docs(test): enforce kernel and dev/header version matching gates

This commit is contained in:
2026-05-12 19:15:46 -04:00
parent 2ff747d500
commit a89de319b6

View File

@@ -26,6 +26,14 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
- Apply this rule to every relevant step in this procedure. - Apply this rule to every relevant step in this procedure.
- For every CMC install/reinstall command in this test, always include installer option: `-no-prebuilt-mtdi-nexus`. - For every CMC install/reinstall command in this test, always include installer option: `-no-prebuilt-mtdi-nexus`.
## Kernel Package Matching Rule (Global)
- For every planned kernel upgrade, verify matching development/header packages are available for the exact target kernel version before installing that kernel.
- On Red Hat-family systems, verify `kernel-devel-<target>` and `kernel-headers-<target>` availability (or documented distro-equivalent package names where applicable).
- The first kernel upgrade attempt must not use the latest kernel in the filtered candidate list; reserve the latest kernel for the final kernel-upgrade stage.
- When upgrading kernel versions, also upgrade/install the matching development/header packages for that same version.
- After each kernel upgrade and reboot, verify running kernel version and installed dev/header package versions all match.
- If kernel and dev/header package versions are mismatched at any point, stop immediately as blocker-fail and do not continue with remediation by assumption.
## Red Hat Preflight (Global, Manual Tasks Only) ## Red Hat Preflight (Global, Manual Tasks Only)
- Apply this section when the test target is a Red Hat machine and the run is manually executed. - Apply this section when the test target is a Red Hat machine and the run is manually executed.
- Do not apply this section to ATVM automation runs that already handle subscription flow. - Do not apply this section to ATVM automation runs that already handle subscription flow.
@@ -88,29 +96,35 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. 26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
27. Wait for initial sync completion. 27. Wait for initial sync completion.
28. Check available kernels again using full candidate listing (not latest-only output). 28. Check available kernels again using full candidate listing (not latest-only output).
29. Select upgrade target one step above current kernel from the filtered candidate list (same major; same minor preferred). 29. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate.
30. Install selected kernel and reboot. 30. Verify matching dev/header packages for the selected first-upgrade target are available.
31. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. 31. Install selected first-upgrade kernel and matching dev/header packages, then reboot.
32. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. 32. Verify running kernel and installed dev/header packages match the selected first-upgrade version.
33. Write sample data to source 10GB disk. 33. If versions do not match exactly, stop as blocker-fail.
34. Trigger sync and confirm tracking status using `cirrusdata`. 34. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
35. Uninstall CMC. 35. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
36. Post-uninstall cleanup checkpoint: 36. Write sample data to source 10GB disk.
37. Trigger sync and confirm tracking status using `cirrusdata`.
38. Uninstall CMC.
39. Post-uninstall cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically.
37. Check available kernels. 40. Check available kernels.
38. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). 41. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
39. Upgrade to selected latest target kernel and reboot. 42. Verify matching dev/header packages for the selected latest-upgrade target are available.
40. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. 43. Install selected latest-upgrade kernel and matching dev/header packages, then reboot.
41. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. 44. Verify running kernel and installed dev/header packages match the selected latest-upgrade version.
42. Confirm machine is online in `skidamarink` using `cirrusdata`. 45. If versions do not match exactly, stop as blocker-fail.
43. SSH and verify MTDI, Galaxy Migrate services/driver are up. 46. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`.
44. Success-path cleanup only: power off cloned machine. 47. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
45. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. 48. Confirm machine is online in `skidamarink` using `cirrusdata`.
46. Success-path final cleanup checkpoint: 49. SSH and verify MTDI, Galaxy Migrate services/driver are up.
50. Success-path cleanup only: power off cloned machine.
51. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory.
52. Success-path final cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically.
47. Blocker-fail path after clone creation: 53. Blocker-fail path after clone creation:
- Stop test immediately after recording failure details. - Stop test immediately after recording failure details.
- Leave cloned VM powered on and present in inventory for manual inspection. - Leave cloned VM powered on and present in inventory for manual inspection.
- Do not run clone power-off/delete steps in blocker-fail path. - Do not run clone power-off/delete steps in blocker-fail path.
@@ -139,10 +153,14 @@ Use one cumulative results file and append one new section per tested host.
- Start kernel version: - Start kernel version:
- Kernel list before first upgrade (full candidate list, filtered by scope rule): - Kernel list before first upgrade (full candidate list, filtered by scope rule):
- Kernel selected for step-up upgrade: - Kernel selected for step-up upgrade:
- Matching dev/header packages for step-up target (availability check):
- Kernel after step-up reboot: - Kernel after step-up reboot:
- Installed dev/header package versions after step-up:
- Kernel list before latest upgrade (full candidate list, filtered by scope rule): - Kernel list before latest upgrade (full candidate list, filtered by scope rule):
- Kernel selected for latest upgrade: - Kernel selected for latest upgrade:
- Matching dev/header packages for latest target (availability check):
- Kernel after latest reboot: - Kernel after latest reboot:
- Installed dev/header package versions after latest upgrade:
### Execution Summary (Short Bullets) ### Execution Summary (Short Bullets)
- Clone created / FC PCI detached: `PASS|FAIL` - notes - Clone created / FC PCI detached: `PASS|FAIL` - notes
@@ -150,11 +168,13 @@ Use one cumulative results file and append one new section per tested host.
- CMC reinstall #1: `PASS|FAIL` - notes - CMC reinstall #1: `PASS|FAIL` - notes
- Local migration #1 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes - Local migration #1 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes
- Step-up kernel upgrade: `PASS|FAIL` - notes - Step-up kernel upgrade: `PASS|FAIL` - notes
- Step-up dev/header package match check: `PASS|FAIL` - notes
- Online in skidamarink after step-up: `PASS|FAIL` - notes - Online in skidamarink after step-up: `PASS|FAIL` - notes
- MTDI/Galaxy Migrate service+driver health after step-up: `PASS|FAIL` - notes - MTDI/Galaxy Migrate service+driver health after step-up: `PASS|FAIL` - notes
- Write data + tracking status: `PASS|FAIL` - notes - Write data + tracking status: `PASS|FAIL` - notes
- CMC uninstall: `PASS|FAIL` - notes - CMC uninstall: `PASS|FAIL` - notes
- Latest kernel upgrade: `PASS|FAIL` - notes - Latest kernel upgrade: `PASS|FAIL` - notes
- Latest dev/header package match check: `PASS|FAIL` - notes
- CMC reinstall #2: `PASS|FAIL` - notes - CMC reinstall #2: `PASS|FAIL` - notes
- Local migration #2 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes - Local migration #2 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes
- Online in skidamarink after latest upgrade: `PASS|FAIL` - notes - Online in skidamarink after latest upgrade: `PASS|FAIL` - notes