From 7477d18cff65bf5d32c2570efd25e16fb055e7bd Mon Sep 17 00:00:00 2001 From: "anthony.wen" Date: Tue, 12 May 2026 17:14:40 -0400 Subject: [PATCH] docs(test): move kernel gate pre-clone and clarify same-major/minor upgrade selection --- tests/cmc-upgrade-test.md | 95 ++++++++++++++++++++------------------- 1 file changed, 50 insertions(+), 45 deletions(-) diff --git a/tests/cmc-upgrade-test.md b/tests/cmc-upgrade-test.md index b101728..bcddf01 100644 --- a/tests/cmc-upgrade-test.md +++ b/tests/cmc-upgrade-test.md @@ -55,52 +55,57 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re ## Test Procedure 1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup. -2. Confirm source host is powered off. -3. Determine base clone name: `aw999-[source-without-atvmxxx-]`. -4. Before cloning, check whether that clone name already exists in vCenter. -5. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed. -6. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. -7. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. -8. Detach the 2 FC PCI adapters from the cloned VM. -9. Power on clone. -10. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. -11. Change OS hostname to clone name, replacing `.` with `-`. -12. Convert networking from static IP to DHCP. -13. Remove/clean static IP configuration references. -14. Reboot clone. -15. Find DHCP address and verify it is not ``. -16. If still ``, fix static config cleanup and repeat reboot/verify. -17. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. -18. Check available kernel versions. -19. Verify at least 2 upgrade candidates exist. -20. If fewer than 2 candidates: stop test, power off clone, delete clone and its disks, end run. -21. Gate check: - - If step 20 triggered a stop condition, execute no further steps. +2. Confirm source host is powered on. If it is powered off, power it on. +3. SSH to the source host and check available kernel versions on the source before cloning. +4. Build source-host kernel candidate list from all available versions (include intermediate versions, not just the latest from `check-update`). +5. Candidate scope rule: + - Include only kernels in the same major OS family as the current machine (no major-version upgrades). + - Prefer candidates within the same minor stream as current OS/kernel when available. +6. Verify at least 2 upgrade candidates exist in the filtered candidate list. +7. If fewer than 2 candidates: hard stop and end run before clone creation. +8. Gate check: + - If step 7 triggered a stop condition, execute no further steps. - If no stop condition was triggered, continue with the next step. -22. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone. -23. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. -24. Wait for initial sync completion. -25. Check available kernels again. -26. Select upgrade target one step above current kernel (not latest). -27. If only 1 available version, stop test. -28. Install selected kernel and reboot. -29. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. -30. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. -31. Write sample data to source 10GB disk. -32. Trigger sync and confirm tracking status using `cirrusdata`. -33. Uninstall CMC. -34. Post-uninstall cleanup checkpoint: +9. Confirm source host is powered off (required pre-clone state). +10. Determine base clone name: `aw999-[source-without-atvmxxx-]`. +11. Before cloning, check whether that clone name already exists in vCenter. +12. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed. +13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. +14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. +15. Detach the 2 FC PCI adapters from the cloned VM. +16. Power on clone. +17. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. +18. Change OS hostname to clone name, replacing `.` with `-`. +19. Convert networking from static IP to DHCP. +20. Remove/clean static IP configuration references. +21. Reboot clone. +22. Find DHCP address and verify it is not ``. +23. If still ``, fix static config cleanup and repeat reboot/verify. +24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. +25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone. +26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. +27. Wait for initial sync completion. +28. Check available kernels again using full candidate listing (not latest-only output). +29. Select upgrade target one step above current kernel from the filtered candidate list (same major; same minor preferred). +30. Install selected kernel and reboot. +31. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. +32. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. +33. Write sample data to source 10GB disk. +34. Trigger sync and confirm tracking status using `cirrusdata`. +35. Uninstall CMC. +36. Post-uninstall cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically. -35. Check available kernels. -36. Upgrade to latest kernel and reboot. -37. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`). -38. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. -39. Confirm machine is online in `skidamarink` using `cirrusdata`. -40. SSH and verify MTDI, Galaxy Migrate services/driver are up. -41. Power off cloned machine. -42. Delete cloned VM and its disks from vCenter inventory. -43. Final cleanup checkpoint: +37. Check available kernels. +38. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). +39. Upgrade to selected latest target kernel and reboot. +40. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`). +41. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. +42. Confirm machine is online in `skidamarink` using `cirrusdata`. +43. SSH and verify MTDI, Galaxy Migrate services/driver are up. +44. Power off cloned machine. +45. Delete cloned VM and its disks from vCenter inventory. +46. Final cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically. @@ -126,10 +131,10 @@ Use one cumulative results file and append one new section per tested host. ### Kernel / OS Tracking - Start OS version: - Start kernel version: -- Kernel list before first upgrade: +- Kernel list before first upgrade (full candidate list, filtered by scope rule): - Kernel selected for step-up upgrade: - Kernel after step-up reboot: -- Kernel list before latest upgrade: +- Kernel list before latest upgrade (full candidate list, filtered by scope rule): - Kernel selected for latest upgrade: - Kernel after latest reboot: