diff --git a/tests/cmc-upgrade-kernel-test.md b/tests/cmc-upgrade-kernel-test.md index 2f0b7ae..f134d2d 100644 --- a/tests/cmc-upgrade-kernel-test.md +++ b/tests/cmc-upgrade-kernel-test.md @@ -62,6 +62,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re - If the clone is missing or identity is uncertain, stop and do not delete any other VM. - If any blocker occurs after clone creation, stop the test and leave the cloned VM powered on for manual inspection. - Do not delete or power off the clone on blocker-fail outcomes. +- After source-host kernel inspection is complete, power the source VM off and re-verify in vCenter that it is powered off before cloning. - Detaching the 2 FC PCI passthrough adapters from the cloned VM is mandatory before any guest boot or guest-side change. - Verify in vCenter that both FC passthrough devices are absent before proceeding past the clone-prep stage. - Cleanup actions that remove hosts from CMC must target only the cloned host used in the current test run. @@ -69,7 +70,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re ## Test Procedure 1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup. -2. Confirm source host is powered on. If it is powered off, power it on. +2. Confirm source host is powered on for the inspection phase. If it is powered off, power it on. 3. SSH to the source host and check available kernel versions on the source before cloning. 4. Build source-host kernel candidate list from all available versions (include intermediate versions, not just the latest from `check-update`). 5. Candidate scope rule: @@ -80,60 +81,61 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re 8. Gate check: - If step 7 triggered a stop condition, execute no further steps. - If no stop condition was triggered, continue with the next step. -9. Confirm source host is powered off (required pre-clone state). -10. Determine base clone name: `aw999-[source-without-atvmxxx-]`. -11. Before cloning, check whether that clone name already exists in vCenter. -12. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed. -13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. -14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. -15. Detach the 2 FC PCI adapters from the cloned VM. -16. Verify in vCenter that both FC passthrough devices are no longer present on the clone. -17. Power on clone. -18. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. -19. Change OS hostname to clone name, replacing `.` with `-`. -20. Convert networking from static IP to DHCP. -21. Remove/clean static IP configuration references. -22. Reboot clone. -23. Find DHCP address and verify it is not ``. -24. If still ``, fix static config cleanup and repeat reboot/verify. -25. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. -26. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. -27. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. -28. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. -29. Wait for initial sync completion. -30. Check available kernels again using full candidate listing (not latest-only output). -31. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. -32. Verify matching dev/header packages for the selected first-upgrade target are available. -33. Install selected first-upgrade kernel and matching dev/header packages, then reboot. -34. Verify running kernel and installed dev/header packages match the selected first-upgrade version. -35. If versions do not match exactly, stop as blocker-fail. -36. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. -37. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. -38. Write sample data to source 10GB disk. -39. Trigger sync and confirm tracking status using `cirrusdata`. -40. Uninstall CMC. -41. Post-uninstall cleanup checkpoint: +9. After source-host inspection is complete, power the source VM off. +10. Confirm in vCenter that the source host is powered off before cloning. +11. Determine base clone name: `aw999-[source-without-atvmxxx-]`. +12. Before cloning, check whether that clone name already exists in vCenter. +13. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed. +14. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. +15. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. +16. Detach the 2 FC PCI adapters from the cloned VM. +17. Verify in vCenter that both FC passthrough devices are no longer present on the clone. +18. Power on clone. +19. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. +20. Change OS hostname to clone name, replacing `.` with `-`. +21. Convert networking from static IP to DHCP. +22. Remove/clean static IP configuration references. +23. Reboot clone. +24. Find DHCP address and verify it is not ``. +25. If still ``, fix static config cleanup and repeat reboot/verify. +26. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. +27. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. +28. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. +29. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +30. Wait for initial sync completion. +31. Check available kernels again using full candidate listing (not latest-only output). +32. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. +33. Verify matching dev/header packages for the selected first-upgrade target are available. +34. Install selected first-upgrade kernel and matching dev/header packages, then reboot. +35. Verify running kernel and installed dev/header packages match the selected first-upgrade version. +36. If versions do not match exactly, stop as blocker-fail. +37. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. +38. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. +39. Write sample data to source 10GB disk. +40. Trigger sync and confirm tracking status using `cirrusdata`. +41. Uninstall CMC. +42. Post-uninstall cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone. -42. Check available kernels. -43. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). -44. Verify matching dev/header packages for the selected latest-upgrade target are available. -45. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. -46. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. -47. If versions do not match exactly, stop as blocker-fail. -48. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. -49. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. -50. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. -51. Confirm machine is online in `skidamarink` using `cirrusdata`. -52. SSH and verify MTDI, Galaxy Migrate services/driver are up. -53. Success-path cleanup only: power off cloned machine. -54. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. -55. Success-path final cleanup checkpoint: +43. Check available kernels. +44. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). +45. Verify matching dev/header packages for the selected latest-upgrade target are available. +46. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. +47. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. +48. If versions do not match exactly, stop as blocker-fail. +49. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. +50. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. +51. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +52. Confirm machine is online in `skidamarink` using `cirrusdata`. +53. SSH and verify MTDI, Galaxy Migrate services/driver are up. +54. Success-path cleanup only: power off cloned machine. +55. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. +56. Success-path final cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone. -56. Blocker-fail path after clone creation: +57. Blocker-fail path after clone creation: - Stop test immediately after recording failure details. - Leave cloned VM powered on and present in inventory for manual inspection. - Do not run clone power-off/delete steps in blocker-fail path.