diff --git a/tests/cmc-upgrade-kernel-test.md b/tests/cmc-upgrade-kernel-test.md index 3bc16ae..2f0b7ae 100644 --- a/tests/cmc-upgrade-kernel-test.md +++ b/tests/cmc-upgrade-kernel-test.md @@ -62,6 +62,8 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re - If the clone is missing or identity is uncertain, stop and do not delete any other VM. - If any blocker occurs after clone creation, stop the test and leave the cloned VM powered on for manual inspection. - Do not delete or power off the clone on blocker-fail outcomes. +- Detaching the 2 FC PCI passthrough adapters from the cloned VM is mandatory before any guest boot or guest-side change. +- Verify in vCenter that both FC passthrough devices are absent before proceeding past the clone-prep stage. - Cleanup actions that remove hosts from CMC must target only the cloned host used in the current test run. - Treat migration session creation failures (for either migration #1 or migration #2) as blocker-fail events. @@ -85,52 +87,53 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re 13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. 14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. 15. Detach the 2 FC PCI adapters from the cloned VM. -16. Power on clone. -17. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. -18. Change OS hostname to clone name, replacing `.` with `-`. -19. Convert networking from static IP to DHCP. -20. Remove/clean static IP configuration references. -21. Reboot clone. -22. Find DHCP address and verify it is not ``. -23. If still ``, fix static config cleanup and repeat reboot/verify. -24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. -25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. -26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. -27. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. -28. Wait for initial sync completion. -29. Check available kernels again using full candidate listing (not latest-only output). -30. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. -31. Verify matching dev/header packages for the selected first-upgrade target are available. -32. Install selected first-upgrade kernel and matching dev/header packages, then reboot. -33. Verify running kernel and installed dev/header packages match the selected first-upgrade version. -34. If versions do not match exactly, stop as blocker-fail. -35. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. -36. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. -37. Write sample data to source 10GB disk. -38. Trigger sync and confirm tracking status using `cirrusdata`. -39. Uninstall CMC. -40. Post-uninstall cleanup checkpoint: +16. Verify in vCenter that both FC passthrough devices are no longer present on the clone. +17. Power on clone. +18. SSH to `` using credentials from `/home/aw/code/cds/.env.credentials.local`. +19. Change OS hostname to clone name, replacing `.` with `-`. +20. Convert networking from static IP to DHCP. +21. Remove/clean static IP configuration references. +22. Reboot clone. +23. Find DHCP address and verify it is not ``. +24. If still ``, fix static config cleanup and repeat reboot/verify. +25. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. +26. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. +27. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. +28. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +29. Wait for initial sync completion. +30. Check available kernels again using full candidate listing (not latest-only output). +31. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. +32. Verify matching dev/header packages for the selected first-upgrade target are available. +33. Install selected first-upgrade kernel and matching dev/header packages, then reboot. +34. Verify running kernel and installed dev/header packages match the selected first-upgrade version. +35. If versions do not match exactly, stop as blocker-fail. +36. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. +37. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. +38. Write sample data to source 10GB disk. +39. Trigger sync and confirm tracking status using `cirrusdata`. +40. Uninstall CMC. +41. Post-uninstall cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone. -41. Check available kernels. -42. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). -43. Verify matching dev/header packages for the selected latest-upgrade target are available. -44. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. -45. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. -46. If versions do not match exactly, stop as blocker-fail. -47. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. -48. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. -49. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. -50. Confirm machine is online in `skidamarink` using `cirrusdata`. -51. SSH and verify MTDI, Galaxy Migrate services/driver are up. -52. Success-path cleanup only: power off cloned machine. -53. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. -54. Success-path final cleanup checkpoint: +42. Check available kernels. +43. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). +44. Verify matching dev/header packages for the selected latest-upgrade target are available. +45. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. +46. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. +47. If versions do not match exactly, stop as blocker-fail. +48. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. +49. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. +50. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +51. Confirm machine is online in `skidamarink` using `cirrusdata`. +52. SSH and verify MTDI, Galaxy Migrate services/driver are up. +53. Success-path cleanup only: power off cloned machine. +54. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. +55. Success-path final cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone. -55. Blocker-fail path after clone creation: +56. Blocker-fail path after clone creation: - Stop test immediately after recording failure details. - Leave cloned VM powered on and present in inventory for manual inspection. - Do not run clone power-off/delete steps in blocker-fail path. @@ -139,6 +142,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re - Cannot verify clone identity. - Cannot detach required FC PCI adapters. - Clone cannot be created on datastore `AutomatedTest-UnitTesting`. +- FC passthrough adapters remain attached after the detach/verification step. - DHCP transition cannot be completed (clone remains static at ``). - Kernel upgrade candidate criteria not met. - Migration session creation failed (including API/service errors such as HTTP 5xx or equivalent backend unavailability).