From c11878a05c1b27f05d4184073a64cf3c2c99b9be Mon Sep 17 00:00:00 2001 From: "anthony.wen" Date: Tue, 12 May 2026 19:57:20 -0400 Subject: [PATCH] docs(test): hard-stop on migration creation failures and preserve clone --- tests/cmc-upgrade-kernel-test.md | 58 +++++++++++++++++--------------- 1 file changed, 31 insertions(+), 27 deletions(-) diff --git a/tests/cmc-upgrade-kernel-test.md b/tests/cmc-upgrade-kernel-test.md index 3c06673..3bc16ae 100644 --- a/tests/cmc-upgrade-kernel-test.md +++ b/tests/cmc-upgrade-kernel-test.md @@ -63,6 +63,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re - If any blocker occurs after clone creation, stop the test and leave the cloned VM powered on for manual inspection. - Do not delete or power off the clone on blocker-fail outcomes. - Cleanup actions that remove hosts from CMC must target only the cloned host used in the current test run. +- Treat migration session creation failures (for either migration #1 or migration #2) as blocker-fail events. ## Test Procedure 1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup. @@ -95,39 +96,41 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re 24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`. 25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. 26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. -27. Wait for initial sync completion. -28. Check available kernels again using full candidate listing (not latest-only output). -29. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. -30. Verify matching dev/header packages for the selected first-upgrade target are available. -31. Install selected first-upgrade kernel and matching dev/header packages, then reboot. -32. Verify running kernel and installed dev/header packages match the selected first-upgrade version. -33. If versions do not match exactly, stop as blocker-fail. -34. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. -35. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. -36. Write sample data to source 10GB disk. -37. Trigger sync and confirm tracking status using `cirrusdata`. -38. Uninstall CMC. -39. Post-uninstall cleanup checkpoint: +27. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +28. Wait for initial sync completion. +29. Check available kernels again using full candidate listing (not latest-only output). +30. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. +31. Verify matching dev/header packages for the selected first-upgrade target are available. +32. Install selected first-upgrade kernel and matching dev/header packages, then reboot. +33. Verify running kernel and installed dev/header packages match the selected first-upgrade version. +34. If versions do not match exactly, stop as blocker-fail. +35. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. +36. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. +37. Write sample data to source 10GB disk. +38. Trigger sync and confirm tracking status using `cirrusdata`. +39. Uninstall CMC. +40. Post-uninstall cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone. -40. Check available kernels. -41. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). -42. Verify matching dev/header packages for the selected latest-upgrade target are available. -43. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. -44. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. -45. If versions do not match exactly, stop as blocker-fail. -46. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. -47. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. -48. Confirm machine is online in `skidamarink` using `cirrusdata`. -49. SSH and verify MTDI, Galaxy Migrate services/driver are up. -50. Success-path cleanup only: power off cloned machine. -51. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. -52. Success-path final cleanup checkpoint: +41. Check available kernels. +42. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). +43. Verify matching dev/header packages for the selected latest-upgrade target are available. +44. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. +45. Verify running kernel and installed dev/header packages match the selected latest-upgrade version. +46. If versions do not match exactly, stop as blocker-fail. +47. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. +48. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. +49. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. +50. Confirm machine is online in `skidamarink` using `cirrusdata`. +51. SSH and verify MTDI, Galaxy Migrate services/driver are up. +52. Success-path cleanup only: power off cloned machine. +53. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. +54. Success-path final cleanup checkpoint: - Run MCP offline-host cleanup for `skidamarink`. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host). - Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone. -53. Blocker-fail path after clone creation: +55. Blocker-fail path after clone creation: - Stop test immediately after recording failure details. - Leave cloned VM powered on and present in inventory for manual inspection. - Do not run clone power-off/delete steps in blocker-fail path. @@ -138,6 +141,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re - Clone cannot be created on datastore `AutomatedTest-UnitTesting`. - DHCP transition cannot be completed (clone remains static at ``). - Kernel upgrade candidate criteria not met. +- Migration session creation failed (including API/service errors such as HTTP 5xx or equivalent backend unavailability). - Any critical migration/service validation failure that blocks continuation. ## Per-Host Test Result Record