Move one-time disk prep before first CMC install

This commit is contained in:
Cirrus Codex
2026-05-14 17:50:38 -04:00
parent eda18702f6
commit 9e5203cb60

View File

@@ -106,45 +106,46 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
26. SSH to the new live clone IP and verify the DHCP state. 26. SSH to the new live clone IP and verify the DHCP state.
27. If the clone still reports the previous static IP, fix static config cleanup and repeat reboot/verify. 27. If the clone still reports the previous static IP, fix static config cleanup and repeat reboot/verify.
28. Continue all remaining steps using the live DHCP IP from vCenter and credentials from `/home/aw/code/cds/.env.credentials.local`. 28. Continue all remaining steps using the live DHCP IP from vCenter and credentials from `/home/aw/code/cds/.env.credentials.local`.
29. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`. 29. Before the first CMC install, wipe the 10GB source disk so it has no filesystem, partition table, or other residual content. This disk prep is one-time only and must not be repeated in later stages of the test.
30. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. 30. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`.
31. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. 31. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
32. Wait for initial sync completion. 32. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
33. Check available kernels again using full candidate listing (not latest-only output). 33. Wait for initial sync completion.
34. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate. 34. Check available kernels again using full candidate listing (not latest-only output).
35. Verify matching dev/header packages for the selected first-upgrade target are available. 35. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate.
36. Install selected first-upgrade kernel and matching dev/header packages, then reboot. 36. Verify matching dev/header packages for the selected first-upgrade target are available.
37. Query vCenter guest-tools again for the live clone IP after reboot. 37. Install selected first-upgrade kernel and matching dev/header packages, then reboot.
38. SSH to the rebooted clone via the live vCenter IP and verify running kernel and installed dev/header package versions match the selected first-upgrade version. 38. Query vCenter guest-tools again for the live clone IP after reboot.
39. If versions do not match exactly, stop as blocker-fail. 39. SSH to the rebooted clone via the live vCenter IP and verify running kernel and installed dev/header package versions match the selected first-upgrade version.
40. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. 40. If versions do not match exactly, stop as blocker-fail.
41. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. 41. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
42. Write sample data to source 10GB disk. 42. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
43. Trigger sync and confirm tracking status using `cirrusdata`. 43. Write sample data to source 10GB disk.
44. Uninstall CMC. 44. Trigger sync and confirm tracking status using `cirrusdata`.
45. Post-uninstall cleanup checkpoint: 45. Uninstall CMC.
46. Post-uninstall cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host). - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host).
- Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone. - Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone.
46. Check available kernels. 47. Check available kernels.
47. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred). 48. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
48. Verify matching dev/header packages for the selected latest-upgrade target are available. 49. Verify matching dev/header packages for the selected latest-upgrade target are available.
49. Install selected latest-upgrade kernel and matching dev/header packages, then reboot. 50. Install selected latest-upgrade kernel and matching dev/header packages, then reboot.
50. Query vCenter guest-tools again for the live clone IP after reboot. 51. Query vCenter guest-tools again for the live clone IP after reboot.
51. SSH to the rebooted clone via the live vCenter IP and verify running kernel and installed dev/header package versions match the selected latest-upgrade version. 52. SSH to the rebooted clone via the live vCenter IP and verify running kernel and installed dev/header package versions match the selected latest-upgrade version.
52. If versions do not match exactly, stop as blocker-fail. 53. If versions do not match exactly, stop as blocker-fail.
53. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`. 54. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`.
54. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. 55. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
55. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail. 56. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
56. Confirm machine is online in `skidamarink` using `cirrusdata`. 57. Confirm machine is online in `skidamarink` using `cirrusdata`.
57. SSH and verify MTDI, Galaxy Migrate services/driver are up. 58. SSH and verify MTDI, Galaxy Migrate services/driver are up.
58. Success-path cleanup only: power off cloned machine. 59. Success-path cleanup only: power off cloned machine.
59. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory. 60. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory.
60. Success-path final cleanup checkpoint: 61. Success-path final cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host). - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host).
- Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone. - Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone.
61. Blocker-fail path after clone creation: 62. Blocker-fail path after clone creation:
- Stop test immediately after recording failure details. - Stop test immediately after recording failure details.
- Leave cloned VM powered on and present in inventory for manual inspection. - Leave cloned VM powered on and present in inventory for manual inspection.
- Do not run clone power-off/delete steps in blocker-fail path. - Do not run clone power-off/delete steps in blocker-fail path.
@@ -188,6 +189,7 @@ Use one cumulative results file and append one new section per tested host.
- Clone created / FC PCI detached: `PASS|FAIL` - notes - Clone created / FC PCI detached: `PASS|FAIL` - notes
- Hostname/IP DHCP conversion: `PASS|FAIL` - notes - Hostname/IP DHCP conversion: `PASS|FAIL` - notes
- CMC reinstall #1: `PASS|FAIL` - notes - CMC reinstall #1: `PASS|FAIL` - notes
- 10 GB source disk prep before first CMC install: `PASS|FAIL` - notes
- Local migration #1 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes - Local migration #1 (10GB -> 11GB) initial sync: `PASS|FAIL` - notes
- Step-up kernel upgrade: `PASS|FAIL` - notes - Step-up kernel upgrade: `PASS|FAIL` - notes
- Step-up dev/header package match check: `PASS|FAIL` - notes - Step-up dev/header package match check: `PASS|FAIL` - notes