docs: make FC passthrough detach mandatory in kernel upgrade test

This commit is contained in:
Cirrus Codex
2026-05-13 17:20:37 -04:00
parent c11878a05c
commit a53e2ee068

View File

@@ -62,6 +62,8 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
- If the clone is missing or identity is uncertain, stop and do not delete any other VM.
- If any blocker occurs after clone creation, stop the test and leave the cloned VM powered on for manual inspection.
- Do not delete or power off the clone on blocker-fail outcomes.
- Detaching the 2 FC PCI passthrough adapters from the cloned VM is mandatory before any guest boot or guest-side change.
- Verify in vCenter that both FC passthrough devices are absent before proceeding past the clone-prep stage.
- Cleanup actions that remove hosts from CMC must target only the cloned host used in the current test run.
- Treat migration session creation failures (for either migration #1 or migration #2) as blocker-fail events.
@@ -85,52 +87,53 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only.
14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed.
15. Detach the 2 FC PCI adapters from the cloned VM.
16. Power on clone.
17. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`.
18. Change OS hostname to clone name, replacing `.` with `-`.
19. Convert networking from static IP to DHCP.
20. Remove/clean static IP configuration references.
21. Reboot clone.
22. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
23. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`.
26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
27. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
28. Wait for initial sync completion.
29. Check available kernels again using full candidate listing (not latest-only output).
30. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate.
31. Verify matching dev/header packages for the selected first-upgrade target are available.
32. Install selected first-upgrade kernel and matching dev/header packages, then reboot.
33. Verify running kernel and installed dev/header packages match the selected first-upgrade version.
34. If versions do not match exactly, stop as blocker-fail.
35. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
36. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
37. Write sample data to source 10GB disk.
38. Trigger sync and confirm tracking status using `cirrusdata`.
39. Uninstall CMC.
40. Post-uninstall cleanup checkpoint:
16. Verify in vCenter that both FC passthrough devices are no longer present on the clone.
17. Power on clone.
18. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`.
19. Change OS hostname to clone name, replacing `.` with `-`.
20. Convert networking from static IP to DHCP.
21. Remove/clean static IP configuration references.
22. Reboot clone.
23. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
24. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
25. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
26. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone, always adding `-no-prebuilt-mtdi-nexus`.
27. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
28. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
29. Wait for initial sync completion.
30. Check available kernels again using full candidate listing (not latest-only output).
31. Select first-upgrade target from filtered candidate list (same major; same minor preferred), ensuring it is not the latest candidate.
32. Verify matching dev/header packages for the selected first-upgrade target are available.
33. Install selected first-upgrade kernel and matching dev/header packages, then reboot.
34. Verify running kernel and installed dev/header packages match the selected first-upgrade version.
35. If versions do not match exactly, stop as blocker-fail.
36. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
37. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
38. Write sample data to source 10GB disk.
39. Trigger sync and confirm tracking status using `cirrusdata`.
40. Uninstall CMC.
41. Post-uninstall cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically via MCP (target only this test clone host).
- Because CMC status can lag behind VM state, poll briefly for status transition; if still online, perform targeted MCP host removal for the tested clone.
41. Check available kernels.
42. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
43. Verify matching dev/header packages for the selected latest-upgrade target are available.
44. Install selected latest-upgrade kernel and matching dev/header packages, then reboot.
45. Verify running kernel and installed dev/header packages match the selected latest-upgrade version.
46. If versions do not match exactly, stop as blocker-fail.
47. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`.
48. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
49. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
50. Confirm machine is online in `skidamarink` using `cirrusdata`.
51. SSH and verify MTDI, Galaxy Migrate services/driver are up.
52. Success-path cleanup only: power off cloned machine.
53. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory.
54. Success-path final cleanup checkpoint:
42. Check available kernels.
43. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
44. Verify matching dev/header packages for the selected latest-upgrade target are available.
45. Install selected latest-upgrade kernel and matching dev/header packages, then reboot.
46. Verify running kernel and installed dev/header packages match the selected latest-upgrade version.
47. If versions do not match exactly, stop as blocker-fail.
48. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`), always adding `-no-prebuilt-mtdi-nexus`.
49. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
50. If migration session creation fails (including API/service errors such as 5xx), hard stop as blocker-fail.
51. Confirm machine is online in `skidamarink` using `cirrusdata`.
52. SSH and verify MTDI, Galaxy Migrate services/driver are up.
53. Success-path cleanup only: power off cloned machine.
54. Success-path cleanup only: delete cloned VM and its disks from vCenter inventory.
55. Success-path final cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically via MCP (target only this test clone host).
- Because CMC status can lag behind VM deletion/power-off, wait/poll briefly first; if still online, perform targeted MCP host removal for the tested clone.
55. Blocker-fail path after clone creation:
56. Blocker-fail path after clone creation:
- Stop test immediately after recording failure details.
- Leave cloned VM powered on and present in inventory for manual inspection.
- Do not run clone power-off/delete steps in blocker-fail path.
@@ -139,6 +142,7 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
- Cannot verify clone identity.
- Cannot detach required FC PCI adapters.
- Clone cannot be created on datastore `AutomatedTest-UnitTesting`.
- FC passthrough adapters remain attached after the detach/verification step.
- DHCP transition cannot be completed (clone remains static at `<INITIAL_CLONE_HOST_OR_IP>`).
- Kernel upgrade candidate criteria not met.
- Migration session creation failed (including API/service errors such as HTTP 5xx or equivalent backend unavailability).