docs(test): move kernel gate pre-clone and clarify same-major/minor upgrade selection

This commit is contained in:
2026-05-12 17:14:40 -04:00
parent 7b20a524fd
commit 7477d18cff

View File

@@ -55,52 +55,57 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
## Test Procedure
1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup.
2. Confirm source host is powered off.
3. Determine base clone name: `aw999-[source-without-atvmxxx-]`.
4. Before cloning, check whether that clone name already exists in vCenter.
5. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed.
6. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only.
7. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed.
8. Detach the 2 FC PCI adapters from the cloned VM.
9. Power on clone.
10. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`.
11. Change OS hostname to clone name, replacing `.` with `-`.
12. Convert networking from static IP to DHCP.
13. Remove/clean static IP configuration references.
14. Reboot clone.
15. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
16. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
17. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
18. Check available kernel versions.
19. Verify at least 2 upgrade candidates exist.
20. If fewer than 2 candidates: stop test, power off clone, delete clone and its disks, end run.
21. Gate check:
- If step 20 triggered a stop condition, execute no further steps.
2. Confirm source host is powered on. If it is powered off, power it on.
3. SSH to the source host and check available kernel versions on the source before cloning.
4. Build source-host kernel candidate list from all available versions (include intermediate versions, not just the latest from `check-update`).
5. Candidate scope rule:
- Include only kernels in the same major OS family as the current machine (no major-version upgrades).
- Prefer candidates within the same minor stream as current OS/kernel when available.
6. Verify at least 2 upgrade candidates exist in the filtered candidate list.
7. If fewer than 2 candidates: hard stop and end run before clone creation.
8. Gate check:
- If step 7 triggered a stop condition, execute no further steps.
- If no stop condition was triggered, continue with the next step.
22. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone.
23. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
24. Wait for initial sync completion.
25. Check available kernels again.
26. Select upgrade target one step above current kernel (not latest).
27. If only 1 available version, stop test.
28. Install selected kernel and reboot.
29. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
30. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
31. Write sample data to source 10GB disk.
32. Trigger sync and confirm tracking status using `cirrusdata`.
33. Uninstall CMC.
34. Post-uninstall cleanup checkpoint:
9. Confirm source host is powered off (required pre-clone state).
10. Determine base clone name: `aw999-[source-without-atvmxxx-]`.
11. Before cloning, check whether that clone name already exists in vCenter.
12. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed.
13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only.
14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed.
15. Detach the 2 FC PCI adapters from the cloned VM.
16. Power on clone.
17. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`.
18. Change OS hostname to clone name, replacing `.` with `-`.
19. Convert networking from static IP to DHCP.
20. Remove/clean static IP configuration references.
21. Reboot clone.
22. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
23. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone.
26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
27. Wait for initial sync completion.
28. Check available kernels again using full candidate listing (not latest-only output).
29. Select upgrade target one step above current kernel from the filtered candidate list (same major; same minor preferred).
30. Install selected kernel and reboot.
31. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
32. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
33. Write sample data to source 10GB disk.
34. Trigger sync and confirm tracking status using `cirrusdata`.
35. Uninstall CMC.
36. Post-uninstall cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically.
35. Check available kernels.
36. Upgrade to latest kernel and reboot.
37. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`).
38. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
39. Confirm machine is online in `skidamarink` using `cirrusdata`.
40. SSH and verify MTDI, Galaxy Migrate services/driver are up.
41. Power off cloned machine.
42. Delete cloned VM and its disks from vCenter inventory.
43. Final cleanup checkpoint:
37. Check available kernels.
38. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
39. Upgrade to selected latest target kernel and reboot.
40. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`).
41. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
42. Confirm machine is online in `skidamarink` using `cirrusdata`.
43. SSH and verify MTDI, Galaxy Migrate services/driver are up.
44. Power off cloned machine.
45. Delete cloned VM and its disks from vCenter inventory.
46. Final cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically.
@@ -126,10 +131,10 @@ Use one cumulative results file and append one new section per tested host.
### Kernel / OS Tracking
- Start OS version:
- Start kernel version:
- Kernel list before first upgrade:
- Kernel list before first upgrade (full candidate list, filtered by scope rule):
- Kernel selected for step-up upgrade:
- Kernel after step-up reboot:
- Kernel list before latest upgrade:
- Kernel list before latest upgrade (full candidate list, filtered by scope rule):
- Kernel selected for latest upgrade:
- Kernel after latest reboot: