docs(test): move kernel gate pre-clone and clarify same-major/minor upgrade selection

This commit is contained in:
2026-05-12 17:14:40 -04:00
parent 7b20a524fd
commit 7477d18cff

View File

@@ -55,52 +55,57 @@ Validate CMC behavior across staged kernel upgrades on a cloned VM, including re
## Test Procedure ## Test Procedure
1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup. 1. Remove offline hosts in `skidamarink` using MCP offline-host cleanup.
2. Confirm source host is powered off. 2. Confirm source host is powered on. If it is powered off, power it on.
3. Determine base clone name: `aw999-[source-without-atvmxxx-]`. 3. SSH to the source host and check available kernel versions on the source before cloning.
4. Before cloning, check whether that clone name already exists in vCenter. 4. Build source-host kernel candidate list from all available versions (include intermediate versions, not just the latest from `check-update`).
5. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed. 5. Candidate scope rule:
6. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only. - Include only kernels in the same major OS family as the current machine (no major-version upgrades).
7. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed. - Prefer candidates within the same minor stream as current OS/kernel when available.
8. Detach the 2 FC PCI adapters from the cloned VM. 6. Verify at least 2 upgrade candidates exist in the filtered candidate list.
9. Power on clone. 7. If fewer than 2 candidates: hard stop and end run before clone creation.
10. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`. 8. Gate check:
11. Change OS hostname to clone name, replacing `.` with `-`. - If step 7 triggered a stop condition, execute no further steps.
12. Convert networking from static IP to DHCP.
13. Remove/clean static IP configuration references.
14. Reboot clone.
15. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
16. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
17. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
18. Check available kernel versions.
19. Verify at least 2 upgrade candidates exist.
20. If fewer than 2 candidates: stop test, power off clone, delete clone and its disks, end run.
21. Gate check:
- If step 20 triggered a stop condition, execute no further steps.
- If no stop condition was triggered, continue with the next step. - If no stop condition was triggered, continue with the next step.
22. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone. 9. Confirm source host is powered off (required pre-clone state).
23. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`. 10. Determine base clone name: `aw999-[source-without-atvmxxx-]`.
24. Wait for initial sync completion. 11. Before cloning, check whether that clone name already exists in vCenter.
25. Check available kernels again. 12. If the name exists, choose the next available suffixed name: `aw999-[source-without-atvmxxx-]-1`, then `-2`, then `-N` as needed.
26. Select upgrade target one step above current kernel (not latest). 13. Clone source VM using the resolved unique clone name on datastore `AutomatedTest-UnitTesting` only.
27. If only 1 available version, stop test. 14. For the clone command destination name, pass only the VM name (for example `aw999-ubuntu24.04-1`), not an inventory path like `/CDSHQ-Eng/vm/...`; set folder separately if needed.
28. Install selected kernel and reboot. 15. Detach the 2 FC PCI adapters from the cloned VM.
29. After reboot, verify clone is online in `skidamarink` using `cirrusdata`. 16. Power on clone.
30. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up. 17. SSH to `<INITIAL_CLONE_HOST_OR_IP>` using credentials from `/home/aw/code/cds/.env.credentials.local`.
31. Write sample data to source 10GB disk. 18. Change OS hostname to clone name, replacing `.` with `-`.
32. Trigger sync and confirm tracking status using `cirrusdata`. 19. Convert networking from static IP to DHCP.
33. Uninstall CMC. 20. Remove/clean static IP configuration references.
34. Post-uninstall cleanup checkpoint: 21. Reboot clone.
22. Find DHCP address and verify it is not `<INITIAL_CLONE_HOST_OR_IP>`.
23. If still `<INITIAL_CLONE_HOST_OR_IP>`, fix static config cleanup and repeat reboot/verify.
24. Continue all remaining steps using DHCP IP and credentials from `/home/aw/code/cds/.env.credentials.local`.
25. Using `cirrusdata` (`gcstage`, project `skidamarink`), reinstall CMC on clone.
26. Create local migration from 10GB source disk to 11GB destination disk using `cirrusdata`.
27. Wait for initial sync completion.
28. Check available kernels again using full candidate listing (not latest-only output).
29. Select upgrade target one step above current kernel from the filtered candidate list (same major; same minor preferred).
30. Install selected kernel and reboot.
31. After reboot, verify clone is online in `skidamarink` using `cirrusdata`.
32. SSH to clone and verify MTDI, Galaxy Migrate services/driver are up.
33. Write sample data to source 10GB disk.
34. Trigger sync and confirm tracking status using `cirrusdata`.
35. Uninstall CMC.
36. Post-uninstall cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically. - If the cloned VM is still marked online after uninstall, remove that cloned VM host entry specifically.
35. Check available kernels. 37. Check available kernels.
36. Upgrade to latest kernel and reboot. 38. Select latest-upgrade target kernel from the filtered candidate list (same major required; same minor preferred).
37. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`). 39. Upgrade to selected latest target kernel and reboot.
38. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion. 40. Reinstall CMC via `cirrusdata` (`gcstage`, `skidamarink`).
39. Confirm machine is online in `skidamarink` using `cirrusdata`. 41. Create a local migration (10GB -> 11GB) via `cirrusdata` and wait for initial sync completion.
40. SSH and verify MTDI, Galaxy Migrate services/driver are up. 42. Confirm machine is online in `skidamarink` using `cirrusdata`.
41. Power off cloned machine. 43. SSH and verify MTDI, Galaxy Migrate services/driver are up.
42. Delete cloned VM and its disks from vCenter inventory. 44. Power off cloned machine.
43. Final cleanup checkpoint: 45. Delete cloned VM and its disks from vCenter inventory.
46. Final cleanup checkpoint:
- Run MCP offline-host cleanup for `skidamarink`. - Run MCP offline-host cleanup for `skidamarink`.
- If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically. - If the cloned VM is still marked online at the end of the test, remove that cloned VM host entry specifically.
@@ -126,10 +131,10 @@ Use one cumulative results file and append one new section per tested host.
### Kernel / OS Tracking ### Kernel / OS Tracking
- Start OS version: - Start OS version:
- Start kernel version: - Start kernel version:
- Kernel list before first upgrade: - Kernel list before first upgrade (full candidate list, filtered by scope rule):
- Kernel selected for step-up upgrade: - Kernel selected for step-up upgrade:
- Kernel after step-up reboot: - Kernel after step-up reboot:
- Kernel list before latest upgrade: - Kernel list before latest upgrade (full candidate list, filtered by scope rule):
- Kernel selected for latest upgrade: - Kernel selected for latest upgrade:
- Kernel after latest reboot: - Kernel after latest reboot: