Reorganize cdsmcp workspace into docs, templates, and artifacts
Restructure the cdsmcp folder to separate operator guidance, reusable templates, and runtime artifacts into clearer top-level areas. Move the VMware migration guide and run learnings into docs/, move vmw.yaml into templates/, move the existing log into artifacts/logs/, replace the old index file with a README, and split the former monolithic guide into focused documents for VMware MigrateOps workflow, VM lookup and FC/disk assignment, and CMC install reference. Update internal references so the reorganized layout remains coherent without changing the underlying operational guidance.
This commit is contained in:
33
cdsmcp/docs/cmc-install-reference.md
Normal file
33
cdsmcp/docs/cmc-install-reference.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# CMC Install Reference
|
||||
|
||||
This file contains the CMC install, uninstall, and reinstall fallback reference used by the CDS MCP VMware workflow.
|
||||
|
||||
## Default Project Rule
|
||||
- Default project: `Skidamarink`
|
||||
- Default registration code: `BZHKABCODZLIOK6RTAJ4`
|
||||
- Default endpoint: `portal.gcstage.cloud.nonprod.cirrusdata.com:443`
|
||||
- Use a different project code only when the user explicitly requests it in that run.
|
||||
|
||||
## Skidamarink Install (Linux)
|
||||
```bash
|
||||
curl https://get.cirrusdata.cloud/install-cmc | bash -s -- -rgc BZHKABCODZLIOK6RTAJ4 -gce portal.gcstage.cloud.nonprod.cirrusdata.com:443 -pkg-mode PRE_RELEASE
|
||||
```
|
||||
|
||||
## Skidamarink Install (Windows)
|
||||
```powershell
|
||||
iex "& { $(irm https://get.cirrusdata.cloud/install-cmc-win) } -rgc BZHKABCODZLIOK6RTAJ4 -gce portal.gcstage.cloud.nonprod.cirrusdata.com:443 -pkg-mode PRE_RELEASE"
|
||||
```
|
||||
|
||||
## Uninstall (Linux)
|
||||
```bash
|
||||
curl https://get.cirrusdata.cloud/install-cmc | bash -s -- -uninstall
|
||||
```
|
||||
|
||||
## Uninstall (Windows)
|
||||
```powershell
|
||||
iex "& { $(irm https://get.cirrusdata.cloud/install-cmc-win) } -uninstall"
|
||||
```
|
||||
|
||||
## CMC Reinstall Fallback (RHEL 10)
|
||||
- If installer-based reinstall fails due repo metadata/download errors, use cached local `mtdi-daemon` and `galaxy-migrate` RPMs, start services, enforce `galaxy_complete_endpoint`, then manually register.
|
||||
- Do not continue MigrateOps create until the source host is visible as connected in CDC.
|
||||
47
cdsmcp/docs/run-learnings.md
Normal file
47
cdsmcp/docs/run-learnings.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# ESX / vCenter Run Learnings
|
||||
|
||||
This file stores run-specific examples only when a run produced a new learning relevant to future tasks.
|
||||
|
||||
## Entry Rule
|
||||
- Add an entry only when the run changed workflow behavior, uncovered a new failure pattern, or confirmed a new required check.
|
||||
- Do not add routine successful runs with no new learning.
|
||||
|
||||
## Run Learning: Operation 14208
|
||||
- Learning: `wait-for-vm-registration` helper registration can be the longest early-stage step.
|
||||
- Action for future runs: if step 6/7 is slow, verify helper VM existence in vCenter before remediation.
|
||||
|
||||
## Run Learning: Operation 14213
|
||||
- Learning: completion response was sent before destination delete prompt, operation archive, and offline-host cleanup.
|
||||
- Action for future runs: completion must be gated on delete prompt handling, archive, and cleanup verification.
|
||||
|
||||
## Run Learning: Operation 14214
|
||||
- Learning: stale helper/source entries can remain and require explicit offline-host cleanup reruns.
|
||||
- Action for future runs: rerun cleanup until stale entries are actually removed.
|
||||
|
||||
## Run Learning: Operation 14215
|
||||
- Learning: helper creation can fail with vSphere `ReconfigVM` errors and recover via controlled retries.
|
||||
- Action for future runs:
|
||||
- remove leftover helper artifacts before retry
|
||||
- avoid manual helper power actions during active task execution
|
||||
- keep waiting while heartbeats/progress still advance
|
||||
|
||||
## Run Learning: Operation 14216
|
||||
- Learning: destination login validation and post-run cleanup were missed before completion reporting.
|
||||
- Action for future runs: always perform destination login validation + archive + cleanup automatically before declaring completion.
|
||||
|
||||
## Run Learning: Operation 14218
|
||||
- Learning: source/helper entries can remain `connected` with stale `last_checkin` after migration.
|
||||
- Action for future runs: enforce heartbeat-timeout waits and rerun cleanup until source/helper entries are removed.
|
||||
|
||||
## Run Learning: Operation 14221
|
||||
- Learning: source/helper CDC entries for the current request can be removed cleanly by timeout-based cleanup loop after archive, and final 4-entity status listing is effective for closure.
|
||||
- Action for future runs:
|
||||
- always provide final source/destination/access/helper listing across CDC and vCenter
|
||||
- keep destination delete as explicit user-confirmed step only
|
||||
|
||||
## Run Learning: Operation 14223
|
||||
- Learning: on RHEL 10, CMC reinstall via installer script can fail when repo metadata is unavailable; local RPM install + explicit CDC endpoint config + manual register can recover the source in-place.
|
||||
- Action for future runs:
|
||||
- if Linux installer fails on repo metadata, check cached `mtdi-daemon` and `galaxy-migrate` RPMs and install directly
|
||||
- enforce `galaxy_complete_endpoint` before manual register
|
||||
- proceed with migrateops only after source host is confirmed connected in CDC
|
||||
57
cdsmcp/docs/vm-lookup-and-assignment.md
Normal file
57
cdsmcp/docs/vm-lookup-and-assignment.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# VM Lookup And Assignment
|
||||
|
||||
This file covers vCenter VM lookup responses and the workflow for assigning existing disks and PCI passthrough FC adapters to a VM.
|
||||
|
||||
## Cluster Scope Rule
|
||||
- Only work under cluster `QACL-ATVMCypressONLY` unless explicitly told otherwise.
|
||||
|
||||
## Ignore VMs
|
||||
- `vCLS-bf0ec6f6-c7e2-4383-b11e-9c97cec7ed44`
|
||||
- `vCLS-e5b3c60e-6a1c-46a6-8357-191fc0ab8e14`
|
||||
|
||||
## IP Lookup Rule
|
||||
- If asked about an IP address, only check powered-on VMs.
|
||||
|
||||
## VM Lookup Response Rule
|
||||
- Unless the user explicitly asks otherwise, return VM lookup/list results only from cluster `QACL-ATVMCypressONLY`.
|
||||
- For vCenter VM lookup requests, always report:
|
||||
- VM name
|
||||
- datastore name
|
||||
- VM notes/annotation
|
||||
- include power state and IP when available
|
||||
|
||||
## VM Disk And FC Assignment Workflow
|
||||
- When asked to assign existing disks and PCI passthrough FC adapters to a specified VM, treat the request as a two-step workflow:
|
||||
- first gather and report findings
|
||||
- then wait for explicit approval before making any changes
|
||||
- Always log into vCenter `192.168.0.201`.
|
||||
- Find the specified VM and verify the ESXi host it is currently running on.
|
||||
- Default expected ESXi host is `192.168.1.165`, but always verify live placement before planning changes.
|
||||
- Always identify and report the datastore where the VM is stored before planning disk attachment.
|
||||
- Unless the operator explicitly specifies alternatives, default to these PCI passthrough FC adapters:
|
||||
- `vmhba7` (`0000:85:00.0`)
|
||||
- `vmhba8` (`0000:85:00.1`)
|
||||
- Do not substitute any other PCI FC passthrough adapters if either default or operator-specified adapter cannot be found.
|
||||
- Unless the operator explicitly specifies alternatives, default to these existing disks from the VM's datastore under the `atvm-DISKS` directory:
|
||||
- `atvm-DISK_1.vmdk`
|
||||
- `atvm-DISK_2.vmdk`
|
||||
- Do not substitute any other disks if either default or operator-specified disk cannot be found.
|
||||
- If the specified adapters or specified disks cannot be found, do nothing and report that nothing will be assigned.
|
||||
- Before any assignment action, always provide a summary of:
|
||||
- the VM found
|
||||
- the ESXi host
|
||||
- the datastore
|
||||
- whether `vmhba7` and `vmhba8` were found and are usable
|
||||
- whether `atvm-DISK_1.vmdk` and `atvm-DISK_2.vmdk` were found under `atvm-DISKS`
|
||||
- exactly what would be assigned
|
||||
- Never perform the assignment step until the operator explicitly approves after seeing that summary.
|
||||
|
||||
## Common VM Credentials
|
||||
- Username: `root`
|
||||
- Password: `cdsi2012`
|
||||
|
||||
## Status Output Format (Power-Off/Revert/Power-On)
|
||||
- `VM [vm name] was poweredOn, so I powered it off` (or `already poweredOff`)
|
||||
- `Snapshot rollback completed`
|
||||
- `VM [vm name] powered back on successfully`
|
||||
- `Current IP: <ip>`
|
||||
102
cdsmcp/docs/vmware-migrateops-guide.md
Normal file
102
cdsmcp/docs/vmware-migrateops-guide.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# VMware Compute MigrateOps Guide
|
||||
|
||||
This file is for workflow guidance only. Do not add specific run examples here.
|
||||
|
||||
## Update Rule
|
||||
- After every run, update this file only when a workflow rule/checklist/default behavior changed.
|
||||
- Add run-specific examples and evidence to `run-learnings.md` only when that run produced a new learning.
|
||||
|
||||
## vCenter Access
|
||||
- Address: `192.168.0.201`
|
||||
- Username: `administrator@qalab.cdsi.local`
|
||||
- Password: `CDSi101!`
|
||||
- Standard CLI path: `/home/aw/.local/bin/govc`
|
||||
- Use only this standard vCenter login for vCenter actions unless explicitly instructed otherwise.
|
||||
- Do not use `192.168.3.190` for vCenter actions; that machine is reserved for Cypress ATVM automation.
|
||||
|
||||
## IP And Power-State Policy (Mandatory)
|
||||
- Before finding guest IP or attempting SSH, confirm VM power state in vCenter and power on if needed.
|
||||
- Treat only these as stable references:
|
||||
- `192.168.0.201` for vCenter login only
|
||||
- `192.168.3.190` for ATVM Cypress automation only
|
||||
- `192.168.3.191` as default ATVM target reference
|
||||
- Any other VM IP must be obtained live from vCenter for that run only.
|
||||
- Do not carry forward ad-hoc VM IPs from previous runs in runbooks.
|
||||
- When the operator refers to `192.168.3.191`, assume ATVM target SSH access should ignore host key mismatch by default with `-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null`.
|
||||
- When the operator refers to `192.168.3.191`, assume default SSH credentials `root / cdsi2012` unless the operator explicitly overrides them.
|
||||
|
||||
## Related References
|
||||
- VM lookup, datastore reporting, and FC/disk assignment:
|
||||
- `vm-lookup-and-assignment.md`
|
||||
- CMC install, uninstall, and reinstall fallback:
|
||||
- `cmc-install-reference.md`
|
||||
|
||||
## VMware Compute MigrateOps Defaults
|
||||
- Use `/home/aw/code/cds/cdsmcp/templates/vmw.yaml` as the starting template.
|
||||
- Default sequence for requested source machine:
|
||||
- clean CDC state for that machine
|
||||
- reinstall CMC Linux on that machine
|
||||
- perform migration preflight and operation create
|
||||
- If user provides a client name, replace consistently:
|
||||
- `config.system_name`
|
||||
- `migrateops_vmware_compute.compute.vm_name`
|
||||
- operation `name`
|
||||
- Validate `integration_name` is active in target project before create.
|
||||
- Default access node: `atvm-linux-h2h` (must be powered on in vCenter and connected in CDC).
|
||||
- Always discover `source_nic` from live source host networking.
|
||||
|
||||
## Approval and Monitoring Defaults
|
||||
- Auto-approve cutover by default.
|
||||
- Start monitoring immediately after operation create.
|
||||
- Approve as soon as `final-synchronization` requests input.
|
||||
- Skip auto-approval only if user explicitly asks for manual approval.
|
||||
- Patience rule:
|
||||
- if heartbeat/progress is advancing, keep waiting
|
||||
- allow longer waits for helper deployment/registration steps
|
||||
- intervene only for terminal failure, confirmed blocker, or prolonged no-progress
|
||||
|
||||
## Preflight Checklist
|
||||
- Source host connected in CDC.
|
||||
- Integration exists and is active in same project.
|
||||
- `atvm-linux-h2h` powered on in vCenter.
|
||||
- `atvm-linux-h2h` connected in same CDC project.
|
||||
- Destination VM name does not already exist in vCenter.
|
||||
- Destination datastore/host/network resolve in vCenter.
|
||||
- `source_nic` discovered via SSH from source host.
|
||||
|
||||
## Post-Migration Validation and Cleanup Pattern
|
||||
- Validate destination login before cleanup:
|
||||
- get destination guest IP from vCenter
|
||||
- verify SSH/login works
|
||||
- if guest IP empty, keep polling and do not skip validation
|
||||
- do not mark run complete before validation result is recorded
|
||||
- Before deleting destination VM:
|
||||
- always prompt user for explicit confirmation
|
||||
- never delete destination VM without that confirmation in the same run
|
||||
- For delete path:
|
||||
- resolve source VM ID and destination VM ID separately
|
||||
- abort if IDs match
|
||||
- power off destination if needed
|
||||
- delete destination by explicit VM ID
|
||||
- verify destination removed and source still exists
|
||||
- Always run project cleanup after terminal migration state:
|
||||
- archive completed operation
|
||||
- run global offline-host cleanup
|
||||
- cleanup must target source VM named in current request only
|
||||
- if source/helper entries still connected, force-disconnect conditions and rerun cleanup
|
||||
- if stale connected state persists after VM removal/power-off, wait heartbeat timeout and rerun cleanup until removed
|
||||
- verify helper entry from this run (`migrateops-<opid>-<source-system-name>`) is removed
|
||||
- Completion gate:
|
||||
- do not report run complete until archive + cleanup verification are done
|
||||
- always provide read-only final listing for source, destination, access node, helper:
|
||||
- CDC status (`present` or `cleaned up`)
|
||||
- vCenter status (`present` or `cleaned up`, and if present include power state + IP)
|
||||
|
||||
## Default Behavior Contract
|
||||
- Perform automatically on every VMware compute run:
|
||||
- destination login validation
|
||||
- operation archive
|
||||
- offline-host cleanup and source/helper stale verification
|
||||
- Still require explicit user confirmation before destination delete:
|
||||
- always prompt
|
||||
- if no confirmation, keep destination and record `deletion skipped by user`
|
||||
Reference in New Issue
Block a user