Reorganize ATVM workspace into scripts, docs, inventory, and archive

Restructure the ATVM folder to separate executable scripts from workflow documentation and long-form environment reference material.

Move setup and automation scripts into scripts/, move setup and automation guides into docs/, add top-level README and workflow conventions, and organize durable environment details into inventory/ while preserving the original long-form ATVM notes under archive/imported-notes/.

Update internal documentation paths to match the new layout and remove the archived Zone.Identifier metadata file.
This commit is contained in:
2026-03-21 20:39:23 -04:00
parent 08b2ab3104
commit 274b920b40
17 changed files with 332 additions and 191 deletions

View File

@@ -1,189 +1,70 @@
# ATVM AGENTS Guide
This file defines how to operate and maintain the ATVM folder workflows.
It is rebuilt from current files in `/home/aw/code/cds/atvm`.
This file defines how to operate and maintain the ATVM workspace in `/home/aw/code/cds/atvm`.
## Scope
Two operational tracks exist in this folder:
- Setup/bootstrap track:
- `atvm-setup-script.sh`
- `run-atvm-setup-and-collect-log.sh`
- `atvm-setup-script-guide.md`
- `atvm-setup-script-runs.md`
- Cypress automation track:
- `atvm-automation-guide.md`
- `atvm-automation-examples.md`
- `atvm-automation-runs.md`
## Workspace Layout
- `README.md`
- human entry point for the folder
- `scripts/`
- executable workflow assets
- `docs/setup/`
- setup/bootstrap procedure and run learnings
- `docs/automation/`
- ATVM Cypress automation procedure, examples, and run learnings
- `docs/workflow/`
- shared conventions for how the docs are maintained
- `inventory/`
- environment reference, credentials, IP allocations, and inventory indexes
- `archive/imported-notes/`
- preserved original long-form source material
Reference/inventory material:
- `cypress-automation-for-cmc.md`
- `cypress-automation-for-cmc.md:Zone.Identifier`
## Authoritative Sources
- Setup/bootstrap procedure:
- `docs/setup/guide.md`
- Setup/bootstrap learnings:
- `docs/setup/run-learnings.md`
- Automation execution procedure:
- `docs/automation/guide.md`
- Automation command examples:
- `docs/automation/examples.md`
- Automation run learnings:
- `docs/automation/run-learnings.md`
- Workspace conventions:
- `docs/workflow/conventions.md`
## File Roles
- `*-guide.md` files:
- Guide-only procedures, rules, defaults, and checklists.
- No dated or one-off run examples.
- `*-runs.md` files:
- Run-specific learnings only when a run introduces new information.
- No routine/no-change run logs.
- `*-examples.md` files:
- Reusable command examples and commonly used option combinations.
- Keep generic; avoid dated one-off run outcomes.
- Keep command-focused; move workflow rules, defaults, blacklist policy, and reporting requirements into the corresponding `*-guide.md` or `*-runs.md` files.
## Setup Track: Required Behavior
Use `atvm-setup-script-guide.md` as the procedure source and keep behavior aligned with `atvm-setup-script.sh`.
### Safety-Critical Rules
1. Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
2. Never infer expected hostname from target host output.
3. Stop immediately on hostname mismatch or expected-IP-not-assigned.
4. Keep static IP configuration as a final step to avoid mid-run connection loss.
### Canonical Setup Order
1. Parse args.
2. Validate host identity.
3. Check sudo/privileges.
4. Fix repositories.
5. Configure Ubuntu root SSH/password workflow (Ubuntu only).
6. Install sudo if needed.
7. Configure Oracle default non-UEK kernel (Oracle Linux only).
8. Disable Ubuntu auto-upgrades (Ubuntu only).
9. Run package cleanup/install.
10. Disable SELinux (RHEL-family).
11. Configure static IP.
12. Print summary.
13. Reboot + post-reboot SELinux verifier when applicable.
14. Keep client on until controller log copy + SHA256 verification completes.
15. Power off only after verified success and no real error log lines.
### Setup Defaults
## Setup Track Defaults
- ATVM static IP target: `192.168.3.191/22`
- Gateway: `192.168.0.1`
- DNS: `8.8.8.8`, `8.8.4.4`
- Ubuntu root SSH workflow credential in docs/script: `root / cdsi2012`
- Client log file: `atvm_setup_script.log` (typically `/root/atvm_setup_script.log` when run as root)
- When the operator refers to `192.168.3.191`, treat it as the ATVM target host by default.
- Default setup credential: `root / cdsi2012`
- Client log file: `atvm_setup_script.log`
- Treat `192.168.3.191` as the default ATVM target host reference.
- For SSH to `192.168.3.191`, ignore host key mismatch by default with `-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null`.
- For SSH to `192.168.3.191`, use default credentials `root / cdsi2012` unless the operator explicitly overrides them.
- For SSH to `192.168.3.191`, use default credentials `root / cdsi2012` unless explicitly overridden.
### Setup Controller Wrapper Rules
- Wrapper supports:
- run-and-collect (default)
- `--collect-after-complete`
- `run-and-collect` requires env vars:
- `EXPECTED_IP_ARG`
- `EXPECTED_HOSTNAME_ARG`
- Wrapper validates success marker and SHA256 before success.
- Wrapper powers off only when log has no lines matching `^\[ERROR\]`.
## Automation Track Defaults
- Controller host: `atvm-cypres-vm-1`
- Controller IP: `192.168.3.190`
- Controller credentials: `root / atvmcdsi2012`
- Default plugin: `--use_specified_plugin iscsi`
- Always include `--ignore_force_shutdown` unless explicitly told not to.
- Default config family: `gold`
## Cypress Automation Track: Required Behavior
Use `atvm-automation-guide.md` as the execution source.
Use `atvm-automation-examples.md` as the common options/command reference.
Treat `atvm-automation-examples.md` as reference-only.
Do not treat example options, excludes, plugins, or filters as default operator intent unless the operator explicitly asks for them.
## Required Operating Rules
- Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
- Keep static IP configuration as the final setup step.
- Before any automation run, always check whether automation is already running.
- Always show exact planned ATVM commands before execution.
- Never execute setup or automation commands that require approval until the operator explicitly approves them.
- Treat `docs/automation/examples.md` as reference-only, not default operator intent.
- Put reusable workflow rules in `guide.md` files.
- Put dated lessons only in `run-learnings.md` files.
- Keep durable environment reference in `inventory/`.
- Preserve imported long-form notes in `archive/imported-notes/`; do not treat them as the primary runbook.
### Controller Client
- Hostname: `atvm-cypres-vm-1`
- IP: `192.168.3.190`
- Credentials: `root / atvmcdsi2012`
### Mandatory Run Control
1. Before planning a new run, check for active automation processes.
2. Report running/not-running status.
3. If running, ask before termination; terminate only with explicit approval.
4. Always show exact planned command(s) exactly as they will be executed before execution.
5. Never execute `cmc-templates.py`, `run-sorry-cypress.py`, or any other ATVM run command until the operator explicitly approves the displayed command(s).
6. Approval is required for template-generation commands as well as the runner command.
7. If the operator changes the requested scope, plugin, config, build name, or any other option after commands are shown, rebuild the commands, display them again, and wait for fresh approval.
8. If monitoring is not requested, report immediate command success/failure and any errors.
9. Monitor completion only when explicitly requested by the operator.
10. For monitored runs, allow long runtime windows (15-30+ minutes or longer) and continue until completion unless operator instructs otherwise.
11. Do not terminate monitored runs unless the operator explicitly instructs termination.
### Status Request Format
When the operator asks for run status, report in this order:
1. Heading/title using the run `build_name`.
2. Completed machines with machine name first and status second for each machine.
3. Notes.
4. Skipped machines with reason.
5. Remaining machines still to run.
6. Summary counts for finished, passed, failed, and skipped machines.
7. Timing details:
- start time
- end time if complete
- total run time if complete, or elapsed run time if still running
- quickest completed test runtime
- longest completed test runtime
- average completed test runtime
8. Estimated completion time.
Status details:
- Use the same status display format for every ATVM automation status response regardless of template type (`e2e`, `systemOS`, `reboot`, `migrateops`, and others).
- Treat references to the "ATVM automation run" or "automation run" as referring to the local ATVM automation workflow in `/home/aw/code/cds/atvm` and the automation VM at `192.168.3.190`, not to Cirrus project operations with similar names.
- Treat a status request as a request for live status by default.
- Use the live run log on the automation VM when available.
- If no automation is currently running, fall back to the most recent historical run artifacts and logs.
- Use the run `build_name` as the heading/title when available.
- Format every machine entry as `machine-name - STATUS`.
- Put each machine on its own line; do not combine multiple machine statuses on one line.
- Put failure reasons, oddities, and operator-facing context in a dedicated `Notes` section.
- Show blacklisted machines under skipped machines when they are part of the requested scope.
- Show in-progress machines under remaining machines as `RUNNING`.
- Show not-yet-started machines as `NOT STARTED`.
- Use completed spec results already recorded in the log to determine machine pass/fail state.
- For failed machines, mark the machine `FAIL` in the completed list and append a longer failure description on the same machine line when the reason materially helps the operator understand the failure.
- Keep `Notes` for broader context, anomalies, or follow-up details that are not part of the machine-specific failure description.
- Include start time in status output when it can be derived from the log.
- Include end time and total runtime for completed runs, or elapsed runtime for active runs.
- Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the log.
- Make the estimated completion time refer to the entire remaining run, not only the current machine/spec.
- For categorized runs, reconstruct the status of the entire run across all category batches whether the run is still active or only historical artifacts remain.
### Automation Blacklist
Always exclude these machines with `--exclude_partial_match` when building ATVM automation commands.
CMC install blacklist (`BLACKLISTED: CMC INSTALL - CAN'T COMPILE`):
- `atvm6-centos6.0`
- `atvm41-redhat6.0`
- `atvm73-oracle6.0`
Crash blacklist (`BLACKLISTED: CRASHES WHEN CREATING MIGRATION SESSION - BUG`):
- `atvm144-suse15.0`
Support-request blacklist (`BLACKLISTED: SUPPORT REQUEST - WAITING`):
- `atvm113-debian9.0.0`
- `atvm115-debian9.1.0`
- `atvm116-debian9.2.0`
Re-create-might-be-needed blacklist (`BLACKLISTED: RE-CREATE MIGHT BE NEEDED`):
- `atvm156-debian9.3.0`
### Operator Preferences
- Do not include Gold Disk IDs in `--build_name`.
- `--build_name` must not contain spaces; use `-` between words.
- Prefer distro-scoped filtering (for example `--containsVm redhat9`) when possible.
- Always include `--ignore_force_shutdown` on `cmc-templates.py` commands unless the operator explicitly asks not to.
- Default to `--use_specified_plugin iscsi` unless the operator explicitly requests a different plugin.
- Do not reference `cypress.atvm-config.ts` by default.
- Unless the operator explicitly asks for another config, use ATVM config files with `gold` in the filename (for example `cypress.atvm-config-gold.ts`).
- If the operator-specified ATVM config file is missing, fail fast and report the missing filename.
- Do not search for alternate config files or silently substitute another config unless the operator explicitly asks for that.
## Update Policy (Both Tracks)
After each run:
- Update corresponding `*-guide.md` only if workflow/rules/default behavior changed.
- Update corresponding `*-examples.md` when common command patterns/options change.
- Update corresponding `*-runs.md` only if the run produced new learning.
## Path and Naming Consistency Note
Current repo filenames use hyphen style, but some script text/defaults still show underscore-style paths (for example `atvm_setup_script.sh`, `run_atvm_setup_and_collect_log.sh`, `/home/aw/code/atvm`).
When operating:
1. Use actual filesystem paths in this repo first (`/home/aw/code/cds/atvm/...`).
2. If script defaults are used, verify they match existing files before execution.
3. If changing path conventions, update scripts and guides in the same change.
## Non-Goals
- Do not treat `cypress-automation-for-cmc.md` as executable runbook logic.
- Do not record secrets/tokens into new guide or runs entries.
## Maintenance Rules
- When changing workflow behavior, update the corresponding `guide.md`.
- When adding a reusable command pattern, update `docs/automation/examples.md`.
- When a run produces a new lesson, update the appropriate `run-learnings.md`.
- Keep filesystem paths in docs aligned with the actual repo layout.
- Do not remove detailed inventory or credential information from this workspace unless explicitly instructed.