- show failed machines with a longer failure description on the same status line - keep Notes for broader context beyond the machine-specific failure reason - update the ATVM automation guide and AGENTS rules to match - record the reporting preference in atvm-automation-runs.md
178 lines
8.1 KiB
Markdown
178 lines
8.1 KiB
Markdown
# ATVM AGENTS Guide
|
|
|
|
This file defines how to operate and maintain the ATVM folder workflows.
|
|
It is rebuilt from current files in `/home/aw/code/cds/atvm`.
|
|
|
|
## Scope
|
|
Two operational tracks exist in this folder:
|
|
- Setup/bootstrap track:
|
|
- `atvm-setup-script.sh`
|
|
- `run-atvm-setup-and-collect-log.sh`
|
|
- `atvm-setup-script-guide.md`
|
|
- `atvm-setup-script-runs.md`
|
|
- Cypress automation track:
|
|
- `atvm-automation-guide.md`
|
|
- `atvm-automation-examples.md`
|
|
- `atvm-automation-runs.md`
|
|
|
|
Reference/inventory material:
|
|
- `cypress-automation-for-cmc.md`
|
|
- `cypress-automation-for-cmc.md:Zone.Identifier`
|
|
|
|
## File Roles
|
|
- `*-guide.md` files:
|
|
- Guide-only procedures, rules, defaults, and checklists.
|
|
- No dated or one-off run examples.
|
|
- `*-runs.md` files:
|
|
- Run-specific learnings only when a run introduces new information.
|
|
- No routine/no-change run logs.
|
|
- `*-examples.md` files:
|
|
- Reusable command examples and commonly used option combinations.
|
|
- Keep generic; avoid dated one-off run outcomes.
|
|
- Keep command-focused; move workflow rules, defaults, blacklist policy, and reporting requirements into the corresponding `*-guide.md` or `*-runs.md` files.
|
|
|
|
## Setup Track: Required Behavior
|
|
Use `atvm-setup-script-guide.md` as the procedure source and keep behavior aligned with `atvm-setup-script.sh`.
|
|
|
|
### Safety-Critical Rules
|
|
1. Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
|
|
2. Never infer expected hostname from target host output.
|
|
3. Stop immediately on hostname mismatch or expected-IP-not-assigned.
|
|
4. Keep static IP configuration as a final step to avoid mid-run connection loss.
|
|
|
|
### Canonical Setup Order
|
|
1. Parse args.
|
|
2. Validate host identity.
|
|
3. Check sudo/privileges.
|
|
4. Fix repositories.
|
|
5. Configure Ubuntu root SSH/password workflow (Ubuntu only).
|
|
6. Install sudo if needed.
|
|
7. Configure Oracle default non-UEK kernel (Oracle Linux only).
|
|
8. Disable Ubuntu auto-upgrades (Ubuntu only).
|
|
9. Run package cleanup/install.
|
|
10. Disable SELinux (RHEL-family).
|
|
11. Configure static IP.
|
|
12. Print summary.
|
|
13. Reboot + post-reboot SELinux verifier when applicable.
|
|
14. Keep client on until controller log copy + SHA256 verification completes.
|
|
15. Power off only after verified success and no real error log lines.
|
|
|
|
### Setup Defaults
|
|
- ATVM static IP target: `192.168.3.191/22`
|
|
- Gateway: `192.168.0.1`
|
|
- DNS: `8.8.8.8`, `8.8.4.4`
|
|
- Ubuntu root SSH workflow credential in docs/script: `root / cdsi2012`
|
|
- Client log file: `atvm_setup_script.log` (typically `/root/atvm_setup_script.log` when run as root)
|
|
|
|
### Setup Controller Wrapper Rules
|
|
- Wrapper supports:
|
|
- run-and-collect (default)
|
|
- `--collect-after-complete`
|
|
- `run-and-collect` requires env vars:
|
|
- `EXPECTED_IP_ARG`
|
|
- `EXPECTED_HOSTNAME_ARG`
|
|
- Wrapper validates success marker and SHA256 before success.
|
|
- Wrapper powers off only when log has no lines matching `^\[ERROR\]`.
|
|
|
|
## Cypress Automation Track: Required Behavior
|
|
Use `atvm-automation-guide.md` as the execution source.
|
|
Use `atvm-automation-examples.md` as the common options/command reference.
|
|
Treat `atvm-automation-examples.md` as reference-only.
|
|
Do not treat example options, excludes, plugins, or filters as default operator intent unless the operator explicitly asks for them.
|
|
|
|
### Controller Client
|
|
- Hostname: `atvm-cypres-vm-1`
|
|
- IP: `192.168.3.190`
|
|
- Credentials: `root / atvmcdsi2012`
|
|
|
|
### Mandatory Run Control
|
|
1. Before planning a new run, check for active automation processes.
|
|
2. Report running/not-running status.
|
|
3. If running, ask before termination; terminate only with explicit approval.
|
|
4. Always show exact planned command(s) before execution.
|
|
5. Execute only after explicit approval.
|
|
6. If monitoring is not requested, report immediate command success/failure and any errors.
|
|
7. Monitor completion only when explicitly requested by the operator.
|
|
8. For monitored runs, allow long runtime windows (15-30+ minutes or longer) and continue until completion unless operator instructs otherwise.
|
|
9. Do not terminate monitored runs unless the operator explicitly instructs termination.
|
|
|
|
### Status Request Format
|
|
When the operator asks for run status, report in this order:
|
|
1. Heading/title using the run `build_name`.
|
|
2. Completed machines with machine name first and status second for each machine.
|
|
3. Notes.
|
|
4. Skipped machines with reason.
|
|
5. Remaining machines still to run.
|
|
6. Summary counts for finished, passed, failed, and skipped machines.
|
|
7. Timing details:
|
|
- start time
|
|
- end time if complete
|
|
- total run time if complete, or elapsed run time if still running
|
|
- quickest completed test runtime
|
|
- longest completed test runtime
|
|
- average completed test runtime
|
|
8. Estimated completion time.
|
|
|
|
Status details:
|
|
- Use the same status display format for every ATVM automation status response regardless of template type (`e2e`, `systemOS`, `reboot`, `migrateops`, and others).
|
|
- Use the live run log on the automation VM when available.
|
|
- Use the run `build_name` as the heading/title when available.
|
|
- Format every machine entry as `machine-name - STATUS`.
|
|
- Put each machine on its own line; do not combine multiple machine statuses on one line.
|
|
- Put failure reasons, oddities, and operator-facing context in a dedicated `Notes` section.
|
|
- Show blacklisted machines under skipped machines when they are part of the requested scope.
|
|
- Show in-progress machines under remaining machines as `RUNNING`.
|
|
- Show not-yet-started machines as `NOT STARTED`.
|
|
- Use completed spec results already recorded in the log to determine machine pass/fail state.
|
|
- For failed machines, mark the machine `FAIL` in the completed list and append a longer failure description on the same machine line when the reason materially helps the operator understand the failure.
|
|
- Keep `Notes` for broader context, anomalies, or follow-up details that are not part of the machine-specific failure description.
|
|
- Include start time in status output when it can be derived from the log.
|
|
- Include end time and total runtime for completed runs, or elapsed runtime for active runs.
|
|
- Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the log.
|
|
- Make the estimated completion time refer to the entire remaining run, not only the current machine/spec.
|
|
|
|
### Automation Blacklist
|
|
Always exclude these machines with `--exclude_partial_match` when building ATVM automation commands.
|
|
|
|
CMC install blacklist (`BLACKLISTED: CMC INSTALL - CAN'T COMPILE`):
|
|
- `atvm6-centos6.0`
|
|
- `atvm41-redhat6.0`
|
|
- `atvm73-oracle6.0`
|
|
|
|
Crash blacklist (`BLACKLISTED: CRASHES WHEN CREATING MIGRATION SESSION - BUG`):
|
|
- `atvm144-suse15.0`
|
|
|
|
Support-request blacklist (`BLACKLISTED: SUPPORT REQUEST - WAITING`):
|
|
- `atvm113-debian9.0.0`
|
|
- `atvm115-debian9.1.0`
|
|
- `atvm116-debian9.2.0`
|
|
- `atvm156-debian9.3.0`
|
|
|
|
Re-create blacklist:
|
|
- `atvm157-debian13.0.0`
|
|
|
|
### Operator Preferences
|
|
- Do not include Gold Disk IDs in `--build_name`.
|
|
- `--build_name` must not contain spaces; use `-` between words.
|
|
- Prefer distro-scoped filtering (for example `--containsVm redhat9`) when possible.
|
|
- Do not reference `cypress.atvm-config.ts` by default.
|
|
- Unless the operator explicitly asks for another config, use ATVM config files with `gold` in the filename (for example `cypress.atvm-config-gold.ts`).
|
|
|
|
## Update Policy (Both Tracks)
|
|
After each run:
|
|
- Update corresponding `*-guide.md` only if workflow/rules/default behavior changed.
|
|
- Update corresponding `*-examples.md` when common command patterns/options change.
|
|
- Update corresponding `*-runs.md` only if the run produced new learning.
|
|
|
|
## Path and Naming Consistency Note
|
|
Current repo filenames use hyphen style, but some script text/defaults still show underscore-style paths (for example `atvm_setup_script.sh`, `run_atvm_setup_and_collect_log.sh`, `/home/aw/code/atvm`).
|
|
|
|
When operating:
|
|
1. Use actual filesystem paths in this repo first (`/home/aw/code/cds/atvm/...`).
|
|
2. If script defaults are used, verify they match existing files before execution.
|
|
3. If changing path conventions, update scripts and guides in the same change.
|
|
|
|
## Non-Goals
|
|
- Do not treat `cypress-automation-for-cmc.md` as executable runbook logic.
|
|
- Do not record secrets/tokens into new guide or runs entries.
|