Files
cds-ai/atvm/AGENTS.md
anthony.wen c9a908ff3d Document ATVM missing-config fail-fast rule
Record the operator preference to stop immediately when a requested ATVM config file is missing. Update the ATVM guide, local agent instructions, and run learnings to avoid searching for or substituting alternate config files without explicit direction.
2026-03-14 20:11:06 -04:00

8.3 KiB

ATVM AGENTS Guide

This file defines how to operate and maintain the ATVM folder workflows. It is rebuilt from current files in /home/aw/code/cds/atvm.

Scope

Two operational tracks exist in this folder:

  • Setup/bootstrap track:
    • atvm-setup-script.sh
    • run-atvm-setup-and-collect-log.sh
    • atvm-setup-script-guide.md
    • atvm-setup-script-runs.md
  • Cypress automation track:
    • atvm-automation-guide.md
    • atvm-automation-examples.md
    • atvm-automation-runs.md

Reference/inventory material:

  • cypress-automation-for-cmc.md
  • cypress-automation-for-cmc.md:Zone.Identifier

File Roles

  • *-guide.md files:
    • Guide-only procedures, rules, defaults, and checklists.
    • No dated or one-off run examples.
  • *-runs.md files:
    • Run-specific learnings only when a run introduces new information.
    • No routine/no-change run logs.
  • *-examples.md files:
    • Reusable command examples and commonly used option combinations.
    • Keep generic; avoid dated one-off run outcomes.
    • Keep command-focused; move workflow rules, defaults, blacklist policy, and reporting requirements into the corresponding *-guide.md or *-runs.md files.

Setup Track: Required Behavior

Use atvm-setup-script-guide.md as the procedure source and keep behavior aligned with atvm-setup-script.sh.

Safety-Critical Rules

  1. Never run setup without operator-provided --expected-ip and --expected-hostname.
  2. Never infer expected hostname from target host output.
  3. Stop immediately on hostname mismatch or expected-IP-not-assigned.
  4. Keep static IP configuration as a final step to avoid mid-run connection loss.

Canonical Setup Order

  1. Parse args.
  2. Validate host identity.
  3. Check sudo/privileges.
  4. Fix repositories.
  5. Configure Ubuntu root SSH/password workflow (Ubuntu only).
  6. Install sudo if needed.
  7. Configure Oracle default non-UEK kernel (Oracle Linux only).
  8. Disable Ubuntu auto-upgrades (Ubuntu only).
  9. Run package cleanup/install.
  10. Disable SELinux (RHEL-family).
  11. Configure static IP.
  12. Print summary.
  13. Reboot + post-reboot SELinux verifier when applicable.
  14. Keep client on until controller log copy + SHA256 verification completes.
  15. Power off only after verified success and no real error log lines.

Setup Defaults

  • ATVM static IP target: 192.168.3.191/22
  • Gateway: 192.168.0.1
  • DNS: 8.8.8.8, 8.8.4.4
  • Ubuntu root SSH workflow credential in docs/script: root / cdsi2012
  • Client log file: atvm_setup_script.log (typically /root/atvm_setup_script.log when run as root)

Setup Controller Wrapper Rules

  • Wrapper supports:
    • run-and-collect (default)
    • --collect-after-complete
  • run-and-collect requires env vars:
    • EXPECTED_IP_ARG
    • EXPECTED_HOSTNAME_ARG
  • Wrapper validates success marker and SHA256 before success.
  • Wrapper powers off only when log has no lines matching ^\[ERROR\].

Cypress Automation Track: Required Behavior

Use atvm-automation-guide.md as the execution source. Use atvm-automation-examples.md as the common options/command reference. Treat atvm-automation-examples.md as reference-only. Do not treat example options, excludes, plugins, or filters as default operator intent unless the operator explicitly asks for them.

Controller Client

  • Hostname: atvm-cypres-vm-1
  • IP: 192.168.3.190
  • Credentials: root / atvmcdsi2012

Mandatory Run Control

  1. Before planning a new run, check for active automation processes.
  2. Report running/not-running status.
  3. If running, ask before termination; terminate only with explicit approval.
  4. Always show exact planned command(s) before execution.
  5. Execute only after explicit approval.
  6. If monitoring is not requested, report immediate command success/failure and any errors.
  7. Monitor completion only when explicitly requested by the operator.
  8. For monitored runs, allow long runtime windows (15-30+ minutes or longer) and continue until completion unless operator instructs otherwise.
  9. Do not terminate monitored runs unless the operator explicitly instructs termination.

Status Request Format

When the operator asks for run status, report in this order:

  1. Heading/title using the run build_name.
  2. Completed machines with machine name first and status second for each machine.
  3. Notes.
  4. Skipped machines with reason.
  5. Remaining machines still to run.
  6. Summary counts for finished, passed, failed, and skipped machines.
  7. Timing details:
    • start time
    • end time if complete
    • total run time if complete, or elapsed run time if still running
    • quickest completed test runtime
    • longest completed test runtime
    • average completed test runtime
  8. Estimated completion time.

Status details:

  • Use the same status display format for every ATVM automation status response regardless of template type (e2e, systemOS, reboot, migrateops, and others).
  • Use the live run log on the automation VM when available.
  • Use the run build_name as the heading/title when available.
  • Format every machine entry as machine-name - STATUS.
  • Put each machine on its own line; do not combine multiple machine statuses on one line.
  • Put failure reasons, oddities, and operator-facing context in a dedicated Notes section.
  • Show blacklisted machines under skipped machines when they are part of the requested scope.
  • Show in-progress machines under remaining machines as RUNNING.
  • Show not-yet-started machines as NOT STARTED.
  • Use completed spec results already recorded in the log to determine machine pass/fail state.
  • For failed machines, mark the machine FAIL in the completed list and append a longer failure description on the same machine line when the reason materially helps the operator understand the failure.
  • Keep Notes for broader context, anomalies, or follow-up details that are not part of the machine-specific failure description.
  • Include start time in status output when it can be derived from the log.
  • Include end time and total runtime for completed runs, or elapsed runtime for active runs.
  • Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the log.
  • Make the estimated completion time refer to the entire remaining run, not only the current machine/spec.

Automation Blacklist

Always exclude these machines with --exclude_partial_match when building ATVM automation commands.

CMC install blacklist (BLACKLISTED: CMC INSTALL - CAN'T COMPILE):

  • atvm6-centos6.0
  • atvm41-redhat6.0
  • atvm73-oracle6.0

Crash blacklist (BLACKLISTED: CRASHES WHEN CREATING MIGRATION SESSION - BUG):

  • atvm144-suse15.0

Support-request blacklist (BLACKLISTED: SUPPORT REQUEST - WAITING):

  • atvm113-debian9.0.0
  • atvm115-debian9.1.0
  • atvm116-debian9.2.0
  • atvm156-debian9.3.0

Re-create blacklist:

  • atvm157-debian13.0.0

Operator Preferences

  • Do not include Gold Disk IDs in --build_name.
  • --build_name must not contain spaces; use - between words.
  • Prefer distro-scoped filtering (for example --containsVm redhat9) when possible.
  • Do not reference cypress.atvm-config.ts by default.
  • Unless the operator explicitly asks for another config, use ATVM config files with gold in the filename (for example cypress.atvm-config-gold.ts).
  • If the operator-specified ATVM config file is missing, fail fast and report the missing filename.
  • Do not search for alternate config files or silently substitute another config unless the operator explicitly asks for that.

Update Policy (Both Tracks)

After each run:

  • Update corresponding *-guide.md only if workflow/rules/default behavior changed.
  • Update corresponding *-examples.md when common command patterns/options change.
  • Update corresponding *-runs.md only if the run produced new learning.

Path and Naming Consistency Note

Current repo filenames use hyphen style, but some script text/defaults still show underscore-style paths (for example atvm_setup_script.sh, run_atvm_setup_and_collect_log.sh, /home/aw/code/atvm).

When operating:

  1. Use actual filesystem paths in this repo first (/home/aw/code/cds/atvm/...).
  2. If script defaults are used, verify they match existing files before execution.
  3. If changing path conventions, update scripts and guides in the same change.

Non-Goals

  • Do not treat cypress-automation-for-cmc.md as executable runbook logic.
  • Do not record secrets/tokens into new guide or runs entries.