Files
cds-ai/atvm/AGENTS.md

12 KiB

ATVM AGENTS Guide

This file defines how to operate and maintain the ATVM workspace in /home/aw/code/cds/atvm.

Workspace Layout

  • README.md
    • human entry point for the folder
  • scripts/
    • executable workflow assets
  • docs/setup/
    • setup/bootstrap procedure and run learnings
  • docs/automation/
    • ATVM Cypress automation procedure, examples, and run learnings
  • docs/workflow/
    • shared conventions for how the docs are maintained
  • inventory/
    • environment reference, credentials, IP allocations, and inventory indexes
  • archive/imported-notes/
    • preserved original long-form source material

Authoritative Sources

  • Setup/bootstrap procedure:
    • docs/setup/guide.md
  • Setup/bootstrap learnings:
    • docs/setup/run-learnings.md
  • Automation execution procedure:
    • docs/automation/guide.md
  • Automation command examples:
    • docs/automation/examples.md
  • Automation run learnings:
    • docs/automation/run-learnings.md
  • Workspace conventions:
    • docs/workflow/conventions.md

Setup Track Defaults

  • ATVM static IP target: 192.168.3.191/22
  • Gateway: 192.168.0.1
  • DNS: 8.8.8.8, 8.8.4.4
  • Default Linux setup credential source: /home/aw/code/cds/.env.credentials.local via ATVM_TARGET_USER and ATVM_TARGET_PASSWORD
  • Client log file: atvm_setup_script.log
  • Treat 192.168.3.191 as the default ATVM target host reference.
  • For SSH to 192.168.3.191, ignore host key mismatch by default with -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null.
  • For Linux SSH to 192.168.3.191, source /home/aw/code/cds/.env.credentials.local and use ATVM_TARGET_USER plus ATVM_TARGET_PASSWORD unless explicitly overridden.
  • Use ATVM_WINDOWS_TARGET_USER plus ATVM_WINDOWS_TARGET_PASSWORD for Windows guest access to the same ATVM target IP unless explicitly overridden.

Automation Track Defaults

  • Controller host: atvm-cypres-vm-1
  • Controller IP: 192.168.3.190
  • Controller credential source: /home/aw/code/cds/.env.credentials.local via ATVM_CONTROLLER_USER and ATVM_CONTROLLER_PASSWORD
  • Detailed test artifact root on controller: /root/cdc-e2e-cyp-12.17.4/cypress/cmcReporter
  • For vCenter inspection, prefer govc and raw vCenter REST calls before inventing or depending on higher-level wrappers.
  • Default Mattermost status destination config: /home/aw/code/cds/.env.credentials.local
  • Default plugin-bearing template plugin: --use_specified_plugin iscsi
  • Always include --ignore_force_shutdown unless explicitly told not to.
  • Always include --test_partition unless explicitly told not to.
  • For cmc-migrateops-compute-migration to VMware, default to --vm_platforms vmware and --set_static_ip_dest unless explicitly told otherwise.
  • For ATVM automation runs that involve Windows guests, default the runner command to --hang_retries 0 unless explicitly told otherwise.
  • Default config family: gold
  • Treat cmc-systemOS as not using plugin or integration-type arguments. Do not auto-add --use_specified_plugin, --integration_type, or watcher integration/plugin metadata for that template.
  • Do not auto-add the maintained --exclude_partial_match blacklist when the operator explicitly targets named VMs with --specify_vms.
  • Even for explicit --specify_vms requests, first check whether any requested VM is on the maintained blacklist and stop if it is.

Required Operating Rules

  • Never run setup without operator-provided --expected-ip and --expected-hostname.
  • Keep static IP configuration as the final setup step.
  • Before any automation run, always check whether automation is already running.
  • Treat every new ATVM run request as requiring a fresh live controller running-state check; never rely on the immediately previous status check when deciding whether a new run may start.
  • For ATVM automation runs, execute without a pre-run approval gate unless the operator explicitly asks to review commands first.
  • Default all ATVM automation runs to watcher-backed execution unless the operator explicitly says to run without watcher.
  • After starting an ATVM automation run, report the exact executed cmc-templates.py and run-sorry-cypress.py commands.
  • Treat git/commit requests as a separate approval gate.
  • Follow /home/aw/code/cds/atvm/git-guide.md for ATVM git command drafting and commit-request handling, including the rule that the controller e2e cypress repo behavior only applies when the operator explicitly asks for the e2e cypress repo or a close variation, the rule to draft plain git commands rather than SSH-wrapped controller login commands unless explicitly requested, the SSH-prefixed push example requirement for that repo, and the rule that phrases such as create me a git, create a git, create a git description, make me a git, make a git, make me a git description, create me a git description, and close variations are prepare-only until the operator explicitly approves the displayed commit command.
  • Never execute git push from the assistant for this workspace.
  • After creating a local commit for the explicitly requested e2e cypress controller repo, stop and give the operator the exact manual SSH-prefixed push command reference from git-guide.md, unless they explicitly ask for a different remote or branch.
  • Do not treat approve after a commit as permission to push; pushing requires separate explicit wording and still remains manual-reference-only unless the operator explicitly overrides this workspace rule.
  • After cmc-templates.py, always verify that the generated spec files and the config specPattern still include every requested VM before starting run-sorry-cypress.py.
  • If any requested VM is missing after template generation, stop and report the mismatch instead of launching the runner.
  • Do not infer plugin behavior from the mere presence of plugin-specific strings or code blocks in a generated spec file.
  • For plugin selection questions, determine expected behavior from the template/runtime gate such as Cypress.env(...) conditions or other execution guards, and only treat it as a mismatch if the runtime logic would execute the wrong plugin path.
  • For cmc-reboot, treat --use_specified_plugin both as a separate confirmation gate, not a normal plugin selection.
  • When a planned reboot command uses --use_specified_plugin both, warn that FC+iSCSI together may have a "chicken before the egg" timing issue where iSCSI disks are not attached before mTDI / CMC services start, and ask the operator to confirm that both is really intended.
  • Unless the operator explicitly reconfirms both for cmc-reboot, prefer only fc or only iscsi, not both.
  • For default watcher-backed runs, start the watcher before run-sorry-cypress.py.
  • For default watcher-backed runs, build the watcher-start command so it automatically includes the exact cmc-templates.py command via --template-command and the exact run-sorry-cypress.py command via --runner-command; the operator should not need to restate them separately.
  • When watcher-backed execution is used, prefer the controller-local atvm-runner@... systemd service over detached SSH background launch patterns for run-sorry-cypress.py.
  • Do not start the runner before the watcher, because the watcher helper clears stale /tmp/<build-name>.log and can delete the fresh live runner log if the runner starts first.
  • Prefer the combined start-atvm-run.sh wrapper when starting both services so watcher and runner are never launched in parallel against the same /tmp/<build-name>.log.
  • For host-level test detail and failed-test investigation, use /root/cdc-e2e-cyp-12.17.4/cypress/cmcReporter, especially logs/, xml/, and mochawesome/.
  • Apply failed-host detail recovery consistently for every ATVM template run, not just cmc-reboot.
  • For any failed ATVM host, recover failure detail in this order when available: consolidated run log, mochawesome, structured reporter artifacts (json/xml), then text reporter artifacts.
  • Keep the HOSTS detail column compact with the failing step plus a short error summary only.
  • Put richer per-host error excerpts in a dedicated FAILURE NOTES: section, and reserve NOTES: for non-failure context such as the template command, Currents URL, and operator-facing caveats.
  • When reporting TEST FLOW: for an ATVM run, prefer the numbered steps extracted from the generated spec for that exact run.
  • If the generated spec exists, do not rely on a static template flow list for TEST FLOW:.
  • Only fall back to template-level or static flow definitions when the generated spec cannot be located or parsed.
  • Treat /var/lib/atvm-run-watcher/<build>/state.json as cached watcher output, not the source of truth for a completed-run confirmation.
  • Before confirming a completed ATVM run status, verify in this order: live launch log, matching reporter artifacts, Cloud Run Finished summary / Currents URL, then compare against saved watcher state.
  • If saved watcher state disagrees with the launch log or a replay of the exact artifacts through the current watcher code, treat the saved state as stale and do not report from it.
  • Never confirm a completed ATVM run from state.json alone.
  • For categorized runs, never report a grouped sub-run as PASS from watcher host_results, grouped XML, or a lone check-xml-files.ts result by itself.
  • Before reporting a categorized grouped sub-run as PASS, confirm that the matching child batch also passed in the live launch log or the final Cloud Run Finished summary for that child run.
  • If the operator asks for ATVM run status without mentioning Mattermost, respond locally only and do not post externally.
  • If the operator asks to send ATVM run status to Mattermost, use MATTERMOST_ATVM_WEBHOOK and MATTERMOST_ATVM_CHANNEL from /home/aw/code/cds/.env.credentials.local by default and send the final status only after the run has fully completed, whether the run passed or failed.
  • For vCenter VM snapshot requests, default the snapshot name to VM Snapshot [mm/dd/yyyy:hh:mm:ss AM/PM] in the local America/New_York timezone unless the operator explicitly requests a different name.
  • For VM power, shutdown, startup, snapshot, or maintenance requests, operate only on the exact VM names provided by the operator.
  • If any requested VM name is missing or not found, stop immediately and report the missing names; do not infer, substitute, or search for replacement VMs unless the operator explicitly requests a discovery/mapping step.
  • For atvm_prep requests, infer common option values from operator shorthand when intent is clear.
  • For any atvm_prep execution request, always present the exact planned atvm_prep.py command first and wait for explicit operator approval before running it.
  • Always execute atvm_prep.py on the ATVM Cypress controller host 192.168.3.190 (atvm-cypres-vm-1).
  • atvm_prep datastore shorthand mapping for -n:
    • Gold -> AutomatedTest-VMBootImg-Gold
    • Gold2 -> AutomatedTest-VMBootImg-Gold-2
    • ComputeMigration (or compute migration) -> AutomatedTest-VMBootImgComputeMigration-Gold
  • atvm_prep client/ESXi pair mapping:
    • CDS1-ESX165 <-> 192.168.1.165
    • CDS1-ESX166 <-> 192.168.1.166
  • When an atvm_prep request specifies only one side of the client/ESXi pair (-c or -e), auto-fill the other side from the mapping.
  • If both -c and -e are provided but conflict with the mapping, stop and ask the operator to confirm the intended pair before any execution.
  • Do not call out expected, harmless systemctl reset-failed ... unit not loaded output in routine run updates; mention it only if it blocks startup or matters for debugging.
  • Treat docs/automation/examples.md as reference-only, not default operator intent.
  • Put reusable workflow rules in guide.md files.
  • Put dated lessons only in run-learnings.md files.
  • Keep durable environment reference in inventory/.
  • Preserve imported long-form notes in archive/imported-notes/; do not treat them as the primary runbook.

Maintenance Rules

  • When changing workflow behavior, update the corresponding guide.md.
  • When adding a reusable command pattern, update docs/automation/examples.md.
  • When a run produces a new lesson, update the appropriate run-learnings.md.
  • Keep filesystem paths in docs aligned with the actual repo layout.
  • Do not remove detailed inventory or credential information from this workspace unless explicitly instructed.