20 KiB
Run ATVM Automation Guide
This file is guide-only documentation for operating ATVM CMC automation.
Do not put specific run examples here.
For reusable command examples and common option combinations, use examples.md.
Treat examples.md as reference-only.
Do not assume the operator wants the extra options shown in examples unless they explicitly request them.
Purpose
Run ATVM CMC automation tests on the designated automation VM without unintended system or file changes.
ATVM Cypress Automation Controller Client
- Hostname:
atvm-cypres-vm-1 - IP:
192.168.3.190 - Credentials: source
/home/aw/code/cds/.env.credentials.localand useATVM_CONTROLLER_USERplusATVM_CONTROLLER_PASSWORD
ATVM Target Host Default
- Treat
192.168.3.191as the default ATVM target host reference. - For SSH to
192.168.3.191, ignore host key mismatch by default with-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null. - For Linux SSH access to
192.168.3.191, source/home/aw/code/cds/.env.credentials.localand useATVM_TARGET_USERplusATVM_TARGET_PASSWORDunless the operator explicitly overrides them. ATVM_LINUX_TARGET_HOST,ATVM_LINUX_TARGET_USER, andATVM_LINUX_TARGET_PASSWORDmirror the Linux default values when an OS-specific reference is clearer.- For Windows guest access to
192.168.3.191, source/home/aw/code/cds/.env.credentials.localand useATVM_WINDOWS_TARGET_USERplusATVM_WINDOWS_TARGET_PASSWORDunless the operator explicitly overrides them.
Operating Constraints
- Run only scripts/commands explicitly requested.
- Do not make manual system configuration changes on the client.
- Do not edit client files unless explicitly requested.
Operator Preferences
- Do not include Gold Disk identifiers in
--build_name. --build_namemust not contain spaces; use-between words.- For multiple VMs in same distro, use distro-scoped filtering (
--containsVm) instead of long explicit VM lists. - Always include
--ignore_force_shutdownoncmc-templates.pycommands unless the operator explicitly asks not to. - Default to
--use_specified_plugin iscsiunless the operator explicitly requests a different plugin. - Before preparing a new run, always check whether automation is already running.
- Always report whether automation is currently running.
- If running, ask whether to terminate; terminate only with explicit approval.
- After termination approval, terminate first, then present planned command(s), then wait for separate execution approval.
- Before any run, always show exact planned command(s) exactly as they will be executed and wait for explicit approval.
- Never execute
cmc-templates.py,run-sorry-cypress.py, or any other ATVM run command until the operator explicitly approves the displayed command(s). - Approval is required even for preparation-only steps such as template generation.
- If the operator changes any part of the request after commands are displayed, rebuild the commands, show the updated commands, and wait for fresh approval before executing anything.
- Execute ATVM run commands only after explicit approval.
- Treat
approveas approval to run and also start the per-run watcher service for that build. - Treat
approve without watcheras approval to execute the ATVM run without starting the watcher. - When
--categorizeis used with watcher enabled, treat the watcher as a sequential grouped-run watcher:- it must post one final Mattermost status per completed categorized group/sub-run
- it must stay active between grouped sub-runs while the parent categorized request is still running
- it must not stop after the first grouped run simply because one grouped run completed
- if the child build id label does not match the actual host/spec being executed, report the grouped run using the inferred host-based group instead of the raw child build id label
- it must not wait and replace those with one single parent-only post
- After execution, report immediate success/failure only.
- Do not include expected, harmless
systemctl reset-failed ... unit not loadedoutput in routine run-start confirmations. - Mention
reset-failedoutput only when it prevents watcher startup or becomes relevant to debugging. - Do not actively monitor completion unless explicitly requested.
- If monitoring is requested, allow long runtime windows (15-30+ minutes) and continue until completion unless operator instructs otherwise.
- Report command errors immediately.
sshpassmay be used where password-based SSH automation is required.
Core Scripts
- Template prep:
/root/cdc-e2e-cyp-12.17.4/cmc-templates.py - Test execution:
./run-sorry-cypress.py - Detailed host-level test artifacts:
/root/cdc-e2e-cyp-12.17.4/cypress/cmcReporter
Detailed Test Artifacts
- Use
/root/cdc-e2e-cyp-12.17.4/cypress/cmcReporteron the automation controller for detailed per-host test evidence. - Reporter subdirectories of interest:
logs/- per-host text and JSON logs for the executed tests
xml/- machine result XML files and the final
check-xml-files.tsbookkeeping output
- machine result XML files and the final
mochawesome/- per-run HTML reports
- When a machine fails, use the matching
logs/entry first to capture the detailed failure context for that host. - When reconstructing historical status, prefer
cmcReporterartifacts over less-specific runner output because they preserve per-host results after the live run has ended.
Typical sequence:
- Build the exact
cmc-templates.pyandrun-sorry-cypress.pycommands for the request. - Show those exact commands to the operator.
- Wait for explicit approval.
- Run
cmc-templates.pywith the approved options. - Wait for
cmc-templates.pyto fully finish and confirm success. - Verify the generated
.tsfiles and the configspecPatterninclude every requested VM before starting the runner. - If the watcher is approved, start the watcher before launching
run-sorry-cypress.py. - Run
run-sorry-cypress.pywith the matching approved config and build name.
Config File / Gold Disk Mapping
cypress.atvm-config-gold.ts-> Gold Disk 1cypress.atvm-config-gold-2.ts-> Gold Disk 2- Additional numbered config variants map to corresponding Gold Disks.
- Do not default to
cypress.atvm-config.ts. - Unless the operator explicitly requests another config, use a config file with
goldin the filename. - If the operator-specified config file is missing, stop immediately and report the missing file.
- Do not search for substitute ATVM config files and do not switch to another config unless the operator explicitly instructs it.
Available Templates
cmc-e2ecmc-group-consistencycmc-h2h-diff-platfcmc-h2h-same-platfcmc-migrateopscmc-migrateops-compute-migrationcmc-rebootcmc-systemOS
Command Pattern
python3 cmc-templates.py --template <template> --ignore_force_shutdown --config_file_path ./<config-file> --use_specified_plugin iscsi [template options or explicit plugin override...]; \
python3 ./run-sorry-cypress.py --config_file <config-file> --build_name <hyphenated-description-no-spaces> [--categorize]
Examples Reference
- Commonly used command examples:
examples.md - Keep this guide focused on run-control rules and workflow constraints.
- Use examples as reference material only, not as default intent for new operator requests.
- Keep
examples.mdlimited to reusable example commands; keep workflow rules, defaults, blacklist policy, and reporting rules in this guide orrun-learnings.md.
Example Option Patterns (Guide-Only)
- Distro-scoped VM selection:
--containsVm redhat--containsVm redhat9
- Explicit VM selection:
--specify_vms <vm1> <vm2> ...
- Compute migrateops platform:
--vm_platforms vmware|ovirt|openshift|proxmox
Blacklisted Machines
Always exclude these machines from broad-scope ATVM automation runs by adding them to --exclude_partial_match.
If the operator explicitly targets one or more named VMs with --specify_vms, do not add the maintained --exclude_partial_match list unless the operator also explicitly asks for it.
Even for explicit --specify_vms requests, first check whether any requested VM is on the maintained blacklist and stop instead of launching the run if one is included.
Permanently blacklisted because CMC cannot compile:
atvm6-centos6.0atvm41-redhat6.0atvm73-oracle6.0
Temporarily blacklisted because the run crashes when creating a migration session:
atvm144-suse15.0
Temporarily blacklisted while support requests are waiting:
atvm113-debian9.0.0atvm115-debian9.1.0atvm116-debian9.2.0
Temporarily blacklisted because re-creation might be needed:
atvm156-debian9.3.0
Preferred exclude list:
--exclude_partial_match atvm6-centos6.0 atvm41-redhat6.0 atvm73-oracle6.0 atvm144-suse15.0 atvm113-debian9.0.0 atvm115-debian9.1.0 atvm116-debian9.2.0 atvm156-debian9.3.0
Running-Automation Check (Mandatory)
Before any new automation request:
- SSH to
root@192.168.3.190. - Check for active automation processes (for example
run-sorry-cypress.py,cmc-templates.py, and related Cypress runners). - Report:
Runningwith process details, orNot running.
- If
Running, ask operator whether to terminate. - If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval.
- If termination is not approved, do not start a new run.
Approval Workflow (Mandatory)
- Build exact command(s) for the request.
- Present them verbatim as planned commands before running anything.
- Wait for explicit approval.
- When the watcher is available, present the watcher-start command separately from the core run commands.
- Treat
approveas approval to execute the ATVM run and start the watcher for that build. - Treat
approve without watcheras approval to execute the ATVM run without starting the watcher. - If the run uses
--categorizeand the watcher is requested, include--categorizeon the watcher start command too so the watcher tracks sequential categorized sub-runs correctly. - Run only approved command(s), no extra options and no silent substitutions.
- When both template generation and the Cypress runner are requested, run them sequentially, not in parallel.
- Do not launch
run-sorry-cypress.pyuntilcmc-templates.pyhas exited successfully and finished updating the intended config/spec files. - After
cmc-templates.py, always verify that the generated spec files on disk and the configspecPatternboth contain the full requested VM set before launchingrun-sorry-cypress.py. - If any requested VM is missing from the generated files or
specPattern, stop and report the mismatch instead of launching the runner. - Treat displayed commands as a review gate: do not execute either command until the operator has had a chance to review them and explicitly approve.
- If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope after commands are shown, discard the old plan, show the revised commands, and wait for new approval.
- If monitoring was not requested, report immediate success/failure for each command.
- If monitoring was requested, keep monitoring until completion and report final outcome.
- When the watcher is requested, launch the watcher before
run-sorry-cypress.py. - Do not start the runner before the watcher, because the watcher helper clears stale
/tmp/<build-name>.logand can delete the fresh live runner log if the runner starts first.
Requested Test Style
When asked for one VM or a VM set:
- choose requested template/options,
- choose correct config file for intended Gold Disk,
- default to a config filename containing
goldunless the operator explicitly says otherwise, - always include
--ignore_force_shutdownon the template-generation command unless the operator explicitly overrides that default, - default to
--use_specified_plugin iscsiunless the operator explicitly requests another plugin or the template does not use plugin selection, - use a descriptive
--build_namewithout Gold Disk IDs.
Update Rule
- After each run, update this guide only for workflow/rule/default changes.
- Update
examples.mdfor reusable command/option examples. - Add run-specific learnings only to
run-learnings.mdwhen the run produced new information.
Monitoring Policy
- Monitor only when the operator explicitly asks to monitor.
- If monitoring was not requested, run commands and report execution success/failure and any errors.
- If monitoring was requested, do not terminate processes automatically; only terminate if the operator explicitly instructs termination.
Mattermost Status Posting
- Treat a normal ATVM status request as local-only output by default.
- When the operator asks to send ATVM automation run status to Mattermost, use the local defaults from
/home/aw/code/cds/.env.credentials.local. - Default Mattermost variables:
MATTERMOST_ATVM_WEBHOOKMATTERMOST_ATVM_CHANNEL
- Treat these as the default destination for ATVM automation run-status posts unless the operator explicitly overrides them.
- Send the final ATVM run status only after the run has fully completed, regardless of whether the run passed or failed.
- Do not send interim or in-progress ATVM run status updates to Mattermost unless the operator explicitly asks for that.
- Use the same ATVM status layout that would be shown to the operator locally when posting to Mattermost.
- Default status template:
/home/aw/code/cds/atvm/docs/automation/status-template.md - Do not post to Mattermost unless the operator explicitly asks for the run status to be sent there.
- For categorized execution with watcher enabled, send one Mattermost status per completed categorized sub-run/group after that grouped run fully finishes.
Status Reporting Format
When the operator asks for the status of an ATVM automation run, report in this order:
- Heading/title using the run
build_name. SUMMARY:section with finished, passed, failed, and skipped counts.HOSTS:section with the machine rows.TIMING:section with start, end, total, quickest, longest, and average.COVERAGE:section describing what the run was intended to cover, excluding the target-host list.TEST FLOW:section describing the template-specific numbered run flow for the test.NOTES:section for broader context and anomalies.- Remaining machines still to run.
- Summary counts for finished, passed, failed, and skipped machines.
- Timing details:
- start time
- end time if complete
- total run time if complete, or elapsed run time if still running
- quickest completed test runtime
- longest completed test runtime
- average completed test runtime
- Estimated completion time.
Status-report expectations:
- Use the same display layout for every ATVM automation status response regardless of test type (
e2e,systemOS,reboot,migrateops, and others). - Use
/home/aw/code/cds/atvm/docs/automation/status-template.mdas the default template for both local status output and Mattermost status posts. - The default ATVM status template uses flat bullet-list sections for
COVERAGE:andTEST FLOW:, Markdown tables forSUMMARY:,HOSTS:, andTIMING:, and usesNOTES:for flat operator-facing notes. - Order the status sections as
SUMMARY:,HOSTS:,TIMING:,COVERAGE:,TEST FLOW:, thenNOTES:. - Keep
NOTES:focused on operator-facing value such as the Currents run URL, real anomalies, failure context, or material fallback behavior. - Do not include generic watcher bookkeeping messages in
NOTES:such as artifact-detection confirmations. - Do not include internal watcher fallback notes in
NOTES:such ascheck-xml-files.tsvalidation confirmations or reporter-artifact recovery details. - The
HOSTS:table includesHost,Kernel,Status, andDetailcolumns in that order. - In
COVERAGE:, describe the template, datastore/config family, migration style, and plugin/integration path, but do not list target hosts there. - In
TEST FLOW:, show the template-specific numbered run flow once for the whole test, not per host. - Resolve the flow from the run template name.
cmc-e2ecurrently uses the 22-step migration flow documented in/home/aw/code/cds/atvm/docs/automation/status-template.md.- For the
Kernelcolumn, cross-reference the host name against/home/aw/code/cds/atvm/inventory/vm-inventory.md. - If the hostname is not present in
vm-inventory.md, report the kernel value asunknown. - Treat references to the "ATVM automation run" or "automation run" as referring to this ATVM folder workflow and the automation VM at
192.168.3.190, not to Cirrus project operations such as theatvm - cypressproject. - Treat a status request as a request for live status by default.
- Unless the operator explicitly asks to send the status to Mattermost, print the status only in the local terminal response.
- Use the live automation VM state when available.
- If no automation is currently running, fall back to the most recent historical run artifacts and logs.
- Prefer local automation evidence in this order: active runner processes, live automation-VM files, shell history for the last launch command, then historical reporter artifacts.
- For detailed machine-level failure information, use
/root/cdc-e2e-cyp-12.17.4/cypress/cmcReporter/logs/on the automation VM. - Derive the heading/title from the run
build_namewhen available. - Format every machine entry as
machine-name - STATUS. - Put each machine on its own line; never combine multiple machines into one paragraph or comma-separated line.
- Use a separate
Notessection for failure reasons, anomalies, or operator-relevant context rather than cramming those details into the completed-machine list. - For categorized runs, reconstruct the whole run across all category batches; do not treat the current live category batch as the full run scope.
- For categorized runs with no active automation, reconstruct the status from the full historical run across all category batches, not only the most recent category batch.
- Always report the status of the entire requested run, even when the runner split execution into multiple category batches or cloud sub-runs.
- Derive completed-machine status from completed spec results already written during the same run.
- Parse all same-run
test-result-*.xmlfiles, not only machine-namedtest-result-atvm*.xmlfiles. - When XML filenames are hash-named, extract the machine name from XML contents such as
testsuite file=,testsuite name=, ortestcase name=. - Ignore
check-xml-files.tsXML outputs when counting machine completion because they are bookkeeping steps, not machine runs. - When multiple same-run XML files exist for one machine, use the most recently written XML for that machine.
- Include the run start time in every status response when it can be derived from the run log.
- If the run is complete, include the end time and total run time.
- If the run is still active, include the elapsed run time so far.
- Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the run log.
- Show blacklisted machines under skipped machines even if they are part of the broader machine family requested by the operator.
- For skipped machines, include the reason category:
BLACKLISTED: CMC INSTALL - CAN'T COMPILEBLACKLISTED: SUPPORT REQUEST - WAITINGBLACKLISTED: RE-CREATE MIGHT BE NEEDEDBLACKLISTED: RE-CREATE NEEDED
- If a machine is currently in progress, show it under remaining machines as
RUNNING. - If a machine has not started yet, show it under remaining machines as
NOT STARTED. - If no failures are present in completed spec results, report those completed machines as
PASS. - If a completed spec result shows a failure, report that machine as
FAILin the completed list and append a longer same-line failure description when the extra detail is useful to the operator. - Use
Notesfor extra context beyond the machine-specific same-line failure description. - Base the completion estimate on the full remaining machine count and recent per-machine runtime visible in the run log.
- Make the estimate explicitly refer to completion of the entire remaining run, not only the current machine/spec.
- When the operator also asks to send the status to Mattermost, send this same final status output to the configured Mattermost destination only after the run has fully completed.