- update the watcher to detect the active categorized sub-run from the live `--ci-build-id` process state instead of treating the parent run as one synthetic grouped run
- fix host XML parsing so the watcher prefers the real host suite over the `Root Suite` entry, avoiding `0 tests, 0 failures` summaries
- use the first timestamp inside the run log as the watcher start time so restarted watchers do not miss current-run categorized artifacts because of log file mtime drift
- improve active-host inference for categorized runs so the watcher maps the current categorized build to the correct host family while the sub-run is still in progress
- update the watcher start helper to stop any stale watcher instance for the same requested parent build name and remove its old state directory before starting fresh
- document that reused parent build names must not inherit stale cancelled, posted, state.json, or subruns state from older runs
- update the watcher install and design docs so the controller workflow explicitly treats stale reused-build-name state as part of startup cleanup
- update the watcher design and automation guide to treat --categorize as sequential ATVM sub-runs rather than one parent run with internal phases
- document that categorized runs should send one Mattermost status per completed grouped sub-run instead of one parent-only final post
- add a --categorize option to the watcher start helper so categorized mode is explicit in watcher startup
- update the watcher implementation to track categorized sub-runs separately, write per-subrun state, and post each completed grouped run once
- add explicit Windows ATVM guest credential references alongside the existing Linux target defaults
- update the ATVM automation guide and AGENTS rules so Linux SSH uses ATVM_TARGET_* while Windows guest access uses ATVM_WINDOWS_TARGET_*
- update the CDS MCP CMC install and VMware workflow docs to distinguish Linux and Windows credential usage for the shared ATVM target IP
- update the VM lookup reference so common VM credentials list both Linux and Windows target variables
- update the watcher cancel helper so it writes a final CANCELLED state into state.json before stopping the service
- record cancellation timestamps and a cancellation note in the watcher state file for clearer post-run inspection
- update the watcher service docs so the documented cancel behavior matches the state-file handling
- remove the committed watcher-service __pycache__ bytecode file from git tracking
- ignore Python bytecode artifacts so generated .pyc files do not get committed again
- keep the watcher-service source files as the only tracked implementation artifacts
- add the per-run ATVM watcher service package under atvm/watcher-service, including the Python watcher, systemd template unit, helper scripts, and deployment docs
- document the watcher-service install and operating model, including one-run-per-instance behavior, Mattermost posting rules, and the best-practice /opt/atvm-watcher-service install path
- clarify ATVM run approval semantics so `approve` means run without watcher and `approve with watcher` means run and start the watcher
- update the ATVM automation guide and AGENTS rules so watcher usage and approval behavior are explicit and consistent
- fix the remaining in-family version ordering for Oracle Linux, Redhat Linux, Rocky Linux, and Windows rows
- normalize distro labels so the inventory consistently uses Oracle Linux, Redhat Linux, and Rocky Linux
- leave hostnames, kernel versions, notes, and blacklist annotations unchanged
- reorder atvm/inventory/vm-inventory.md so distro families stay grouped and versions sort together
- add a Notes column to the VM inventory table
- annotate blacklisted VMs with their current known reasons from the ATVM automation docs
- leave existing OS labels, naming inconsistencies, and typos unchanged while restructuring the table
- update the ATVM status template so the HOSTS: table includes a Kernel column after Host
- document that kernel values should be resolved by cross-referencing hostnames in atvm/inventory/vm-inventory.md
- document that unknown should be used when a kernel value cannot be verified from the VM inventory
- align the ATVM automation guide so local status output and Mattermost posts use the kernel-aware host table
- update the ATVM status template to include COVERAGE: and FUNCTIONALLY: sections ahead of the existing summary tables
- document that COVERAGE: should describe intended run scope without listing target hosts
- document that FUNCTIONALLY: should summarize the intended workflow steps at a high level
- align the ATVM automation guide so local status output and Mattermost posts use the expanded default format
- remove hardcoded credentials, tokens, registration codes, and similar secret values from tracked ATVM and CDS MCP docs
- replace those values with references to /home/aw/code/cds/.env.credentials.local and the corresponding environment variable names
- update current operator guides to instruct sourcing .env.credentials.local before credential-dependent setup and automation workflows
- update the ATVM setup scripts to consume ATVM_TARGET_PASSWORD from the environment instead of hardcoding the Ubuntu root SSH password
- scrub the remaining tracked artifact log entry that still included the old CMC registration code
- keep the local-only credential inventory in .env.credentials.local while leaving that file untracked
- change ATVM status formatting to the approved Markdown-table template with SUMMARY:, HOSTS:, TIMING:, and NOTES:
- document that normal status requests print locally only unless explicitly asked to send to Mattermost
- document Mattermost defaults and posting rules, including only sending after full run completion
- document the controller-side systemd watcher design for future automation
- add the secrets migration/cleanup review doc
- ignore .env.credentials.local in git and reflect the move toward using that local credentials file instead of hardcoded secrets
Restructure the ATVM folder to separate executable scripts from workflow documentation and long-form environment reference material.
Move setup and automation scripts into scripts/, move setup and automation guides into docs/, add top-level README and workflow conventions, and organize durable environment details into inventory/ while preserving the original long-form ATVM notes under archive/imported-notes/.
Update internal documentation paths to match the new layout and remove the archived Zone.Identifier metadata file.
Add guide rules that treat 192.168.3.191 as the default ATVM target host.
For that IP, default SSH access now assumes root/cdsi2012 credentials and ignores host key mismatch with StrictHostKeyChecking=no and UserKnownHostsFile=/dev/null.
Update ATVM setup, automation, and ESX/vCenter guides so future runs use the same default behavior consistently.
Update the ATVM automation guidance so exact planned commands must always be shown for operator review before any execution. Require explicit approval before running cmc-templates.py, run-sorry-cypress.py, or any other ATVM run command, and require fresh approval whenever the displayed command set changes. Also record the new approval rule in the ATVM run-learning notes and operator instruction file.
Adjust the maintained ATVM blacklist so atvm156-debian9.3.0 uses the reason RE-CREATE MIGHT BE NEEDED and remove atvm157-debian13.0.0 from blacklist entries and reusable exclude examples. Also clarify the ATVM automation workflow so cmc-templates.py must finish successfully before run-sorry-cypress.py is started.
Clarify that ATVM automation status requests refer to the local atvm
folder workflow and the automation VM at 192.168.3.190, not Cirrus
project operations such as the atvm - cypress project.
Update the ATVM guide and AGENTS notes to prefer local evidence in this
order when reporting status:
- active runner processes
- live automation VM files
- shell history for the last launch command
- historical reporter artifacts
Record the new run learning in atvm-automation-runs.md so future status
requests use the correct source of truth.
Document ATVM automation defaults to always include --ignore_force_shutdown on cmc-templates.py commands and to use --use_specified_plugin iscsi unless explicitly overridden.
Update the ATVM automation guide, folder-level AGENTS instructions, and run learnings so future runs follow the same defaults.
Clarify ATVM automation status handling so status requests default to live run data from the automation VM. If no automation is active, fall back to the most recent historical run artifacts and logs.
Also document that categorized runs must always be reconstructed as a single full run across all category batches or cloud sub-runs, rather than reporting only the current or latest batch.
Update:
- atvm/atvm-automation-guide.md
- atvm/AGENTS.md
- atvm/atvm-automation-runs.md
Record the operator preference to stop immediately when a requested ATVM config file is missing. Update the ATVM guide, local agent instructions, and run learnings to avoid searching for or substituting alternate config files without explicit direction.
- show failed machines with a longer failure description on the same status line
- keep Notes for broader context beyond the machine-specific failure reason
- update the ATVM automation guide and AGENTS rules to match
- record the reporting preference in atvm-automation-runs.md
- keep atvm-automation-examples.md limited to reusable example commands
- move example-file role guidance into AGENTS.md and atvm-automation-guide.md
- document that all ATVM automation run types use the same status display format
- record the status-format rule as a run learning in atvm-automation-runs.md
- Treat atvm-automation-examples.md as reference material rather than default operator intent.
- Use only explicitly requested options, plus maintained mandatory blacklist handling.
- Record the rule as a run learning in atvm-automation-runs.md.
Stop defaulting to cypress.atvm-config.ts in ATVM guidance.
Prefer gold-named config files unless explicitly told otherwise.
Update automation examples to use cypress.atvm-config-gold.ts.
Record the run learning explaining why the old default is unreliable.
Add atvm144-suse15.0 to the ATVM automation blacklist because it crashes when creating a migration session.
Update maintained exclude examples to include the new blacklist entry.
Tighten status reporting guidance to require one machine per line.
Add a Notes section for failure reasons and operator-facing context.
Record the new run learnings in atvm-automation-runs.md.