Files
cds-ai/atvm/watcher-service/README.md
anthony.wen c9706e9702 Record cancelled watcher state on ATVM run cancellation
- update the watcher cancel helper so it writes a final CANCELLED state into state.json before stopping the service
- record cancellation timestamps and a cancellation note in the watcher state file for clearer post-run inspection
- update the watcher service docs so the documented cancel behavior matches the state-file handling
2026-03-25 18:24:17 -04:00

3.6 KiB

ATVM Watcher Service

This folder contains a per-run ATVM watcher service package that is intended to be reviewed locally first and installed on the ATVM Cypress controller later only when explicitly requested.

Purpose

Watch a single ATVM automation run until it reaches a terminal state, then:

  • post the final status to Mattermost if the run state is COMPLETED or FAILED
  • verify the Mattermost post succeeded
  • write durable watcher state
  • exit cleanly so the service stops

The watcher does not run indefinitely. It is designed for one run per service instance.

Files

  • atvm_run_watcher.py
    • main watcher implementation
  • atvm-run-watcher@.service
    • systemd template unit for one watcher instance per build name
  • start-atvm-run-watcher.sh
    • helper to write per-run environment data and start a watcher instance
  • cancel-atvm-run-watcher.sh
    • helper to mark a run cancelled and stop the watcher instance

Intended Controller Paths

These are the default install targets assumed by the included unit file:

  • service package root: /opt/atvm-watcher-service
  • watcher state root: /var/lib/atvm-run-watcher
  • controller ATVM automation root: /root/cdc-e2e-cyp-12.17.4
  • watcher environment file: /etc/atvm-run-watcher.env

Use /opt/atvm-watcher-service as the controller install root for future installs and reinstalls. Do not treat /root/atvm-watcher-service as the preferred long-term install location.

Per-Run Behavior

Each watcher instance is tied to one build name.

Typical workflow:

  1. Launch the ATVM run.
  2. Start the watcher for that run.
  3. The watcher polls the run log, process state, and cmcReporter artifacts.
  4. When the run reaches a terminal state:
    • COMPLETED or FAILED
      • build the final ATVM status
      • send the status to Mattermost
      • verify Mattermost returned ok
      • mark the run as posted
      • exit
    • CANCELLED, TERMINATED, HUNG, or UNKNOWN
      • do not post
      • mark the final state
      • exit

Required Environment

The service expects the local credentials file values to be made available on the controller through the service environment:

  • MATTERMOST_ATVM_WEBHOOK
  • MATTERMOST_ATVM_CHANNEL

Optional metadata for better status formatting:

  • ATVM_WATCHER_TEMPLATE
  • ATVM_WATCHER_CONFIG_FAMILY
  • ATVM_WATCHER_MIGRATION_STYLE
  • ATVM_WATCHER_INTEGRATION_PLUGIN
  • ATVM_WATCHER_SCOPE_DESCRIPTION

Start Example

This helper writes a per-run environment file and starts the matching instance:

./start-atvm-run-watcher.sh \
  --build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc \
  --template cmc-e2e \
  --config-family gold \
  --migration-style "ATVM end-to-end migration validation" \
  --integration-plugin "pure with fc" \
  --scope-description "mixed Linux and Windows FC E2E validation on the gold datastore set"

That results in:

  • state dir:
    • /var/lib/atvm-run-watcher/e2e-redhat9.6-ubuntu24.04-w2k25-fc
  • service instance:
    • atvm-run-watcher@e2e-redhat9.6-ubuntu24.04-w2k25-fc.service

Cancel Example

./cancel-atvm-run-watcher.sh --build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc

This writes a cancellation marker, updates state.json to CANCELLED, and stops the watcher instance. The watcher will not send Mattermost results for that run.

Notes

  • The watcher uses the same ATVM status layout documented in atvm/docs/automation/status-template.md.
  • Kernel values are resolved from atvm/inventory/vm-inventory.md.
  • Best-practice controller install path: /opt/atvm-watcher-service.
  • This package is local-only right now. Nothing here is installed on the controller yet.