Add ATVM watcher service and explicit watcher approval flow
- add the per-run ATVM watcher service package under atvm/watcher-service, including the Python watcher, systemd template unit, helper scripts, and deployment docs - document the watcher-service install and operating model, including one-run-per-instance behavior, Mattermost posting rules, and the best-practice /opt/atvm-watcher-service install path - clarify ATVM run approval semantics so `approve` means run without watcher and `approve with watcher` means run and start the watcher - update the ATVM automation guide and AGENTS rules so watcher usage and approval behavior are explicit and consistent
This commit is contained in:
109
atvm/watcher-service/README.md
Normal file
109
atvm/watcher-service/README.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# ATVM Watcher Service
|
||||
|
||||
This folder contains a per-run ATVM watcher service package that is intended to be reviewed locally first and installed on the ATVM Cypress controller later only when explicitly requested.
|
||||
|
||||
## Purpose
|
||||
|
||||
Watch a single ATVM automation run until it reaches a terminal state, then:
|
||||
|
||||
- post the final status to Mattermost if the run state is `COMPLETED` or `FAILED`
|
||||
- verify the Mattermost post succeeded
|
||||
- write durable watcher state
|
||||
- exit cleanly so the service stops
|
||||
|
||||
The watcher does not run indefinitely. It is designed for one run per service instance.
|
||||
|
||||
## Files
|
||||
|
||||
- `atvm_run_watcher.py`
|
||||
- main watcher implementation
|
||||
- `atvm-run-watcher@.service`
|
||||
- `systemd` template unit for one watcher instance per build name
|
||||
- `start-atvm-run-watcher.sh`
|
||||
- helper to write per-run environment data and start a watcher instance
|
||||
- `cancel-atvm-run-watcher.sh`
|
||||
- helper to mark a run cancelled and stop the watcher instance
|
||||
|
||||
## Intended Controller Paths
|
||||
|
||||
These are the default install targets assumed by the included unit file:
|
||||
|
||||
- service package root: `/opt/atvm-watcher-service`
|
||||
- watcher state root: `/var/lib/atvm-run-watcher`
|
||||
- controller ATVM automation root: `/root/cdc-e2e-cyp-12.17.4`
|
||||
- watcher environment file: `/etc/atvm-run-watcher.env`
|
||||
|
||||
Use `/opt/atvm-watcher-service` as the controller install root for future installs and reinstalls.
|
||||
Do not treat `/root/atvm-watcher-service` as the preferred long-term install location.
|
||||
|
||||
## Per-Run Behavior
|
||||
|
||||
Each watcher instance is tied to one build name.
|
||||
|
||||
Typical workflow:
|
||||
|
||||
1. Launch the ATVM run.
|
||||
2. Start the watcher for that run.
|
||||
3. The watcher polls the run log, process state, and `cmcReporter` artifacts.
|
||||
4. When the run reaches a terminal state:
|
||||
- `COMPLETED` or `FAILED`
|
||||
- build the final ATVM status
|
||||
- send the status to Mattermost
|
||||
- verify Mattermost returned `ok`
|
||||
- mark the run as posted
|
||||
- exit
|
||||
- `CANCELLED`, `TERMINATED`, `HUNG`, or `UNKNOWN`
|
||||
- do not post
|
||||
- mark the final state
|
||||
- exit
|
||||
|
||||
## Required Environment
|
||||
|
||||
The service expects the local credentials file values to be made available on the controller through the service environment:
|
||||
|
||||
- `MATTERMOST_ATVM_WEBHOOK`
|
||||
- `MATTERMOST_ATVM_CHANNEL`
|
||||
|
||||
Optional metadata for better status formatting:
|
||||
|
||||
- `ATVM_WATCHER_TEMPLATE`
|
||||
- `ATVM_WATCHER_CONFIG_FAMILY`
|
||||
- `ATVM_WATCHER_MIGRATION_STYLE`
|
||||
- `ATVM_WATCHER_INTEGRATION_PLUGIN`
|
||||
- `ATVM_WATCHER_SCOPE_DESCRIPTION`
|
||||
|
||||
## Start Example
|
||||
|
||||
This helper writes a per-run environment file and starts the matching instance:
|
||||
|
||||
```bash
|
||||
./start-atvm-run-watcher.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc \
|
||||
--template cmc-e2e \
|
||||
--config-family gold \
|
||||
--migration-style "ATVM end-to-end migration validation" \
|
||||
--integration-plugin "pure with fc" \
|
||||
--scope-description "mixed Linux and Windows FC E2E validation on the gold datastore set"
|
||||
```
|
||||
|
||||
That results in:
|
||||
|
||||
- state dir:
|
||||
- `/var/lib/atvm-run-watcher/e2e-redhat9.6-ubuntu24.04-w2k25-fc`
|
||||
- service instance:
|
||||
- `atvm-run-watcher@e2e-redhat9.6-ubuntu24.04-w2k25-fc.service`
|
||||
|
||||
## Cancel Example
|
||||
|
||||
```bash
|
||||
./cancel-atvm-run-watcher.sh --build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc
|
||||
```
|
||||
|
||||
This writes a cancellation marker and stops the watcher instance. The watcher will not send Mattermost results for that run.
|
||||
|
||||
## Notes
|
||||
|
||||
- The watcher uses the same ATVM status layout documented in `atvm/docs/automation/status-template.md`.
|
||||
- Kernel values are resolved from `atvm/inventory/vm-inventory.md`.
|
||||
- Best-practice controller install path: `/opt/atvm-watcher-service`.
|
||||
- This package is local-only right now. Nothing here is installed on the controller yet.
|
||||
Reference in New Issue
Block a user