Keep categorized ATVM watcher alive until parent run finishes

- update the watcher to treat categorized parent-run activity as the authoritative signal for whether the overall request is still running
- prevent the watcher from exiting early just because one categorized grouped sub-run completed and wrote artifacts
- document that categorized watcher instances must remain alive between grouped runs until the parent request has actually gone inactive past the grace window
- update the ATVM guide, watcher design, and install docs to reflect the stricter categorized parent-run completion rule
This commit is contained in:
2026-03-26 12:39:23 -04:00
parent 1ba508169f
commit 44e6e0e653
5 changed files with 23 additions and 1 deletions

View File

@@ -120,6 +120,7 @@ Recommended permissions:
- if the run uses `--categorize`, also pass `--categorize` to the watcher start helper
- confirm final Mattermost delivery for a completed run
- confirm categorized execution sends one post per completed grouped sub-run
- confirm the watcher stays alive between categorized grouped runs while the parent request is still active
- confirm reused parent build names do not inherit stale `cancelled.marker`, `posted.marker`, or `subruns/` state from older runs
## Recommended Validation Commands
@@ -191,6 +192,7 @@ The cancel helper should:
- This is not a daemon.
- One watcher instance is started per ATVM run.
- Categorized execution is treated as one watcher instance tracking sequential grouped ATVM sub-runs.
- In categorized execution, the watcher must remain alive until the parent request has actually gone inactive past the grace window, even if one grouped sub-run already completed.
- The watcher exits after the run reaches a terminal state.
- The watcher writes state under `/var/lib/atvm-run-watcher/<build-name>`.
- The watcher prevents duplicate Mattermost posts by writing posted markers.