64 Commits

Author SHA1 Message Date
3b9b9eef0f Merge categorized watcher subruns by inferred group identity
- update the watcher to merge consecutive categorized child build ids into one logical grouped subrun when they resolve to the same real host-based group
- prevent one real grouped run from being fragmented across multiple watcher subruns just because the raw ci-build-id label changes between hosts
- keep grouped Mattermost posting aligned with the inferred group identity instead of the unstable raw child build id
2026-03-26 16:51:10 -04:00
35342c1f23 Post categorized watcher results from early host completion evidence
- update the categorized watcher to use the latest matching host reporter artifact written before grouped finalization so grouped results can be posted as soon as the host run is actually done
- reduce dependence on the later parent Cloud Run Finished summary block for grouped-run completion reporting
- keep grouped runs from waiting until near parent-run completion when real host-level completion evidence is already available
2026-03-26 15:19:54 -04:00
293b9bf239 Delay categorized watcher posts until host summary is available
- keep categorized grouped sub-runs in RUNNING state when only the grouped check-xml-files.ts artifact has appeared and the parent log has not yet exposed the real host summary
- prevent blank Mattermost grouped-run posts with empty host tables, zero counts, and zero/invalid timing caused by posting too early
- require the categorized watcher to wait for host-level grouped summary data before treating the grouped run as complete
2026-03-26 14:40:11 -04:00
f5eb21cccd Infer categorized watcher group names from actual host execution
- update the watcher to stop trusting misleading categorized child build labels when they do not match the host/spec actually being executed
- infer the reported categorized group name from the actual host being run, so mismatched labels like ubuntu-batch for a Red Hat host are corrected in status reporting
- document the categorized watcher workaround in the ATVM guide, watcher design, and watcher README without changing the underlying ATVM runner scripts
2026-03-26 14:20:22 -04:00
7d49896ac2 Improve categorized watcher duration parsing for grouped runs
- relax grouped-run duration parsing so Linux categorized summaries like `12m 19.4` and `15m 42.2` are converted reliably even when the trailing `s` is split awkwardly in the parent run log
- keep the categorized grouped-run summary extraction aligned with the host results already being parsed from the parent Cloud Run Finished blocks
- refresh the controller watcher copy so the next categorized run uses the improved grouped duration parser
2026-03-26 13:41:06 -04:00
c74f74bc46 Improve categorized watcher grouped-run summary extraction
- update the watcher to parse completed categorized grouped-run host summaries from the parent run log instead of relying only on grouped XML files that often contain only check-xml-files.ts
- add grouped-run duration parsing so categorized sub-run timing can be derived from the Cloud Run Finished summary when host XML details are absent
- fix the completed-summary to grouped-xml alignment so filtered-out older artifacts do not shift host-summary assignment for the current run
2026-03-26 13:22:37 -04:00
44e6e0e653 Keep categorized ATVM watcher alive until parent run finishes
- update the watcher to treat categorized parent-run activity as the authoritative signal for whether the overall request is still running
- prevent the watcher from exiting early just because one categorized grouped sub-run completed and wrote artifacts
- document that categorized watcher instances must remain alive between grouped runs until the parent request has actually gone inactive past the grace window
- update the ATVM guide, watcher design, and install docs to reflect the stricter categorized parent-run completion rule
2026-03-26 12:39:23 -04:00
1ba508169f Fix watcher timestamp handling for reused categorized runs
- interpret ATVM controller log timestamps in the controller's local timezone before converting to UTC so the watcher uses the correct current-run window
- prevent newly started categorized runs from immediately picking up older categorized artifacts just because the parent build name was reused
- keep the categorized watcher focused on artifacts from the current controller run instead of stale prior attempts
2026-03-26 12:26:23 -04:00
3ea732d63c Improve categorized ATVM watcher sub-run detection
- update the watcher to detect the active categorized sub-run from the live `--ci-build-id` process state instead of treating the parent run as one synthetic grouped run
- fix host XML parsing so the watcher prefers the real host suite over the `Root Suite` entry, avoiding `0 tests, 0 failures` summaries
- use the first timestamp inside the run log as the watcher start time so restarted watchers do not miss current-run categorized artifacts because of log file mtime drift
- improve active-host inference for categorized runs so the watcher maps the current categorized build to the correct host family while the sub-run is still in progress
2026-03-26 12:01:07 -04:00
f5849dde0c Reset reused watcher state before starting a new ATVM run
- update the watcher start helper to stop any stale watcher instance for the same requested parent build name and remove its old state directory before starting fresh
- document that reused parent build names must not inherit stale cancelled, posted, state.json, or subruns state from older runs
- update the watcher install and design docs so the controller workflow explicitly treats stale reused-build-name state as part of startup cleanup
2026-03-26 11:30:28 -04:00
d60b8b9b18 Update ATVM watcher for categorized sub-run posting
- update the watcher design and automation guide to treat --categorize as sequential ATVM sub-runs rather than one parent run with internal phases
- document that categorized runs should send one Mattermost status per completed grouped sub-run instead of one parent-only final post
- add a --categorize option to the watcher start helper so categorized mode is explicit in watcher startup
- update the watcher implementation to track categorized sub-runs separately, write per-subrun state, and post each completed grouped run once
2026-03-26 11:00:39 -04:00
c9706e9702 Record cancelled watcher state on ATVM run cancellation
- update the watcher cancel helper so it writes a final CANCELLED state into state.json before stopping the service
- record cancellation timestamps and a cancellation note in the watcher state file for clearer post-run inspection
- update the watcher service docs so the documented cancel behavior matches the state-file handling
2026-03-25 18:24:17 -04:00
9caa7deb94 Stop tracking Python bytecode in watcher service
- remove the committed watcher-service __pycache__ bytecode file from git tracking
- ignore Python bytecode artifacts so generated .pyc files do not get committed again
- keep the watcher-service source files as the only tracked implementation artifacts
2026-03-25 17:44:52 -04:00
ba8354b95c Add ATVM watcher service and explicit watcher approval flow
- add the per-run ATVM watcher service package under atvm/watcher-service, including the Python watcher, systemd template unit, helper scripts, and deployment docs
- document the watcher-service install and operating model, including one-run-per-instance behavior, Mattermost posting rules, and the best-practice /opt/atvm-watcher-service install path
- clarify ATVM run approval semantics so `approve` means run without watcher and `approve with watcher` means run and start the watcher
- update the ATVM automation guide and AGENTS rules so watcher usage and approval behavior are explicit and consistent
2026-03-25 17:41:50 -04:00