Fix categorized ATVM watcher host result recovery
This commit is contained in:
@@ -356,3 +356,20 @@ This file stores run-specific examples only when a run produced a new learning r
|
||||
- Keep the maintained `--exclude_partial_match` list for broad selectors such as `--containsVm` or `--randomize`.
|
||||
- When the operator uses `--specify_vms`, do not auto-add the blacklist unless they explicitly request it.
|
||||
- Even when the operator uses `--specify_vms`, first check whether any requested VM is on the maintained blacklist and stop instead of launching it if one is included.
|
||||
|
||||
## Run Learning: 2026-03-30 (Controller watcher deployment must match the repo watcher before trusting live posts)
|
||||
- Observed failure mode:
|
||||
- The repo watcher had the corrected `cmc-reboot` flow, but the controller install at `/opt/atvm-watcher-service/atvm_run_watcher.py` still had the old generic 5-step fallback.
|
||||
- A live categorized reboot subrun therefore posted the stale 5-step `TEST FLOW:` even though the repo copy had already been fixed.
|
||||
- Action for future runs:
|
||||
- Before trusting watcher-generated live posts for new watcher behavior, verify that the controller install matches the intended repo watcher version.
|
||||
- If the controller install is stale and the operator approves it, deploy the updated watcher code to `/opt/atvm-watcher-service` and restart only the watcher instance for the active build.
|
||||
|
||||
## Run Learning: 2026-03-30 (Categorized grouped recovery must parse real per-host reporter status, not assume pass)
|
||||
- Observed failure mode:
|
||||
- A categorized Red Hat reboot subrun posted both hosts as passed even though `atvm71-redhat9.1` actually failed during `1. Verifying set up`.
|
||||
- The grouped XML only contained `check-xml-files.ts`, and the watcher incorrectly treated the presence of a per-host reporter artifact as `PASS completed`.
|
||||
- Action for future runs:
|
||||
- When grouped XML lacks explicit host testcase results, recover grouped host status from the per-host reporter JSON or equivalent detailed artifact.
|
||||
- Carry through the real `failures`, `pending`, and failure message from that host artifact instead of assuming `PASS completed`.
|
||||
- If a correction post is needed because stale or reconstructed state was wrong, mark it explicitly as a correction that supersedes the earlier result.
|
||||
|
||||
Reference in New Issue
Block a user