fix(atvm-watcher): synthesize failed host result on hang-kill/nonzero exit; update run learning and vm inventory

This commit is contained in:
2026-05-12 14:42:11 -04:00
parent 222fb1aaa2
commit 8c4985d33a
3 changed files with 50 additions and 0 deletions

View File

@@ -658,3 +658,10 @@ This file stores run-specific examples only when a run produced a new learning r
- When parsing parent `Cloud Run Finished` tables, treat standalone wrapped `s` rows as duration-cell continuations and remove those rows instead of appending `s` to the end of the host line.
- Rely on the existing duration parser to accept wrapped values without the trailing `s`.
- Replay the exact launch log through the current watcher code after this fix before trusting a corrected host count.
## Run Learning: 2026-05-07 (Synthesize failed host row when hang-kill occurs before reporter artifacts)
- Observed failure mode:
- Some hang-killed runs exit before host-level reporter artifacts are emitted, which can leave Mattermost statuses with `FAILED` summary but no host rows.
- Action for future runs:
- When a run is marked `FAILED` from hang-kill markers or non-zero runner exit and no host results are available, synthesize one failed host row from current host/spec inference.
- Use a clear failure detail such as `hang timeout killed runner` so operator-facing status always includes a concrete host failure line.