Tighten ATVM completed-run status verification
This commit is contained in:
@@ -401,3 +401,28 @@ This file stores run-specific examples only when a run produced a new learning r
|
||||
- Do not classify a reporter TXT artifact as failed just because it contains the word `error`.
|
||||
- For TXT fallback, require explicit terminal failure markers such as `cy:command error`, `cy:task error`, or real `Error:`/`AssertionError:`/timeout text.
|
||||
- Prefer the parent run summary when available, because it is less prone to false failure signals than raw per-step console text.
|
||||
|
||||
## Run Learning: 2026-03-30 (Replay exact artifacts before assuming a thin closed-run detail is a current watcher bug)
|
||||
- Observed failure mode:
|
||||
- The saved controller state for `reboot-redhat8.10-both` still showed only `1 failures` under the host detail, even though the launch log contained the full md5sum failure text.
|
||||
- Replaying the exact launch log and reporter artifacts through the currently installed watcher produced the correct host detail with `57 tests, 1 failures` and the failing testcase/error text.
|
||||
- Action for future runs:
|
||||
- Before patching the watcher again for a thin closed-run detail, replay the exact run artifacts through the currently installed watcher code.
|
||||
- Treat a mismatch between saved state and current replay as evidence of a stale in-memory watcher instance or stale deployment, not automatically as a parser regression.
|
||||
- Use an isolated temp state directory or other no-post path for that replay so historical validation does not repost results.
|
||||
|
||||
## Run Learning: 2026-03-30 (Red Hat 8.10 Pure both failure on step 38 was a missing FC reboot-validation artifact with concurrent storage instability)
|
||||
- Observed failure mode:
|
||||
- The failing testcase was `38. Verify diskname2Reboot file is the same as diskname2Reboot’s source (Reboot test)`.
|
||||
- The concrete error was `md5sum: /root/tmp/fcDisk/diskname2Reboot.md5: No such file or directory`.
|
||||
- On the target after the run, `/root/tmp/fcDisk` contained `diskname2Disk` and `diskname2Disk.md5`, but not `diskname2Reboot.md5`.
|
||||
- Additional host findings:
|
||||
- The target showed repeated iSCSI authorization failures and later `Could not log into all portals`.
|
||||
- `mtdi-driver.service` started at `17:30:26 EDT`.
|
||||
- `iscsid.service` / `Open-iSCSI` started at `17:30:30 EDT`.
|
||||
- `iscsi.service`, `mtdi-daemon.service`, and `galaxy-migrate.service` reached active state at `17:32:45 EDT`.
|
||||
- Repeated multipath reinitialization and `failed to get ... uid` messages continued through the run window.
|
||||
- Action for future runs:
|
||||
- If this failure recurs, treat it as a host/storage investigation first, not just a watcher-formatting issue.
|
||||
- Check whether the FC reboot-validation step actually created `diskname2Reboot.md5` on `/root/tmp/fcDisk` before the md5 verification step ran.
|
||||
- Check whether repeated iSCSI auth failures or multipath churn during the same boot window are interfering with the expected disk/file state.
|
||||
|
||||
Reference in New Issue
Block a user