Reset reused watcher state before starting a new ATVM run
- update the watcher start helper to stop any stale watcher instance for the same requested parent build name and remove its old state directory before starting fresh - document that reused parent build names must not inherit stale cancelled, posted, state.json, or subruns state from older runs - update the watcher install and design docs so the controller workflow explicitly treats stale reused-build-name state as part of startup cleanup
This commit is contained in:
@@ -144,6 +144,10 @@ Possible contents:
|
|||||||
- one cancellation marker per parent run id
|
- one cancellation marker per parent run id
|
||||||
- optional lock file to prevent multiple watcher instances from racing
|
- optional lock file to prevent multiple watcher instances from racing
|
||||||
|
|
||||||
|
When the same requested parent build name is reused for a new run:
|
||||||
|
- the watcher start workflow must clear old watcher state for that requested build name before starting
|
||||||
|
- stale `cancelled.marker`, `posted.marker`, `state.json`, and `subruns/` contents must not be allowed to affect the new run
|
||||||
|
|
||||||
## Recommended Operator Workflow
|
## Recommended Operator Workflow
|
||||||
Normal completion workflow:
|
Normal completion workflow:
|
||||||
1. ATVM run starts.
|
1. ATVM run starts.
|
||||||
|
|||||||
@@ -120,6 +120,7 @@ Recommended permissions:
|
|||||||
- if the run uses `--categorize`, also pass `--categorize` to the watcher start helper
|
- if the run uses `--categorize`, also pass `--categorize` to the watcher start helper
|
||||||
- confirm final Mattermost delivery for a completed run
|
- confirm final Mattermost delivery for a completed run
|
||||||
- confirm categorized execution sends one post per completed grouped sub-run
|
- confirm categorized execution sends one post per completed grouped sub-run
|
||||||
|
- confirm reused parent build names do not inherit stale `cancelled.marker`, `posted.marker`, or `subruns/` state from older runs
|
||||||
|
|
||||||
## Recommended Validation Commands
|
## Recommended Validation Commands
|
||||||
|
|
||||||
@@ -154,6 +155,7 @@ Once installed, the intended workflow is:
|
|||||||
|
|
||||||
1. Launch the ATVM run as usual.
|
1. Launch the ATVM run as usual.
|
||||||
2. Start the watcher for that build name.
|
2. Start the watcher for that build name.
|
||||||
|
- the start helper must clear any stale watcher state for that same requested build name before starting the new watcher instance
|
||||||
3. Let the watcher run on the controller.
|
3. Let the watcher run on the controller.
|
||||||
4. The watcher exits on terminal state.
|
4. The watcher exits on terminal state.
|
||||||
|
|
||||||
|
|||||||
@@ -49,6 +49,7 @@ Typical workflow:
|
|||||||
1. Launch the ATVM run.
|
1. Launch the ATVM run.
|
||||||
2. Start the watcher for that run.
|
2. Start the watcher for that run.
|
||||||
3. The watcher polls the run log, process state, and `cmcReporter` artifacts.
|
3. The watcher polls the run log, process state, and `cmcReporter` artifacts.
|
||||||
|
- before starting, the helper resets any prior watcher state for the same requested build name so stale cancellation or posted markers do not leak into a new run
|
||||||
4. For non-categorized runs, when the run reaches a terminal state:
|
4. For non-categorized runs, when the run reaches a terminal state:
|
||||||
- `COMPLETED` or `FAILED`
|
- `COMPLETED` or `FAILED`
|
||||||
- build the final ATVM status
|
- build the final ATVM status
|
||||||
@@ -105,6 +106,12 @@ That results in:
|
|||||||
- service instance:
|
- service instance:
|
||||||
- `atvm-run-watcher@e2e-redhat9.6-ubuntu24.04-w2k25-fc.service`
|
- `atvm-run-watcher@e2e-redhat9.6-ubuntu24.04-w2k25-fc.service`
|
||||||
|
|
||||||
|
The helper also:
|
||||||
|
|
||||||
|
- stops any stale watcher instance for that same requested build name
|
||||||
|
- removes the old watcher state directory for that requested build name
|
||||||
|
- starts the new watcher with a clean state root for the new run
|
||||||
|
|
||||||
## Cancel Example
|
## Cancel Example
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|||||||
@@ -49,6 +49,8 @@ if [[ -z "$BUILD_NAME" ]]; then
|
|||||||
fi
|
fi
|
||||||
|
|
||||||
RUN_DIR="${STATE_ROOT}/${BUILD_NAME}"
|
RUN_DIR="${STATE_ROOT}/${BUILD_NAME}"
|
||||||
|
systemctl stop "atvm-run-watcher@${BUILD_NAME}.service" >/dev/null 2>&1 || true
|
||||||
|
rm -rf "$RUN_DIR"
|
||||||
mkdir -p "$RUN_DIR"
|
mkdir -p "$RUN_DIR"
|
||||||
|
|
||||||
cat >"${RUN_DIR}/watch.env" <<EOF
|
cat >"${RUN_DIR}/watch.env" <<EOF
|
||||||
|
|||||||
Reference in New Issue
Block a user