atvm: default automation runs to watcher-backed execute mode
Update ATVM run workflow rules to remove the default pre-run approval gate for automation requests while keeping safety checks around live running-state and spec verification. Set watcher-backed execution as the default unless explicitly overridden and require post-execution reporting of the exact template and runner commands used. Record the workflow shift in automation run learnings with a dated entry for future consistency.
This commit is contained in:
@@ -62,9 +62,10 @@ This file defines how to operate and maintain the ATVM workspace in `/home/aw/co
|
|||||||
- Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
|
- Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
|
||||||
- Keep static IP configuration as the final setup step.
|
- Keep static IP configuration as the final setup step.
|
||||||
- Before any automation run, always check whether automation is already running.
|
- Before any automation run, always check whether automation is already running.
|
||||||
- Always show exact planned ATVM commands before execution.
|
- Treat every new ATVM run request as requiring a fresh live controller running-state check; never rely on the immediately previous status check when deciding whether a new run may start.
|
||||||
- Never execute setup or automation commands that require approval until the operator explicitly approves them.
|
- For ATVM automation runs, execute without a pre-run approval gate unless the operator explicitly asks to review commands first.
|
||||||
- For ATVM run approvals, treat `approve` as run-with-watcher and `approve without watcher` as run-without-watcher.
|
- Default all ATVM automation runs to watcher-backed execution unless the operator explicitly says to run without watcher.
|
||||||
|
- After starting an ATVM automation run, report the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands.
|
||||||
- Treat git/commit requests as a separate approval gate.
|
- Treat git/commit requests as a separate approval gate.
|
||||||
- Follow `/home/aw/code/cds/atvm/git-guide.md` for ATVM git command drafting and commit-request handling, including the rule that the controller `e2e cypress` repo behavior only applies when the operator explicitly asks for the `e2e cypress` repo or a close variation, the rule to draft plain git commands rather than SSH-wrapped controller login commands unless explicitly requested, the SSH-prefixed push example requirement for that repo, and the rule that phrases such as `create me a git`, `create a git`, `create a git description`, `make me a git`, `make a git`, `make me a git description`, `create me a git description`, and close variations are prepare-only until the operator explicitly approves the displayed commit command.
|
- Follow `/home/aw/code/cds/atvm/git-guide.md` for ATVM git command drafting and commit-request handling, including the rule that the controller `e2e cypress` repo behavior only applies when the operator explicitly asks for the `e2e cypress` repo or a close variation, the rule to draft plain git commands rather than SSH-wrapped controller login commands unless explicitly requested, the SSH-prefixed push example requirement for that repo, and the rule that phrases such as `create me a git`, `create a git`, `create a git description`, `make me a git`, `make a git`, `make me a git description`, `create me a git description`, and close variations are prepare-only until the operator explicitly approves the displayed commit command.
|
||||||
- Never execute `git push` from the assistant for this workspace.
|
- Never execute `git push` from the assistant for this workspace.
|
||||||
@@ -77,8 +78,8 @@ This file defines how to operate and maintain the ATVM workspace in `/home/aw/co
|
|||||||
- For `cmc-reboot`, treat `--use_specified_plugin both` as a separate confirmation gate, not a normal plugin selection.
|
- For `cmc-reboot`, treat `--use_specified_plugin both` as a separate confirmation gate, not a normal plugin selection.
|
||||||
- When a planned reboot command uses `--use_specified_plugin both`, warn that FC+iSCSI together may have a "chicken before the egg" timing issue where iSCSI disks are not attached before mTDI / CMC services start, and ask the operator to confirm that `both` is really intended.
|
- When a planned reboot command uses `--use_specified_plugin both`, warn that FC+iSCSI together may have a "chicken before the egg" timing issue where iSCSI disks are not attached before mTDI / CMC services start, and ask the operator to confirm that `both` is really intended.
|
||||||
- Unless the operator explicitly reconfirms `both` for `cmc-reboot`, prefer only `fc` or only `iscsi`, not both.
|
- Unless the operator explicitly reconfirms `both` for `cmc-reboot`, prefer only `fc` or only `iscsi`, not both.
|
||||||
- When the watcher is requested, start the watcher before `run-sorry-cypress.py`.
|
- For default watcher-backed runs, start the watcher before `run-sorry-cypress.py`.
|
||||||
- When the watcher is requested, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`; the operator should not need to restate them separately.
|
- For default watcher-backed runs, build the watcher-start command so it automatically includes the exact `cmc-templates.py` command via `--template-command` and the exact `run-sorry-cypress.py` command via `--runner-command`; the operator should not need to restate them separately.
|
||||||
- When watcher-backed execution is used, prefer the controller-local `atvm-runner@...` systemd service over detached SSH background launch patterns for `run-sorry-cypress.py`.
|
- When watcher-backed execution is used, prefer the controller-local `atvm-runner@...` systemd service over detached SSH background launch patterns for `run-sorry-cypress.py`.
|
||||||
- Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
- Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
||||||
- Prefer the combined `start-atvm-run.sh` wrapper when starting both services so watcher and runner are never launched in parallel against the same `/tmp/<build-name>.log`.
|
- Prefer the combined `start-atvm-run.sh` wrapper when starting both services so watcher and runner are never launched in parallel against the same `/tmp/<build-name>.log`.
|
||||||
|
|||||||
@@ -39,16 +39,14 @@ Run ATVM CMC automation tests on the designated automation VM without unintended
|
|||||||
- When `cmc-reboot` is planned with `--use_specified_plugin both`, warn that FC+iSCSI together may hit a "chicken before the egg" timing problem where iSCSI disks are not attached before mTDI / CMC services start.
|
- When `cmc-reboot` is planned with `--use_specified_plugin both`, warn that FC+iSCSI together may hit a "chicken before the egg" timing problem where iSCSI disks are not attached before mTDI / CMC services start.
|
||||||
- For `cmc-reboot`, prefer `--use_specified_plugin fc` or `--use_specified_plugin iscsi` unless the operator explicitly reconfirms that `both` is really intended after seeing that warning.
|
- For `cmc-reboot`, prefer `--use_specified_plugin fc` or `--use_specified_plugin iscsi` unless the operator explicitly reconfirms that `both` is really intended after seeing that warning.
|
||||||
- Before preparing a new run, always check whether automation is already running.
|
- Before preparing a new run, always check whether automation is already running.
|
||||||
|
- Treat a prior status check as stale once control returns to the operator or a new ATVM request arrives; perform a fresh live controller check at request time instead of relying on the immediately previous result.
|
||||||
- Always report whether automation is currently running.
|
- Always report whether automation is currently running.
|
||||||
- If running, ask whether to terminate; terminate only with explicit approval.
|
- If running, ask whether to terminate; terminate only with explicit approval.
|
||||||
- After termination approval, terminate first, then present planned command(s), then wait for separate execution approval.
|
- After termination approval, terminate first, then execute the new run command set.
|
||||||
- Before any run, always show exact planned command(s) exactly as they will be executed and wait for explicit approval.
|
- By default, execute `cmc-templates.py` and `run-sorry-cypress.py` without a pre-run approval gate.
|
||||||
- Never execute `cmc-templates.py`, `run-sorry-cypress.py`, or any other ATVM run command until the operator explicitly approves the displayed command(s).
|
- If the operator explicitly asks to review planned commands first, show them before execution.
|
||||||
- Approval is required even for preparation-only steps such as template generation.
|
- If the operator changes any part of the request before execution, rebuild commands and execute the revised command set.
|
||||||
- If the operator changes any part of the request after commands are displayed, rebuild the commands, show the updated commands, and wait for fresh approval before executing anything.
|
- Default to watcher-backed execution for every run unless the operator explicitly asks to run without watcher.
|
||||||
- Execute ATVM run commands only after explicit approval.
|
|
||||||
- Treat `approve` as approval to run and also start the per-run watcher service for that build.
|
|
||||||
- Treat `approve without watcher` as approval to execute the ATVM run without starting the watcher.
|
|
||||||
- When `--categorize` is used with watcher enabled, treat the watcher as a sequential grouped-run watcher:
|
- When `--categorize` is used with watcher enabled, treat the watcher as a sequential grouped-run watcher:
|
||||||
- it must post one final Mattermost status per completed categorized group/sub-run
|
- it must post one final Mattermost status per completed categorized group/sub-run
|
||||||
- it must stay active between grouped sub-runs while the parent categorized request is still running
|
- it must stay active between grouped sub-runs while the parent categorized request is still running
|
||||||
@@ -56,6 +54,7 @@ Run ATVM CMC automation tests on the designated automation VM without unintended
|
|||||||
- if the child build id label does not match the actual host/spec being executed, report the grouped run using the inferred host-based group instead of the raw child build id label
|
- if the child build id label does not match the actual host/spec being executed, report the grouped run using the inferred host-based group instead of the raw child build id label
|
||||||
- it must not wait and replace those with one single parent-only post
|
- it must not wait and replace those with one single parent-only post
|
||||||
- After execution, report immediate success/failure only.
|
- After execution, report immediate success/failure only.
|
||||||
|
- After execution, include the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands in the response.
|
||||||
- Do not include expected, harmless `systemctl reset-failed ... unit not loaded` output in routine run-start confirmations.
|
- Do not include expected, harmless `systemctl reset-failed ... unit not loaded` output in routine run-start confirmations.
|
||||||
- Mention `reset-failed` output only when it prevents watcher startup or becomes relevant to debugging.
|
- Mention `reset-failed` output only when it prevents watcher startup or becomes relevant to debugging.
|
||||||
- Do not actively monitor completion unless explicitly requested.
|
- Do not actively monitor completion unless explicitly requested.
|
||||||
@@ -101,16 +100,16 @@ Run ATVM CMC automation tests on the designated automation VM without unintended
|
|||||||
|
|
||||||
Typical sequence:
|
Typical sequence:
|
||||||
1. Build the exact `cmc-templates.py` and `run-sorry-cypress.py` commands for the request.
|
1. Build the exact `cmc-templates.py` and `run-sorry-cypress.py` commands for the request.
|
||||||
2. Show those exact commands to the operator.
|
2. Run `cmc-templates.py` with the requested options.
|
||||||
3. Wait for explicit approval.
|
3. Wait for `cmc-templates.py` to fully finish and confirm success.
|
||||||
4. Run `cmc-templates.py` with the approved options.
|
4. Verify the generated `.ts` files and the config `specPattern` include every requested VM before starting the runner.
|
||||||
5. Wait for `cmc-templates.py` to fully finish and confirm success.
|
5. By default, use watcher-backed execution unless the operator explicitly asks not to.
|
||||||
6. Verify the generated `.ts` files and the config `specPattern` include every requested VM before starting the runner.
|
6. For watcher-backed runs, make sure the controller's deployed watcher code is the intended version before relying on its posts.
|
||||||
7. If the watcher is approved, make sure the controller's deployed watcher code is the intended version before relying on its posts.
|
7. For watcher-backed runs, build the watcher-start command so it automatically includes the exact `cmc-templates.py` command via `--template-command` and the exact `run-sorry-cypress.py` command via `--runner-command`.
|
||||||
8. If the watcher is approved, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`.
|
8. For watcher-backed runs, prefer the controller-local `atvm-runner@...` systemd service instead of detached SSH background launch patterns for `run-sorry-cypress.py`.
|
||||||
9. If the watcher is approved, prefer the controller-local `atvm-runner@...` systemd service instead of detached SSH background launch patterns for `run-sorry-cypress.py`.
|
9. For watcher-backed runs, start the watcher before launching the runner service.
|
||||||
10. If the watcher is approved, start the watcher before launching the runner service.
|
10. Start the runner with the matching config and build name.
|
||||||
11. Start the runner with the matching approved config and build name.
|
11. Report immediate start success/failure and include the exact executed template and runner commands.
|
||||||
|
|
||||||
Completed-run verification sequence:
|
Completed-run verification sequence:
|
||||||
1. Read the launch log for the build.
|
1. Read the launch log for the build.
|
||||||
@@ -206,35 +205,33 @@ Preferred exclude list:
|
|||||||
Before any new automation request:
|
Before any new automation request:
|
||||||
1. SSH to `root@192.168.3.190`.
|
1. SSH to `root@192.168.3.190`.
|
||||||
2. Check for active automation processes (for example `run-sorry-cypress.py`, `cmc-templates.py`, and related Cypress runners).
|
2. Check for active automation processes (for example `run-sorry-cypress.py`, `cmc-templates.py`, and related Cypress runners).
|
||||||
3. Report:
|
3. Treat every new operator request to start, replace, or block on an ATVM run as requiring a new live controller check, even if a status check was performed earlier in the same conversation.
|
||||||
|
4. Do not reuse the result of the immediately previous running-state check when deciding whether to block or allow a new ATVM run request.
|
||||||
|
5. Report:
|
||||||
- `Running` with process details, or
|
- `Running` with process details, or
|
||||||
- `Not running`.
|
- `Not running`.
|
||||||
4. If `Running`, ask operator whether to terminate.
|
6. If `Running`, ask operator whether to terminate.
|
||||||
5. If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval.
|
7. If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval.
|
||||||
6. If termination is not approved, do not start a new run.
|
8. If termination is not approved, do not start a new run.
|
||||||
|
|
||||||
## Approval Workflow (Mandatory)
|
## Execution Workflow (Mandatory)
|
||||||
1. Build exact command(s) for the request.
|
1. Build exact command(s) for the request.
|
||||||
2. Present them verbatim as planned commands before running anything.
|
2. By default, execute without a pre-run approval gate. If the operator explicitly asks for command review first, show planned commands before running.
|
||||||
3. Wait for explicit approval.
|
3. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel.
|
||||||
4. When the watcher is available, present the watcher-start command separately from the core run commands.
|
4. Do not launch the ATVM runner until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files.
|
||||||
5. Treat `approve` as approval to execute the ATVM run and start the watcher for that build.
|
5. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching the ATVM runner.
|
||||||
6. Treat `approve without watcher` as approval to execute the ATVM run without starting the watcher.
|
6. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner.
|
||||||
7. When the watcher is requested, the displayed watcher-start command must automatically include the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`.
|
7. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope before execution, discard the old plan and execute only the revised command set.
|
||||||
8. If the run uses `--categorize` and the watcher is requested, include `--categorize` on the watcher start command too so the watcher tracks sequential categorized sub-runs correctly.
|
8. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning and require explicit confirmation that `both` is intended before execution.
|
||||||
9. Run only approved command(s), no extra options and no silent substitutions.
|
9. Default to watcher-backed execution unless explicitly told not to.
|
||||||
10. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel.
|
10. When watcher-backed execution is used, the watcher-start command must automatically include the exact executed `cmc-templates.py` command via `--template-command` and the exact executed `run-sorry-cypress.py` command via `--runner-command`.
|
||||||
11. Do not launch the ATVM runner until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files.
|
11. If the run uses `--categorize` and watcher is enabled, include `--categorize` on the watcher start command so the watcher tracks sequential categorized sub-runs correctly.
|
||||||
12. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching the ATVM runner.
|
12. When watcher-backed execution is used, launch the watcher before the runner service.
|
||||||
13. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner.
|
13. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
||||||
14. Treat displayed commands as a review gate: do not execute either command until the operator has had a chance to review them and explicitly approve.
|
14. Prefer the combined `start-atvm-run.sh` wrapper when both services are used, so watcher and runner are not launched in parallel.
|
||||||
15. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope after commands are shown, discard the old plan, show the revised commands, and wait for new approval.
|
15. If monitoring was not requested, report immediate success/failure for each command.
|
||||||
16. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning to the review message and require explicit confirmation that `both` is intended before execution.
|
16. If monitoring was requested, keep monitoring until completion and report final outcome.
|
||||||
17. If monitoring was not requested, report immediate success/failure for each command.
|
17. After execution, always include the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands in the response.
|
||||||
18. If monitoring was requested, keep monitoring until completion and report final outcome.
|
|
||||||
19. When the watcher is requested, launch the watcher before the runner service.
|
|
||||||
20. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
|
||||||
21. Prefer the combined `start-atvm-run.sh` wrapper when both services are used, so watcher and runner are not launched in parallel.
|
|
||||||
|
|
||||||
## Requested Test Style
|
## Requested Test Style
|
||||||
When asked for one VM or a VM set:
|
When asked for one VM or a VM set:
|
||||||
|
|||||||
@@ -9,6 +9,25 @@ This file stores run-specific examples only when a run produced a new learning r
|
|||||||
## Current State
|
## Current State
|
||||||
- No run-learning entries recorded yet from `guide.md` source material.
|
- No run-learning entries recorded yet from `guide.md` source material.
|
||||||
|
|
||||||
|
## Run Learning: 2026-05-06 (ATVM runs now execute by default with watcher and report executed commands after start)
|
||||||
|
- Observed requirement:
|
||||||
|
- The operator does not want a pre-run approval/review gate for standard ATVM automation run requests.
|
||||||
|
- The operator wants watcher-backed execution to be the default unless explicitly overridden.
|
||||||
|
- The operator wants the exact executed template and runner commands reported after execution starts.
|
||||||
|
- Action for future runs:
|
||||||
|
- Execute ATVM run requests by default without waiting for explicit `approve`.
|
||||||
|
- Default to watcher-backed launch using the combined `start-atvm-run.sh` wrapper unless the operator explicitly asks to run without watcher.
|
||||||
|
- After command execution, report the exact executed `cmc-templates.py` and `run-sorry-cypress.py` command strings in the response.
|
||||||
|
|
||||||
|
## Run Learning: 2026-05-02 (Do not reuse the previous controller status check for a new ATVM request)
|
||||||
|
- Observed failure mode:
|
||||||
|
- A later ATVM run request was blocked because the assistant reused the immediately previous controller status result instead of performing a fresh live running-state check at request time.
|
||||||
|
- The earlier check had been correct when it was taken, but the prior run had already failed and exited by the time the new request arrived.
|
||||||
|
- Action for future runs:
|
||||||
|
- Treat each new ATVM request to start, replace, or block on a run as requiring a fresh live controller check on `192.168.3.190`.
|
||||||
|
- Do not decide whether to block or allow a new run from a previously reported controller state, even within the same conversation.
|
||||||
|
- Re-check live runner and watcher state before telling the operator that automation is still running.
|
||||||
|
|
||||||
## Run Learning: 2026-04-29 (Combined watcher wrapper must execute template generation before runner startup)
|
## Run Learning: 2026-04-29 (Combined watcher wrapper must execute template generation before runner startup)
|
||||||
- Observed failure mode:
|
- Observed failure mode:
|
||||||
- A watcher-backed `start-atvm-run.sh` launch for `cmc-migrateops-compute-migration` started `run-sorry-cypress.py` without ever running the approved `cmc-templates.py` command.
|
- A watcher-backed `start-atvm-run.sh` launch for `cmc-migrateops-compute-migration` started `run-sorry-cypress.py` without ever running the approved `cmc-templates.py` command.
|
||||||
|
|||||||
Reference in New Issue
Block a user