diff --git a/atvm/AGENTS.md b/atvm/AGENTS.md index f6cbd9e..3b9284e 100644 --- a/atvm/AGENTS.md +++ b/atvm/AGENTS.md @@ -62,9 +62,10 @@ This file defines how to operate and maintain the ATVM workspace in `/home/aw/co - Never run setup without operator-provided `--expected-ip` and `--expected-hostname`. - Keep static IP configuration as the final setup step. - Before any automation run, always check whether automation is already running. -- Always show exact planned ATVM commands before execution. -- Never execute setup or automation commands that require approval until the operator explicitly approves them. -- For ATVM run approvals, treat `approve` as run-with-watcher and `approve without watcher` as run-without-watcher. +- Treat every new ATVM run request as requiring a fresh live controller running-state check; never rely on the immediately previous status check when deciding whether a new run may start. +- For ATVM automation runs, execute without a pre-run approval gate unless the operator explicitly asks to review commands first. +- Default all ATVM automation runs to watcher-backed execution unless the operator explicitly says to run without watcher. +- After starting an ATVM automation run, report the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands. - Treat git/commit requests as a separate approval gate. - Follow `/home/aw/code/cds/atvm/git-guide.md` for ATVM git command drafting and commit-request handling, including the rule that the controller `e2e cypress` repo behavior only applies when the operator explicitly asks for the `e2e cypress` repo or a close variation, the rule to draft plain git commands rather than SSH-wrapped controller login commands unless explicitly requested, the SSH-prefixed push example requirement for that repo, and the rule that phrases such as `create me a git`, `create a git`, `create a git description`, `make me a git`, `make a git`, `make me a git description`, `create me a git description`, and close variations are prepare-only until the operator explicitly approves the displayed commit command. - Never execute `git push` from the assistant for this workspace. @@ -77,8 +78,8 @@ This file defines how to operate and maintain the ATVM workspace in `/home/aw/co - For `cmc-reboot`, treat `--use_specified_plugin both` as a separate confirmation gate, not a normal plugin selection. - When a planned reboot command uses `--use_specified_plugin both`, warn that FC+iSCSI together may have a "chicken before the egg" timing issue where iSCSI disks are not attached before mTDI / CMC services start, and ask the operator to confirm that `both` is really intended. - Unless the operator explicitly reconfirms `both` for `cmc-reboot`, prefer only `fc` or only `iscsi`, not both. -- When the watcher is requested, start the watcher before `run-sorry-cypress.py`. -- When the watcher is requested, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`; the operator should not need to restate them separately. +- For default watcher-backed runs, start the watcher before `run-sorry-cypress.py`. +- For default watcher-backed runs, build the watcher-start command so it automatically includes the exact `cmc-templates.py` command via `--template-command` and the exact `run-sorry-cypress.py` command via `--runner-command`; the operator should not need to restate them separately. - When watcher-backed execution is used, prefer the controller-local `atvm-runner@...` systemd service over detached SSH background launch patterns for `run-sorry-cypress.py`. - Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/.log` and can delete the fresh live runner log if the runner starts first. - Prefer the combined `start-atvm-run.sh` wrapper when starting both services so watcher and runner are never launched in parallel against the same `/tmp/.log`. diff --git a/atvm/docs/automation/guide.md b/atvm/docs/automation/guide.md index ef558f0..5888f75 100644 --- a/atvm/docs/automation/guide.md +++ b/atvm/docs/automation/guide.md @@ -39,16 +39,14 @@ Run ATVM CMC automation tests on the designated automation VM without unintended - When `cmc-reboot` is planned with `--use_specified_plugin both`, warn that FC+iSCSI together may hit a "chicken before the egg" timing problem where iSCSI disks are not attached before mTDI / CMC services start. - For `cmc-reboot`, prefer `--use_specified_plugin fc` or `--use_specified_plugin iscsi` unless the operator explicitly reconfirms that `both` is really intended after seeing that warning. - Before preparing a new run, always check whether automation is already running. +- Treat a prior status check as stale once control returns to the operator or a new ATVM request arrives; perform a fresh live controller check at request time instead of relying on the immediately previous result. - Always report whether automation is currently running. - If running, ask whether to terminate; terminate only with explicit approval. -- After termination approval, terminate first, then present planned command(s), then wait for separate execution approval. -- Before any run, always show exact planned command(s) exactly as they will be executed and wait for explicit approval. -- Never execute `cmc-templates.py`, `run-sorry-cypress.py`, or any other ATVM run command until the operator explicitly approves the displayed command(s). -- Approval is required even for preparation-only steps such as template generation. -- If the operator changes any part of the request after commands are displayed, rebuild the commands, show the updated commands, and wait for fresh approval before executing anything. -- Execute ATVM run commands only after explicit approval. -- Treat `approve` as approval to run and also start the per-run watcher service for that build. -- Treat `approve without watcher` as approval to execute the ATVM run without starting the watcher. +- After termination approval, terminate first, then execute the new run command set. +- By default, execute `cmc-templates.py` and `run-sorry-cypress.py` without a pre-run approval gate. +- If the operator explicitly asks to review planned commands first, show them before execution. +- If the operator changes any part of the request before execution, rebuild commands and execute the revised command set. +- Default to watcher-backed execution for every run unless the operator explicitly asks to run without watcher. - When `--categorize` is used with watcher enabled, treat the watcher as a sequential grouped-run watcher: - it must post one final Mattermost status per completed categorized group/sub-run - it must stay active between grouped sub-runs while the parent categorized request is still running @@ -56,6 +54,7 @@ Run ATVM CMC automation tests on the designated automation VM without unintended - if the child build id label does not match the actual host/spec being executed, report the grouped run using the inferred host-based group instead of the raw child build id label - it must not wait and replace those with one single parent-only post - After execution, report immediate success/failure only. +- After execution, include the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands in the response. - Do not include expected, harmless `systemctl reset-failed ... unit not loaded` output in routine run-start confirmations. - Mention `reset-failed` output only when it prevents watcher startup or becomes relevant to debugging. - Do not actively monitor completion unless explicitly requested. @@ -101,16 +100,16 @@ Run ATVM CMC automation tests on the designated automation VM without unintended Typical sequence: 1. Build the exact `cmc-templates.py` and `run-sorry-cypress.py` commands for the request. -2. Show those exact commands to the operator. -3. Wait for explicit approval. -4. Run `cmc-templates.py` with the approved options. -5. Wait for `cmc-templates.py` to fully finish and confirm success. -6. Verify the generated `.ts` files and the config `specPattern` include every requested VM before starting the runner. -7. If the watcher is approved, make sure the controller's deployed watcher code is the intended version before relying on its posts. -8. If the watcher is approved, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`. -9. If the watcher is approved, prefer the controller-local `atvm-runner@...` systemd service instead of detached SSH background launch patterns for `run-sorry-cypress.py`. -10. If the watcher is approved, start the watcher before launching the runner service. -11. Start the runner with the matching approved config and build name. +2. Run `cmc-templates.py` with the requested options. +3. Wait for `cmc-templates.py` to fully finish and confirm success. +4. Verify the generated `.ts` files and the config `specPattern` include every requested VM before starting the runner. +5. By default, use watcher-backed execution unless the operator explicitly asks not to. +6. For watcher-backed runs, make sure the controller's deployed watcher code is the intended version before relying on its posts. +7. For watcher-backed runs, build the watcher-start command so it automatically includes the exact `cmc-templates.py` command via `--template-command` and the exact `run-sorry-cypress.py` command via `--runner-command`. +8. For watcher-backed runs, prefer the controller-local `atvm-runner@...` systemd service instead of detached SSH background launch patterns for `run-sorry-cypress.py`. +9. For watcher-backed runs, start the watcher before launching the runner service. +10. Start the runner with the matching config and build name. +11. Report immediate start success/failure and include the exact executed template and runner commands. Completed-run verification sequence: 1. Read the launch log for the build. @@ -206,35 +205,33 @@ Preferred exclude list: Before any new automation request: 1. SSH to `root@192.168.3.190`. 2. Check for active automation processes (for example `run-sorry-cypress.py`, `cmc-templates.py`, and related Cypress runners). -3. Report: +3. Treat every new operator request to start, replace, or block on an ATVM run as requiring a new live controller check, even if a status check was performed earlier in the same conversation. +4. Do not reuse the result of the immediately previous running-state check when deciding whether to block or allow a new ATVM run request. +5. Report: - `Running` with process details, or - `Not running`. -4. If `Running`, ask operator whether to terminate. -5. If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval. -6. If termination is not approved, do not start a new run. +6. If `Running`, ask operator whether to terminate. +7. If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval. +8. If termination is not approved, do not start a new run. -## Approval Workflow (Mandatory) +## Execution Workflow (Mandatory) 1. Build exact command(s) for the request. -2. Present them verbatim as planned commands before running anything. -3. Wait for explicit approval. -4. When the watcher is available, present the watcher-start command separately from the core run commands. -5. Treat `approve` as approval to execute the ATVM run and start the watcher for that build. -6. Treat `approve without watcher` as approval to execute the ATVM run without starting the watcher. -7. When the watcher is requested, the displayed watcher-start command must automatically include the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`. -8. If the run uses `--categorize` and the watcher is requested, include `--categorize` on the watcher start command too so the watcher tracks sequential categorized sub-runs correctly. -9. Run only approved command(s), no extra options and no silent substitutions. -10. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel. -11. Do not launch the ATVM runner until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files. -12. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching the ATVM runner. -13. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner. -14. Treat displayed commands as a review gate: do not execute either command until the operator has had a chance to review them and explicitly approve. -15. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope after commands are shown, discard the old plan, show the revised commands, and wait for new approval. -16. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning to the review message and require explicit confirmation that `both` is intended before execution. -17. If monitoring was not requested, report immediate success/failure for each command. -18. If monitoring was requested, keep monitoring until completion and report final outcome. -19. When the watcher is requested, launch the watcher before the runner service. -20. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/.log` and can delete the fresh live runner log if the runner starts first. -21. Prefer the combined `start-atvm-run.sh` wrapper when both services are used, so watcher and runner are not launched in parallel. +2. By default, execute without a pre-run approval gate. If the operator explicitly asks for command review first, show planned commands before running. +3. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel. +4. Do not launch the ATVM runner until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files. +5. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching the ATVM runner. +6. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner. +7. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope before execution, discard the old plan and execute only the revised command set. +8. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning and require explicit confirmation that `both` is intended before execution. +9. Default to watcher-backed execution unless explicitly told not to. +10. When watcher-backed execution is used, the watcher-start command must automatically include the exact executed `cmc-templates.py` command via `--template-command` and the exact executed `run-sorry-cypress.py` command via `--runner-command`. +11. If the run uses `--categorize` and watcher is enabled, include `--categorize` on the watcher start command so the watcher tracks sequential categorized sub-runs correctly. +12. When watcher-backed execution is used, launch the watcher before the runner service. +13. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/.log` and can delete the fresh live runner log if the runner starts first. +14. Prefer the combined `start-atvm-run.sh` wrapper when both services are used, so watcher and runner are not launched in parallel. +15. If monitoring was not requested, report immediate success/failure for each command. +16. If monitoring was requested, keep monitoring until completion and report final outcome. +17. After execution, always include the exact executed `cmc-templates.py` and `run-sorry-cypress.py` commands in the response. ## Requested Test Style When asked for one VM or a VM set: diff --git a/atvm/docs/automation/run-learnings.md b/atvm/docs/automation/run-learnings.md index ab60aeb..6737d96 100644 --- a/atvm/docs/automation/run-learnings.md +++ b/atvm/docs/automation/run-learnings.md @@ -9,6 +9,25 @@ This file stores run-specific examples only when a run produced a new learning r ## Current State - No run-learning entries recorded yet from `guide.md` source material. +## Run Learning: 2026-05-06 (ATVM runs now execute by default with watcher and report executed commands after start) +- Observed requirement: + - The operator does not want a pre-run approval/review gate for standard ATVM automation run requests. + - The operator wants watcher-backed execution to be the default unless explicitly overridden. + - The operator wants the exact executed template and runner commands reported after execution starts. +- Action for future runs: + - Execute ATVM run requests by default without waiting for explicit `approve`. + - Default to watcher-backed launch using the combined `start-atvm-run.sh` wrapper unless the operator explicitly asks to run without watcher. + - After command execution, report the exact executed `cmc-templates.py` and `run-sorry-cypress.py` command strings in the response. + +## Run Learning: 2026-05-02 (Do not reuse the previous controller status check for a new ATVM request) +- Observed failure mode: + - A later ATVM run request was blocked because the assistant reused the immediately previous controller status result instead of performing a fresh live running-state check at request time. + - The earlier check had been correct when it was taken, but the prior run had already failed and exited by the time the new request arrived. +- Action for future runs: + - Treat each new ATVM request to start, replace, or block on a run as requiring a fresh live controller check on `192.168.3.190`. + - Do not decide whether to block or allow a new run from a previously reported controller state, even within the same conversation. + - Re-check live runner and watcher state before telling the operator that automation is still running. + ## Run Learning: 2026-04-29 (Combined watcher wrapper must execute template generation before runner startup) - Observed failure mode: - A watcher-backed `start-atvm-run.sh` launch for `cmc-migrateops-compute-migration` started `run-sorry-cypress.py` without ever running the approved `cmc-templates.py` command.