Improve ATVM watcher status metadata and run workflow
This commit is contained in:
@@ -77,6 +77,7 @@ This file defines how to operate and maintain the ATVM workspace in `/home/aw/co
|
||||
- When a planned reboot command uses `--use_specified_plugin both`, warn that FC+iSCSI together may have a "chicken before the egg" timing issue where iSCSI disks are not attached before mTDI / CMC services start, and ask the operator to confirm that `both` is really intended.
|
||||
- Unless the operator explicitly reconfirms `both` for `cmc-reboot`, prefer only `fc` or only `iscsi`, not both.
|
||||
- When the watcher is requested, start the watcher before `run-sorry-cypress.py`.
|
||||
- When the watcher is requested, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`; the operator should not need to restate them separately.
|
||||
- Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
||||
- For host-level test detail and failed-test investigation, use `/root/cdc-e2e-cyp-12.17.4/cypress/cmcReporter`, especially `logs/`, `xml/`, and `mochawesome/`.
|
||||
- Apply failed-host detail recovery consistently for every ATVM template run, not just `cmc-reboot`.
|
||||
|
||||
@@ -104,8 +104,9 @@ Typical sequence:
|
||||
5. Wait for `cmc-templates.py` to fully finish and confirm success.
|
||||
6. Verify the generated `.ts` files and the config `specPattern` include every requested VM before starting the runner.
|
||||
7. If the watcher is approved, make sure the controller's deployed watcher code is the intended version before relying on its posts.
|
||||
8. If the watcher is approved, start the watcher before launching `run-sorry-cypress.py`.
|
||||
9. Run `run-sorry-cypress.py` with the matching approved config and build name.
|
||||
8. If the watcher is approved, build the watcher-start command so it automatically includes the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`.
|
||||
9. If the watcher is approved, start the watcher before launching `run-sorry-cypress.py`.
|
||||
10. Run `run-sorry-cypress.py` with the matching approved config and build name.
|
||||
|
||||
Completed-run verification sequence:
|
||||
1. Read the launch log for the build.
|
||||
@@ -200,19 +201,20 @@ Before any new automation request:
|
||||
4. When the watcher is available, present the watcher-start command separately from the core run commands.
|
||||
5. Treat `approve` as approval to execute the ATVM run and start the watcher for that build.
|
||||
6. Treat `approve without watcher` as approval to execute the ATVM run without starting the watcher.
|
||||
7. If the run uses `--categorize` and the watcher is requested, include `--categorize` on the watcher start command too so the watcher tracks sequential categorized sub-runs correctly.
|
||||
8. Run only approved command(s), no extra options and no silent substitutions.
|
||||
9. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel.
|
||||
10. Do not launch `run-sorry-cypress.py` until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files.
|
||||
11. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching `run-sorry-cypress.py`.
|
||||
12. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner.
|
||||
13. Treat displayed commands as a review gate: do not execute either command until the operator has had a chance to review them and explicitly approve.
|
||||
14. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope after commands are shown, discard the old plan, show the revised commands, and wait for new approval.
|
||||
15. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning to the review message and require explicit confirmation that `both` is intended before execution.
|
||||
16. If monitoring was not requested, report immediate success/failure for each command.
|
||||
17. If monitoring was requested, keep monitoring until completion and report final outcome.
|
||||
18. When the watcher is requested, launch the watcher before `run-sorry-cypress.py`.
|
||||
19. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
||||
7. When the watcher is requested, the displayed watcher-start command must automatically include the exact approved `cmc-templates.py` command via `--template-command` and the exact approved `run-sorry-cypress.py` command via `--runner-command`.
|
||||
8. If the run uses `--categorize` and the watcher is requested, include `--categorize` on the watcher start command too so the watcher tracks sequential categorized sub-runs correctly.
|
||||
9. Run only approved command(s), no extra options and no silent substitutions.
|
||||
10. When both template generation and the Cypress runner are requested, run them sequentially, not in parallel.
|
||||
11. Do not launch `run-sorry-cypress.py` until `cmc-templates.py` has exited successfully and finished updating the intended config/spec files.
|
||||
12. After `cmc-templates.py`, always verify that the generated spec files on disk and the config `specPattern` both contain the full requested VM set before launching `run-sorry-cypress.py`.
|
||||
13. If any requested VM is missing from the generated files or `specPattern`, stop and report the mismatch instead of launching the runner.
|
||||
14. Treat displayed commands as a review gate: do not execute either command until the operator has had a chance to review them and explicitly approve.
|
||||
15. If the operator asks to change plugin, config, filters, build name, Gold Disk, or scope after commands are shown, discard the old plan, show the revised commands, and wait for new approval.
|
||||
16. If the planned command is `cmc-reboot` with `--use_specified_plugin both`, add the FC+iSCSI timing warning to the review message and require explicit confirmation that `both` is intended before execution.
|
||||
17. If monitoring was not requested, report immediate success/failure for each command.
|
||||
18. If monitoring was requested, keep monitoring until completion and report final outcome.
|
||||
19. When the watcher is requested, launch the watcher before `run-sorry-cypress.py`.
|
||||
20. Do not start the runner before the watcher, because the watcher helper clears stale `/tmp/<build-name>.log` and can delete the fresh live runner log if the runner starts first.
|
||||
|
||||
## Requested Test Style
|
||||
When asked for one VM or a VM set:
|
||||
@@ -276,6 +278,7 @@ Status-report expectations:
|
||||
- Order the status sections as `SUMMARY:`, `HOSTS:`, `TIMING:`, `COVERAGE:`, `TEST FLOW:`, `FAILURE NOTES:`, then `NOTES:`.
|
||||
- Keep `NOTES:` focused on non-failure operator-facing value such as the Currents run URL, real anomalies unrelated to the direct failure text, or material fallback behavior.
|
||||
- Include the exact `cmc-templates.py` command used to trigger the ATVM automation run in `NOTES:`, without the outer `sshpass`/`ssh` wrapper and without trimming it.
|
||||
- Include the exact `run-sorry-cypress.py` command used to launch the ATVM automation run in `NOTES:`, without the outer `sshpass`/`ssh` wrapper and without trimming it.
|
||||
- Do not include generic watcher bookkeeping messages in `NOTES:` such as artifact-detection confirmations.
|
||||
- Do not include internal watcher fallback notes in `NOTES:` such as `check-xml-files.ts` validation confirmations or reporter-artifact recovery details.
|
||||
- The `HOSTS:` table includes `Host`, `Kernel`, `Status`, and `Detail` columns in that order.
|
||||
|
||||
@@ -79,6 +79,7 @@ Use this as the default ATVM automation run-status template for:
|
||||
- Put broader non-failure context under `NOTES:`.
|
||||
- When available, put the persistent Currents run URL in `NOTES:` so operators can open the exact recorded run directly.
|
||||
- Include the exact `cmc-templates.py` command used to trigger the run in `NOTES:`, without the outer `sshpass`/`ssh` wrapper.
|
||||
- Include the exact `run-sorry-cypress.py` command used to launch the run in `NOTES:`, without the outer `sshpass`/`ssh` wrapper.
|
||||
- Keep `FAILURE NOTES:` limited to detailed per-host error excerpts.
|
||||
- Keep `NOTES:` limited to meaningful non-failure operator-facing items such as the Currents link, real anomalies, or important fallback behavior.
|
||||
- Do not include generic watcher bookkeeping lines in `NOTES:` such as "run artifacts were detected" or "final reporting artifacts were detected."
|
||||
|
||||
@@ -166,6 +166,8 @@ Example:
|
||||
/opt/atvm-watcher-service/start-atvm-run-watcher.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc \
|
||||
--template cmc-e2e \
|
||||
--template-command "python3 ./cmc-templates.py --template_name cmc-e2e --config_file cypress.atvm-config-gold.ts" \
|
||||
--runner-command "python3 ./run-sorry-cypress.py --config_file cypress.atvm-config-gold.ts --build_name e2e-redhat9.6-ubuntu24.04-w2k25-fc --categorize" \
|
||||
--config-family gold \
|
||||
--migration-style "ATVM end-to-end migration validation" \
|
||||
--integration-plugin "pure with fc" \
|
||||
|
||||
@@ -83,6 +83,8 @@ Optional metadata for better status formatting:
|
||||
- `ATVM_WATCHER_CONFIG_FAMILY`
|
||||
- `ATVM_WATCHER_MIGRATION_STYLE`
|
||||
- `ATVM_WATCHER_INTEGRATION_PLUGIN`
|
||||
- `ATVM_WATCHER_TEMPLATE_COMMAND`
|
||||
- `ATVM_WATCHER_RUNNER_COMMAND`
|
||||
- `ATVM_WATCHER_SCOPE_DESCRIPTION`
|
||||
- `ATVM_WATCHER_CATEGORIZED`
|
||||
|
||||
@@ -94,6 +96,8 @@ This helper writes a per-run environment file and starts the matching instance:
|
||||
./start-atvm-run-watcher.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc \
|
||||
--template cmc-e2e \
|
||||
--template-command "python3 ./cmc-templates.py --template_name cmc-e2e --config_file cypress.atvm-config-gold.ts" \
|
||||
--runner-command "python3 ./run-sorry-cypress.py --config_file cypress.atvm-config-gold.ts --build_name e2e-redhat9.6-ubuntu24.04-w2k25-fc --categorize" \
|
||||
--config-family gold \
|
||||
--migration-style "ATVM end-to-end migration validation" \
|
||||
--integration-plugin "pure with fc" \
|
||||
|
||||
@@ -196,6 +196,35 @@ def run_ps() -> str:
|
||||
return proc.stdout
|
||||
|
||||
|
||||
def normalize_logged_command(raw: str, command_name: str) -> Optional[str]:
|
||||
patterns = {
|
||||
"cmc-templates.py": r"((?:python3?\s+)?(?:\./)?cmc-templates\.py\b.*)",
|
||||
"run-sorry-cypress.py": r"((?:python3?\s+)?(?:\./)?run-sorry-cypress\.py\b.*)",
|
||||
}
|
||||
pattern = patterns.get(command_name)
|
||||
if not pattern:
|
||||
return None
|
||||
match = re.search(pattern, raw)
|
||||
if not match:
|
||||
return None
|
||||
normalized = " ".join(match.group(1).split())
|
||||
return normalized or None
|
||||
|
||||
|
||||
def extract_command_from_ps(build_name: str, command_name: str) -> Optional[str]:
|
||||
output = run_ps()
|
||||
matches: List[str] = []
|
||||
for line in output.splitlines():
|
||||
if command_name not in line:
|
||||
continue
|
||||
if command_name == "run-sorry-cypress.py" and f"--build_name {build_name}" not in line:
|
||||
continue
|
||||
normalized = normalize_logged_command(line, command_name)
|
||||
if normalized:
|
||||
matches.append(normalized)
|
||||
return matches[-1] if matches else None
|
||||
|
||||
|
||||
def process_active(build_name: str) -> bool:
|
||||
output = run_ps()
|
||||
for line in output.splitlines():
|
||||
@@ -261,6 +290,20 @@ def extract_currents_url(log_text: str) -> Optional[str]:
|
||||
return match.group(1) if match else None
|
||||
|
||||
|
||||
def extract_command_from_log(log_text: str, command_name: str, build_name: Optional[str] = None) -> Optional[str]:
|
||||
matches: List[str] = []
|
||||
for line in log_text.splitlines():
|
||||
if command_name not in line:
|
||||
continue
|
||||
normalized = normalize_logged_command(line, command_name)
|
||||
if not normalized:
|
||||
continue
|
||||
if command_name == "run-sorry-cypress.py" and build_name and f"--build_name {build_name}" not in normalized:
|
||||
continue
|
||||
matches.append(normalized)
|
||||
return matches[-1] if matches else None
|
||||
|
||||
|
||||
def load_state(state_file: Path) -> Dict[str, object]:
|
||||
if not state_file.exists():
|
||||
return {}
|
||||
@@ -1108,7 +1151,7 @@ def infer_host_from_subrun_build(
|
||||
return remaining_hosts[0] if remaining_hosts else None
|
||||
|
||||
|
||||
def infer_metadata() -> Dict[str, object]:
|
||||
def infer_metadata(build_name: str, log_text: str) -> Dict[str, object]:
|
||||
try:
|
||||
extra_options = json.loads(os.environ.get("ATVM_WATCHER_EXTRA_OPTIONS", "[]"))
|
||||
except json.JSONDecodeError:
|
||||
@@ -1116,9 +1159,16 @@ def infer_metadata() -> Dict[str, object]:
|
||||
if not isinstance(extra_options, list):
|
||||
extra_options = []
|
||||
extra_options = [value for value in extra_options if isinstance(value, str) and value]
|
||||
template_command = os.environ.get("ATVM_WATCHER_TEMPLATE_COMMAND", "")
|
||||
if not template_command:
|
||||
template_command = extract_command_from_log(log_text, "cmc-templates.py") or extract_command_from_ps(build_name, "cmc-templates.py") or ""
|
||||
runner_command = os.environ.get("ATVM_WATCHER_RUNNER_COMMAND", "")
|
||||
if not runner_command:
|
||||
runner_command = extract_command_from_log(log_text, "run-sorry-cypress.py", build_name) or extract_command_from_ps(build_name, "run-sorry-cypress.py") or ""
|
||||
return {
|
||||
"template": os.environ.get("ATVM_WATCHER_TEMPLATE", "unknown"),
|
||||
"template_command": os.environ.get("ATVM_WATCHER_TEMPLATE_COMMAND", ""),
|
||||
"template_command": template_command,
|
||||
"runner_command": runner_command,
|
||||
"config_family": os.environ.get("ATVM_WATCHER_CONFIG_FAMILY", "unknown"),
|
||||
"config_file": os.environ.get("ATVM_WATCHER_CONFIG_FILE", "unknown"),
|
||||
"migration_style": os.environ.get("ATVM_WATCHER_MIGRATION_STYLE", "ATVM automation validation"),
|
||||
@@ -1309,6 +1359,9 @@ def build_status_markdown(
|
||||
template_command = metadata.get("template_command")
|
||||
if isinstance(template_command, str) and template_command:
|
||||
notes = notes + [f"Template command: `{template_command}`"]
|
||||
runner_command = metadata.get("runner_command")
|
||||
if isinstance(runner_command, str) and runner_command:
|
||||
notes = notes + [f"Run command: `{runner_command}`"]
|
||||
template_name = metadata.get("template")
|
||||
integration_plugin = metadata.get("integration_plugin")
|
||||
if (
|
||||
@@ -1908,10 +1961,9 @@ if __name__ == "__main__":
|
||||
posted_marker = build_dir / "posted.marker"
|
||||
|
||||
inventory = load_inventory(inventory_file)
|
||||
metadata = infer_metadata()
|
||||
|
||||
state = load_state(state_file)
|
||||
log_text_for_start = read_text(run_log)
|
||||
metadata = infer_metadata(build_name, log_text_for_start)
|
||||
default_started_at = first_log_timestamp(log_text_for_start) or (datetime.fromtimestamp(run_log.stat().st_mtime, tz=timezone.utc) if run_log.exists() else now_utc())
|
||||
started_at = parse_xml_timestamp(state.get("started_at")) or default_started_at
|
||||
state.setdefault("build_name", build_name)
|
||||
@@ -1927,6 +1979,11 @@ if __name__ == "__main__":
|
||||
if active:
|
||||
process_gone_since = None
|
||||
current_log_text = read_text(run_log)
|
||||
refreshed_metadata = infer_metadata(build_name, current_log_text)
|
||||
for key in ("template_command", "runner_command"):
|
||||
value = refreshed_metadata.get(key)
|
||||
if isinstance(value, str) and value and not metadata.get(key):
|
||||
metadata[key] = value
|
||||
|
||||
run_state, subrun_states, host_results, start_ts, end_ts, currents_url, notes = determine_state(
|
||||
build_name=build_name,
|
||||
|
||||
@@ -10,6 +10,7 @@ Options:
|
||||
--build-name <name>
|
||||
--template <name>
|
||||
--template-command <text>
|
||||
--runner-command <text>
|
||||
--config-family <name>
|
||||
--config-file <path>
|
||||
--migration-style <text>
|
||||
@@ -24,6 +25,7 @@ EOF
|
||||
BUILD_NAME=""
|
||||
TEMPLATE=""
|
||||
TEMPLATE_COMMAND=""
|
||||
RUNNER_COMMAND=""
|
||||
CONFIG_FAMILY=""
|
||||
CONFIG_FILE=""
|
||||
MIGRATION_STYLE=""
|
||||
@@ -38,6 +40,7 @@ while [[ $# -gt 0 ]]; do
|
||||
--build-name) BUILD_NAME="${2:-}"; shift 2 ;;
|
||||
--template) TEMPLATE="${2:-}"; shift 2 ;;
|
||||
--template-command) TEMPLATE_COMMAND="${2:-}"; shift 2 ;;
|
||||
--runner-command) RUNNER_COMMAND="${2:-}"; shift 2 ;;
|
||||
--config-family) CONFIG_FAMILY="${2:-}"; shift 2 ;;
|
||||
--config-file) CONFIG_FILE="${2:-}"; shift 2 ;;
|
||||
--migration-style) MIGRATION_STYLE="${2:-}"; shift 2 ;;
|
||||
@@ -74,6 +77,7 @@ PY
|
||||
cat >"${RUN_DIR}/watch.env" <<EOF
|
||||
ATVM_WATCHER_TEMPLATE=${TEMPLATE@Q}
|
||||
ATVM_WATCHER_TEMPLATE_COMMAND=${TEMPLATE_COMMAND@Q}
|
||||
ATVM_WATCHER_RUNNER_COMMAND=${RUNNER_COMMAND@Q}
|
||||
ATVM_WATCHER_CONFIG_FAMILY=${CONFIG_FAMILY@Q}
|
||||
ATVM_WATCHER_CONFIG_FILE=${CONFIG_FILE@Q}
|
||||
ATVM_WATCHER_MIGRATION_STYLE=${MIGRATION_STYLE@Q}
|
||||
|
||||
Reference in New Issue
Block a user