Files
cds-ai/atvm/atvm-automation-runs.md

4.8 KiB

Run ATVM Automation Runs

This file stores run-specific examples only when a run produced a new learning relevant to future automation tasks.

Entry Rule

  • Add an entry only when a run changed workflow behavior, exposed a failure mode, or confirmed a required new check.
  • Do not add routine runs with no new learning.

Current State

  • No run-learning entries recorded yet from atvm-automation-guide.md source material.

Run Learning: 2026-03-08 (E2E redhat9.7, pure/fc)

  • Request:
    • template: cmc-e2e
    • filter: --containsVm redhat9.7
    • integration: --integration_type pure
    • plugin: --use_specified_plugin fc
  • Observed result:
    • Cypress spec execution passed (1 test, 1 passing, 0 failing).
    • Cloud run URL was produced and marked uploaded.
    • run-sorry-cypress.py remained running afterward with a defunct npm exec cypress-cloud child process and did not exit cleanly on its own.
  • Action for future runs:
    • If pass/upload is confirmed but run-sorry-cypress.py does not exit, treat it as a runner hang condition.
    • Capture run URL and pass/fail status first, then terminate the stuck runner process cleanly.

Run Learning: 2026-03-09 (Blacklist handling and status format)

  • Observed requirement:
    • Some ATVM machines must be skipped even when a broad selector such as --containsVm or --randomize would otherwise include them.
  • Machines to blacklist via --exclude_partial_match:
    • BLACKLISTED: CMC INSTALL - CAN'T COMPILE:
      • atvm6-centos6.0
      • atvm41-redhat6.0
      • atvm73-oracle6.0
    • BLACKLISTED: SUPPORT REQUEST - WAITING:
      • atvm113-debian9.0.0
      • atvm115-debian9.1.0
      • atvm116-debian9.2.0
      • atvm156-debian9.3.0
    • Needs re-creation:
      • atvm157-debian13.0.0
  • Action for future runs:
    • Add these machine names to --exclude_partial_match when building broad-scope automation commands.
    • When reporting run status, include skipped blacklisted machines separately with their reason, in addition to completed and remaining machines.
    • Use the run build_name as the heading/title for status responses so the test type is obvious.
    • For failed machines in status responses, include the failure reason taken from the run log.
    • Include timing details in status responses: start time, end time when complete, and total or elapsed runtime.
    • Also include timing stats in status responses: quickest completed test runtime, longest completed test runtime, and average completed test runtime.

Run Learning: 2026-03-11 (Machine-first status lines and whole-run ETA)

  • Observed requirement:
    • Status output must list each machine first and then its status, rather than leading with the status label.
    • Estimated completion time must refer to the entire remaining automation run, not only the currently running machine.
  • Action for future runs:
    • Format machine entries as machine-name - STATUS.
    • Keep failure reasons after the machine/status entry when a machine failed.
    • When giving ETA, explicitly state it is the estimate for completion of the full remaining run.

Run Learning: 2026-03-11 (Categorized run status must be reconstructed across batches)

  • Observed failure mode:
    • run-sorry-cypress.py --categorize mutates the active config to the current category batch, so live files such as specPattern, current_vm, and the newest /tmp Cypress JSON only describe the current category, not the full automation run.
    • Answering from only the current live batch underreports the run and misses already-finished machines from earlier category batches.
  • Action for future runs:
    • Reconstruct whole-run status from the generated machine scope plus all machine result artifacts written since the run start time.
    • Use the current batch only to identify the live RUNNING machine and immediate next machine(s), not as the full run scope.
    • Do not answer status requests for categorized runs until earlier category results have been checked as part of the same run.

Run Learning: 2026-03-11 (Hash-named XML files still belong to machine runs)

  • Observed failure mode:
    • Same-run JUnit output is not consistently named test-result-atvm...xml.
    • Many machine results for the same automation run were written as hash-named files such as test-result-01fe412894862398d06d9cc4bc7e81a0.xml.
    • Limiting status reconstruction to machine-named XML files causes major undercounting of completed machines.
  • Action for future runs:
    • Parse all test-result-*.xml files written since the run start time, not only test-result-atvm*.xml.
    • Extract the machine name from XML contents such as testsuite file=, testsuite name=, or testcase name= when the filename does not include the machine name.
    • Treat check-xml-files.ts XML outputs as bookkeeping steps, not machine results.
    • Prefer the most recently written same-run XML per machine when multiple XML files exist for that machine.