Initial commit
This commit is contained in:
162
atvm/AGENTS.md
Normal file
162
atvm/AGENTS.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# ATVM AGENTS Guide
|
||||
|
||||
This file defines how to operate and maintain the ATVM folder workflows.
|
||||
It is rebuilt from current files in `/home/aw/code/cds/atvm`.
|
||||
|
||||
## Scope
|
||||
Two operational tracks exist in this folder:
|
||||
- Setup/bootstrap track:
|
||||
- `atvm-setup-script.sh`
|
||||
- `run-atvm-setup-and-collect-log.sh`
|
||||
- `atvm-setup-script-guide.md`
|
||||
- `atvm-setup-script-runs.md`
|
||||
- Cypress automation track:
|
||||
- `atvm-automation-guide.md`
|
||||
- `atvm-automation-examples.md`
|
||||
- `atvm-automation-runs.md`
|
||||
|
||||
Reference/inventory material:
|
||||
- `cypress-automation-for-cmc.md`
|
||||
- `cypress-automation-for-cmc.md:Zone.Identifier`
|
||||
|
||||
## File Roles
|
||||
- `*-guide.md` files:
|
||||
- Guide-only procedures, rules, defaults, and checklists.
|
||||
- No dated or one-off run examples.
|
||||
- `*-runs.md` files:
|
||||
- Run-specific learnings only when a run introduces new information.
|
||||
- No routine/no-change run logs.
|
||||
- `*-examples.md` files:
|
||||
- Reusable command examples and commonly used option combinations.
|
||||
- Keep generic; avoid dated one-off run outcomes.
|
||||
|
||||
## Setup Track: Required Behavior
|
||||
Use `atvm-setup-script-guide.md` as the procedure source and keep behavior aligned with `atvm-setup-script.sh`.
|
||||
|
||||
### Safety-Critical Rules
|
||||
1. Never run setup without operator-provided `--expected-ip` and `--expected-hostname`.
|
||||
2. Never infer expected hostname from target host output.
|
||||
3. Stop immediately on hostname mismatch or expected-IP-not-assigned.
|
||||
4. Keep static IP configuration as a final step to avoid mid-run connection loss.
|
||||
|
||||
### Canonical Setup Order
|
||||
1. Parse args.
|
||||
2. Validate host identity.
|
||||
3. Check sudo/privileges.
|
||||
4. Fix repositories.
|
||||
5. Configure Ubuntu root SSH/password workflow (Ubuntu only).
|
||||
6. Install sudo if needed.
|
||||
7. Configure Oracle default non-UEK kernel (Oracle Linux only).
|
||||
8. Disable Ubuntu auto-upgrades (Ubuntu only).
|
||||
9. Run package cleanup/install.
|
||||
10. Disable SELinux (RHEL-family).
|
||||
11. Configure static IP.
|
||||
12. Print summary.
|
||||
13. Reboot + post-reboot SELinux verifier when applicable.
|
||||
14. Keep client on until controller log copy + SHA256 verification completes.
|
||||
15. Power off only after verified success and no real error log lines.
|
||||
|
||||
### Setup Defaults
|
||||
- ATVM static IP target: `192.168.3.191/22`
|
||||
- Gateway: `192.168.0.1`
|
||||
- DNS: `8.8.8.8`, `8.8.4.4`
|
||||
- Ubuntu root SSH workflow credential in docs/script: `root / cdsi2012`
|
||||
- Client log file: `atvm_setup_script.log` (typically `/root/atvm_setup_script.log` when run as root)
|
||||
|
||||
### Setup Controller Wrapper Rules
|
||||
- Wrapper supports:
|
||||
- run-and-collect (default)
|
||||
- `--collect-after-complete`
|
||||
- `run-and-collect` requires env vars:
|
||||
- `EXPECTED_IP_ARG`
|
||||
- `EXPECTED_HOSTNAME_ARG`
|
||||
- Wrapper validates success marker and SHA256 before success.
|
||||
- Wrapper powers off only when log has no lines matching `^\[ERROR\]`.
|
||||
|
||||
## Cypress Automation Track: Required Behavior
|
||||
Use `atvm-automation-guide.md` as the execution source.
|
||||
Use `atvm-automation-examples.md` as the common options/command reference.
|
||||
|
||||
### Controller Client
|
||||
- Hostname: `atvm-cypres-vm-1`
|
||||
- IP: `192.168.3.190`
|
||||
- Credentials: `root / atvmcdsi2012`
|
||||
|
||||
### Mandatory Run Control
|
||||
1. Before planning a new run, check for active automation processes.
|
||||
2. Report running/not-running status.
|
||||
3. If running, ask before termination; terminate only with explicit approval.
|
||||
4. Always show exact planned command(s) before execution.
|
||||
5. Execute only after explicit approval.
|
||||
6. If monitoring is not requested, report immediate command success/failure and any errors.
|
||||
7. Monitor completion only when explicitly requested by the operator.
|
||||
8. For monitored runs, allow long runtime windows (15-30+ minutes or longer) and continue until completion unless operator instructs otherwise.
|
||||
9. Do not terminate monitored runs unless the operator explicitly instructs termination.
|
||||
|
||||
### Status Request Format
|
||||
When the operator asks for run status, report in this order:
|
||||
1. Heading/title using the run `build_name`.
|
||||
2. Completed machines with pass/fail state for each machine.
|
||||
3. Skipped machines with reason.
|
||||
4. Remaining machines still to run.
|
||||
5. Summary counts for finished, passed, failed, and skipped machines.
|
||||
6. Timing details:
|
||||
- start time
|
||||
- end time if complete
|
||||
- total run time if complete, or elapsed run time if still running
|
||||
- quickest completed test runtime
|
||||
- longest completed test runtime
|
||||
- average completed test runtime
|
||||
7. Estimated completion time.
|
||||
|
||||
Status details:
|
||||
- Use the live run log on the automation VM when available.
|
||||
- Use the run `build_name` as the heading/title when available.
|
||||
- Show blacklisted machines under skipped machines when they are part of the requested scope.
|
||||
- Show in-progress machines under remaining machines as `RUNNING`.
|
||||
- Show not-yet-started machines as `NOT STARTED`.
|
||||
- Use completed spec results already recorded in the log to determine machine pass/fail state.
|
||||
- For failed machines, include the failure reason from the run log in the status output.
|
||||
- Include start time in status output when it can be derived from the log.
|
||||
- Include end time and total runtime for completed runs, or elapsed runtime for active runs.
|
||||
- Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the log.
|
||||
|
||||
### Automation Blacklist
|
||||
Always exclude these machines with `--exclude_partial_match` when building ATVM automation commands.
|
||||
|
||||
CMC install blacklist (`BLACKLISTED: CMC INSTALL - CAN'T COMPILE`):
|
||||
- `atvm6-centos6.0`
|
||||
- `atvm41-redhat6.0`
|
||||
- `atvm73-oracle6.0`
|
||||
|
||||
Support-request blacklist (`BLACKLISTED: SUPPORT REQUEST - WAITING`):
|
||||
- `atvm113-debian9.0.0`
|
||||
- `atvm115-debian9.1.0`
|
||||
- `atvm116-debian9.2.0`
|
||||
- `atvm156-debian9.3.0`
|
||||
|
||||
Re-create blacklist:
|
||||
- `atvm157-debian13.0.0`
|
||||
|
||||
### Operator Preferences
|
||||
- Do not include Gold Disk IDs in `--build_name`.
|
||||
- `--build_name` must not contain spaces; use `-` between words.
|
||||
- Prefer distro-scoped filtering (for example `--containsVm redhat9`) when possible.
|
||||
|
||||
## Update Policy (Both Tracks)
|
||||
After each run:
|
||||
- Update corresponding `*-guide.md` only if workflow/rules/default behavior changed.
|
||||
- Update corresponding `*-examples.md` when common command patterns/options change.
|
||||
- Update corresponding `*-runs.md` only if the run produced new learning.
|
||||
|
||||
## Path and Naming Consistency Note
|
||||
Current repo filenames use hyphen style, but some script text/defaults still show underscore-style paths (for example `atvm_setup_script.sh`, `run_atvm_setup_and_collect_log.sh`, `/home/aw/code/atvm`).
|
||||
|
||||
When operating:
|
||||
1. Use actual filesystem paths in this repo first (`/home/aw/code/cds/atvm/...`).
|
||||
2. If script defaults are used, verify they match existing files before execution.
|
||||
3. If changing path conventions, update scripts and guides in the same change.
|
||||
|
||||
## Non-Goals
|
||||
- Do not treat `cypress-automation-for-cmc.md` as executable runbook logic.
|
||||
- Do not record secrets/tokens into new guide or runs entries.
|
||||
97
atvm/atvm-automation-examples.md
Normal file
97
atvm/atvm-automation-examples.md
Normal file
@@ -0,0 +1,97 @@
|
||||
## Examples
|
||||
|
||||
- `--build_name` values must not include spaces; use `-` between words.
|
||||
- Add the maintained blacklist to `--exclude_partial_match` for runs that use broad selection or randomization.
|
||||
- Maintained blacklist:
|
||||
- `atvm6-centos6.0`
|
||||
- `atvm41-redhat6.0`
|
||||
- `atvm73-oracle6.0`
|
||||
- `atvm113-debian9.0.0`
|
||||
- `atvm115-debian9.1.0`
|
||||
- `atvm116-debian9.2.0`
|
||||
- `atvm156-debian9.3.0`
|
||||
- `atvm157-debian13.0.0`
|
||||
|
||||
### E2E: Pure iscsi+fc with specific VMs
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-e2e --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --test_partition --integration_type pure --use_specified_plugin both --specify_vms atvm3-ubuntu18.04 atvm109-w2k12R2; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-e2e-pure-plugin
|
||||
```
|
||||
|
||||
### E2E: Infinibox fc with specific VMs
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-e2e --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --test_partition --integration_type infinibox --use_specified_plugin fc --specify_vms atvm51-redhat6.10 atvm110-w2k16; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-e2e-infinibox-plugin
|
||||
```
|
||||
|
||||
### E2E: Regular cutover
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-e2e --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --test_partition --integration_type pure --use_specified_plugin fc --specify_vms atvm93-oracle7.9 atvm111-w2k19 --regular_cutover; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-e2e-regular-cutover
|
||||
```
|
||||
|
||||
### Reboot test
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-reboot --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --integration_type pure --use_specified_plugin fc --specify_vms atvm37-rocky8.8 atvm112-w2k22 --wait_for_power_on 120; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-reboot
|
||||
```
|
||||
|
||||
### SystemOS test
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-systemOS --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --specify_vms atvm118-oracle9.3 atvm145-w2k25; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-systemOS
|
||||
```
|
||||
|
||||
### MigrateOPS test
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-migrateops --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --integration_type pure --use_specified_plugin fc --specify_vms atvm139-redhat9.5 atvm112-w2k22; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-migrateOPS
|
||||
```
|
||||
|
||||
### Compute MigrateOPS: vmware
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-migrateops-compute-migration --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --vm_platforms vmware --test_partition --specify_vms atvm138-oracle9.4-opt atvm112-w2k22 --set_static_ip_dest; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-computeMigrateOPS-vmware
|
||||
```
|
||||
|
||||
### Compute MigrateOPS: ovirt
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-migrateops-compute-migration --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --vm_platforms ovirt --test_partition --specify_vms atvm124-redhat8.8 atvm111-w2k19 --set_static_ip_dest; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-computeMigrateOPS-ovirt
|
||||
```
|
||||
|
||||
### Group consistency
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-group-consistency --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --integration_type pure --use_specified_plugin fc --specify_vms atvm4-ubuntu20.04 atvm112-w2k22 --enable_uuid; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-consistentyGroup
|
||||
```
|
||||
|
||||
### H2H same platform
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-h2h-same-platf --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --integration_type pure --use_specified_plugin fc --specify_vms atvm38-rocky9.0 atvm112-w2k22; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-h2hSamePlatform
|
||||
```
|
||||
|
||||
### H2H different platform
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-h2h-diff-platf --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --integration_type pure --use_specified_plugin fc --specify_vms atvm65-redhat8.3 atvm112-w2k22; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name nightly-h2hDifferentPlatform
|
||||
```
|
||||
|
||||
### Randomized reboot sanity
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-reboot --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --test_partition --integration_type pure --use_specified_plugin fc --randomize 1 --exclude_partial_match suse15.0 atvm6-centos6.0 atvm41-redhat6.0 atvm73-oracle6.0 atvm113-debian9.0.0 atvm115-debian9.1.0 atvm116-debian9.2.0 atvm156-debian9.3.0 atvm157-debian13.0.0 --wait_for_power_on 120; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name sanity-reboot-iscsi
|
||||
```
|
||||
|
||||
### Randomized e2e sanity
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-e2e --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --test_partition --integration_type pure --use_specified_plugin both --randomize 1 --exclude_partial_match suse15.0 atvm6-centos6.0 atvm41-redhat6.0 atvm73-oracle6.0 atvm113-debian9.0.0 atvm115-debian9.1.0 atvm116-debian9.2.0 atvm156-debian9.3.0 atvm157-debian13.0.0; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name sanity-e2e
|
||||
```
|
||||
|
||||
### Randomized systemOS sanity
|
||||
```bash
|
||||
python3 cmc-templates.py --template cmc-systemOS --ignore_force_shutdown --config_file_path ./cypress.atvm-config.ts --randomize 1 --exclude_partial_match suse15.0 fedora34 atvm6-centos6.0 atvm41-redhat6.0 atvm73-oracle6.0 atvm113-debian9.0.0 atvm115-debian9.1.0 atvm116-debian9.2.0 atvm156-debian9.3.0 atvm157-debian13.0.0; \
|
||||
python3 ./run-sorry-cypress.py --config_file cypress.atvm-config.ts --build_name sanity-systemOS
|
||||
```
|
||||
166
atvm/atvm-automation-guide.md
Normal file
166
atvm/atvm-automation-guide.md
Normal file
@@ -0,0 +1,166 @@
|
||||
# Run ATVM Automation Guide
|
||||
|
||||
This file is guide-only documentation for operating ATVM CMC automation.
|
||||
Do not put specific run examples here.
|
||||
For reusable command examples and common option combinations, use `atvm-automation-examples.md`.
|
||||
|
||||
## Purpose
|
||||
Run ATVM CMC automation tests on the designated automation VM without unintended system or file changes.
|
||||
|
||||
## ATVM Cypress Automation Controller Client
|
||||
- Hostname: `atvm-cypres-vm-1`
|
||||
- IP: `192.168.3.190`
|
||||
- Credentials: `root / atvmcdsi2012`
|
||||
|
||||
## Operating Constraints
|
||||
- Run only scripts/commands explicitly requested.
|
||||
- Do not make manual system configuration changes on the client.
|
||||
- Do not edit client files unless explicitly requested.
|
||||
|
||||
## Operator Preferences
|
||||
- Do not include Gold Disk identifiers in `--build_name`.
|
||||
- `--build_name` must not contain spaces; use `-` between words.
|
||||
- For multiple VMs in same distro, use distro-scoped filtering (`--containsVm`) instead of long explicit VM lists.
|
||||
- Before preparing a new run, always check whether automation is already running.
|
||||
- Always report whether automation is currently running.
|
||||
- If running, ask whether to terminate; terminate only with explicit approval.
|
||||
- After termination approval, terminate first, then present planned command(s), then wait for separate execution approval.
|
||||
- Before any run, always show exact planned command(s) and wait for explicit approval.
|
||||
- Execute only after explicit approval (for example `approve`).
|
||||
- After execution, report immediate success/failure only.
|
||||
- Do not actively monitor completion unless explicitly requested.
|
||||
- If monitoring is requested, allow long runtime windows (15-30+ minutes) and continue until completion unless operator instructs otherwise.
|
||||
- Report command errors immediately.
|
||||
- `sshpass` may be used where password-based SSH automation is required.
|
||||
|
||||
## Core Scripts
|
||||
- Template prep: `/root/cdc-e2e-cyp-12.17.4/cmc-templates.py`
|
||||
- Test execution: `./run-sorry-cypress.py`
|
||||
|
||||
Typical sequence:
|
||||
1. Run `cmc-templates.py` with requested template/options.
|
||||
2. Run `run-sorry-cypress.py` with matching config and build name.
|
||||
|
||||
## Config File / Gold Disk Mapping
|
||||
- `cypress.atvm-config-gold.ts` -> Gold Disk 1
|
||||
- `cypress.atvm-config-gold-2.ts` -> Gold Disk 2
|
||||
- Additional numbered config variants map to corresponding Gold Disks.
|
||||
|
||||
## Available Templates
|
||||
- `cmc-e2e`
|
||||
- `cmc-group-consistency`
|
||||
- `cmc-h2h-diff-platf`
|
||||
- `cmc-h2h-same-platf`
|
||||
- `cmc-migrateops`
|
||||
- `cmc-migrateops-compute-migration`
|
||||
- `cmc-reboot`
|
||||
- `cmc-systemOS`
|
||||
|
||||
## Command Pattern
|
||||
```bash
|
||||
python3 cmc-templates.py --template <template> --config_file_path ./<config-file> [template options...]; \
|
||||
python3 ./run-sorry-cypress.py --config_file <config-file> --build_name <hyphenated-description-no-spaces>
|
||||
```
|
||||
|
||||
## Examples Reference
|
||||
- Commonly used command examples: `atvm-automation-examples.md`
|
||||
- Keep this guide focused on run-control rules and workflow constraints.
|
||||
|
||||
## Example Option Patterns (Guide-Only)
|
||||
- Distro-scoped VM selection:
|
||||
- `--containsVm redhat`
|
||||
- `--containsVm redhat9`
|
||||
- Explicit VM selection:
|
||||
- `--specify_vms <vm1> <vm2> ...`
|
||||
- Compute migrateops platform:
|
||||
- `--vm_platforms vmware|ovirt|openshift|proxmox`
|
||||
|
||||
## Blacklisted Machines
|
||||
Always exclude these machines from ATVM automation runs by adding them to `--exclude_partial_match`.
|
||||
|
||||
Permanently blacklisted because CMC cannot compile:
|
||||
- `atvm6-centos6.0`
|
||||
- `atvm41-redhat6.0`
|
||||
- `atvm73-oracle6.0`
|
||||
|
||||
Temporarily blacklisted while support requests are waiting:
|
||||
- `atvm113-debian9.0.0`
|
||||
- `atvm115-debian9.1.0`
|
||||
- `atvm116-debian9.2.0`
|
||||
- `atvm156-debian9.3.0`
|
||||
|
||||
Temporarily blacklisted until re-created:
|
||||
- `atvm157-debian13.0.0`
|
||||
|
||||
Preferred exclude list:
|
||||
- `--exclude_partial_match atvm6-centos6.0 atvm41-redhat6.0 atvm73-oracle6.0 atvm113-debian9.0.0 atvm115-debian9.1.0 atvm116-debian9.2.0 atvm156-debian9.3.0 atvm157-debian13.0.0`
|
||||
|
||||
## Running-Automation Check (Mandatory)
|
||||
Before any new automation request:
|
||||
1. SSH to `root@192.168.3.190`.
|
||||
2. Check for active automation processes (for example `run-sorry-cypress.py`, `cmc-templates.py`, and related Cypress runners).
|
||||
3. Report:
|
||||
- `Running` with process details, or
|
||||
- `Not running`.
|
||||
4. If `Running`, ask operator whether to terminate.
|
||||
5. If termination is approved, terminate matching process(es), confirm termination, then proceed to planned-command approval.
|
||||
6. If termination is not approved, do not start a new run.
|
||||
|
||||
## Approval Workflow (Mandatory)
|
||||
1. Build exact command(s) for the request.
|
||||
2. Present them verbatim as planned commands.
|
||||
3. Wait for explicit approval.
|
||||
4. Run only approved command(s), no extra options.
|
||||
5. If monitoring was not requested, report immediate success/failure for each command.
|
||||
6. If monitoring was requested, keep monitoring until completion and report final outcome.
|
||||
|
||||
## Requested Test Style
|
||||
When asked for one VM or a VM set:
|
||||
- choose requested template/options,
|
||||
- choose correct config file for intended Gold Disk,
|
||||
- use a descriptive `--build_name` without Gold Disk IDs.
|
||||
|
||||
## Update Rule
|
||||
- After each run, update this guide only for workflow/rule/default changes.
|
||||
- Update `atvm-automation-examples.md` for reusable command/option examples.
|
||||
- Add run-specific learnings only to `atvm-automation-runs.md` when the run produced new information.
|
||||
|
||||
## Monitoring Policy
|
||||
- Monitor only when the operator explicitly asks to monitor.
|
||||
- If monitoring was not requested, run commands and report execution success/failure and any errors.
|
||||
- If monitoring was requested, do not terminate processes automatically; only terminate if the operator explicitly instructs termination.
|
||||
|
||||
## Status Reporting Format
|
||||
When the operator asks for the status of an ATVM automation run, report in this order:
|
||||
1. Heading/title using the run `build_name`.
|
||||
2. Completed machines with pass/fail state for each machine.
|
||||
3. Skipped machines with reason.
|
||||
4. Remaining machines still to run.
|
||||
5. Summary counts for finished, passed, failed, and skipped machines.
|
||||
6. Timing details:
|
||||
- start time
|
||||
- end time if complete
|
||||
- total run time if complete, or elapsed run time if still running
|
||||
- quickest completed test runtime
|
||||
- longest completed test runtime
|
||||
- average completed test runtime
|
||||
7. Estimated completion time.
|
||||
|
||||
Status-report expectations:
|
||||
- Use the live automation VM state when available.
|
||||
- Derive the heading/title from the run `build_name` when available.
|
||||
- Derive completed-machine status from completed spec results already written to the run log.
|
||||
- Include the run start time in every status response when it can be derived from the run log.
|
||||
- If the run is complete, include the end time and total run time.
|
||||
- If the run is still active, include the elapsed run time so far.
|
||||
- Include quickest completed test runtime, longest completed test runtime, and average completed test runtime under timing details when they can be derived from the run log.
|
||||
- Show blacklisted machines under skipped machines even if they are part of the broader machine family requested by the operator.
|
||||
- For skipped machines, include the reason category:
|
||||
- `BLACKLISTED: CMC INSTALL - CAN'T COMPILE`
|
||||
- `BLACKLISTED: SUPPORT REQUEST - WAITING`
|
||||
- `BLACKLISTED: RE-CREATE NEEDED`
|
||||
- If a machine is currently in progress, show it under remaining machines as `RUNNING`.
|
||||
- If a machine has not started yet, show it under remaining machines as `NOT STARTED`.
|
||||
- If no failures are present in completed spec results, report those completed machines as `PASS`.
|
||||
- If a completed spec result shows a failure, report that machine as `FAIL` and include the failure reason from the run log.
|
||||
- Base the completion estimate on the current remaining machine count and recent per-machine runtime visible in the run log.
|
||||
47
atvm/atvm-automation-runs.md
Normal file
47
atvm/atvm-automation-runs.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Run ATVM Automation Runs
|
||||
|
||||
This file stores run-specific examples only when a run produced a new learning relevant to future automation tasks.
|
||||
|
||||
## Entry Rule
|
||||
- Add an entry only when a run changed workflow behavior, exposed a failure mode, or confirmed a required new check.
|
||||
- Do not add routine runs with no new learning.
|
||||
|
||||
## Current State
|
||||
- No run-learning entries recorded yet from `atvm-automation-guide.md` source material.
|
||||
|
||||
## Run Learning: 2026-03-08 (E2E redhat9.7, pure/fc)
|
||||
- Request:
|
||||
- template: `cmc-e2e`
|
||||
- filter: `--containsVm redhat9.7`
|
||||
- integration: `--integration_type pure`
|
||||
- plugin: `--use_specified_plugin fc`
|
||||
- Observed result:
|
||||
- Cypress spec execution passed (`1` test, `1` passing, `0` failing).
|
||||
- Cloud run URL was produced and marked uploaded.
|
||||
- `run-sorry-cypress.py` remained running afterward with a defunct `npm exec cypress-cloud` child process and did not exit cleanly on its own.
|
||||
- Action for future runs:
|
||||
- If pass/upload is confirmed but `run-sorry-cypress.py` does not exit, treat it as a runner hang condition.
|
||||
- Capture run URL and pass/fail status first, then terminate the stuck runner process cleanly.
|
||||
|
||||
## Run Learning: 2026-03-09 (Blacklist handling and status format)
|
||||
- Observed requirement:
|
||||
- Some ATVM machines must be skipped even when a broad selector such as `--containsVm` or `--randomize` would otherwise include them.
|
||||
- Machines to blacklist via `--exclude_partial_match`:
|
||||
- `BLACKLISTED: CMC INSTALL - CAN'T COMPILE`:
|
||||
- `atvm6-centos6.0`
|
||||
- `atvm41-redhat6.0`
|
||||
- `atvm73-oracle6.0`
|
||||
- `BLACKLISTED: SUPPORT REQUEST - WAITING`:
|
||||
- `atvm113-debian9.0.0`
|
||||
- `atvm115-debian9.1.0`
|
||||
- `atvm116-debian9.2.0`
|
||||
- `atvm156-debian9.3.0`
|
||||
- Needs re-creation:
|
||||
- `atvm157-debian13.0.0`
|
||||
- Action for future runs:
|
||||
- Add these machine names to `--exclude_partial_match` when building broad-scope automation commands.
|
||||
- When reporting run status, include skipped blacklisted machines separately with their reason, in addition to completed and remaining machines.
|
||||
- Use the run `build_name` as the heading/title for status responses so the test type is obvious.
|
||||
- For failed machines in status responses, include the failure reason taken from the run log.
|
||||
- Include timing details in status responses: start time, end time when complete, and total or elapsed runtime.
|
||||
- Also include timing stats in status responses: quickest completed test runtime, longest completed test runtime, and average completed test runtime.
|
||||
165
atvm/atvm-setup-script-guide.md
Normal file
165
atvm/atvm-setup-script-guide.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# ATVM Setup Script Guide
|
||||
|
||||
This file is guide-only documentation for running and maintaining the ATVM setup workflow.
|
||||
Do not put dated run examples here.
|
||||
|
||||
## Scope
|
||||
- Client setup script: `/home/aw/code/cds/atvm/atvm-setup-script.sh`
|
||||
- Controller wrapper: `/home/aw/code/cds/atvm/run-atvm-setup-and-collect-log.sh`
|
||||
- Run-learnings log: `/home/aw/code/cds/atvm/atvm-setup-script-runs.md`
|
||||
|
||||
## Purpose
|
||||
The setup flow performs a controlled bootstrap across supported Linux distributions:
|
||||
1. Validate target host identity using expected IP + expected hostname before any configuration.
|
||||
2. Fix repositories (especially CD/DVD media repo entries).
|
||||
3. On Ubuntu, configure root SSH password-login workflow (`root/cdsi2012`) for follow-on root operations.
|
||||
4. On Oracle Linux, set default boot kernel to non-UEK when available.
|
||||
5. Disable unattended auto-upgrades on Ubuntu.
|
||||
6. Remove specific storage-related packages and install base tooling.
|
||||
7. Disable SELinux on Red Hat-family systems.
|
||||
8. Configure static IP as the final step.
|
||||
9. Print final summary and write logs to `atvm_setup_script.log`.
|
||||
10. On SELinux-capable distros, reboot and verify runtime SELinux status post-reboot.
|
||||
11. Keep client powered on after successful setup so controller-side log collection + SHA256 verification can complete.
|
||||
12. Power off from controller only after successful verification and no setup errors.
|
||||
|
||||
## Execution Model
|
||||
- Shell safety flags: `set -euo pipefail`
|
||||
- Logging: colorized console + plain text log file
|
||||
- Entry point: `main "$@"`
|
||||
- Default operator assumption for setup access: `root / cdsi2012` unless explicitly overridden.
|
||||
|
||||
## Mandatory Identity Gate
|
||||
Setup must not start unless operator explicitly provides both values:
|
||||
- `--expected-ip <ip>`
|
||||
- `--expected-hostname <hostname>`
|
||||
|
||||
Rules:
|
||||
- Connect to the operator-provided target IP directly.
|
||||
- Do not pre-scan alternate candidate IPs.
|
||||
- Do not infer hostname from target.
|
||||
- If hostname is missing from request, stop and ask for it.
|
||||
- If detected hostname does not exactly match expected hostname, stop immediately.
|
||||
- If expected IP is not assigned on target, stop immediately.
|
||||
|
||||
## Canonical Run Order
|
||||
1. `parse_args`
|
||||
2. `validate_target_host_identity`
|
||||
3. `check_sudo`
|
||||
4. `fix_repositories`
|
||||
5. `configure_ubuntu_root_ssh_access` (Ubuntu only)
|
||||
6. `install_sudo_if_needed`
|
||||
7. `configure_oracle_non_uek_kernel` (Oracle Linux only)
|
||||
8. `disable_ubuntu_auto_upgrades` (Ubuntu only)
|
||||
9. `run_package_installation`
|
||||
10. `disable_selinux` (RHEL-family only)
|
||||
11. `configure_static_ip` (final configuration step)
|
||||
12. `print_final_summary`
|
||||
13. `reboot_and_verify_selinux_if_needed`
|
||||
14. `poweroff_client_if_successful` (controller-driven after verification)
|
||||
|
||||
## Core Behavior By Step
|
||||
|
||||
### Repository Fix
|
||||
- Debian/Ubuntu: comment `cdrom` entries in apt lists and run `apt-get update`.
|
||||
- RHEL-family/Oracle: disable media/cdrom/dvd repo entries and run `yum clean all && yum makecache`.
|
||||
- Fedora: same model via `dnf clean all && dnf makecache`.
|
||||
- openSUSE/SLES: disable CD/DVD repos with `zypper mr -d` and refresh.
|
||||
|
||||
### Oracle Linux Kernel Handling
|
||||
- Oracle Linux only.
|
||||
- Select first non-UEK kernel via `grubby --info=ALL` and set GRUB default.
|
||||
- Track whether default changed and whether reboot is required.
|
||||
|
||||
### Ubuntu Root SSH Workflow
|
||||
- Ubuntu only.
|
||||
- Set root password `cdsi2012`, unlock root account.
|
||||
- Write `/etc/ssh/sshd_config.d/99-atvm-root-login.conf` enabling root + password auth.
|
||||
- Validate config and restart SSH service.
|
||||
|
||||
### Ubuntu Auto-Upgrade Disable
|
||||
- Ubuntu only.
|
||||
- Update `/etc/apt/apt.conf.d/20auto-upgrades` to disable periodic update/upgrade actions.
|
||||
|
||||
### Package Installation
|
||||
- Package manager detection order: `apt-get`, `dnf`, `yum`, `zypper`, `pacman`, `apk`.
|
||||
- Pre-cleanup removes multipath/iSCSI packages where applicable.
|
||||
- Installs kernel headers per distro.
|
||||
- Base package set includes:
|
||||
`curl wget git vim perl gdb scsitools net-tools parted fio ca-certificates python3 elfutils-libelf-devel`
|
||||
|
||||
### SELinux Disable
|
||||
- RHEL-family only.
|
||||
- If enforcing/permissive, backup and rewrite `/etc/selinux/config` to disabled.
|
||||
- Marks reboot recommendation/requirement in summary.
|
||||
|
||||
### Static IP Configuration (Final Step)
|
||||
Hardcoded target values:
|
||||
- IP: `192.168.3.191`
|
||||
- Prefix: `22`
|
||||
- Gateway: `192.168.0.1`
|
||||
- DNS: `8.8.8.8`, `8.8.4.4`
|
||||
|
||||
Interface detection priority:
|
||||
1. default-route interface
|
||||
2. first non-loopback interface with IPv4
|
||||
3. first non-loopback interface from link list
|
||||
|
||||
Network-stack handling includes `netplan`, `NetworkManager`/`nmcli`, `wicked`, and legacy `ifcfg` fallback patterns.
|
||||
|
||||
### SELinux Reboot Verification
|
||||
- Applies to `rhel`, `centos`, `rocky`, `almalinux`, `fedora`, `ol` when SELinux changed.
|
||||
- Creates one-time systemd verifier service before reboot.
|
||||
- Post-reboot service records runtime `getenforce` and self-removes.
|
||||
- On success/no real errors, keeps client on for controller log copy/hash verification before controller power-off.
|
||||
- On errors, leaves client on for manual inspection.
|
||||
|
||||
## Power-State Rules
|
||||
- After successful setup, keep client powered on until controller log collection + SHA256 verification completes.
|
||||
- If verification succeeds and no real error lines exist (`^\[ERROR\]`), controller powers off client.
|
||||
- If any real error lines exist, keep client powered on.
|
||||
|
||||
## Logging and Verification
|
||||
- Client log filename: `atvm_setup_script.log`
|
||||
- Common client log path when run as root: `/root/atvm_setup_script.log`
|
||||
- Controller collected log naming: `atvm_configuration_<hostname>_<yyyymmdd_hhmmss>.log`
|
||||
|
||||
Required post-run validation:
|
||||
1. Copy client log to controller `atvm/log/` path.
|
||||
2. Compare SHA256 between client and copied controller log.
|
||||
3. Require exact match.
|
||||
|
||||
## Preferred Execution Commands
|
||||
Direct client execution:
|
||||
```bash
|
||||
sudo bash /home/cirrususer/atvm-setup-script.sh \
|
||||
--expected-ip <current-client-ip> \
|
||||
--expected-hostname <exact-hostname>
|
||||
```
|
||||
|
||||
Controller run + collect:
|
||||
```bash
|
||||
EXPECTED_IP_ARG=<current-client-ip> EXPECTED_HOSTNAME_ARG=<exact-hostname> \
|
||||
/home/aw/code/cds/atvm/run-atvm-setup-and-collect-log.sh
|
||||
```
|
||||
|
||||
Controller collect-only after client run:
|
||||
```bash
|
||||
/home/aw/code/cds/atvm/run-atvm-setup-and-collect-log.sh --collect-after-complete
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
- If local collected log is missing, do not rerun full setup just for log recovery.
|
||||
- Use collect-only mode and verify SHA256 after copy.
|
||||
- If wrapper appears stuck after IP/reboot transition, stop older wrapper sessions and run one fresh collect-only session.
|
||||
- If `sshpass` is missing on controller, wrapper can still run but may require repeated interactive password prompts.
|
||||
|
||||
## Operational Caveats
|
||||
- Not fully idempotent for all paths; repeated runs may rewrite network configs and create multiple backups.
|
||||
- Static IP values are hardcoded; adjust before use in other environments.
|
||||
- Run in maintenance windows because network changes can interrupt active sessions.
|
||||
- Preserve host identity gating; do not weaken expected IP/hostname checks.
|
||||
|
||||
## Update Rule
|
||||
- After each run, update this file only for guide/rule/checklist/default behavior changes.
|
||||
- Put run-specific outcomes in `atvm-setup-script-runs.md` only when the run produced a new learning.
|
||||
40
atvm/atvm-setup-script-runs.md
Normal file
40
atvm/atvm-setup-script-runs.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# ATVM Setup Script Runs
|
||||
|
||||
This file stores run-specific examples only when a run produced a new learning relevant to future tasks.
|
||||
|
||||
## Entry Rule
|
||||
- Add an entry only when the run changed workflow behavior, exposed a new failure mode, or confirmed a new required check.
|
||||
- Do not add routine runs with no new learning.
|
||||
|
||||
## Run Learning: 2026-03-03 (Ubuntu 24.04)
|
||||
- Environment:
|
||||
- Initial IP: `192.168.0.89`
|
||||
- Final static IP: `192.168.3.191`
|
||||
- Hostname: `atvm-codextest-vm-1`
|
||||
- Learning:
|
||||
- Root SSH password workflow (`root/cdsi2012`) and log copy/hash verification path are valid end-to-end.
|
||||
- Wrapper must enforce identity arguments for run-and-collect mode.
|
||||
- Action for future runs:
|
||||
- Require `EXPECTED_IP_ARG` and `EXPECTED_HOSTNAME_ARG` for wrapper run-and-collect.
|
||||
|
||||
## Run Learning: 2026-03-05 (RHEL 9)
|
||||
- Environment:
|
||||
- Initial IP: `192.168.3.212`
|
||||
- Final static IP: `192.168.3.191`
|
||||
- Hostname: `atvm-codextest-vm-2`
|
||||
- Learning:
|
||||
- SELinux disable path with reboot + post-reboot verifier worked.
|
||||
- Auto power-off can race controller-side log collection if done too early.
|
||||
- Action for future runs:
|
||||
- Keep client powered on until controller log copy + SHA256 verification completes.
|
||||
- Only then perform controller-side power-off when no real error lines are present.
|
||||
|
||||
## Run Learning: 2026-03-06 (Oracle Linux 9)
|
||||
- Environment:
|
||||
- Initial IP: `192.168.0.121`
|
||||
- Final static IP: `192.168.3.191`
|
||||
- Hostname: `atvm-codextest-vm`
|
||||
- Learning:
|
||||
- Wrapper auto power-off was blocked by false-positive error detection from instructional text.
|
||||
- Action for future runs:
|
||||
- Match only real error log lines using `^\[ERROR\]` for power-off gating.
|
||||
1867
atvm/atvm-setup-script.sh
Normal file
1867
atvm/atvm-setup-script.sh
Normal file
File diff suppressed because it is too large
Load Diff
1319
atvm/cypress-automation-for-cmc.md
Normal file
1319
atvm/cypress-automation-for-cmc.md
Normal file
File diff suppressed because it is too large
Load Diff
BIN
atvm/cypress-automation-for-cmc.md:Zone.Identifier
Normal file
BIN
atvm/cypress-automation-for-cmc.md:Zone.Identifier
Normal file
Binary file not shown.
228
atvm/run-atvm-setup-and-collect-log.sh
Executable file
228
atvm/run-atvm-setup-and-collect-log.sh
Executable file
@@ -0,0 +1,228 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
REMOTE_IP_PRIMARY="${REMOTE_IP_PRIMARY:-192.168.0.121}"
|
||||
REMOTE_IP_SECONDARY="${REMOTE_IP_SECONDARY:-192.168.3.191}"
|
||||
REMOTE_USER="${REMOTE_USER:-root}"
|
||||
PROJECT_DIR="${PROJECT_DIR:-/home/aw/code/atvm}"
|
||||
LOCAL_LOG_DIR="${LOCAL_LOG_DIR:-$PROJECT_DIR/log}"
|
||||
LOCAL_SETUP_SCRIPT="${LOCAL_SETUP_SCRIPT:-$PROJECT_DIR/atvm_setup_script.sh}"
|
||||
REMOTE_SETUP_SCRIPT="${REMOTE_SETUP_SCRIPT:-/root/atvm_setup_script.sh}"
|
||||
REMOTE_LOG_FILE="${REMOTE_LOG_FILE:-/root/atvm_setup_script.log}"
|
||||
WAIT_TIMEOUT_SECONDS="${WAIT_TIMEOUT_SECONDS:-600}"
|
||||
MODE="${1:-run-and-collect}"
|
||||
EXPECTED_IP_ARG="${EXPECTED_IP_ARG:-}"
|
||||
EXPECTED_HOSTNAME_ARG="${EXPECTED_HOSTNAME_ARG:-}"
|
||||
|
||||
SSH_OPTS=(-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=5)
|
||||
|
||||
if [[ ! -f "$LOCAL_SETUP_SCRIPT" ]]; then
|
||||
echo "ERROR: Local setup script not found: $LOCAL_SETUP_SCRIPT" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
mkdir -p "$LOCAL_LOG_DIR"
|
||||
|
||||
if ! command -v ssh >/dev/null 2>&1 || ! command -v scp >/dev/null 2>&1; then
|
||||
echo "ERROR: ssh/scp is required." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
SSH_CMD=(ssh "${SSH_OPTS[@]}")
|
||||
SCP_CMD=(scp "${SSH_OPTS[@]}")
|
||||
|
||||
if [[ -n "${ATVM_PASSWORD:-}" ]]; then
|
||||
if command -v sshpass >/dev/null 2>&1; then
|
||||
SSH_CMD=(sshpass -p "$ATVM_PASSWORD" ssh "${SSH_OPTS[@]}")
|
||||
SCP_CMD=(sshpass -p "$ATVM_PASSWORD" scp "${SSH_OPTS[@]}")
|
||||
else
|
||||
echo "WARNING: ATVM_PASSWORD is set, but sshpass is not installed. Falling back to interactive password prompts."
|
||||
fi
|
||||
fi
|
||||
|
||||
run_ssh() {
|
||||
local host="$1"
|
||||
shift
|
||||
"${SSH_CMD[@]}" "${REMOTE_USER}@${host}" "$@"
|
||||
}
|
||||
|
||||
run_scp_to_remote() {
|
||||
local src="$1"
|
||||
local host="$2"
|
||||
local dst="$3"
|
||||
"${SCP_CMD[@]}" "$src" "${REMOTE_USER}@${host}:${dst}"
|
||||
}
|
||||
|
||||
run_scp_from_remote() {
|
||||
local host="$1"
|
||||
local src="$2"
|
||||
local dst="$3"
|
||||
"${SCP_CMD[@]}" "${REMOTE_USER}@${host}:${src}" "$dst"
|
||||
}
|
||||
|
||||
wait_for_reachable_host() {
|
||||
local start_ts current_ts elapsed
|
||||
start_ts="$(date +%s)"
|
||||
|
||||
while true; do
|
||||
for host in "$REMOTE_IP_PRIMARY" "$REMOTE_IP_SECONDARY"; do
|
||||
if run_ssh "$host" "echo ready" >/dev/null 2>&1; then
|
||||
echo "$host"
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
|
||||
current_ts="$(date +%s)"
|
||||
elapsed=$((current_ts - start_ts))
|
||||
if (( elapsed >= WAIT_TIMEOUT_SECONDS )); then
|
||||
return 1
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
}
|
||||
|
||||
pick_initial_host() {
|
||||
for host in "$REMOTE_IP_PRIMARY" "$REMOTE_IP_SECONDARY"; do
|
||||
if run_ssh "$host" "echo ready" >/dev/null 2>&1; then
|
||||
echo "$host"
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
wait_for_completed_task() {
|
||||
local start_ts current_ts elapsed
|
||||
start_ts="$(date +%s)"
|
||||
|
||||
while true; do
|
||||
for host in "$REMOTE_IP_PRIMARY" "$REMOTE_IP_SECONDARY"; do
|
||||
if run_ssh "$host" "test -f '$REMOTE_LOG_FILE' && grep -q 'SUCCESS: ATVM VM Setup Complete!' '$REMOTE_LOG_FILE'" >/dev/null 2>&1; then
|
||||
echo "$host"
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
|
||||
current_ts="$(date +%s)"
|
||||
elapsed=$((current_ts - start_ts))
|
||||
if (( elapsed >= WAIT_TIMEOUT_SECONDS )); then
|
||||
return 1
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
}
|
||||
|
||||
wait_for_host_offline() {
|
||||
local host="$1"
|
||||
local start_ts current_ts elapsed
|
||||
start_ts="$(date +%s)"
|
||||
|
||||
while true; do
|
||||
if ! run_ssh "$host" "echo still-up" >/dev/null 2>&1; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
current_ts="$(date +%s)"
|
||||
elapsed=$((current_ts - start_ts))
|
||||
if (( elapsed >= WAIT_TIMEOUT_SECONDS )); then
|
||||
return 1
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
}
|
||||
|
||||
if [[ "$MODE" != "run-and-collect" && "$MODE" != "--collect-after-complete" ]]; then
|
||||
echo "Usage:"
|
||||
echo " $0 # run setup on client, then collect log"
|
||||
echo " $0 --collect-after-complete # wait for completed client task, then collect log only"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ "$MODE" == "run-and-collect" ]]; then
|
||||
if [[ -z "$EXPECTED_IP_ARG" || -z "$EXPECTED_HOSTNAME_ARG" ]]; then
|
||||
echo "ERROR: run-and-collect requires EXPECTED_IP_ARG and EXPECTED_HOSTNAME_ARG." >&2
|
||||
echo "Example:" >&2
|
||||
echo " EXPECTED_IP_ARG=192.168.0.121 EXPECTED_HOSTNAME_ARG=atvm-codextest-vm $0" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
INITIAL_HOST="$(pick_initial_host)" || {
|
||||
echo "ERROR: Could not reach ${REMOTE_IP_PRIMARY} or ${REMOTE_IP_SECONDARY} for initial setup." >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
echo "Copying setup script to ${REMOTE_USER}@${INITIAL_HOST}:${REMOTE_SETUP_SCRIPT}"
|
||||
run_scp_to_remote "$LOCAL_SETUP_SCRIPT" "$INITIAL_HOST" "$REMOTE_SETUP_SCRIPT"
|
||||
|
||||
echo "Running remote setup script on ${INITIAL_HOST} (disconnect is expected during IP/reboot steps)"
|
||||
set +e
|
||||
run_ssh "$INITIAL_HOST" "chmod +x '$REMOTE_SETUP_SCRIPT' && bash '$REMOTE_SETUP_SCRIPT' --expected-ip '$EXPECTED_IP_ARG' --expected-hostname '$EXPECTED_HOSTNAME_ARG'"
|
||||
run_status=$?
|
||||
set -e
|
||||
if (( run_status != 0 )); then
|
||||
echo "INFO: Remote run returned non-zero (${run_status}). Continuing because network reconfiguration/reboot can interrupt SSH."
|
||||
fi
|
||||
|
||||
echo "Waiting for completed client task marker in ${REMOTE_LOG_FILE} (timeout: ${WAIT_TIMEOUT_SECONDS}s)"
|
||||
ACTIVE_HOST="$(wait_for_completed_task)" || {
|
||||
echo "ERROR: Could not detect completed task marker in remote log within timeout." >&2
|
||||
exit 1
|
||||
}
|
||||
else
|
||||
echo "Waiting for completed client task marker in ${REMOTE_LOG_FILE} (timeout: ${WAIT_TIMEOUT_SECONDS}s)"
|
||||
ACTIVE_HOST="$(wait_for_completed_task)" || {
|
||||
echo "ERROR: Could not detect completed task marker in remote log within timeout." >&2
|
||||
exit 1
|
||||
}
|
||||
fi
|
||||
|
||||
echo "Host reachable at: ${ACTIVE_HOST}"
|
||||
|
||||
REMOTE_HOSTNAME="$(run_ssh "$ACTIVE_HOST" "hostname" | tr -d '\r' | tail -n1)"
|
||||
RUN_TS="$(date +%Y%m%d_%H%M%S)"
|
||||
LOCAL_LOG_FILE="${LOCAL_LOG_DIR}/atvm_configuration_${REMOTE_HOSTNAME}_${RUN_TS}.log"
|
||||
|
||||
echo "Collecting remote log: ${REMOTE_LOG_FILE}"
|
||||
run_scp_from_remote "$ACTIVE_HOST" "$REMOTE_LOG_FILE" "$LOCAL_LOG_FILE"
|
||||
|
||||
REMOTE_HASH="$(run_ssh "$ACTIVE_HOST" "sha256sum '$REMOTE_LOG_FILE' | awk '{print \$1}'" | tr -d '\r' | tail -n1)"
|
||||
LOCAL_HASH="$(sha256sum "$LOCAL_LOG_FILE" | awk '{print $1}')"
|
||||
|
||||
if [[ "$REMOTE_HASH" != "$LOCAL_HASH" ]]; then
|
||||
echo "ERROR: Hash mismatch after log copy." >&2
|
||||
echo "Remote: $REMOTE_HASH" >&2
|
||||
echo "Local: $LOCAL_HASH" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
HAS_ERRORS_IN_LOG=false
|
||||
# Match only real error log records. Do not match instructional text that mentions "[ERROR]".
|
||||
if run_ssh "$ACTIVE_HOST" "grep -Eq '^\\[ERROR\\]' '$REMOTE_LOG_FILE'"; then
|
||||
HAS_ERRORS_IN_LOG=true
|
||||
fi
|
||||
|
||||
if [[ "$HAS_ERRORS_IN_LOG" == true ]]; then
|
||||
echo "WARNING: [ERROR] entries detected in remote log. VM will remain powered on for manual inspection."
|
||||
else
|
||||
echo "Log indicates success with no [ERROR] entries. Powering off ${ACTIVE_HOST}."
|
||||
set +e
|
||||
run_ssh "$ACTIVE_HOST" "shutdown -h now"
|
||||
shutdown_status=$?
|
||||
set -e
|
||||
if (( shutdown_status != 0 )); then
|
||||
echo "INFO: Shutdown command returned non-zero (${shutdown_status}); this can occur if SSH disconnects during shutdown."
|
||||
fi
|
||||
|
||||
echo "Waiting for ${ACTIVE_HOST} to go offline (timeout: ${WAIT_TIMEOUT_SECONDS}s)"
|
||||
if wait_for_host_offline "$ACTIVE_HOST"; then
|
||||
echo "Power-off confirmed: ${ACTIVE_HOST} is offline."
|
||||
else
|
||||
echo "WARNING: Could not confirm ${ACTIVE_HOST} offline within timeout."
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "Success"
|
||||
echo "Active host: ${ACTIVE_HOST}"
|
||||
echo "Local log: ${LOCAL_LOG_FILE}"
|
||||
echo "SHA256: ${LOCAL_HASH}"
|
||||
Reference in New Issue
Block a user