Add ATVM systemd runner service
This commit is contained in:
@@ -1,13 +1,14 @@
|
||||
# ATVM Watcher Service Install Plan
|
||||
|
||||
This document describes how to deploy the ATVM per-run watcher service to the ATVM Cypress controller at `192.168.3.190`.
|
||||
This document describes how to deploy the ATVM per-run watcher and runner services to the ATVM Cypress controller at `192.168.3.190`.
|
||||
|
||||
This is a deployment plan only. It does not perform the installation.
|
||||
|
||||
## Goal
|
||||
|
||||
Install the local watcher package so the controller can:
|
||||
Install the local watcher/runner package so the controller can:
|
||||
|
||||
- start one requested ATVM Cypress runner per service instance
|
||||
- watch one requested ATVM run per watcher instance
|
||||
- for non-categorized runs, send one final Mattermost status only for `COMPLETED` or `FAILED`
|
||||
- for categorized runs, send one final Mattermost status per completed categorized sub-run/group
|
||||
@@ -20,6 +21,8 @@ Recommended controller paths:
|
||||
|
||||
- package root:
|
||||
- `/opt/atvm-watcher-service`
|
||||
- runner service unit:
|
||||
- `/etc/systemd/system/atvm-runner@.service`
|
||||
- service unit:
|
||||
- `/etc/systemd/system/atvm-run-watcher@.service`
|
||||
- global environment file:
|
||||
@@ -40,6 +43,10 @@ Best-practice rule:
|
||||
From the local workspace:
|
||||
|
||||
- `/home/aw/code/cds/atvm/watcher-service/atvm_run_watcher.py`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/run-atvm-runner.sh`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/atvm-runner@.service`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/start-atvm-runner.sh`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/cancel-atvm-runner.sh`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/atvm-run-watcher@.service`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/start-atvm-run-watcher.sh`
|
||||
- `/home/aw/code/cds/atvm/watcher-service/cancel-atvm-run-watcher.sh`
|
||||
@@ -84,12 +91,18 @@ Recommended permissions:
|
||||
- `/var/lib/atvm-run-watcher`
|
||||
|
||||
2. Copy package files to the controller.
|
||||
- copy the runner wrapper
|
||||
- copy the runner `systemd` unit file
|
||||
- copy the runner helper scripts
|
||||
- copy the Python watcher
|
||||
- copy the `systemd` unit file
|
||||
- copy the helper scripts
|
||||
- copy `vm-inventory.md`
|
||||
|
||||
3. Set executable permissions.
|
||||
- `run-atvm-runner.sh`
|
||||
- `start-atvm-runner.sh`
|
||||
- `cancel-atvm-runner.sh`
|
||||
- `atvm_run_watcher.py`
|
||||
- `start-atvm-run-watcher.sh`
|
||||
- `cancel-atvm-run-watcher.sh`
|
||||
@@ -99,6 +112,7 @@ Recommended permissions:
|
||||
- keep permissions restricted
|
||||
|
||||
5. Install the `systemd` unit file.
|
||||
- copy the runner unit to `/etc/systemd/system/atvm-runner@.service`
|
||||
- copy to `/etc/systemd/system/atvm-run-watcher@.service`
|
||||
|
||||
6. Reload `systemd`.
|
||||
@@ -132,6 +146,9 @@ mkdir -p /opt/atvm-watcher-service /var/lib/atvm-run-watcher
|
||||
```
|
||||
|
||||
```bash
|
||||
chmod 755 /opt/atvm-watcher-service/run-atvm-runner.sh
|
||||
chmod 755 /opt/atvm-watcher-service/start-atvm-runner.sh
|
||||
chmod 755 /opt/atvm-watcher-service/cancel-atvm-runner.sh
|
||||
chmod 755 /opt/atvm-watcher-service/atvm_run_watcher.py
|
||||
chmod 755 /opt/atvm-watcher-service/start-atvm-run-watcher.sh
|
||||
chmod 755 /opt/atvm-watcher-service/cancel-atvm-run-watcher.sh
|
||||
@@ -139,6 +156,7 @@ chmod 755 /opt/atvm-watcher-service/cancel-atvm-run-watcher.sh
|
||||
|
||||
```bash
|
||||
systemctl daemon-reload
|
||||
systemctl cat atvm-runner@.service
|
||||
systemctl cat atvm-run-watcher@.service
|
||||
```
|
||||
|
||||
@@ -146,6 +164,10 @@ systemctl cat atvm-run-watcher@.service
|
||||
python3 /opt/atvm-watcher-service/atvm_run_watcher.py --help
|
||||
```
|
||||
|
||||
```bash
|
||||
/opt/atvm-watcher-service/start-atvm-runner.sh --help
|
||||
```
|
||||
|
||||
```bash
|
||||
/opt/atvm-watcher-service/start-atvm-run-watcher.sh --help
|
||||
```
|
||||
@@ -154,10 +176,10 @@ python3 /opt/atvm-watcher-service/atvm_run_watcher.py --help
|
||||
|
||||
Once installed, the intended workflow is:
|
||||
|
||||
1. Launch the ATVM run as usual.
|
||||
2. Start the watcher for that build name.
|
||||
1. Start the watcher for that build name.
|
||||
- the start helper must clear any stale watcher state for that same requested build name before starting the new watcher instance
|
||||
3. Let the watcher run on the controller.
|
||||
2. Start the runner service for that build name.
|
||||
3. Let the runner and watcher run on the controller.
|
||||
4. The watcher exits on terminal state.
|
||||
|
||||
Example:
|
||||
@@ -173,10 +195,19 @@ Example:
|
||||
--integration-plugin "pure with fc" \
|
||||
--categorize \
|
||||
--scope-description "mixed Linux and Windows FC E2E validation on the gold datastore set"
|
||||
|
||||
/opt/atvm-watcher-service/start-atvm-runner.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc \
|
||||
--runner-command "python3 ./run-sorry-cypress.py --config_file cypress.atvm-config-gold.ts --build_name e2e-redhat9.6-ubuntu24.04-w2k25-fc --categorize"
|
||||
```
|
||||
|
||||
Cancel example:
|
||||
|
||||
```bash
|
||||
/opt/atvm-watcher-service/cancel-atvm-runner.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc
|
||||
```
|
||||
|
||||
```bash
|
||||
/opt/atvm-watcher-service/cancel-atvm-run-watcher.sh \
|
||||
--build-name e2e-redhat9.6-ubuntu24.04-w2k25-fc
|
||||
@@ -192,7 +223,9 @@ The cancel helper should:
|
||||
## Operational Notes
|
||||
|
||||
- This is not a daemon.
|
||||
- One runner instance is started per ATVM run.
|
||||
- One watcher instance is started per ATVM run.
|
||||
- Prefer the `atvm-runner@...` service over detached SSH background launch patterns for `run-sorry-cypress.py`.
|
||||
- Categorized execution is treated as one watcher instance tracking sequential grouped ATVM sub-runs.
|
||||
- In categorized execution, the watcher must remain alive until the parent request has actually gone inactive past the grace window, even if one grouped sub-run already completed.
|
||||
- The watcher exits after the run reaches a terminal state.
|
||||
|
||||
Reference in New Issue
Block a user