# AI Agent Instructions for `cdssync` These instructions apply anywhere under `/home/aw/code/cds/cdssync`. ## Migration Test Dataset Workflow For migration test datasets in this workspace, follow this process by default: 1. Generate the dataset locally from this workspace. 2. Preserve the local generated dataset as the canonical original copy. 3. Copy the dataset to the test machine using metadata-preserving tooling. 4. Verify the copied dataset on the test machine before using it for migration testing. ## Generation Rules - Use `/home/aw/code/cds/cdssync/generate_migration_test_dataset.sh` to create the dataset unless the user explicitly asks for a different method. - Prefer `/home/aw/code/cds/cdssync/migration-test-dataset` as the local canonical dataset location unless the user specifies another target. - The generator script accepts an optional `UPDATE_INTERVAL_SECONDS` argument: - omit it to create the dataset once and exit - use `0` for continuous random content updates - use any integer greater than `0` to rewrite mutable files every `N` seconds - The generator script also accepts `--update-only`: - use it to update an existing dataset in place without recreating files, links, or directories - combine it with `UPDATE_INTERVAL_SECONDS` to keep mutating an existing dataset on a fixed interval - The generator script can also create additional bulk test data under `bulk/`: - `--folder-count N` controls how many bulk folders are created - `--files-per-folder N` controls how many bulk files are created in each folder - `--min-file-size-mib N` and `--max-file-size-mib N` control the random size range for bulk files - `--max-dataset-size-mib N` caps the total size of generated bulk files only - once bulk files exist, update mode rewrites them too as part of the mutable-content set - If ACL/xattr coverage matters, ensure the generation host has: - `acl` installed for `setfacl` and `getfacl` - `attr` installed for `setfattr` and `getfattr` ## Monitoring Helper - Use `/home/aw/code/cds/cdssync/monitor_mtdi_galaxy.sh` when the user wants CPU and memory logging for `mtdi-daemon` and `galaxy-migrate`. - The script accepts: - `INTERVAL_SECONDS`, default `10` - `LOG_FILE`, default `/root/monitor_mtdi_galaxy.log` - A common remote run pattern is: - `nohup /root/monitor_mtdi_galaxy.sh 10 /root/monitor_mtdi_galaxy.log >/dev/null 2>&1