# AI Agent Instructions for `cdssync` These instructions apply anywhere under `/home/aw/code/cds/cdssync`. ## Migration Test Dataset Workflow For migration test datasets in this workspace, follow this process by default: 1. Generate the dataset locally from this workspace. 2. Preserve the local generated dataset as the canonical original copy. 3. Copy the dataset to the test machine using metadata-preserving tooling. 4. Verify the copied dataset on the test machine before using it for migration testing. ## Generation Rules - Use `/home/aw/code/cds/cdssync/generate_migration_test_dataset.sh` to create the dataset unless the user explicitly asks for a different method. - Prefer `/home/aw/code/cds/cdssync/migration-test-dataset` as the local canonical dataset location unless the user specifies another target. - The generator script accepts an optional `UPDATE_INTERVAL_SECONDS` argument: - omit it to create the dataset once and exit - use `0` for continuous random content updates - use any integer greater than `0` to rewrite mutable files every `N` seconds - The generator script also accepts `--update-only`: - use it to update an existing dataset in place without recreating files, links, or directories - combine it with `UPDATE_INTERVAL_SECONDS` to keep mutating an existing dataset on a fixed interval - The generator script can also create additional bulk test data under `bulk/`: - `--folder-count N` controls how many bulk folders are created - `--files-per-folder N` controls how many bulk files are created in each folder - `--min-file-size-mib N` and `--max-file-size-mib N` control the random size range for bulk files - `--max-dataset-size-mib N` caps the total size of generated bulk files only - once bulk files exist, update mode rewrites them too as part of the mutable-content set - If ACL/xattr coverage matters, ensure the generation host has: - `acl` installed for `setfacl` and `getfacl` - `attr` installed for `setfattr` and `getfattr` ## Copy Rules - Use `rsync -aHAX` by default when copying the dataset to another machine. - Preserve permissions, timestamps, symlinks, hard links, ACLs, and xattrs. - Do not use GUI copy/paste or non-preserving copy methods for this dataset unless the user explicitly asks for that. ## Verification Rules After copying to a test machine, verify at least: - file and directory structure - permissions - symlinks - hard links - timestamps - ACLs - xattrs Preferred verification commands include: - `find DEST_DIR | sort` - `stat DEST_DIR/regular/script_3mb_700.sh` - `stat DEST_DIR/readonly-dir/locked_text_1mb_444.txt` - `readlink DEST_DIR/links/symlink_to_text_1mb_644.txt` - `stat DEST_DIR/regular/random_3mb_644.bin DEST_DIR/links/hardlink_to_random_3mb_644.bin` - `getfacl -p DEST_DIR/metadata/acl_text_1mb_644.txt` - `getfattr -d DEST_DIR/metadata/xattr_text_1mb_644.txt` ## Destination Host Requirements - If the destination host lacks `acl` or `attr`, ACL/xattr verification will be incomplete. - If the destination filesystem does not support ACLs or xattrs, those attributes may not survive transfer even when the copy method is correct. - The generator now logs and continues when ACL/xattr assignment is unsupported on the target filesystem instead of exiting.