Add the cdssync migration test dataset manifest, generator script, workspace instructions, and gitignore. This sets the default workflow to: - generate the dataset locally - copy it to the test machine with metadata preserved - verify the copied data before migration testing
4.0 KiB
4.0 KiB
Migration Test Dataset Manifest
This manifest defines a compact, high-value filesystem test set for validating file migration behavior. It is intended to cover common file-content, naming, metadata, and directory edge cases without generating an unnecessarily large corpus.
Recommended Root Layout
regular/hidden/spaces in name/deep/tree/level1/level2/level3/readonly-dir/links/metadata/empty-dirs/
Test Objects
Regular Files
regular/text_1mb_644.txtregular/text_3mb_600.txtregular/text_5mb_755.txtregular/random_1mb_600.binregular/random_3mb_644.binregular/random_5mb_755.binregular/compressible_1mb_644.logregular/compressible_3mb_600.logregular/compressible_5mb_755.logregular/script_1mb_755.shregular/script_3mb_700.shregular/script_5mb_755.shregular/sparse_1mb_600.imgregular/sparse_3mb_600.imgregular/sparse_5mb_600.imgregular/empty_000_644.txtregular/empty_001_600.txtregular/empty_002_755.txt
Hidden Files
hidden/.hidden_text_1mb_644.txthidden/.hidden_random_3mb_600.binhidden/.hidden_script_1mb_755.shhidden/.hidden_empty_644hidden/.hidden_sparse_5mb_600.img
Files With Spaces
spaces in name/file with spaces text 1mb 644.txtspaces in name/file with spaces random 3mb 600.binspaces in name/file with spaces script 1mb 755.shspaces in name/file with spaces empty 644spaces in name/file with spaces sparse 5mb 600.img
Long-Name Files
regular/longname_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_text_1mb_644.txtregular/longname_bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb_random_3mb_600.binregular/longname_cccccccccccccccccccccccccccccccc_compressible_5mb_755.log
Deep Path Files
deep/tree/level1/level2/level3/deep_text_1mb_644.txtdeep/tree/level1/level2/level3/deep_random_3mb_600.bindeep/tree/level1/level2/level3/deep_script_1mb_755.shdeep/tree/level1/level2/level3/deep_sparse_5mb_600.img
Duplicate-Content Cases
regular/dup_source_text_3mb_644.txtregular/dup_copy_a_text_3mb_600.txtdeep/tree/level1/level2/dup_copy_b_text_3mb_755.txt
Timestamp Variants
regular/old_text_1mb_644.txtregular/recent_text_1mb_644.txtregular/futureish_text_1mb_644.txt
Read-Only Or Awkward Placement Cases
readonly-dir/locked_text_1mb_444.txtreadonly-dir/locked_random_3mb_400.binreadonly-dir/locked_script_1mb_500.sh
Links
links/symlink_to_text_1mb_644.txtlinks/symlink_to_deep_random_3mb_600.binlinks/symlink_to_hidden_filelinks/hardlink_to_random_3mb_644.binlinks/hardlink_to_compressible_5mb_755.log
Directories
empty-dirs/empty_a/empty-dirs/empty_b/empty-dirs/.hidden_empty_dir/readonly-dir/no_write_subdir/deep/tree/level1/level2/level3/
Metadata Cases
These should only be created if the source filesystem supports them and the test environment allows them.
metadata/xattr_text_1mb_644.txtmetadata/xattr_random_3mb_600.binmetadata/acl_text_1mb_644.txtmetadata/acl_script_1mb_755.sh
Approximate Storage
Estimated real disk usage for this manifest:
- core allocated files: about
95 MiBto125 MiB - with filesystem overhead and modest headroom: plan for about
150 MiB - comfortable reserve for later additions:
250 MiB
Important notes:
- sparse files may report a logical size of
1 MiBto5 MiBwhile using much less physical disk space - symlinks, hard links, directories, ACLs, xattrs, and empty files add little compared with regular allocated files
- if you later expand this set with more size permutations or more metadata variants, storage will grow mostly with the fully allocated non-sparse files
Usage Recommendation
Use this directory as the canonical definition of the source dataset. Generate the files once, preserve the original unchanged, and transfer a copy to the source test machine using metadata-preserving tooling such as rsync -aH, cp -a, or a tar archive workflow.