Add support for running content updates against an existing migration test dataset without recreating the filesystem structure. Also make ACL/xattr updates non-fatal on filesystems that do not support those operations.
4.9 KiB
Migration Test Dataset Manifest
This manifest defines a compact, high-value filesystem test set for validating file migration behavior. It is intended to cover common file-content, naming, metadata, and directory edge cases without generating an unnecessarily large corpus.
The generator script can also run in continuous update mode after initial creation. In that mode, mutable content files are rewritten with random data on a fixed interval:
- omit the interval argument to create the dataset once and exit
- use
0for continuous rewrites with no sleep between passes - use any integer greater than
0to rewrite mutable files everyNseconds - use
--update-onlyto run updates against an already-existing dataset without recreating the special-case filesystem objects first
Important implementation detail for update mode:
- the update loop rewrites content-bearing regular files that are intended to simulate active data churn
- it does not rewrite script files, sparse files, symlinks, hard links, or empty files
- this preserves the special-case filesystem structure while still generating ongoing content changes
- if ACL/xattr assignment is unsupported on the target filesystem, the script logs that condition and continues
Recommended Root Layout
regular/hidden/spaces in name/deep/tree/level1/level2/level3/readonly-dir/links/metadata/empty-dirs/
Test Objects
Regular Files
regular/text_1mb_644.txtregular/text_3mb_600.txtregular/text_5mb_755.txtregular/random_1mb_600.binregular/random_3mb_644.binregular/random_5mb_755.binregular/compressible_1mb_644.logregular/compressible_3mb_600.logregular/compressible_5mb_755.logregular/script_1mb_755.shregular/script_3mb_700.shregular/script_5mb_755.shregular/sparse_1mb_600.imgregular/sparse_3mb_600.imgregular/sparse_5mb_600.imgregular/empty_000_644.txtregular/empty_001_600.txtregular/empty_002_755.txt
Hidden Files
hidden/.hidden_text_1mb_644.txthidden/.hidden_random_3mb_600.binhidden/.hidden_script_1mb_755.shhidden/.hidden_empty_644hidden/.hidden_sparse_5mb_600.img
Files With Spaces
spaces in name/file with spaces text 1mb 644.txtspaces in name/file with spaces random 3mb 600.binspaces in name/file with spaces script 1mb 755.shspaces in name/file with spaces empty 644spaces in name/file with spaces sparse 5mb 600.img
Long-Name Files
regular/longname_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_text_1mb_644.txtregular/longname_bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb_random_3mb_600.binregular/longname_cccccccccccccccccccccccccccccccc_compressible_5mb_755.log
Deep Path Files
deep/tree/level1/level2/level3/deep_text_1mb_644.txtdeep/tree/level1/level2/level3/deep_random_3mb_600.bindeep/tree/level1/level2/level3/deep_script_1mb_755.shdeep/tree/level1/level2/level3/deep_sparse_5mb_600.img
Duplicate-Content Cases
regular/dup_source_text_3mb_644.txtregular/dup_copy_a_text_3mb_600.txtdeep/tree/level1/level2/dup_copy_b_text_3mb_755.txt
Timestamp Variants
regular/old_text_1mb_644.txtregular/recent_text_1mb_644.txtregular/futureish_text_1mb_644.txt
Read-Only Or Awkward Placement Cases
readonly-dir/locked_text_1mb_444.txtreadonly-dir/locked_random_3mb_400.binreadonly-dir/locked_script_1mb_500.sh
Links
links/symlink_to_text_1mb_644.txtlinks/symlink_to_deep_random_3mb_600.binlinks/symlink_to_hidden_filelinks/hardlink_to_random_3mb_644.binlinks/hardlink_to_compressible_5mb_755.log
Directories
empty-dirs/empty_a/empty-dirs/empty_b/empty-dirs/.hidden_empty_dir/readonly-dir/no_write_subdir/deep/tree/level1/level2/level3/
Metadata Cases
These should only be created if the source filesystem supports them and the test environment allows them.
metadata/xattr_text_1mb_644.txtmetadata/xattr_random_3mb_600.binmetadata/acl_text_1mb_644.txtmetadata/acl_script_1mb_755.sh
Approximate Storage
Estimated real disk usage for this manifest:
- core allocated files: about
95 MiBto125 MiB - with filesystem overhead and modest headroom: plan for about
150 MiB - comfortable reserve for later additions:
250 MiB
Important notes:
- sparse files may report a logical size of
1 MiBto5 MiBwhile using much less physical disk space - symlinks, hard links, directories, ACLs, xattrs, and empty files add little compared with regular allocated files
- if you later expand this set with more size permutations or more metadata variants, storage will grow mostly with the fully allocated non-sparse files
Usage Recommendation
Use this directory as the canonical definition of the source dataset. Generate the files once, preserve the original unchanged, and transfer a copy to the source test machine using metadata-preserving tooling such as rsync -aH, cp -a, or a tar archive workflow.