Add bulk dataset generation options to test data script
Add bulk data generation controls for folder count, files per folder, file size range, and bulk dataset size limits. Also update the cdssync docs to describe the new options and how update mode applies to generated bulk files.
This commit is contained in:
@@ -22,6 +22,12 @@ For migration test datasets in this workspace, follow this process by default:
|
||||
- The generator script also accepts `--update-only`:
|
||||
- use it to update an existing dataset in place without recreating files, links, or directories
|
||||
- combine it with `UPDATE_INTERVAL_SECONDS` to keep mutating an existing dataset on a fixed interval
|
||||
- The generator script can also create additional bulk test data under `bulk/`:
|
||||
- `--folder-count N` controls how many bulk folders are created
|
||||
- `--files-per-folder N` controls how many bulk files are created in each folder
|
||||
- `--min-file-size-mib N` and `--max-file-size-mib N` control the random size range for bulk files
|
||||
- `--max-dataset-size-mib N` caps the total size of generated bulk files only
|
||||
- once bulk files exist, update mode rewrites them too as part of the mutable-content set
|
||||
- If ACL/xattr coverage matters, ensure the generation host has:
|
||||
- `acl` installed for `setfacl` and `getfacl`
|
||||
- `attr` installed for `setfattr` and `getfattr`
|
||||
|
||||
Reference in New Issue
Block a user