Add the cdssync migration test dataset manifest, generator script,
workspace instructions, and gitignore.
This sets the default workflow to:
- generate the dataset locally
- copy it to the test machine with metadata preserved
- verify the copied data before migration testing