Data Transfer¶
Move records — and optionally files — between environments. Two commands serve different use-cases:
| Command | Scope | Entities | Files |
|---|---|---|---|
grace data copy |
Flat entity copy with dependency resolution | devices, locations, vendors, videos | No |
grace data transfer |
Video-centric deep copy with child entities | videos + annotation_runs, segmentations, video_steps, devices, collectors, locations, vendors | Optional GCS blob copy |
grace data copy¶
Copy records from one environment to another — typically from production to development for testing and debugging.
Core semantics¶
grace data copy is additive and safe by design:
| Behavior | Description |
|---|---|
| Create only | New records are created in the target environment |
| Skip existing | Records that already exist (by UUID) are silently skipped |
| No updates | Existing records in the target are never modified |
| No deletes | Nothing is ever removed from the target |
| Dependency resolution | Required parent records are discovered and copied automatically |
UUIDs are preserved
Records keep their original UUIDs when copied. This means you can trace a record back to its production origin and re-running the same copy is idempotent.
Basic usage¶
This copies up to 50 videos from prod to dev, along with any devices,
locations, and vendors they depend on.
Workflow¶
Every copy follows the same four-step flow:
1. Connect¶
The CLI authenticates against both the source and target environments. Both must have stored credentials (see Authentication).
2. Plan¶
The planner:
- Fetches matching records from the source
- Resolves dependency chains (videos depend on devices, locations, vendors)
- Checks which records already exist in the target
- Builds a summary of what needs to be created
3. Review¶
A summary table is printed:
Copy plan: prod → dev
┏━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Entity ┃ Selected ┃ Dependencies ┃ Already in target ┃ To create ┃
┡━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ devices │ 0 │ 5 │ 3 │ 2 │
│ locations│ 0 │ 3 │ 3 │ 0 │
│ vendors │ 0 │ 2 │ 1 │ 1 │
│ videos │ 50 │ 0 │ 12 │ 38 │
└──────────┴──────────┴───────────────┴───────────────────┴───────────┘
- Selected: Records matching your query
- Dependencies: Parent records pulled in automatically
- Already in target: Skipped (UUIDs already present)
- To create: Records that will be written
4. Execute¶
After confirmation, records are created in dependency order with a progress bar:
Dry run¶
Preview without writing anything:
The plan table is shown, but no records are created. Use this to understand the scope of a copy before committing to it.
Filtering¶
Narrow down which source records to copy:
grace data copy \
--source prod --target dev \
--entity videos \
--filter "collection_meta.data_source:eq:internal" \
--filter "createdAt:gte:2025-06-01" \
--limit 100
Filter syntax is key:operator:value. Dot notation works for JSONB fields.
Use the in operator with bracket syntax to match multiple values:
id:in:[uuid-1,uuid-2,uuid-3].
Transforms¶
Transforms modify records during copy — useful when certain fields don't make sense in the target environment.
--clear-storage-meta¶
Nulls out the storage_meta field on video records. Use this when copying
videos whose cloud storage paths don't exist in the target environment:
Without this flag, videos are copied with their original storage_meta intact.
When to use
Always use --clear-storage-meta when copying from production to a dev
environment that doesn't share the same GCS bucket. This prevents downstream
tools from attempting to access non-existent files.
Dependency resolution¶
When copying videos, the planner automatically resolves dependencies:
Each video references a device, location, and vendor by UUID. The planner fetches these parent records from the source and creates any that are missing in the target — before creating the videos.
Devices, locations, and vendors have no dependencies themselves, so copying those entity types directly is always a flat operation.
Supported entities¶
| Entity | Dependencies | Notes |
|---|---|---|
devices |
None | Flat copy |
locations |
None | Flat copy |
vendors |
None | Flat copy |
videos |
devices, locations, vendors | Auto-resolves parents |
grace data transfer¶
Deep, video-centric transfer that copies a video plus all of its child entities (annotation runs, segmentations, video steps) and optionally the referenced GCS files.
When to use transfer vs copy¶
| Scenario | Command |
|---|---|
| Populate a dev environment with raw video records | grace data copy |
| Replicate a complete video pipeline result (annotations, steps, files) | grace data transfer |
| Debug a specific video's processing in a lower environment | grace data transfer |
Core semantics¶
grace data transfer shares the additive/idempotent guarantee of copy, with
additional behaviors:
| Behavior | Description |
|---|---|
| Device resolution | Devices are matched across environments by device_no (serial number), not UUID |
| Device ID remapping | collection_meta.device_id on videos is rewritten to point at the target device UUID |
| Collector resolution | Collectors are matched by name (e.g. collector-01#0000); missing ones are created as minimal Worker records |
| Child entity fetch | annotation_runs, segmentations, and video_steps linked to selected videos are fetched and diffed |
| GCS file copy | With --copy-files, referenced GCS blobs are copied to the target bucket |
| URI rewriting | When copying files, gs:// URIs in storage_meta and result_ref/debug_ref are rewritten to point at the target bucket |
Why device_no instead of UUID?
Device UUIDs differ between environments since devices are registered
independently in each environment. The device_no serial number is the
stable natural key that identifies the same physical device across
environments.
Basic usage¶
This transfers up to 20 videos from prod to dev with all their child
entities. Devices are resolved by device_no, collectors by name.
Transfer with file copy¶
This also copies all GCS blobs referenced in storage_meta (videos) and
result_ref/debug_ref (video steps), and rewrites the URIs to point at the
derived target bucket.
GCS prerequisites
--copy-files requires:
- The
gcsoptional dependency:pip install 'grace-cli[gcs]' - Application Default Credentials:
gcloud auth application-default login
The CLI validates both before proceeding.
Bucket derivation¶
By default, the target GCS bucket is derived by replacing the source environment name in the bucket name:
| Source bucket | Source env | Target env | Derived target bucket |
|---|---|---|---|
co-prod-data |
prod |
dev |
co-dev-data |
co-prod-annotations |
prod |
dev |
co-dev-annotations |
Override this with --target-gcs-bucket:
grace data transfer \
--source prod --target dev \
--limit 5 \
--copy-files \
--target-gcs-bucket my-custom-bucket
Workflow¶
1. Connect¶
Authenticate against both source and target environments.
2. Plan¶
The planner:
- Fetches matching videos from the source (respecting
--filterand--limit) - Extracts
device_id,collector_id,location_id,vendor_idfrom each video'scollection_meta - Resolves devices by
device_no— builds a remap table of source UUID → target UUID - Resolves collectors by
name— flags missing ones for creation - Fetches and diffs locations and vendors (direct UUID match)
- Fetches child entities (
annotation_runs,segmentations,video_steps) for all selected video IDs - Diffs each entity type against the target
3. Review¶
Transfer plan: prod → dev
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Entity ┃ Selected ┃ Already in target ┃ To create ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ devices (by device_no) │ 5 │ 3 │ 2 │
│ collectors (by name) │ - │ - │ 1 │
│ locations │ 3 │ 3 │ 0 │
│ vendors │ 2 │ 1 │ 1 │
│ videos │ 20 │ 4 │ 16 │
│ annotation_runs │ 30 │ 0 │ 30 │
│ segmentations │ 45 │ 0 │ 45 │
│ video_steps │ 160 │ 0 │ 160 │
└─────────────────────────┴──────────┴───────────────────┴───────────┘
GCS blobs to copy: 87
4. Execute¶
After confirmation, the executor:
- Creates missing devices in the target
- Creates missing collectors as Worker records
- Copies GCS blobs (if
--copy-files) with multi-threaded parallelism - Creates locations, vendors, videos (with device ID remapping + URI rewriting)
- Creates annotation_runs, segmentations, video_steps in dependency order
Dry run¶
Shows the full plan including GCS blob count, but writes nothing.
Verbose output¶
Add --verbose (or -v) to see every record and file that will be transferred:
This prints per-entity detail tables showing each record's action (create or
skip) along with key identifiers (video ID, step key, GCS URI, etc.).
Combine --verbose with --dry-run to inspect the full plan before executing,
or use --verbose without --dry-run to see the details right before the
confirmation prompt.
Filtering¶
grace data transfer \
--source prod --target dev \
--filter "collection_meta.data_source:eq:internal" \
--filter "createdAt:gte:2025-06-01" \
--limit 50
Use the in operator to select specific video IDs:
Filters apply to the video query. Child entities are fetched for all matched videos — they cannot be filtered independently.
Existing records (idempotency)¶
By default, when a video already exists in the target environment, the video
record itself is skipped. Its child entities (annotation_runs,
segmentations, video_steps) are still diffed independently — if the video
exists but has new child records in the source, those child records will be
created.
Re-running the same transfer is always safe: everything that already exists is skipped.
Update mode (--update)¶
Add --update to compare existing video records field-by-field against the
source and patch those that differ:
When enabled:
- Video records that exist in the target are fetched and compared (ignoring
server-managed timestamps like
created_at,updated_at) - If fields differ, the video is queued for update via
PATCH /videos/batch - The summary table gains a To update column
- All other entity types (
annotation_runs,segmentations,video_steps) remain create-or-skip regardless of this flag
Why only videos?
annotation_runs are immutable by design (re-running QC creates a new
run). segmentations are always created fresh alongside annotation runs.
video_steps use upsert semantics that maintain an audit trail via
video_step_events — a raw PATCH would bypass that.
Transforms applied automatically¶
grace data transfer applies two transforms transparently:
- RemapDeviceIdTransform — rewrites
collection_meta.device_idon video records using the device_no-based remap table. - RewriteGcsUriTransform (when
--copy-files) — rewrites allgs://URIs instorage_meta.gcs(videos) andresult_ref/debug_ref(video steps) to point at the target bucket.
Entity creation order¶
Records are created in strict dependency order to satisfy foreign key constraints:
Partial failure¶
If a batch fails mid-transfer, the CLI stops creating further records of that entity type and reports how many were created before the failure. Already-created records are not rolled back — re-running the same transfer will skip them.
GCS blob copy failures are non-fatal: the CLI warns and continues with record creation.
Option reference¶
grace data copy¶
| Option | Short | Required | Description |
|---|---|---|---|
--source |
-s |
Yes | Source environment name |
--target |
-t |
Yes | Target environment name |
--entity |
Yes | Entity type (devices, locations, vendors, videos) |
|
--filter |
-f |
No | Filter expression (key:op:value), repeatable |
--limit |
-l |
No | Max records to copy |
--dry-run |
No | Show plan without executing | |
--clear-storage-meta |
No | Null out storage_meta on videos |
grace data transfer¶
| Option | Short | Required | Description |
|---|---|---|---|
--source |
-s |
Yes | Source environment name |
--target |
-t |
Yes | Target environment name |
--filter |
-f |
No | Filter expression (key:op:value), repeatable |
--limit |
-l |
No | Max videos to transfer |
--dry-run |
No | Show plan without executing | |
--verbose |
-v |
No | Show detailed per-record and per-file lists |
--copy-files |
No | Also copy GCS blobs to target bucket | |
--target-gcs-bucket |
No | Override derived target bucket name | |
--update |
No | Compare existing videos field-by-field and update those that differ |
Scripting patterns¶
JSON output for automation¶
Non-interactive execution¶
Combine --dry-run for preview and then run without prompts:
# Preview first
grace data copy --source prod --target dev --entity videos --limit 50 --dry-run
# Execute (Ctrl+C if the plan looks wrong)
echo "y" | grace data copy --source prod --target dev --entity videos --limit 50
Transfer a specific video by ID¶
grace data transfer \
--source prod --target dev \
--filter "id:eq:a1b2c3d4-5678-90ab-cdef-1234567890ab" \
--copy-files
Transfer videos from a CSV file¶
Given a CSV with a video_id column, extract the IDs and pass them via
the in filter: