Copy
Copy hooks allow you to transfer files between storage locations. This is particularly useful for moving files between different storage systems, making backups, or archiving data.
Configuration
- type: copy
from: "connection1/path/to/source" # Required: Source Location
to: "connection2/path/to/dest" # Required: Destination Location
single_file: true # Optional: true/false. Force treat source as single file (no need to list the source location)
on_failure: abort # Optional: abort/warn/quiet/skip
id: my_id # Optional. Will be generated. Use `log` hook with {runtime_state} to view state.Properties
single_file
No
Boolean flag to specify whether to treat the source as a single file. If true, copies as a single file. If false, uses recursive copy for directories. If not specified, automatically detects based on the source path.
on_failure
No
What to do if the copy fails (abort/warn/quiet/skip)
Output
When the copy hook executes successfully, it returns the following output that can be accessed in subsequent hooks:
status: success # Status of the hook execution
from_uri: "s3://bucket/path/to/file" # The normalized URI of the source file
from_path: "/path/to/source/file" # The source path
to_uri: "gcs://bucket/path/to/file" # The normalized URI of the destination file
to_path: "/path/to/dest/file" # The destination path
bytes_written: 1024 # Number of bytes writtenYou can access these values in subsequent hooks using the following syntax (jmespath):
{state.hook_id.status}- Status of the hook execution{state.hook_id.from_uri}- The normalized URI of the source file{state.hook_id.from_path}- The source path{state.hook_id.to_uri}- The normalized URI of the destination file{state.hook_id.to_path}- The destination path{state.hook_id.bytes_written}- Number of bytes written
Examples
Archive Files Between Cloud Storage
Archive files between different cloud storage providers:
hooks:
post:
- type: copy
if: run.status == "success"
from: "aws_s3/{run.object.file_path}"
to: "gcs/archives/{target.name}/{timestamp.YYYY}/{timestamp.MM}/{run.object.file_path}"
on_failure: warnUpload A Local DuckDB Database into S3
Copy a local database file into Amazon S3 after writing to it:
hooks:
end:
- type: copy
from: "{target.instance}"
to: "aws_s3/duckdb/{env.FOLDER}/backup.db"
on_failure: warnCopy Multiple Files Pattern
Copy multiple files matching a pattern:
hooks:
post:
- type: copy
from: "local//tmp/exports/*.parquet"
to: "gcs/data-lake/raw/{run.stream.name}/"
on_failure: abortForce Single File Copy
Force treating the source as a single file, even if it could be interpreted as a pattern:
hooks:
post:
- type: copy
from: "s3/data/file.json"
to: "local//backup/file.json"
single_file: true
on_failure: abortForce Directory Copy
Force recursive directory copying:
hooks:
post:
- type: copy
from: "local//data/exports"
to: "s3/backup/exports"
single_file: false
on_failure: warnLast updated
Was this helpful?