Copy

Copy hooks allow you to transfer files between storage locations. This is particularly useful for moving files between different storage systems, making backups, or archiving data.

Configuration

- type: copy
  from: "connection1/path/to/source"   # Required: Source Location
  to: "connection2/path/to/dest"       # Required: Destination Location
  on_failure: abort       # Optional: abort/warn/quiet/skip
  id: my_id      # Optional. Will be generated. Use `log` hook with {runtime_state} to view state.

Properties

Property
Required
Description

from

Yes

to

Yes

on_failure

No

What to do if the copy fails (abort/warn/quiet/skip)

Output

When the copy hook executes successfully, it returns the following output that can be accessed in subsequent hooks:

status: success  # Status of the hook execution
from_uri: "s3://bucket/path/to/file"  # The normalized URI of the source file
from_path: "/path/to/source/file"  # The source path
to_uri: "gcs://bucket/path/to/file"  # The normalized URI of the destination file
to_path: "/path/to/dest/file"  # The destination path
bytes_written: 1024  # Number of bytes written

You can access these values in subsequent hooks using the following syntax (jmespath):

  • {state.hook_id.status} - Status of the hook execution

  • {state.hook_id.from_uri} - The normalized URI of the source file

  • {state.hook_id.from_path} - The source path

  • {state.hook_id.to_uri} - The normalized URI of the destination file

  • {state.hook_id.to_path} - The destination path

  • {state.hook_id.bytes_written} - Number of bytes written

Examples

Archive Files Between Cloud Storage

Archive files between different cloud storage providers:

hooks:
  post:
    - type: copy
      if: run.status == "success"
      from: "aws_s3/{run.object.file_path}"
      to: "gcs/archives/{target.name}/{timestamp.YYYY}/{timestamp.MM}/{run.object.file_path}"
      on_failure: warn

Upload A Local DuckDB Database into S3

Copy a local database file into Amazon S3 after writing to it:

hooks:
  end:
    - type: copy
      from: "{target.instance}"
      to: "aws_s3/duckdb/{env.FOLDER}/backup.db"
      on_failure: warn

Copy Multiple Files Pattern

Copy multiple files matching a pattern:

hooks:
  post:
    - type: copy
      from: "local//tmp/exports/*.parquet"
      to: "gcs/data-lake/raw/{run.stream.name}/"
      on_failure: abort

Last updated