Copy

Copy hooks allow you to transfer files between storage locations. This is particularly useful for moving files between different storage systems, making backups, or archiving data.

Configuration

- type: copy
  from: "connection1/path/to/source"   # Required: Source Location
  to: "connection2/path/to/dest"       # Required: Destination Location
  single_file: true       # Optional: true/false. Force treat source as single file (no need to list the source location)
  on_failure: abort       # Optional: abort/warn/quiet/skip
  id: my_id      # Optional. Will be generated. Use `log` hook with {runtime_state} to view state.

Properties

Property
Required
Description

from

Yes

The source location string. Contains connection name and path.

to

Yes

The destination location string. Contains connection name and path.

single_file

No

Boolean flag to specify whether to treat the source as a single file. If true, copies as a single file. If false, uses recursive copy for directories. If not specified, automatically detects based on the source path.

on_failure

No

What to do if the copy fails (abort/warn/quiet/skip)

Output

When the copy hook executes successfully, it returns the following output that can be accessed in subsequent hooks:

status: success  # Status of the hook execution
from_uri: "s3://bucket/path/to/file"  # The normalized URI of the source file
from_path: "/path/to/source/file"  # The source path
to_uri: "gcs://bucket/path/to/file"  # The normalized URI of the destination file
to_path: "/path/to/dest/file"  # The destination path
bytes_written: 1024  # Number of bytes written

You can access these values in subsequent hooks using the following syntax (jmespath):

  • {state.hook_id.status} - Status of the hook execution

  • {state.hook_id.from_uri} - The normalized URI of the source file

  • {state.hook_id.from_path} - The source path

  • {state.hook_id.to_uri} - The normalized URI of the destination file

  • {state.hook_id.to_path} - The destination path

  • {state.hook_id.bytes_written} - Number of bytes written

Examples

Archive Files Between Cloud Storage

Archive files between different cloud storage providers:

hooks:
  post:
    - type: copy
      if: run.status == "success"
      from: "aws_s3/{run.object.file_path}"
      to: "gcs/archives/{target.name}/{timestamp.YYYY}/{timestamp.MM}/{run.object.file_path}"
      on_failure: warn

Upload A Local DuckDB Database into S3

Copy a local database file into Amazon S3 after writing to it:

hooks:
  end:
    - type: copy
      from: "{target.instance}"
      to: "aws_s3/duckdb/{env.FOLDER}/backup.db"
      on_failure: warn

Copy Multiple Files Pattern

Copy multiple files matching a pattern:

hooks:
  post:
    - type: copy
      from: "local//tmp/exports/*.parquet"
      to: "gcs/data-lake/raw/{run.stream.name}/"
      on_failure: abort

Force Single File Copy

Force treating the source as a single file, even if it could be interpreted as a pattern:

hooks:
  post:
    - type: copy
      from: "s3/data/file.json"
      to: "local//backup/file.json"
      single_file: true
      on_failure: abort

Force Directory Copy

Force recursive directory copying:

hooks:
  post:
    - type: copy
      from: "local//data/exports"
      to: "s3/backup/exports"
      single_file: false
      on_failure: warn

Last updated

Was this helpful?