Command

Command hooks allow you to execute system commands or scripts as part of your replication workflow. This is particularly useful for running data processing scripts, triggering external processes, or performing system-level operations.

Configuration

- type: command
  command: ["executable", "arg1", "arg2"]  # Required: Command and arguments as array
  print: false      # Optional: Print command output to console (default: false)
  capture: false    # Optional: Capture command output in hook result (default: false)
  env:              # Optional: Environment variables for the command
    ENV_VAR1: "value1"
    ENV_VAR2: "value2"
  on_failure: abort # Optional: abort/warn/quiet/skip
  id: my_id         # Optional. Will be generated. Use `log` hook with {runtime_state} to view state.

Properties

Property
Required
Description

command

Yes

Array containing the command and its arguments

print

No

Whether to print command output to console (default: false)

capture

No

Whether to capture command output in hook result (default: false)

env

No

Map of environment variables to set for the command

on_failure

No

What to do if the command fails (abort/warn/quiet/skip)

Output

When the command hook executes successfully, it returns the following output that can be accessed in subsequent hooks:

status: success  # Status of the hook execution
binary: "/path/to/executable"  # The binary that was executed
arguments: ["arg1", "arg2"]  # The arguments passed to the command
start: "2024-01-01T00:00:00Z"  # Command start time
end: "2024-01-01T00:00:01Z"  # Command end time
output:  # Only present if capture: true
  stdout: "Standard output text"
  stderr: "Standard error text"
  combined: "Combined output text"

You can access these values in subsequent hooks using the following syntax (jmespath):

  • {state.hook_id.status} - Status of the hook execution

  • {state.hook_id.binary} - The binary that was executed

  • {state.hook_id.arguments} - The arguments passed to the command

  • {state.hook_id.start} - Command start time

  • {state.hook_id.end} - Command end time

  • {state.hook_id.output.stdout} - Standard output (if capture: true)

  • {state.hook_id.output.stderr} - Standard error (if capture: true)

  • {state.hook_id.output.combined} - Combined output (if capture: true)

Examples

Run Data Processing Script

Execute a Python script to process data before replication:

hooks:
  pre:
    - type: command
      command: ["python", "scripts/process_data.py", "--stream", "{run.stream.name}"]
      env:
        PYTHONPATH: "/path/to/libs"
        DATA_DIR: "{env.data_directory}"
      print: true
      on_failure: abort

System Cleanup

Clean up temporary files after processing:

hooks:
  post:
    - type: command
      command: ["rm", "-rf", "/tmp/processed/{run.stream.name}/*"]
      on_failure: warn

Run Data Quality Checks

Execute a data quality checking script and capture its output:

hooks:
  post:
    - type: command
      command: [
        "python",
        "scripts/quality_check.py",
        "--table", "{run.object.full_name}",
        "--date", "{timestamp.date}"
      ]
      capture: true
      env:
        DB_CONNECTION: "{target.connection_string}"
      on_failure: warn

Conditional Command Execution

Run commands based on environment or conditions:

hooks:
  post:
    - type: command
      if: env.PRODUCTION == "true"
      command: ["notify-admin", "--stream", "{run.stream.name}", "--status", "{run.status}"]
      print: true
      env:
        NOTIFY_TOKEN: "{env.notification_token}"

Run Shell Script

Execute a shell script with parameters:

hooks:
  pre:
    - type: command
      command: [
        "bash",
        "scripts/prepare_environment.sh",
        "{target.environment}",
        "{run.stream.name}"
      ]
      print: true
      capture: true
      on_failure: abort

Generate Reports

Run a report generation tool after successful replication:

hooks:
  post:
    - type: command
      if: run.status == "success"
      command: [
        "report-generator",
        "--input", "{run.object.full_name}",
        "--output", "reports/{run.stream.name}_{timestamp.date}.pdf"
      ]
      env:
        REPORT_TEMPLATE: "templates/standard.tpl"
        OUTPUT_DIR: "/var/reports"

Last updated