Command
Command hooks allow you to execute system commands or scripts as part of your replication workflow. This is particularly useful for running data processing scripts, triggering external processes, or performing system-level operations.
Configuration
- type: command
command: ["executable", "arg1", "arg2"] # Required: Command and arguments as array or string
print: true # Optional: Print command output to console (default: true)
capture: true # Optional: Capture command output in hook result (default: false)
timeout: 300 # Optional: Command timeout in seconds (default: no timeout)
env: # Optional: Environment variables for the command
ENV_VAR1: "value1"
ENV_VAR2: "value2"
on_failure: abort # Optional: abort/warn/quiet/skip
id: my_id # Optional. Will be generated. Use `log` hook with {runtime_state} to view state.Properties
command
Yes
String or Array containing the command and its arguments
No
Whether to print command output to console (default: false)
capture
No
Whether to capture command output in hook result (default: false)
timeout
No
Command timeout in seconds. If 0 or not specified, no timeout is applied
env
No
Map of environment variables to set for the command
on_failure
No
What to do if the command fails (abort/warn/quiet/skip)
Output
When the command hook executes successfully, it returns the following output that can be accessed in subsequent hooks:
status: success # Status of the hook execution
binary: "/path/to/executable" # The binary that was executed
arguments: ["arg1", "arg2"] # The arguments passed to the command
start: "2024-01-01T00:00:00Z" # Command start time
end: "2024-01-01T00:00:01Z" # Command end time
timeout: 300 # Only present if timeout was specified
output: # Only present if capture: true
stdout: "Standard output text"
stderr: "Standard error text"
combined: "Combined output text"You can access these values in subsequent hooks using the following syntax (jmespath):
{state.hook_id.status}- Status of the hook execution{state.hook_id.binary}- The binary that was executed{state.hook_id.arguments}- The arguments passed to the command{state.hook_id.start}- Command start time{state.hook_id.end}- Command end time{state.hook_id.timeout}- Timeout value (if specified){state.hook_id.output.stdout}- Standard output (if capture: true){state.hook_id.output.stderr}- Standard error (if capture: true){state.hook_id.output.combined}- Combined output (if capture: true)
Examples
Run Data Processing Script
Execute a Python script to process data before replication:
hooks:
pre:
- type: command
command: python scripts/process_data.py --stream "{run.stream.name}"
timeout: 600 # 10 minute timeout
env:
PYTHONPATH: "/path/to/libs"
DATA_DIR: "{env.data_directory}"
print: true
on_failure: abortSystem Cleanup
Clean up temporary files after processing:
hooks:
post:
- type: command
command: ["rm", "-rf", "/tmp/processed/{run.stream.name}/*"]
on_failure: warnRun Data Quality Checks
Execute a data quality checking script and capture its output:
hooks:
post:
- type: command
command: [
"python",
"scripts/quality_check.py",
"--table", "{run.object.full_name}",
"--date", "{timestamp.date}"
]
timeout: 1800 # 30 minute timeout
capture: true
env:
DB_CONNECTION: "{target.connection_string}"
on_failure: warnConditional Command Execution
Run commands based on environment or conditions:
hooks:
post:
- type: command
if: env.PRODUCTION == "true"
command: notify-admin --stream "{run.stream.name}" --status {run.status}
print: true
env:
NOTIFY_TOKEN: "{env.notification_token}"Run Shell Script
Execute a shell script with parameters:
hooks:
pre:
- type: command
command: [
"bash",
"scripts/prepare_environment.sh",
"{target.environment}",
"{run.stream.name}"
]
print: true
capture: true
on_failure: abortGenerate Reports
Run a report generation tool after successful replication:
hooks:
post:
- type: command
if: run.status == "success"
command: [
"report-generator",
"--input", "{run.object.full_name}",
"--output", "reports/{run.stream.name}_{timestamp.date}.pdf"
]
timeout: 900 # 15 minute timeout
env:
REPORT_TEMPLATE: "templates/standard.tpl"
OUTPUT_DIR: "/var/reports"Last updated
Was this helpful?