Column Metrics & Validation

Column-level statistics, validation rules, and value monitoring

Column-level monitoring lets you collect detailed statistics and apply validation rules to individual columns. These metrics feed into anomaly detection over time, alerting you when column values deviate from historical patterns.

Column Statistics

Key
Type
Description

count

bool

Total non-null and null value counts

null_count

bool

Number of null values

count_distinct

bool

Number of unique values (cardinality)

unique_count

bool

Unique value count

size

bool

Total size in bytes of column values

min_max_mean

bool

Minimum, maximum, and mean values for numeric columns

min_max_len

bool

Minimum and maximum string length for text columns

percentile

bool

Percentile statistics (p50, p90, p95, p99) and standard deviation

objects:
  public.orders:
    columns:
      revenue:
        count: true
        min_max_mean: true
        percentile: true

      customer_id:
        count_distinct: true
        null_count: true

      description:
        min_max_len: true

Column Validation

Validation rules check column values against defined patterns or value lists. Violations are reported as anomaly events.

Regex Patterns

Use regex_match to define patterns that values should match, and regex_not_match for patterns that values should not match:

Value Lists

Use accepted_values to define valid values (anything else is a violation), and rejected_values to define values that should not appear:

circle-info

Validation results include match counts, violation counts, and a valid boolean. These are tracked over time and can trigger anomaly alerts when violation rates change.

Column Wildcard

Use "*" as a column name in defaults to apply metrics to all columns across all monitored objects:

Complete Example

Last updated

Was this helpful?