MAD Detector (Median Absolute Deviation)

The MAD (Median Absolute Deviation) detector is a robust statistical method for anomaly detection that uses median-based statistics instead of mean-based approaches.

Overview

MAD is particularly effective for:

Data with outliers - More robust than standard deviation methods
Skewed distributions - Works well with non-normal data
Time-series with seasonality - Supports seasonality grouping
General-purpose detection - Good default choice for most metrics

Algorithm

The MAD detector works by:

Calculate median of historical window values
Calculate MAD (median of absolute deviations from median)
Scale MAD to σ-equivalents: sigma_est = 1.4826 × MAD (normal-consistency constant)
Build confidence interval: [median - threshold × 1.4826 × MAD, median + threshold × 1.4826 × MAD]
Detect anomalies when values fall outside the interval

Thanks to the 1.4826 scaling, threshold is expressed in σ-equivalents exactly like Z-Score: threshold=3.0 corresponds to 3-sigma on Gaussian noise (~0.27% false positives). Without scaling, 3×MAD would only be ≈2σ and fire on ~4% of normal points.

With Seasonality Grouping

When seasonality is configured:

Compute global statistics (entire window)
For each seasonality group:
- Compute group statistics (matching seasonality values only)
- Calculate multipliers (group_stat / global_stat)
Apply multipliers to adjust confidence intervals
Detect anomalies using adjusted intervals

This creates adaptive intervals that vary per seasonality pattern (e.g., different intervals for each hour of day).

Parameters

Algorithm Parameters

`threshold` (float, default: 3.0)

Number of σ-equivalents from median to consider anomalous. MAD is multiplied by the normal-consistency constant 1.4826, so the interval is median ± threshold × 1.4826 × MAD.

Higher values (e.g., 5.0) = less sensitive, fewer anomalies
Lower values (e.g., 2.0) = more sensitive, more anomalies
Default 3.0 genuinely corresponds to 3-sigma on Gaussian noise (~0.27% false positives)
Typical range: 2.0 - 5.0

Example:

detectors:
  - type: mad
    params:
      threshold: 3.0  # Standard sensitivity

`window_size` (int, default: 100)

Number of historical points to use for computing statistics.

Larger windows (e.g., 1000) = more stable, less responsive to changes
Smaller windows (e.g., 50) = more responsive, less stable
Recommended: 2-4 weeks of data for daily seasonality
- For 10-minute intervals: window_size = 8640 (60 days)
- For hourly data: window_size = 672 (4 weeks)
- For daily data: window_size = 60 (2 months)

Example:

detectors:
  - type: mad
    params:
      window_size: 8640  # 60 days of 10-min intervals

`min_samples` (int, default: 30, minimum: 1)

Minimum valid samples required before detection starts.

Ensures statistical reliability
Points before this threshold are marked as “insufficient_data”
Should be significantly smaller than window_size
Typical: 10-30% of window_size

Example:

detectors:
  - type: mad
    params:
      min_samples: 1000  # Wait for 1000 valid samples

`seasonality_components` (list, optional)

List of seasonality groups to apply adaptive intervals.

Format: List of column names or lists of column names

Names must match the metric’s seasonality features: the built-in seasonality_columns names (hour, day_of_week, day_of_month, month, is_weekend, is_holiday) or custom columns returned by the query and declared in query_columns.seasonality.

Examples:

Single seasonality component:

detectors:
  - type: mad
    params:
      seasonality_components:
        - "hour"  # Different intervals per hour

Multiple separate components:

detectors:
  - type: mad
    params:
      seasonality_components:
        - "day_of_week"   # Different intervals per weekday
        - "hour"          # Different intervals per hour

Combined component (interaction):

detectors:
  - type: mad
    params:
      seasonality_components:
        - ["hour", "day_of_week"]  # Different per hour+weekday combo

`min_samples_per_group` (int, default: 10)

Minimum samples required in each seasonality group for applying multipliers. Groups below this threshold fall back to global statistics.

Advanced — size the window to fill a group. A group’s multiplier engages only when the trailing window holds min_samples_per_group points sharing the current point’s key, and same-key points recur once per cardinality of the key — so you need window_size ≳ min_samples_per_group × distinct_keys. For hourly data grouped by hour (24 keys), the default min_samples_per_group = 10 needs window_size ≳ 240; with the default window_size = 100 no group ever fills and the seasonality silently has no effect (every point uses the global band). The detector logs a one-time warning in this case. Raise window_size, lower min_samples_per_group, or use a coarser grouping.

Example:

detectors:
  - type: mad
    params:
      seasonality_components:
        - ["offset_10minutes", "league_day"]  # custom columns from query_columns.seasonality
      min_samples_per_group: 15  # Need 15 samples per group

Shared Parameters (Preprocessing, Weighting, Detrending)

input_type, smoothing, window_weights / half_life, and detrend behave identically across MAD, Z-Score and IQR. See Shared Detector Parameters for the full reference, defaults, and tuning recipes.

Execution Parameters

start_time and batch_size control how detection runs without affecting results (they are not part of the detector ID). See Shared Detector Parameters → Execution Parameters.

Detector Identity

All result-affecting parameters (everything except start_time and batch_size) are hashed into the detector_id. See Shared Detector Parameters → Detector Identity and Recomputation.

Configuration Examples

Basic Usage

Minimal configuration:

name: cpu_usage
interval: 1min
query: "SELECT timestamp, cpu_percent FROM metrics"

detectors:
  - type: mad
    params:
      threshold: 3.0

With Historical Window

Recommended for production:

detectors:
  - type: mad
    params:
      threshold: 3.0
      window_size: 2880   # 2 days of 1-min data
      min_samples: 500    # Wait for 500 valid samples

With Seasonality (Single Component)

For metrics with daily patterns:

name: website_traffic
interval: 1hour
query: "SELECT timestamp, visitor_count FROM traffic"

# Extract hour-of-day seasonality from timestamp (built-in feature "hour")
seasonality_columns:
  - hour

detectors:
  - type: mad
    params:
      threshold: 3.0
      window_size: 672    # 4 weeks of hourly data
      seasonality_components:
        - "hour"          # Different intervals for each hour

With Combined Seasonality

For complex patterns (e.g., gaming metrics with multi-day tournaments):

name: group_assigned_users_pct
interval: 10min
query_file: sql/group_assigned.sql

query_columns:
  timestamp: period_time
  metric: group_assigned_users_pct
  seasonality:
    - offset_10minutes  # 0-143 (10-min intervals in day)
    - league_day        # 1-3 (tournament days)

detectors:
  - type: mad
    params:
      threshold: 3.0
      window_size: 8640   # 60 days
      min_samples: 1000
      start_time: "2024-03-01 00:00:00"
      batch_size: 2160
      seasonality_components:
        - ["offset_10minutes", "league_day"]  # 432 unique combinations
      min_samples_per_group: 10

When to Use MAD Detector

Best For:

General-purpose anomaly detection - Good default choice
Data with outliers - More robust than Z-Score
Skewed distributions - Doesn’t assume normality
Metrics with seasonality - Excellent seasonality support
Production systems - Stable and predictable behavior

Consider Alternatives:

Symmetric distributions with no outliers → Z-Score may be more sensitive
Known bounds → Manual Bounds for strict thresholds
Extreme skewness → IQR detector

Performance Characteristics

Speed: ~1,500 points/second (including I/O)
Memory: O(window_size) per metric
CPU: Lightweight (median calculation only)
Seasonality impact: Minimal performance penalty

Detection runs a per-point window loop, so large initial backfills are the slow path; incremental runs only score the few new points and are cheap.

Detection Metadata

Each detection result includes metadata:

{
    "global_median": 0.5123,        # Median of entire window
    "global_mad": 0.0234,           # MAD of entire window
    "adjusted_median": 0.5234,      # After seasonality adjustment
    "adjusted_mad": 0.0187,         # After seasonality adjustment
    "window_size": 8640,            # Actual valid samples used
    "ess": 412.7,                   # Effective sample size (Kish) — when window_weights is set
    "trend_slope_per_point": -0.0002,  # Estimated trend slope — when detrend is set
    "seasonality_groups": [         # Applied adjustments
        {
            "group": ["offset_10minutes", "league_day"],
            "median_multiplier": 1.023,
            "mad_multiplier": 0.876,
            "group_size": 23
        }
    ],
    # Only for anomalies:
    "direction": "above",           # "above" or "below"
    "severity": 4.52,               # σ-equivalents beyond the bound (distance / (1.4826 × adjusted_mad); equals global_mad only when no seasonality multiplier applies)
    "distance": 0.2298              # Absolute distance from bound
}

Comparison with Other Detectors

Feature	MAD	Z-Score	IQR	Manual
Robust to outliers	Very	No	Very	N/A
Normal distribution	Not required	Required	Not required	N/A
Seasonality support	Excellent	Yes	Yes	No
Sensitivity tuning	Threshold	Threshold	Threshold	Bounds
Performance	Fast	Fast	Fast	Very Fast

MAD Detector (Median Absolute Deviation)

Overview

Algorithm

With Seasonality Grouping

Parameters

Algorithm Parameters

`threshold` (float, default: 3.0)

`window_size` (int, default: 100)

`min_samples` (int, default: 30, minimum: 1)

`seasonality_components` (list, optional)

`min_samples_per_group` (int, default: 10)

Shared Parameters (Preprocessing, Weighting, Detrending)

Execution Parameters

Detector Identity

Configuration Examples

Basic Usage

With Historical Window

With Seasonality (Single Component)

With Combined Seasonality

When to Use MAD Detector

Best For:

Consider Alternatives:

Performance Characteristics

Detection Metadata

Comparison with Other Detectors

References

See Also

MAD Detector (Median Absolute Deviation)

Overview

Algorithm

With Seasonality Grouping

Parameters

Algorithm Parameters

threshold (float, default: 3.0)

window_size (int, default: 100)

min_samples (int, default: 30, minimum: 1)

seasonality_components (list, optional)

min_samples_per_group (int, default: 10)

Shared Parameters (Preprocessing, Weighting, Detrending)

Execution Parameters

Detector Identity

Configuration Examples

Basic Usage

With Historical Window

With Seasonality (Single Component)

With Combined Seasonality

When to Use MAD Detector

Best For:

Consider Alternatives:

Performance Characteristics

Detection Metadata

Comparison with Other Detectors

References

See Also

`threshold` (float, default: 3.0)

`window_size` (int, default: 100)

`min_samples` (int, default: 30, minimum: 1)

`seasonality_components` (list, optional)

`min_samples_per_group` (int, default: 10)