Skip to content

MAD Detector (Median Absolute Deviation)

The MAD (Median Absolute Deviation) detector is a robust statistical method for anomaly detection that uses median-based statistics instead of mean-based approaches.

MAD is particularly effective for:

  • Data with outliers - More robust than standard deviation methods
  • Skewed distributions - Works well with non-normal data
  • Time-series with seasonality - Supports seasonality grouping
  • General-purpose detection - Good default choice for most metrics

The MAD detector works by:

  1. Calculate median of historical window values
  2. Calculate MAD (median of absolute deviations from median)
  3. Scale MAD to σ-equivalents: sigma_est = 1.4826 × MAD (normal-consistency constant)
  4. Build confidence interval: [median - threshold × 1.4826 × MAD, median + threshold × 1.4826 × MAD]
  5. Detect anomalies when values fall outside the interval

Thanks to the 1.4826 scaling, threshold is expressed in σ-equivalents exactly like Z-Score: threshold=3.0 corresponds to 3-sigma on Gaussian noise (~0.27% false positives). Without scaling, 3×MAD would only be ≈2σ and fire on ~4% of normal points.

When seasonality is configured:

  1. Compute global statistics (entire window)
  2. For each seasonality group:
    • Compute group statistics (matching seasonality values only)
    • Calculate multipliers (group_stat / global_stat)
  3. Apply multipliers to adjust confidence intervals
  4. Detect anomalies using adjusted intervals

This creates adaptive intervals that vary per seasonality pattern (e.g., different intervals for each hour of day).

Number of σ-equivalents from median to consider anomalous. MAD is multiplied by the normal-consistency constant 1.4826, so the interval is median ± threshold × 1.4826 × MAD.

  • Higher values (e.g., 5.0) = less sensitive, fewer anomalies
  • Lower values (e.g., 2.0) = more sensitive, more anomalies
  • Default 3.0 genuinely corresponds to 3-sigma on Gaussian noise (~0.27% false positives)
  • Typical range: 2.0 - 5.0

Example:

detectors:
- type: mad
params:
threshold: 3.0 # Standard sensitivity

Number of historical points to use for computing statistics.

  • Larger windows (e.g., 1000) = more stable, less responsive to changes
  • Smaller windows (e.g., 50) = more responsive, less stable
  • Recommended: 2-4 weeks of data for daily seasonality
    • For 10-minute intervals: window_size = 8640 (60 days)
    • For hourly data: window_size = 672 (4 weeks)
    • For daily data: window_size = 60 (2 months)

Example:

detectors:
- type: mad
params:
window_size: 8640 # 60 days of 10-min intervals

min_samples (int, default: 30, minimum: 1)

Section titled “min_samples (int, default: 30, minimum: 1)”

Minimum valid samples required before detection starts.

  • Ensures statistical reliability
  • Points before this threshold are marked as “insufficient_data”
  • Should be significantly smaller than window_size
  • Typical: 10-30% of window_size

Example:

detectors:
- type: mad
params:
min_samples: 1000 # Wait for 1000 valid samples

List of seasonality groups to apply adaptive intervals.

Format: List of column names or lists of column names

Names must match the metric’s seasonality features: the built-in seasonality_columns names (hour, day_of_week, day_of_month, month, is_weekend, is_holiday) or custom columns returned by the query and declared in query_columns.seasonality.

Examples:

Single seasonality component:

detectors:
- type: mad
params:
seasonality_components:
- "hour" # Different intervals per hour

Multiple separate components:

detectors:
- type: mad
params:
seasonality_components:
- "day_of_week" # Different intervals per weekday
- "hour" # Different intervals per hour

Combined component (interaction):

detectors:
- type: mad
params:
seasonality_components:
- ["hour", "day_of_week"] # Different per hour+weekday combo

Minimum samples required in each seasonality group for applying multipliers. Groups below this threshold fall back to global statistics.

Advanced — size the window to fill a group. A group’s multiplier engages only when the trailing window holds min_samples_per_group points sharing the current point’s key, and same-key points recur once per cardinality of the key — so you need window_size ≳ min_samples_per_group × distinct_keys. For hourly data grouped by hour (24 keys), the default min_samples_per_group = 10 needs window_size ≳ 240; with the default window_size = 100 no group ever fills and the seasonality silently has no effect (every point uses the global band). The detector logs a one-time warning in this case. Raise window_size, lower min_samples_per_group, or use a coarser grouping.

Example:

detectors:
- type: mad
params:
seasonality_components:
- ["offset_10minutes", "league_day"] # custom columns from query_columns.seasonality
min_samples_per_group: 15 # Need 15 samples per group

Shared Parameters (Preprocessing, Weighting, Detrending)

Section titled “Shared Parameters (Preprocessing, Weighting, Detrending)”

input_type, smoothing, window_weights / half_life, and detrend behave identically across MAD, Z-Score and IQR. See Shared Detector Parameters for the full reference, defaults, and tuning recipes.

start_time and batch_size control how detection runs without affecting results (they are not part of the detector ID). See Shared Detector Parameters → Execution Parameters.

All result-affecting parameters (everything except start_time and batch_size) are hashed into the detector_id. See Shared Detector Parameters → Detector Identity and Recomputation.

Minimal configuration:

name: cpu_usage
interval: 1min
query: "SELECT timestamp, cpu_percent FROM metrics"
detectors:
- type: mad
params:
threshold: 3.0

Recommended for production:

detectors:
- type: mad
params:
threshold: 3.0
window_size: 2880 # 2 days of 1-min data
min_samples: 500 # Wait for 500 valid samples

For metrics with daily patterns:

name: website_traffic
interval: 1hour
query: "SELECT timestamp, visitor_count FROM traffic"
# Extract hour-of-day seasonality from timestamp (built-in feature "hour")
seasonality_columns:
- hour
detectors:
- type: mad
params:
threshold: 3.0
window_size: 672 # 4 weeks of hourly data
seasonality_components:
- "hour" # Different intervals for each hour

For complex patterns (e.g., gaming metrics with multi-day tournaments):

name: group_assigned_users_pct
interval: 10min
query_file: sql/group_assigned.sql
query_columns:
timestamp: period_time
metric: group_assigned_users_pct
seasonality:
- offset_10minutes # 0-143 (10-min intervals in day)
- league_day # 1-3 (tournament days)
detectors:
- type: mad
params:
threshold: 3.0
window_size: 8640 # 60 days
min_samples: 1000
start_time: "2024-03-01 00:00:00"
batch_size: 2160
seasonality_components:
- ["offset_10minutes", "league_day"] # 432 unique combinations
min_samples_per_group: 10
  • General-purpose anomaly detection - Good default choice
  • Data with outliers - More robust than Z-Score
  • Skewed distributions - Doesn’t assume normality
  • Metrics with seasonality - Excellent seasonality support
  • Production systems - Stable and predictable behavior
  • Symmetric distributions with no outliers → Z-Score may be more sensitive
  • Known bounds → Manual Bounds for strict thresholds
  • Extreme skewness → IQR detector
  • Speed: ~1,500 points/second (including I/O)
  • Memory: O(window_size) per metric
  • CPU: Lightweight (median calculation only)
  • Seasonality impact: Minimal performance penalty

Detection runs a per-point window loop, so large initial backfills are the slow path; incremental runs only score the few new points and are cheap.

Each detection result includes metadata:

{
"global_median": 0.5123, # Median of entire window
"global_mad": 0.0234, # MAD of entire window
"adjusted_median": 0.5234, # After seasonality adjustment
"adjusted_mad": 0.0187, # After seasonality adjustment
"window_size": 8640, # Actual valid samples used
"ess": 412.7, # Effective sample size (Kish) — when window_weights is set
"trend_slope_per_point": -0.0002, # Estimated trend slope — when detrend is set
"seasonality_groups": [ # Applied adjustments
{
"group": ["offset_10minutes", "league_day"],
"median_multiplier": 1.023,
"mad_multiplier": 0.876,
"group_size": 23
}
],
# Only for anomalies:
"direction": "above", # "above" or "below"
"severity": 4.52, # σ-equivalents beyond the bound (distance / (1.4826 × adjusted_mad); equals global_mad only when no seasonality multiplier applies)
"distance": 0.2298 # Absolute distance from bound
}
FeatureMADZ-ScoreIQRManual
Robust to outliersVeryNoVeryN/A
Normal distributionNot requiredRequiredNot requiredN/A
Seasonality supportExcellentYesYesNo
Sensitivity tuningThresholdThresholdThresholdBounds
PerformanceFastFastFastVery Fast