MAD Detector (Median Absolute Deviation)
The MAD (Median Absolute Deviation) detector is a robust statistical method for anomaly detection that uses median-based statistics instead of mean-based approaches.
Overview
Section titled “Overview”MAD is particularly effective for:
- Data with outliers - More robust than standard deviation methods
- Skewed distributions - Works well with non-normal data
- Time-series with seasonality - Supports seasonality grouping
- General-purpose detection - Good default choice for most metrics
Algorithm
Section titled “Algorithm”The MAD detector works by:
- Calculate median of historical window values
- Calculate MAD (median of absolute deviations from median)
- Scale MAD to σ-equivalents:
sigma_est = 1.4826 × MAD(normal-consistency constant) - Build confidence interval:
[median - threshold × 1.4826 × MAD, median + threshold × 1.4826 × MAD] - Detect anomalies when values fall outside the interval
Thanks to the 1.4826 scaling, threshold is expressed in σ-equivalents
exactly like Z-Score: threshold=3.0 corresponds to 3-sigma on Gaussian
noise (~0.27% false positives). Without scaling, 3×MAD would only be ≈2σ and
fire on ~4% of normal points.
With Seasonality Grouping
Section titled “With Seasonality Grouping”When seasonality is configured:
- Compute global statistics (entire window)
- For each seasonality group:
- Compute group statistics (matching seasonality values only)
- Calculate multipliers (group_stat / global_stat)
- Apply multipliers to adjust confidence intervals
- Detect anomalies using adjusted intervals
This creates adaptive intervals that vary per seasonality pattern (e.g., different intervals for each hour of day).
Parameters
Section titled “Parameters”Algorithm Parameters
Section titled “Algorithm Parameters”threshold (float, default: 3.0)
Section titled “threshold (float, default: 3.0)”Number of σ-equivalents from median to consider anomalous. MAD is multiplied
by the normal-consistency constant 1.4826, so the interval is
median ± threshold × 1.4826 × MAD.
- Higher values (e.g., 5.0) = less sensitive, fewer anomalies
- Lower values (e.g., 2.0) = more sensitive, more anomalies
- Default 3.0 genuinely corresponds to 3-sigma on Gaussian noise (~0.27% false positives)
- Typical range: 2.0 - 5.0
Example:
detectors: - type: mad params: threshold: 3.0 # Standard sensitivitywindow_size (int, default: 100)
Section titled “window_size (int, default: 100)”Number of historical points to use for computing statistics.
- Larger windows (e.g., 1000) = more stable, less responsive to changes
- Smaller windows (e.g., 50) = more responsive, less stable
- Recommended: 2-4 weeks of data for daily seasonality
- For 10-minute intervals:
window_size = 8640(60 days) - For hourly data:
window_size = 672(4 weeks) - For daily data:
window_size = 60(2 months)
- For 10-minute intervals:
Example:
detectors: - type: mad params: window_size: 8640 # 60 days of 10-min intervalsmin_samples (int, default: 30, minimum: 1)
Section titled “min_samples (int, default: 30, minimum: 1)”Minimum valid samples required before detection starts.
- Ensures statistical reliability
- Points before this threshold are marked as “insufficient_data”
- Should be significantly smaller than
window_size - Typical: 10-30% of
window_size
Example:
detectors: - type: mad params: min_samples: 1000 # Wait for 1000 valid samplesseasonality_components (list, optional)
Section titled “seasonality_components (list, optional)”List of seasonality groups to apply adaptive intervals.
Format: List of column names or lists of column names
Names must match the metric’s seasonality features: the built-in
seasonality_columns names (hour, day_of_week, day_of_month,
month, is_weekend, is_holiday) or custom columns returned by the
query and declared in query_columns.seasonality.
Examples:
Single seasonality component:
detectors: - type: mad params: seasonality_components: - "hour" # Different intervals per hourMultiple separate components:
detectors: - type: mad params: seasonality_components: - "day_of_week" # Different intervals per weekday - "hour" # Different intervals per hourCombined component (interaction):
detectors: - type: mad params: seasonality_components: - ["hour", "day_of_week"] # Different per hour+weekday combomin_samples_per_group (int, default: 10)
Section titled “min_samples_per_group (int, default: 10)”Minimum samples required in each seasonality group for applying multipliers. Groups below this threshold fall back to global statistics.
Advanced — size the window to fill a group. A group’s multiplier engages only when the trailing window holds
min_samples_per_grouppoints sharing the current point’s key, and same-key points recur once per cardinality of the key — so you needwindow_size ≳ min_samples_per_group × distinct_keys. For hourly data grouped byhour(24 keys), the defaultmin_samples_per_group = 10needswindow_size ≳ 240; with the defaultwindow_size = 100no group ever fills and the seasonality silently has no effect (every point uses the global band). The detector logs a one-time warning in this case. Raisewindow_size, lowermin_samples_per_group, or use a coarser grouping.
Example:
detectors: - type: mad params: seasonality_components: - ["offset_10minutes", "league_day"] # custom columns from query_columns.seasonality min_samples_per_group: 15 # Need 15 samples per groupShared Parameters (Preprocessing, Weighting, Detrending)
Section titled “Shared Parameters (Preprocessing, Weighting, Detrending)”input_type, smoothing, window_weights / half_life, and detrend behave
identically across MAD, Z-Score and IQR. See
Shared Detector Parameters for the full reference,
defaults, and tuning recipes.
Execution Parameters
Section titled “Execution Parameters”start_time and batch_size control how detection runs without affecting
results (they are not part of the detector ID). See
Shared Detector Parameters → Execution Parameters.
Detector Identity
Section titled “Detector Identity”All result-affecting parameters (everything except start_time and
batch_size) are hashed into the detector_id. See
Shared Detector Parameters → Detector Identity and Recomputation.
Configuration Examples
Section titled “Configuration Examples”Basic Usage
Section titled “Basic Usage”Minimal configuration:
name: cpu_usageinterval: 1minquery: "SELECT timestamp, cpu_percent FROM metrics"
detectors: - type: mad params: threshold: 3.0With Historical Window
Section titled “With Historical Window”Recommended for production:
detectors: - type: mad params: threshold: 3.0 window_size: 2880 # 2 days of 1-min data min_samples: 500 # Wait for 500 valid samplesWith Seasonality (Single Component)
Section titled “With Seasonality (Single Component)”For metrics with daily patterns:
name: website_trafficinterval: 1hourquery: "SELECT timestamp, visitor_count FROM traffic"
# Extract hour-of-day seasonality from timestamp (built-in feature "hour")seasonality_columns: - hour
detectors: - type: mad params: threshold: 3.0 window_size: 672 # 4 weeks of hourly data seasonality_components: - "hour" # Different intervals for each hourWith Combined Seasonality
Section titled “With Combined Seasonality”For complex patterns (e.g., gaming metrics with multi-day tournaments):
name: group_assigned_users_pctinterval: 10minquery_file: sql/group_assigned.sql
query_columns: timestamp: period_time metric: group_assigned_users_pct seasonality: - offset_10minutes # 0-143 (10-min intervals in day) - league_day # 1-3 (tournament days)
detectors: - type: mad params: threshold: 3.0 window_size: 8640 # 60 days min_samples: 1000 start_time: "2024-03-01 00:00:00" batch_size: 2160 seasonality_components: - ["offset_10minutes", "league_day"] # 432 unique combinations min_samples_per_group: 10When to Use MAD Detector
Section titled “When to Use MAD Detector”Best For:
Section titled “Best For:”- General-purpose anomaly detection - Good default choice
- Data with outliers - More robust than Z-Score
- Skewed distributions - Doesn’t assume normality
- Metrics with seasonality - Excellent seasonality support
- Production systems - Stable and predictable behavior
Consider Alternatives:
Section titled “Consider Alternatives:”- Symmetric distributions with no outliers → Z-Score may be more sensitive
- Known bounds → Manual Bounds for strict thresholds
- Extreme skewness → IQR detector
Performance Characteristics
Section titled “Performance Characteristics”- Speed: ~1,500 points/second (including I/O)
- Memory: O(window_size) per metric
- CPU: Lightweight (median calculation only)
- Seasonality impact: Minimal performance penalty
Detection runs a per-point window loop, so large initial backfills are the slow path; incremental runs only score the few new points and are cheap.
Detection Metadata
Section titled “Detection Metadata”Each detection result includes metadata:
{ "global_median": 0.5123, # Median of entire window "global_mad": 0.0234, # MAD of entire window "adjusted_median": 0.5234, # After seasonality adjustment "adjusted_mad": 0.0187, # After seasonality adjustment "window_size": 8640, # Actual valid samples used "ess": 412.7, # Effective sample size (Kish) — when window_weights is set "trend_slope_per_point": -0.0002, # Estimated trend slope — when detrend is set "seasonality_groups": [ # Applied adjustments { "group": ["offset_10minutes", "league_day"], "median_multiplier": 1.023, "mad_multiplier": 0.876, "group_size": 23 } ], # Only for anomalies: "direction": "above", # "above" or "below" "severity": 4.52, # σ-equivalents beyond the bound (distance / (1.4826 × adjusted_mad); equals global_mad only when no seasonality multiplier applies) "distance": 0.2298 # Absolute distance from bound}Comparison with Other Detectors
Section titled “Comparison with Other Detectors”| Feature | MAD | Z-Score | IQR | Manual |
|---|---|---|---|---|
| Robust to outliers | Very | No | Very | N/A |
| Normal distribution | Not required | Required | Not required | N/A |
| Seasonality support | Excellent | Yes | Yes | No |
| Sensitivity tuning | Threshold | Threshold | Threshold | Bounds |
| Performance | Fast | Fast | Fast | Very Fast |
References
Section titled “References”See Also
Section titled “See Also”- Z-Score Detector - For normally distributed data
- IQR Detector - For extremely skewed data
- Detectors Guide - Choosing the right detector
- Configuration Guide - Complete config reference