Quickstart
This guide will walk you through creating your first detectkit project and monitoring a metric.
Prerequisites
Section titled “Prerequisites”- detectkit installed (Installation Guide)
- Database connection (ClickHouse, PostgreSQL, or MySQL)
- Basic SQL knowledge
Fastest start: set up with an AI assistant
Section titled “Fastest start: set up with an AI assistant”If you use Claude Code, it can do the setup for you interactively — no hand-editing YAML:
dtk init my_monitoring && cd my_monitoringdtk init-claude # adds CLAUDE.md + .claude/rules + skills to this folderThen, in Claude Code, ask it to run the dtk-setup-project skill. It walks
you through profiles.yml based on your database (ClickHouse today): connection
details, the internal _dtk_* vs data locations, an optional first alert
channel — and verifies it with a non-destructive run. Next, ask it to run
dtk-new-metric to scaffold your first metric. That’s the whole setup.
Prefer to do it by hand? The manual steps below do exactly the same thing.
Step 1: Initialize Project
Section titled “Step 1: Initialize Project”Create a new detectkit project:
dtk init my_monitoringcd my_monitoringThis creates the following structure:
my_monitoring/├── detectkit_project.yml # Project configuration├── profiles.yml # Database connections├── README.md # Project readme with quick commands├── metrics/ # Metric definitions│ └── example_cpu_usage.yml # Working starter metric (mad + zscore, alerting)├── incidents/ # Labeled incidents for supervised `dtk autotune`│ └── example_cpu_usage.yml # Example labels file└── sql/ # SQL queries └── .gitkeepmetrics/example_cpu_usage.yml is a complete, runnable example — use it as a
template for your own metrics.
Tip — set up an AI assistant. If you use Claude Code, run
dtk init-claudein this folder. It writes aCLAUDE.mdand.claude/rules/detectkit/reference plus the setup skills —dtk-setup-project(walk throughprofiles.ymlinteractively) anddtk-new-metric(scaffold a metric) — so the assistant can do the setup below for you and help you write metrics, tune detectors and configure alerts. Re-run it after upgrading detectkit to refresh the context. Seedtk init-claude.
Step 2: Configure Database Connection
Section titled “Step 2: Configure Database Connection”Shortcut — let the assistant do it. If you ran
dtk init-claude(see the tip above), just ask Claude Code to run thedtk-setup-projectskill. It asks for your connection details, branches on the database type, fills in the profile fields below for you, and verifies the result. The manual steps below are the same thing by hand.
Edit profiles.yml to add your database connection:
ClickHouse Example
Section titled “ClickHouse Example”default_profile: prod
profiles: prod: type: clickhouse host: localhost port: 9000 user: default password: ""
# Internal tables location (for _dtk_* tables) internal_database: analytics
# Data tables location data_database: default
settings: max_execution_time: 600Edit before running. The auto-generated
profiles.ymlships adevprofile with example values —host: localhostand the two required ClickHouse locationsinternal_database: detectkit(for the_dtk_*tables) anddata_database: default(where your source tables live). Change the host, port, credentials and both database names to match your environment. (There is nodatabase:field — ClickHouse needs bothinternal_databaseanddata_database, or the run fails withinternal_database must be set for ClickHouse.)
Tip:
dtk init --db-type postgres(ormysql) scaffoldsprofiles.ymlwith the right fields for that backend from the start.
PostgreSQL Example
Section titled “PostgreSQL Example”PostgreSQL connects to a database (must already exist) and uses schemas:
profiles: prod: type: postgres host: localhost port: 5432 user: postgres password: "your_password" database: detectkit # must already exist internal_schema: detectkit # auto-created data_schema: publicMySQL Example
Section titled “MySQL Example”MySQL (8.0+) uses databases (auto-created):
profiles: prod: type: mysql host: localhost port: 3306 user: root password: "your_password" internal_database: detectkit data_database: analyticsSee the Databases guide for the full per-backend breakdown (install extras, connection fields, SQL dialect).
Step 3: Create Your First Metric
Section titled “Step 3: Create Your First Metric”Create a metric configuration file:
touch metrics/api_response_time.ymlEdit metrics/api_response_time.yml:
# Basic metric infoname: api_response_timeinterval: 5min
# SQL query to load data.# Built-in template variables: {{ dtk_start_time }}, {{ dtk_end_time }}# (rendered as 'YYYY-MM-DD HH:MM:SS' strings) and {{ interval_seconds }}.query: | SELECT timestamp, AVG(response_time_ms) AS value FROM api_logs WHERE timestamp >= '{{ dtk_start_time }}' AND timestamp < '{{ dtk_end_time }}' GROUP BY timestamp ORDER BY timestamp
# Column mapping (optional if columns match defaults)query_columns: timestamp: timestamp metric: value
# Detector configurationdetectors: - type: mad params: threshold: 3.0 window_size: 288 # 1 day of 5-min intervals min_samples: 50
# Alerting configurationalerting: enabled: true channels: - mattermost_ops consecutive_anomalies: 3 # Require 3 anomalies in a row alert_cooldown: "30min" # Recommended: without it a persisting # anomaly re-alerts on every runStep 4: Configure Alert Channel
Section titled “Step 4: Configure Alert Channel”Edit profiles.yml to add an alert channel:
# At the end of profiles.ymlalert_channels: mattermost_ops: type: mattermost webhook_url: "https://mattermost.example.com/hooks/your_webhook_id" channel: "alerts" # Bot name + avatar default to the detectkit brand. Override per channel # with username / icon_url / icon_emoji (see the Alert Channels guide).Step 5: Run Your Metric
Section titled “Step 5: Run Your Metric”Run the metric for the first time:
dtk run --select api_response_timeOutput looks like this — a header with the project root and the metric count, then a per-metric block (config file + steps) and the load → detect → alert pipeline rendered as a tree, ending in a success line:
Project root: /path/to/my_monitoringFound 1 metric(s) to process
Processing metric: api_response_time Config file: metrics/api_response_time.yml Steps: load, detect, alert
┌─ LOAD │ ... (load progress) └─ ... (detect / alert progress)
✓ Pipeline completed successfullyThe per-step detail lines are emitted by the pipeline itself, so the exact middle of the tree depends on how much data was loaded and how many anomalies were found.
Step 6: Explore Results
Section titled “Step 6: Explore Results”View Loaded Data
Section titled “View Loaded Data”Data is stored in _dtk_datapoints table:
SELECT *FROM analytics._dtk_datapointsWHERE metric_name = 'api_response_time'ORDER BY timestamp DESCLIMIT 10;View Detections
Section titled “View Detections”Anomalies are stored in _dtk_detections table:
SELECT timestamp, value, confidence_lower, confidence_upper, detection_metadataFROM analytics._dtk_detectionsWHERE metric_name = 'api_response_time' AND is_anomaly = trueORDER BY timestamp DESC;Common Use Cases
Section titled “Common Use Cases”1. Error Rate Monitoring
Section titled “1. Error Rate Monitoring”name: error_rateinterval: 1min
query: | SELECT toStartOfMinute(timestamp) AS timestamp, countIf(status >= 500) / count() AS value FROM http_requests WHERE timestamp >= '{{ dtk_start_time }}' AND timestamp < '{{ dtk_end_time }}' GROUP BY timestamp ORDER BY timestamp
detectors: - type: manual_bounds params: upper_bound: 0.01 # Alert if error rate > 1%2. CPU Usage Monitoring
Section titled “2. CPU Usage Monitoring”name: cpu_usageinterval: 30s
query: | SELECT timestamp, avg_cpu_percent AS value FROM system_metrics WHERE timestamp >= '{{ dtk_start_time }}' AND timestamp < '{{ dtk_end_time }}' ORDER BY timestamp
detectors: - type: zscore params: threshold: 3.0 window_size: 120 # 1 hour3. Daily Active Users
Section titled “3. Daily Active Users”name: daily_active_usersinterval: 1day
query: | SELECT toDate(timestamp) AS timestamp, uniqExact(user_id) AS value FROM user_events WHERE timestamp >= '{{ dtk_start_time }}' AND timestamp < '{{ dtk_end_time }}' GROUP BY timestamp ORDER BY timestamp
detectors: - type: mad params: threshold: 3.0 window_size: 60 # 2 monthsCLI Commands
Section titled “CLI Commands”Run Specific Metrics
Section titled “Run Specific Metrics”# Run single metricdtk run --select api_response_time
# Run multiple metricsdtk run --select "api_*"
# Run all metricsdtk run --select "*"Partial Pipeline
Section titled “Partial Pipeline”# Only load data (skip detection)dtk run --select api_response_time --steps load
# Only detect anomalies (skip alert)dtk run --select api_response_time --steps load,detectFull Refresh
Section titled “Full Refresh”# Delete all data and reload from scratchdtk run --select api_response_time --full-refreshHistorical Backfill
Section titled “Historical Backfill”# Load data from a specific datedtk run --select api_response_time --from "2024-01-01 00:00:00"
# Bounded backfill: pair --from with --to to load a closed windowdtk run --select api_response_time --from "2024-01-01" --to "2024-02-01"Exclude and Force
Section titled “Exclude and Force”# Run everything except a subsetdtk run --select "*" --exclude "metrics/staging/*"
# Ignore a stuck lock left by a crashed rundtk run --select api_response_time --forceSee the CLI Reference for the full flag list.
Test Alert
Section titled “Test Alert”# Preview alert message without real anomaliesdtk test-alert api_response_timeClear a Stuck Lock
Section titled “Clear a Stuck Lock”# If a run was killed without releasing its lock (e.g. the database# restarted mid-run), later runs fail with "Failed to acquire lock".# Clear it immediately:dtk unlock --select api_response_timeStuck locks also auto-expire after 1 hour, so the next normal run recovers on
its own — dtk unlock just does it right away.
Prune Stale Data After Editing Configs
Section titled “Prune Stale Data After Editing Configs”# Editing a metric's detectors/alerting leaves the old results behind.# Preview what no longer matches the config (dry-run), then delete it:dtk clean --select api_response_timedtk clean --select api_response_time --execute
# Renamed or deleted a metric? Purge everything left under the old name:dtk clean --orphaned-metrics --executeSee the CLI Reference for both modes.
Next Steps
Section titled “Next Steps”Now that you have a working metric:
- Add seasonality - MAD Detector with Seasonality
- Handle trending metrics -
window_weights: exponential+half_life, ordetrend: linear(Detectors Guide) - Configure multiple detectors - Detectors Guide
- Set up multiple channels - Alerting Guide
- Fan out to independent alert rules -
alerting:can be a list of alert blocks, each with its own channels, conditions and template (Multiple alert blocks) - Explore examples - Examples
Troubleshooting
Section titled “Troubleshooting””Table _dtk_datapoints does not exist”
Section titled “”Table _dtk_datapoints does not exist””Solution: detectkit creates internal tables automatically on first run. Check database permissions.
”Connection refused”
Section titled “”Connection refused””Solution: Verify database connection in profiles.yml:
# Test ClickHouse connectionclickhouse-client --host=localhost --port=9000
# Test PostgreSQL connectionpsql -h localhost -U postgres -d analytics“No data loaded”
Section titled ““No data loaded””Solution: Check your SQL query returns data:
-- Run query manually with sample datesSELECT timestamp, AVG(response_time_ms) AS valueFROM api_logsWHERE timestamp >= '2024-03-01 00:00:00' AND timestamp < '2024-03-02 00:00:00'GROUP BY timestampORDER BY timestamp;“All points marked as insufficient_data”
Section titled ““All points marked as insufficient_data””Solution: Increase historical data range or decrease min_samples:
detectors: - type: mad params: min_samples: 10 # Reduce from default 30Getting Help
Section titled “Getting Help”- Documentation: Full guides available in docs/
- Examples: See examples/ for more configurations
- If something looks like a bug: when a
dtkcommand errors or behaves unexpectedly and it isn’t your config, use thedtk-feedbackskill (fromdtk init-claude) to file a redacted bug report or feature request upstream — it collects diagnostics, strips every secret, and asks you to confirm first. - Issues: Report bugs at https://github.com/alexeiveselov92/detectkit/issues