Time-Series SQL You’ll Actually Use: SAMPLE BY + FILL, LATEST ON, ASOF/LT/SPLICE JOIN, and EXPLAIN
Meta description (158 chars):
Practical time-series SQL patterns for mid-level data engineers: downsampling with SAMPLE BY + FILL, LATEST ON, ASOF/LT/SPLICE joins, and profiling with EXPLAIN.
Why this matters
Your PM wants a “60-second dashboard” that never shows gaps, your SRE needs the last known device state, and finance insists trades be joined to the nearest quote—without lying about time. These are the four patterns you actually need in production: downsample & fill, latest-per-key, time-aware joins, and explain the plan. The examples below use QuestDB’s time-series SQL because it ships these constructs natively. SAMPLE BY/FILL, LATEST ON, and ASOF/LT/SPLICE JOIN are first-class features, and EXPLAIN shows you how queries will execute. (QuestDB)
Pattern 1 — Downsample without lying: SAMPLE BY … FILL
Scenario: You have second-level metrics but want minute buckets, no holes.
-- Per-minute error rate, fill gaps by carrying forward last value
SELECT
ts,
sum(errors) / nullif(sum(requests), 0) AS error_rate
FROM api_metrics
SAMPLE BY 1m
FILL(PREV); -- carry forward
What it does:
SAMPLE BYgroups rows into uniform time buckets.FILL(PREV|LINEAR|NULL|<const>)controls how to populate missing buckets.- For pre/post window padding (before first/after last row), add
FROM '...' TO '...'. Note: FROM-TO works only for non-keyed queries and doesn’t combine withFILL(PREV|LINEAR). (QuestDB)
Helpful options:
-- Pin buckets to the calendar & a timezone
SELECT ts, count()
FROM api_metrics
SAMPLE BY 1h
FILL(NULL)
ALIGN TO CALENDAR TIME ZONE 'America/New_York';
Gotcha: These operations rely on a designated timestamp column on the table, so set it at create time. (QuestDB)
Pattern 2 — The last known state: LATEST ON … PARTITION BY
Scenario: “Show me the newest reading per device (or per device+sensor).”
SELECT device_id, status, ts
FROM device_events
LATEST ON ts PARTITION BY device_id;
That returns one row per device, the most recent by the table’s designated timestamp. With multiple partition columns, you’ll get the latest per (device_id, sensor_id) combination—but expect a full scan on very wide cardinalities. (QuestDB)
Operator precedence trick — choose whether filters apply before or after “latest” by bracketing:
-- Filter then latest
SELECT * FROM balances
WHERE balance > 0
LATEST ON ts PARTITION BY cust_id;
-- Latest then filter (note the parentheses)
(SELECT * FROM balances LATEST ON ts PARTITION BY cust_id)
WHERE balance > 0;
The order changes the result set and the work the engine does. (QuestDB)
Pattern 3 — Time-aware joins that match how you think: ASOF, LT, SPLICE
Scenario A (pricing): For each trade, attach the nearest quote at or before the trade time.
SELECT t.ts, t.symbol, t.qty, q.bid, q.ask
FROM trades t
ASOF JOIN quotes q ON (symbol)
-- Optional tolerance to avoid “ancient” matches
TOLERANCE '5m';
ASOF JOIN matches each left row to the closest right-side timestamp ≤ the left’s timestamp (per key), optionally bounded by TOLERANCE. It requires time ordering via the designated timestamp. (QuestDB)
Scenario B (strict causality): Same as above, but disallow equal timestamps—use LT JOIN (“less-than”). (QuestDB)
Scenario C (fill both sides): Need a full-outer, time-aware union? Use SPLICE JOIN. It behaves like a time-series-aware FULL JOIN, pairing nearest neighbors across streams. (QuestDB)
Quick reference
| Join | Meaning (per key) | Typical use |
|---|---|---|
| ASOF JOIN | Right row with timestamp ≤ left, nearest by time; optional TOLERANCE | Trades → most recent quote |
| LT JOIN | Like ASOF but strictly < (no equals) | Causality-safe telemetry merges |
| SPLICE JOIN | Time-aware FULL-like pairing across streams | Rebuilding continuous series from intermittent feeds |
All three are native SQL join types in QuestDB. (QuestDB)
Pattern 4 — Profile it: EXPLAIN (read the plan before you scale)
When queries feel “mysteriously slow,” don’t guess. EXPLAIN prints the execution plan—scan strategies, join order and method, and where JIT kicks in—so you can fix the shape of the query instead of throwing hardware at it. (QuestDB)
EXPLAIN
SELECT t.ts, t.symbol, q.bid
FROM trades t
ASOF JOIN quotes q ON (symbol) TOLERANCE '2m'
WHERE t.ts IN '2025-11-01';
Two simple wins you’ll often see:
- Put the big table first when combining huge and tiny subqueries; it can change which side gets hashed. The docs show a 10M-row example where join order matters. (QuestDB)
- Prune by time (
WHERE ts IN 'YYYY-MM-DD'orFROM '…' TO '…'withSAMPLE BY) so the engine touches fewer partitions; QuestDB even injects a time filter to match your FROM-TO window. (QuestDB)
End-to-end mini-lab (copy/paste friendly)
Goal: Attach quotes to trades (ASOF), downsample to 1-minute buckets with no holes, and sanity-check the plan.
-- 1) Schemas with designated timestamps
CREATE TABLE trades (
ts TIMESTAMP, symbol SYMBOL, qty INT, px DOUBLE
) TIMESTAMP(ts) PARTITION BY DAY;
CREATE TABLE quotes (
ts TIMESTAMP, symbol SYMBOL, bid DOUBLE, ask DOUBLE
) TIMESTAMP(ts) PARTITION BY DAY;
-- 2) Time-aware join (nearest quote at/under trade time)
WITH enriched AS (
SELECT t.ts, t.symbol, t.qty, t.px, q.bid, q.ask
FROM trades t
ASOF JOIN quotes q ON (symbol)
TOLERANCE '2m'
)
-- 3) Downsample and fill every minute for the last day
SELECT
ts,
sum(qty) AS trades,
avg(px) AS avg_px,
avg((bid+ask)/2) AS mid_px
FROM enriched
WHERE ts IN '2025-11-19'
SAMPLE BY 1m
FILL(0, PREV, LINEAR); -- per-column fill strategies
Then inspect the plan:
EXPLAIN
WITH enriched AS (
SELECT t.ts, t.symbol, t.qty, t.px, q.bid, q.ask
FROM trades t
ASOF JOIN quotes q ON (symbol) TOLERANCE '2m'
)
SELECT ts, sum(qty), avg(px)
FROM enriched
WHERE ts IN '2025-11-19'
SAMPLE BY 1m FILL(PREV);
You should see a time-pruned scan and an ASOF join operator in the tree. If not, tighten your time filter or revisit join keys. (QuestDB)
Best practices & common pitfalls
- Always set the designated timestamp. Many time-aware features assume it. Without it, you’ll be forced into subqueries and full scans. (QuestDB)
- Choose the right fill.
PREVis honest for counters/states;LINEARis okay for “smooth” signals; never fabricate critical KPIs—useNULLand make the gap visible. FROM-TO padding works only on non-keyedSAMPLE BYand not withPREV/LINEAR. (QuestDB) - Bound your ASOF/LT joins. Add
TOLERANCEto avoid matching a quote from “yesterday” to a trade “now.” (QuestDB) - Be intentional about precedence. Bracket
LATEST ONto control whether you filter before or after picking the latest row. (QuestDB) - Fix query shape before scaling compute. Use
EXPLAINto catch bad join order or missing time pruning. (QuestDB)
Summary
If you master just these four patterns, you can ship fast, honest dashboards:
SAMPLE BY … FILLto downsample without gaps,LATEST ONfor last known state per key,- ASOF/LT/SPLICE for time-aware joins that respect causality,
EXPLAINto keep yourself honest about performance.
They’re compact, expressive, and production-ready.
Call to action: Pick one production panel this week and rewrite its query using these patterns. Measure before/after with EXPLAIN and a wall-clock. If latency doesn’t drop, your data model—not your SQL—may be the real problem.
Internal link ideas (official docs)
- QuestDB SAMPLE BY & FILL reference. (QuestDB)
- QuestDB LATEST ON reference. (QuestDB)
- QuestDB JOIN reference (ASOF/LT/SPLICE +
TOLERANCE). (QuestDB) - QuestDB EXPLAIN keyword and blog deep-dive. (QuestDB)
Image prompt (for DALL·E / Midjourney)
“A crisp, minimalist time-series SQL diagram: three panels showing (1) SAMPLE BY with filled gaps, (2) LATEST ON selecting last row per device, (3) ASOF/LT/SPLICE joins aligning two timelines, plus a small EXPLAIN plan tree. Isometric, high-contrast, vector.”
Tags
#TimeSeries #SQL #QuestDB #DataEngineering #ASOFJoin #SAMPLEBY #LATESTON #DatabasePerformance #EXPLAIN #Streaming Analytics









Leave a Reply