Query Store Playbook: Finding Regressions and Forcing the Right Plan in Azure SQL
Why this matters (a quick story)
Yesterday your API latency doubled after a harmless deployment. No code paths changed, but users are timing out. Classic. The optimizer picked a different plan, and your once-fast query regressed. Azure SQL’s Query Store is the black box flight recorder you use to (1) find that regression fast and (2) stabilize performance by forcing the good plan or applying a Query Store hint—without redeploying code. (Microsoft Learn)
What you’ll do (at a glance)
- Turn on & configure Query Store (if not already). (Microsoft Learn)
- Find the regressed query (Portal/SSMS + DMVs). (Microsoft Learn)
- Compare plans & pick the winner (evidence-based). (Microsoft Learn)
- Stabilize with Force Plan or Query Store Hints. (Microsoft Learn)
- Optionally enable Automatic Plan Correction to auto-revert bad flips. (Microsoft Learn)
Step 0 — Enable & tune Query Store (one-time)
T-SQL
-- Enable Query Store if needed
ALTER DATABASE [YourDb]
SET QUERY_STORE = ON
(
OPERATION_MODE = READ_WRITE,
CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
DATA_FLUSH_INTERVAL_SECONDS = 900,
MAX_STORAGE_SIZE_MB = 2048,
INTERVAL_LENGTH_MINUTES = 60,
QUERY_CAPTURE_MODE = AUTO -- or ALL for short windows
);
Query Store captures queries, plans, and runtime stats over time so you can see when a plan changed (and why your SLOs fell off a cliff). (Microsoft Learn)
Portal (optional): Azure SQL → Intelligent Performance → Query Performance Insight / Query Store for built-in reports. (Microsoft Learn)
Step 1 — Find the regression
A) Fast path via SSMS UI
- Open SSMS → connect to the database → Database → Query Store → Regressed Queries.
- Sort by Duration or CPU Delta. The chart shows when performance diverged and lists plans side-by-side. Use Plan Summary to visualize all plans for the query and their performance trend; this view also exposes a Force Plan button. (Microsoft Learn)
Screenshot cue:
Regressed Queries chart with time slider narrowed to “since yesterday”; Plan Summary pane showing 2+ plans with the “good” plan highlighted.
B) Scriptable path with DMVs
-- Top regressions: last 24 hours vs prior 24 hours (duration-focused example)
WITH rs AS (
SELECT p.plan_id, q.query_id, qt.query_sql_text,
SUM(CASE WHEN rs.runtime_stats_interval_id IN (
SELECT TOP (100000) runtime_stats_interval_id
FROM sys.query_store_runtime_stats_interval
WHERE start_time >= DATEADD(day,-1,SYSDATETIME())
) THEN rs.avg_duration * rs.count_executions ELSE 0 END) AS dur_last,
SUM(CASE WHEN rs.runtime_stats_interval_id IN (
SELECT TOP (100000) runtime_stats_interval_id
FROM sys.query_store_runtime_stats_interval
WHERE start_time < DATEADD(day,-1,SYSDATETIME())
AND start_time >= DATEADD(day,-2,SYSDATETIME())
) THEN rs.avg_duration * rs.count_executions ELSE 0 END) AS dur_prev
FROM sys.query_store_query q
JOIN sys.query_store_plan p ON p.query_id = q.query_id
JOIN sys.query_store_runtime_stats rs ON rs.plan_id = p.plan_id
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
GROUP BY p.plan_id, q.query_id, qt.query_sql_text
)
SELECT TOP (20) query_id, plan_id, LEFT(query_sql_text, 200) AS sample_text,
dur_prev, dur_last, (dur_last - dur_prev) AS dur_delta
FROM rs
WHERE dur_prev > 0 AND dur_last > dur_prev * 1.5 -- 50% slower
ORDER BY dur_delta DESC;
You can also use official samples and patterns to pull “regressed” sets similar to the UI. (mssqltips.com)
Screenshot cue:
SSMS Results grid showing query_id, plan_id, sample_text, and dur_delta sorted descending.
Step 2 — Compare plans like a pro
Open the plans for the same query_id (either from the UI or by right-clicking the plan in Query Store → Top Resource Consuming Queries). Look for:
- Cardinality shifts (est vs actual rows)
- Index usage changes (seek→scan, missing include columns)
- Join type flips (hash vs nested loops)
- Parameter sniffing behavior differences
- Parallelism toggles and memory grants
The UI overlays runtime metrics over time so you can see exactly when a different plan took over. (Microsoft Learn)
Screenshot cue:
Two execution plans side-by-side; callouts on changed join type and index choice.
Step 3 — Stabilize the winner
Option 1: Force the known-good plan (quickest)
UI: In Regressed Queries → Plan Summary, select the better plan → Force Plan. (Microsoft Learn)
T-SQL:
-- Force a plan
EXEC sys.sp_query_store_force_plan @query_id = 12345, @plan_id = 67890;
-- Verify forcing state and location (incl. secondaries)
SELECT * FROM sys.query_store_plan WHERE plan_id = 67890;
SELECT * FROM sys.query_store_plan_forcing_locations WHERE plan_id = 67890;
-- Unforce if it backfires
EXEC sys.sp_query_store_unforce_plan @query_id = 12345, @plan_id = 67890;
Plan forcing pins a same or similar plan; in rare cases it can still diverge, so monitor and unforce if needed. Requires ALTER on the DB. (Microsoft Learn)
Option 2: Apply a Query Store Hint (surgical, code-free)
If the query needs a hinted shape (e.g., OPTIMIZE FOR, MAXDOP, join strategy) and you cannot change app code:
-- Find the query_id for a text fragment
SELECT TOP (1) q.query_id
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
WHERE qt.query_sql_text LIKE N'%FROM Sales.Orders WHERE CustomerId = @p%';
-- Add a hint (example: force newer optimizer, or set MAXDOP)
EXEC sys.sp_query_store_set_hints
@query_id = 12345,
@query_hints = N'OPTION (USE HINT(''QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_160''), MAXDOP 4)';
-- List current hints
SELECT * FROM sys.query_store_query_hints WHERE query_id = 12345;
-- Clear hints
EXEC sys.sp_query_store_clear_hints @query_id = 12345;
Query Store hints shape plans without changing code and can even target replica groups in QS-on-secondary scenarios. Use judiciously. (Microsoft Learn)
Option 3: Let Automatic Plan Correction help
For “it was good, then it got bad” flips, enable FORCE_LAST_GOOD_PLAN so Azure can auto-revert the regression and keep you stable while you fix root cause:
-- Enable at the database level (Azure SQL)
ALTER DATABASE [YourDb]
SET AUTOMATIC_TUNING ( FORCE_LAST_GOOD_PLAN = ON );
This uses Query Store history to detect and correct regressions automatically. You can manage it in the Azure Portal under Automatic tuning. (Microsoft Learn)
Step 4 — Validate impact & watch for drift
-- Before/after snapshot for the stabilized query
SELECT p.plan_id, rs.count_executions, rs.avg_duration, rs.avg_cpu_time, rs.last_execution_time
FROM sys.query_store_plan p
JOIN sys.query_store_runtime_stats rs ON rs.plan_id = p.plan_id
WHERE p.query_id = 12345
ORDER BY rs.last_execution_time DESC;
Expect lower variance and resettled medians. If you forced a plan, watch for schema/index changes that can invalidate it (the engine will unforce on certain incompatibilities). (Microsoft Learn)
Best practices & common pitfalls
Best practices
- Capture policy: Start with
QUERY_CAPTURE_MODE = AUTO; switch toALLduring short incident windows to ensure visibility. (Microsoft Learn) - Keep history sane: Set retention and size caps; monitor
MAX_STORAGE_SIZE_MBto avoid read-only fallback. (Microsoft Learn) - Prefer plan hints over plan forcing for chronic issues (e.g., parameter sensitivity) when code changes are hard. Track and review hints regularly. (Microsoft Learn)
- Automate guardrails: Turn on FORCE_LAST_GOOD_PLAN in production; it’s reversible and buys time. (Microsoft Learn)
- Document query_ids in your runbooks so incidents are minutes, not hours.
Pitfalls
- Blindly forcing: The “good” plan for one parameter can be awful for another; watch parameter skew. Unforce if regressions appear. (Microsoft Learn)
- Hint sprawl: Too many hints = fragile system. Hints override statement hints and plan guides; treat them as temporary crutches. (Microsoft Learn)
- Ignoring stats/indices: A force or hint masks root cause (stale stats, missing indexes). Fix fundamentals after stabilizing.
Comparison: Force Plan vs Query Store Hints vs Automatic Plan Correction
| Approach | When to Use | Pros | Cons |
|---|---|---|---|
| Force Plan | You have a known good prior plan and need instant stability | One click; reversible; no code change | Can drift; “similar” not identical; brittle across schema/index changes (Microsoft Learn) |
| Query Store Hints | You need to shape the optimizer (e.g., MAXDOP, join strategy, optimizer level) without code change | Surgical; survives recompiles; can target replicas | Requires careful selection; can accumulate; still a band-aid (Microsoft Learn) |
| Automatic Plan Correction | Frequent plan flips; want safety net | Autonomous revert to last good plan; portal-managed | Not a substitute for tuning; depends on QS telemetry and detection windows (Microsoft Learn) |
End-to-end example (snackable)
- Incident triage: Open Regressed Queries → filter last 24h → pick query with largest duration delta. (Microsoft Learn)
- Plan diff: Compare plans; identify regression cause (e.g., scan due to sniffed parameter). (Microsoft Learn)
- Stabilize:
- If a prior plan is clearly better → Force Plan. (Microsoft Learn)
- If shape is the issue → sp_query_store_set_hints with
OPTIMIZE FOR/MAXDOP/join hint. (Microsoft Learn)
- Guardrail: Enable FORCE_LAST_GOOD_PLAN at DB level. (Microsoft Learn)
- Root cause (later): refresh stats, add/fix indexes, adjust code.
Screenshot cues to include in your doc/runbook
- Regressed Queries view with plan summary
- Forced Plans report after forcing
- Automatic tuning blade showing FORCE_LAST_GOOD_PLAN = ON
Conclusion & takeaways
- Query Store is your timeline—it shows what changed and when.
- Force Plan for immediate relief; Query Store Hints for controlled shaping; Automatic Plan Correction for safety.
- Fix fundamentals after you stabilize: stats, indexes, parameterization, query shape.
If you don’t institutionalize these steps (and the screenshots), you’ll repeat the same firefights.
Internal link ideas (for your blog/wiki)
- “Parameter Sensitivity & PSR in Azure SQL”
- “Indexing Playbook: Seekability vs Write Costs”
- “Baseline & SLO Dashboards from Query Store DMVs”
- “When to Trust Automatic Tuning vs Manual Control”
Image prompt
“A clean, modern diagram showing Azure SQL Query Store flow: queries → plan A (fast) → plan B (slow) → detection in ‘Regressed Queries’ → decision fork to ‘Force Plan’ or ‘Apply Query Store Hint’ → monitoring loop. Minimalistic, high contrast, 3D isometric style.”
Tags
#AzureSQL #QueryStore #SQLServer #PerformanceTuning #PlanForcing #QueryStoreHints #AutomaticTuning #DatabaseReliability #SRE #DataEngineering
Sources
- Monitor performance & UI concepts for Query Store and plan-change detection. (Microsoft Learn)
- Usage scenarios & “Regressed Queries” + Force Plan in SSMS. (Microsoft Learn)
- sp_query_store_force_plan behavior/risks/permissions. (Microsoft Learn)
- Query Store Hints overview & procedures (set, clear, list). (Microsoft Learn)
- Automatic Tuning (FORCE_LAST_GOOD_PLAN) in Azure SQL. (Microsoft Learn)








Leave a Reply