SQL Server performance tuning is most effective when you can answer two questions quickly and defensibly: what is slow, and why it is slow. Dynamic Management Views (DMVs) are built-in, low-overhead instrumentation surfaces that expose SQL Server’s internal state—query execution statistics, wait events, memory clerks, file I/O latency, index usage, and more. They are not a single “performance tool” so much as a toolkit for building an evidence-based workflow.
This guide focuses on using DMVs for performance analysis in a way that fits how IT administrators and system engineers operate: you need repeatable steps, scripts that are safe to run on production, and interpretation guidance that avoids common misreads. The emphasis is on correlating signals across DMVs so you can distinguish CPU pressure from I/O stalls, or a missing index symptom from a parameter sensitivity issue.
A key theme throughout is that most DMVs are cumulative since the last restart (or since a database was brought online), and many are scoped by plan cache lifetime. That means DMVs reward disciplined baselining. You will see better results if you collect snapshots on a schedule and compare deltas.
How DMVs fit into a repeatable tuning workflow
DMVs are best used in a loop: you observe symptoms, collect evidence, validate hypotheses, apply a change, and then measure again. Because SQL Server systems are dynamic—workload patterns change by hour, new deployments modify query shapes, and maintenance jobs shift I/O—tuning based on a single snapshot is risky.
A pragmatic workflow for SQL Server performance tuning with DMVs typically looks like this:
First, establish a baseline of wait statistics, file I/O latencies, and top query resource consumers during a known “healthy” period. Then, when users report slowness (or monitoring detects SLO/SLA violations), capture the same set of snapshots and compare.
Second, triage at the instance level using wait statistics and host counters: are you CPU-bound, I/O-bound, or blocked? DMVs give you the SQL Server perspective; you should also correlate with OS-level signals (PerfMon on Windows or perf/sar on Linux), but DMVs are often enough to identify where to dig.
Third, drill down: identify the top expensive queries, correlate them with execution plans and indexes, and validate whether the problem is plan quality, missing/inefficient indexes, parameter sensitivity, spills to tempdb, or external waits.
Finally, implement changes conservatively. For example, an index change might reduce reads for one query but increase write overhead and log generation for the rest of the workload. DMVs help you see these trade-offs by tracking both query stats and index operational metrics.
Understanding DMV scope, reset behavior, and permissions
Before running scripts, it matters how to interpret what you see.
Many DMVs are cumulative. For example, sys.dm_os_wait_stats accumulates waits since the last SQL Server service start. sys.dm_io_virtual_file_stats accumulates I/O counters since the last start as well. Query stats in sys.dm_exec_query_stats persist only as long as the plan stays in cache; a restart, memory pressure, or plan cache eviction changes what is visible.
This reset behavior is why “top queries by total CPU” can be misleading immediately after a restart: the list may reflect startup activity, index maintenance, or a batch job that happened to run early.
Permissions also matter. Many server-scoped DMVs require VIEW SERVER STATE. Database-scoped views often require VIEW DATABASE STATE (or VIEW SERVER STATE for some contexts). In locked-down environments, you may need a controlled role granting these rights to a monitoring login rather than using sysadmin.
On Azure SQL Database and other managed offerings, you don’t get the full set of server-scoped DMVs. Many concepts still apply, but some scripts must be adapted to database-scoped equivalents.
Establishing a baseline with lightweight snapshots
A baseline is not a single number; it is a time series of the signals you’ll rely on. The minimal useful baseline for DMV-driven tuning usually includes:
Wait stats snapshot (so you can calculate deltas per interval rather than relying on cumulative totals).
File I/O latencies per database file (data and log), because storage is a frequent constraint.
Top query consumers by CPU, logical reads, and duration, because expensive queries are often the immediate cause of user-visible slowness.
If you already have monitoring, you can still benefit from a DMV baseline because it gives you SQL Server’s internal attribution, not just external symptoms.
A simple pattern is to store snapshots in a DBA database. The following example tables are intentionally minimal. In production you would add retention and indexing, but this shows the idea.
-- Create a DBA database if you don't already have one
-- CREATE DATABASE DBA;
GO
USE DBA;
GO
CREATE TABLE dbo.WaitStatsSnapshot
(
SnapshotTime datetime2(0) NOT NULL,
wait_type nvarchar(60) NOT NULL,
waiting_tasks_count bigint NOT NULL,
wait_time_ms bigint NOT NULL,
signal_wait_time_ms bigint NOT NULL
);
CREATE TABLE dbo.IoFileStatsSnapshot
(
SnapshotTime datetime2(0) NOT NULL,
database_id int NOT NULL,
file_id int NOT NULL,
num_of_reads bigint NOT NULL,
num_of_bytes_read bigint NOT NULL,
io_stall_read_ms bigint NOT NULL,
num_of_writes bigint NOT NULL,
num_of_bytes_written bigint NOT NULL,
io_stall_write_ms bigint NOT NULL,
io_stall_ms bigint NOT NULL
);
To populate these tables, you can schedule a SQL Agent job (or an external scheduler) every 5 minutes. Keep the collection lightweight; you are capturing counters, not running complex joins.
sql
INSERT INTO DBA.dbo.WaitStatsSnapshot
(
SnapshotTime, wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms
)
SELECT
SYSDATETIME(), wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms
FROM sys.dm_os_wait_stats;
INSERT INTO DBA.dbo.IoFileStatsSnapshot
(
SnapshotTime, database_id, file_id,
num_of_reads, num_of_bytes_read, io_stall_read_ms,
num_of_writes, num_of_bytes_written, io_stall_write_ms,
io_stall_ms
)
SELECT
SYSDATETIME(),
vfs.database_id, vfs.file_id,
vfs.num_of_reads, vfs.num_of_bytes_read, vfs.io_stall_read_ms,
vfs.num_of_writes, vfs.num_of_bytes_written, vfs.io_stall_write_ms,
vfs.io_stall
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs;
Once you have snapshots, the tuning workflow becomes more reliable because you can compute per-interval deltas. That’s crucial for waits and I/O where a cumulative view hides “what just happened.”
Triage: is the instance waiting, working, or blocked?
When a slowdown hits, your first job is to identify whether SQL Server is actively consuming CPU (working), stalled on resource waits (waiting), or stuck behind locks (blocked). You can do this quickly by combining a small number of DMVs.
Start with current requests and sessions. sys.dm_exec_requests shows what is executing right now; sys.dm_exec_sessions provides session metadata; and sys.dm_os_waiting_tasks shows waits at the task level.
sql
SELECT
r.session_id,
r.status,
r.command,
r.cpu_time,
r.total_elapsed_time,
r.reads,
r.writes,
r.logical_reads,
r.wait_type,
r.wait_time,
r.last_wait_type,
r.blocking_session_id,
DB_NAME(r.database_id) AS database_name
FROM sys.dm_exec_requests AS r
WHERE r.session_id <> @@SPID
ORDER BY r.total_elapsed_time DESC;
This view is not a “top queries” list; it is a “what is currently running” list. For a bursty workload, the culprit may come and go quickly, so it’s helpful to capture a few snapshots during the incident.
If you see many sessions with a non-zero blocking_session_id, shift immediately to blocking analysis. If you see many sessions waiting on I/O waits, the I/O path is a prime suspect. If you see a small number of sessions consuming large cpu_time with minimal waits, you may be CPU-bound or running inefficient query plans.
A helpful next check is task-level waits to see exactly what each request is waiting on.
sql
SELECT
wt.session_id,
wt.wait_type,
wt.wait_duration_ms,
wt.blocking_session_id,
wt.resource_description
FROM sys.dm_os_waiting_tasks AS wt
WHERE wt.session_id > 50
ORDER BY wt.wait_duration_ms DESC;
At this stage you are not fixing anything; you are classifying the incident into a path: waits, blocking, CPU, or I/O. The following sections expand each path.
Finding expensive queries with sys.dm_exec_query_stats
Once you know the instance is spending time doing work (or doing too much work), you need to identify which queries are responsible. sys.dm_exec_query_stats is the core DMV for this. It aggregates execution statistics per cached plan, exposing total worker time (CPU), total elapsed time, total logical reads, and execution counts.
Because it’s per cached plan, it can hide parameter sensitivity issues where the same query text has multiple plans, or where a single plan performs poorly only for certain parameter values. Still, it is the fastest way to find “hot” query patterns.
A practical approach is to look at the top queries by average resource usage and by total usage. Totals find the biggest overall consumers; averages find “one execution is expensive,” which often correlates with user complaints.
The query below finds the top cached statements by total CPU and includes query text and plan handle so you can pull the plan.
sql
SELECT TOP (25)
qs.total_worker_time / 1000.0 AS total_cpu_ms,
qs.total_elapsed_time / 1000.0 AS total_duration_ms,
qs.total_logical_reads AS total_logical_reads,
qs.execution_count,
(qs.total_worker_time / NULLIF(qs.execution_count,0)) / 1000.0 AS avg_cpu_ms,
(qs.total_elapsed_time / NULLIF(qs.execution_count,0)) / 1000.0 AS avg_duration_ms,
(qs.total_logical_reads / NULLIF(qs.execution_count,0)) AS avg_logical_reads,
DB_NAME(st.dbid) AS database_name,
SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS statement_text,
qs.plan_handle,
qs.sql_handle
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY qs.total_worker_time DESC;
When you review the results, avoid the temptation to tune the top row immediately. Use it as a starting point for deeper inspection: retrieve the execution plan, check whether the query is doing large scans, doing key lookups at high frequency, spilling to tempdb, or executing many times with small inefficiencies that add up.
To pull the cached plan, use sys.dm_exec_query_plan.
sql
SELECT qp.query_plan
FROM sys.dm_exec_query_plan(@plan_handle) AS qp;
In SSMS, clicking the XML opens the graphical plan. In automation contexts, you can save the XML for offline analysis.
Real-world scenario 1: “CPU is high after a deployment”
Consider a line-of-business application where a deployment introduces a new filter on a reporting endpoint. Users report timeouts, and you observe sustained CPU at 85–95% on the SQL Server VM.
You capture sys.dm_exec_requests and see a handful of sessions with high cpu_time and minimal waits, suggesting CPU-bound execution. Query stats show a new statement near the top by total worker time, with an average CPU per execution far higher than other statements. The plan reveals a hash match and a scan of a large table, caused by a non-sargable predicate (for example, wrapping a column in a function) that prevents index seeks.
The DMV workflow leads you to a targeted fix: adjust the query or add a computed column/index that makes the predicate sargable. After the change, the same DMV query shows a drop in average logical reads and CPU per execution, and host CPU returns to baseline.
The important point is the DMVs didn’t just tell you “CPU is high.” They provided attribution to a specific cached statement and plan, making the remediation defensible.
Using Query Store alongside DMVs (when available)
Query Store is not a DMV, but it complements DMV-based analysis because it persists query performance history across restarts and plan evictions. DMVs are “what’s in cache now”; Query Store is “what happened over time.” If you are on SQL Server 2016+ and Query Store is enabled, use it to validate whether an expensive query is a new regression or an ongoing pattern.
A common tuning pattern is: DMVs identify the current top consumer; Query Store shows that the query’s average duration jumped after a specific deployment or stats update, which points to plan regression or parameter sensitivity.
Even if you rely on DMVs day-to-day, Query Store reduces the risk of losing evidence when the plan cache changes. It also supports forcing a plan, which is sometimes appropriate as a mitigation while you work on a long-term fix.
Interpreting wait statistics with sys.dm_os_wait_stats
Wait statistics answer: “Where is SQL Server spending time waiting?” A wait occurs when a worker thread cannot proceed because it needs a resource (I/O completion, lock, latch, memory grant, CPU scheduling, etc.). Waits are not automatically “bad,” but changes in wait profile are often the quickest way to identify bottlenecks.
The primary DMV is sys.dm_os_wait_stats. It includes both resource waits and signal waits. Signal wait time is time spent waiting to get scheduled on CPU after the resource is available; high signal wait can indicate CPU pressure.
Because the DMV is cumulative, you should either baseline and compute deltas, or periodically reset it in controlled windows. Resetting with DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR); is disruptive to long-term trending and should be done only if you have an alternate history mechanism. In most environments, snapshotting is safer.
A standard first query is to exclude benign or idle waits (sleep, background tasks) and rank by total wait time.
sql
WITH Waits AS
(
SELECT
wait_type,
wait_time_ms,
signal_wait_time_ms,
waiting_tasks_count
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN
(
'SLEEP_TASK','SLEEP_SYSTEMTASK','BROKER_TASK_STOP','BROKER_TO_FLUSH',
'SQLTRACE_BUFFER_FLUSH','CLR_AUTO_EVENT','CLR_MANUAL_EVENT','LAZYWRITER_SLEEP',
'XE_TIMER_EVENT','XE_DISPATCHER_WAIT','BROKER_EVENTHANDLER','CHECKPOINT_QUEUE',
'REQUEST_FOR_DEADLOCK_SEARCH','LOGMGR_QUEUE','FT_IFTS_SCHEDULER_IDLE_WAIT'
)
)
SELECT TOP (20)
wait_type,
wait_time_ms/1000.0 AS wait_s,
signal_wait_time_ms/1000.0 AS signal_wait_s,
waiting_tasks_count,
(wait_time_ms - signal_wait_time_ms)/1000.0 AS resource_wait_s
FROM Waits
ORDER BY wait_time_ms DESC;
Interpreting waits requires context. For example:
PAGEIOLATCH_* often indicates data page reads waiting on storage (or a memory pressure pattern causing more reads).
WRITELOG indicates waiting for log flushes; this points to log disk latency, excessive log generation, or commit frequency.
CXPACKET/CXCONSUMER are parallelism-related waits; they can be benign or can indicate skew or over-parallelization.
LCK_M_* indicates lock waits (blocking), which is an application and indexing/design concern.
SOS_SCHEDULER_YIELD suggests CPU pressure where workers yield frequently.
A good habit is to correlate waits with current activity. If WRITELOG is high in the delta, check sys.dm_io_virtual_file_stats for log file write latency and whether a batch job is generating large log volume.
Moving from waits to root cause: CPU, I/O, locks, or memory
Wait stats are a compass, not a map. Once you see the dominant waits, you need supporting evidence to avoid chasing noise.
If CPU-related waits dominate (high signal waits, high runnable tasks), look at scheduler health in sys.dm_os_schedulers. If I/O waits dominate, check file latency and read/write patterns. If lock waits dominate, identify the blocker chain and transaction scope. If memory grant waits show up, examine queries that request large memory grants and spill.
The sections that follow build those supporting checks, starting with CPU and scheduling.
CPU and scheduler pressure: sys.dm_os_schedulers and worker signals
SQL Server schedules work on schedulers (roughly aligned to logical CPUs, excluding hidden schedulers). When CPU is saturated or workers are queued, query latency rises even if I/O is fine.
sys.dm_os_schedulers helps you see runnable queues and whether tasks are waiting to get CPU.
sql
SELECT
scheduler_id,
status,
current_tasks_count,
runnable_tasks_count,
active_workers_count,
load_factor
FROM sys.dm_os_schedulers
WHERE status = 'VISIBLE ONLINE';
A sustained runnable_tasks_count greater than 0 across multiple schedulers suggests CPU pressure. It pairs with increased SOS_SCHEDULER_YIELD waits and higher signal wait time in sys.dm_os_wait_stats.
At this point, you return to sys.dm_exec_query_stats to find top CPU consumers and validate whether the CPU usage comes from a few bad queries, from compilation/recompilation, or from an expected workload increase. DMVs can also help with compilation pressure via sys.dm_exec_query_optimizer_info (less commonly used) and by spotting high SQL_COMPILATION waits.
If the workload is genuinely CPU-heavy, you may need scaling (more cores), but in many cases a single inefficient query or missing index is responsible.
Storage and file I/O latency: sys.dm_io_virtual_file_stats
When wait stats show PAGEIOLATCH_* or WRITELOG, you need file-level evidence. sys.dm_io_virtual_file_stats returns cumulative I/O counts and stalls per file. You can join it to sys.master_files to get file paths and types.
sql
SELECT
DB_NAME(vfs.database_id) AS database_name,
mf.type_desc,
mf.physical_name,
vfs.num_of_reads,
vfs.io_stall_read_ms,
CASE WHEN vfs.num_of_reads = 0 THEN 0
ELSE vfs.io_stall_read_ms * 1.0 / vfs.num_of_reads END AS avg_read_ms,
vfs.num_of_writes,
vfs.io_stall_write_ms,
CASE WHEN vfs.num_of_writes = 0 THEN 0
ELSE vfs.io_stall_write_ms * 1.0 / vfs.num_of_writes END AS avg_write_ms,
vfs.size_on_disk_bytes / 1024 / 1024 AS size_mb
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
JOIN sys.master_files AS mf
ON vfs.database_id = mf.database_id
AND vfs.file_id = mf.file_id
ORDER BY (vfs.io_stall_read_ms + vfs.io_stall_write_ms) DESC;
Interpreting latency depends on your storage and workload, but as a rough operational heuristic, sustained average latencies in the tens of milliseconds for data reads can be problematic for OLTP, and log writes above a few milliseconds can become visible as commit delays. The key is comparing to your baseline and correlating with waits.
If you see high avg_write_ms on log files with WRITELOG waits, you have a coherent story: commits are waiting for slow log flush. If you see high data read latency and PAGEIOLATCH_*, you may have storage throughput constraints or memory pressure causing excessive reads.
Real-world scenario 2: “WRITELOG spikes during nightly processing”
A common incident pattern is a stable daytime workload but severe slowdowns during nightly ETL or index maintenance. Users might not be online, but downstream systems time out.
DMV snapshots show a sharp increase in WRITELOG wait deltas during the job window. sys.dm_io_virtual_file_stats deltas show log file write stall rising to 15–25 ms average. The ETL batch performs frequent small transactions, causing high log flush frequency.
The remediation might involve moving the log file to faster storage, batching commits, adjusting recovery model for the ETL database if acceptable, or reducing concurrent writers during that window. The DMVs help you validate which change worked by showing that WRITELOG deltas drop and log write latency returns to baseline.
Memory pressure signals: clerks, grants, and page life
Memory issues show up in different ways: buffer pool pressure (more physical reads), memory grant pressure (queries waiting for workspace memory), or overall OS pressure causing SQL Server to shrink its working set.
DMVs provide several views into this:
sys.dm_os_process_memory shows process memory and whether SQL Server is in a low-memory condition.
sys.dm_os_memory_clerks shows which components are consuming memory.
sys.dm_exec_query_memory_grants shows active memory grants and who is waiting.
Start with process memory to see if SQL Server believes it is under pressure.
sql
SELECT
physical_memory_in_use_kb / 1024 AS physical_memory_in_use_mb,
locked_page_allocations_kb / 1024 AS locked_pages_mb,
total_virtual_address_space_kb / 1024 AS vas_mb,
available_commit_limit_kb / 1024 AS os_commit_limit_mb,
process_physical_memory_low,
process_virtual_memory_low
FROM sys.dm_os_process_memory;
If process_physical_memory_low is 1, SQL Server is detecting low physical memory available. That often correlates with OS-level pressure from other processes or misconfigured max server memory.
Next, look at memory clerks to understand where SQL Server is spending memory.
sql
SELECT TOP (20)
type,
pages_kb / 1024 AS pages_mb
FROM sys.dm_os_memory_clerks
ORDER BY pages_kb DESC;
For query tuning, memory grants are especially important. A memory grant is workspace memory used for sorts, hashes, and some batch operations. If many queries request large grants, concurrency can collapse: new queries wait on RESOURCE_SEMAPHORE while existing queries hold memory.
sql
SELECT
mg.session_id,
mg.requested_memory_kb,
mg.granted_memory_kb,
mg.used_memory_kb,
mg.max_used_memory_kb,
mg.wait_time_ms,
mg.is_next_candidate,
mg.dop,
DB_NAME(r.database_id) AS database_name,
r.status,
r.command
FROM sys.dm_exec_query_memory_grants AS mg
LEFT JOIN sys.dm_exec_requests AS r
ON mg.session_id = r.session_id
ORDER BY mg.requested_memory_kb DESC;
If you see many sessions with non-zero wait_time_ms, the workload is waiting for memory grants. That’s often driven by poor cardinality estimates (the optimizer requests too much), missing indexes leading to hash joins/sorts, or parameter sensitivity. The fix is usually query and index work, sometimes alongside configuration (for example, ensuring sufficient max server memory and avoiding external memory pressure).
tempdb contention and usage with DMVs
tempdb is a shared system database used for sorts, hashes, row versioning, temporary objects, and internal worktables. Performance issues related to tempdb can be caused by I/O latency, allocation contention, or workload patterns that spill.
To see tempdb space usage by session and request, use sys.dm_db_session_space_usage and sys.dm_db_task_space_usage. These are invaluable when tempdb grows unexpectedly.
sql
SELECT TOP (50)
ssu.session_id,
(ssu.user_objects_alloc_page_count - ssu.user_objects_dealloc_page_count) * 8 AS user_objects_kb,
(ssu.internal_objects_alloc_page_count - ssu.internal_objects_dealloc_page_count) * 8 AS internal_objects_kb
FROM sys.dm_db_session_space_usage AS ssu
ORDER BY (ssu.user_objects_alloc_page_count - ssu.user_objects_dealloc_page_count
+ ssu.internal_objects_alloc_page_count - ssu.internal_objects_dealloc_page_count) DESC;
If tempdb contention is suspected, wait stats may show PAGELATCH_* on tempdb allocation pages (historically PAGELATCH_UP on tempdb system pages). Modern SQL Server versions have improved tempdb allocation behavior, but configuration still matters: multiple tempdb data files of equal size, appropriate growth settings, and fast storage.
When you identify sessions consuming large internal object space, connect it back to the query: large sorts and hashes may indicate missing indexes or underestimated cardinalities causing spills.
Real-world scenario 3: “tempdb grows and latency spikes during a reporting run”
Imagine a shared SQL Server where OLTP is fine most of the day, but a mid-morning reporting job causes tempdb to grow rapidly and the whole instance becomes sluggish.
During the event, sys.dm_exec_query_memory_grants shows several large grants in use, and wait deltas show RESOURCE_SEMAPHORE and PAGEIOLATCH_*. sys.dm_db_session_space_usage reveals one session allocating a large amount of internal objects. Pulling the plan for that session’s query shows a large sort with a spill to tempdb, driven by a missing index on the report’s join/filter columns.
After adding an appropriate index (or rewriting the report to pre-aggregate), tempdb usage returns to normal and the reporting job stops impacting OLTP. The DMVs provided a chain of evidence from symptom (tempdb growth) to mechanism (spill) to cause (index/query shape).
Blocking and lock waits: sys.dm_tran_locks and active requests
Blocking is one of the most user-visible forms of slowness: queries appear “stuck,” often waiting on LCK_M_* waits. DMVs can identify the blocker, the blocked sessions, and what resources are involved.
Start by finding blocked requests.
sql
SELECT
r.session_id,
r.status,
r.wait_type,
r.wait_time,
r.blocking_session_id,
DB_NAME(r.database_id) AS database_name,
r.command,
r.cpu_time,
r.total_elapsed_time
FROM sys.dm_exec_requests AS r
WHERE r.blocking_session_id <> 0
ORDER BY r.wait_time DESC;
Once you know the blocking session id, retrieve its SQL text and transaction state.
sql
SELECT
s.session_id,
s.login_name,
s.host_name,
s.program_name,
s.status,
s.open_transaction_count,
t.transaction_id,
t.transaction_begin_time,
t.transaction_state,
t.transaction_type
FROM sys.dm_exec_sessions AS s
LEFT JOIN sys.dm_tran_session_transactions AS st
ON s.session_id = st.session_id
LEFT JOIN sys.dm_tran_active_transactions AS t
ON st.transaction_id = t.transaction_id
WHERE s.session_id = @blocker_session_id;
SELECT st.text
FROM sys.dm_exec_connections AS c
CROSS APPLY sys.dm_exec_sql_text(c.most_recent_sql_handle) AS st
WHERE c.session_id = @blocker_session_id;
To see what locks are held and requested, use sys.dm_tran_locks. This is often noisy, but it helps when you need to understand whether the blocker holds an X lock on a key, page, or object.
sql
SELECT
tl.request_session_id AS session_id,
tl.resource_type,
tl.resource_database_id,
tl.resource_associated_entity_id,
tl.request_mode,
tl.request_status
FROM sys.dm_tran_locks AS tl
WHERE tl.request_session_id IN (@blocker_session_id)
ORDER BY tl.resource_type, tl.request_mode;
Blocking resolution is often about transaction scope and isolation level. Long-running transactions (including those left open by application bugs) can block a large portion of the workload. Indexing can help reduce lock footprints by enabling seeks instead of scans, but the root fix may be in application transaction management.
If row versioning isolation (READ COMMITTED SNAPSHOT or SNAPSHOT) is enabled, reader-writer blocking can be reduced, but you must then monitor tempdb version store usage; DMVs help there too.
Index usage and missing index signals
Indexes are a frequent tuning lever, but DMV-based index tuning must be approached carefully. You are balancing read performance against write overhead and maintenance cost.
Two DMV families help:
sys.dm_db_index_usage_stats shows how indexes have been used (seeks, scans, lookups, updates) since last restart.
The “missing index” DMVs (sys.dm_db_missing_index_*) suggest index patterns that could improve query cost.
Index usage stats: identifying unused or heavily updated indexes
sys.dm_db_index_usage_stats is useful for spotting indexes that are updated frequently but rarely used, and for validating whether a new index is actually being used. Because it resets on restart, it’s best used with a long enough observation window.
sql
SELECT
DB_NAME(ius.database_id) AS database_name,
OBJECT_NAME(ius.object_id, ius.database_id) AS table_name,
i.name AS index_name,
i.index_id,
ius.user_seeks,
ius.user_scans,
ius.user_lookups,
ius.user_updates,
ius.last_user_seek,
ius.last_user_scan,
ius.last_user_lookup
FROM sys.dm_db_index_usage_stats AS ius
JOIN sys.indexes AS i
ON i.object_id = ius.object_id
AND i.index_id = ius.index_id
WHERE ius.database_id = DB_ID()
ORDER BY ius.user_updates DESC;
High user_updates with low seeks/scans can indicate an index that adds write cost without delivering much value. However, “unused” does not always mean “safe to drop”: some indexes support rare but critical queries, constraints, or operational processes. You should correlate with query stats and application requirements before removing anything.
Missing index DMVs: using recommendations responsibly
SQL Server collects missing index hints during optimization and exposes them via DMVs. These are suggestions, not prescriptions. They are also limited: they don’t consider existing indexes comprehensively, they can recommend overlapping indexes, and they don’t include maintenance cost.
Still, they are valuable when you correlate them with top expensive queries.
sql
SELECT TOP (25)
migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) AS improvement_measure,
DB_NAME(mid.database_id) AS database_name,
OBJECT_NAME(mid.object_id, mid.database_id) AS table_name,
mid.equality_columns,
mid.inequality_columns,
mid.included_columns,
migs.user_seeks,
migs.user_scans,
migs.avg_user_impact
FROM sys.dm_db_missing_index_group_stats AS migs
JOIN sys.dm_db_missing_index_groups AS mig
ON migs.group_handle = mig.index_group_handle
JOIN sys.dm_db_missing_index_details AS mid
ON mig.index_handle = mid.index_handle
ORDER BY improvement_measure DESC;
Treat the output as a backlog of hypotheses. Before creating an index, check whether an existing index already covers the key columns (perhaps with a different column order), whether the table is write-heavy, and whether the suggested included columns would make the index excessively wide.
A common safe pattern is to create a targeted nonclustered index that supports a specific top query and then validate improvement with sys.dm_exec_query_stats and sys.dm_db_index_usage_stats. If the index does not get used, you can remove it.
Plan quality: spills, warnings, and parameter sensitivity signals
DMVs don’t replace execution plan analysis, but they can tell you where plan quality problems are likely.
If you see high logical reads, large elapsed time, or large memory grants, pull plans for those statements and look for:
Spills to tempdb (sort/hash warnings), which correlate with tempdb usage and memory grant issues.
Key lookups executed many times, which can be resolved by covering indexes or query changes.
Implicit conversions, which can prevent index seeks and lead to scans.
Parameter sensitivity (sometimes called parameter sniffing), where the cached plan is optimal for one parameter value but terrible for another.
DMVs can help with parameter sensitivity indirectly by showing high variance between average and max times. sys.dm_exec_query_stats includes max_elapsed_time and max_worker_time. High max compared to average can suggest skew.
sql
SELECT TOP (25)
qs.execution_count,
qs.total_elapsed_time/1000.0 AS total_duration_ms,
qs.max_elapsed_time/1000.0 AS max_duration_ms,
(qs.total_elapsed_time/NULLIF(qs.execution_count,0))/1000.0 AS avg_duration_ms,
qs.total_logical_reads,
qs.max_logical_reads,
DB_NAME(st.dbid) AS database_name,
SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS statement_text,
qs.plan_handle
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY qs.max_elapsed_time DESC;
From here, the plan inspection and Query Store history (if enabled) help determine whether a plan regression or parameter sensitivity is at play. Fixes can include updating statistics, rewriting predicates, or using targeted query hints sparingly. In modern SQL Server versions, features like the Query Store and automatic tuning can help, but they still require operational oversight.
Database-level operational stats: log usage and file growth signals
Performance incidents sometimes trace back to operational events: log growth, file autogrowth stalls, or stalled checkpoints. DMVs can help identify these patterns.
To see log space usage per database, use sys.dm_db_log_space_usage (database-scoped) where available.
sql
SELECT
DB_NAME() AS database_name,
total_log_size_in_bytes/1024/1024 AS total_log_mb,
used_log_space_in_bytes/1024/1024 AS used_log_mb,
used_log_space_in_percent
FROM sys.dm_db_log_space_usage;
In multi-database instances you can iterate databases to collect this into a central table. This helps you correlate WRITELOG waits with log saturation or growth.
Autogrowth events themselves are not directly exposed as a simple DMV counter, but you can infer their impact by sudden changes in file size and coincident latency increases. Proactively sizing files and using sensible growth increments is one of the most effective ways to avoid unpredictable stalls.
Building an incident capture script (production-safe)
During an incident, speed matters. Having a single script that captures key DMVs into a timestamped output reduces the chance of missing transient evidence.
The goal is not to run a “kitchen sink” script that adds load. The goal is to capture:
Current executing requests (what is running right now).
Waiting tasks (what is waiting and on what).
Top cached query stats (what has been expensive recently).
Wait stats (current cumulative; ideally you’ll compute deltas against a prior snapshot).
File I/O stats.
Here is a compact capture you can run in SSMS and save results. It avoids expensive plan retrieval for every row.
sql
-- 1) Current requests
SELECT
SYSDATETIME() AS capture_time,
r.session_id,
r.status,
r.command,
DB_NAME(r.database_id) AS database_name,
r.cpu_time,
r.total_elapsed_time,
r.reads,
r.writes,
r.logical_reads,
r.wait_type,
r.wait_time,
r.blocking_session_id
FROM sys.dm_exec_requests AS r
WHERE r.session_id <> @@SPID;
-- 2) Waiting tasks
SELECT
SYSDATETIME() AS capture_time,
wt.session_id,
wt.wait_type,
wt.wait_duration_ms,
wt.blocking_session_id,
wt.resource_description
FROM sys.dm_os_waiting_tasks AS wt
WHERE wt.session_id > 50;
-- 3) Top cached query stats by CPU
SELECT TOP (20)
SYSDATETIME() AS capture_time,
qs.total_worker_time/1000.0 AS total_cpu_ms,
qs.execution_count,
(qs.total_worker_time/NULLIF(qs.execution_count,0))/1000.0 AS avg_cpu_ms,
qs.total_logical_reads,
DB_NAME(st.dbid) AS database_name,
SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS statement_text,
qs.plan_handle
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY qs.total_worker_time DESC;
-- 4) Wait stats (top)
SELECT TOP (30)
SYSDATETIME() AS capture_time,
wait_type,
wait_time_ms,
signal_wait_time_ms,
waiting_tasks_count
FROM sys.dm_os_wait_stats
ORDER BY wait_time_ms DESC;
-- 5) File I/O stats
SELECT
SYSDATETIME() AS capture_time,
DB_NAME(vfs.database_id) AS database_name,
mf.type_desc,
mf.physical_name,
vfs.num_of_reads,
vfs.io_stall_read_ms,
vfs.num_of_writes,
vfs.io_stall_write_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
JOIN sys.master_files AS mf
ON vfs.database_id = mf.database_id
AND vfs.file_id = mf.file_id;
For operations teams, it’s often helpful to wrap this in PowerShell so you can run it consistently and store output files with timestamps.
powershell
# Requires SqlServer module
# Install-Module SqlServer
$server = "SQLPROD01"
$outDir = "C:\DBA\captures"
New-Item -ItemType Directory -Force -Path $outDir | Out-Null
$ts = Get-Date -Format "yyyyMMdd-HHmmss"
$outFile = Join-Path $outDir "dmv-capture-$server-$ts.csv"
$query = @"
SET NOCOUNT ON;
SELECT 'requests' AS dataset, * FROM (
SELECT
SYSDATETIME() AS capture_time,
r.session_id,
r.status,
r.command,
DB_NAME(r.database_id) AS database_name,
r.cpu_time,
r.total_elapsed_time,
r.reads,
r.writes,
r.logical_reads,
r.wait_type,
r.wait_time,
r.blocking_session_id
FROM sys.dm_exec_requests AS r
WHERE r.session_id <> @@SPID
) d
UNION ALL
SELECT 'waiting_tasks' AS dataset, * FROM (
SELECT
SYSDATETIME() AS capture_time,
wt.session_id,
CAST(NULL AS nvarchar(60)) AS status,
wt.wait_type AS command,
CAST(NULL AS sysname) AS database_name,
CAST(NULL AS int) AS cpu_time,
wt.wait_duration_ms AS total_elapsed_time,
CAST(NULL AS bigint) AS reads,
CAST(NULL AS bigint) AS writes,
CAST(NULL AS bigint) AS logical_reads,
wt.wait_type,
wt.wait_duration_ms AS wait_time,
wt.blocking_session_id
FROM sys.dm_os_waiting_tasks AS wt
WHERE wt.session_id > 50
) d;
"@
Invoke-Sqlcmd -ServerInstance $server -Database master -Query $query |
Export-Csv -NoTypeInformation -Path $outFile
Write-Host "Saved $outFile"
This PowerShell example flattens different datasets into a single CSV for convenience by aligning columns, which is imperfect but operationally practical. In a more robust implementation, you would export separate files or write to dedicated tables.
Correlating DMVs: turning signals into decisions
A common failure mode in SQL Server performance tuning is to treat each DMV as a standalone truth. In practice, you get reliable answers when you correlate.
If wait deltas show PAGEIOLATCH_* and file stats show elevated read latency on a subset of data files, that’s a strong I/O story. From there you decide whether the fix is storage, missing indexes, or memory sizing.
If wait deltas show lock waits and sys.dm_exec_requests shows many blocked sessions behind one blocker, you focus on transaction scope and indexing to reduce lock footprints.
If wait deltas show high signal waits and schedulers show runnable queues, you are CPU constrained; then you use query stats to identify top CPU consumers and determine whether tuning or scaling is appropriate.
If you see RESOURCE_SEMAPHORE waits and sys.dm_exec_query_memory_grants shows many waiting grants, you likely have memory grant pressure. That leads you back to specific queries and plans (hashes/sorts), and often to indexing improvements or stats/plan stability work.
This correlation discipline is what makes DMVs a practical tool for system engineers: you can explain the chain from symptom to mechanism to root cause.
Practical query tuning patterns driven by DMV findings
Once you have identified candidate queries, a few tuning patterns come up repeatedly.
When logical reads are high, the plan often shows scans where a seek is possible. Adding or adjusting an index (or rewriting predicates to be sargable) is usually the highest-impact fix.
When CPU is high but reads are moderate, look for expensive joins, scalar UDF usage (on older compatibility levels), or row-by-row patterns. Plans may show nested loops with high row counts or compute scalars evaluated many times.
When duration is high but CPU is modest, and waits point to I/O or locks, query changes may not help as much as addressing the underlying resource bottleneck.
It is also important to validate improvements with the same DMVs you used to identify the problem. For example, after an index change you should see:
Reduced logical reads and duration for the target statement in sys.dm_exec_query_stats.
Increased seek usage for the new index in sys.dm_db_index_usage_stats.
Potentially changed wait profile at the instance level during the affected workload window.
If you don’t see these changes, the fix may not be effective—or the workload may be using a different query shape than you tested.
Avoiding common DMV misinterpretations
DMVs are powerful, but there are several common traps.
First, confusing cumulative totals with “current.” Wait stats and I/O stats accumulate since restart. Always compare to a prior snapshot or compute deltas.
Second, tuning based solely on missing index DMVs. These DMVs can recommend redundant or overly wide indexes. Use them to inform, not to automate.
Third, assuming sys.dm_exec_query_stats shows everything. It shows cached plans only. If the workload uses ad hoc queries with frequent recompilation or if plans are evicted, you may miss important queries. Query Store can fill this gap.
Fourth, focusing on a single wait type without context. For example, CXPACKET/CXCONSUMER waits can appear in healthy parallel workloads. The question is whether parallelism is helping or hurting, and whether there is skew.
Finally, ignoring the write side of performance. An index that improves a read-heavy report might harm an OLTP insert/update workload. Always check user_updates and overall log/I/O patterns after changes.
Operationalizing DMV-based tuning in production
To make DMVs part of routine operations rather than an emergency tool, incorporate them into monitoring and change management.
Start by scheduling baseline snapshots for waits and file I/O, as shown earlier, and retain enough history to compare weekdays vs weekends and month-end vs normal periods. Even a week of 5-minute snapshots can reveal patterns.
When an incident occurs, capture the “current activity” DMVs (sys.dm_exec_requests, sys.dm_os_waiting_tasks, memory grants) immediately, because these are transient.
As you implement fixes, record what you changed (query rewrite, index add, configuration adjustment) and then use DMV deltas to validate. Over time, you build a playbook: you know what a storage-latency incident looks like in your wait profile and file stats, and you know which queries are the typical offenders.
If you have multiple SQL Server instances, standardizing these scripts and storing snapshots centrally makes cross-instance comparison possible. That is especially useful for identifying whether an issue is workload-specific or infrastructure-wide (for example, a shared storage problem).