Wait Stats Analysis – Data Acquisition
Sunday, November 2, 2008 at 5:04PM











The above are just some of the examples of charts based on the Wait Stats data
(click the thumbnails to see the full size versions on flickr.com)
Description:
SQL Server Dynamic Management Queries provide in-depth details about the waits and wait types that are occurring for internally running sql processes at any given time, as well as over a past period of time since the statistics were cleared or the server restarted. In many administrative functions for SQL Server the identification and resolution of bottle-necks or other performance impeding issues threatening a stable running system are high priority functions and are the center point of effectiveness for the DBA.
Wait statistics in SQL Server is a great way to see what SQL Server is waiting on while it is trying to do what it is asked of it. These waits identify directly the resources in contention and give great pointers where to look for the exact T-SQL or application causing the waits. I use this on a daily basis on all my SQL Servers to augment and provide valuable information for: Capacity planning, troubleshooting performance issues, trouble shooting blocking scenarios, performance trend awareness, monitoring of new application modification and deployments, job staggering assistance, maintenance drains, backup effects on performance, on and on…
What is really nice is the data acquisition is not intensive on SQL Server, and the reports that compile and present the data can be run anytime, real-time, quickly and easily as the data has already been “tabled” and indexed.
SupportSQL.com will showcase in this blog and others many such examples of SQL Server performance issues that can be quickly identified and resolved using this data with the charts, graphs and alerts created on top of it.
This proc will:
Create the WaitStats table if it doesn’t exist
Populate that table with the results of the WaitStats DM Queries for the intervals as set when running the proc.
To do:
Create the proc with this code
Schedule the proc to run according to your monitoring/tracking needs – notes: - I run this every 50 minutes for 40 samples taken 30 sec apart (a 20 minute window) and I do not clear the history table but I do clear the accrued statistics on every run to get a good real-time snapshot, instead of an “up-to-now” report. Taking 25 or so snapshots a day (about every hour) of current wait statistics will provide many opportunities for visual analysis that will uncover several SQL Server performance issues and ultimately be a favorite DBA tool.
Use the view gp_Perf_GetWaitStats_2005 to show the results (provided below)
Create MS Access or MS Excel Pivot Charts to analyze the data further, e.g:
Here’s the data acquisition stored procedure script (schedule this to run regularly):
CREATE proc [dbo].[gp_Perf_TrackWaitStats_2005]
(@num_samples int=60
,@delay_interval int=1
,@delay_type nvarchar(10)=’minutes’
,@truncate_history nvarchar(1)=’N’
,@clear_waitstats nvarchar(1)=’Y’)
as
—sample exec stmt:
—exec dbo.gp_Perf_TrackWaitStats_2005 @num_samples=20 —take 20 samples
— ,@delay_interval=30 —one every 30
— ,@delay_type=’s’ — seconds
— ,@truncate_history=’n’ —truncate the entire history table
— ,@clear_waitstats=’y’ —clear the sql server internal stats for waits (get stats as of now, instead of as of last clearing)
set nocount on
—create table if not exists
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N’[dbo].[WaitStats]’) AND type in (N’U’))
BEGIN
CREATE TABLE [dbo].[WaitStats](
[wait_type] [nvarchar](60) NOT NULL,
[waiting_tasks_count] [bigint] NOT NULL,
[wait_time_ms] [bigint] NOT NULL,
[max_wait_time_ms] [bigint] NOT NULL,
[signal_wait_time_ms] [bigint] NOT NULL,
[now] [datetime] NOT NULL DEFAULT (getdate())
) ON [PRIMARY]
CREATE CLUSTERED INDEX [ixc_waitstats] ON [dbo].[WaitStats]
(
[now] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [ix_waitstats_type] ON [dbo].[WaitStats]
(
[wait_type] ASC
)
INCLUDE ( [waiting_tasks_count],
[wait_time_ms],
[max_wait_time_ms],
[signal_wait_time_ms]) WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50) ON [PRIMARY]
END
—validation
If lower(@truncate_history) not in (N’y’,N’n’)
begin
raiserror (‘valid @truncate_history values are ”y” or ”n”’,16,1) with nowait
end
If lower(@clear_waitstats) not in (N’y’,N’n’)
begin
raiserror (‘valid @clear_waitstats values are ”y” or ”n”’,16,1) with nowait
end
If lower(@truncate_history) = N’y’
truncate table dbo.waitstats
If lower (@clear_waitstats) = N’y’
dbcc sqlperf ([sys.dm_os_wait_stats],clear) with no_infomsgs — clear out waitstats
declare @i int,@delay varchar(8),@dt varchar(3), @now datetime, @totalwait numeric(20,1)
,@endtime datetime,@begintime datetime
,@hr int, @min int, @sec int
select @i = 1
select @dt = case lower(@delay_type)
when N’minutes’ then ‘m’
when N’minute’ then ‘m’
when N’min’ then ‘m’
when N’mi’ then ‘m’
when N’n’ then ‘m’
when N’m’ then ‘m’
when N’seconds’ then ‘s’
when N’second’ then ‘s’
when N’sec’ then ‘s’
when N’ss’ then ‘s’
when N’s’ then ‘s’
else @delay_type
end
if @dt not in (’s’,’m’)
begin
raiserror (‘delay type must be either ”seconds” or ”minutes”’,16,1) with nowait
return
end
if @dt = ‘s’
begin
select @sec = @delay_interval % 60, @min = cast((@delay_interval / 60) as int), @hr = cast((@min / 60) as int)
end
if @dt = ‘m’
begin
select @sec = 0, @min = @delay_interval % 60, @hr = cast((@delay_interval / 60) as int)
end
select @delay= right(‘0’+ convert(varchar(2),@hr),2) + ‘:’ +
+ right(‘0’+convert(varchar(2),@min),2) + ‘:’ +
+ right(‘0’+convert(varchar(2),@sec),2)
if @hr > 23 or @min > 59 or @sec > 59
begin
select ‘delay interval and type: ’ + convert (varchar(10),@delay_interval) + ‘,’ + @delay_type + ’ converts to ’ + @delay
raiserror (‘hh:mm:ss delay time cannot > 23:59:59’,16,1) with nowait
return
END
—get the samples on a loop for time intervals
while (@i <= @num_samples)
begin
select @now = getdate()
insert into [dbo].[waitstats] ([wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], now)
select [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], @now
from sys.dm_os_wait_stats
insert into [dbo].[waitstats] ([wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], now)
select ‘Total’,sum([waiting_tasks_count]), sum([wait_time_ms]), 0, sum([signal_wait_time_ms]),@now
from [dbo].[waitstats]
where now = @now
select @i = @i + 1
waitfor delay @delay
end
print ‘done’
Here’s a screenshot of the table and data generated.
This will be the base table for analysis of all wait statistics in SQL Server:
Here is a view to retrieve the basic detailed WaitStats results:
CREATE PROC [dbo].[gp_Perf_GetWaitStats_2005]
(
@report_format varchar(20) = ‘all’
,@report_order varchar(20) = ‘resource’)
AS —e.g:
—exec [dbo].[gp_Perf_GetWaitStats_2005] @report_format=’detail’,@report_order=’resource’
— this proc will create waitstats report listing wait types by percentage.
— (1) total wait time is the sum of resource & signal waits, @report_format=’all’ reports resource & signal
— (2) Basics of execution model (simplified)
— a. spid is running then needs unavailable resource, moves to resource wait list at time T0
— b. a signal indicates resource available, spid moves to runnable queue at time T1
— c. spid awaits running status until T2 as cpu works its way through runnable queue in order of arrival
— (3) resource wait time is the actual time waiting for the resource to be available, T1-T0
— (4) signal wait time is the time it takes from the point the resource is available (T1)
— to the point in which the process is running again at T2. Thus, signal waits are T2-T1
— (5) Key questions: Are Resource and Signal time significant?
— a. Highest waits indicate the bottleneck you need to solve for scalability
— b. Generally if you have LOW% SIGNAL WAITS, the CPU is handling the workload e.g. spids spend move through runnable queue quickly
— c. HIGH % SIGNAL WAITS indicates CPU can’t keep up, significant time for spids to move up the runnable queue to reach running status
— (6) This proc can be run when gp_Perf_TrackWaitStats is executing
SET nocount ON
DECLARE
@now datetime
,@totalwait numeric(20, 1)
,@totalsignalwait numeric(20, 1)
,@totalresourcewait numeric(20, 1)
,@endtime datetime
,@begintime datetime
,@hr int
,@min int
,@sec int
IF NOT EXISTS ( SELECT
1
FROM
sysobjects
WHERE
id = object_id(N’[dbo].[waitstats]’)
AND objectproperty(id, N’IsUserTable’) = 1 )
BEGIN
RAISERROR (‘Error [dbo].[waitstats] table does not exist’ , 16 , 1) WITH nowait
RETURN
END
IF lower(@report_format) NOT IN (‘all’, ‘detail’, ‘simple’)
BEGIN
RAISERROR (‘@report_format must be either ”all”,”detail”, or ”simple”’ , 16 , 1) WITH nowait
RETURN
END
IF lower(@report_order) NOT IN (‘resource’, ‘signal’, ‘total’)
BEGIN
RAISERROR (‘@report_order must be either ”resource”, ”signal”, or ”total”’ , 16 , 1) WITH nowait
RETURN
END
IF lower(@report_format) = ‘simple’
AND lower(@report_order) <> ‘total’
BEGIN
RAISERROR (‘@report_format is simple so order defaults to ”total”’ , 16 , 1) WITH nowait
SELECT
@report_order = ‘total’
END
SELECT
@now = max(now)
,@begintime = min(now)
,@endtime = max(now)
FROM
[dbo].[waitstats]
WHERE
[wait_type] = ‘Total’
—- subtract waitfor, sleep, and resource_queue from Total
SELECT
@totalwait = sum([wait_time_ms]) + 1
,@totalsignalwait = sum([signal_wait_time_ms]) + 1
FROM
waitstats
—where [wait_type] not in (‘WAITFOR’,’SLEEP’,’RESOURCE_QUEUE’, ‘Total’, ‘***total***’) and now = @now
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’, ‘***total***’)
AND now = @now
SELECT
@totalresourcewait = 1 + @totalwait - @totalsignalwait
— insert adjusted totals, rank by percentage descending
DELETE
waitstats
WHERE
[wait_type] = ‘***total***’
AND now = @now
INSERT INTO
waitstats
SELECT
‘***total***’
,0
,@totalwait
,0
,@totalsignalwait
,@now
SELECT
‘start time’ = @begintime
,’end time’ = @endtime
,’duration (hh:mm:ss:ms)’ = convert(varchar(50), @endtime - @begintime, 14)
,’report format’ = @report_format
,’report order’ = @report_order
IF lower(@report_format) IN (‘all’, ‘detail’)
BEGIN
——- format=detail, column order is resource, signal, total. order by resource desc
IF lower(@report_order) = ‘resource’
INSERT INTO
WaitStats_Trend
SELECT
[wait_type]
,[waiting_tasks_count]
,’Resource wt (T1-T0)’ = [wait_time_ms] - [signal_wait_time_ms]
,’res_wt_%’ = cast(100 * ([wait_time_ms] - [signal_wait_time_ms]) / @totalresourcewait AS numeric(20, 1))
,’Signal wt (T2-T1)’ = [signal_wait_time_ms]
,’sig_wt_%’ = cast(100 * [signal_wait_time_ms] / @totalsignalwait AS numeric(20, 1))
,’Total wt (T2-T0)’ = [wait_time_ms]
,’wt_%’ = cast(100 * [wait_time_ms] / @totalwait AS numeric(20, 1))
,@now AS AsOf
FROM
waitstats
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’)
AND now = @now
ORDER BY
‘res_wt_%’ DESC
IF lower(@report_order) = ‘resource’
SELECT
[wait_type]
,[waiting_tasks_count]
,’Resource wt (T1-T0)’ = [wait_time_ms] - [signal_wait_time_ms]
,’res_wt_%’ = cast(100 * ([wait_time_ms] - [signal_wait_time_ms]) / @totalresourcewait AS numeric(20, 1))
,’Signal wt (T2-T1)’ = [signal_wait_time_ms]
,’sig_wt_%’ = cast(100 * [signal_wait_time_ms] / @totalsignalwait AS numeric(20, 1))
,’Total wt (T2-T0)’ = [wait_time_ms]
,’wt_%’ = cast(100 * [wait_time_ms] / @totalwait AS numeric(20, 1))
FROM
waitstats
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’)
AND now = @now
ORDER BY
‘res_wt_%’ DESC
——- format=detail, column order signal, resource, total. order by signal desc
IF lower(@report_order) = ‘signal’
SELECT
[wait_type]
,[waiting_tasks_count]
,’Signal wt (T2-T1)’ = [signal_wait_time_ms]
,’sig_wt_%’ = cast(100 * [signal_wait_time_ms] / @totalsignalwait AS numeric(20, 1))
,’Resource wt (T1-T0)’ = [wait_time_ms] - [signal_wait_time_ms]
,’res_wt_%’ = cast(100 * ([wait_time_ms] - [signal_wait_time_ms]) / @totalresourcewait AS numeric(20, 1))
,’Total wt (T2-T0)’ = [wait_time_ms]
,’wt_%’ = cast(100 * [wait_time_ms] / @totalwait AS numeric(20, 1))
FROM
waitstats
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’)
AND now = @now
ORDER BY
‘sig_wt_%’ DESC
——- format=detail, column order total, resource, signal. order by total desc
IF lower(@report_order) = ‘total’
SELECT
[wait_type]
,[waiting_tasks_count]
,’Total wt (T2-T0)’ = [wait_time_ms]
,’wt_%’ = cast(100 * [wait_time_ms] / @totalwait AS numeric(20, 1))
,’Resource wt (T1-T0)’ = [wait_time_ms] - [signal_wait_time_ms]
,’res_wt_%’ = cast(100 * ([wait_time_ms] - [signal_wait_time_ms]) / @totalresourcewait AS numeric(20, 1))
,’Signal wt (T2-T1)’ = [signal_wait_time_ms]
,’sig_wt_%’ = cast(100 * [signal_wait_time_ms] / @totalsignalwait AS numeric(20, 1))
FROM
waitstats
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’)
AND now = @now
ORDER BY
‘wt_%’ DESC
END
ELSE
—— simple format, total waits only
SELECT
[wait_type]
,[wait_time_ms]
,percentage = cast(100 * [wait_time_ms] / @totalwait AS numeric(20, 1))
FROM
waitstats
WHERE
[wait_type] NOT IN (‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’, ‘SLEEP_SYSTEMTASK’, ‘Total’, ‘WAITFOR’)
AND now = @now
ORDER BY
percentage DESC
—— compute cpu resource waits
SELECT
‘total waits’ = [wait_time_ms]
,’total signal=CPU waits’ = [signal_wait_time_ms]
,’CPU resource waits % = signal waits / total waits’ = cast(100 * [signal_wait_time_ms] / [wait_time_ms] AS numeric(20, 1))
,now
FROM
[dbo].[waitstats]
WHERE
[wait_type] = ‘***total***’
ORDER BY
now
Here is the output of this proc:( a basic detail report showing the highest resource waits and signal waits for the number of tasks waiting shown.)
More examples of charts based on the Wait Stats data: (click these to see the full size versions on flickr.com)








View Printer Friendly Version
Email Article to Friend
Reader Comments