mediumOther

SQLTRACE_SHUTDOWN Wait Type Explained

SQL Server SQLTRACE_SHUTDOWN wait type occurs during trace session shutdown. Learn diagnostic queries, fix scripts, and prevention strategies for trace shutdown waits.

Quick Answer

SQLTRACE_SHUTDOWN waits occur when SQL Server is stopping a trace session (SQL Trace or Extended Events) and must wait for in-flight trace events to complete before fully shutting down the trace. This wait type is typically benign but can indicate excessive tracing overhead during trace deactivation, especially in high-volume OLTP systems.

Root Cause Analysis

SQLTRACE_SHUTDOWN manifests when the trace subsystem executes its cleanup sequence during trace termination. The SQL Server trace architecture operates through event producers (various engine components) that queue events to trace consumers via the trace buffer pool. When a trace stops, the trace manager must ensure all queued events complete processing before deallocating trace resources.

The wait occurs specifically in the trace shutdown coordinator thread, which blocks until all worker threads finish draining their event queues. This coordination prevents memory corruption and ensures trace file consistency. In SQL Server 2016 and later, Extended Events largely replaced SQL Trace, but the underlying coordination mechanism remains similar for both technologies.

Heavy trace loads with complex event filters or frequent file rollovers amplify this wait duration. The trace buffer pool becomes the bottleneck as events accumulate faster than the file writer threads can persist them. SQL Server 2019 improved trace shutdown performance by implementing parallel buffer flushing, reducing typical wait times from seconds to milliseconds under normal conditions.

High SQLTRACE_SHUTDOWN waits often correlate with oversized trace definitions capturing excessive events, particularly in systems running third-party monitoring tools that create multiple concurrent traces. The trace subsystem enforces memory limits, but shutdown coordination still requires sequential event draining regardless of memory pressure.

AutoDBA checks trace session configuration, buffer utilization monitoring, and Extended Events migration recommendations across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.

Diagnostic Queries

-- Check active traces and their current status
SELECT 
    t.id,
    t.status,
    t.path,
    t.buffer_count,
    t.buffer_size,
    t.file_position,
    DATEDIFF(MINUTE, t.start_time, GETDATE()) AS runtime_minutes
FROM sys.traces t
WHERE t.status = 1;
-- Identify current trace-related waits and their duration
SELECT 
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    max_wait_time_ms,
    signal_wait_time_ms,
    wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms
FROM sys.dm_os_wait_stats 
WHERE wait_type LIKE '%TRACE%'
    AND wait_time_ms > 0
ORDER BY wait_time_ms DESC;
-- Check for traces with large buffer counts indicating high event volume
SELECT 
    t.id,
    t.path,
    t.buffer_count,
    t.buffer_size,
    (t.buffer_count * t.buffer_size) / 1024.0 / 1024.0 AS total_memory_mb,
    te.eventid,
    te.columnid
FROM sys.traces t
CROSS APPLY (
    SELECT TOP 5 eventid, columnid 
    FROM sys.fn_trace_geteventinfo(t.id) ei
) te
WHERE t.status = 1;
-- Monitor Extended Events sessions that might be causing similar waits
SELECT 
    s.name,
    s.blocked_event_fire_time,
    s.dropped_event_count,
    s.dropped_buffer_count,
    st.target_name,
    CAST(st.target_data AS XML) as target_config
FROM sys.dm_xe_sessions s
INNER JOIN sys.dm_xe_session_targets st ON s.address = st.event_session_address
WHERE s.blocked_event_fire_time > 0 
   OR s.dropped_event_count > 0;
-- Check for processes currently stopping traces
SELECT 
    r.session_id,
    r.command,
    r.status,
    r.wait_type,
    r.wait_time,
    r.last_wait_type,
    t.text
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.command LIKE '%TRACE%' 
   OR t.text LIKE '%sp_trace_setstatus%'
   OR r.wait_type = 'SQLTRACE_SHUTDOWN';

Fix Scripts

Stop problematic traces immediately

-- Identify and stop traces with excessive buffer usage
DECLARE @trace_id INT;
DECLARE trace_cursor CURSOR FOR
SELECT id 
FROM sys.traces 
WHERE status = 1 
    AND buffer_count > 1000  -- Adjust threshold as needed
    AND id > 1;  -- Exclude default trace

OPEN trace_cursor;
FETCH NEXT FROM trace_cursor INTO @trace_id;

WHILE @@FETCH_STATUS = 0
BEGIN
    EXEC sp_trace_setstatus @trace_id, 0;  -- Stop trace
    EXEC sp_trace_setstatus @trace_id, 2;  -- Delete trace definition
    PRINT 'Stopped and deleted trace ID: ' + CAST(@trace_id AS VARCHAR(10));
    FETCH NEXT FROM trace_cursor INTO @trace_id;
END;

CLOSE trace_cursor;
DEALLOCATE trace_cursor;

Test this in development first. Stopping traces will lose any unbuffered trace data and may break dependent monitoring applications.

Optimize trace buffer settings for existing traces

-- Recreate traces with smaller buffer sizes to reduce shutdown time
-- This script template shows the pattern; customize for your specific trace needs
DECLARE @new_trace_id INT;
EXEC sp_trace_create 
    @traceid = @new_trace_id OUTPUT,
    @options = 0,  -- No rollover
    @tracefile = N'C:\Temp\optimized_trace',
    @maxfilesize = 100,  -- Smaller files for faster shutdown
    @stoptime = NULL,
    @filecount = 5;

-- Add only essential events, avoid capturing high-frequency events like RPC:Completed
EXEC sp_trace_setevent @new_trace_id, 12, 1, 1;   -- SQL:BatchCompleted - TextData
EXEC sp_trace_setevent @new_trace_id, 12, 13, 1;  -- SQL:BatchCompleted - Duration

-- Set filters to reduce event volume
EXEC sp_trace_setfilter @new_trace_id, 13, 0, 4, 100;  -- Duration >= 100ms only

EXEC sp_trace_setstatus @new_trace_id, 1;  -- Start trace
PRINT 'Created optimized trace ID: ' + CAST(@new_trace_id AS VARCHAR(10));

Always verify trace output meets monitoring requirements after optimization. Reduced event capture may miss important diagnostics.

Configure Extended Events as SQL Trace replacement

-- Create lightweight Extended Events session to replace heavy SQL Traces
IF EXISTS (SELECT * FROM sys.dm_xe_sessions WHERE name = 'optimized_monitoring')
    DROP EVENT SESSION optimized_monitoring ON SERVER;

CREATE EVENT SESSION optimized_monitoring ON SERVER
ADD EVENT sqlserver.sql_batch_completed (
    ACTION (sqlserver.database_name, sqlserver.username)
    WHERE duration > 100000  -- 100ms in microseconds
),
ADD EVENT sqlserver.rpc_completed (
    ACTION (sqlserver.database_name, sqlserver.username) 
    WHERE duration > 100000
)
ADD TARGET package0.asynchronous_file_target (
    SET filename = 'C:\Temp\optimized_monitoring.xel',
        max_file_size = 50,
        max_rollover_files = 10
)
WITH (MAX_MEMORY = 16MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS);

ALTER EVENT SESSION optimized_monitoring ON SERVER STATE = START;

Extended Events provide better performance and less shutdown coordination overhead than SQL Trace. Test event filtering thoroughly as XE syntax differs from SQL Trace.

AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.

Prevention

Configure trace buffer sizes appropriately for your workload volume. Keep buffer counts under 1000 and buffer sizes under 64KB for most production systems. Larger buffers increase shutdown coordination time exponentially.

Replace SQL Trace with Extended Events wherever possible. XE sessions shut down faster due to improved buffer management and reduced lock contention during cleanup. SQL Server 2019+ provides significant XE performance improvements over earlier versions.

Implement trace rotation policies with smaller file sizes rather than single large trace files. Files under 100MB reduce I/O blocking during shutdown sequences. Configure automatic trace stops during maintenance windows to avoid shutdown waits during planned restarts.

Monitor trace buffer utilization through sys.traces and establish alerts when buffer_count exceeds normal thresholds. High buffer counts indicate event production exceeding consumption rates, leading to longer shutdown waits.

Avoid capturing high-frequency events like RPC:Completed or SQL:StmtCompleted without appropriate filters. These events generate massive volumes in OLTP systems and create shutdown bottlenecks. Filter by duration, database_name, or application_name to reduce event volume by 90% while maintaining diagnostic value.

Need hands-on help?

Dealing with persistent sqltrace_shutdown issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.

Related Pages