mediumAlways On

HADR_XRF_STACK_ACCESS Wait Type Explained

HADR_XRF_STACK_ACCESS wait type in SQL Server Always On availability groups. Causes, diagnosis queries, fixes for extended recovery fork stack waits during log processing.

Quick Answer

HADR_XRF_STACK_ACCESS waits occur when Always On availability groups access the extended recovery fork stack for log record processing during database recovery or synchronization. This wait type appears during heavy log activity, failovers, or when secondary replicas process large transaction log volumes. Generally low concern unless consistently high.

Root Cause Analysis

The extended recovery fork (XRF) stack tracks recovery points and fork operations during log processing in Always On availability groups. SQL Server maintains this stack to handle complex recovery scenarios where multiple recovery paths exist, particularly when dealing with log records that create recovery forks during database state transitions.

This wait manifests when the log manager accesses the XRF stack structure during log record processing on both primary and secondary replicas. The stack operations include lookups for existing recovery forks, additions of new fork points, and deletions of obsolete entries. These operations require synchronization to maintain consistency across the stack structure.

In SQL Server 2012 through 2014, XRF stack access was less optimized and showed higher wait times during heavy transactional workloads. SQL Server 2016 introduced improvements to the stack management algorithms, reducing contention. SQL Server 2019 and later versions further optimized stack access patterns, particularly for workloads with frequent small transactions that create multiple recovery points.

The wait occurs most frequently during secondary replica log apply operations, automatic failovers, and planned manual failovers where extensive log record processing creates multiple recovery fork scenarios. Database startup recovery after failover events also triggers intensive XRF stack activity as the recovery process evaluates multiple potential recovery paths.

AutoDBA checks Always On configuration, log backup schedules, and replica performance metrics across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.

Diagnostic Queries

-- Check current HADR_XRF_STACK_ACCESS wait statistics
SELECT 
    waiting_tasks_count,
    wait_time_ms,
    max_wait_time_ms,
    signal_wait_time_ms,
    wait_time_ms / NULLIF(waiting_tasks_count, 0) AS avg_wait_ms
FROM sys.dm_os_wait_stats 
WHERE wait_type = 'HADR_XRF_STACK_ACCESS';
-- Identify sessions currently experiencing XRF stack waits
SELECT 
    s.session_id,
    r.status,
    r.wait_type,
    r.wait_time,
    r.last_wait_type,
    s.program_name,
    r.command,
    DB_NAME(r.database_id) as database_name
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
WHERE r.wait_type = 'HADR_XRF_STACK_ACCESS'
   OR r.last_wait_type = 'HADR_XRF_STACK_ACCESS';
-- Analyze AG replica log processing performance
SELECT 
    ag.name AS ag_name,
    ar.replica_server_name,
    ars.role_desc,
    ars.operational_state_desc,
    drs.log_send_queue_size,
    drs.log_send_rate,
    drs.redo_queue_size,
    drs.redo_rate,
    drs.redo_queue_size,
    drs.redo_rate
FROM sys.availability_groups ag
JOIN sys.availability_replicas ar ON ag.group_id = ar.group_id
JOIN sys.dm_hadr_availability_replica_states ars ON ar.replica_id = ars.replica_id
LEFT JOIN sys.dm_hadr_database_replica_states drs ON ars.replica_id = drs.replica_id
ORDER BY ag.name, ar.replica_server_name;
-- Monitor transaction log generation patterns
SELECT 
    DB_NAME(database_id) as database_name,
    log_reuse_wait_desc,
    log_space_in_bytes_since_last_backup / 1024 / 1024 as log_space_mb,
    total_log_size_in_bytes / 1024 / 1024 as total_log_mb,
    used_log_space_in_percent
FROM sys.dm_db_log_space_usage
WHERE database_id IN (
    SELECT database_id 
    FROM sys.dm_hadr_database_replica_states
);
-- Check for log apply backlog on local replicas
SELECT 
    ag.name AS ag_name,
    DB_NAME(drs.database_id) AS database_name,
    drs.redo_queue_size,
    drs.redo_rate,
    drs.log_send_queue_size,
    drs.last_commit_time,
    drs.synchronization_state_desc
FROM sys.availability_groups ag
JOIN sys.dm_hadr_database_replica_states drs ON ag.group_id = drs.group_id
WHERE drs.is_local = 1
  AND drs.redo_queue_size > 0;

Fix Scripts

Optimize transaction log backup frequency Reduces log chain length and XRF stack complexity by minimizing recovery fork scenarios.

-- Create optimized log backup job for AG databases
-- TEST THIS SCHEDULE IN DEV FIRST
DECLARE @sql NVARCHAR(MAX);
DECLARE @dbname SYSNAME;

DECLARE db_cursor CURSOR FOR
SELECT DISTINCT DB_NAME(database_id)
FROM sys.dm_hadr_database_replica_states
WHERE is_local = 1 AND is_primary_replica = 1;

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @dbname;

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = 'BACKUP LOG [' + @dbname + '] TO DISK = ''NUL:''';
    PRINT 'Execute: ' + @sql;
    -- EXEC sp_executesql @sql; -- Uncomment after testing
    
    FETCH NEXT FROM db_cursor INTO @dbname;
END;

CLOSE db_cursor;
DEALLOCATE db_cursor;

Clear wait statistics for monitoring reset Resets accumulated wait stats to establish new baseline measurements.

-- Clear wait statistics to establish fresh baseline
-- WARNING: This clears ALL wait statistics, not just XRF waits
-- Run during maintenance window only
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

Configure optimal AG synchronous commit timeout Prevents excessive log accumulation that increases XRF stack operations.

-- Adjust AG replica timeout settings
-- VERIFY NETWORK LATENCY REQUIREMENTS FIRST
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON 'SecondaryReplicaName'
WITH (SESSION_TIMEOUT = 30); -- Increase from default 10 seconds if needed

Monitor and alert on excessive XRF waits Creates monitoring query for ongoing XRF stack wait detection.

-- Create monitoring view for XRF stack waits
-- Deploy as part of monitoring solution
CREATE OR ALTER VIEW dbo.v_hadr_xrf_monitor AS
SELECT 
    GETDATE() as sample_time,
    waiting_tasks_count,
    wait_time_ms,
    wait_time_ms / NULLIF(waiting_tasks_count, 0) AS avg_wait_ms,
    CASE 
        WHEN wait_time_ms / NULLIF(waiting_tasks_count, 0) > 100 THEN 'HIGH'
        WHEN wait_time_ms / NULLIF(waiting_tasks_count, 0) > 50 THEN 'MEDIUM'
        ELSE 'LOW'
    END as severity_level
FROM sys.dm_os_wait_stats 
WHERE wait_type = 'HADR_XRF_STACK_ACCESS'
  AND waiting_tasks_count > 0;

AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.

Prevention

Configure transaction log backups every 1-2 minutes for AG databases to minimize log chain complexity and reduce XRF stack depth. Frequent log backups prevent accumulation of recovery fork points that require stack management overhead.

Size transaction log files appropriately with 8GB-16GB initial size and reasonable autogrowth settings (512MB-1GB increments) to reduce log management operations. Undersized logs create frequent autogrowth events that complicate recovery fork tracking.

Monitor redo queue size on secondary replicas and maintain queues below 100MB during normal operations. Large redo queues increase log processing time and XRF stack operations. Configure readable secondaries carefully as read workloads can slow log apply processes.

Implement AG dashboard monitoring that tracks XRF wait times alongside redo rates and log send rates. Set alerts when average XRF wait times exceed 50ms consistently over 5-minute periods, indicating potential log processing bottlenecks.

Place AG replicas on dedicated network connections with consistent low latency (under 5ms) to prevent network-induced log apply delays that increase recovery fork complexity. Avoid mixing AG traffic with backup or other high-bandwidth operations.

Need hands-on help?

Dealing with persistent hadr_xrf_stack_access issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.

Related Pages