Quick Answer
HADR_SYNCHRONIZING_THROTTLE occurs when a synchronous-commit secondary replica cannot keep up with the primary replica's transaction rate, forcing the primary to throttle commits. This is normal during initial seeding or catch-up scenarios, but sustained occurrences indicate performance bottlenecks on the secondary replica or network issues.
Root Cause Analysis
When an Always On availability group operates in synchronous-commit mode, the primary replica must wait for acknowledgment from secondary replicas before committing transactions. The log transport mechanism sends log records to secondary replicas, which must harden them to disk and acknowledge receipt before the primary can proceed.
HADR_SYNCHRONIZING_THROTTLE specifically manifests when the secondary replica falls behind in applying log records and the primary determines that continued log generation would exceed the secondary's ability to catch up. The Always On availability group subsystem implements an internal throttling mechanism that deliberately slows down transaction commits on the primary to prevent the log send queue from growing excessively large.
This wait occurs in the log capture thread on the primary replica when it detects that the secondary replica's redo queue size exceeds internal thresholds. SQL Server 2016 SP2 and later versions improved the throttling algorithm to be more adaptive, reducing unnecessary throttling during temporary spikes. SQL Server 2019 introduced better telemetry around redo queue management and more granular throttling controls.
The throttling mechanism considers multiple factors: network latency to the secondary replica, secondary replica's redo rate, available memory for log caching, and the configured session timeout values. When these factors indicate the secondary cannot maintain synchronization without intervention, the primary replica's log capture process enters this wait state.
AutoDBA checks Always On availability group health, redo queue monitoring, and secondary replica performance metrics across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.
Diagnostic Queries
-- Check current redo queue sizes and rates across all secondary replicas
SELECT
ar.replica_server_name,
db_name(drs.database_id) AS database_name,
drs.redo_queue_size / 1024.0 AS redo_queue_size_mb,
drs.redo_rate AS redo_rate_kb_sec,
drs.log_send_queue_size / 1024.0 AS log_send_queue_mb,
drs.last_redone_time,
datediff(second, drs.last_redone_time, getdate()) AS seconds_behind
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_replicas ar ON drs.replica_id = ar.replica_id
WHERE drs.is_local = 0
ORDER BY drs.redo_queue_size DESC;
-- Identify sessions experiencing HADR_SYNCHRONIZING_THROTTLE waits
SELECT
s.session_id,
s.login_name,
r.command,
r.wait_type,
r.wait_time,
r.wait_resource,
t.text AS current_sql
FROM sys.dm_exec_sessions s
JOIN sys.dm_exec_requests r ON s.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.wait_type = 'HADR_SYNCHRONIZING_THROTTLE'
ORDER BY r.wait_time DESC;
-- Analyze historical wait statistics for throttling patterns
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms / 1000.0 AS wait_time_seconds,
wait_time_ms / waiting_tasks_count AS avg_wait_time_ms,
signal_wait_time_ms,
(wait_time_ms - signal_wait_time_ms) / waiting_tasks_count AS avg_resource_wait_ms
FROM sys.dm_os_wait_stats
WHERE wait_type = 'HADR_SYNCHRONIZING_THROTTLE'
AND waiting_tasks_count > 0;
-- Check secondary replica performance metrics and configuration
SELECT
ar.replica_server_name,
ar.availability_mode_desc,
ar.failover_mode_desc,
drs.synchronization_state_desc,
drs.synchronization_health_desc,
pi.counter_name,
pi.cntr_value
FROM sys.availability_replicas ar
JOIN sys.dm_hadr_database_replica_states drs ON ar.replica_id = drs.replica_id
JOIN sys.dm_os_performance_counters pi ON db_name(drs.database_id) = pi.instance_name
WHERE ar.secondary_role_allow_connections_desc != 'NO'
AND pi.object_name LIKE '%Database Replica%'
AND pi.counter_name IN ('Redo Bytes/sec', 'Log Apply Pending Queue', 'Redo blocked/sec')
ORDER BY ar.replica_server_name, pi.counter_name;
-- Examine Always On extended events for throttling details
SELECT
event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
event_data.value('(event/@timestamp)[1]', 'datetime2') AS event_timestamp,
event_data.value('(event/data[@name="database_id"]/value)[1]', 'int') AS database_id,
event_data.value('(event/data[@name="throttle_type"]/text)[1]', 'varchar(50)') AS throttle_type,
event_data.value('(event/data[@name="throttle_reason"]/text)[1]', 'varchar(100)') AS throttle_reason
FROM (
SELECT CAST(target_data AS XML) AS target_data
FROM sys.dm_xe_sessions s
JOIN sys.dm_xe_session_targets t ON s.address = t.event_session_address
WHERE s.name = 'AlwaysOn_health'
) AS xml_data
CROSS APPLY target_data.nodes('//event[data[@name="throttle_type"]]') AS n(event_data)
WHERE event_data.value('(event/@timestamp)[1]', 'datetime2') > dateadd(hour, -4, getdate())
ORDER BY event_timestamp DESC;
Fix Scripts
Increase redo thread parallelism on secondary replica This script enables multiple redo threads per database on the secondary replica to improve log apply performance. Test thoroughly in development as this changes fundamental redo behavior.
-- Execute on secondary replica instance
-- Enables parallel redo for specific database
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON 'SecondaryReplicaName'
WITH (SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
-- For SQL Server 2016 SP2 and later, enable parallel redo
ALTER DATABASE [YourDatabaseName] SET TARGET_RECOVERY_TIME = 60 SECONDS;
Modify session timeout to reduce sensitivity Increases the session timeout between replicas to reduce premature throttling during temporary performance degradations. This trades availability detection speed for reduced throttling.
-- Execute on primary replica
-- Increase session timeout from default 10 seconds to 30 seconds
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON 'SecondaryReplicaName'
WITH (SESSION_TIMEOUT = 30);
Optimize secondary replica for redo performance Configures the secondary replica database for optimal redo thread performance by adjusting parallelism and recovery interval settings.
-- Execute on secondary replica instance
-- Set database-specific configurations for better redo performance
ALTER DATABASE [YourDatabaseName] SET AUTO_CREATE_STATISTICS OFF;
ALTER DATABASE [YourDatabaseName] SET AUTO_UPDATE_STATISTICS OFF;
ALTER DATABASE [YourDatabaseName] SET PAGE_VERIFY CHECKSUM;
ALTER DATABASE [YourDatabaseName] SET TARGET_RECOVERY_TIME = 0; -- Use checkpoint interval instead
-- Ensure adequate memory allocation for redo operations
EXEC sp_configure 'max server memory (MB)', 8192; -- Adjust based on server capacity
RECONFIGURE;
Configure readable secondary for reduced contention When the secondary replica serves read workloads, those queries can block redo operations. This script optimizes the readable secondary configuration.
-- Execute on secondary replica
-- Enable read committed snapshot to reduce blocking between readers and redo
ALTER DATABASE [YourDatabaseName] SET READ_COMMITTED_SNAPSHOT ON;
ALTER DATABASE [YourDatabaseName] SET ALLOW_SNAPSHOT_ISOLATION ON;
-- Configure application intent routing for read workloads
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON 'SecondaryReplicaName'
WITH (SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SecondaryServer:1433'));
AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.
Prevention
Deploy secondary replicas on hardware with comparable or superior I/O performance to the primary replica. The secondary replica's log redo operations are inherently single-threaded per database until SQL Server 2016 SP2, requiring fast sequential write performance to keep pace with primary replica log generation.
Configure appropriate session timeout values based on your network characteristics and acceptable failover detection latency. Networks with higher latency or occasional packet loss require longer session timeouts to prevent unnecessary throttling. Monitor session timeout events in the Always On health extended events session.
Implement proper read workload management on readable secondary replicas using read committed snapshot isolation or dedicated read-only routing to prevent reader queries from blocking redo operations. Long-running analytical queries on secondary replicas directly contribute to redo queue buildup and subsequent throttling.
Size tempdb appropriately on secondary replicas as redo operations may require temporary workspace for certain operations like index rebuilds replayed from the primary. Insufficient tempdb space causes redo operations to stall, triggering throttling on the primary replica.
Monitor redo queue sizes and redo rates continuously using the diagnostic queries above. Establish baseline performance metrics during normal operations to quickly identify when secondary replica performance degrades. Configure alerts when redo queue sizes exceed 10MB consistently or when redo rates drop below historical averages.
Consider using asynchronous-commit mode for secondary replicas that serve primarily as disaster recovery targets rather than requiring strict data consistency. This eliminates throttling entirely but removes zero data loss guarantees for those replicas.
Need hands-on help?
Dealing with persistent hadr_synchronizing_throttle issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.