Quick Answer
HADR_SYNC_COMMIT waits occur when a primary replica waits for synchronous secondary replicas to acknowledge transaction log hardening during commit processing. This is expected behavior in synchronous Always On Availability Groups but becomes concerning when wait times exceed 5-10ms consistently, indicating network latency, storage bottlenecks, or overloaded secondaries.
Root Cause Analysis
When a transaction commits on the primary replica in a synchronous Always On Availability Group, the Log Writer process must wait for acknowledgment from all synchronous secondary replicas before completing the commit. The primary sends log records through the Database Mirroring endpoint to secondaries, which must write the log to disk and send confirmation back to the primary.
The HADR_SYNC_COMMIT wait specifically occurs in the log flush path when the Log Writer thread blocks waiting for the hardening acknowledgment. SQL Server 2016 introduced improvements to the Always On log transport mechanism, reducing CPU overhead in the transport layer. SQL Server 2019 added Accelerated Database Recovery (ADR), which can reduce some commit latencies by minimizing undo operations, though this primarily affects crash recovery rather than synchronous commit waits.
The wait encompasses three distinct phases: network transmission time to secondaries, storage write latency on secondary replicas, and network acknowledgment time back to the primary. High HADR_SYNC_COMMIT waits typically indicate bottlenecks in secondary storage subsystems, network latency between replicas, or resource contention on secondary instances preventing timely log hardening.
SQL Server 2022 enhanced Always On telemetry and introduced contained availability groups, but the core synchronous commit mechanism remains unchanged. The wait statistics aggregate across all databases in the availability group, making per-database analysis crucial for identifying specific problem databases.
AutoDBA checks Always On availability group health, synchronization latency, and replica performance monitoring across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.
Diagnostic Queries
-- Check current HADR wait statistics and averages
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms,
max_wait_time_ms,
signal_wait_time_ms,
wait_time_ms / NULLIF(waiting_tasks_count, 0) AS avg_wait_ms
FROM sys.dm_os_wait_stats
WHERE wait_type = 'HADR_SYNC_COMMIT'
AND waiting_tasks_count > 0;
-- Examine Always On replica synchronization health
SELECT
ag.name AS ag_name,
ar.replica_server_name,
ar.availability_mode_desc,
ars.role_desc,
ars.synchronization_health_desc,
ars.last_commit_time,
ars.last_hardened_time,
DATEDIFF(ms, ars.last_hardened_time, ars.last_commit_time) AS commit_lag_ms
FROM sys.availability_groups ag
JOIN sys.availability_replicas ar ON ag.group_id = ar.group_id
JOIN sys.dm_hadr_availability_replica_states ars ON ar.replica_id = ars.replica_id
WHERE ar.availability_mode = 1; -- Synchronous commit only
-- Monitor per-database synchronization performance
SELECT
ag.name AS ag_name,
db.name AS database_name,
drs.synchronization_state_desc,
drs.last_commit_time,
drs.last_hardened_time,
drs.log_send_queue_size,
drs.log_send_rate,
drs.redo_queue_size,
drs.redo_rate
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.databases db ON drs.database_id = db.database_id
JOIN sys.availability_groups ag ON drs.group_id = ag.group_id
WHERE drs.is_local = 0 -- Secondary replicas
ORDER BY drs.log_send_queue_size DESC;
-- Identify sessions currently experiencing HADR waits
SELECT
s.session_id,
s.host_name,
s.program_name,
r.wait_type,
r.wait_time,
r.blocking_session_id,
t.text AS current_sql
FROM sys.dm_exec_sessions s
JOIN sys.dm_exec_requests r ON s.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.wait_type = 'HADR_SYNC_COMMIT';
-- Check Always On endpoint configuration and status
SELECT
e.name AS endpoint_name,
e.protocol_desc,
e.type_desc,
e.state_desc,
t.port,
e.is_compression_enabled
FROM sys.database_mirroring_endpoints e
JOIN sys.tcp_endpoints t ON e.endpoint_id = t.endpoint_id
WHERE e.type = 4; -- Database mirroring endpoint
Fix Scripts
Temporarily switch to asynchronous mode during maintenance
-- Switch problematic replica to async mode to reduce commit waits
-- WARNING: This removes synchronous data protection guarantees
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON N'SecondaryServerName'
WITH (AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT);
-- Monitor the change
SELECT
ar.replica_server_name,
ar.availability_mode_desc,
ars.synchronization_health_desc
FROM sys.availability_replicas ar
JOIN sys.dm_hadr_availability_replica_states ars ON ar.replica_id = ars.replica_id
WHERE ar.replica_server_name = 'SecondaryServerName';
-- Switch back after resolving underlying issues
-- ALTER AVAILABILITY GROUP [YourAGName]
-- MODIFY REPLICA ON N'SecondaryServerName'
-- WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT);
Increase session timeout to handle temporary network delays
-- Increase session timeout from default 10 seconds to 30 seconds
-- This prevents false failovers during temporary network issues
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON N'SecondaryServerName'
WITH (SESSION_TIMEOUT = 30);
-- Verify the change
SELECT
replica_server_name,
session_timeout,
availability_mode_desc
FROM sys.availability_replicas
WHERE replica_server_name = 'SecondaryServerName';
Enable endpoint compression for high-latency network scenarios
-- Enable compression at the database mirroring endpoint level
-- This reduces network traffic for AG transport on high-latency links
-- Test thoroughly as this increases CPU usage on both replicas
-- Run on EACH replica that should use compression
ALTER ENDPOINT [Hadr_endpoint]
FOR DATABASE_MIRRORING (COMPRESSION = ENABLED);
-- Verify endpoint compression setting
SELECT
name AS endpoint_name,
type_desc,
state_desc,
is_compression_enabled
FROM sys.database_mirroring_endpoints;
Configure readable secondary to reduce primary workload
-- Configure secondary for read-only workload to justify sync cost
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON N'SecondaryServerName'
WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY));
-- Create read-only routing configuration
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON N'PrimaryServerName'
WITH (PRIMARY_ROLE(READ_ONLY_ROUTING_URL = N'TCP://PrimaryServerName:1433'));
ALTER AVAILABILITY GROUP [YourAGName]
MODIFY REPLICA ON N'SecondaryServerName'
WITH (SECONDARY_ROLE(READ_ONLY_ROUTING_URL = N'TCP://SecondaryServerName:1433'));
AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.
Prevention
Deploy secondary replicas on storage subsystems with comparable or superior performance to the primary. Synchronous commit waits directly correlate with secondary storage write latency, making fast SSD storage essential for acceptable performance. Use dedicated network connections between replicas with latency under 5ms for optimal synchronous commit performance.
Configure connection string retry logic in applications to handle brief availability group failovers gracefully. Implement application-level connection pooling with appropriate timeout settings to avoid cascading failures during temporary Always On delays.
Monitor the Transaction Delay performance counter continuously, alerting when average delays exceed 10ms over 5-minute periods. Establish baseline performance metrics during low-activity periods to identify degradation trends before they impact production workloads.
Consider hybrid approaches using one synchronous replica for local high availability and additional asynchronous replicas for disaster recovery. This balances data protection requirements with commit performance, reducing the number of replicas that must acknowledge each transaction synchronously.
Implement regular storage performance testing on secondary replicas using tools like SQLIO or CrystalDiskMark to identify storage degradation before it impacts Always On performance. Schedule maintenance windows to update storage drivers and firmware on secondary replicas before primaries to minimize production impact.
Need hands-on help?
Dealing with persistent hadr_sync_commit issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.