Quick Answer
HADR_REPLICAINFO_SYNC occurs when Always On availability group replicas wait for concurrency control to update replica state information in system metadata. This wait typically appears during replica state transitions, failovers, or when the primary updates replica information. Low occurrences are normal, but sustained high waits indicate replica communication issues or metadata contention.
Root Cause Analysis
This wait type occurs within the Always On availability group engine when the SQL Server Database Engine needs to update replica state information stored in internal system tables. The wait happens at the scheduler level when threads compete for access to update the replica metadata structures that track primary/secondary states, synchronization health, and connection status.
The concurrency control mechanism uses lightweight locks to protect the replica information data structures from simultaneous updates. When multiple threads attempt to modify replica state simultaneously, such as during automatic failovers, manual failovers, or health monitoring updates, threads queue on this wait type until they can acquire exclusive access to the metadata.
In SQL Server 2012 through 2014, this wait was more frequent due to less optimized metadata update patterns. SQL Server 2016 introduced improvements to the Always On state machine that reduced unnecessary replica information updates. SQL Server 2019 and later versions further optimized the locking granularity around replica metadata, making these waits shorter in duration but potentially more frequent during active state changes.
The wait specifically occurs in the availability group worker threads that manage replica health monitoring, the lease timeout mechanisms, and the distributed availability group coordination processes. Network latency between replicas can exacerbate this wait when the primary must update replica states based on delayed heartbeat responses.
AutoDBA checks Always On availability group health monitoring, replica synchronization states, and failover cluster resource status across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.
Diagnostic Queries
-- Current HADR_REPLICAINFO_SYNC waits and their duration
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms,
max_wait_time_ms,
signal_wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type = 'HADR_REPLICAINFO_SYNC'
AND waiting_tasks_count > 0;
-- Active sessions experiencing HADR_REPLICAINFO_SYNC waits
SELECT
s.session_id,
s.login_name,
s.program_name,
r.wait_type,
r.wait_time,
r.wait_resource,
r.command,
t.text
FROM sys.dm_exec_requests r
INNER JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.wait_type = 'HADR_REPLICAINFO_SYNC';
-- Availability group replica states and health
SELECT
ag.name AS availability_group_name,
r.replica_server_name,
r.endpoint_url,
rs.role_desc,
rs.operational_state_desc,
rs.connected_state_desc,
rs.synchronization_health_desc,
rs.last_connect_error_number,
rs.last_connect_error_description
FROM sys.availability_replicas r
INNER JOIN sys.availability_groups ag ON r.group_id = ag.group_id
INNER JOIN sys.dm_hadr_availability_replica_states rs ON r.replica_id = rs.replica_id;
-- Recent Always On error log entries
EXEC xp_readerrorlog 0, 1, N'Always On', NULL, NULL, NULL, N'DESC';
-- Availability group database synchronization states
SELECT
ag.name AS availability_group_name,
db.database_name,
r.replica_server_name,
drs.synchronization_state_desc,
drs.synchronization_health_desc,
drs.last_sent_time,
drs.last_received_time,
drs.last_hardened_time,
drs.log_send_queue_size,
drs.redo_queue_size
FROM sys.availability_databases_cluster db
INNER JOIN sys.availability_groups ag ON db.group_id = ag.group_id
INNER JOIN sys.availability_replicas r ON ag.group_id = r.group_id
LEFT JOIN sys.dm_hadr_database_replica_states drs ON db.group_database_id = drs.group_database_id
AND r.replica_id = drs.replica_id
ORDER BY ag.name, db.database_name, r.replica_server_name;
Fix Scripts
Reset Wait Statistics to Establish Baseline
-- Clear wait statistics to get fresh measurements
-- WARNING: This clears ALL wait stats, run during maintenance window
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);
This resets all wait statistics counters to establish a clean baseline for monitoring HADR_REPLICAINFO_SYNC waits going forward. Only run during planned maintenance windows as it affects monitoring visibility across all wait types.
Force Availability Group Health Detection Cycle
-- Trigger immediate health detection to resolve stale replica states
-- This forces the Always On health detection to refresh replica information
ALTER AVAILABILITY GROUP [YourAvailabilityGroupName]
SET (HEALTH_CHECK_TIMEOUT = 30000);
-- Reset to original timeout after forcing refresh
WAITFOR DELAY '00:00:05';
ALTER AVAILABILITY GROUP [YourAvailabilityGroupName]
SET (HEALTH_CHECK_TIMEOUT = 30000);
Replace [YourAvailabilityGroupName] with your actual AG name. This temporarily adjusts the health check timeout to force an immediate health evaluation cycle. Test in development first as this can trigger failover conditions if replicas are genuinely unhealthy.
Restart Availability Group Resource
-- PowerShell script to restart AG resource (run from Windows cluster node)
-- Import-Module FailoverClusters
-- Stop-ClusterResource -Name "YourAGResourceName"
-- Start-ClusterResource -Name "YourAGResourceName"
-- Verify AG resource state after restart
SELECT
name,
resource_group_name,
cluster_node_name,
state_desc
FROM sys.dm_hadr_availability_replica_cluster_nodes;
This PowerShell approach restarts the cluster resource managing the availability group, which can resolve stuck replica metadata states. Coordinate with your Windows administrator and test thoroughly in development as this causes brief service interruption.
AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.
Prevention
Configure appropriate health check timeouts based on your network latency and workload patterns. Set HEALTH_CHECK_TIMEOUT to 3-5 times your typical network round-trip time between replicas. Values too low cause false positives and excessive metadata updates.
Implement robust network monitoring between Always On replicas. Network packet loss, high latency, or intermittent connectivity creates frequent replica state changes that trigger HADR_REPLICAINFO_SYNC waits. Use dedicated network links for Always On traffic when possible.
Monitor the Always On dashboard and system health extended events regularly. Configure alerts on availability group state changes, connection failures, and synchronization issues before they manifest as sustained wait types. The system_health session captures most relevant Always On events.
Avoid frequent manual failovers and role changes in production environments. Each role transition requires replica metadata updates that generate these waits. Implement proper load balancing and capacity planning to minimize unnecessary failovers.
Consider upgrading to SQL Server 2019 or later versions if experiencing frequent HADR_REPLICAINFO_SYNC waits on older versions. The internal Always On state management improvements significantly reduce metadata contention in newer releases.
Need hands-on help?
Dealing with persistent hadr_replicainfo_sync issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.