Quick Answer
SQL Server Error 825 indicates a storage subsystem failure where the operating system had to retry read operations multiple times before succeeding. This points to failing disk hardware, faulty storage controllers, or deteriorating SAN connectivity that will cause data corruption if not immediately addressed.
Root Cause Analysis
Error 825 triggers when SQL Server's I/O completion routines detect that the Windows storage stack required multiple retry attempts to complete a read operation. The SQL Server buffer pool manager requests data pages through the Windows I/O subsystem, which normally completes synchronously or asynchronously with a single attempt. When the storage subsystem encounters hardware-level errors, Windows implements automatic retry logic before reporting failure back to SQL Server.
The error surfaces in SQL Server's I/O completion ports when the storage stack reports ERROR_CRC or similar hardware errors during the initial read attempt. SQL Server's lazywriter and checkpoint processes continue operating normally, but the underlying storage reliability has degraded significantly. The buffer pool manager still receives the requested data pages, but the retry behavior indicates imminent hardware failure.
Starting with SQL Server 2016, Microsoft enhanced I/O error reporting to include more detailed hardware diagnostics in the error log. SQL Server 2019 introduced additional correlation with Windows System Event Log entries. SQL Server 2022 expanded integration with Azure Monitor for hybrid environments, providing better visibility into storage health patterns.
The error typically correlates with specific physical storage components: failing disk drives in RAID arrays, degrading SAS/SATA cables, overheating storage controllers, or SAN fabric instability. Unlike logical corruption errors, Error 825 represents actual hardware communication failures at the Windows kernel level.
AutoDBA checks I/O performance metrics, storage health monitoring, and database integrity check scheduling across your entire SQL Server instance in 60 seconds. Download the free diagnostic script and see what else needs attention.
Diagnostic Queries
-- Check for recent I/O stall patterns and error frequencies
SELECT
database_id,
file_id,
DB_NAME(database_id) AS database_name,
io_stall_read_ms,
io_stall_write_ms,
num_of_reads,
num_of_writes,
io_stall_read_ms / NULLIF(num_of_reads, 0) AS avg_read_stall_ms,
io_stall_write_ms / NULLIF(num_of_writes, 0) AS avg_write_stall_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL)
WHERE io_stall_read_ms > 1000 OR io_stall_write_ms > 1000
ORDER BY io_stall_read_ms DESC;
-- Identify specific database files experiencing I/O pressure
SELECT
mf.database_id,
mf.file_id,
mf.physical_name,
mf.type_desc,
divfs.num_of_reads,
divfs.num_of_writes,
divfs.io_stall,
divfs.io_stall_read_ms,
divfs.io_stall_write_ms
FROM sys.master_files mf
INNER JOIN sys.dm_io_virtual_file_stats(NULL, NULL) divfs
ON mf.database_id = divfs.database_id
AND mf.file_id = divfs.file_id
WHERE divfs.io_stall > 50000
ORDER BY divfs.io_stall DESC;
-- Check for pending I/O requests that may timeout
SELECT
database_id,
file_id,
io_stall,
io_pending_ms_ticks,
scheduler_address
FROM sys.dm_io_pending_io_requests
ORDER BY io_pending_ms_ticks DESC;
-- Review recent error log entries for I/O related messages
EXEC xp_readerrorlog 0, 1, N'I/O', N'error';
-- Examine wait statistics for storage-related bottlenecks
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms,
max_wait_time_ms,
signal_wait_time_ms,
wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type IN ('PAGEIOLATCH_SH', 'PAGEIOLATCH_EX', 'WRITELOG',
'IO_COMPLETION', 'ASYNC_IO_COMPLETION')
AND waiting_tasks_count > 0
ORDER BY wait_time_ms DESC;
Fix Scripts
Immediate Data Integrity Check Run DBCC CHECKDB on affected databases to identify any corruption that may have occurred during the I/O retry events. Error 825 often precedes detectable corruption.
-- Check database integrity with extended logical checks
-- WARNING: This will consume significant I/O and CPU resources
-- Run during maintenance window or low activity period
USE [YourDatabaseName];
GO
DBCC CHECKDB WITH EXTENDED_LOGICAL_CHECKS, NO_INFOMSGS;
GO
Enable Page Verify and Backup Validation Configure checksum page verification to detect storage corruption earlier and validate backup integrity.
-- Enable page checksum verification for early corruption detection
-- This only affects new pages, existing pages retain their current verification
ALTER DATABASE [YourDatabaseName] SET PAGE_VERIFY CHECKSUM;
GO
-- Verify recent backups are not corrupted
RESTORE VERIFYONLY FROM DISK = 'C:\YourBackupPath\database.bak'
WITH CHECKSUM;
Increase Error Log Monitoring Reduce error log recycling to retain more historical I/O error information for pattern analysis.
-- Increase error log retention to capture more I/O error patterns
-- Default is typically 6 logs, increase to 20 for better trend analysis
EXEC sys.sp_cycle_errorlog;
EXEC sys.xp_instance_regwrite
N'HKEY_LOCAL_MACHINE',
N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer',
N'NumErrorLogs',
REG_DWORD,
20;
Configure Database File Placement Review Document current file placement for hardware replacement planning and validate RAID configuration.
-- Document current database file layout for hardware remediation
SELECT
DB_NAME(database_id) AS DatabaseName,
name AS LogicalName,
physical_name AS PhysicalPath,
type_desc AS FileType,
size * 8 / 1024 AS SizeMB,
CASE WHEN max_size = -1 THEN 'Unlimited'
ELSE CAST(max_size * 8 / 1024 AS VARCHAR(20)) + ' MB'
END AS MaxSize,
growth AS GrowthSetting,
is_percent_growth
FROM sys.master_files
ORDER BY database_id, type_desc DESC;
AutoDBA generates fix scripts like these automatically, with impact estimates and rollback SQL included.
Prevention
Replace the failing storage hardware immediately. Error 825 represents actual hardware failure that will worsen rapidly, leading to data corruption or complete database inaccessibility. Contact your storage vendor for emergency hardware replacement.
Implement comprehensive I/O monitoring using Performance Monitor counters for Avg. Disk sec/Read and Avg. Disk sec/Write. Values consistently above 20ms indicate storage performance degradation that precedes Error 825 conditions. Monitor PhysicalDisk\Current Disk Queue Length to identify I/O saturation patterns.
Configure SQL Server trace flag 3226 to suppress backup completion messages in the error log, making I/O error patterns more visible. Establish baseline I/O performance metrics using sys.dm_io_virtual_file_stats weekly snapshots to identify performance trends before hardware failure occurs.
Deploy database files across multiple physical spindles or separate SSD devices to prevent single points of failure. For critical databases, implement database mirroring or Always On Availability Groups with replicas on independent storage subsystems. This provides immediate failover capability when storage hardware fails.
Schedule monthly DBCC CHECKDB operations with EXTENDED_LOGICAL_CHECKS to detect corruption before it spreads. Configure database mail alerts for Error 825 occurrences to enable immediate response. Document your storage topology including RAID levels, controller models, and firmware versions to accelerate vendor support engagement during hardware failures.
Need hands-on help?
Dealing with persistent sql server error 825 issues across your environment? Samix Technology provides hands-on SQL Server performance consulting with 15+ years of production DBA experience.