SQL Server DBCC Deep Dive
DBCC CHECKDB
- Only way to read and test all pages in DB
- By default its CPU bound (full check), but can be made IO bound using WITH PHYSICAL_ONLY
- A resource hog, and uses tempdb as well
- wont take locks unless specified WITH TABLOCK
- New Features starting 2005:
(a) progress reporting using sys.dm_exec_reports
(b) Data purity reports e.g. datetime correctness
(c) last DBCC completed successfully
- New Features starting 2008:
(a) WITH EXTENDED_LOGICAL_CHECKS
How to Run CHECKDB?
- Returns first 200 errors by default, use ALL_ERRORMSGS
- No harm in letting it complete even if its taking time; if running for too long, check for error 5628 to check if its rescheduling certain check
What CHECKDB Do
(1) primitive check for critical system tables (any problem here – game over)
(2) allocation checks (IAM pages etc)
(3) Logical checks of critical system tables (key order, index mapping, LOB pointer, file stream)
(4) logical check for all tables
(5) Service Broker data validation
(6) Metadata table checks
(7) Indexed view, XML index, spatial index checks (off by default)
NOTE that repairs are done at each stage. CheckDB creates a snapshot and runs the check on that for consistency. You can as well create a snapshot of your own and run DBCC on that (in case default site dont have space)
How CHECKDB checks as above:
- Primitive system table checks
- DBCC CHECKALLOC
- DBCC CHECKTABLE (system tables)
- DBCC CHECKTABLE
- DBCC CHECKCATALOG.
You might want to optionally run DBCC CHECKIDENT and CHECKCONSTRAINTS.
CHECKFILEGROUP: checks file and hosted tables and index. can be opted for after file based operations.
CHECKTABLE: Dont use WITHOUT INDEX flag.
NOTE: – Size TempDB, use WITH_ESTIMATEONLY flag if needed. – If checkdb running for too long it might mean it has found something to work upon. Run CHECKDN as frequently as can.
Interpreting and Fixing Issue Using CheckDB
- If there are only a couple errors, check MSDN – If there are many errors, use some tips and tricks, as below:
(1) If checkdb doesn’t complete it tells why it couldn’t (also means you cant perform recovery, so restore). e.g. of fatal errors, 7984-7988 critical system table corrupt; 8967 invalid states in checkdb; 8930 corrupt metadata
(2) Are the error messages for NonClustered Index (NCI)? Run index rebuild. When rebuilding indexes, it references older records, so issue might persist. So drop and recreated. Always rerun DBCC
(3) Unrepairable erros:
(a) 8909, 8938, 8939 i.e. page header corruption – restore page from backup
(b) 8970 invalid data for column – work with BAs
(c) 8992 checkcatalog error; find object ID, find catalog table having errorenous objectID, shutdown db, startup with -m: param, delete the entry from errooneous table, rerun DBCC, if fixed startup without -m: flag
(d) 2570 data purity error – find the page from DBCC CHECKDB output, run DBCC PAGE, try delete or update the records as appropriate
REPAIR_ALLOW_DATA_LOSS
- Beware of using it
- repairs physical strucuture w/o any consideration for data / biz logic; fastest mode of repair.
- Drop create for what cant be corrected
- Doesnt take into account: replication, mirroring, biz logic, data relationship, foreign key constraints
- Take backup before doing this
- After using it, run CHECKCONSTRAINTS and fix up replication topologies
Reference: SQL Server MCM Deep Dive Sessions