Few IT scenarios inspire as much dread as the words “the database is corrupted.” For UK businesses that depend on databases to run their operations — customer records, financial transactions, inventory management, order processing — a corrupted database is not merely an inconvenience. It is a potential business-ending event if not handled correctly.
Database corruption can strike without warning. A power failure during a write operation, a failed storage controller, a buggy application committing malformed data, or even a ransomware attack can leave your database in an inconsistent state where some or all of the data is inaccessible, incorrect, or missing entirely.
This guide covers the types of database corruption you might encounter, the recovery options available in SQL Server and Azure SQL, step-by-step recovery procedures, and — most importantly — the prevention strategies that ensure you are never in the devastating position of having no viable recovery path.
Understanding Database Corruption
Database corruption occurs when data stored in a database becomes inconsistent, unreadable, or logically incorrect. Not all corruption is created equal — the type and extent of corruption determine the severity of the problem and the appropriate recovery approach.
Physical corruption (also called hardware-level corruption) occurs when the actual bits stored on disk are damaged or altered. This can result from storage hardware failures (failing disk sectors, RAID controller issues), sudden power loss during write operations, firmware bugs in storage controllers, or faulty memory (RAM). Physical corruption is typically detected by SQL Server through checksum validation — every data page includes a checksum that is verified when the page is read. If the checksum does not match, SQL Server reports a corruption error.
Logical corruption occurs when the data on disk is physically intact but the logical structure of the database is inconsistent. This can manifest as orphaned rows (data that exists in a table but is not referenced by any index), broken page chains (where the pointers linking data pages together are incorrect), metadata inconsistencies (where the system tables disagree about the structure of the database), or torn pages (where only part of a page was written before a failure occurred).
SQL Server uses several mechanisms to detect corruption. Page checksums are calculated when a page is written to disk and verified when it is read back. If the checksums do not match, SQL Server raises error 824. Torn page detection checks whether all sectors within a page were written together by storing a bit pattern that is inverted with each write. If the pattern is inconsistent, a torn page error (error 824) is raised. DBCC CHECKDB performs a comprehensive consistency check that validates both physical and logical integrity. It checks every page in every table, validates all index structures, and identifies orphaned or mislinked pages. Running DBCC CHECKDB regularly is the single most important corruption detection measure you can implement.
Immediate Response: What to Do When You Discover Corruption
When database corruption is detected, your immediate response is critical. The wrong actions in the first few minutes can make the situation significantly worse — potentially turning a recoverable problem into a catastrophic data loss.
Step one: Do not panic, and do not restart anything. The natural instinct when something goes wrong with a database is to restart SQL Server or reboot the server. Resist this urge. Restarting can cause SQL Server to roll back transactions in a way that compounds the corruption, or the recovery process during startup might fail entirely, leaving you in a worse position.
Step two: Assess the scope. Run DBCC CHECKDB against the affected database to determine the full extent of the corruption. This command will report every consistency error it finds, giving you a clear picture of what is damaged and what is intact. Document the output carefully — you will need it to plan your recovery.
Step three: Protect your backups. Before attempting any repair, verify that your backup chain is intact. Check that your most recent full backup, differential backup, and transaction log backups are accessible and uncorrupted. Copy them to a separate location if necessary. Your backups are your lifeline — protect them above all else.
Step four: Take the database offline if necessary. If the corruption is actively spreading (which can happen with certain storage failures), take the database offline to prevent further damage. Use ALTER DATABASE [YourDatabase] SET OFFLINE to stop all access.
Recovery Option 1: Restore from Backup
Restoring from backup is the gold standard for database recovery. If you have a solid backup strategy with regular full backups, differential backups, and transaction log backups, you can recover to any point in time with minimal data loss.
The restore process follows a specific sequence. First, restore the most recent full backup using RESTORE DATABASE with the NORECOVERY option (which leaves the database in a restoring state, ready to accept additional backups). Next, restore the most recent differential backup (again with NORECOVERY). Finally, restore each transaction log backup in sequence up to the point in time you want to recover to. The last restore in the chain uses the RECOVERY option to bring the database online.
For SQL Server on-premises, the critical question is how much data you can afford to lose. If your last full backup was taken at midnight and the corruption occurred at 3pm, restoring only the full backup would lose 15 hours of data. But if you have been taking transaction log backups every 15 minutes, you can restore to within 15 minutes of the corruption event — losing at most a quarter of an hour's worth of transactions.
SQL Server On-Premises Recovery
Azure SQL Database Recovery
Recovery Option 2: DBCC CHECKDB with REPAIR
When backup restoration is not possible — because backups are missing, corrupted themselves, or too old to be useful — DBCC CHECKDB with repair options becomes the fallback. It is important to understand that this is a last resort, not a first choice, because repair operations can result in data loss.
DBCC CHECKDB offers two repair levels. REPAIR_REBUILD attempts to fix corruption without data loss by rebuilding indexes, correcting page header information, and fixing minor structural issues. This is always the first repair option to try. REPAIR_ALLOW_DATA_LOSS performs more aggressive repairs that may delete corrupted data to restore the database to a consistent state. As the name implies, this can result in permanent data loss. It should only be used when REPAIR_REBUILD cannot fix the problem and no viable backup exists.
Before running either repair option, the database must be in single-user mode (ALTER DATABASE [YourDatabase] SET SINGLE_USER WITH ROLLBACK IMMEDIATE). After repair completes, run DBCC CHECKDB again to verify the database is now consistent, then return it to multi-user mode.
The data loss from REPAIR_ALLOW_DATA_LOSS is not random — it follows a specific pattern. SQL Server identifies the corrupted pages and deallocates them, effectively deleting the rows stored on those pages. It then rebuilds any affected indexes. The amount of data lost depends entirely on how many pages are corrupted. In many cases, only a handful of rows are affected, and the loss is manageable. In severe cases, entire tables can be lost. Always document which pages and tables are affected (from the DBCC CHECKDB output) before running repair, so you know exactly what data might be lost and can attempt to recover it from other sources (application logs, paper records, third-party systems) after the repair.
Recovery Option 3: Point-in-Time Recovery in Azure SQL
If your database runs on Azure SQL Database or Azure SQL Managed Instance, you have access to one of the most powerful recovery capabilities available: automatic point-in-time restore (PITR).
Azure SQL automatically takes full backups weekly, differential backups every 12 to 24 hours, and transaction log backups every 5 to 10 minutes. These backups are stored on geo-redundant storage by default, meaning they exist in at least two Azure regions. The default retention period is 7 days for Basic tier, 35 days for Standard and Premium tiers, and up to 10 years with long-term retention policies.
To restore to a specific point in time, navigate to the database in the Azure portal, click “Restore,” and select the exact date and time you want to restore to. Azure creates a new database with the data as it existed at that moment. The restore typically completes in minutes for small databases and an hour or so for larger ones. Once restored, you can verify the data, rename the databases, and switch your application to the restored copy.
Prevention: The Best Recovery Strategy
The most effective approach to database corruption is preventing it from happening in the first place, and ensuring that when it does happen (because eventually it will), you have multiple recovery options available.
Implement a robust backup strategy. Follow the 3-2-1 rule: three copies of your data, on two different media types, with one copy off-site. For SQL Server on-premises, this means automated full backups at least daily, differential backups every four to six hours, and transaction log backups every 15 minutes for databases in full recovery mode. Store backups on separate storage from the database files, and replicate them to a secondary location (a different physical site, Azure Blob Storage, or an off-site backup service).
Run DBCC CHECKDB regularly. Schedule DBCC CHECKDB to run against every database at least weekly. For critical databases, run it daily during a maintenance window. DBCC CHECKDB is the only way to detect latent corruption — corruption that exists on disk but has not yet been accessed by a query. Without regular consistency checks, you might not discover corruption until weeks after it occurred, by which time your backup chain may have rotated past the last clean backup.
Monitor storage health. Most database corruption originates from storage issues. Monitor your storage for SMART warnings (for local disks), RAID degradation alerts, latency spikes that could indicate failing disks, and error rates on storage controller logs. For Azure VMs, use Azure Monitor to track disk latency, IOPS, and throughput, and set alerts for anomalous patterns.
| Prevention Measure | Frequency | What It Protects Against | Implementation Effort |
|---|---|---|---|
| Full database backups | Daily | All data loss scenarios | Low — SQL Server Agent job |
| Transaction log backups | Every 15 minutes | Minimises RPO for point-in-time recovery | Low — SQL Server Agent job |
| DBCC CHECKDB | Weekly (daily for critical DBs) | Detects corruption before it causes outage | Low — scheduled maintenance plan |
| Backup verification (RESTORE VERIFYONLY) | Weekly | Ensures backups are actually restorable | Low — automated script |
| Off-site backup replication | Daily or real-time | Site-level disasters, ransomware | Medium — requires secondary storage |
| UPS with graceful shutdown | Always active | Power failure during write operations | Medium — hardware investment |
| Storage health monitoring | Continuous | Hardware-level corruption from failing disks | Low — monitoring tools |
GDPR and Data Recovery Obligations
For UK businesses subject to GDPR (which is virtually all of them), database corruption and data loss have regulatory implications. Under Article 32 of the UK GDPR, organisations must implement appropriate technical and organisational measures to ensure the ongoing confidentiality, integrity, availability, and resilience of processing systems, and the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident.
In practical terms, this means your backup and recovery procedures are not optional — they are a legal requirement. The ICO expects organisations to demonstrate that they have tested backup and recovery procedures, that backups are encrypted and stored securely, that recovery time objectives are defined and achievable, and that regular tests confirm the ability to restore data from backups. If a database corruption incident results in the loss of personal data and you cannot demonstrate that adequate backup and recovery measures were in place, the ICO may consider this a failure of Article 32 obligations, potentially resulting in enforcement action.
Having backups is not the same as having a working recovery capability. A shocking number of UK businesses discover during an actual incident that their backups are incomplete, corrupted, or cannot be restored within an acceptable timeframe. The only way to confirm your recovery works is to test it regularly. Schedule quarterly recovery tests where you restore a full backup to a test environment and verify that the database is consistent and the data is complete. Document the recovery time and compare it against your recovery time objective (RTO). If the test reveals problems — slow restore times, missing data, incompatible backup formats — fix them before you face a real incident. The cost of quarterly testing is negligible compared to the cost of discovering your backups do not work when your business is at stake.
Building Your Database Resilience Plan
Every UK business that depends on databases should have a documented database resilience plan that covers prevention, detection, and recovery. This plan should define the RPO (Recovery Point Objective) — the maximum amount of data loss you can tolerate, the RTO (Recovery Time Objective) — the maximum acceptable downtime, the backup schedule and retention policy, the DBCC CHECKDB schedule, the escalation path when corruption is detected, step-by-step recovery procedures for each scenario, and the testing schedule for recovery procedures.
This plan should be reviewed annually, tested quarterly, and updated whenever your database environment changes — new databases, new applications, infrastructure migrations, or changes to your compliance requirements.
Need Help with Database Recovery or Prevention?
Cloudswitched provides database management and recovery services for UK businesses running SQL Server on-premises and Azure SQL in the cloud. From emergency corruption recovery to implementing comprehensive backup strategies, DBCC CHECKDB scheduling, and Azure SQL point-in-time recovery configuration, we ensure your data is protected and recoverable. If you are facing a database emergency or want to strengthen your prevention strategy, contact us immediately.
GET IN TOUCH
