SQL Server Corruption: More about Detecting Corruption

SQL Server Corruption: More about Detecting Corruption
Download PDF

The integrity of a Microsoft SQL Server database can be severely undermined by corruption, leading to data inaccessibility and potential operational disasters. Steve Stedman’s comprehensive approach to identifying and rectifying such corruption is often considered an essential tool in the SQL Server administrator’s kit, underlining its significance in maintaining database health.

Transcribed from a presentation on Database Corruption by Steve Stedman and Derrick Bovenkamp.

Derrick : How do we find corruption? Oh, man, I wish it was as easy as taking out the magnifying glass there.

Steve : Yeah, if we had this magic magnifying glass that would let us look at the data bytes like that, that’d be cool. But we don’t unfortunately. So the tools we have, the first one is CHECK DB. This is really the number one standard for finding corruption. With checkdb, run it frequently have alerting on it. We’ll go into a little bit more detail on that later. But when you run check dB, the nice thing is for those who are not colorblind is that everything that check DB puts in the output window, that’s red is usually a problem. And everything that’s black is usually okay, or a summary of the problem. So the quick thing is you run check DB on whichever database you’re looking into and you then scan through the results and look for anything red. Anything red is really bad and has to be dealt with at that point.

Derrick : Yeah, and the next thing that you can do is, especially if your database is too big to get check DB back in a reasonable time, if you suspect corruption, is you could try just a check table.

Steve : And that’s something that if you can script or automate these, it’ll make things a whole lot easier. Now, some people say, Well, can I just run check table on all my tables in the database and not run a check dB? Yeah, you could do that but you’re going to end up missing things. And there’s some other checks that are part of DBCC, like check constraints, check catalog, check Alec, check file group, check identity, and checking system tables as well, because when you run check table, you’re usually not checking the hidden system tables you don’t even see as part of the database. And when those become corrupt, that was one of the more difficult ones that we’ve had to repair. But you can run all these other things to check specific areas.

Derrick : You can also look at suspect pages to see if there’s any specific pages there. We have seen this before, a lot of the times, you’ll actually see errors in the SQL server error log. You’ll sometimes see the suspect pages if they used to have corruption, and it was repaired and the suspect pages weren’t cleared out of that suspect pages table.

Steve : If you happen to download any of the database corruption, challenge corrupt databases, and then do the Restore, usually during the restore injects data into the suspect pages, that there’s something corrupt. So just another table to check regularly. Also, just running a query may show corruption, select star from revenue, and you’re expecting piles and piles of money from your company’s revenue from your accounting database. But instead you get a logical consistency based IO error. Well, that’s no good. Anytime you get read output like this, this is something that really needs some attention.

Derrick : Before I started working with Steve, I wouldn’t have guessed it as possible but put your smell test next to any output you get. If you expect revenue to have, let’s say, 1000 rows in it, and your query just keeps selecting and selecting and selecting and selecting, and you’re at 50,000 rows, something might be up.

Steve : And there’s actually an example in the database corruption challenge that shows where the linkage between pages in the database gets mixed up and puts you in an endless loop and select star from table, the table will run until you run out of memory. It just keeps repeating the same rows over and over and over again. And that’s a real bad spot to be in. Also, if you’ve restarted your server, remember earlier, we said don’t reboot your SQL Server, it may make things worse, well, if you if you did do that, and you’ve got a corrupt database, it may not come back online. And you may be either in a recovery pending state or in a suspect state. Now recovery pending is okay for a little bit. And what happens is, after you’ve done an unclean shutdown of the SQL Server, it goes through what’s called Crash Recovery mode where it’s rebuilding all those open transactions that that were running when the date when the SQL Server was shut down. And depending on the speed of your server that may take a few minutes to a few hours. But when that’s going on, you usually see things in the error log to show that the database is moving, it’s attempting to be recovered. If it’s not moving, or if it’s been in recovery pending mode for days, it’s probably never going to come back without some type of intervention.

Derrick : I would also add, sometimes you’ll see that recovery pending with a really, really slow storage. I kind of just that screenshot almost sent shivers down my spine because I worked on a few SQL servers where the storage was so slow and the database was so big that every time you restart SQL it take five or 10 minutes for it to come back up and wondering if there’s actually a problem or if it’s going to come online.

Steve : Absolutely.

Want to learn more about how to prepare for corruption on SQL Server? Take a look at my class that will teach how to to be prepared if corruption strikes.

Corruption Class by Steve Stedman.

 

More from Stedman Solutions:

SteveStedman5
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!

Leave a Reply

Your email address will not be published. Required fields are marked *

*