Database Corruption vs Drive Failure Explained – Video

Database Corruption vs Drive Failure Explained – Video
Download PDF

If you don’t know the difference between server corruption and drive failure then this video is for you. It will all be explained in this quick 6 min video.

Transcription:

Steve Stedman 0:10
Now, one of the things that when we talk about drive failures, whatnot is that want to point out there’s a big difference between corruption and complete drive failure. And here’s an example. And this is one that a couple years ago, and I think Randolph West, if he’s on the call, he’s, he came in and assisted with this one as well to take a look at it. But it was another Medical Group. And we do a lot of work with medical groups. And they called us and they had a corrupt database. So we started looking into it to find out what was corrupt. And it turned out, they had a raid 50 Drive array. And on that was the storage for the virtual machine. And it had multiple drive failures. And it turns out that there with raid, or even raid five or raid 50, that if you lose enough disks, eventually you end up with just a big hole in your data. And by a big hole, I mean, like 20 to 25% of every disk was gone. And for big files, 20 to 25% of the file was gone.

Derrick Bovenkamp 1:11
Yeah. So another tough thing they had here is we put the claim had no backup. They did have backups. But they were on the same raid 50 array as the database. Yep. So they also had the same 20% hole missing out of them. And, you know, this is another one, I don’t know if this client had hot spares available, and they’d eaten through a hot spare as well. But they definitely even through two drives and not just the one drive.

Steve Stedman 1:41
And with this, basically, there was nothing we could do other than help them restore a backup from six months ago. But that was something that they were able to do on their own. And basically 20% of every large file on the drive was missing. And when we look at that, that’s not corruption. Corruption is areas where things are damaged, that they can be repaired. But this is like the difference between a car getting a fender bender and doing a repair on that, and your car being stolen. It’s just gone at that point. Yeah. And basically, there was no option from the database side to repair this in any way. And here’s an example of what the SQL server error log looked like.

Derrick Bovenkamp 2:20
Yeah, so if you can look at that, you can see the error log looks normal at the top and at the bottom. And there’s just a whole bunch of some kind of garbage in the middle. That’s not their log,

Steve Stedman 2:33
and every single large file, and not even that large, but every single medium to large file on the disk had this type of problem with it. So this was a install their operating system and build rebuild from scratch, based off of the situation they were in. So there’s a lot of confusion with database corruption. And I think that if you’ve ever had run check DB and got errors, and you start googling on some of those errors, you find a lot of people out there who maybe have not done as much corruption repair as they, as they should have to be talking about it. And they come up with all kinds of things that they suggest or tell you about what’s going on with that corruption. And one of those is that a full backup and restore of a corrupt database may help fix the corruption,

Derrick Bovenkamp 3:16
unfortunately, that that is false. And when you do a full database backup, it backs up the corrupted data with it. That is, you know, we put it in brackets there. That’s if you can take a backup. There’s a lot of times where the backup just quits halfway through. And you know, one of the ways that we can prove this is can you so choose to get turned into database corruption challenge. That’s how we distributed the crap databases, we back them up.

Steve Stedman 3:41
Yeah, in the database corruption challenge. If you download many of the weeks of database corruption, you download a backup file, you do that restore, and it comes back corrupt. So doing a full backup and restore of the database will not help fix it. However, there are some situations where if you happen to have a full backup, and maybe some transaction log backups, you might be able to replay the transaction log backups to get past the date, the point in time where the corruption occurred, if the storage holding the transaction log backups does not have the same corruption that caused the database to be corrupt to begin with.

Derrick Bovenkamp 4:14
Yeah, and the next one, and it’s the first thought of, you know, many system analysts and system administrators is, hey, if I reboot the servers, I’m gonna fix it.

Steve Stedman 4:24
Yep. And you know, that’s one of those things that if it was a file server, I’d say yeah, kick it for a reboot and see if it works any better. But SQL Server, not the case, once the file becomes corrupt, a reboot won’t help it. And actually, it may make things worse, like a whole lot worse. And what I mean by that is that if you have corruption and your databases running, that’s a bad spot to be in. But if you have corruption, and your system is shut down, and it will not start up, that’s a much worse place to be in than corrupt and running.

Derrick Bovenkamp 4:52
Yeah. And we’ll cover that more in the next few slides. Yep. So what about if I ignore corruption is it going to go isn’t gonna

Steve Stedman 5:00
repair itself, that’s not going to happen. And in technically it’s unlikely because if you have a regular process that maybe truncates a table and then refills it with some kind of a data warehousing or ETL type process. Yeah, you might get lucky it truncates it, it goes away and then you wouldn’t have corruption. But other than that without the act of like truncating a table or clearing out a table completely, it’s extremely unlikely that ignoring it will make go away. And, in fact, ignoring it may make things a whole lot worse because without fixing it without knowing what caused it. You may get more and more corruption over time. So yeah, be careful with what people are telling you on the internet because a lot of those things may not be accurate on database corruption.

 

More from Stedman Solutions:

SteveStedman5
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!

Leave a Reply

Your email address will not be published. Required fields are marked *

*