Solution to Week 10 Part 1 Database Corruption Challenge

Download PDF

If you are looking for more info on the corrupt database with Week 10 of the Database Corruption Challenge, you can take a look at the week 10 Challenge Details.

With this being the final competition of the Database Corruption Challenge, I had to make it more difficult than the others, so I added 3 different types of corruption. This challenge was about the equivalent of any 3 of the other challenges.

I have split the solution to the Week 10 Challenge into 3 different posts, with each post explaining how to fix one of the three specific corrupt areas.

Corruption in the Orders Table

WARNING: DBCC WritePage is a dangerous command, that should never be used on any production database. It may invalidate your ability to get support from Microsoft on issues that arise with that database going forward. It is not my intention to encourage anyone to use DBCC WritePage ever. This is just what I used to create a corrupt database, and since creating corrupt databases is not part of the role of most DBAs, you should not use DBCC WritePage. Consider yourself warned.

The corruption in the orders table was unlike any of the other corruption challenges. This was something new. For details on the way that a table scan uses a linked list of pages see my blog post on SQL Server Tables – BTree or Linked List.

Lets start with how I created the orders table and made it corrupt.

CREATE TABLE [dbo].[Orders](
	[id] [int] IDENTITY(-2147483647,1) NOT NULL,
	[orderDate] [DATETIME] NOT NULL,
	[customerId] [int] NULL,
	[shippingType] varchar(100),
	[orderDetails] varchar(max),
	[totalPrice] DECIMAL(12, 2)
	CONSTRAINT [PK_Revenue] PRIMARY KEY CLUSTERED ([id] ASC, [orderDate])
)
ON [UserObjects];

CREATE NONCLUSTERED INDEX [ncOrders] ON [dbo].[Orders]
(
	[orderDate]  ASC,
	[totalPrice] ASC
);

I then added rows and deleted rows, until I ended up with a the orders table having 59019 rows in it.

My goal was to corrupt the table in a way that it would create an infinite loop when doing a table scan (or clustered index scan) on the orders table. Here is how I did that.

First I used


DBCC TRACEON(3604) with no_infomsgs;
DBCC IND(CorruptionChallenge10,'Orders', 1) with no_infomsgs;

First I used DBCC IND to get a list of the pages in the orders table (clustered index).

Next I used DBCC page to pick a page somewhere near the beginning of the table, so I chose page 2323. Using DBCC Page, I looked up the m_nextPage and m_prevPage pointers, and recorded them. I then modified the pointer to the next page to point to the previous page creating the infinite loop, or infinite query.


-- point a nextPage point to the prev page producing an infinite loop
DBCC Page(CorruptionChallenge10, 3, 2323, 2) with no_infomsgs;
--m_prevPage = (3:2322) 0x1209
--m_nextPage = (3:2324) 0x1409

-- overwrite the m_nextPage point to point to the previous page.
DBCC WritePage(CorruptionChallenge10, 3, 2323, 16, 2, 0x1209);
DBCC CheckTable(Orders);

-- infinite query
SELECT * FROM [dbo].[Orders];

TableAsALinkedListInfiniteLoop

At this point it turned out that the query wasn’t entirely infinite. After a while my SQL Server ran out of memory and the query stopped at about 33 million rows.

Here I realized that it would be too easy, all you needed to do at this point was just select in the opposite direction and you could get all the data out.

SELECT *
  FROM [dbo].[Orders]
 ORDER BY id DESC;

Being the final week of the challenge, I couldn’t leave it at just that, instead, I had to mess with one of the m_prevPage pointers near the end of the table to cause a similar loop, so I did the following:

-- point a prevPage point to the next page producing an infinite loop
DBCC Page(CorruptionChallenge10, 3, 3058, 2) with no_infomsgs;
--  m_prevPage = (3:3057)   0xF10B
--  m_nextPage = (3:3059)   0xF30B

--dbcc writepage ({'dbname' | dbid}, fileid, pageid, offset, length, data [, directORbufferpool])
DBCC WritePage(CorruptionChallenge10, 3, 3058, 8, 2, 0xF30B);
DBCC CheckTable(Orders);

--infinite query
SELECT *
  FROM [dbo].[Orders]
 ORDER BY id DESC;

Yes… at this point you were unable to SELECT * from the table with either ASC or DESC.

So to make it a little worse, I decided to mess with another page somewhere in the middle of the table, and overwrite the page header (similar to an early challenge). So I stomped on another page, this time not touching the prev page or next page pointer.


DBCC WritePage(CorruptionChallenge10, 3, 2603, 0, 2, 0x0909);
DBCC CheckTable(Orders);

I then backed up the database and we have a corrupt database with the orders table.

Fixing the Corruption – DBCC WritePage

So now to fix the corruption. Some people were able to track down the problem, and just used DBCC WritePage to the same addresses to fix the corruption. Those solutions looked something like this solution from Rob Farley:

dbcc traceon(3604)
dbcc page ('CorruptionChallenge10', 3, 2323,3) --with tableresults
select cast(2322 as binary(2)) --0x0912 so look for 1209
dbcc page ('CorruptionChallenge10', 3, 2323,2) --with tableresults
select cast(2323 as binary(2)) --0x0913 so look for 1309
dbcc page ('CorruptionChallenge10', 3, 2322,2) --with tableresults --end of first line
select cast(2324 as binary(2)) --0x0914 so WANT 1409

--01010400 00820001 12090000 03001d00 12090000 should be
--01010400 00820001 12090000 03001d00 14090000

--page 2323.

DBCC WRITEPAGE ('CorruptionChallenge10', 3, 2323, 16, 2, 0x1409)
--Good. That's done.

--Now similarly for 3058's previous page, where the first 3059 should be 3057

select cast(3059 as binary(2)) --0x0BF3 so look for F30B
dbcc page ('CorruptionChallenge10', 3, 3058,2) --with tableresults
select cast(3057 as binary(2)) --0x0BF1 so WANT F10B

DBCC WRITEPAGE ('CorruptionChallenge10', 3, 3058, 8, 2, 0xF10B)

--Good. That's done too.

--Now for page 2603, which shouldn't be type 9.
dbcc page ('CorruptionChallenge10', 3, 2603, 2) --with tableresults --79 rows, that's encouraging considering how many are missing
dbcc page ('CorruptionChallenge10', 3, 2602, 2) --with tableresults
--2602: 01010400 00820001 290a0000 03001d00 2b0a0000
--2603: 09090400 00820001 2a0a0000 03001d00 2c0a0000
DBCC WRITEPAGE ('CorruptionChallenge10', 3, 2603, 0, 2, 0x0101)

dbcc checktable ('Orders') --No more errors!

Many solved the challenge with a similar solution to this. There were some however who solved the corruption by reading each page, and reconstructing the table, specifically Patrick Flynn who had a solution very similar to his solution to the Week 7 Challenge.

Not everyone used DBCC WritePage to solve the challenge this week, however many did.

I hope this helps get a better understanding Fixing the Corruption in a bizarre corruption example in the Database Corruption Challenge.

Related Links

Database Corruption Challenge Sponsored by

Stedman Solutions for Help with Database Corruption and Performance Tuning

 

More from Stedman Solutions:

SteveStedman5
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!

Leave a Reply

Your email address will not be published. Required fields are marked *

*