Corruption Challenge Week 4 – The Winning Solution

Download PDF

Congratulations to Randolph West who won the corruption challenge this week with the following solution which restored all of the data.

First he restored the database to get started. Note some of his code and comments have been reformatted to better fit in the blog format.

Restore database. I use KEEP_CDC and KEEP_REPLICATION because of the hints you dropped in your blog.
The hint is that there is 100% chance of data recovery, so I will look for Change Data Capture tables.

USE master;
GO
--DROP DATABASE [CorruptionChallenge4] ;
RESTORE DATABASE [CorruptionChallenge4]
   FROM DISK = N'C:\DBBackups\CorruptionChallenge4_Corrupt.bak'
   WITH FILE = 1,
        MOVE N'CorruptionChallenge4'
		  TO N'C:\SQL_DATA\CorruptionChallenge4.mdf',
        MOVE N'UserObjects'
		  TO N'C:\SQL_DATA\CorruptionChallenge4_UserObjects.ndf',
        MOVE N'CorruptionChallenge4_log'
		  TO N'C:\SQL_DATA\CorruptionChallenge4_log.ldf',
        NOUNLOAD,
    REPLACE,
    STATS = 5,
    KEEP_CDC,
    KEEP_REPLICATION;

Next he ran CheckDB to see what is wrong with this database.

USE [CorruptionChallenge4];
-- Run DBCC CHECKDB to see what's wrong
DBCC CHECKDB
WITH ALL_ERRORMSGS,
	NO_INFOMSGS,
	DATA_PURITY

CheckDB returned 17,515 lines of errors.
Week4Winner1
Ouch, lots of errors. But there’s one object ID that keeps showing up: 2105058535

SELECT *
FROM sys.objects
WHERE [object_id] = 2105058535

Week4Winner2

dbo.Customers seems to be the problem. Let’s cheat a little and look for a CDC table using this name

SELECT *
FROM sys.tables

Week4Winner3
dbo_Customers_CT … 373576369 … 2015-05-01 08:19:32.570 … 2015-05-01 15:24:30.177
This looks promising. Let’s get the CDC output for net changes on this table, and dump them into a temp table.
We get the Start LSN from sys.fn_cdc_get_min_lsn(‘Customers’)

SELECT id, FirstName, MiddleName, LastName
INTO tempdb.dbo.fff
FROM [cdc].[fn_cdc_get_net_changes_dbo_Customers](0x0000009A000009700267, sys.fn_cdc_get_max_lsn(), 'all')

Which inserted 511740 rows into the tempdb.dbo.fff table.

Hmmm … how many records is that compared to the original corrupt table?

SELECT COUNT(*) FROM dbo.Customers
SELECT COUNT(*) FROM tempdb.dbo.fff

Week4Winner4
This looks very promising. Now let’s query the rows at ID 510900 and 510901


SELECT * FROM dbo.Customers -- oops, page 3:22 is corrupt

Week4Winner5

 


SELECT * FROM dbo.Customers WHERE id IN (510900, 510901) -- oops, page 3:2150 is corrupt too

Week4Winner6

During my playing around earlier, I discovered that the two NC indexes were affected by corruption in page 3:88 as well.

SELECT * FROM tempdb.dbo.fff WHERE id IN (510900, 510901)

Week4Winner7

This is a mess – I guess we’ll have to turf the entire Customers table
But first we should double-check that the Orders table is ok

DBCC CHECKTABLE ('[dbo].[Orders]') WITH ALL_ERRORMSGS, NO_INFOMSGS

[Orders] is good.

 

Set database in single user mode to enable repair

ALTER DATABASE [CorruptionChallenge4]
SET SINGLE_USER
WITH ROLLBACK IMMEDIATE;

The corrupt table is dbo.Customer
Run repair of dbo.Customer

DBCC CheckTable('dbo.Customers', REPAIR_ALLOW_DATA_LOSS)
WITH NO_INFOMSGS;

Week4Winner8


EXEC sp_changedbowner 'sa'
GO

-- This sometimes shows up, but should be dropped if CDC is off
DISABLE TRIGGER [tr_MScdc_ddl_event]
 ON DATABASE
GO

DISABLE TRIGGER [noDropTables]
 ON DATABASE
GO

DISABLE TRIGGER [noNewTables]
 ON DATABASE
GO

EXEC sys.sp_cdc_disable_db
GO

 

Based on the DBCC CheckTable removing all rows from the Customers table, this next section could be skipped.


ALTER TABLE [dbo].[Orders]

DROP CONSTRAINT [FK_Orders_People]
GO

ALTER TABLE [dbo].[Customers]

DROP CONSTRAINT [PK_Customers]
GO

TRUNCATE TABLE [dbo].[Customers]

ALTER TABLE [dbo].[Customers] ADD CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED ([id] ASC) ON [UserObjects]
GO

Now to put everything back into the Customers table.


SET IDENTITY_INSERT [dbo].[Customers] ON

INSERT INTO [dbo].[Customers] (
 [id],
 [FirstName],
 [MiddleName],
 [LastName]
 )
SELECT [id],
 [FirstName],
 [MiddleName],
 [LastName]
FROM tempdb.dbo.fff

SET IDENTITY_INSERT [dbo].[Customers] OFF

Turn the triggers back on


ALTER TABLE [dbo].[Orders]
 WITH CHECK ADD CONSTRAINT [FK_Orders_People] FOREIGN KEY ([customerId]) REFERENCES [dbo].[Customers]([id])
GO

ALTER TABLE [dbo].[Orders] CHECK CONSTRAINT [FK_Orders_People]
GO

EXEC sys.sp_cdc_enable_db
GO

EXECUTE sys.sp_cdc_enable_table
 @source_schema = N'dbo',
 @source_name = N'Customers',
 @role_name = NULL;

EXECUTE sys.sp_cdc_enable_table
 @source_schema = N'dbo',
 @source_name = N'Orders',
 @role_name = NULL;

ENABLE TRIGGER [noDropTables]
 ON DATABASE
GO

ENABLE TRIGGER [noNewTables]
 ON DATABASE
GO

ALTER DATABASE [CorruptionChallenge4]
SET MULTI_USER

Not for the checks.


SELECT *
FROM [dbo].[Customers]

Week4Winner11

 


SELECT * FROM dbo.Customers WHERE id IN (510900, 510901)

Week4Winner10


SELECT COUNT(*)
FROM sys.objects
WHERE is_ms_shipped = 0;

And that wraps it up, that is how Randolph West won the Week 4 Database Corruption Challenge.

Related Links:

Tagged with: , , , , , , ,
6 comments on “Corruption Challenge Week 4 – The Winning Solution
  1. Shashikant Shakya says:

    Nice work!!
    I was not aware of these restore options.
    Thanks Steve for posting such corruption challenges.

  2. kumar says:

    How did you pick the Start LSN from sys.fn_cdc_get_min_lsn(‘Customers’) = 0x0000009A000009700267
    did you query cdc.dbo_Customers_CT table and pick the _$start_lsn from the first row.

    i have different record count:
    select capture_instance from cdc.change_tables
    select sys.fn_cdc_get_min_lsn(‘dbo_Customers’)

    my value is: 0x000000180000004E0040

    my count did not match.

    • SteveStedman SteveStedman says:

      Kumar,
      Here is an alternate way to to get the start and end lsn numbers, that was provided in a different solution, perhaps this will help you.

      DECLARE @begin_time datetime, @end_time datetime, @from_lsn binary(10), @to_lsn binary(10);
      SET @begin_time = (Select Min(tran_begin_time) from [cdc].[lsn_time_mapping]);
      
      -- Obtain the end of the time interval.
      SET @end_time = GETDATE();
      -- Map the time interval to a change data capture query range.
      SET @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', @begin_time);
      SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);
      

      -Steve Stedman

  3. kumar says:

    Thanks steve & Randolph.
    my issue is not the lsn i was running query against the wrong table. i was running against [cdc].[fn_cdc_get_all_changes_dbo_Customers] instead of running [cdc].[fn_cdc_get_net_changes_dbo_Customers].

    But any way it is still puzzling me how do we find out if CDC is enabled or replication is enabled from the back up.
    RESTORE HEADERONLY, RESTORE LABELONLY , RESTORE FILELISTONLY could not tell.
    is there a way to learn this from the backup copy you provided.

    The only difference i found restoring with keep_cdc, keep_replication is i was able to expand system tables and tabled valued functions. (i used 2008 R2).

    i tried your backup on sql 2014 and never able to complete dbcc checkdb

    • SteveStedman SteveStedman says:

      On SQL Server 2014, DBCC CheckDB appears to be doing more work. When I ran it on 2014, it eventually timed out and crashed my sql instance after around 25 minutes.

  4. Bill Hughes says:

    I just recently came across this challenge, and as an aspiring DBA I thought it would be good to run through some of these and their solutions. It is great to have something like this to learn from. Now, maybe it is just my inexperience or maybe I missed something somewhere. Where does the query for 510900 and 510901 come from?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.