Corruption Challenge Week 4 – The Winning Solution
Congratulations to Randolph West who won the corruption challenge this week with the following solution which restored all of the data.
First he restored the database to get started. Note some of his code and comments have been reformatted to better fit in the blog format.
Restore database. I use KEEP_CDC and KEEP_REPLICATION because of the hints you dropped in your blog.
The hint is that there is 100% chance of data recovery, so I will look for Change Data Capture tables.
USE master; GO --DROP DATABASE [CorruptionChallenge4] ; RESTORE DATABASE [CorruptionChallenge4] FROM DISK = N'C:\DBBackups\CorruptionChallenge4_Corrupt.bak' WITH FILE = 1, MOVE N'CorruptionChallenge4' TO N'C:\SQL_DATA\CorruptionChallenge4.mdf', MOVE N'UserObjects' TO N'C:\SQL_DATA\CorruptionChallenge4_UserObjects.ndf', MOVE N'CorruptionChallenge4_log' TO N'C:\SQL_DATA\CorruptionChallenge4_log.ldf', NOUNLOAD, REPLACE, STATS = 5, KEEP_CDC, KEEP_REPLICATION;
Next he ran CheckDB to see what is wrong with this database.
USE [CorruptionChallenge4]; -- Run DBCC CHECKDB to see what's wrong DBCC CHECKDB WITH ALL_ERRORMSGS, NO_INFOMSGS, DATA_PURITY
CheckDB returned 17,515 lines of errors.
Ouch, lots of errors. But there’s one object ID that keeps showing up: 2105058535
SELECT * FROM sys.objects WHERE [object_id] = 2105058535
dbo.Customers seems to be the problem. Let’s cheat a little and look for a CDC table using this name
SELECT * FROM sys.tables
dbo_Customers_CT … 373576369 … 2015-05-01 08:19:32.570 … 2015-05-01 15:24:30.177
This looks promising. Let’s get the CDC output for net changes on this table, and dump them into a temp table.
We get the Start LSN from sys.fn_cdc_get_min_lsn(‘Customers’)
SELECT id, FirstName, MiddleName, LastName INTO tempdb.dbo.fff FROM [cdc].[fn_cdc_get_net_changes_dbo_Customers](0x0000009A000009700267, sys.fn_cdc_get_max_lsn(), 'all')
Which inserted 511740 rows into the tempdb.dbo.fff table.
Hmmm … how many records is that compared to the original corrupt table?
SELECT COUNT(*) FROM dbo.Customers SELECT COUNT(*) FROM tempdb.dbo.fff
This looks very promising. Now let’s query the rows at ID 510900 and 510901
SELECT * FROM dbo.Customers -- oops, page 3:22 is corrupt
SELECT * FROM dbo.Customers WHERE id IN (510900, 510901) -- oops, page 3:2150 is corrupt too
During my playing around earlier, I discovered that the two NC indexes were affected by corruption in page 3:88 as well.
SELECT * FROM tempdb.dbo.fff WHERE id IN (510900, 510901)
This is a mess – I guess we’ll have to turf the entire Customers table
But first we should double-check that the Orders table is ok
DBCC CHECKTABLE ('[dbo].[Orders]') WITH ALL_ERRORMSGS, NO_INFOMSGS
[Orders] is good.
Set database in single user mode to enable repair
ALTER DATABASE [CorruptionChallenge4] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
The corrupt table is dbo.Customer
Run repair of dbo.Customer
DBCC CheckTable('dbo.Customers', REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;
EXEC sp_changedbowner 'sa' GO -- This sometimes shows up, but should be dropped if CDC is off DISABLE TRIGGER [tr_MScdc_ddl_event] ON DATABASE GO DISABLE TRIGGER [noDropTables] ON DATABASE GO DISABLE TRIGGER [noNewTables] ON DATABASE GO EXEC sys.sp_cdc_disable_db GO
Based on the DBCC CheckTable removing all rows from the Customers table, this next section could be skipped.
ALTER TABLE [dbo].[Orders] DROP CONSTRAINT [FK_Orders_People] GO ALTER TABLE [dbo].[Customers] DROP CONSTRAINT [PK_Customers] GO TRUNCATE TABLE [dbo].[Customers] ALTER TABLE [dbo].[Customers] ADD CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED ([id] ASC) ON [UserObjects] GO
Now to put everything back into the Customers table.
SET IDENTITY_INSERT [dbo].[Customers] ON INSERT INTO [dbo].[Customers] ( [id], [FirstName], [MiddleName], [LastName] ) SELECT [id], [FirstName], [MiddleName], [LastName] FROM tempdb.dbo.fff SET IDENTITY_INSERT [dbo].[Customers] OFF
Turn the triggers back on
ALTER TABLE [dbo].[Orders] WITH CHECK ADD CONSTRAINT [FK_Orders_People] FOREIGN KEY ([customerId]) REFERENCES [dbo].[Customers]([id]) GO ALTER TABLE [dbo].[Orders] CHECK CONSTRAINT [FK_Orders_People] GO EXEC sys.sp_cdc_enable_db GO EXECUTE sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'Customers', @role_name = NULL; EXECUTE sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'Orders', @role_name = NULL; ENABLE TRIGGER [noDropTables] ON DATABASE GO ENABLE TRIGGER [noNewTables] ON DATABASE GO ALTER DATABASE [CorruptionChallenge4] SET MULTI_USER
Not for the checks.
SELECT * FROM [dbo].[Customers]
SELECT * FROM dbo.Customers WHERE id IN (510900, 510901)
SELECT COUNT(*) FROM sys.objects WHERE is_ms_shipped = 0;
And that wraps it up, that is how Randolph West won the Week 4 Database Corruption Challenge.
Related Links:
- Week 4 Building the Corrupt Database
- Week 4 Results
- Week 4 Database Corruption Challenge
- Database Corruption Challenge Scores
- Week 4 Extra Clue
- Is Week 4 Too Diabolical?
More from Stedman Solutions:
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!
Nice work!!
I was not aware of these restore options.
Thanks Steve for posting such corruption challenges.
How did you pick the Start LSN from sys.fn_cdc_get_min_lsn(‘Customers’) = 0x0000009A000009700267
did you query cdc.dbo_Customers_CT table and pick the _$start_lsn from the first row.
i have different record count:
select capture_instance from cdc.change_tables
select sys.fn_cdc_get_min_lsn(‘dbo_Customers’)
my value is: 0x000000180000004E0040
my count did not match.
Kumar,
Here is an alternate way to to get the start and end lsn numbers, that was provided in a different solution, perhaps this will help you.
-Steve Stedman
Thanks steve & Randolph.
my issue is not the lsn i was running query against the wrong table. i was running against [cdc].[fn_cdc_get_all_changes_dbo_Customers] instead of running [cdc].[fn_cdc_get_net_changes_dbo_Customers].
But any way it is still puzzling me how do we find out if CDC is enabled or replication is enabled from the back up.
RESTORE HEADERONLY, RESTORE LABELONLY , RESTORE FILELISTONLY could not tell.
is there a way to learn this from the backup copy you provided.
The only difference i found restoring with keep_cdc, keep_replication is i was able to expand system tables and tabled valued functions. (i used 2008 R2).
i tried your backup on sql 2014 and never able to complete dbcc checkdb
On SQL Server 2014, DBCC CheckDB appears to be doing more work. When I ran it on 2014, it eventually timed out and crashed my sql instance after around 25 minutes.
I just recently came across this challenge, and as an aspiring DBA I thought it would be good to run through some of these and their solutions. It is great to have something like this to learn from. Now, maybe it is just my inexperience or maybe I missed something somewhere. Where does the query for 510900 and 510901 come from?