Week 7 Database Corruption Challenge Results

Download PDF

The seventh week of ten in the Database Corruption Challenge went well. There were a few new participants, and many of those who have competed week over week. The first to submit a correct answer this week was Raul Gonzalez.

What was interesting about this challenge was that it was not your typical corruption where you could just run DBCC CheckDB and find the problem. To understand exactly what happened, solving this required reading through the transaction log that had not yet been backed up and finding the specific problem.

The original post describing the challenge stated the following:

Your goal if you choose to accept it is to get restore all the data in the [OPEN_NFIRS].[Record1000] table prior to its disappearance. Note: there may have been multiple DBA’s working on this system in several different time zones across the world.

Which contained some valid information, and some distracting information. First the distracting part, the fact that the DBAs working on the solution were from multiple time zones across the world. This had nothing to do with anything, and some people asked about it. That statement were purely there to distract. The key items to take away from this comment were:

  • Multiple DBAs (not necessarily at the same location).
  • The table contents just disappeared.

There was another key clue in the original post…

Not all missing data is caused by hardware failures or issues, sometimes it is caused by people. – Unknown DBA

Where I was going with this was that the table appeared to be “corrupt”, but after initial investigation it could be discovered that someone who was logged in as the user ‘sa’ ran a delete command that possibly forgot a where clause, and dumped the entire [OPEN_NFIRS].[Record1000] table. There was an extra point offered for those who figured that part out.

So once you figured that part out all you had to do was the following:

  • Take a tail of the log backup from the 3 database files included in the challenge.
  • Figure out the specific time or LSN of the delete statement from the log.
  • Restore the provided full backup.
  • Restore the provided log backups.
  • Restore the tail of the log backup just up to the point in time before the DELETE statement was run.

The following solution, all the code from Raul Gonzales, and the commentary and screen shots are from me. I changed a couple of paths to match my configuration when testing this:

-- try to attach the database files
CREATE DATABASE [CorruptionChallenge7] ON
	( FILENAME='C:\SQL_DATA\CorruptionChallenge7.mdf' )
	, ( FILENAME='C:\SQL_DATA\CorruptionChallenge7_log.ldf' )
	, ( FILENAME='C:\SQL_DATA\UserObjects.ndf' ) FOR ATTACH
GO

The database attaches just fine, and upgrades the files to the current version of SQL Server.
Week7_Corruption_Step1

Next take a look to see what the recovery model is on the database. This will determine if the tail of the log backup may be available.

SELECT recovery_model_desc, name FROM sys.databases
WHERE name = 'CorruptionChallenge7'
--recovery_model_desc	name
--FULL	CorruptionChallenge7

I like how Raul included his results as comments in the code. This allows me to confirm that I see the same thing that he did going through the process.
Week7_Corruption_Step2

So yes, the database CorruptionChallenge7 is in FULL recovery model. Lets take a look at the transaction log.

USE [CorruptionChallenge7]
GO

SELECT COUNT(*)
    FROM fn_dblog (NULL, NULL)
-- loads of things there

Yes, loads of things in the transaction log, 56591 to be exact.

Week7_Corruption_Step3

So now its time to create a tail of the log backup.

-- Backup tail of the log
BACKUP LOG [CorruptionChallenge7] TO DISK = 'C:\DBBackups\CorruptionChallenge7\CorruptionChallenge7_tail_log.trn'
WITH NO_TRUNCATE

Week7_Corruption_Step4At this point we at least know that using backups we can restore to the exact point in time that we are at now. We also know that we can restore to any point along the way if we know where to stop.

So next we run the final check query that needs to be submitted with the solution to see what the table looks like, and we select everything out of the [Record1000] table.

USE [CorruptionChallenge7]
GO
SELECT COUNT([Record1000Id]) as check1
      ,CHECKSUM_AGG(CHECKSUM([FireDeptID])) as check2
      ,CHECKSUM_AGG(CHECKSUM([FireDeptState])) as check3
      ,CHECKSUM_AGG(CHECKSUM([AlarmDate])) as check4
      ,CHECKSUM_AGG(CHECKSUM([IncidentNumber])) as check5
      ,CHECKSUM_AGG(CHECKSUM([ExposureNumberZeroBased])) as check6
      ,CHECKSUM_AGG(CHECKSUM([RecordType])) as check7
      ,CHECKSUM_AGG(CHECKSUM([TransactionType])) as check8
      ,CHECKSUM_AGG(CHECKSUM([FireDepartmentStation])) as check9
FROM [OPEN_NFIRS].[Record1000];

SELECT * FROM [OPEN_NFIRS].[Record1000]
-- Empty

Which shows there are no rows in the table.

Week7_Corruption_Step5

So where did the rows go?

Restore all the transaction logs without the tail of the log backup.


-- Restore From backup to the latest
USE master

ALTER DATABASE CorruptionChallenge7 SET SINGLE_USER WITH ROLLBACK IMMEDIATE

RESTORE DATABASE [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\CorruptionChallenge7_1.bak'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_0.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_1.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_2.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_3.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_4.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_5.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE DATABASE [CorruptionChallenge7] WITH RECOVERY

Week7_Corruption_Step6

Now lets take a look at how things look.

DBCC CHECKDB('CorruptionChallenge7') WITH NO_INFOMSGS
-- No corruption this week..

Week7_Corruption_Step7
No corruption remaining, so lets look at the data.


USE [CorruptionChallenge7]
GO
SELECT COUNT([Record1000Id]) as check1
      ,CHECKSUM_AGG(CHECKSUM([FireDeptID])) as check2
      ,CHECKSUM_AGG(CHECKSUM([FireDeptState])) as check3
      ,CHECKSUM_AGG(CHECKSUM([AlarmDate])) as check4
      ,CHECKSUM_AGG(CHECKSUM([IncidentNumber])) as check5
      ,CHECKSUM_AGG(CHECKSUM([ExposureNumberZeroBased])) as check6
      ,CHECKSUM_AGG(CHECKSUM([RecordType])) as check7
      ,CHECKSUM_AGG(CHECKSUM([TransactionType])) as check8
      ,CHECKSUM_AGG(CHECKSUM([FireDepartmentStation])) as check9
FROM [OPEN_NFIRS].[Record1000];
-- Check done to the latest transaction log backup
-- check1	check2	check3	check4	check5	check6	check7	check8	check9
-- 1010		835		0		131902	2099290	0		0		0		67

Week7_Corruption_Step8

Showing 1010 rows, and some checksums. But it can be that simple…

-- let's see what the tail log looks like
DECLARE @LogFile varchar(max);

SET @LogFile = 'C:\DBBackups\Challenge7Tail.bak'

SELECT *
INTO #dump_dblog
FROM fn_dump_dblog( DEFAULT, DEFAULT,DEFAULT, DEFAULT, @LogFile, DEFAULT,DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
                      DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT )

SELECT * FROM #dump_dblog
WHERE [transaction Name] IS NOT NULL
-- Interested in this one
--Beginlog Status	Xact Type	Begin Time		Transaction Name	Transaction SID
--0x01		1		2015/05/31	17:57:56:287	DELETE				0x01

Using the fn_dup_dblog function to scan the tail of the log transaction log backup to find the interesting DELETE statement, which occurred at 17:57:56:278.

Week7_Corruption_Step9

 

So lets look at what happened after that DELETE statement.

SELECT * FROM #dump_dblog
WHERE [Begin Time] >= '2015/05/31 17:57:56:287'
-- Seems like the last transaction on that table..

 

Week7_Corruption_Step10

Scrolling through the list after the DELETE, no other commands, just SQL Server doing its thing to cleanup after the DELETE statement. So now we know everything that is needed to restore to the exact point in time (or just before it). Run the same list of transaction log backups we did before, but add in the tail of the log backup and tell it to stop at the DELETE statement.

-- Restore the database and STOPAT the time of the last
USE master

ALTER DATABASE CorruptionChallenge7 SET SINGLE_USER WITH ROLLBACK IMMEDIATE

RESTORE DATABASE [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\CorruptionChallenge7_1.bak'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_0.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_1.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_2.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_3.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_4.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_5.trn'
WITH NORECOVERY, REPLACE, STATS=10
, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

--==================================== IMPORTANT BIT ==================================
RESTORE LOG [CorruptionChallenge7]
FROM DISK='C:\DBBackups\Challenge7Tail.bak'
WITH NORECOVERY, REPLACE, STATS=10

, STOPAT='2015/05/31 17:57:56:287' -- date time of the what I believe is the last transaction on the table

, MOVE 'CorruptionChallenge7' TO 'C:\SQL_DATA\CorruptionChallenge7.mdf'
, MOVE 'UserObjects' TO 'C:\SQL_DATA\UserObjects.ndf'
, MOVE 'CorruptionChallenge7_log' TO 'C:\SQL_DATA\CorruptionChallenge7_log.ldf'

RESTORE DATABASE [CorruptionChallenge7] WITH RECOVERY

The tail of the log backup is restored to just before the DELETE statement.

Week7_Corruption_Step11

 

So now run the final checksum query that needed to be submitted as proof of completion.

USE [CorruptionChallenge7]
GO
SELECT COUNT([Record1000Id]) as check1
      ,CHECKSUM_AGG(CHECKSUM([FireDeptID])) as check2
      ,CHECKSUM_AGG(CHECKSUM([FireDeptState])) as check3
      ,CHECKSUM_AGG(CHECKSUM([AlarmDate])) as check4
      ,CHECKSUM_AGG(CHECKSUM([IncidentNumber])) as check5
      ,CHECKSUM_AGG(CHECKSUM([ExposureNumberZeroBased])) as check6
      ,CHECKSUM_AGG(CHECKSUM([RecordType])) as check7
      ,CHECKSUM_AGG(CHECKSUM([TransactionType])) as check8
      ,CHECKSUM_AGG(CHECKSUM([FireDepartmentStation])) as check9
FROM [OPEN_NFIRS].[Record1000];
-- Current
--check1	check2	check3	check4	check5		check6	check7	check8	check9
--3010		2346	0		1643050	26216226	0		0		0		246

-- Previous
-- check1	check2	check3	check4	check5	check6	check7	check8	check9
-- 1010		835		0		131902	2099290	0		0		0		67

-- Seems like we've got much more rows..

Week7_Corruption_Step12

And that’s how Database Corruption Challenge #7 was won by Raul Gonzalez.

Not all “corruption” is what you traditionally think of as corruption.

Raul received 4 points this week:

  • One point for being first to complete the challenge correction.
  • Two points for solving it before any additional clues where given.
  • One point for providing that the DELETE was performed by the user ‘sa’.

 

Almost all of the winning solutions this week were very similar to this, however some used the LSN on the final restore, rather than the timestamp. Those solutions looked something like this:


RESTORE DATABASE [CorruptionChallenge7]
   FROM DISK = N'C:\DBBackups\CorruptionChallenge7\CorruptionChallenge7_1.bak'
   WITH FILE = 1,
        MOVE N'CorruptionChallenge7' TO N'C:\SQL_DATA\CorruptionChallenge7.mdf',
        MOVE N'UserObjects' TO N'C:\SQL_DATA\CorruptionChallenge7UserObjects.ndf',
        MOVE N'CorruptionChallenge7_log' TO N'C:\SQL_DATA\CorruptionChallenge7_log.ldf',
        NORECOVERY, REPLACE, STATS=10;

-- apply the log files that we have
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_0.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_1.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_2.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_3.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_4.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\CorruptionChallenge7\TransLog_CorruptionChallenge7_5.trn' WITH NORECOVERY;
RESTORE LOG [CorruptionChallenge7]
	FROM DISK = 'C:\DBBackups\Challenge7Tail.bak' WITH NORECOVERY,
STOPATMARK = 'lsn:52000000032800002';

RESTORE DATABASE [CorruptionChallenge7] WITH RECOVERY;

The solutions using the LSN required converting it from hex to decimal, which isn’t exactly obvious. I created a function to do the LSN conversion, and included it in the extra clue this week. Another participant Dan Andrei Stefan provided an alternate function to do the same thing. His function was the inspiration to create my own function to do the conversion. Here is my function.

CREATE FUNCTION [dbo].[dbhCrackLSN](@currentLSN AS VARCHAR(25))
RETURNS VARCHAR(25)
AS
BEGIN
	DECLARE @returnLSN as [numeric](25);

	;WITH lsnCte
	AS
	(
		SELECT CONVERT(INT, CONVERT(VARBINARY, '0x' + LEFT(@currentLSN, 8), 1)) AS Part1,
		CONVERT(INT, CONVERT(VARBINARY, '0x' + SUBSTRING(@currentLSN, 10, 8) , 1)) AS Part2,
		CONVERT(INT, CONVERT(VARBINARY, '0x' + RIGHT(@currentLSN, 4) , 1)) AS Part3
	)
	SELECT @returnLSN = CAST(Part1 as VARCHAR(10)) + RIGHT('0000000000' + CAST(Part2 as VARCHAR(10)), 10) + RIGHT('00000' + CAST(Part3 as VARCHAR(5)), 5)
	  FROM lsnCte;
	RETURN @returnLSN;
END
GO

There was one solution provided by Patrick Flynn which restored all of the data without using any of the backups. He only used the 3 database files (MDF, LDF, NDF). I will post this solution tomorrow, so check back to see how he did it.

So that wraps up Database Corruption Challenge #7. I hope everyone enjoyed this weeks competition.

 

-Steve Stedman

 

Related Links

Database Corruption Challenge Sponsored by
StedmanSolutionsRemoteSkilledDBA

Posted in Corruption Tagged with: , , , , , ,
2 comments on “Week 7 Database Corruption Challenge Results
  1. Bogdan says:

    Conversion of LSN values from hexa to decimal can be avoided using 0x prefix: … STOPBEFOREMARK = ‘lsn:0x00000025:00000270:0001’

  2. Jon Gurgul says:

    “One point for providing that the DELETE was performed by the user ‘sa’.”

    The sa account maps to the Transaction SID column value of 0x01. But that does not mean that it was necessarily directly the sa account.

    If for example someone wanted to cover their tracks they could simply use:

    EXECUTE AS LOGIN = ‘sa’

    …stuff…

    REVERT

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.