DBCC ShrinkDatabase

Being day 24 of the DBCC Command month at SteveStedman.com, today’s featured DBCC Command is DBCC SHRINKDATABASE.

When I first heard about DBCC Shrink Database (many years ago), I immediately thought “what a great feature, run this every night when the system load is low, and the database will be smaller, perform better, and all around be in better shape”. Boy was I wrong.

To ‘DBCC SHRINKDATABASE’ or Not To ‘DBCC SHRINKDATABASE': What’s the question

If you read Microsoft Books Online, there is a lot of great info on all the benefits of shrinking your database, and hidden down in the best practices section one line about how “A shrink operation does not preserve the fragmentation state of indexes in the database”.

So why not just shrink the database every day like I had attempted so many years ago. The problem is index fragmentation, which is a pretty common problem on many SQL Servers. Index fragmentation is such a performance issue that the other obsolete DBCC commands DBCC IndexDefrag and DBCC DBReIndex were created, and later replaced with ALTER INDEX options for rebuilding and reorganizing

Is it a good idea to run DBCC SHRINKDATABASE regularly?

Download the sample file ShrinkSample.

This article and samples apply to SQL Server 2005, 2008,  2008R2, and SQL 2012.

This really depends on a number of factors, but generally the answer is NO, it is not a good idea to run DBCC SHRINKDATABASE regularly.

For the purpose of this article, I am going to assume a couple of things:

  1. You are concerned about database performance.
  2. Over time your database is growing (which is probably why are are concerned about performance).
  3. You want to do your best to improve the overall health of the database, not just fixing one thing.

Most DBAs who are not familiar with the issues around index fragmenation just set up maintenance plans, and see SHRINKDATABASE as a nice maintenance plan to add.  It must be good since it is going to make the database take up less space than it does now.  This is the problem, although SHRINKDATABASE may give you a small file, the amount of index fragmentation is massive.

I have seen maintenance plans that first reorganize or rebuild all of the indexes, then call DBCC SHRINKDATABASE.  This should be translated as the first reorganize all of the indexes, then they scramble them again.

Here is an example showing some new tables, with a clustered index on the largest, that are then fragmented, then REORGANIZED, then SHRINKDATABASE.  You might find the results interesting.

To start with, I am going to create a new database, with two simple tables. One table uses a CHAR column and the other users VARCHAR. The reason for the CHAR column is to just take up extra space for the purpose of the demonstration. Each table will be filled with 10,000 rows holding text that is randomly generated with the NEWID() function and then cast to be a VARCHAR. For the purpose of demonstrating, that appeared to be a good way to fill up the table with some characters.

USE MASTER;
GO

IF EXISTS(SELECT * FROM Sys.sysdatabases WHERE [name] = 'IndexTest')
	DROP DATABASE [IndexTest];
GO

CREATE DATABASE [IndexTest];
GO

USE [IndexTest];
CREATE TABLE [Table1] (id INT IDENTITY,  name CHAR (6000));
SET nocount ON;
GO

INSERT INTO [Table1] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

CREATE TABLE [Table2] (id INT IDENTITY,  name VARCHAR(6000));
CREATE CLUSTERED INDEX [Table2Cluster] ON [Table2] ([id] ASC);
GO

INSERT INTO [Table2] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

Now that we have some tables, lets take a look at the size of the database and the fragmentation on Table2. We will run thee following two queries before after each of the commands following this.

DBCC showcontig('[Table2]');

SELECT CAST(CONVERT(DECIMAL(12,2),
            Round(t1.size/128.000,2)) AS VARCHAR(10)) AS [FILESIZEINMB] ,
       CAST(CONVERT(DECIMAL(12,2),
	        Round(Fileproperty(t1.name,'SpaceUsed')/128.000,2)) AS VARCHAR(10)) AS [SPACEUSEDINMB],
       CAST(CONVERT(DECIMAL(12,2),
	        Round((t1.size-Fileproperty(t1.name,'SpaceUsed'))/128.000,2)) AS VARCHAR(10)) AS [FREESPACEINMB],
       CAST(t1.name AS VARCHAR(16)) AS [name]
FROM dbo.sysfiles t1;

The results of the two checks are shown below. You can see that the “Logical scan fragmentation” is 2.9% which is very good. You can also see that the data file is taking 80.0mb of disk space. Remember these numbers as they will be changing later.

Next we drop Table1 which will free up space at the beginning of the datafile. This is done to force Table2 to be moved when we run DBCC SHRINKDATABASE later.

DROP TABLE [Table1];

The checks after dropping the table show that there is no change to the Table2 fragmentation, but free space in the datafile is now 78.38mb.

Next we shrink the database, then run the same 2 queries to check the size and the fragmentation.

DBCC shrinkdatabase ('IndexTest', 5);

The results show good news and bad news. The good news is that the filesize has been reduced from 80mb to just 1.88mb. The bad news shows that fragmentation is now 98.55%, which indicates that the index is not going to perform as optimal as it should. You can see the shrinkdatabase has succeeded just as expected, and if you didn’t know where to look, you wouldn’t know that the clustered index on Table2 is now very fragmented.

Imagine running DBCC SHRINKDATABASE every night on a large database with hundreds or thousands of tables. The effect would be that very quickly every table with a clustered index would end up at close to 100% fragmented. These heavily fragmented indexes will slow down queries and seriously impact performance.

To fix this fragmentation, you must REORGANIZE or REBUILD the index.
The standard recommendation is to REORGANIZE if the fragmentation is between 5% and 30%, and to REBUILD if it is more than 30% fragmented. This is a good recommendation if you are running on SQL Server Enterprise Edition with the ability to REBUILD indexes online, but with standard edition this is not available so the REORGANIZE will do the job.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] reorganize;

Once we run this our check script shows that after the REORGANIZE the fragmentation has been reduced to 10.14%, which is a big improvement over the 98.55% it was at earlier.

Next we try the REBUILD.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] rebuild;

Which reduces the fragmenation to 4.17%, but it increases the filesize to 34.88mb. This effectively is undoing a big part of the original DBCC SHRINKDATABASE.

Notes

You can REBUILD or REORGANIZE all of your indexes on the system at one time, but this is not recommended. The REBUILD or REORGANIZE of all of the indexes will impact performance while it is running, and it may cause excessive transaction logs to be generated.

After doing a REORGANIZE of an index, it is suggested that statistics be updated immediately after the REORGANIZE.

Summary

It is my opinion that DBCC SHRINKDATABASE should never be run on a production system that is growing over time. It may be necessary to shrink the database if a huge amount of data has been removed from the database, but there are other options besides shink in this case. After any DBCC SHRINKDATABASE, if you chose to use it, you will need to REBUILD or REORGANIZE all of your indexes.

Even if you never use DBCC SHRINKDATABASE your indexes will end up getting fragmented over time. My suggestion is to create a custom Maintenance Plan which finds the most fragmented indexes and REBUILD or REORGANIZE them over time. You could for instance create a stored procedure that finds and REORGANIZES the 4 or 5 indexes that are the most fragmented. This could be run a couple times per night during a slow time allowing your system to automatically find and fix any indexes that are too fragmented.

Related Posts:

Blog:  Index Fragmentation

Blog: Index Fragmentation and SHRINKDATABASE

Notes:

For more information see TSQL Wiki DBCC shrinkdatabase.

DBCC Command month at SteveStedman.com is almost as much fun as realizing how fragmented your indexes are after running DBCC SHRINKDATABASE.

CTE Hierarchy compared to the alternative

After my CTE presentation at SQL Saturday 108 in Redmond, I was asked many questions, and received several great suggestions from people.  Based on that feedback, I am updating my presentation for SQL Saturday 114 in Vancouver to include some additional content.

One question was how does the performance compare between a recursive CTE to generate a hierarchical tree path listing and a query using self JOINs and UNION ALL to generate similar results.  To test this I created a simple table for departments in an online store that contained the following rows.

The I wrote two queries to generate a tree path for all departments in the list.

First the CTE Query:



-- Performance Differences 
-- Remember Ctrl+M to turn on Execution Plan 
-- Recursive CTE compared to Multiple Self Joins 
;WITH departmentcte(deptid, department, parent, LEVEL, treepath) AS
( SELECT id AS deptid, department, parent, 0 AS LEVEL,
CAST(department AS VARCHAR(1024)) AS treepath
FROM departments
WHERE parent IS NULL
UNION ALL -- and now for the recursive part  
SELECT d.id AS deptid, d.department, d.parent,
departmentcte.LEVEL + 1 AS LEVEL,
CAST(departmentcte.treepath + ' -> ' +
CAST(d.department AS VARCHAR(1024))
AS VARCHAR(1024)) AS treepath
FROM departments d
INNER JOIN departmentcte
ON departmentcte.deptid = d.parent
)
SELECT deptid, treepath
FROM departmentcte
ORDER BY treepath;

 

 

Then the Non CTE Query (ugly):



-- Multiple Self Joins Unioned 
-- Difficult to display Parent categories 
SELECT d.id AS deptid, d.department AS treepath
FROM departments d
WHERE d.parent IS NULL
UNION ALL
SELECT da2.id AS deptid,
da1.department + ' -> ' +
da2.department AS treepath
FROM departments da1
INNER JOIN departments da2 ON da1.id = da2.parent
WHERE da1.parent IS NULL
UNION ALL
SELECT db3.id AS deptid,
db1.department + ' -> ' +
db2.department + ' -> ' +
db3.department AS treepath
FROM departments db1
INNER JOIN departments db2 ON db1.id = db2.parent
INNER JOIN departments db3 ON db2.id = db3.parent
WHERE db1.parent IS NULL
UNION ALL
SELECT dc3.id AS deptid,
dc1.department + ' -> ' +
dc2.department + ' -> ' +
dc3.department + ' -> ' +
dc4.department AS treepath
FROM departments dc1
INNER JOIN departments dc2 ON dc1.id = dc2.parent
INNER JOIN departments dc3 ON dc2.id = dc3.parent
INNER JOIN departments dc4 ON dc3.id = dc4.parent
WHERE dc1.parent IS NULL
UNION ALL
SELECT dd3.id AS deptid,
dd1.department + ' -> ' +
dd2.department + ' -> ' +
dd3.department + ' -> ' +
dd4.department + ' -> ' + dd5.department AS treepath
FROM departments dd1
INNER JOIN departments dd2 ON dd1.id = dd2.parent
INNER JOIN departments dd3 ON dd2.id = dd3.parent
INNER JOIN departments dd4 ON dd3.id = dd4.parent
INNER JOIN departments dd5 ON dd4.id = dd5.parent
WHERE dd1.parent IS NULL

ORDER BY treepath
;

 

The two queries were confirmed to produce the same results:

But when looking at the actual execution plan for the queries, there is a very big difference.  The self JOIN, and UNION ALL query took 13 times as long to run as the CTE query did.

Results:

The CTE compared to the self JOIN with UNION ALL is a double win, first the CTE is much easier to read, second the CTE is much faster with 1/13th of the time it took to execute the self JOIN, UNION ALL solution. Also in the examples shown we showed the queries were used to show up to 5 levels of categories. If we wanted to go further than 5 levels, the difference gets even worse between the CTE and the self JOIN, UNION ALL solution.

Related Links:

Common Table Expressions Book

If you enjoyed this posting, and want to learn more about common table expressions, please take a look at my book on CTE’s at Amazon.com. The book is titled Common Table Expressions – Joes 2 Pros® – A CTE Tutorial on Performance, Stored Procedures, Recursion, Nesting and the use of Multiple CTEs.

Determining free disk space with TSQL

Working on a new report for the SQL Server Health reports, I needed to display the amount of free disk space on a SQL Server.

EXEC MASTER..Xp_fixeddrives;

Which was useful if I just wanted to look, but I needed to use the results in a query, and I didn’t want to put the results into a temp table, so here is how I decided to do it, using table variables.

DECLARE @disk_space TABLE(drive CHAR(1) NOT NULL, free INTEGER NOT NULL)
INSERT INTO @disk_space
EXEC MASTER..Xp_fixeddrives;

SELECT drive,
CAST(free* 10 / 1024 AS FLOAT) / 10 AS gbfree
FROM @disk_space
ORDER BY gbfree ASC;

With this, I am now about to query the @disk_space table variable and join it with other tables in the database.

Index Fragmentation and SHRINKDATABASE

Is it a good idea to run DBCC SHRINKDATABASE regularly?

Download the sample file ShrinkSample.

This article and samples apply to SQL Server 2005, 2008, and 2008R2.

This really depends on a number of factors, but generally the answer is NO, it is not a good idea to run DBCC SHRINKDATABASE regularly.

For the purpose of this article, I am going to assume a couple of things:

  1. You are concerned about database performance.
  2. Over time your database is growing (which is probably why are are concerned about performance).
  3. You want to do your best to improve the overall health of the database, not just fixing one thing.

Most DBAs who are not familiar with the issues around index fragmenation just set up maintenance plans, and see SHRINKDATABASE as a nice maintenance plan to add.  It must be good since it is going to make the database take up less space than it does now.  This is the problem, although SHRINKDATABASE may give you a small file, the amount of index fragmentation is massive.

I have seen maintenance plans that first reorganize or rebuild all of the indexes, then call DBCC SHRINKDATABASE.  This should be translated as the first reorganize all of the indexes, then they scramble them again.

Here is an example showing some new tables, with a clustered index on the largest, that are then fragmented, then REORGANIZED, then SHRINKDATABASE.  You might find the results interesting.

 

To start with, I am going to create a new database, with two simple tables. One table uses a CHAR column and the other users VARCHAR. The reason for the CHAR column is to just take up extra space for the purpose of the demonstration. Each table will be filled with 10,000 rows holding text that is randomly generated with the NEWID() function and then cast to be a VARCHAR. For the purpose of demonstrating, that appeared to be a good way to fill up the table with some characters.

USE MASTER;
GO

IF EXISTS(SELECT name FROM sys.databases WHERE name = 'IndexTest')
DROP DATABASE [IndexTest];
CREATE DATABASE [IndexTest];
USE [IndexTest]

CREATE TABLE [Table1] (id INT IDENTITY,  name CHAR (6000));
SET nocount ON;
GO
INSERT INTO [Table1] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

CREATE TABLE [Table2] (id INT IDENTITY,  name VARCHAR(6000));
CREATE CLUSTERED INDEX [Table2Cluster] ON [Table2] ([id] ASC);
GO
INSERT INTO [Table2] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

 

Now that we have some tables, lets take a look at the size of the database and the fragmentation on Table2. We will run thee following two queries before after each of the commands following this.

DBCC showcontig('[Table2]');
SELECT CAST(CONVERT(DECIMAL(12,2),Round(t1.size/128.000,2)) AS VARCHAR(10)) AS [FILESIZEINMB] ,
CAST(CONVERT(DECIMAL(12,2),Round(Fileproperty(t1.name,'SpaceUsed')/128.000,2)) AS VARCHAR(10)) AS [SPACEUSEDINMB],
CAST(CONVERT(DECIMAL(12,2),Round((t1.size-Fileproperty(t1.name,'SpaceUsed'))/128.000,2)) AS VARCHAR(10)) AS [FREESPACEINMB],
CAST(t1.name AS VARCHAR(16)) AS [name]
FROM dbo.sysfiles t1;

 

The results of the two checks are shown below. You can see that the “Logical scan fragmentation” is 2.9% which is very good. You can also see that the data file is taking 80.0mb of disk space. Remember these numbers as they will be changing later.

Next we drop Table1 which will free up space at the beginning of the datafile. This is done to force Table2 to be moved when we run DBCC SHRINKDATABASE later.

DROP TABLE [Table1];

The checks after dropping the table show that there is no change to the Table2 fragmentation, but free space in the datafile is now 78.38mb.

 

Next we shrink the database, then run the same 2 queries to check the size and the fragmentation.

DBCC shrinkdatabase (‘IndexTest’, 5);

The results show good news and bad news. The good news is that the filesize has been reduced from 80mb to just 1.88mb. The bad news shows that fragmentation is now 98.55%, which indicates that the index is not going to perform as optimal as it should. You can see the shrinkdatabase has succeeded just as expected, and if you didn’t know where to look, you wouldn’t know that the clustered index on Table2 is now very fragmented.

Imagine running DBCC SHRINKDATABASE every night on a large database with hundreds or thousands of tables. The effect would be that very quickly every table with a clustered index would end up at close to 100% fragmented. These heavily fragmented indexes will slow down queries and seriously impact performance.

To fix this fragmentation, you must REORGANIZE or REBUILD the index.
The standard recommendation is to REORGANIZE if the fragmentation is between 5% and 30%, and to REBUILD if it is more than 30% fragmented. This is a good recommendation if you are running on SQL Server Enterprise Edition with the ability to REBUILD indexes online, but with standard edition this is not available so the REORGANIZE will do the job.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] reorganize;

Once we run this our check script shows that after the REORGANIZE the fragmentation has been reduced to 10.14%, which is a big improvement over the 98.55% it was at earlier.

Next we try the REBUILD.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] rebuild;

Which reduces the fragmenation to 4.17%, but it increases the filesize to 34.88mb. This effectively is undoing a big part of the original DBCC SHRINKDATABASE.

Notes
You can REBUILD or REORGANIZE all of your indexes on the system at one time, but this is not recommended. The REBUILD or REORGANIZE of all of the indexes will impact performance while it is running, and it may cause excessive transaction logs to be generated.

After doing a REORGANIZE of an index, it is suggested that statistics be updated immediately after the REORGANIZE.

Summary
It is my opinion that DBCC SHRINKDATABASE should never be run on a production system that is growing over time. It may be necessary to shrink the database if a huge amount of data has been removed from the database, but there are other options besides shink in this case. After any DBCC SHRINKDATABASE, if you chose to use it, you will need to REBUILD or REORGANIZE all of your indexes.

Even if you never use DBCC SHRINKDATABASE your indexes will end up getting fragmented over time. My suggestion is to create a custom Maintenance Plan which finds the most fragmented indexes and REBUILD or REORGANIZE them over time. You could for instance create a stored procedure that finds and REORGANIZES the 4 or 5 indexes that are the most fragmented. This could be run a couple times per night during a slow time allowing your system to automatically find and fix any indexes that are too fragmented.