T-SQL Technical Debt

Technical Debt is a programming concept that is also known as code debt.

The concept of Technical Debt applies closely with Agile software development, but also applies to database development. There are many tools available to report on technical debt for C#, PHP or other programming languages, but there isn’t much available to manage the SQL Server or T-SQL technical debt or general database design technical debt on SQL Server.

Some of the traditional technical debt causes are:

  • Poor design.
  • Lack of coding standards.
  • A rush to get things done rather than focusing on stability and sustainability.
  • Non refactoring code as it grows.
  • Evolving code or environments to just make things work, rather than to make things work well.
  • many more…

Reducing technical debt will lead to a more sustainable environment.

Example of technical debt in SQL Server.

Consider a project that has been using SQL Server over the last 10 years which has evolved from SQL Server 2000, and is currently running on SQL Server 2008 with a plan to move to SQL Server 2012. There were practices with SQL Server 2000 that were considered best practices, or the right way of doing things that aren’t the case any more. For instance in SQL Server 200o using TEXT columns was the norm for anything that was larger than 8000 bytes that wouldn’t  fit into a VARCHAR column.  In SQL Server 2005 Microsoft introduced the concept of VARCHAR(MAX) to go beyond the 8000 byte limit (along with NVARCHAR(MAX) and VARBINARY(MAX) ).  It has been rumored that TEXT columns would be going away since SQL Server 2008, but they are still supported in SQL Server 2008 R2. SQL 2012 has the TEXT column on the official deprecated features list with the statement of “Use varchar(max), nvarchar(max), and varbinary(max) data types.”

In this example where Microsoft has announced that the TEXT column type will be going away at some point in the future, the use of the TEXT column type is considered technical debt. Just using that column will cause (based on the deprecated feature list from Microsoft) your code to break at some point in the future. TEXT columns still work in SQL Server 2012 even though it is deprecated, and in the CTP1 early release of SQL Server 2014 they still work, but since Microsoft has announced that they are deprecated who knows for the future.

Technical debt is one of those things that is up for much debates based on your specific coding standards. Anything that analyzes technical debt needs to allow specific overrides bases on your local coding standards. For instance some people argue that using spaces instead of tabs for indentation in your T-SQL code is the right way, and others argue that tabs are the only way to do proper indentation. For this specific case there are many pros or cons to either side of the argument, the key is coding standards. If your coding standard is one way or the other, you are in better shape than if it is not defined.

The next beta of the Database Health Reports (Beta 7) will introduce a technical debt analysis tool. This is scheduled to be released sometime late summer 2013.

What do you consider T-SQL Technical Debt?

DBCC ShrinkDatabase

Being day 24 of the DBCC Command month at SteveStedman.com, today’s featured DBCC Command is DBCC SHRINKDATABASE.

When I first heard about DBCC Shrink Database (many years ago), I immediately thought “what a great feature, run this every night when the system load is low, and the database will be smaller, perform better, and all around be in better shape”. Boy was I wrong.

To ‘DBCC SHRINKDATABASE’ or Not To ‘DBCC SHRINKDATABASE': What’s the question

If you read Microsoft Books Online, there is a lot of great info on all the benefits of shrinking your database, and hidden down in the best practices section one line about how “A shrink operation does not preserve the fragmentation state of indexes in the database”.

So why not just shrink the database every day like I had attempted so many years ago. The problem is index fragmentation, which is a pretty common problem on many SQL Servers. Index fragmentation is such a performance issue that the other obsolete DBCC commands DBCC IndexDefrag and DBCC DBReIndex were created, and later replaced with ALTER INDEX options for rebuilding and reorganizing

Is it a good idea to run DBCC SHRINKDATABASE regularly?

Download the sample file ShrinkSample.

This article and samples apply to SQL Server 2005, 2008,  2008R2, and SQL 2012.

This really depends on a number of factors, but generally the answer is NO, it is not a good idea to run DBCC SHRINKDATABASE regularly.

For the purpose of this article, I am going to assume a couple of things:

  1. You are concerned about database performance.
  2. Over time your database is growing (which is probably why are are concerned about performance).
  3. You want to do your best to improve the overall health of the database, not just fixing one thing.

Most DBAs who are not familiar with the issues around index fragmenation just set up maintenance plans, and see SHRINKDATABASE as a nice maintenance plan to add.  It must be good since it is going to make the database take up less space than it does now.  This is the problem, although SHRINKDATABASE may give you a small file, the amount of index fragmentation is massive.

I have seen maintenance plans that first reorganize or rebuild all of the indexes, then call DBCC SHRINKDATABASE.  This should be translated as the first reorganize all of the indexes, then they scramble them again.

Here is an example showing some new tables, with a clustered index on the largest, that are then fragmented, then REORGANIZED, then SHRINKDATABASE.  You might find the results interesting.

To start with, I am going to create a new database, with two simple tables. One table uses a CHAR column and the other users VARCHAR. The reason for the CHAR column is to just take up extra space for the purpose of the demonstration. Each table will be filled with 10,000 rows holding text that is randomly generated with the NEWID() function and then cast to be a VARCHAR. For the purpose of demonstrating, that appeared to be a good way to fill up the table with some characters.

USE MASTER;
GO

IF EXISTS(SELECT * FROM Sys.sysdatabases WHERE [name] = 'IndexTest')
	DROP DATABASE [IndexTest];
GO

CREATE DATABASE [IndexTest];
GO

USE [IndexTest];
CREATE TABLE [Table1] (id INT IDENTITY,  name CHAR (6000));
SET nocount ON;
GO

INSERT INTO [Table1] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

CREATE TABLE [Table2] (id INT IDENTITY,  name VARCHAR(6000));
CREATE CLUSTERED INDEX [Table2Cluster] ON [Table2] ([id] ASC);
GO

INSERT INTO [Table2] (name) SELECT CAST(Newid() AS VARCHAR(100));
GO 10000

Now that we have some tables, lets take a look at the size of the database and the fragmentation on Table2. We will run thee following two queries before after each of the commands following this.

DBCC showcontig('[Table2]');

SELECT CAST(CONVERT(DECIMAL(12,2),
            Round(t1.size/128.000,2)) AS VARCHAR(10)) AS [FILESIZEINMB] ,
       CAST(CONVERT(DECIMAL(12,2),
	        Round(Fileproperty(t1.name,'SpaceUsed')/128.000,2)) AS VARCHAR(10)) AS [SPACEUSEDINMB],
       CAST(CONVERT(DECIMAL(12,2),
	        Round((t1.size-Fileproperty(t1.name,'SpaceUsed'))/128.000,2)) AS VARCHAR(10)) AS [FREESPACEINMB],
       CAST(t1.name AS VARCHAR(16)) AS [name]
FROM dbo.sysfiles t1;

The results of the two checks are shown below. You can see that the “Logical scan fragmentation” is 2.9% which is very good. You can also see that the data file is taking 80.0mb of disk space. Remember these numbers as they will be changing later.

Next we drop Table1 which will free up space at the beginning of the datafile. This is done to force Table2 to be moved when we run DBCC SHRINKDATABASE later.

DROP TABLE [Table1];

The checks after dropping the table show that there is no change to the Table2 fragmentation, but free space in the datafile is now 78.38mb.

Next we shrink the database, then run the same 2 queries to check the size and the fragmentation.

DBCC shrinkdatabase ('IndexTest', 5);

The results show good news and bad news. The good news is that the filesize has been reduced from 80mb to just 1.88mb. The bad news shows that fragmentation is now 98.55%, which indicates that the index is not going to perform as optimal as it should. You can see the shrinkdatabase has succeeded just as expected, and if you didn’t know where to look, you wouldn’t know that the clustered index on Table2 is now very fragmented.

Imagine running DBCC SHRINKDATABASE every night on a large database with hundreds or thousands of tables. The effect would be that very quickly every table with a clustered index would end up at close to 100% fragmented. These heavily fragmented indexes will slow down queries and seriously impact performance.

To fix this fragmentation, you must REORGANIZE or REBUILD the index.
The standard recommendation is to REORGANIZE if the fragmentation is between 5% and 30%, and to REBUILD if it is more than 30% fragmented. This is a good recommendation if you are running on SQL Server Enterprise Edition with the ability to REBUILD indexes online, but with standard edition this is not available so the REORGANIZE will do the job.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] reorganize;

Once we run this our check script shows that after the REORGANIZE the fragmentation has been reduced to 10.14%, which is a big improvement over the 98.55% it was at earlier.

Next we try the REBUILD.

ALTER INDEX table2cluster ON [IndexTest].[dbo].[Table2] rebuild;

Which reduces the fragmenation to 4.17%, but it increases the filesize to 34.88mb. This effectively is undoing a big part of the original DBCC SHRINKDATABASE.

Notes

You can REBUILD or REORGANIZE all of your indexes on the system at one time, but this is not recommended. The REBUILD or REORGANIZE of all of the indexes will impact performance while it is running, and it may cause excessive transaction logs to be generated.

After doing a REORGANIZE of an index, it is suggested that statistics be updated immediately after the REORGANIZE.

Summary

It is my opinion that DBCC SHRINKDATABASE should never be run on a production system that is growing over time. It may be necessary to shrink the database if a huge amount of data has been removed from the database, but there are other options besides shink in this case. After any DBCC SHRINKDATABASE, if you chose to use it, you will need to REBUILD or REORGANIZE all of your indexes.

Even if you never use DBCC SHRINKDATABASE your indexes will end up getting fragmented over time. My suggestion is to create a custom Maintenance Plan which finds the most fragmented indexes and REBUILD or REORGANIZE them over time. You could for instance create a stored procedure that finds and REORGANIZES the 4 or 5 indexes that are the most fragmented. This could be run a couple times per night during a slow time allowing your system to automatically find and fix any indexes that are too fragmented.

Related Posts:

Blog:  Index Fragmentation

Blog: Index Fragmentation and SHRINKDATABASE

Notes:

For more information see TSQL Wiki DBCC shrinkdatabase.

DBCC Command month at SteveStedman.com is almost as much fun as realizing how fragmented your indexes are after running DBCC SHRINKDATABASE.

T-SQL: A Simple Example Using a Cursor

In SQL Server the cursor is a tool that is used to iterate over a result set, or to loop through each row of a result set one row at a time. It may not be the best way to work with a set of data, but if you need to loop row by agonizing row (RBAR) in a T-SQL script then a cursor is one way of doing it.

Note: If you are new to SQL Server and come from an Oracle background, you should know that cursors on SQL Server are different from those on Oracle.

Before creating the cursor, we will just start with a simple query that will end up being used in the cursor.


USE AdventureWorks2008;
GO

SELECT BusinessEntityID, Name
 FROM Sales.Store;

Which looks something like this:

SimpleCursor1

Now to convert it to a cursor, instead of just a select statement.

Step 1: Declare variables to hold the output from the cursor.

</p>
DECLARE @BusinessEntityID as INT;
DECLARE @BusinessName as NVARCHAR(50);

Step 2: Declare the cursor object;


DECLARE @BusinessCursor as CURSOR;

Step 3: Assign the query to the cursor.


SET @BusinessCursor = CURSOR FOR
SELECT BusinessEntityID, Name
 FROM Sales.Store;

Step 4: Open the cursor.


OPEN @BusinessCursor;

Step 5: Fetch the first row.


FETCH NEXT FROM @BusinessCursor INTO @BusinessEntityID, @BusinessName;

Step 5: Loop until there are no more results.  In the loop print out the ID and the name from the result set and fetch the net row.


WHILE @@FETCH_STATUS = 0
BEGIN
 PRINT cast(@BusinessEntityID as VARCHAR (50)) + ' ' + @BusinessName;
 FETCH NEXT FROM @BusinessCursor INTO @BusinessEntityID, @BusinessName;
END

Step 6: Close the cursor.


CLOSE @BusinessCursor;

Step 7: Deallocate the cursor to free up any memory or open result sets.


DEALLOCATE @BusinessCursor;

Now putting it all together:


DECLARE @BusinessEntityID as INT;
DECLARE @BusinessName as NVARCHAR(50);

DECLARE @BusinessCursor as CURSOR;

SET @BusinessCursor = CURSOR FOR
SELECT BusinessEntityID, Name
 FROM Sales.Store;

OPEN @BusinessCursor;
FETCH NEXT FROM @BusinessCursor INTO @BusinessEntityID, @BusinessName;

WHILE @@FETCH_STATUS = 0
BEGIN
 PRINT cast(@BusinessEntityID as VARCHAR (50)) + ' ' + @BusinessName;
 FETCH NEXT FROM @BusinessCursor INTO @BusinessEntityID, @BusinessName;
END

CLOSE @BusinessCursor;
DEALLOCATE @BusinessCursor;

SimpleCursor2

This should give you a quick overview of how to quickly build and use a cursor on Microsoft SQL Server. The example shown was run on SQL Server 2008, and works the same on SQL Server 2005 or 2012.

Enjoy!

Related Links

 


 

Recursive Scalar Function in T-SQL

In my Common Table Expressions presentation the topic of recursion often comes up, but for scalar functions in T-SQL, it might not be as common.
This article has been written to show how a scalar function in SQL Server can call itself, thus being considered recursive.

What is the Fibonacci Sequence

The Fibonacci Sequence is a classic example that is used to demonstrate recursion.

By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two.

For instance, the following are Fibonacci Numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 267914296, 433494437, 701408733 …

Fibonacci Sequence as a Computer Science Challenge

Often times calculating the Fibonacci Sequence is used as a computer science puzzle, or programming interview question to see if you understand recursion.  It is very simple to do in any programming language that supports recursion.

What is Recursion

Recursion is a programming concept where a function, procedure or section of code calls itself. For instance in the following T-SQL example, the scalar function Fibonacci calculates the Fibonacci sequence using recursion, by calling itself.  This is accomplished by calling the function name inside of the body of the function.


CREATE FUNCTION dbo.Fibonacci (@Num integer, @prev integer, @next integer)
RETURNS VARCHAR(4000)
AS
BEGIN
DECLARE @returnValue as VARCHAR (4000) = cast(@prev as varchar(4000));
IF (@Num > 0)
BEGIN
IF (LEN(@returnValue) > 0)
BEGIN
SET @returnValue = @returnValue + ',';
END
SET @returnValue = @returnValue + dbo.Fibonacci(@Num - 1, @next, @next + @prev) ;
END

RETURN @returnValue;
END
GO

To call the function you simply include in a select statement, with the first parameter being the number of Fibonacci numbers to calculate, and the second and third parameters are always 0 and 1. The second and third parameters are set to 0 and 1 to prime the recursive function, and are used internally to pass the recursive call the current and previous values.

select dbo.Fibonacci(10, 0, 1);

Which produces the following output using SSMS:
FIbonacci_Scalar

For more details on the CTE version of Fibonaci take a look at my earlier post.

Enjoy!

T-SQL DATEPART Explained

Here is a quick rundown on the T-SQL DATEPART function for SQL Server. DATEPART is used to pull a single part of a date/time element out as shown below.

The following query uses DATEPART to extract the year from the datetime input.

SELECT DATEPART(year, 'April 19, 2013') as year;

DATEPART_year

The following query uses DATEPART to extract the quarter from the datetime input.

SELECT DATEPART(quarter, 'April 19, 2013') as quarter;

DATEPART_quarter
The following query uses DATEPART to extract the month part from the datetime input.

SELECT DATEPART(month, 'April 19, 2013') as month ;

DATEPART_month
The following query uses DATEPART to extract the dayofyear part from the datetime input.

SELECT DATEPART(dayofyear, 'April 19, 2013') as dayofyear ;

DATEPART_DayOfYear
The following query uses DATEPART to extract the day part from the datetime input.

SELECT DATEPART(day, 'April 19, 2013') as day ;

DATEPART_Day
The following query uses DATEPART to extract the week part from the datetime input.

SELECT DATEPART(week, 'April 19, 2013') as week ;

DATEPART_week
The following query uses DATEPART to extract the weekday part from the datetime input.

SELECT DATEPART(weekday, 'April 19, 2013') as weekday ;

DATEPART_weekday
The following query uses DATEPART to extract the hour part from the datetime input.

SELECT DATEPART(hour, 'April 19, 2013 09:01:22.123') as hour ;

DATEPART_hour
The following query uses DATEPART to extract the minute part from the datetime input.

SELECT DATEPART(minute, 'April 19, 2013 09:01:22.123') as minute ;

DATEPART_minute
The following query uses DATEPART to extract the second part from the datetime input.

SELECT DATEPART(second, 'April 19, 2013 09:01:22.123') as second ;

DATEPART_second
The following query uses DATEPART to extract the millisecond part from the datetime input.

SELECT DATEPART(millisecond, 'April 19, 2013 09:01:22.123') as millisecond ;

DATEPART_millisecond

Abbreviated format for DATEPART

The following query uses DATEPART to extract the Year part from the datetime input.

SELECT DATEPART(Yy, 'April 19, 2013') as Yy;

DATEPART_Yy
The following query uses DATEPART to extract the Quarter part from the datetime input.

SELECT DATEPART(Qq, 'April 19, 2013') as Qq;

DATEPART_Qq
The following query uses DATEPART to extract the Month part from the datetime input.

SELECT DATEPART(Mm, 'April 19, 2013') as Mm;

DATEPART_Mm
The following query uses DATEPART to extract the Day of the Year part from the datetime input.

SELECT DATEPART(Dy, 'April 19, 2013') as Dy;

DATEPART_Dy
The following query uses DATEPART to extract the Day of the Month part from the datetime input.

SELECT DATEPART(Dd, 'April 19, 2013') as Dd;

DATEPART_Dd
The following query uses DATEPART to extract the Week part from the datetime input.

SELECT DATEPART(Wk, 'April 19, 2013') as Wk;

DATEPART_Wk
The following query uses DATEPART to extract the Day of the Week part from the datetime input.

SELECT DATEPART(Dw, 'April 19, 2013') as Dw;

DATEPART_Dw
The following query uses DATEPART to extract the Hour part from the datetime input.

SELECT DATEPART(Hh, 'April 19, 2013 09:01:22.123') as Hh;

DATEPART_Hh
The following query uses DATEPART to extract the Minute part from the datetime input.

SELECT DATEPART(Mi, 'April 19, 2013 09:01:22.123') as Mi;

DATEPART_Mi
The following query uses DATEPART to extract the Second part from the datetime input.

SELECT DATEPART(Ss, 'April 19, 2013 09:01:22.123') as Ss;

DATEPART_Ss
The following query uses DATEPART to extract the Millisecond part from the datetime input.

SELECT DATEPART(Ms, 'April 19, 2013 09:01:22.123') as Ms;

DATEPART_Ms

Zero Padding

Note, the DatePart function returns an INT data type.  If you need a zero padded return you can cast it to a VARCHAR, concatenate a 0 to the front, then use the RIGHT function to trim off extra padding, for instance the following:

SELECT RIGHT ('0' + CAST(DATEPART(Mi, 'April 19, 2013 09:01:22.123') AS VARCHAR(2)),2) as Mi;

DATEPART_Leading0

There are many great ways to use the DatePart, these are just some of them.

Enjoy.

Select Favorite SQL Server 2012 Articles

The following articles are a collection of my favorite SQL Server related posts over the last 6 months.

3 Steps to Work More Efficiently in SSMS. Posted 2/17/2013

Recently the 3 Steps to Work More Efficiently in SSMS article was one of my favorites. The three steps were:

  1. Display results in a separate tab.
  2. Hotkey for SELECT TOP 100 * FROM …
  3. sp_help by using Alt+F1

The Display results in a separate tab is one of the first things I do when working in SSMS on a new computer.

VARCHAR diff function Posted 2/13/2013

What made the VARCHAR diff function posting one of my favorite was how it built on a number of previous articles to finally get to something that was much more useful, all based on common table expressions. The VARCHAR diff function used code from a previous posting called Using a CTE to Split a String Into Rows with Line Numbers. A couple weeks after writing the VARCHAR diff function with CTE’s I ended up using this on a project to create an automated testing script to compare the output from an outdated code project to the results from the new T-SQL implementation.  The use of the VARCHAR diff function allowed for a single tester to test the work of 3 developers in a time far less than what was anticipated on this project.

The VARCHAR diff function probably isn’t something that anyone would use on a daily basis, but it is a massive time saver when needed.

TSQL 2012 – Generating letters with a SEQUENCE object Posted 11/9/2012

This article was a lot of fun. Having learned about how to use a SQL Server 2012 sequence object to generate letters at a SQL Saturday presentation, I just had to blog about this one. It was a fun post.

Adventureworks2012 data to Google GeoChart Visualization Posted 10/19/2012

On this article I pulled address data from the AdventureWorks2012 database, and covered how to extract it and reformat it correctly to display on the Google GeoChart Visualizations.

Visualization1

Again another fun posting.

I hope everyone enjoys these as much as I did.