How BIG is too BIG for a BIGINT IDENTITY?

After writing a recent blog post on Are you wasting half the capacity of IDENTITY, I started thinking about how much is too big for an BIGINT IDENTITY.

If you don’t know about, you can get Are you wasting half the capacity of IDENTITY 9,223,372,036,854,775,807 IDENTITY values out of a BIGINT in SQL Server.  Which in easier terminology is 9 with roughly 18 digits after it, which works out to 9 quintillion (us) or 9 trillion for the rest of the world.  Translating again that is roughly 9 billion billion, which is a huge amount of IDENTITY values.

Now if you consier the article of Are you wasting half the capacity of IDENTITY you can get just about 18.5 quintillion rows out of an identity column.  I know as DBA’s and programmers we all like to plan big, but there are very few real world cases where you would need more than 18.5 quintillion rows out of an IDENTITY COLUMN.

So I ask the question, what are you doing where you need more than 18.5 quintillion different id’s in a table?  Perhaps its time to rethink the table design if it is getting that big.

Are you wasting half the capacity of IDENTITY?

When creating a table you can specify IDENTITY with the following syntax:


IDENTITY [ (seed , increment) ]

For example
CREATE TABLE [dbo].[TestIntIdentity](
  [cpuID] [int] IDENTITY(1,1) NOT NULL,
  [logTime] [datetime] NOT NULL DEFAULT GetDate()
) ON [PRIMARY];

The IDENTITY property is valid for data types of the integer data type category, except for the bit data type, or decimal data type.  So how many rows can you insert into that table?  It depends on the data type used.

  • BIGINT maximum value is 9,223,372,036,854,775,807
  • INT maximum value is 2,147,483,647
  • SMALLINT maximum value is 32,767
  • TINYINT maximum value is 255

What happens when I run out of numbers?  Let’s take a look at an example using SMALLINT.


-- SMALLINT should accommodate 32767 values
CREATE TABLE [dbo].[TestSmallIntIdentity](
 [cpuID] [smallint] IDENTITY(1,1) NOT NULL,
 [logTime] [datetime] NOT NULL DEFAULT GetDate()
) ON [PRIMARY];

GO

INSERT INTO TestSmallIntIdentity DEFAULT VALUES;

SELECT * FROM TestSmallIntIdentity;
SELECT count(*) FROM TestSmallIntIdentity;

GO

At this point we have 1 row in the table.  Now lets fill almost all the way full.


INSERT INTO TestSmallIntIdentity DEFAULT VALUES;
GO 32765

-- Run it 32,765 times

SELECT * FROM TestSmallIntIdentity;
SELECT count(*) FROM TestSmallIntIdentity;

GO

Keep in mind that specifying a number after the GO statement specifies that we want the previous statement run that many times.  In this case run the insert statement 32756 times plus the one previous should leave us with 32766 rows in the table as shown here.

Now lets take a quick look at this table using the Identity Column Usage Report from Database Health Reports project.

Now the big question, there are 32766 items in the table, we know that it will hold 32767, what happens when we do another insert or 2?


INSERT INTO TestSmallIntIdentity DEFAULT VALUES;

SELECT count(*) FROM TestSmallIntIdentity;
GO

This one works fine and puts our count at 32767,  what next.  Insert one more row…


INSERT INTO TestSmallIntIdentity DEFAULT VALUES;

SELECT count(*) FROM TestSmallIntIdentity;
GO

This message is SQL Servers way of saying that the identity column is full.

Msg 8115, Level 16, State 1, Line 1
Arithmetic overflow error converting IDENTITY to data type smallint.
Arithmetic overflow occurred.

How do I fix it?

This is where it gets ugly.  You can’t change the datatype on a IDENTITY value on an existing table with dropping it and recreating it, which would lead to data loss.

If there are no tables with foreign key relationships, you could just do the following:

  1. Create another table with a larger integer type for the IDENTITY column, in this case BIGINT or INT.
  2. Then select everything out of this table into the new table, excluding the primary key.
  3. Then drop the original table.
  4. Then rename the new table to the original name.
The more complex solution is if there are foreign keys, you need to also drop the foreign key relationship,  turn on INDENTITY_INSERT do the insert, and turn off IDENTITY INSERT, then re-enable the foreign keys.  Here are the steps.
  1. Create another table with a larger integer type for the IDENTITY column, in this case BIGINT or INT.
  2. Disable all the foreign keys associated with the original table.
  3. Turn on identity_insert for the new table.
  4. Then select everything out of the original table into the new table, including the identity column.
  5. Then drop the original table.
  6. Then rename the new table to the original name.
  7. Turn off IDENTITY_INSERT for the new table.
  8. Re-enable all the foreign keys.
Oh, and did I mention do this at a time when nobody is inserting anything new into the original table.

How to prevent it?

2 options when you originally create a table.

  • Use a larger integer data type.
  • Seed the identity at the smallest value available for that data type.
  • On SQL Server 2012 you could consider the SEQUENCE object instead of an IDENTITY.

The first option of a larger integer data type is more expensive on the database, for instance if you just decided to set every identity to be a BIGINT, you would have a huge amount of overhead and waste on many of your tables.

The second option of seeding the identity to the smallest value available for the sequence.

  • BIGINT minimum value is -9,223,372,036,854,775,808
  • INT minimum value is -2,147,483,648
  • SMALLINT minimum value is -32,768
  • TINYINT minimum value is 0

There really isn’t anything to be gained with the TINYINT since its minimum value is 0, but with the 3 other INT data types, you double the capacity of your identity values by starting them at their minimum value instead of at 1.

Take a look at the following example, seeding the value with -32768, instead of 1.


-- SMALLINT should accommodate -32768 to 32767 values
CREATE TABLE [dbo].[TestSmallIntIdentity2](
[cpuID] [smallint] IDENTITY(-32768,1) NOT NULL,
[logTime] [datetime] NOT NULL DEFAULT GetDate()
) ON [PRIMARY];

GO

INSERT INTO TestSmallIntIdentity2 DEFAULT VALUES;

SELECT * FROM TestSmallIntIdentity2;
SELECT count(*) FROM TestSmallIntIdentity2;

GO

After inserting 1 row, we see the following output.

Where you can see that the first value inserted has an identity value of -32768.  Now what happens when we insert 37268 values.

INSERT INTO TestSmallIntIdentity2 DEFAULT VALUES;</pre>
GO 32768

-- Run it 32,768 times
-- take a look at the last to inserted.
SELECT TOP 10 *
FROM TestSmallIntIdentity2
ORDER BY cpuID DESC;

SELECT count(*) FROM TestSmallIntIdentity2;

Which produces the following output.

Where you can see that after inserting 32769 rows were are now at an IDENTITY value of 0, rather than overflowing.

Now lets take a look at the same table using the Identity Column Usage Report from Database Health Reports project.

Summary

From what we have observed, I ponder the question of why would you ever not start an identity at its minimum value in order to take full advantage the capacity of the datatype you are using.

-Steve Stedman

Bellingham WA

Muda – or the 7 Deadly Wastes

In the manufacturing world it is referred to the 7 wastes, or the Japanese term Muda.  How does this apply to software development.

THE 7 WASTES OR MUDA ARE:

  • Transportation
  • Inventory
  • Motion
  • Waiting
  • Over-Processing
  • Over-Production
  • Defects

These all apply to software development in similar ways as they apply to a manufacturing facility.

For instance:

  • Transportation refers to moving things, which in software development may refer to electronically moving things through the process.
  • Inventory – The backlog of bugs, feature requests or other work.
  • Motion – Moving people, for instance in person meetings or walking to a different office for things that could be done over the phone, over Skype, IM, or other modern technology.
  • Waiting – Developers distracted by waiting for information, waiting for a build, waiting for test, etc.
  • Over Processing – depends a lot on the environment.
  • Over Production – doing more than what is asked for or needed.
  • Defects of Muda translates to Bugs in software.

Find out more at Lean6Agile.SteveStedman.com

Database Health Version 3.1 So Soon After Beta 3

Today I release Beta 3.1 of the Database Health Reports.

Immediately after the release of Beta 3, there were 3 bugs reported that were very quick to fix.  I was able to get these fixed, along with one other bug fix, once they were all tested, I launched Beta 3.1 of the Database Health Reports.

For anyone who downloaded Beta 3 and set up the historic waits monitoring:  What you set up will still be active if you upgrade to Beta 3.1, just upgrade to Beta 3.1, and  reconnect to it by clicking on the historic waits panel and providing the login credentials.

I hope everyone enjoys this update.

-Steve Stedman

Bellingham WA

Beta 3 is Now Available.

One big new feature in Beta 3 of the Database Health Reports, the historic monitoring.

The way that the historic monitoring works is that it continuously monitors your SQL Server database and logs the details around queries that are causing waits.

Here is an example, the following ugly block of TSQL updates a table with a few million rows in it, and that table has about 20 indexes on the primary key of a unique identifier.   Like I said ugly, ugly, ugly, but this query is quickly detected in the historic monitoring in Beta 3 of the Database Health Reports.


declare @counter as integer;
set @counter = 100;
WHILE @counter > 0
BEGIN
UPDATE dbo.NoLockDemo SET SomeData = NEWID()
set @counter = @counter - 1
WAITFOR DELAY '00:00:00.300'
END

Here is what I discovered using the Historic Waits reporting in Beta 3 of the Database Health Reports.

From here you can see that this query was run a lot between 4:00pm and 6:00pm, and that the wait that caused the most trouble was the WAITFOR wait type, but following that was the IO_COMPLETION, LOGBUFFER, PAGEIOLATCH_EX, followed by WRITELOG.

Once you activate the historic waits reporting, it may take a couple hours to a day to collect the data to be able to see trends like we see here.

The Beta 3 of the Database Health Reports gives you a quick graphical way to track down the queries that may be causing the most pain on your SQL Server.

Download Beta 3 of the Database Health Reports and give it a try.

Beta 3 of the Database Health Reports will be launched tomorrow.

Finally after a month and a half of preparation on Beta 3 of the Database Health Reports, it will be released tomorrow.

The biggest feature in Beta 3 is the historic monitoring.  This process installs a small database to track wait times continuously, with a very small (less than 1%) on the database being monitored.

Look for the download of Beta 3 of the Database Health Reports tomorrow.

The Beta 3 release of the Database Health Reports adds 2 major milestones, first the Historic Monitoring which, in my opinion is as useful as the entire rest of the Database Health Reports project.   The second milestone was support for SQL Server 2005.  Several bugs were fixed that allow for the program to function correctly when connection to SQL Server 2005.

The supported versions of SQL Server for the Database Health Reports project are now:

  • SQL Server 2005 SP1 or newer
  • SQL Server 2008
  • SQL Server 2008 R2
  • SQL Server 2012

New Features

  • Added Historic Monitoring.  This is the main feature added in Beta 3.
    • Added an additional drilldown level in the tree view hierarchy.
    • Creates a small database on one SQL Server instance to track historic data from multiple instances.
    • Includes filtering by time, user, database, query, and host.
    • Allow drilldown by time, user, database, query and host.
    • Added the Historic Query Waits Advisor with charting over time, waits summary, and links to the original Waits Advisor.
  • Added a configuration option for the minimum number of pages to consider in the fragmented indexes report.  Allows users to skip the small tables with fragmentation since fragmented small tables don’t have as big of an impact as the larger tables.

Bug Fixes

  • Fixed a timing related crashing bug on the Table Sizes Advisor.
  • Fixed crash on startup related to the splash screen.  Implemented a new splash screen.
  • Fixed several crashes with SQL Server 2005 instances.  Right, in Beta 1 and Beta 2 I didn’t do enough testing on SQL Server 2005.  With Beta 3 I think we are good on 2005, although a couple reports are disabled because the use functionality not available on SQL Server 2005.
    • Bug 70 – Crash on backup reports with SQL Server 2005.
    • Bug 71 – Stored Procs by logical writes crashes on SQL Server 2005.
    • Bug 73 –  Re-implemented the statistics reports so that they function on SQL Server 2005.
  • Fixed bug relating to ;, + and other not alphanumeric chars in passwords.  It was causing the login to fail, now it works.
  • SQL Server 2000 is not supported, none of the functionality works on SQL Server 2000, I now disable connections to SQL Server 2000 to prevent a variety of crashes throughout the reports.