TSQL Queries – Using NTILE

NTILE is a SQL Server function that can be called in a TSQL Query. This is part of the free sql query training to prepare for the Microsoft 70-461 queries exam.
Here is a video presentation on using the TSQL NTILE function.

And here is the sample code associated with the presentation. Feel free to copy and paste to SSMS and run it yourself.


CREATE DATABASE [QueryTraining];
GO

USE [QueryTraining];



-- Table to be used by Over Clause Rows/Range
CREATE TABLE REVENUE
(
[DepartmentID] int,
[Revenue] int,
[Year] int
);

insert into REVENUE
values (1,10030,1998),(2,20000,1998),(3,40000,1998),
       (1,20000,1999),(2,60000,1999),(3,50000,1999),
       (1,40000,2000),(2,40000,2000),(3,60000,2000),
       (1,30000,2001),(2,30000,2001),(3,70000,2001),
       (1,90000,2002),(2,20000,2002),(3,80000,2002),
       (1,10300,2003),(2,1000,2003), (3,90000,2003),
       (1,10000,2004),(2,10000,2004),(3,10000,2004),
       (1,20000,2005),(2,20000,2005),(3,20000,2005),
       (1,40000,2006),(2,30000,2006),(3,30000,2006),
       (1,70000,2007),(2,40000,2007),(3,40000,2007),
       (1,50000,2008),(2,50000,2008),(3,50000,2008),
       (1,20000,2009),(2,60000,2009),(3,60000,2009),
       (1,30000,2010),(2,70000,2010),(3,70000,2010),
       (1,80000,2011),(2,80000,2011),(3,80000,2011),
       (1,10000,2012),(2,90000,2012),(3,90000,2012);



-- first lets look at the REVENUE table

SELECT * 
  FROM Revenue;



-- NTILE Function


-- Rank all the revenue amounts on a scale of 1 to 10
SELECT *, NTILE(10) OVER(ORDER BY [Revenue] ASC) 
  FROM Revenue
  ORDER BY [Year];

-- Rank all the revenue amounts on a scale of 1 to 100
SELECT *, NTILE(100) OVER(ORDER BY [Revenue] ASC) AS Percentile
  FROM Revenue
  ORDER BY [Year];
-- or does it.....


SELECT *, NTILE(20) OVER(ORDER BY [Revenue] DESC) AS NTile20 
  FROM Revenue
  ORDER BY NTile20;


-- multiple NTILES
SELECT *, 
	   NTILE(10) OVER(ORDER BY [Revenue] DESC) AS NTile10, 
	   NTILE(20) OVER(ORDER BY [Revenue] DESC) AS NTile20, 
	   NTILE(4) OVER(ORDER BY [Revenue] DESC) AS Quartile
  FROM Revenue
  ORDER BY NTile20;

Enabling Transparent Data Encryption on SQL Server 2014

To start with for Transparent Data Encyrption (TDE) we will need to be using Enterprise (or Developer) edition of SQL Server 2014. I am using SQL Server 2014 Enterprise Edition CTP1, which is the early release of SQL Server 2014.

Next we need a database to work with.  I will just use the Sample Database for Common Table Expressions, a simple database for TDE with just a couple tables. After downloading the script and running it, we can see the following in the Object Exporer.

TDE14_1

Encrypting the Database

First lets set up the master key to be used for encryption, then create a certificate for the database we are going to encrypt.


USE master;
GO
CREATE MASTER KEY ENCRYPTION
    BY PASSWORD = 'BeSureToUseABetterPasswordThanThis!!!!!!';
GO
CREATE CERTIFICATE cteDemoCert
  WITH SUBJECT = 'cte_demo Certificate';

The next step is to create the encryption key for the database being encrypted. In this case we are encrypting the cte_demo database.

USE cte_demo;
GO

CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = TRIPLE_DES_3KEY
ENCRYPTION BY SERVER CERTIFICATE cteDemoCert;

As soon as you run this you will be shown a warning stating that key has not been backed up and that you should back it up.
TDE14_2

Warning: The certificate used for encrypting the database encryption key has not been backed up. You should immediately back up the certificate and the private key associated with the certificate. If the certificate ever becomes unavailable or if you must restore or attach the database on another server, you must have backups of both the certificate and the private key or you will not be able to open the database.

Now lets backup the certificate as the warning suggests:

USE master;
GO
BACKUP CERTIFICATE cteDemoCert TO FILE = 'c:\Backups\cteDemoCert';

And the final step is to turn on TDE encryption for the database. This may take a while for larger databases.


USE cte_demo;
GO
ALTER DATABASE cte_demo SET ENCRYPTION ON;

And to check to see if everything worked, we can just query the sys.databases table to see if we the cte_demo database is now encrypted.

SELECT NAME, IS_ENCRYPTED
 FROM sys.databases;

TDE14_4From there we can see that the cte_demo database has been encrypted with transparent data encryption. Be sure to make a full database backup at this time and to safely store your backed up encryption key.

The same script will also work on SQL Server 2012.

SQL Server 2014 related articles:

Query To Simulate CPU Load

Quite often when working testing different performance measures from the Database Health Reports to testing Resource Governor configuration I end up in a situation where I need to emulate a high CPU load on my test SQL Server. In this case it is usually my desktop or a Hyper-V virtual machine, where I want to see how things will work if I had a nearly 100% CPU load. In the real world you would just have to get millions of users to visit your website that has a SQL Server backend, that its not that easy in my development / test lab.

Here is that I came up with for a plan. Keeping in mind the goal here is to write queries that will use up as much CPU as possible. This is counter-intuitive, completely against everything that I practice on a daily basis, but here goes.

First create a table with poor design. Using UNIQUEIDENTIFIERS for a primary key and a foreign key (parent_id) is probably ugly enough.


CREATE TABLE SplitThrash
(
 id UNIQUEIDENTIFIER default newid(),
 parent_id UNIQUEIDENTIFIER default newid(),
 name VARCHAR(50) default cast(newid() as varchar(50))
);

Next we fill the table up with lots and lots of rows, specifically 1,000,000 rows, remember here the goal is to simulate CPU load. If this isn’t enough I often times run this script several times. Keep in mind the GO statement followed by a number says to execute the batch that many times.


SET NOCOUNT ON;
INSERT INTO SplitThrash DEFAULT VALUES;
GO  1000000

Next, this part makes me just feel nasty. Create a CLUSTERED index on the table that we just filled up, and cluster on BOTH columns that were UNIQUEIDENTIFIERS.


CREATE CLUSTERED INDEX [ClusteredSplitThrash] ON [dbo].[SplitThrash]
(
 [id] ASC,
 [parent_id] ASC
);

At this point is is a bit ugly, but it still doesn’t use much memory. You are probably wondering why I called the table split thrash. I gave it this name so that updating the UNIQUEIDENTIFER would cause as many page splits or new page allocations as possible. So we update the parent_id which is part of the clustered index

</pre>
UPDATE SplitThrash
SET parent_id = newid(), id = newid();
GO 100
<pre>

This update statement causes chaos in the page structure for the table as updating the unique identifiers causes quite a bit of processor work.

On my wimpy VM for this development environment I need to repeat this entire process creating 4 or 5 tables, and doing the update in 4 or 5 SSMS windows in order to use up all of the CPU on the database.

Don’t try this on a production system, but it is a great test to run on a development server.

See Also:

If you like this you might also like my article on how to use up all of your memory with a common table expression.

CTE Scope

Day 5 of Common Table Expression Month (June) at SteveStedman.com, today I will cover concepts around the scope of a CTE.

The queries in the video will be using that database that was set up in a previous posting on the CTE_DEMO Sample Database, if you haven’t set up the sample database, download it and set it up now.

Where is live is not to far from the US/Canadian border. On most days I have no need to visit Canada, but occasionally I may got to Vancouver for a SQLSaturday, or just to go out for some sushi or a comedy club. As long as I have my passport with me, I have the ability to cross into Canada and return to the United States. If I don’t plan ahead and take my passport with me, then visiting Canada is not an option, it is just not allowed. When we talk about scope in any programming language it is similar to  not being able to get to that sushi restaurant inVancouver BC because I don’t have my passport and I can’t cross the border without it. That sushi restaurant would be considered out of scope. In programming languages, including T-SQL, scope is the concept of what you can or can’t access. Some things in T-SQL are accessible from anywhere on the SQL Server (assuming permissions have been granted), for instance a table has what I would call global scope in that it can be referenced from any query accessing the SQL Server (again assuming the correct permissions). If you look at something like a temporary table, it has the scope of the function or query batch that created it, there is also a concept of a global temporary table which has a greater scope.

One of the confusing things around CTE is the scope of a CTE. Given that a CTE is described as being similar to a temporary table, like a temporary named result set, or like a temporary view, all of these imply something that might be around for more than a single query. When using CTE’s, the CTE only exists for the single statement that is using it.

The scope of the CTE is only a single query. That means that the single select, insert or update statement that uses the CTE is the only thing that can access that CTE.

Lets take a look at scope using the following query:


USE [cte_demo];
 GO

SELECT *
FROM Departments
ORDER BY id;

;WITH departmentsCTE(id, department, parent) AS
 ( SELECT id, department, parent
 FROM Departments)
 SELECT * FROM departmentsCTE WHERE id = 1;

SELECT * FROM departmentsCTE WHERE id = 2;

SELECT department
 FROM Departments
 ORDER BY id DESC;

At first glance this batch of SQL may look fine, but when we run it it throws an error.

CTE_Scope1

Even thought we get two result sets.

CTE_Scope2

Lets take a look at the code again, so we can see what is in scope and out of scope. The green rectangle outlines a single query that is using a CTE, and that CTE is available only inside of that single statement.  The red rectangle is attempting to access the CTE from the previous query, and the CTE is not reachable, it is out of scope.

CTE_Scope3

If we take a look again at the original query, what can we do to achieve the results that were intended here, 4 result sets returned, and the 2nd and 3rd queries using the CTE.


USE [cte_demo];
GO

SELECT *
FROM Departments
ORDER BY id;

;WITH departmentsCTE(id, department, parent) AS
( SELECT id, department, parent
FROM Departments)
SELECT * FROM departmentsCTE WHERE id = 1;

SELECT * FROM departmentsCTE WHERE id = 2;

SELECT department
FROM Departments
ORDER BY id DESC;

To make it work we just need to copy the CTE and paste it in to the second query so that it looks like this:


USE [cte_demo];
GO

SELECT *
 FROM Departments
 ORDER BY id;

;WITH departmentsCTE(id, department, parent) AS
( SELECT id, department, parent
 FROM Departments)
SELECT * FROM departmentsCTE WHERE id = 1;

;WITH departmentsCTE(id, department, parent) AS
( SELECT id, department, parent
 FROM Departments)
SELECT * FROM departmentsCTE WHERE id = 2;

SELECT department
 FROM Departments
 ORDER BY id DESC;

Which produces the desired result sets:

CTE_Scope4

Here is how the scope of the CTE’s works. The first CTE query, the CTE is only available int the green rectangle, and the second CTE only has the scope of the blue rectangle.

CTE_Scope6

I hope this helps with the overall understanding of the scope of a common table expression.

Related Links:

Common Table Expressions Book

If you enjoyed this posting, and want to learn more about common table expressions, please take a look at my book on CTE’s at Amazon.com. The book is titled Common Table Expressions – Joes 2 Pros® – A CTE Tutorial on Performance, Stored Procedures, Recursion, Nesting and the use of Multiple CTEs.

Common Table Expressions – Terminating the Previous Statement

Day 4 of Common Table Expression Month (June) at SteveStedman.com, today I will cover the topic of terminating the previous T-SQL statement.

These queries will be using that database that was set up in a previous posting on the CTE_DEMO Sample Database, if you haven’t set up the sample database, download it and set it up now.

To start with having come from a background years ago of writing C programs, the concept of terminating statements is natural to me. I always feel a bit uncomfortable of a SQL Statement isn’t properly terminated with a semicolon. If I am doing a T-SQL code review for someone else, I usually will make at least one comment about terminating T-SQL statements, unless it has already been done. I know that is not a requirement to terminate every SQL Statement, but it does make it easier to read. However in most cases, other than readability the termination of the previous statement really doesn’t matter. . . In most cases.

For instance, the following two sets of queries work exactly the same with our without the semicolons.

Batch 1:


SELECT * FROM Departments
SELECT * FROM (SELECT id, department, parent FROM Departments) as Dept

Batch 2:


SELECT * FROM Departments;
SELECT * FROM (SELECT id, department, parent FROM Departments) as Dept;

In most cases, when writing multiple T-SQL statements in a batch, the semicolon to terminate the lines really doesn’t matter, but there are a few cases in SQL Server where it does matter, and common table expressions are one of them.

Improper Termination

To start with lets take a look at the sample code from our previous posting titled “Writing Your First Common Table Expression with SQL Server“. If we tried to run this SQL without any semicolons, it would run fine if it was run one line at a time, but if we run it as a batch, it throws an error.

USE cte_demo
GO

SELECT * FROM Departments

WITH departmentsCTE(id, department, parent) AS
(
 SELECT id, department, parent
 FROM Departments
)
SELECT *
 FROM departmentsCTE

When run produces the following results:

CTE_Termination

Msg 336, Level 15, State 1, Line 5
Incorrect syntax near 'departmentsCTE'. If this is intended to be a 
common table expression, you need to explicitly terminate the previous 
statement with a semi-colon.

Which is a descriptive enough message stating that we need to explicitly terminate the previous statement with a semi-colon.

Proper Termination

When there is another T-SQL Statement in a batch prior to a CTE, that statement must be terminated with a semi-colon like this:

USE cte_demo
GO

SELECT * FROM Departments;

WITH departmentsCTE(id, department, parent) AS
(
 SELECT id, department, parent
 FROM Departments
)
SELECT *
 FROM departmentsCTE

Note, the only required semi-colon is the one at the end of the SELECT * FROM Departments; query, and we get the desired results.
CTE_Termination2

GO Keyword in SSMS

When working in SQL Server Management Studio, an alternative is to terminate the batch, and the GO keyword is used to break a number of T-SQL statements into different batches, like this:


USE cte_demo;
GO

SELECT * FROM Departments

GO

WITH departmentsCTE(id, department, parent) AS
(
 SELECT id, department, parent
 FROM Departments
)
SELECT *
 FROM departmentsCTE

GO

CTE_Termination3

Clean Code

Being an advocate for clean code, and clearing specifying your intentions, the way that I would write is with a semi-colon at the end of every T-SQL statement, and an extra semi-colon at the beginning of any CTE just to be safe.


USE cte_demo;
GO

SELECT * FROM Departments;

;WITH departmentsCTE(id, department, parent) AS
(
 SELECT id, department, parent
 FROM Departments
)
SELECT *
 FROM departmentsCTE;

CTE_Termination4

How you decide to do it is up to your own coding standards, and as long as it is clearly defined and supports clean code it is just fine.

Related Links:

Common Table Expressions Book

If you enjoyed this posting, and want to learn more about common table expressions, please take a look at my book on CTE’s at Amazon.com. The book is titled Common Table Expressions – Joes 2 Pros® – A CTE Tutorial on Performance, Stored Procedures, Recursion, Nesting and the use of Multiple CTEs.

DBCC ShrinkFile

Being day 25 of the DBCC Command month at SteveStedman.com, today’s featured DBCC Command is DBCC SHRINKFILE.

Description:

DBCC SHRINKFILE is used to shrink an individual log in SQL Server.

DBCC SHRINKFILE Syntax:

dbcc shrinkfile
(
    { 'file_name' | file_id }
    {
        [ , EMPTYFILE]
        | [ [, target_size ] [ , { NOTRUNCATE | TRUNCATEONLY } ] ]
    }
)
    [ WITH NO_INFOMSGS ]

Example:

The following example DBCC ShrinkFile is used to shrink a log file that grew too large.

-- check the size of the files.
SELECT size / 128.0 as sizeMB, name
 FROM sys.database_files;

GO
-- Truncate the log by changing the database recovery model to SIMPLE.
ALTER DATABASE DBHealthHistory SET RECOVERY SIMPLE;
GO
-- Shrink the truncated log file to 1 MB.
DBCC SHRINKFILE (DBHealthHistory_log, 1);
GO
-- Reset the database recovery model.
ALTER DATABASE DBHealthHistory SET RECOVERY FULL;
GO
-- Be sure to do a full backup, then kick off transaction log backups

-- check the size of the files.
SELECT size / 128.0 as sizeMB, name
 FROM sys.database_files;

Related Posts:

Blog Post Shrinking a Log File.

Notes:

For more information see TSQL Wiki DBCC shrinkfile.

DBCC Command month at SteveStedman.com is almost as much fun as chasing geoducks on the beach.