Skip to content

CTE

Understanding Common Table Expressions (CTEs) in SQL Server

Common Table Expressions (CTEs) are a powerful feature in SQL Server, introduced with SQL Server 2005, that allow you to create temporary, named result sets. These result sets exist only within the scope of a single SQL query and can greatly simplify complex queries, improve readability, and enhance maintainability. A CTE is defined using the WITH keyword, followed by the CTE name, a list of columns (optional), and a query that defines the data. Once defined, the CTE can be used like a table or view within the subsequent query.

Simplifying Complex Queries

One of the primary benefits of CTEs is their ability to break down complex queries into manageable chunks. For instance, if you have a deeply nested query or one with multiple levels of joins and aggregations, a CTE can allow you to isolate and name each logical step. This not only makes the query easier to write but also significantly improves its readability for anyone maintaining it later. For example, a CTE can be used to calculate a running total or to filter intermediate results before applying additional logic.

Recursion with CTEs

CTEs also support recursion, making them an excellent choice for working with hierarchical data such as organizational structures or tree-like datasets. Recursive CTEs consist of two parts: an anchor member, which provides the base result set, and a recursive member, which repeatedly references the CTE to build the hierarchy. By limiting recursion levels with the OPTION (MAXRECURSION) query hint, you can ensure performance and prevent runaway queries. Recursive CTEs are often simpler and more intuitive than alternative approaches like self-joins or cursor-based solutions.

Temporary Scope and Performance

It’s important to note that CTEs exist only for the duration of the query they are defined in. They are not stored in memory or the database, which makes them lightweight and flexible. However, this temporary nature means that they cannot be indexed directly. While CTEs are generally performant for intermediate processing, their execution depends on how SQL Server’s query optimizer handles the underlying logic. For complex cases or when performance issues arise, it may be necessary to analyze execution plans and consider alternatives like indexed views or temp tables.

Practical Applications of CTEs

CTEs are widely used in scenarios like improving query readability, managing hierarchical data, and simplifying reporting queries. For example, they are ideal for tasks such as filtering out duplicate rows, summarizing sales data by year, or identifying cycles in graph-like datasets. CTEs also lend themselves well to troubleshooting and debugging SQL code, allowing developers to isolate problematic sections of queries. By mastering CTEs, SQL Server developers and DBAs can create more efficient and maintainable solutions for a variety of business needs.

For expert assistance with SQL Server queries, performance tuning, or CTE optimization, consider the Managed Services from Stedman Solutions. Learn more at Stedman Solutions Managed Services.

Temp Table vs Table Variable vs CTE and the use of TEMPDB.

For more information on Common Table Expessions and performance, take a look at my book at Amazon.com: Common Table Expressions Joes 2 Pros®: A CTE Tutorial on Performance, Stored Procedures, Recursion, Nesting and the use of Multiple CTEs

There are many reasons that a Temp Table, Table Variable or Common Table Expression can be a good solution, but there is also a lot of confusion around these.  Much of that confusion is around the use of TempDB.

For the purpose of this article I am going to use my standard Departments table from my CTE presentation. It is just a simple database to work with. Also, this is running on my local SQL Server with no other users connected.


USE [Master];
set statistics io off;

IF EXISTS(SELECT name FROM sys.databases WHERE name = 'cte_demo')
BEGIN
 ALTER DATABASE [cte_demo] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
 DROP DATABASE [cte_demo];
END
CREATE DATABASE [cte_demo];
GO

USE [cte_demo];
-- create a table to use for CTE query demo
CREATE TABLE [Departments] (
 id int, --would normally be an INT IDENTITY
 department VARCHAR (200),
 parent int
 );

-- insert top level departments
insert into [Departments] (id, department, parent) values (1, 'Camping', null);
insert into [Departments] (id, department, parent) values (2, 'Cycle', null);
insert into [Departments] (id, department, parent) values (3, 'Snowsports', null);
insert into [Departments] (id, department, parent) values (4, 'Fitness', null);

-- now some sub-departments for camping
insert into [Departments] (id, department, parent) values (5, 'Tents', 1);
insert into [Departments] (id, department, parent) values (6, 'Backpacks', 1);
insert into [Departments] (id, department, parent) values (7, 'Sleeping Bags', 1);
insert into [Departments] (id, department, parent) values (8, 'Cooking', 1);

-- now some sub-departments for cycle
insert into [Departments] (id, department, parent) values (9, 'Bikes', 2);
insert into [Departments] (id, department, parent) values (10, 'Helmets', 2);
insert into [Departments] (id, department, parent) values (11, 'Locks', 2);

-- now some sub-departments for snowsports
insert into [Departments] (id, department, parent) values (12, 'Ski', 3);
insert into [Departments] (id, department, parent) values (13, 'Snowboard', 3);
insert into [Departments] (id, department, parent) values (14, 'Snowshoe', 3);

-- now some sub-departments for fitness
insert into [Departments] (id, department, parent) values (15, 'Running', 4);
insert into [Departments] (id, department, parent) values (16, 'Swimming', 4);
insert into [Departments] (id, department, parent) values (17, 'Yoga', 4);

Then to compare the difference on the TempDB usage, I am going to use an under-documented function called fn_dblog() which shows you a list of the items that are written to the transaction log in different scenarios.  I will also combine this with the CHECKPOINT command to clear the contents of the log for TempDB.

To try out out, run the following on a TEST or DEVELOPMENT database.  Don’t run this on your production system.


USE TempDB;
GO

SELECT *
 FROM fn_dblog(null, null);

CHECKPOINT;

SELECT *
 FROM fn_dblog(null, null);

Then take a look at the output which will look something like this.

The first time you select from fn_dblog() you may be many rows.  In my example fn_dblog() returned 171 rows, you may have many more.  But after running CHECKPOINT you only get 3 rows, and those are the 3 rows that you will normally get after running CHECKPOINT.

Read More »Temp Table vs Table Variable vs CTE and the use of TEMPDB.