Calculating Factorials with a Recursive CTE

What is a Factorial:

The product of an integer and all the integers below it; e.g., factorial four (4!) is equal to 24.

 

The factorial of a positive integer n, written n!, is the product of all the positive integers from 1 up to and including n
Example:
1! = 1
2! = 1 * 2 = 2
3! = 1 * 2 * 3 = 6
4! = 1 * 2 * 3 * 4 = 24

 

It is simple to do with a recursive function, and actually commonly used as a programming interview question to determine if you understand recursion.  For instance, years ago, I was asked the following during an interview at Microsoft.

What is recursion? Calculate factorial numbers using a recursive function?

Which is very easy to answer using C, which was the programming language of choice when I was asked it.  But what if you were asked that on an intervie for a SQL Server DBA, or SQL Server Programmer job.  Your first answer might be to use a recursive stored procedure, which would be a good answer, but what if you were asked the following, how would you answer it?

What is recursion?  Using TSQL write a query to calculate factorials without creating any stored procedures or functions.

Think, think, think…. You want the job, and that is the question that the interviewer is asking you.  How do you solve it?

  1. First, make sure you understand factorials.  If you don’t understand what a factorial is you won’t get anywhere with this.
  2. Then write the query, using a Recursive Common Table Expression.

Here is how I solved it.

First, get started with a simple CTE that just calculates the first factorial 1! as shown below.  Pretty simple at this point.  The result set is correct, the Factorial of 1 is just 1.

 

Next, I would add the recursive CTE query to continue with the factorial calculations beyond 1!, this time going up to 5!  When working with recursive CTE‘s you usually want to have an escape clause to avoid too much recursion.  In this example the CTE exits at 5.

 

That’s nice, but lets go further.  Lets go for 20!  Which you will see in the image below caused some problems as shown with an error of “Arithmetic overflow error converting expression to data type int.”

 

Now to fix the overflow.  It turns out that Factorial of 20 gets big pretty quick, and an integer only can hold a number up to just over 2 Billion. But a BIGINT can go much further, to just over 9 Quintillion (US).  So to fix this we cast the factorial column to be a BIGINT rather than an INT.

INT

Range: -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
Space: 4 Bytes

BIGINT

Range: -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
Space: 8 Bytes

 

So here we go with an attempt to get to 20! , which works great.

But how far can we go with the BIGINT 9 Quintillion limit of BIGINT.  As it turns out 21! exceeds the size of a BIGINT.

 

So what now?   What if we want to calculate more than 20! using SQL Server?     Cast it to a NUMERIC(38,0) which gives us 38 digits to work with as shown here.

 

Now what???

I haven’t been able to find a solution to do math and store the results larger than a NUMERIC(38,0).

Here is the final query.

WITH factorial (n, factorial)
AS (SELECT 1,
CAST(1 AS NUMERIC(38, 0)) – Cast to BIGINT to avoid overflow
UNION ALL – here is where it gets recursive
SELECT n + 1,
( n + 1 ) * factorial
FROM   factorial – reference back to the CTE
WHERE  n < 33 – abort when we get to 33!
)
SELECT n,
factorial
FROM   factorial;

 

I hope you have found this useful.

 

CTE – With An Insert Statement

Queries with Common table expressions (CTE) are made up of two parts, the CTE part, and the SQL that references the CTE.  In preparation for SQL Saturday, the question came up of can you use an INSERT or UPDATE statement with a CTE.  Referring to the documentation I confirmed that using an insert or update inside of the CTE is invalid, but you can use an insert or update statement outside of the CTE.

 

For Example.


DECLARE @NumTableVar TABLE( n INT);

 

;WITH numbers (n)
AS (SELECT 1
UNION ALL
SELECT 1 + n
FROM   numbers
WHERE  n < 1000)

INSERT INTO @NumTableVar   (n)
SELECT n
FROM   numbers
OPTION (MAXRECURSION 0);


SELECT *
FROM   @NumTableVar;

When run confirms that you can use the insert statement with a CTE, but not inside of a CTE.

 

This would be very useful if you had just created a table and wanted to fill it up quickly for testing purposes.

Fibonacci Sequence

As part of my CTE research for my SQL Saturday presentation in Redmond in 2 weeks I decided to take on some classic computer science algorithms with CTE’s. Here is what I cam up with.

Any math geek can tell you what the Fibonacci Sequence is, but do you know how to calculate it with a query?

First, what is the Fibonacci Sequence?

By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two.

For instance, the following are Fibonacci Numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 267914296, 433494437, 701408733 …

 

Fibonacci Sequence as a Computer Science Challenge

Often times calculating the Fibonacci Sequence is used as a computer science puzzle, or programming interview question to see if you understand recursion.  It is very simple to do in any programming language that supports recursion.

 

So what is recursion…  Recursion is a method of programming where a function ends up calling back into itself.

Writing a query to calculate the Fibonacci Sequence

In order to do this you need to understand recursion and CTE‘s.

The following query uses a recursive CTE query to generate the fibonacci sequence.


– Calculating the Fibonacci sequence 
WITH Fibonacci (PrevN, N) AS

( SELECT 0, 1

UNION ALL

SELECT N, PrevN + N

FROM Fibonacci

WHERE N < 1000000000

)

SELECT PrevN as Fibo

FROM Fibonacci

OPTION (MAXRECURSION 0);

 

Which generates the following output.

Now to make it more fun, using the CSV conversion from a previous blog post, how do we convert it to a Comma Separated List.


WITH Fibonacci (PrevN, N) AS
( SELECT 0, 1

UNION ALL

SELECT N, PrevN + N

FROM Fibonacci

WHERE N < 1000000000

)

SELECT Substring((SELECT cast(‘, ’ as varchar(max)) + cast(PrevN as varchar(max))

FROM Fibonacci

FOR XML PATH()),3,10000000) AS list

Output…

 

What other classic math problems can you do with SQL Server.

Generating a Tree Path with a CTE

The following example shows how to create a Tree Path with a Common Table Expression (CTE).

First off what is a tree path?

For this example I will be referring to the product categories at a camping and fitness store, where you have a top level with 4 categories, and some of the categories have subcategories:

  • Camping
    • Tents
      • 1 Person
      • 2 Person
        • Family Camping
        • Backpacking
        • Mountaineering
      • 3 Person
      • 4 Person
    • Backpacks
    • Sleeping Bags
    • Cooking
  • Cycle
  • Snowsports
  • Fitness

 

In this example, the tree path for 2 person mountaineering tents would be  ”Camping -> Tents -> 2 Person -> Mountaineering”.  This is sometimes referred to as bread crumbs when you are creating a website or storefront.

 

So how do you get from the following table structure into a hierarchy that includes the Tree Path.

 

You could do it by doing a self JOIN from the table back to itself, joining parent to id, which would work for 2 levels.  Then to go to the third or fourth level, you would need to another one or two self JOINs to get those levels in.  The problem with this strategy is that you are limited on the level of categories or the depth of the tree.

Using a Common Table Expression

The way that I would do it is with a CTE to get the following results.

 

Here is how you do it, with a recursive CTE, which will accommodate any number of levels up to 100 (even more if you specify a deeper MAXRECURSION).

 

WITH departmentcte(deptid, department, parent, LEVEL, treepath) AS

( SELECT id AS deptid, department, parent, 0 AS LEVEL,

CAST(department AS VARCHAR(1024)) AS treepath

FROM departments

WHERE parent IS NULL

UNION ALL – and now for the recursive part 

SELECT d.id AS deptid, d.department, d.parent,

departmentcte.LEVEL + 1 AS LEVEL,

CAST(departmentcte.treepath + ‘ -> ’ +

CAST(d.department AS VARCHAR(1024))

AS VARCHAR(1024)) AS treepath

FROM departments d

INNER JOIN departmentcte

ON departmentcte.deptid = d.parent)

SELECT *

FROM departmentcte

ORDER BY treepath;

Just another example of using a Recursive Common Table Expression.

For more details, see my CTE presentation at SQL Saturday in Redmond.

CTE Query Performance

The following question came up when working on my CTE presentation for SQL Saturday.

Does a query that JOINs a CTE to itself execute the CTE query once or twice?

For instance:


 ;WITH deptCTE(id, department, parent)
    AS (SELECT id,department,parent FROM   Departments)
SELECT q1.department,q2.department
  FROM deptCTE q1
 INNER JOIN deptCTE q2 ON q1.id = q2.parent
 WHERE q1.parent IS NULL; 

 

The following execution plan is produced showing that the Departments table is hit twice with a table scan each time with the same cost.

In the above example the query inside of the CTE gets run twice, and has no performance improvement over using the same query twice as a subquery. But it does clean up the syntax and reduce errors by only having the query in one place.

Sessions Submitted for SQL Saturday #120 Orange County CA

After being approved as a speaker at SQL Saturday 114 – Vancouver BC, and SQL Saturday 108 – Redmond WA, I have decided to give it another shot at SQL Saturday 120 in Orange County California.

I have submitted 2 abstracts for sessions:

The first one is titled Unleashing Common Table Expressions in SQL Server, and if accepted this will be the third time presenting this one on a SQL Saturday.

The next session which I will also be presenting in Vancouver BC is Using SSRS reports to analyze SQL Server health.

 

I am looking forward to travelling to sunny southern California for these presentations.  Nice to get away from the snow and rain for a while.