Deleting from a CTE with an EXISTS statement
During my 24 Hours of Pass presentation on Advanced CTE’s today I was asked the question about deleting from a CTE when it uses an EXISTS statement that queries another table. I figured I would create quick blog post to show the example.
First some background when using the delete statement with a CTE you can’t delete if the query inside of a CTE references multiple tables. For instance:
;WITH CustomerCTE AS ( SELECT c.* FROM Customer AS c INNER JOIN SalesInvoice AS si ON si.CustomerID = c.CustomerID WHERE c.LastName like 'Williams' ) DELETE FROM CustomerCTE;
Will return an error like this:
An the question was asked around referencing a second table using EXISTS, for instance:
;WITH CustomerCTE AS ( SELECT c.* FROM Customer AS c WHERE EXISTS (SELECT 1 FROM SalesInvoice AS si WHERE si.CustomerID = c.CustomerID AND c.LastName like 'Williams') ) DELETE FROM CustomerCTE;
Since the CTE using the EXISTS example only references one table in the CTE query (technically the query in the exists is a correlated subquery or derived table), you are able to do it just fine.
So if you can restructure a query inside of a CTE into an EXIST clause being used for DELETE, you can make it work as a way to get around the single table in a CTE delete statement error.
Thanks for asking.
More from Stedman Solutions:
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!
To delete 100,000 row chunks from a 9,542,067 row table, how about:
DECLARE @rowcount int
,@recordcount int
,@continue int
SET @rowcount = 0
SET @recordcount = 0
SET @continue = 1
WHILE @continue = 1
BEGIN
BEGIN TRANSACTION
;WITH [table_CTE] AS (
SELECT TOP (100000) * FROM [table_A]
WHERE EXISTS (
SELECT TOP (100000) * [table_A].[composite_key_pt1]
,[table_A].[REV]
FROM [table_A]
INNER JOIN [table_B] ON [table_A].[composite_key_pt1] = [table_B].[composite_key_pt1]
AND [table_A].[composite_key_pt2] = [table_B].[composite_key_pt2]
WHERE [table_A].[composite_key_pt1] = [table_B].[composite_key_pt1]
AND [table_A].[composite_key_pt2] = [table_B].[composite_key_pt2]
)
ORDER BY [table_A].[composite_key_pt1]
,[table_A].[composite_key_pt2]
)
DELETE FROM [table_CTE]
SET @rowcount = @@ROWCOUNT
IF @rowcount > 0
BEGIN
SET @recordcount = @recordcount + @rowcount
PRINT ‘Running total deleted record count: ‘ + CAST(@recordcount AS varchar(20))
END
ELSE
BEGIN
SET @continue = 0
SET @recordcount = @recordcount + @rowcount
PRINT ‘Total deleted record count: ‘ + CAST(@recordcount AS varchar(20))
END
COMMIT
END
Any assistance would be appreciated!
Randy, thats a great question. I have had to do something very similar lately.
The problem that you run into when you are deleting 100,000 rows is lock escalation.
When you run a delete statement with just a few rows or a small number of rows server generally locks the pages associated with those rows. When you cross a threshold somewhere between 2000 rows and 4000 rows depending on your server the page locks get escalated to a table lock, thus locking the entire table until all 100,000 rows are deleted which could take some time.
When I want to delete many rows from very large tables, I will usually do it in batches of 2000 rows at a time or less per transaction.
Unfortunately the CTE doesn’t help or hurt in any way here.
My recommendation would be to create a loop that checks to see if there are rows that need to be deleted based on your join, then deletes them at 1000 to 2000 at a time.
Another way to do it would be to get a count of the rows to be deleted and then loop adding up the number of rows deleted (1000 to 2000 at a time), and delete until they are all gone.
Also you might want to consider index structure, fragmentation and statistics when deleting that many rows. When deleting large numbers of rows I have sometime put in an index regorg somewhere along the way or a statistics update, just to help speed up the process.
Another thing to consider if you don’t have a good index for your where clause or join criteria would be to add a filtered index for the data you are deleting, then when you are done deleting you can drop the filtered index.
There really isn’t any great way to delete lots and lots of rows, unless of course you want to delete everything from the table with truncate table.
I hope this helps.
-Steve Stedman