After my post yesterday on the performance implications of concatenating long strings with the + operator and the CONCAT function, today I am following up with a way to speed up concatenation if you need to build really long strings.
What the previous show as that the longer the string gets the slower the concatenation is. So instead we declare a second VARCHAR(MAX) variable called @stringBuilder, and each time through the loop we concatenate to that, then every thousandth time through the loop we take the @stringBuilder variable and concatenate it on to the @bigString, then clear out the @stringBuilder variable. This keeps the @stringBuilder variable relatively short, and reduces the number of concatenations to the @bigString to roughly 1/1000th the original.
My choice of 100 iterations going to the @stringBuilder variable, then flushing it to the @bigString came from some simple experimentation. I started out at 10 iterations, then 100, and then tried 1000 which worked well. After that I increased by a factor of 10 to 10,000 and it was much slower on my SQL Server 2019 instance. There may be some other number in there between 100, and 10,000 that works better, but 1000 seems to work just fine from the performance perspective.
The TSQL code looks like this:
SET NOCOUNT ON;
DECLARE @textToAdd VARCHAR(max) = 'We the People of the United States, in Order to form a more ...'; -- perfect Union, establish Justice, insure domestic Tranquility, provide for the common defense, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America.';
DECLARE @startTime DATETIME2 = GETDATE();
DECLARE @endTime DATETIME2;
DECLARE @diff INTEGER = 0;
DECLARE @bigString VARCHAR(max) = '';
DECLARE @loop INTEGER = 0;
DECLARE @highestTestPass INTEGER = 500000;
DECLARE @maxConcats INTEGER = 0;
DROP TABLE #results ;
CREATE TABLE #results
(
testType VARCHAR(100),
concats INTEGER,
durationMiliseconds INTEGER
);
DECLARE @stringBuilder as VARCHAR(MAX) = ''; -- declare the variable
-- testing + for concatenation
SET @maxConcats = 0;
WHILE @maxConcats < @highestTestPass
BEGIN
SET @maxConcats = @maxConcats + 10000;
SET @bigString = '';
SET @loop = 0;
WHILE @loop < @maxConcats
BEGIN
SET @loop = @loop + 1;
SET @stringBuilder = @stringBuilder + @textToAdd;
IF @loop % 1000 = 0
BEGIN
SET @bigString = @bigString + @stringBuilder;
SET @stringBuilder = '';
END
END
SET @bigString = @bigString + @stringBuilder;
SET @endTime = GETDATE();
SET @diff = datediff(MILLISECOND, @startTime, @endTime);
print LEN(@bigString);
print concat('Difference ', @diff, ' for ', @loop, ' + concatenations');
INSERT INTO #results VALUES ('+', @loop, @diff);
END
SELECT * , durationMiliseconds * 1.0 / concats as milisecondsPerConcat
FROM #results;
The results really changed here in that the average time for each concatenation with the + operator stayed at less than a millisecond. Also note that yesterdays test only went to 50,000 concatenations, this test went 10 times that to 500,000 concatenations.
Since the performance issues are tied to the length of the string being built, we can get significantly faster performance using the string builder concept of a shorter VARCHAR(MAX) that gets added to then cleared out over time.
I first used this method on a project several years ago using SQL Server 2008, and although I did not run this same test against SQL Server 2008 today, I did run it against SQL Server 2012, SQL Server 2014, SQL Server 2016, SQL Server 2017 and SQL Server 2019 today, and I know for certain that the same concepts apply from 2008 to 2019.
More from Stedman Solutions:
Steve and the team at Stedman Solutions are here for all your SQL Server needs.
Contact us today for your free 30 minute consultation..
We are ready to help!