VARCHAR diff function

Last month I posted another CTE blog entry Using a CTE to Split a String Into Rows with Line Numbers.  Since then I have used it but, but realized that what I really needed was a diff tool to compare two strings similar to the command line diff tool.

First we take a look the SplitWithLineNumber from the previous article.  This will be used in the CTE diff tool

CREATE FUNCTION dbo.SplitWithLineNumber (@sep char(1), @s varchar(max))
RETURNS table
AS
RETURN (
WITH splitter_cte AS (
  SELECT CHARINDEX(@sep, @s) as pos,
         cast(0 as bigint) as lastPos,
         cast(0 as int) as LineNumber
  UNION ALL
 SELECT CHARINDEX(@sep, @s, pos + 1),
        cast(pos as bigint),
        LineNumber + 1 as LineNumber
  FROM splitter_cte
 WHERE pos > 0
)
SELECT LineNumber,
       SUBSTRING(@s, lastPos + 1,
         case when pos = 0 then 2147483647
         else pos - lastPos -1 end) as chunk
  FROM splitter_cte
);

Next we look at the new function which calls the original, and compares the results.

CREATE FUNCTION [dbo].[varcharDiff](@s1 varchar(max), @s2 varchar(max))
RETURNS table
AS
RETURN (
WITH FirstStringAsTable as
(
SELECT leftDiff.lineNumber,
       leftDiff.chunk AS leftChunk,
       rightDiff.chunk AS rightChunk
  FROM dbo.SplitWithLineNumber(char(10), @s1) AS leftDiff
  LEFT JOIN dbo.SplitWithLineNumber(char(10), @s2) AS rightDiff
    ON leftDiff.lineNumber = rightDiff.lineNumber,
)
SELECT *
  FROM FirstStringAsTable
 WHERE leftChunk <> rightChunk
)

In order to run it we need to include two varchar’s that are slightly diff so that we can run the diff tool against them.

declare @InputString as varchar(8000);
set @InputString = 'Anyone can perform a SQL Server upgrade, but the risk of failure is much
greater if you don’t plan it correctly. By the end of this chapter you will understand how
to plan for the things that could go wrong during the upgrade, and options you can take to
mitigate those risks.
This chapter will cover the details that you will need to know when upgrading a stand alone,
non High Availability (HA), SQL Server from one version to a newer version, with minimal
downtime. I use the term “minimal downtime” as the goal, as it is very expensive to get to the
point of zero downtime with a SQL Server upgrade without an HA solution. For the purpose of
this upgrade process we will be considering a single SQL Server being upgraded, although it
may end up on different hardware or it may be the same hardware.
Upgrading may involve upgrading just the hardware for better performance or more capacity and
not the SQL Server version. The specific upgrade path may depend on your business needs and
the resources or budget that you have available.
Most of the summaries and stories in this chapter are based on my experiences using SQL Server
over the last 21 years. I will refer to several examples in this chapter of things that went
wrong, and then at the end follow up with a case study of an upgrade that went very well.
Considering the things that can go wrong will help you think about ways to prevent them, or
reduce their likelihood of happening.'

declare @InputString2 as varchar(8000);
set @InputString2 = 'Anyone can perform a SQL Server upgrade, but the risk of failure is much
greater if you don’t plan it correctly. By the end of this chapter you will understand how
to plan for the things that could go wrong during the upgrade, and options you can take to
mitigate those risks.
This chapter will cover the details that you need to know when upgrading a stand alone,
non High Availability (HA), SQL Server from one version to a newer version, with minimal
downtime. I use the term “minimalistic downtime” as the goal, as it is very expensive to get to the
point of zero downtime with a SQL Server upgrade without an HA solution. For the purpose of
this upgrade process we will be considering a single SQL Server being upgraded, although it
may end up on different hardware or it may be the same hardware.
Upgrading may involve upgrading just the hardware for better performance or more capacity and
not the SQL Server version. The specific upgrade path may depend on your business needs and
the resources or budget that you have.
Most of the summaries and stories in this chapter are based on my experiences using SQL Server
over the last 21 years. I will refer to several examples in this chapter of things that went
wrong, and then at the end follow up with a case study of an upgrade that went very well.
Considering the things that can go wrong will help you think about ways to prevent them, or
reduce their likelihood of happening.'

And then call the new function.

SELECT *
  FROM [dbo].[varcharDiff](@InputString, @InputString2);

Which produces the following output.

CTE_DIFFWhere we can see that there are 3 lines different between the two varchar strings.  Lines 4, 6 and 12 are the lines that are different, and the left and right pieces are shown here.

Useful for doing Test Driven Development (TDD) to test the output of a function against what yoa are expecting it to return.

I hope this script helps you and that you find enjoy it.

Using a CTE to Split a String Into Rows with Line Numbers

Last year while working on my CTE presentation for a SQL Saturday I added a blog post called “Using a CTE to Split a String Into Rows“, and since that posting I have used it many times.   But as things go in development, eventually there is a need to do something more.

Doing some complex string building to create files lately I cam across the need to use a CTE to split strings into rows but to also include line numbers.  To accomplish that I first started with the same function that I used last year, but renamed it, and added a second output column called LineNumber, here is what it looked like:


CREATE FUNCTION dbo.SplitWithLineNumber (@sep char(1), @s varchar(max))
RETURNS table
AS
RETURN (
 WITH splitter_cte AS (
 SELECT CHARINDEX(@sep, @s) as pos,
        cast(0 as bigint) as lastPos,
        0 as LineNumber
 UNION ALL
 SELECT CHARINDEX(@sep, @s, pos + 1),
        cast(pos as bigint),
        LineNumber + 1 as LineNumber
 FROM splitter_cte
 WHERE pos > 0
 )
 SELECT LineNumber,
        SUBSTRING(@s,
                 lastPos + 1,
                 case when pos = 0 then 2147483647
                      else pos - lastPos -1 end) as chunk
 FROM splitter_cte
 );

The things that have changed since the last post “Using a CTE to Split a String Into Rows“, are the following:

  • I added another column to the output called Line Number.
  • Modified the input to be varchar(max).
  • Adjusted the case statement to accommodate varchar(max).
  • Cast the positions as bigint’s in order to accomodate varchar(max).

So now when you run this on the original input, you get the following:

SELECT *
 FROM dbo.SplitWithLineNumber (' ',
          'the quick brown dog jumped over the lazy fox')
OPTION(MAXRECURSION 0);

CTEtoSplitaStringIntoRowswithLineNumbers

Just a small change, but if you need to lave line numbers on the output, this will do it for you.


Using a CTE in a Function to Split Up a Database Connect String

I came across this example when preparing my CTE presentation for SQL Saturday in Portland OR in 2 weeks.

CREATE FUNCTION dbo.SplitConnectString(@s varchar(8000))
RETURNS table
AS
RETURN (
 WITH splitter_cte AS (
 SELECT CHARINDEX(';', @s) as pos, 0 as lastPos
 UNION ALL
 SELECT CHARINDEX(';', @s, pos + 1), pos
 FROM splitter_cte
 WHERE pos > 0
 ),
 pair_cte AS (
 SELECT chunk,
 CHARINDEX('=', chunk) as pos
 FROM (
 SELECT SUBSTRING(@s, lastPos + 1,
 case when pos = 0 then 80000
 else pos - lastPos -1 end) as chunk
 FROM splitter_cte) as t1
 )
 SELECT substring(chunk, 0, pos) as keyName,
 substring(chunk, pos+1, 8000) as keyValue
 FROM pair_cte
);
GO
declare @connectString varchar(2048)
set @connectString = 'server=myserver;user id=sa;password=asdfasdfasdasdffjfjfj';
SELECT *
 FROM dbo.SplitConnectString(@connectString)
OPTION(MAXRECURSION 0);

Which produces the following output.

Using a CTE in a Function to Split Up a Query String

The question came up as to how do I parse a query string using TSQL.  So here you go, using a common table expression, and basing this on a similar function to what I put together yesterday for split.

CREATE FUNCTION dbo.SplitQueryString (@s varchar(8000))
RETURNS table
AS
RETURN (
 WITH splitter_cte AS (
 SELECT CHARINDEX('&', @s) as pos, 0 as lastPos
 UNION ALL
 SELECT CHARINDEX('&', @s, pos + 1), pos
 FROM splitter_cte
 WHERE pos > 0
 ),
 pair_cte AS (
 SELECT chunk,
 CHARINDEX('=', chunk) as pos
 FROM (
 SELECT SUBSTRING(@s, lastPos + 1,
 case when pos = 0 then 80000
 else pos - lastPos -1 end) as chunk
 FROM splitter_cte) as t1
 )
 SELECT substring(chunk, 0, pos) as keyName,
 substring(chunk, pos+1, 8000) as keyValue
 FROM pair_cte
)
GO

declare @queryString varchar(2048)
set @queryString = 'foo=bar&temp=baz&key=value';
SELECT *
 FROM dbo.SplitQueryString(@queryString)
OPTION(MAXRECURSION 0);

which produces this output.

A quick and easy way to parse a query string  in TSQL using a CTE.

Using a CTE to Split a String Into Rows

One of the common questions that comes up in my CTE class/presentation asks can you use the CTE inside of a function or stored procedure. The answer is yes, but today I came across another question, how can I split a string (varchar) into rows. I did some searching but didn’t find quite what I was looking for. I did fine one CTE example, but I didn’t like the way it worked, so I decided to create my own Split function, using a recursive CTE to implement the split function.

Keep in mind that the OPTION(MAXRECURSION 0) does not go inside of the table valued function, be instead it goes in the query that calls the function.

GO
CREATE FUNCTION dbo.Split (@sep char(1), @s varchar(8000))
RETURNS table
AS
RETURN (
    WITH splitter_cte AS (
      SELECT CHARINDEX(@sep, @s) as pos, 0 as lastPos
      UNION ALL
      SELECT CHARINDEX(@sep, @s, pos + 1), pos
      FROM splitter_cte
      WHERE pos > 0
    )
    SELECT SUBSTRING(@s, lastPos + 1,
                     case when pos = 0 then 80000
                     else pos - lastPos -1 end) as chunk
    FROM splitter_cte
  )
GO

SELECT *
  FROM dbo.Split(' ', 'the quick brown dog jumped over the lazy fox')
OPTION(MAXRECURSION 0);

which produced the following output.

chunk
---------------------
the
quick
brown
dog
jumped
over
the
lazy
fox
(9 row(s) affected)

There are programming tasks that can be accomplished easily with a Recursive Common Table Expression.