SQLCLR: string or SqlString?

One of the most asked question regarding the SQL CRL is whether to use string instead of System.Data.SqlString (see [MSDN] SqlString Structure for further details). You might guess that, from a performance point of view, using string is better. Let’s confirm (or reject) this assumption. First we need to create two very simple SQLCLR functions:

 [SqlFunction(
    DataAccess=DataAccessKind.None,
    SystemDataAccess=SystemDataAccessKind.None,
    IsDeterministic=true,
    IsPrecise=true)]
public static string UseString(string input)
{
    return input;
}

[SqlFunction(
    DataAccess = DataAccessKind.None,
    SystemDataAccess = SystemDataAccessKind.None,
    IsDeterministic = true,
    IsPrecise = true)]
public static SqlString UseSqlString(SqlString sqlinput)
{
    return sqlinput;
}

As you can see we just return the parameter received without even touching it. Note, however, the different types between the two functions.

In order to map there functions in our SQL Server instance we can use this code:

 SET NOCOUNT ON;
GO

USE [master];
GO

CREATE DATABASE DemoCLR;
GO
ALTER DATABASE DemoCLR SET RECOVERY SIMPLE;
GO
USE DemoCLR;
GO

CREATE SCHEMA [CLR];
GO


CREATE ASSEMBLY [ClassLibrary1] FROM 'c:\yourpath\bin\Debug\ClassLibrary1.dll';
GO

CREATE FUNCTION [CLR].UseString(@input NVARCHAR(255))
RETURNS  NVARCHAR(255)
AS EXTERNAL NAME [ClassLibrary1].[ClassLibrary1.Class1].UseString;
GO

CREATE FUNCTION [CLR].UseSqlString(@input NVARCHAR(255))
RETURNS  NVARCHAR(255)
AS EXTERNAL NAME [ClassLibrary1].[ClassLibrary1.Class1].UseSqlString;
GO

This T-SQL batch will create a test DB and load the assembly in it.

Now, in order to test them we should create a dummy table:

 CREATE TABLE tbl(ID INT IDENTITY PRIMARY KEY CLUSTERED, Txt NVARCHAR(255));
GO

INSERT INTO tbl(Txt) VALUES(CONVERT(NVARCHAR, GETDATE()));
GO 1000

INSERT INTO tbl(Txt) SELECT Txt FROM tbl;
GO 10

This script will load a million rows in our table called tbl. Now the test code itself:

 USE DemoCLR;
GO

DECLARE @dt DATETIME
DECLARE @tbl TABLE(Txt NVARCHAR(255));
SET @dt = GETDATE()
INSERT INTO @tbl(Txt)
SELECT [CLR].UseString(Txt) FROM tbl;
PRINT N'String ' +  CONVERT(NVARCHAR(255), DATEDIFF(MS, @dt, GETDATE())) + N' ms.';
GO

DECLARE @dt DATETIME
DECLARE @tbl TABLE(Txt NVARCHAR(255));
SET @dt = GETDATE()
INSERT INTO @tbl(Txt)
SELECT [CLR].UseSqlString(Txt) FROM tbl;
PRINT N'SqlString ' +  CONVERT(NVARCHAR(255), DATEDIFF(MS, @dt, GETDATE())) + N' ms.';
GO

These two batches will call the function for each row. The results, in my case, are like these:

 

 

Notice the higher time for the SqlString function, as expected. Notice, however, that the time difference is fairly small. Given that – even slight – time advantage you may not want to use the string directly. But, in fact, what’s the purpose of SqlString anyway? The purpose is, as explained in the MSDN in the Remarks session:

String always uses the default CultureInfo from the current thread, but SqlString contains its own locale Id and compare flags. The comparison result of two SqlString structures is independent of operating system and current thread locale.

In our SQL Server terms it means that SqlString is aware of the collation setting of your database. That’s very important since we know that string comparison is always case sensitive but, in SQL Server, it depends of the collation settings.

In order to test that we add two more SQLCLR functions:

 [SqlFunction(
DataAccess = DataAccessKind.None,
SystemDataAccess = SystemDataAccessKind.None,
IsDeterministic = true,
IsPrecise = true)]
public static bool CompareString(string input1, string input2)
{
    return input1.Equals(input2);
}

[SqlFunction(
    DataAccess = DataAccessKind.None,
    SystemDataAccess = SystemDataAccessKind.None,
    IsDeterministic = true,
    IsPrecise = true)]
public static SqlBoolean CompareSqlString(SqlString input1, SqlString input2)
{
    return input1.Equals(input2);
}

In this case we just test for equality (question: will using the = operator yield the same result?).

Now use them to compare two strings: Test and TEST.

If we use a case sensitive collation:

 use [master]
GO
ALTER DATABASE DemoCLR
COLLATE SQL_Latin1_General_CP1_CS_AS ;
GO
USE DemoCLR;
GO

SELECT 
    collation_name, 
   [CLR].CompareString('Test', 'TEST') AS 'StringCompare',  
 [CLR].CompareSqlString('Test', 'TEST') AS 'SqlStringCompare',
 IIF('Test' = 'TEST', 1, 0) AS 'SQLCompare'
FROM sys.databases WHERE name = 'DemoCLR';
GO

The result is:

 

Both string and SqlString return false. But what about a case insensitive collation?

 use [master]
GO
ALTER DATABASE DemoCLR
COLLATE SQL_Latin1_General_CP1_CI_AS ;
GO
USE DemoCLR;
GO

SELECT 
    collation_name, 
   [CLR].CompareString('Test', 'TEST') AS 'StringCompare',  
 [CLR].CompareSqlString('Test', 'TEST') AS 'SqlStringCompare',
 IIF('Test' = 'TEST', 1, 0) AS 'SQLCompare'
FROM sys.databases WHERE name = 'DemoCLR';
GO

In this case the result is:

 

So in this case only SqlString comparison behaved as expected: using SqlString shielded our code from the collation complexities!

Happy coding,

Francesco Cogno