SQL CLR Functions and parallelelism

If you develop your own SQLCLR functions, you know you should decorate the method with this attribute class: [MSDN] SqlFunctionAttribute Class. You might wonder what these options will do. The DataAccess Property, according to the documentation, “Indicates whether the function involves access to user data stored in the local instance of SQL Server”. You might be inclined to enable it regardless you actually need it. But what happens if you do? Let’s try it first and then explain what happens.

First, create a SQLCRL Assembly with these two methods:

 [SqlFunction
(IsDeterministic = true,
IsPrecise = true,
DataAccess = DataAccessKind.None,
SystemDataAccess = SystemDataAccessKind.None)]
public static string NoAccess(string str)
{
return str.ToUpper();
}

 [SqlFunction
(IsDeterministic = true,
IsPrecise = true,
DataAccess = DataAccessKind.Read,
SystemDataAccess = SystemDataAccessKind.None)]
public static string WithAccess(string str)
{
return str.ToUpper();
}

These functions are the same except for the DataAccess parameter. The NoAccess function sets it to None while the WithAccess one specifies Read. Now create a sample database and load the assembly in it:

 CREATE DATABASE DemoCLR;
GO

USE DemoCLR;
GO

CREATE ASSEMBLY [Sample] FROM 'YourDLL.dll'
WITH PERMISSION_SET=EXTERNAL_ACCESS;
GO

CREATE FUNCTION NoAccess (@txt NVARCHAR(MAX))
RETURNS NVARCHAR(MAX) EXTERNAL NAME [Sample].[Class].NoAccess;
GO
CREATE FUNCTION WithAccess (@txt NVARCHAR(MAX))
RETURNS NVARCHAR(MAX) EXTERNAL NAME [Sample].[Class].WithAccess;
GO

CREATE TABLE tbl(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, SomeTxt NVARCHAR(255) NOT NULL);
GO

INSERT INTO tbl(SomeTxt) SELECT name FROM sys.objects;
GO

INSERT INTO tbl(SomeTxt) 
SELECT SomeTxt FROM tbl;
GO 14

Now we can use them as in this TSQL:

 SELECT SomeTxt FROM tbl
WHERE dbo.NoAccess(SomeTxt) LIKE '%pip%';

The execution plan is this:

Note how SQL choose to parallelize this query. It makes sense since we wrote a non-sargable predicate leading to a full scan.

Let’s look at the STATISTICS TIME too:

SQL Server Execution Times:

   CPU time = 27831 ms, elapsed time = 4357 ms.

Note the elapsed time compared to the CPU time. Since our function is CPU bound having more than one core working in parallel speeded up our query (remember, CPU time is core cumulative).

Now try to do execute the same query with the WithAccess function:

 SELECT SomeTxt FROM tbl
WHERE dbo.WithAccess(SomeTxt) LIKE '%pip%';

It will take a while longer and the execution plan will be:

This is interesting: no parallelism was chosen by SQL Server. It’s reflected in the STATISTICS TIME result too:

SQL Server Execution Times:

   CPU time = 15584 ms, elapsed time = 15639 ms.

Here the elapsed time is bigger than the CPU Time (and the elapsed time is bigger than before). The moral of the story here is: allowing a SQLCLR function to access data context prevents the SQL Server optimizer from picking a parallel operator. So be careful when decorating your functions :). You might wonder why SQL Server would not execute your function in parallel in this case. Think about this: when you access data in your SQLCLR function you’re expected to reuse the existing connection (called context connection). It’s very efficient; you can read more about it here: [MSDN] Context Connection. The context connection is always one (and, no, MARS isn’t supported). This means that SQL Server cannot call in parallel your function since only one instance at a given time can “own” the context connection.

Happy Coding,

Francesco Cogno

 

PS: Can you guess my sys.dm_os_schedulers configuration? ;)