CASE To The Rescue Again: Order Of Operations In SQL Server

You might remember way back when, when we discussed the fact that sometimes we inherit systems -- and the design decisions which spawn them -- which don't quite match how we'd approach things. A recent exchange on one of the Microsoft SQL Server discussion lists highlights this wisdom, and also offers up a nifty programming insight.

David wrote:

This works great:

SELECT [BinaryName]
, [SimpleVersion]
, CONVERT(float, [MetricValue]) as ChurnValue
FROM [BRATDB].[dbo].[BinaryBuildMetricView]
WHERE MetricName='ChurnDensity'
ORDER BY SimpleVersion, ChurnValue DESC

But I’m only look for ‘ChurnDensity’ records on a specific BinaryName. When I change the WHERE line to:

WHERE MetricName='ChurnDensity' AND BinaryName='wininet.dll'

I get the following error:

Msg 8114, Level 16, State 5, Line 1
Error converting data type nvarchar to float.

It seems like the query optimizer is getting all the MetricValues for a BinaryName and trying to CONVERT all of them instead of doing a query to narrow down the MetricValue column first to only contain ‘ChurnDensity’ values (which will cause the CONVERT to succeed) and then filtering that list by BinaryName.

When I asked David for more insight into the contents of the [MetricValue] column, he wrote:

If MetricName=’ChurnDensity’, its matching MetricValue will always be CONVERT-able to a float (I enforce that outside the database, plus I’ve manually verified that every ChurnDensity’s value can be converted in my current data set).

Now, if MetricName <> ‘ChurnDensity’, chances are, it’ll be some string that can’t be converted…which is why I want to avoid selecting those records before the CONVERT happens.

I know this begs the question why values of different types are being shoved into MetricValue(nvarchar)…but let’s just let sleeping dogs lie on that…

So, poor David is stuck with a design I hope no reader of this blog would countenance -- what an old boss of mine used to derisively call a "multi-use" column. This column is so multi-use that the underlying type of the column's contents is different, depending on the contents of another column.

Umachandar replied to David's query with a wonderfully complete discussion of his solution:

You cannot guarantee that the CONVERT happens last. It depends on the query plan. There are cases in SQL2000 also where a query that does these types of operations will work some time. The only way to change order in which expressions are evaluated is to use a CASE expression. So in your example, you should modify the SELECT list like:

,CASE MetricName WHEN 'ChurnDensity' THEN CONVERT(float, [MetricValue]) END as ChurnValue

This will ensure that the expression if evaluated for rows that doesn’t have MetricName = ‘ChurnDensity’ does produce run-time error.

Note that even though the ANSI SQL standard describes how various parts of the SELECT statement are evaluated, lot of it is left to implementation on how to produce the results. SQL Server 2005 is even more aggressive in this area to better match computed column expression for example. See the “Behavior Changes to Database Engine Features” topic in SQL Server 2005 BOL ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/instsql9/html/65eaafa1-9e06-4264-b547-cbee8013c995.htm.

I can't add much to this except to thank both David and Umachandar for consenting to my blogging of their exchange. Hopefully this little trick will come in handy if you find yourself inheriting a multi-use column.

You'd never design one of those yourself, would you? :-)