Database Programming: Why I, Like, May Never Write Another LIKE, Like, Ever Again

[UPDATE 30 October 2007: There are significant ambiguities in this post which are addressed in the comments and in this follow-up, which I recommend reading after you've read this post and its comments. -wp]

I've been doing some performance tuning work over the last couple of days, and I've found a new use for a technique first shown here.

One of the primary functions of our product is to examine program code for certain patterns and practices. This is accomplished through pattern-matching queries against system tables, which take the general form:

SELECT data

FROM mytable

WHERE column LIKE N'%value%'

This is among the most common performance issues we face. Full-Text Search is a wonderful technology for addressing this issue; it unleashes the CONTAINS syntax, which is a powerful and wonderful thing. Sometimes, though, Full-Text Search isn't an option. In those scenarios, here's some equivalent syntax which provides a 2.5x performance improvement for every LIKE it replaces:

SELECT data

FROM mytable

WHERE patindex('%value%',column COLLATE Latin1_General_BIN) > 0

If you just use the PATINDEX without the binary collation, you get performance equivalent to the LIKE. Adding the binary collation turns the trick and unlocks the performance boost.

So, when would I ever code a LIKE in light of this discovery? Well, the only thing I haven't figured out how to do with PATINDEX is escape a reserved character, which, as you'l recall from this post, is among the LIKE command's capabilities.

I hope this tip comes in handy for you..

-wp