Linchi Shea Makes an Interesting Point About Hints

I had a couple of extra minutes today and found this post on Linchi Shea’s blog; I wanted to commend to all of you who’ve taken in the programming methodology and set-based thinking discussions in this little corner.

While I disagree some with the point in the main post about views (JOINing views, to my mind, is imposing an additional layer of abstraction on our query and already detuned. JOINing views is not a purely set-based solution to a problem; performance problems in this scenario are legendary and completely expected), the really interesting nugget comes in the comment thread where Linchi says:

I don't know any set theory that has hints as its basic construct. When you give a hint, you are giving a 'procedural' directive on how to do it. Sure, this is more semantic than substantial. But that's how arguments go in this realm.

Wow! Take a minute to get your head around that.

In that a hint is instructing the optimizer on how to do a thing, if you tilt your head just right, I think it’s completely fair to look at a hint as procedural code. Before reading Linchi’s comment, this mindset had never occurred to me.

Adam Machanic chimed in as well, with his variation of a story we can all tell (I told my version here): the elegant, perhaps even poetic query that’s hinted to the hilt, only to perform better as a series of smaller queries.

It doesn’t “break” set-based thinking if we get better performance from three small statements than a single poetic one. Temporary tables don’t even break set-based thinking. All of these techniques are about alternative approaches to speaking lovingly to the optimizer while instantiating sets.

Procedural code breaks set-based thinking, and Linchi has expanded my horizons to include hints in this category, along with the old standards WHILE loops, CURSORs, etc.

The old line about cursors aside, none of these techniques are inherently evil. None of them are my first choice for solving a database programming problem, either. In the field, I think it’s fair to say that all are frequently employed when more efficient set-based alternatives are available. If we can speak to the optimizer through minimal abstractions (reduce procedural code + JOINS on views + views of views; can you think of other constructs to add to this list?), in general we’ll get more performant code.

Of course, there are always exceptions. That’s why the procedural syntax is in the product. The proof is in the performance of a particular application. There are no absolutes here, and there should be no religious arguments. You should be open-mindedly worshipping at the altar of performance. Only a fool would ignore best practices, because very few of us are traversing untrodden paths, and none of us should want to walk alone. At the same time, only a fool would leave an interesting new idea untested.

My thanks and compliments to Linchi for a thought-provoking post..

-wp