Insufficient data from Andrew Fryer

The place where I page to when my brain is full up of stuff about the Microsoft platform

SQL Server Advent Calendar 6 – Hierarchy Data Type

Day 6 of my virtual advent calendar, about stuff I like in SQL Server 2008..

Using a relational database to store structures like the hierarchy of a company or the product catalogue usually involves one of two techniques:

  1. Have a table for each level in the hierarchy with a foreign primary key relationship to represent the parent child relationship.  This works fine where the depth (number of levels) in the hierarchy is known.
  2. Have a self join in the table e.g. a member of staff would have the id of their manager in a manager id column which refer to the staff id of the manager in the same table.  This means that the number of levels in the hierarchy is not predetermined and so the schema doesn’t have to be changed if the business introduces new levels.

The downside of this self join approach is that is slow to navigate and awkward to report on.  So In SQL Server 2008 there is a new HierarchyID data type. This stores the structure of the hierarchy in a system defined user defined type (UDT) like there is in SQL Server for spatial data.  and like spatial data there are three things you get:

  1. the data type itself
  2. functions to make it work e.g. tostring(), reparent(), GetParent(), GetDescendant() etc.
  3. indexing to make it fast.

Of course this is one new feature which will require development time, but it is a lot faster and intuitive then self joins while having the same advantage of not requiring schema changes to reflect changes in the number of levels you need.

If you want to try it then I have put a simple example in this separate article on my blog.