Partitioning in SQL 2005 - on RANGE LEFT and RIGHT

This time, an article in English, since this is an answer to Kimberly Tripp's blog post on left and right partition functions here. In this excellent article, you also find the basics on partition functions and partition schemes.

So, I want to show why RANGE RIGHT partition functions are good in some scenarios. First, why don't I like RANGE LEFT partition functions? It's mostly a matter of style: When you do a LEFT based partition function on dates, you need to leverage knowledge on internal SQL Server storage of dates, especially the fact that time resolution is 3 milliseconds in SQL Server. Kimberly herself demonstrates this by using DATEADD(ms,-3,'20010101') in the partition function in this article. I like the clearer and more descriptive syntax of

CREATE PARTITION FUNCTION OrderDateRangePFN(datetime)
AS
RANGE RIGHT FOR VALUES ('20001001',
'20010101',
'20010401',
'20010701')

way better than

CREATE PARTITION FUNCTION OrderDateRangePFN(datetime)
AS
RANGE LEFT FOR VALUES ('20000930 23:59:59.997',
'20001231 23:59:59.997',
'20010331 23:59:59.997',
'20010630 23:59:59.997')

and it should work even if DATE storage resolution is changed later.

But Kimberly's point was that in a sliding window scenario, data need to be moved with a RANGE RIGHT partition function but not with a RANGE LEFT partition function. Here is how you can avoid this:

The trick is an empty first partition. Note that a RANGE LEFT scenario also requires an empty partition, but it must be the last partition. See here.

In this example, I want to partition by year, and I currently have 3 years worth of data: 2001, 2002, and 2003. So I create a RANGE RIGHT partition function like this:

CREATE PARTITION FUNCTION PF_NachJahren(DATETIME)
AS RANGE RIGHT
FOR VALUES
('2001-01-01', '2002-01-01', '2003-01-01')

and the partition scheme

CREATE PARTITION SCHEME PS_NachJahren
AS PARTITION PF_NachJahren to
( [PRIMARY], -- never used, always empty!
PartFilegroup2001, -- filegroup for < 2002
PartFilegroup2002, -- filegroup for 2002
PartFilegroup2003 -- filegroup for >= 2003
)

I put a partitioned table on it:

CREATE TABLE UmsatzPartitioniert(
OrderId int ,
Datum datetime DEFAULT (getdate()),
Umsatz money NULL
) ON PS_NachJahren(Datum)

To guarantee that there are no data in the first partition, I put a check constraint on the table:

ALTER TABLE UmsatzPartitioniert
ADD CONSTRAINT DatumChk
CHECK (Datum >= '2001-01-01')

You should also CHECK that there are no data with date values after end of 2003!

To put some data in the table you can select it from AdventureWorks:

INSERT INTO UmsatzPartitioniert
(OrderID, Datum, Umsatz)
SELECT SalesOrderID, OrderDate, TotalDue
FROM AdventureWorks.Sales.SalesOrderHeader
WHERE OrderDate < '2004-01-01'

Now for the sliding window scenario:
At a later point in time, data from 2001 should be removed and data from 2004 should be added.
How is this done without moving any data? First, we need a table to use for the 2001 data, so we create an archive table:

CREATE TABLE UmsatzArchiv2001(
OrderId int ,
Datum datetime DEFAULT (getdate()),
Umsatz money NULL
) on PartFilegroup2001

Then we switch the 2001 data from the partitioned table to the archive table:

ALTER TABLE UmsatzPartitioniert SWITCH PARTITION 2 TO UmsatzArchiv2001

This removes the 2001 data from UmsatzPartitioniert and adds it to UmsatzArchiv2001. It is just a pointer change, so no data is moved. Note that it is partition 2 that is switched, since partition 1 is the empty partition for data < 1.1.2001.

Now I have no 2001 data in UmsatzPartitioniert any more, so I can remove the 2001 range from my partition function. Since both the first and the second partition are currently empty, no data is moved:

ALTER PARTITION FUNCTION PF_NachJahren() MERGE RANGE ('2001-01-01')

Then the constraints need to be changed so no 2001 data can be inserted any more:

ALTER TABLE UmsatzPartitioniert
DROP CONSTRAINT DatumChk
GO
ALTER TABLE UmsatzPartitioniert
ADD CONSTRAINT DatumChk
CHECK (Datum >= '2002-01-01')

You may then back up the archive table, but for simplicity of the scenario, I just drop it and remove the file & filegroup the 2001 data resided before:

DROP TABLE UmsatzArchiv2001
ALTER DATABASE PartitioningBeispiel REMOVE FILE FileAufPartFilegroup1
ALTER DATABASE PartitioningBeispiel REMOVE FILEGROUP PartFilegroup1

That was the merge scenario - no data moved!
Now for the SPLIT. We want to add 2004 data to the table. First, we need a file and a filegroup to hold the data:

ALTER DATABASE PartitioningBeispiel
ADD FILEGROUP PartFilegroup2004

ALTER DATABASE PartitioningBeispiel
ADD FILE
( NAME = N'FileAufPartFilegroup2004', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\FileAufPartFilegroup2004.ndf' , SIZE = 3072KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
TO FILEGROUP PartFilegroup2004

Now we must tell the partitioning scheme that the new filegroup is ready for use:

ALTER PARTITION SCHEME PS_NachJahren NEXT USED PartFilegroup2004

Then the partition function can be changed to add the new 2004 range:

ALTER PARTITION FUNCTION PF_NachJahren() SPLIT RANGE ('2004-01-01')

On a split, the new partition (residing on the NEXT USED PartFilegroup2004) is always the one that contains the boundary value. Since this is a RANGE RIGHT partition function, the boundary value 2004-01-01 is the lower boundary of the right partition. This means, the 2003 partition is untouched by the split - no data is moved (if there are only data up to 2003 before the split).

As a result: You can use a RANGE RIGHT sliding window scenario without moving data on SPLIT and MERGE. The cost is the same as with a RANGE LEFT scenario: a partition that is always empty. Only it needs to be the first on RANGE RIGHT while it needs to be the last on RANGE LEFT.

Demo script on request (just leave a comment).

Regards,
Steffen