SQL–Partition SPLIT takes forever to create new empty partition

 

This is one of those recent crazy cases where I was working.

Issue Description

Customer had a table of 1.5 TB in size with multiple partition (yearly partitioning strategy). Even though Microsoft always recommends that the last partition should be empty, they could not follow the strategy and added data in the last partition and it was more than 100GB at the time of call.

When they further tried using SPLIT function to create a new empty Partition so that they can move the 2011 data to a new partition, it was taking lot of IO and it was completing after almost 20+ hours. They wanted to know what was actually happening in the Partition Split and how to reduce the long processing time.

Research Outcome

Well this was one of those requests where recommendations were not followed and when there is an issue, explanation are expected. I really did not have any other option except looking at some internals. Mainly I referred few articles written by:

Kimberly Tripp: https://msdn.microsoft.com/library/ms345146.aspx

Kalen Delaney: https://sqlblog.com/blogs/kalen_delaney/archive/2009/08/16/altering-a-partition-function.aspx

I even looked at all the documentations which could help but alas, there is not much which could be shared. So here all what I could find:

 

  • When you create Partition on one of your tables in SQL Server, primarily it keeps :
    • Partition ID
    • Partition Number
  • Partition SPLIT or Partition MERGE will not do anything with First or Last Partition but would change the Partition Number (Sequence Number) depending upon the activity. In simple words, SQL would create new Partition and move the pointers so that it becomes part of the table:

            image

  • This process seems to be simple when the last partition is empty and we are creating a new in range partition where data is not in the table. (Scenario of last partition empty) where what max is happening is update in Partition Sequence Number.
  • But when you have data in your last partition and you create new partition using partition SPLIT, it definitely follows the basic steps such as:
    • Creating a new empty Partition.
    • Update all the sequences numbers
    • But at the end, the most important activity is, moving the data what was earlier sitting in the last partition due to no range definition (if that is the intent of defining the partition) but now has to move in new partition based on your new range.
  • The above step is a time consuming process as physically we are moving the data pages from one location to another. And for sure, more the data, more the time it’s going to consume.

Summary

Overall to summarize, we commented on their approach. We mentioned that splitting the data partition is wrong and particularly with huge amount of data in the last partition, it is expected to give high IO usage and it would take longer time. We recommended them to go for Sliding Windows Approach with Partition Switch and follow archival strategy than keeping all the data in one table. Another alternate option what we suggested them to review is documented in Partitioning White Paper (Page 51).