SQL Server 2008 Fragmentation

Here are some notes on "SQL Server 2008 Fragmentation" I took while attending an advanced class on SQL Server taught by Paul Randal https://sqlskills.com/AboutPaulSRandal.asp).

Please note that, although these notes were taken during the class, I might have added some of my own (mis)interpretation :-). Always check your facts on Books Online (I try to provide links when applicable). As with anything else you get from a blog, never use any of this in production before you thoroughly validate it a test environment. Please keep in mind that some of those will be hard to follow without the right pre-requisite knowledge and the right context. Reading the post from top to bottom will help.


Types of searches

  • Full scans
  • Binary searches
  • Linear interpolation - fixed record size, monotonely increasing key

Singleton Lookup

  • Index Seek / Clustered Index Seek
  • Find record using index pages

Allocation Order Scan

  • Table Scan or unordered Clustered/Index Scan
  • Pages splits during allocation order scans
  • Table lock helps (TABLOCK, TABLOCKX) but not required

Advanced scanning

Range Scan

  • Index Scan / Clustered Index Scan
  • Find first, then follw next page pointer

Readahead

  • Pre-reading of pages in the leaf level during Range Scans
  • 1, 8 or 32 pages (8KB, 64KB or 256KB) - only if pages are contiguous
  • 128 pages (1024KB) only in Enterprise edition
  • Logical fragmentation prevents optimal readahead
  • See https://msdn.microsoft.com/en-us/library/ms191475.aspx

Logical Fragmentation

  • Next page pointer does not point to next physical page
  • DBCC SHOWCONTIG is deprecated now
  • Look at avg_fragmentation_in_percent in sys.dm_db_index_physical_stats
  • Not relevant if you have a small number of pages (less than 100?)
  • Not relevant if all pages are already in the buffer pool (in cache)
  • Logical IOs are not necessarily physical IOs

When to do something about it

  • When to rebuild/reorganize? It always depends...
  • Different for data warehouse or OLTP, for instance
  • If you must have a number: 5-30% - ALTER INDEX REORGANIZE
  • If you must have a number: >30% - ALTER INDEX REBUILD WITH (ONLINE = ON)*
  • See https://msdn.microsoft.com/en-us/library/ms189858.aspx
  • Many just rebuild everything every week (could be totally unnecessary)

Extent Fragmentation

  • Extents are not contiguous
  • Affects readahead, but not much (can read 8 but not 32 pages)
  • In a heap, logical fragmentation is actually extent fragmentation
  • With multiple files, round robin does not necessarily mean fragmentation
  • Interesting KB: https://support.microsoft.com/kb/329526

Page Density

  • How much space is actually in use in the page
  • Affected by row size (5K fixed-length row leaves 3K unused in every page)
  • Affected by page splits
  • Affected by FILLFACTOR
  • In the end, low page density leads to more IOs
  • This is actually a kind of fragmentation

Page Splits

  • Page is full and you need to add something (INSERT, UPDATE with larger data)
  • Take half the rows to a new page, fix up all the pointers in both
  • Could lead to page splits to non-leaf levels, all the way up to the root
  • All fully logged, not matter what log level you are in
  • Typically creates two low-density pages and are not contiguous
  • Does not happen with a heap, only with a clustered index
  • Does not happen with a key that is ever increasing (like identity)

Things that cause fragmentation

  • GUID as high order key (NEWSEQUENTIALID can help)
  • Variable lenght colums
  • Misconfigured FILLFACTOR
  • Wide rows (with 5K rows being the typical example)
  • Clustered indexes that cause random insertion order (sales with customer,date)

FILLFACTOR

  • Leave space on pages to avoid page splits in leaf level
  • Applies only on rebuild, not when doing INSERT/UPDATE
  • PAD_INDEX will do the same thing for non-leaf level
  • For data warehouse, no need for FILLFACTOR
  • For OLTP, it's hard to calculate the ideal factor, no quick answer
  • Look at how many splits, how frequently you can rebuild
  • Pick a value, watch how it goes. Then change factor or rebuild frequency
  • If you must choose a number, try 70%

Symptoms of at Fragmentation

  • Longer run times
  • More disk activity (SET STATISTICS IO ON)
  • More logging activity

Looking at Fragmentation

Querying sys.dm_db_index_physical_stats

  • It could be expensive, since it bring lots of pages into the buffer pool
  • DETAILED (reads everything)
  • SAMPLED (reads 1% of the pages if less than 10,000 pages)
  • LIMITED (parent level of b-tree, same as DBCC SHOWCONTIG WITH FAST)
  • Limited can't tell you page density, since it does not read the page level

Fixing fragmentation

  • Three main choices
  • Rebuild - ALTER INDEX ... REBUILD (replaces DBCC REINDEX)
  • Recreate - CREATE INDEX WITH DROP_EXISTING
  • Reorganize - ALTER INDEX ... REORGANIZE (replaces DBCC INDEXDEFRAG)
  • DROP, CREATE (separately) not good due to possible issue with constraints
  • You can also decide not to do anything at all
  • You don't necessarily have to rebuild everything every day...

Rebuild

  • Can be done online (unless you have LOB columns)
  • Can use multiple CPUs (control with MAXDOP)
  • Works single partition or all partitions
  • Rebuilds index statistics (equivalent of a full scan)
  • Does not update column statistics
  • Can be minimally logged
  • Atomic. If you interrupt, it rolls back.
  • Online - short-hold Shared lock, short-hold SCHema_Mod lock
  • Offline clustered - eXclusive lock
  • Offline non-clustered - Shared lock
  • Creates new before dropping old. Reads from the old index.
  • Always rebuilds everything. Faster for largely fragmented indexes.

Online Index Rebuild

  • 1) Preparation - New index is created and set to write-only
  • very short Shared lock
  • 2) Build - Data is inserted from source.
  • Two copies of the data being updated
  • Scans know they shouldn't use the other - "anti-matter" record
  • 3) Final - Index metadata is updated.
  • short-hold SCHema_Mod lock
  • Old data is dropped
  • Details at https://msdn.microsoft.com/en-us/library/ms191261.aspx

Recreate

  • Basically same as rebuild
  • Can move to a new location
  • Can change the schema
  • Good way to do a shrink without using shrink

Reorganize

  • Is always online (even if you have LOB columns)
  • Always single threaded
  • Works single partition or all partitions
  • Table IX lock - locks only TABLOCKs and escalated locks
  • Addresses only existing fragmentation
  • Faster for lightly fragemented indexes
  • Does not use much extra space

Reorganize Phase 1 - Page compaction

  • Moves data from neighbor pages to get closer to FILLFACTOR
  • Leaf level only. Works on a sliding window (a few pages at a time)
  • Tries eXclusive lock on the pages involved. If can't, moves on.
  • If it can empty pages, it will deallocated them.
  • It will also remove ghost records.

Reorganize Phase 2 - Page defragment

  • Make logical order the same as the allocation order
  • Leaf level only. Shuffle pages arounds without using extra space.
  • Uses one extra free page (page type 19 - unlinked reorg page)
  • Locks enough pages to get the operation done without blocking others.
  • Reminded me of the "towers of hanoi" problem :-)

SQL Server Best Practices Article - Predeployment I/O Best Practices
https://technet.microsoft.com/en-us/library/cc966412.aspx