Siemens: Using compression to improve performance and reduce costs

In prior posts, I have described the benefit of Read-Committed Snapshot Isolation (RCSI). RCSI will usually benefit most ISV databases. Another method may benefit many ISV databases (running on SQL Server 2008 and later):  compression. Compression is often ignored because of user experiences with the apparent slowness of zipping and unzipping files. There is an important difference with SQL Server compression; unlike unzipping a file, there is no need to locate a decompression program, then load it into memory, then pass it the file name, then allocate memory for processing. SQL Server has all of these items in memory always. There is no slow searching for and then reading physical files from the hard drive.

Compression has two interesting impacts on performance:

  • Less physical disk data is needed to read the data. Although your hard drive may be rotating at 15,000 RPM and have read cache memory, you may still be waiting for the spindle to rotate.
    • Less physical spindle tracks will be read (remember 360 tracks per second is the typical speed limit with hard drives).
  • The number of records stored in memory will be more, so less need to read from the hard drive.

A recent case study on Siemens Teamcenter found:

  • Database size was reduced by 50% with data compression compared to SQL Server 2005. This reduces the cost of storage and allows existing equipment to be kept in service longer.
  • Compression does increase CPU usage by just 1-2%
    • SQL Server 2008 reduced CPU by 10% compared to SQL Server 2005 .
  • 20% improvement in response time (compared to SQL Server 2005 without compression)

You may read this case study here, and more experiences with compression here.

One of the key issues to consider for compression is whether your SQL Server is CPU bound or I/O bound. If it is CPU bound, then compression may be questionable. If it is I/O bound, then I/O volume is reduced by consuming idle CPU capacity.

If you are using SQL Server 2008R2, then database compression is more effective. One user reported that 56% compression of a database on SQL Server 2008, has increased to 70% on SQL Server 2008 R2.

How big a speed boost from compression?

I contacted some of the customer engineering teams at Microsoft for real world data, and they sent this example from a customer (who happens to be local to me). As usual, results may vary based on the type of data in the table and the nature of the query.

Table Design Time to read one location (seconds) Space Used (MB) Rows

Standard table

95

7,703

  7,991,461

Compressed

12

767

  7,991,461

Making life simpler

Codeplex has just released a tool to optimize the use of compression, SQL Server Compression Estimator (SSCE), available here. This tool helps determine the optimal compression ratio for different objects in the database, reducing the time it takes to get compression working well. Just select the rows in the tool and it will enable compression for you.

Give compression and this tool a try on your test system after doing a benchmark and see how much impact it has. Do not be afraid to post your results as comments!