Architectural Overview of SQL Server 2014’s In-Memory OLTP Technology

July 22, 2013

At the SQL PASS conference last November, we announced the In-memory OLTP database technology (code-named “Hekaton”) for a future SQL Server release. Microsoft’s technical fellow Dave Campbell’s blog provided a broad overview of the motivation and design principles behind the technology and discussed four architectural principles: (1) optimize for main memory data access, (2) Accelerate business logic processing, (3) Provide frictionless scale-up, (4) Built-in to SQL Server. This post will focus on the components behind the first three principles – both the motivation for the new components and how they achieve efficiency and scaling integral to the first three principles.

The new In-Memory OLTP architecture achieves much of its performance by executing stored procedures more efficiently. This focus on efficiency is critical as individual CPU cores are not getting significantly faster and there is not much opportunity for multi-core parallelism within OLTP stored procedures that access relatively few rows. The inescapable conclusion, supported by research and SQL’s performance engineering team, is that to achieve dramatic improvements in performance the database system code must get more dramatically more efficient.

But, what areas of the database engine need to be more efficient? To answer that question we have to understand where time is spent in the database system. The component stack diagram on the left shows, very roughly, the percentage of time spent in various components in an OLTP workload running on previous versions of SQL Server. The green boxes correspond to what is frequently called the relational engine while the blue boxes correspond to the storage engine.

The first observation from this diagram is that speeding up any one area will not lead to factors better performance. Second, even building an infinitely fast main-memory engine (effectively a streamlined storage engine) will at most get you twice the efficiency, since there is significant time spent in query and stored procedure processing. Adding to the challenge is the fact that SQL Server has been continuously optimized with each successive release.

Rather than change the existing SQL Server components that are already highly optimized for what they do best, the approach we too was to focus on new streamlined components that optimize for OLTP workload patterns and the reality that the hottest, most performance critical, data always reside in memory. Then, incorporate these components directly into SQL Server.

This focus on efficiency leads aligns directly to two key architectural components: a memory-optimized engine and a compiler that converts stored procedures and queries into machine code running against the memory-optimized engine.

Memory-Optimized Engine

The memory optimized engine manages all aspects of indexing, transactions, checkpoint (i.e. storage) and recovery for tables denoted as memory_optimized and as such it takes the place of core pieces of SQL Server’s storage engine for these tables.

In Dave Campbell’s blog post he pointed out that the engine achieves much of its performance gains by optimizing around its data always residing in memory. More generally, one might say that it achieves its performance by taking an “optimistic” approach and streamlining and optimizing for common cases and eliminating rare situations or just accepting that rare situations will not be as fast. Assuming that data resides in memory is indeed optimistic, but it’s also the reality for high-performance scenarios. For example, even in 2008 Facebook was deploying 28 terabytes of main-memory to accelerate databases.

This optimistic approach shows up in more subtle ways. For example SQL Server’s storage engine generates a transaction log record for every update to any table or index. This log record contains information not only for redoing the operation but also for undoing it and the log record must be placed in a log buffer immediately (a highly contentious operation) – this is all necessary because the storage engine is operating around the pessimistic assumption that the buffer page containing the recently updated record may be checkpointed or forced to disk before the updating transaction has committed. In reality, this is very unlikely for typical OLTP transactions and buffer pools of reasonable size. Instead, the engine takes an optimistic approach and simply doesn’t allow uncommitted changes to be checkpointed or flushed to disk and therefore doesn’t need to log undo information for these changes nor push them into log buffers aggressively. Instead, for short transactions, it generates a single log record describing the transaction when the transaction commits.

One of the most significant areas of optimism in in the engine architecture is based around optimistic concurrency control where the engine is optimized around transactions that don’t conflict. Multi-versioning eliminates many potential conflicts between transactions and the rest are handled by rolling back one of the conflicting transactions rather than blocking.

This avoidance of blocking is central to the engine design and goes beyond transaction concurrency. The proliferation of processor cores requires that database engines be highly tuned for running large numbers of concurrent transactions. While SQL Server scales very well for many OLTP workloads, there are cases, such as contention on the tail of an index, where some applications encounter scaling limitations. The engine design eliminates such contention points by avoiding any latches or spinlocks in any performance critical areas and uses multi-version optimistic concurrency control to eliminate the contention and overhead of locking. The combination of optimistic concurrency control, multi-versioning and latch-free data structures results in a system where threads execute without stalling or waiting. This is critical because the blocking implies context switches which are very expensive relative to the efficiency that engine operates at. In fact in the earlier diagram, the lower gray box partially accounts for this context switch time as “thread management” overhead.

Stored Procedure Compiler

As noted earlier, a fast engine deals with less than half the cost of executing an OLTP transaction. The cost of interpreting T-SQL stored procedures and their query plans is also quite significant. For an OLTP workload the primary goal is to support efficient execution of compile-once-and-execute-many-times workloads as opposed to optimizing the execution of ad hoc queries. This is where the stored procedure compiler comes into play. It transforms T-SQL stored procedures (marked as native_compiled) including their queries into highly customized native code. The compiler reuses much of the SQL Server T-SQL language stack including the metadata, parser, name resolution, type derivation, and query optimizer. This tight integration helps achieve syntactic and semantic equivalence with the existing SQL Server T-SQL language. The output of the stored procedure compiler is C code and the compiler leverages Microsoft’s Visual C/C++ compiler to convert the C code into machine code.

The efficiency of the resulting code is the result of many optimizations. Most of these are related to having specialized code generated rather than executing a general purpose interpreter. In other words the system pushes work to compilation time in order to further optimize execution time. Other optimizations are also done. For example, the generated code can take advantage of properties of the memory-optimized engine such as the fact that row versions are immutable. Because of this the generated code is not obligated to make copies of rows and instead can use direct pointers to the live database rows in memory.

Conclusion

The end result of the main_memory engine and stored procedure compiler is stored procedures and queries that execute with 10x-40x fewer instructions and processor cycles compared to regular SQL Server running on in-memory data. There is other computation in the system that is outside the scope of the generated code and the engine, such as client server communication and transaction log processing, that contributed to execution time and thus any particular workload will see less of a speedup. We intend to address these areas in future releases.

Of course fast and scalable components alone don’t make a database product – customers expect security, high-availability, administration tools, rich programming languages, and more. The in-memory OLTP system achieves this by integrating its highly optimized components into SQL Server in a way that leverages what customers know and expect from SQL Server and yet still unleashes the components raw performance. That’s a topic for another post.

For a more in-depth understanding of the components discussed here and their integration with SQL Server, see our recent SIGMOD 2013 article, download SQL Server 2014 CTP1, or see more blogs in the series introduction and index here!