State Transition of Checkpoint Files in Databases with Memory-Optimized Tables

January 23, 2014

In the previous blog Merge Operations, we described that how a CFP transitions across various stages in its life cycle. In this blog, we will walk through an actual example to show the state transition. Note, some of the states shown for CFPs are only visible in RTM bits (not in CTP2) but internally the state transitions is the same. Basing this example using RTM bits adds lot more clarity in how storage is managed. Another note is that RTM bits allocate CFPs (16MB of data file and 1 MB of delta file) on machines with physical memory <= 16GB. This example was run on a machine with 8GB of RAM using pre-RTM bits.

CREATE DATABASE hkdb1 ON

PRIMARY (NAME = [hkdb1_hk_fs_data], FILENAME = ‘C:\hekaton_test\data\hkdb1_data.mdf’),

FILEGROUP [hkdb1_hk_fs_fg] CONTAINS MEMORY_OPTIMIZED_DATA

(NAME = [hkdb1_hk_fs_dir], FILENAME = ‘C:\hekaton_test\data\hkdb1_hk_fs_dir’)

LOG ON (name = [hktest_log], Filename=‘C:\hekaton_test\data\hkdb1_log.ldf’, size=100MB)

— disable auto-merge so that we can show the merge

— in a predictable way

dbcc traceon (9851, -1)

— set the database to full recovery. This is a common practice for production scenario

alter database hkdb1 set recovery full

use hkdb1

— create a memory-optimized table with each row of size > 8KB

CREATE TABLE dbo.t_memopt (

c1 int NOT NULL,

c2 char(40) NOT NULL,

c3 char(8000) NOT NULL,

CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1) WITH (BUCKET_COUNT = 100000)

) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

— do the database backup. We do this so that we can do transaction log backups

BACKUP DATABASE [hkdb1] TO DISK = N’C:\hekaton_test\data\hk-empty-data.bak’

WITH NOFORMAT, INIT, NAME = N’hkdb1-Full Database Backup’, SKIP, NOREWIND, NOUNLOAD, STATS = 10

— load 8000 rows. This should load 5 16MB data files on a machine with <= 16GB of storage

declare @i int = 0

while (@i < 8000)

begin

insert t_memopt values (@i, ‘a’, replicate (‘b’, 8000))

set @i += 1;

end

In this example, we will show the output of DMV sys.dm_db_xtp_checkpoint_files using the following query.

select file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes,

inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count

from sys.dm_db_xtp_checkpoint_files

order by container_id, file_type_desc, upper_bound_tsn

Here is the output after removing some columns and rows (just showing rows with DATA file).

There are 8 data files under ‘PRECREATED’ state. This demo was run on machine with 4 logical processors with 8GB memory. The size of the files is 16MB (supported in SQL Server 2014 RTM on a machine with <= 16GB memory)
There are 5 data files marked ‘UNDER CONSTRUCTION’ as checkpoint has not been taken but these files together store 8000 rows that we had inserted. If database is restarted, the data rows will loaded using transaction log and NOT from these files as they are not part of a durable checkpoint

Now, execute a manual checkpoint by executing the following command and see the state transition in the CFPs. You will note that UNDER CONSTRUCTION CFPs are now marked ACTIVE as they are part of a durable checkpoint that we have just completed. The manual checkpoint closed the data file with internal_storage_slot though it was only 25% full. This data file can potentially be merged in future as it qualifies under the merge policy.

checkpoint

Now, we will delete 50% rows

— now delete 50% rows

declare @i int = 0

while (@i <= 8000)

begin

delete t_memopt where c1 = @i

set @i += 2;

end

checkpoint

Here is the output of the DMV for CFPs in ACTIVE state using the following query

select file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes,

inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count

from sys.dm_db_xtp_checkpoint_files

where state = 2

order by container_id, file_type_desc, upper_bound_tsn

Note that the delta files have reference to the deleted rows. There is a CFP with 0 data rows due to a manual checkpoint.

For this example, we had disabled auto-merge so that we can see the state transitions with no surprises.

We will now force a manual merge (merging all the ACTIVE CFPs) by executing the following command

— now do the manual merge

— it generates merge-target and other files stay as regular files

exec sys.sp_xtp_merge_checkpoint_files ‘hkdb1’, 1877, 12004

The DMV shows that the merge was executed and there is new CFP marked as MERGE TARGET containing 4000 rows. Also, note the data file is around 33MB, much larger than 16MB. This happened because we forced the merge. The auto-merge would not have merged all ACTIVE CFPs because it won’t meet the merge policy.

At this time, the CFPs that are sources of MERGE are still marked as ACTIVE. This is because the merge has not been installed. You can query the state of merge request as follows

select request_state_desc, lower_bound_tsn, upper_bound_tsn

from sys.dm_db_xtp_merge_requests

request_state_desc lower_bound_tsn upper_bound_tsn

——————– ——————– ——————–

PENDING 0 12007

Now, we will force a checkpoint to install the merge.

checkpoint

Here is the output of the DMVs. It shows that the MERGE TARGET is now changed to ACTIVE state and all the CFPs that were used as source of the MERGE are now marked as MERGED SOURCE. Referring to the blog <reference>, these CFPs are now in category-3.

The source CFPs will transition into category-4 automatically as automatic checkpoints and log backups are taken. For the example, here we will force the checkpoint and log backups as follows

checkpoint

— do the log backup.

BACKUP LOG [hkdb1] TO DISK = N’C:\hekaton_test\data\hk-log-1.bak’

WITH NOFORMAT, INIT, NAME = N’hk-log Database Backup’, SKIP, NOREWIND, NOUNLOAD, STATS = 10

Here is the output of the DMV showing that earlier CFPs have now transitioned to state REQUIRED FOR BACKUP/HA meaning that these CFPs can’t be garbage collected as they are needed for operational correctness of the database with memory-optimized table

After couple of manual checkpoint followed by log backup, the DMV output shows that CFPs are finally being handed off to the garbage collector for File Stream. At this stage, these CFPs are not needed for operational correctness for our database. To state differently, the full database backup of the database will NOT include these CFPs but they continue take storage space in the memory-optimized filegroup until they are garbage collected.

One more cycle of manual checkpoint followed by log backup, the CFPs are actually transitioned to TOMBSTONE state which indicates that these will get removed as part of File Stream Garbage Collection. Please refer to FS Garbage Collection for details

This concludes the state transition of CFPs. The main point to remember is that it takes few checkpoint/log-backups after a set of CFPs have been merged before the storage can be garbage collected. We expect these state transitions to occur automatically in production environment as checkpoints will occur automatically for each additional 512MB transaction log generation and the regular log backups that are taken as part of backup strategy. However, you will need to account for this extra storage as CFPs go through transition from MERGE SOURCE to finally getting garbage collected.

SQL Server Team

See more articles from this author

PublishedFeb 15

7 min read read

How SQL developers can maximize savings
PublishedJun 14, 2023

4 min read read

Secure your AI using SQL Server Machine Learning with Microsoft Azure OpenAI Services
PublishedApr 26, 2023

3 min read read

Capitalize on your investments with the new centrally managed Azure Hybrid Benefit for SQL Server
PublishedApr 13, 2023

<1 min read read

5 reasons to join us at Securely Migrate and Optimize with Azure

State Transition of Checkpoint Files in Databases with Memory-Optimized Tables

Related Posts

Explore SQL Server 2022

Related Posts

How SQL developers can maximize savings

Secure your AI using SQL Server Machine Learning with Microsoft Azure OpenAI Services

Capitalize on your investments with the new centrally managed Azure Hybrid Benefit for SQL Server

5 reasons to join us at Securely Migrate and Optimize with Azure