* But Were Afraid to Ask
Part 2 of this series (Part 1 is here and Part 3 is here) breaks down the events that take place during the backup of a mounted and active replicated database in an Exchange 2010 Database Availability Group called, simply enough, “DAG”. In this example the backup server is asked to create a full backup of database DB1 on server ADA-MBX1, using non-persistent COW snapshots:
(please click thumbnails for full size version of graphics in this post)
Event 9606 indicates that the VSS requestor has engaged the Exchange writer, and reports the instance GUID for the backup job that is starting. In this case the instance is 830705de-32d9-4059-94ea-b9e9aad38615. This instance GUID persists throughout each job, and changes with each subsequent one. You can therefore use it to track the sequence of events for each individual job. At this time the Exchange Writer provides metadata about the databases and logs present to the backup application.
Events 2005 and 9811 indicate an instance number assignment for ESE. So along with the writer instance GUID from event 9606 we can also track a job’s progress using these ESE instance numbers which increment by one with each job. At this stage the database is marked with “backup in progress” in the Information Store Service’s memory space.
Just after the backup application has determined which disks need snapshots created, based on the data locations provided by the Exchange Writer metadata, it goes ahead and requests those snapshots. As the snapshot requests arrive event 9608 gets generated, indicating the Exchange writer’s acknowledgment of what’s about to happen. It then must halt disk writes to the database(s) and logs, also known as a “freeze” for the duration of the snapshot generation process.
When event 2001 is generated the current transaction log is closed, and the freeze begins. Writes from STORE.exe to the disks are held in memory.
Once these events appear we know the snapshot(s) have been created, and writes are allowed to database data blocks again.
Once the snapshots are created the backup application can copy blocks of data from the VSS subsystem, getting blocks of data from shadow storage if they’ve been preserved due to a change, or from the actual disk volume if they haven’t. The Exchange Writer waits for the signal that the transfer of data is complete. This flow of data is represented by the purple arrows, which in this case indicates data getting copied out of the snapshots in storage, through I/O of the Exchange server, and on to the backup server.
Once the backup application finishes copying data it will signal VSS that it’s done. VSS in turn signals the Exchange writer, which then initiates post-backup steps, signified by the above events. Event 225 appears to state that log truncation won’t occur, but that event is misleading. For a standalone database, upon backup completion, ESE would go ahead and clear logs accordingly. However, when a DAG replicated database is involved a check of other database copies must be performed in coordination with the Exchange Replication Service to ensure log truncation can continue. Once that check is complete the logs eligible for truncation are deleted. The database header is marked with information about the backup and the backup in progress bit is switched off in memory. In this case the snapshots used for the job are destroyed as part of the completion. In other types of backups, such as incremental, the persistence of the snapshot varies, but in this case they are removed.
In the next post in this series we’ll break down the backup of a passive DAG replicated database copy.