How quick's it going to take to backup and restore Exchange data with DPM?

One of the most often asked, and most difficult to answer, questions concerning Microsoft System Centre Data Protection Manager (dpm) is how long will it take to backup and restore data...  As you'd expect the answer is 'it depends'.  So I thought I'd provide an example from my own experience which it might be useful to use to extrapolate an estimate for your own environments...

So first let's have a quick look at my test environment...

DPM Blog

Some other key bits of information:

  • 7 Protection Groups each protecting a single Storage Group
  • Custom Volumes
  • 8 Day retention period
  • Weekly Express Full schedule
  • 15 minute incremental schedule
  • 2 minute incremental offset
  • No Express Full offset

So here are my results...

1. Replica Creation and Consistency Check

This test is the initial job which copies the data from Exchange Server to DPM and runs a consistency check between the Exchange Server volumes protected by DPM and the DPM volume where the corresponding data will be stored. For a definition of a consistency check please go to the following location: https://technet.microsoft.com/en-us/library/cc161653.aspx

Protection Group Data Transferred Change Percentage Time to Completion (h:m:s)
PG1 80,689.88MB 100% 01:42:29
PG2 80,788.82MB 100% 01:52:41
PG3 80,821.94MB 100% 01:43:30
PG4 80,609.82MB 100% 01:46:00
PG5 80,647.94MB 100% 01:50:26
PG6 80,131.94MB 100% 01:51:04
PG7 80,611.82MB 100% 01:44:01

 

Total Data Transferred Bandwidth Consumption Backup Time
564,302.16MB 368Mbps 01:52:41

* Bandwidth consumption was calculated as an average bytes received per second for the duration of the peak transfer of data during the consistency check.
** All jobs ran in parallel and so the backup time is the time it took for the longest job to complete.

2. First Express Full Backup

This job takes across any changes since the original replica creation and consistency check. Have a look here for more information about the Express Full job.

Protection Group Data Transferred Change Percentage Time to Completion (h:m:s)
PG1 1,164.19MB 1.4% 00:24:43
PG2 1,155.94MB 1.4% 00:33:35
PG3 1,129.50MB 1.36% 00:25:10
PG4 267.38MB 0.3% 00:24:48
PG5 1,144.25MB 1.4% 00:33:12
PG6 286.88MB 0.3% 00:21:21
PG7 1,171.44MB 1.4% 00:33:04

* A consistency check had to be run previously against Protection Groups 4 and 6 which is why there was less data to transfer. The issue was resolved by the installation of the DPM Feature Pack.

Total Data Transferred Bandwidth Consumption Backup Time
6,320MB 104Mbps 00:33:35

* Bandwidth consumption was calculated as an average bytes received per second for the duration of the peak transfer of data during the backup job.

3. Second Express Full Backup

This job takes across any changes since the last Express Full backup job.

Protection Group Data Transferred Change Percentage Time to Completion (h:m:s)
PG1 4,114.50 MB 5% 00:26:44
PG2 3,997.31 MB 4.5% 00:34:41
PG3 3,926.00 MB 4.5% 00:40:28
PG4 4,123.56 MB 5% 00:41:18
PG5 1,144.25MB 4.5% 00:35:36
PG6 4,019.13 MB 4.5% 00:19:19
PG7 4,002.19 MB 5% 00:36:56

 

Total Data Transferred Bandwidth Consumption Backup Time
28,338.44MB 240Mbps 00:41:18

4. Example Incremental synchronisation job

This test is the regular incremental synchronisation which occurs by default every 15 minutes. Have a look here more information about this type of job.

Protection Group Data Transferred Time to Completion (h:m:s)
PG1 51.19MB 00:01:05
PG2 56.19MB 00:01:04
PG3 55.19MB 00:01:05
PG4 57.19MB 00:01:06
PG5 46.19MB 00:01:04
PG6 40.19MB 00:01:04
PG7 36.19MB 00:01:07

 

Total Data Transferred Bandwidth Consumption Backup Time
342.33mb 4Mbps 00:01:07

5. Database Restore

The final test I ran was to restore a single database over the top of a 'failed' database to its original location.

Protection Group Total Data Transferred Time to Completion (h:m:s) Bandwidth Consumption
PG2 92,176.44MB 00:21:33 744Mbps

What it is quite interesting to see what happens to the bandwidth consumption as an incremental backup job kicks off during the restore.  You would expect to see this on a production server.  The following screenshot shows how the bytes sent per second drops drops from around 744Mbps to about 520Mbps...  So obviously parallel backups will increase restore times.  Pretty obvious I know but interesting to see.

dpm_restore

So what conclusions can we draw from these results?

Well first of all DPM seems to make pretty efficient use of the available bandwidth.  At times during the consistency check and the restore DPM was hitting about saturation point of Gb Ethernet...

Secondly because DPM is only concerned with changes following the initial replica creation and consistency check backups are fast.  My express full backups were taking under 45 minutes and because DPM is VSS based the interruption to service was always under 2 seconds.  The regular transaction log syncs were taking about a minute each time.

..and of course as we are backing up from the replica database there is no impact on the active database and therefore the clients.

You should also be able to get pretty decent restore rates - 93GB in under 25 minutes is pretty fast I reckon.

Lastly I'd like to point out that tests were performed using the latest Feature Pack for DPM; available for download at https://www.microsoft.com/downloads/details.aspx?familyid=AD5CD1A2-9B87-4A2C-90A2-9DBAF1024310&displaylang=en and for the duration of the testing DPM proved to be very stable...

Of course I do have to point out that this was a test rig with nothing else going on its isolated network.  Unfortunately it's very difficult to say exactly how DPM in your environment might perform based on these results.  However I hope the information is useful as a guide for some...

---
Doug Gowans