How to Recover from "Disk Full" on an Exchange Log Drive

The_Exchange_Team · ‎May 12 2004

Introduction

Every change made to an Exchange database must first be written in a transaction log file. Most Exchange administrators keep Exchange transaction log files on a dedicated drive. If the drive fills up, all Exchange databases in an affected storage group will dismount. Before you can re-mount any of the databases, you must free up some space on the transaction log drive.

There are two basic rules to follow when freeing up space on a log drive:

Do not delete log files outright. Move them to a different drive so that you can get them back if they are needed.

Do not remove all the log files, even if the databases have been shut down cleanly. Remove only log files that are older than the current checkpoint.

This post will explain best practices for managing log drive disk space and tell you how to determine which log files you can safely move if your log drive does fill up.

Why does this subject warrant a post?

In looking over cases opened with Microsoft Product Support Services (PSS), we discovered that a surprising percentage of Exchange disaster recovery cases are triggered by an Exchange log drive filling up. The intuitive thing to do if a drive is full is to delete files to free up space. But that is exactly the wrong thing to do when it comes to Exchange transaction logs. The log files are critical to Exchange's recoverability.

How do transaction log files work in Exchange?

In brief, when an Exchange database starts ("mounts") it attaches itself to the transaction log files for the storage group to which the database belongs. Every change made to the database will be recorded in a log file before it is written to the database.

Log files are each 5 megabytes in size, and they automatically "roll over" to a new log file when they fill up. Log files are numbered in sequence according to this naming scheme:

EnnXXXXX.log

Enn represents the log file prefix. The prefix is different for each set of log files on a server, and is usually E00, E01, E02 or E03 (because there can be up to four storage groups on a server).

XXXXX represents the log file generation or sequence number. Log files are numbered hexadecimally, and you can have a million of them in a series (0xFFFFF = 1,048,575)

before you have to start over with log file number 0x00001.

All databases in a storage group (up to five databases) share a single set of log files. Log records for each database are interleaved together inside the log files.

Before a database is shut down ("dismounted"), Exchange ensures that all transactions in the log files have been written to the database files. After this has finished, the database detaches itself from the transaction log files. At this point, the database files are self-contained and independent of the log files. The database could be mounted again in a different storage group or with an entirely different set of log files, if you wished.

Until a database has been shut down normally, it remains attached to the log files. If a database is crashed, there is no opportunity to complete normal shutdown operations and detach it from the log files. The database files are in an inconsistent and Dirty Shutdown state. Therefore, the database must be recovered before it can be mounted again.

Recovery works by scanning the log files for a storage group, applying outstanding transactions to each of the databases in the storage group, and then detaching each of the databases properly from the log stream. This process leaves the database files consistent and in Clean Shutdown state, ready to be mounted again.

After a server crash, recovery runs automatically the next time you try to mount any database in a storage group. Most of the time, you will not even notice that recovery has happened. (If recovery is required, the process is recorded in the Application Log, including the name of each transaction log file replayed. An Event 302 from ESE Logging/Recovery indicates successful completion of recovery.)

The recovery process not only allows Exchange to survive a sudden crash without loss of data, but also lets you roll the database forward after restoration of a backup. Transaction logs generated subsequent to the backup can be played into the restored database, thus bringing it up to date.

There are many good books, Microsoft KnowledgeBase articles and white papers that explain in more detail how Exchange transaction logging and recovery work. For in depth information, you can start here and browse the white papers and books about Exchange disaster recovery:

www.microsoft.com/exchange/library

If any of the log files needed for recovery are missing or damaged after a crash, recovery will fail, and the database files will remain in a Dirty Shutdown state. A database cannot be mounted until it has been recovered to a Clean Shutdown state. So you have to get those logs back if you want to mount the database. If you can't get the logs back, you will have to restore the database from backup or repair the database in order to get it running again. And that brings us to the mistake administrators sometimes make after a log drive is full: they delete logs needed to recover the database in order to free up disk space.

But if I'm not supposed to delete log files, how do I free up disk space so I can start my databases again?

Not all logs on the disk will be required in order to re-start the database. Typically, you only need the last several. So you can free up disk space by moving older transaction logs to a different drive, as long as you preserve the last several logs that are really needed to start the database.

IMPORTANT: you should never just delete Exchange transaction log files in this situation. Instead, move them to another drive so that you can move them back in case you make a mistake about which ones are needed.

How can I tell which transaction logs are needed for recovery?

There are two ways:

You can read the Checkpoint field in the checkpoint file, which lists the oldest log needed for recovery.
You can read the Log Required field in the database header, which lists the range of logs needed for recovery.

The first method works for all versions of Exchange. The second method only works for Exchange 2000 and later versions.

Method 1

To read the checkpoint file, you first have to find it. The location of the checkpoint file is the System Path listed on the General page for each storage group's properties. That same properties page also lists the transaction log file path and the prefix for the storage group. The name of the checkpoint file will always be [prefix].chk, for example, E00.chk.

After you find the checkpoint file, you can read it by using Eseutil.exe. Eseutil is a text mode utility that is kept in the \exchsrvr\bin directory. To read a checkpoint file, you must use the /MK command line switch. For example:

"C:\Program Files\exchsrvr\bin\eseutil.exe" /MK "C:\Program Files\exchsrvr\mdbdata\E02.chk"

This command will give you about half a screen of output, and one of the lines in the output will be the checkpoint field:

Checkpoint (0x21EE,11B0,9A)

There are three values separated by commas in the parentheses, but you only have to pay attention to the first one. The first number lists the generation or sequence number of the oldest log file needed to recover the databases in the storage group. The other two numbers are offsets into the log file, and don't really matter.

In this example, if the log file prefix for the storage group is E02, then that means you need log file E02021EE.log, along with all newer log files, in order to recover the database. This means you can remove log file E02021ED.log, and all older log files, without affecting the recoverability of the database.

Remember: move log files, don't delete them. If you make a mistake, or if you later have to restore the database from backup, you'll want to get those log files back.

In Exchange 5.5, the checkpoint is listed in decimal numbers. Here is the same checkpoint as in the example above, as it would appear in Exchange 5.5:

Checkpoint (8686,4528,154)

Hint: Use Calc.exe in Scientific mode to easily convert numbers between decimal and hexadecimal. You can turn on Scientific mode from the View menu in Calc.

When Eseutil lists numbers hexadecimally, it prefaces them with "0x" to alert you. Here, the checkpoint is decimal, and so you will have to convert it to hexadecimal (and 8686 will turn out to be equivalent to 0x21EE. In Exchange 5.5, there is only a single storage group and the log file prefix is always EDB. So the checkpoint log in Exchange 5.5 would be EDB021EE.log.

Method 2

Starting with Exchange 2000, the Log Required field is available in each database's header, and can be viewed by Eseutil with the /MH command line switch, for example:

"C:\Program Files\exchsrvr\bin\eseutil.exe" /MH "D:\mdbdata\Mb1.edb"

If you are unsure where your database files reside, you can find the path in Exchange System Manager on the Database page for each database's properties.

The Log Required field looks like this in Exchange 2000:

Log Required: 8686-8690

The field lists the range of log generations needed to recover the database, but it lists them in decimal, so you have to convert them to hexadecimal. In this example, the range of logs required is E02021EE.log to E02021F2.log.

In Exchange 2003, the conversion is done automatically for you and the Log Required field looks like this:

Log Required: 8686-8690 (0x21EE-0x21F2)

What if one of the logs required is missing?

If a required log is really missing, there is no way to mount the database again without restoring it from backup or repairing it. But it's not very often that a log is really missing.

You may think a log is missing because the most recent log doesn't have a number associated with it yet. While Exchange is still filling up a log with data, the name of the log is set to only the log prefix (Enn) with no hexadecimal generation number (XXXXX). For example, E02.log would be the current log in use by the storage group with the log prefix E02. When this log is full, it might be renamed to something like E02021F2.log, and a new E02.log will be created.

You can tell what the current log will eventually be named by using Eseutil with the /ML switch:

"C:\Program Files\exchsrvr\bin\eseutil.exe" /ML "L:\mdbdata\E02.log"

Look for the lGeneration line in the Eseutil output:

lGeneration (0x21F2)

Now you know that E02.log is really E02021F2.LOG.

Note: In Exchange 5.5, the lGeneration field is in decimal and requires conversion to hexadecimal.

What if the Log Required field is 0-0?

If the Log Required field is 0-0, this means the database has been shut down cleanly and has been detached from the log files properly. This database doesn't need any transaction log files in order to be mounted again. But this does not mean you should remove all the transaction log files.

If Log Required is 0-0 for one database, you should check the header of every other database in the storage group (Eseutil /MH), and make sure that each database shows Log Required: 0-0 and State: Clean Shutdown.

If this is the case, then you can move all numbered log files, but do not remove the current log file (Enn.log).

It is true that databases in Clean Shutdown state don't need previous log files in order to mount them again. You could remove all logs without losing any data. But if you remove all log files, Exchange will create a new series of log files starting over with log generation 1. This is called resetting the log series. If you later have to restore from backup, this will greatly complicate rolling the database forward because you'll have two incompatible sets of log files to contend with.

However, if you leave the current log in place, removing only numbered logs, then when the databases start up again, they will attach to the current log and continue the previous series. Remember that this only applies if Log Required is 0-0. If Log Required actually lists a range of logs, you must keep that entire range of logs available to the databases. (By the way, if Log Required is 0-0 for all databases in the storage group, you will find that the checkpoint is always in the current log file too.)

As a general rule, don't reset your logs without a good reason. One good reason would be because you are getting close to the million log limit on log file generation numbers. If you do reset a log series, take a full backup as soon as possible afterward.

Why do Exchange log drives fill up? Shouldn't Exchange manage the logs automatically?

There are two methods used by Exchange to "prune" old log files so that the disk doesn't fill up:

Circular logging. By default, circular logging is enabled in Exchange 5.5. With circular logging turned on, as soon as the checkpoint passes through a log file, and the log file is therefore no longer needed by any database for crash recovery, then the log is deleted. Typically, if you have circular logging turned on, you will have no more than half a dozen log files in existence at any given time. Circular logging was the Exchange 5.5 default specifically to prevent log drives from filling up. But there is a big drawback to circular logging.

If you have to restore from backup with circular logging enabled, then Exchange will not be able to roll the database forward with additional transaction logs because all those logs will have already been deleted.

In Exchange 2000, the default for circular logging was changed to be off, even though this means a log drive will eventually fill up if an administrator doesn't take regular backups. This means that the problem of log drives running out of space happens more frequently than it did before. The great majority of Exchange administrators tell us that they prefer the new setting, because it means Exchange won't delete any transaction log files without the administrator giving "permission" for it.

NOTE: Small Business Server 2003 includes Exchange 2003. Circular logging is enabled by default for Exchange 2003 running on Small Business Server. However, running the Small Business Server Backup Wizard will automatically turn off circular logging. If you use a third party backup solution with Small Business Server 2003, you should examine the storage group properties to verify that circular logging is not enabled.

Taking an online backup. After a Normal (Full) backup completes successfully, Exchange will automatically remove logs that are not needed to roll forward from the backup. Logs are also pruned after an Incremental backup.

An administrator can also manually remove log files, as long as the rules explained in this flash are followed.

I'm taking regular online backups, but my transaction logs aren't getting pruned. Why?

The most common reasons for this are:

You are not backing up all databases in a storage group regularly. All databases in a storage group share a single set of log files. Until no database in the storage group needs a particular log file to roll forward from backup, the log file will not be removed. So if you haven't backed up one of your databases in a long time, all the log files that database needs will stay on disk, even if no other database needs them anymore.

You do not have all databases in the storage group mounted when online backup is done. At the end of backup, Exchange checks with each database in the storage group to find out which logs it still needs for recovery. If a database is offline, Exchange cannot tell what it needs, and so, to be safe, no logs will be pruned.

You are taking Copy or Differential backups instead of Normal (Full) or Incremental backups. By design, no logs will be pruned after a Copy or Differential backup.

How big should I make my log drive?

We recommend that your log drive be sized to hold at least 10 times as many logs as you generate on an average day. This gives you some buffer room if your backup fails for several days in a row, or logs can't be pruned for some other reason.

The extra space also helps if you have a sudden jump in database activity that causes more logs to be generated than you are used to. For example, if you move a large number of mailboxes between databases, that is likely to generate a large number of log files because you are, in essence, "re-delivering" all the mail all at once.

Another thing you can do to make it easier to recover from running out of log drive disk space is to put a "spacer" file on the log drive. The size of this file should be a little more than the size of the log files you generate in a day. Put this file in the same folder as the log files, and name it something like "AAA DELETE ME IF YOU RUN OUT OF SPACE.SPACER."

If you create such a spacer file, then it is likely that even an untrained administrator will delete this file instead of destroying essential Exchange log files.

- Mike Lee

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

How to Recover from "Disk Full" on an Exchange Log Drive