Got a -1018 ..? (Part 1)

Just a quickie this one,

If you ever see a -1018 in your application event log, you must take immediate action.

The fact that a -1018 is being reported means that there is corruption on one or more of the 4kb pages in the physical database(s) (a checksum mismatch). I will go into a lot more detail about this next week - but meantime, if you do experience a -1018 (454) in the App log please consider the following, in the long run it will reduce, if not elimate a lot of admin pain:

1) Your on-line backups are most likely to be failing, so your ability to recover 100% successfully could be in jeoperdy.
2) The amount of -1018's generated in the application event log depends on:

   a) How many times that particular page is being accessed (read)
b) How many corrupt pages there are in the database

3) Your users will be calling the helpdesk commenting on irrational Outlook behaviour when clicking on particular folders or mail.

The recommended way to recover from a -1018 is to restore the Database from the previous nights On-Line backup

If you have no healthy backup from BEFORE the first -1018 is reported, then in order to successfully "resolve" the issue you should carry out ALL of the following actions:

1) Run an ESEutil /p against the corrupt database (Repair)

 There is the potential for dataloss, but this depends on where the corrupt page is in the database. If the corrupt page is near the "top" of the DB (Imagine an upside down Oak tree) then the repair function will simply remove this page and subsequently everything underneath it. The reason why the EDB+STM files shrink so much after a /p has been run, is because it is likely that the attachment table within the EDB has the corrupt page.
(I will go into all of this further next week)

2) Run an ESEutil /d against the repaired database (Defrag)

The reason why a defrag is required is because a new database is actually created - same sort of principle is if you were to move mailboxes from one mailbox store to another. (Again more detail on this next week)

3) Finally you must then run ISINTEG –fix –test alltests (Logical)

This final requirement looks at the database from a logical perspective, i.e. the data the lives on the physical 4kb pages. It makes sure that that the logical pointers in the Database are all linked.

You maybe thinking: "Yes, but this could take a long time..!" Yes it could, depending on the size of the database(s), proc and disk speed, but trust me, in the long run, if you need to follow this action plan, you will be in a much better place..!

The Bottom line though if you get a -1018 you must (because it could happen again)

a) Take immediate action
b) Check your hardware, this means disks, disk controllers, firmware versions, drivers etc etc...

Next week, I will go into all of this a lot further, commenting on Moving mailboxes from one DB to another and using Exmerge.

Meantime though, take a look at the following articles:

https://support.microsoft.com/default.aspx?scid=kb;EN-US;314917

https://support.microsoft.com/default.aspx?scid=kb;EN-US;237953