How does duplicate detection work?

Duplicate detection of delivered messages is done by the Exchange store. The store does duplicate detection based on two properties on the message - the Internet Message Id and the client submit time. We would have liked to do duplicate detection based solely on the Internet Message Id but there are several not-to-be named applications out there that use the same Internet Message Id on all their messages.

The store keeps track of duplicates using a table in JET called the DeliveredTo table. When a message is delivered to a user, the store checks this table and if no entry is found the message is delivered to the user and a row is added to this table to indicate that the user received the message. If an entry is found, the message is turfed.

The store only tracks duplicates for 1 hour by default. This can be changed by changing the value of registry setting:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\<Server Name>\<Private/Public-Guid>\Track Duplicates (in hours)

The maximum value that the store will accept for this registry value is 49 days, if a value greater than 49 days is set, the store will ignore the value and keep duplicates for 24 hours. But keep in mind that increasing this value will cause this table to grow really large and this could slow down delivery.

The store will periodically delete the old items from the deliveredTo table which is handled by the background cleanup thread which runs every hour. This is also configurable and the registry setting is:
 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\<Server Name>\<Private/Public-Guid>\Background Cleanup (in msecs).

There is still a chance that you will get duplicate email if the delivery of email is delayed because of the following reasons:

1) If either the internet message id or the submit time is different on the two messages, the second message will not be treated as a duplicate.

2) If the two are the same, but time interval between the arrival of the two messages is greater than 1 hour, the store cleanup task would have deleted the original entry in the deliveredTo table and the user will get duplicates.

3) If the user is moved. The deliveredTo table is per database and the information is not moved when the user's mailbox is moved.

4) In older versions of Exchange we had a problem where duplicates would occur when a message was sent to a user and a DL containing the user using OWA. When the message was submitted the store would stamp an Internet Message Id on the outgoing message. However, since OWA submits messages with native MIME and the fact that the Internet Message Id stamped by the store on submission did not update the MIME Message Id header, the MAPI message was out of sync with the native MIME. The message would then be bifurcated by Transport and this would result in messages with different Internet Message Ids and therefore, duplicates. We changed this in Exchange 2003 so that the store only stamps the Internet Message Id on a message if it detects that the MIME has to be regenerated or if it is a pure MAPI message.

- Jaya Matthew

Comments (4)
  1. Christian says:

    What is this good for?

    If a SMTP-server on the way to me duplicates a message, then I want to see that.

    Especially if someone uses the same message-id twice.

    I’m frightened by this: I don’t want Exchange to drop mail.

    How can it be switched off completely?

  2. davidee says:

    In your example "a SMTP-server on the way to me duplicates a message" each message would have an altogether different message ID once it leaves that smtp relay, which would allow delivery into your exchange environment as it should be seen as a unique mail message

  3. Christian says:

    But what is this duplicate-filtering feature good for? Under which circumstances do duplicates normally happen?

    And isn’t it a bit dangerous to rely on the message-id only? I know that Cyrus IMAP does the same (part of the reason why I dropped it).

    I think it should compare the contents of the message oder a hash.

  4. jaya says:

    Duplicates may occur in Exchange when a message is sent to a user and to a DL containing that user. This usually happens if the DL needs to be expanded on a different server or if it is a hidden DL.

    Also there is no way to turn off duplicate detection.

Comments are closed.

Skip to main content