This is Jonathan Stephens from the Directory Services team, and I wanted to share with you a recent interoperability issue I encountered. An admin had set up an Apache web server with the OpenSSL mod for SSL/TLS support. Users were able to connect to the secure web site using Firefox, but when they tried to use Internet Explorer the connection failed with the following error: The page cannot be displayed. We were asked to investigate what was happening and fix it if possible.
When connecting to an SSL-enabled web site with Internet Explorer, the client and server must negotiate an SSL session during a process called the SSL (or TLS) Handshake. The client and server exchange what are called records, each record containing information relevant to a step in the negotiation process. Describing the entire Handshake process is beyond the scope of this post, but you can find more information here.
Note: SSL 3.0 is a proprietary protocol developed by Netscape Communications. TLS 1.0 is an Internet Standard (RFC 2246) based upon that proprietary protocol. Functionally, there is little difference between SSL 3.0 and TLS 1.0, and for the purposes of this discussion the two are identical.
As part of the handshake process, the server sends its list of trusted root certificates to the client in the form of a non-encrypted record. This is done so that if the server requires that the client have a digital certificate for authentication, the client is able to select one that will chain up to a root certificate trusted by the server. While there is no defined limit on the number root certificates that can be in this list, there is a limitation on the size of the records exchanged between the client and the server. This limit is defined in RFC 2246 as 16,384 bytes.
So how does the Handshake protocol handle those scenarios where the list of trusted root certificates exceeds 16,384 bytes? RFC 2246 describes a process called record fragmentation, where any data that would exceed the 16KB record limit is split across multiple fragments. These fragments must be merged into one record by the client in order to retrieve the data.
Let’s set that aside for a moment and talk briefly about SSL/TLS in Windows. The SSL/TLS protocol is implemented as a security package in Windows; this package is called SChannel, and the associated library is schannel.dll. A Windows application that needs to support SSL/TLS as either a client or a server can use Windows-integrated authentication to leverage the capabilities of the SChannel security package. Two such applications are Internet Explorer (IE) and Internet Information Services (IIS), the Windows web server. Other non-Microsoft products may have their own implementations of SSL/TLS and so would not use SChannel.
This is precisely what our admin discovered while he was investigating this issue. He found that while users were unable to connect to the web site with IE, they could connect successfully with a third party browser – Firefox.
To understand exactly what was happening, we took a network trace between IE and the Apache server. In that trace, we could clearly see that the list of root certificates sent to the client by the server was split across two records. The first was 16,384 bytes and the second was 153 bytes.
The problem here is that SChannel does not support record fragmentation. When receiving data split across multiple records SChannel is not able to merge the data, so when record fragmentation is encountered the Handshake will fail resulting in a failed connection. On the server side, for example IIS, SChannel will truncate data above 16,384 bytes in order to fit it into one record. There are other implementations of SSL/TLS that do support record fragmentation, such as OpenSSL and Firefox. This explains why this problem wasn’t seen when Firefox was used.
In the vast majority of cases, this does not present a problem. Most of the record data exchanged during the Handshake process is considerably smaller than the 16KB limit defined in the RFC. The potential exception to this is the trusted root certificate list record. If a server trusts more than approximately 100 root certificates the root certificate list could exceed the 16KB limit. Please note the use of the word “approximately”. The actual number of root certificates can vary from environment to environment, and should be determined by testing. Microsoft cannot provide a precise number because limitation is based solely on the total size of the data in the record rather than the number of entries, which can vary in length.
In the case of IIS, where SChannel is leveraged for the server side of the Handshake, SChannel will truncate the list of trusted root certificates as I mentioned above. This behavior is described in the following KB article:
933430 Clients cannot make connections if you require client certificates on a Web site or if you use IAS in Windows Server 2003
The above article describes a 12,288 byte limit for the root certificate list. The hotfix described in that article simply increased that limit to the full 16,384 byte limit defined by the RFC. In those cases, however, where the root certificate list exceeds 16KB, the list will still be truncated by SChannel before the record is sent from the server to the client.
When using IIS, the above article describes some specific steps an administrator can take to work around this limitation in SChannel. In cases such as this one, where the web server supports fragmentation but the client does not, the only option is to reduce the number of trusted root certificates to get the size under the 16KB limit for a single record.
In some environments, the lack of support for record fragmentation in SChannel can lead to interoperability problems – failed connections, invalid client certificates, etc. Identifying problems associated with fragmentation is pretty simple; analyzing a brief network trace is usually sufficient to pinpoint instances of fragmentation. As I stated earlier, we usually see this problem in relation to the number of root certificates that are trusted by the server, and currently, the only way we have to resolve this issue is to remove unneeded roots from the server side. We hope to eliminate this problem completely in a future version of Windows.
UPDATE 8/25/2010: Someone pointed out that I should update this blog post to make clear that the “future version of Windows” referenced above is Windows 7. Sort of. In order to support interoperability with other implementations of SSL/TLS, Windows 7 and Windows Server 2008 R2 both support coalescing fragmented SSL/TLS records on the receiving side, but Windows does not support fragmenting records on the sending side. Any outbound record that exceeds 16KB will still be truncated as described above.
– Jonathan Stephens