HTTPS Client Certificate Request freezes when the Server is handling a large PUT/POST Request
There is a class of problems that may occur when using client-side certificates in HTTPS.
Sometimes, the server’s request for a client certificate will freeze (until the timeout of two minutes or so) when processing PUT/POST request with a large payload (e.g., >40KB).
Ideally, the server should request the client certificate before any large request exchange.
Otherwise, the server should request the client certificate immediately after either:
- a request has been completely received, or
- a request has been responded to.
Otherwise, the large payload fills the network buffers, which cannot be emptied until the certificate is received and everything processed. This leads to deadlock if the server issues a synchronous call for the client certificate. Although it is not illegal, this is what causes the problem. Furthermore, this represents a trivial DoS vector against any such server.
This may depend on the component sitting directly above http.sys. IIS for example, tries to read as much entity body as possible before requesting the client certificate.
These are some alternatives to fix this issue, but only the first one listed below is deterministic.
By Modifying Only the Server side (when the client cannot be modified):
- (recommended) Set “client certificate required” on the SSL binding so that client certificate is requested at SSL/TLS connection time, before any HTTP request exchange. This forces client certificate to be requested for every connection on that binding. Depending on your configuration, you might need a dedicated VIP and/or SSL SNI name for this communication. This requires no server code changes, but a configuration change via “netsh http” on the SSL binding: clientcertnegotiation=enable
Note: If the server is IIS-based the change needs to be done through IIS. Otherwise, since IIS has a different config, it may overwrite any changes made directly to Http.sys.
- If the server sees this is a PUT/POST request, you need to ensure that the server’s TCP buffers have enough space for the client certificate when it arrives. This leads to strategies such as
- reading as much entity body as possible requesting for the client certificate using an asynchronous call, or,
- Modify your web server app so that it asynchronously pulls the request body while it waits for client certificate retrieval to finish. If too much entity body is pulled (e.g., several MB) and client certificate retrieval has still not finished then cancel the request/connection. Requires server code changes.
- even better, issuing the asynchronous call for the client certificate as early as possible and draining as much of the entity body as possible as you wait for the client certificate to arrive.
This requires server code changes for sure. To increase the chances of this working, *all* relevant buffers on the server as well as the client and in between, need to have enough space for the client certificate to not be stuck behind large payloads. So modifying the client to drain buffers (in addition to the server) helps, but is not sufficient, as intermediate buffers along the way may also pose a problem (e.g., bufferbloat). This is not a deterministic method.
By Modifying the Client side in addition to the Server side:
- (recommended) Use requests such as GET or HEAD to prime the connection so that the server can request for the certificate without being blocked to receive the entity body. This also implies an extra round trip for the priming request, but if client certificates are involved the application is already making some latency tradeoffs. This is not deterministic, as the immediately following “real” request may use a different connection, but usually reuses the “primed” connection from the connection pool. This will require client-side changes and requires that the server expose such an endpoint, as well.
- Use the status code 100 Continue (requires client to send “Expect: 100-continue” header). This may require both client and server changes to be supported. Furthermore, it is not a deterministic mechanism.