How to Get SQL to Accept the Cert or A day (or two) in the life of an MSCOM Debug Engineer – Part 2

(Mr. Cert’s side of the story)

Some days, it just doesn't pay to get out of bed. Nevertheless the day described by Mr. Debug in A day (or two) in the life of an MSCOM Debug Engineer was not one of them for me. If you already had the chance to read his blog you already know the scoop. If not - click on the link above and read it – it is a great blog. So getting to the point – yes you can indeed replace the default self-signed certificate that SQL 2005 is using with a certificate of your choice and here is how.

As expected the first step is to obtain the certificate. A self-signed certificate is a good option if you just need it for testing for example. And, a great tool that can be used to create such self-signed certs is MakeCert.exe (part of Microsoft .NET Framework SDK). Executing the following from the command line will create a self-signed cert issued to sqlserver.ms.com in the local machine certificate store valid until 06/01/2009:

Makecert.exe -r -pe -n CN="sqlserver.ms.com" -eku 1.3.6.1.5.5.7.3.1 -ss my -sr localmachine -sky exchange -sp "Microsoft RSA SChannel Cryptographic Provider" -sy 12 -e 06/01/2009

In case you need a real production cert which is trusted by other clients you will need to request one from a trusted Enterprise or third-party Certification Authority (step-by-step instructions here https://support.microsoft.com/kb/298805 ). Note that creating the Certificate Signing Request (CSR) requires IIS on the server requesting the certificate which is not something that one can usually find on a SQL database server. How do you get around this issue? Relatively easy - use a separate web server to obtain the cert; export it to a .pfx file (https://support.microsoft.com/kb/232136/EN-US) and import it to the Local Machine certificate store on the database server (see first half of https://support.microsoft.com/kb/232137). In both cases you will need to specify the CN or “Issued To” portion of the certificate. When using a standalone server this should be the Fully Qualified Domain Name (FQDN) of the server. For SQL server clusters (Windows Load Balancing or a failover clusters) the CN (“Issued To”) portion should match the FQDN of the VIP (virtual IP address).

If you followed the process above at this point you should have a valid x.509 certificate enabled for server authentication in the local machine cert store of the database server. You can verify this using Microsoft Management Console (MMC) Certificate snap-in. Double-click the certificate name, and then select Details. Click the Enhanced Key Usage property, and then verify that the value looks like this:

 Server Authentication(1.3.6.1.5.5.7.3.1).

Now we can get to the fun part – configuring the SQL Server to use the cert.

The easy way

If you are using a standalone SQL server and have the certificate issued to the FQDN of the server you can use the SQL Server Configuration Manager to configure it. Open the Configuration Manager-extend SQL Server 2005 Network Configuration – Right-Click Protocols for MSSQLSERVER – select Properties and click on Certificate tab. In the drop down you should be able to see and select the certificate. Next you will get the warning stating that the changes will be saved but not take effect until the service is stopped and restarted. Logically – the next step is to try to restart the service. If the service starts successfully your SQL server is up and running and using the new certificate. The keyword here is “if” and the reason is that back that day when we were “trying to get SQL to accept the cert” the SQL Server service did not start for us. It stopped just fine, but it failed to start with the following error:

“Unable to load user-specified certificate. The server will not accept a connection. You should verify that the certificate is correctly installed. See "Configuring Certificate for Use by SSL" in Books Online.”

Hmmm… this seems broken. I was quite positive that nothing was wrong with this certificate since I just configured it but obviously something’s missing. After taking another (closer) look at the server I found the missing piece - the service account that was used to run the SQL Server service did not have access to the private key of the certificate. By default access to the private key of a certificate imported to the local machine store is granted only for BUILTIN\Administrators and NT AUTHORITY\SYSTEM. The account wasn’t Local System and it was not a member of the local admin group either – hence the failure. Granting access to the private key is really straight forward if you have the right tools (WinHttpCertCfg.exe in this particular case). Executing the following from command line fixed the permission issue and the SQL server service was able to start successfully after that.

WinHttpCertCfg.exe -g -c LOCAL_MACHINE\MY -s “sqlserver.ms.com" -a "domain\service_account"

 So far we have SQL working with a cert issued to the FQDN of the server with SQL server service running under a non-admin service account (this could be the built-in Network Service as well). Great! How about using it on a cluster with certificate issued to the FQDN of the cluster VIP? Well, this will require configuration option number two also known as:

The not-so-easy way

Next we got another cert issued to the FQDN of the VIP; imported it and granted permissions pretty much the same way as above. As you already know (you already read part one – right?) we had a hard time configuring the Certificate portion of the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.x\MSSQLServer\SuperSocketNetLib registry key correctly. Reading KB articles and trying thumbprint; serial number; friendly name; the entire public key (1024 bytes) didn’t help much so I got the idea to take a step back and see what is inside that reg key when having the SQL server configured “the easy way” (using the UI to select the cert). It is the thumbprint alright, but with one caveat – NO extra spaces. Running certutil.exe –store MY will list all certificates from the local machine store and some of the certificate details including the thumbprint (a.k.a. Cert Hash) which looks like this:

Cert Hash(sha1): 8a 02 6b 32 a3 56 fa b9 41 30 32 df 70 03 3d 77 74 76 ed 14

 So the string that needs to be present in the registry is this very same hash minus the spaces:

8a026b32a356fab9413032df70033d777476ed14

Finally (after couple of hours) we managed to convince SQL to accept our updated cert. At the same time Mr. Debug finds out that the original issue was because of name resolution and not because of the cert after all.

So, what’s the bottom line?

 We learned that SQL client really likes to do reverse DNS lookup and also how to configure SQL servers to use certificates issued to the cluster’s FQDN…not a waste of time after all. Well, at least that’s my story and I’m sticking to it!