Basics and History of PKI

One of the most commonly used and misunderstood concepts of IT that I have encountered is the Public Key Infrastructure, also known as PKI.  PKI is one of the most effective methods of ensuring the confidentiality or integrity of data available today; however, an improper implementation of PKI can severely damage the availability of a service.  This post begins a series of posts on the topic of PKI and its common uses, ultimately leading up to advanced topics to include Active Directory authentication with an external certification authority (CA).

The History of PKI

PKI is based on an advanced mathematic relationship called asymmetric cryptography discovered by two Stanford mathematicians, Whitfield Diffie and Martin Hellman, in 1976 (ref: https://purl.umn.edu/107353).  These mathematicians were faced with two major problems in cryptography: key distribution and identification.  In response to these issues, Diffie and Hellman identified a relationship between large prime numbers whereby data encrypted with one key could only be decrypted by its paired key.  Due to the difficulty involved in establishing the relationship between these large prime numbers, it is unfeasible to reverse-engineer the relationship with a key of any significant size.  This mathematic relationship was eventually called the Diffie-Hellman  key exchange (ref: https://cr.yp.to/bib/1988/diffie.pdf, patent: https://www.docstoc.com/docs/31206972/Public-Key-Cryptographic-Apparatus-And-Method---Patent-4218582).

Interestingly enough, another set of mathematicians by the names of Ron Rivest, Adi Shamir, and Leonard Adleman identified the same relationship in 1978 and published it as the RSA (based on their initials) algorithm (ref: https://people.csail.mit.edu/rivest/Rsapaper.pdf, patent: https://www.patents.com/us-4405829.html).  Though Diffie and Hellman discovered the concept of asymmetric cryptosystems first, RSA monetized the concept more effectively ultimately resulting in RSA Corporation, now owned by EMC.

PKI as we know it began with establishment of the X.509 certificate standard in 1993 with the establishment of RFC 1422 (ref: https://tools.ietf.org/html/rfc1422).  This standard created the concepts of certification authorities, certificate revocation lists (CRL), and certificate trusts that provided the framework for more advanced PKI-based technologies in-use today.

Asymmetric Cryptography

In asymmetric cryptography two mathematically related keys are generated; the public key and the private key. The relationship of these keys is such that any data encrypted by one key can only be decrypted by its matched pair. To clarify this relationship, the public key cannot decrypt data encrypted by the public key and the private key cannot decrypt data encrypted by the private key.

In this relationship, the private key identifies the identity of the certified asset, while the public key is widely distributed to validate its identity. An important thing to note at this point is that the compromise of the private key must immediately result in invalidation of this relationship (the means to accomplish this will be discussed later).

The relationship between these keys enables two important cryptographic capabilities based on use: confidentiality and integrity.

Confidentiality

PKI can be used to ensure the confidentiality of data transmission through use of the recipient's public key. Two common examples of this are in secure sockets layer (SSL) encryption and e-mail encryption

Secure Sockets Layer (SSL)

In SSL encryption, a Web server is configured to use a PKI certificate to establish a trust and cryptographic relationship between clients and the server. In this design, the SSL client requests the server's certificate (containing its pubic key) for encryption of a symmetric key that will be shared by both parties for further communications. Because the client has the server's public key it can be reasonably assured that only servers having the private key will be able to decrypt the symmetric key and therefore ensuring confidentiality. There is much more to this topic than discussed in this section, some of which will be discussed later in a later post.

E-mail Encryption

Similar to SSL encryption, e-mail encryption leverages the recipient's public key (available in their certificate) to ensure that only recipients of the e-mail will be able to read the message. In this form of encryption, the e-mail message is encrypted using symmetric encryption. The resulting key is then encrypted separately with each recipient's public key to ensure only the intended recipients of the message are able to view its contents (due to their possession of their corresponding private key).  This concept will be discussed in-depth in a later post.

Integrity, Authentication, and Non-Repudiation

Another common use of PKI is to ensure the integrity, authenticity, or to prevent repudiation (refusal to acknowledge an action). In these operations, the private key to guarantee the sender through cryptography. It is important to make the distinction that cryptographic operations performed by the private key usually do not provide confidentiality (rather obfuscation at best) due to the intended wide distribution of the public key. Once common example of this is the use of digital signatures.

Digital Signatures

You may have received a digitally signed e-mail (identified in Outlook by a red ribbon) or have seen a warning message that a driver or piece of software was not digitally signed. Both of these scenarios leverage the relationship between the public and private key to certify that the digitally signed information (the e-mail, software, or driver) was produced by a trusted source and has not changed since the digital signature.

Digital signatures begin by taking the data to be signed (either the e-mail contents or the compiled code in the above examples) and passing it through a hashing algorithm. Hashing algorithms are special formulas that produce a unique fixed-length output when given a specific input (for more information on this research HMAC or CBCMAC formulas). In addition, effective hashing algorithms will change the entire result if any part of the source message changes. Also, the resulting hash should not be able to be reverse engineered to provide the original message.

The hash alone only proves that the message has not changed. To prove the identity of the sender this hash is then encrypted using the sender's private key resulting in a digital signature. This digital signature can then be used to verify the identity of the sender (the ability to decrypt the signature) as well as to validate the integrity of the message through performing the same hashing algorithm and comparing.  As with the previous topics, this will be discussed in-depth in an upcoming post.