If you tell enough stories, perhaps the moral will show up.

2007-04-21

Barefoot PKI 1: Asymmetric Encryption

This post is the first is a series. I want to create a "barefoot PKI" to record my own experience and share the lessons. I found I needed to know more about PKI when I saw a sudden flurry of certificate requirements. Outlook users wanting to engage in S/MIME email. The MS Ops Manager implementation authenticating non-domain machines with certificates. Sensitive machines needing certificates to communicate over IPSEC. HTTPS-delivered applications needing server certificates.....

In the past I'd said "We don't do certificates." A good response, if you're a simpleton, because this is tricky stuff -- not conceptually difficult, but hard to keep straight and get all the bits lined up. But it's not good enough any more. So this is the Security Stories guide to PKI, for simpletons. By an authentic, accredited, self-taught simpleton.

Even for simpletons though, this is bigger than a single post. So I'm going to break it up.

  1. Absolute minimum concepts for working with asymmetric encryption (Public/Private Key encryption), and some links to get more. (This post.)
  2. The X.509 Public Key infrastructure
  3. Simple PKI operations.
  4. Tools and techniques and implementation with OpenSSL

I'm going to use OpenSSL for the practical stuff. This is not really a political statement: because a) It's what I actually used when I realised I was in trouble, b) it exposes a little more of the gory details, particularly in terms of getting things working with Microsoft systems, and c) what swung it for me was that it's an easy install -- I didn't feel that I was making ill-informed choices that would bind the firm into the future. I installed on Linux, but I'll be talking in terms of a Windows setup.

Asymmetric Encryption for Simpletons

Encryption used to be easy, and symmetric. Alice encrypted her plaintext message with a secret key (which is really just a string of bits) using a (hopefully) robust encryption algorithm. She sent the resulting unintelligible ciphertext to Bob. Bob knows the secret and can run the encryption in reverse to yield the plain text. This is the way it's always been -- the history of cryptography is mostly the history of better and clearer understanding of symmetric encryption.

To benefit from symmetric encryption Alice and Bob need to

  1. Agree the encryption to be used. That's easy -- it's not a secret. Modern encryption doesn't rely on obscurity of the algorithm.
  2. Keep the secret key secret, to prevent anyone else reading the message
  3. But nonetheless share it between the two of them. That's difficult -- they need to have a means of communication which is secure so that the key won't leak. The historical solution has been a key of the day or the hour, in a code book. You don't need me to tell you how horrible that is, especially in a situation where Alice and Bob will never meet, or Alice is dealing with multiple mutually untrusting Bobs.

Public Key — asymmetric — encryption seems to have arrived in a bit of a rush between 1975 and 1980. There's still an encryption algorithm which uses a key. But there's no reverse process for decryption -- instead we use a key pair. The pair are chosen to be related by a property which at first sight seems magical: If you encrypt with the first key to get a ciphertext, you must encrypt again, using the same encryption algorithm with the second key, to get back to the original plaintext. And the same is true if you use the keys in the other order (though the intervening ciphertext would be different). It is spooky, but it does work and the smart types who fret about this sort of thing have not been able to find any conceptual weaknesses.

This magic offers a solution to the key sharing problem. Alice and Bob both create independent — different — key pairs. Each designates one of their keys as "public" and one as "private". The private key is kept secret — it never has to be shared with anyone. And the public key is actually published, to the whole world if necessary. To send a message to Bob, Alice encrypts it with his public key, knowing that the only person who can read it will be Bob, because the only way to decrypt needs the private key which only Bob knows.

It also offers some other wins:

  • Alice can sign a file by encrypting it with her private key. Anyone can check that Alice's private key was used, by decrypting it with her public key and checking that it comes out right. Since Alice's private key must have been used, only Alice can have signed it.
  • Once it's encrypted it can't be altered, or it won't decrypt right, so its authenticity is guaranteed — the text decrypted is definitely the text that was encrypted.

(Diversion: PK algorithms are relatively slow and require long keys of one or two thousand bits. Instead of using them directly:

  • For encryption, the only message encrypted in the PK system is a randomly chosen key for the ordinary symmetric encryption which will protect the rest of the message — typically just 128 or 256 bits to protect a message that that may be megabytes long. So symmetric encryption is as important as it ever was.
  • To avoid encrypting the entire document, a hash algorithm is used. A hash is a fixed-length digest of a message. It can't be used to recreate the message, but a change to the message is supposed to be very unlikely to produce the same hash. By convention therefore, Alice signs a hash. This is a little worrying as it appears the this promise is not entirely true in the case of the MD5 hash algorithm which is very widely used.

But the logic works much the same if you think of the PK crypto being used directly. The hash or the key exchange is a convenience only.)

To be secure, the public key must give no clue to the private key. As an example (there are other techniques) RSA encryption uses a key pair which are the (only) two prime factors of a very large number. Because large numbers are hard to factorise, the public key gives no hint of the private.

There are still a couple of problems: The first is practical: private keys have to be kept secret, or all the guarantees fail. The second feels a bit like we've gone round in a circle: we have to have a trustworthy means of finding out the public keys that their counter-parties will be using. It has to be trustworthy because a malicious person can read or spoof traffic if they can deceive either end about the right public key to use. The reason we haven't returned to the secret key distribution problem is simply that it doesn't have to be secret — just trusted.

Mechanisms set up to distribute and validate public keys are called Public Key Infrastructure (PKI). Certificates are the tool used to make it trustworthy. And X.509 is the standard for certificate formats. And that is quite enough for today.

No comments: