↬ I'm Ivan Krstić (@radian). This is a personal site; I speak for no one but myself.

Languages and security: a short reading list

After my HCS talk last week, a grad student who was in attendance mailed to ask for my thoughts about the intersection of security and programming languages.

I’ve received this question with some frequency, and even gave a brief talk about it last year. The subject matter is rather nuanced, and providing an explanation that does it justice would take a lot of effort, so it’s been sitting on my “to properly write about when I have some time” pile for quite a while now. Unfortunately, it recently became clear to me that The Pile is mostly a black hole. Not wishing to sorely disappoint Greg the Grad Student, I sent him the following sketch of an answer.

If I had to grossly overgeneralize, I’d say people looking at language security fall in roughly three schools of thought:

  1. The “My name is Correctness, king of kings” people say that security problems are merely one manifestation of incorrectness, which is dissonance between what the program is supposed to do and what its implementation actually does. This tends to be the group led by mathematicians, and you can recognize them because their solutions revolve around proofs and the writing and (automatic) verification thereof.
  2. The “If you don’t use a bazooka, you can’t blow things up” people say that security problems are a byproduct of exposing insufficiently intelligent or well-trained programmers to dangerous language features that don’t come with a safety interlock. You can identify these guys because they tend to make new languages that no one uses, and frequently describe them as “like popular language X but safer”.
  3. The “We need to change how we fundamentally build software” people say that security problems are the result of having insufficiently fine-grained methods for delegating individual bits of authority to individual parts of a running program, which traditionally results in all parts of a program having all the authority, which means the attack surface becomes a Cartesian product of every part of the program and every bit of authority which the program uses. You can spot these guys because they tend to throw around the phrase “object-capability model”.

Now, while I’m already grossly overgeneralizing, I think the first group is almost useless, the second group is almost irrelevant, and the third group is absolutely horrible at explaining what the hell they’re talking about.

(If I was trying to be less overly general, I’d mention that in some instances the groups overlap substantially, and some subsets of these groups, such as the subset of group 2 that’s working on SFI and sandboxing, are relevant and occasionally produce good work.)

In terms of a very incomplete reading list for getting to know more about the subject, I recommend starting with Mark Miller’s PhD thesis, then looking at his work on Caja (paper, website) which aims to provide a way to securely write JavaScript without changing the language spec or the existing runtimes, and in the end having a glance at David Wagner’s work on Joe-E. All of those links fall into the “let’s change programming” group 3.

For a bunch of papers in the “mathematicians do it provably correctly” group 1 (though most not focused on security), see the publications section of the Alloy website.

Finally, for the “practice safe hex” group 2, take a look at Cyclone (paper, website), NaCl (paper, website) and Vx32 (paper, website).

Combined, these will give you enough references to chase the subject matter as far down the rabbit hole as you dare descend. Good luck, and may the gods have mercy on your soul.