So GUIs use metaphors to make computing easier, but they are bad metaphors.”
– In the Beginning was the Command Line, Neal Stephenson
The worst analogy I have ever come up with is this:
A linked list is like a sequence of water slides with only one ladder.
No, I don’t know what I was thinking either. I no longer recall why it had to be waterslides rather than any other kind, I suppose to emphasize that one couldn’t climb up a slide, or jump on/off partway down one. What I do recall is that I used this metaphor in an actual tutorial (the NZ version of ‘section’) attended by actual 100-level CS students. Some perfunctory googling seems to indicate that they are all productive members of society nowadays, so with any luck, none of them were permanently scarred by the experience, but – oh, the regret.
Of course, “linked list” is far from an ideal metaphor in the first place. While linking in chains and sausages has a strong sense of both permanence and bi-directionality, linked lists do not necessarily have either.
“What has a head, a tail, nodes, and links?” sounds like the lead-in for a low-grade riddle, but then fails to deliver even on that. A horse wearing chain mail? A dog with a golf course? Ugh. Terrible.
This Stack Overflow thread suggests various other analogies, including a train, a conga line, and a scavenger hunt. Maybe, though, students don’t need analogies anymore to subvert their intuitions about what a ‘link’ is — maybe the ubiquity of hyperlinks has solved that problem as neatly as it solved the problem of not being able to sleep because you can’t remember the name of that one guy from that show… you know, the one with the hat.
So, the flawed GUI and OS metaphors that Neal Stephenson talks about are implemented with software artifacts that have their own flawed metaphors. It doesn’t stop there, either. At the machine level we find instructions like the x86 MOV, which doesn’t actually ‘move’ anything unless you count electric curr…wait, no. It’s dubious metaphors all the way down.
An actual solid analogy?
It was therefore with no small amount of surprise that I realized that the computing notion of data taint has a solid analog in the non-computing notion of food contamination. This is the kind of thing you come up with when you are simultaneously writing documentation for CodeSonar’s (recently announced) visual taint analysis and obsessing about cake at a level matched only by Allie Brosh.
There are many food-borne illnesses in the world, with effects ranging from inconvenient to fatal. People want to avoid these illnesses, so they avoid exposure to contaminated food. In the same way, there are a lot of different ways for an attacker to attack software, so software developers avoid exposure to security vulnerabilities in their code.
The most enticing part of the analogy, perhaps, is that food handling protocol does not concern itself with exactly what pathogens are present in any given food item, nor with the ultimate origin of those pathogens (including whether it was deliberate or accidental) – what matters is that items may be contaminated and so they must be treated with caution.
One approach to the food safety problem is to set up a carefully sterilized kitchen and then never bring in anything from outside the sterile boundary. A moment’s reflection reveals the flaw in this plan, though — while you might avoid E.coli, you will eventually starve to death. Software, likewise, is generally interacting with the wider world in some way that is intrinsic to its usefulness. If this outside interaction cannot be limited, any external artifacts must be treated with care and should not be consumed until their safety has been established.
There are different methods for ensuring that food is suitable for eating, depending on the type of food. For some foods, washing is sufficient. Others must be heated above some threshold temperature, or chilled to below a different threshold. In the same way, tainted data can be “cleansed” of taint by sufficiently rigorous treatment within a program. And until this cleansing has occurred, cross-contamination must be avoided.
It is not safe to cut up raw meat and then immediately use the same equipment to chop vegetables for a salad; likewise, it is not safe to use a tainted value in a computation and then use the computation result for a security-sensitive operation.
It is also worth noting that context is highly relevant. A developer writing a small program for personal use on unimportant data may spend little or no time considering taint issues, while software for secure applications needs to take taint very, very seriously. Similarly, a healthy person may spend very little time worrying about potential contamination of their food, while an immunosuppressed person must take contamination very, very seriously.
Healthy person:
Immunosuppressed person:
If this were a different blog, and I were a different writer, this would probably be where I provided insightful commentary on metaphor and meaning. Instead I will leave you with one final food-based metaphor, perpetrated on my undergraduate self and subsequently memorialized in haiku form:
A wise man once said:
“Trees” should be called “Potatoes”
They grow down, not up
(….And, if you’re actually interested in tainted dataflow analysis, you can check out this white paper that describes how to use static analysis to identify and eliminate software taint.)