This series explains the theory of Anonymity and what factors influence anonymity in online communication and offline interaction. The goal is to provide necessary background information to make educated judgements on the effectiveness of methods to increase anonymity.
Let us start out with a definition of the term and then explore its implications:
Anonymity is the degree of uncertainty in relating a person to an event, action or property.
Anonymity is a problem of knowledge, it deals with the certainty or assurance an observer has for assigning information to a person – such as a person’s connection to an event or action, or a property of a person such as his name. The assigning of information to a person is called “attribution“, and the information in question is the “attribute“.
The assurance of attribution is expressed in the “anonymity set“, the group of potential candidates each of which the attribute could be assigned to [The members of an anonymity set are also called “elements of an anonymity set”. We chose the term “member” here because it is less de-humanizing though technically improper].
The bigger the anonymity set, the less certain an attribution is and the more anonymity exists for its members.
In the process of attribution the observer tries to decrease the anonymity set by applying deductive and inductive reasoning and by discovering properties that make certain members better or worse candidates for assigning the attribute. A property that makes a member of the set a more likely candidate for attribution is called “unpooling property” while a property that makes the member a more unlikely candidate is called “pooling property“.
It is important to keep in mind that any new information learned by the observer can influence the make-up of the anonymity set and thus the attribution – even when this process of learning and applying spans considerable amounts of time. Each change in knowledge about any member of the anonymity set also changes the certainty of attribution for all other members. The discovery of a pooling property of one member increases the likeliness of attribution to any other member – the discovery of an unpooling property of one member decreases the likeliness of attribution to any other member.
This way the anonymity set is repeatedly shrunk until the observer can assign the attribute to a person with a satisfying certainty. Attribution has become “plausible“. The method of reaching attribution by repeatedly decreasing the anonymity set is called “drill-down“.
The above shows that anonymity is never an absolute, there is always a probability of attribution for each member of the anonymity set. Also, attribution is rarely absolute and strongly depends on the certainty required for the case in question. Even in such crucial instances as criminal investigations attribution is never achieved with a 100% certainty, but only with “sufficient” plausibility.
Let us apply the above to a little story to make it easier to understand:
A late evening in winter, a family – mother Hillary, father Mitt, son Ron and daughter Sarah – sits in the living room eating various cookies from a jar. When only a single peanut cookie is left in the jar, the mother leaves the room saying “Do not eat that cookie, I want to give it to our neighbor.”
After a few minutes the mother comes back and finds the cookie jar empty.
Mother Hillary asks: “Who took the peanut cookie from the jar?”
Hillary has become the observer, the attribute to assign is “took the cookie from the jar”.
The anonymity set is father Mitt, son Ron and daughter Sarah. Each of them being equally likely to be the thief. There is a 1⁄3 probability for each of them.
In the first round of drill-down, Hillary notes that all three suspects have cookie crumbs all over them. This does not change the probability of any of them being a more likely thief than any of the others – the crumbs are a pooling property. The probability for each remains at 1⁄3.
Second, she notices that the hands of father Mitt are far too large to fit into the cookie jar, it makes him less likely to be the thief (another pooling property) but does not finally exclude him. Hillary changes the probabilities to 1⁄5 for Mitt, 2⁄5 for Sarah, and 2⁄5 for Ron.
Third, she remembers that her daughter Sarah is heavily allergic to peanuts (a strong pooling property) while her son Ron likes peanuts a lot (an unpooling property).
Due to Sarah’s allergy, she is excluded from the anonymity set, and Ron’s probability of being the thief is increased: Mitt 1⁄3, Ron 2⁄3, Sarah 0.
Finally, Hillary is pretty certain that her husband Mitt does not want any trouble with her, again reducing the probability of him being the thief: Mitt 1⁄4, Ron 3⁄4.
Mother Hillary now grumbles at her son Ron, being assured enough that he was the thief.
Of course, this line of reasoning does not guarantee Ron to be the culprit, but the anonymity set was reduced sufficiently for Hillary to risk having to apologize to her Son in the unlikely case that she was mistaken.
After having shown the theory of anonymity working in an example we will explore more complex and realistic applications in the next parts of this series – and what Thomas Bayes has to do with it.
- Anonymity is the degree of uncertainty in relating a person to an event, action or property (the attributes).
- The opposite of Anonymity is attribution.
- The measure of Anonymity is the size of the anonymity set.
- Anonymity of a person is reduced through the discovery of unpooling properties for that person and the discovery of pooling properties of other members of the anonymity set.
- Anonymity and attribution are no absolutes but relative probabilities.
Things to come…
This part explores how much anonymity can be expected online and how anonymity is reduced by everyday technologies used in Internet communication.
Here we apply the theory of anonymity to offline interaction.
Some lessons have been learned that can help to improve anonymity in general, both online and offline.