In this article we are going to explore how anonymity in the physical world is eroded through technologies and conventions that have been introduced over the last 30 years. Most people assume that their physical behavior is mostly disconnected from the world of bits and bytes, databases and surveillance. Sadly, this increasingly proves to be an illusion.
It is easy to overlook how much the digital world has found its way into our physical lives over the last years. Just 30 years ago most people where only consumers of data, be it the TV or radio in their living-rooms. Most data produced by them was strictly personal or business related and never distributed widely or easily accessible to third parties. Daily transactions were settled with cash, travel records mostly non-existent. Only their telephones produced usage data, and even that was bound more to the house or office than to the individual user. The data trail left behind by individuals was very small and limited.
This has changed fundamentally. Today, the vast majority of people generates a growing data trail with increasing frequency and accuracy. People have become constant producers of data in their daily, physical, non-Internet lives – often without noticing or understanding the processes involved.
The main reason for this development in the increasing digitization of life. Computers and databases are not restricted to a natural habitat called the Internet, quite the contrary. Computer technology was developed mostly for managing physical events – managing warehouses, cataloging citizens and customers, calculating machine parameters, managing relationships and planning for the future. The attention focus on communication, social networking and the Internet in general contributed to many developments in the physical world to go almost unnoticed. And this is especially true for our perception and defense of anonymity.
Most transactions taking place in the physical are now mirrored by transactions in the digital world, creating a digital shadow of our daily offline lives. Physical and digital, offline and online, are tightly linked. Physical objects are represented by digital objects used to track and understand the events in the physical world. And these digital representations are increasingly becoming the sole focus for decisions made in the physical world. This transition has only just begun, and will continue towards an Internet of Things that will dominate the way we deal with both the digital and the physical in the future.
The digitization of life, the connection between physical objects and digital representations, has already enveloped most aspects of business-to-consumer transactions, travel and movement, as well as most communication.
However, digitization of these areas comes with several challenges that are necessary to understand the impact on physical anonymity:
Human life is notoriously ambiguous, a features that is hard to cope with by computers. This makes it necessary to create means to precisely describe and identify actions and objects that need to be digitally processed. The solution for this is the introduction of unique identifiers, numbers that are directly tied to one specific action or object and that will not be encountered in any other relation.
Another challenge is data acquisition. For computers to be able to track physical objects or events, data about these must be made available in a digital form. This happens by the use of sensors that collect and transmit the data for further processing by computers.
In those cases where data cannot be acquired automatically, or when data needs to be presented to humans, terminals are used. Here humans need to actively participate in data acquisition or communication with the computer.
Furthermore, just to complete the description of digitization of life, some actions in the physical can be automated through actuators, devices that can perform operations like opening or closing doors, or moving objects.
Lastly, there needs to be a method for connecting information about multiple objects and events – there needs to be correlation. This is done through the means of title and co-presence.
Titles are formal connections between objects that are usually enforced through law – like titles of ownership for cars, identity papers like passports or objects that may only be found in the possession of a specific owner like credit cards.
Co-presence refers to the fact that two or more objects can be located at a specific geographic point at the same time, preferably repeatedly.
While this may sound excessively detailed, the combination of unique identifiers, sensors, terminals and correlation methods describes the infrastructure to collect vast amounts of identifying information and for automatic processing.
Digitization and Anonymity
Just a few decades ago, people only left behind data in the memory of other people. One person would witness the presence or action of another person, and maybe communicate it to a third party. But this data was widely distributed and disconnected, unreliable and only short-lived. Only in cases when a person was specifically targeted, means like photography, audio recording, fingerprint capture, on-foot surveillance and detailed record-keeping were employed. When not targeted, most people were anonymous outside of their direct social environment. They were not identified nor were records of their actions kept.
This stands in stark contrast to today. Through the use of unique identifiers, sensors and correlation most people are constantly identified, their actions recorded and records kept indefinitely even if not specifically targeted.
Many of these records are not yet interconnected, but many more of these records are kept by an increasing number of parties that individually combine them. Further interconnection will develop out of economic reasons and due to law-enforcement interests. Since these person-specific records are relatively cheap to store and manage, they are kept for not-yet-identified future use.
All of these records reduce the anonymity set of an individual simply by containing massive amounts of unpooling properties. Since many of these properties are unique identifiers, the anonymity set is often reduced to a single member – leaving no anonymity in the physical world – unless the individual takes conscious countermeasures to protect his privacy.
In the following we shall explore several of the technologies used for unique identification and sensors. Due to the nature of the subject this can only be an overview which is by no means comprehensive, but should enable us to identify other technologies when they are encountered. A more complete list can be found in the notes below.
Probably the best known technology for physical tracking is the use of credit cards and other payment cards (with the exception of pre-paid, anonymous gift cards paid for in cash). They are directly tied to a person and connect that person to the time and place of a payment, in addition to making payment and shopping habits accessible. Thus credit/payment cards are unique identifiers that destroy anonymity. In addition, the payment data is made available both to the shop and the credit card company, and potentially to third parties requesting that data.
The license plate of a car is another unique identifier that is currently gaining popularity with people trying to reduce the anonymity of others. Automated license plate scanners are set up in more and more locations, allowing the automated collection of license plate data combined with time and place. These are often coupled with additional sensors like toll collection systems to identify the in-car toll boxes. Combined, this allows for the automated creation of movement profiles that are directly connected to a person.
A more personal, precise and reliable method to create movement profiles and to pinpoint an individual is the mobile phone. Almost everybody is carrying a mobile phone today, all the time, at all places. However, mobile phones are constantly traceable as long as they are switched on. The mobile phone network knows the location of every active phone at all times, simply by how the network is set up – not because of targeted surveillance or backdoors. Every mobile phone has a globally unique hardware identification number, the IMEI (International Mobile Equipment Identifier) which is broadcasted to the network frequently. Furthermore the IMSI (International Mobile Subscriber Identity) number which is stored on the phone’s SIM card is made known to the network so that calls can be routed. These pieces of information – location, time, IMEI and IMSI – are frequently stored for extended periods of time and made available to third parties. Together, they form a powerful method to find out where a person was at a given time. Since most mobile phones and network accounts for mobile telephony are bound to a person they are immediately de-anonymizing.
But there are more ways electronic companions, be it smart-phone, tablet or laptop can de-anonymize the owner. When switched on, wireless enabled devices broadcast so called MAC addresses that can easily be captured over dozens of meters. These hardware addresses are intended to be globally unique and not change, so to identify the device to a local hotspot or other devices like headsets. Both WiFi/Wireless LAN as well as Bluetooth use identity broadcasting, though many devices can effectively suppress Bluetooth to send out it’s ID.
Another strongly unpooling property are loyalty cards issued by various commercial entities. These also allow the collection of transaction data for products bought, the time and place of purchase, and the person. But since loyalty cards and not legally bound to a person they can be swapped to reduce the quality of data collected. This is why we classify loyalty cards as strongly unpooling property instead of unique identifiers.
A less likely, but nevertheless frequently used method of tracing is the use of bank note serial numbers. Though they are not bound to a person, they can be connected to a person through the commercial transaction itself. This is a method frequently used in law-enforcement sting operations. For example, the bank note numbers are known to the ATM machine at which the target uses its credit card to withdraw money, making it possible to connect the serial numbers to his identity with a high probability. However, data between banks and grocery stores are rarely shared, especially not serial number tracing data. It is useful to keep this method in mind however, since automated bank note scanners are becoming a more frequent piece of equipment found not just at banks but at shops and border checkpoints.
Far more prevalent are tracking methods based on pre-paid or subscription tickets for public transportation. For example, the Oyster Card used in London allows the long term tracking of movement because it can connect the built in unique number to the use at the gates to the public transportation network. Since the technology employed (MiFARE RFID chips) has been proven insecure, any stranger could read the ID of an Oyster Card carried in a target’s purse and then look up the locations and times of travel. Some transportation ticketing systems also allow access to subscription data, often identifying the person directly.
The underlying technology in many ticket systems are RFID (Radio Frequency IDentification) chips. But RFID is far from being limited to tickets. RFID chips are found in passports, credit cards, tickets – but also attached to everyday products like clothing. RFID-tagged clothing is intended for stock management and anti-theft operations, but it also allows the silent tracking of persons. Since clothes are personal and we usually do not replace all our clothing at once, the correlation between RFID identities in clothing combined with payment data collected at stores can make the wearer long-term traceable. For example, RFID scanning gates put in at choke points like hotel entries, subway system entries and store doors can be used to track the wearer of RFID tagged clothing or other objects that are equipped with RFID tags.
Lastly, the fastest growing area of de-anonymization and tracking is the spreading use of biometrics. Facial recognition systems are now being built into CCTV (camera surveillance networks) and even shop’s surveillance cameras. Since the human face can be quickly identified by current technology – and the face is constantly visible – this probably makes facial recognition the strongest future application for identification. However, facial recognition is not limited to surveillance cameras. Due to the growing use of mobile phones with built in cameras, and the spreading habit of shooting pictures always and everywhere to upload them on social media websites, more and more biometric data linked to place and time is made available – with the active and cheerful help of a whole generation of facebook users.
Must digitization lead to loss of privacy?
It should be noted that the process of digitization does not inevitably lead to a loss of anonymity. Many convenience and efficiency gains can be achieved without impacting privacy, if the technology is designed with data protection in mind.
For example, unique identifiers are not always required, or they can be only temporary and changing. It would also be possible to offer more option to opt-out from data collection or to limit data collection to well defined circumstances in which its use can be demonstrated.
However, these “Privacy by Design” approaches are rare to encounter, either because they are not requested by consumers or because non-economic interests (like law enforcement etc.) are at play.
It should be clear that anonymity should not be expected from the physical world. Credit cards, mobile phones and facial recognition being the three most frequent de-anonymizing technologies that we are constantly confronted with. Without special measures, physical anonymity does not exist anymore.
Things to come…
In the first part of this series the theoretical aspects of what anonymity is are explored.
This part explores how much anonymity can be expected online and how anonymity is reduced by everyday technologies used in Internet communication.
Here we apply the theory of anonymity to offline interaction.
Some lessons have been learned that can help to improve anonymity in general, both online and offline.
Unique Identifiers, serial numbers: (Ultimately unpooling properties)
- Credit Cards, Cash Cards, ATM cards
- Financial Transactions: Account numbers, check numbers, routing codes
- Mobile phone, also built into many modern cars:
- IMSI: International Mobile Subscriber Identifier. Globally unique, associated with SIM card
- IMEI: International Mobile Equipment Identifier. Globally unique, associated with the mobile phone hardware
- Phone number
- Number plates / License plates
- Passports, identity cards
Probable identifiers: (Strongly unpooling properties)
- Face geometry
- Voice characteristics
- DNA / Genetic fingerprint
- Eye: Iris & Retina
- Receipt numbers of purchases
- Loyalty cards
- MAC Address. Publicly visible hardware address of WiFi/Wireless LAN hardware.
- Bluetooth ID. Publicly visible hardware address of a Bluetooth device.
- Tickets for public transportation, Oyster card
- Banknote serial numbers
- Names used in personal interaction
- RFID tags, found in many products
- Artificial DNA marking of objects
- Automated toll payment boxes
Weak unpooling properties:
- ‘Weak biometrics’, visual:
- Gait (can be automated)
- Hand/Ear patterns
- Hand geometry
- Habits, patterns of behavior:
- Geolocation data (can be very strong)
- Buying habits
- Time habits
- Driving habits
- Power consumption
Sensors & Terminals
- Payment terminals (credit cards, loyalty cards)
- Mobile phone
- Ticket systems (transportation)
- RFID Gates
- CCTV/Surveillance camera networks
- Automated license plate scanners