Browser fingerprinting is the systematic collection of information from a remote device with the aim of uniquely identifying and tracking its user. The primary motivation behind this technique is the monetization of user data, often for personalized advertising campaigns and user profiling.
Unlike traditional tracking methods like cookies, fingerprinting operates silently and can be far more persistent. It has been described in technical literature as a "cookieless monster" because it doesn't require storing any files on a user's device and is completely transparent to the user.
| Feature | Cookies | Browser Fingerprinting |
|---|---|---|
| Storage | Stores small files on the user's computer. | No files are stored on the user's computer ("cookieless"). |
| User Visibility & Control | Can be viewed, blocked, or deleted by the user through browser settings. | Operates transparently. The user has no direct way of knowing it's happening or preventing it. |
| Persistence | Can be removed by the user. | Highly persistent. It can even be used to restore cookies that a user has deleted, re-linking their identity. |
Now that we understand what browser fingerprinting is and why it's more persistent than cookies, let's explore the specific techniques used to create these unique digital identifiers.
A fingerprint's uniqueness comes from combining many different pieces of information, some of which are simple browser characteristics, while others are highly advanced and sophisticated.
These are basic characteristics that can be collected via a browser to begin building a profile. Each piece of information, when combined with others, helps to narrow down the identity of a device.
These methods exploit modern web technologies to extract subtle but highly identifying details from a device.
This technique uses the HTML5 Canvas element to draw a hidden image or text. Because every device renders it slightly differently due to variations in the graphics card, drivers, and operating system, the resulting image data can be converted into a hash (a unique string of characters) that serves as a powerful identifier.
A variation of Canvas fingerprinting, this method generates images of the same text string multiple times, each with a different font from a predefined list. The subtle rendering differences across the variety of fonts allow for metrics to be extracted from the generated images, creating a unique identifier for the browser.
This technique uses the WebRTC API (a technology for real-time communication) to discover a device's true local IP address, even if it is behind a Network Address Translation (NAT) router. Combining this local IP with the public IP address creates a very stable and consistent identification factor.
This method uses the AudioContext API to process a standard, computer-generated audio signal (like a sine wave). It does not listen to the device's microphone. The final processed audio signal has subtle variations due to the unique hardware and software stack of the device. This output is then hashed to create a unique identifier.
| Technique | How It Works (Simplified) | Why It's Effective for Identification |
|---|---|---|
| Canvas | Draws a hidden image and analyzes the subtle rendering differences between devices. | Variations in graphics hardware, drivers, and fonts make the final image unique to a device. |
| Canvas Font | Renders the same text with many different fonts to measure rendering inconsistencies. | The specific combination of installed fonts and their rendering creates a highly unique profile. |
| WebRTC | Uses a communication API to reveal the device's local network IP address. | Combining the local and public IP addresses can uniquely identify a device on a network. |
| AudioContext | Processes a standard audio signal to detect differences in a device's audio stack. | The audio processing hardware and software on each device produce a slightly different output. |
While each of these techniques gathers a piece of the puzzle, the true power of fingerprinting comes from combining them; the next section explains how we can scientifically measure that power of identification.
The scientific way to measure the level of unique identification provided by a piece of information is called Information Entropy, which is measured in "bits." Higher entropy means more uniqueness.
A simple analogy is a six-sided die. A single roll has six possible outcomes, providing about 2.58 bits of information. If an event only had two outcomes (like a coin flip), it would provide only 1 bit of information. The more possible outcomes, the higher the entropy and the more "information" a result provides.
When a website collects a browser characteristic, it reduces the uncertainty (entropy) about who you are. It is estimated that approximately 33 bits of entropy are needed to uniquely identify a single person from the global population of 7.5 billion.
The Panopticlick research project provides a clear example of how different browser attributes contribute bits of identifying information.
Example: Bits of Identifying Information
| Browser Characteristic | Bits of Identifying Information | Significance for Identification |
|---|---|---|
| Browser Plugin Details | 9.14 bits | A higher value means this characteristic is rarer and contributes more to making you unique. |
| User Agent | 7.68 bits | This combination of browser and OS is quite uncommon, adding significant identifying power. |
| Hash of canvas fingerprint | 6.62 bits | The way your device renders graphics is a strong identifier. |
| System Fonts | 6.5 bits | The specific list of fonts on your machine is highly distinguishing. |
| Time Zone | 2.7 bits | While not unique on its own, it helps narrow down the possibilities significantly. |
In the Panopticlick test, the combination of these and other values resulted in a total of at least 20.37 bits of identifying information, making the browser unique among more than 1,357,000 others tested. A similar project, AmIUnique.org, also demonstrates this by showing users how their browser fingerprint compares to a large database of others, often finding it to be unique.
Understanding that fingerprinting is a measurable science of reducing anonymity, we can now evaluate strategies to protect against it.
The most important defense principle for users is straightforward: The closer a device is to a general or default configuration, the harder it is to uniquely identify.
Many common privacy tools are not effective against advanced fingerprinting.
A study analyzing the efficiency of different mitigation measures found a clear winner.
Other potentially effective, though sometimes impractical, measures include:
With these defensive strategies in mind, let's summarize the most critical points from these notes.