RouteHardenHire us
Back to blog
Anonymity Engineering··29 min read·intermediate

Threat models for network anonymity

Passive observers, active adversaries, global traffic correlation, and the vocabulary needed to reason about anonymity without hand-waving.

The single most expensive mistake in anonymity engineering is starting with the tool. People reach for Tor, or a VPN, or sing-box with REALITY, and ask "is this anonymous?" — and there's no useful answer because the question doesn't specify anonymous against whom. The same configuration that makes your traffic invisible to the coffee shop's WiFi router can be transparent to the destination service, the relay operator, the recursive resolver three networks downstream, or a global passive adversary correlating timing signals across the world's transit links.

Anonymity engineering starts with the adversary, not the tool. You enumerate who you're hiding from, what they can see, what they can do, and what failure modes you're willing to accept. Only then do you pick mechanisms. A "good" anonymity tool is one whose properties match your threat model — and a tool that's perfect for one threat model is regularly useless against another.

This module is the foundation for everything else in Track 4. We'll establish a precise vocabulary (anonymity vs. unlinkability vs. unobservability vs. pseudonymity, drawn from the Pfitzmann/Hansen terminology proposal that prevents anonymity discussions from collapsing into hand-waving), enumerate the concrete adversaries an anonymity engineer reasons about, walk through what each adversary actually sees on a real network path, and assemble a worked comparison of what VPN/Tor/mixnet hide against each adversary class. The pattern this module trains is "describe the threat model first; then evaluate the tool against it." Without this discipline, every later module about traffic analysis, padding, mixnets, fingerprinting, and steganography reduces to lore.

Prerequisites

Learning objectives

  1. Distinguish confidentiality, anonymity, unlinkability, unobservability, and pseudonymity using precise vocabulary rather than colloquial English.
  2. Model passive observers, active probing adversaries, and global passive adversaries in terms of what packets, metadata, and side channels they can actually see.
  3. Compare what a VPN, Tor circuit, and mix network do and do not hide against concrete adversaries.
  4. Explain why anonymity failures usually come from composition mistakes across layers rather than a single broken cipher.

Why anonymity starts with the adversary, not the tool

Two scenarios:

Scenario A. A journalist reads news articles on her laptop from her home WiFi. She doesn't want her ISP — or anyone else with access to her ISP's logs — to know she reads opposition political news.

Scenario B. A whistleblower in an authoritarian country needs to upload internal government documents to an offshore reporting platform without being identifiable to the government, the platform, or any party who might be coerced by the government.

Both scenarios are "anonymity problems." But the right tools differ enormously:

  • For the journalist, a commercial VPN is overwhelmingly sufficient. Her threat model is "the ISP and downstream observers"; the VPN moves her trust to the VPN provider, who has no business relationship with her ISP and isn't subject to the journalist's local data-retention laws. A VPN doesn't hide her from the website she visits, but the website doesn't care who she is — it has no incentive to share her identity with her ISP.

  • For the whistleblower, a commercial VPN is dangerous. The VPN provider becomes a single point of compromise; the local network observer can see "this person uses a VPN at this time," which itself may be evidence of intent; the upload pattern (sudden 50MB outbound to an unfamiliar destination) is fingerprintable; the browser used to upload may carry identifying state. The right answer probably involves Tor, an obfuscated transport, careful operational security, and physical disposable hardware — and even then the threat model includes "global passive adversary doing traffic correlation across the country's external transit links."

Same word, "anonymity," wildly different tools, because the adversary is different. Until you specify the adversary, you can't pick the tool.

This is also why "is X more anonymous than Y" questions usually have no answer. Tor is more anonymous than a VPN against the destination service (the destination doesn't see your IP) and less anonymous against your ISP (using Tor is itself observable as a behavioral signal, where many VPNs blend with corporate traffic). Both are massively less anonymous than physical operational measures like leaving your phone at home and using cash to buy a prepaid hotspot from a town you don't normally visit. Comparing tools without an adversary frame is meaningless.

The core vocabulary: anonymity, unlinkability, unobservability

Loose English makes anonymity discussions slippery. The Pfitzmann/Hansen terminology proposal, originally from the early 2000s and revised through 2010, gives us precise definitions that let us argue without talking past each other. The relevant terms:

Anonymity is the state of not being identifiable within a set of subjects (the anonymity set). Anonymity is always relative to a set: "Alice is anonymous within the set of all 8 million Tor users" is meaningful; "Alice is anonymous" without qualification is not. The size of the set matters, but so does the distinguishability of subjects within the set — see "anonymity sets and why crowd size is not enough" below.

Unlinkability is the property that an adversary cannot determine whether two or more events (messages, actions, sessions) are related to the same subject. Two messages might individually be anonymous, but if the adversary can link them ("the same anonymous user sent both"), the combined linked set may reveal more than each message alone.

Unobservability is the property that an adversary cannot tell whether communication is happening at all. A communication is unobservable if the adversary cannot distinguish "Alice is communicating right now" from "Alice is not communicating right now." Unobservability subsumes anonymity (if the communication isn't observable, the participants certainly aren't identifiable) and is much harder to achieve.

Pseudonymity is the use of a stable identifier (a pseudonym) that is not the subject's real-world identity, but allows linkability of actions performed under that pseudonym. A pseudonymous user is not anonymous (their actions are linkable to their pseudonym) but their pseudonym is not associated with their real-world identity. Pseudonymity is the model behind reputation systems, anonymous-but-persistent forum accounts, and most cryptocurrency wallets.

Confidentiality is unrelated to anonymity. Confidentiality protects the content of a message; anonymity protects the identity of the sender or receiver. A perfectly confidential message between Alice and Bob still tells observers "Alice is talking to Bob" if the IP addresses are visible. Conversely, an anonymous message system doesn't necessarily encrypt content — though most do, since the two properties are usually pursued together.

The properties form a rough ladder:

  • Confidentiality is the easiest. TLS solves the basic case.
  • Anonymity is harder. Solved imperfectly by VPNs (against some adversaries) and Tor (against more adversaries).
  • Unlinkability is harder still. Even Tor is imperfect: same-session activity is linkable by definition; cross-session linkability depends on circuit isolation and application behavior.
  • Unobservability is hardest. True unobservability requires steganography, dead drops, or cover traffic so dense that nothing distinguishes "communicating" from "not communicating." Practical low-latency systems do not achieve it.

When someone says "this VPN gives me anonymity," the precise question is: anonymous against whom, with what unlinkability properties, and at what observability level? Almost always, the honest answer is much narrower than the marketing implies.

Who can see what on a real network path

A normal HTTPS request from a laptop to a website passes through many actors. Each sees a different slice. Mapping these observers explicitly is the first step in any threat model.

A typical request's observability map:

  • The local device. Sees everything: the application making the request, the cleartext content (before TLS), the destination URL, the user. If the device is compromised, anonymity at any other layer doesn't matter.
  • The local LAN. A coworker, a roommate, a coffee-shop attacker on the same WiFi. Sees: TCP/UDP packets to/from your device, including source/destination IPs (not encrypted at the IP layer), TLS SNI (until ECH is widely deployed), traffic timing and volume. Cannot see TLS-encrypted content.
  • The recursive DNS resolver. Sees: every domain you look up. The default for most home networks is your ISP's resolver; cloud-hosted resolvers (1.1.1.1, 8.8.8.8) replace ISP visibility with the resolver operator's visibility. DoH/DoT obscures the lookup from the network in between but not from the resolver itself.
  • The local ISP. Sees: TCP/UDP packets to/from your devices, source/destination IP pairs, packet sizes, timing, DNS lookups (if using their resolver), TLS SNI, total volume. Cannot see TLS-encrypted content. Maintains logs to varying durations depending on jurisdiction (months in many democracies, years in some authoritarian regimes).
  • Transit ISPs. Networks the packets traverse between your ISP and the destination. Each sees: source/destination IPs of packets that traverse them, sizes, timing. Visibility depends on routing — a request to a major service usually traverses 2-5 transit networks. State-level adversaries who can observe transit links see substantial slices of internet traffic.
  • The destination service. Sees: source IP of incoming connection (which is your VPN exit if you use a VPN, your Tor exit if you use Tor), the full request content (after TLS terminates), session cookies, account credentials if you log in, browser fingerprint (User-Agent, Accept headers, screen resolution if JavaScript runs).
  • CDN / edge networks. Many destinations sit behind Cloudflare, Akamai, Fastly. The edge provider sees what the destination sees, plus may operate the TLS termination point.
  • The certificate transparency log infrastructure. When the destination's TLS cert was issued, it was logged in publicly-auditable CT logs. Anyone can see "this CA issued this certificate for this domain" — generally not a privacy issue, but worth knowing.
  • Out-of-band identity systems. OAuth providers (Google, Microsoft) see your account activity. Push notification services (APNs, FCM) see app activity. Email-verification flows expose email-to-account linkage to email providers.
  • Storage and backup providers. If the device's data is backed up to the cloud, the backup provider sees the data.
  • Global passive adversary. Five Eyes-style intelligence apparatus, certain national-security agencies. Can observe traffic at major internet exchange points, undersea cable landings, and inside major telecom backbones. Sees: a substantial fraction of internet traffic worldwide as encrypted-but-correlatable flows.

For any specific request, only a subset of these observers is actually relevant. A coffee-shop user worried about the coffee shop owner's nephew has a small observer set; a journalist communicating with an intelligence-target source has a much larger one. The exercise is enumerating which observers actually matter for the situation at hand, then evaluating tools against that set.

Passive, active, and global adversaries

Adversaries fall into three operational classes, and the distinction matters for what defenses make sense.

Passive observers see traffic but don't modify it. They watch. The local LAN observer is passive. The ISP is passive (at least in a country where the ISP doesn't actively interfere with traffic). A snooping coworker is passive. Defenses against passive observers are largely cryptographic — encryption hides content, traffic mixing hides metadata, anonymity overlays hide source/destination linking. Passive observers cannot do active probing, cannot inject packets, cannot tamper with traffic in flight.

Active adversaries modify traffic, inject packets, or otherwise interact with the network in real time. A nation-state firewall doing deep packet inspection and selective blocking is active. An attacker on the same LAN doing ARP poisoning is active. A compromised CA issuing fraudulent certificates is active. Active adversaries can:

  • Block traffic that matches certain patterns ("drop all WireGuard handshakes").
  • Inject responses (DNS poisoning, RST injection on TCP connections, fake CAPTCHAs).
  • Probe servers ("send a TLS handshake and see if it accepts our test cert").
  • Man-in-the-middle connections by intercepting and re-encrypting.
  • Inject identifying patterns into your traffic to fingerprint you later.

Defenses against active adversaries are harder. Cryptographic authentication (TLS with proper cert validation, certificate pinning, mTLS) prevents the MITM case. Pluggable transports and protocol obfuscation prevent the pattern-matching case. But active adversaries can also block instead of attack — if a censor can simply drop traffic they can't classify, that's a denial of access, which most anonymity tools can't directly address.

Global passive adversaries (GPAs) see traffic at many points simultaneously. They don't have to actively interfere; just observing many flows lets them correlate timing and volume to identify users. Tor's design paper explicitly says Tor does not protect against GPAs: an adversary observing both the traffic from the user to the guard and the traffic from the exit to the destination can correlate the two timing-and-volume signatures and conclude they're the same circuit, even though every cell in between is encrypted.

GPAs are real but their capabilities are debated. Five Eyes-style intelligence services have been documented (via the Snowden disclosures and other sources) operating taps at major internet exchange points and submarine cable landings. The actual scope and quality of correlation against any specific user is harder to know. The conservative assumption in anonymity design is "if you're worth a GPA's attention, low-latency anonymity systems can be defeated by them."

The threat-model question is which adversary class you're defending against. Most users only ever face passive local observers; defenses against them are cheap and effective. A meaningful minority face active censors; defenses (REALITY, naïveproxy, obfuscated transports — see xray-reality-vs-wireguard) exist and work in practice but are an arms race. A small but important minority face GPAs; for them, low-latency systems are insufficient and high-latency mixnets, dead drops, or operational measures (no internet at all for sensitive communications) become relevant.

Anonymity sets and why crowd size is not enough

The anonymity set is the set of subjects within which one specific subject is indistinguishable. "Alice is anonymous within the set of all Tor users" is a useful claim; the set is millions of people; finding Alice within them is hard.

But the anonymity set isn't just "everyone who uses the tool." It's the set of subjects whose observable behavior is indistinguishable from Alice's. A few examples of how the effective set collapses:

  • Distinct timing. Alice always uses Tor at 3am Pacific time. The anonymity set "all Tor users" shrinks to "Tor users active at 3am PT," which is dramatically smaller than the headline number.
  • Distinct browsing behavior. Alice loads https://specific-rare-forum.example.com over Tor. The exit logs (or the destination's logs) show "someone loaded specific-rare-forum.example.com." The anonymity set shrinks to "Tor users who loaded that specific URL." If only a few people read that forum, the set is tiny.
  • Distinct fingerprint. Alice's browser has a particular combination of plugins, screen resolution, time zone, and user-agent that's unique among Tor users (or close to it). Even within the larger anonymity set, fingerprinting collapses her to a singleton. See browser-fingerprint-hardening.
  • Distinct application behavior. Alice logs into the same email account from Tor that she also accesses from her home IP. The application-layer linkability collapses the anonymity entirely — the destination service sees the same account credential from both contexts and can link them.

The point: anonymity-set size is necessary but not sufficient. What matters is anonymity-set size after accounting for distinguishability. A 1-million-user system where each user has a unique fingerprint provides anonymity-set size 1 in practice. The Tor Project understands this and works hard on Tor Browser uniformity (every Tor Browser instance presents nearly identical fingerprints, JavaScript timing precision is reduced, fonts are constrained), but application-layer behavior is the user's responsibility.

The corollary: smaller anonymity sets aren't necessarily worse if distinguishability is also lower. A small mixnet where everyone behaves identically may provide better effective anonymity than a large low-latency system where everyone is distinguishable by behavioral patterns.

Low-latency anonymity versus high-latency anonymity

This is the central architectural tradeoff in anonymity systems and the one most worth understanding.

Low-latency systems (Tor, VPNs, sing-box overlays) move traffic with minimal delay so users can browse the web, watch video, hold voice calls. The cost: timing signatures of the traffic are preserved in transit. An adversary observing both ends of a Tor circuit sees the same timing pattern at both ends and can correlate them.

High-latency systems (mix networks like Loopix and Nym, traditional Mixmaster-style email mixers) introduce deliberate delays, batching, padding, and reordering at each hop. The cost: traffic takes minutes to hours instead of milliseconds. The benefit: timing correlation becomes statistically much harder; an adversary observing both ends sees timing patterns that are no longer correlated, because the intermediate mixing has scrambled the timing structure.

The tradeoff is real and unavoidable. You cannot have both low latency and timing-correlation resistance. The math of timing analysis is too straightforward; if Alice sends a 1 KB packet at time T and a 1 KB packet leaves the system at time T + 50ms, an adversary observing both has a strong hint they're the same packet. The only way to break this hint is to make T + 50ms long enough and the timing variance large enough that the correlation is no longer reliable — which means high latency.

Tor sits at the low-latency end of the spectrum and accepts the GPA threat. Mixnets sit at the high-latency end and aren't usable for browsing. The choice depends on the threat model:

  • Threat model: local passive observers, casual surveillance, commercial advertising tracking. Low-latency systems (VPN, Tor) are sufficient and the right tool.
  • Threat model: GPA, intelligence agencies, long-term traffic correlation against a specific target. High-latency systems (mix networks for messaging, dead drops for the most sensitive cases) are required. Low-latency systems are insufficient.

The Track 4 modules on traffic analysis (traffic-analysis-fundamentals — coming soon), padding strategies (padding-strategies-and-cover-traffic — coming soon), and mix networks (mix-networks-loopix-nym — coming soon) go deep on this tradeoff. For now: notice that the tradeoff exists and that "Tor is anonymous" is incomplete unless you specify whose observation you're talking about.

Composition failures across layers

Most real-world anonymity failures don't come from broken ciphers or correlated-circuit attacks. They come from composition mistakes — the user did something at the application layer that linked their anonymous network presence to their real-world identity. The cryptographic transport worked perfectly; it just didn't matter because the leak was somewhere else.

The composition-failure catalog:

Application-layer identity linkage. You log into your real-name email account through Tor. The transport is anonymous; the application destroys the anonymity by carrying your identity into the session.

Browser state. Cookies from your non-Tor browsing get sent through the Tor connection because you used the same browser. Or: your browser's autofill helpfully fills in your real name and email on a Tor-accessed form. Tor Browser exists specifically to isolate browser state, but if you bypass it, anonymity collapses.

DNS leaks. A misconfigured client resolves DNS locally before connecting through Tor. The destination domain is leaked to the local DNS resolver, which is observed by your ISP. Tor saw nothing wrong; the application-layer DNS broke the anonymity. SOCKS5h vs. SOCKS5 matters here; see the Tor module.

Time-of-day patterns. You only access your anonymous account during your work-day timezone. Long-term observation of the destination (or the guard) reveals timezone patterns that narrow the anonymity set.

Writing style. Stylometric analysis can identify authorship of text from a few thousand words. Two anonymous accounts with the same writing style are linkable to each other; both linkable to a non-anonymous account whose author wrote in the same style.

Application fingerprinting. TLS JA3/JA4 fingerprints, HTTP/2 settings, and browser fingerprints can identify the specific software stack a user is on. If Alice always uses Firefox 122 with custom settings X, Y, Z over Tor, that combination may be unique. See ja3-ja4-tls-fingerprinting.

Out-of-band channels. You email someone "I'll connect to you over Tor in an hour." The email is observed; the timing of the Tor connection matches; an adversary correlates them.

OS-level information disclosure. OS prefetch data, swap files containing decrypted memory, telemetry sent to OS vendors, software-update metadata — all can leak information that bridges anonymous and identified contexts.

Physical-world correlations. Connecting to Tor only when at home with a fixed IP geolocation. Using a unique device whose hardware identifiers are knowable. Buying a "anonymous" hotspot with a credit card.

The pattern: anonymity is a property of the entire system from physical layer to social behavior. Strengthening one layer doesn't help if a weaker layer is the actual leak. A useful threat model enumerates the layers and asks "where is the weakest link?" — usually it's not the cryptographic transport.

Worked comparison: VPN, Tor, and mixnet

To make all this concrete, walk through a single message — Alice sends an HTTP request to example.com — under three transport choices, against four adversary classes.

Adversary               | Direct  | VPN     | Tor     | Mixnet
─────────────────────── | ─────── | ─────── | ─────── | ───────
Local LAN observer      | dest IP | VPN IP  | guard   | mix entry
                        | + SNI   | only    | only    | only
ISP                     | dest IP | VPN IP  | guard   | mix entry
                        | + SNI   | only    | only    | only
Destination service     | src IP  | VPN exit| Tor exit| mixnet exit
                        | (real)  | IP      | IP      | IP (delayed)
GPA correlating both    | direct  | trivial | possible| hard
ends                    | linkage | linkage | (timing)| (mixed)

Reading the table:

Direct (no anonymity tool):

  • Local LAN observer sees: destination IP and TLS SNI. The fact that you're contacting example.com is plain.
  • ISP sees: same, plus DNS lookup, packet timing, total volume.
  • Destination service sees: your real source IP, full request content.
  • GPA sees: same as ISP, with additional information from observing more network paths.

VPN:

  • Local LAN observer sees: encrypted UDP/TCP to the VPN provider's IP. The destination is hidden.
  • ISP sees: same — encrypted tunnel to VPN provider.
  • Destination service sees: the VPN provider's exit IP, full request content.
  • GPA observing both your home connection to the VPN and the VPN's connection to the destination: trivial linkage by timing and volume. The VPN's encryption doesn't break this; the GPA correlates the two endpoints of the tunnel.
  • The VPN provider sees: your real source IP and your destination. The VPN is a single point of trust.

Tor:

  • Local LAN observer sees: encrypted TLS to the guard relay (looks like generic HTTPS, but Tor traffic has recognizable patterns).
  • ISP sees: same — connection to the guard relay.
  • Destination service sees: the Tor exit's IP (which is a known Tor exit; the destination knows you're using Tor).
  • GPA observing both the user-to-guard traffic and the exit-to-destination traffic: timing-correlation attack succeeds in many cases. Cells are encrypted at every hop, but timing patterns survive. This is the GPA threat Tor explicitly does not defend against.
  • No single Tor relay sees both your IP and your destination. The trust is distributed across the three relay positions.

Mixnet (Loopix-style):

  • Local LAN observer sees: encrypted traffic to a mix node, plus periodic cover traffic if the mixnet pads. Hard to distinguish "actively communicating" from "idle."
  • ISP sees: same.
  • Destination service sees: the mixnet exit IP; receives the message at a delayed time (minutes to hours after Alice sent it).
  • GPA observing all mix-node traffic: the mixing operations (delays, batching, reordering, cover traffic) make timing correlation statistically hard. Not impossible — it depends on how thorough the mixing is and how persistent the GPA's observation — but the threshold for confident correlation is much higher than for Tor.
  • No single mix node sees both endpoints; like Tor, trust is distributed.

Notice what each tool actually changes. The VPN moves trust from your ISP to the VPN provider but doesn't defend against GPAs at all. Tor distributes trust across three relays and defends against most adversaries except GPAs. Mixnets defend against GPAs at the cost of usability for interactive applications. None is "more anonymous" in a vacuum; each has a specific threat-model fit.

Threat-model writing as an engineering skill

Writing a threat model is the discipline anonymity engineering rests on. The format doesn't matter as much as the discipline of doing it. A working format:

THREAT MODEL — [Scenario name]

ASSETS (what we're protecting)
- Identity of the user
- Connection to the destination
- Content of the connection
- Timing of the connection
- Any application-layer credentials in use

ADVERSARIES (who we're protecting against)
- [Adversary A]: capabilities, observable points, motivation
- [Adversary B]: ...
- [Adversary C]: ...

OBSERVABLE EVENTS (what each adversary sees)
- Adversary A sees: [list]
- Adversary B sees: [list]
- Adversary C sees: [list]

ACCEPTABLE FAILURE MODES
- It is acceptable that [adversary A] knows [some specific thing], because [...]
- It is unacceptable that [adversary B] knows [some specific thing], because [...]

DEFENSES
- [Defense 1] addresses [observation X by adversary Y]
- [Defense 2] addresses [observation Z by adversary W]

RESIDUAL RISK
- After defenses, [adversary B] can still observe [residual things]; we accept this because [...]

A worked example (the whistleblower scenario from earlier):

THREAT MODEL — Government-document upload by domestic source

ASSETS
- Source's real-world identity
- The fact that source is communicating with the offshore platform
- Content of the documents being uploaded
- Timing of the upload (could be correlated with internal access)

ADVERSARIES
- Domestic ISP (passive, retains logs, subject to government request)
- Domestic state telecom monitoring (active, can do DPI and selective blocking,
  may have GPA-like coverage of external links)
- Offshore platform (presumed honest but coercible if compelled by the
  government's diplomatic channels)
- Local network at the source's workplace (passive, monitored by IT)

OBSERVABLE EVENTS
- Domestic ISP sees: encrypted traffic from source's home IP to some destination,
  packet sizes and timing, possibly TLS SNI (depends on tooling)
- State telecom sees: same as ISP, plus may have visibility into upstream
  transit links and external connections
- Offshore platform sees: source IP of incoming connection (which we want
  to be a Tor exit, not the source's real IP), application-layer identity
  (which we want to be pseudonymous)
- Workplace network sees: nothing if the upload happens from home

ACCEPTABLE FAILURE MODES
- It is acceptable that domestic ISP knows source uses Tor (Tor usage alone
  is not direct evidence of intent in this jurisdiction).
- It is acceptable that the offshore platform knows the upload came from a
  Tor exit (this is the design).

UNACCEPTABLE FAILURE MODES
- It is unacceptable that the source's real IP appears anywhere downstream
  of Tor's exit relays.
- It is unacceptable that the upload timing can be correlated with internal
  document access at the source's workplace.
- It is unacceptable that any persistent identifier links the upload to the
  source's other online activity.

DEFENSES
- Use Tor with bridges + obfs4 to obscure Tor usage from domestic monitoring.
- Upload from a personal device, not workplace devices, with timing decorrelated
  from internal document access (delay days, not minutes).
- Use Tor Browser default configuration; do not log in or carry browser state.
- Use a fresh pseudonym at the offshore platform (no email/phone verification
  that could link to identifiable accounts).
- Strip metadata from documents before upload.

RESIDUAL RISK
- A sufficiently determined GPA correlating Tor entry traffic with offshore
  platform receive timing might still link the upload. Mitigation: delay
  between source's Tor session and platform receive timing such that
  correlation is statistically weak; for highest-stakes documents, prefer
  high-latency mixnets or dead drops over Tor.

This format isn't sacred. What matters is that you've enumerated who you're hiding from, what they can see, what you accept losing, and what defenses address what observations. Without this, you cannot evaluate whether a specific tool is sufficient. With it, the tool selection becomes a series of straightforward "does mechanism X defeat observation Y by adversary Z?" questions.

The rest of Track 4 — traffic analysis, padding, mixnets, fingerprinting, steganography — gives you the mechanism-side understanding to populate the "DEFENSES" section. This module gave you the structure to know what defenses you actually need.

Hands-on exercise

Build an observability map for one web request.

Tools: traceroute, tcpdump, text editor. Runtime: 20 minutes.

On a Linux or macOS machine on your normal network, in one terminal:

sudo tcpdump -nn -i any -w /tmp/observability.pcap "host example.com or port 53"

In another terminal:

traceroute example.com
curl -v https://example.com > /dev/null

Stop tcpdump after curl completes. Then open the pcap (in Wireshark, or tcpdump -r /tmp/observability.pcap | head -40).

For each packet you observe, record what each of the following observers sees:

PacketLocal LAN seesISP seesDestination sees
DNS queryexample.com lookup, your IPsamenothing (DNS isn't to it)
TCP SYNsrc+dst IP, src portsameyour src IP, src port
TLS ClientHellosrc+dst IP, SNI=example.comsamefull TLS handshake
TLS app datasrc+dst IP, sizes, timingsamedecrypted application data
TCP teardownsrc+dst IP, sizessameTCP close

The point isn't the table per se — it's the act of writing it. Most engineers haven't actually thought through what's observable at each hop; the exercise of mapping it explicitly reveals where the surprises are.

Stretch: repeat the exercise conceptually (without actually running it) for the same request through a VPN, then through Tor, then through a hypothetical mixnet. What changes in each row?

Write a threat model for a sample whistleblower workflow.

Tools: plain text, the template above. Runtime: 15 minutes.

Take this scenario:

A worker at a financial-fraud company has discovered evidence the company is laundering money for sanctioned entities. They want to send the evidence to investigative journalists at an international newspaper. They cannot risk being identified by the company or by the company's legal/PR firm.

Write a threat model using the template. Specifically enumerate:

  • The assets (their identity, the connection, the content).
  • The adversaries (the company, the company's IT department, possibly the company's lawyers via subpoena power, possibly state-level adversaries if the laundering involves governments).
  • What each adversary observes.
  • What's acceptable to leak versus what's catastrophic.
  • The defenses that would address the unacceptable leaks.
  • The residual risk after defenses.

Don't optimize for a "right answer." Optimize for clarity about who can see what. The point is to internalize the discipline.

Common misconceptions and traps

"Encryption implies anonymity." It implies confidentiality of content. The IP-level metadata, packet timing, packet sizes, DNS lookups, and TLS SNI fields are still visible to network observers even when the application content is encrypted. Anonymity requires hiding identity-and-relationship information that encryption doesn't address.

"Tor solves every privacy problem once enabled." Tor solves the transport-level network anonymity problem against most adversaries except GPAs. It doesn't solve application-layer linkage (logging in, browser state, account history), doesn't solve traffic-analysis attacks against website fingerprinting, doesn't solve operational mistakes (using Tor only at certain times of day from certain locations), and doesn't make a destination that knows you're using Tor treat you as anonymous.

"A large user base guarantees anonymity." Anonymity-set size is necessary but not sufficient. What matters is the size after accounting for behavioral and fingerprintable distinguishability. A user with a unique browser fingerprint within a million-user system has effective anonymity-set size 1.

"Global passive adversaries are unrealistic so they can be ignored." Some are documented (Snowden disclosures); some are inferred from intelligence-budget and capability disclosures. The honest engineering answer is "GPAs are real for some users but not for most." Defenses against GPAs (high-latency mixnets, dead drops) are expensive, so design choice depends on whether your threat model includes them. The engineer's job is to be explicit about whether GPAs are in or out of scope, not to dismiss them.

"Threat modeling is paperwork." In anonymity engineering it's the difference between rational defense selection and theater. A team that writes good threat models picks defenses that match observations; a team that doesn't picks defenses based on which tools are popular this year. The latter often gets compromised by adversaries the team didn't bother to enumerate.

"More layers of defense is always more anonymous." Layering defenses without understanding what each one addresses creates complexity that can backfire. VPN + Tor + sing-box stacked together can introduce more identifying patterns (more software, more configuration surface, more potential for observable misconfiguration) than any single tool used correctly. Each layer should address a specific observation by a specific adversary; layering for its own sake is not a defense, it's a dependency.

"My adversary doesn't have those resources." Maybe true, maybe not. Threat model carefully and document your assumptions; revisit them when news suggests assumptions have changed. Adversary capabilities accumulate over time (better fingerprinting tools, more taps installed, more legal compulsion authority), and a defense designed against 2020-era adversaries may not work against 2026-era ones.

Wrapping up

Threat models are how anonymity engineering keeps itself honest. Without them, "anonymity" collapses into product marketing and folklore. With them, you can reason cleanly about which defenses address which observations by which adversaries — and accept the residual risk that always remains.

The vocabulary (anonymity, unlinkability, unobservability, pseudonymity, anonymity sets) gives you precise terms for properties that English mushes together. The adversary classification (passive, active, GPA) gives you a framework for thinking about who can do what. The composition-failure catalog reminds you that the cryptographic transport is rarely where leaks actually happen — application-layer behavior, browser state, and operational patterns are usually the actual leak.

The rest of Track 4 builds on this foundation: traffic analysis is what passive observers do to break low-latency anonymity; padding and cover traffic are defenses against traffic analysis; mixnets are the high-latency systems that take padding to its logical conclusion; browser and TCP/IP fingerprinting are the application-layer leaks that compose with transport leaks; steganography is the unobservability frontier. Each module assumes you know how to ask "anonymous against whom?" before evaluating a tool. That discipline is what this module trained.

The next module (traffic-analysis-fundamentals — coming soon) goes deep into what passive observers can actually infer from encrypted-but-observable traffic, and why timing and volume are the most powerful identifiers in low-latency systems.

Further reading