Encrypted Transport · Part 4 of 7·Anonymity Engineering·2026-05-02·32 min read·advanced

Tor, onion routing, and circuit-level anonymity

Tor from the transport up: cells, telescoping circuits, guards, exits, directory authorities, and why Tor is not just a VPN with extra hops.

Tor is not a VPN. It runs over TCP and uses encryption, so it superficially resembles one, and casual writeups call it "a VPN with three hops" or "extra-anonymous internet." Both descriptions are wrong in ways that matter for understanding what Tor actually protects against and what it doesn't.

A VPN moves your TCP connections through a single encrypted tunnel terminating at one operator. The operator can see what you do; the VPN provides confidentiality from the network in between, plus a different exit IP. Trust is concentrated in the operator.

Tor is a TCP-stream anonymity overlay. It splits trust across three independently-operated relays. No single relay knows both who you are and what you're doing. The encryption is layered — wrapped in onion-shaped layers, hence the name — so each relay can only peel its own layer. The system is designed so that compromising any one relay reveals nothing useful about the user's activity, and even compromising two relays is not enough as long as they aren't the right two.

This module is the architectural treatment of Tor: how the protocol is built, why it looks the way it does, and what the design choices cost in performance and what they pay for in anonymity. We're not going to repeat practical user advice — that already lives in tor-technical-users-guide. Here we'll walk through fixed-size cells, telescoping circuit construction, directory authorities, the guard/middle/exit role separation, the cryptography that makes onion routing work, the unavoidable leakage Tor accepts as the price of low latency, and why all of this is fundamentally different from running a TCP connection through a VPN concentrator.

Prerequisites

tcp-at-the-wire-level — Tor circuits carry TCP streams; you need to know how TCP behaves to understand circuit semantics.
dns-name-resolution-end-to-end — DNS leakage is one of Tor's classic failure modes; understanding name resolution is required to understand what Tor does about it.
stream-ciphers-and-aead-construction — relay encryption uses AES-CTR with separate MACs in older specs and AES-GCM in newer ones.
asymmetric-crypto-rsa-and-discrete-log-family — relay identity historically used 1024-bit RSA, now Curve25519 onion keys plus Ed25519 identity keys.
digital-signatures — directory consensus and onion-service descriptors rely on signatures.

Learning objectives

Explain Tor as a circuit-based anonymity overlay distinct from any VPN, with different threat goals and structural compromises.
Describe how fixed-size cells, telescoping circuit construction, guard nodes, and directory authorities fit together to implement onion routing.
Distinguish the relay roles — guard, middle, exit — and explain what each knows and does not know about a circuit.
Explain why Tor's transport and anonymity properties differ fundamentally from a simple encrypted tunnel, and what residual leakage Tor accepts.

Why Tor is not a VPN

The "VPN with extra hops" framing fails on multiple fronts. Worth enumerating:

Trust model. A VPN has one operator. They have logs (or claim not to). They see your real IP and the destinations you visit. You trade trust in your ISP for trust in them. Tor distributes trust across three independent operators, none of whom is supposed to see both your real IP and your destinations. The threat model is "no single party can deanonymize you," which is a categorically different security property than "trust the operator."

Routing model. A VPN tunnels arbitrary IP packets (in the TUN model) or Ethernet frames (TAP). Tor moves only TCP streams — UDP doesn't traverse the network at all. There is no DNS forwarding by default; clients use SOCKS, and Tor itself does the destination resolution at the exit. ICMP, raw sockets, custom protocols — all unsupported. This is a deliberate scope limitation: anonymity properties are easier to reason about for one transport class than for "everything IP can carry."

Path properties. A VPN's path is one encrypted hop terminating at one operator. Tor's path is three relays, with the path constructed by the client (not chosen by any server), changing periodically, and using telescoping construction so that no relay learns the full path. Path diversity — picking relays in different jurisdictions, ASes, and operator families — is part of the anonymity story; a VPN has no equivalent concept.

Latency profile. A VPN adds one hop's worth of latency. Tor adds at least three hops' worth, plus the geographic distance between them, plus the load-induced queueing at each relay, plus circuit construction overhead at the start of each new TCP connection. Real-world Tor browsing latencies are commonly 500-2000ms on first page load. This is not a limitation that can be optimized away — it's structural to the design.

Identity model. A VPN has accounts. You log in. The operator knows you. Tor has no accounts — clients don't authenticate to the network. Anyone can use Tor. Anyone can run a relay. Anyone can run a directory mirror. This permissionless model is what makes the anonymity property credible (no central party to subpoena), and it's why operating Tor at scale is a fundamentally different engineering problem than operating a VPN.

Adversary model. VPN adversaries are typically the network in between (your ISP, public WiFi, transit providers). Tor's adversary model is broader: the adversary can run relays, can observe parts of the network, can correlate traffic across relays they observe, can run timing attacks, can perform website fingerprinting against guards. The defenses are different because the attackers are different.

So when you see "Tor is just a VPN with three hops," correct it. Tor is a separate system with separate threat goals. Some of those goals overlap with what a VPN provides (network confidentiality), some don't (trust distribution, anonymity from operators), and some Tor pursues that VPNs don't try to provide at all (defense against running adversary relays in the path).

Onion routing as layered path knowledge

The "onion" metaphor refers to the layered encryption: a packet bound for the destination through three relays is wrapped in three layers of encryption, the outermost peelable only by the first relay, the next only by the second, and so on. The destination receives the un-onioned plaintext (modulo any application-layer encryption the user added).

The structural property this creates is predecessor/successor knowledge only: each relay knows only the IP of the previous hop and the IP of the next hop. The first relay (the guard) knows the client's IP but not the destination. The middle relay knows neither — only the IPs of the guard and the exit. The exit knows the destination but not the client. As long as the guard and exit don't collude (and they shouldn't — Tor's path-selection logic actively avoids picking relays in the same operator family or AS for the same circuit), no single relay learns the (client → destination) pairing that would identify what the user is doing.

This is the fundamental anonymity property: divided knowledge. The encryption doesn't make the user anonymous on its own — what makes the user anonymous is that the knowledge of who-is-talking-to-whom is split across multiple non-colluding parties, and the encryption is what enforces that split.

The cryptographic mechanics of layering: when the client wants to send a cell to the exit through a circuit, it encrypts the cell payload with the exit's session key, then with the middle's session key, then with the guard's session key. The cell goes to the guard, which decrypts with its session key (peeling the outer layer), revealing a payload destined for the middle. The guard forwards to the middle, which decrypts with its session key, revealing a payload destined for the exit. The exit decrypts with its session key, revealing the final plaintext (or the next-hop IP for SOCKS-style forwarding to the actual destination).

Each layer uses AES in counter mode with a fresh per-direction key derived during circuit construction. The IV is derived deterministically from the cell sequence number; this is workable for Tor because each (circuit, direction) pair gets a fresh AES key per circuit, so counter-mode key reuse never happens within a key's lifetime.

Modern Tor (Tor 0.4.x and later) is migrating to AES-GCM and other AEAD modes for cell encryption, which removes the need for separate authentication. The security argument is unchanged — only the implementation primitives — but operationally the migration matters because the older AES-CTR-with-separate-MAC scheme is slightly more brittle to implementation bugs.

Cells and multiplexed streams

Tor's wire-protocol unit is a cell: a fixed-size frame of (originally 512, now 514) bytes carrying either circuit-control commands or relay data. Fixed size is critical for two reasons:

Size-based traffic analysis becomes harder. Variable-length packets leak information about content (a small TCP payload looks like a small Tor cell, a large TCP payload looks like several Tor cells). Fixed-size cells force the adversary observing a relay to count cells rather than infer content size, which is a coarser observation.
Multiplexing is uniform. Multiple TCP streams can share one circuit, with each stream's data chopped into cells and interleaved on the wire. Fixed cell size means scheduling decisions don't depend on payload size — a fairness scheduler at the relay can treat all cells uniformly.

The cell format (Tor 0.4.x):

| circuit ID  |  4 bytes  identifies which circuit this cell belongs to
| command     |  1 byte   what kind of cell: CREATE, RELAY, DESTROY, etc.
| payload     |  509 bytes (for relay cells) or smaller for control cells

The 4-byte circuit ID lets a single TLS connection between two relays carry many circuits in parallel — relays don't open one TLS connection per circuit. This dramatically reduces TCP/TLS overhead at relays that handle thousands of simultaneous circuits.

Inside a RELAY cell, the payload is further structured:

| relay command  |  1 byte    BEGIN, DATA, END, EXTEND, CONNECTED, etc.
| recognized     |  2 bytes   set to 0 by sender, used by relays to detect "this is for me"
| stream ID      |  2 bytes   identifies which stream within the circuit
| digest         |  4 bytes   running hash of all cells on this circuit, for integrity
| length         |  2 bytes   payload length within the cell
| data           |  498 bytes payload (or less if length is smaller)

The "recognized" field is a check used by intermediate relays to figure out whether the cell they're decrypting is meant for them or just passing through. After applying their layer of decryption, a relay checks whether the recognized field is 0 and the digest matches the running hash. If both, the cell is for this relay. If not, forward to the next hop. This is how telescoping circuit construction works without each relay having to know the full circuit length in advance.

Stream multiplexing matters operationally. A single Tor circuit can carry tens or hundreds of TCP streams simultaneously — every TCP connection from a Tor browser session is, by default, going through the same circuit. The reasons:

Circuit construction is expensive. Building a circuit means three rounds of Diffie-Hellman with three different relays, all of which is per-circuit work. Reusing one circuit across many streams amortizes that cost.
Pattern obscuration. If each stream got its own circuit, the relay-level pattern of "user opens 30 TCP connections in five seconds" (typical for loading a webpage with many resources) would be very visible on the network. Multiplexing them onto one circuit collapses that into a single circuit with many cells.
Linkability mitigation between unrelated activities. Tor periodically rotates circuits (default ~10 minutes for clean circuits), so a user's morning browsing session and afternoon browsing session are on different circuits and harder to correlate. If each TCP got its own circuit, the rotation guarantee would be moot.

The downside of multiplexing: a slow stream (a long-running download, a chat connection) can hold a circuit open well past its rotation interval, and any future browsing through the same circuit inherits that circuit's history. Tor mitigates with circuit isolation policies — different "stream isolation" settings put high-value streams on separate circuits. For onion services and certain identity-sensitive use cases, the Tor client makes circuit-isolation decisions automatically.

Telescoping circuit construction

Building a circuit means establishing a session key with each relay in the path. The naïve approach would be to hand each relay the next-hop information up front: "Hey guard, here's middle's address; once you connect to middle, hand them exit's address." This works but has two problems. First, the guard learns the middle's identity, which is unnecessary if we want to keep guard knowledge minimal. Second, it doesn't give per-hop forward secrecy: if the client's long-term key is compromised later, all those handshakes can be retroactively broken.

The telescoping construction solves both problems. The client builds the circuit one hop at a time:

Client → Guard. The client opens a TLS connection to the guard. Inside that connection, it sends a CREATE2 cell containing an ephemeral Curve25519 public key. The guard responds with CREATED2, containing its own ephemeral pubkey. Both sides derive a session key via the ntor handshake (Curve25519-DH-based, defined in the Tor proposal 216). The client now has a session key with the guard.
Client → (through Guard) → Middle. The client constructs a RELAY EXTEND2 cell addressed to the middle, encrypts it with the guard session key, and sends it through the circuit. The guard decrypts the outer layer, sees an EXTEND2 instruction, opens a connection to the middle, sends a CREATE2 to the middle on the client's behalf, gets the CREATED2 back, and forwards it (encrypted with the guard session key) back to the client. The client now has a session key with the middle, derived via the guard but not knowable to the guard (the ntor handshake is between client and middle directly; the guard only saw the encrypted DH payload).
Client → (through Guard) → (through Middle) → Exit. Repeat: the client sends a RELAY EXTEND2 addressed to the exit, encrypted in two layers (middle's key, then guard's key). The guard peels its layer and forwards to the middle. The middle peels its layer, sees an EXTEND2 instruction, and extends the circuit to the exit. The exit's CREATED2 response comes back through the same path, peeling layers in reverse order. The client now has a session key with the exit.

After three EXTEND2 round trips, the client has three session keys. All future cells on this circuit are encrypted three times (or decrypted three times in the reverse direction).

The properties this gets:

Per-hop forward secrecy. Each session key is derived from per-hop ephemeral DH. Compromising the long-term identity keys later doesn't decrypt past circuits.
Predecessor/successor knowledge only. Each relay learned the IP of the next hop only when it had to (during the EXTEND2 it processed), and never learns hops further along the path.
Identity authentication. The ntor handshake authenticates the relay using its long-term identity key (Ed25519 in modern Tor, RSA-1024 in legacy). The client knows it's talking to the relay specified in the consensus, not an MITM.
Reliable circuit aborts. If middle is offline or refuses extension, the EXTEND2 failure propagates back through the guard to the client, and the client picks a different middle and retries — without having committed any state to the failed middle.

Construction takes three round trips minimum (one per hop). Over Tor's typical inter-relay latencies (50-200ms), that's 200-600ms of circuit setup before a single byte of user data can flow. This is why circuits are reused across streams — the construction cost would be unbearable per-stream.

Directory authorities and consensus

Clients need to know which relays exist before they can build circuits. The naive answers — DNS, a tracker server, a P2P discovery layer — all have failure modes that conflict with Tor's threat model. Whatever distributes the relay list must itself be trustworthy, censorship-resistant, and consistent across clients.

Tor's answer is the directory authorities: a small set (currently around 9) of independently-operated, well-known servers run by trusted members of the Tor community (academics, activists, sysadmins with long-standing reputations). Each authority maintains a list of all relays it has heard from, with their measured properties (bandwidth, uptime, exit policy, geographic location, whether they're a guard candidate, whether they're an exit, etc.).

Once per hour, the authorities run a vote-and-consensus protocol:

Each authority publishes its current view of the relay set.
Each authority downloads the others' views.
They run a deterministic merging algorithm — for each relay attribute, a majority vote determines the consensus value.
They sign the resulting consensus document with their authority signing keys.
The consensus is published to a network of directory caches.

Clients fetch the consensus from any directory cache (not directly from authorities, to spread load). Any client can verify the consensus signature against a small set of authority identity public keys baked into the Tor binary at compile time. As long as a majority of authorities are honest, the consensus is trustworthy.

This design has interesting properties:

No single point of failure for the network view. A majority of authorities must collude to produce a forged consensus. In practice the authority operators are diverse jurisdictionally and don't have a common employer.
Censoring the consensus is hard. Directory caches are everywhere; bridges allow clients to fetch the consensus through obfuscated channels even if direct cache access is blocked.
Network state is verifiable, not just trusted. Clients verify signatures, not just trust whatever they receive.
Authorities are an attractive target. Compromising 5 of 9 authorities would let an attacker forge a consensus listing only attacker-controlled relays. This is why authority operators are vetted carefully and authority signing keys are kept on dedicated hardware.

The consensus contains, for each relay: identity fingerprint, IP and port, supported protocols, advertised bandwidth, measured bandwidth (different from advertised, computed by the bandwidth-measurement subsystem to detect lying relays), assigned flags (Guard, Exit, BadExit, HSDir, V2Dir, Authority, etc.), and exit policy. Clients use this information to make path selection decisions: pick a guard with the Guard flag, pick a middle that isn't in the same /16 or operator family as the guard, pick an exit whose exit policy permits the destination port, etc.

The consensus changes slowly. Most relays appear in many consecutive consensuses; new relays appear gradually as authorities measure them; relays that go offline drop out after a few consensus periods of unreachability. Clients keep their copy of the consensus for as long as it remains valid (a few hours), then refetch.

Guards, middles, exits

The three relay positions in a circuit — guard, middle, exit — have different roles and different operational realities.

Guards are the relays clients connect to first. The guard is the only relay that learns the client's real IP. This makes the guard the highest-risk position from a deanonymization perspective: an attacker who runs the guard sees the client. To limit the damage from running guards, Tor uses guard rotation discipline:

Each client picks a small set of guards (currently 3) and uses only those for an extended period (months). This is counter-intuitive — you might think rotating guards frequently would be safer.
The reason: if you rotate guards constantly, eventually you'll randomly pick an attacker-controlled guard. With sticky guards, either your guards are not attacker-controlled (and you're safe long-term) or your guards are attacker-controlled (and you're consistently exposed, but your exposure doesn't accumulate over time).
The math is "lottery winning" probability: if 1% of guard-flagged relays are malicious and you rotate every day, after a year you've used roughly 365 guards and the chance you've used at least one bad one is very high. With sticky guards, the chance is ~3% (three guards from a 1%-bad pool).
Guards earn the Guard flag by being long-running, well-connected, and high-bandwidth. Bad-faith operators have to maintain a relay for months before earning it.

Middles are the relays in the middle position. Middles don't learn the client's IP (they only see the guard) and don't learn the destination (they only see the exit). Middles have the easiest job from a privacy-leakage standpoint — what they observe is "circuits flowing through me," with no usable identifying information about either end. The middle position is the easiest one for "I want to help Tor without taking on legal exposure" relay operators.

Exits are the relays that actually send traffic out to the public internet. The exit sees the destination IP and the cleartext content of any non-end-to-end-encrypted connection (unencrypted HTTP, plaintext email protocols, etc.). HTTPS protects the content from the exit, but DNS lookups and TLS SNI fields still leak destination information.

Operating an exit relay carries real legal exposure. The exit's IP is what destination services see; if a Tor user does something illegal, the exit's IP is what shows up in logs. Most exit operators field abuse complaints regularly; some get raided occasionally; the legal protections (essentially: "I'm running a relay, not the endpoint") are real but require legal sophistication to invoke. As a result, there are far fewer exits than guards or middles in the network, and the exit set is geographically and jurisdictionally concentrated in countries with lenient laws on intermediary liability (Germany, Netherlands, Sweden, parts of the US, etc.).

The exit's exit policy is also part of the consensus. An exit can declare it permits HTTP and HTTPS only (typical for "reduced exit" relays that limit abuse exposure), or all common ports, or essentially anything. Clients choose exits whose policy permits the user's destination port — picking a wrong-policy exit and getting an "exit policy violation" cell wastes a circuit.

Onion services preview

The discussion above assumes the destination is on the public internet — Tor is acting as a transport from the client through the network to a normal IP destination. Tor also supports onion services (formerly "hidden services"), where the destination itself is on the Tor network and never reveals an IP address.

The architecture is more elaborate. An onion service:

Generates a long-term identity keypair. Its onion address (xxx.onion) is derived from the public key.
Selects a small set of introduction points — Tor relays at which the service will accept incoming connection requests. The service builds Tor circuits to each introduction point and tells them "I'm reachable through this circuit; if anyone wants to talk to me, send the request through here."
Publishes a service descriptor to the HSDir subset of relays (those with the HSDir flag). The descriptor lists the introduction points and contains a public key for the service. The descriptor is itself anonymized — the service publishes through a Tor circuit, so the HSDirs don't learn the service's IP.

A client wanting to connect to xxx.onion:

Fetches the descriptor for xxx.onion from the relevant HSDirs (the responsible HSDirs are chosen by hashing the onion address against the consensus).
Picks a rendezvous point (any Tor relay) and builds a circuit to it.
Sends an INTRODUCE1 cell to one of the service's introduction points (through a Tor circuit), telling the introduction point "the rendezvous is at <relay>, please tell the service to come meet me there. Here's the secret cookie they need to use."
The introduction point relays this to the service over the service's pre-existing circuit to the introduction point.
The service builds a Tor circuit to the rendezvous point and sends the cookie. The rendezvous point matches the cookie against the client's circuit and joins them.
Client and service now have a shared circuit through the rendezvous point. The total path is: client → 3 client-side hops → rendezvous → 3 service-side hops → service. Six hops total.

The properties:

The service's IP is never revealed. The service only ever connects out to Tor relays, never accepts incoming connections from the public internet.
The client's IP is not revealed to the service. Standard Tor anonymity for the client side.
Connections are end-to-end encrypted to the service. No exit is involved (six hops, no exit position) — the rendezvous point doesn't peel the application-layer encryption.
Identity is a public key, not an IP. Onion addresses are derived from public keys; v3 onion addresses (the current standard) embed a 256-bit Ed25519 public key directly.

The onion service v3 protocol (specified at length in rend-spec-v3.txt) is one of Tor's most architecturally interesting subsystems. It uses Ed25519 keys, blinded subkey derivation for forward-secret descriptors, and a hash-ring-based HSDir selection algorithm to make censoring or enumerating onion services difficult.

Why Tor stays low latency

Tor is a low-latency anonymity system, in contrast to high-latency mix networks. The distinction matters:

A traditional mix network (Mixmaster, Mixminion) achieves very strong anonymity by batching messages, randomly delaying them, padding them to constant size, and reordering them at each mix node. Anonymity comes from the fact that an observer of any one mix node sees a batch of messages go in and a batch of messages come out, with no usable timing or volume correlation between input and output. The cost: latencies of minutes to hours per message. Mix networks are appropriate for email, not for browsing.

Tor consciously chose to be low-latency to support interactive use — web browsing, IRC, SSH, remote desktop. The cost: by not batching, padding, or delaying, Tor leaks substantial timing and volume information. An adversary who can observe both the entry guard and the exit (or the entry guard and the destination, or any two points where a circuit is visible) can correlate timing and volume patterns to link the two — even though the cells in between are encrypted.

This is end-to-end correlation, and it is Tor's fundamental anonymity limit. The Tor design paper acknowledges this directly: Tor does not protect against a global passive adversary capable of observing both ends of a circuit simultaneously. Such an adversary doesn't need to break any cryptography to deanonymize users — they correlate timing and volume patterns and the math just works out.

The argument for accepting this trade is utilitarian. Mix networks with strong anonymity are unusable for browsing; nobody uses them; the anonymity is theoretical. Tor with weaker anonymity is usable; millions of people use it; the herd effect itself contributes to anonymity (any one user's traffic is mixed with everyone else's). A practical low-latency system that gets used is more anonymizing in the real world than a strong high-latency system that doesn't.

There has been ongoing research and limited deployment of additional padding and timing defenses in Tor — circuit-level padding to mask burst patterns, "padding negotiation" to defend against website fingerprinting attacks, longer-circuits and "vanguards" for onion services to defend against guard discovery attacks. These help against specific attacks but don't change the fundamental low-latency limit. Anyone who tells you Tor is bulletproof against a global adversary is misrepresenting the design.

What Tor leaks anyway

The full leakage list is long; here are the categories that matter most for thinking about what Tor protects against:

Application-layer metadata. If you load an unencrypted HTTP page over Tor, the exit sees the URL, headers, content. Cookies, login credentials, anything in the request — all visible to the exit. The mitigation is HTTPS — the exit then only sees the destination IP and the SNI.

DNS. A misconfigured client that resolves DNS locally before connecting through Tor leaks the destination to the local DNS resolver and the network in between. Tor Browser handles this correctly (DNS happens at the exit). Custom Tor configurations using SOCKS need to use SOCKS5h, not SOCKS5 — the h means "hostname resolution at the proxy."

TLS SNI. Even with HTTPS, the SNI field in the ClientHello is plaintext (until Encrypted Client Hello is widely deployed). The exit sees www.example.com even if the URL after the connection is encrypted. Network observers between exit and destination see the same.

Timing patterns. Loading a specific webpage produces a specific pattern of cell counts, request bursts, and response timings. Website fingerprinting attacks classify circuits by which website they're visiting based on these patterns alone, without any access to circuit content. Defenses include circuit-level padding and timing perturbation; these help but don't eliminate the attack.

Volume. A circuit carrying a lot of data and a circuit carrying a little are visibly different. Long-running connections (downloads, video streams) are particularly identifiable.

Exit IP visibility to the destination. The destination sees the exit's IP. Some destinations block all known Tor exits (the consensus is public, so the exit list is enumerable). Some allow Tor but show a CAPTCHA. Some treat Tor traffic differently (different anti-fraud rules, different content). This isn't a leakage of identity per se, but it is a usability and behavioral consequence.

Browser fingerprinting. Tor Browser tries hard to make every Tor user's browser fingerprint identical (same User-Agent, same screen resolution, no fonts beyond a standard set, no JavaScript timing precision), but application-layer fingerprinting is an arms race. See browser-fingerprint-hardening for the broader fingerprinting threat model.

Time-of-day patterns. A user who only connects to Tor during their workday is identifiable as "someone in a particular timezone with this work schedule" by an observer of the guard. Long-term observation of the guard reveals usage patterns.

Linked stream metadata. Even with circuit isolation, application-layer linkage (logging into the same account from different circuits) defeats Tor's anonymity by tying the circuits to the same identity at the application layer. Tor cannot protect against the user behaving in identifiable ways above the transport.

The summary: Tor protects you from the network in between learning what you're doing. It does not protect you from your own application behavior, from end-to-end correlation, from a sufficiently global adversary, or from destinations that deanonymize through application-layer means. Tor is a transport for anonymity engineering, not an anonymity proof by itself. The deeper threat-model material lives in Track 4 — see threat-models-for-network-anonymity (coming soon) when those modules go up.

Hands-on exercise

Inspect a Tor circuit at the conceptual level.

Tools: Tor Browser (download from torproject.org). Runtime: 10 minutes.

Open Tor Browser. Visit a normal HTTPS site (e.g., https://check.torproject.org). Click the lock icon or shield icon and find the "Circuit for this site" view. You should see three relays listed: guard, middle, exit. Then:

Note the country code or IP for each. Are any two relays in the same country? (Tor tries to avoid this; sometimes it can't.)
Open a different site in a new tab. Is the circuit the same? (Likely yes for the same domain; circuit isolation rotates per-domain in some configurations.)
Click "New Tor circuit for this site" — what changes? Which relays change?
For each relay in the circuit, ask yourself: what does this relay see? What can it not see? Specifically:
- The guard: client IP (yes), destination (no), application content (no, it's encrypted three layers deep)
- The middle: client IP (no), destination (no), application content (no)
- The exit: client IP (no), destination (yes), application content (yes if HTTP, no if HTTPS)

Stretch: visit an .onion site (e.g., https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/) and compare the circuit display. Onion service circuits look different — the path is six hops total (client → 3 hops → rendezvous → 3 hops → service), and there's no exit relay in the conventional sense.

Map cells to streams conceptually.

Tools: text/diagram analysis. Runtime: 10 minutes.

Sketch on paper or in text:

Circuit C123 between client and exit, with 3 active TCP streams:

Stream S1 (HTTP request, 450 bytes):
  -> 1 RELAY DATA cell (498 bytes payload, mostly used)

Stream S2 (HTTP request, 60 bytes):
  -> 1 RELAY DATA cell (498 bytes payload, mostly padding)

Stream S3 (long download, 100 KB):
  -> ~ 200 RELAY DATA cells

On the wire between client and guard, all of these cells are
interleaved, each labeled with circuit ID C123 and a stream ID
(S1, S2, or S3) inside the encrypted relay payload.

The guard sees:
  - circuit ID C123 (yes, in cleartext header)
  - stream IDs (no, those are inside the encrypted payload)
  - which TCP streams are in flight (no, can only count cells)

The middle sees:
  - circuit ID C123
  - the cells, but two layers of encryption deeper than the guard

Ask yourself: if a relay in the path can count cells but can't see stream IDs, what observations can it make about the user's activity? (Burst patterns, cell volume, timing — enough for some traffic-analysis attacks, not enough for direct content recovery.)

Common misconceptions and traps

"Tor is just a VPN with three hops." As argued at length above, no. Tor's trust model, routing model, identity model, and adversary model are all different from a VPN's. The "three hops" framing misses the point of trust distribution and the structural anonymity argument.

"More hops always means more anonymity." Tor's path length of three is a deliberate design choice, not a default that should be increased. Adding more hops adds latency and provides marginal additional anonymity benefit — three hops is enough to ensure no single relay knows both ends. Six-hop paths (which onion services effectively use) are required by the rendezvous architecture, not chosen for extra anonymity. Most attacks against Tor that succeed are not defeated by adding more hops.

"The exit node sees nothing because traffic is onion-encrypted." The exit sees the cleartext of whatever the application protocol leaks. If you're loading HTTP, the exit sees the full content. If you're loading HTTPS, the exit sees the destination IP and SNI but not the encrypted content. Onion encryption protects content from the network between client and exit; it does not protect content from the exit itself.

"Tor Browser discipline is optional because Tor handles everything." Tor handles transport. Application behavior is your responsibility. Logging into a personally-identifiable account through Tor links the Tor circuit to your identity at the application layer. Downloading and opening files outside the Tor Browser sandbox can reveal your real IP through DNS resolution. Running JavaScript-heavy sites can leak fingerprintable information. Tor Browser exists specifically to bundle the transport with sane application defaults; bypass any of those defaults and you can lose anonymity at the application layer.

"Low-latency anonymity means traffic analysis is solved." It absolutely doesn't. Tor explicitly trades anonymity strength against usability. End-to-end correlation, website fingerprinting, and timing-based attacks are real and known. The design paper acknowledges them. Tor's anonymity argument is "useful in practice for many threat models" not "mathematically unbreakable."

"Tor relays are anonymous." The relays themselves are public. Their IPs, fingerprints, bandwidth, and uptime are in the consensus. What's anonymous is the users, not the infrastructure. (Onion services attempt to anonymize the service infrastructure too, via the rendezvous architecture.)

"You can't run a website behind Tor." Onion services let you do exactly that. The service runs on a normal machine but only accepts connections through the Tor network via introduction and rendezvous points. The IP of the host machine is never revealed (assuming correct configuration; misconfigured services have leaked their real IPs through verbose error pages, side-channel timing, or co-located services on the same IP).

"Bridges are just secret relays." Bridges are unlisted relays whose existence isn't in the public consensus. Their purpose is to help users in censored networks reach Tor — censors can't easily block what they can't enumerate. Bridges plus pluggable transports (obfs4, meek, snowflake) provide both unlisted endpoints and obfuscated transport, defeating both static-blocklist censorship and basic protocol-fingerprinting censorship. Bridges are not a different security tier; circuits built through bridges have the same three-hop structure as any other circuit.

"Tor is illegal." Tor is legal in most jurisdictions including all democracies. Operating Tor relays is legal in essentially all countries; running an exit can attract legal scrutiny because of the abuse-source association, but is not itself illegal. Use of Tor is restricted or punished in some authoritarian states (China, Iran, Russia in some senses), where the issue is censorship and political control, not anonymity per se.

Wrapping up

Tor is a TCP-stream anonymity overlay built from fixed-size cells multiplexed over telescoping circuits, with relay roles separated to enforce divided knowledge across guard/middle/exit positions, with a directory-authority consensus that makes the network state verifiable, and with a deliberate low-latency stance that accepts known traffic-analysis attacks in exchange for being usable for interactive applications. The whole design is shaped by the trust-distribution thesis: no single relay should learn both who you are and what you do.

Understanding Tor at this architectural level is what lets you reason about which threat models it actually addresses, which application behaviors it can't protect against, and where it sits relative to other tools in the privacy stack. The next module (sing-box-and-xray-architecture — coming soon) leaves the anonymity-overlay model and looks at modern censorship-evasion stacks, where the goal is not anonymity but reachability — getting traffic through hostile networks that recognize and block standard VPN protocols.