RouteHardenHire us
Back to Evasion
Evasion · Part 1 of 7·Anonymity Engineering··13 min read·advanced

Pluggable transports: the obfs lineage

obfs4, meek, Snowflake, and the history of transport-layer evasive design as adversaries moved from passive filtering to active probing.

The censorship-evasion arms race has played out most visibly in the Tor ecosystem's pluggable-transport project. Tor itself is recognizable on the wire — its fixed cell sizes, its specific TLS configuration, its known relay IPs — and a censor who blocks Tor traffic can do so with reasonable accuracy. The pluggable-transport (PT) framework, defined in the PT 2.0 spec, lets the Tor client wrap its traffic in different obfuscation protocols that try to look like something other than Tor.

This module walks the history of pluggable transports — obfs2, obfs3, obfs4, meek, Snowflake — as a designed sequence of responses to evolving adversary capabilities. Each transport addresses a specific threat: passive identification at first, then statistical fingerprinting, then active probing. The arc shows what happens when a censor and circumventor co-evolve over a decade of pressure.

This is the opening module of Track 6. The thesis: there is no permanent solution; pluggable transports are a sequence of moves in a game whose terms keep shifting. Understanding the lineage is what lets you reason about why current transports look the way they do and predict what comes next.

Prerequisites

Learning objectives

  1. Explain what pluggable transports do — wrap Tor traffic in different observable shapes — and what they don't change about Tor's anonymity properties.
  2. Distinguish the obfs lineage (obfs2, obfs3, obfs4) by the specific threats each addressed.
  3. Explain meek's domain-fronting model and why it depended on cooperative collateral-damage infrastructure.
  4. Describe Snowflake's WebRTC-volunteer-proxy model and the operational tradeoffs it accepts.

What pluggable transports do (and don't do)

A pluggable transport sits between the Tor client and the bridge. The client speaks Tor protocol normally; the PT serializes Tor cells into something else (random-looking bytes, HTTP requests, WebRTC packets); the bridge runs the PT in reverse to recover Tor cells.

The PT changes:

  • The observable wire format (sizes, framing, content patterns).
  • The endpoint structure (specific bridge IPs vs. CDN domains vs. WebRTC-discovered peers).
  • The detection profile against passive classification and active probing.

The PT does not change:

  • Tor's three-hop path structure.
  • Tor's anonymity properties (which depend on the relay topology, not the entry transport).
  • The destination's view of traffic (if the user visits HTTPS sites, the destination still sees Tor exit IPs).

So a PT is a transport-layer wrapper, not a different anonymity system. The threat model is "circumvent transport-level censorship and detection of Tor." The anonymity-from-relays story is unchanged.

obfs2 (2012): the first attempt

obfs2 was the first deployed pluggable transport. The design: encrypt the Tor traffic with a key derived from a fixed bridge identity, then send the resulting bytes over a TCP connection. The intent: defeat passive identification of Tor by making the byte content look random rather than Tor-shaped.

The result: obfs2 was distinguishable through statistical analysis. The encryption produced uniformly random bytes; legitimate protocols don't have uniform random byte distributions. A simple entropy test on the first packet of a connection could distinguish obfs2 from normal traffic.

obfs2 also had no active-probe resistance. A censor who suspected an obfs2 bridge could send the obfs2 client preface; if the bridge accepted it, the bridge was confirmed.

obfs2 was deprecated quickly. It taught the design community: random bytes alone aren't sufficient — they're a recognizable pattern in their own right.

obfs3 (2013): better randomness, same structural issue

obfs3 used Diffie-Hellman to derive a per-session key, removing the fixed-key issue of obfs2. The byte distribution was still uniform-random, but per-session.

The structural issue persisted: uniform-random bytes are themselves identifiable as not-normal-protocol. obfs3 was easier to identify than legitimate traffic, just as obfs2 had been.

The Great Firewall began blocking obfs3 within months of deployment. The lesson: per-session secrets don't fix the "looks suspiciously random" problem; the byte distribution itself is the signature.

obfs4 (2014): the active-probe-resistant generation

obfs4 was the design that addressed both passive and active threats. Key elements:

Per-bridge secret key. Each bridge generates a public key and a node ID; the bridge advertises these out of band (in BridgeDB or via custom distribution channels). A client that doesn't have the secret can't construct a valid handshake. This addresses active probing — a censor who tries connecting to a suspected obfs4 bridge without the secret gets nothing useful back.

Probabilistic byte distribution. obfs4 padding produces output whose byte distribution mimics ordinary network traffic rather than uniform random. Statistical tests that flagged obfs2/obfs3 produce ambiguous results on obfs4.

IAT (inter-arrival time) randomization. obfs4 can introduce random delays to its traffic to defeat timing-based fingerprinting. Optional but supported.

TCP-only. obfs4 runs over TCP and presents itself as some-kind-of-encrypted-traffic. It doesn't try to look like a specific protocol; it tries to look like generic encrypted traffic that's not obviously suspicious.

obfs4 has been the workhorse Tor pluggable transport for over a decade. It's still deployed; it still defeats the simpler censorship deployments. Sophisticated censors (China's GFW especially) have developed behavioral classifiers that flag obfs4 bridges through patterns other than the byte distribution — port usage, lack of co-hosted real services, predictable responses to specific probes. The arms race continues; obfs4 remains useful but no longer perfect.

meek (2014+): domain fronting with collateral damage

meek took a different approach: rather than trying to look like generic encrypted traffic, look exactly like HTTPS traffic to a major cloud service.

The design:

  • Tor client connects to meek.azureedge.net (or similar large-CDN domain).
  • The TLS handshake's SNI says meek.azureedge.net — visible to the local network.
  • Inside the encrypted HTTPS body, the request specifies a different Host: header pointing at a Tor bridge backend.
  • The CDN routes the request based on the inner Host: (this was the domain-fronting trick — covered in the next module).
  • The bridge sees the request and replies; responses come back through the same path.

The strategic point: blocking meek required blocking the entire CDN domain (e.g., azureedge.net). The collateral damage of blocking a major CDN was too high for most censors; meek persisted as a circumvention option for years.

The fall: starting in 2018, major CDN providers (AWS, Google Cloud, Azure) changed their policies to disallow domain fronting. The technical mechanism was disabled. meek persists in narrow forms but isn't the high-volume circumvention transport it once was. The full story is in domain-fronting-the-rise-fall-remnant (the next module).

Snowflake (2018+): WebRTC volunteer proxies

Snowflake takes the volunteer-distributed approach. The design:

Volunteer proxies. Anyone running a Snowflake browser extension or website snippet contributes a temporary proxy. Volunteers are typically users running Tor Browser; their browsers act as one-shot bridges for other users behind censorship.

WebRTC transport. The Tor client connects to a volunteer proxy via WebRTC, the same protocol used by Google Meet and similar video-call services. WebRTC traffic is hard to block without breaking video calling.

Domain-fronted broker. The client uses a broker server to find available volunteer proxies. The broker connection itself uses domain fronting (or a similar masking technique) to be reachable from censored networks.

No fixed bridges. Unlike obfs4 where each bridge is a stable IP that can be discovered and blocked, Snowflake's "bridges" are volunteer browsers that come and go. There's no enumerable list of all bridges; new volunteers appear continuously.

The strategic point: Snowflake makes bridge enumeration much harder. Even if a censor blocks individual proxies they discover, new ones replace them faster than the censor can block.

The operational tradeoffs:

  • WebRTC is heavy; Snowflake has higher overhead than obfs4.
  • Volunteer churn creates connection instability; users may need to reconnect frequently.
  • The broker is a single point of attack (though domain-fronting makes it harder to block).
  • Volunteer proxies have variable bandwidth and reliability.

Snowflake has become Tor's most-used pluggable transport in highly-censored regions where obfs4 and meek are blocked.

Why active probing pushed designs

The pivot from obfs2/obfs3 to obfs4 was specifically driven by active probing. A censor who suspected an obfs2 or obfs3 bridge could probe it: connect, send the protocol's known opening bytes, see if the bridge responded. If yes, confirmed bridge → block. The probabilistic uncertainty of "this looks suspicious" became deterministic "this is a bridge."

obfs4's secret-gated handshake removed the probing vector. Without the per-bridge secret, a probe gets no useful response. The censor's confidence stops at "this looks suspicious" and they have to decide whether to block on suspicion (with the cost of false positives on legitimate traffic).

This was the lesson generalized to modern designs (REALITY, naïveproxy, HTTPT): probe resistance requires that connections from clients without the secret produce indistinguishable behavior to legitimate connections. The byte-pattern game is necessary but not sufficient; the state-machine game matters more.

What makes meek strategically different

meek's contribution wasn't byte-pattern obfuscation; it was infrastructure economics. The technical mechanism (domain fronting) wasn't novel — it was just a clever exploitation of how cloud-CDN routing worked. The strategic insight was that cloud CDNs were too valuable to block, so blocking the protocol required blocking the cloud, which censors wouldn't do.

For five years (2014-2019), meek was effective for exactly this reason. CDN providers had no incentive to disable fronting because they didn't want to be involved in the censorship debate at all. Censors had no good way to block meek without unacceptable collateral damage.

Then provider policies changed (2018-2019), and the strategic value of meek collapsed. The lesson: strategic-mechanism transports depend on third-party policies that can change. obfs4-style transports depend only on cryptography and adversary capabilities; they degrade gradually. meek-style transports depend on policy and can fail abruptly.

What remains hard

Even with the obfs4 + Snowflake combination, several adversary capabilities still create problems:

Bridge enumeration via active probing. A sophisticated censor can probe many IPs on port 443 with an obfs4 client preface. Bridges that respond reveal themselves; bridges that don't are presumed not to be obfs4. Over time, the censor accumulates a bridge list.

Behavioral classification. Even if individual handshakes are indistinguishable, behavioral patterns (long-lived TLS connections, specific traffic-volume shapes) can flag suspected obfs4 bridges with statistical confidence.

Capacity bottlenecks. Volunteer-distributed transports like Snowflake depend on volunteer availability. In some regions, the volunteer pool is small and can be saturated.

Adversary patience. Censors have time. A user who needs Tor access today is in a tight spot; the censor with multi-year horizons can systematically chip away at bridge availability.

The current design responses (Track 6 covers them):

Each addresses a specific gap in the obfs lineage; none is a complete solution.

Hands-on exercise

PT layering pseudocode.

Tools: notes. Runtime: 10 minutes.

Sketch the Tor client + obfs4 layering:

[Tor application logic] → produces Tor cells (514 bytes each)
   ↓
[obfs4 client]
   - takes Tor cells
   - applies obfs4 framing + padding + per-session encryption
   - emits bytes over TCP socket
   ↓
[TCP socket to bridge]
   - bridge IP and port
   - bytes look like generic encrypted traffic
   ↓
[bridge runs obfs4 server in reverse]
   - recovers Tor cells
   - forwards to Tor relay
   ↓
[Tor relay accepts cells like any other Tor connection]

Identify what each layer hides:

  • obfs4 framing: hides Tor's fixed cell-size pattern.
  • Per-session encryption: prevents the same Tor cell from producing the same obfs4 bytes (no static fingerprint).
  • Padding: hides cell-count patterns.
  • IAT randomization: hides cell-timing patterns.
  • Bridge-as-TCP-server: hides Tor relay structure (the bridge looks like any TCP server).

What does this not hide?

  • Bridge IP (still observable; can be blocked once known).
  • Long-lived TCP connection patterns (different from typical web traffic).
  • Total bandwidth and timing (different from typical patterns).

Transport strategy comparison table.

TransportThreat addressedCostFailure mode
obfs2Plain-Tor passive identificationLowDefeated by entropy analysis
obfs3obfs2 + per-session keysLowSame byte-distribution issue
obfs4Passive + active probingModerate; bridge managementBehavioral classification; bridge enumeration
meekAll passive + active for HTTPS targetsHigh; depends on CDN policiesCDN policy change collapses the mechanism
SnowflakeBridge enumeration via volunteer churnVolunteer availability; WebRTC overheadBroker availability; behavioral classification

The pattern: each transport addresses what the previous failed against; each has new failure modes that future transports try to address. There's no permanent winner.

Common misconceptions and traps

"Pluggable transports change Tor's anonymity." They change the transport-layer observability of the connection to the bridge. The relay-based anonymity (three hops, divided knowledge) is unchanged.

"obfs4 makes Tor undetectable." It makes Tor harder to detect with simple methods. Sophisticated censors with active probing and behavioral classification can still identify obfs4 bridges with substantial accuracy, especially over time.

"meek is dead because of provider changes." meek-the-transport persists for some specific CDN/provider configurations. The strategic mass-availability of domain fronting collapsed; the technical capability still exists in narrow contexts.

"Snowflake is a Tor replacement." Snowflake is a Tor pluggable transport; it carries Tor traffic. The user is still using Tor; the transport just changes what bridge connection looks like.

"More obfuscation always helps." More obfuscation costs bandwidth, latency, and complexity. The right level depends on the adversary; over-engineering against weak adversaries is wasteful.

Wrapping up

Pluggable transports are a sequence of responses to evolving censorship capabilities. obfs2 and obfs3 addressed passive identification (and failed at active probing). obfs4 added active-probe resistance via per-bridge secrets. meek exploited domain fronting until the provider policies changed. Snowflake distributed bridges via volunteer WebRTC, addressing enumeration but accepting volunteer churn.

There is no permanent solution; there is only the next move in an ongoing game. Understanding the obfs lineage is what lets you reason about current transports — REALITY, naïveproxy, Hysteria, refraction networking — as moves in the same game with different specific tactics.

The next module (domain-fronting-the-rise-fall-remnant — coming soon) covers the meek-related domain-fronting story in depth: how cross-layer naming made it work, what cloud provider policy changes destroyed it, and what remnants survive.

Further reading