RouteHardenHire us
← Back to blog
Anonymity Engineering··8 min read

JA3 and JA4 TLS fingerprints, explained

How JA3 and JA4 fingerprint the TLS ClientHello, what they're good for, and why they are correlation signals rather than identities.

TLS fingerprinting is about client shape, not client content.

That sounds obvious once you say it out loud, but a lot of bad explanations still treat JA3 and JA4 like they are somehow reading encrypted traffic after the fact. They are not. They work because the interesting part happens before the encrypted session is established, inside the TLS ClientHello.

That is why fingerprinting survives changing IPs, new certificates, CDNs, and all the other surface churn people like to point at when they want to sound reassuring.

If the client keeps speaking TLS in the same shape, the observer still has something to correlate.

Why the ClientHello is enough

Before application data is encrypted, the client has to introduce itself. It proposes a TLS version, advertises cipher suites, declares extensions, and signals other preferences about how it wants the session to work.

That opening message is full of structure.

Different clients tend to produce different structure:

  • browser families
  • runtime libraries
  • malware frameworks
  • headless automation stacks
  • custom clients built on Go, Java, Python, curl, or OpenSSL defaults

JA3 and JA4 turn that structure into compact labels.

The right mental model is not "this fingerprint identifies Alice." It is "this fingerprint identifies the kind of TLS speaker that showed up."

That distinction matters a lot.

JA3 in one screen

The canonical JA3 README defines JA3 as a method of TLS client fingerprinting based on the ClientHello. It uses five fields:

SSLVersion,Cipher,SSLExtension,EllipticCurve,EllipticCurvePointFormat

Those values are concatenated into a delimited string, then hashed with MD5 to yield a compact 32-character fingerprint.

The important thing here is not the MD5. People get distracted by that.

JA3 is not using MD5 as a cryptographic integrity primitive. It is using MD5 as a short, stable label for a structured string. You are not relying on it to stop an adversary from forging signatures in the way you would for a security boundary. You are using it as a convenient index for clustering and comparison.

In that sense, a JA3 fingerprint is closer to a naming scheme than a proof.

The original project also defined JA3S, which fingerprints the server-side response pattern. That gives you a client-and-server pair, which can be useful when the point is not just "what client is this?" but "what client-server combination keeps recurring?"

Why JA3 became useful so fast

JA3 spread because it gave defenders something the IP layer could not.

An attacker can rotate:

  • source IPs
  • domains
  • certificates
  • hosting providers

But changing the TLS stack or its exact handshake behavior is often more expensive, less convenient, or simply forgotten. The Salesforce engineering write-up made this point early: the same client often keeps producing the same fingerprint across many different destinations.

That turns the handshake into a correlation signal.

For operators, the common uses are:

  • bot detection
  • malware clustering
  • anomaly detection
  • checking whether a claimed browser actually behaves like that browser

If a request says "I am Chrome on macOS" and the TLS shape looks more like a stock automation library, that mismatch is interesting even before you inspect anything else.

JA3's limits are not subtle

JA3 is useful. JA3 is also old enough that its rough edges are well understood.

The biggest practical issue is that JA3 is heavily influenced by raw ordering. If the order of ciphers or extensions changes, the fingerprint changes. That makes it more fragile than people expect.

This gets worse when you account for:

  • GREASE values
  • implementation churn between browser versions
  • protocol evolution
  • normal differences between domain and IP access patterns

It is not that JA3 stops working. It is that engineers start over-reading it.

The JA3 project README itself now notes that Salesforce no longer actively maintains the project and points readers toward FoxIO's newer work. That is a good hint that the field moved on for a reason.

So when people ask whether JA3 is "still useful," the honest answer is:

  • yes, as a familiar correlation signal
  • no, not as the final word on modern TLS fingerprinting

JA4 is the cleanup pass

JA4's technical details are best read as an attempt to keep the value of TLS fingerprinting while reducing some of JA3's unnecessary volatility.

The compact format looks like this:

t13d1516h2_8daaf6152771_e5627efa2ab1

That string is dense, but the construction is more deliberate than it looks.

JA4 begins with compact metadata about:

  • transport type (t, q, or d)
  • TLS version
  • whether the destination is a domain or IP
  • counts of ciphers and extensions
  • the first ALPN value in compact form

Then it uses sorted lists and truncated SHA-256 hashes for the cipher and extension sections. That design choice matters because it reduces dependence on raw ordering noise. JA4 is trying to preserve what is stable about client behavior while stripping out variation that produces too many meaningless fingerprint changes.

That is the central improvement.

Another important design choice is what JA4 does not hash together in the same way. The spec is careful about fields like SNI and ALPN because those can create unnecessary churn between otherwise similar clients depending on whether they talk to domains, IPs, or different application protocols.

Put simply, JA4 is more opinionated about what should count as client identity and what should count as environment noise.

What engineers actually do with JA3 and JA4

The boring real-world use is usually better than the dramatic one.

Most teams are not using JA3 or JA4 to declare "this exact person is malicious." They are using them to answer softer, more operationally useful questions:

  • Does this client cluster with known automation?
  • Did a browser claim arrive with a non-browser TLS shape?
  • Did a traffic family suddenly change handshake behavior?
  • Is the same odd client pattern now coming from many networks?
  • Does the observed handshake fit the story the user-agent and app layer are telling?

That last one matters a lot.

Fingerprinting becomes much more powerful when combined with consistency checks. A claimed Chrome browser paired with a curl-like or Go net/http-like TLS fingerprint is not proof of abuse, but it is a perfectly reasonable reason to look closer.

A simple workflow might be:

tshark -r capture.pcap -Y tls.handshake.type==1
zeek -r capture.pcap

Then compare the extracted clienthellos or derived fingerprints against what the application layer claims to be.

Or, operationally:

Claimed browser: Chrome
Observed TLS fingerprint: curl-like or Go net/http-like
Conclusion: investigate automation or impersonation

This is where the topic overlaps nicely with /blog/browser-fingerprint-hardening. HTTP headers, JS-visible browser traits, and TLS handshake shape are all parts of the same consistency story.

What JA3 and JA4 are not

They are not identities.

This sounds like a nitpick until you watch people turn a good correlation feature into a bad certainty machine.

A fingerprint can be:

  • shared by many legitimate clients
  • reproduced by tooling
  • altered by library upgrades
  • reshaped by intermediaries or runtime differences
  • made noisier by transport changes

So the right operator stance is: treat JA3 and JA4 as signals, not verdicts.

This matters especially in privacy and anonymity work. If you are reading about TLS fingerprints because you care about being harder to classify, the lesson is not "one fingerprint equals one user." The lesson is that your client stack leaves a behavioral silhouette before encryption hides the payload.

That is why articles like /blog/active-probing-defense and /blog/xray-reality-vs-wireguard end up touching the same territory. Traffic classification starts earlier than many people expect.

Why JA4 is showing up more often now

JA4 is no longer just a niche spec somebody put on GitHub. Platforms are operationalizing it. AWS announced JA4 fingerprinting support in WAF, which is a strong signal that TLS-shape classification is now normal defensive infrastructure rather than researcher trivia.

That does not mean JA4 "replaces JA3 everywhere." It means the field is converging on the idea that cleaner, lower-volatility fingerprints are more useful at scale.

The FoxIO JA4 repository also makes an important licensing point: the TLS client fingerprinting part of JA4 is open-source under BSD 3-Clause, while the broader JA4+ suite has different licensing. That matters if you are deciding what to adopt in internal tooling.

The operator opinion

The most useful way to think about JA3 and JA4 is not as magical surveillance dust, and not as useless vendor jargon. They are shorthand for how a TLS client introduces itself before the encrypted session begins.

JA3 got popular because that shorthand was operationally valuable. JA4 matters because people learned where JA3 was too fragile and tried to keep the good part while reducing the noisy part.

If you are defending systems, that means:

  • cluster behavior, do not worship single hits
  • compare TLS fingerprints with app-layer claims
  • assume drift over time
  • prefer correlation stories over identity stories

If you are building privacy-sensitive systems, the lesson is the mirror image:

  • changing IPs is not enough
  • changing domains is not enough
  • encrypting payloads is not enough if the handshake shape remains easy to classify

That is the value of these fingerprints. They remind you that "encrypted" does not mean "featureless."

And that is also why good network OPSEC starts before the first byte of application data ever moves.