Cryptography Foundations · Part 6 of 8·Anonymity Engineering·2026-05-02·14 min read·intermediate

Key derivation: HKDF and friends

Why one secret becomes many keys: HKDF extract-then-expand, PBKDF2 vs Argon2id, salts, domain separation, and the failure mode of reusing keys across contexts.

A real protocol never uses a single key. TLS 1.3 derives at least four traffic keys from a single handshake secret. WireGuard derives separate transmit and receive keys per peer per direction. Signal derives a fresh key per message. The thing that turns "we have shared randomness" into "we have a fit-for-purpose set of cryptographic keys" is a key derivation function (KDF). Two distinct families exist: KDFs for deriving keys from already-high-entropy secrets (HKDF), and KDFs for deriving keys from low-entropy passwords (PBKDF2, scrypt, Argon2id). Confusing them — using a fast HKDF on passwords, or running Argon2 on a TLS handshake secret — is a class of bug worth recognizing.

Prerequisites

Module 2.3 — Hash functions and message authentication. HKDF is built on HMAC; understanding HMAC's properties is the floor.
Module 2.4 — Asymmetric crypto. Diffie-Hellman shared secrets are the canonical input to HKDF.

Learning objectives

By the end of this module you should be able to:

Explain why raw shared secrets and passwords are not directly suitable as encryption keys.
Describe HKDF's two-stage extract-then-expand model and articulate the role of salt, info/context strings, and pseudorandom keys.
Compare PBKDF2, scrypt, and Argon2id at the threat-model level — what each is designed to slow down, and what attacker capabilities each assumes.
Trace how TLS 1.3 derives client/server handshake and traffic secrets from one Diffie-Hellman output.
Recognize KDF misuse patterns: reusing salts, omitting context strings, using HKDF on passwords, using password KDFs on transport secrets.

Why key material needs shaping

A raw secret — the output of an X25519 exchange, a master key from a hardware token, a shared symmetric key — is rarely directly usable as an encryption key. Several reasons:

Length mismatch. A Diffie-Hellman shared secret over Curve25519 is exactly 32 bytes. AES-128 wants a 16-byte key; AES-256 wants 32. HMAC-SHA-256 wants a key up to 64 bytes. The application probably wants several keys at once. The raw secret doesn't fit any specific protocol's needs without shaping.

Distribution non-uniformity. A DH shared secret is a value in a specific algebraic group. Its byte representation has subtle structural properties — leading zeros, bias toward small numbers in some curves. A symmetric cipher expects a uniformly random key; passing in a structured value violates that assumption. Even if it works in practice, the security analysis breaks down.

Context conflation. A protocol typically needs multiple keys: client-side transmit, server-side transmit, MAC key, exporter secrets. If the protocol uses the same raw secret for all of them, a flaw in one usage compromises all the others. Domain separation — explicit per-purpose key derivation — is what prevents this.

Compromise containment. If one derived key is leaked through a side channel, the master secret should remain safe. HKDF's design ensures that knowing a derived key doesn't reveal the input secret or other derived keys.

A KDF turns "some secret input" into "a structured family of keys, each fit for a specific purpose." Modern protocols rely on this transformation; ad-hoc concatenation of secrets and labels does not have the same security properties.

HKDF's extract-then-expand model

HKDF (HMAC-based Key Derivation Function) is the canonical KDF for high-entropy inputs. Its design splits the work into two clearly-separated stages:

Extract absorbs the input secret into a pseudorandom key (PRK) of fixed length:

PRK = HMAC(salt, IKM)

IKM is the input keying material (the DH secret, the master key). salt is a public value that may be empty. The HMAC operation strips structural patterns from the input and produces a value that's effectively a uniformly-random PRK of the hash's output size (32 bytes for HMAC-SHA-256).

Expand generates output keys from the PRK with context-binding info:

output = HMAC(PRK, info || 0x01)            (for the first 32-byte block)
output += HMAC(PRK, output[-32:] || info || 0x02)   (next block)
...                                          (continues until enough bytes)

The expand stage chains HMAC blocks, each XORing in the previous block plus a counter and the info string. Each call produces an arbitrary-length output. The same PRK with different info strings produces unrelated output keys.

The split is engineering, not just academic: the extract stage handles "make the input look random"; the expand stage handles "produce as many keys as you need with explicit context." Each stage has a clear job and clear inputs.

In code, the API:

from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes

derived_key = HKDF(
    algorithm=hashes.SHA256(),
    length=32,
    salt=b"some-public-salt",     # optional
    info=b"my-protocol/v1/client-tx-key",  # context-binding
).derive(input_keying_material)

Three things to keep straight:

The salt is public. Its job is uniqueness, not secrecy. A unique salt per session ensures that two sessions with the same input material produce unrelated PRKs.
The info is the domain separator. Two derivations with different info produce unrelated keys. This is how a single PRK becomes (client_handshake_traffic_secret, server_handshake_traffic_secret, exporter_master_secret, etc.).
The length controls how much is squeezed out. HKDF can produce up to 255 × hash_output_size bytes per call (8160 bytes for SHA-256).

Domain separation and context binding

The single most underappreciated security property HKDF provides is domain separation. Without it, the same secret used in two contexts can be confused: an attacker who can substitute one bytes-stream for another exploits the lack of separation.

Concrete example: a protocol uses a shared secret to derive a key for AES-GCM message encryption AND to derive a key for HMAC-based authenticator tokens. If both derivations are hash(secret) without context labels, an attacker who can convince the system to compute either operation on attacker-chosen bytes might recover information about the other. The domain separation that HKDF's info provides — "this is for AES-GCM message encryption, version 1, channel A" vs "this is for HMAC tokens, version 1" — eliminates the cross-context confusion.

The pragmatic protocol-design rule: every derived key gets its own info string identifying the protocol, version, and purpose. A label like "my-app/v2/server-tx-key" is unambiguous; "key1" is not.

This is also why composing two protocols that share a secret needs care. A shared secret used for both TLS and a custom application-level handshake should derive its keys via HKDF with disjoint info prefixes — TLS already does this internally, and your custom handshake should pick a prefix that won't collide with TLS's.

Password-based key derivation

A password is fundamentally different from a high-entropy secret. A 32-byte DH secret has 256 bits of entropy. A typical user password has maybe 30 bits. An attacker who knows the password's distribution (English words, common substitutions, length 8-12) can guess it in millions of tries; an offline attacker with a hash of the password can guess at GPU speed (10^10 attempts per second).

Password-based KDFs (PBKDFs) are designed to make this expensive. Their job is not to "make a key from a password" — that's trivial. Their job is to make each guess slow, so that an attacker testing 10 billion guesses per second is brought down to thousands or hundreds.

Three generations of PBKDFs:

PBKDF2 (RFC 8018, 1999). Iterates HMAC-SHA-256 (or other hash) thousands to hundreds of thousands of times. The iteration count is the security parameter. Effective on CPUs in 1999; almost useless against modern GPUs and ASICs that parallelize hash iterations efficiently. PBKDF2 is legacy: still used in some standards (WPA-PSK, some PKCS formats), but not the right choice for new designs.

scrypt (RFC 7914, 2009). Memory-hard: requires significant memory per password attempt, not just CPU cycles. An attacker can't massively parallelize attacks on memory-bandwidth-limited GPUs the way they can with PBKDF2. scrypt was the first widely-deployed memory-hard KDF; it's been used in cryptocurrency proof-of-work and in some password-storage systems.

Argon2 (RFC 9106, 2021). Winner of the 2015 Password Hashing Competition. Three variants: Argon2d (data-dependent memory, fastest), Argon2i (data-independent, side-channel-resistant), Argon2id (hybrid, the recommended default). Memory-hard like scrypt, with parameter knobs for memory size, iteration count, and parallelism. The 2026 default for new password-storage and password-based key derivation.

Typical parameters for Argon2id in 2026:

Memory: 64 MB to 1 GB per derivation.
Iterations: 1-3 (the memory cost dominates, not iteration count).
Parallelism: 1 (don't trade memory hardness for parallel speedup unless you understand the implications).

These parameters are tuned to take 100-500 ms on the server's hardware. The user notices the delay only at login; an offline attacker testing 100 billion guesses pays the same cost per guess and is brought to a crawl.

A password-storage system in 2026 should:

Use Argon2id for new password hashing.
Migrate existing PBKDF2 hashes by re-hashing on next successful login (the user enters the password, you verify against the old hash, then store a fresh Argon2id hash).
Salt every password individually. Salts are not secret; their job is to make rainbow tables ineffective. A unique random salt per password (16+ bytes) is standard.
Tune parameters to match hardware. A modern server might budget 200 ms per login attempt.

What HKDF is not for

A common confusion: using HKDF for password storage. Don't.

HKDF assumes the input is high-entropy. It's a fast operation — a few HMACs total. It does no work to slow down brute-force attacks. If you HKDF a password and store the output, an attacker can run HKDF on every password guess at the same speed you can. There's no slowdown.

Conversely, Argon2id is wasteful for derivation from a high-entropy secret. The DH shared secret is already 256 bits; you don't need to spend 200 ms hashing it. HKDF in microseconds is correct.

The clean rule:

High-entropy input (DH secret, master key, hardware-derived key) → HKDF (or HMAC-based KDF).
Low-entropy input (user password, PIN) → Argon2id (or scrypt for legacy).

Mixing them up is a common protocol-design bug. Either the system is too slow to use (Argon2 on every TLS handshake) or trivially brute-forceable (HKDF on passwords).

KDFs inside real protocols

A whirlwind tour of where you encounter KDFs:

TLS 1.3. Every key in the connection comes from HKDF. The handshake secret is HKDF-Extract'd from the DH shared secret. Client and server handshake traffic secrets are HKDF-Expand'd with different labels. Then the master secret is derived. Then client/server application traffic secrets. Then per-record nonces. The entire key schedule (Module 1.11) is HKDF.

Noise protocol framework (next module). Uses HKDF (specifically HMAC-based "MixKey") to evolve a chained set of keys as messages are exchanged.

WireGuard. Derives transmit and receive keys per peer-direction-pair from a triple-DH operation, with HKDF binding everything together.

HPKE (RFC 9180, Hybrid Public Key Encryption — the framework underlying Encrypted Client Hello and OHTTP). HKDF is the labeled derivation primitive.

TLS 1.3 PSK and 0-RTT. When using a pre-shared key (e.g., for session resumption), HKDF derives the early traffic keys from the PSK plus per-session randomness.

The unifying observation: every modern protocol uses HKDF as the bridge from "we have a shared secret" to "we have the actual keys we'll use." Understanding HKDF is understanding most of the modern key-management story.

Failure modes

Several patterns to recognize:

Reusing salt across passwords. Defeats the purpose of salting; rainbow tables become viable. Salt must be unique per password.

Omitting info strings in HKDF. Two unrelated uses of HKDF on the same PRK with empty info produce the same output — which the attacker can exploit to confuse one role for another. Always specify a unique info string per derivation.

Using HKDF for password storage. Fast operation, no slowdown, attackers brute-force at full speed. Use Argon2id.

Using Argon2 for transport keys. Slow operation, no benefit (the input is high-entropy already), wastes CPU on every connection. Use HKDF.

Inventing custom KDFs by concatenation. hash(secret || "client_tx_key") looks fine but lacks formal proofs of security. Specifically, for Merkle-Damgard hashes, this construction is vulnerable to length-extension. HKDF (built on HMAC, which is length-extension-resistant) is a drop-in replacement that doesn't have these issues.

Re-deriving keys on every operation. HKDF is fast but not free. Derive keys once at handshake/setup time and reuse them; don't HKDF per packet.

The clean engineering principle: use a standard library's HKDF for high-entropy inputs and a standard library's Argon2id for passwords. Don't invent custom KDFs; don't mix the two.

Hands-on exercise

Exercise 1 — Derive multiple labeled keys from one secret with HKDF

"""Derive separate client and server traffic keys from one shared secret."""

import os
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes

# Pretend this is a 32-byte X25519 shared secret.
shared_secret = os.urandom(32)
salt = b"my-protocol/v1/handshake-salt"

# Derive a client-side transmit key.
client_tx_key = HKDF(
    algorithm=hashes.SHA256(),
    length=32,
    salt=salt,
    info=b"my-protocol/v1/client-tx-key",
).derive(shared_secret)

# Derive a server-side transmit key from the SAME shared secret.
server_tx_key = HKDF(
    algorithm=hashes.SHA256(),
    length=32,
    salt=salt,
    info=b"my-protocol/v1/server-tx-key",
).derive(shared_secret)

print(f"client_tx_key: {client_tx_key.hex()}")
print(f"server_tx_key: {server_tx_key.hex()}")
print(f"different: {client_tx_key != server_tx_key}")

The two keys are completely unrelated even though they came from the same shared secret. The only difference in the inputs is the info string. This is the domain-separation property that lets a single Diffie-Hellman exchange produce all the keys a TLS connection needs.

Stretch: add four more derivations — client RX key, server RX key, client MAC key, server MAC key — all from the same shared secret with different info strings. Confirm all six are pairwise different. This is essentially TLS 1.3's key schedule shape.

Exercise 2 — Hash a password two ways

"""Compare PBKDF2 and Argon2id for password hashing."""

import os
import time
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.primitives import hashes

password = b"correct-horse-battery-staple"
salt = os.urandom(16)

# PBKDF2-SHA256 with 600,000 iterations (the OWASP 2023 recommendation).
start = time.time()
pbkdf2_hash = PBKDF2HMAC(
    algorithm=hashes.SHA256(),
    length=32,
    salt=salt,
    iterations=600_000,
).derive(password)
pbkdf2_ms = (time.time() - start) * 1000

# Argon2id with the libsodium-default-style parameters.
try:
    from argon2 import PasswordHasher
    ph = PasswordHasher(time_cost=3, memory_cost=64*1024, parallelism=1)
    start = time.time()
    argon2_hash = ph.hash(password.decode())
    argon2_ms = (time.time() - start) * 1000
    print(f"PBKDF2 (600k iter):       {pbkdf2_ms:>6.1f} ms, output {pbkdf2_hash.hex()[:16]}...")
    print(f"Argon2id (3 iter, 64 MB): {argon2_ms:>6.1f} ms, output {argon2_hash[:32]}...")
except ImportError:
    print(f"PBKDF2 (600k iter): {pbkdf2_ms:>6.1f} ms, output {pbkdf2_hash.hex()[:16]}...")
    print("install argon2-cffi to compare against Argon2id")

Run it. Both take ~100-300 ms on a modern laptop. The output values are completely different (different algorithms with different parameter spaces).

The interesting comparison isn't the output bytes; it's the attacker cost. PBKDF2 with 600k iterations costs the attacker ~600k SHA-256 invocations per guess — fast on a GPU farm. Argon2id with 64 MB memory cost 3 iterations costs the attacker ~3 × (64 MB of memory access + computation) per guess — slow, and the GPU farm can't massively parallelize because each guess needs its own gigabytes of memory. Argon2id is much harder to brute-force at scale.

Common misconceptions

"A shared secret is already a key, so a KDF is optional polish." A raw shared secret has unknown distribution and lacks domain separation. HKDF turns it into uniformly-random fit-for-purpose key material with explicit context binding. Skipping HKDF means using a structurally weaker construction that will eventually be exploited.

"PBKDF2 and Argon2 are the same except one is newer." They have different attacker-cost models. PBKDF2 is CPU-iteration-hard, which is parallelizable on GPUs and ASICs. Argon2 is memory-hard, which is much harder to parallelize. The cost-per-guess for a real attacker can differ by 100× or more.

"Salt must be secret." Salt needs to be unique, not secret. Its purpose is preventing rainbow tables — a precomputed hash of every password under every common salt would defeat naive password storage, but a unique random salt per password makes it ineffective. Public salts (in the database alongside the hash) are fine.

"HKDF is for passwords." No. HKDF is for high-entropy inputs (DH secrets, master keys). Using HKDF for passwords is dangerous because HKDF is fast — attackers brute-force at full speed.

"One KDF output can be reused everywhere." Without domain separation (the info field), the same PRK used in two contexts can be confused. Always use a unique info string per derivation.