RouteHardenHire us
Back to Encrypted Transport
Encrypted Transport · Part 1 of 7·encrypted-transport··17 min read·advanced

IPsec, the original VPN

IPsec from first principles: ESP vs AH, transport vs tunnel mode, IKEv2's role, why it dominates enterprise gateways and why everyone else fled to WireGuard.

IPsec is the most-deployed VPN protocol nobody likes. Almost every site-to-site link between corporate networks runs over it. Almost every cellular phone's IPSec-based connection to the carrier core is invisible to the user. Hardware vendors (Cisco, Juniper, Fortinet, Palo Alto) sell tens of billions of dollars of IPsec gateways every year. And almost every engineer who has had to configure it from scratch comes away wishing they could replace it with WireGuard. The protocol is genuinely powerful — it operates at the IP layer, supports almost arbitrary topologies, and has decades of standards work behind it. The cost is operational complexity that newer protocols deliberately reject. This module walks through what IPsec actually is, how its pieces fit together, and why the engineering tradeoff has tilted away from it for new deployments while keeping it firmly entrenched where it already lives.

Prerequisites

Learning objectives

By the end of this module you should be able to:

  1. Explain IPsec as an architectural layer for protecting IP packets, not as one product or one tunnel.
  2. Distinguish AH from ESP, transport mode from tunnel mode, and IKEv2 from the data-plane protocols it negotiates.
  3. Trace the lifecycle of a Security Association — selector, SPI, anti-replay window, rekey — and explain how IKEv2 manages them.
  4. Diagnose why IPsec is powerful yet operationally unpopular compared with WireGuard or TLS-based VPNs, and articulate where it still wins.
  5. Read an ESP packet capture and identify what's encrypted, what's authenticated, and what's visible to network observers.

Why IPsec was the original answer

When the IETF set out to add cryptographic protection to the Internet in the early 1990s, the obvious choice was to do it at the IP layer. Every IP packet would be optionally encryptable by the routing infrastructure or by endpoints; higher-layer protocols (TCP, UDP, application) wouldn't need to know anything about it. The end-to-end packet model would survive intact, just with cryptographic guarantees added.

IPsec is the result of that design intent. It's not a single protocol — it's an architecture (RFC 4301) that defines how IP packets get cryptographically protected in transit, plus a set of protocols (ESP, AH, IKEv2) that implement specific pieces of the architecture.

The architectural choice was pure: protect at the IP layer, support both end-to-end and gateway-to-gateway use, support arbitrary higher-layer protocols. The implementation choice was less pure: extensive optionality at every layer (modes, algorithms, key-exchange options, selectors, NAT traversal variants) to satisfy every deployment scenario. The result is the most-flexible VPN protocol in widespread use, and also the hardest to configure correctly.

When TLS arrived for application-layer encryption in the mid-1990s and became universal for web traffic, IPsec retreated to its remaining niches: site-to-site network connections and remote-access VPNs. Cellular core networks — between phones and the operator infrastructure — settled on IPsec for their always-on encrypted control plane. Hardware vendors built dedicated IPsec offload silicon. The protocol entrenched in enterprise routers and carrier gear, where it remains.

IPsec architecture, not just packets

The architecture document, RFC 4301, defines the conceptual model:

Security Policy Database (SPD). Per-host policy: for each combination of source IP / destination IP / protocol / port, decide whether the traffic should be protected, dropped, or bypassed. The SPD is consulted on every packet by the IP forwarding plane.

Security Association Database (SAD). Active cryptographic state: each entry holds a Security Association (SA) — keys, sequence numbers, anti-replay window, and the algorithms in use. Identified by an SPI (Security Parameter Index, 32 bits) plus the destination IP.

Selectors. Tuples that identify which traffic an SA applies to: source/destination IP, protocol, port. When a packet arrives, the SPD is consulted; if a matching policy says "protect," the SAD is consulted to find the active SA; the packet is encapsulated and sent.

Mode. Whether the SA protects the original IP packet inside a new outer header (tunnel mode) or just protects the payload of the original packet (transport mode).

Direction. Each SA is unidirectional. A bidirectional protected channel is two SAs.

This abstraction is more involved than newer VPN protocols expose. A WireGuard peer just has "the other endpoint and the keys"; an IPsec peer has SPDs, SADs, selectors, modes, directions, and lifetime parameters per SA. The complexity buys flexibility — you can have one IPsec stack protect dozens of distinct flows with different policies — at the cost of configuration surface that operators have to understand and maintain.

AH versus ESP

Two data-plane protocols carry the protected packets:

AH (Authentication Header, RFC 4302). Provides integrity and authentication, no confidentiality. The header sits between the IP header and the upper-layer protocol; it includes a sequence number, the SPI, and an integrity check value computed over the packet. AH protects the outer IP header fields too — source/destination IP, ports.

ESP (Encapsulating Security Payload, RFC 4303). Provides confidentiality and integrity. The packet's original payload (and, in tunnel mode, the original IP header) is encrypted under an SA's symmetric key; an ESP header (SPI + sequence number) precedes the encrypted region; an ESP trailer (padding + next-header-byte) follows; an ESP authentication trailer holds the AEAD tag.

In practice, ESP wins almost all real deployments. Reasons:

  • ESP-with-AEAD provides both confidentiality and integrity in one operation. AH provides only integrity, with no confidentiality, which is rarely what people actually want.
  • AH protects outer IP header fields including the source IP. NAT rewrites the source IP, which invalidates AH's integrity check. AH and NAT are fundamentally incompatible without ALG support; ESP works through NAT (with NAT-T encapsulation, see below).
  • ESP is what hardware accelerates. Routers and HSMs have ESP-specific offload paths; AH is a niche feature.

So when someone says "an IPsec connection," they usually mean ESP plus IKEv2 plus tunnel mode. AH technically still exists in the standard but you'll rarely see it in production.

Transport mode versus tunnel mode

ESP can wrap a packet in two ways:

Transport mode. The original IP header is preserved. The ESP header sits between the IP header and the original payload (TCP, UDP, ICMP). Only the payload is encrypted; the IP header fields (source, destination, etc.) are visible.

[IP][ESP header][TCP][app data][ESP trailer][ESP auth]
              <-- encrypted -->
              <-------- authenticated -------->

Used for host-to-host protection: two endpoints encrypting their direct traffic without changing the routing topology. Less common in deployment because the visible original IP addresses leak which hosts are talking.

Tunnel mode. The original IP packet (header + payload) is encrypted in its entirety. A new outer IP header is prepended, addressed to the IPsec gateway. The original packet is invisible until decryption.

[outer IP][ESP header][[inner IP][TCP][app data]][ESP trailer][ESP auth]
                       <----- encrypted ----->
                      <------- authenticated -------->

Used for gateway-to-gateway VPNs: each side has an IPsec gateway, packets to remote networks are encrypted into tunnel-mode ESP, sent over the public internet to the remote gateway, decrypted, and forwarded into the remote network. This is the canonical "VPN" mental model.

Most enterprise deployments use tunnel mode. Most cellular deployments mix both modes for different control-plane purposes.

Security Associations and SPIs

A Security Association is a one-way encrypted channel. Each direction needs its own SA. A bidirectional VPN connection is therefore two SAs.

Each SA includes:

  • SPI (Security Parameter Index, 32-bit). The receiver-chosen identifier that lets it look up the right SA on inbound packets.
  • Destination IP. Combined with SPI, identifies the SA at the receiver.
  • Cryptographic keys. Encryption key, integrity key (or AEAD key for combined modes).
  • Algorithm choices. Specific cipher (AES-128-GCM, ChaCha20-Poly1305), specific integrity (for non-AEAD modes).
  • Sequence number. Monotonically incremented per packet for replay protection.
  • Anti-replay window. The receiver tracks recently-seen sequence numbers to detect replays.
  • Lifetime. After N bytes or T seconds, the SA must be rekeyed.

The SPI is what the receiver sees on incoming packets and uses to look up the SA. The sender chose the SPI when the SA was negotiated; the receiver wrote it into its SAD.

When a packet arrives:

  1. Look up SA by (destination IP, SPI).
  2. If no SA found, drop.
  3. Verify sequence number against anti-replay window. Drop if duplicate or too old.
  4. Decrypt and verify integrity tag. Drop if tag fails.
  5. Update anti-replay window.
  6. Hand the inner packet to forwarding (tunnel mode) or up the stack (transport mode).

The state per SA is significant — keys, sequence number, anti-replay window — and is held in the kernel for performance. Modern implementations support hundreds of thousands of concurrent SAs per gateway.

IKEv2 as the control plane

IKEv2 (Internet Key Exchange version 2) is the control protocol that establishes Security Associations. It runs on UDP/500 (or UDP/4500 with NAT-T), exchanges keys via Diffie-Hellman, authenticates the peers, and negotiates which algorithms the data-plane SAs will use.

The IKEv2 exchange flow:

IKE_SA_INIT. First two messages. Initiator sends a list of supported DH groups and proposes algorithms. Responder picks one. Both perform a DH exchange. The result is an IKE SA — a control-plane SA used for subsequent IKEv2 messages.

IKE_AUTH. Next two messages, encrypted under the IKE SA. Peers exchange identity (certificate, PSK, etc.) and prove possession of the corresponding authentication material. They then negotiate the first Child SA — the actual data-plane ESP SA.

CREATE_CHILD_SA. Used to create additional data-plane SAs or to rekey existing ones without restarting the IKE conversation.

INFORMATIONAL. Various housekeeping: dead-peer detection, deletion of SAs, error messages.

Two SAs are created from the very first exchange: the IKE SA (control plane) and one Child SA (data plane). Most deployments only use one Child SA per peer; protocols that need multiple data flows (e.g., multiple traffic selectors with different policies) create additional Child SAs.

The complexity of IKEv2 is in the negotiation. Each side proposes a list of acceptable algorithms (cipher, integrity, DH group, PRF); they negotiate down to a common set. Identity is verified via certificates, PSKs, EAP, or RSA signatures. Authentication can be one-sided or mutual. Child SAs can have their own selectors, lifetimes, and rekey behaviors.

NAT traversal

The original IPsec assumed clean end-to-end IP. NAT broke that assumption thoroughly. Two issues:

AH is incompatible with NAT because it integrity-protects the source IP that NAT rewrites. AH-protected packets through NAT fail integrity and are dropped. Most deployments use ESP only for this reason.

ESP through NAT also has problems because NATs sometimes don't track the SPI as a port (it isn't one) and may incorrectly translate or drop ESP packets. Protocol number 50 (ESP) isn't always handled gracefully by NATs designed for TCP/UDP.

The fix is NAT-Traversal (NAT-T, RFC 3948): encapsulate ESP inside UDP. The packet becomes:

[outer IP][UDP/4500][ESP][...]

Now the NAT sees a normal UDP/4500 flow. It translates source port and IP normally. The receiving IPsec stack strips the UDP header and processes the ESP packet inside.

NAT-T is automatically negotiated during IKE_SA_INIT — both sides advertise NAT-T capability, then send a NAT-detection payload and check whether the received IP/port match what they were sent. If they don't match, a NAT was in the path; switch to UDP/4500 encapsulation.

Most modern IPsec deployments use NAT-T by default because so many endpoints are behind NAT. The cost is 8 bytes of UDP header per packet.

Algorithm agility and operational complexity

IKEv2's algorithm-negotiation surface is huge. Each side advertises:

  • Encryption algorithms (AES-128-CBC, AES-256-CBC, AES-128-GCM, AES-256-GCM, ChaCha20-Poly1305, etc.).
  • Integrity algorithms (HMAC-SHA-256, HMAC-SHA-384, AES-GMAC, etc.).
  • Pseudorandom functions (PRF-HMAC-SHA-256, etc.).
  • Diffie-Hellman groups (Group 14 = MODP-2048, Group 15 = MODP-3072, Group 19 = ECP-256, Group 20 = ECP-384, Group 31 = X25519, etc.).
  • Authentication methods (PSK, RSA signatures, ECDSA signatures, EAP).

Two implementations have to negotiate down to a common set across all five dimensions. The matrix of "what does this peer support" makes interop testing notoriously painful. A misconfigured AES-256-CBC vs AES-128-GCM mismatch produces an opaque "no proposal chosen" error that gives no hint about which dimension caused the failure.

This is one of the operational reasons engineers dislike IPsec. Every interop session involves trial-and-error to find a common configuration; configuration files have to be kept in sync between gateways; algorithm negotiation failures are vague.

WireGuard's response was to nail the algorithms entirely (no negotiation: ChaCha20-Poly1305, BLAKE2s, X25519, fixed). One choice. No matrix. The trade-off is no algorithm agility — if Curve25519 is broken, every WireGuard installation needs to be replaced — but in practice this is fine because Curve25519 hasn't been broken and switching to a new fixed algorithm is no worse than the IPsec version-bump cycle.

Why operators fled to simpler systems

The honest summary of IPsec's deployment story:

StrongSwan and other mature stacks make IPsec work well at scale, with extensive deployment experience. Production cellular cores, enterprise gateways, government networks all run IPsec reliably.

The cost is operational expertise. Configuring IPsec correctly requires understanding SPDs, SADs, selectors, modes, algorithms, NAT-T, IKEv2 negotiation, and rekey behavior. Most generalist engineers don't have that expertise; the protocol's combinatorial complexity is a real burden.

For new deployments, the engineering tradeoff has tilted toward WireGuard:

  • WireGuard's configuration is one ini file with a half-dozen lines per peer.
  • WireGuard's protocol is fixed; no negotiation matrix.
  • WireGuard's performance is comparable to or better than IPsec on commodity hardware.
  • WireGuard's security analysis is tractable (it's Noise IK with specific primitives).

For existing deployments, IPsec stays. The installed base of routers, gateways, and cellular cores can't be replaced overnight, and IPsec is what they speak.

The honest 2026 guidance: use WireGuard for new VPN deployments unless you have a specific requirement IPsec uniquely satisfies (hardware acceleration on legacy gear, mandated by compliance, peering with infrastructure that only speaks IPsec).

What IPsec still does exceptionally well

Several scenarios where IPsec remains the right choice:

Site-to-site at scale. A corporate network with dozens of branch offices, each with a hardware IPsec gateway, talking to a central HQ over IPsec tunnels. The hardware vendors have decades of operational experience and tooling for this.

Cellular operator infrastructure. The interfaces between phones and the carrier core, between carriers, between basestations and the centralized core, all run on IPsec. The protocol is mandated by 3GPP standards.

Standards-mandated environments. Government, financial services, and regulated industries with explicit compliance requirements often need IPsec specifically because it's the standardized choice.

Hardware-offloaded throughput. Modern enterprise routers do IPsec-AES at line rate (10+ Gbps) via dedicated silicon. WireGuard is software-only on most platforms (kernel module exists for Linux but no widespread hardware acceleration yet).

Multi-protocol selectors. IPsec can protect specific flows (this src/dst pair, this protocol, this port) while leaving others unprotected. WireGuard tunnels everything in the AllowedIPs range; you can't selectively protect within a tunnel without additional firewalling.

For each of these scenarios, IPsec is the right answer. For everything else in 2026, WireGuard is.

Hands-on exercise

Exercise 1 — Inspect an ESP packet structure

If you have an IPsec-using setup, capture some traffic:

sudo tcpdump -ni any -w /tmp/esp.pcap esp

Open in Wireshark. Each ESP packet has:

  • Outer IP header (visible).
  • ESP header: SPI (4 bytes) + sequence number (4 bytes), all visible.
  • Encrypted payload (opaque to observers without keys).
  • ESP trailer (padding + next-header byte, encrypted).
  • ESP authentication trailer (12 or 16 bytes typically, the AEAD tag).

In tunnel mode, the encrypted payload begins with the inner IP header, followed by the inner protocol's data. In transport mode, the encrypted payload is the original TCP/UDP/etc. payload directly.

Observable to a network observer:

  • Source/destination IPs (the gateway endpoints, in tunnel mode).
  • Packet sizes (gives traffic-pattern hints).
  • Timing.

Hidden:

  • Inner source/destination (in tunnel mode).
  • Application protocol.
  • Application data.

This is the protocol's confidentiality story.

Exercise 2 — Read a minimal strongSwan config

# /etc/swanctl/conf.d/site-to-site.conf
connections {
    site-a-to-b {
        local_addrs = 198.51.100.10
        remote_addrs = 198.51.100.20
        version = 2
        proposals = aes256gcm16-prfsha384-x25519
        local {
            id = site-a@example.com
            auth = pubkey
            certs = site-a.cert.pem
        }
        remote {
            id = site-b@example.com
            auth = pubkey
        }
        children {
            site-a-to-b {
                local_ts = 10.10.0.0/16
                remote_ts = 10.20.0.0/16
                esp_proposals = aes256gcm16
                start_action = trap
            }
        }
    }
}

Reading this config:

  • The IKE SA uses AES-256-GCM, SHA-384 PRF, X25519 DH.
  • Each side has a certificate identifying it.
  • The Child SA (children block) protects traffic between 10.10.0.0/16 (local side) and 10.20.0.0/16 (remote side).
  • The data-plane SA uses AES-256-GCM for ESP.
  • start_action = trap means the kernel will install the policy and the SA will be created on first matching packet.

Map the config fields to the IPsec concepts: traffic selectors (local_ts, remote_ts), authentication method (auth = pubkey plus the cert), algorithm proposals (IKE and ESP separately), tunnel mode (implicit from local_tslocal_addrs).

A WireGuard equivalent of this config is roughly half as many lines. That's the operational difference in concrete terms.

Common misconceptions

"IPsec is one protocol." It's an architecture (RFC 4301) plus several protocols (ESP, AH, IKEv2) plus extensive negotiation surface. Saying "use IPsec" without specifying which mode and which algorithm is like saying "use TLS" without picking a version.

"AH plus ESP is the normal secure choice." AH is rare in production. NAT incompatibility, NAT-T's lack of AH support, and the redundancy with ESP-AEAD have pushed almost all deployments to ESP-only.

"IKEv2 is just the same thing as IPsec." IKEv2 is the control plane that negotiates SAs. ESP is the data plane that protects packets. They're separate and either could in principle be replaced by something else. Some setups use IKE-less keying via static configuration; some research has explored IKEv2 with non-IPsec data planes.

"IPsec is obsolete because WireGuard exists." IPsec is entrenched in enterprise gateways, cellular cores, and hardware-accelerated infrastructure. It's not going away. WireGuard is the better choice for new deployments where flexibility is the ground.

"The problem with IPsec is only bad UX." The deeper issue is genuine combinatorial complexity. Modes × algorithms × selectors × NAT-T × IKEv2 options × authentication methods × certificate ecosystems. Each axis is justified by some real use case; the sum is daunting.

Further reading

  1. RFC 4301 — Security Architecture for the Internet Protocol. The canonical IPsec architecture.
  2. RFC 4303 — IP Encapsulating Security Payload (ESP). The data-plane protocol.
  3. RFC 7296 — Internet Key Exchange Protocol Version 2 (IKEv2). The control plane.
  4. RFC 3948 — UDP Encapsulation of IPsec ESP Packets. NAT-Traversal.
  5. strongSwan documentation. The most-used open-source IPsec stack; their docs are the working engineer's reference.
  6. Doraswamy and Harkins, IPsec: The New Security Standard for the Internet, Intranets, and Virtual Private Networks, 2nd ed. The classic textbook treatment, still substantively current.

The next module — OpenVPN: the friendly compromise — covers the protocol that bridged IPsec's enterprise model and modern simpler designs, and why it remained popular long after IPsec's complexity became a complaint.