RouteHardenHire us
Back to Networking Fundamentals
Networking Fundamentals · Part 6 of 12·Network Hardening··16 min read·introductory

UDP, the simplest transport

UDP from first principles: datagram semantics, the 8-byte header, why DNS / QUIC / RTP / metrics protocols choose it, and when 'almost nothing' is the right answer.

UDP is the transport protocol that does almost nothing, and the things it doesn't do are the entire reason it exists. Where TCP gives you a reliable, ordered, congestion-controlled byte stream, UDP gives you a slightly-prettier IP packet with two ports tacked on. The header is 8 bytes. The state machine has zero states. The receiver has no idea whether the sender wants a reply, a retransmission, or anything else. This sounds like a deficiency until you notice that almost every protocol that has to make timing or recovery decisions of its own — DNS, QUIC, RTP, metrics shippers, online games, STUN, WebRTC — built itself on UDP precisely to escape what TCP imposes. This module is the foundational pass: enough to read a UDP capture, write a working echo server in fifteen lines, and have a defensible opinion about when "almost nothing" is exactly the right answer.

Prerequisites

  • Module 1.5 — The IP forwarding plane. UDP rides on IP; understanding how IP routing and adjacency work is the floor below this discussion.

Learning objectives

By the end of this module you should be able to:

  1. Decode a UDP datagram header byte by byte and explain what the protocol deliberately does not provide.
  2. Contrast UDP's connectionless message-oriented transport with TCP's connection-oriented byte-stream semantics.
  3. Identify which application protocols built themselves on UDP and articulate the engineering reason in each case.
  4. Build and observe a minimal UDP echo pair with python3 plus tcpdump, and recognize what the loss of a single datagram looks like to the application.
  5. Recognize UDP's failure modes — silent loss, duplication, reordering, oversized fragmentation — and know which ones the application must handle itself.

What UDP leaves out on purpose

The cleanest way to understand UDP is to list everything it doesn't do, and notice that every omission corresponds to something the application is expected to handle (or not need) itself:

  • No connection setup or teardown. UDP has no SYN/SYN-ACK/ACK, no handshake. A sender simply sends; the receiver simply receives.
  • No reliability. A lost datagram is gone. UDP makes no attempt to retransmit, and no one notifies anyone.
  • No ordering. Datagrams may arrive in a different order than they were sent. UDP preserves no sequence semantics.
  • No flow control. A sender will happily flood a slow receiver until the receiver's kernel drops datagrams.
  • No congestion control. A UDP sender does not adapt its rate based on network conditions. If the network is congested, UDP keeps blasting.
  • No state. A UDP socket holds no per-flow state in the kernel beyond the local port number and the application's interest.

What UDP does provide is also short:

  • Port-based demultiplexing. Source and destination ports let multiple applications share one IP address.
  • A length field. The datagram knows how big it is.
  • A checksum over the header and payload (and a small pseudo-header taken from IP), so corrupted datagrams can be detected and dropped.

That's it. UDP is a thin demultiplexing wrapper around IP packets. The receiver's kernel hands the application a buffer; the application interprets the bytes. The sender's kernel takes a buffer and packs it into one IP packet with a UDP header.

The omissions are not bugs. They are the protocol's reason for existing. Many application classes need precisely this — a way to address a port on a remote host without inheriting TCP's machinery, because they're going to do their own recovery, timing, or multiplexing in user space.

Datagram semantics

The most important practical difference between UDP and TCP is message boundary preservation.

When you send() 100 bytes on a TCP socket and then send() another 100 bytes, the receiver might recv() and get 200 bytes in one call. Or 50 bytes the first time and 150 the next. TCP delivers a byte stream; the receiving application has to do its own framing.

When you sendto() 100 bytes on a UDP socket and then sendto() another 100 bytes, the receiver does two recvfrom() calls and gets exactly one 100-byte datagram per call. Each sendto() produces exactly one datagram on the wire (modulo IP fragmentation, which we'll get to). Each recvfrom() returns exactly one datagram or blocks waiting for one.

This is the datagram model: discrete messages, atomic at the transport layer. The application thinks in messages; the kernel preserves the boundaries.

The model maps cleanly onto request/response protocols (DNS), event protocols (RTP packets, syslog UDP, telemetry shippers), and any communication where the application's natural unit is "one message" rather than "a stream of bytes." For a stream-oriented application — bulk file transfer, an HTTP/1.1 connection — the byte-stream model TCP provides is a better fit. For a message-oriented one, datagrams are exactly the right abstraction.

Two practical consequences:

  • A single datagram is atomic. Either the whole thing arrives or none of it does. The receiver never gets half a datagram.
  • An oversized datagram is at risk. If the application tries to send a UDP datagram larger than the path MTU, the IP layer will fragment it. The receiver will reassemble — or, if any fragment is lost, the receiver gets nothing. UDP applications that send large messages are gambling on every fragment surviving. Most modern UDP-based protocols (DNS-over-UDP traditionally capped at 512 bytes, then 1232 bytes for IPv6 safety; QUIC negotiates a path-compatible MTU) work hard to stay below the path MTU per datagram.

The UDP header byte by byte

The UDP header is a glorious 8 bytes:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Source Port          |       Destination Port        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             Length            |           Checksum            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          (Payload)                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Field by field:

  • Source Port (16 bits). Identifies the application that sent the datagram. Optional in some senses — receivers may ignore it — but every kernel populates it for connectionless replies.
  • Destination Port (16 bits). Identifies the receiving application. Together with the destination IP, this is the demultiplexing key.
  • Length (16 bits). Total length of the UDP datagram including header, in bytes. Maximum value 65,535, but practical limits from the underlying IP packet size and path MTU constrain real datagrams to ~1472 bytes on Ethernet (1500 IP MTU minus 20 IPv4 header minus 8 UDP header).
  • Checksum (16 bits). Computed over a pseudo-header (source IP, dest IP, protocol number, length) plus the UDP header plus the payload. Detects corruption in transit. In IPv4 the checksum is technically optional (set to all zeros to disable), but most stacks compute it. In IPv6 the checksum is mandatory because IPv6 has no header checksum of its own.

The pseudo-header is a quirk worth understanding. The checksum field doesn't only cover the UDP header and payload; it covers a "fake" prefix that includes the source and destination IP addresses. This catches a specific class of bug: a packet that gets misrouted and arrives at the wrong host (different destination IP) will fail the checksum, because the receiver computes the checksum using its own IP rather than the IP the sender intended. Without the pseudo-header, a misrouted packet could pass checksum and be delivered to the wrong application.

No handshake, no guarantees

When a UDP packet arrives at a host:

  1. The kernel demultiplexes by destination port.
  2. If a socket is bound to that port and is willing to receive, the datagram is queued.
  3. The application calls recvfrom() to read it.

If no socket is bound, the kernel typically returns ICMP "Destination Unreachable, port unreachable" to the source. (Some servers disable this for security reasons.) Otherwise, the datagram is silently dropped at the kernel level.

The application gets exactly one chance per datagram. If the network drops a datagram, neither sender nor receiver is automatically informed. If the application needs to know about loss, it must build that into its protocol — typically with sequence numbers and explicit acknowledgments. DNS does this implicitly: the client retries after a timeout. QUIC does it explicitly: every QUIC packet is ack'd, and lost packets are recovered by sender-side retransmission, all inside the QUIC layer above UDP.

Reordering is similar. If two datagrams take different paths through the network and arrive out of order, the receiver gets them out of order. The application either accepts that (event protocols often do) or includes sequence numbers to detect and reorder.

Duplication is the same. A datagram that traverses two paths and reaches the receiver twice is delivered twice. The application either tolerates duplicates (idempotent operations) or detects and discards them.

The unifying theme: UDP makes the application responsible for reliability, ordering, and deduplication when those properties matter. For protocols where they don't matter — fire-and-forget telemetry, lossy real-time media — that's a feature. For protocols where they do matter, the application has to build them on top of UDP, which is exactly what QUIC, RTP, and many internal RPC protocols do.

Why applications still choose UDP

UDP's appeal becomes obvious when you look at what specific protocols want from their transport.

DNS. A query is one packet, a response is one packet, the client retries after a timeout if no response arrives. The whole protocol fits in a few hundred bytes. Establishing a TCP connection — three-way handshake, possibly TLS, then teardown — would multiply round trips. UDP gets DNS done in 1 RTT for the typical case. (DNS-over-TCP and DNS-over-TLS exist for specific cases — large responses, integrity-sensitive contexts — but the bulk of DNS in 2026 is still UDP.)

RTP and real-time media. Voice, video, and live broadcast tolerate occasional loss but cannot tolerate retransmission. By the time TCP retransmitted a lost packet, the audio frame it carried would already be in the listener's past. RTP runs on UDP, accepts that some packets will be lost, and uses application-layer techniques (forward error correction, packet loss concealment) to mask the loss in the user's experience.

QUIC. Designed to replace TCP for HTTP/3, QUIC needed several things TCP couldn't easily provide: encrypted transport headers, stream multiplexing without head-of-line blocking, fast connection establishment with 0-RTT for resumed sessions, and the ability to evolve without negotiating with every middlebox on the internet. Building it on UDP was the only path: TCP's behavior is baked into kernel implementations everywhere, but UDP is a thin demultiplexing layer the QUIC user-space library can put almost anything on top of. This is why QUIC traffic shows up in captures as UDP — but the actual transport semantics live in QUIC.

STUN and TURN. WebRTC's NAT-traversal helpers are inherently message-oriented and need to work across NATs that may not handle TCP cleanly. UDP is the obvious fit.

Metrics shippers. StatsD's classic UDP wire protocol fits the pattern: applications emit metric events as fire-and-forget UDP datagrams, accepting that some will be lost in exchange for an extremely cheap send path. If a metric collector goes down, applications don't block; they just lose those particular metric points.

Online games. Per-tick state updates are time-sensitive. A retransmitted update that arrives 200 ms late is worse than not arriving at all. UDP plus an application-layer reliability scheme that retransmits only the most-recent state is the standard pattern.

The unifying theme across all of these: the application has its own opinions about reliability, ordering, and timing. UDP gets out of the way so the application can implement its own opinions, rather than having TCP's opinions imposed on top.

When UDP is the wrong choice

The misuse pattern is "I'll use UDP because it's faster" without thinking through what that means. UDP is faster only because it doesn't do things — and if your application then has to do those same things in user space, you've moved the work, not eliminated it.

Three concrete anti-patterns:

Bulk transfer over UDP without congestion control. A file-transfer protocol on UDP that doesn't implement TCP-equivalent congestion control will either underperform (if the application is conservative) or be a bad network citizen (if it isn't). Modern bulk-transfer protocols built on UDP (Aspera, QUIC's bulk-data mode) include sophisticated congestion control. Naive ones do not.

Request/response that secretly reinvents TCP. If you find yourself adding sequence numbers, ACKs, retransmission timers, ordering, and connection state to your UDP protocol, you've reinvented TCP — usually badly, with corner cases TCP solved decades ago. Sometimes the answer is "use TCP." Sometimes it's "use QUIC, which already solved this." Almost never is it "let's keep growing this hand-rolled UDP protocol."

Long-lived flows over NATs. UDP's statelessness means NATs guess. A NAT will create a translation entry on the first outbound UDP packet, then expire it after some idle timeout (typically 30 seconds for UDP, 5 minutes for "established" UDP, and varies wildly between vendors). If your application sends infrequently, the NAT entry expires, and inbound packets are dropped because the NAT doesn't know where to send them. The fix is application-layer keepalives — periodic do-nothing packets that keep the NAT entry alive. Without them, UDP across NAT is unreliable in a way that has nothing to do with the protocol itself.

If you find yourself reaching for UDP, ask which property of TCP you're trying to escape. If the answer is concrete (head-of-line blocking, 3-way handshake latency, kernel-controlled retransmission), UDP plus an application-layer protocol may be right. If the answer is "I heard UDP is faster," you probably want TCP.

Hands-on exercise

Exercise 1 — Build a tiny UDP echo pair

Save the server as udp_echo_server.py:

import socket

PORT = 9999

s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(("127.0.0.1", PORT))
print(f"udp echo listening on 127.0.0.1:{PORT}")

while True:
    data, addr = s.recvfrom(2048)
    print(f"recv {len(data)} bytes from {addr}: {data!r}")
    s.sendto(data, addr)

And the client as udp_echo_client.py:

import socket
import sys

PORT = 9999
msg = (sys.argv[1] if len(sys.argv) > 1 else "hello").encode()

s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.settimeout(2.0)

s.sendto(msg, ("127.0.0.1", PORT))
try:
    data, addr = s.recvfrom(2048)
    print(f"got {len(data)} bytes from {addr}: {data!r}")
except socket.timeout:
    print("no reply within 2s — datagram lost or server not running")

Open two terminals. In Terminal 1, run the server:

python3 udp_echo_server.py

In Terminal 2, run the client:

python3 udp_echo_client.py "hello world"

You should see the client's message echoed back. Now stop the server (Ctrl-C) and rerun the client. Observe the timeout: the client got no reply, so it raised socket.timeout after 2 seconds. UDP gave neither sender nor receiver any indication of why — the application has to decide what to do (retry, give up, raise to the user).

Stretch task: modify the server to drop every third request. The client will see two echoes succeed, then a 2-second timeout, then two more, etc. There is no automatic retransmission. The application would have to add one if it cared.

Exercise 2 — Capture the datagram exchange

In Terminal 3, while the client and server are running, capture loopback traffic on port 9999:

sudo tcpdump -ni lo0 -X 'udp port 9999'

(On Linux, the loopback interface is usually lo; on macOS, lo0.)

You'll see the request and reply, including their UDP headers. Each line of output is one datagram. Notice:

  • Source and destination ports.
  • The length field matches the size of the UDP header plus payload.
  • The payload (printed in hex and ASCII by -X) matches the application-level message.

Now run the client with a much longer message:

python3 udp_echo_client.py "$(python3 -c 'print("A"*4000)')"

The client sends a 4000-byte UDP datagram. Loopback's MTU is large enough to handle this without IP fragmentation, but on a real network this would exceed Ethernet's 1500-byte MTU and the IP layer would fragment the datagram into multiple IP packets. If any fragment was lost, the receiver would never reassemble — and to UDP, this looks like a single lost datagram. The application sees nothing useful and has to time out.

This is the operational reality behind "keep UDP datagrams under the path MTU" — which is why DNS-over-UDP traditionally caps at 512 bytes, and why DNS responses larger than that fall back to TCP.

Common misconceptions

"UDP is unreliable, therefore useless." Many applications want explicit control over recovery and timing. DNS, QUIC, RTP, online games, and metrics shippers all chose UDP because TCP's reliability machinery imposes the wrong defaults for what they're doing. UDP isn't useless; it's a different shape of usable.

"UDP is faster because it has no checksum." UDP does have a checksum (it's the 4-byte field in the header). The "speed" is in the omitted state and recovery machinery — no handshake, no retransmission timers, no congestion-window state — not in the checksum. The checksum itself takes nanoseconds and is computed by hardware on most modern NICs.

"One UDP socket means one peer." A UDP socket bound to a port can receive datagrams from any source. The application sees the source address with each recvfrom(). This is what makes UDP servers naturally able to handle thousands of clients on one socket — there's no per-client state in the kernel.

"If a UDP datagram is too large, the application will just get the rest later." Datagram boundaries are atomic. An oversized datagram is fragmented at the IP layer, and if any fragment is lost, the receiver gets nothing. There is no "rest" — it's all-or-nothing per datagram. Send sizes that fit the path MTU.

"If my protocol uses UDP, I don't need congestion control." Responsible UDP-based protocols still need to avoid being bad network citizens. Sending at line rate without backing off when the network is congested is a recipe for collapsing other people's TCP flows on the same path. QUIC includes congestion control. RTP applications use feedback mechanisms. Even StatsD's UDP traffic is rate-limited by the application sending it. "UDP doesn't have congestion control" is a description of the protocol, not a license for the application to ignore the network.

Further reading

  1. RFC 768 — User Datagram Protocol. The 1980 spec. The protocol is so small the original RFC is still the best primary source — three pages.
  2. RFC 8085 — UDP Usage Guidelines. What responsible UDP-using applications should do for congestion control, message sizing, and behavior across NATs.
  3. RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport. The headline modern UDP-built transport. Worth skimming once even if you don't implement QUIC, because it's a master class in what's actually possible to build over UDP.
  4. RFC 3550 — RTP: A Transport Protocol for Real-Time Applications. The classic UDP-based real-time protocol. The contrast with TCP semantics is sharpest here.
  5. Larry Peterson and Bruce Davie, Computer Networks: A Systems Approach, book.systemsapproach.org. Good treatment of when datagram vs stream semantics fit which application.

The next module — TCP at the wire level — covers the byte-stream alternative in detail, and explains exactly which costs UDP is letting you skip.