Networking Fundamentals · Part 1 of 12·Network Hardening·2026-04-29·24 min read·introductory

Bits, signals, and the physical layer

The physical layer from first principles: bits vs symbols, line encoding, clock recovery, noise, bandwidth, and why software engineers should care.

The reason a modern computer network behaves the way it does — why retransmissions exist, why MTU is 1500 and not 65,536, why a fiber link still has measurable latency from one end of a building to the other — is that all of it has to ride on top of physical media. Bits in a CPU register are mathematical; bits on a wire are voltages, photons, or radio waves with finite propagation, finite bandwidth, and unavoidable noise. The protocol stack as a whole is a long argument about how to paper over that gap. You will write better software, debug performance problems faster, and design more honest systems if you understand the floor that everything else is built on.

This module is the floor.

Prerequisites

None. We assume you can count in binary and take a logarithm. Nothing else.

Learning objectives

By the end of this module you should be able to:

Distinguish cleanly between bits, symbols, signals, frames, and packets, and explain why mixing them up causes bad analysis.
Compare line-coding schemes such as NRZ, NRZI, and Manchester on the dimensions that matter — clock recovery, bandwidth efficiency, DC balance — and predict which one will fail under what conditions.
Calculate, at a back-of-the-envelope level, how a medium's bandwidth and signal-to-noise ratio bound the data rate it can carry.
Decompose end-to-end latency into propagation, serialization, queueing, and processing, and identify which one dominates a given link.
Trace a specific upper-layer artifact (a 1,500-byte MTU, a 32-bit Ethernet preamble, a CRC trailer) back to the physical-layer constraint that produced it.

Why the physical layer still matters to software people

There is a recurring habit in software engineering culture of treating the wire as somebody else's problem. The OSI diagram puts a tidy line between the physical and data-link layers, the kernel hides everything below the socket API, and most of the time it works. Then a transfer that should take 30 seconds takes 8 minutes, and the mailing list fills up with diagnoses of "weird congestion" or "TCP just being slow," none of which turn out to be the real cause. The real cause is almost always at the bottom: a noisy radio link silently dropping framing bytes, a duplex mismatch on a switched port doubling the apparent latency, a transceiver that's heat-cycled past its tolerance and is corrupting one symbol in a million.

Three categories of issue collapse onto physical-layer reasoning:

Capacity questions. "Why won't my 10 Gb/s link give me 10 Gb/s?" Almost always because Shannon's bound, framing overhead, encoding overhead (PAM-4, 64b/66b), or serialization across the link's actual symbol clock take a slice off the top.
Latency questions. "Why does this packet take 12 ms when the wire is only 100 km long?" Propagation in fiber is about 5 µs/km; the rest is serialization, queueing, and middlebox processing.
Reliability questions. "Why am I seeing checksum failures only at peak traffic?" Because crosstalk, electromagnetic interference, or temperature drift have pushed the bit error rate above what the line code's built-in margins can hide.

You don't need to design transceivers to understand networks, any more than you need to design filesystems to understand databases. But you do need to know that a transceiver exists, what it cares about, and which questions it answers — and which questions you're wasting time investigating elsewhere.

Bits are abstract; signals are what the medium actually carries

Inside your CPU, a bit is a mathematical entity: the value at index i of an array, true or false. Inside the wire, there is no such thing. A wire carries a voltage that varies continuously over time. A fiber strand carries a stream of photons whose intensity rises and falls. A radio link carries an electromagnetic field oscillating at carrier frequency, modulated by the transmitter's input.

The protocol design problem at the physical layer is: how do you map a sequence of mathematical bits onto a continuous-time signal, in a way the receiver can reliably reverse?

To talk about this without confusing ourselves, three terms need to be kept apart:

A bit is the unit of information. It has no physical form. A 256-byte payload contains 2,048 bits no matter how those bits get transmitted.
A symbol is one element of the transmitted alphabet on the medium. In a simple two-level scheme, one symbol carries one bit. In a four-level scheme like PAM-4, one symbol carries two bits. In a 16-QAM radio modulation, one symbol carries four bits.
A signal is the actual continuous-time waveform that conveys those symbols. The signal has amplitude, frequency, and phase characteristics that the medium imposes on it; the receiver's job is to recover the symbol stream from the signal, then the bit stream from the symbols.

The unit bits per second (bps) is what the application sees. The unit baud (symbols per second) is what the modulator sees. They are equal only when one symbol carries one bit. On a modern 1000BASE-T Ethernet link the symbol rate is 125 megabaud, but each symbol carries 8 bits across 4 differential pairs simultaneously — so the effective rate is 1 Gbps. Not understanding the difference is the source of every confused argument about whether "9600 baud" means 9600 bits per second.

The bit/symbol/signal hierarchy is the entire reason higher-layer protocols have framing. Above the wire, the receiving stack reconstructs bits from symbols, frames from bits, packets from frames, and a byte stream from packets. Each layer trusts the one below to deliver units that match its expectations. Each layer lives or dies on whether the layer below is correctly delineating its symbols.

Encoding a bitstream onto a medium

Suppose your medium is a copper twisted pair and you want to transmit the bit string 110100110.... You have to pick a way to map each bit to something the wire can convey.

The simplest scheme is NRZ (Non-Return-to-Zero). A 1 is a high voltage, a 0 is a low voltage, held for one bit period. To send 1101001, you draw a waveform that's high, high, low, high, low, low, high. NRZ is dirt simple, has no per-bit overhead, and uses the minimum bandwidth a binary scheme can use. Early serial standards like RS-232 are essentially NRZ at heart.

NRZ has a fatal weakness for high-speed signaling: long runs of identical bits produce no transitions. A million consecutive 1s in NRZ is just a flat high voltage for a million bit periods. The receiver, which is supposed to sample the signal every bit period, has nothing to lock onto. Tiny clock drift between transmitter and receiver — a few parts per million is a lot at gigabit rates — accumulates over those million bits and the receiver eventually samples in the wrong place.

NRZI (Non-Return-to-Zero Inverted) helps a bit. Instead of "1=high, 0=low," NRZI encodes "1=transition, 0=no transition." So a 1 toggles the line state and a 0 keeps it. This guarantees that any sequence of 1s produces transitions. But you still get long flat stretches when the data has a lot of 0s, so NRZI alone isn't a complete answer.

Manchester encoding solves clock recovery the brute-force way: it puts a transition in every single bit period. A 1 is a low-to-high transition mid-bit, a 0 is a high-to-low transition mid-bit. (There are conventions that flip these definitions; the principle is the same.) The receiver always sees one transition per bit, which gives it a continuous timing signal regardless of payload.

Manchester is what 10BASE-T Ethernet (10 Mbps) uses. It's also why 10BASE-T occupies 20 MHz of bandwidth on the wire to carry 10 Mbps of data: each bit needs two signal halves. That doubling is the cost of self-clocking.

Above 100 Mbps, Manchester becomes prohibitive — you'd need 200 MHz to carry 100 Mbps, and copper twisted pair starts losing signal integrity that high. So 100BASE-TX moves to 4B/5B + MLT-3: every 4 data bits get encoded as a 5-symbol pattern chosen to guarantee enough transitions, then transmitted as a three-level signal with restricted state changes. Gigabit and faster links use even denser schemes (8B/10B, 64B/66B, PAM-4) trading higher per-symbol complexity for tighter bandwidth efficiency.

The pattern across all of these: higher line speeds force more sophisticated coding to keep clock recovery alive without burning bandwidth. Manchester was right for 10 Mbps; it would be wrong for 10 Gbps.

Clock recovery and why transitions are precious

It is tempting to imagine the receiver "reading" 1s and 0s the way you read characters on this page. That is not what happens. The receiver samples a continuous-time signal at periodic intervals — one sample (or a small set of samples) per symbol time — and decides what symbol was transmitted. To do that, it needs to know when each symbol time begins and ends.

The transmitter and receiver have separate clocks. They are nominally the same frequency but, in practice, drift relative to each other because of temperature, manufacturing tolerance, and crystal aging. A typical commercial oscillator is rated at ±100 parts per million — meaning over 1 million bit periods, the receiver might be 100 bit-times off from the transmitter. At gigabit rates, 100 bit-times is 100 nanoseconds, and being even one bit-time off is fatal.

The receiver compensates by running a clock-recovery loop. The loop watches the incoming waveform, picks out signal transitions, and adjusts the local sampling phase so that samples land in the middle of each symbol period. This works only if there are transitions in the signal often enough for the loop to track drift.

Hence: transitions are precious. Every line-coding scheme is, partly, a strategy for guaranteeing transitions. Manchester does it by per-bit construction. 4B/5B does it by table-lookup substitution that excludes patterns with too many consecutive zeros. 64B/66B does it by scrambling — XOR'ing the data with a pseudo-random sequence so that statistically there are always enough transitions, then framing the scrambled stream with a known synchronization header.

This is also why the Ethernet preamble exists. Before the actual frame, the transmitter sends 56 bits of alternating 1s and 0s, which after Manchester encoding produces a steady string of transitions. The receiver uses these to lock its clock-recovery loop before the real data starts. The 8 bits after the preamble — the start frame delimiter (10101011) — is the receiver's "now you can start reading" signal.

Bandwidth, noise, and the two famous ceilings

Two pieces of math govern how much data you can push through a physical channel. They sound abstract; the engineering implications are very concrete.

Nyquist's bound (1928) says that a channel of bandwidth B hertz can carry at most 2B distinct symbol changes per second. So a 100 MHz copper channel cannot transmit more than 200 megabaud, no matter how clever your modulation. Bandwidth is a hard wall.

Shannon's capacity theorem (1948) says that the maximum bit rate over a channel of bandwidth B with signal-to-noise ratio S/N is:

C = B × log₂(1 + S/N)

The two combined tell you what's possible. Take a noisy 4 kHz analog telephone line with S/N = 100 (20 dB):

C = 4000 × log₂(1 + 100) ≈ 4000 × 6.66 ≈ 26.6 kbps

Real V.34 modems hit ~33.6 kbps and V.92 reached 56 kbps downstream by exploiting end-to-end digital paths past the analog loop. They were squeezing the absolute last drop out of Shannon's bound, and that's exactly the era when home internet stalled at "dial-up speed" — there was no more bound to push.

Three operational consequences:

Bandwidth costs money. Doubling a wireless link's bandwidth requires doubling the spectrum, which is regulated and expensive. Most upgrades come from improving SNR (better antennas, lower-noise amplifiers, error correction) or from packing more bits per symbol.
You cannot beat Shannon by being clever. You can only get within some margin of it. Modern coding (LDPC, polar codes) gets the gap down to a fraction of a dB. Past that, the only lever is widening the channel.
Noise is a constant tax. You don't reduce noise; you accept it and design around it. The interesting variable is the channel coding's overhead — how many redundant bits you spend per useful bit so that the receiver can reconstruct the truth despite errors.

A useful intuition: if a vendor advertises a single-link rate that is wildly above what Nyquist + reasonable SNR allows, they're either using more channels than they're admitting (MIMO, multiple fibers, multiple frequencies) or lying.

Error detection lives here before reliability lives above

The link will lose bits. Cosmic rays, thermal noise, a dirty fiber connector, a stray RF source — the bit error rate of any real channel is non-zero. A typical short copper Ethernet link has a bit error rate around 10⁻¹², which sounds vanishingly small until you do the math on a 10 Gbps link: that's one bit error roughly every 100 seconds.

If the application stack has to retransmit on every error, you'd burn most of your throughput on retries. So the data-link layer adds detection at the bottom: a value computed over the bits as they're transmitted, included with the frame, recomputed on receive. If the recomputation matches, the frame is accepted; otherwise it's discarded.

The two main techniques you'll see:

Parity is the cheapest. One bit per byte, set so the total number of 1s is even (or odd, by convention). Detects any single-bit error per byte. Misses every two-bit error. Useful in memory and very short links; rarely sufficient by itself for networks.
CRC (cyclic redundancy check) is the workhorse of every link layer in modern use. The transmitter treats the frame as a polynomial over GF(2), divides by a fixed generator polynomial, and appends the remainder. The receiver does the same division on the received bits and checks the remainder is zero. CRC-32 (Ethernet's choice) detects all burst errors up to 32 bits long and statistically catches longer ones with very high probability. It costs 4 bytes per frame.

Detection is cheaper than correction for general-purpose networks because most frames arrive uncorrupted. Correction (forward error correction, FEC) makes sense when retransmission is expensive — long-distance fiber, satellite links, optical disks, hard drives. There, the receiver pays per-frame computational cost up front to fix errors without round-trip retransmission. For an Ethernet LAN with 0.1 ms RTT, retransmission via TCP is just cheaper.

The split between detection at the link layer and reliability at the transport layer is one of the cleanest examples of the end-to-end principle: put expensive guarantees only where they're needed, and only at the layer where they can be done correctly. Detection at the wire is necessary because the wire is the only place you can tell that a specific frame was corrupted. Reliability at TCP is necessary because the wire can't tell you that an entire frame was lost between two routers — only the endpoints can.

Duplex, collision domains, and historical Ethernet compromises

Until the mid-1990s, Ethernet ran on a shared coaxial cable that every host on the LAN tapped into. There was exactly one wire pair carrying signals in either direction, and exactly one host could transmit at a time without garbling everyone else's reception. This is half-duplex operation: send or receive, not both, and certainly not multiple senders at once.

The mediating protocol was CSMA/CD (Carrier Sense Multiple Access with Collision Detection):

Each host listens to the medium. If it's quiet, the host transmits.
If two hosts transmit at nearly the same instant — neither yet hearing the other's signal — they will both detect the resulting garbled waveform as a collision, abort transmission, send a jam signal, and back off for a randomly chosen amount of time before retrying.

This works, sort of, on a small LAN. As the number of hosts on a shared cable grows, collisions become more frequent and effective throughput plummets. Worse, the maximum cable length is bounded by the requirement that a sender can detect a collision before it stops transmitting, which means the round-trip propagation delay must be less than the minimum frame transmission time. That's why classic Ethernet has a minimum frame size of 64 bytes and a maximum collision domain length of about 2.5 km: the numbers are chosen so the math works out.

Switched, full-duplex Ethernet — what you actually use today — eliminates collisions in the common case. Each host has its own dedicated link to the switch, with separate physical pairs for transmit and receive, so it can send and receive simultaneously and never has to compete for the medium. CSMA/CD is still in the standards for compatibility but is essentially never invoked on modern networks.

The 64-byte minimum frame size, however, is still in the standards. If you've ever wondered why short Ethernet frames are padded with zeros to reach 64 bytes, that's the historical reason: the protocol needs a minimum frame size to make the original collision-detection math work. Switched Ethernet inherited the constraint and ships it forward in every spec from 100BASE-TX onward.

This is a common pattern. Decisions made for one set of physical-layer constraints freeze into upper-layer formats, and they stay there for decades after the original constraints are gone, because the cost of changing the format is much higher than the cost of carrying a few legacy bytes.

Propagation delay, serialization delay, and why "latency" is not one number

When a packet travels across a network, four distinct delays contribute to its end-to-end latency:

Propagation delay — the time the leading edge of the signal takes to travel through the medium. Bounded by the speed of light in that medium: roughly 200,000 km/s in fiber and copper (about 2/3 of vacuum c). On a 1,000 km link, propagation alone is 5 ms one-way.
Serialization delay — the time to clock the entire frame onto the wire at the link's bit rate. Equal to frame_size_in_bits / link_rate_in_bps. This depends only on the frame and the link speed, not the path length.
Queueing delay — time the packet spends waiting in router or switch buffers because something else is using the outgoing interface. This is the most variable component and the one congestion control is trying to manage.
Processing delay — time inside each device to look up the destination, decrement TTL, recompute checksums. On modern hardware this is on the order of microseconds.

Serialization delay matters more on slow links than it does on fast ones. A 1500-byte frame is 12,000 bits. On a 10 Mbps link that takes 1.2 ms; on a 1 Gbps link, 12 µs; on a 10 Gbps link, 1.2 µs. On a 56 kbps modem, the same frame took 215 ms to serialize — a noticeable chunk of human-perceptible delay just to get the packet onto the wire.

This is why "fat pipe, low latency" is an actual thing rather than a tautology. A 100 Mbps cross-country fiber link has 10× the propagation delay of a 1 Gbps short-haul link, but its per-packet serialization is 10× higher too. The latency you experience for a small interactive packet is dominated by propagation; the latency for a large bulk packet is dominated by serialization.

The split also explains why bandwidth-delay product is the key sizing parameter for buffers. A connection on a 100 ms RTT × 1 Gbps path has 12.5 MB "in flight" at steady state. The receive window has to be at least that large or the connection idles waiting for ACKs. We saw this constraint already in Module 1.7's discussion of TCP window scaling; this is where it comes from.

Media types and their engineering tradeoffs

The three families of physical media for networking each make different tradeoffs:

Twisted pair copper. Cheap, easy to terminate, decent shielding, distance-limited. CAT-6A handles 10 Gbps over 100 m. Beyond that, signal attenuation and crosstalk between adjacent pairs in the same cable become unmanageable. Twisted pair is what almost every desk-to-switch and switch-to-server link in a building actually is.

Optical fiber. Higher bandwidth-delay product than anything copper can achieve, lower attenuation per meter, immune to electromagnetic interference. Single-mode fiber routinely runs hundreds of kilometers without an amplifier. The cost is operational: connectors must be polished and clean, splicing requires fusion equipment, and dust kills throughput. Fiber is what every backbone link, every metro-area link, and an increasing fraction of building-scale links actually is. Fiber is also what removes the question of EMI from the equation — it's the only medium that doesn't care about radio interference at all.

Radio. Maximum operational flexibility — no cable to pull. Massive engineering complexity at the protocol level, because the medium is shared, the propagation environment changes every time someone moves, and regulatory bodies care a great deal about which frequencies you transmit on at what power. Wi-Fi 6/7 squeezes amazing throughput out of 5 GHz and 6 GHz bands. Cellular networks juggle hundreds of users in a single cell. The price is that most of the protocol stack is rewriting itself constantly to deal with multipath fading, collision avoidance, and handoff.

A useful heuristic for which medium to use: copper if the run is short and indoor, fiber if it's long or runs through electrical-noise hell, radio if you can't lay a cable. Most real networks combine all three.

From waveform to frame boundary

The receiving NIC's job, after clock recovery and symbol decoding, is to find frame boundaries in the bit stream. Without framing, the bits coming off the wire are an undifferentiated sequence; it's the framing that lets the data-link layer hand discrete units up to the network layer.

The Ethernet preamble we mentioned earlier is one approach: send a known synchronization pattern before each frame, then signal "frame starts now" with a unique start-frame delimiter. The receiver locks its clock on the preamble, watches for the SFD, and reads the frame contents that follow.

Older serial protocols use a different approach. PPP/HDLC framing delimits frames with a special flag byte (0x7E, 01111110). Any occurrence of that exact byte inside the frame's payload is escaped via byte stuffing or bit stuffing: the transmitter inserts an extra bit or escape sequence to break up the pattern, and the receiver removes it on the way back up. The result is that the flag byte appears only at frame boundaries.

Flag-based framing is good for variable-length frames over a continuous bitstream where there's no preamble. It costs one byte per frame plus statistically small stuffing overhead. It's why dial-up PPP and HDLC-based serial links can operate over media that don't lend themselves to a precise per-frame preamble.

The general lesson is that physical-layer encoding decisions shape framing decisions one level up. Whether you use a preamble or a flag byte; how big the minimum frame size has to be; whether the receiver knows when a frame ends because it's been told the length or because it sees a stop pattern — all of those choices ripple up from how the wire actually works.

The next module picks up from here. With clock recovery and frame boundaries solved, we're ready to look at what the data-link layer does with framed bytes: addressing, switching, and the journey from a single physical link to a building-scale LAN.

Hands-on exercise

Exercise 1 — Encode the same bitstring three ways

Save the following as encodings.py and run it.

"""Print waveform transitions for NRZ, NRZI, and Manchester encoding."""

INPUT = "11011010001110010110010010100110"   # 32-bit demo input

def nrz(bits: str) -> str:
    # high for 1, low for 0
    return "".join("H" if b == "1" else "L" for b in bits)

def nrzi(bits: str) -> str:
    # 1 = transition, 0 = no transition
    out = []
    state = "L"
    for b in bits:
        if b == "1":
            state = "H" if state == "L" else "L"
        out.append(state)
    return "".join(out)

def manchester(bits: str) -> str:
    # 1 = low->high in mid-bit; 0 = high->low in mid-bit
    return "".join("LH" if b == "1" else "HL" for b in bits)

print(f"input:      {INPUT}")
print(f"NRZ:        {nrz(INPUT)}")
print(f"NRZI:       {nrzi(INPUT)}")
print(f"Manchester: {manchester(INPUT)}")

Run it and look at the outputs. Notice that the longest unbroken stretch in the NRZ output is the longest run of identical bits in the input (here, three 0s gives LLL). NRZI breaks up runs of 1s but leaves runs of 0s flat. Manchester always alternates within each bit — every position has a transition.

Stretch task: modify the script to inject one symbol error every 100 symbols (flip a character). Then write a tiny "receiver" that re-derives the bits, and measure how each encoding fares. Manchester recovers transitions even from corrupted symbols; NRZ doesn't.

Exercise 2 — Measure serialization delay

"""Compute serialization delay for various frame sizes and link rates."""

FRAME_SIZES_BYTES = [64, 512, 1500, 9000]    # min Ethernet, mid, max Ethernet, jumbo
LINK_RATES_BPS    = [10e6, 100e6, 1e9, 10e9, 100e9]

def serialization_delay(frame_bytes: int, link_bps: float) -> float:
    """Return seconds to clock the frame onto the wire."""
    return (frame_bytes * 8) / link_bps

for size in FRAME_SIZES_BYTES:
    print(f"frame: {size} bytes")
    for rate in LINK_RATES_BPS:
        seconds = serialization_delay(size, rate)
        print(f"  {rate/1e9:>5.0f} Gbps: {seconds*1e6:>10.3f} µs")

Run it. Notice the columns: at 10 Gbps, even a jumbo frame serializes in 7.2 µs. At 10 Mbps, a max-size Ethernet frame takes 1.2 ms to clock out — long enough to dominate end-to-end latency on a low-bandwidth interactive session. Now imagine a 33.6 kbps modem: a 1500-byte frame is 357 ms, which is why dial-up internet "felt slow" in a way that a 1 Mbps DSL link did not, even though the propagation delays were similar.

Common misconceptions

"Bits per second and baud are the same." Only when one symbol carries one bit. Modern links almost universally carry multiple bits per symbol (PAM-4 carries 2, 16-QAM carries 4, 64-QAM carries 6), and the symbol rate is much lower than the bit rate. Confusing the two leads to wrong predictions about bandwidth requirements.

"The physical layer just makes the wire faster." The physical layer also constrains error rates, shapes framing decisions one layer up, sets practical MTU behavior, determines duplex options, and enforces minimum frame sizes that long outlive the original justification. Treating the physical layer as a black-box "speed knob" misses every one of those effects.

"Manchester is strictly better because it is self-clocking." Self-clocking is great. The cost is doubling the signal bandwidth used per bit. At 10 Mbps over CAT-3 copper, that was a fine trade. At 10 Gbps over CAT-6A, doubling bandwidth is impossible — that's why higher-speed Ethernet uses denser and trickier encodings instead.

"Latency is just propagation time." Propagation, serialization, queueing, and processing all contribute. On long links, propagation dominates. On slow links, serialization dominates. On busy links, queueing dominates. On overloaded middleboxes, processing dominates. Calling any one of these "the latency" oversimplifies in a way that breaks down the moment you try to optimize.

"Error correction always beats retransmission." Forward error correction makes sense when retransmission is expensive — long-haul fiber, satellite, optical disks. On a low-RTT general-purpose IP network, simple detection plus higher-layer retransmission is cheaper, simpler, and more flexible. Sticking FEC where it isn't needed wastes throughput on permanent overhead.