system-design.io

OSI Model and TCP/IP Fundamentals

A Comprehensive Guide for Software Engineering Interviews

Part 1: The Real World Already Has Layers

Before we talk about networking models, let’s talk about something we’ve all experienced: sending a package overseas.

Imagine you’re in Milwaukee, Wisconsin, and you want to ship a birthday gift to a friend in Tokyo, Japan. You don’t think about cargo plane fuel logistics or international customs law — you just walk into a UPS store and hand over the package. The shipping company handles everything in between.

But behind the scenes, something remarkable is happening. Your package moves through a chain of specialized handlers, each responsible for exactly one job:

You write a note, wrap the gift, and seal the box (Data preparation)
UPS at the counter labels it, weighs it, and assigns a tracking number (Addressing)
The local UPS facility routes it to the right regional hub (Routing)
The cargo airline physically moves the package across the ocean (Physical transport)
Japanese customs inspects and clears the package (Protocol enforcement)
Local courier in Tokyo delivers it to your friend’s door (Final delivery)

Your friend unwraps it in reverse order — first accepting it from the courier, then removing packaging layers, then finally reading your note.

This is the essence of layered network communication. Each layer is a specialist. Each layer only talks to the layer directly above or below it. The sender wraps the data going down through the layers. The receiver unwraps the data going up through the layers. The term for this wrapping process is encapsulation, and we’ll come back to it repeatedly.

This mental model is everything. Once you internalize this package-shipping analogy, the OSI model stops being a memorization exercise and starts being obvious.

Part 2: The Problem That Created the Layers

The Internet Before Standards Was Chaos

In the early 1970s, different computers could barely talk to each other. IBM systems used one protocol, DEC systems used another, and connecting them required custom software that had to be rewritten every time a new vendor entered the market. If you upgraded your hardware, your networking code broke. This was an era of vendor lock-in so severe it was almost comical.

The fundamental pain point was this: networking concerns were tangled together. Code that handled physical signal transmission was mixed with code that handled message formatting, error checking, and application logic. Changing one thing broke everything else.

The engineers at the time recognized a crucial insight: these are different problems. Physical transmission is a completely different concern from error correction, which is completely different from routing, which is completely different from application protocol design. They needed separation of concerns — a principle software engineers still preach today.

The fix was the OSI Model — a conceptual framework proposed by the International Organization for Standardization (ISO) in 1984. The goal wasn’t to be a practical implementation. It was to be a reference model — a shared vocabulary and a clean separation of responsibilities that would let different vendors build interoperable systems.

Think of the OSI Model like a building blueprint. The blueprint doesn’t build the house, but every construction worker uses it as the shared reference for what each team is responsible for.

Part 3: The OSI Model — Seven Layers of Responsibility

The OSI Model divides network communication into seven distinct layers. Each layer provides services to the layer above it and relies on services from the layer below it. No layer skips its neighbor.

Let’s build this up from the bottom — from raw physics to human-facing applications.

+-----------------------------------------------------------+
|                    APPLICATION (Layer 7)                  |
|          HTTP, FTP, SMTP, DNS, SSH                        |
+-----------------------------------------------------------+
|                   PRESENTATION (Layer 6)                  |
|          SSL/TLS, JPEG, ASCII, Encryption                 |
+-----------------------------------------------------------+
|                     SESSION (Layer 5)                     |
|          Session Management, NetBIOS, RPC                 |
+-----------------------------------------------------------+
|                    TRANSPORT (Layer 4)                    |
|          TCP, UDP  —  Port Numbers, Reliability           |
+-----------------------------------------------------------+
|                     NETWORK (Layer 3)                     |
|          IP, ICMP, Routers  —  Logical Addressing         |
+-----------------------------------------------------------+
|                   DATA LINK (Layer 2)                     |
|          Ethernet, MAC Addresses, Switches                |
+-----------------------------------------------------------+
|                    PHYSICAL (Layer 1)                     |
|          Cables, Radio Waves, Bits, Electrical Signals    |
+-----------------------------------------------------------+

     Data flows DOWN on the sender side →
     Data flows UP on the receiver side ←

In the diagram above, notice the layer numbers increase from bottom to top. When you see interview questions reference “Layer 3 routing” or “Layer 4 protocols,” they’re referring to specific rows in this stack. The bottom three layers (1–3) deal with how to move data. The top three layers (5–7) deal with how to present data. Layer 4 is the critical bridge between the two worlds.

Layer 1: Physical — The Raw Signal

This layer is concerned with raw bits traveling over a physical medium. It has absolutely no concept of meaning, structure, or error correction. Its only job is: how do we represent a 0 and a 1 on this wire, fiber, or radio wave?

Physical layer technologies include:

Copper cables (Ethernet) — voltage changes represent bits
Fiber optic cables — light pulses represent bits
Wi-Fi (802.11) — radio frequency signals represent bits
Bluetooth, USB, DSL — each with their own physical encoding

The physical layer defines things like:

Voltage levels (what counts as a 1 vs. a 0)
Pin layouts on connectors
Cable specifications (Cat5e, Cat6, fiber types)
Transmission speeds in bits per second

A useful way to think of it: Layer 1 is the road (or the ocean). It doesn’t care what cargo travels on it.

Layer 2: Data Link — From Bits to Frames

Now we hit our first major “fix.” The Physical layer can transmit bits, but it has no idea where a signal starts or ends, and it has no mechanism to detect if a bit got corrupted in transit.

The Data Link layer solves this with a concept called a frame — a structured package of bits with a defined start, an end, and a checksum for error detection. It also introduces MAC (Media Access Control) addresses — unique hardware identifiers burned into every network interface card (NIC) at the factory.

  A Single Ethernet Frame:

  +----------+----------+------+--------+--------+----------+
  | Preamble | Dest MAC | Src  |  Type  |  Data  |   FCS    |
  |  7 bytes |  6 bytes | MAC  | 2 bytes|(varies)| 4 bytes  |
  |          |          |6 bytes|       |        |(Checksum)|
  +----------+----------+------+--------+--------+----------+

  FCS = Frame Check Sequence (error detection)
  MAC = 48-bit hardware address like  00:1A:2B:3C:4D:5E

In this diagram, the FCS (Frame Check Sequence) is a CRC checksum calculated from the frame contents. If the receiver recalculates the checksum and it doesn’t match, the frame is discarded. Notice that MAC addresses here are 6-byte (48-bit) values — these are burned into hardware, not assigned by software.

The Data Link layer is also responsible for media access control — deciding who gets to transmit on a shared medium at any given moment. This is why it’s split into two sub-layers: the MAC sub-layer (hardware addressing and access control) and the LLC sub-layer (logical link control, flow and error management).

Key Data Link layer devices and protocols:

Ethernet (IEEE 802.3) — the dominant wired LAN standard
Wi-Fi (IEEE 802.11) — the dominant wireless LAN standard
Switches — Layer 2 devices that forward frames based on MAC addresses
ARP (Address Resolution Protocol) — maps IP addresses to MAC addresses (technically lives at the boundary of Layer 2/3)

Layer 3: Network — Routing Across the Globe

Here’s the problem Layer 2 can’t solve: MAC addresses are local. They work within a single network segment, but a MAC address in Milwaukee has no meaning to a router in Tokyo. We need a globally routable addressing scheme that can scale to billions of devices.

Enter IP addresses and the Network layer.

The Network layer introduces logical addressing — IP addresses that can be hierarchically organized and routed across the entire internet. It’s also responsible for path selection — figuring out the best route from the source network to the destination network.

           Milwaukee              Chicago              Tokyo
          +--------+            +--------+           +--------+
          |  Your  |            |        |           | Your   |
          |  PC    +---Router---+ Router +---Router--+ Friend |
          |        |            |        |           |        |
          +--------+            +--------+           +--------+
          192.168.1.5         10.0.0.1             203.0.113.42

          Layer 3 figures out: Milwaukee → Chicago → Tokyo
          Layer 2 handles each individual hop: Milwaukee → Chicago

In this diagram, notice how the path is a sequence of hops. Each router knows about nearby networks and uses routing tables to forward packets toward the destination. The IP address stays the same end-to-end, but the MAC address changes at every hop.

Key concepts at Layer 3:

IP (Internet Protocol) — IPv4 (32-bit) and IPv6 (128-bit) addressing
Packets — the unit of data at this layer (frames become packets)
Routers — Layer 3 devices that forward packets based on IP addresses
ICMP — Internet Control Message Protocol (ping lives here)
Subnetting — dividing IP ranges into smaller networks
TTL (Time To Live) — a counter that decrements at each hop, preventing packets from looping forever

The Network layer does best-effort delivery. It will try to get your packet there, but it makes no promises. Packets can be lost, reordered, or duplicated. Fixing that is someone else’s job — and that “someone else” is Layer 4.

Layer 4: Transport — Reliability and Process Identification

This layer is arguably the most important for software engineers to deeply understand. It solves two critical problems that Layers 1–3 leave completely unaddressed:

Problem 1: Which application should receive this data? A single computer might be running a web browser, an email client, and a streaming app simultaneously — all on the same IP address. How does the OS know which application gets which incoming packet?

Problem 2: How do we guarantee delivery? The Network layer is “best-effort.” How do we build a reliable, ordered communication channel on top of an unreliable substrate?

The Layer 4 solution to Problem 1 is port numbers — 16-bit numbers (0–65535) that identify specific processes on a machine. An IP address gets you to the right house; a port number gets you to the right apartment.

  Incoming packet arrives at IP: 192.168.1.5

  +----------------------+
  |   IP: 192.168.1.5    |
  +----------+-----------+
             |
     +--------+--------+
     |                 |
  Port 80           Port 443         Port 25         Port 22
  (HTTP)            (HTTPS)          (SMTP)          (SSH)
     |                 |
  +--+---+          +--+---+
  | Web  |          | Web  |
  |Browser|         |Server|
  +------+          +------+

In this diagram, the OS acts like a building receptionist — it looks at the destination port number and delivers the data to the right “apartment” (process). Well-known ports (0–1023) are reserved for standard services.

The Layer 4 solution to Problem 2 is the choice between two protocols: TCP (guaranteed delivery, ordered) and UDP (fast, but unreliable). We’ll explore these in depth shortly.

Layers 5, 6, and 7: Session, Presentation, and Application

As we climb to the top three layers, the separation becomes less crisp in practice. In the real-world TCP/IP implementation, these three layers are largely merged into the Application layer. But understanding them conceptually is still valuable.

Layer 5 — Session: Manages the lifecycle of a communication session. This layer establishes, maintains, and terminates connections between two applications. Think of it as managing the difference between “we’re mid-conversation” and “this conversation is over.” RPC (Remote Procedure Call) and NetBIOS operate here.

Layer 6 — Presentation: Handles data format translation — the “lingua franca” problem. If one machine stores text as ASCII and another as EBCDIC, someone needs to translate. SSL/TLS encryption lives conceptually at this layer (though in practice it’s implemented in the Application layer). JPEG, MPEG, and other encoding formats are also Presentation layer concerns.

Layer 7 — Application: This is where user-facing protocols live. HTTP, HTTPS, FTP, SMTP (email), DNS, SSH — all of these are Application layer protocols. This layer is what software engineers interact with most directly every day.

  A developer's daily interaction with the OSI stack:

  You write this:
  +-------------------------------------------------+
  |  fetch("https://api.example.com/data")          |   Layer 7 (HTTP)
  +-------------------------------------------------+
           |
           v  The browser and OS handle everything below this line
  +-------------------------------------------------+
  |  TLS Handshake, Certificate Verification       |   Layer 6 (Presentation)
  +-------------------------------------------------+
  |  TCP Connection State Management               |   Layer 5 (Session)
  +-------------------------------------------------+
  |  TCP Segments, Port 443                        |   Layer 4 (Transport)
  +-------------------------------------------------+
  |  IP Routing to api.example.com (93.184.x.x)   |   Layer 3 (Network)
  +-------------------------------------------------+
  |  Ethernet Frame to default gateway MAC         |   Layer 2 (Data Link)
  +-------------------------------------------------+
  |  Electrical/radio signal on your Wi-Fi card    |   Layer 1 (Physical)
  +-------------------------------------------------+

Notice in this diagram that as a developer you only consciously interact with Layer 7 — the HTTP fetch call. But this single line of code triggers a cascade through all seven layers below it. Each layer adds its own header before passing the data down.

Part 4: Encapsulation and Decapsulation — The Wrapping Paper Metaphor

Here’s where our gift-wrapping analogy comes alive technically.

As data travels down the OSI stack on the sender’s side, each layer adds its own header (and sometimes a trailer) to the data it receives. This is called encapsulation.

  SENDER SIDE (data flows down, headers added):

  Layer 7:  [ Application Data (HTTP request)           ]
                            |
                            v  Layer 4 adds TCP header
  Layer 4:  [ TCP Header | Application Data             ]  ← "Segment"
                            |
                            v  Layer 3 adds IP header
  Layer 3:  [ IP Header | TCP Header | Application Data ]  ← "Packet"
                            |
                            v  Layer 2 adds Ethernet header + trailer
  Layer 2:  [ ETH Hdr | IP Header | TCP Header | Data | ETH Trailer ] ← "Frame"
                            |
                            v  Layer 1 transmits as bits
  Layer 1:  10110010101110001010... (raw bits on the wire)


  RECEIVER SIDE (data flows up, headers stripped):

  Layer 1 → receives bits
  Layer 2 → strips Ethernet header, passes packet up
  Layer 3 → strips IP header, passes segment up
  Layer 4 → strips TCP header, passes data up
  Layer 7 → receives the original HTTP request ✓

The key insight here is that each layer on the receiver side “peels” exactly one layer of wrapping. Layer 2 on the receiver doesn’t know or care what’s inside the IP packet — it just verifies the frame checksum and strips the Ethernet header. This isolation is what makes the layered model so powerful.

The unit of data at each layer has a specific name:

Layer	Unit Name
Layer 7–5	Message / Data
Layer 4	Segment (TCP) or Datagram (UDP)
Layer 3	Packet
Layer 2	Frame
Layer 1	Bits

Part 5: The TCP/IP Model — The Practical Alternative

The OSI model is the theoretical standard. The TCP/IP model is the actual internet.

Developed by DARPA in the 1970s as part of the ARPANET project (the precursor to the modern internet), the TCP/IP model predates the OSI model. When OSI was published in 1984, TCP/IP was already deployed and running. The internet ran on TCP/IP, and the internet won.

The TCP/IP model is a 4-layer model that collapses the OSI layers into a more pragmatic grouping:

       OSI Model (7 layers)        TCP/IP Model (4 layers)
  +---------------------------+  +---------------------------+
  |  Layer 7: Application    |  |                           |
  +---------------------------+  |   Application Layer       |
  |  Layer 6: Presentation   |  |   (HTTP, FTP, DNS, SMTP)  |
  +---------------------------+  |                           |
  |  Layer 5: Session        |  +---------------------------+
  +---------------------------+  |                           |
  |  Layer 4: Transport      |  |   Transport Layer         |
  +---------------------------+  |   (TCP, UDP)              |
  |  Layer 3: Network        |  +---------------------------+
  +---------------------------+  |   Internet Layer          |
  |  Layer 2: Data Link      |  |   (IP, ICMP, ARP)         |
  +---------------------------+  +---------------------------+
  |  Layer 1: Physical       |  |   Network Access Layer    |
  +---------------------------+  |   (Ethernet, Wi-Fi, etc.) |
                                 +---------------------------+

In this comparison, notice that the TCP/IP model merges OSI Layers 5, 6, and 7 into a single Application layer. It also merges Layers 1 and 2 into the Network Access layer. The Internet Layer maps directly to OSI Layer 3, and the Transport layer maps directly to OSI Layer 4.

The practical reason for this collapse: the top three OSI layers were theoretically clean but hard to implement as truly distinct modules. In practice, protocols like HTTP handle session management, data formatting, and application logic all in one. The TCP/IP model acknowledges this reality.

For interviews, you need to know both models, but you’ll spend most of your time explaining TCP/IP behavior.

Part 6: TCP — The Reliable Workhorse

The Transmission Control Protocol (TCP) is one of the most elegant pieces of engineering in computer science. Its job is to create the illusion of a reliable, ordered, bi-directional byte stream on top of an unreliable, packet-switched network.

Let’s break down exactly how it achieves this.

The Three-Way Handshake — Establishing a Connection

Before a single byte of application data is sent, TCP requires both sides to agree they’re ready to communicate. This is the three-way handshake (SYN → SYN-ACK → ACK).

  Client (Milwaukee)                    Server (Tokyo)
       |                                      |
       |  ---- SYN (seq=100) ------------->   |   "Hey, I want to talk.
       |                                      |    My starting sequence
       |                                      |    number is 100."
       |                                      |
       |  <--- SYN-ACK (seq=300, ack=101) --- |   "Got it! I'll start at
       |                                      |    300. I'm expecting your
       |                                      |    next byte to be 101."
       |                                      |
       |  ---- ACK (ack=301) ------------->   |   "Perfect. I'm expecting
       |                                      |    your next byte at 301."
       |                                      |
       |         CONNECTION ESTABLISHED       |
       |                                      |
       |  ====  Application Data Flows  ====  |

In this diagram, notice that each side generates its own Initial Sequence Number (ISN) — a random starting value. The randomness is a security feature that prevents attackers from predicting sequence numbers and injecting malicious packets. The ack number always means “I’ve received everything up to X-1, and I’m now expecting byte X.”

Why three steps and not two? Because both sides need to confirm they can send AND receive. A two-way handshake would only prove one side can send. Three steps prove both sides have bidirectional communication capability.

Reliability Through Sequence Numbers and ACKs

TCP numbers every byte it sends. The receiver sends back ACK (acknowledgment) messages confirming which bytes it has received. If the sender doesn’t receive an ACK within a timeout period, it retransmits the data.

  Reliable Delivery Example:

  Sender                          Receiver
    |                                 |
    |  --[Segment 1: bytes 1-100]--> |  ✓ Received
    |  --[Segment 2: bytes 101-200]->|  ✓ Received
    |  --[Segment 3: bytes 201-300]->|  ✗ LOST in network
    |  --[Segment 4: bytes 301-400]->|  ✓ Received (but buffered)
    |                                 |
    |  <----[ACK 201]--------------- |  "I have 1-200, waiting for 201"
    |                                 |
    |      timeout...                 |
    |                                 |
    |  --[Segment 3: bytes 201-300]->|  ✓ Re-received
    |                                 |
    |  <----[ACK 401]--------------- |  "I now have 1-400!"

Notice in this diagram that Segment 4 arrived out of order but the receiver buffered it. TCP reassembles segments into the correct order before passing data to the application layer. The application only ever sees a clean, ordered byte stream — it has no idea about the chaos happening at the network layer.

Flow Control — Don’t Overwhelm the Receiver

What if the sender is a superfast server and the receiver is a slow phone on 3G? The sender could flood the receiver’s buffer.

TCP solves this with the receive window (rwnd) — a field in every TCP header that tells the sender “I currently have this many bytes of buffer space available.” The sender will never transmit more unacknowledged data than the receiver’s window allows.

Congestion Control — Don’t Overwhelm the Network

Flow control protects the receiver. Congestion control protects the network itself.

TCP uses algorithms like AIMD (Additive Increase, Multiplicative Decrease) and the congestion window (cwnd) to probe for available bandwidth without causing network congestion. The key phases are:

Slow Start: Begin conservatively, doubling the congestion window each RTT
Congestion Avoidance: Once a threshold is reached, increase linearly
Fast Retransmit / Fast Recovery: On detecting loss, cut back aggressively and recover

The Four-Way Termination — Closing a Connection

Closing a TCP connection gracefully requires four steps because each side closes its half of the connection independently.

  Client                              Server
    |                                   |
    |  ---- FIN (I'm done sending) -->  |   Client closes its send half
    |                                   |
    |  <--- ACK ----------------------  |   Server acknowledges
    |                                   |
    |  <--- FIN (I'm done sending) ---  |   Server closes its send half
    |                                   |
    |  ---- ACK ----------------------> |   Client acknowledges
    |                                   |
    |         CONNECTION CLOSED         |

Notice that after the client sends its FIN, it enters a TIME_WAIT state and waits 2×MSL (Maximum Segment Lifetime, typically 60 seconds) before fully closing. This ensures any delayed packets from the old connection don’t corrupt a new connection on the same port.

Part 7: UDP — The Speed Demon

Now that we understand TCP’s elaborate machinery, we can appreciate why UDP exists: sometimes all of that overhead is the enemy.

UDP (User Datagram Protocol) strips the transport layer down to its bare minimum:

Send data to an IP:port
Add a short 8-byte header (source port, destination port, length, checksum)
That’s it. No handshake, no acknowledgment, no retransmission, no ordering

  TCP Header (20+ bytes):                  UDP Header (8 bytes):
  +--------+--------+                      +--------+--------+
  |  Src   |  Dest  |                      |  Src   |  Dest  |
  |  Port  |  Port  |                      |  Port  |  Port  |
  +--------+--------+                      +--------+--------+
  |    Sequence     |                      |  Length| Chksum |
  |     Number      |                      +--------+--------+
  +-----------------+                      |                  |
  |  Acknowledgment |                      |   That's it.     |
  |     Number      |                      |                  |
  +-----------------+                      +------------------+
  | Offset|Flags|Win|
  +-----------------+
  |Checksum|Urgent |
  +-----------------+
  | Options (var)   |
  +-----------------+

The TCP header can be 20–60 bytes. The UDP header is always exactly 8 bytes. For a DNS query that might be 50 bytes of data, TCP’s overhead is significant. UDP’s isn’t.

When We Choose UDP Over TCP

We can group UDP use cases by the nature of their tolerance for loss:

Loss-tolerant, latency-sensitive applications:

Real-time video/audio streaming (Zoom, YouTube Live) — a dropped frame is better than a frozen stream
Online gaming — a missed position update is better than noticeable lag
VoIP (Voice over IP) — a brief audio glitch is better than pausing mid-sentence

One-shot request-response applications:

DNS — a query is one packet; easier to retry than to maintain a TCP connection
DHCP — IP address assignment at boot time
SNMP — network monitoring queries

Applications that implement their own reliability:

QUIC (HTTP/3) — Google’s protocol that runs over UDP and implements its own reliability at the application layer, gaining more control over retransmission behavior
WebRTC — browser-to-browser real-time communication

Part 8: IP Addressing — The Postal System of the Internet

Understanding IP addressing is foundational to understanding routing, subnetting, NAT, and countless other networking concepts.

IPv4 — The Classic

An IPv4 address is a 32-bit number, conventionally written as four decimal numbers (octets) separated by dots. Each octet represents 8 bits (0–255).

  192   .   168   .   1   .   5
  11000000  10101000  00000001  00000101

This gives us a theoretical address space of 2³² = 4,294,967,296 addresses. Which sounds like a lot — until the internet grew to have billions of connected devices. IPv4 address exhaustion is real, and it’s why we have NAT and why IPv6 was invented.

Subnetting — Dividing the Address Space

IP addresses have two logical parts: the network prefix and the host identifier. The subnet mask (or CIDR notation) tells us where the boundary is.

  192.168.1.0/24  (CIDR notation)

  Address:  192.168.1.   5
            +-----------+ +-+
            Network part   Host part
            (24 bits)      (8 bits)

  Network:  192.168.1.0
  Broadcast: 192.168.1.255
  Usable hosts: 192.168.1.1 — 192.168.1.254 (254 hosts)

The /24 in CIDR notation means the first 24 bits are the network part. A /24 network has 8 bits for hosts, giving 2⁸ = 256 addresses, but 2 are reserved (network address and broadcast address), leaving 254 usable host addresses.

Special IP Ranges

Some IP ranges are reserved and never route on the public internet:

Range	Purpose
`10.0.0.0/8`	Private network (large orgs)
`172.16.0.0/12`	Private network (medium orgs)
`192.168.0.0/16`	Private network (home/small office)
`127.0.0.0/8`	Loopback (`127.0.0.1` is “localhost”)
`0.0.0.0/8`	“This network” — used in routing
`255.255.255.255`	Limited broadcast

NAT — The Address Space Extender

Network Address Translation (NAT) is what lets your entire home use one public IP address from your ISP while internally using private IP addresses.

  Your Home Network              |  Public Internet
                                 |
  192.168.1.5 (your PC)         |
  192.168.1.6 (your phone)  ----|--- 73.45.12.100 (your public IP)
  192.168.1.7 (your tablet)     |
  192.168.1.8 (your TV)         |
           ↓                    |
         Router                 |
      (NAT device)              |

Your router maintains a NAT table that maps (private IP + port) → (public IP + different port). When a packet returns from the internet, the router looks up the NAT table and forwards it to the correct internal device. From the internet’s perspective, all four devices look like one machine.

Part 9: DNS — The Internet’s Phone Book

No discussion of networking fundamentals is complete without DNS (Domain Name System). It’s the protocol that converts human-readable names like google.com into IP addresses like 142.250.80.46.

  DNS Resolution Process:

  Your Browser         Local DNS          Root        TLD (.com)    Authoritative
  (Milwaukee)          Resolver           Server       Server       Server
      |                    |                |              |             |
      |-- "google.com?" -->|                |              |             |
      |                    |-- "google?" -->|              |             |
      |                    |                |--"Try .com"->|             |
      |                    |                |  server list |             |
      |                    |<-- .com servers|              |             |
      |                    |                          "Try Google's NS"  |
      |                    |------------------------------------------>  |
      |                    |<-- 142.250.80.46 ---------------------------  |
      |<-- 142.250.80.46 --|                |              |             |
      |                    |                |              |             |
      |  connects to 142.250.80.46:443      |              |             |

This diagram shows the DNS resolution chain. The process is called “iterative resolution” when the resolver does the legwork itself. Notice the result is cached at each step — the next time anyone on your local network asks for google.com, the resolver answers immediately from its cache without going to the root servers.

Part 10: Practical Python — Seeing the Stack in Action

Theory is powerful, but there’s nothing like writing code to make it concrete. Let’s explore some practical Python networking to solidify these concepts.

Exploring TCP — The Socket API

The socket API is the programming interface that lets applications interact with Layer 4. When we create a TCP socket, we’re explicitly working with Layer 4 constructs.

import socket
import time

def tcp_client_demo():
    """
    Demonstrates the TCP three-way handshake lifecycle through Python.
    The socket.connect() call triggers SYN → SYN-ACK → ACK internally.
    """
    host = "example.com"
    port = 80  # HTTP port

    # AF_INET = IPv4 addressing (Layer 3)
    # SOCK_STREAM = TCP (Layer 4, reliable stream)
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
        
        print(f"[1] Resolving {host} via DNS...")
        ip = socket.gethostbyname(host)
        print(f"    DNS resolved {host} → {ip}")

        print(f"[2] Initiating TCP three-way handshake to {ip}:{port}...")
        start = time.time()
        sock.connect((ip, port))  # SYN → SYN-ACK → ACK happens here
        rtt = (time.time() - start) * 1000
        print(f"    Connection established! Round-trip time: {rtt:.2f}ms")

        # Send an HTTP/1.1 GET request (Application Layer 7)
        http_request = (
            f"GET / HTTP/1.1\r\n"
            f"Host: {host}\r\n"
            f"Connection: close\r\n\r\n"
        )
        print(f"[3] Sending HTTP GET request (Layer 7 data)...")
        sock.sendall(http_request.encode())

        # Receive response
        response = b""
        while chunk := sock.recv(4096):
            response += chunk

        print(f"[4] Received {len(response)} bytes")
        print(f"    First line: {response.split(b'\\n')[0].decode().strip()}")

        print(f"[5] Context manager closes socket (TCP FIN → ACK → FIN → ACK)")

tcp_client_demo()

Exploring UDP — Fire and Forget

import socket

def udp_dns_query_demo():
    """
    DNS queries use UDP. This demonstrates the UDP send-and-hope model.
    No connection is established — we just fire the packet and wait.
    """
    # AF_INET = IPv4, SOCK_DGRAM = UDP (no connection, no guarantee)
    with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as sock:
        sock.settimeout(2.0)  # UDP has no built-in timeout, we set our own
        
        # Google's public DNS server
        dns_server = ("8.8.8.8", 53)
        
        # Minimal DNS query for "google.com" (A record)
        # This is a raw DNS packet — in production, use the `dnspython` library
        dns_query = bytes([
            0x12, 0x34,  # Transaction ID
            0x01, 0x00,  # Flags: standard query
            0x00, 0x01,  # Questions: 1
            0x00, 0x00,  # Answers: 0
            0x00, 0x00,  # Authority: 0
            0x00, 0x00,  # Additional: 0
            # Encoded domain: 6google3com0
            0x06, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65,  # "google"
            0x03, 0x63, 0x6f, 0x6d, 0x00,               # "com"
            0x00, 0x01,  # Type: A (IPv4 address)
            0x00, 0x01,  # Class: IN (Internet)
        ])
        
        print("[1] Sending UDP DNS query (no handshake, no connection)...")
        sock.sendto(dns_query, dns_server)
        
        try:
            response, server = sock.recvfrom(512)
            print(f"[2] Received {len(response)} bytes from {server}")
            print(f"    (If this timed out, the packet was simply lost — UDP doesn't retry)")
        except socket.timeout:
            print("[2] Timed out. UDP doesn't retransmit. We'd need to retry manually.")

udp_dns_query_demo()

Introspecting Your Own Network Stack

import socket
import struct

def inspect_network_info():
    """
    Demonstrates how to programmatically explore your network configuration.
    Maps to our Layer 2 (MAC) and Layer 3 (IP) concepts.
    """
    hostname = socket.gethostname()
    local_ip = socket.gethostbyname(hostname)
    
    print(f"Hostname: {hostname}")
    print(f"Local IP (Layer 3): {local_ip}")
    
    # Get all network interfaces and their addresses
    for info in socket.getaddrinfo(hostname, None):
        family, type_, proto, canonname, sockaddr = info
        if family == socket.AF_INET:
            print(f"IPv4 Address: {sockaddr[0]}")
        elif family == socket.AF_INET6:
            print(f"IPv6 Address: {sockaddr[0]}")
    
    # Demonstrate port number concepts
    well_known_ports = {
        20: "FTP Data",     21: "FTP Control",
        22: "SSH",          23: "Telnet",
        25: "SMTP",         53: "DNS",
        80: "HTTP",         443: "HTTPS",
        3306: "MySQL",      5432: "PostgreSQL",
        6379: "Redis",      27017: "MongoDB",
    }
    
    print("\nWell-Known Port Numbers (Layer 4 identifiers):")
    for port, service in well_known_ports.items():
        print(f"  Port {port:5d}: {service}")

inspect_network_info()

Part 11: Common Interview Questions — Decoded

Now that we’ve built the foundation layer by layer, let’s revisit the most common interview questions and explain exactly what’s being tested.

“What happens when you type google.com in a browser?”

This is a full-stack question that tests whether you can walk down the OSI model coherently. The ideal answer covers:

DNS Resolution (Layer 7/Application) — Browser checks cache, OS checks /etc/hosts, then queries local DNS resolver, which queries the DNS hierarchy
TCP Connection (Layer 4/Transport) — Three-way handshake to port 443
TLS Handshake (Layer 6/Presentation) — Certificate exchange, cipher negotiation, session key establishment
HTTP Request (Layer 7/Application) — GET request sent over the TLS-secured connection
IP Routing (Layer 3/Network) — Packets routed through ISP infrastructure
Ethernet Frames (Layer 2/Data Link) — Your router forwards frames to the ISP
Physical Transmission (Layer 1) — Electrical signals or photons on fiber

“What’s the difference between TCP and UDP?”

Group your answer by three dimensions:

Dimension	TCP	UDP
Connection	Connection-oriented (3-way handshake)	Connectionless
Reliability	Guaranteed delivery, ordered, no duplicates	Best-effort, no guarantees
Speed/Overhead	Higher overhead, lower throughput	Lower overhead, higher throughput
Use Cases	HTTP, HTTPS, FTP, email, file transfer	DNS, streaming, gaming, VoIP

“What’s the difference between OSI and TCP/IP?”

OSI is the theoretical reference model (7 layers, ISO standard, 1984). TCP/IP is the practical implementation model (4 layers, what the actual internet uses). OSI helps us reason and talk about networking. TCP/IP is what runs on every device.

“What’s the difference between a switch and a router?”

Switch = Layer 2 device. Forwards frames based on MAC addresses. Operates within a single network (LAN). Creates a table of MAC→port mappings.
Router = Layer 3 device. Forwards packets based on IP addresses. Routes traffic between different networks. Maintains routing tables.

A useful memory trick: switches think in MAC addresses (hardware), routers think in IP addresses (logical).

Part 12: The Bigger Picture — Why This Matters Beyond the Interview

The OSI and TCP/IP models aren’t just interview trivia. They’re mental models that make you a better engineer in your daily work.

When you’re debugging a production incident, knowing the layers helps you isolate the problem:

Can’t reach the server at all? Likely Layer 1–3 (physical, routing, or firewall)
Connection refused? Layer 4 (the process isn’t listening on that port)
TLS certificate error? Layer 6 (presentation/security)
404 Not Found? Layer 7 (application routing)

When you’re designing a microservices architecture, the transport choice (TCP vs. UDP) has real implications for your service mesh and load balancers.

When you’re optimizing for latency, understanding TCP’s handshake cost is why HTTP/2 and HTTP/3 were designed to multiplex multiple streams over a single connection — avoiding the overhead of establishing new TCP connections for every request.

When you’re working with WebSockets or gRPC, you’re operating at the precise boundary between Layer 4 and Layer 7, and understanding both layers deeply makes you a much more effective debugger.

The engineers who designed the internet understood one timeless principle: complex systems are tamed by clean abstractions. The OSI model is a masterclass in that principle. Each layer knows exactly what it’s responsible for, communicates through a well-defined interface, and doesn’t care about the implementation details of its neighbors.

That’s not just good networking. That’s good software engineering.

Quick Reference — The Cheat Sheet

  OSI Layer  |  Name         |  Unit    |  Key Protocols      |  Devices
  -----------+---------------+----------+---------------------+----------
  Layer 7    |  Application  |  Data    |  HTTP, DNS, SMTP    |  Proxy
  Layer 6    |  Presentation |  Data    |  TLS, JPEG, ASCII   |  —
  Layer 5    |  Session      |  Data    |  RPC, NetBIOS       |  —
  Layer 4    |  Transport    |  Segment |  TCP, UDP           |  Firewall
  Layer 3    |  Network      |  Packet  |  IP, ICMP, OSPF     |  Router
  Layer 2    |  Data Link    |  Frame   |  Ethernet, ARP      |  Switch
  Layer 1    |  Physical     |  Bits    |  802.3, DSL, Wi-Fi  |  Hub, NIC

  Memory Aid:  "All People Seem To Need Data Processing"
               (Application, Presentation, Session, Transport,
                Network, Data Link, Physical)