Networking Fundamentals
Core networking concepts for cloud infrastructure engineers — from OSI layers and TCP/IP internals to SDN, overlay networks, and real-world troubleshooting tools.
OSI Model — Cloud & DevOps Perspective
The OSI model provides a layered framework for understanding where issues occur. In cloud environments, most problems live at layers 3–7, but lower layers surface in on-prem/hybrid scenarios.
Layer 1 — Physical
Cloud relevance: Fiber optic links, transceiver types, cable specs for Direct Connect / Interconnect. Rarely visible in cloud but critical for colocation and dedicated connectivity.
What breaks here: Bad SFP modules, fiber bends, attenuation. Tools: Physical inspection, optical power meters, vendor NOC tickets.
Layer 2 — Data Link
Cloud relevance: MAC addressing, VLANs, 802.1Q tagging (used in Direct Connect VIFs, Interconnect VLAN attachments), ARP, spanning tree (STP).
What breaks here: ARP storms, duplicate MACs, VLAN misconfiguration. Tools: arp -n, arping, tcpdump, switch port analysis.
Layer 3 — Network
Cloud relevance: IP routing, CIDR blocks, route tables, security groups (IP-level), BGP, OSPF. The most critical layer for cloud networking — VPC route tables operate here.
What breaks here: Route table misconfigurations, missing routes, blackhole routes, MTU mismatches causing fragmentation. Tools: ping, traceroute, ip route, mtr.
Layer 4 — Transport
Cloud relevance: TCP/UDP port filtering (security groups, NACLs, firewall rules), connection tracking, TCP state management, health checks (TCP-based NLB). Load balancers operate at L4 (NLB) or L7 (ALB).
What breaks here: Port blocked by security group, TIME_WAIT exhaustion, SYN floods, asymmetric routing breaking stateful connections. Tools: netstat, ss, tcpdump, nmap.
Layer 5 — Session
Cloud relevance: TLS session establishment, session resumption (TLS tickets/session IDs), SSH sessions. Less distinctly modeled in modern stacks — often merged with L4/L6.
What breaks here: Session timeouts, TLS session cache mismatches. Tools: openssl s_client, application logs.
Layer 6 — Presentation
Cloud relevance: TLS encryption/decryption (SSL termination at ALB/GCP HTTPS LB), data encoding (JSON, Protobuf), compression (gzip, Brotli). SSL/TLS offloading happens at this layer.
What breaks here: Certificate errors, cipher mismatch, encoding bugs. Tools: openssl s_client -connect host:443, curl -v.
Layer 7 — Application
Cloud relevance: HTTP/HTTPS routing (ALB rules, GCP URL maps), DNS, gRPC, WebSocket, API Gateway. WAF rules (AWS WAF, Cloud Armor) inspect at this layer.
What breaks here: HTTP 4xx/5xx errors, incorrect Host headers, broken API routing, WAF false positives. Tools: curl -v, browser DevTools, application logs, WAF logs.
TCP/IP Stack
TCP vs UDP
| Feature | TCP | UDP |
|---|---|---|
| Connection | Connection-oriented (3-way handshake) | Connectionless |
| Reliability | Guaranteed delivery, ordering, retransmission | Best-effort, no guarantee |
| Overhead | Higher (headers, ACKs, state) | Lower (8-byte header) |
| Use cases | HTTP, HTTPS, SSH, databases, file transfer | DNS, NTP, video streaming, gaming, VoIP, QUIC |
| Cloud examples | ALB, RDS, SSH bastion, HTTPS APIs | NLB (UDP mode), Route 53, CloudFront UDP |
TCP Three-Way Handshake
# TCP connection establishment
Client Server
| |
|--- SYN (seq=x) ------------->| Client initiates
| |
|<-- SYN-ACK (seq=y, ack=x+1) -| Server acknowledges + its own SYN
| |
|--- ACK (ack=y+1) ----------->| Client acknowledges server
| |
|=== DATA TRANSFER ============|
# Connection teardown (4-way)
Client Server
|--- FIN ---------------------->|
|<-- ACK ----------------------|
|<-- FIN ----------------------|
|--- ACK ---------------------->|
TCP Connection States
| State | Description | Troubleshooting note |
|---|---|---|
LISTEN | Server waiting for incoming connections | Service is up and bound to port |
SYN_SENT | Client sent SYN, waiting for SYN-ACK | Many here = connectivity/firewall issue |
SYN_RECV | Server received SYN, sent SYN-ACK | SYN flood if excessive |
ESTABLISHED | Connection is active and data flowing | Normal active connections |
FIN_WAIT_1/2 | Active close initiated | Closing connection |
TIME_WAIT | 2*MSL wait after close (typically 60s) | High count = port exhaustion risk |
CLOSE_WAIT | Remote side closed, local hasn't | Application not calling close() — likely a bug |
TIME_WAIT can exhaust ephemeral ports (default: 32768–60999). Fix with net.ipv4.tcp_tw_reuse=1 or persistent connections / connection pooling.
# Check TCP states
ss -s
netstat -an | awk '/tcp/ {print $6}' | sort | uniq -c | sort -rn
# Tune TIME_WAIT behavior (Linux)
sysctl net.ipv4.tcp_tw_reuse # allow reuse for outgoing connections
sysctl net.ipv4.ip_local_port_range # view ephemeral port range
# Set in /etc/sysctl.conf:
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535
IP Addressing
CIDR Notation & Subnetting
| CIDR | Subnet Mask | Hosts | Common Use |
|---|---|---|---|
| /8 | 255.0.0.0 | 16,777,214 | Class A (10.0.0.0/8) |
| /16 | 255.255.0.0 | 65,534 | VPC CIDR block |
| /20 | 255.255.240.0 | 4,094 | Large subnet |
| /24 | 255.255.255.0 | 254 | Standard subnet per AZ |
| /28 | 255.255.255.240 | 14 | AWS PrivateLink endpoints |
| /32 | 255.255.255.255 | 1 | Host route / security group reference |
RFC 1918 Private Ranges
10.0.0.0/8 # 10.x.x.x — Class A private (16M addresses) — preferred for VPCs
172.16.0.0/12 # 172.16.x.x to 172.31.x.x — Class B private
192.168.0.0/16 # 192.168.x.x — Class C private (home/small office)
# Loopback
127.0.0.0/8 # localhost
# Link-local (APIPA)
169.254.0.0/16 # AWS metadata: 169.254.169.254, GCP metadata: 169.254.169.254
# AWS VPC recommended planning:
# - 10.0.0.0/8 split into /16 per environment
# - 10.0.0.0/16 → production
# - 10.1.0.0/16 → staging
# - 10.2.0.0/16 → development
# - 10.100.0.0/16 → shared services / hub
IPv6 Overview & Dual-Stack
# IPv6 address format: 8 groups of 4 hex digits
2001:0db8:85a3:0000:0000:8a2e:0370:7334
2001:db8:85a3::8a2e:370:7334 # compressed (:: = consecutive zero groups)
# Special addresses
::1 # loopback (equivalent to 127.0.0.1)
fe80::/10 # link-local (equivalent to 169.254.x.x)
2001:db8::/32 # documentation range (RFC 5737)
::/0 # default route (all IPv6)
# AWS dual-stack VPC
# Assign /56 IPv6 CIDR to VPC, /64 to each subnet
# Enable "Assign IPv6 address on creation" per subnet
# Update route tables: ::/0 via IGW
# GCP dual-stack
gcloud compute networks subnets create my-subnet \
--stack-type=IPV4_IPV6 \
--ipv6-access-type=EXTERNAL \
--region=us-central1
Key Protocols
DNS
Resolution Flow
Recursive resolver (ISP/8.8.8.8) queries on behalf of client, caches results. Authoritative nameserver holds the actual zone records and returns definitive answers.
# DNS resolution chain for api.example.com:
Client → Recursive Resolver (e.g. 8.8.8.8)
→ Root nameservers (13 root clusters, anycast)
→ TLD nameservers (.com)
→ Authoritative NS for example.com (Route 53 / Cloud DNS)
→ Returns A record (e.g. 203.0.113.45)
# Inspect DNS resolution
dig api.example.com +trace # full resolution trace
dig @8.8.8.8 api.example.com A # query specific resolver
dig api.example.com ANY # all record types
nslookup -type=MX example.com # MX records
host -t TXT example.com # TXT records
# TTL: Time To Live (seconds)
# Low TTL (60s) → fast failover, more DNS queries, higher cost
# High TTL (3600s+) → cached longer, slower propagation
# Best practice: lower TTL 24h before any migration
HTTP/HTTPS Methods, Status Codes & Headers
# Common HTTP methods
GET # Retrieve resource — safe, idempotent
POST # Create resource — not idempotent
PUT # Replace resource — idempotent
PATCH # Partial update
DELETE # Remove resource — idempotent
HEAD # Headers only (no body) — used by health checks
OPTIONS # CORS preflight
# Status code families
1xx: Informational (100 Continue, 101 Switching Protocols)
2xx: Success (200 OK, 201 Created, 204 No Content)
3xx: Redirect (301 Moved, 302 Found, 304 Not Modified)
4xx: Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests)
5xx: Server Error (500 Internal, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)
# Key request headers
Host: api.example.com
Authorization: Bearer eyJhbGci...
Content-Type: application/json
Accept: application/json
X-Forwarded-For: 203.0.113.10 # original client IP (set by LB)
X-Request-ID: uuid-here
# Key response headers
Content-Type: application/json
Cache-Control: max-age=3600, public
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
Access-Control-Allow-Origin: https://app.example.com
TLS Handshake
# TLS 1.3 Handshake (simplified)
Client Server
|--- ClientHello ----------------->| (supported ciphers, TLS version, key share)
|<-- ServerHello + Certificate ----| (chosen cipher, cert, key share)
|<-- EncryptedExtensions ----------|
|<-- Finished ---------------------|
|--- Finished ---------------------| (session keys derived)
|=== Encrypted Application Data ===|
# Verify TLS certificate and chain
openssl s_client -connect api.example.com:443 -servername api.example.com
openssl s_client -connect api.example.com:443 | openssl x509 -noout -dates -subject
# Check cipher and protocol version
openssl s_client -connect api.example.com:443 -tls1_3
# Test with curl
curl -v --tlsv1.3 https://api.example.com/health 2>&1 | grep -E 'SSL|TLS|certificate'
ICMP & ARP
# ICMP - Internet Control Message Protocol
# Type 0: Echo Reply (ping response)
# Type 3: Destination Unreachable (network/host/port unreachable)
# Type 8: Echo Request (ping)
# Type 11: Time Exceeded (TTL=0, used by traceroute)
ping -c 4 -s 1472 10.0.1.10 # test with specific packet size (MTU check)
ping -f -s 8972 10.0.1.10 # flood ping with jumbo frames
# ARP - Address Resolution Protocol (L2 → L3 mapping)
arp -n # show ARP cache
arping -I eth0 10.0.1.1 # ARP ping (find MAC for IP)
ip neigh show # modern ARP cache display
# Gratuitous ARP: host announcing its own IP→MAC mapping
# Used by: virtual IP failover (keepalived, VRRP), VM migration
Network Troubleshooting Toolkit
Connectivity & Path Analysis
# ping — basic reachability, latency, packet loss
ping -c 10 8.8.8.8
ping -c 5 -s 1400 10.0.1.10 # test specific packet size
# traceroute — path and per-hop latency (uses TTL increment + ICMP Time Exceeded)
traceroute 8.8.8.8
traceroute -T -p 443 api.example.com # TCP traceroute on port 443 (bypass ICMP blocks)
tracepath 8.8.8.8 # similar, no root needed
# mtr — continuous traceroute with statistics (best for identifying intermittent loss)
mtr --report --report-cycles 20 8.8.8.8
mtr -T -P 443 api.example.com # TCP mode for firewalled paths
# KEY: Look for packet loss that persists at a hop AND all subsequent hops
Port Scanning & Service Discovery
# nmap — port scanning and service fingerprinting
nmap -p 22,80,443 10.0.1.0/24 # scan specific ports on subnet
nmap -sV -p 443 api.example.com # version detection
nmap -sn 10.0.1.0/24 # ping sweep (host discovery only)
nmap --open -p 0-1024 10.0.1.10 # show only open ports
# nc (netcat) — lightweight port testing
nc -zv 10.0.1.10 443 # test if port is open
nc -zv -w 3 10.0.1.10 5432 # with 3s timeout (PostgreSQL)
echo "GET / HTTP/1.0" | nc api.example.com 80
Packet Capture
# tcpdump — packet capture and analysis
tcpdump -i eth0 -n # capture all traffic on eth0
tcpdump -i any host 10.0.1.10 # traffic to/from specific host
tcpdump -i eth0 port 443 # HTTPS traffic
tcpdump -i eth0 'tcp[tcpflags] & tcp-syn != 0' # SYN packets only
tcpdump -i eth0 -w /tmp/capture.pcap # write to file for Wireshark
tcpdump -r /tmp/capture.pcap -n # read capture file
# Practical: capture HTTP requests
tcpdump -i eth0 -A -s 0 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
# Wireshark display filters (when analyzing .pcap)
# http.response.code == 500 → filter 500 errors
# tcp.analysis.retransmission → retransmissions
# dns.flags.rcode != 0 → DNS errors
Socket & Connection Inspection
# netstat (legacy, widely available)
netstat -tlnp # listening TCP sockets with PID
netstat -an # all connections with numeric addresses
netstat -s # protocol statistics
# ss (modern replacement for netstat)
ss -tlnp # listening TCP sockets with PID
ss -tnp state established # established connections
ss -s # summary statistics
ss -tnp 'dport = :443' # connections to port 443
# Check what's listening on a port
lsof -i :443
fuser 443/tcp
HTTP & TLS Debugging
# curl — versatile HTTP debugging
curl -v https://api.example.com/health # verbose with TLS details
curl -sk https://api.example.com/health # skip cert verification
curl -H "Host: api.example.com" http://10.0.1.10/ # override Host header
curl -w "\n\ntime_namelookup: %{time_namelookup}\ntime_connect: %{time_connect}\ntime_starttransfer: %{time_starttransfer}\ntotal: %{time_total}\n" \
-o /dev/null -s https://api.example.com/health # timing breakdown
# openssl s_client — TLS inspection
openssl s_client -connect api.example.com:443 -servername api.example.com
openssl s_client -connect api.example.com:443 < /dev/null 2>&1 | grep -E 'subject|issuer|expire|verify'
openssl s_client -connect api.example.com:443 -showcerts # show full chain
Cloud Networking Fundamentals
Software-Defined Networking (SDN)
Cloud networks are fully software-defined: the physical underlay carries traffic, while a virtual overlay provides tenant isolation and programmable topology.
Control Plane vs Data Plane
Control plane: Makes routing decisions. In cloud: the hypervisor/SDN controller programs virtual switch rules. Examples: AWS VPC route tables, GCP Cloud Router.
Data plane: Forwards packets based on programmed rules. Happens in hardware (ASICs/FPGAs) or software (OVS). AWS uses Nitro system; GCP uses Andromeda SDN.
Overlay Networks — VXLAN & Geneve
# VXLAN (Virtual Extensible LAN) — RFC 7348
# Encapsulates L2 frames in UDP/IP (port 4789)
# 24-bit VNI (VXLAN Network Identifier) = 16M segments
# Used by: Kubernetes (Flannel, Calico), OpenStack, AWS EKS
# Geneve (Generic Network Virtualization Encapsulation)
# More flexible than VXLAN — variable-length options header
# Used by: AWS Nitro (GRE-like), GCP Andromeda
# AWS Gateway Load Balancer uses Geneve port 6081
# VXLAN header structure:
# Outer: Ethernet | IP | UDP (4789) | VXLAN Header (VNI) | Inner: Ethernet | IP | Payload
# Inspect VXLAN traffic
tcpdump -i eth0 port 4789
# Show VXLAN interfaces
ip -d link show type vxlan
East-West vs North-South Traffic
| Traffic Type | Direction | Examples | Security consideration |
|---|---|---|---|
| North-South | Client ↔ Internet / on-prem | User → ALB → App, VPN tunnel | Internet-facing, perimeter security, WAF |
| East-West | Service ↔ Service (within cloud) | App → DB, microservice → microservice | Lateral movement risk — micro-segmentation critical |
Network Performance
Bandwidth, Latency & Throughput
Key Metrics Defined
Bandwidth: Maximum data rate (Gbps). Like the width of a pipe.
Latency: Time for a packet to travel source → destination (ms). Round-trip time (RTT) = 2× one-way latency.
Throughput: Actual data transferred per second. Limited by min(bandwidth, TCP_window/RTT).
BDP (Bandwidth-Delay Product): bandwidth × RTT = bytes in flight. TCP buffer must be at least BDP for full utilization.
MTU & Jumbo Frames
# Standard Ethernet MTU: 1500 bytes
# Jumbo frames: 9000 bytes (common in AWS, GCP)
# VXLAN overhead: ~50 bytes → effective MTU 1450 for overlay
# Test MTU with ping (no-fragment flag)
ping -c 3 -M do -s 1472 10.0.1.10 # 1472 + 28 ICMP header = 1500 MTU
ping -c 3 -M do -s 8972 10.0.1.10 # test jumbo frames
# Show interface MTU
ip link show eth0
# Set MTU
ip link set eth0 mtu 9000
# AWS: Enable jumbo frames on Nitro instances (up to 9001 bytes)
# GCP: Default MTU 1460, configurable up to 8896 (VPC setting)
TCP Window Scaling & Connection Pooling
# TCP window scaling (RFC 1323) — critical for high-BDP paths
sysctl net.ipv4.tcp_window_scaling # should be 1 (enabled)
sysctl net.core.rmem_max # max receive buffer
sysctl net.core.wmem_max # max send buffer
# Recommended for high-throughput cloud workloads:
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
# Connection pooling: reuse TCP connections instead of opening new ones
# Benefits: eliminates handshake latency, reduces TIME_WAIT, reduces file descriptor usage
# Implementation: database connection pools (PgBouncer for PostgreSQL, ProxySQL for MySQL)
# HTTP: keep-alive connections, HTTP/2 multiplexing
Network Security
Stateful vs Stateless Firewall Inspection
| Feature | Stateful (e.g. SG, iptables conntrack) | Stateless (e.g. NACL, ACL) |
|---|---|---|
| Connection tracking | Yes — tracks state, allows return traffic automatically | No — each packet evaluated independently |
| Performance | Slightly higher CPU (state table lookup) | Faster, but must explicitly allow return traffic |
| Ephemeral ports | Handled automatically | Must explicitly allow 1024-65535 for return traffic |
| AWS equivalent | Security Groups | Network ACLs |
Network Segmentation & Micro-Segmentation
- Perimeter: Internet Gateway, NAT, WAF (AWS WAF / Cloud Armor) for north-south
- Segment: VPC isolation per environment; private subnets for data tier
- Host: Security groups / firewall rules scoped to minimum required ports
- Micro-segmentation: Kubernetes NetworkPolicy, service mesh (Istio mTLS) for east-west
- Zero Trust: No implicit trust based on network location — authenticate every request
# iptables — Linux host firewall (foundation of cloud security groups)
iptables -L -n -v # list all rules with stats
iptables -L INPUT -n --line-numbers # INPUT chain with rule numbers
iptables -A INPUT -p tcp --dport 443 -j ACCEPT # allow HTTPS
iptables -A INPUT -s 10.0.0.0/8 -j ACCEPT # allow from private range
iptables -P INPUT DROP # default deny
# nftables (modern replacement)
nft list ruleset
nft add rule ip filter input tcp dport 443 accept
L1/L2: Can you ARP? →
arpingL3: Can you ping? →
ping, ip route get <dst>L4: Can you connect to port? →
nc -zv host port, ss -tlnpL7: Is the application responding? →
curl -v, openssl s_client