Socket programming

IP, port, TCP/UDP

TCP byte stream and protocols

TCP stands for Transmission Control Protocol. It produces a continuous stream of bytes, with no internal boundaries. Interpreting this byte stream is the job of an application protocol — rules for making sense of the byte stream, including how to split it into messages.

Data serialisation Maps objects (contained in ‘message’) to bytes and Deserialisation (bytes to object). You can do this trivially via JSON or Protobuf libraries.

Networking from a Programmers’ perspective

Layers of protocols

Abstract idea: Network protocols are divided into layers. A lower layer can contain a higher layer as payload, and the higher layer adds new functions.

Reality: Ethernet contains IP, IP contains UDP or TCP, UDP or TCP contains application protocols.

We can also divide the layers by function:

The layer of small, discrete messages (IP): When downloading a large file, network hardware can only process smaller units called IP packets—it cannot store the entire file at once. This is why the lowest layer operates on packets. TCP, a higher layer protocol, handles the task of reassembling these packets into complete application data.
The layer of multiplexing (port number): Multiple apps can share the same network on a single computer. How does the computer know which packet belongs to which app? This is called demultiplexing. The next layer of IP (UDP or TCP) adds a 16-bit port number to distinguish different apps. Each app must claim an unused local port number before it can send or receive data. The computer uses the 4-tuple to identify a “flow” of information: (src_ip, src_port, dst_ip, dst_port)
The layer of reliable and ordered bytes (TCP): TCP provides a layer of reliable & ordered bytes on top of IP packets, it handles retransmission, reordering automatically.

TCP/IP Model

Network protocols layered by function:

	Subject	Function
Higher	TCP	Reliable & ordered bytes
↕	Port in TCP/UDP	Multiplex to programs
Lower	IP	Small, discrete messages

What’s actually relevant to us The layers above IP are what we care about. Applications use TCP or UDP, either directly by rolling their own protocol, or indirectly by using an implementation of a well-known protocol. Both TCP and UDP are contained by IP. Everything is built on top of TCP or UDP.

Ethernet is below IP, it’s also packet-based but uses a different type of address (MAC)