Networking is about one computer sending a message to another computer. This message is called packet in the IP world. It is just like a postcard in the postal service. The postal service can take a postcard addressed to someone and deliver it. Similarly, it needs to provide a network address when a computer to ask the network to deliver a packet to another computer. The network address is called IP address.
An Internet Protocol address is a numerical label such as 192.0.2.1 that is connected to a computer network that uses the Internet Protocol for communication. An IP address serves two main functions: network interface identification and location addressing. Internet Protocol version 4 (IPv4) defines an IP address as a 32-bit number. However, because of the growth of the Internet and the depletion of available IPv4 addresses, a new version of IP (IPv6) uses 128 bits for the IP address. Source
In order for a souce computer to deliver the addressed packet to the destination computer, it needs to understand the IP address similar to how the postal service understands a mailing address. For example, if the destination address is in the same postal code as sender’s, the mail probably never leaves the same neighbourhood. But if the destination mailing address is in a different postal code, the postal service has to leverage some mechanism to get to the destination. Network routing is very similar. If a packet is sent from 192.168.0.2 to 192.168.0.5, it is likely the two computers are close to each other and the routing is simple, probably just a single hop to get to the target computer. However, if a packet is sent from 192.168.1.2 to 184.108.40.206, it probably requires many hops across complicated network.
On a postcard, the name of recipient is usually written. It is because there might be more than one person living at a particular mailing address. In the computer world, there may be many processes on a computer to use the network. The packet can only be sent to the target computer if all we have is a network address. There is no way to decide which process should receive the packet. This is solved by the use of a network port. It’s usually represented with the form of ip:port, for example, 220.127.116.11:80.
Transmission Control Protocol(TCP)
TCP is a layer on top of the Internet Protocol(IP) but they are often used together as “TCP/IP”.
Similar to the postal service, networking protocols also set an upper limit to how many bytes can be sent in a single IP packet. TCP can establish a connection to a particular port on the target computer and the desired data can be sent with multiple packets while each packet has a upper limit. TCP will ensure all the data arrives at the destination in the proper order by chopping it up into individually numbered IP packets. In the case of packet loss in transit, it can be resent. TCP also has mechanisms to notice if packets are not getting through for an extended period of time and notifying you of a “broken” connection.
TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP. TCP is connection-oriented, and a connection between client and server is established before data can be sent. The server must be listening (passive open) for connection requests from clients before a connection is established. Three-way handshake (active open), retransmission, and error detection adds to reliability but lengthens latency. Applications that do not require reliable data stream service may use the User Datagram Protocol (UDP), which provides a connectionless datagram service that prioritizes time over reliability. TCP employs network congestion avoidance. However, there are vulnerabilities to TCP, including denial of service, connection hijacking, TCP veto, and reset attack. Source
In computer networking, the User Datagram Protocol (UDP) is one of the core members of the Internet protocol suite. With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP) network. Prior communications are not required in order to set up communication channels or data paths. UDP uses a simple connectionless communication model with a minimum of protocol mechanisms. UDP provides checksums for data integrity, and port numbers for addressing different functions at the source and destination of the datagram. It has no handshaking dialogues, and thus exposes the user’s program to any unreliability of the underlying network; there is no guarantee of delivery, ordering, or duplicate protection. If error-correction facilities are needed at the network interface level, an application may instead use Transmission Control Protocol (TCP) or Stream Control Transmission Protocol (SCTP) which are designed for this purpose. UDP is suitable for purposes where error checking and correction are either not necessary or are performed in the application; UDP avoids the overhead of such processing in the protocol stack. Time-sensitive applications often use UDP because dropping packets is preferable to waiting for packets delayed due to retransmission, which may not be an option in a real-time system. The protocol was designed by David P. Reed in 1980 and formally defined in RFC 768. Source
Domain Name System(DNS)
The destination computer can be addressed with IP in network. However, it’s hard to remember the IP address directly for a user. When the computer and network are upgraded over time, it’s likely to assign a different IP address to a computer. The same destination computer would be unaccessible with the old IP address. Thus, a “phone book” called Domain Name System(DNS) is used to solve these issues.
The dig utility can be used to query DNS. In the follow example, the domain name google.com is eventually translated to the IP address 18.104.22.168.
1 2 3 4 5 6 7 8 $ dig google.com [..] ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 222 IN A 22.214.171.124 [..]
Hypertext Transfer Protocol(HTTP)
The Hypertext Transfer Protocol is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser. Source
When we type a web address, such as http://example.com/hello, into a web browser. Firstly, the browser consults DNS to translate the domain name into a IP address. Then, the web browser establishes a TCP connection to the translated IP address of the web server on port 80 which is the default “well known” port for the HTTP. Once a connection is established, the web browser send a GET message to the server to ask for a particular resource. The server will reply with an OK message and the requested content.
curl is a commonly used CLI tool that you can use to issue GET requests.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ curl -v http://126.96.36.199/hello * About to connect() to 188.8.131.52 port 80 (#0) * Trying 184.108.40.206... * Connected to 220.127.116.11 (18.104.22.168) port 80 (#0) > GET /hello HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 22.214.171.124 > Accept: */* > < HTTP/1.1 200 OK < Server: nginx/1.17.10 < Date: Sun, 13 Mar 2022 01:50:19 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 15 < Connection: keep-alive < * Connection #0 to host 126.96.36.199 left intact Hello world
Here we use curl utility to issue a simple GET request. In this case, we don’t have DNS setup. We issue the GET request to the target server with IP address directly. It establishes a TCP connection on default port 80 of that address. Then it sends a GET message for the resource /hello. Finally, a response of 200 OK is received, along with the content Hello world.
In the example output, there are additional lines in both the request and response which look like Name: Value. These are called headers which convey additional information about the request and response. For example, the request contains header Accept: */* which means the client can accept the response in any format. In the response, we see the header Content-Type: text/html; charset=utf-8, which is the server telling the client the body of response is just a text.