When you think about it, the same is true in the networking world. For two systems, especially ones that speak entirely different languages, as might be the case with a Linux system trying to communicate to a Windows system, there must be standards of behavior and communication. In the early days of the Internet, back when it was still called the Arpanet in the late '60s and early '70s, many more operating systems were around than might seem to be the case today. Although there still are many, once you start factoring in larger systems, the day-to-day experience of the vast majority of people is with three operating systems: Windows, macOS, and Linux. Two of those come from the same root operating system – Unix. However, they have just enough differences even today that protocols are important to make sure every conversation takes place smoothly.
Most of the time, when there is a conversation about protocols, you will hear someone refer to layers. This is because protocols are generally placed into stacks to explain how they relate to one another. Every type of communication on a network will involve multiple protocols across multiple layers, though each protocol is generally only aware of its own layer. There is one exception to that, but we'll get to it later in this chapter. Network protocols are mapped into two stacks. One is a generic model, and the other is a description of a set of protocols specifically designed to work together. Even the TCP/IP protocols can be mapped into the generic model, however.
Regardless of which way you think about the protocols, one important factor to keep in mind is that every layer only ever talks to its own layer on the other side. If you think about writing someone a letter, you can conceive of how this operates. You write a letter, you put it in an envelope, seal the envelope, address it, put a stamp on it, and then put it in the mailbox. For every action you put into pulling the letter together, there is a corresponding action on the receiving end. Your post office on the sending end determines how the envelope should get to the recipient by looking at the ZIP code. The sending post office has no interest in anything inside the envelope and really doesn't have any interest in the street address or the name of the recipient.
Let's say that the letter you are sending is to someone at a business. The address you have placed on the envelope is for the business. Once the envelope reaches the destination post office (the one that owns the ZIP code), the postal workers there have to look at the street address in order to determine which truck to put it on for delivery. The person driving the truck and out delivering the mail doesn't look at the ZIP code because it's irrelevant – the truck only delivers to a single ZIP code. Likewise, the name on the envelope is also irrelevant; the only important part is the street address. Once it gets to the business and lands in the mail room or with the receptionist, or whoever gets the mail when it arrives, that person will look at the name on the envelope and deliver it. The recipient then gets the letter, opens it, and reads the contents.
The same is true when we talk about protocol stacks. At every point during the process of sending and receiving, there is a specific piece of information that is intended for and handled by a specific person or target. The ZIP code tells the sending post office how to get to the destination. The street address tells the receiving post office how to get to the destination. The name on the envelope tells the receiving party who the letter is actually destined for, and in the end, the letter is probably only meaningful in any way to the recipient. None of these parties has much interest in looking at the other information because it doesn't help them to do their job. Certainly, each party can see the rest of the information (except, perhaps, the contents of the letter), but they only focus on the information they actually need. You will see this repeated over and over as we start talking about the different protocol stacks and then the specific protocols from the TCP/IP suite of protocols.
An essential concept that you should understand before we get started is encapsulation. Regardless of which communications stack you are referring to, data passes from one layer to another. Each layer distinguishes itself by applying some data associated to that layer before passing it on to the next layer down. This process is called encapsulation. Going back to our mail example, the letter is encapsulated inside the envelope and then the person's name is added to the envelope. After that, the street address and then finally the ZIP code (since the city/town and state are just the long form of the ZIP, they are redundant) are added. This addressing information encapsulates the information that comes before, though in a less obvious way than you will get from the IP addresses and other forms of address discussed below.
On the receiving end, the communication goes through de-encapsulation by removing the headers that were added on the sending end before the data is sent to the next layer up the stack. You will see this process of encapsulation as we start talking about the two different models and then, more concretely, when we start looking at the different protocols in operation.
Open Systems Interconnection (OSI) Model
In the 1970s, a number of communication protocols including the nascent TCP were used on the Arpanet as well as System Network Architecture (SNA) from IBM, DECnet from Digital Equipment Corporation, and many others. The International Organization for Standardization (ISO) decided a single model was needed to fit all communication protocols. In 1977, the ISO made use of work done by the Honeywell Corporation to create an abstract model describing different functions used in communications systems. By 1983, it had merged its standard with a similar standard by the International Telephone and Telegraph Consultative Committee to create the current Open Systems Interconnection (OSI) model.
NOTE
The acronym “ISO” is a compromise, recognizing the different abbreviations across the three languages used within ISO and is based on the Greek isos, meaning equal.
The OSI model consists of seven separate and distinct layers, each describing a particular set of functions and behaviors. Although every protocol used for communication will fit into one of these seven layers, not all communication streams will make use of all seven layers. Some types of communication are far more simplistic than others and may not need some of the higher layers of the protocol stack, depending on the intention of the communication. You can see a representation of the OSI model, drawn as a stack of boxes, in Figure 2.1.
Figure 2.1 : The Open Systems Interconnection seven layer model.
We will go through the model from the bottom to the top, as though we were reading a message off the wire. At the very bottom of the stack, at layer 1, is the physical layer. The physical layer includes all of the tangible components that you can touch – cabling, network interfaces, and the actual signaling medium, whether it's light or electrical. Since the name is pretty straightforward and descriptive, this one will be the easiest to remember and keep straight.
The next one up is the data link layer, layer 2. The data link layer is how systems on the same physical network communicate. For every layer in the stack, there is generally a way to differentiate communication streams – a way of addressing. At layer 2, this is the Media Access Control (MAC) address. The MAC address is attached directly to the network interface, which is why it is sometimes called the physical interface. The data link layer makes sure that devices on the same physical network can communicate reliably with one another. If you are using a switch on your network, the switch is operating at layer 2 because it makes use of the MAC address to determine where to send network messages.
NOTE
The MAC address is six bytes and it is