TCP (Transmission Control Protocol) for Intermediates

David Piscitello

 

Text Box: Author’s Note:

I recently wrote a Fundamentals column on TCP for WatchGuard Technologies. This paper investigates TCP at a more advanced level. People who have read the WatchGuard piece and were not challenged may appreciate this paper. Thanks to Lisa Phifer and A. Lyman Chapin for their respective reviews.

 

 

 

 

 

 

 

 


The transport layer is the basic end-to-end (host-to-host) building block of the Internet (TCP/IP) architecture and communications. Protocols above the transport layer concentrate on distributed applications processing, and protocols below the transport layer concentrate on the transmission, routing and forwarding of application data. The transmission control protocol (TCP, RFC 793) provides a reliable delivery service that provides “robustness in spite of unreliable communications media” and “data transfer that is reliable, ordered, full-duplex, and flow controlled.”

 

TCP Functions

To provide reliable data delivery, TCP must:

 

·       Deliver data submitted by sending application processes without loss,

·       Prevent duplication of data,

·       Preserve the order of bytes of data submitted,

·       Detect and correct[1] corruption (e.g., bit-level errors) introduced into the data stream by the network, and

·       Regulate the flow of data across the TCP connection (flow control, to help prevent network congestion, which often results in packet loss).

 

TCP has some additional features that help certain applications perform well:

.

·       Push, which allows a sending application to signal to both sending and receiving TCP processes (hereafter simply called “TCP”) that the data in this send call must be delivered immediately to the receiving application process. In the absence of “push,” TCP waits (up to a configurable timeout) to fill a segment before actually sending the data.

·       Urgent data, an interrupt data service whereby a sending application process may request that data marked “urgent” be processed quickly by the receiving upper-layer protocol process. Note that Urgent is a signal by the sending the TCP process to the receiving TCP process: it’s not an application-to-application signal.

 

TCP treats a data transfer as a continuous stream of bytes (octets), delimited into segments.

All TCP segments have the same format, illustrated below:

 

Figure 1: TCP Header Format

                                   

    0                   1                   2                   3  

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |          Source Port          |       Destination Port        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                        Sequence Number                        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Acknowledgment Number                      |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |  Data |           |U|A|P|R|S|F|                               |

   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |

   |       |           |G|K|H|T|N|N|                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |           Checksum            |         Urgent Pointer        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Options                    |    Padding    |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                          Data Segment                         |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

Ports, Sockets, Connections

Let’s consider Internet addressing for a moment, and how it affects TCP. IP addresses are 32-bit numbers that uniquely identify a network interface of a host on an IP network. Port numbers commonly identify applications. RFC 793 explains that the port, paired with an IP address, forms a TCP connection endpoint identifier, also known as a socket address. Sockets are the two endpoints of the connection. Sockets are typically accessed through a sockets API, used by applications to communicate with the network stack at some exposed layer.

 

Internet port number assignment commonly follows a client/server paradigm. Hosts that support Internet services such as FTP, DNS, HTTP, SSL, etc. listen to well-known port numbers, 16-bit values permanently assigned to identify a registered Internet application. Originally, these values were documented in an Internet standard called Assigned Numbers; more recently, they are maintained at an online registry. A client application associates or binds to a TCP port number that is typically allocated from unused and unassigned or ephemeral ports.

 

When a client application opens a connection to a server application, the socket addresses involved are {client’s IP address, client’s ephemeral TCP port number} and {server’s IP address, well-known TCP port number for desired Internet service}. The client will always send to the well known destination port, but the server may create a new socket to listen (again) for another incoming connection. This is where the difference between the address and the socket seems to be throwing you.  Look at this FTP/TCP dialog and you’ll see that the Destination Port remains the same for all messages from the client to the server.

 

TCP has three phases of operation. I’ve posted an LAN packet analysis of an FTP session I captured so the masochists among you can follow along in bit level detail: the FTP application uses TCP for reliable delivery.

 

TCP Connection Establishment

TCP operates as a pair of independent byte streams of data between upper-layer protocols. The Synchronize stream (SYN) establishes the beginning of the byte stream in each direction of information flow. TCP segments involved in connection establishment have the SYN flag set to one (1). During Synchronize, TCPs encode the following information in a TCP SYN header:

 

 

A responding TCP acknowledges receipt of the SYN segment and commonly  attempts to synchronize a byte stream in the return (responder-to-initiator) direction in a single TCP segment by generating a TCP segment with both the SYN and ACK flags set to one (1). This process, called piggybacking, improves protocol efficiency.

 

When composing a SYN/ACK segment, the responding TCP: 
·            Sets the acknowledgement number to the value of the next sequence number the responder expects to receive (in this case, the ISN+1);

 

TCP’s Three-Way Handshake

When the initiating TCP receives the SYN/ACK segment, it knows the SYN it sent was delivered. With these two messages alone, however, the responding TCP can’t know for certain that the SYN/ACK segment was delivered.  TCP uses an additional message from the initiator – a TCP segment with the ACK flag set – to confirm that the data stream has been synchronized in the responder-to-initiator direction. This message sequence is referred to as a three-way handshake.

 

Any TCP segment having the ACK flag set may also contain application data. For example, if the responder indicated a non-zero initial Window in the SYN/ segment, the initiator can piggyback up to the responder’s initial window number of bytes of data in the SYN/ACK segment.

TCP data transfer

In non-loss situations, an application submits data to the local, sending TCP. The sending TCP sends those data “at its convenience” (seriously, that’s what RFC 793 says . . .). Typically, the sender attempts to fill a maximum segment size (MSS) packet before sending (unless a PUSH is invoked).

 

TCP uses the 32-bit sequence number to reconstruct the application data stream. The sequence number identifies the relative position of the first byte contained in a TCP segment, and hence the position of the data in this segment, with respect to other data segments of the application stream. The push flag, if set to one (1), signals the communicating TCPs to immediately deliver all data processed prior to and including the segment containing the push to the application (basically, push overrides TCP’s attempt to fill a maximum segment sized packet before sending.)

 

When piggybacking is used, TCP must also attend to data being transmitted in the return direction. The acknowledgment flag, if set to one (1), indicates that the acknowledgment sequence number and window are significant, and must be used to process data segments of the return stream. Here, the TCP acknowledgment number identifies the next expected byte in the data stream, and the 16-bit TCP window indicates the amount of data the receiver is willing to accept in the future TCP segment(s); this value is added to the acknowledgment sequence number to determine the send window.

 

To detect and recover from packet loss, duplication, and modification during its transfer across an IP network, TCP relies on a mechanism called positive acknowledgment and retransmission upon timeout.

A sending TCP runs a separate retransmission timer for each TCP data segment. If the retransmission timer expires and no acknowledgment packet has arrived indicating successful delivery of the segment to the receiver, TCP assumes the data segment is lost, arrived corrupted (failed checksum), or was delivered to the wrong address. In such cases, TCP resends the data segment; and restarts the retransmission timer for this segment. RFC 793 suggests two resend strategies: if the retransmission timer expires, the sending TCP may resend the unacknowledged segment (first-only retransmission), or it may resend all the data segments on the retransmission queue (batch retransmission).

 

The receiving TCP may apply one of two acceptance strategies. If an in-order data-acceptance strategy is used, the receiving TCP accepts only data that arrive in byte-sequence order and discards all other data. The receiving TCP returns an acknowledgment to the sender and makes the byte stream available to the upper-layer protocol process as it arrives. If an in-window data-acceptance strategy is employed, the receiving TCP maintains segments containing bytes that arrive out of order separately from those that have arrived in order and examines newly arrived data segments to determine whether the next expected byte in the ordered stream of bytes has arrived. If so, the receiving TCP adds this segment’s worth of bytes to the end of the byte stream that had previously arrived in order and looks at the out-of-order stream to see whether additional bytes may now be appended to the end of the in-order stream. The TCP returns an acknowledgment and makes the accumulated stream of in-order bytes available to the application.

 

An explicit acknowledgment is returned in a TCP segment (potentially “piggybacked” with data flowing in the opposite direction). The acknowledgment sequence number X indicates that all bytes up to but not including X have been received, and the next byte expected is at sequence number X. The segment window indicates the number of bytes the receiver is willing to accept (beginning with sequence number X) .

Acknowledgment packets reflect only what has been received in sequence; they do not acknowledge data packets that arrived successfully but out of sequence.

 

When an acknowledgment packet arrives, the sending TCP may choose to resend all unacknowledged data from sequence number X up to the maximum permitted by the segment window. In theory, applying the “batch” retransmission strategy results in more traffic but possibly less delay. The sending TCP may resend only the data segment containing the first unacknowledged byte. This negates a large window and may increase delay, but it is preferred because it introduces less traffic into the network. Batch retransmission strategies are generally regarded as bad ideas, since their excessive retransmission of segments is likely to contribute to network congestion.

 

Connection Release (Refusal) in TCP

 

TCP offers two forms of connection release: graceful and abrupt.

 

Graceful close in TCP is an orderly shutdown process, requiring that all data transmitted in both directions be acknowledged before the connection may be closed. When an application has finished sending data and wishes to close the TCP connection, its local TCP sends a TCP segment with the FIN flag set to one (1). The sequence number is set to the value of the last byte transmitted. The receiving TCP must acknowledge receipt of the last byte but is not required to close its half of the connection; it may continue to transfer data, and the initiator of the FIN segment must dutifully acknowledge all data received until it receives a TCP segment with the FIN and ACK flags set to one (1) and the acknowledgment set to the sequence number of the last byte received from the FIN segment initiator. Upon receiving the FIN/ACK segment, the FIN initiator returns an ACK segment, completing a three-way “good-bye” handshake.

 

Abrupt release indicates that something seriously wrong has occurred. TCP composes a segment with the RST flag set to one (1), returns this to the peer for this TCP stream, and shuts down.

 

Connection refusal occurs in TCP when the responding TCP cannot establish a TCP connection or when the SYN packet received is in error. To refuse a TCP connection, the called TCP sets the RST and ACK flags to one (1), and sets the acknowledgment sequence number to the initiator’s ISN +1.

 

Conclusions and Additional Reading

Much of the success of the Internet can be attributed to TCP (with no sleight intended to its companion network protocol, IP). A large part of the reason TCP has been so successful is that a steady stream of incredibly smart people have used TCP as the basis for research into efficient communications. During the course of 20+ years of research, TCP has been used for reliable delivery over everything from amateur radio frequencies and water droplets to multi-gigabit optical links. As a result of this extensive and elaborate work, the original RFC is thus complemented with extensions that have made TCP a remarkably flexible and adaptable transport protocol. Early work by David Clark (Silly Window Syndrome, RFC 813) and landmark work by Van Jacobsen (Congestion Avoidance and Control) improved TCP’s efficiency. Subsequent standards work culminated in extensions for selective acknowledgement, MTU discovery, and window management. TCP remains under scrutiny by the research community.

 

I can only scratch the surface of the features of TCP here. You might find Chapter 12 of my book, Open Systems Networking (out of print, available for download), helpful, along with dozens of equally good books on TCP/IP.



[1] Lyman Chapin made the following observation during his review, “The TCP checksum detects most (but not all) errors, and the correction is retransmission—which means that some underlying network failures or defects can’t be corrected by TCP (for example, a link-level driver that always fails when it receives a particular string of 0 and 1 bits). There’s no forward error correction or other mechanism for recovering from failures that can’t be overcome by simple retransmission.