When you drop a letter into a slot at a post office, you are committing an act of faith: you hope your letter will be delivered to its destination. When a Linux application uses the User Datagram Protocol (UDP) it is committing a very similar act.
C function names appear in the text, the parenthesized number
indicate the manual section in which the function is described. This
is consistent with Unix standard documentation. For example, read(2)
indicates you can use the command
man 2 read
to read the man page online that describes the function.
Unlike TCP, which requires a structured handshake to open a connection and system resources to keep the connection open, UDP just sends data packets, one at a time, with no preliminaries, postliminaries, or fanfare of any kind — not even any confirmation that a data packet has actually been sent, let alone received. Much of TCP’s other complexity (such as the slow-start algorithm, Nagle packet-stuffing algorithm, and lost-packet recovery algorithm) is likewise absent from UDP.
The absence of an error-recovery capability makes UDP the simplest of all the TCP/IP protocols. But, as always, simplicity has its price, and in this case, the price is paid by the application. Under the TCP protocol, “the system” handles recoveries from packet losses; with UDP’s no-frills service, the application has to do this particular dirty job itself.
Or not. Some applications can afford to lose packets with little ill effect; one such application is Voice-Over-IP (VoIP) — the loss of a packet will cause a “hiccup” in the audio stream, but as long as there are not too many of them the human loses little understanding.
If the user application is like an F-14 fighter jet, then sending a UDP message is like firing a Phoenix missile: point, shoot, and forget. The application puts together the data package, inserts the target IP address, and calls the appropriate outgoing-message function. The system sends back a status message telling the application that the message has taken off and has gone at least as far as the router. If the packet reaches its destination, fine. If it doesn’t, the application — not the protocol — gets to decide what to do about it.
For applications, receiving a UDP packet is just a matter of waiting for a missile to land in the backyard. Applications don’t anticipate what the missile may be carrying, and deal only with missiles that hit the target. (Unlike real artillery projectiles, stray UDP “missiles” never cause collateral damage. They just nosedive into a bit-bucket.)
“Smart” missiles depend on their computer brains, and UDP depends on the “rocket science” of the Internet Protocol (IP) in general — and packet routing in particular — to deliver the message payloads as best it can. This means that the support code for UDP transmission is limited to preparing the UDP header and launching the data packets. As you’ll see in the associated commentary, UDP reception is somewhat trickier, because a receiving application has to call the bind(2) function as a prerequisite to catching the packets in its back yard.
UDP is used primarily for activities that require only infrequent interaction, where “infrequent” means that the interactions take place minutes, hours, or even days apart. One very common use of UDP transmissions is in the Domain Name System (DNS), which uses a distributed database and distributed servers to associate a human-friendly name (such as www.linux.org) with an IP address (such as 18.104.22.168 — the associated IP address at the time this chapter is being written). Using a library routine in a Linux system, an application discovers a name binding by sending a UDP packet to a DNS server and then waiting for a reply that contains the desired information. For example, a File Transfer Protocol (FTP) client may use UDP to send a human-readable address to a DNS server and get back an IP address. An FTP server may also use UDP to send an IP address to the DNS server and retrieve the human-readable address. Many contemporary systems use this technique (lookups in the in-addr.arpa. or ip6.arpa. domain) as a security tool.
UDP is the only protocol that’s used when information is sent to two or more destination computers, regardless of whether the recipients are limited to a group of registered computers (multicasting) or consist of every computer on a network (broadcasting).
Because of its simplicity, UDP has also found favor with TCP/IP application protocols such as Trivial File Transfer Protocol (TFTP), Bootstrap Protocol (BOOTP), and Dynamic Host Configuration Protocol (DHCP), which are all used at computer startup. In fact, BOOTP and DHCP must use UDP, because they rely on the broadcast mechanism to reach their respective servers or relay agents, which provide essential system-initialization information.
However, these three application-bootstrap protocols aren’t always loaded from a hard disk drive. Sometimes, they’re implemented in a computer’s built-in read-only memory (ROM). This is the arrangement that provides information for thin clients — network computers that don’t have their own individual hard drives. DHCP lets such a computer discover its IP address; BOOTP tells the computer where to find a load image; and TFTP actually transfers the load image to the computer. After the load image is installed, other TCP protocols can be used to finish the loading process. Note also that many of the related routing-support IP functions don’t apply to the bootstrap applications; therefore, a ROM-based implementation of IP and UDP can be very simple indeed.
UDP is also used in the original Network File System (NFS) implementations, which date from the distant past of networking, now hardly remembered, when UDP transfers were faster than TCP transfers and had lower system overhead. True, older systems tended to transmit smaller blocks of data (usually 512 bytes) in each read or write operation, and the data transfers took place across comparatively small high-speed communications circuits (LANs and campus WANs). But, UDP’s packet-exchange model also fits well with the Remote Procedure Call (RPC) protocol (still used within NFS), which lets one computer execute a function call requested by another, remote, computer.
Finally, UDP is used extensively by some routers to exchange information about the ever-mutating topology of the Internet. When such a router or a host wants to tell its neighbors about changes in the Internet, it employs the Router Information Protocol (RIP), which in turn uses UDP to transmit updated router-table information.
However, not all router protocols use UDP. Here are three that don’t:
Open Shortest Path First (OSPF) — Has its own IP protocol number (89) and operates outside the TCP/IP domain.
Border Gateway Protocol (BGP) — Uses TCP (port 179), not only because the amount of information BGP needs to transfer is greater than UDP can easily carry, but also because BGP-based routers sometimes want to use the TCP keep-alive feature to tell when a data link has dropped.
Intermediate System to Intermediate System (IS-IS) — Belongs to the ISO protocol family, which is a different family altogether. It uses IP protocol number 124.
In summary, although the use of UDP is far from universal, this protocol is very handy when for systems in which only occasional exchanges are required, and when the implementation has to be compact. UDP is also used in broadcast-based applications and in some router-to-router communications.
The UDP packet format is shown in Figure 1.
The definitions of the fields in the UDP packet are:
IP header: the UDP packet is encapsulated in an IP packet, so the packet starts with the IP header — see Chapter 7 for details.
Source port: the source port — selected more-or-less at random by the originating system — tells the receiving service where to send any replies.
Destination port: a host, identified by an IP address in the IP header, can have a number of UDP services operating. The 16-bit destination port identifies which service should receive this packet. Table 1 shows some of the well-known service destination ports; a complete list is found at /etc/services on any modern Unix/Linux system.
Table 1 — Sample of UDP service ports
Length: the length, in octets, is the length of the UDP portion of the packet, in the diagram from h+0 to n.
Checksum: the checksum for the UDP portion of the packet, encompassing the number of octets in the Length field.
UDP Payload: zero or more octets of data to be delivered.
Because of UDP’s bare-bones nature, the amount of code devoted to it in Linux is comparatively small. From start to finish, not including the utilities common to all TCP/ IP applications, it spans only about 1,150 lines. [Ed. Note: to be confirmed for 2022]
Back to Table of Contents
suggestions, and error reports are welcome.
Send them to:
ipstacks (at) satchell (dot) net
Copyright © 2022 Stephen Satchell, Reno NV USA