Configurable Options

We've seen many features of TCP/IP that we've had to describe with the qualifier "it depends on the configuration." Typical examples are whether or not UDP checksums are enabled (Section 11.3), whether destination IP addresses with the same network ID but a different subnet ID are local or nonlocal (Section 18.4), and whether directed broadcasts are forwarded or not (Section 12.3). Indeed, many operating characteristics of a given TCP/IP implementation can be modified by the system administrator.

This appendix lists some of the configurable options for the various TCP/IP implementations that have been used throughout the text. As you might expect, every vendor does things differently from all others. Nevertheless, this appendix gives an idea of the types of parameters different implementations allow one to modify. A few options that are highly implementation specific, such as the low-water mark for the memory buffer pool, are not described.

These variables are described for informational purposes only. Their names, default values, or interpretation can change from one release to the next. Always check your vendor's documentation (or bug them for adequate documentation) for the final word on these variables.

This appendix does not cover the initialization that takes place every time the system is bootstrapped: the initialization of each network interface using ifconfig (setting the IP address, the subnet mask, etc.), entering static routes into the routing table, and the like. Instead, this appendix focuses on the configuration options that affect how TCP/IP operates.

E.1 BSD/386 Version 1.0

This system is an example of the "classical" BSD configuration that has been used since 4.2BSD. Since the source code is distributed with the system, configuration options are specified by the administrator, and the kernel is recompiled. There are two types of options: constants that are defined in the kernel configuration file (see the config(8) manual page), and variable initializations in various C source files. Brave and knowledgeable administrators can also change the values of these C variables in either the running kernel or the kernel's disk image, using a debugger, to avoid rebuilding the kernel. Here are the constants that can be changed in the kernel's configuration file.

IPFORWARDING
The value of this constant initializes the kernel variable ipforwarding. If 0 (default), IP datagrams are not forwarded. If 1, forwarding is always enabled.

GATEWAY
If defined, causes IPFORWARDING to be set to 1. Additionally, defining this constant causes certain system tables (the ARP cache and the routing table) to be larger.

SUBNETSARELOCAL
The value of this constant initializes the kernel variable subnetsarelocal. If 1 (default), a destination IP address with the same network ID as the sending host but a different subnet ID is considered local. If 0, only destination IP addresses on an attached subnet are considered local. This is summarized in Figure E.1.

Network IDs
Subnet IDs
subnetsarelocal
Comment
1
0
same
same
different
same
different
-
local
local
nonlocal
local
nonlocal
nonlocal
always local
depends on configuration
always nonlocal

Figure E.1 Interpretation of subnetsarelocal kernel variable.

This affects the MSS selected by TCP. When sending to local destinations, TCP chooses the MSS based on the MTU of the outgoing interface. When sending to nonlocal destinations, TCP uses the variable tcp_mssdflt as the MSS.

IPSENDREDIRECTS
The value of this constant initializes the kernel variable ipsendredirects. If 1 (default), the host will send ICMP redirects when forwarding IP datagrams. If 0, ICMP redirects are not sent.

DIRECTED_BROADCAST
If 1 (default), received datagrams whose destination address is the directed broadcast address of an attached interface are forwarded as a link-layer broadcast. If 0, these datagrams are silently discarded.

The following variables can also be modified. These variables are spread throughout different files in the /usr/src/sys/netinet directory.

tcprexmtthreshThe number of consecutive ACKs that triggers the fast retransmit and fast recovery algorithm. The default value is 3.
tcp_ttlThe default value for the TTL field for TCP segments. Default value is 60.
tcp_mssdfltThe default TCP MSS for nonlocal destinations. Default value is 512.
tcp_keepidleNumber of 500-ms clock ticks before sending a keepalive probe. Default value is 14400 (2 hours).
tcp_keepintvlNumber of 500-ms clock ticks between successive keepalive probes, when no response is received. Default value is 150 (75 seconds).
tcp_sendspaceThe default size of the TCP send buffer. Default value is 4096.
tcp_recvspaceThe default size of the TCP receive buffer. This affects the window size that is offered. Default value is 4096.
udpcksumIf nonzero, UDP checksums are calculated for outgoing UDP datagrams, and incoming UDP datagrams containing nonzero checksums have their checksum verified. If 0, outgoing UDP datagrams do not contain a checksum, and no checksum verification is performed on incoming UDP datagrams, even if the sender calculated a checksum. Default is 1.
udp_ttlThe default value for the TTL field in UDP datagrams. Default value is 30.
udp_sendspaceThe default size of the UDP send buffer. Defines the maximum UDP datagram that can be sent. Default is 9216.
udp_recvspaceThe default size of the UDP receive buffer. The default is 41600, allowing for 40 1024-byte datagrams.

E.2 SunOS 4.1.3

The method used with SunOS 4.1.3 is similar to what we saw with BSD/386. Since most of the kernel sources are not distributed, all the C variable initializations are contained in a single C source file that is provided.

The administrator's kernel configuration file (see the config(8) manual page) can define the following variables. After modifying your configuration file, a new kernel must be made and rebooted.

IPFORWARDING
The value of this constant initializes the kernel variable ip_forwarding. If -1, IP datagrams are never forwarded. Furthermore, the variable is never changed. If 0 (default), IP datagrams are not forwarded, but the variable's value is changed to 1 if multiple interfaces are up. If 1, forwarding is always enabled.

SUBNETSARELOCAL
The value of the kernel variable ip_subnetsarelocal is initialized from this constant. If 1 (default), a destination IP address with the same network ID as the sending host but a different subnet ID is considered local. If 0, only destination IP addresses on an attached subnet are considered local. This is summarized in Figure E.1. When sending to local destinations, TCP chooses the MSS based on the MTU of the outgoing interface. When sending to nonlocal destinations, TCP uses the variable tcp_default_mss.

IPSENDREDIRECTS
The value of this constant initializes the kernel variable ip_sendredirects. If 1 (default), the host will send ICMP redirects when forwarding IP datagrams. If 0, ICMP redirects are not sent.

DIRECTED_BROADCAST
The value of this constant initializes the kernel variable ip_dirbroadcast. If 1 (default), received datagrams whose destination .address is the directed broadcast address of an attached interface are forwarded as a link-layer broadcast. If 0, these datagrams are silently discarded.

The file /usr/kvm/sys/netinet/in_proto.c defines the following variables that can be changed. Once these variables are changed, a new kernel must be made and rebooted.

tcp_default_mssThe default TCP MSS for nonlocal destinations. Default value is 512.
tcp_sendspaceThe default size of the TCP send buffer. Default value is 4096.
tcp_recvspaceThe default size of the TCP receive buffer. This affects the window size that is offered. Default value is 4096.
tcp_keeplenA keepalive probe to a 4.2BSD host must contain a single byte of data to get a response. Set the variable to 1 for compatibility with these older implementations. Default value is 1.
tcp_ttlThe default value for the TTL field for TCP segments. Default value is 60.
tcp_nodelackIf nonzero, ACKs are not delayed. Default value is 0.
tcp_keepidleNumber of 500-ms clock ticks before sending a keepalive probe. Default value is 14400 (2 hours).
tcp_keepintvlNumber of 500-ms clock ticks between successive keepalive probes, when no response is received. Default value is 150 (75 seconds).
udp_cksumIf nonzero, UDP checksums are calculated for outgoing UDP datagrams, and incoming UDP datagrams containing nonzero checksums have their checksum verified. If 0, outgoing UDP datagrams do not contain a checksum, and no checksum verification is performed on incoming UDP datagrams, even if the sender calculated a checksum. Default is 0.
udp_ttlThe default value for the TTL field in UDP datagrams. Default value is 60.
udp_sendspaceThe default size of the UDP send buffer. Defines the maximum UDP datagram that can be sent. Default is 9000.
udp recvspaceThe default size of the UDP receive buffer. The default is 18000, allowing for two 9000-byte datagrams.

E.3 System V Release 4

The TCP/IP configuration of SVR4 is similar to the previous two systems, but fewer options are available. In the file /etc/conf/pack.d/ip/space.c two constants can be defined, and the kernel must then be rebuilt and rebooted.

IPFORWARDING
The value of this constant initializes the kernel variable ipforwarding. If 0 (default), IP datagrams are not forwarded. If 1, forwarding is always enabled.

IPSENDREDIRECTS
The value of this constant initializes the kernel variable ipsendredirects. If 1 (default), the host will send ICMP redirects when forwarding IP datagrams. If 0, ICMP redirects are not sent.

Many of the variables that we've described in the previous two sections are defined in the kernel, but one must patch the kernel to modify them. For example, there is a variable named tcp_keepidle with a value of 14400.

E.4 Solaris 2.2

Solaris 2.2 is typical of the newer Unix systems that provide a program for the administrator to run to change the configuration options of the TCP/IP system. This allows reconfiguration without having to modify source files and rebuild a kernel.

The configuration program is ndd(l). We can run the program to see what parameters we can examine or modify in the UDP module:
solaris % ndd /dev/udp \?
udp_wroff_extra(read and write)
udp_def_ttl(read and write)
udp_first_anon_port(read and write)
udp_trust_optlen(read and write)
udp_do_checksum(read and write)
udp_status(read only)

There are five modules we can specify: /dev/ip, /dev/icmp, /dev/arp, /dev/udp, and /dev/tcp. The question mark argument (which we have to prevent the shell from interpreting by preceding it with a backslash) tells the program to list all the parameters for that module. An example that queries the value of a variable is:

solaris % ndd /dev/tcp tcp_mss_def
536

To change the value of a variable we need superuser privilege and type:

solaris # ndd -set /dev/ip ip_forwarding 0

These variables can be divided into three categories:

  1. Configuration variables that a system administrator can change (e.g., ip_forwarding).
  2. Status variables that can only be displayed (e.g., the ARP cache). Normally this information is provided in an easier to understand format by the commands ifconfig, netstat, and arp.
  3. Debugging variables intended for those with kernel source code. Enabling some of these generates kernel debug output at runtime, which can degrade performance.

We now describe the parameters in each module. All parameters are read-write, unless marked "(Read only)." The read-only parameters are the status variables from case 2 above. We also mark the "(Debug)" variables from case 3. Unless otherwise noted, all the timing variables are specified in milliseconds, which differs from the other systems that normally specify times as some number of 500-ms clock ticks.

/dev/ip

ip_cksum_choice
(Debug) Selects between two independent implementations of the IP checksum algorithm.

ip_debug
(Debug) Enables printing of debug output by the kernel, if greater than 0. Larger values generate more output. Default is 0.

ip_def_ttl
Default TTL for outgoing IP datagrams, if not specified by transport layer. Default is 255.

ip_forward_directed_broadcasts
If 1 (default), received datagrams whose destination address is the directed broadcast address of an attached interface are forwarded as a link-layer broadcast. If 0, these datagrams are silently discarded.

ip_forward_src_routed
If 1 (default), received datagrams containing a source route option are forwarded. If 0, these datagrams are discarded.

ip_forwarding
Specifies whether the system forwards incoming IP datagrams: 0 means never forward, 1 means always forward, and 2 (default) means only forward when two or more interfaces are up.

ip_icmp_return_data_bytes
The number of bytes of data beyond the IP header that are returned in an ICMP error. Default is 64.

ip_ignore_delete_time
(Debug) Minimum lifetime of an IP routing table entry (IRE). Default is 30 seconds. (This parameter is in seconds, not milliseconds.)

ip_ill_status
(Read only) Displays the status of each IP lower layer data structure. There is one lower layer structure for each interface.

ip_ipif_status
(Read only) Displays the status of each IP interface data structure (IP address, subnet mask, etc.). There is one of these structures for each interface.

ip_ire_cleanup_interval
(Debug) The interval at which the IP routing table entries are scanned for possible deletions. Default is 30000 ms (30 seconds).

ip_ire_flush_interval
The interval at which ARP information in unconditionally flushed from the IP routing table. Default is 1200000 ms (20 minutes).

ip_irepathmtu_interval
The interval at which the path MTU discovery algorithm tries to increase the MTU. Default is 30000 ms (30 seconds).

ip_ire_redirect_interval
The interval at which IP routing table entries that are from ICMP redirects are deleted. Default is 60000 ms (60 seconds).

ip_ire_status
(Read only) Displays all the IP routing table entries.

ip_local_cksum
If 0 (default), IP does not calculate the IP checksum or the higher layer protocol checksum (i.e., TCP, UDP, ICMP, or IGMP) for datagrams sent or received through the loopback interface. If 1, these checksums are calculated.

ip_mrtdebug
(Debug) Enables printing of debug output concerning multicast routing by the kernel, if 1. Default is 0.

ip_path_mtu_discovery
If 1 (default), path MTU discovery is performed by IP. If 0, IP never sets the "don't fragment" bit in outgoing datagrams.

ip_respond_to_address_mask
If 0 (default), the host does not respond to ICMP address mask requests. If 1, it does respond.

ip_respond_to_echo_broadcast
If 1 (default), the host responds to ICMP echo requests that are sent to a broadcast address. If 0, it does not respond.

ip_respond_to_timestamp
If 0 (default), the host does not respond to ICMP timestamp requests. If 1, the host responds.

ip_respond_to_timestamp_broadcast
If 0 (default), the host does not respond to ICMP timestamp requests that are sent to a broadcast address. If 1, it does respond.

ip_rput_pullups
(Debug) Count of number of buffers from the network interface driver that needed to be pulled up to access the full IP header. Initialized to 0 at bootstrap time, and can be reset to 0.

ip_send_redirects
If 1 (default), the host sends ICMP redirects when acting as a router. If 0, these are not sent.

ip_send_source_quench
If 1 (default), the host generates ICMP source quench errors when incoming datagrams are discarded. If 0, these are not generated.

ip_wroff_extra
(Debug) Number of bytes of extra space to allocate in buffers for IP headers. Default is 32.

/dev/icmp

icmp_bsd_compat
(Debug) If 1 (default), the length field in the IP header of received datagrams is adjusted to exclude the length of the IP header. This is compatible with Berkeley-derived implementations and is for applications reading raw IP or raw ICMP packets. If 0, the length field is not changed.

icmp_def_ttl
The default TTL for outgoing ICMP messages. Default is 255.

icmp_wroff_extra
(Debug) Number of bytes of extra space to allocate in buffers for IP options and data-link headers. Default is 32.

/dev/arp

arp_cache_report
(Readonly) The ARP cache.

arp_cleanup_interval
The interval after which ARP entries are discarded from ARP's cache. Default is 300000 ms (5 minutes). (IP maintains its own cache of completed ARP translations; see ip_ire flush_interval.)

arp_debug
(Debug) If 1, enables printing of debug output by the ARP driver. Default is 0.

/dev/udp

udp_def_ttl
The default TTL for outgoing UDP datagrams. Default value is 255.

udp_do_checksum
If 1 (default), UDP checksums are calculated for outgoing UDP datagrams. If 0, outgoing UDP datagrams do not contain a checksum. (Unlike most other implementations, this UDP checksum flag does not affect incoming datagrams. If a received datagram has a nonzero checksum, it is always verified.)

udp_largest_anon_port
Largest port number to allocate for UDP ephemeral ports. Default is 65535.

udp_smallest_anon_port
Starting port number to allocate for UDP ephemeral ports. Default is 32768.

udp_smallest_nonpriv_port
A process requires superuser privilege to assign itself a port number less than this. Default is 1024.

udp_status
(Read only) The status of all local UDP end points: local IP address and port, foreign IP address and port.

udp_trust_optlen
(Debug) No longer used.

udp_wroff_extra
(Debug) Number of bytes of extra space to allocate in buffers for IP options and data-link headers. Default is 32.

/dev/tcp

tcp_close_wait_interval
The 2MSL value: the time spent in the TIME_WAIT state. Default is 240000 ms (4 minutes).

tcp_conn_grace_period
(Debug) Additional time added to the timer interval when sending a SYN. Default is 500 ms.

tcp_conn_req_max
The maximum number of pending connection requests queued for any listening end point. Default is 5.

tcp_cwnd_max
The maximum value of the congestion window. Default is 32768.

tcp_debug
(Debug) If 1, enables printing of debug output by TCP. Default is 0.

tcp_deferred_ack_interval
The time to wait before sending a delayed ACK. Default is 50 ms.

tcp_dupack_fast_retransmit
The number of consecutive duplicate ACKs that triggers the fast retransmit, fast recovery algorithm. Default is 3.

tcp_eager_listeners
(Debug) If 1 (default), TCP completes the three-way handshake before returning a new connection to an application with a pending passive open. "This is the way most TCP implementations operate. If 0, TCP passes an incoming connection request (received SYN) to the application, and does not complete the three-way handshake until the application accepts the connection. (Setting this to 0 might break many existing applications.)

tcp_ignore_path_mtu
(Debug) If 1, path MTU discovery ignores received ICMP fragmentation needed messages. If 0 (default), path MTU discovery is enabled for TCP.

tcp_ip abort_cinterval
The total retransmit timeout value when TCP is performing an active open. Default is 240000 ms (4 minutes).

tcp_ip_abort_interval
The total retransmit timeout value for a TCP connection after it is established. Default is 120000 ms (2 minutes).

tcp_ip_notify_cinterval
The timeout value when TCP is performing an active open after which TCP notifies IP to find a new route. Default is 10000 ms (10 seconds).

tcp_ip_notify_interval
The timeout value for an established connection after which TCP notifies IP to find a new route. Default is 10000 ms (10 seconds).

tcp_ip_ttl
The TTL to use for outgoing TCP segments. Default is 255.

tcp_keepalive_interval
The time that a connection must be idle before a keepalive probe is sent. Default is 7200000 ms (2 hours).

tcp_largest_anon_port
Largest port number to allocate for TCP ephemeral ports. Default is 65535.

tcp_maxpsz_multiplier
(Debug) Specifies the multiple of the MSS into which the stream head packetizes the application's write data. Default is 1.

tcp_mss_def
Default MSS for nonlocal destinations. Default is 536.

tcp_mss_max
The maximum MSS. Default is 65495.

tcp_mss_min
The minimum MSS. Default is 1.

tcp_naglim_def
(Debug) Maximum value of the per-connection Nagle algorithm threshold. Default is 65535. The per-connection value starts out as the minimum of the MSS or this value. The per-connection value is set to 1 by the TCP_NODELAY socket option, which disables the Nagle algorithm.

tcp_old_urp_interpretation
(Debug) If 1 (default), the older (but more common) BSD interpretation of the urgent pointer is used: it points 1 byte beyond the last byte of urgent data. If 0, the Host Requirements RFC interpretation is used; it points to the last byte of urgent data.

tcp_rcv_push_wait
(Debug) Maximum number of bytes received without the PUSH flag set before the data is passed to the application. Default is 16384.

tcp_rexmit_interval_initial
(Debug) Initial retransmit timeout interval. Default is 500 ms.

tcp_rexmit_interval_max
(Debug) Maximum retransmit timeout interval. Default is 60000 ms (60 seconds).

tcp_rexmit_interval_min
(Debug) Minimum retransmit timeout interval. Default is 200 ms.

tcp_rwin_credit_pct
(Debug) Percentage of receive window that must be buffered before flow control is checked on every received segment. Default is 50%.

tcp_smallest_anon_port
Starting port number to allocate for TCP ephemeral ports. Default is 32768.

tcp_smallest_nonpriv_port
A process requires superuser privilege to assign itself a port number less than this. Default is 1024.

tcp_snd_lowat_fraction
(Debug) If nonzero, the send buffer low-water mark is the send buffer size divided by this value. Default is 0 (disabled).

tcp_status
(Read only) Information on all TCP connections.

tcp_sth_rcv_hiwat
(Debug) If nonzero, the value to set the stream head high-water mark to. Default is 0.

tcp_sth_rcv_lowat
(Debug) If nonzero, the value to set the stream head low-water mark to. Default is 0.

tcp_wroff_xtra
(Debug) Number of bytes of extra space to allocate in buffers for IP options and data-link headers. Default is 32.

E.5 AIX 3.2.2

AIX 3.2.2 allows network options to be set at runtime using the no command. It can display the value of an option, set the value of an option, or set an option value back to its default. For example, to display an option we type:

aix % no -o udp_ttl
udp_ttl = 30

The following options can be modified.

arpt_killc
The time (in minutes) before an inactive completed ARP entry is removed. Default is 20.

ipforwarding
If 1 (default), IP datagrams are always forwarded. If 0, forwarding is disabled.

ipfragttl
The time to live (in seconds) for IP fragments awaiting reassembly. Default is 60.

ipsendredirects
If 1 (default), the host will send ICMP redirects when forwarding IP datagrams. If 0, ICMP redirects are not sent.

loop_check_sum
If 1 (default), the IP checksum is calculated for datagrams sent through the loop-back interface. If 0, this checksum is not calculated.

nonlocsrcroute
If 1 (default), received datagrams containing a source route option are forwarded. If 0, these datagrams are discarded.

subnetsarelocal
If 1 (default), a destination IP address with the same network ID as the sending host but a different subnet ID is considered local. If 0, only destination IP addresses on an attached subnet are considered local. This is summarized in Figure E.1. When sending to local destinations, TCP chooses the MSS based on the MTU of the outgoing interface. When sending to nonlocal destinations, TCP uses the default (536) as the MSS.

tcp_keepidle
Number of 500-ms clock ticks before sending a keepalive probe. Default value is 14400 (2 hours).

tcp_keepintvl
Number of 500-ms clock ticks between successive keepalive probes, when no response is received. Default value is 150 (75 seconds).

tcp_recvspace
The default size of the TCP receive buffer. This affects the window size that is offered. Default value is 16384.

tcp_sendspace
The default size of the TCP send buffer. Default value is 16384.

tcp_ttl
The default value for the TTL field for TCP segments. Default value is 60.

udp_recvspace
The default size of the UDP receive buffer. The default is 41600, allowing for 40 1024-byte datagrams.

udp_sendspace
The default size of the UDP send buffer. Defines the maximum UDP datagram that can be sent. Default is 9216.

udp_ttl
The default value for the TTL field in UDP datagrams. Default value is 30.

E.6 4.4BSD

4.4BSD is the first of the Berkeley releases to provide dynamic configuration for numerous kernel parameters. The sysctl(8) command is used. The names for the parameters were chosen to look like MIB names from SNMP. To examine a parameter we type:

vangogh % sysctl net.inet.ip.forwarding
net.inet.ip.forwarding = 1

To change a parameter we need superuser privilege and then type:

vangogh # sysctl -w nat.inet.ip.ttl=128

The following parameters can be changed.

net.inet.ip.forwarding
If 0 (default), IP datagrams are not forwarded. If 1, forwarding is enabled.

net.inet.ip.redirect
If 1 (default), the host will send ICMP redirects when forwarding IP datagrams. If 0, ICMP redirects are not sent.

net.inet.ip.tti
The default TTL for both TCP and UDP. The default is 64.

net.inet.icmp.maskrepi
If 0 (default), the host does not respond to ICMP address mask requests. If 1, it does respond.

net.inet.udp.checksum
If 1 (default), UDP checksums are calculated for outgoing UDP datagrams, and incoming UDP datagrams containing nonzero checksums have their checksum verified. If 0, outgoing UDP datagrams do not contain a checksum, and no checksum verification is performed on incoming UDP datagrams, even if the sender calculated a checksum.

Additionally, numerous variables that we've described earlier in this appendix are scattered among various source files (tcp_keepidle, subnetsarelocal, etc.) and can be modified.