tcp - TCP protocol.
#include <sys/socket.h>
#include <netinet/in.h>
tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
This is an implementation of the TCP protocol defined in RFC793,
RFC1122 and RFC2001 with the NewReno and SACK extensions. It provides
a reliable, stream oriented, full duplex connection between two sockets
on top of ip(7). TCP guarantees that the data arrives in order and
retransmits lost packets. It generates and checks a per packet checksum
to catch transmission errors. TCP does not preserve record boundaries.
A fresh TCP socket has no remote or local address and is not fully
specified. To create an outgoing TCP connection use connect(2) to
establish a connection to another TCP socket. To receive new incoming
connections bind(2) the socket first to a local address and port and
then call listen(2) to put the socket into listening state. After that
a new socket for each incoming connection can be accepted using
accept(2). A socket which has had accept or connect successfully
called on it is fully specified and may transmit data. Data cannot be
transmitted on listening or not yet connected sockets.
Linux 2.2 supports the RFC1323 TCP high performance extensions. This
includes large TCP windows to support links with high latency or bandwidth.
In order to make use of them, the send and receive buffer sizes
must be increased. They can be be set globally with the
net.core.wmem_default and net.core.rmem_default sysctls, or on individual
sockets by using the SO_SNDBUF and SO_RCVBUF socket options. The
maximum sizes for socket buffers are limited by the global
net.core.rmem_max and net.core.wmem_max sysctls. See socket(7) for more
information.
TCP supports urgent data. Urgent data is used to signal the receiver
that some important message is part of the data stream and that it
should be processed as soon as possible. To send urgent data specify
the MSG_OOB option to send(2). When urgent data is received, the kernel
sends a SIGURG signal to the reading process or the process or
process group that has been set for the socket using the FIOCSPGRP or
FIOCSETOWN ioctls. When the SO_OOBINLINE socket option is enabled,
urgent data is put into the normal data stream (and can be tested for
by the SIOCATMARK ioctl), otherwise it can be only received when the
MSG_OOB flag is set for sendmsg(2).
TCP is built on top of IP (see ip(7)). The address formats defined by
ip(7) apply to TCP. TCP only supports point-to-point communication;
broadcasting and multicasting are not supported.
These sysctls can be accessed by the /proc/sys/net/ipv4/* files or with
the sysctl(2) interface. In addition, most IP sysctls also apply to
TCP; see ip(7).
tcp_window_scaling
Enable RFC1323 TCP window scaling.
tcp_sack
Enable RFC2018 TCP Selective Acknowledgements.
tcp_timestamps
Enable RFC1323 TCP timestamps.
tcp_fin_timeout
How many seconds to wait for a final FIN packet before the
socket is forcibly closed. This is strictly a violation of the
TCP specification, but required to prevent denial-of-service
attacks.
tcp_keepalive_probes
Maximum TCP keep-alive probes to send before giving up. Keepalives
are only sent when the SO_KEEPALIVE socket option is
enabled.
tcp_keepalive_time
The number of seconds after no data has been transmitted before
a keep-alive will be sent on a connection. The default is 10800
seconds (3 hours).
tcp_max_ka_probes
How many keep-alive probes are sent per slow timer run. To prevent
bursts, this value should not be set too high.
tcp_stdurg
Enable the strict RFC793 interpretation of the TCP urgentpointer
field. The default is to use the BSD-compatible interpretation
of the urgent-pointer, pointing to the first byte
after the urgent data. The RFC793 interpretation is to have it
point to the last byte of urgent data. Enabling this option may
lead to interoperatibility problems.
tcp_syncookies
Enable TCP syncookies. The kernel must be compiled with CON-
FIG_SYN_COOKIES. Syncookies protects a socket from overload
when too many connection attempts arrive. Client machines may
not be able to detect an overloaded machine with a short timeout
anymore when syncookies are enabled.
tcp_max_syn_backlog
Length of the per-socket backlog queue. As of Linux 2.2, the
backlog specified in listen(2) only specifies the length of the
backlog queue of already established sockets. The maximum queue
of sockets not yet established (in SYN_RECV state) per listen
socket is set by this sysctl. When more connection requests
arrive, Linux starts to drop packets. When syncookies are
enabled the packets are still answered and this value is effectively
ignored.
tcp_retries1
Defines how many times an answer to a TCP connection request is
retransmitted before giving up.
tcp_retries2
Defines how many times a TCP packet is retransmitted in established
state before giving up.
tcp_syn_retries
Defines how many times to try to send an initial SYN packet to a
remote host before giving up and returns an error. Must be below
255. This is only the timeout for outgoing connections; for
incoming connections the number of retransmits is defined by
tcp_retries1.
tcp_retrans_collapse
Try to send full-sized packets during retransmit. This is used
to work around TCP bugs in some stacks.
To set or get a TCP socket option, call getsockopt(2) to read or set-
sockopt(2) to write the option with the socket family argument set to
SOL_TCP. In addition, most SOL_IP socket options are valid on TCP
sockets. For more information see ip(7).
TCP_NODELAY [Toc] [Back]
Turn the Nagle algorithm off. This means that packets are always
sent as soon as possible and no unnecessary delays are introduced,
at the cost of more packets in the network. Expects an
integer boolean flag.
TCP_MAXSEG [Toc] [Back]
Set or receive the maximum segment size for outgoing TCP packets.
If this option is set before connection establishment, it
also changes the MSS value announced to the other end in the
initial packet. Values greater than the interface MTU are
ignored and have no effect.
TCP_CORK [Toc] [Back]
If enabled don't send out partial frames. All queued partial
frames are sent when the option is cleared again. This is useful
for prepending headers before calling sendfile(2), or for
throughput optimization. This option cannot be combined with
TCP_NODELAY.
These ioctls can be accessed using ioctl(2). The correct syntax is:
int value;
error = ioctl(tcp_socket, ioctl_type, &value);
FIONREAD or TIOCINQ
Returns the amount of queued unread data in the receive buffer.
Argument is a pointer to an integer.
SIOCATMARK [Toc] [Back]
Returns true when the all urgent data has been already received
by the user program. This is used together with SO_OOBINLINE.
Argument is an pointer to an integer for the test result.
TIOCOUTQ [Toc] [Back]
Returns the amount of unsent data in the socket send queue in
the passed integer value pointer. Unfortunately, the implementation
of this ioctl is buggy in all known versions of Linux and
instead returns the free space (effectively buffer size minus
bytes used including metadata) in the send queue. This will be
fixed in future Linux versions. If you use TIOCOUTQ, please
include a runtime test for both behaviors for correct function
on future releases and other Unixes.
When a network error occurs, TCP tries to resend the packet. If it
doesn't succeed after some time, either ETIMEDOUT or the last received
error on this connection is reported.
Some applications require a quicker error notification. This can be
enabled with the SOL_IP level IP_RECVERR socket option. When this
option is enabled, all incoming errors are immediately passed to the
user program. Use this option with care - it makes TCP less tolerant
to routing changes and other normal network conditions.
When an error occurs doing a connection setup occuring in a socket
write SIGPIPE is only raised when the SO_KEEPALIVE socket option is
set.
TCP has no real out-of-band data; it has urgent data. In Linux this
means if the other end sends newer out-of-band data the older urgent
data is inserted as normal data into the stream (even when SO_OOBINLINE
is not set). This differs from BSD based stacks.
Linux uses the BSD compatible interpretation of the urgent pointer
field by default. This violates RFC1122, but is required for interoperability
with other stacks. It can be changed by the tcp_stdurg sysctl.
EPIPE The other end closed the socket unexpectedly or a read is executed
on a shut down socket.
ETIMEDOUT [Toc] [Back]
The other end didn't acknowledge retransmitted data after some
time.
EAFNOTSUPPORT [Toc] [Back]
Passed socket address type in sin_family was not AF_INET.
Any errors defined for ip(7) or the generic socket layer may also be
returned for TCP.
Not all errors are documented.
IPv6 is not described.
Transparent proxy options are not described.
The sysctls are new in Linux 2.2. IP_RECVERR is a new feature in Linux
2.2. TCP_CORK is new in 2.2.
socket(7), socket(2), ip(7), sendmsg(2), recvmsg(2)
RFC793 for the TCP specification.
RFC1122 for the TCP requirements and a description of the Nagle algorithm.
RFC2581 for some TCP algorithms.
Linux Man Page 1999-04-25 TCP(7)
[ Back ] |