Changes

Jump to navigation Jump to search
5,198 bytes added ,  15:31, 8 August 2021
use template:opt
Line 1: Line 1: −
Hello my friend heres my new gig easy as it looks Ill send u a total of 5350++ backlinks to your website in 2 tiers. This gig its for 1 website and up to 5 keywords. First tier to your main website 350 page rank 1-5 and the the second tier of 5000 profile backlinks pointing to your first tier.Ill send u a report in a txt file in less than 48 hours.Any question just send me a private message
+
This page describes how we handle established TCP connections.
 +
 
 +
== TCP repair mode in kernel ==
 +
 
 +
The <code>TCP_REPAIR</code> socket option was added to the kernel 3.5 to help with C/R for TCP sockets.
 +
 
 +
When this option is used, the socket is switched into a special mode, in which any action performed on it
 +
does not result in anything defined by an appropriate protocol actions, but rather directly puts the socket
 +
into the state that the socket is expected to be in at the end of a successfully finished operation.
 +
 
 +
For example, calling <code>connect()</code> on a repaired socket just changes its state to <code>ESTABLISHED</code>,
 +
with the peer address set as requested.
 +
The <code>bind()</code> call forcibly binds the socket to a given address (ignoring any potential conflicts).
 +
The <code>close()</code> call closes the socket without any transient <code>FIN_WAIT</code>/<code>TIME_WAIT</code>/etc states,
 +
socket is silently killed.
 +
 
 +
=== Sequences ===
 +
 
 +
To restore the connection properly, bind() and connect() is not enough. One also needs to restore the
 +
TCP sequence numbers. To do so, the <code>TCP_REPAIR_QUEUE</code> and <code>TCP_QUEUE_SEQ</code> options were introduced.
 +
 
 +
The former one selects which queue (input or output) will be repaired and the latter gets/sets the sequence. Note
 +
setting the sequence is only possible on CLOSE-d socket.
 +
 
 +
=== Packets in queue ===
 +
 
 +
When set the queue to repair as described above, one can call recv or send syscalls on a repaired socket. Both calls
 +
result on peeking or poking data from/to the respective queue. This sounds funny, but yes, for repaired socket one
 +
can receve the outgoing and send the incoming queues. Using the <code>MSG_PEEK</code> flag for <code>recv()</code> is required.
 +
 
 +
=== Options ===
 +
 
 +
There are 4 options that are negotiated by the socket at the connecting stage. These are
 +
 
 +
* mss_clamp -- the maximum size of the segment peer is ready to accept
 +
* snd _scale -- the scale factor for a window
 +
* sack -- whether selective acks are permitted or not
 +
* tstamp -- whether timestamps on packets are supported
 +
 
 +
All four can be read with <code>getsockopt()</code> calls to a socket and in order to restore them the <code>TCP_REPAIR_OPTIONS</code> sockoption is introduced.
 +
 
 +
== Timestamp ==
 +
"The sender's timestamp clock is used as a source of monotonic non-decreasing values to stamp the segments"(rfc7323). The Linux kernel uses the jiffies counter as the tcp timestamp.
 +
 
 +
<code>#define tcp_time_stamp          ((__u32)(jiffies))</code>
 +
 
 +
We add the <code>TCP_TIMESTAMP</code> options to be able to compensate a difference between jiffies counters, when a connection is migrated on another host. When a connection is dumped, criu calls <code>getsockopt(TCP_TIMESTAMP)</code> to get a current timestamp, then on restore it calls <code>setsockopt(TCP_TIMESTAMP)</code> to set this timestamp as a starting point.
 +
 
 +
== Checkpoint and restore TCP connection ==
 +
 
 +
With the above sockoptions dumping and restoring TCP connection becomes possible. The criu just reads the socket
 +
state and restores it back letting the protocol resurrect the data sequence.
 +
 
 +
One thing to note here — while the socket is closed between dump and restore the connection should be "locked", i.e.
 +
no packets from peer should enter the stack, otherwise the RST will be sent by a kernel. In order to do so a simple
 +
netfilter rule is configured that drops all the packets from peer to a socket we're dealing with. This rule sits
 +
in the host netfilter tables after the criu dump command finishes and it should be there when you issue the
 +
criu restore one. The locking method can be specified using the {{opt|--network-lock}} option.
 +
 
 +
Another thing to note is -- on restore there should be available the IP address, that was used by the connection.
 +
This is automatically so if restore happens on the same box as dump. In case of hand-made live migration the
 +
IP address should be copied too.
 +
 
 +
That said, the command line option {{opt|--tcp-established}} should be used when calling criu to explicitly state, that the
 +
caller is aware of this "transitional" state of the netfilter.
 +
 
 +
In case the target process lives in NET namespace the connection locking happens the other way. Instead of
 +
per-connection iptables rules the "network-lock"/"network-unlock" [[action scripts]] are called so that the user
 +
could isolate the whole netns from network. Typically this is done by downing the respective veth pair end.
 +
 
 +
== States ==
 +
=== TCP_SYN_SENT ===
 +
There is only one difference with TCP_ESTABLISHED, we have to restore a socket and disable the repair mode before calling <code>connect()</code>. The kernel will send a one syn-sent packet with the same initial sequence number and sets the TCP_SYN_SENT state for the socket.
 +
 
 +
=== Half-closed sockets ===
 +
A socket is half-closed when it sent or received a fin packet. These sockets are in one for these states: TCP_FIN_WAIT1, TCP_FIN_WAIT2, TCP_CLOSING, TCP_LAST_ACL, TCP_CLOSE_WAIT. To restore these states, we restore a socket into the TCP_ESTABLISHED state and then we call shutfown(SHUT_WR), if a socket has sent a fin packet and we send a fake fin packet, if a socket has received it before. For example, if we want to restore the TCP_FIN_WAIT1 state, we have to call shutfown(SHUT_WR) and we can send a fake ack to the fin packet to restore the TCP_FIN_WAIT2 state.
 +
 
 +
== See also ==
 +
* [[Simple TCP pair]]
 +
* [[TCP repair TODO]]
 +
* [[CLI/opt/--tcp-close|Dropping the connection]]
 +
 
 +
== External links ==
 +
* http://lwn.net/Articles/495304/
 +
 
 +
[[Category:Under the hood]]
 +
[[Category:Sockets]]
 +
[[Category: Editor help needed]]

Navigation menu