Line 26: |
Line 26: |
| </pre> | | </pre> |
| | | |
− | Now let's remember, that a file can be opened multiple times in one task, this is happens when you e.g. start a shell. One of the <code>/dev/tty</code> or alike files will sit under 0, 1 and 2 descriptors. Not a big deal, we just expand the <code>struct fd</code> | + | Now let's remember, that a file can be opened multiple times in one task, this happens when you e.g. start a shell. One of the <code>/dev/tty</code> or alike files will sit under 0, 1 and 2 descriptors. Not a big deal, we just expand the <code>struct fd</code> |
| | | |
| <pre> | | <pre> |
Line 46: |
Line 46: |
| close(fd); | | close(fd); |
| </pre> | | </pre> |
| + | |
| + | Next thing to handle -- file shared between tasks. This is also very typical, once you called <code>open()</code> and then <code>fork()</code> the file becomes such. But what if a file is shared between two processes, none of which is the ancestor of another? There are two ways of doing this, CRIU uses the most straightforward one -- it [http://linux.die.net/man/3/cmsg sends file descriptors] between processes. |
| + | |
| + | This requires some complication in the structures we use |
| + | |
| + | <pre> |
| + | struct pid_fd { |
| + | int pid; |
| + | int fd; |
| + | }; |
| + | |
| + | struct fd { |
| + | struct file *file; |
| + | int n_fds; |
| + | struct pid_fd *tgt_fds; |
| + | } *fd; |
| + | </pre> |
| + | |
| + | and in the code which now consists of two parts -- one that opens file and sends it to others, and the other one that just receives them. We will come back to this again below, let's enjoy the code we have at the moment: |
| + | |
| + | <pre> |
| + | int fd, i, pid = getpid(); |
| + | |
| + | if (pid == file_opener(fd)) { |
| + | fd = open_a_file(fd->file); |
| + | |
| + | for (i = 0; i < fd->n_fds; i++) { |
| + | if (fd->tgt_fds[i].pid == pid) |
| + | dup2(fd, fd->tgt_fds[i].fd); |
| + | else |
| + | send_fd(fd, fd->tgt_fds[i]); |
| + | } |
| + | |
| + | close(fd); |
| + | } else { |
| + | for (i = 0; i < fd->n_fds; i++) { |
| + | if (fd->tgt_fds[i].pid != pid) |
| + | continue; |
| + | |
| + | fd = recv_fd(); |
| + | dup2(fd, fd->tgt_fds[i].fd); |
| + | close(fd); |
| + | } |
| + | } |
| + | </pre> |
| + | |
| + | Please, note, that all <code>tgt_fds</code> belonging to some task are opened by different one and then are sent to the real owner in the order they are met in the array. So does the receiver -- it receives the fds in the same order, so this algorithm puts files into proper descriptors. |
| + | |
| + | There are several interesting things about the above code snippet. |
| + | |
| + | First, the <code>send_fd</code> and <code>recv_fd</code> routines cannot works using one socket for all tasks -- a descriptor sent to task <code>pid</code> should reach ''this'' task, not some arbitrary one that kernel woke up earlier on data arrival. That said, we have to create one socket per at least task to receive descriptors. |
| + | |
| + | Second, files can be shared in a tricky manner, so that task A may have one file shared with task B and some other file shared with task C. If the "who opens a file" voting selects B and C for respective files, they will have to send descriptors to A with proper coordination with each other. This coordination can be simplified if we create sockets not just per-pid, but per-(pid, fd). This is what CRIU does, and this is how it does that. |
| | | |
| [[Category:Under the hood]] | | [[Category:Under the hood]] |