Line 15: |
Line 15: |
| | | |
| For virtual filesystems like proc or sysfs there's another possibility for such files to appear. It's the removal of the object represented of a file on this FS. In particular, if we open some file in /proc/$pid and the respective task dies the path of the opened file would get removed, while the file itself would be still alive (though reporting ENOENT error on any attempts to read from one). | | For virtual filesystems like proc or sysfs there's another possibility for such files to appear. It's the removal of the object represented of a file on this FS. In particular, if we open some file in /proc/$pid and the respective task dies the path of the opened file would get removed, while the file itself would be still alive (though reporting ENOENT error on any attempts to read from one). |
| + | |
| + | == What CRIU does about it == |
| + | |
| + | === Detection === |
| + | |
| + | First of all, CRIU should detect this situation to take place. Modulo some [[filesystems pecularities]], this is done like this. |
| + | |
| + | First, we [[dumping files|get the files]] from the target process via unix socket. Then for each of them get the file's name via /proc by calling <core>readlink</code> on the /proc/self/fd/$fd path. It's important to note, that we readlink ''self'' FD to get the file's name we can work with. Next we <code>fstat()</code> the respective self file descriptor. |
| + | |
| + | If the <code>st_nlink</code> field is zero, then the file is fully deleted from the system. Since no filesystems allow to create a name back for such files, we have no other choice other than get the file itself into images. So we generate a so called ''ghost file'' in the image directory and copy the file contents into it. |
| + | |
| + | But what happens if the link count is not zero. Then we should check than the name we got from proc is the one with which we can see this file. So we call <code>stat()</code> on this name and compare <code>st_dev</code> and <code>st_inode</code> fields of it with those obtained from the fstat() call earlier. If they match the file is alive and we can just dump it's name. If they don't the name we got references some other file and we fail the dump. This can be handled, but this situation is quite rare so we decided to implement support for it later. |
| + | |
| + | But there's also the 3rd possibility -- the <code>stat()</code> could fail with ENOENT error, which means, that the file has names, but the one we have one opened by is removed. In ''this'' situation we cannot just save the file name in the images, since this name is not longer alive. Neither we can dump the file as ghost, as the same file can be accessed by some other name. And, as was said, there's no way to fine this other name. Fortunately, in this case filesystems allow to create a new name for a file, so CRIU calls <code>linkat</code> system call creating a temporary name for this file on the disk and saves this name in the image. This is called ''link-remap''. Since this manoeuvre modifies the filesystem, CRIU requires the special option ''--link-remap'' to be passed to it allowing this behaviour. On restore the link-remap names are removed after files restore. |
| + | |
| + | Please note, that a file may have been opened by many removed names, and for each a link-remap name should point to the same file, so while dumping and restoring CRIU keeps track of those names to inode mappings. |
| + | |
| + | === Virtual filesystems === |
| + | |
| + | For proc CRIU does a slightly different trick. When we see dead name in /proc we cannot link() a new name or create a ghost file. Instead, we remember the PID of the process, that died and on restore create a temporary task with the desired pid, which gets killed right after all its open()-ers are restored. |
| + | |
| + | [[Category: Under the hood]] |