https://criu.org/api.php?action=feedcontributions&user=Alak&feedformat=atomCRIU - User contributions [en]2024-03-19T01:12:17ZUser contributionsMediaWiki 1.35.6https://criu.org/index.php?title=Restorer_v2&diff=5074Restorer v22020-07-23T16:38:31Z<p>Alak: Grammar fix.</p>
<hr />
<div>The current way of restoring from images is quite straightforward -- we just fork the necessary amount of tasks, then each of them starts "dressing" itself based on the data from image files. Sometimes shared resources appear, e.g. files inherited or sessions inherited on fork, or CLONE_XXX objects. With these we write code, that handles only this type of resources -- files are sent to co-owners via unix sockets, CLONE_XXX stuff is pre-created on fork stage. Sometimes more interesting dependencies arise, for example -- [[invisible files]]. These are also special-cared.<br />
<br />
All this is not nice. We believe, that the restoring process can (and should) look better. Like this.<br />
<br />
== Concept ==<br />
The result that we want to achieve can be described as a graph of objects, each edge of it having a special "relationship" meaning. This graph should be generated using a set of pre-defined rules, e.g.<br />
<br />
* fork() -- an object type "task" may create a copy of itself, each "linked" node of this graph get "linked" to the new object<br />
* open() -- a new object type "file" appear in a graph "linked" to the "task" object doing open()<br />
<br />
And so on.<br />
<br />
Another way of describing the thing is -- a string of sequences like <code>{o,t,[l]*}</code> where <code>o</code> is an ID of an object, <code>t</code> is its type, <code>[l]*</code> is an array of links to another objects. The generation rules can look like<br />
<br />
* fork() — <code>({(*),task,[(*)]}) -> \1,{$new,task,([parent=\2,\3])}</code> which means that any task can create a copy of itself having it as parent (link) and the rest of objects shared<br />
* open() — <code>{(*),task,(*)} -> {\1,task,[file=$new,\2]},{$new,file,[]}</code> which means that any task can create a new object of type file linked to it<br />
<br />
The restoring process then is: given a final graph or string and the set of generating rules find out what sequence of the latter can produce the former.<br />
<br />
This task (in its 2nd description) reminds the task of solving the context-sensitive grammar.<br />
<br />
== Examples ==<br />
=== Process tree, sessions and groups ===<br />
<br />
In case of three IDs only we can define a task as a non-terminal looking like <code>{PGS[Ki]}</code> where <code>P</code> is task PID, <code>G</code> is task PGID, <code>S</code> is task Session ID and <code>[Ki]</code> is the list of children.<br />
<br />
Initial tree would look like<br />
<br />
<pre><br />
{111}<br />
</pre><br />
<br />
and the rules would look like below.<br />
<br />
<pre><br />
FORK: {T(..)(.*)} -> {T\1N\2}|{N\1} // T forks N<br />
EXIT: {I(..)(.*)}(.*){X(..)(.*)} -> {I\1\2\5}\3 // X exits and all its kids are reparented to I (init)<br />
SETSID: {S(.)(.)(.*)} -> {S\1S\3} // S becomes a new session leader<br />
SETPGID(0): {P(.)(.)(.*)} -> {PP\2\3} // P becomes a new group leader<br />
SETPGID(G): {GGS(.*)}(.*){P(.)S(.*)} -> {GGS\1}\2{PGS\4}<br />
{P(.)S(.*)}(.*){GGS(.*)} -> {PGS\2}\3{GGS\4} // P joins group of alive G<br />
</pre><br />
<br />
[[Category: Plans]]<br />
[[Category: Thinkers]]</div>Alakhttps://criu.org/index.php?title=External_bind_mounts&diff=5067External bind mounts2020-06-25T13:52:12Z<p>Alak: Possible false double negative</p>
<hr />
<div>__TOC__<br />
<br />
One of typical external resources when dumping a container (especially LXC/Docker) is a mount point whose root sits outside of the container's root. This situation was intended to be resolved using [[plugins]] but turned out to be common enough to introduce a built-in way of handling it.<br />
<br />
== What is external bind mount ==<br />
<br />
The way to create such is simple as<br />
<br />
mkdir /root<br />
mount --bind /foo /root/bar<br />
chroot /root<br />
<br />
This is it. From now on, the /bar file is a mountpoint whose root (the source) is not accessible directly.<br />
<br />
If you look at the /proc/$pid/mountinfo file of a task seeing such you would see smth like<br />
<br />
11 23 8:3 /root / ... - ext4 /dev/sda1 ...<br />
23 34 8:3 /foo /bar ... - ext4 /dev/sda1 ...<br />
<br />
The columns 4 and 5 are root and mountpoint respectively. You can see, that the / is /root file from /dev/sda1 device and /bar file is a mountpoint with the root being /foo file from the same device.<br />
<br />
== How to teach CRIU to dump them ==<br />
<br />
By default CRIU doesn't dump such mountpoints, because there's no way CRIU will be able to restore it -- the root of these mounts is out of scope of what CRIU dumped. In the logs you would see a message like<br />
<br />
34:/bar doesn't have a proper root mount<br />
<br />
which means the mountpoint /bar has inaccessible root.<br />
<br />
To dump and restore them there's the <code>--external mnt[KEY]:VAL</code> option that sets up external mounts root mapping.<br />
<br />
On dump, KEY is a mountpoint inside container, and corresponding VAL is a string that will be written into the image as mountpoint's root value.<br />
<br />
On restore, KEY is the value from the image (VAL from dump), and the VAL is the path on host that will be bind-mounted into container (to the mountpoint path from image).<br />
<br />
For example, if we want to dump the task above we should call<br />
<br />
criu dump ... --external mnt[/bar]:barmount<br />
<br />
The word <code>barmount</code> is an arbitrary identifier, that will be put in the image file instead of the original root path<br />
<br />
criu show -f mountpoints.img -F mnt_id,root,mountpoint<br />
mnt_id: 0x22 root: barmount mountpoint: /bar<br />
<br />
On restore we should tell CRIU where to bind mount the <code>barmount</code> from like this<br />
<br />
criu restore ... --external mnt[barmount]:/foo<br />
<br />
With this CRIU will bind mount the /foo into proper mountpoint.<br />
<br />
== Auto detection ==<br />
<br />
In case one wants CRIU to autodetect and dump all the external bind mounts, and there is no need to change host mount points on restore, one can use a special syntax:<br />
<br />
criu dump ... --external mnt[]:''flags''<br />
<br />
Note here is nothing inside square brackets, and the optional <code>:''flags''</code> argument can contain the following characters:<br />
<br />
; <code>m</code><br />
: Also enable dumping of external master mounts (as in <code>mount --make-slave</code>)<br />
; <code>s</code><br />
: Also enable dumping of external shared mounts (as in <code>mount --make-shared</code>)<br />
<br />
By default, neither master nor shared external mounts are dumped (if found, dump is aborted). Note if <code>''flags''</code> are not given, semicolon is optional.<br />
<br />
=== Examples ===<br />
<br />
criu dump ... --external 'mnt[]'<br />
<br />
Auto-detect and dump all external bind mounts.<br />
<br />
criu dump ... --external 'mnt[]:s'<br />
<br />
Auto-detect and dump all external bind mounts, including the shared ones.<br />
<br />
criu dump ... --external 'mnt[]:sm'<br />
<br />
Auto-detect and dump all external bind mounts, including the shared and the master ones.<br />
<br />
== Sharing ==<br />
<br />
External bindmounts can both have internal/external sharing. Please see the example:<br />
<br />
# Preparation<br />
unshare -m --propagation private<br />
mkdir /external_mount_sharing_test<br />
mount -t tmpfs tmpfs /external_mount_sharing_test/<br />
mount --make-private /external_mount_sharing_test/<br />
cd /external_mount_sharing_test<br />
# Source of external mount<br />
mkdir external_mount<br />
mount -t tmpfs tmpfs-external external_mount/<br />
mount --make-shared external_mount/<br />
cat /proc/$$/mountinfo | grep external<br />
# 811 755 0:60 / /external_mount_sharing_test rw,relatime - tmpfs tmpfs rw<br />
# 812 811 0:62 / /external_mount_sharing_test/external_mount rw,relatime shared:290 - tmpfs tmpfs-external rw<br />
<br />
# Switch to CT mntns<br />
unshare -m --propagation unchanged sh<br />
mkdir root<br />
mount -t tmpfs tmpfs-root root/<br />
mkdir root/external_sharing root/internal_sharing root/proc<br />
<br />
# Create external mount<br />
mount --bind external_mount/ root/external_sharing<br />
mount --bind external_mount/ root/internal_sharing<br />
mount --make-private root/internal_sharing<br />
mount --make-shared root/internal_sharing<br />
<br />
# More preparations<br />
mount --bind /proc root/proc<br />
cd root<br />
mkdir bin lib64<br />
SH=$(which sh)<br />
cp $SH bin<br />
cp $(ldd $SH | grep "/lib64" | sed 's/^.*\(\/lib64\S*\)\s.*$/\1/') lib64<br />
CAT=$(which cat)<br />
cp $CAT bin<br />
cp $(ldd $CAT | grep "/lib64" | sed 's/^.*\(\/lib64\S*\)\s.*$/\1/') lib64<br />
PATH=$PATH:/bin<br />
chroot . sh<br />
cat /proc/$$/mountinfo<br />
# 843 841 0:63 / / rw,relatime - tmpfs tmpfs-root rw<br />
# 861 843 0:62 / /external_sharing rw,relatime shared:290 - tmpfs tmpfs-external rw<br />
# 898 843 0:62 / /internal_sharing rw,relatime shared:349 - tmpfs tmpfs-external rw<br />
# 899 843 0:5 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw<br />
<br />
Mounts 812 (on the host) and 861 (in a container) have the same sharing (shared group) - external sharing and mount 898 has it's own local shared group - internal sharing.<br />
<br />
Before [https://github.com/checkpoint-restore/criu/pull/906 #906] we were detecting this external/internal sharing state for auto-detected external mounts only, but we need it for manual external mounts too. Moreover, this also applies to manual external slave mounts they can be external/internal slaves too.<br />
<br />
So we detect that the mount is from external sharing if in mount namespace of CRIU there are mounts of the same shared group and also we detect that the mount is from external slavery if there is no master mount for it in CT mount namespaces.<br />
<br />
== Old days ==<br />
<br />
For now the same behavior is configured with the <code>--ext-mount-map KEY:VAL</code> option. Soon this option will be [[deprecation|deprecated]].<br />
<br />
[[Category:HOWTO]]<br />
[[Category:External]]</div>Alak