| Line 168: |
Line 168: |
| | == External Checkpoint Restore == | | == External Checkpoint Restore == |
| | | | |
| − | {{Note| External C/R was done as proof-of-concept. Its use is discouraged and the helper script mentioned below will be deprecated in the near future.}} | + | {{Note| External C/R was done as proof-of-concept. Its use is highly discouraged.}} |
| | | | |
| − | This approach is called external because it's happening external to the
| + | Although it's not recommended, you can also learn more about using CRIU without integrating with docker: [[Docker_External]]. |
| − | Docker daemon. After checkpoint, the Docker daemon thinks that the
| |
| − | container has exited. After restore, the Docker daemon doesn't know that
| |
| − | the container is running again. Therefore, commands such as
| |
| − | <code>docker ps, stop, kill</code> and <code>logs</code>
| |
| − | will not work correctly.
| |
| − | | |
| − | Starting with CRIU 1.3, it is possible to checkpoint and restore a
| |
| − | process tree running inside a Docker container. However, it's
| |
| − | important to note that Docker needs native support for checkpoint
| |
| − | and restore in order to maintain its parent-child relationship and
| |
| − | to correctly keep track of container states. In other words, while
| |
| − | CRIU can C/R a process tree, the restored tree will not become a
| |
| − | child of Docker and, from Docker's point of view, the container's
| |
| − | state will remain "Exited" (even after successful restore).
| |
| − | | |
| − | It's important to re-emphasize that by checkpointing and restoring
| |
| − | a Docker container, we mean C/R of a process tree running inside a
| |
| − | container, excluding the Docker daemon itself. As CRIU currently
| |
| − | does not support nested PID namespaces, the C/R process tree cannot
| |
| − | include the Docker daemon which runs in the global PID namespace.
| |
| − | | |
| − | === Command Line Options ===
| |
| − | | |
| − | In addition to the usual CRIU command line options used when
| |
| − | checkpointing and restoring a process tree, the following command
| |
| − | line options are needed for Docker containers.
| |
| − | | |
| − | ==== <code>--root</code> ====
| |
| − | | |
| − | This option has been used in the past only for restore operations
| |
| − | that wanted to change the root of the mount namespace. It was not
| |
| − | used for checkpoint operations.
| |
| − | | |
| − | However, because Docker by default uses the AUFS graph driver and
| |
| − | the AUFS module in the kernel reveals branch pathnames in
| |
| − | <code>/proc/''pid''/map_files</code>, option <code>--root</code>
| |
| − | is used to specify the root of the
| |
| − | mount namespace. Once the kernel AUFS module is fixed, it won't
| |
| − | be necessary to specify this option anymore.
| |
| − | | |
| − | ==== <code>--ext-mount-map</code> ====
| |
| − | | |
| − | This option is used to specify the path of the external bind mounts.
| |
| − | Docker sets up <code>/etc/{hostname,hosts,resolv.conf}</code> as targets with
| |
| − | source files outside the container's mount namespace. Older versions
| |
| − | of Docker also bind mount <code>/.dockerinit</code>.
| |
| − | | |
| − | For example, assuming the default Docker configuration, <code>/etc/hostname</code>
| |
| − | in the container's mount namespace is bind mounted from the source
| |
| − | at <code>/var/lib/docker/containers/''container_id''/hostname</code>.
| |
| − | | |
| − | ==== <code>--manage-cgroups</code> ====
| |
| − | | |
| − | When a process tree exits after a checkpoint operation, the cgroups
| |
| − | that Docker had created for the container are removed. This option
| |
| − | is needed during restore to move the process tree into its cgroups,
| |
| − | re-creating them if necessary.
| |
| − | | |
| − | ==== <code>--evasive-devices</code> ====
| |
| − | | |
| − | Docker bind mounts <code>/dev/null</code> on <code>/dev/stdin</code> for detached containers
| |
| − | (i.e., <code>docker run -d ...</code>). Since earlier versions of Docker used
| |
| − | <code>/dev/null</code> in the global namespace, this option tells CRIU to treat
| |
| − | the global <code>/dev/null</code> and the container <code>/dev/null</code> as the same device.
| |
| − | | |
| − | ==== <code>--inherit-fd</code> ====
| |
| − | | |
| − | For native C/R support, this option tells CRIU to let the restored process "inherit"
| |
| − | its specified file descriptor (instead of restoring from checkpoint).
| |
| − | | |
| − | === Restore Prework for External C/R ===
| |
| − | | |
| − | Docker supports many storage drivers (AKA graph drivers) including
| |
| − | AUFS, Btrfs, ZFS, DeviceMapper, OverlayFS, and VFS. The user can
| |
| − | specify his/her desired storage driver via the <code>DOCKER_DRIVER</code>
| |
| − | environment variable or the <code>-s (--storage-driver)</code> command
| |
| − | line option.
| |
| − | | |
| − | Currently C/R can only be done on containers using either AUFS, OverlayFS, or VFS.
| |
| − | In the following example, we assume AUFS.
| |
| − | | |
| − | When Docker notices that the container has exited (due to CRIU dump),
| |
| − | it dismantles the container's filesystem. We need to set up the container's
| |
| − | filesystem again before attempting to restore.
| |
| − | | |
| − | === An External C/R Example ===
| |
| − | | |
| − | Below is an example to show C/R operations for a shell script that
| |
| − | continuously appends a number to a file. You can use tail -f to
| |
| − | see the process in action.
| |
| − | | |
| − | As you will see below, after restore, the process's parent is PID
| |
| − | 1 (init), not Docker. Also, although the process has been successfully
| |
| − | restored, Docker still thinks that the container has exited.
| |
| − | | |
| − | To set up the container's AUFS filesystem before restore, its branch
| |
| − | information should be saved before checkpointing the container.
| |
| − | For convenience, however, AUFS branch information is saved in the
| |
| − | dump.log file. So we can examine dump.log to set up the filesystem
| |
| − | again.
| |
| − | | |
| − | For brevity, the 64-character long container ID is replaced by the
| |
| − | string <container_id> in the following lines.
| |
| − | | |
| − | <pre>
| |
| − | $ docker run -d busybox:latest /bin/sh -c 'i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done'
| |
| − | <container_id>
| |
| − | $
| |
| − | $ docker ps
| |
| − | CONTAINER ID IMAGE COMMAND CREATED STATUS
| |
| − | 168aefb8881b busybox:latest "/bin/sh -c 'i=0; 6 seconds ago Up 4 seconds
| |
| − | $
| |
| − | $ sudo criu dump -o dump.log -v4 -t 17810 \
| |
| − | -D /tmp/img/<container_id> \
| |
| − | --root /var/lib/docker/aufs/mnt/<container_id> \
| |
| − | --ext-mount-map /etc/resolv.conf:/etc/resolv.conf \
| |
| − | --ext-mount-map /etc/hosts:/etc/hosts \
| |
| − | --ext-mount-map /etc/hostname:/etc/hostname \
| |
| − | --ext-mount-map /.dockerinit:/.dockerinit \
| |
| − | --manage-cgroups \
| |
| − | --evasive-devices
| |
| − | $
| |
| − | $ sudo grep successful /tmp/img/<container_id>/dump.log
| |
| − | (00.020103) Dumping finished successfully
| |
| − | $
| |
| − | $ docker ps -a
| |
| − | CONTAINER ID IMAGE COMMAND CREATED STATUS
| |
| − | 168aefb8881b busybox:latest "/bin/sh -c 'i=0; 6 minutes ago Exited (-1) 4 minutes ago
| |
| − | $
| |
| − | $ sudo mount -t aufs -o br=\
| |
| − | /var/lib/docker/aufs/diff/<container_id>:\
| |
| − | /var/lib/docker/aufs/diff/<container_id>-init:\
| |
| − | /var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721:\
| |
| − | /var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16:\
| |
| − | /var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229:\
| |
| − | /var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158:\
| |
| − | none /var/lib/docker/aufs/mnt/<container_id>
| |
| − | $
| |
| − | $ sudo criu restore -o restore.log -v4 -d
| |
| − | -D /tmp/img/<container_id> \
| |
| − | --root /var/lib/docker/aufs/mnt/<container_id> \
| |
| − | --ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/<container_id>/resolv.conf \
| |
| − | --ext-mount-map /etc/hosts:/var/lib/docker/containers/<container_id>/hosts \
| |
| − | --ext-mount-map /etc/hostname:/var/lib/docker/containers/<container_id>/hostname \
| |
| − | --ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 \
| |
| − | --manage-cgroups \
| |
| − | --evasive-devices
| |
| − | $
| |
| − | $ sudo grep successful /tmp/img/<container_id>/restore.log
| |
| − | (00.424428) Restore finished successfully. Resuming tasks.
| |
| − | $
| |
| − | $ ps -ef | grep /bin/sh
| |
| − | root 18580 1 0 12:38 ? 00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
| |
| − | $
| |
| − | $ docker ps -a
| |
| − | CONTAINER ID IMAGE COMMAND CREATED STATUS
| |
| − | 168aefb8881b busybox:latest "/bin/sh -c 'i=0; 7 minutes ago Exited (-1) 5 minutes ago
| |
| − | $
| |
| − | </pre>
| |
| − | | |
| − | === External C/R Helper Script ===
| |
| − | | |
| − | As seen in the above examples, the CRIU command line for checkpointing and
| |
| − | restoring a Docker container is pretty long. For restore, there is also
| |
| − | an additional step to set up the root filesystem before invoking CRIU.
| |
| − | | |
| − | To automate the C/R process, there is a helper script in the contrib
| |
| − | subdirectory of CRIU sources, called docker_cr.sh. In addition to
| |
| − | invoking CRIU, this helper script sets up the root filesystem for AUFS,
| |
| − | UnionFS, and VFS for restore.
| |
| − | | |
| − | With docker_cr.sh, all you have to provide is the container ID.
| |
| − | If you don't specify a container ID, docker_cr.sh will list all running
| |
| − | containers and prompt you to choose one. Also, as shown in the help
| |
| − | output below, by setting the appropriate environment variable, it's
| |
| − | possible to tell docker_cr.sh which Docker and CRIU binaries to use,
| |
| − | where Docker's home directory is, and where CRIU should save and look
| |
| − | for its image files.
| |
| − | | |
| − | <pre>
| |
| − | # docker_cr.sh --help
| |
| − | Usage:
| |
| − | docker_cr.sh -c|-r [-hv] [<container_id>]
| |
| − | -c, --checkpoint checkpoint container
| |
| − | -h, --help print help message
| |
| − | -r, --restore restore container
| |
| − | -v, --verbose enable verbose mode
| |
| − | | |
| − | Environment:
| |
| − | DOCKER_HOME (default /var/lib/docker)
| |
| − | CRIU_IMG_DIR (default /var/lib/docker/criu_img)
| |
| − | DOCKER_BINARY (default docker)
| |
| − | CRIU_BINARY (default criu)
| |
| − | </pre>
| |
| − | | |
| − | Below is an example to checkpoint and restore Docker container 4397:
| |
| − | | |
| − | <pre>
| |
| − | # docker_cr.sh -c 4397
| |
| − | dump successful
| |
| − | # docker_cr.sh -r 4397
| |
| − | restore successful
| |
| − | </pre>
| |
| − | | |
| − | Optionally, you can specify <code>-v</code> to see the commands that <code>docker_cr.sh</code>
| |
| − | executes. For example:
| |
| − | | |
| − | <pre>
| |
| − | # docker_cr.sh -c -v 40d3
| |
| − | docker binary: docker
| |
| − | criu binary: criu
| |
| − | image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | | |
| − | criu dump -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf -o dump.log \
| |
| − | --manage-cgroups --evasive-devices \
| |
| − | --ext-mount-map /etc/resolv.conf:/etc/resolv.conf \
| |
| − | --ext-mount-map /etc/hosts:/etc/hosts \
| |
| − | --ext-mount-map /etc/hostname:/etc/hostname \
| |
| − | --ext-mount-map /.dockerinit:/.dockerinit \
| |
| − | -t 5991 --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | | |
| − | dump successful
| |
| − | (00.020827) Dumping finished successfully
| |
| − | | |
| − | # docker_cr.sh -r -v 40d3
| |
| − | docker binary: docker
| |
| − | criu binary: criu
| |
| − | image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | | |
| − | mount -t aufs -o
| |
| − | /var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | /var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf-init
| |
| − | /var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721
| |
| − | /var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16
| |
| − | /var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229
| |
| − | /var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158
| |
| − | none
| |
| − | /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
| |
| − | | |
| − | criu restore -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf \
| |
| − | -o restore.log --manage-cgroups --evasive-devices \
| |
| − | --ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/resolv.conf \
| |
| − | --ext-mount-map /etc/hosts:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hosts \
| |
| − | --ext-mount-map /etc/hostname:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hostname \
| |
| − | --ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 \
| |
| − | -d --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf \
| |
| − | --pidfile /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/restore.pid
| |
| − | | |
| − | restore successful
| |
| − | (00.408807) Restore finished successfully. Resuming tasks.
| |
| − | | |
| − | root 6206 1 1 10:49 ? 00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
| |
| − | </pre>
| |
| − | | |
| − | | |
| − | [[Category:HOWTO]]
| |