Changes

Jump to navigation Jump to search
4,490 bytes removed ,  21:07, 24 January 2017
no edit summary
Line 1: Line 1:  
This HOWTO page describes how to checkpoint and restore a Docker container.
 
This HOWTO page describes how to checkpoint and restore a Docker container.
   −
== Background ==
+
== Introduction ==
   −
1. External C/R: Using CRIU directly on the command line as it's typically
+
Docker wants to manage the full lifecycle of processes running inside one if its containers, which makes it important for CRIU and Docker to work closely together when trying to checkpoint and restore a container. This is being achieved by adding the ability to checkpoint and restore directly into Docker itself, powered under the hood by CRIU. This integration is a work in progress, and its status will be outlined below.  
done. This is called external because it's happening external to the
  −
Docker daemon.  After checkpoint, the Docker daemon thinks that the
  −
container has exited. After restore, the Docker daemon doesn't know that
  −
the container is running again.  Therefore, commands such as ''docker ps''
  −
and ''docker logs'' will not work correctly.
     −
External C/R was done as a proof-of-concept.
+
== Docker Experimental ==
   −
2. Native C/R: Using ''docker checkpoint'' and ''docker restore'' commands.
+
Checkpoint & Restore is now available in the _experimental_ runtime mode for Docker. Simply start your docker daemon with '''--experimental''' to enable the feature.
Because the Docker daemon is involved in both checkpoint and restore,
  −
its notion of the container state will be consistent and commands such as
  −
''docker ps'' and ''docker logs'' will work.
     −
Native C/R is work in progress, say pre-alpha quality.  You can
+
=== Dependencies ===
watch this short demo
  −
[https://www.youtube.com/watch?v=HFt9v6yqsXo video]
  −
to see how it works.  Source files for Docker 1.5 C/R is at this
  −
[https://github.com/SaiedKazemi/docker/tree/cr repo].
  −
Work in underway to integrate C/R into the new libcontainer.
     −
== External C/R ==
+
In addition to installing version 1.13 of Docker, you need '''CRIU''' installed on your system, with at least version 2.0. You also need some shared libraries on your system. The most likely things you'll need to install are '''libprotobuf-c''' and '''libnl-3'''. Here's an output of <code>ldd</code> on my system:
   −
Starting with CRIU 1.3, it's possible to checkpoint and restore a
+
$ ldd `which criu`
process tree running inside a Docker containerHowever, it's
+
    linux-vdso.so.1 => (0x00007ffc09fda000)
important to note that Docker needs native support for checkpoint
+
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd28b2c7000)
and restore in order to maintain its parent-child relationship and
+
    libprotobuf-c.so.0 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.0 (0x00007fd28b0b7000)
to correctly keep track of container states. In other words, while
+
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd28aeb2000)
CRIU can C/R a process tree, the restored tree will not become a
+
    libnl-3.so.200 => /lib/x86_64-linux-gnu/libnl-3.so.200 (0x00007fd28ac98000)
child of Docker and, from Docker's point of view, the container's
+
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd28a8d3000)
state will remain "Exited" (even after successful restore).
+
    /lib64/ld-linux-x86-64.so.2 (0x000056386bb38000)
 +
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd28a5cc000)
   −
Work is in progress to add native checkpoint and restore support
+
=== checkpoint ===
to Docker.  Once ready, specific commands (for example, "docker
  −
checkpoint" and "docker restore") will use CRIU to do the actual
  −
C/R operations while Docker continues to maintain its parent-child
  −
relationship and container states.
     −
It's important to re-emphasize that by checkpointing and restoring
+
There's a top level <code>checkpoint</code> sub-command in Docker, which lets you create a new checkpoint, and list or delete an existing checkpoint. These checkpoints are stored and managed by Docker, unless you specify a custom storage path.
a Docker container, we mean C/R of a process tree running inside a
  −
container, excluding the Docker daemon itself.  As CRIU currently
  −
does not support nested PID namespaces, the C/R process tree cannot
  −
include the Docker daemon which runs in the global PID namespace.
     −
== Command Line Options ==
+
Here's an example of creating a checkpoint, from a container that simply logs an integer in a loop.
   −
In addition to the usual CRIU command line options used when
+
First, we create container:
checkpointing and restoring a process tree, the following command
  −
line options are needed for Docker containers.
     −
=== <code>--root</code> ===
+
$ docker run -d --name looper --security-opt seccomp:unconfined busybox  \
 +
          /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
   −
This option has been used in the past only for restore operations
+
You can verify the container is running by printings its logs:
that wanted to change the root of the mount namespace.  It was not
  −
used for checkpoint operations.
     −
However, because Docker by default uses the AUFS graph driver and
+
  $ docker logs looper
the AUFS module in the kernel reveals branch pathnames in
  −
/proc/<pid>/map_files, --root is used to specify the root of the
  −
mount namespace. Once the kernel AUFS module is fixed, it won't
  −
be necessary to specify this option anymore.
     −
=== <code>--ext-mount-map</code> ===
+
If you do this a few times you'll notice the integer increasing. Now, we checkpoint the container:
   −
This option is used to specify the path of the external bind mounts.
+
  $ docker checkpoint create looper checkpoint1
Docker sets up /etc/{hostname,hosts,resolv.conf} as targets with
  −
source files outside the container's mount namespace. Older versions
  −
of Docker also bind mount /.dockerinit.
     −
For example, assuming the default Docker configuration, /etc/hostname
+
You should see that the process is no longer running, and if you print the logs a few times no new logs will be printed.
in the container's mount namespace is bind mounted from the source
  −
at /var/lib/docker/containers/<container_id>/hostname.
     −
=== <code>--manage-cgroups</code> ===
+
=== restore ===
   −
When a process tree exits after a checkpoint operation, the cgroups
+
Unlike creating a checkpoint, restoring from a checkpoint is just a flag provided to the normal container '''start''' call. Here's an example:
that Docker had created for the container are removed.  This option
  −
is needed during restore to move the process tree into its cgroups,
  −
re-creating them if necessary.
     −
=== <code>--evasive-devices</code> ===
+
$ docker start --checkpoint checkpoint1 looper
   −
Docker bind mounts /dev/null on /dev/stdin for detached containers
+
If we then print the logs, you should see they start from where we left off and continue to increase.  
(i.e., docker run -d ...).  Since earlier versions of Docker used
  −
/dev/null in the global namespace, this option tells CRIU to treat
  −
the global /dev/null and the container /dev/null as the same device.
     −
== Restore Prework ==
+
==== Restoring into a '''new''' container ====
   −
As mentioned earlier, by default Docker uses AUFS to set up the
+
Beyond the straightforward case of checkpointing and restoring the same container, it's also possible to checkpoint one container, and then restore the checkpoint into a completely different container. This is done by providing a custom storage path with the <code>--checkpoint-dir</code> option. Here's a slightly revised example from before:
container's filesystem.  When Docker notices that the process has
  −
exited (due to criu dump), it dismantles the filesystem. We need
  −
to set up the filesystem again before attempting to restore.
     −
== An Example ==
+
$ docker run -d --name looper2 --security-opt seccomp:unconfined busybox \
 +
          /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
 +
 +
# wait a few seconds to give the container an opportunity to print a few lines, then
 +
$ docker checkpoint create --checkpoint-dir=/tmp looper2 checkpoint2
 +
 +
$ docker create --name looper-clone --security-opt seccomp:unconfined busybox \
 +
          /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
 +
 +
$ docker start --checkpoint-dir=/tmp --checkpoint=checkpoint2 looper-clone
   −
Below is an example to show C/R operations for a shell script that
  −
continuously appends a number to a file.  You can use tail -f to
  −
see the process in action.
     −
As you will see below, after restore, the process's parent is PID
+
You should be able to print the logs from <code>looper-clone</code> and see that they start from wherever the logs of <code>looper</code> end.
1 (init), not Docker.  Also, although the process has been successfully
  −
restored, Docker still thinks that the container has exited.
     −
To set up the container's AUFS filesystem before restore, its branch
+
=== usage ===
information should be saved before checkpointing the container.
  −
For convenience, however, AUFS branch information is saved in the
  −
dump.log file.  So we can examine dump.log to set up the filesystem
  −
again.
     −
For brevity, the 64-character long container ID is replaced by the
+
Checkpoint
string <container_id> in the following lines.
     −
<pre>
+
  # docker checkpoint create --help
$ docker run -d busybox:latest /bin/sh -c 'i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done'
+
  Usage: docker checkpoint create [OPTIONS] CONTAINER CHECKPOINT
<container_id>
+
 
$
+
  Create a checkpoint from a running container
$ docker ps
+
 
CONTAINER ID  IMAGE          COMMAND          CREATED        STATUS
+
  Options:
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 seconds ago  Up 4 seconds
+
      --checkpoint-dir string  Use a custom checkpoint storage directory
$
+
      --help                    Print usage
$ sudo criu dump -o dump.log -v4 -t 17810 \
+
      --leave-running           Leave the container running after checkpoint
-D /tmp/img/<container_id> \
+
 
--root /var/lib/docker/aufs/mnt/<container_id> \
+
Restore
--ext-mount-map /etc/resolv.conf:/etc/resolv.conf \
+
 
--ext-mount-map /etc/hosts:/etc/hosts \
+
  # docker start --help
--ext-mount-map /etc/hostname:/etc/hostname \
+
  Usage: docker start [OPTIONS] CONTAINER [CONTAINER...]
--ext-mount-map /.dockerinit:/.dockerinit \
  −
--manage-cgroups \
  −
--evasive-devices
  −
$
  −
$ sudo grep successful /tmp/img/<container_id>/dump.log
  −
(00.020103) Dumping finished successfully
  −
$
  −
$ docker ps -a
  −
CONTAINER ID  IMAGE           COMMAND          CREATED        STATUS
  −
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 minutes ago  Exited (-1) 4 minutes ago
  −
$
  −
$ sudo mount -t aufs -o br=\
  −
/var/lib/docker/aufs/diff/<container_id>:\
  −
/var/lib/docker/aufs/diff/<container_id>-init:\
  −
/var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721:\
  −
/var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16:\
  −
/var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229:\
  −
/var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158:\
  −
none /var/lib/docker/aufs/mnt/<container_id>
  −
$
  −
$ sudo criu restore -o restore.log -v4 -d
  −
-D /tmp/img/<container_id> \
  −
--root /var/lib/docker/aufs/mnt/<container_id> \
  −
--ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/<container_id>/resolv.conf \
  −
--ext-mount-map /etc/hosts:/var/lib/docker/containers/<container_id>/hosts \
  −
--ext-mount-map /etc/hostname:/var/lib/docker/containers/<container_id>/hostname \
  −
--ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 \
  −
--manage-cgroups \
  −
--evasive-devices
  −
$
  −
$ sudo grep successful /tmp/img/<container_id>/restore.log
  −
(00.424428) Restore finished successfully. Resuming tasks.
  −
$
  −
$ ps -ef | grep /bin/sh
  −
root    18580    1  0 12:38 ?        00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
  −
$
  −
$ docker ps -a
  −
CONTAINER ID  IMAGE          COMMAND          CREATED        STATUS
  −
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 7 minutes ago  Exited (-1) 5 minutes ago
  −
$
  −
</pre>
     −
== Help Script ==
+
  Start one or more stopped containers
   −
As seen in the above examples, the CRIU command line for checkpointing and
+
  Options:
restoring a Docker container is pretty long.  For restore, there is also
+
  -a, --attach                  Attach STDOUT/STDERR and forward signals
an additional step to set up the root filesystem before invoking CRIU.
+
      --checkpoint string      Restore from this checkpoint
 +
      --checkpoint-dir string  Use a custom checkpoint storage directory
 +
      --detach-keys string      Override the key sequence for detaching a container
 +
      --help                    Print usage
 +
  -i, --interactive            Attach container's STDIN
   −
To automate the C/R process, there is a helper script in the contrib
+
== Integration Status ==
subdirectory of CRIU sources, called docker_cr.sh.  In addition to
  −
invoking CRIU, this helper script sets up the root filesystem for AUFS,
  −
UnionFS, and VFS for restore.
     −
With docker_cr.sh, all you have to provide is the container ID.
+
CRIU has already been integrated into the lower level components that power Docker, namely '''runc''' and '''containerd'''. The final step in the process is to integrate with Docker itself. You can track the status of that process in [https://github.com/docker/docker/pull/22049 this pull request].
If you don't specify a container ID, docker_cr.sh will list all running
  −
containers and prompt you to choose one. Also, as shown in the help
  −
output below, by setting the appropriate environment variable, it's
  −
possible to tell docker_cr.sh which Docker and CRIU binaries to use,
  −
where Docker's home directory is, and where CRIU should save and look
  −
for its image files.
     −
<pre>
+
== Compatibility Notes ==
# docker_cr.sh --help
  −
Usage:
  −
docker_cr.sh -c|-r [-hv] [<container_id>]
  −
-c, --checkpoint checkpoint container
  −
-h, --help print help message
  −
-r, --restore restore container
  −
-v, --verbose enable verbose mode
     −
Environment:
+
The latest versions of the Docker integration require at least version 2.0 of CRIU in order to work correctly. Additionally, depending on the storage driver being used by Docker, and other factors, there may be other compatibility issues that will attempt to be listed here.
DOCKER_HOME (default /var/lib/docker)
  −
CRIU_IMG_DIR (default /var/lib/docker/criu_img)
  −
DOCKER_BINARY (default docker)
  −
CRIU_BINARY (default criu)
  −
</pre>
     −
Below is an example to checkpoint and restore Docker container 4397:
+
=== TTY ===
   −
<pre>
+
Checkpointing an interactive container is currently not supported.  
# docker_cr.sh -c 4397
  −
dump successful
  −
# docker_cr.sh -r 4397
  −
restore successful
  −
</pre>
     −
Optionally, you can specify -v to see the commands that docker_cr.sh
+
=== Seccomp ===
executes.  For example:
     −
<pre>
+
You'll notice that all of the above examples disable Docker's default seccomp support. In order to use seccomp, you'll need a newer version of the Kernel. **Update Needed with Exact Version**
# docker_cr.sh -c -v 40d3
  −
docker binary: docker
  −
criu binary: criu
  −
image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
  −
container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
     −
criu dump -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf -o dump.log --manage-cgroups --evasive-devices --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /etc/hostname:/etc/hostname --ext-mount-map /.dockerinit:/.dockerinit -t 5991 --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
=== OverlayFS ===
   −
dump successful
+
There is a bug in OverlayFS that reports the wrong mnt_id in /proc/<pid>/fdinfo/<fd> and the wrong symlink target path for /proc/<pid>/<fd>. Fortunately, these bugs have been fixed in the kernel v4.2-rc2. The following small kernel patches fix the mount id and symlink target path issue:
(00.020827) Dumping finished successfully
     −
# docker_cr.sh -r -v 40d3
+
* {{torvalds.git|155e35d4da}} by David Howells
docker binary: docker
+
* {{torvalds.git|df1a085af1}} by David Howells
criu binary: criu
+
* {{torvalds.git|f25801ee46}} by David Howells
image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
* {{torvalds.git|4bacc9c923}} by David Howells
container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
* {{torvalds.git|9391dd00d1}} by Al Viro
   −
mount -t aufs -o
+
Assuming that you are running Ubuntu Vivid (Linux kernel 3.19), here is how you can patch your kernel:
/var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
  −
/var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf-init
  −
/var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721
  −
/var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16
  −
/var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229
  −
/var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158
  −
none
  −
/var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
     −
criu restore -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf -o restore.log --manage-cgroups --evasive-devices --ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/resolv.conf --ext-mount-map /etc/hosts:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hosts --ext-mount-map /etc/hostname:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hostname --ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 -d --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf --pidfile /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/restore.pid
+
<pre>
 +
git clone  git://kernel.ubuntu.com/ubuntu/ubuntu-vivid.git
 +
cd ubuntu-vivid
 +
git remote add torvalds  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
 +
git remote update
   −
restore successful
+
git cherry-pick 155e35d4da
(00.408807) Restore finished successfully. Resuming tasks.
+
git cherry-pick df1a085af1
 +
git cherry-pick f25801ee46
 +
git cherry-pick 4bacc9c923
 +
git cherry-pick 9391dd00d1
   −
root      6206    1  1 10:49 ?        00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
+
cp /boot/config-$(uname -r) .config
 +
make olddefconfig
 +
make -j 8 bzImage modules
 +
sudo make install modules_install
 +
sudo reboot
 
</pre>
 
</pre>
    +
=== Async IO ===
 +
 +
If you are using a kernel older than 3.19 and your container uses AIO, you need the following AIO kernel patches from 3.19:
 +
 +
* {{torvalds.git|bd9b51e79c}} by Al Viro
 +
* {{torvalds.git|e4a0d3e720}} by Pavel Emelyanov
 +
 +
== External Checkpoint Restore ==
 +
 +
{{Note| External C/R was done as proof-of-concept.  Its use is highly discouraged.}}
   −
[[Category:HOWTO]]
+
Although it's not recommended, you can also learn more about using CRIU without integrating with docker: [[Docker_External]].
21

edits

Navigation menu