This HOWTO page describes how to checkpoint and restore a Docker container.
Docker wants to manage the full lifecycle of processes running inside one if its containers, which makes it important for CRIU and Docker to work closely together when trying to checkpoint and restore a container. This is being achieved by adding the ability to checkpoint and restore directly into Docker itself, powered under the hood by CRIU. This integration is a work in progress, and its status will be outlined below.
The easiest way to try CRIU and Docker together is to install this pre-compiled version of Docker. It's based on Docker 1.10, and built with the
DOCKER_EXPERIMENTAL build tag.
To install, download the
docker-1.10.0-dev binary to your system. You'll need to start a docker daemon from this binary, and then you can use the same binary to communicate with that daemon. To start a docker daemon, run a command something like this:
$ docker-1.10.0-dev daemon -D --graph=/var/lib/docker-dev --host unix:///var/run/docker-dev.sock
The graph and host options will prevent colliding with an existing installation of Docker, but you can replace your existing docker if desired. In another shell, you can then connect to that daemon:
$ docker-1.10.0-dev --host unix:///var/run/docker-dev.sock run -d busybox top
In addition to downloading the binary above (or compiling one yourself), you need CRIU installed on your system, with at least version 2.0. You also need some shared libraries on your system. The most likely things you'll need to install are libprotobuf-c and libnl-3. Here's an output of
ldd on my system:
$ ldd `which criu` linux-vdso.so.1 => (0x00007ffc09fda000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd28b2c7000) libprotobuf-c.so.0 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.0 (0x00007fd28b0b7000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd28aeb2000) libnl-3.so.200 => /lib/x86_64-linux-gnu/libnl-3.so.200 (0x00007fd28ac98000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd28a8d3000) /lib64/ld-linux-x86-64.so.2 (0x000056386bb38000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd28a5cc000)
Creating a checkpoint is a top level Docker command with this new version of Docker. Here's an example that simply logs an integer in a loop. From this point forward, commands are show using docker instead of docker-dev-1.10, but if you have not installed this version globally you can use the latter.
First, we create container:
$ docker run -d --name looper --security-opt seccomp:unconfined busybox \ /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
You can verify the container is running by printings its logs:
$ docker logs looper
If you do this a few times you'll notice the integer increasing. Now, we checkpoint the container:
$ docker checkpoint looper
You should see that the process is no longer running, and if you print the logs a few times no new logs will be printed.
Like checkpoint, restore is a top level command in this version of Docker. Continuing our example, let's restore the same container:
$ docker restore looper
If we then print the logs, you should see they start from where we left off and continue to increase.
Restoring into a new container
Beyond the straightforward case of checkpointing and restoring the same container, it's also possible to checkpoint one container, and then restore the checkpoint into a completely different container. Right now that is done with the
--force option, in conjunction with the
--image-dir option. Here's a slightly revised example from before:
$ docker run -d --name looper2 --security-opt seccomp:unconfined busybox \ /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done' # wait a few seconds to give the container an opportunity to print a few lines, then $ docker checkpoint --image-dir=/tmp/checkpoint1 looper2 $ docker create --name looper-force --security-opt seccomp:unconfined busybox \ /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done' $ docker restore --force=true --image-dir=/tmp/checkpoint1 looper-force
You should be able to print the logs from
looper-force and see that they start from wherever the logs of
# docker checkpoint --help Usage: docker checkpoint [OPTIONS] CONTAINER Checkpoint one or more running containers --help Print usage --image-dir directory for storing checkpoint image files --leave-running leave the container running after checkpoint --work-dir directory for storing log file
# docker restore --help Usage: docker restore [OPTIONS] CONTAINER Restore one or more checkpointed containers --force bypass checks for current container state --help Print usage --image-dir directory to restore image files from --work-dir directory for restore log
More detailed instructions on running checkpoint/restore with Docker in version 1.12 will be coming in the future, but in the meantime, you must build the version of Docker available in the docker-checkpoint-restore branch of Boucher's fork of Docker, available here. Make sure to build with the env
The command line interface has changed from the 1.10 version.
docker checkpoint is now an umbrella command for a few checkpoint operations. To create a checkpoint, use the
docker checkpoint create command, which takes
checkpoint_id as non-optional arguments. Example:
docker checkpoint create my_container my_first_checkpoint
Restoring a container is now performed just as an option to
docker start. Although typically you may create and start a container in a single step using
docker run, under the hood this is actually two steps:
docker create followed by
docker start. You can also call
start on a container that was previously running and has since been stopped or killed. That looks something like this:
docker start --checkpoint my_first_checkpoint my_container
CRIU has already been integrated into the lower level components that power Docker, namely runc and containerd. The final step in the process is to integrate with Docker itself. You can track the status of that process in this pull request.
The latest versions of the Docker integration require at least version 2.0 of CRIU in order to work correctly. Additionally, depending on the storage driver being used by Docker, and other factors, there may be other compatibility issues that will attempt to be listed here.
Checkpointing an interactive container is currently not supported.
You'll notice that all of the above examples disable Docker's default seccomp support. In order to use seccomp, you'll need a newer version of the Kernel. **Update Needed with Exact Version**
There is a bug in OverlayFS that reports the wrong mnt_id in /proc/<pid>/fdinfo/<fd> and the wrong symlink target path for /proc/<pid>/<fd>. Fortunately, these bugs have been fixed in the kernel v4.2-rc2. The following small kernel patches fix the mount id and symlink target path issue:
155e35d4daby David Howells
df1a085af1by David Howells
f25801ee46by David Howells
4bacc9c923by David Howells
9391dd00d1by Al Viro
Assuming that you are running Ubuntu Vivid (Linux kernel 3.19), here is how you can patch your kernel:
git clone git://kernel.ubuntu.com/ubuntu/ubuntu-vivid.git cd ubuntu-vivid git remote add torvalds git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git remote update git cherry-pick 155e35d4da git cherry-pick df1a085af1 git cherry-pick f25801ee46 git cherry-pick 4bacc9c923 git cherry-pick 9391dd00d1 cp /boot/config-$(uname -r) .config make olddefconfig make -j 8 bzImage modules sudo make install modules_install sudo reboot
If you are using a kernel older than 3.19 and your container uses AIO, you need the following AIO kernel patches from 3.19:
External Checkpoint Restore
|Note: External C/R was done as proof-of-concept. Its use is highly discouraged.|
Although it's not recommended, you can also learn more about using CRIU without integrating with docker: Docker_External.