Difference between revisions of "Docker"

From CRIU
Jump to navigation Jump to search
 
(51 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This HOWTO page describes how to checkpoint and restore a Docker container.
+
This article describes the status of CRIU integration with Docker, and how to use it.
  
== Background ==
+
== Docker Experimental ==
  
1. External C/R: Using CRIU directly on the command line as it's typically
+
Naturally, Docker wants to manage the full lifecycle of processes running inside its containers, so CRIU should be run by Docker (rather than separately).
done.  This is called external because it's happening external to the
+
This feature is available in the ''experimental'' mode for Docker (since Docker 1.13, so every later version, like Docker 17.03, should work).
Docker daemon. After checkpoint, the Docker daemon thinks that the
 
container has exited.  After restore, the Docker daemon doesn't know that
 
the container is running again. Therefore, commands such as ''docker ps''
 
and ''docker logs'' will not work correctly.
 
  
External C/R was done as a proof-of-concept.
+
To enable experimental features (incl. CRIU), you need to do something like this:
  
2. Native C/R: Using ''docker checkpoint'' and ''docker restore'' commands.
+
echo "{\"experimental\": true}" >> /etc/docker/daemon.json
Because the Docker daemon is involved in both checkpoint and restore,
+
systemctl restart docker
its notion of the container state will be consistent and commands such as
 
''docker ps'' and ''docker logs'' will work.
 
  
Native C/R is work in progress, say pre-alpha quality. You can
+
In addition to having a recent version of Docker, you need '''CRIU''' 2.0 or later installed on your system (see [[Installation]] for more info).
watch this short demo
 
[https://www.youtube.com/watch?v=HFt9v6yqsXo video]
 
to see how it works.  Source files for Docker 1.5 C/R is at this
 
[https://github.com/SaiedKazemi/docker/tree/cr repo].
 
Work in underway to integrate C/R into the new libcontainer.
 
  
== External C/R ==
+
=== checkpoint ===  
  
Starting with CRIU 1.3, it's possible to checkpoint and restore a
+
There's a top level <code>checkpoint</code> sub-command in Docker, which lets you create a new checkpoint, and list or delete an existing checkpoint. These checkpoints are stored and managed by Docker, unless you specify a custom storage path.
process tree running inside a Docker container.  However, it's
 
important to note that Docker needs native support for checkpoint
 
and restore in order to maintain its parent-child relationship and
 
to correctly keep track of container states. In other words, while
 
CRIU can C/R a process tree, the restored tree will not become a
 
child of Docker and, from Docker's point of view, the container's
 
state will remain "Exited" (even after successful restore).
 
  
Work is in progress to add native checkpoint and restore support
+
Here's an example of creating a checkpoint, from a container that simply logs an integer in a loop.
to Docker.  Once ready, specific commands (for example, "docker
 
checkpoint" and "docker restore") will use CRIU to do the actual
 
C/R operations while Docker continues to maintain its parent-child
 
relationship and container states.
 
  
It's important to re-emphasize that by checkpointing and restoring
+
First, we create container:
a Docker container, we mean C/R of a process tree running inside a
 
container, excluding the Docker daemon itself.  As CRIU currently
 
does not support nested PID namespaces, the C/R process tree cannot
 
include the Docker daemon which runs in the global PID namespace.
 
  
== Command Line Options ==
+
$ docker run -d --name looper busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
  
In addition to the usual CRIU command line options used when
+
You can verify the container is running by printings its logs:
checkpointing and restoring a process tree, the following command
 
line options are needed for Docker containers.
 
  
=== <code>--root</code> ===
+
$ docker logs looper
  
This option has been used in the past only for restore operations
+
If you do this a few times you'll notice the integer increasing. Now, we checkpoint the container:
that wanted to change the root of the mount namespace. It was not
 
used for checkpoint operations.
 
  
However, because Docker by default uses the AUFS graph driver and
+
  $ docker checkpoint create looper checkpoint1
the AUFS module in the kernel reveals branch pathnames in
 
/proc/<pid>/map_files, --root is used to specify the root of the
 
mount namespace. Once the kernel AUFS module is fixed, it won't
 
be necessary to specify this option anymore.
 
  
=== <code>--ext-mount-map</code> ===
+
You should see that the process is no longer running, and if you print the logs a few times no new logs will be printed.
  
This option is used to specify the path of the external bind mounts.
+
=== restore ===
Docker sets up /etc/{hostname,hosts,resolv.conf} as targets with
 
source files outside the container's mount namespace.  Older versions
 
of Docker also bind mount /.dockerinit.
 
  
For example, assuming the default Docker configuration, /etc/hostname
+
Unlike creating a checkpoint, restoring from a checkpoint is just a flag provided to the normal container '''start''' call. Here's an example:
in the container's mount namespace is bind mounted from the source
 
at /var/lib/docker/containers/<container_id>/hostname.
 
  
=== <code>--manage-cgroups</code> ===
+
$ docker start --checkpoint checkpoint1 looper
  
When a process tree exits after a checkpoint operation, the cgroups
+
If we then print the logs, you should see they start from where we left off and continue to increase.  
that Docker had created for the container are removed.  This option
 
is needed during restore to move the process tree into its cgroups,
 
re-creating them if necessary.
 
  
=== <code>--evasive-devices</code> ===
+
==== Restoring into a '''new''' container ====
  
Docker bind mounts /dev/null on /dev/stdin for detached containers
+
Beyond the straightforward case of checkpointing and restoring the same container, it's also possible to checkpoint one container, and then restore the checkpoint into a completely different container. This is done by providing a custom storage path with the <code>--checkpoint-dir</code> option. Here's a slightly revised example from before:
(i.e., docker run -d ...).  Since earlier versions of Docker used
 
/dev/null in the global namespace, this option tells CRIU to treat
 
the global /dev/null and the container /dev/null as the same device.
 
  
== Restore Prework ==
+
$ docker run -d --name looper2 --security-opt seccomp:unconfined busybox \
 +
          /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
 +
 +
# wait a few seconds to give the container an opportunity to print a few lines, then
 +
$ docker checkpoint create --checkpoint-dir=/tmp looper2 checkpoint2
 +
 +
$ docker create --name looper-clone --security-opt seccomp:unconfined busybox \
 +
          /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
 +
 +
$ docker start --checkpoint-dir=/tmp --checkpoint=checkpoint2 looper-clone
  
As mentioned earlier, by default Docker uses AUFS to set up the
 
container's filesystem.  When Docker notices that the process has
 
exited (due to criu dump), it dismantles the filesystem.  We need
 
to set up the filesystem again before attempting to restore.
 
  
== An Example ==
+
You should be able to print the logs from <code>looper-clone</code> and see that they start from wherever the logs of <code>looper</code> end.
  
Below is an example to show C/R operations for a shell script that
+
=== Passing additional options ===
continuously appends a number to a file.  You can use tail -f to
 
see the process in action.
 
  
As you will see below, after restore, the process's parent is PID
+
[[Configuration files]] can be used to set additional CRIU options when performing checkpoint/restore of Docker containers. These options should be added in the file <code>/etc/criu/runc.conf</code> (in order to '''overwrite''' the ones set by runc/Docker). Note that the options stored in <code>~/.criu/default.conf</code> or <code>/etc/criu/default.conf</code> will be '''overwritten''' by the ones set via [[RPC]] by Docker.
1 (init), not Docker. Also, although the process has been successfully
 
restored, Docker still thinks that the container has exited.
 
  
To set up the container's AUFS filesystem before restore, its branch
+
For example, in order to checkpoint and restore a container with established TCP connections CRIU requires the <code>--tcp-established</code> option to be set. However, this option is set to false by default and it is currently not possible to change this behaviour via the command-line interface of Docker. This feature can be enabled by adding <code>tcp-established</code> in the file <code>/etc/criu/runc.conf</code>. Note that for this functionality to work, the version of [[https://github.com/opencontainers/runc runc]] must be recent enough to have the commit [[https://github.com/opencontainers/runc/commit/e157963054e1be28bcd6612f15df1ea561c62571 e157963]] applied.
information should be saved before checkpointing the container.
 
For convenience, however, AUFS branch information is saved in the
 
dump.log file. So we can examine dump.log to set up the filesystem
 
again.
 
  
For brevity, the 64-character long container ID is replaced by the
+
An alternative solution is to use [https://podman.io/ Podman] which has support to specify <code>--tcp-established</code> on the command-line.
string <container_id> in the following lines.
 
  
<pre>
+
=== Synopsis ===
$ docker run -d busybox:latest /bin/sh -c 'i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done'
+
 
<container_id>
+
Checkpoint
$
+
 
$ docker ps
+
  # docker checkpoint create --help
CONTAINER ID  IMAGE          COMMAND          CREATED        STATUS
+
  Usage: docker checkpoint create [OPTIONS] CONTAINER CHECKPOINT
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 seconds ago  Up 4 seconds
+
 
$
+
  Create a checkpoint from a running container
$ sudo criu dump -o dump.log -v4 -t 17810 \
+
 
-D /tmp/img/<container_id> \
+
  Options:
--root /var/lib/docker/aufs/mnt/<container_id> \
+
      --checkpoint-dir string  Use a custom checkpoint storage directory
--ext-mount-map /etc/resolv.conf:/etc/resolv.conf \
+
      --help                    Print usage
--ext-mount-map /etc/hosts:/etc/hosts \
+
      --leave-running           Leave the container running after checkpoint
--ext-mount-map /etc/hostname:/etc/hostname \
 
--ext-mount-map /.dockerinit:/.dockerinit \
 
--manage-cgroups \
 
--evasive-devices
 
$
 
$ sudo grep successful /tmp/img/<container_id>/dump.log
 
(00.020103) Dumping finished successfully
 
$
 
$ docker ps -a
 
CONTAINER ID  IMAGE          COMMAND          CREATED        STATUS
 
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 minutes ago  Exited (-1) 4 minutes ago
 
$
 
$ sudo mount -t aufs -o br=\
 
/var/lib/docker/aufs/diff/<container_id>:\
 
/var/lib/docker/aufs/diff/<container_id>-init:\
 
/var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721:\
 
/var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16:\
 
/var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229:\
 
/var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158:\
 
none /var/lib/docker/aufs/mnt/<container_id>
 
$
 
$ sudo criu restore -o restore.log -v4 -d
 
-D /tmp/img/<container_id> \
 
--root /var/lib/docker/aufs/mnt/<container_id> \
 
--ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/<container_id>/resolv.conf \
 
--ext-mount-map /etc/hosts:/var/lib/docker/containers/<container_id>/hosts \
 
--ext-mount-map /etc/hostname:/var/lib/docker/containers/<container_id>/hostname \
 
--ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 \
 
--manage-cgroups \
 
--evasive-devices
 
$
 
$ sudo grep successful /tmp/img/<container_id>/restore.log
 
(00.424428) Restore finished successfully. Resuming tasks.
 
$
 
$ ps -ef | grep /bin/sh
 
root    18580    1  0 12:38 ?        00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
 
$
 
$ docker ps -a
 
CONTAINER ID  IMAGE           COMMAND          CREATED        STATUS
 
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 7 minutes ago  Exited (-1) 5 minutes ago
 
$
 
</pre>
 
  
== Help Script ==
+
Restore
  
As seen in the above examples, the CRIU command line for checkpointing and
+
  # docker start --help
restoring a Docker container is pretty long. For restore, there is also
+
  Usage: docker start [OPTIONS] CONTAINER [CONTAINER...]
an additional step to set up the root filesystem before invoking CRIU.
 
  
To automate the C/R process, there is a helper script in the contrib
+
  Start one or more stopped containers
subdirectory of CRIU sources, called docker_cr.sh.  In addition to
 
invoking CRIU, this helper script sets up the root filesystem for AUFS,
 
UnionFS, and VFS for restore.
 
  
With docker_cr.sh, all you have to provide is the container ID.
+
  Options:
If you don't specify a container ID, docker_cr.sh will list all running
+
  -a, --attach                  Attach STDOUT/STDERR and forward signals
containers and prompt you to choose one.  Also, as shown in the help
+
      --checkpoint string      Restore from this checkpoint
output below, by setting the appropriate environment variable, it's
+
      --checkpoint-dir string  Use a custom checkpoint storage directory
possible to tell docker_cr.sh which Docker and CRIU binaries to use,
+
      --detach-keys string      Override the key sequence for detaching a container
where Docker's home directory is, and where CRIU should save and look
+
      --help                   Print usage
for its image files.
+
  -i, --interactive            Attach container's STDIN
  
<pre>
+
== Compatibility Notes ==
# docker_cr.sh --help
 
Usage:
 
docker_cr.sh -c|-r [-hv] [<container_id>]
 
-c, --checkpoint checkpoint container
 
-h, --help print help message
 
-r, --restore restore container
 
-v, --verbose enable verbose mode
 
  
Environment:
+
The latest versions of the Docker integration require at least version 2.0 of CRIU in order to work correctly. Additionally, depending on the storage driver being used by Docker, and other factors, there may be other compatibility issues that will attempt to be listed here.
DOCKER_HOME (default /var/lib/docker)
 
CRIU_IMG_DIR (default /var/lib/docker/criu_img)
 
DOCKER_BINARY (default docker)
 
CRIU_BINARY (default criu)
 
</pre>
 
  
Below is an example to checkpoint and restore Docker container 4397:
+
=== TTY ===
  
<pre>
+
Checkpointing an interactive container is supported by CRIU, runc and containerd, but not yet enabled in Docker.
# docker_cr.sh -c 4397
+
(See [[https://github.com/moby/moby/pull/38405 PR 38405]] for more information.)
dump successful
 
# docker_cr.sh -r 4397
 
restore successful
 
</pre>
 
  
Optionally, you can specify -v to see the commands that docker_cr.sh
+
=== Seccomp ===
executes.  For example:
 
  
<pre>
+
You'll notice that all of the above examples disable Docker's default seccomp support. In order to use seccomp, you'll need a newer version of the Kernel. **Update Needed with Exact Version**
# docker_cr.sh -c -v 40d3
 
docker binary: docker
 
criu binary: criu
 
image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
 
container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
 
  
criu dump -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf -o dump.log --manage-cgroups --evasive-devices --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /etc/hostname:/etc/hostname --ext-mount-map /.dockerinit:/.dockerinit -t 5991 --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
=== OverlayFS ===
  
dump successful
+
There is a bug in OverlayFS that reports the wrong mnt_id in /proc/<pid>/fdinfo/<fd> and the wrong symlink target path for /proc/<pid>/<fd>. Fortunately, these bugs have been fixed in the kernel v4.2-rc2. The following small kernel patches fix the mount id and symlink target path issue:
(00.020827) Dumping finished successfully
 
  
# docker_cr.sh -r -v 40d3
+
* {{torvalds.git|155e35d4da}} by David Howells
docker binary: docker
+
* {{torvalds.git|df1a085af1}} by David Howells
criu binary: criu
+
* {{torvalds.git|f25801ee46}} by David Howells
image directory: /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
* {{torvalds.git|4bacc9c923}} by David Howells
container root directory: /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
+
* {{torvalds.git|9391dd00d1}} by Al Viro
  
mount -t aufs -o
+
Assuming that you are running Ubuntu Vivid (Linux kernel 3.19), here is how you can patch your kernel:
/var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
 
/var/lib/docker/aufs/diff/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf-init
 
/var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721
 
/var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16
 
/var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229
 
/var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158
 
none
 
/var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf
 
  
criu restore -v4 -D /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf -o restore.log --manage-cgroups --evasive-devices --ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/resolv.conf --ext-mount-map /etc/hosts:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hosts --ext-mount-map /etc/hostname:/var/lib/docker/containers/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/hostname --ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 -d --root /var/lib/docker/aufs/mnt/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf --pidfile /var/lib/docker/criu_img/40d363f564e00a2f893579fa012a200e475dcf8df47f2a22b7dd0860ffc3d7bf/restore.pid
+
<pre>
 +
git clone  git://kernel.ubuntu.com/ubuntu/ubuntu-vivid.git
 +
cd ubuntu-vivid
 +
git remote add torvalds  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
 +
git remote update
  
restore successful
+
git cherry-pick 155e35d4da
(00.408807) Restore finished successfully. Resuming tasks.
+
git cherry-pick df1a085af1
 +
git cherry-pick f25801ee46
 +
git cherry-pick 4bacc9c923
 +
git cherry-pick 9391dd00d1
  
root      6206    1  1 10:49 ?        00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
+
cp /boot/config-$(uname -r) .config
 +
make olddefconfig
 +
make -j 8 bzImage modules
 +
sudo make install modules_install
 +
sudo reboot
 
</pre>
 
</pre>
  
 +
=== Async IO ===
 +
 +
If you are using a kernel older than 3.19 and your container uses AIO, you need the following AIO kernel patches from 3.19:
 +
 +
* {{torvalds.git|bd9b51e79c}} by Al Viro
 +
* {{torvalds.git|e4a0d3e720}} by Pavel Emelyanov
 +
 +
== External checkpoint/restore ==
  
[[Category:HOWTO]]
+
Although it's not recommended, you can also learn more about using CRIU without integrating with Docker. See [[Docker External]] for more info.

Latest revision as of 09:06, 12 October 2021

This article describes the status of CRIU integration with Docker, and how to use it.

Docker Experimental[edit]

Naturally, Docker wants to manage the full lifecycle of processes running inside its containers, so CRIU should be run by Docker (rather than separately). This feature is available in the experimental mode for Docker (since Docker 1.13, so every later version, like Docker 17.03, should work).

To enable experimental features (incl. CRIU), you need to do something like this:

echo "{\"experimental\": true}" >> /etc/docker/daemon.json
systemctl restart docker

In addition to having a recent version of Docker, you need CRIU 2.0 or later installed on your system (see Installation for more info).

checkpoint[edit]

There's a top level checkpoint sub-command in Docker, which lets you create a new checkpoint, and list or delete an existing checkpoint. These checkpoints are stored and managed by Docker, unless you specify a custom storage path.

Here's an example of creating a checkpoint, from a container that simply logs an integer in a loop.

First, we create container:

$ docker run -d --name looper busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

You can verify the container is running by printings its logs:

$ docker logs looper

If you do this a few times you'll notice the integer increasing. Now, we checkpoint the container:

$ docker checkpoint create looper checkpoint1

You should see that the process is no longer running, and if you print the logs a few times no new logs will be printed.

restore[edit]

Unlike creating a checkpoint, restoring from a checkpoint is just a flag provided to the normal container start call. Here's an example:

$ docker start --checkpoint checkpoint1 looper

If we then print the logs, you should see they start from where we left off and continue to increase.

Restoring into a new container[edit]

Beyond the straightforward case of checkpointing and restoring the same container, it's also possible to checkpoint one container, and then restore the checkpoint into a completely different container. This is done by providing a custom storage path with the --checkpoint-dir option. Here's a slightly revised example from before:

$ docker run -d --name looper2 --security-opt seccomp:unconfined busybox \
         /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

# wait a few seconds to give the container an opportunity to print a few lines, then
$ docker checkpoint create --checkpoint-dir=/tmp looper2 checkpoint2

$ docker create --name looper-clone --security-opt seccomp:unconfined busybox \
         /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

$ docker start --checkpoint-dir=/tmp --checkpoint=checkpoint2 looper-clone


You should be able to print the logs from looper-clone and see that they start from wherever the logs of looper end.

Passing additional options[edit]

Configuration files can be used to set additional CRIU options when performing checkpoint/restore of Docker containers. These options should be added in the file /etc/criu/runc.conf (in order to overwrite the ones set by runc/Docker). Note that the options stored in ~/.criu/default.conf or /etc/criu/default.conf will be overwritten by the ones set via RPC by Docker.

For example, in order to checkpoint and restore a container with established TCP connections CRIU requires the --tcp-established option to be set. However, this option is set to false by default and it is currently not possible to change this behaviour via the command-line interface of Docker. This feature can be enabled by adding tcp-established in the file /etc/criu/runc.conf. Note that for this functionality to work, the version of [runc] must be recent enough to have the commit [e157963] applied.

An alternative solution is to use Podman which has support to specify --tcp-established on the command-line.

Synopsis[edit]

Checkpoint

 # docker checkpoint create --help
 Usage:	docker checkpoint create [OPTIONS] CONTAINER CHECKPOINT
 Create a checkpoint from a running container
 Options:
     --checkpoint-dir string   Use a custom checkpoint storage directory
     --help                    Print usage
     --leave-running           Leave the container running after checkpoint

Restore

  # docker start --help
  Usage:	docker start [OPTIONS] CONTAINER [CONTAINER...]
 Start one or more stopped containers
 Options:
 -a, --attach                  Attach STDOUT/STDERR and forward signals
     --checkpoint string       Restore from this checkpoint
     --checkpoint-dir string   Use a custom checkpoint storage directory
     --detach-keys string      Override the key sequence for detaching a container
     --help                    Print usage
 -i, --interactive             Attach container's STDIN

Compatibility Notes[edit]

The latest versions of the Docker integration require at least version 2.0 of CRIU in order to work correctly. Additionally, depending on the storage driver being used by Docker, and other factors, there may be other compatibility issues that will attempt to be listed here.

TTY[edit]

Checkpointing an interactive container is supported by CRIU, runc and containerd, but not yet enabled in Docker. (See [PR 38405] for more information.)

Seccomp[edit]

You'll notice that all of the above examples disable Docker's default seccomp support. In order to use seccomp, you'll need a newer version of the Kernel. **Update Needed with Exact Version**

OverlayFS[edit]

There is a bug in OverlayFS that reports the wrong mnt_id in /proc/<pid>/fdinfo/<fd> and the wrong symlink target path for /proc/<pid>/<fd>. Fortunately, these bugs have been fixed in the kernel v4.2-rc2. The following small kernel patches fix the mount id and symlink target path issue:

Assuming that you are running Ubuntu Vivid (Linux kernel 3.19), here is how you can patch your kernel:

git clone  git://kernel.ubuntu.com/ubuntu/ubuntu-vivid.git
cd ubuntu-vivid
git remote add torvalds  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git remote update

git cherry-pick 155e35d4da
git cherry-pick df1a085af1
git cherry-pick f25801ee46
git cherry-pick 4bacc9c923
git cherry-pick 9391dd00d1

cp /boot/config-$(uname -r) .config
make olddefconfig
make -j 8 bzImage modules
sudo make install modules_install
sudo reboot

Async IO[edit]

If you are using a kernel older than 3.19 and your container uses AIO, you need the following AIO kernel patches from 3.19:

External checkpoint/restore[edit]

Although it's not recommended, you can also learn more about using CRIU without integrating with Docker. See Docker External for more info.