This article describes how to perform checkpoint-restore for an LXC container.
Preparing a Linux Container
Requirements
- A console should be disabled (
lxc.console = none) - udev should not run inside containers (
mv /sbin/udevd{,.bcp})
Preparing a host environment
- Mount cgroupfs
$ mount -t cgroup c /cgroup
- Create a network bridge
# cat /etc/sysconfig/network-scripts/ifcfg-br0 DEVICE=br0 TYPE=Bridge BOOTPROTO=dhcp ONBOOT=yes DELAY=5 NM_CONTROLLED=n $ cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE="eth0" NM_CONTROLLED="no" ONBOOT="yes" BRIDGE=br0
Create and start a container
- Download an OpenVZ template and extract it.
curl http://download.openvz.org/template/precreated/centos-6-x86_64.tar.gz | tar -xz -C test-lxc
- Create config files
$ cat ~/test-lxc.conf lxc.console=none lxc.utsname = test-lxc lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.name = eth0 lxc.mount = /root/test-lxc/etc/fstab lxc.rootfs = /root/test-lxc-root/ lxc.console = none lxc.tty = 0
$ cat /root/test-lxc/etc/fstab none /root/test-lxc-root/dev/pts devpts defaults 0 0 none /root/test-lxc-root/proc proc defaults 0 0 none /root/test-lxc-root/sys sysfs defaults 0 0 none /root/test-lxc-root/dev/shm tmpfs defaults 0 0
- Register the container
$ lxc-create -n test-lxc -f test-lxc.conf
- Start the container
$ mount --bind test-lxc test-lxc-root/ $ lxc-start -n test-lxc
Checkpoint and restore an LXC Container
Preparations
You not only need to install the criu, but also check that the iproute2 utility (ip) is v3.6.0 or higher.
You can clone the git repo or download the tarball to compile it manually. In order to tell to criu where the proper ip tool is set the CR_IP_TOOL environment variable.
Dump and restore
Dumping and restoring an LXC contianer means -- dumping a subtree of processes starting from container init plus all kinds of namespaces. Restoring is symmetrical. The way LXC container works imposes some more requirements on criu usage.
- In order to properly isolate container from unwanted networking communication during checkpoint/restore you should provide a script for locking/unlocking the container network (see below)
- When restoring a container with veth device you may specify a name for the host-side veth device
- In order to checkpoint and restore alive TCP connections you should use the
--tcp-establishedoption
Typically a container dump command will look like
criu dump
--tcp-established # allow for TCP connections dump
-n net -n mnt -n ipc -n pid # dump all the namespaces container uses
--action-script "net-script.sh" # use net-script.sh to lock/unlock networking
-D dump/ -o dump.log # set images dir to dump/ and put logs into dump.log file
-t ${init-pid} # start dumping from task ${init-pid}. It should be container's init
and restore command like
criu restore
--tcp-established
-n net -n mnt -n ipc -n pid
--action-script "net-script.sh"
--veth-pair eth0=${veth-name} # when restoring a veth link use ${veth-name} for host-side device end
--root ${path} # path to container root. It should be a root of a (bind)mount
-D data/ -o restore.log
-t ${init-pid}
We also find it useful to use the --restore-detached option for restore to make contianer reparent to init rather than hanging on a criu process launched from shell. Another useful option is the --pidfile one -- you will be able to find out the host-side pid of a container init after restore.
Also note, that there's a BUG in how LXC prepares the /dev filesystem for a container which sometimes makes it impossible to dump and container. The --evasive-devices option can help.
More details on the option mentioned can be found in Usage and Advanced usage pages.
Example
We have an application test for dumping/restoring an LXC Container. You may look at it for better understanding how to dump and restore your container with criu.
This test contains two scripts:
- run.sh
- This is the main script, which executes criu two times for dumping and restoring CT. It contains a working commands for dumping and restoring a container.
- network-script.sh
- This one is used to lock and unlock CT's network as described above.
FAQ
- CRIU supports restricted number of file systems: proc, sysfs, devtmpfs, tmpfs, binfmt_misc. All unsupported file systems must be umounted or handled by plugins.
Error (mount.c:737): FS mnt /sys/fs/pstore dev 0x18 root / unsupported
- /dev/console isn't supported yet. You can try to remove the "lxc.cgroup.devices.allow = c 5:1 rwm" from a CT config.
Error (tty.c:203): tty: Can't obtain ptmx index: Inappropriate ioctl for device
- Netlink sockets with pending messages are not supported. You can try again later.
Error (sk-netlink.c:77): The socket has data to read
- Quite often CRIU can complain about external bind mounts like this
... doesn't have a proper root mount
consider using the --ext-mount-map option