This article describes how to perform checkpoint-restore for an LXC container.
Preparing a Linux Container
Requirements
- A console should be disabled (
lxc.console = none
) - udev should not run inside containers (
mv /sbin/udevd{,.bcp}
)
Preparing a host environment
- Mount cgroupfs
$ mount -t cgroup c /cgroup
- Create a network bridge
# cat /etc/sysconfig/network-scripts/ifcfg-br0 DEVICE=br0 TYPE=Bridge BOOTPROTO=dhcp ONBOOT=yes DELAY=5 NM_CONTROLLED=n $ cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE="eth0" NM_CONTROLLED="no" ONBOOT="yes" BRIDGE=br0
Create and start a container
- Download an OpenVZ template and extract it.
curl http://download.openvz.org/template/precreated/centos-6-x86_64.tar.gz | tar -xz -C test-lxc
- Create config files
$ cat ~/test-lxc.conf lxc.console=none lxc.utsname = test-lxc lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.name = eth0 lxc.mount = /root/test-lxc/etc/fstab lxc.rootfs = /root/test-lxc-root/ lxc.console = none lxc.tty = 0
$ cat /root/test-lxc/etc/fstab none /root/test-lxc-root/dev/pts devpts defaults 0 0 none /root/test-lxc-root/proc proc defaults 0 0 none /root/test-lxc-root/sys sysfs defaults 0 0 none /root/test-lxc-root/dev/shm tmpfs defaults 0 0
- Register the container
$ lxc-create -n test-lxc -f test-lxc.conf
- Start the container
$ mount --bind test-lxc test-lxc-root/ $ lxc-start -n test-lxc
Checkpoint and restore an LXC Container
Preparations
You not only need to install the criu, but also check that the iproute2 utility (ip
) is v3.6.0 or higher.
You can clone the git repo or download the tarball to compile it manually. In order to tell to criu where the proper ip tool is set the CR_IP_TOOL
environment variable.
Dump and restore
Dumping and restoring an LXC contianer means -- dumping a subtree of processes starting from container init plus all kinds of namespaces. Restoring is symmetrical. The way LXC container works imposes some more requirements on criu usage.
- In order to properly isolate container from unwanted networking communication during checkpoint/restore you should provide a script for locking/unlocking the container network (see below)
- When restoring a container with veth device you may specify a name for the host-side veth device
- In order to checkpoint and restore alive TCP connections you should use the
--tcp-established
option
Typically a container dump command will look like
criu dump --tcp-established # allow for TCP connections dump -n net -n mnt -n ipc -n pid # dump all the namespaces container uses --action-script "net-script.sh" # use net-script.sh to lock/unlock networking -D dump/ -o dump.log # set images dir to dump/ and put logs into dump.log file -t ${init-pid} # start dumping from task ${init-pid}. It should be container's init
and restore command like
criu restore --tcp-established -n net -n mnt -n ipc -n pid --action-script "net-script.sh" --veth-pair eth0=${veth-name} # when restoring a veth link use ${veth-name} for host-side device end --root ${path} # path to container root. It should be a root of a (bind)mount -D data/ -o restore.log -t ${init-pid}
We also find it useful to use the --restore-detached
option for restore to make contianer reparent to init rather than hanging on a criu process launched from shell. Another useful option is the --pidfile
one -- you will be able to find out the host-side pid of a container init after restore.
Also note, that there's a BUG in how LXC prepares the /dev filesystem for a container which sometimes makes it impossible to dump and container. The --evasive-devices
option can help.
More details on the option mentioned can be found in Usage and Advanced usage pages.
Example
We have an application test for dumping/restoring an LXC Container. You may look at it for better understanding how to dump and restore your container with criu.
This test contains two scripts:
- run.sh
- This is the main script, which executes criu two times for dumping and restoring CT. It contains a working commands for dumping and restoring a container.
- network-script.sh
- This one is used to lock and unlock CT's network as described above.
FAQ
- CRIU supports restricted number of file systems: proc, sysfs, devtmpfs, tmpfs, binfmt_misc. All unsupported file systems must be umounted or handled by plugins.
Error (mount.c:737): FS mnt /sys/fs/pstore dev 0x18 root / unsupported
- /dev/console isn't supported yet. You can try to remove the "lxc.cgroup.devices.allow = c 5:1 rwm" from a CT config.
Error (tty.c:203): tty: Can't obtain ptmx index: Inappropriate ioctl for device
- Netlink sockets with pending messages are not supported. You can try again later.
Error (sk-netlink.c:77): The socket has data to read