LXC

Revision as of 14:06, 26 February 2014 by Avagin (talk | contribs) (→‎FAQ)

This article describes how to perform checkpoint-restore for an LXC container.

Preparing a Linux Container

Requirements

  • A console should be disabled (lxc.console = none)
  • udev should not run inside containers (mv /sbin/udevd{,.bcp})

Preparing a host environment

  • Mount cgroupfs
$ mount -t cgroup c /cgroup
  • Create a network bridge
# cat /etc/sysconfig/network-scripts/ifcfg-br0 
DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes
DELAY=5
NM_CONTROLLED=n
$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
NM_CONTROLLED="no"
ONBOOT="yes"
BRIDGE=br0

Create and start a container

  • Download an OpenVZ template and extract it.
curl http://download.openvz.org/template/precreated/centos-6-x86_64.tar.gz | tar -xz -C test-lxc
  • Create config files
$ cat ~/test-lxc.conf 
lxc.console=none
lxc.utsname = test-lxc
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.name = eth0
lxc.mount = /root/test-lxc/etc/fstab
lxc.rootfs = /root/test-lxc-root/
lxc.console = none
lxc.tty = 0
$ cat /root/test-lxc/etc/fstab
none /root/test-lxc-root/dev/pts devpts defaults 0 0
none /root/test-lxc-root/proc    proc   defaults 0 0
none /root/test-lxc-root/sys     sysfs  defaults 0 0
none /root/test-lxc-root/dev/shm tmpfs  defaults 0 0
  • Register the container
$ lxc-create -n test-lxc -f test-lxc.conf
  • Start the container
$ mount --bind test-lxc test-lxc-root/
$ lxc-start -n test-lxc

Checkpoint and restore an LXC Container

Preparations

You not only need to install the criu, but also check that the iproute2 utility (ip) is v3.6.0 or higher.

You can clone the git repo or download the tarball to compile it manually. In order to tell to criu where the proper ip tool is set the CR_IP_TOOL environment variable.

Dump and restore

Dumping and restoring an LXC contianer means -- dumping a subtree of processes starting from container init plus all kinds of namespaces. Restoring is symmetrical. The way LXC container works imposes some more requirements on criu usage.

  • In order to properly isolate container from unwanted networking communication during checkpoint/restore you should provide a script for locking/unlocking the container network (see below)
  • When restoring a container with veth device you may specify a name for the host-side veth device
  • In order to checkpoint and restore alive TCP connections you should use the --tcp-established option

Typically a container dump command will look like

criu dump 
    --tcp-established                # allow for TCP connections dump
    -n net -n mnt -n ipc -n pid      # dump all the namespaces container uses
    --action-script "net-script.sh"  # use net-script.sh to lock/unlock networking
    -D dump/ -o dump.log             # set images dir to dump/ and put logs into dump.log file
    -t ${init-pid}                   # start dumping from task ${init-pid}. It should be container's init

and restore command like

criu restore
   --tcp-established
   -n net -n mnt -n ipc -n pid
   --action-script "net-script.sh"
   --veth-pair eth0=${veth-name}     # when restoring a veth link use ${veth-name} for host-side device end
   --root ${path}                    # path to container root. It should be a root of a (bind)mount
   -D data/ -o restore.log
   -t ${init-pid}

We also find it useful to use the --restore-detached option for restore to make contianer reparent to init rather than hanging on a criu process launched from shell. Another useful option is the --pidfile one -- you will be able to find out the host-side pid of a container init after restore.

Also note, that there's a BUG in how LXC prepares the /dev filesystem for a container which sometimes makes it impossible to dump and container. The --evasive-devices option can help.

More details on the option mentioned can be found in Usage and Advanced usage pages.

Example

We have an application test for dumping/restoring an LXC Container. You may look at it for better understanding how to dump and restore your container with criu.

This test contains two scripts:

run.sh
This is the main script, which executes criu two times for dumping and restoring CT. It contains a working commands for dumping and restoring a container.
network-script.sh
This one is used to lock and unlock CT's network as described above.

FAQ

  • CRIU supports restricted number of file systems: proc, sysfs, devtmpfs, tmpfs, binfmt_misc. All unsupported file systems must be umounted or handled by plugins.
Error (mount.c:737): FS mnt /sys/fs/pstore dev 0x18 root / unsupported
  • /dev/console isn't supported yet. You can try to remove the "lxc.cgroup.devices.allow = c 5:1 rwm" from a CT config.
Error (tty.c:203): tty: Can't obtain ptmx index: Inappropriate ioctl for device
  • Netlink sockets with pending messages are not supported. You can try again later.
Error (sk-netlink.c:77): The socket has data to read