LXC
This article describes how to perform checkpoint-restore for an LXC container.
Preparing a Linux Container
Requirements
- A console should be disabled (
lxc.console = none
) - udev should not run inside containers (
mv /sbin/udevd{,.bcp}
)
Preparing a host environment
- Mount cgroupfs
$ mount -t cgroup c /cgroup
- Create a network bridge
# cat /etc/sysconfig/network-scripts/ifcfg-br0 DEVICE=br0 TYPE=Bridge BOOTPROTO=dhcp ONBOOT=yes DELAY=5 NM_CONTROLLED=n $ cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE="eth0" NM_CONTROLLED="no" ONBOOT="yes" BRIDGE=br0
Create and start a container
- Download an OpenVZ template and extract it.
curl http://download.openvz.org/template/precreated/centos-6-x86_64.tar.gz | tar -xz -C test-lxc
- Create config files
$ cat ~/test-lxc.conf lxc.console=none lxc.utsname = test-lxc lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.name = eth0 lxc.mount = /root/test-lxc/etc/fstab lxc.rootfs = /root/test-lxc-root/
$ cat /root/test-lxc/etc/fstab none /root/test-lxc-root/dev/pts devpts defaults 0 0 none /root/test-lxc-root/proc proc defaults 0 0 none /root/test-lxc-root/sys sysfs defaults 0 0 none /root/test-lxc-root/dev/shm tmpfs defaults 0 0
- Register the container
$ lxc-create -n test-lxc -f test-lxc.conf
- Start the container
$ mount --bind test-lxc test-lxc-root/ $ lxc-start -n test-lxc
Checkpoint and restore an LXC Container
Preparations
You not only need to install the crtools, but also check that the iproute2 utility (ip
) is not v3.6.0 or higher.
You can clone the git repo or download the tarball to compile it manually. In order to tell to crtools where the proper ip tool is set the CR_IP_TOOL
environment variable.
Dump and restore
Dumping and restoring an LXC contianer means -- dumping a subtree of processes starting from container init plus all kinds of namespaces. Restoring is symmetrical. The way LXC container works imposes some more requirements on crtools usage.
- In order to properly isolate container from unwanted networking communication during checkpoint/restore you should provide a script for locking/unlocking the container network (see below)
- When restoring a container with veth device you may specify a name for the host-side veth device
- In order to checkpoint and restore alive TCP connections you should use the
--tcp-established
option
Typically a container dump command will look like
crtools dump --tcp-established # allow for TCP connections dump -n net -n mnt -n ipc -n pid # dump all the namespaces container uses --action-script "net-script.sh" # use net-script.sh to lock/unlock networking -D dump/ -o dump.log # set images dir to dump/ and put logs into dump.log file -t ${init-pid} # start dumping from task ${init-pid}. It should be container's init
and restore command like
crtools restore --tcp-established -n net -n mnt -n ipc -n pid --action-script "net-script.sh" --veth-pair eth0=${veth-name} # when restoring a veth link use ${veth-name} for host-side device end --root ${path} # path to container root. It should be a root of a (bind)mount -D data/ -o restore.log -t ${init-pid}
We also find it useful to use the --restore-detached
option for restore to make contianer reparent to init rather than hanging on a crtools process launched from shell. Another useful option is the --pidfile
one -- you will be able to find out the host-side pid of a container init after restore.
Also note, that there's a BUG in how LXC prepares the /dev filesystem for a container which sometimes makes it impossible to dump and container. The --evasive-devices
option can help.
More details on the option mentioned can be found in Usage and Advanced usage pages.
Example
We have an application test for dumping/restoring an LXC Container. You may look at it for better understanding how to dump and restore your container with crtools.
This test contains two scripts:
- run.sh
- This is the main script, which executes crtools two times for dumping and restoring CT. It contains a working commands for dumping and restoring a container.
- network-script.sh
- This one is used to lock and unlock CT's network as described above.