Difference between revisions of "LXC"

From CRIU
Jump to navigation Jump to search
m (Remove unneeded namespace args, these are autodetected.)
Line 1: Line 1:
This article describes how to perform checkpoint-restore for an LXC container. There are two ways to do this, the first part of the article describes how to dump a container with LXC directly, and the second part describes how to do it using tools under development in LXC upstream.
+
== Requirements ==
  
== Preparing a Linux Container ==
+
You should have built and installed a recent (>= 1.3.1) version of CRIU.
  
=== Requirements ===
+
== Checkpointing and restoring a container ==
 
 
* A console should be disabled (<code>lxc.console = none</code>)
 
* udev should not run inside containers (<code>mv /sbin/udevd{,.bcp}</code>)
 
 
 
=== Preparing a host environment ===
 
 
 
* Mount cgroupfs
 
$ mount -t cgroup c /cgroup
 
 
 
* Create a network bridge
 
# cat /etc/sysconfig/network-scripts/ifcfg-br0
 
DEVICE=br0
 
TYPE=Bridge
 
BOOTPROTO=dhcp
 
ONBOOT=yes
 
DELAY=5
 
NM_CONTROLLED=n
 
$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
 
DEVICE="eth0"
 
NM_CONTROLLED="no"
 
ONBOOT="yes"
 
BRIDGE=br0
 
 
 
=== Create and start a container ===
 
* Download an OpenVZ template and extract it.
 
<pre><nowiki>curl http://download.openvz.org/template/precreated/centos-6-x86_64.tar.gz | tar -xz -C test-lxc
 
</nowiki></pre>
 
* Create config files
 
$ cat ~/test-lxc.conf
 
lxc.console=none
 
lxc.utsname = test-lxc
 
lxc.network.type = veth
 
lxc.network.flags = up
 
lxc.network.link = br0
 
lxc.network.name = eth0
 
lxc.mount = /root/test-lxc/etc/fstab
 
lxc.rootfs = /root/test-lxc-root/
 
lxc.console = none
 
lxc.tty = 0
 
 
 
$ cat /root/test-lxc/etc/fstab
 
none /root/test-lxc-root/dev/pts devpts defaults 0 0
 
none /root/test-lxc-root/proc    proc  defaults 0 0
 
none /root/test-lxc-root/sys    sysfs  defaults 0 0
 
none /root/test-lxc-root/dev/shm tmpfs  defaults 0 0
 
 
 
* Register the container
 
$ lxc-create -n test-lxc -f test-lxc.conf
 
 
 
* Start the container
 
$ mount --bind test-lxc test-lxc-root/
 
$ lxc-start -n test-lxc
 
 
 
== Checkpoint and restore an LXC Container ==
 
 
 
=== Preparations ===
 
 
 
You not only need to [[Installation | install]] the criu, but also check that the iproute2 utility (<code>ip</code>) is v3.6.0 or higher.
 
 
 
You can clone the [http://git.kernel.org/?p=linux/kernel/git/shemminger/iproute2.git;a=summary git repo] or download the [http://kernel.org/pub/linux/utils/net/iproute2/ tarball] to compile it manually. In order to tell to criu where the proper ip tool is set the <code>CR_IP_TOOL</code> environment variable.
 
 
 
=== Dump and restore ===
 
Dumping and restoring an LXC contianer means -- dumping a subtree of processes starting from container init plus all kinds of namespaces.
 
Restoring is symmetrical. The way LXC container works imposes some more requirements on criu usage.
 
 
 
* In order to properly isolate container from unwanted networking communication during checkpoint/restore you should provide a script for locking/unlocking the container network (see below)
 
* When restoring a container with veth device you may specify a name for the host-side veth device
 
* In order to checkpoint and restore alive TCP connections you should use the <code>--tcp-established</code> option
 
 
 
Typically a container dump command will look like
 
<pre>
 
criu dump
 
    --tcp-established                # allow for TCP connections dump
 
    --action-script "net-script.sh"  # use net-script.sh to lock/unlock networking
 
    -D dump/ -o dump.log            # set images dir to dump/ and put logs into dump.log file
 
    -t ${init-pid}                  # start dumping from task ${init-pid}. It should be container's init
 
</pre>
 
and restore command like
 
<pre>
 
criu restore
 
  --tcp-established
 
  --action-script "net-script.sh"
 
  --veth-pair eth0=${veth-name}    # when restoring a veth link use ${veth-name} for host-side device end
 
  --root ${path}                    # path to container root. It should be a root of a (bind)mount
 
  -D data/ -o restore.log
 
  -t ${init-pid}
 
</pre>
 
 
 
We also find it useful to use the <code>--restore-detached</code> option for restore to make contianer reparent to init rather than hanging on a criu process launched from shell. Another useful option is the <code>--pidfile</code> one -- you will be able to find out the host-side pid of a container init after restore.
 
 
 
Also note, that there's a BUG in how LXC prepares the /dev filesystem for a container which sometimes makes it impossible to dump and container. The <code>--evasive-devices</code> option can help.
 
 
 
More details on the option mentioned can be found in [[Usage]] and [[Advanced usage]] pages.
 
 
 
=== Example ===
 
We have [http://git.criu.org/?p=crtools.git;a=tree;f=test/app-emu/lxc;hb=HEAD an application test] for dumping/restoring an LXC Container. You may look at it for better understanding how to dump and restore your container with criu.
 
 
 
This test contains two scripts:
 
;[http://git.criu.org/?p=crtools.git;a=blob;f=test/app-emu/lxc/run.sh;hb=HEAD run.sh]
 
:This is the main script, which executes ''criu'' two times for dumping and restoring CT. It contains a working commands for dumping and restoring a container.
 
 
 
;[http://git.criu.org/?p=crtools.git;a=blob;f=test/app-emu/lxc/network-script.sh;hb=HEAD network-script.sh]
 
: This one is used to lock and unlock CT's network as described above.
 
 
 
===FAQ===
 
* CRIU supports restricted number of file systems: proc, sysfs, devtmpfs, tmpfs, binfmt_misc. All unsupported file systems must be umounted or handled by plugins.
 
Error (mount.c:737): FS mnt /sys/fs/pstore dev 0x18 root / unsupported
 
 
 
* /dev/console isn't supported yet. You can try to remove the "lxc.cgroup.devices.allow = c 5:1 rwm" from a CT config.
 
Error (tty.c:203): tty: Can't obtain ptmx index: Inappropriate ioctl for device
 
 
 
* Netlink sockets with pending messages are not supported. You can try again later.
 
Error (sk-netlink.c:77): The socket has data to read
 
 
 
* Quite often CRIU can complain about [[external bind mounts]] like this
 
  ... doesn't have a proper root mount
 
consider using the <code>--ext-mount-map</code> option
 
 
 
== Using LXC Upstream tools ==
 
  
 
LXC upstream has begun to integrate checkpoint/restore support through the lxc-checkpoint tool. Although this functionality is not in any released version of LXC yet, you can check out the development version on Ubuntu by doing:
 
LXC upstream has begun to integrate checkpoint/restore support through the lxc-checkpoint tool. Although this functionality is not in any released version of LXC yet, you can check out the development version on Ubuntu by doing:
Line 168: Line 49:
 
ssh ubuntu@$(sudo lxc-info -i -H -n u1)
 
ssh ubuntu@$(sudo lxc-info -i -H -n u1)
 
</pre>
 
</pre>
 +
 +
===Troubleshooting===
 +
* <pre>Error (sk-netlink.c:77): The socket has data to read</pre>
 +
Netlink sockets with pending messages are not supported. Usually if you run the dump again it will succeed.
 +
 +
* <pre>Error (mount.c:805): fusectl isn't empty: 8388625.</pre>
 +
Dumping of fuse filesystems is currently not supported. Empty the container's /sys/fs/fuse/connections and try again.
 +
 +
* <pre>Error (mount.c:517): Mount 58 (master_id: 12 shared_id: 0) has unreachable sharing</pre>
 +
criu doesn't yet support shared mountpoints as lxc does; make sure your rootfs is on a non-shared mount.
  
 
[[Category: HOWTO]]
 
[[Category: HOWTO]]

Revision as of 19:05, 19 September 2014

Requirements

You should have built and installed a recent (>= 1.3.1) version of CRIU.

Checkpointing and restoring a container

LXC upstream has begun to integrate checkpoint/restore support through the lxc-checkpoint tool. Although this functionality is not in any released version of LXC yet, you can check out the development version on Ubuntu by doing:

sudo add-apt-repository ppa:ubuntu-lxc/daily
sudo apt-get update
sudo apt-get install lxc

Next, create a container:

sudo lxc-create -t ubuntu -n u1 -- -r trusty -a amd64

And add the following lines (as above) to its config:

cat | sudo tee -a /var/lib/lxc/u1/config << EOF
# hax for criu
lxc.console = none
lxc.tty = 0
lxc.cgroup.devices.deny = c 5:1 rwm
EOF

Finally, start, and checkpoint the container:

sudo lxc-start -n u1
sleep 5s  # let the container get to a more interesting state
sudo lxc-checkpoint -s -D /tmp/checkpoint -n u1

At this point, the container's state is stored in /tmp/checkpoint, and the filesystem is in /var/lib/lxc/u1/rootfs. You can restore the container by doing:

sudo lxc-checkpoint -r -D /tmp/checkpoint -n u1

And then, get your container's IP and ssh in:

ssh ubuntu@$(sudo lxc-info -i -H -n u1)

Troubleshooting

  • Error (sk-netlink.c:77): The socket has data to read

Netlink sockets with pending messages are not supported. Usually if you run the dump again it will succeed.

  • Error (mount.c:805): fusectl isn't empty: 8388625.

Dumping of fuse filesystems is currently not supported. Empty the container's /sys/fs/fuse/connections and try again.

  • Error (mount.c:517): Mount 58 (master_id: 12 shared_id: 0) has unreachable sharing

criu doesn't yet support shared mountpoints as lxc does; make sure your rootfs is on a non-shared mount.