Live migration

Revision as of 18:01, 14 May 2013 by Xemul (talk | contribs) (→‎Notes)

The criu utility can be used to perform live migration of apps or containers. This page is a sort of HOWTO describing this.

Migration sequence

In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using rsync to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.

In order to live migrate tasks you should do these steps:

Dump

Take tasks you're about to migrate and dump them into some place, asking criu to leave them in stopped state after dump:

[src]# criu dump --tree <pid> --images-dir <path-to-existing-directory> --leave-stopped

The directory you put images to can reside on the shared file-system if you're using one. In this case you can skip the Copy step and proceed to Restore.

Copy

Copy images to destination node:

[src]# scp -r <path-to-images-dir> <dst>:/<path-to-images>

Restore

Go to the destination node and restore the apps from images on it:

[dst]# criu restore --tree <pid> --images-dir <path-to-images>

Kill

If everything went OK you can return on the source node and kill stopped tasks on it.

[src]# FIXME put command here

Notes

The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform disk-less migration to address this.

Another issue with this way of doing live migration is that while copying memory on remote host tasks remain frozen. If there's a LOT of memory, this freeze time can be big. CRIU can speed this up by doing iterative migration.