|Note: The main article about live-migration is here. The article below is just description of how it can be done.|
criu utility can be used to perform live migration of apps or containers. This page is a sort of HOWTO describing this.
In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using
rsync to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.
In order to live migrate tasks you should do these steps:
Take tasks you're about to migrate and dump them into some place, asking
criu to leave them in stopped state after dump:
[src]# criu dump --tree <pid> --images-dir <path-to-existing-directory> --leave-stopped
The directory you put images to can reside on the shared file-system if you're using one. In this case you can skip the Copy step and proceed to Restore.
Copy images to destination node:
[src]# scp -r <path-to-images-dir> <dst>:/<path-to-images>
Go to the destination node and restore the apps from images on it:
[dst]# criu restore --tree <pid> --images-dir <path-to-images>
If everything went OK you can return on the source node and kill stopped tasks on it.
[src]# FIXME put command here
- The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform disk-less migration to address this.
- Another issue with this way of doing live migration is that while copying memory on remote host tasks remain frozen. If there's a LOT of memory, this freeze time can be big. CRIU can speed this up by doing iterative migration.
- If you're live migrating a shell job, remember that
--shell-joboption must be used on both stages -- dump and restore. See more details about shell jobs here.