The criu
utility can be used to perform live migration of apps or containers. This page is a sort of HOWTO describing this.
Note: The main article about live-migration of containers is here |
Migration sequence
In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using rsync
to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.
In order to live migrate tasks you should do these steps:
Dump
Take tasks you're about to migrate and dump them into some place, asking criu
to leave them in stopped state after dump:
[src]# criu dump --tree <pid> --images-dir <path-to-existing-directory> --leave-stopped
The directory you put images to can reside on the shared file-system if you're using one. In this case you can skip the Copy step and proceed to Restore.
Copy
Copy images to destination node:
[src]# scp -r <path-to-images-dir> <dst>:/<path-to-images>
Restore
Go to the destination node and restore the apps from images on it:
[dst]# criu restore --tree <pid> --images-dir <path-to-images>
Kill
If everything went OK you can return on the source node and kill stopped tasks on it.
[src]# FIXME put command here
Notes
The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform disk-less migration to address this.
Another issue with this way of doing live migration is that while copying memory on remote host tasks remain frozen. If there's a LOT of memory, this freeze time can be big. CRIU can speed this up by doing iterative migration.