Changes

Jump to navigation Jump to search
1,826 bytes added ,  12:39, 13 January 2019
m
→‎Restore: Using -t with criu restore is obsoleted
Line 1: Line 1: −
The <code>crtools</code> utility can be used to perform live migration of apps or containers. This page is a sort of HOWTO describing this.
+
{{Note|The main article about live-migration is [[P.Haul|here]]. The article below is just description of how it can be done.}}
   −
== Migration sequence ==
+
Live migration attempts to provide a seamless transfer of service between physical machines without impacting client processes or applications.
 +
 
 +
The <code>criu</code> utility can be used to perform live migration of apps or containers. This page is a sort of HOWTO describing this.
 +
 
 +
== Prerequisites ==
    
In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using <code>rsync</code> to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.
 
In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using <code>rsync</code> to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.
   −
In order to live migrate tasks you should do these steps:
+
Another thing you should take care of is the networking. General rule here is that IP addresses, that your application uses should be available on the destination host. The reason for that is -- when restoring TCP sockets CRIU will try to bind() and connect() them back using their original credentials, and if the requested IP address is not available for some reason on the destination side, the respective system call will fail. Also during migration the connections will be locked by CRIU and there are two options here.
 +
 
 +
First, is when your app shares the networking with the host. In this case CRIU locks connections using iptables rules, so you should make sure the rules are available on the destination side. Second option is when the app lives in a net namespace (a container). In this case CRIU will call [[action scripts]] to lock the network and it's up to you how to lock it. In case of [[Docker]] the latter daemon handles it by the libnetwork library.
 +
 
 +
Said that, in order to live migrate tasks you should do these steps:
    
=== Dump ===
 
=== Dump ===
Take tasks you're about to migrate and dump them into some place, asking <code>crtools</code> to leave them in stopped state after dump:
+
Take tasks you're about to migrate and dump them into some place, asking <code>criu</code> to leave them in stopped state after dump:
   −
  [src]# crtools dump --tree <pid> --images-dir <path-to-existing-directory> --leave-stopped
+
  [src]# criu dump --tree <pid> --images-dir <path-to-existing-directory> --leave-stopped
    
The directory you put images to can reside on the shared file-system if you're using one. In this case you can skip the Copy step and proceed to Restore.
 
The directory you put images to can reside on the shared file-system if you're using one. In this case you can skip the Copy step and proceed to Restore.
Line 22: Line 30:  
Go to the destination node and restore the apps from images on it:
 
Go to the destination node and restore the apps from images on it:
   −
  [dst]# crtools restore --tree <pid> --images-dir <path-to-images>
+
  [dst]# criu restore --images-dir <path-to-images>
    
=== Kill ===
 
=== Kill ===
Line 31: Line 39:  
== Notes ==
 
== Notes ==
   −
The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform [[disk-less migration]] to address this.
+
* The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform [[disk-less migration]] to address this.
 +
 
 +
* Another issue with this way of doing live migration is that while copying memory on remote host tasks remain frozen. If there's a LOT of memory, this freeze time can be big. CRIU can speed this up by doing [[iterative migration]].
 +
 
 +
* If you're live migrating a shell job, remember that <code>--shell-job</code> option must be used on both stages -- dump and restore. See more details about shell jobs [[Advanced usage|here]].
 +
 
 +
== See also ==
 +
 
 +
* [[P.Haul]]
 +
* [[Iterative migration]]
 +
* [[Disk-less migration]]
 +
* [[Lazy migration]]
 +
* [[Page server]]
 +
 
 +
[[Category: HOWTO]]
 +
[[Category: Live migration]]
277

edits

Navigation menu