Difference between revisions of "P.Haul"

(Created page with "P.Haul is an extension to CRIU that makes live migration with CRIU possible.")
 
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
P.Haul is an extension to CRIU that makes [[live migration]] with CRIU possible.
+
P.Haul is an extension to CRIU that makes [[live migration]] with CRIU possible. The effort first appeared as [[Py-P.Haul|python script(s)]], but due to high complexity of python code integration, it was switched into Go. Right now the sources are in [https://github.com/checkpoint-restore/go-criu go-criu] repository.
 +
 
 +
== Description ==
 +
 
 +
P.Haul library is the pair of Go classes, one to be launched on the source node, the other one on the destination. Users are to import the source into their projects and call function directly. No CLI provided (yet).
 +
 
 +
=== Configuration ===
 +
 
 +
Both source and destination should create a <code>PhaulConfig</code> object that configures client and server. The fields are
 +
 
 +
* <code>Pid</code> -- the pid of the process subtree to live migrate
 +
* <code>Memfd</code> -- file descriptor via which CRIU will send processes' memory contents
 +
* <code>Wdir</code> -- path where CRIU can put intermediate files (images, logs, etc.)
 +
 
 +
=== Destination ===
 +
 
 +
Destination process is to call <code>phaul.MakePhaulServer</code> routine, that returns back a handler (and go error). Argument is the <code>PhaulConfig</code> object described above.
 +
 
 +
=== Source ===
 +
 
 +
Source is to call <code>phaul.MakePhaulClient</code> routine, it also returns a handler (and go error). Arguments are more complex.
 +
 
 +
The first is <code>PhaulLocal</code> interface. This one has the single method called <code>DumpCopyRestore</code>. Once p.haul client and server agree, that all preparations (pre-dumps) are done and it's time to call full dump, copy images and call full restore, this method is called. It's up to go-phaul caller to implement this method, as dumping processes is very engine-specific. E.g. OpenVZ, Docker, LXC all have [[CLI|different ways]] of invoking the <code>criu dump</code> operation. In turn, the method accepts
 +
 
 +
* <code>criu.Criu</code> -- a handler to Criu object from [[go wrappers]] using which client may invoke the dump action
 +
* <code>PhaulConfig</code> object
 +
* <code>last_client_images_path</code> string denoting where the last dump-s are. Needed to configure the [[incremental dumps]] for this final step
 +
 
 +
Next goes the <code>PhaulRemote</code> interface with a set of methods, that client wants to be called on the server object. It's up to the caller to provide the RPC method for this. E.g. in phaul test the server handler is passed as is as this argument.
 +
 
 +
The last one is known <code>PhaulConfig</code> object.
 +
 
 +
After these preparations, the <code>client.Migrate()</code> is to be called.
 +
 
 +
== Further development plans ==
 +
 
 +
Right now phaul is an implementation of [[iterative migration]] -- it calls pre-dumps several times, then informs the caller to do final dump-copy-restore steps. It's important to note, that it's up to the caller to copy the generated by last criu call images to the destination node.
 +
 
 +
To improve the above we want to
 +
 
 +
* Add [[lazy migration]] support
 +
* Add [[Image_cache/proxy_TODO|automatic images transfer]]
 +
* Add API for FS migration (if necessary)
 +
* Fix [[Py-P.Haul]] to use this library as a core
 +
 
 +
== Git ==
 +
https://github.com/checkpoint-restore/go-criu
 +
 
 +
[[Category: Live migration]]
 +
[[Category: New features]]

Latest revision as of 18:51, 4 November 2018

P.Haul is an extension to CRIU that makes live migration with CRIU possible. The effort first appeared as python script(s), but due to high complexity of python code integration, it was switched into Go. Right now the sources are in go-criu repository.

DescriptionEdit

P.Haul library is the pair of Go classes, one to be launched on the source node, the other one on the destination. Users are to import the source into their projects and call function directly. No CLI provided (yet).

ConfigurationEdit

Both source and destination should create a PhaulConfig object that configures client and server. The fields are

  • Pid -- the pid of the process subtree to live migrate
  • Memfd -- file descriptor via which CRIU will send processes' memory contents
  • Wdir -- path where CRIU can put intermediate files (images, logs, etc.)

DestinationEdit

Destination process is to call phaul.MakePhaulServer routine, that returns back a handler (and go error). Argument is the PhaulConfig object described above.

SourceEdit

Source is to call phaul.MakePhaulClient routine, it also returns a handler (and go error). Arguments are more complex.

The first is PhaulLocal interface. This one has the single method called DumpCopyRestore. Once p.haul client and server agree, that all preparations (pre-dumps) are done and it's time to call full dump, copy images and call full restore, this method is called. It's up to go-phaul caller to implement this method, as dumping processes is very engine-specific. E.g. OpenVZ, Docker, LXC all have different ways of invoking the criu dump operation. In turn, the method accepts

  • criu.Criu -- a handler to Criu object from go wrappers using which client may invoke the dump action
  • PhaulConfig object
  • last_client_images_path string denoting where the last dump-s are. Needed to configure the incremental dumps for this final step

Next goes the PhaulRemote interface with a set of methods, that client wants to be called on the server object. It's up to the caller to provide the RPC method for this. E.g. in phaul test the server handler is passed as is as this argument.

The last one is known PhaulConfig object.

After these preparations, the client.Migrate() is to be called.

Further development plansEdit

Right now phaul is an implementation of iterative migration -- it calls pre-dumps several times, then informs the caller to do final dump-copy-restore steps. It's important to note, that it's up to the caller to copy the generated by last criu call images to the destination node.

To improve the above we want to

GitEdit

https://github.com/checkpoint-restore/go-criu