Changes

899 bytes added , 22:49, 11 September 2018

Another crazy idea (dima)

Line 1: Line 1: −

This is a set of ideas how criu can be used

+

This is a set of ideas how criu can be used.

−

== Container [[live migration]] ==

+

== Container live migration ==

−

This is the use case from what the whole checkpoint/restore project appeared. Container is checkpointed, then the image is copied on another box, then restored. From the remote observer point of view the container is just frozen for a while. ~~You can find~~ more ~~details on this scenario~~ [[~~LXC | here~~]]

+

This is the use case from what the whole checkpoint/restore project appeared. Container is checkpointed, then the image is copied on another box, then restored. From the remote observer point of view the container is just frozen for a while.

+

''For more info, see [[:Category:live migration]].''

== Slow-boot services speed up ==

Line 11: Line 13:

We have a rough preliminary measurement, showing that VNC server + eclipse start time reduces from ~29 seconds to ~1.5.

−

== ~~Reboot-less~~ kernel upgrade ==

+

''Main article: [[slow-boot services speed up]].''

+

== Seamless kernel upgrade ==

When replacing a kernel on a box we can do it without stopping critical activity. Checkpoint it, then replace the kernel (e.g. using kexec) then restore services back. In a perfect world the applications memory shouldn't be put to disk image, but should rather be kept in RAM.

+

''Main article: [[Seamless kernel upgrade]]''.

== Networking load balancing ==

−

Not the whole project, but the [[~~TCP_connection~~|TCP repair]] can be used to offload an app-level request handling on another box.

+

Not the whole project, but the [[TCP connection|TCP repair]] can be used to offload an app-level request handling on another box.

== HPC issues ==

Line 30: Line 36:

Suspending a screen session and restoring it on another box might be interesting.

Suspending some X app (browser?) and restoring it later is also worth thinking about but requires knowledge of X-protocol.

+

''Main article: [[X applications]].''

== Processes duplication ==

Line 42: Line 50:

With CRIU one can save a series of app's states (all but first incremental) and revert later to any of them. The "apply-images" item from TODO list should help to revert the state faster, especially if the memory changes tracker state is with us.

+

One of examples when this snapshot might be useful is debugging. One might need to bring an application into a "desired" state fast, and having dump at that state would speed things up.

+

''Main article: [[Incremental dumps]].''

== Move "forgotten" applications into "screen" ==

Line 54: Line 66:

If there's some service, that got hung, but need to be restarted quickly, it's possible to take a dump of one, restart and debug why it hanged later, using its restored copy.

+

== Fault-tolerant systems ==

+

With CRIU it's possible to periodically duplicate process on another box. Requires [[applying images]] facility.

+

== Update dryrun ==

+

Before updating a kernel/system libs one may duplicate a system service(s) into VM with updates and check they continue to run OK. If this test passes, then the real system update can be done.

+

== Zero downtime crash restore ==

+

Checkpoint critical service from /proc/vmcore in crash kernel and migrate on another machine.

+

[[Category:Using]]

+

[[Category:Editor help needed]]

Dsafonov

105

edits

Changes

Usage scenarios (edit)

Revision as of 22:49, 11 September 2018

Navigation menu

Search