Line 1: |
Line 1: |
− | This is a set of ideas how criu can be used | + | This is a set of ideas how criu can be used. |
| | | |
− | == Container [[live migration]] == | + | == Container live migration == |
| | | |
− | This is the use case from what the whole checkpoint/restore project appeared. Container is checkpointed, then the image is copied on another box, then restored. From the remote observer point of view the container is just frozen for a while. You can find more details on this scenario [[LXC | here]] | + | This is the use case from what the whole checkpoint/restore project appeared. Container is checkpointed, then the image is copied on another box, then restored. From the remote observer point of view the container is just frozen for a while. |
| + | |
| + | ''For more info, see [[:Category:live migration]].'' |
| | | |
| == Slow-boot services speed up == | | == Slow-boot services speed up == |
Line 11: |
Line 13: |
| We have a rough preliminary measurement, showing that VNC server + eclipse start time reduces from ~29 seconds to ~1.5. | | We have a rough preliminary measurement, showing that VNC server + eclipse start time reduces from ~29 seconds to ~1.5. |
| | | |
− | == Reboot-less kernel upgrade == | + | ''Main article: [[slow-boot services speed up]].'' |
| + | |
| + | == Seamless kernel upgrade == |
| | | |
| When replacing a kernel on a box we can do it without stopping critical activity. Checkpoint it, then replace the kernel (e.g. using kexec) then restore services back. In a perfect world the applications memory shouldn't be put to disk image, but should rather be kept in RAM. | | When replacing a kernel on a box we can do it without stopping critical activity. Checkpoint it, then replace the kernel (e.g. using kexec) then restore services back. In a perfect world the applications memory shouldn't be put to disk image, but should rather be kept in RAM. |
| + | |
| + | ''Main article: [[Seamless kernel upgrade]]''. |
| | | |
| == Networking load balancing == | | == Networking load balancing == |
| | | |
− | Not the whole project, but the [[TCP_connection|TCP repair]] can be used to offload an app-level request handling on another box. | + | Not the whole project, but the [[TCP connection|TCP repair]] can be used to offload an app-level request handling on another box. |
| | | |
| == HPC issues == | | == HPC issues == |
Line 30: |
Line 36: |
| Suspending a screen session and restoring it on another box might be interesting. | | Suspending a screen session and restoring it on another box might be interesting. |
| Suspending some X app (browser?) and restoring it later is also worth thinking about but requires knowledge of X-protocol. | | Suspending some X app (browser?) and restoring it later is also worth thinking about but requires knowledge of X-protocol. |
| + | |
| + | ''Main article: [[X applications]].'' |
| | | |
| == Processes duplication == | | == Processes duplication == |
Line 42: |
Line 50: |
| | | |
| With CRIU one can save a series of app's states (all but first incremental) and revert later to any of them. The "apply-images" item from TODO list should help to revert the state faster, especially if the memory changes tracker state is with us. | | With CRIU one can save a series of app's states (all but first incremental) and revert later to any of them. The "apply-images" item from TODO list should help to revert the state faster, especially if the memory changes tracker state is with us. |
| + | |
| + | One of examples when this snapshot might be useful is debugging. One might need to bring an application into a "desired" state fast, and having dump at that state would speed things up. |
| + | |
| + | ''Main article: [[Incremental dumps]].'' |
| | | |
| == Move "forgotten" applications into "screen" == | | == Move "forgotten" applications into "screen" == |
Line 54: |
Line 66: |
| | | |
| If there's some service, that got hung, but need to be restarted quickly, it's possible to take a dump of one, restart and debug why it hanged later, using its restored copy. | | If there's some service, that got hung, but need to be restarted quickly, it's possible to take a dump of one, restart and debug why it hanged later, using its restored copy. |
| + | |
| + | == Fault-tolerant systems == |
| + | |
| + | With CRIU it's possible to periodically duplicate process on another box. Requires [[applying images]] facility. |
| + | |
| + | == Update dryrun == |
| + | |
| + | Before updating a kernel/system libs one may duplicate a system service(s) into VM with updates and check they continue to run OK. If this test passes, then the real system update can be done. |
| + | |
| + | == Zero downtime crash restore == |
| + | |
| + | Checkpoint critical service from /proc/vmcore in crash kernel and migrate on another machine. |
| + | |
| + | [[Category:Using]] |
| + | [[Category:Editor help needed]] |