Difference between revisions of "Comparison to other CR projects"
(moved DMTCP to a separate page, included from here) |
(added absent architectures for criu) |
||
(24 intermediate revisions by 5 users not shown) | |||
Line 5: | Line 5: | ||
{{:DMTCP}} | {{:DMTCP}} | ||
− | == | + | == BLCR == |
+ | |||
+ | Berkeley Lab Checkpoint/Restart (BLCR) is a part of the Scalable Systems Software Suite , | ||
+ | developed by the Future Technologies Group at Lawrence Berkeley National Lab under SciDAC | ||
+ | funding from the United States Department of Energy. It is an Open Source, system-level | ||
+ | checkpointer designed with High Performance Computing (HPC) applications in mind: in particular | ||
+ | CPU and memory intensive batch-scheduled MPI jobs. BLCR is implemented as a GPL-licensed | ||
+ | loadable kernel module for Linux 2.4.x and 2.6.x kernels on the x86, x86_64, PPC/PPC64, ARM architectures, and a | ||
+ | small LGPL-licensed library. | ||
+ | |||
+ | == PinLIT / PinPlay == | ||
+ | |||
+ | PinLIT (Pin-Long Instruction Trace) is a checkpointing tool built on top of Intel's proprietary [https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool PIN binary instrumentation tool] described on page 48 of [https://cseweb.ucsd.edu/~calder/papers/thesis-cristiano.pdf Cristiano Pereira's PhD thesis]. It records the processor's (big) architectural register state and all pages of memory that contain application and shared library code, optimizing size by only storing memory used during a desired interval. | ||
+ | |||
+ | [https://software.intel.com/en-us/articles/program-recordreplay-toolkit PinPlay] or the Program Record/Replay Toolkit appears to be the successor of or new name for PinLIT. | ||
+ | |||
+ | Both tools appear primarily focused on reducing benchmark runtime on slow computer architecture simulators, leveraging sampling algorithms such as SimPoint. | ||
+ | |||
+ | == OpenVZ (in-kernel) == | ||
+ | |||
+ | Legacy OpenVZ (RHEL4, RHEL5, RHEL6 based kernels) has in-kernel checkpoint/restore, sources can be found in kernel/cpt/. | ||
+ | |||
+ | == CKPT (in-kernel) == | ||
+ | |||
+ | (In-kernel) [https://ckpt.wiki.kernel.org/index.php/Main_Page Linux Checkpoint/Restart] was a project from around 2008 to around 2010 to implement checkpoint/restart of Linux processes. | ||
+ | |||
+ | == CRIU, DMTCP, BLCR, OpenVZ comparison table == | ||
+ | |||
+ | “looks\seems like yes/no” - i found only unproved message(s) saying “yes”/“no” | ||
+ | |||
+ | “not yet” - it is officially planned or i found no reasons, why it can’t be done. | ||
+ | |||
+ | |||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! | ||
+ | ! CRIU | ||
+ | ! DMTCP | ||
+ | ! BLCR | ||
+ | ! OpenVZ | ||
+ | |||
+ | |- | ||
+ | | Arch | ||
+ | | x86_64, ARM, AArch64, PPC64le | ||
+ | | x86, x86_64, ARM | ||
+ | | x86, x86_64, PPC/PPC64, ARM | ||
+ | | x86, x86_64 | ||
+ | |||
+ | |- | ||
+ | | OS | ||
+ | | Linux | ||
+ | | Linux | ||
+ | | Linux | ||
+ | | Linux | ||
+ | |||
+ | |- | ||
+ | | Uses standard kernel? | ||
+ | | {{Yes}}, provided it's 3.11 or later | ||
+ | | {{Yes}} | ||
+ | | {{Yes}}, just needs to load module | ||
+ | | {{No}}. OpenVZ kernel is required | ||
+ | |||
+ | |- | ||
+ | | Can be used without preloading special libraries before app start? | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Can be used as non-root user? | ||
+ | | {{Yes}}, but user can only manipulate tasks belonging to him | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | |||
+ | |- | ||
+ | | Can run unmodified programs? | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No}}. Statically linked and/or threaded apps are unsupported. | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Can run unprepared tasks? | ||
+ | | {{Yes}} | ||
+ | | {{No}}. It preloads the DMTCP library. That library runs before the routine main(). It creates a second thread. The checkpoint thread then creates a socket to the DMTCP coordinator and registers itself. The checkpoint thread also creates a signal handler. | ||
+ | | {{No}}. CR shall notify processes when a checkpoint is to occur (before the kernel takes a checkpoint) to allow the processes to prepare itself accordingly. | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Retains behavior of the c/r-ed programs? | ||
+ | | {{Yes}} (but see [[What can change after C/R]]) | ||
+ | | {{No}}, because of wrappers on system calls | ||
+ | | {{No}}, because of wrappers on system calls | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Live migration | ||
+ | | {{Yes}}, even if kernel, libs, etc are newer. Can use [[memory changes tracking]] to decrease freeze time | ||
+ | | {{Yes}}, if both kernels are recent | ||
+ | | {{Yes}}, but if all components are the same. Even if prelinked addresses are different, it will not restore, but it can save the whole used libs and localization files to restore program on the different machine | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Containers | ||
+ | | {{Yes}}, LXC and OpenVZ containers | ||
+ | | {{No}}. It doesn't support namespaces, so it probably can’t dump containers | ||
+ | | {{No|Looks like no}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Parallel/distributed computations libraries | ||
+ | | {{No}} (planned) | ||
+ | | {{Yes}}. OpenMPI, MPICH2, OpenMP, Cilk are alredy supported and Infiniband is in progress | ||
+ | | {{Yes}}. Cray MPI, Intel MPI, LAM/MPI, MPICH-V, MPICH2, MVAPICH, Open MPI, SGI MPT | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Possible to C/R of gdb with debugged app? | ||
+ | | {{No}}, because they are using the same interface | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | X Window apps (KDE, GNOME, etc) | ||
+ | | {{Yes}}, via VNC | ||
+ | | {{Yes}}, via VNC | ||
+ | | {{No|Looks like no}} | ||
+ | | {{Yes}}, via VNC | ||
+ | |||
+ | |||
+ | |- | ||
+ | | Solutions for invocation in the custom software | ||
+ | | {{Yes}}, [[RPC]] and [[C API]] | ||
+ | | {{Yes}}, plugins and API | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}}, via ioctl calls | ||
+ | |||
+ | |- | ||
+ | | colspan="4" | | ||
+ | |||
+ | |- | ||
+ | | Unix sockets | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | UDP sockets | ||
+ | | {{Yes}}, both ipv4 and ipv6 | ||
+ | | {{No|Not yet}}. Developers of dmtcp had no request for this | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | TCP sockets | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Established TCP connection | ||
+ | | {{Yes}} | ||
+ | | {{No}}, but you can write a simple DMTCP plugin that tells DMTCP how you want to reconnect on restart | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Infiniband | ||
+ | | {{No}} | ||
+ | | {{No|Not yet, developing is on the half-way}} | ||
+ | | {{No}} | ||
+ | | {{No}} | ||
+ | |||
+ | |- | ||
+ | | Multithread support | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Multiprocess | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Process groups and sessions | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Zombies | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Namespaces | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Ptraced programs | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | System V IPC | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Memory mappings | ||
+ | | {{Yes}}, all kinds | ||
+ | | {{Yes}} | ||
+ | | {{Partial}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Pipes | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Terminals | ||
+ | | {{Yes}}, but only Unix98 PTYs | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Non-POSIX files (inotify, signalfd, eventfd, etc) | ||
+ | | {{Yes}}, inotify, fanotify, epoll, signalfd, eventfd | ||
+ | | {{Yes}}, epoll, eventfd, signalfd are already supported and inotify will be supported in future | ||
+ | | {{No|Looks like no}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Timers | ||
+ | | {{Yes}} | ||
+ | | {{No}}. Any counter or timer active since the beginning of a process will consider the restarted process to be a new process. | ||
+ | | {{Yes}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Shared resources (files, mm, etc.) | ||
+ | | {{Yes}}. SysVIPC, files, fd table and memory | ||
+ | | {{Yes}}. System V shared memory(shmget, etc.), mmap-based shared memory, shared sockets, pipes, file descriptors | ||
+ | | {{No}}, but it is planned to support shared mmap regions | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Block devices | ||
+ | | {{No}} | ||
+ | | {{Yes|Looks like yes}} | ||
+ | | {{No}} | ||
+ | | {{No}} | ||
+ | |||
+ | |||
+ | |- | ||
+ | | Character devices | ||
+ | | {{Yes}}, only /dev/null, /dev/zero, etc. are supported | ||
+ | | {{Yes}}, looks like null and zero are supported | ||
+ | | {{Yes}}, /dev/null and /dev/zero | ||
+ | | {{Yes}} | ||
+ | |||
+ | |- | ||
+ | | Capture the contents of open files | ||
+ | | {{Yes}}, if file is unlinked | ||
+ | | {{No|Looks like no}} | ||
+ | | {{No|Not yet}} | ||
+ | | {{Yes}} | ||
+ | |||
+ | |} | ||
+ | |||
+ | == Sources == | ||
+ | DMTCP: | ||
+ | *http://dmtcp.sourceforge.net/ | ||
+ | *http://dmtcp.sourceforge.net/papers/dmtcp.pdf | ||
+ | *http://www.ccs.neu.edu/home/gene/papers/ccgrid06.pdf | ||
+ | *http://research.cs.wisc.edu/htcondor/CondorWeek2010/condor-presentations/cooperman-dmtcp.pdf | ||
+ | *http://dmtcp.sourceforge.net/papers/mtcp.pdf | ||
+ | |||
+ | BLCR: | ||
+ | *https://upc-bugs.lbl.gov/blcr/doc/html/ | ||
+ | *https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/LBNL-49659.pdf | ||
+ | *https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/blcr.pdf | ||
+ | *https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/checkpointSurvey-020724b.pdf | ||
+ | *https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/lacsi-2003.pdf | ||
+ | *https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/LBNL-60520.pdf | ||
+ | |||
+ | == External links == | ||
* [http://dmtcp.sourceforge.net/FAQ.html#Internals How does DMTCP work?] | * [http://dmtcp.sourceforge.net/FAQ.html#Internals How does DMTCP work?] | ||
+ | |||
+ | [[Category:Under the hood]] |
Latest revision as of 19:20, 9 December 2015
This pages tries to explain differences between CRIU and other C/R solutions.
DMTCP[edit]
DMTCP implements checkpoint/restore of a process on a library level. This means, that if you want to C/R some application you should launch one with DMTCP library (dynamically) linked from the very beginning. When launched like this, the DMTCP library intercepts a certain amount of library calls from the application, builds a shadow data-base of information about process' internals and then forwards the request down to glibc/kernel. The information gathered is to be used to create an image of the application. With this approach, one can only dump applications known to run successfully with the DMTCP libraries, but the latter doesn't provide proxies for all kernel APIs (for example, inotify() is known to be unsupported). Another implication of this approach is potential performance issues that arise due to proxying of requests.
Restoration of process set is also tricky, as it frequently requires restoring an object with the predefined ID and kernel is known to provide no APIs for several of them. For example, kernel cannot fork a process with the desired PID. To address that, DMTCP fools a process by intercepting the getpid() library call and providing fake PID value to the application. Such behavior is very dangerous, as application might see wrong files in the /proc filesystem if it will try to access one via its PID.
CRIU, on the other hand, doesn't require any libraries to be pre-loaded. It will checkpoint and restore any arbitrary application, as long as kernel provides all needed facilities. Kernel support for some of CRIU features were added recently, essentially meaning that a recent kernel version might be required.
BLCR[edit]
Berkeley Lab Checkpoint/Restart (BLCR) is a part of the Scalable Systems Software Suite , developed by the Future Technologies Group at Lawrence Berkeley National Lab under SciDAC funding from the United States Department of Energy. It is an Open Source, system-level checkpointer designed with High Performance Computing (HPC) applications in mind: in particular CPU and memory intensive batch-scheduled MPI jobs. BLCR is implemented as a GPL-licensed loadable kernel module for Linux 2.4.x and 2.6.x kernels on the x86, x86_64, PPC/PPC64, ARM architectures, and a small LGPL-licensed library.
PinLIT / PinPlay[edit]
PinLIT (Pin-Long Instruction Trace) is a checkpointing tool built on top of Intel's proprietary PIN binary instrumentation tool described on page 48 of Cristiano Pereira's PhD thesis. It records the processor's (big) architectural register state and all pages of memory that contain application and shared library code, optimizing size by only storing memory used during a desired interval.
PinPlay or the Program Record/Replay Toolkit appears to be the successor of or new name for PinLIT.
Both tools appear primarily focused on reducing benchmark runtime on slow computer architecture simulators, leveraging sampling algorithms such as SimPoint.
OpenVZ (in-kernel)[edit]
Legacy OpenVZ (RHEL4, RHEL5, RHEL6 based kernels) has in-kernel checkpoint/restore, sources can be found in kernel/cpt/.
CKPT (in-kernel)[edit]
(In-kernel) Linux Checkpoint/Restart was a project from around 2008 to around 2010 to implement checkpoint/restart of Linux processes.
CRIU, DMTCP, BLCR, OpenVZ comparison table[edit]
“looks\seems like yes/no” - i found only unproved message(s) saying “yes”/“no”
“not yet” - it is officially planned or i found no reasons, why it can’t be done.
CRIU | DMTCP | BLCR | OpenVZ | |
---|---|---|---|---|
Arch | x86_64, ARM, AArch64, PPC64le | x86, x86_64, ARM | x86, x86_64, PPC/PPC64, ARM | x86, x86_64 |
OS | Linux | Linux | Linux | Linux |
Uses standard kernel? | Yes, provided it's 3.11 or later | Yes | Yes, just needs to load module | No. OpenVZ kernel is required |
Can be used without preloading special libraries before app start? | Yes | No | No | Yes |
Can be used as non-root user? | Yes, but user can only manipulate tasks belonging to him | Yes | Yes | No |
Can run unmodified programs? | Yes | Yes | No. Statically linked and/or threaded apps are unsupported. | Yes |
Can run unprepared tasks? | Yes | No. It preloads the DMTCP library. That library runs before the routine main(). It creates a second thread. The checkpoint thread then creates a socket to the DMTCP coordinator and registers itself. The checkpoint thread also creates a signal handler. | No. CR shall notify processes when a checkpoint is to occur (before the kernel takes a checkpoint) to allow the processes to prepare itself accordingly. | Yes |
Retains behavior of the c/r-ed programs? | Yes (but see What can change after C/R) | No, because of wrappers on system calls | No, because of wrappers on system calls | Yes |
Live migration | Yes, even if kernel, libs, etc are newer. Can use memory changes tracking to decrease freeze time | Yes, if both kernels are recent | Yes, but if all components are the same. Even if prelinked addresses are different, it will not restore, but it can save the whole used libs and localization files to restore program on the different machine | Yes |
Containers | Yes, LXC and OpenVZ containers | No. It doesn't support namespaces, so it probably can’t dump containers | Looks like no | Yes |
Parallel/distributed computations libraries | No (planned) | Yes. OpenMPI, MPICH2, OpenMP, Cilk are alredy supported and Infiniband is in progress | Yes. Cray MPI, Intel MPI, LAM/MPI, MPICH-V, MPICH2, MVAPICH, Open MPI, SGI MPT | Yes |
Possible to C/R of gdb with debugged app? | No, because they are using the same interface | Yes | No | Yes |
X Window apps (KDE, GNOME, etc) | Yes, via VNC | Yes, via VNC | Looks like no | Yes, via VNC
|
Solutions for invocation in the custom software | Yes, RPC and C API | Yes, plugins and API | Not yet | Yes, via ioctl calls |
Unix sockets | Yes | Yes | No | Yes |
UDP sockets | Yes, both ipv4 and ipv6 | Not yet. Developers of dmtcp had no request for this | Not yet | Yes |
TCP sockets | Yes | Yes | Not yet | Yes |
Established TCP connection | Yes | No, but you can write a simple DMTCP plugin that tells DMTCP how you want to reconnect on restart | No | Yes |
Infiniband | No | Not yet, developing is on the half-way | No | No |
Multithread support | Yes | Yes | Yes | Yes |
Multiprocess | Yes | Yes | Yes | Yes |
Process groups and sessions | Yes | Yes | Not yet | Yes |
Zombies | Yes | No | No | Yes |
Namespaces | Yes | No | No | Yes |
Ptraced programs | No | Yes | No | Yes |
System V IPC | Yes | Yes | No | Yes |
Memory mappings | Yes, all kinds | Yes | Partial | Yes |
Pipes | Yes | Yes | Not yet | Yes |
Terminals | Yes, but only Unix98 PTYs | Yes | Yes | Yes |
Non-POSIX files (inotify, signalfd, eventfd, etc) | Yes, inotify, fanotify, epoll, signalfd, eventfd | Yes, epoll, eventfd, signalfd are already supported and inotify will be supported in future | Looks like no | Yes |
Timers | Yes | No. Any counter or timer active since the beginning of a process will consider the restarted process to be a new process. | Yes | Yes |
Shared resources (files, mm, etc.) | Yes. SysVIPC, files, fd table and memory | Yes. System V shared memory(shmget, etc.), mmap-based shared memory, shared sockets, pipes, file descriptors | No, but it is planned to support shared mmap regions | Yes |
Block devices | No | Looks like yes | No | No
|
Character devices | Yes, only /dev/null, /dev/zero, etc. are supported | Yes, looks like null and zero are supported | Yes, /dev/null and /dev/zero | Yes |
Capture the contents of open files | Yes, if file is unlinked | Looks like no | Not yet | Yes |
Sources[edit]
DMTCP:
- http://dmtcp.sourceforge.net/
- http://dmtcp.sourceforge.net/papers/dmtcp.pdf
- http://www.ccs.neu.edu/home/gene/papers/ccgrid06.pdf
- http://research.cs.wisc.edu/htcondor/CondorWeek2010/condor-presentations/cooperman-dmtcp.pdf
- http://dmtcp.sourceforge.net/papers/mtcp.pdf
BLCR:
- https://upc-bugs.lbl.gov/blcr/doc/html/
- https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/LBNL-49659.pdf
- https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/blcr.pdf
- https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/checkpointSurvey-020724b.pdf
- https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/lacsi-2003.pdf
- https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/LBNL-60520.pdf