Line 22: |
Line 22: |
| |- | | |- |
| | crtools || New images format || medium (v2) || - || See [[what's bad with V1 images]] | | | crtools || New images format || medium (v2) || - || See [[what's bad with V1 images]] |
− | |-
| |
− | | crtools || Make [[RPC]] "swrk" mode public || medium || xemul || There's an "swrk" action in CRIU [[Usage|CLI API]]. This turns CRIU into service worker accepting RPC commands. This mode is not documented. Need to standartify one and make public.
| |
| |- | | |- |
| | kernel/crtools || Tune the start-time of tasks || medium || - || When we restore tasks their start-time goes forward (since we create the new task effectively). Need to address this somehow, most likely with the [[time namespace]]. | | | kernel/crtools || Tune the start-time of tasks || medium || - || When we restore tasks their start-time goes forward (since we create the new task effectively). Need to address this somehow, most likely with the [[time namespace]]. |
Line 34: |
Line 32: |
| |- | | |- |
| | kernel || Make pipes swappable || hard || - || When [[Memory dumping and restoring|pre-dumping]] memory we pull all the task's memory into pipe with vmsplice and then send it via network splicing the pages into socket. During this period all the memory is effectively pinned as pages in pipe are not swappable. | | | kernel || Make pipes swappable || hard || - || When [[Memory dumping and restoring|pre-dumping]] memory we pull all the task's memory into pipe with vmsplice and then send it via network splicing the pages into socket. During this period all the memory is effectively pinned as pages in pipe are not swappable. |
− | |-
| |
− | | crtools || Deduplication for shmem [[memory dumps]] || easy || - || We have dedup action and --auto-dedup option for dump/restore which only works for pid pagemaps. Need the same for shmem.
| |
| |- | | |- |
| | kernel/crtools || Adjust per-task/-container timers offsets || medium || - || Absolute timers differ on different nodes. When live migrating a task/container this difference may (and will) screw the timers up. | | | kernel/crtools || Adjust per-task/-container timers offsets || medium || - || Absolute timers differ on different nodes. When live migrating a task/container this difference may (and will) screw the timers up. |
Line 42: |
Line 38: |
| |- | | |- |
| | crtools || Show what was left in the system after dump || easy || - || When we use [[Invisible files|--link-remap]] option or [[TCP connection|--tcp-established]] one CRIU leaves some traces in the system, in particular -- temporary hard links in the former case and iptables rules in the latter. Need some way to show these to the user. | | | crtools || Show what was left in the system after dump || easy || - || When we use [[Invisible files|--link-remap]] option or [[TCP connection|--tcp-established]] one CRIU leaves some traces in the system, in particular -- temporary hard links in the former case and iptables rules in the latter. Need some way to show these to the user. |
− | |-
| |
− | | crtools || Leases support || easy || - || The F_SETLEASE/F_GETLEASE API is not currently supported, but doesn't differ much from regular locks.
| |
| |- | | |- |
| | kernel/crtools || Put call to mmap into VDSO || easy || Cyrill || To put the [[parasite code]] into target process we modify its code to call the <code>mmap()</code> system call (and the unmodify it back) and put the parasite into new area. Oleg Nesterov suggests not to patch victim, but to always have one on VDSO. | | | kernel/crtools || Put call to mmap into VDSO || easy || Cyrill || To put the [[parasite code]] into target process we modify its code to call the <code>mmap()</code> system call (and the unmodify it back) and put the parasite into new area. Oleg Nesterov suggests not to patch victim, but to always have one on VDSO. |
Line 52: |
Line 46: |
| |- | | |- |
| | crtools || Rollback tree state || medium || - || When we checkpointed process tree with -R option (let them run after checkpoint) we might want to return the tasks into checkpointed state on the same machine. Currently this can only be done by killing the processes and restoring them from scratch. If we could ask CRIU to restore the images ''into'' the ready processes that could speed things up, especially if carefully caring about [[memory changes tracking]]. | | | crtools || Rollback tree state || medium || - || When we checkpointed process tree with -R option (let them run after checkpoint) we might want to return the tasks into checkpointed state on the same machine. Currently this can only be done by killing the processes and restoring them from scratch. If we could ask CRIU to restore the images ''into'' the ready processes that could speed things up, especially if carefully caring about [[memory changes tracking]]. |
− | |-
| |
− | | crtools || AIO with pending events || medium || - || When we dump AIO ring we check it not to contain events inside and abort the dump otherwise. Need to dump events too and put them back on restore.
| |
| |- | | |- |
| | crtools || Restore arbitrary mountpoints tree || hard || - || Linux kernel can construct tricky knows with [[mount points]]. We don't support arbitrary configuration of such things, only those that are in active use by software. Need to fix them up. | | | crtools || Restore arbitrary mountpoints tree || hard || - || Linux kernel can construct tricky knows with [[mount points]]. We don't support arbitrary configuration of such things, only those that are in active use by software. Need to fix them up. |
Line 60: |
Line 52: |
| |- | | |- |
| | crtools || [[Lazy migration]] using [[userfaultfd]] || medium || xemul || Lazy migration is when we move all the tasks on another node, but leave theirs memory on the source one. Not to allow tasks read garbage from empty address space we protect all of it as inaccessible. When tasks start reading/writing the mem they got page-fault-ed. With the userfaultfd technology it can be possible to intercept the #PF, pull the page from source node and map it into expected address. | | | crtools || [[Lazy migration]] using [[userfaultfd]] || medium || xemul || Lazy migration is when we move all the tasks on another node, but leave theirs memory on the source one. Not to allow tasks read garbage from empty address space we protect all of it as inaccessible. When tasks start reading/writing the mem they got page-fault-ed. With the userfaultfd technology it can be possible to intercept the #PF, pull the page from source node and map it into expected address. |
− | |-
| |
− | | crtools || Dump tasks from cgroup || easy || - || Currently criu dumps a subtree from given pid. It makes sense to request CRIU to dump a set of tasks from given cgroup.
| |
| |- | | |- |
| | crtools || Speed up [[logging]] || medium || Cyrill || Synchronous formatting and writes into log files slow things down. On the other hand turning logs off make it impossible to troubleshoot. | | | crtools || Speed up [[logging]] || medium || Cyrill || Synchronous formatting and writes into log files slow things down. On the other hand turning logs off make it impossible to troubleshoot. |
| |- | | |- |
| | crtools || Sanitize [[logging]] messages || hard || - || Currently log messages are printed w/o any logic, it's hard to analize what has happened when CRIU fails. Need to improve that by, e.g. categorizing images and [[When C/R fails|explaining them]] in more details. | | | crtools || Sanitize [[logging]] messages || hard || - || Currently log messages are printed w/o any logic, it's hard to analize what has happened when CRIU fails. Need to improve that by, e.g. categorizing images and [[When C/R fails|explaining them]] in more details. |
− | |-
| |
− | | crtools || Support OFD posix locks || easy || - || These are still rarely used, but exist. Might make sense to support them in advance, it looks like kernel API allows for that.
| |
| |- | | |- |
| | crtools || Optimize kcmp calls || medium || - || CRIU build [[kcmp trees]] to find out IDs of such objects as MM, FDT and others. Currently we kcmp all tasks to get the ID, but we can improve that by pre-generating ID based on objects that live on MM, FS, etc. If pre-ID of two tasks matches, then we call kcmp, if not -- objects ''are'' different. | | | crtools || Optimize kcmp calls || medium || - || CRIU build [[kcmp trees]] to find out IDs of such objects as MM, FDT and others. Currently we kcmp all tasks to get the ID, but we can improve that by pre-generating ID based on objects that live on MM, FS, etc. If pre-ID of two tasks matches, then we call kcmp, if not -- objects ''are'' different. |
− | |-
| |
− | | crtools || Shmem changes tracking || medium || - || Memory changes tracker works on anonymous private mappings only. Anonymous shmem is not in active use by server applications, so we don't support one currently. Supporting it should be done by tracking the changes from all the tasks that ''could'' write into the segment. For anon shared memory and sysvipc segment inside IPC namespace this works reliably.
| |
| |- | | |- |
| | crtools || Page transfer filters || medium || - || The page-xfer engine just splices the pages from stealing pipes into socket. Packing or encrypting the data would be nice. Maybe it's purely for [[P.Haul]]? | | | crtools || Page transfer filters || medium || - || The page-xfer engine just splices the pages from stealing pipes into socket. Packing or encrypting the data would be nice. Maybe it's purely for [[P.Haul]]? |
| |- | | |- |
| | crtools || [[FUSE]] mount points || hard || - || When dumping mountpoints we explicitly check the filesystem mounted. The thing is -- not all filesystems can be just ignored on dump. E.g. FUSE mount involves a user-space daemon that is responsible for the files tree contents. If we just kill one on dump we might not be able to restore it. Need to special-care one. | | | crtools || [[FUSE]] mount points || hard || - || When dumping mountpoints we explicitly check the filesystem mounted. The thing is -- not all filesystems can be just ignored on dump. E.g. FUSE mount involves a user-space daemon that is responsible for the files tree contents. If we just kill one on dump we might not be able to restore it. Need to special-care one. |
− | |-
| |
− | | crtools || 32-bit tasks || hard || Cyrill || For x86 we only dump and restore 64-bit tasks. Doing 32-bit should also be done, but keep in mind, that not only 64-bit tree OR 32-bit tree should be supported. There can be mixed 64-and-32-bit trees out there and CRIU should support those too.
| |
| |- | | |- |
| | crtools || Modify restored resources run-time in [[CRIT]] daemon || medium || - || Sometimes it might make sense to tune the objects from images on restore. E.g. change the IP address of sockets from task above or fix file paths to be "chroot-ed". The best solution seems to be in launching CRIT in daemon mode, telling it what images and how to modify and teaching CRIU to "filter" the pb objects read from images through this daemon. | | | crtools || Modify restored resources run-time in [[CRIT]] daemon || medium || - || Sometimes it might make sense to tune the objects from images on restore. E.g. change the IP address of sockets from task above or fix file paths to be "chroot-ed". The best solution seems to be in launching CRIT in daemon mode, telling it what images and how to modify and teaching CRIU to "filter" the pb objects read from images through this daemon. |