Difference between revisions of "Google Summer of Code Ideas"
(add memory tracking with userfaultfd-WP) |
|||
Line 173: | Line 173: | ||
* Mentor: Cyrill Gorcunov <gorcunov@gmail.com> | * Mentor: Cyrill Gorcunov <gorcunov@gmail.com> | ||
* Suggested by: Cyrill Gorcunov <gorcunov@gmail.com> | * Suggested by: Cyrill Gorcunov <gorcunov@gmail.com> | ||
+ | |||
+ | === Memory changes tracking with userfaultfd-WP === | ||
+ | |||
+ | '''Summary:''' add ability to track memory changes of the snapshotted processes using userfaultfd-WP | ||
+ | |||
+ | Currently CRIU uses [[Memory_changes_tracking|Soft-dirty]] mechanism in Linux kernel to track memory changes. | ||
+ | This mechanism can be complemented or even completely replaced with recently proposed userfaultfd-WP. | ||
+ | |||
+ | Userfault allows implementation of paging in userspace. It allows an application to receive notifications about page faults and provide the desired memory contents for the faulting pages. In the current upstream kernels only missing page faults are supported, but there is an ongoing work to allow notifications for write faults as well. Using these notifications it would be possible to precisely track memory accesses of during pre-dump iterations and this approach may prove more efficient than soft-dirty. | ||
+ | |||
+ | '''Links:''' | ||
+ | * [https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html Userfaultfd] | ||
+ | * [https://github.com/xzpeter/linux/tree/uffd-wp-merged Userfaultfd-WP] | ||
+ | * [https://www.kernel.org/doc/html/latest/admin-guide/mm/soft-dirty.html?highlight=soft%20dirty Soft-Dirty] | ||
+ | |||
+ | '''Details:''' | ||
+ | * Skill level: most probably advanced? | ||
+ | * Language: C | ||
+ | * Mentor: Mike Rapoport <rppt@linux.ibm.com> | ||
+ | * Suggested by: Mike Rapoport <rppt@linux.ibm.com> |
Revision as of 07:48, 22 January 2019
Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.
This page contains project ideas for upcoming Google Summer of Code.
Suggested ideas
Summary: extend post-copy memory restore and migration to support shared memory and hugeltbfs.
CRIU relies on Userfaultfd mechanism in the Linux kernel to implement the demand paging in userspace and allow post-copy memory (or lazy) migration. However, currently this support is limited to anonymous private memory mappings, although kernel also supports shared memory areas and hugetlbfs backed memory.
The shared memory support for lazy migration can be added to CRIU without kernel modifications, while proper handling of hugetlbfs would require userfaultfd callbacks for fallocate(PUNCH_HOLE) and madvise(MADV_REMOVE) system calls.
Links:
Details:
- Skill level: most probably advanced?
- Language: C
- Mentor: Mike Rapoport <rppt@linux.ibm.com>
- Suggested by: Mike Rapoport <rppt@linux.ibm.com>
Optimize logging engine
Summary: TODO: Short description of the project
TODO: Detailed description of the project.
Links:
- Wiki links to relevant material
- External links to mailing lists or web sites
Details:
- Skill level: beginner or intermediate or advanced
- Language: C
- Mentor: Andrei Vagin <avagin@gmail.com>
- Suggested by: Andrei Vagin <avagin@gmail.com>
Add support for checkpoint/restore of cgroups v2
Summary: TODO: Short description of the project
TODO: Detailed description of the project.
Links:
- https://github.com/checkpoint-restore/criu/issues/252
- Wiki links to relevant material
- External links to mailing lists or web sites
Details:
- Skill level: beginner or intermediate or advanced
- Language: C
- Suggested by: Person who suggested the idea
Add support for checkpoint/restore of CORK-ed UDP socket
Summary: Support C/R of corked UDP socket
There's UDP_CORK option for sockets. As man page says:
If this option is enabled, then all data output on this socket is accumulated into a single datagram that is transmitted when the option is disabled. This option should not be used in code intended to be portable.
Currently criu refuses to dump this case, so it's effectively a bug. Supporting this will need extending the kernel API to allow criu read back the write queue of the socket (see how it's done for TCP sockets, for example). Then the queue is written into the image and is restored into the socket (with the CORK bit set too).
Links:
Details:
- Skill level: intermediate (+linux kernel)
- Language: C
- Mentor: Pavel Emelianov <xemul@virtuozzo.com>
- Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
Optimize the pre-dump algorithm
Summary: Optimize the pre-dump algorithm to avoid pinning to many memory in RAM
Current pre-dump mode is used to write task memory contents into image files w/o stopping the task for too long. It does this by stopping the task, infecting it and draining all the memory into a set of pipes. Then the task is cured and resumed and the pipes' contents is written into images (maybe a page server). This approach creates a big stress on memory subsystem, as keeping all the memory in pipes creates a lot of unreclaimable memory (pages in pipes are not swappable), as well as the number of pipes themselves can be buge (as one pipe doesn't store more than a fixed certain amount of data).
We can try to use sys_read_process_vm() syscall to mitigate all of the above. To do this we need to allocate a temporary buffer in criu, then walk the target process vm by copying the memory piece-by-piece into it, then flush the data into image (or page server), then repeat.
Ideally there should be sys_splice_process_vm() syscall in the kernel, that does the same as the read_process_vm does, but vmsplices the data
Links:
- https://github.com/checkpoint-restore/criu/issues/351
- Memory dumping and restoring, Memory changes tracking
- process_vm_readv(2) vmsplice(2) RFC for splice_process_vm syscall
Details:
- Skill level: advanced
- Language: C
- Mentor: Pavel Emelianov <xemul@virtuozzo.com>
- Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
Anonymize image files
Summary: Teach CRIT to remove sensitive information from images
When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.
List of data to shred:
- Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
- Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
- Registers.
- Process names. (But relations should be kept).
- Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
- Ghost files.
- Tarballs with tmpfs-s.
- IP addresses in sk-inet-s, ip tool dumps and net*.img.
Links:
- https://github.com/checkpoint-restore/criu/issues/360
- CRIT, Images
- External links to mailing lists or web sites
Details:
- Skill level: beginner
- Language: Python
- Mentor: Pavel Emelianov <xemul@virtuozzo.com>
- Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
Porting crit functionalities in GO
Summary: TODO: Short description of the project
TODO: Detailed description of the project.
Links:
- Wiki links to relevant material
- External links to mailing lists or web sites
Details:
- Skill level: beginner or intermediate or advanced
- Language: Go
- Mentor: Adrian Reber <areber@redhat.com>
- Suggested by: Adrian Reber <areber@redhat.com>
Implement diskless migration
Summary: Need to investigate and implement that named diskless migration.
By diskless we imply a case where all images generated by checkpoint procedure do not sit on storage at all but rather get collected by the criu service on a destination machine, and read from memory later once restore procedure is initiated. More importantly the memory transferred should be deduplicated on the fly and premapped at some preliminary address. Later the restore procedure just remap data to proper positions without copying page data at all.
This task is under the question still and the section is like a placeholder.
Details:
- Skill level: expert
- Language: C
- Mentor: Cyrill Gorcunov <gorcunov@gmail.com>
- Suggested by: Cyrill Gorcunov <gorcunov@gmail.com>
Memory changes tracking with userfaultfd-WP
Summary: add ability to track memory changes of the snapshotted processes using userfaultfd-WP
Currently CRIU uses Soft-dirty mechanism in Linux kernel to track memory changes. This mechanism can be complemented or even completely replaced with recently proposed userfaultfd-WP.
Userfault allows implementation of paging in userspace. It allows an application to receive notifications about page faults and provide the desired memory contents for the faulting pages. In the current upstream kernels only missing page faults are supported, but there is an ongoing work to allow notifications for write faults as well. Using these notifications it would be possible to precisely track memory accesses of during pre-dump iterations and this approach may prove more efficient than soft-dirty.
Links:
Details:
- Skill level: most probably advanced?
- Language: C
- Mentor: Mike Rapoport <rppt@linux.ibm.com>
- Suggested by: Mike Rapoport <rppt@linux.ibm.com>