Difference between revisions of "Google Summer of Code Ideas"

From CRIU
Jump to: navigation, search
(Post-copy for shared memory and hugetlbfs)
(Support sparse ghosts)
 
(45 intermediate revisions by 7 users not shown)
Line 3: Line 3:
 
This page contains project ideas for upcoming Google Summer of Code.
 
This page contains project ideas for upcoming Google Summer of Code.
  
== Suggested ideas ==
+
== Contacts ==
  
 +
Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@openvz.org mailing list] or write in [https://gitter.im/save-restore/criu gitter].
  
=== Post-copy for shared memory and hugetlbfs ===
+
== Project ideas ==
 +
 
 +
=== Support sparse ghosts ===
 
   
 
   
'''Summary:''' extend post-copy memory restore and migration to support shared memory and hugeltbfs.
 
  
CRIU relies on [[Userfaultfd]] mechanism in the Linux kernel to implement the demand paging in userspace and allow post-copy memory (or lazy) [[Lazy_migration|migration]]. However, currently this support is limited to anonymous private memory mappings, although kernel also supports shared memory areas and hugetlbfs backed memory.
+
When criu dumps processes it also dumps files that are opened by them. It does this by saving file names by which the files are accessible. But sometimes files can have no names. It may happen if a task opened a file and then removed it. To dump this file criu cannot save its name (because the name doesn't exist). Instead criu saves the whole file. This is called "ghost file". Since saving the whole file is very expensive (copying lots of data on disk) criu limits the maximum size of a ghost file. The latter is also not good, because there are "sparse" files, that are large in size, but may be small from the real disk usage perspective. The goal of the task is to support sparse ghost files, i.e. limit the size of the ghost not by its length but by disk usage and when copying the data detect the used blocks and save only those.
  
The shared memory support for lazy migration can be added to CRIU without kernel modifications, while proper handling of hugetlbfs would require userfaultfd callbacks for [http://man7.org/linux/man-pages/man2/fallocate.2.html fallocate(PUNCH_HOLE)] and [http://man7.org/linux/man-pages/man2/madvise.2.html madvise(MADV_REMOVE)] system calls.
+
 
 +
'''Links:'''
 
   
 
   
'''Links:'''
+
*[https://en.wikipedia.org/wiki/Sparse_file Sparse files]
* [https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html Userfaultfd]
+
*[[Dumping files]]
* [https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html hugetlbfs]
+
*[[Invisible files]]
 +
*[https://www.kernel.org/doc/html/latest/filesystems/fiemap.html Fiemap ioctl]
 +
 
 
'''Details:'''
 
'''Details:'''
* Skill level: most probably advanced?
+
* Skill level: intermediate
 
* Language: C
 
* Language: C
* Mentor: Mike Rapoport <rppt@linux.ibm.com>
+
* Mentor: Pavel Emelyanov <xemul@openvz.org>
* Suggested by: Mike Rapoport <rppt@linux.ibm.com>
+
* Suggested by: Pavel Emelyanov <xemul@openvz.org>
  
 
=== Optimize logging engine ===
 
=== Optimize logging engine ===
 
   
 
   
'''Summary:''' TODO: Short description of the project
+
'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.
+
 
TODO: Detailed description of the project.
+
At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.
+
 
'''Links:'''
+
One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.
* Wiki links to relevant material
 
* External links to mailing lists or web sites
 
 
'''Details:'''
 
* Skill level: beginner or intermediate or advanced
 
* Language: C
 
* Mentor: Andrei Vagin <avagin@gmail.com>
 
* Suggested by: Andrei Vagin <avagin@gmail.com>
 
  
 +
The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.
  
=== Add support for checkpoint/restore of cgroups v2 ===
 
 
'''Summary:''' TODO: Short description of the project
 
 
TODO: Detailed description of the project.
 
 
 
'''Links:'''
 
'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/252
+
* [[Better logging]]
* Wiki links to relevant material
 
* External links to mailing lists or web sites
 
 
   
 
   
 
'''Details:'''
 
'''Details:'''
* Skill level: beginner or intermediate or advanced
+
* Skill level: intermediate
* Language: C
+
* Language: C, though decoder/preprocessor can be in any language
* Suggested by: Person who suggested the idea
+
* Mentor: Pavel Emelyanov <xemul@openvz.org>
 
+
* Suggested by: Andrei Vagin <avagin@gmail.com>
  
 
=== Add support for checkpoint/restore of CORK-ed UDP socket ===
 
=== Add support for checkpoint/restore of CORK-ed UDP socket ===
Line 86: Line 76:
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
  
=== Optimize the pre-dump algorithm ===
+
 
 +
=== Use eBPF to lock and unlock the network ===
 +
 +
'''Summary:''' Use ePBF instead of external iptables-restore tool for network lock and unlock.
 +
 
 +
During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.
 +
 
 +
'''Links:'''
 +
* https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
 +
* https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
 +
 
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: C
 +
* Mentor: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
 +
* Suggested by: Adrian Reber <areber@redhat.com>
 +
 
 +
=== IOUring support ===
 +
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.
 +
 
 +
'''Links:'''
 +
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
 +
* https://github.com/axboe/liburing
 +
 
 +
'''Details:'''
 +
* Skill level: expert (+linux kernel)
 +
* Suggested by: Pavel Emelyanov <xemul@openvz.org>
 +
 
 +
=== CGroup-v2 support ===
 +
 
 +
'''Summary:''' cgroup is a mechanism to organize processes hierarchically and distribute system resources along the hierarchy in a controlled and configurable manner. cgroup v2 is a new version of the cgroup file system. Unlike v1, cgroup v2 has only single hierarchy. CRIU has to dump/restore a container cgroup hierarchy along with all per-cgroup options. The cgroupv2 support in CRIU has to be compatible with Docker, containerd and cri-o.
 +
 
 +
'''Links:'''
 +
* [[CGroups]]
 +
* https://github.com/checkpoint-restore/criu/issues/252
 +
* https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: C
 +
* Mentor: Andrei Vagin <avagin@gmail.com>
 +
* Suggested by: Andrei Vagin <avagin@gmail.com>
 +
 
 +
=== Dump shmem in user-mode (unprivileged-mode) ===
 +
 
 +
CRIU uses /proc/pid/map_files to dump and restore anonymous shared memory regions, but map_files is restricted to the global CAP_SYS_ADMIN capability. In most cases, it is possible to dump/restore shared memory region without map_files and we need to implement this in CRIU.
 +
 
 +
'''Links:'''
 +
* [[User-mode]]
 +
 
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: C
 +
* Suggested by: Andrei Vagin <avagin@gmail.com>
 +
* Suggested by: Pavel Emelyanov <xemul@openvz.org>
 +
 
 +
=== Porting crit functionalities in GO ===
 
   
 
   
'''Summary:''' Optimize the pre-dump algorithm to avoid pinning to many memory in RAM
+
'''Summary:''' Implement image view and manipulation in Go
 
   
 
   
Current [[CLI/cmd/pre-dump|pre-dump]] mode is used to write task memory contents into image
+
CRIU's checkpoint images are stored on disk using protobuf. For easier analysis of checkpoint files CRIU has a tool called [[CRIT|CRiu Image Tool (CRIT)]]. It can display/decode CRIU image files from binary protobuf to JSON as well as encode JSON files back to the binary format. With closer integration of CRIU in container runtimes it becomes important to be able to view the CRIU output files. Either for manipulation before restoring or for reading checkpoint statistics (memory pages written to disk, memory pages skipped, process downtime).
files w/o stopping the task for too long. It does this by stopping the task, infecting it and
 
draining all the memory into a set of pipes. Then the task is cured and resumed and the pipes'
 
contents is written into images (maybe a [[page server]]). This approach creates a big stress
 
on memory subsystem, as keeping all the memory in pipes creates a lot of unreclaimable memory
 
(pages in pipes are not swappable), as well as the number of pipes themselves can be buge (as
 
one pipe doesn't store more than a fixed certain amount of data).
 
  
We can try to use sys_read_process_vm() syscall to mitigate all of the above. To do this we
+
Currently CRIT is implemented in Python, for easier integration in other Go projects it is important to have image manipulation and analysis available from GO. This means we need a Go based library to read/modify/write/encode/decode CRIU's image files. Based on this library a Go based implementation of CRIT would be useful.
need to allocate a temporary buffer in criu, then walk the target process vm by copying the
 
memory piece-by-piece into it, then flush the data into image (or page server), then repeat.
 
  
Ideally there should be sys_splice_process_vm() syscall in the kernel, that does the same as
+
'''Links:'''
the read_process_vm does, but vmsplices the data
+
* [[CRIT]]
 +
* Possible use case see LXD: https://github.com/lxc/lxd/blob/cb55b1c5a484a43e0c21c6ae8c4a2e30b4d45be3/lxd/migrate_container.go#L179
 +
* https://github.com/lxc/lxd/pull/4072
 +
* https://github.com/checkpoint-restore/go-criu/blob/master/phaul/stats.go
 
   
 
   
'''Links:'''
+
'''Details:'''
* https://github.com/checkpoint-restore/criu/issues/351
+
* Skill level: beginner
* [[Memory dumping and restoring]], [[Memory changes tracking]]
+
* Language: Go
* [http://man7.org/linux/man-pages/man2/process_vm_readv.2.html process_vm_readv(2)] [http://man7.org/linux/man-pages/man2/vmsplice.2.html vmsplice(2)] [https://lkml.org/lkml/2017/11/22/527 RFC for splice_process_vm syscall]
+
* Mentor: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
 +
* Suggested by: Adrian Reber <areber@redhat.com>
 +
 
 +
== Suspended project ideas ==
 +
 
 +
Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.
 +
 
 +
=== Add support for SPFS ===
 
   
 
   
 +
'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE
 +
 +
NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.
 +
 +
'''Links'''
 +
* https://github.com/checkpoint-restore/criu/issues/60
 +
* https://github.com/checkpoint-restore/criu/issues/53
 +
* https://github.com/skinsbursky/spfs
 +
* https://patchwork.criu.org/series/137/
 +
 
'''Details:'''
 
'''Details:'''
* Skill level: advanced
+
* Skill level: expert
 
* Language: C
 
* Language: C
* Mentor: Pavel Emelianov <xemul@virtuozzo.com>
+
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com> / <alexander.mikhalitsyn@virtuozzo.com>
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
+
* Suggested by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
 +
 
  
=== Anonymize image files ===
+
=== Anonymise image files ===
 
   
 
   
 
'''Summary:''' Teach [[CRIT]] to remove sensitive information from images
 
'''Summary:''' Teach [[CRIT]] to remove sensitive information from images
Line 134: Line 192:
 
   
 
   
 
'''Links:'''
 
'''Links:'''
 +
* [[Anonymize image files]]
 
* https://github.com/checkpoint-restore/criu/issues/360
 
* https://github.com/checkpoint-restore/criu/issues/360
 
* [[CRIT]], [[Images]]
 
* [[CRIT]], [[Images]]
Line 144: Line 203:
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
  
=== Porting crit functionalities in GO ===
+
[[Category:GSoC]]
+
[[Category:Development]]
'''Summary:''' TODO: Short description of the project
 
 
TODO: Detailed description of the project.
 
 
 
'''Links:'''
 
* Wiki links to relevant material
 
* External links to mailing lists or web sites
 
 
'''Details:'''
 
* Skill level: beginner or intermediate or advanced
 
* Language: Go
 
* Mentor: Adrian Reber <areber@redhat.com>
 
* Suggested by: Adrian Reber <areber@redhat.com>
 

Latest revision as of 14:09, 8 April 2021

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

Contacts[edit]

Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the mailing list or write in gitter.

Project ideas[edit]

Support sparse ghosts[edit]

When criu dumps processes it also dumps files that are opened by them. It does this by saving file names by which the files are accessible. But sometimes files can have no names. It may happen if a task opened a file and then removed it. To dump this file criu cannot save its name (because the name doesn't exist). Instead criu saves the whole file. This is called "ghost file". Since saving the whole file is very expensive (copying lots of data on disk) criu limits the maximum size of a ghost file. The latter is also not good, because there are "sparse" files, that are large in size, but may be small from the real disk usage perspective. The goal of the task is to support sparse ghost files, i.e. limit the size of the ghost not by its length but by disk usage and when copying the data detect the used blocks and save only those.


Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Pavel Emelyanov <xemul@openvz.org>
  • Suggested by: Pavel Emelyanov <xemul@openvz.org>

Optimize logging engine[edit]

Summary: CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but if some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each log(fmt, ...) call gets translated into a call to a binary log function that saves fmt identifier copies all the args as is into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

Links:

Details:

  • Skill level: intermediate
  • Language: C, though decoder/preprocessor can be in any language
  • Mentor: Pavel Emelyanov <xemul@openvz.org>
  • Suggested by: Andrei Vagin <avagin@gmail.com>

Add support for checkpoint/restore of CORK-ed UDP socket[edit]

Summary: Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:

    If this option is enabled, then all data output on this socket
    is accumulated into a single datagram that is transmitted when
    the option is disabled.  This option should not be used in
    code intended to be portable.

Currently criu refuses to dump this case, so it's effectively a bug. Supporting this will need extending the kernel API to allow criu read back the write queue of the socket (see how it's done for TCP sockets, for example). Then the queue is written into the image and is restored into the socket (with the CORK bit set too).

Links:

Details:

  • Skill level: intermediate (+linux kernel)
  • Language: C
  • Mentor: Pavel Emelianov <xemul@virtuozzo.com>
  • Suggested by: Pavel Emelianov <xemul@virtuozzo.com>


Use eBPF to lock and unlock the network[edit]

Summary: Use ePBF instead of external iptables-restore tool for network lock and unlock.

During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
  • Suggested by: Adrian Reber <areber@redhat.com>

IOUring support[edit]

The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

Links:

Details:

  • Skill level: expert (+linux kernel)
  • Suggested by: Pavel Emelyanov <xemul@openvz.org>

CGroup-v2 support[edit]

Summary: cgroup is a mechanism to organize processes hierarchically and distribute system resources along the hierarchy in a controlled and configurable manner. cgroup v2 is a new version of the cgroup file system. Unlike v1, cgroup v2 has only single hierarchy. CRIU has to dump/restore a container cgroup hierarchy along with all per-cgroup options. The cgroupv2 support in CRIU has to be compatible with Docker, containerd and cri-o.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Mentor: Andrei Vagin <avagin@gmail.com>
  • Suggested by: Andrei Vagin <avagin@gmail.com>

Dump shmem in user-mode (unprivileged-mode)[edit]

CRIU uses /proc/pid/map_files to dump and restore anonymous shared memory regions, but map_files is restricted to the global CAP_SYS_ADMIN capability. In most cases, it is possible to dump/restore shared memory region without map_files and we need to implement this in CRIU.

Links:

Details:

  • Skill level: intermediate
  • Language: C
  • Suggested by: Andrei Vagin <avagin@gmail.com>
  • Suggested by: Pavel Emelyanov <xemul@openvz.org>

Porting crit functionalities in GO[edit]

Summary: Implement image view and manipulation in Go

CRIU's checkpoint images are stored on disk using protobuf. For easier analysis of checkpoint files CRIU has a tool called CRiu Image Tool (CRIT). It can display/decode CRIU image files from binary protobuf to JSON as well as encode JSON files back to the binary format. With closer integration of CRIU in container runtimes it becomes important to be able to view the CRIU output files. Either for manipulation before restoring or for reading checkpoint statistics (memory pages written to disk, memory pages skipped, process downtime).

Currently CRIT is implemented in Python, for easier integration in other Go projects it is important to have image manipulation and analysis available from GO. This means we need a Go based library to read/modify/write/encode/decode CRIU's image files. Based on this library a Go based implementation of CRIT would be useful.

Links:

Details:

  • Skill level: beginner
  • Language: Go
  • Mentor: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
  • Suggested by: Adrian Reber <areber@redhat.com>

Suspended project ideas[edit]

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

Add support for SPFS[edit]

Summary: The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

Links

Details:

  • Skill level: expert
  • Language: C
  • Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com> / <alexander.mikhalitsyn@virtuozzo.com>
  • Suggested by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>


Anonymise image files[edit]

Summary: Teach CRIT to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

  • Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
  • Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
  • Registers.
  • Process names. (But relations should be kept).
  • Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
  • Ghost files.
  • Tarballs with tmpfs-s.
  • IP addresses in sk-inet-s, ip tool dumps and net*.img.

Links:

Details:

  • Skill level: beginner
  • Language: Python
  • Mentor: Pavel Emelianov <xemul@virtuozzo.com>
  • Suggested by: Pavel Emelianov <xemul@virtuozzo.com>