Difference between revisions of "Google Summer of Code Ideas"

From CRIU
Jump to: navigation, search
(Optimize the pre-dump algorithm)
(Add support for checkpoint/restore of CORK-ed UDP socket)
Line 59: Line 59:
 
=== Add support for checkpoint/restore of CORK-ed UDP socket ===
 
=== Add support for checkpoint/restore of CORK-ed UDP socket ===
 
   
 
   
'''Summary:''' TODO: Short description of the project
+
'''Summary:''' Support C/R of corked UDP socket
 
   
 
   
TODO: Detailed description of the project.
+
There's UDP_CORK option for sockets. As man page says:
 +
<pre>
 +
    If this option is enabled, then all data output on this socket
 +
    is accumulated into a single datagram that is transmitted when
 +
    the option is disabled.  This option should not be used in
 +
    code intended to be portable.
 +
</pre>
 +
 
 +
Currently criu refuses to dump this case, so it's effectively a bug. Supporting
 +
this will need extending the kernel API to allow criu read back the write queue
 +
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
 +
the queue is written into the image and is restored into the socket (with the CORK
 +
bit set too).
 
   
 
   
 
'''Links:'''
 
'''Links:'''
 
* https://github.com/checkpoint-restore/criu/issues/409
 
* https://github.com/checkpoint-restore/criu/issues/409
* Wiki links to relevant material
+
* [[Sockets]], [[TCP connection]]
* External links to mailing lists or web sites
+
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]
 
   
 
   
 
'''Details:'''
 
'''Details:'''
* Skill level: beginner or intermediate or advanced
+
* Skill level: intermediate (+linux kernel)
 
* Language: C
 
* Language: C
 
* Mentor: Pavel Emelianov <xemul@virtuozzo.com>
 
* Mentor: Pavel Emelianov <xemul@virtuozzo.com>
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
 
* Suggested by: Pavel Emelianov <xemul@virtuozzo.com>
 
  
 
=== Optimize the pre-dump algorithm ===
 
=== Optimize the pre-dump algorithm ===

Revision as of 12:14, 16 January 2019

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

Suggested ideas

Post-copy for shared memory and hugetlbfs

Summary: TODO: Short description of the project

TODO: Detailed description of the project.

Links:

  • Wiki links to relevant material
  • External links to mailing lists or web sites

Details:

  • Skill level: beginner or intermediate or advanced
  • Language: C
  • Mentor: Mike Rapoport <rppt@linux.ibm.com>
  • Suggested by: Mike Rapoport <rppt@linux.ibm.com>


Optimize logging engine

Summary: TODO: Short description of the project

TODO: Detailed description of the project.

Links:

  • Wiki links to relevant material
  • External links to mailing lists or web sites

Details:

  • Skill level: beginner or intermediate or advanced
  • Language: C
  • Mentor: Andrei Vagin <avagin@gmail.com>
  • Suggested by: Andrei Vagin <avagin@gmail.com>


Add support for checkpoint/restore of cgroups v2

Summary: TODO: Short description of the project

TODO: Detailed description of the project.

Links:

Details:

  • Skill level: beginner or intermediate or advanced
  • Language: C
  • Suggested by: Person who suggested the idea


Add support for checkpoint/restore of CORK-ed UDP socket

Summary: Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:

    If this option is enabled, then all data output on this socket
    is accumulated into a single datagram that is transmitted when
    the option is disabled.  This option should not be used in
    code intended to be portable.

Currently criu refuses to dump this case, so it's effectively a bug. Supporting this will need extending the kernel API to allow criu read back the write queue of the socket (see how it's done for TCP sockets, for example). Then the queue is written into the image and is restored into the socket (with the CORK bit set too).

Links:

Details:

  • Skill level: intermediate (+linux kernel)
  • Language: C
  • Mentor: Pavel Emelianov <xemul@virtuozzo.com>
  • Suggested by: Pavel Emelianov <xemul@virtuozzo.com>

Optimize the pre-dump algorithm

Summary: Optimize the pre-dump algorithm to avoid pinning to many memory in RAM

Current pre-dump mode suffers from several issues

  • It keeps all the memory in pipes, and their number can be huge due to limited one pipe size
  • It keeps all the memory in pipes and this memory is unreclaimable for that period
  • It stops and infects tasks to drain memory from

We can try to use sys_read_process_vm() syscall to mitigate all of the above. Benefits:

  • No pipes, just copy data into temp buffer and send
  • Memory is always reclaimable
  • No infection is needed, just freeze, reset the tracker and proceed

Ideally there should be sys_splice_process_vm() syscall in the kernel, that does the same as the read_process_vm does, but vmsplices the data

Links:

Details:

  • Skill level: advanced
  • Language: C
  • Mentor: Pavel Emelianov <xemul@virtuozzo.com>
  • Suggested by: Pavel Emelianov <xemul@virtuozzo.com>

Anonymize image files

Summary: Teach CRIT to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

  • Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
  • Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
  • Registers.
  • Process names. (But relations should be kept).
  • Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
  • Ghost files.
  • Tarballs with tmpfs-s.
  • IP addresses in sk-inet-s, ip tool dumps and net*.img.

Links:

Details:

  • Skill level: beginner
  • Language: Python
  • Mentor: Pavel Emelianov <xemul@virtuozzo.com>
  • Suggested by: Pavel Emelianov <xemul@virtuozzo.com>

Porting crit functionalities in GO

Summary: TODO: Short description of the project

TODO: Detailed description of the project.

Links:

  • Wiki links to relevant material
  • External links to mailing lists or web sites

Details:

  • Skill level: beginner or intermediate or advanced
  • Language: Go
  • Mentor: Adrian Reber <areber@redhat.com>
  • Suggested by: Adrian Reber <areber@redhat.com>