| Line 3: |
Line 3: |
| | This page contains project ideas for upcoming Google Summer of Code. | | This page contains project ideas for upcoming Google Summer of Code. |
| | | | |
| − | == Contacts == | + | == Contact == |
| | | | |
| − | Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@openvz.org mailing list] or write in [https://gitter.im/save-restore/criu gitter].
| + | First, make sure to go through the [[GSoC Students Recommendations]]. Once you build CRIU locally and C/R a simple process successfully, please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@lists.linux.dev mailing list] or write in [https://gitter.im/save-restore/criu gitter]. |
| | | | |
| | == Project ideas == | | == Project ideas == |
| | | | |
| − | === Add support for checkpoint/restore of CORK-ed UDP socket === | + | === Add support for memory compression === |
| − |
| |
| − | '''Summary:''' Support C/R of corked UDP socket
| |
| − |
| |
| − | There's UDP_CORK option for sockets. As man page says:
| |
| − | <pre>
| |
| − | If this option is enabled, then all data output on this socket
| |
| − | is accumulated into a single datagram that is transmitted when
| |
| − | the option is disabled. This option should not be used in
| |
| − | code intended to be portable.
| |
| − | </pre>
| |
| − | | |
| − | Currently criu refuses to dump this case, so it's effectively a bug. Supporting
| |
| − | this will need extending the kernel API to allow criu read back the write queue
| |
| − | of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
| |
| − | the queue is written into the image and is restored into the socket (with the CORK
| |
| − | bit set too).
| |
| | | | |
| − | '''Links:''' | + | '''Summary:''' Support compression for page images |
| − | * https://github.com/checkpoint-restore/criu/issues/409
| |
| − | * https://github.com/criupatchwork/criu/commit/a532312
| |
| − | * [[Sockets]], [[TCP connection]]
| |
| − | * [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]
| |
| | | | |
| − | '''Details:''' | + | We would like to support memory page files compression |
| − | * Skill level: intermediate (+linux kernel)
| + | in CRIU using one of the fastest algorithms (it's matter |
| − | * Language: C
| + | of discussion which one to choose!). |
| − | * Expected size: 350 hours
| |
| − | * Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>
| |
| − | | |
| − | === Add support for pidfd file descriptors ===
| |
| − | | |
| − | '''Summary:''' Support C/R of pidfd descriptors
| |
| | | | |
| − | There is pidfd_open syscall which allows opening
| + | This task does not require any Linux kernel modifications |
| − | a special PID file descriptor. A user can send a signal to
| + | and scope is limited to CRIU itself. At the same time it's |
| − | the process (pidfd_send_signal syscall), wait for the process
| + | complex enough as we need to touch memory dump/restore codepath |
| − | (poll() on pidfd).
| + | in CRIU and also handle many corner cases like page-server and stuff. |
| − | | |
| − | At the moment CRIU can't dump processes that have pidfd's opened. | |
| − | | |
| − | '''Links:'''
| |
| − | * https://lwn.net/Articles/801319/
| |
| − | * https://lwn.net/Articles/794707/
| |
| − | * https://github.com/torvalds/linux/blob/v5.16/kernel/fork.c#L1877
| |
| | | | |
| | '''Details:''' | | '''Details:''' |
| Line 59: |
Line 26: |
| | * Language: C | | * Language: C |
| | * Expected size: 350 hours | | * Expected size: 350 hours |
| − | * Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Christian Brauner <christian@brauner.io> | + | * Suggested by: Andrei Vagin <avagin@gmail.com> |
| − | * Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
| + | * Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com> |
| | | | |
| | === Use eBPF to lock and unlock the network === | | === Use eBPF to lock and unlock the network === |
| Line 136: |
Line 103: |
| | * Language: C | | * Language: C |
| | * Expected size: 350 hours | | * Expected size: 350 hours |
| − | * Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Prajwal S N <prajwalnadig21@gmail.com> | + | * Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com> |
| | * Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | | * Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> |
| | | | |
| − | === Kubernetes operator for managing container checkpoints ===
| |
| | | | |
| − | '''Summary:''' Develop a Kubernetes operator that automates the management of container checkpoints | + | === Add support for arm64 Guarded Control Stack (GCS) === |
| | + | |
| | + | '''Summary:''' Support arm64 Guarded Control Stack (GCS) |
| | + | |
| | + | The arm64 Guarded Control Stack (GCS) feature provides support for |
| | + | hardware protected stacks of return addresses, intended to provide |
| | + | hardening against return oriented programming (ROP) attacks and to make |
| | + | it easier to gather call stacks for applications such as profiling (taken from [1]). |
| | + | We would like to support arm64 Guarded Control Stack (GCS) in CRIU, which means |
| | + | that CRIU should be able to Checkpoint/Restore applications using GCS. |
| | | | |
| − | Container checkpointing has recently been introduced as an alpha feature in Kubernetes.
| + | This task should not require any Linux kernel modifications |
| − | To enable this feature, the kubelet API was extended with an endpoint that enables the
| + | but will require a lot of effort to understand Linux kernel and |
| − | creation of checkpoints for individual containers. By default, all container checkpoints
| + | glibc support patches. We have a good example of support for |
| − | are stored as tar archives in <code>/var/lib/kubelet/checkpoints</code> using the following
| + | x86 shadow stack [4]. |
| − | file name format: <code>checkpoint-<pod-name>_<namespace-name>-<container-name>-<timestamp>.tar</code>.
| |
| − | However, the current implementation does not provide a mechanism for limiting the number
| |
| − | of checkpoints, which may lead to filling up all existing disk space. This project aims to | |
| − | develop a Kubernetes operator that automates the management of checkpoints and provides
| |
| − | a garbage collection mechanism to discard obsolete checkpoints.
| |
| | | | |
| | '''Links:''' | | '''Links:''' |
| − | * https://github.com/checkpoint-restore/checkpoint-restore-operator | + | * [1] kernel support https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.org |
| − | * https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/ | + | * [2] libc support https://inbox.sourceware.org/libc-alpha/20250117174119.3254972-1-yury.khrustalev@arm.com |
| − | * https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ | + | * [3] libc tests https://inbox.sourceware.org/libc-alpha/20250210114538.1723249-1-yury.khrustalev@arm.com |
| − | * https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
| + | * [4] x86 support https://github.com/checkpoint-restore/criu/pull/2306 |
| − | * https://github.com/kubernetes/kubernetes/pull/115888
| + | |
| − | * https://github.com/kubernetes/enhancements/issues/2008 | |
| − | | |
| | '''Details:''' | | '''Details:''' |
| − | * Skill level: intermediate | + | * Skill level: expert (a lot of moving parts: Linux kernel / libc / CRIU) |
| − | * Language: Go | + | * Language: C |
| | * Expected size: 350 hours | | * Expected size: 350 hours |
| − | * Mentors: Adrian Reber <areber@redhat.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com> | + | * Suggested by: Mike Rapoport <rppt@kernel.org> |
| − | * Suggested by: Adrian Reber
| + | * Mentors: Mike Rapoport <rppt@kernel.org>, Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com> |
| | | | |
| | == Suspended project ideas == | | == Suspended project ideas == |
| Line 248: |
Line 216: |
| | * Skill level: beginner | | * Skill level: beginner |
| | * Language: Python | | * Language: Python |
| | + | |
| | + | === Add support for checkpoint/restore of CORK-ed UDP socket === |
| | + | |
| | + | '''Summary:''' Support C/R of corked UDP socket |
| | + | |
| | + | There's UDP_CORK option for sockets. As man page says: |
| | + | <pre> |
| | + | If this option is enabled, then all data output on this socket |
| | + | is accumulated into a single datagram that is transmitted when |
| | + | the option is disabled. This option should not be used in |
| | + | code intended to be portable. |
| | + | </pre> |
| | + | |
| | + | Currently criu refuses to dump this case, so it's effectively a bug. Supporting |
| | + | this will need extending the kernel API to allow criu read back the write queue |
| | + | of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then |
| | + | the queue is written into the image and is restored into the socket (with the CORK |
| | + | bit set too). |
| | + | |
| | + | '''Notes:''' |
| | + | |
| | + | We already had a couple (3) of tries for this problem: |
| | + | |
| | + | * UDP_REPAIR approach didn't succeed: https://lore.kernel.org/netdev/721a2e32-c930-ad6b-5055-631b502ed11b@gmail.com/, https://lore.kernel.org/netdev/?q=udp_repair |
| | + | * eBPF (CRIB) approach, socket queue iterator was not merged: https://lore.kernel.org/netdev/AM6PR03MB5848EDA002E3D7EACA7C6BDA99A52@AM6PR03MB5848.eurprd03.prod.outlook.com/, and we have general objections to CRIB approach https://lore.kernel.org/bpf/CAHk-=wjLWFa3i6+Tab67gnNumTYipj_HuheXr2RCq4zn0tCTzA@mail.gmail.com/ |
| | + | |
| | + | We still have one idea we didn't try, as UDP allows packets to be lost on the way on restore we can somehow mark the socket to drop all data before UNCORK. This way we don't really need to restore contents of UDP CORK-ed sockets send queue. |
| | + | |
| | + | '''Links:''' |
| | + | * https://github.com/checkpoint-restore/criu/issues/409 |
| | + | * https://github.com/criupatchwork/criu/commit/a532312 |
| | + | * [[Sockets]], [[TCP connection]] |
| | + | * [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]] |
| | + | |
| | + | '''Details:''' |
| | + | * Skill level: intermediate (+linux kernel) |
| | + | * Language: C |
| | + | * Expected size: 350 hours |
| | + | * Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Andrei Vagin <avagin@gmail.com> |
| | | | |
| | | | |