Changes

m
Line 9: Line 9:  
== Project ideas ==
 
== Project ideas ==
   −
=== Automated Policy-Driven Checkpointing Operator for Kubernetes ===
+
=== Kubernetes Operator for Automated Checkpointing ===
    
'''Summary:''' Extend the Checkpoint/Restore Operator with support for automated policy-based checkpointing.
 
'''Summary:''' Extend the Checkpoint/Restore Operator with support for automated policy-based checkpointing.
Line 23: Line 23:  
* Language: Go
 
* Language: Go
 
* Expected size: 350 hours
 
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Viktória Spišaková <spisakova@ics.muni.cz>, Adrian Reber <areber@redhat.com>
+
* Mentors: Prajwal S N <prajwalnadig21@gmail.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
   −
=== Add support for memory compression ===
+
=== Forensic Checkpointing Framework for Kubernetes ===
+
 
'''Summary:''' Support compression for page images
+
Kubernetes provides a highly dynamic and ephemeral environment where workloads can start and disappear very quickly and are continuously being rescheduled across different nodes in the cluster.
+
One of the key challenges with forensic investigations in Kubernetes is capturing and preserving the evidence during security incidents. This project aims to address this problem by developing a framework for efficiently capturing and preserving the state of all running applications in a container at a specific point in time, along with the associated container configurations and metadata. These artifacts would allow investigators to accurately reconstruct the events, create a timeline, and analyze security incidents without impacting the running cluster. This is an important step towards enabling forensic readiness for Kubernetes, where cluster administrators proactively ensure the environments are prepared to collect and preserve evidence before a security incident occurs.
We would like to support memory page files compression
+
 
in CRIU using one of the fastest algorithms (it's matter
+
'''Links:'''
of discussion which one to choose!).
+
* https://github.com/checkpoint-restore/checkpointctl
 +
* [https://fosdem.org/2026/events/attachments/F9RANH-forensic-snapshots-in-kubernetes/slides/267371/fosdem_2_4dh73ni.pdf Investigating Security Incidents with Forensic Snapshots in Kubernetes]
 +
* [https://www.cncf.io/reports/cloud-native-security-whitepaper/ Cloud Native Security Whitepaper]
 +
* [https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF Kubernetes Hardening Guide]
 +
 
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: Go
 +
* Expected size: 350 hours
 +
* Mentors: Lorena Goldoni <lory.goldoni@gmail.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
 +
 
 +
=== Enabling Checkpoint/Restore of Rootless Containers ===
 +
 
 +
[https://rootlesscontaine.rs/ Rootless containers] are containers that can be created, run, and managed by unprivileged users. Container engines such as Podman natively support running containers in a rootless mode to improve security and usability. While checkpoint/restore functionality is already available for rootful containers and unprivileged checkpointing is possible with the <code>CAP_CHECKPOINT_RESTORE</code> capability, container engines do not yet support native checkpointing of containers running in rootless mode. This project aims to explore and address the remaining challenges required to enable unprivileged checkpoint/restore for rootless containers.
 +
 
 +
'''Links:'''
 +
* https://github.com/checkpoint-restore/criu/pull/1930
 +
* https://github.com/torvalds/linux/commit/124ea650d3072b005457faed69909221c2905a1f
 +
* https://src.fedoraproject.org/rpms/criu/pull-request/10#request_diff
   −
This task does not require any Linux kernel modifications
  −
and scope is limited to CRIU itself. At the same time it's
  −
complex enough as we need to touch memory dump/restore codepath
  −
in CRIU and also handle many corner cases like page-server and stuff.
  −
   
'''Details:'''
 
'''Details:'''
 
* Skill level: intermediate
 
* Skill level: intermediate
* Language: C
+
* Language: C, Go
 
* Expected size: 350 hours
 
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
+
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>
+
 
 +
=== Checkpointing of POSIX message queues ===
   −
=== Files on detached mounts ===
+
'''Summary:''' Add support for checkpoint/restore of POSIX message queues
   −
'''Summary:''' Initial support of open files on "detached" mounts
+
POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.
 +
 
 +
'''Links:'''
 +
* https://github.com/checkpoint-restore/criu/issues/2285
 +
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
 +
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf
 +
 
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: C
 +
* Expected size: 350 hours
 +
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
 +
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
   −
When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.
+
=== Add support for SCM_CREDENTIALS / SCM_PIDFD and friends ===
   −
Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.
+
'''Summary:''' Support for SCM_CREDENTIALS / SCM_PIDFD
   −
But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.
+
SCM_CREDENTIALS and SCM_PIDFD are types of SCM (Socket-level Control Messages). They play a crucial role
 +
in systemd and many other user space applications. This project is about adding support for these
 +
SCMs to be properly saved and restored back with CRIU. There is an existing code in OpenVZ CRIU fork,
 +
see [1] and [2]. Goal would be first of all to properly port this code, cover with extensive tests and
 +
ensure that SCM_PIDFD / SO_PEERPIDFD are handled correctly. Also we expect to cover things like
 +
SO_PASSRIGHTS and SO_PASSPIDFD.
   −
Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.
+
There is some extra source of complexity here pidfds can be "stale" (see PIDFD_STALE in Linux kernel)
 +
and we need to ensure that we properly cover those cases.
   −
This can happen in two cases
+
'''Links:'''
 +
* [1] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-unix.c?at=hci-dev
 +
* [2] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-queue.c?at=hci-dev
 +
* [3] Linux kernel https://github.com/torvalds/linux/commit/5e2ff6704a275be009be8979af17c52361b79b89
 +
* [4] Linux kernel https://github.com/torvalds/linux/commit/c679d17d3f2d895b34e660673141ad250889831f
   −
* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
+
'''Details:'''
* 2) mount was lazily unmounted
+
* Skill level: intermediate / advanced
 +
* Language: C
 +
* Expected size: 350 hours
 +
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
 +
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>
   −
In case of 1) we have criu options to help criu handle external dependencies.
+
=== Integrate with Live Update Orchestrator (LUO) ===
   −
In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.
+
'''Summary:''' Integrate with Live Update Orchestrator (LUO)
   −
'''Solution:'''
+
Live Update Orchestrator (LUO) is a framework for Linux kernel
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.
+
live updates (via kexec). Idea behind it is to provide kernel
 +
and user space API to save specific system resources across
 +
kexec reboot.
   −
Known problems with this approach:
+
This research project explores how CRIU can be integrated with LUO.
 +
For example, if a user is running memcached on a node, the current
 +
approach would require a full CRIU dump, then saving the entire
 +
process memory to disk, then followed by restoring it after the
 +
kernel live update.
   −
* Stat on btrfs gives wrong major/minor
+
Instead, CRIU could be extended to leverage the LUO API. When instructed,
* file handles does not work everywhere
+
it could preserve selected memory regions directly across the kexec reboot,
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.
+
avoiding a full disk dump and significantly accelerating the restore process
 +
after the kernel update.
   −
Additionally (optional part):
+
'''Links:'''
We can export real major/minor in fdinfo (kernel).
+
* [1] LUO kernel documentation https://docs.kernel.org/core-api/liveupdate.html
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).
+
* [2] LUO memfd doc https://docs.kernel.org/mm/memfd_preservation.html
    
'''Details:'''
 
'''Details:'''
* Skill level: intermediate
+
* Skill level: intermediate / advanced
 
* Language: C
 
* Language: C
 
* Expected size: 350 hours
 
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
+
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
+
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>
 +
 
 +
=== Optimize COW memory dumping ===
 +
 
 +
'''Summary:''' Optimize COW memory dumping
 +
 
 +
The Linux kernel memory management subsystem is highly optimized not only for performance, but also to minimize unnecessary memory consumption. A key example of this is how the kernel handles private VMAs when user space invokes the fork() system call.
 +
 
 +
Rather than duplicating the entire VMA tree along with all memory contents, the kernel creates optimized copies of inherited VMAs using the Copy-on-Write (COW) mechanism. When a process writes to a page within a COW-ed VMA, a write page fault occurs, and the kernel creates a private copy of that page before applying the modification. However, if the page is only read, no copying is performed.
 +
 
 +
This approach significantly improves fork() performance and can dramatically reduce memory usage in many workloads.
 +
 
 +
In CRIU, when dumping VMAs and their associated memory pages, this COW optimization is not currently taken into account during the dump phase. As a result, for COW-backed VMAs, CRIU may generate multiple copies of identical memory pages in the dump image.
   −
=== Checkpointing of POSIX message queues ===
+
During restore, however, CRIU explicitly handles this situation (see [1] and [2]) and attempts to reconstruct COW relationships inside the kernel. This step is critical: without it, a checkpoint/restore (C/R) cycle could lead to a substantial increase in memory consumption for the same process tree. For example, a workload that originally consumed 500 MiB could expand to 800 MiB after restore, which is clearly unacceptable.
   −
'''Summary:''' Add support for checkpoint/restore of POSIX message queues
+
This project aims to improve the dumping algorithm so that it avoids producing multiple unnecessary copies of identical pages belonging to COW-ed VMAs.
   −
POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.
+
The project requires some understanding of Linux memory management internals and CRIU’s architecture. We strongly encourage GSoC contributors to study references [1] and [2] and experiment with the relevant code paths before applying. We are happy to answer questions and provide guidance along the way.
    
'''Links:'''
 
'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
+
* [1] preparing COW VMAs https://github.com/checkpoint-restore/criu/blob/c180188db036f8ea4c08bfee28cbcdbdd52cdfc3/criu/mem.c#L878
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
+
* [2] private vma content restore cow case https://github.com/checkpoint-restore/criu/blob/c180188db036f8ea4c08bfee28cbcdbdd52cdfc3/criu/mem.c#L1219
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf
      
'''Details:'''
 
'''Details:'''
* Skill level: intermediate
+
* Skill level: intermediate / advanced
 
* Language: C
 
* Language: C
 
* Expected size: 350 hours
 
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
+
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
+
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>
    
== Suspended project ideas ==
 
== Suspended project ideas ==
593

edits