Changes

Jump to navigation Jump to search
m
Coordinated Checkpointing of Distributed Applications
Line 135: Line 135:  
* Suggested by: Mike Rapoport <rppt@kernel.org>
 
* Suggested by: Mike Rapoport <rppt@kernel.org>
 
* Mentors: Mike Rapoport <rppt@kernel.org>, Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>
 
* Mentors: Mike Rapoport <rppt@kernel.org>, Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>
 +
 +
=== Coordinated checkpointing of distributed applications ===
 +
 +
'''Summary:''' Enable coordinated container checkpointing with Kubernetes.
 +
 +
Checkpointing support has been recently introduced in Kubernetes, where the
 +
smallest deployable unit is a Pod (a group of containers).  Kubernetes is often
 +
used to deploy applications that are distributed across multiple nodes.
 +
However, checkpointing such distributed applications requires a coordination
 +
mechanism to synchronize the checkpoint and restore operations. To address this
 +
challenge, we have developed a new tool called <code>criu-coordinator</code>
 +
that relies on the action-script functionality of CRIU to enable synchronization
 +
in distributed environments. This project aims to extend this tool to enable
 +
seamless integration with the checkpointing functionality of Kubernetes.
 +
 +
'''Links:'''
 +
* https://github.com/checkpoint-restore/criu-coordinator
 +
* https://lpc.events/event/18/contributions/1803/
 +
* https://sched.co/1YeT4
 +
* https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
 +
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: Rust / Go / C
 +
* Expected size: 350 hours
 +
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>
 +
* Suggested by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
    
== Suspended project ideas ==
 
== Suspended project ideas ==
355

edits

Navigation menu