Difference between revisions of "Main Page"

From CRIU
Jump to navigation Jump to search
Line 108: Line 108:
 
<div class="m_center">
 
<div class="m_center">
 
== Academic Research ==
 
== Academic Research ==
{{:User talk:Wenhuizhang}}
+
{{:Academic Research}}
 
</div>
 
</div>
  

Revision as of 03:49, 26 February 2023

Download
Tarball: criu-4.1.tar.gz
Version: 4.1 "CRISCV"
Released: 25 Mar 2025
GIT tag: v4.1
InstallationUsage
Releases
3.17.jpg

Welcome to CRIU, a project to implement checkpoint/restore functionality for Linux.

Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a Linux software. It can freeze a running container (or an individual application) and checkpoint its state to disk. The data saved can be used to restore the application and run it exactly as it was during the time of the freeze. Using this functionality, application or container live migration, snapshots, remote debugging, and many other things are now possible.

CRIU started as a project of Virtuozzo, and grew with the tremendous help from the community. It is currently used by (integrated into) OpenVZ, LXC/LXD, Docker, Podman, and other software, and packaged for many Linux distributions.


Using

Getting packages for your distribution
Or try manual installation to have CRIU on your system

CLI, RPC and C API
Three ways to start using the C/R functionality. More info about APIs.
Usage scenarios
Ideas how criu can be used (some are crazy indeed)
Category:HOWTO
Collection of real world examples of how to use CRIU. Some are complex, some are not. HOW TO dump a simple loop might be the best one to start with. Also a set of asciinema records for real-life examples.
FAQ & When C/R fails
A sort of troubleshooting guide
What can change after C/R
CRIU cannot (yet) save and restore every single bit of tasks' state. This page describes what bits visible through standard kernel API are such.
What cannot be checkpointed
What an application could do to make CRIU refuse to dump it.
Contacts
Ways to communicate with CRIU community

Developing

If you're interested in CRIU development, please subscribe to the criu mailing list: https://lists.openvz.org/mailman/listinfo/criu

Images
Description of image files format
Plugins
CRIU can call plugins provided by people
Upstream kernel commits
Mainline kernel commits tracker
Recent commits
CRIU tool repository commits
Manpages
Kernel's manpages commits tracker
ZDTM Test Suite
Zero downtime test suite
TODO
Current TODO list
User namespace
Implementing user namespace support
Postulates
What to keep in mind when writing new code
Code coverage results
Shows how zdtm run covers the criu code paths
How to submit patches


Academic Research

  • GPU CRIU

[arXiv:2502.16631 '25] CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads

[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications

[EuroSys '24] Just-In-Time Checkpointing: Low Cost Error Recovery from Deep Learning Training Failures

[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation

[SC-W '23] Checkpoint/Restart for CUDA Kernels

[arXiv:2202.07848 '22] Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads

[Wiley '21] Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support

[EuroSys '20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning

[HPEC '20] Using Container Migration for HPC Workloads Resilience




  • CRIU for Migration

[APNet '24] Software-based Live Migration for Containerized RDMA

[SEATED '24] Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems

[ICT '24] Packet Buffering to Minimize Service Downtime and Packet Loss During Redundancy Switchover

[SIGMOD/PODS '24] Demonstration of ElasticNotebook: Migrating Live Computational Notebook States

[ICDCS '24] Dapper: A Lightweight and Extensible Framework for Live Program State Rewriting

[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0

[CCGRID '24] Workload-Aware Live Migratable Cloud Instance Detector

[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks

[SRDS '23] Transparent Fault Tolerance for Stateful Applications in Kubernetes with Checkpoint/Restore

[ICFEC '23] Migration of Isolated Application Across Heterogeneous Edge Systems

[TNSM '23] Design, Modeling, and Implementation of Robust Migration of Stateful Edge Microservices

[WORDS '23] Evicting for the greater good: The case for Reactive Checkpointing in serverless computing

[Cloud Summit '23] Microservice Debugging with Checkpoint-Restart

[ICC '23] Processing-Aware Migration Model for Stateful Edge Microservices

[DRONES '23] A Dynamic Checkpoint Interval Decision Algorithm for Live Migration-Based Drone-Recovery System

[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level

[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing

[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments

[ICPADS '22] A Container Pre-copy Migration Method Based on Dirty Page Prediction and Compression

[NetSoft '22] Demonstration of Containerized Central Unit Live Migration in 5G Radio Access Network

[ATC '22] RRC: Responsive Replicated Containers

[HAL '22] Good Shepherds Care For Their Cattle: Seamless Pod Migration in Geo-Distributed Kubernetes

[ATC '21] MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications

[WoWMoM '21] Extending the QUIC Protocol to Support Live Container Migration at the Edge

[MobileCloud '20] Docker Container Deployment in Distributed Fog Infrastructures with Checkpoint/Restart


  • CRIU Acceleration

[EuroSys '24] Pronghorn: Effective Checkpoint Orchestration for Serverless Hot-Starts

[FGCS '24] Prebaking runtime environments to improve the FaaS cold start latency

[Middleware '23] DynaCut: A Framework for Dynamic and Adaptive Program Customization

[Virginia Tech '23] CRIU-RTX: Remote Thread eXecution using Checkpoint/Restore in Userspace

[Virginia Tech '23] HetMigrate: Secure and Efficient Cross-architecture Process Live Migration

[OSDI '23] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing

[SC '22] Out of hypervisor (OoH): efficient dirty page tracking in userspace using hardware virtualization features

[JNCA '22] iContainer: Consecutive checkpointing with rapid resilience for immortal container-based services

[VLSI '21] Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support

[Middleware '20] Prebaking Functions to Warm the Serverless Cold Start

[MEMSYS '19] Fast in-memory CRIU for docker containers

[MCHPC '19] Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory



  • CRIU Security

[APSys '24] Towards Efficient End-to-End Encryption for Container Checkpointing Systems

[eBPF '24] Custom Page Fault Handling With eBPF

[ARES '24] Don't, Stop, Drop, Pause: Forensics of CONtainer CheckPOINTs (ConPoint)

[ATC '22] RRC: Responsive Replicated Containers

[NDSS '22] FitM: Binary-Only Coverage-Guided Fuzzing for Stateful Network Protocols

[SYSTEX '22] Transparent, Cross-ISA Enclave Offloading

[IPDPS '20] Fault-Tolerant Containers Using NiLiCon


  • CRIU for Database

[Journal of Cloud Computing '24] MDB-KCP: persistence framework of in-memory database with CRIU-based container checkpoint in Kubernetes

[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level

[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks

[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level

[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications

Misc