Line 2:
Line 2:
[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications
[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications
+
+
[EuroSys '24] Just-In-Time Checkpointing: Low Cost Error Recovery from Deep Learning Training Failures
[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation
[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation
[SC-W '23] Checkpoint/Restart for CUDA Kernels
[SC-W '23] Checkpoint/Restart for CUDA Kernels
+
+
[arXiv:2202.07848 '22] Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
+
+
[Wiley '21] Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support
+
+
[EuroSys '20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning
+
+
[HPEC '20] Using Container Migration for HPC Workloads Resilience
+
+
+
Line 22:
Line 35:
[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0
[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0
+
+
[CCGRID '24] Workload-Aware Live Migratable Cloud Instance Detector
+
+
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
+
+
[SRDS '23] Transparent Fault Tolerance for Stateful Applications in Kubernetes with Checkpoint/Restore
+
+
[ICFEC '23] Migration of Isolated Application Across Heterogeneous Edge Systems
+
+
[TNSM '23] Design, Modeling, and Implementation of Robust Migration of Stateful Edge Microservices
+
+
[WORDS '23] Evicting for the greater good: The case for Reactive Checkpointing in serverless computing
+
+
[Cloud Summit '23] Microservice Debugging with Checkpoint-Restart
+
+
[ICC '23] Processing-Aware Migration Model for Stateful Edge Microservices
+
+
[DRONES '23] A Dynamic Checkpoint Interval Decision Algorithm for Live Migration-Based Drone-Recovery System
+
+
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing
[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing
[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments
[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments
+
+
[ICPADS '22] A Container Pre-copy Migration Method Based on Dirty Page Prediction and Compression
+
+
[NetSoft '22] Demonstration of Containerized Central Unit Live Migration in 5G Radio Access Network
+
+
[ATC '22] RRC: Responsive Replicated Containers
+
+
[HAL '22] Good Shepherds Care For Their Cattle: Seamless Pod Migration in Geo-Distributed Kubernetes
+
+
[ATC '21] MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications
+
+
[WoWMoM '21] Extending the QUIC Protocol to Support Live Container Migration at the Edge
+
+
[MobileCloud '20] Docker Container Deployment in Distributed Fog Infrastructures with Checkpoint/Restart
+
+
+
+
* CRIU Acceleration
+
+
[EuroSys '24] Pronghorn: Effective Checkpoint Orchestration for Serverless Hot-Starts
+
+
[FGCS '24] Prebaking runtime environments to improve the FaaS cold start latency
+
+
[Middleware '23] DynaCut: A Framework for Dynamic and Adaptive Program Customization
+
+
[Virginia Tech '23] CRIU-RTX: Remote Thread eXecution using Checkpoint/Restore in Userspace
+
+
[Virginia Tech '23] HetMigrate: Secure and Efficient Cross-architecture Process Live Migration
+
+
[OSDI '23] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing
+
+
[SC '22] Out of hypervisor (OoH): efficient dirty page tracking in userspace using hardware virtualization features
+
+
[JNCA '22] iContainer: Consecutive checkpointing with rapid resilience for immortal container-based services
+
+
[VLSI '21] Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support
+
+
[Middleware '20] Prebaking Functions to Warm the Serverless Cold Start
+
+
[MEMSYS '19] Fast in-memory CRIU for docker containers
+
+
[MCHPC '19] Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory
+
+
Line 37:
Line 114:
[ATC '22] RRC: Responsive Replicated Containers
[ATC '22] RRC: Responsive Replicated Containers
+
+
[NDSS '22] FitM: Binary-Only Coverage-Guided Fuzzing for Stateful Network Protocols
+
+
[SYSTEX '22] Transparent, Cross-ISA Enclave Offloading
+
+
[IPDPS '20] Fault-Tolerant Containers Using NiLiCon
Line 44:
Line 127:
[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
+
+
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
+
+
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications
[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications