Changes

3,646 bytes added ,  Wednesday at 02:55
m
no edit summary
Line 2: Line 2:     
[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications
 
[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications
 +
 +
[EuroSys '24] Just-In-Time Checkpointing: Low Cost Error Recovery from Deep Learning Training Failures
    
[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation
 
[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation
    
[SC-W '23] Checkpoint/Restart for CUDA Kernels
 
[SC-W '23] Checkpoint/Restart for CUDA Kernels
 +
 +
[arXiv:2202.07848 '22] Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
 +
 +
[Wiley '21] Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support
 +
 +
[EuroSys '20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning
 +
 +
[HPEC '20] Using Container Migration for HPC Workloads Resilience
 +
 +
 +
      Line 22: Line 35:     
[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0
 
[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0
 +
 +
[CCGRID '24] Workload-Aware Live Migratable Cloud Instance Detector
 +
 +
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
 +
 +
[SRDS '23] Transparent Fault Tolerance for Stateful Applications in Kubernetes with Checkpoint/Restore
 +
 +
[ICFEC '23] Migration of Isolated Application Across Heterogeneous Edge Systems
 +
 +
[TNSM '23] Design, Modeling, and Implementation of Robust Migration of Stateful Edge Microservices
 +
 +
[WORDS '23] Evicting for the greater good: The case for Reactive Checkpointing in serverless computing
 +
 +
[Cloud Summit '23] Microservice Debugging with Checkpoint-Restart
 +
 +
[ICC '23] Processing-Aware Migration Model for Stateful Edge Microservices
 +
 +
[DRONES '23] A Dynamic Checkpoint Interval Decision Algorithm for Live Migration-Based Drone-Recovery System
 +
 +
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
    
[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing
 
[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing
    
[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments
 
[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments
 +
 +
[ICPADS '22] A Container Pre-copy Migration Method Based on Dirty Page Prediction and Compression
 +
 +
[NetSoft '22] Demonstration of Containerized Central Unit Live Migration in 5G Radio Access Network
 +
 +
[ATC '22] RRC: Responsive Replicated Containers
 +
 +
[HAL '22] Good Shepherds Care For Their Cattle: Seamless Pod Migration in Geo-Distributed Kubernetes
 +
 +
[ATC '21] MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications
 +
 +
[WoWMoM '21] Extending the QUIC Protocol to Support Live Container Migration at the Edge
 +
 +
[MobileCloud '20] Docker Container Deployment in Distributed Fog Infrastructures with Checkpoint/Restart
 +
 +
 +
 +
* CRIU Acceleration
 +
 +
[EuroSys '24] Pronghorn: Effective Checkpoint Orchestration for Serverless Hot-Starts
 +
 +
[FGCS '24] Prebaking runtime environments to improve the FaaS cold start latency
 +
 +
[Middleware '23] DynaCut: A Framework for Dynamic and Adaptive Program Customization
 +
 +
[Virginia Tech '23] CRIU-RTX: Remote Thread eXecution using Checkpoint/Restore in Userspace
 +
 +
[Virginia Tech '23] HetMigrate: Secure and Efficient Cross-architecture Process Live Migration
 +
 +
[OSDI '23] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing
 +
 +
[SC '22] Out of hypervisor (OoH): efficient dirty page tracking in userspace using hardware virtualization features
 +
 +
[JNCA '22] iContainer: Consecutive checkpointing with rapid resilience for immortal container-based services
 +
 +
[VLSI '21] Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support
 +
 +
[Middleware '20] Prebaking Functions to Warm the Serverless Cold Start
 +
 +
[MEMSYS '19] Fast in-memory CRIU for docker containers
 +
 +
[MCHPC '19] Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory
 +
 +
      Line 37: Line 114:     
[ATC '22] RRC: Responsive Replicated Containers
 
[ATC '22] RRC: Responsive Replicated Containers
 +
 +
[NDSS '22] FitM: Binary-Only Coverage-Guided Fuzzing for Stateful Network Protocols
 +
 +
[SYSTEX '22] Transparent, Cross-ISA Enclave Offloading
 +
 +
[IPDPS '20] Fault-Tolerant Containers Using NiLiCon
      Line 44: Line 127:     
[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
 
[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
 +
 +
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
 +
 +
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
    
[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications
 
[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications
21

edits