Difference between revisions of "Academic Research"
Wenhuizhang (talk | contribs) |
Wenhuizhang (talk | contribs) m |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | * GPU CRIU | ||
+ | |||
+ | [SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications | ||
+ | |||
+ | [EuroSys '24] Just-In-Time Checkpointing: Low Cost Error Recovery from Deep Learning Training Failures | ||
+ | |||
+ | [arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation | ||
+ | |||
+ | [SC-W '23] Checkpoint/Restart for CUDA Kernels | ||
+ | |||
+ | [arXiv:2202.07848 '22] Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads | ||
+ | |||
+ | [Wiley '21] Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support | ||
+ | |||
+ | [EuroSys '20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning | ||
+ | |||
+ | [HPEC '20] Using Container Migration for HPC Workloads Resilience | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
* CRIU for Migration | * CRIU for Migration | ||
− | [TOCS'22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing | + | [APNet '24] Software-based Live Migration for Containerized RDMA |
+ | |||
+ | [SEATED '24] Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems | ||
+ | |||
+ | [ICT '24] Packet Buffering to Minimize Service Downtime and Packet Loss During Redundancy Switchover | ||
+ | |||
+ | [SIGMOD/PODS '24] Demonstration of ElasticNotebook: Migrating Live Computational Notebook States | ||
+ | |||
+ | [ICDCS '24] Dapper: A Lightweight and Extensible Framework for Live Program State Rewriting | ||
+ | |||
+ | [Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0 | ||
+ | |||
+ | [CCGRID '24] Workload-Aware Live Migratable Cloud Instance Detector | ||
+ | |||
+ | [VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks | ||
+ | |||
+ | [SRDS '23] Transparent Fault Tolerance for Stateful Applications in Kubernetes with Checkpoint/Restore | ||
+ | |||
+ | [ICFEC '23] Migration of Isolated Application Across Heterogeneous Edge Systems | ||
+ | |||
+ | [TNSM '23] Design, Modeling, and Implementation of Robust Migration of Stateful Edge Microservices | ||
+ | |||
+ | [WORDS '23] Evicting for the greater good: The case for Reactive Checkpointing in serverless computing | ||
+ | |||
+ | [Cloud Summit '23] Microservice Debugging with Checkpoint-Restart | ||
+ | |||
+ | [ICC '23] Processing-Aware Migration Model for Stateful Edge Microservices | ||
+ | |||
+ | [DRONES '23] A Dynamic Checkpoint Interval Decision Algorithm for Live Migration-Based Drone-Recovery System | ||
+ | |||
+ | [arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level | ||
+ | |||
+ | [TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing | ||
+ | |||
+ | [VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments | ||
+ | |||
+ | [ICPADS '22] A Container Pre-copy Migration Method Based on Dirty Page Prediction and Compression | ||
+ | |||
+ | [NetSoft '22] Demonstration of Containerized Central Unit Live Migration in 5G Radio Access Network | ||
+ | |||
+ | [ATC '22] RRC: Responsive Replicated Containers | ||
+ | |||
+ | [HAL '22] Good Shepherds Care For Their Cattle: Seamless Pod Migration in Geo-Distributed Kubernetes | ||
+ | |||
+ | [ATC '21] MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications | ||
+ | |||
+ | [WoWMoM '21] Extending the QUIC Protocol to Support Live Container Migration at the Edge | ||
+ | |||
+ | [MobileCloud '20] Docker Container Deployment in Distributed Fog Infrastructures with Checkpoint/Restart | ||
+ | |||
+ | |||
+ | |||
+ | * CRIU Acceleration | ||
+ | |||
+ | [EuroSys '24] Pronghorn: Effective Checkpoint Orchestration for Serverless Hot-Starts | ||
+ | |||
+ | [FGCS '24] Prebaking runtime environments to improve the FaaS cold start latency | ||
+ | |||
+ | [Middleware '23] DynaCut: A Framework for Dynamic and Adaptive Program Customization | ||
+ | |||
+ | [Virginia Tech '23] CRIU-RTX: Remote Thread eXecution using Checkpoint/Restore in Userspace | ||
+ | |||
+ | [Virginia Tech '23] HetMigrate: Secure and Efficient Cross-architecture Process Live Migration | ||
+ | |||
+ | [OSDI '23] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing | ||
− | [ | + | [SC '22] Out of hypervisor (OoH): efficient dirty page tracking in userspace using hardware virtualization features |
+ | [JNCA '22] iContainer: Consecutive checkpointing with rapid resilience for immortal container-based services | ||
− | + | [VLSI '21] Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support | |
− | [ATC'22] RRC: Responsive Replicated Containers | + | [Middleware '20] Prebaking Functions to Warm the Serverless Cold Start |
+ | |||
+ | [MEMSYS '19] Fast in-memory CRIU for docker containers | ||
+ | |||
+ | [MCHPC '19] Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | * CRIU Security | ||
+ | |||
+ | [APSys '24] Towards Efficient End-to-End Encryption for Container Checkpointing Systems | ||
+ | |||
+ | [eBPF '24] Custom Page Fault Handling With eBPF | ||
+ | |||
+ | [ARES '24] Don't, Stop, Drop, Pause: Forensics of CONtainer CheckPOINTs (ConPoint) | ||
+ | |||
+ | [ATC '22] RRC: Responsive Replicated Containers | ||
+ | |||
+ | [NDSS '22] FitM: Binary-Only Coverage-Guided Fuzzing for Stateful Network Protocols | ||
+ | |||
+ | [SYSTEX '22] Transparent, Cross-ISA Enclave Offloading | ||
+ | |||
+ | [IPDPS '20] Fault-Tolerant Containers Using NiLiCon | ||
* CRIU for Database | * CRIU for Database | ||
− | [VLDB'23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level | + | [Journal of Cloud Computing '24] MDB-KCP: persistence framework of in-memory database with CRIU-based container checkpoint in Kubernetes |
+ | |||
+ | [VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level | ||
+ | |||
+ | [VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks | ||
+ | |||
+ | [arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level | ||
− | [EuroSys'21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications | + | [EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications |
Latest revision as of 02:55, 18 December 2024
- GPU CRIU
[SoCC '24] On-demand and Parallel Checkpoint/Restore for GPU Applications
[EuroSys '24] Just-In-Time Checkpointing: Low Cost Error Recovery from Deep Learning Training Failures
[arXiv '23] PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation
[SC-W '23] Checkpoint/Restart for CUDA Kernels
[arXiv:2202.07848 '22] Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
[Wiley '21] Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support
[EuroSys '20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning
[HPEC '20] Using Container Migration for HPC Workloads Resilience
- CRIU for Migration
[APNet '24] Software-based Live Migration for Containerized RDMA
[SEATED '24] Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems
[ICT '24] Packet Buffering to Minimize Service Downtime and Packet Loss During Redundancy Switchover
[SIGMOD/PODS '24] Demonstration of ElasticNotebook: Migrating Live Computational Notebook States
[ICDCS '24] Dapper: A Lightweight and Extensible Framework for Live Program State Rewriting
[Cloud '24] FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0
[CCGRID '24] Workload-Aware Live Migratable Cloud Instance Detector
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
[SRDS '23] Transparent Fault Tolerance for Stateful Applications in Kubernetes with Checkpoint/Restore
[ICFEC '23] Migration of Isolated Application Across Heterogeneous Edge Systems
[TNSM '23] Design, Modeling, and Implementation of Robust Migration of Stateful Edge Microservices
[WORDS '23] Evicting for the greater good: The case for Reactive Checkpointing in serverless computing
[Cloud Summit '23] Microservice Debugging with Checkpoint-Restart
[ICC '23] Processing-Aware Migration Model for Stateful Edge Microservices
[DRONES '23] A Dynamic Checkpoint Interval Decision Algorithm for Live Migration-Based Drone-Recovery System
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[TOCS '22] H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing
[VEE '22] Portkey: hypervisor-assisted container migration in nested cloud environments
[ICPADS '22] A Container Pre-copy Migration Method Based on Dirty Page Prediction and Compression
[NetSoft '22] Demonstration of Containerized Central Unit Live Migration in 5G Radio Access Network
[ATC '22] RRC: Responsive Replicated Containers
[HAL '22] Good Shepherds Care For Their Cattle: Seamless Pod Migration in Geo-Distributed Kubernetes
[ATC '21] MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications
[WoWMoM '21] Extending the QUIC Protocol to Support Live Container Migration at the Edge
[MobileCloud '20] Docker Container Deployment in Distributed Fog Infrastructures with Checkpoint/Restart
- CRIU Acceleration
[EuroSys '24] Pronghorn: Effective Checkpoint Orchestration for Serverless Hot-Starts
[FGCS '24] Prebaking runtime environments to improve the FaaS cold start latency
[Middleware '23] DynaCut: A Framework for Dynamic and Adaptive Program Customization
[Virginia Tech '23] CRIU-RTX: Remote Thread eXecution using Checkpoint/Restore in Userspace
[Virginia Tech '23] HetMigrate: Secure and Efficient Cross-architecture Process Live Migration
[OSDI '23] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing
[SC '22] Out of hypervisor (OoH): efficient dirty page tracking in userspace using hardware virtualization features
[JNCA '22] iContainer: Consecutive checkpointing with rapid resilience for immortal container-based services
[VLSI '21] Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support
[Middleware '20] Prebaking Functions to Warm the Serverless Cold Start
[MEMSYS '19] Fast in-memory CRIU for docker containers
[MCHPC '19] Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory
- CRIU Security
[APSys '24] Towards Efficient End-to-End Encryption for Container Checkpointing Systems
[eBPF '24] Custom Page Fault Handling With eBPF
[ARES '24] Don't, Stop, Drop, Pause: Forensics of CONtainer CheckPOINTs (ConPoint)
[ATC '22] RRC: Responsive Replicated Containers
[NDSS '22] FitM: Binary-Only Coverage-Guided Fuzzing for Stateful Network Protocols
[SYSTEX '22] Transparent, Cross-ISA Enclave Offloading
[IPDPS '20] Fault-Tolerant Containers Using NiLiCon
- CRIU for Database
[Journal of Cloud Computing '24] MDB-KCP: persistence framework of in-memory database with CRIU-based container checkpoint in Kubernetes
[VLDB '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[VLDB '23] ElasticNotebook: Enabling Live Migration for Computational Notebooks
[arXiv:2301.05861 '23] Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
[EuroSys '21] On-demand-fork: a microsecond fork for memory-intensive and latency-sensitive applications