Difference between revisions of "Kubernetes"
m |
m |
||
| (42 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| − | Container checkpointing was introduced as an alpha feature in Kubernetes v1.25 and graduated to beta in Kubernetes v1.30. This functionality allows running containers to be transparently checkpointed to persistent storage and later restored to resume execution, or migrated across nodes and clusters. The content of container checkpoints | + | [[Image:K8s-cr-arch-v2.png|right|500px|thumb|Overview of container checkpoint/restore in Kubernetes.]] |
| + | |||
| + | Container checkpointing was introduced as an alpha feature in Kubernetes v1.25 and graduated to beta in Kubernetes v1.30. This functionality allows running containers to be transparently checkpointed to persistent storage and later restored to resume execution, or migrated across nodes and clusters. | ||
| + | |||
| + | The content of container checkpoints can be further analyzed with the [https://github.com/checkpoint-restore/checkpointctl checkpointctl] tool. This allows to perform forensic analysis in case of security incidents (e.g., suspected compromise, data exfiltration) or application failures by inspecting the saved process memory, open files, sockets, and execution context captured in the checkpoint. | ||
| + | |||
| + | This feature is developed as a community-driven effort at the [https://github.com/kubernetes/community/tree/master/wg-checkpoint-restore Kubernetes Checkpoint/Restore Working Group]. If you want to get more involved by contributing to Kubernetes, join our [https://groups.google.com/a/kubernetes.io/g/wg-checkpoint-restore mailing list] and Slack channel at [https://kubernetes.slack.com/messages/wg-checkpoint-restore #wg-checkpoint-restore]. | ||
| + | |||
| + | == Kubelet Checkpoint API == | ||
| + | |||
| + | This functionality is exposed through a node-local kubelet [https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/ checkpoint API] (enabled by default in Kubernetes v1.30). A checkpoint can be triggered by sending an HTTP POST Request to the kubelet as follows: | ||
| + | |||
| + | <pre> | ||
| + | curl -X POST "https://localhost:10250/checkpoint/<Namespace>/<Pod>/<Container>" | ||
| + | </pre> | ||
| + | |||
| + | Triggering this kubelet API will request the creation of a checkpoint from the container runtime (e.g., containerd or CRI-O). In tern, the container runtime requests a checkpoint from the low-level runtime (e.g., <code>runc</code>) that invokes CRIU. | ||
| + | |||
| + | Once the checkpointing has been created, it will be saved as a tar archive with the following name <code>checkpoint-<pod>_<namespace>-<container>-<timestamp>.tar</code> in <code>/var/lib/kubelet/checkpoints</code>. | ||
| + | |||
| + | === Usage Example === | ||
| + | |||
| + | ==== 1. Creating a Pod with a single container ==== | ||
| + | |||
| + | <pre> | ||
| + | cat > pod.yaml <<'EOF' | ||
| + | apiVersion: v1 | ||
| + | kind: Pod | ||
| + | metadata: | ||
| + | name: counters | ||
| + | spec: | ||
| + | containers: | ||
| + | - name: counter | ||
| + | image: busybox:latest | ||
| + | command: ['sh', '-c', 'i=0; while true; do echo $i; i=$((i+1)); sleep 1; done'] | ||
| + | EOF | ||
| + | </pre> | ||
| + | |||
| + | <pre> | ||
| + | kubectl apply -f pod.yaml | ||
| + | </pre> | ||
| + | |||
| + | The following command can be used to verify that the container is running: | ||
| + | |||
| + | <pre> | ||
| + | kubectl logs -f -c counter counters | ||
| + | </pre> | ||
| + | |||
| + | ==== 2. Create <code>client-admin.crt</code> and <code>client-admin.key</code> files ==== | ||
| + | |||
| + | These certificate and key files will be used to authorize the use of the checkpoint API: | ||
| + | |||
| + | <pre> | ||
| + | kubectl config view --raw --minify -o jsonpath='{.users[0].user.client-certificate-data}' \ | ||
| + | | base64 -d > client-admin.crt | ||
| + | </pre> | ||
| + | |||
| + | <pre> | ||
| + | kubectl config view --raw --minify -o jsonpath='{.users[0].user.client-key-data}' \ | ||
| + | | base64 -d > client-admin.key | ||
| + | chmod 600 client-admin.key | ||
| + | </pre> | ||
| + | |||
| + | ==== 3. Creating a checkpoint of the running container ==== | ||
| + | |||
| + | Note that the <code>--insecure</code> option is necessary for <code>curl</code> to accept the kubelet's self-signed certificate. | ||
| + | |||
| + | <pre> | ||
| + | curl --insecure \ | ||
| + | --cert client-admin.crt \ | ||
| + | --key client-admin.key \ | ||
| + | -X POST "https://localhost:10250/checkpoint/default/counters/counter" | ||
| + | </pre> | ||
| + | |||
| + | Once the checkpoint has been created, it should be available at <code>/var/lib/kubelet/checkpoints/checkpoint-<pod>_<namespace>-<container>-<timestamp>.tar</code> | ||
| + | |||
| + | === Forensic Analysis === | ||
| + | |||
| + | Once a container checkpoint has been created, it's content can be analysed with the help of the [https://github.com/checkpoint-restore/checkpointctl checkpointctl] tool. | ||
| + | |||
| + | ==== Overview of Checkpoints ==== | ||
| + | |||
| + | <code>checkpointctl</code> provides <code>list</code> and <code>show</code> commands that display an overview of checkpoints stored in <code>/var/lib/kubelet/checkpoints</code>. | ||
| + | |||
| + | <pre> | ||
| + | $ sudo checkpointctl list | ||
| + | Listing checkpoints in path: /var/lib/kubelet/checkpoints/ | ||
| + | NAMESPACE POD CONTAINER ENGINE TIME CHECKPOINTED CHECKPOINT NAME | ||
| + | --------- --- --------- ------ ----------------- --------------- | ||
| + | default counters counter containerd 07 Nov 25 11:58 UTC checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar | ||
| + | default counters counter containerd 07 Nov 25 12:09 UTC checkpoint-counters_default-counter-2025-11-07T12:09:07Z.tar | ||
| + | default counters counter containerd 07 Nov 25 12:30 UTC checkpoint-counters_default-counter-2025-11-07T12:30:00Z.tar | ||
| + | |||
| + | $ sudo checkpointctl show /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar | ||
| + | Displaying container checkpoint data from /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar | ||
| + | |||
| + | CONTAINER IMAGE ID RUNTIME CREATED ENGINE CHKPT SIZE ROOT FS DIFF SIZE | ||
| + | --------- ----- -- ------- ------- ------ ---------- ----------------- | ||
| + | counter docker.io/library/busybox:latest 52d907dc8f75 io.containerd.runc.v2 2025-11-07T11:48:41Z containerd 306.8 KiB 270 B | ||
| + | </pre> | ||
| + | |||
| + | ==== Low-level Analysis ==== | ||
| + | |||
| + | The <code>checkpointctl inspect</code> command can be used to perform low-level analysis of the checkpoint data. | ||
| + | |||
| + | <pre> | ||
| + | $ sudo checkpointctl inspect --files --ps-tree --metadata /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar | ||
| + | |||
| + | Displaying container checkpoint tree view from /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar | ||
| + | |||
| + | counter | ||
| + | ├── Image: docker.io/library/busybox:latest | ||
| + | ├── ID: 52d907dc8f75c8a60b366c2fca70839b9505c9da909ef4ae4f90a1c59ccd69ba | ||
| + | ├── Runtime: io.containerd.runc.v2 | ||
| + | ├── Created: 2025-11-07T11:48:41Z | ||
| + | ├── Checkpointed: 2025-11-07T11:58:58Z | ||
| + | ├── Engine: containerd | ||
| + | ├── Checkpoint size: 306.8 KiB | ||
| + | │ └── Memory pages size: 292.0 KiB | ||
| + | ├── Root FS diff size: 270 B | ||
| + | ├── Metadata | ||
| + | │ ├── Pod name: counters | ||
| + | │ ├── Kubernetes namespace: default | ||
| + | │ └── Annotations | ||
| + | │ ├── io.kubernetes.cri.sandbox-name: counters | ||
| + | │ ├── io.kubernetes.cri.sandbox-namespace: default | ||
| + | │ ├── io.kubernetes.cri.sandbox-uid: 430de1f2-cb7b-4c96-8ea7-ba51d335845f | ||
| + | │ ├── io.kubernetes.cri.container-name: counter | ||
| + | │ ├── io.kubernetes.cri.container-type: container | ||
| + | │ ├── io.kubernetes.cri.image-name: busybox:latest | ||
| + | │ └── io.kubernetes.cri.sandbox-id: ee53903d4146165817d0a95e3cfd95340cb9f3bc1852ff28031e43d97e765d88 | ||
| + | └── Process tree | ||
| + | └── [1] sh | ||
| + | ├── Open files | ||
| + | │ ├── [REG 0] /dev/null | ||
| + | │ ├── [PIPE 1] pipe[4398338] | ||
| + | │ ├── [PIPE 2] pipe[4398339] | ||
| + | │ ├── [cwd] / | ||
| + | │ └── [root] / | ||
| + | └── [623] sleep | ||
| + | └── Open files | ||
| + | ├── [REG 0] /dev/null | ||
| + | ├── [PIPE 1] pipe[4398338] | ||
| + | ├── [PIPE 2] pipe[4398339] | ||
| + | ├── [cwd] / | ||
| + | └── [root] / | ||
| + | </pre> | ||
| + | |||
| + | ==== Memory Forensics ==== | ||
| + | |||
| + | The <code>checkpointctl memparse</code> command can be used to analyze the memory pages of individual processes in the container checkpoint. When used without any options, this command will display a table with an overview of the processes: their names, IDs, and memory sizes. The <code>--pid</code> option can be used to specify a process to analyze. The <code>--search</code> and <code>--search-regex</code> options can be used to search for a string or regex pattern in the memory pages. | ||
| + | |||
| + | == Restoring Container within Kubernetes == | ||
| + | |||
| + | To restore a checkpointed container in Kubernetes it is necessary to convert the checkpoint archive into an OCI image that can be pushed to a registry. | ||
| + | |||
| + | === Creating an OCI Image from a Checkpoint === | ||
| + | |||
| + | The <code>checkpointctl build</code> command creates an OCI image from a checkpoint archive. It extracts container metadata from the checkpoint and uses [https://github.com/containers/buildah Buildah] to create an annotated image so the container runtime recognizes that it contains a checkpoint. | ||
| + | <pre> | ||
| + | checkpointctl build checkpoint.tar quay.io/foo/bar:latest | ||
| + | </pre> | ||
| + | |||
| + | Once the image has been created, it can be pushed to a container registry: | ||
| + | <pre> | ||
| + | buildah push quay.io/foo/bar:latest | ||
| + | </pre> | ||
| + | |||
| + | === Restoring Container === | ||
| + | |||
| + | To restore a container from a checkpoint, specify the OCI image containing the checkpoint in the container's <code>image</code> field. When creating a container, CRI-O and containerd detect OCI images with a checkpoint annotation and, instead of a normal start, restore it from the checkpoint. The following example shows how the YAML file used above can be modified to restore the container from a checkpoint: | ||
| + | |||
| + | <pre> | ||
| + | cat > restore-pod.yaml <<'EOF' | ||
| + | apiVersion: v1 | ||
| + | kind: Pod | ||
| + | metadata: | ||
| + | name: counters | ||
| + | spec: | ||
| + | containers: | ||
| + | - name: counter | ||
| + | image: quay.io/foo/bar:latest # Replace with checkpoint image URI | ||
| + | EOF | ||
| + | |||
| + | kubectl apply -f restore-pod.yaml | ||
| + | </pre> | ||
| + | |||
| + | == Related Publications, Talks & Blog Posts == | ||
| + | * Research Papers | ||
| + | ** [https://radostin.io/files/vspisakova-jsspp25.pdf Kubernetes Scheduling with Checkpoint/Restore: Challenges and Open Problems] | ||
| + | ** [https://doi.org/10.48550/arXiv.2502.16631 CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads] | ||
| + | ** [https://doi.org/10.1145/3678015.3680477 Towards Efficient End-to-End Encryption for Container Checkpointing Systems] | ||
| + | |||
| + | * KubeCon & CloudNative Talks | ||
| + | ** [https://kccnceu2025.sched.com/event/1tx7i Efficient Transparent Checkpointing of AI ML Workloads in Kubernetes] | ||
| + | ** [https://sched.co/1dCVs End-to-End Encryption for Container Checkpointing in Kubernetes] | ||
| + | ** [https://sched.co/1YeT4 Enabling Coordinated Checkpointing for Distributed HPC Applications] | ||
| + | |||
| + | * Kubernetes Blog Articles | ||
| + | ** [https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/ Forensic Container Analysis] | ||
| + | ** [https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ Forensic Container Checkpointing in Kubernetes] | ||
Latest revision as of 12:51, 19 November 2025
Container checkpointing was introduced as an alpha feature in Kubernetes v1.25 and graduated to beta in Kubernetes v1.30. This functionality allows running containers to be transparently checkpointed to persistent storage and later restored to resume execution, or migrated across nodes and clusters.
The content of container checkpoints can be further analyzed with the checkpointctl tool. This allows to perform forensic analysis in case of security incidents (e.g., suspected compromise, data exfiltration) or application failures by inspecting the saved process memory, open files, sockets, and execution context captured in the checkpoint.
This feature is developed as a community-driven effort at the Kubernetes Checkpoint/Restore Working Group. If you want to get more involved by contributing to Kubernetes, join our mailing list and Slack channel at #wg-checkpoint-restore.
Kubelet Checkpoint API[edit]
This functionality is exposed through a node-local kubelet checkpoint API (enabled by default in Kubernetes v1.30). A checkpoint can be triggered by sending an HTTP POST Request to the kubelet as follows:
curl -X POST "https://localhost:10250/checkpoint/<Namespace>/<Pod>/<Container>"
Triggering this kubelet API will request the creation of a checkpoint from the container runtime (e.g., containerd or CRI-O). In tern, the container runtime requests a checkpoint from the low-level runtime (e.g., runc) that invokes CRIU.
Once the checkpointing has been created, it will be saved as a tar archive with the following name checkpoint-<pod>_<namespace>-<container>-<timestamp>.tar in /var/lib/kubelet/checkpoints.
Usage Example[edit]
1. Creating a Pod with a single container[edit]
cat > pod.yaml <<'EOF'
apiVersion: v1
kind: Pod
metadata:
name: counters
spec:
containers:
- name: counter
image: busybox:latest
command: ['sh', '-c', 'i=0; while true; do echo $i; i=$((i+1)); sleep 1; done']
EOF
kubectl apply -f pod.yaml
The following command can be used to verify that the container is running:
kubectl logs -f -c counter counters
2. Create client-admin.crt and client-admin.key files[edit]
These certificate and key files will be used to authorize the use of the checkpoint API:
kubectl config view --raw --minify -o jsonpath='{.users[0].user.client-certificate-data}' \
| base64 -d > client-admin.crt
kubectl config view --raw --minify -o jsonpath='{.users[0].user.client-key-data}' \
| base64 -d > client-admin.key
chmod 600 client-admin.key
3. Creating a checkpoint of the running container[edit]
Note that the --insecure option is necessary for curl to accept the kubelet's self-signed certificate.
curl --insecure \ --cert client-admin.crt \ --key client-admin.key \ -X POST "https://localhost:10250/checkpoint/default/counters/counter"
Once the checkpoint has been created, it should be available at /var/lib/kubelet/checkpoints/checkpoint-<pod>_<namespace>-<container>-<timestamp>.tar
Forensic Analysis[edit]
Once a container checkpoint has been created, it's content can be analysed with the help of the checkpointctl tool.
Overview of Checkpoints[edit]
checkpointctl provides list and show commands that display an overview of checkpoints stored in /var/lib/kubelet/checkpoints.
$ sudo checkpointctl list Listing checkpoints in path: /var/lib/kubelet/checkpoints/ NAMESPACE POD CONTAINER ENGINE TIME CHECKPOINTED CHECKPOINT NAME --------- --- --------- ------ ----------------- --------------- default counters counter containerd 07 Nov 25 11:58 UTC checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar default counters counter containerd 07 Nov 25 12:09 UTC checkpoint-counters_default-counter-2025-11-07T12:09:07Z.tar default counters counter containerd 07 Nov 25 12:30 UTC checkpoint-counters_default-counter-2025-11-07T12:30:00Z.tar $ sudo checkpointctl show /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar Displaying container checkpoint data from /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar CONTAINER IMAGE ID RUNTIME CREATED ENGINE CHKPT SIZE ROOT FS DIFF SIZE --------- ----- -- ------- ------- ------ ---------- ----------------- counter docker.io/library/busybox:latest 52d907dc8f75 io.containerd.runc.v2 2025-11-07T11:48:41Z containerd 306.8 KiB 270 B
Low-level Analysis[edit]
The checkpointctl inspect command can be used to perform low-level analysis of the checkpoint data.
$ sudo checkpointctl inspect --files --ps-tree --metadata /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar
Displaying container checkpoint tree view from /var/lib/kubelet/checkpoints/checkpoint-counters_default-counter-2025-11-07T11:58:58Z.tar
counter
├── Image: docker.io/library/busybox:latest
├── ID: 52d907dc8f75c8a60b366c2fca70839b9505c9da909ef4ae4f90a1c59ccd69ba
├── Runtime: io.containerd.runc.v2
├── Created: 2025-11-07T11:48:41Z
├── Checkpointed: 2025-11-07T11:58:58Z
├── Engine: containerd
├── Checkpoint size: 306.8 KiB
│ └── Memory pages size: 292.0 KiB
├── Root FS diff size: 270 B
├── Metadata
│ ├── Pod name: counters
│ ├── Kubernetes namespace: default
│ └── Annotations
│ ├── io.kubernetes.cri.sandbox-name: counters
│ ├── io.kubernetes.cri.sandbox-namespace: default
│ ├── io.kubernetes.cri.sandbox-uid: 430de1f2-cb7b-4c96-8ea7-ba51d335845f
│ ├── io.kubernetes.cri.container-name: counter
│ ├── io.kubernetes.cri.container-type: container
│ ├── io.kubernetes.cri.image-name: busybox:latest
│ └── io.kubernetes.cri.sandbox-id: ee53903d4146165817d0a95e3cfd95340cb9f3bc1852ff28031e43d97e765d88
└── Process tree
└── [1] sh
├── Open files
│ ├── [REG 0] /dev/null
│ ├── [PIPE 1] pipe[4398338]
│ ├── [PIPE 2] pipe[4398339]
│ ├── [cwd] /
│ └── [root] /
└── [623] sleep
└── Open files
├── [REG 0] /dev/null
├── [PIPE 1] pipe[4398338]
├── [PIPE 2] pipe[4398339]
├── [cwd] /
└── [root] /
Memory Forensics[edit]
The checkpointctl memparse command can be used to analyze the memory pages of individual processes in the container checkpoint. When used without any options, this command will display a table with an overview of the processes: their names, IDs, and memory sizes. The --pid option can be used to specify a process to analyze. The --search and --search-regex options can be used to search for a string or regex pattern in the memory pages.
Restoring Container within Kubernetes[edit]
To restore a checkpointed container in Kubernetes it is necessary to convert the checkpoint archive into an OCI image that can be pushed to a registry.
Creating an OCI Image from a Checkpoint[edit]
The checkpointctl build command creates an OCI image from a checkpoint archive. It extracts container metadata from the checkpoint and uses Buildah to create an annotated image so the container runtime recognizes that it contains a checkpoint.
checkpointctl build checkpoint.tar quay.io/foo/bar:latest
Once the image has been created, it can be pushed to a container registry:
buildah push quay.io/foo/bar:latest
Restoring Container[edit]
To restore a container from a checkpoint, specify the OCI image containing the checkpoint in the container's image field. When creating a container, CRI-O and containerd detect OCI images with a checkpoint annotation and, instead of a normal start, restore it from the checkpoint. The following example shows how the YAML file used above can be modified to restore the container from a checkpoint:
cat > restore-pod.yaml <<'EOF'
apiVersion: v1
kind: Pod
metadata:
name: counters
spec:
containers:
- name: counter
image: quay.io/foo/bar:latest # Replace with checkpoint image URI
EOF
kubectl apply -f restore-pod.yaml
Related Publications, Talks & Blog Posts[edit]
- Research Papers
- KubeCon & CloudNative Talks
- Kubernetes Blog Articles