Difference between revisions of "Security Enhanced Linux"

From CRIU
Jump to navigation Jump to search
m
m
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
SELinux is protecting the file system, and the host from attack from inside of a container.
 
SELinux is protecting the file system, and the host from attack from inside of a container.
  
The initial SELinux policy for containers was written for a tool called [https://www.berrange.com/posts/2012/01/17/building-application-sandboxes-with-libvirt-lxc-kvm/ virt-sandbox] that used libvirt to launch containers, specifically it used libvirt-lxc.
+
The initial SELinux policy for containers was written for a tool called [https://www.berrange.com/posts/2012/01/17/building-application-sandboxes-with-libvirt-lxc-kvm/ virt-sandbox], which used libvirt, specifically libvirt-lxc, to launch containers. This first type was called <code>svirt_lxc_t</code> and it is not allowed to have network access.
This first type was called <code>svirt_lxc_t</code> and it is not allowed to have network access.
 
 
The successor of <code>svirt_lxc_t</code> is called <code>svirt_lxc_net_t</code> and allows full network access.
 
The successor of <code>svirt_lxc_t</code> is called <code>svirt_lxc_net_t</code> and allows full network access.
 
The type for content that the <code>svirt_lxc</code> types could manage is named <code>svirt_sandbox_file_t</code>.
 
The type for content that the <code>svirt_lxc</code> types could manage is named <code>svirt_sandbox_file_t</code>.
Line 14: Line 13:
  
 
The Docker daemon and Podman are usually running as <code>container_runtime_t</code>, and the default label for content in <code>/var/lib/docker</code> and <code>/var/lib/containers</code> is <code>container_var_lib_t</code>.
 
The Docker daemon and Podman are usually running as <code>container_runtime_t</code>, and the default label for content in <code>/var/lib/docker</code> and <code>/var/lib/containers</code> is <code>container_var_lib_t</code>.
 +
 +
== SELinux Security Policy Example ==
 +
 +
On systems running SELinux, all processes and files are labeled with security-relevant information known as SELinux context. For files, this context can be viewed using the <code>ls -Z</code> command, and for processes, it can be viewed with the <code>ps -Z</code> command.
 +
 +
 +
Example: <code>system_u:system_r:container_t:s0:c356,c371</code>
 +
 +
{| class="wikitable"
 +
|-
 +
! Description !! Label
 +
|-
 +
| SELinux user || <code>system_u</code>
 +
|-
 +
| SELinux role || <code>system_r</code>
 +
|-
 +
| A shared type || <code>container_t</code>
 +
|-
 +
| Secret-level || <code>s0</code>
 +
|-
 +
| Unique category || <code>c356,c371</code>
 +
|}
  
 
== Using correct SELinux label to parasite socket ==
 
== Using correct SELinux label to parasite socket ==

Latest revision as of 17:26, 2 July 2024

SELinux is protecting the file system, and the host from attack from inside of a container.

The initial SELinux policy for containers was written for a tool called virt-sandbox, which used libvirt, specifically libvirt-lxc, to launch containers. This first type was called svirt_lxc_t and it is not allowed to have network access. The successor of svirt_lxc_t is called svirt_lxc_net_t and allows full network access. The type for content that the svirt_lxc types could manage is named svirt_sandbox_file_t.

This SELinux policy was later adopted by Docker and the aliases container_t and container_file_t were created.

The container policy is defined in the container-selinux package. By default containers run with the SELinux type container_t whether this is a container launched by just about any container engine (e.g. podman, cri-o, docker, buildah, moby).

SELinux only allows a container_t to read/write/execute files labeled container_file_t.

The Docker daemon and Podman are usually running as container_runtime_t, and the default label for content in /var/lib/docker and /var/lib/containers is container_var_lib_t.

SELinux Security Policy Example[edit]

On systems running SELinux, all processes and files are labeled with security-relevant information known as SELinux context. For files, this context can be viewed using the ls -Z command, and for processes, it can be viewed with the ps -Z command.


Example: system_u:system_r:container_t:s0:c356,c371

Description Label
SELinux user system_u
SELinux role system_r
A shared type container_t
Secret-level s0
Unique category c356,c371

Using correct SELinux label to parasite socket[edit]

If running on a system with SELinux enabled the socket for the communication between parasite daemon and the main CRIU process needs to be correctly labeled.

In the case of Podman, CRIU is started from runc and it is running as container_runtime_t. The parasite code will be running with the same context as the container process (container_t).

CRIU interacts with the parasite code via a Unix socket and allowing a container process to connect via socket to the outside of the container is not desirable. Thus, CRIU first obtains the context of the root container process and tells SELinux to label the created socket with the same label as the root container process.

For this to work it is necessary to have the correct SELinux policies installed. For Fedora based systems this is part of the container-selinux package.

Note that the current implementation assumes all processes CRIU that are to be checkpointed are labeled with the same SELinux context, which is the default behaviour for most container engines.

In the case when a child process has a different label an additional SELinux policies might be required.

Checkpoint and restore any SELinux process label[edit]

For successful container checkpoint and restore on a SELinux enabled host it is necessary that the restored container has the same process context as before checkpointing.

During dump CRIU stores any process label to be restored and for processes started from the command-line which are usually running in the unconfined_t this just works. For containers an additional policy is needed, which is provided by the latest container-selinux package. This policy allows CRIU (when running as container_runtime_t) to transition the restored process to container_t.

Restoring a process that is running under systemd's control (unconfined_service_t) without additional policies is likely to fail because CRIU will be not allowed to change the context of the restored process.

For each checkpoint/restore use case on SELinux enabled systems, besides container processes and command-line/shell processes, a dyntransition permission must be granted between the old and new security contexts.

Restoring a multi-threaded process with SELinux[edit]

SELinux does not always support changing the process context of a multi-threaded process. The context change of a running multi-threaded process is allowed only if the new security context is bounded by the old security context.

To be able to restore a process without the need to have the new security context bounded by the old security context, CRIU sets the SELinux process context before creating the threads. Thus, all threads are created with the process context of the main process.