Changes

4,305 bytes added ,  23:42, 1 April 2019
Describe C/R support for SELinux enabled systems based on commits 8eb4309, 796da06 and e86c2e9
Line 1: Line 1: −
Here's what we've found so far:
+
SELinux is protecting the file system, and the host from attack from inside of a container.
   −
* Mapping a parasite code into unprivileged process with <code>ANON | EXEC | READ</code> flag set hits the default selinux restrictions.
+
The initial SELinux policy for containers was written for a tool called virt-sandbox that used libvirt to launch containers, specifically it used libvirt-lxc.
 +
This first type was called <code>svirt_lxc_t</code> and it is not allowed to have network access.
 +
The successor of <code>svirt_lxc_t</code> is called <code>svirt_lxc_net_t</code> and allows full network access.
 +
The type for content that the <code>svirt_lxc</code> types could manage is named <code>svirt_sandbox_file_t</code>.
 +
 
 +
This SELinux policy was later adopted by Docker and the aliases <code>container_t</code> and <code>container_file_t</code> were created.
 +
 
 +
The container policy is defined in the [https://github.com/containers/container-selinux container-selinux] package.
 +
By default containers run with the SELinux type <code>container_t</code> whether this is a container launched by just about any container engine (e.g. podman, cri-o, docker, buildah, moby).
 +
 
 +
SELinux only allows a <code>container_t</code> to read/write/execute files labeled <code>container_file_t</code>.
 +
 
 +
The Docker daemon and Podman are usually running as <code>container_runtime_t</code>, and the default label for content in <code>/var/lib/docker</code> and <code>/var/lib/containers</code> is <code>container_var_lib_t</code>.
 +
 
 +
== Using correct SELinux label to parasite socket ==
 +
 
 +
If running on a system with SELinux enabled the socket for the communication between parasite daemon and the main CRIU process needs to be correctly labeled.
 +
 
 +
In the case of Podman, CRIU is started from runc and it is running as <code>container_runtime_t</code>.
 +
The parasite code will be running with the same context as the container process (<code>container_t</code>).
 +
 
 +
CRIU interacts with the parasite code via a Unix socket and allowing a container process to connect via socket to the outside of the container is not desirable. Thus, CRIU first obtains the context of the root container process and tells SELinux to label the created socket with the same label as the root container process.
 +
 
 +
For this to work it is necessary to have the correct SELinux policies installed. For Fedora based systems this is part of the [https://github.com/containers/container-selinux/commit/a2fc0309642a6430525cdaebcb3f1c8efde45fe2 container-selinux] package.
 +
 
 +
Note that the current implementation assumes all processes CRIU that are to be checkpointed are labeled with the same SELinux context, which is the default behaviour for most container engines.
 +
 
 +
In the case when a child process has a different label an additional SELinux policies might be required.
 +
 
 +
== Checkpoint and restore any SELinux process label ==
 +
 
 +
For successful container checkpoint and restore on a SELinux enabled host it is necessary that the restored container has the same process context as before checkpointing.
 +
 
 +
During dump CRIU stores any process label to be restored and for processes started from the command-line which are usually running in the <code>unconfined_t</code> this just works. For containers
 +
an additional policy is needed, which is provided by the latest [https://github.com/containers/container-selinux/commit/a2fc0309642a6430525cdaebcb3f1c8efde45fe2 container-selinux] package. This policy allows CRIU (when running as <code>container_runtime_t</code>) to transition the restored process to <code>container_t</code>.
 +
 
 +
Restoring a process that is running under systemd's control (<code>unconfined_service_t</code>) without additional policies is likely to fail because CRIU will be not allowed to change the context of the restored process.
 +
 
 +
For each checkpoint/restore use case on SELinux enabled systems, besides container processes and command-line/shell processes, a ''dyntransition'' permission must be granted between the old and new security contexts.
 +
 
 +
== Restoring a multi-threaded process with SELinux ==
 +
 
 +
SELinux does not always support changing the process context of a multi-threaded process. The context change of a running multi-threaded process is allowed only if the new security context is [https://selinuxproject.org/page/TypeRules#typebounds_Rule bounded] by the old security context.
 +
 
 +
To be able to restore a process without the need to have the new security context bounded by the old security context, CRIU sets the SELinux process context before creating the threads. Thus, all threads are created with the process context of the main process.
    
[[Category: Plans]]
 
[[Category: Plans]]
 
[[Category: Creds]]
 
[[Category: Creds]]
332

edits