Changes

Shared memory (edit)

Revision as of 19:37, 2 September 2016

693 bytes added , 19:37, 2 September 2016

→‎Restore: OK this is the way it works. Perhaps we need a picture here.

Line 20: Line 20:

== Restore ==

−

~~Upon~~ restore, CRIU already knows which mappings are shared, ~~and the trick is~~ to ~~restore them~~ as ~~such~~.

+

During the restore, CRIU already knows which mappings are shared, so they need to be restored as shared.

−

~~For that~~, ~~two different approaches~~ are ~~used~~, ~~depending on the availability~~.

+

To restore file mappings, no tricks are needed, they are opened and mmaped with with a MAP_SHARED flag set.

−

~~The common part is, between the processes sharing a mapping~~, ~~the one with the lowest PID~~

+

Anonymous memory mappings, though, need some work to be restored as such. Here is how it is done.

−

~~among the group performs the actual <code>mmap()</code>~~, ~~while all the others wait~~

−

~~for the mapping~~ to ~~appear and, once it's available, use~~ it.

−

~~=== memfd ===~~

+

Among all the processes sharing a mapping, the one with the lowest PID among the group

+

(see [[postulates]]) is assigned to be a mapping creator. The creator task is to obtain a mapping

+

file descriptor, restore the mapping data, and signal all the other process that it's ready.

+

During this process, all the other processes are waiting.

−

~~Linux kernel v3.17 adds~~ a ~~[http://man7.org/linux/man-pages/man2/memfd_create.2.html memfd_create()]~~

+

First, the creator need to obtain a file descriptor for the mapping. To achieve it, two different

−

~~syscall~~. ~~CRIU restore checks if it is available from the running kernel;~~ it ~~yes~~, ~~it is~~ used.

+

approaches are used, depending on the availability.

−

~~FIXME how~~

+

In case [http://man7.org/linux/man-pages/man2/memfd_create.2.html memfd_create()]

+

syscall is available (Linux kernel v3.17+), it is used to obtain a file descriptor.

+

Next, <code>ftruncate()</code> is called to set the proper size of mapping.

−

~~HOW: The memfd in question~~ is ~~created in~~ the ~~task with lowest PID~~ (~~see [[postulates]]~~) ~~among those having this shmem segment~~

+

If <code>memfd_create()</code> is not available, the alternative approach is used.

−

~~mapped, then criu waits for the others~~ to get ~~this~~ file ~~by opening~~ the ~~creator's~~ /proc/~~pid~~/fd/ ~~link.~~

+

First, mmap() is called to create a mapping. Next, a file in <code>/proc/self/map_files/</code>

−

~~Afterwards all~~ the ~~files just mmap() this descriptor into their address space~~.

+

is opened to get a file descriptor for the mapping. The limitation of this method is,

+

due to security concerns, /proc/$PID/map_files/ is not available for processes that

+

live inside a user namespace, so it is impossible to use it if there

+

are any user namespaces in the dump.

−

~~=== /proc/$PID/map_files/ ===~~

+

Once the creator have the file descriptor, it mmap()s it and restores its content from

−

+

the dump (using memcpy()). The creator then unmaps the the mapping (note the file

−

~~This method~~ is ~~used if memfd is not~~ available. ~~The limitation is~~, ~~/proc/$PID/map_files/ is not available~~

+

descriptor is still available). Next, it calls futex(FUTEX_WAKE) to signal all the

−

~~for users inside user namespaces~~ (~~due to security concerns~~)~~, so it's not possible~~ to ~~use it if there~~

+

waiting processes that the mapping file descriptor is ready.

−

~~are any user namespaces in~~ the ~~dump~~.

−

~~FIXME how~~

+

All the other processes that need this mapping wait on futex(FUTEX_WAIT). Once the

+

wait is over, they open the creator's /proc/$CREATOR_PID/fd/$FD file to get the

+

mapping file descriptor.

−

~~HOW: The same technique as with memfd is used~~, ~~with two exceptions. First is that~~ creator ~~calls~~ mmap()

+

Finally, all the processes (including the creator itself) call mmap() to create a

−

~~not memfd_create~~() and ~~creates the shared memory at once. Then it waits~~ for ~~the others to open its~~

+

needed mapping (note that mmap() arguments such as length, offset and flags may

−

~~/proc/pid/map_files/ link. After opening "the others" mmap~~() ~~one to their address space just~~ as if

+

differ for different processes), and close() the mapping file descriptor as it is

−

~~they would have done it with memfd descriptor~~.

+

no longer needed.

== Changes tracking ==

Kir

Bureaucrats, Administrators

1,072

edits

Changes

Shared memory (edit)

Revision as of 19:37, 2 September 2016

Navigation menu

Search