https://criu.org/api.php?action=feedcontributions&user=Sungjaecho97&feedformat=atomCRIU - User contributions [en]2024-03-28T20:31:06ZUser contributionsMediaWiki 1.35.6https://criu.org/index.php?title=Images&diff=4522Images2018-01-25T04:44:09Z<p>Sungjaecho97: /* Images with PB data */</p>
<hr />
<div>The criu utility dumps the state of processes/containers into a set of image files. This article describes the format of them.<br />
<br />
== Types of image files ==<br />
<br />
CRIU images can be in one of the following formats<br />
<br />
* criu specific images in google protocol buffer format (PB format)<br />
* criu specific images with binary data in it<br />
* image files in 3rd party format (a.k.a. raw images)<br />
<br />
== Images in criu-specific format ==<br />
<br />
All criu-specific image files begin with 2 32-bit magic cookies. The first cookie is the type of file (see below) the second is the optional sub-type of image. Images in PB format are followed by zero or more entries of the same type (not size!), each entry is preceded with 32-bit entry size value (not including this 32-bit value itself). Optionally each entry may be followed by extra payload which depends on the entry type.<br />
<br />
Currently there are 3 types of images<br />
<br />
; Inventory file<br />
: This is the image file describing the set. It doesn't have sub-type magic.<br />
<br />
; Image file<br />
: Regular image. Most of the text below is about these files.<br />
<br />
; Auxiliary file<br />
: File that is not image, but criu generates one and it happens to be in protobuf format too. For now we have only stats and irmap cache files of that type. They also have sub-type magic.<br />
<br />
IOW protocol-buffers image files look like<br />
<br />
<pre><br />
IMAGE_FILE ::= MAGIC [MAGIC_2] { ENTRY }<br />
ENTRY ::= SIZE PAYLOAD [ EXTRA ]<br />
PAYLOAD ::= "message encoded in ProtocolBuffer format"<br />
EXTRA ::= "arbitrary blob, depends on the PAYLOAD contents"<br />
<br />
MAGIC ::= "32 bit integer"<br />
MAGIC_2 ::= "32 bit integer"<br />
SIZE ::= "32 bit integer, equals the PAYLOAD length"<br />
</pre><br />
<br />
Or, you can visualize it like<br />
<br />
{| class="wikitable"<br />
|-<br />
! Type !! Size, bytes<br />
|-<br />
| Magic || 4<br />
|-<br />
| Size0 || 4<br />
|-<br />
| Message0 || Size0<br />
|-<br />
| ... || ...<br />
|-<br />
| SizeN || 4<br />
|-<br />
| MessageN || SizeN<br />
|}<br />
<br />
The amount of entries in a image file depends on the type of file.<br />
<br />
=== Images with PB data ===<br />
<br />
Such images can be one of<br />
<br />
; Array image files<br />
: In these files the amount of entries can be any. You should read the image file up to the EOF to find out the exact number.<br />
<br />
; Single-entry image files<br />
: In these files exactly one entry is stored.<br />
<br />
A file type can be guessed by the magic. The description of the entries in ProtocolBuffers language are in respective .proto files which reside in <code>images/</code> directory in the source tree.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! name<br />
! type<br />
! description<br />
! extra payload<br />
! describing proto file<br />
|-<br />
| inventory || single-entry || Top level description of images || - || inventory.proto<br />
|-<br />
| fdinfo || array || [[Fdinfo-engine|Open file descriptors]] || - || fdinfo.proto<br />
|-<br />
| reg-files || array || Paths to [[:Category:Files|files]] opened with <code>open(2)</code> syscall || - || regfile.proto<br />
|-<br />
| eventfd || array || Eventfd file information || - || eventfd.proto<br />
|-<br />
| eventpoll || array || Eventpoll file information || - || eventpoll.proto<br />
|-<br />
| eventpoll-tfd || array || Target file descriptors of eventpoll fds (merged into above) || - || eventpoll.proto<br />
|-<br />
| inotify || array || Inotify file information || - || fsnotify.proto<br />
|-<br />
| inotify-wd || array || Watch descriptors of inotify fds (merged into above) || - || fsnotify.proto<br />
|-<br />
| signalfd || array || signalfd info || - || signalfd.proto<br />
|-<br />
| core || single-entry || Core process info and (name, sigmask, itimers, etc.) arch-dependent information (registers, etc.) || - || core.proto<br />
|-<br />
| mm || single-entry || [[Memory dumping and restoring|Address space]] information (VMAs, segments, exe file, etc.) || - || mm.proto<br />
|-<br />
| pipes || array || Pipes information || - || pipe.proto<br />
|-<br />
| pipes-data || array || Contents of pipes || <code>entry.bytes</code> bytes of data sitting in a pipe || pipe-data.proto<br />
|-<br />
| fifo || array || FIFO information || - || fifo.proto<br />
|-<br />
| fifo-data || array || Contents of FIFOs || same as in pipes-data || pipe-data.proto<br />
|-<br />
| pstree || array || Process [[tree after restore|tree linkage]] || - || pstree.proto<br />
|-<br />
| ids || single || IDs of objects (mm, files, sihand, etc.) and namespaces || - || core.proto<br />
|-<br />
| sigacts || array || Signal handling map || - || sa.proto<br />
|-<br />
| unixsk || array || [[Unix sockets]] || - || sk-unix.proto<br />
|-<br />
| inetsk || array || PF_INET sockets, both IPv4 and IPv6 || - || sk-inet.proto<br />
|-<br />
| sk-queues || array || Contents of socket queues || <code>entry.length</code> bytes of data, one entry per packet || sk-packet.proto<br />
|-<br />
| itimers || array || Interval timers state (merged into core image) || - || timer.proto<br />
|-<br />
| creds || single-entry || Task credentials: uids, gids, caps, etc. || - || creds.proto<br />
|-<br />
| fs || single-entry || Chroot and chdir information || - || fs.proto<br />
|-<br />
| remap-fpath || array || File paths remaps (e.g. for [[invisible files]]) || - || remap-file-path.proto<br />
|-<br />
| ghost-file || single-entry || Ghost [[invisible files]] || Right after the entry up to the EOF goes the contents of the file || ghost-file.proto<br />
|-<br />
| tcp-stream || single-entry || [[TCP connection]] state (including data in queues) || <code>entry.inq_len</code> bytes of in-queue data followed by <code>entry.outq_len</code> bytes of out-queue data || tcp-stream.proto<br />
|-<br />
| mountpoints || array || [[Mountpoints]] information || - || mnt.proto<br />
|-<br />
| utsns || single-entry || Uname nodename and domainname of a UTS namespace || - || utsns.proto<br />
|-<br />
| tty || array || Information about opened [[TTYs]] || - || tty.proto<br />
|-<br />
| tty-info || array || Termios and similar stuff about [[TTYs]] || - || tty.proto<br />
|-<br />
| packetsk || array || Info about PF_PACKET sockets || - || packet-sock.proto<br />
|-<br />
| netdev || array || Info about [[:Category:Network|network]] devices || - || netdev.proto<br />
|}<br />
<br />
=== Images with memory dumps ===<br />
<br />
''Main article: [[memory dumps]]''.<br />
<br />
Anonymous memory contents (both private and shared) is stored in two types of images:<br />
<br />
; Pagemap files<br />
: These files contain info about which virtual regions are populated with data. The file is a set of protobuf messages.<br />
{{Note| Even though pagemap is an array kind of image (and can be included to the previous type), first pb message is of type pagemap_head and all the following ones are of type pagemap_entry.}}<br />
<br />
; Pages files<br />
: These contain 4k pages that are to be put into the memory according to the pagemap.<br />
<br />
== Raw images ==<br />
<br />
These images contain data that were collected by criu with the help of some external tools.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! Name<br />
! Tool supporting the format<br />
! Description<br />
|-<br />
| ifaddr || ip from iproute2 || IP addresses on network devices<br />
|-<br />
| route || ip from iproute2 || Routing tables<br />
|-<br />
| tmpfs || tar + gzip || Contents of a tmpfs filesystem<br />
|}<br />
<br />
== Notes about protobuf ==<br />
We have a registered field number (1018) for [https://developers.google.com/protocol-buffers/docs/proto#options custom options] of all kinds. See protobuf/opts.proto for more info.<br />
<br />
== See also ==<br />
<br />
* [[CRIT]]: a tool to decode images to a human readable format<br />
* [[What's bad with V1 images]]<br />
* [[Image field merging]]<br />
* [[Memory dumps]]<br />
<br />
[[Category:Development]]<br />
[[Category:Images]]<br />
[[Category:Outdated]]</div>Sungjaecho97https://criu.org/index.php?title=Images&diff=4521Images2018-01-25T02:46:13Z<p>Sungjaecho97: /* Images with PB data */</p>
<hr />
<div>The criu utility dumps the state of processes/containers into a set of image files. This article describes the format of them.<br />
<br />
== Types of image files ==<br />
<br />
CRIU images can be in one of the following formats<br />
<br />
* criu specific images in google protocol buffer format (PB format)<br />
* criu specific images with binary data in it<br />
* image files in 3rd party format (a.k.a. raw images)<br />
<br />
== Images in criu-specific format ==<br />
<br />
All criu-specific image files begin with 2 32-bit magic cookies. The first cookie is the type of file (see below) the second is the optional sub-type of image. Images in PB format are followed by zero or more entries of the same type (not size!), each entry is preceded with 32-bit entry size value (not including this 32-bit value itself). Optionally each entry may be followed by extra payload which depends on the entry type.<br />
<br />
Currently there are 3 types of images<br />
<br />
; Inventory file<br />
: This is the image file describing the set. It doesn't have sub-type magic.<br />
<br />
; Image file<br />
: Regular image. Most of the text below is about these files.<br />
<br />
; Auxiliary file<br />
: File that is not image, but criu generates one and it happens to be in protobuf format too. For now we have only stats and irmap cache files of that type. They also have sub-type magic.<br />
<br />
IOW protocol-buffers image files look like<br />
<br />
<pre><br />
IMAGE_FILE ::= MAGIC [MAGIC_2] { ENTRY }<br />
ENTRY ::= SIZE PAYLOAD [ EXTRA ]<br />
PAYLOAD ::= "message encoded in ProtocolBuffer format"<br />
EXTRA ::= "arbitrary blob, depends on the PAYLOAD contents"<br />
<br />
MAGIC ::= "32 bit integer"<br />
MAGIC_2 ::= "32 bit integer"<br />
SIZE ::= "32 bit integer, equals the PAYLOAD length"<br />
</pre><br />
<br />
Or, you can visualize it like<br />
<br />
{| class="wikitable"<br />
|-<br />
! Type !! Size, bytes<br />
|-<br />
| Magic || 4<br />
|-<br />
| Size0 || 4<br />
|-<br />
| Message0 || Size0<br />
|-<br />
| ... || ...<br />
|-<br />
| SizeN || 4<br />
|-<br />
| MessageN || SizeN<br />
|}<br />
<br />
The amount of entries in a image file depends on the type of file.<br />
<br />
=== Images with PB data ===<br />
<br />
Such images can be one of<br />
<br />
; Array image files<br />
: In these files the amount of entries can be any. You should read the image file up to the EOF to find out the exact number.<br />
<br />
; Single-entry image files<br />
: In these files exactly one entry is stored.<br />
<br />
A file type can be guessed by the magic. The description of the entries in ProtocolBuffers language are in respective .proto files which reside in <code>images/</code> directory in the source tree.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! name<br />
! type<br />
! description<br />
! extra payload<br />
! describing proto file<br />
|-<br />
| inventory || single-entry || Top level description of images || - || inventory.proto<br />
|-<br />
| fdinfo || array || [[Fdinfo-engine|Open file descriptors]] || - || fdinfo.proto<br />
|-<br />
| reg-files || array || Paths to [[:Category:Files|files]] opened with <code>open(2)</code> syscall || - || regfile.proto<br />
|-<br />
| eventfd || array || Eventfd file information || - || eventfd.proto<br />
|-<br />
| eventpoll || array || Eventpoll file information || - || eventpoll.proto<br />
|-<br />
| eventpoll-tfd || array || Target file descriptors of eventpoll fds (merged into above) || - || eventpoll.proto<br />
|-<br />
| inotify || array || Inotify file information || - || fsnotify.proto<br />
|-<br />
| inotify-wd || array || Watch descriptors of inotify fds (merged into above) || - || fsnotify.proto<br />
|-<br />
| signalfd || array || signalfd info || - || signalfd.proto<br />
|-<br />
| core || single-entry || Core process info and (name, sigmask, itimers, etc.) arch-dependent information (registers, etc.) || - || core.proto<br />
|-<br />
| mm || single-entry || [[Memory dumping and restoring|Address space]] information (VMAs, segments, exe file, etc.) || - || mm.proto<br />
|-<br />
| pipes || array || Pipes information || - || pipe.proto<br />
|-<br />
| pipes-data || array || Contents of pipes || <code>entry.bytes</code> bytes of data sitting in a pipe || pipe-data.proto<br />
|-<br />
| fifo || array || FIFO information || - || fifo.proto<br />
|-<br />
| fifo-data || array || Contents of FIFOs || same as in pipes-data || pipe-data.proto<br />
|-<br />
| pstree || array || Process [[tree after restore|tree linkage]] || - || pstree.proto<br />
|-<br />
| ids || single || IDs of objects (mm, files, sihand, etc.) and namespaces || - || core.proto<br />
|-<br />
| sigacts || array || Signal handling map || - || sa.proto<br />
|-<br />
| unixsk || array || [[Unix sockets]] || - || sk-unix.proto<br />
|-<br />
| inetsk || array || PF_INET sockets, both IPv4 and IPv6 || - || sk-inet.proto<br />
|-<br />
| sk-queues || array || Contents of socket queues || <code>entry.length</code> bytes of data, one entry per packet || sk-packet.proto<br />
|-<br />
| itimers || array || Interval timers state (merged into core image) || - || itimer.proto<br />
|-<br />
| creds || single-entry || Task credentials: uids, gids, caps, etc. || - || creds.proto<br />
|-<br />
| fs || single-entry || Chroot and chdir information || - || fs.proto<br />
|-<br />
| remap-fpath || array || File paths remaps (e.g. for [[invisible files]]) || - || remap-file-path.proto<br />
|-<br />
| ghost-file || single-entry || Ghost [[invisible files]] || Right after the entry up to the EOF goes the contents of the file || ghost-file.proto<br />
|-<br />
| tcp-stream || single-entry || [[TCP connection]] state (including data in queues) || <code>entry.inq_len</code> bytes of in-queue data followed by <code>entry.outq_len</code> bytes of out-queue data || tcp-stream.proto<br />
|-<br />
| mountpoints || array || [[Mountpoints]] information || - || mnt.proto<br />
|-<br />
| utsns || single-entry || Uname nodename and domainname of a UTS namespace || - || utsns.proto<br />
|-<br />
| tty || array || Information about opened [[TTYs]] || - || tty.proto<br />
|-<br />
| tty-info || array || Termios and similar stuff about [[TTYs]] || - || tty.proto<br />
|-<br />
| packetsk || array || Info about PF_PACKET sockets || - || packet-sock.proto<br />
|-<br />
| netdev || array || Info about [[:Category:Network|network]] devices || - || netdev.proto<br />
|}<br />
<br />
=== Images with memory dumps ===<br />
<br />
''Main article: [[memory dumps]]''.<br />
<br />
Anonymous memory contents (both private and shared) is stored in two types of images:<br />
<br />
; Pagemap files<br />
: These files contain info about which virtual regions are populated with data. The file is a set of protobuf messages.<br />
{{Note| Even though pagemap is an array kind of image (and can be included to the previous type), first pb message is of type pagemap_head and all the following ones are of type pagemap_entry.}}<br />
<br />
; Pages files<br />
: These contain 4k pages that are to be put into the memory according to the pagemap.<br />
<br />
== Raw images ==<br />
<br />
These images contain data that were collected by criu with the help of some external tools.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! Name<br />
! Tool supporting the format<br />
! Description<br />
|-<br />
| ifaddr || ip from iproute2 || IP addresses on network devices<br />
|-<br />
| route || ip from iproute2 || Routing tables<br />
|-<br />
| tmpfs || tar + gzip || Contents of a tmpfs filesystem<br />
|}<br />
<br />
== Notes about protobuf ==<br />
We have a registered field number (1018) for [https://developers.google.com/protocol-buffers/docs/proto#options custom options] of all kinds. See protobuf/opts.proto for more info.<br />
<br />
== See also ==<br />
<br />
* [[CRIT]]: a tool to decode images to a human readable format<br />
* [[What's bad with V1 images]]<br />
* [[Image field merging]]<br />
* [[Memory dumps]]<br />
<br />
[[Category:Development]]<br />
[[Category:Images]]<br />
[[Category:Outdated]]</div>Sungjaecho97https://criu.org/index.php?title=Images&diff=4520Images2018-01-25T01:34:09Z<p>Sungjaecho97: /* Images with PB data */</p>
<hr />
<div>The criu utility dumps the state of processes/containers into a set of image files. This article describes the format of them.<br />
<br />
== Types of image files ==<br />
<br />
CRIU images can be in one of the following formats<br />
<br />
* criu specific images in google protocol buffer format (PB format)<br />
* criu specific images with binary data in it<br />
* image files in 3rd party format (a.k.a. raw images)<br />
<br />
== Images in criu-specific format ==<br />
<br />
All criu-specific image files begin with 2 32-bit magic cookies. The first cookie is the type of file (see below) the second is the optional sub-type of image. Images in PB format are followed by zero or more entries of the same type (not size!), each entry is preceded with 32-bit entry size value (not including this 32-bit value itself). Optionally each entry may be followed by extra payload which depends on the entry type.<br />
<br />
Currently there are 3 types of images<br />
<br />
; Inventory file<br />
: This is the image file describing the set. It doesn't have sub-type magic.<br />
<br />
; Image file<br />
: Regular image. Most of the text below is about these files.<br />
<br />
; Auxiliary file<br />
: File that is not image, but criu generates one and it happens to be in protobuf format too. For now we have only stats and irmap cache files of that type. They also have sub-type magic.<br />
<br />
IOW protocol-buffers image files look like<br />
<br />
<pre><br />
IMAGE_FILE ::= MAGIC [MAGIC_2] { ENTRY }<br />
ENTRY ::= SIZE PAYLOAD [ EXTRA ]<br />
PAYLOAD ::= "message encoded in ProtocolBuffer format"<br />
EXTRA ::= "arbitrary blob, depends on the PAYLOAD contents"<br />
<br />
MAGIC ::= "32 bit integer"<br />
MAGIC_2 ::= "32 bit integer"<br />
SIZE ::= "32 bit integer, equals the PAYLOAD length"<br />
</pre><br />
<br />
Or, you can visualize it like<br />
<br />
{| class="wikitable"<br />
|-<br />
! Type !! Size, bytes<br />
|-<br />
| Magic || 4<br />
|-<br />
| Size0 || 4<br />
|-<br />
| Message0 || Size0<br />
|-<br />
| ... || ...<br />
|-<br />
| SizeN || 4<br />
|-<br />
| MessageN || SizeN<br />
|}<br />
<br />
The amount of entries in a image file depends on the type of file.<br />
<br />
=== Images with PB data ===<br />
<br />
Such images can be one of<br />
<br />
; Array image files<br />
: In these files the amount of entries can be any. You should read the image file up to the EOF to find out the exact number.<br />
<br />
; Single-entry image files<br />
: In these files exactly one entry is stored.<br />
<br />
A file type can be guessed by the magic. The description of the entries in ProtocolBuffers language are in respective .proto files which reside in <code>images/</code> directory in the source tree.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! name<br />
! type<br />
! description<br />
! extra payload<br />
! describing proto file<br />
|-<br />
| inventory || single-entry || Top level description of images || - || inventory.proto<br />
|-<br />
| fdinfo || array || [[Fdinfo-engine|Open file descriptors]] || - || fdinfo.proto<br />
|-<br />
| reg-files || array || Paths to [[:Category:Files|files]] opened with <code>open(2)</code> syscall || - || regfile.proto<br />
|-<br />
| eventfd || array || Eventfd file information || - || eventfd.proto<br />
|-<br />
| eventpoll || array || Eventpoll file information || - || eventpoll.proto<br />
|-<br />
| eventpoll-tfd || array || Target file descriptors of eventpoll fds (merged into above) || - || eventpoll.proto<br />
|-<br />
| inotify || array || Inotify file information || - || intotify.proto<br />
|-<br />
| inotify-wd || array || Watch descriptors of inotify fds (merged into above) || - || inotify.proto<br />
|-<br />
| signalfd || array || signalfd info || - || signalfd.proto<br />
|-<br />
| core || single-entry || Core process info and (name, sigmask, itimers, etc.) arch-dependent information (registers, etc.) || - || core.proto<br />
|-<br />
| mm || single-entry || [[Memory dumping and restoring|Address space]] information (VMAs, segments, exe file, etc.) || - || mm.proto<br />
|-<br />
| pipes || array || Pipes information || - || pipe.proto<br />
|-<br />
| pipes-data || array || Contents of pipes || <code>entry.bytes</code> bytes of data sitting in a pipe || pipe-data.proto<br />
|-<br />
| fifo || array || FIFO information || - || fifo.proto<br />
|-<br />
| fifo-data || array || Contents of FIFOs || same as in pipes-data || pipe-data.proto<br />
|-<br />
| pstree || array || Process [[tree after restore|tree linkage]] || - || pstree.proto<br />
|-<br />
| ids || single || IDs of objects (mm, files, sihand, etc.) and namespaces || - || core.proto<br />
|-<br />
| sigacts || array || Signal handling map || - || sa.proto<br />
|-<br />
| unixsk || array || [[Unix sockets]] || - || sk-unix.proto<br />
|-<br />
| inetsk || array || PF_INET sockets, both IPv4 and IPv6 || - || sk-inet.proto<br />
|-<br />
| sk-queues || array || Contents of socket queues || <code>entry.length</code> bytes of data, one entry per packet || sk-packet.proto<br />
|-<br />
| itimers || array || Interval timers state (merged into core image) || - || itimer.proto<br />
|-<br />
| creds || single-entry || Task credentials: uids, gids, caps, etc. || - || creds.proto<br />
|-<br />
| fs || single-entry || Chroot and chdir information || - || fs.proto<br />
|-<br />
| remap-fpath || array || File paths remaps (e.g. for [[invisible files]]) || - || remap-file-path.proto<br />
|-<br />
| ghost-file || single-entry || Ghost [[invisible files]] || Right after the entry up to the EOF goes the contents of the file || ghost-file.proto<br />
|-<br />
| tcp-stream || single-entry || [[TCP connection]] state (including data in queues) || <code>entry.inq_len</code> bytes of in-queue data followed by <code>entry.outq_len</code> bytes of out-queue data || tcp-stream.proto<br />
|-<br />
| mountpoints || array || [[Mountpoints]] information || - || mnt.proto<br />
|-<br />
| utsns || single-entry || Uname nodename and domainname of a UTS namespace || - || utsns.proto<br />
|-<br />
| tty || array || Information about opened [[TTYs]] || - || tty.proto<br />
|-<br />
| tty-info || array || Termios and similar stuff about [[TTYs]] || - || tty.proto<br />
|-<br />
| packetsk || array || Info about PF_PACKET sockets || - || packet-sock.proto<br />
|-<br />
| netdev || array || Info about [[:Category:Network|network]] devices || - || netdev.proto<br />
|}<br />
<br />
=== Images with memory dumps ===<br />
<br />
''Main article: [[memory dumps]]''.<br />
<br />
Anonymous memory contents (both private and shared) is stored in two types of images:<br />
<br />
; Pagemap files<br />
: These files contain info about which virtual regions are populated with data. The file is a set of protobuf messages.<br />
{{Note| Even though pagemap is an array kind of image (and can be included to the previous type), first pb message is of type pagemap_head and all the following ones are of type pagemap_entry.}}<br />
<br />
; Pages files<br />
: These contain 4k pages that are to be put into the memory according to the pagemap.<br />
<br />
== Raw images ==<br />
<br />
These images contain data that were collected by criu with the help of some external tools.<br />
<br />
{|class="wikitable sortable"<br />
|-<br />
! Name<br />
! Tool supporting the format<br />
! Description<br />
|-<br />
| ifaddr || ip from iproute2 || IP addresses on network devices<br />
|-<br />
| route || ip from iproute2 || Routing tables<br />
|-<br />
| tmpfs || tar + gzip || Contents of a tmpfs filesystem<br />
|}<br />
<br />
== Notes about protobuf ==<br />
We have a registered field number (1018) for [https://developers.google.com/protocol-buffers/docs/proto#options custom options] of all kinds. See protobuf/opts.proto for more info.<br />
<br />
== See also ==<br />
<br />
* [[CRIT]]: a tool to decode images to a human readable format<br />
* [[What's bad with V1 images]]<br />
* [[Image field merging]]<br />
* [[Memory dumps]]<br />
<br />
[[Category:Development]]<br />
[[Category:Images]]<br />
[[Category:Outdated]]</div>Sungjaecho97https://criu.org/index.php?title=Shared_memory&diff=4516Shared memory2018-01-10T08:02:26Z<p>Sungjaecho97: /* Checkpoint */ twice 'the'</p>
<hr />
<div>Every process has one or more memory mappings, i.e. regions of virtual memory it allows to use.<br />
Some such mappings can be shared between a few processes, and they are called shared mappings.<br />
In other words, these are shared '''anonymous (not file-based) memory mappings'''.<br />
The article describes some intricacies of handling such mappings. <br />
<br />
== Checkpoint ==<br />
<br />
During the checkpointing, CRIU needs to figure out all the shared mappings in order to dump them as such.<br />
<br />
It does so by calling <code>fstatat()</code> on each entry found in the <code>/proc/$PID/map_files/</code>,<br />
noting the ''device:inode'' pair of the structure returned by <code>fstatat()</code>. Now, if some processes<br />
have a mapping with the same ''device:inode'' pair, this mapping is marked as shared between these processes<br />
and dumped as such.<br />
<br />
Note that <code>fstatat()</code> works because the kernel actually creates a hidden<br />
tmpfs file, not visible from any tmpfs mounts, but accessible via its<br />
<code>/proc/$PID/map_files/</code> entry.<br />
<br />
Dumping a mapping means two things:<br />
* writing an entry into process' mm.img file;<br />
* storing the actual mapping data (contents).<br />
For shared mappings, the contents is stored into a pair of image files: pagemap-shmem.img and pages.img.<br />
For details, see [[Memory dumps]].<br />
<br />
Note that different processes can map different parts of a shared memory segment.<br />
In this case, CRIU first collects mapping offsets and lengths from all the processes<br />
to determine the total segment size, then reads all the parts contents<br />
from the respective processes.<br />
<br />
== Restore ==<br />
<br />
During the restore, CRIU already knows which mappings are shared, so they need to be<br />
restored as such. Here is how it is done.<br />
<br />
Among all the processes sharing a mapping, the one with the lowest PID among the group<br />
(see [[postulates]]) is assigned to be a mapping creator. The creator task is to obtain a mapping<br />
file descriptor, restore the mapping data, and signal all the other process that it's ready.<br />
During this process, all the other processes are waiting.<br />
<br />
First, the creator need to obtain a file descriptor for the mapping. To achieve it, two different<br />
approaches are used, depending on the availability.<br />
<br />
In case [http://man7.org/linux/man-pages/man2/memfd_create.2.html memfd_create()]<br />
syscall is available (Linux kernel v3.17+), it is used to obtain a file descriptor.<br />
Next, <code>ftruncate()</code> is called to set the proper size of mapping.<br />
<br />
If <code>memfd_create()</code> is not available, the alternative approach is used.<br />
First, mmap() is called to create a mapping. Next, a file in <code>/proc/self/map_files/</code><br />
is opened to get a file descriptor for the mapping. The limitation of this method is,<br />
due to security concerns, /proc/$PID/map_files/ is not available for processes that<br />
live inside a user namespace, so it is impossible to use it if there<br />
are any user namespaces in the dump.<br />
<br />
Once the creator have the file descriptor, it mmap()s it and restores its content from<br />
the dump (using memcpy()). The creator then unmaps the the mapping (note the file<br />
descriptor is still available). Next, it calls futex(FUTEX_WAKE) to signal all the<br />
waiting processes that the mapping file descriptor is ready.<br />
<br />
All the other processes that need this mapping wait on futex(FUTEX_WAIT). Once the<br />
wait is over, they open the creator's /proc/$CREATOR_PID/fd/$FD file to get the<br />
mapping file descriptor.<br />
<br />
Finally, all the processes (including the creator itself) call mmap() to create a<br />
needed mapping (note that mmap() arguments such as length, offset and flags may<br />
differ for different processes), and close() the mapping file descriptor as it is<br />
no longer needed.<br />
<br />
== Changes tracking ==<br />
<br />
For [[iterative migration]] it's very useful to track changes in memory. Until CRIU v2.5, changes were tracked for anonymous memory only, but now it is also shared memory can be tracked as well. To achieve it, CRIU scans all the shmem segment owners' pagemap (as it does for anonymous memory) and then ANDs the collected soft-dirty bits.<br />
<br />
The changes tracking caused developers to implement [[memory images deduplication]] for shmem segments as well.<br />
<br />
== Dumping present pages ==<br />
<br />
When dumping the contents of shared memory, CRIU does not dump all of the data. Instead, it determines which pages contain <br />
it, and only dumps those pages. This is done similarly to how regular [[memory dumping and restoring]] works, i.e. by looking<br />
for PRESENT or SWAPPED bits in owners' pagemap entries.<br />
<br />
There is one particular feature of shared memory dumps worth mentioning. Sometimes, a shared memory page<br />
can exist in the kernel, but it is not mapped to any process. CRIU detects such pages by calling mincore()<br />
on the shmem segment, which reports back the page in-memory status. The mincore bitmap is when ANDed with<br />
the per-process ones.<br />
<br />
== See also ==<br />
<br />
* [[Memory dumping and restoring]]<br />
* [[Memory images deduplication]]<br />
<br />
[[Category:Memory]]<br />
[[Category:Under the hood]]<br />
[[Category:Editor help needed]]</div>Sungjaecho97https://criu.org/index.php?title=Cpuinfo&diff=4515Cpuinfo2018-01-09T08:17:28Z<p>Sungjaecho97: fix spelling</p>
<hr />
<div>Because CRIU allows to live migrate containers (see [[live migration]] for details), it might happen that CPU a container has been ran on differs from the target CPU. For most software this is usually not a problem, but if a program is compiled with optimizations involving a particular CPU feature (say, AVX instruction), the lack of the feature on a destination machine will lead to execution exception in a best case scenario.<br />
<br />
Therefore there should be a way to test if destination machine is capable of running container to be migrated. This is the purpose of <code>cpuinfo</code> command.<br />
<br />
== Saving CPU capabilities into an image file ==<br />
<br />
CRIU does not write CPU capabilities into an image by default (for the sake of speed). Instead, one have to run CRIU as:<br />
<br />
criu cpuinfo dump<br />
<br />
The command creates a ''cpuinfo'' image file, containing information about the current CPU and some bits representing the supported capabilities.<br />
<br />
== Testing CPU capabilities ==<br />
<br />
To check if the capabilities saved in ''cpuinfo'' image file are matching those of the current CPU, one should run:<br />
<br />
criu cpuinfo check<br />
<br />
== Checkpoint/Restore with CPU capabilities ==<br />
<br />
While by default CRIU does not save CPU capabilities in image file, one can pass <code>--cpu-cap</code> option to force CRIU to save and check CPU capabilities on dump and restore accordingly.<br />
<br />
[[Category:API]]</div>Sungjaecho97