Difference between revisions of "Memory dumps"

From CRIU
Jump to navigation Jump to search
(→‎Deduplication: link to a separate article)
(→‎Stacked images: rephrasing)
Line 36: Line 36:
 
== Stacked images ==
 
== Stacked images ==
  
When you do [[incremental dumps]] there appear the <code>parent</code> symlink and images with pages become dependant on each other.
+
When [[incremental dumps]] are performed, for every iteration a <code>parent</code> symlink is created, and images become dependent on their respective parent(s).
  
There appears the 3rd field in the pagemap, called <code>in_parent</code>. When set this boolean value means, that the respective data for the pagemap should be found in the parent images. While searching for data in parent the same algorithm is used -- first the pagemap is resolved, then the data is found in pages. For parent images, the data (either complete or partial) can also be found in ''its'' parent images.  
+
There appears the 3rd field in the pagemap, called <code>in_parent</code>. It is a boolean flag, when set, means the respective data for the given pagemap is available from a parent image. While searching for data in parent, the same algorithm is used -- first the pagemap is resolved, then the data is found in pages. For parent images, the data (either complete or partial) can also be found in ''its'' parent images.  
  
Respectively, the bottom image (with no parent link) should have no in_parent bits at all.
+
Naturally, the bottom image (the one with no parent link) must have no <code>in_parent</code> bits set.
  
 
=== Example ===
 
=== Example ===
  
Let's take another example, consider we have pagemap from previous example with one in_parent bit:
+
Consider we have a pagemap from the previous example with <code>in_parent</code> bit set for one entry:
  
 
<pre>
 
<pre>
Line 51: Line 51:
 
</pre>
 
</pre>
  
In this case the pages image would be only 32k in size, since the first 4 pages should be found in the parent. Thus the parent pagemap image should container one ore more pagemaps that cover the <code>0x1000000 ... 0x1004000</code> area, for example like this
+
In this case, the pages image would be only 32k in size, since the first 4 pages are to be found in the parent. Thus the parent pagemap image should container one or more pagemaps covering the <code>0x1000000 ... 0x1004000</code> area, for example, like this
  
 
<pre>
 
<pre>
Line 58: Line 58:
 
</pre>
 
</pre>
  
This, in turn, would mean that first 2 pages from this range are in parent's pages image file and the last 2 should be looked up deeper -- in the grand-parent pagemaps.
+
This, in turn, means that the first 2 pages from this range are available from the parent pages image file, and the last 2 should be looked up deeper, i.e. in the grand-parent pagemaps.
  
 
== Deduplication ==
 
== Deduplication ==

Revision as of 06:10, 31 August 2016

This page describes the way memory is stored in the image files.

Overview

Process mappings are stored in mm.img images. But that info is only about the virtual memory areas. The data sitting inside those areas is all stored in pairs of files described below.

The memory dumps contain the contents of individual pages (4k) and the information about at which address in the virtual memory the data in question should be. Those images are not connected to the VMA list in mm.img at all, just the addresses matching makes things get into proper locations.

What gets into memory dumps is

  • Present pages from anonymous private mappings
  • Present pages from anonymous shared mappings
  • Private (copied) pages from file private mappings

Images structure

Memory dumps are stored into two images.

Pagemap
This is the list of entries each of which is a pair -- where in the memory the data should go and which amount of pages it includes.
Pages
This is the plain set of 4k entries -- each one is a full page with data.

Example

Let's imagine we have pagemap contain two entries

  { 0x1000000, 4 }
  { 0xCF000000, 8 }

In this case the pages should have 12 pages in it, i.e. be 48K in size. Then the first 4 pages (16k, the first pagemap entry) would be read from image and put at address 0x1000000 thus occupying space up to the 0x1000000 + 4 * 4096 = 0x1004000 address. The last 8 pages (32k, the 2nd pagemap entry) would be read and put at the 0xCF000000 address.

Stacked images

When incremental dumps are performed, for every iteration a parent symlink is created, and images become dependent on their respective parent(s).

There appears the 3rd field in the pagemap, called in_parent. It is a boolean flag, when set, means the respective data for the given pagemap is available from a parent image. While searching for data in parent, the same algorithm is used -- first the pagemap is resolved, then the data is found in pages. For parent images, the data (either complete or partial) can also be found in its parent images.

Naturally, the bottom image (the one with no parent link) must have no in_parent bits set.

Example

Consider we have a pagemap from the previous example with in_parent bit set for one entry:

  { 0x1000000, 4, in_parent }
  { 0xCF000000, 8 }

In this case, the pages image would be only 32k in size, since the first 4 pages are to be found in the parent. Thus the parent pagemap image should container one or more pagemaps covering the 0x1000000 ... 0x1004000 area, for example, like this

  { 0x1000000, 2 }
  { 0x1002000, 2, in_parent }

This, in turn, means that the first 2 pages from this range are available from the parent pages image file, and the last 2 should be looked up deeper, i.e. in the grand-parent pagemaps.

Deduplication

See Memory images deduplication.

See also