Difference between revisions of "Memory dumping and restoring"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
− | == Dumping == | + | == Basic C/R == |
+ | |||
+ | === Dumping === | ||
Currently memory dumping depends on 3 big technologies: | Currently memory dumping depends on 3 big technologies: | ||
Line 15: | Line 17: | ||
The latter step deserves some better explanation. So in order to drain memory from task we first generate the bitmap of pages needed to be dumped (using the smaps, map_files and pagemap from proc). Then we create a set of pipe-s to put pages into. Then we infect the process with [[parasite code]] which, in turn, gets the pipes and <code>vmsplice</code>-s the required pages into it. Then we <code>splice</code> the pages from pipes into image files. | The latter step deserves some better explanation. So in order to drain memory from task we first generate the bitmap of pages needed to be dumped (using the smaps, map_files and pagemap from proc). Then we create a set of pipe-s to put pages into. Then we infect the process with [[parasite code]] which, in turn, gets the pipes and <code>vmsplice</code>-s the required pages into it. Then we <code>splice</code> the pages from pipes into image files. | ||
− | == Restoring == | + | === Restoring === |
Restoring is pretty straightforward as during restore CRIU morphs itself into the target task. Two things worth mentioning. | Restoring is pretty straightforward as during restore CRIU morphs itself into the target task. Two things worth mentioning. | ||
− | + | ;[[COW]] | |
+ | :Anonymous private mappings might have pages shared between tasks till they get COW-ed. To restore this CRIU pre-restores those pages before forking the child processes and <code>mremap</code>-s them in the [[restorer context|final stage]]. | ||
− | + | ;[[Shared memory]] | |
+ | :Those areas are implemented in the kernel by supporting a pseudo file on a hidden tmpfs mount. So on restore we just determine who will create the shared are and who will attach to it (see the [[postulates]]). Then the creator <code>mmap</code>-s the region and the others open the /proc/pid/map_files/ link. However, on the recent kernels, we use the new <code>memfd</code> system call that does similar thing but works for user namespaces. Briefly -- creator creates the memfd, all the others get one via /proc/pid/fd link which is not that strict as compared to the map_files. | ||
− | = | + | === Non linear mappings === |
− | |||
− | |||
− | |||
− | |||
− | == Non linear mappings == | ||
Currently we don't support non-linear mappings (fail dump if present) | Currently we don't support non-linear mappings (fail dump if present) | ||
[[Category:Under the hood]] | [[Category:Under the hood]] | ||
+ | |||
+ | == Advanced C/R == |
Revision as of 17:32, 28 January 2015
Basic C/R
Dumping
Currently memory dumping depends on 3 big technologies:
- /proc/pid/smaps file and /proc/pid/map_files/ directory with links are used to determine
- memory areas in use by task
- file is mapped (if any)
- shared memory "identifier" to resolve the MAP_SHARED areas
- /proc/pid/pagemap file that reveals important flags
- present indicates that the physical page is there. Non-present pages are not dumped.
- anonymoys for the MAP_FILE | MAP_PRIVATE mapping indicate that the page in question is already COW-ed from the file's. Not-anonymous pages are not dumped as they are still in sync with the file
- soft-dirty bit is used by memory changes tracking
- Ptrace SEIZE that is used to grab pages from task's VM into pipe (with vmsplice)
The latter step deserves some better explanation. So in order to drain memory from task we first generate the bitmap of pages needed to be dumped (using the smaps, map_files and pagemap from proc). Then we create a set of pipe-s to put pages into. Then we infect the process with parasite code which, in turn, gets the pipes and vmsplice
-s the required pages into it. Then we splice
the pages from pipes into image files.
Restoring
Restoring is pretty straightforward as during restore CRIU morphs itself into the target task. Two things worth mentioning.
- COW
- Anonymous private mappings might have pages shared between tasks till they get COW-ed. To restore this CRIU pre-restores those pages before forking the child processes and
mremap
-s them in the final stage.
- Shared memory
- Those areas are implemented in the kernel by supporting a pseudo file on a hidden tmpfs mount. So on restore we just determine who will create the shared are and who will attach to it (see the postulates). Then the creator
mmap
-s the region and the others open the /proc/pid/map_files/ link. However, on the recent kernels, we use the newmemfd
system call that does similar thing but works for user namespaces. Briefly -- creator creates the memfd, all the others get one via /proc/pid/fd link which is not that strict as compared to the map_files.
Non linear mappings
Currently we don't support non-linear mappings (fail dump if present)