Iterative migration

Revision as of 18:00, 28 January 2015 by Xemul (talk | contribs)

This page describes how to reduce the freeze time of an application by using the memory changes tracking ability to perform pre-copy memory migration.

Note.svg Note: It is assumed that you already read Live migration article before this one.

Migration sequence

The steps below look like those in regular live migration, but include one or more pre-dump stages.

Pre-dump

Take tasks you are about to migrate and pre-dump them into some place. Tasks will remain running after pre-dump, unlike regular dump.

[src]# criu pre-dump --tree <pid> --images-dir <path-to-existing-directory-A>

The directory with images can be on a shared storage, or you can use disk-less migration to avoid the Copy step.

Now you can either proceed to next step and do regular dump, or perform the pre-dump step again. In the latter case pre-dump would generate another set of pre-dump images which will contain memory changed after previous pre-dump. Doing several pre-dump iterations may reduce the amount of data dumped on dump stage and thus lead to shorter freeze time.

Note, that if you're going to perform more than one pre-dump steps, you should create different directories for images and properly reference them with the --images-dir and the --prev-images-dir for all pre-dump and dump steps.

Dump

Now you can do regular dump of your processes.

[src]# criu dump --tree <pid> --images-dir <path-to-existing-directory-B> \
 --prev-images-dir <path-to-directory-A-relative-to-B> --leave-stopped

Note that:

  1. this dump would work faster than without pre-dump, as this dump only takes the memory that has changed since the last pre-dump;
  2. the --prev-images-dir should contain path to the directory with pre-dump images relative to the directory where the dump images will be put.

Copy

Copy images to the destination node:

[src]# scp -r <path-to-images-dir> <dst>:/<path-to-images>

Restore

On the destination node restore the apps from images:

[dst]# criu restore --tree <pid> --images-dir <path-to-images>

Kill

If everything went OK you can kill stopped tasks on the source node:

[src]# FIXME put command here