Disk-less migration

From CRIU
Jump to navigation Jump to search

When performing live migration, CRIU puts image files with applications' memory on a storage user provides. If the images with applications' memory are too big, this will result in big delays, due to the need to copy this data several times. Other than this, in some situations it would be desirable to avoid using the storage at all not to increase the load on it. This article describes how one can do live migration without putting images on disk, step by step.

The process[edit]

Preparation[edit]

Prepare a tmpfs mount on both sides, where you will put images other than those with apps memory. These images are typically very small and will not create significant memory pressure on nodes.

dst# mount -t tmpfs none <dir>
src# mount -t tmpfs none <dir>

Run page server[edit]

Launch a page server on the destination node. The page server will accept pages from criu and will put them into the tmpfs mount. Since we're about to run the apps on the destination node, it will have to bear with this memory consumption. The source node will not have to store these images.

dst# criu page-server --images-dir <dir> --port <port>

Now, page server will wait for incoming connections to write the applications' memory to the <dir>. When doing iterative migration, you can make page server to automatically drop duplicated pages by using --auto-dedup option. See the incremental dumps article for details.

criu dump[edit]

Dump the applications, just like it would have been done when doing live migration, but with options explaining to criu where the page server is:

src# criu dump --tree <pid> --images-dir <dir> --leave-stopped --page-server --address <dst> --port <port>

Copy images[edit]

Copy the rest of images onto the destination node:

src# scp -r <dir> <dst-node>/<dir>

As mentioned before, CRIU images being copied here (everything but the process' memory) are relatively small so it would not take long.

criu restore[edit]

Restore the applications. By now, the page server should have been stopped (check this by its return code), and images with pages are already in the <dir>.

dst# criu restore --images-dir <dir>

Cleanup tmpfs[edit]

Kill the tmpfs mount with old images. It's no longer required.

dst# umount <dir>

Kill processes on source[edit]

Kill apps on the source node, as they are already running on the destination.

src# FIXME

See also[edit]