Page server

Page server is a component of CRIU that allows to copy (rather than dump) user memory to a destination system during the course of live migration. It is also used for lazy migration.

RationaleEdit

For process tree migration, the biggest part of transfer data is the memory used by the processes. Therefore, optimizing this memory transfer would be beneficial.

 
Memory migration without page server

Without the page server, migrating the user memory pages consists of:

  • dumping (writing) the memory to files on disk;
  • reading the files and sending the data over the network to the destination host;
  • receiving files on the destination host, writing to files on disk;
  • restoring by reading the files into memory.

In other words, all the memory is written to the disk twice, and read from the disk also twice. It incurs significant I/O overhead and slows down the migration. The overhead is further multiplied when using criu pre-dump (such as for iterative migration), as in this case memory is dumped not once but a few times.

A way to mitigate the disk I/O overhead is to use tmpfs, a filesystem that use RAM as a storage (see Disk-less migration for details). This eliminates the disk I/O, but not the double read/write. To eliminate it, a page server mechanism was implemented.

OperationEdit

 
Memory migration with page server

When using page server, the memory is dumped not to disk, but directly to network, thus eliminating any disk reads/writes on the source. On the destination system, a page server runs, receiving the data from network and writing it to files on disk or tmpfs.

Note that page server is only used to migrate user memory, i.e. pagemap.img and pages.img files (see memory dumps if you don't know what that means). Everything else, including, say, process mappings (mm.img) is dumped in a traditional manner and should be moved over when doing migration.

Pages deduplicationEdit

Main article: Memory images deduplication.

When iterative memory dumping feature (criu pre-dump) is used, memory is sent to page server a few times. The page server can automatically deduplicate pages, by punching holes in parent images where the child image is replacing an existing page. This functionality is turned on by --auto-dedup option of criu page-server command.

UsageEdit

The obvious use case for page server is live migration. In most cases, use P.Haul to hide the details from you. Otherwise, read on.

Running page serverEdit

First, run a page server on a destination node:

[dst]# criu page-server --images-dir $DIR --port $PORT [--auto-dedup] [...]

Note that criu page-server is "one shot" service, meaning you need to run it for every migration, and it exits as soon as all pages are transferred (or on an error). It also makes sense to check its exit code, to make sure there was no error.

The options are:

--images-dir $DIR
A directory to write memory images to. To speed things up, you might want to use tmpfs (see disk-less migration).
--port $PORT
A port number to listen at.
--auto-dedup
Perform auto deduplication of images. Useful with iterative memory dumps.

Some other options might also be useful, such as:

--ps-socket $FD
Use provided file descriptor as socket for incoming connection. In this case --address and --port are ignored. Useful for intercepting page-server traffic e.g. to add encryption or authentication.
-o|--logfile $FILE
A file to write log to.
-v$N|-vvv[..]
Set logging level (verbosity). For example, -v4 (or -vvvv, which means the same) enables tons of debug info.

criu pre-dump/dumpEdit

On the source system, run criu pre-dump or criu dump as usual (see live migration, iterative migration etc.). To use page server, one need to specify a few additional arguments:

--page-server
Send pages to a page server (rather than writing to disk files)
--address $IPADDR
IP address of the page server (destination host).
--port $PORT
Port number the page servers is listening at.

VideoEdit

To view an example video of using a page server, please go to https://asciinema.org/a/15847.

LimitationsEdit

  • Currently it only works via TCP
  • No encryption, no compression, data is passed over network as is

See alsoEdit

External linksEdit