Probing C/R

Sometimes CRIU is unable to checkpoint or restore a process for various reasons. This article describes a way to check whether CRIU works or not for a given case.

RationaleEdit

It might be the case that a process can not be dumped at all (see what cannot be checkpointed), or that there are some problems which can be fixed. One way to check if checkpoint/restore will work is to run criu dump with --leave-running on the source system, copy the dump over, then run criu restore on the destination.

The problems with the above approach are:

  • If restore succeeds, one will have two sets of same processes running on two machines, which may have disastrous effects;
  • A lot of memory is dumped to disk and transferred to the destination system, which is not necessary for a test run.

To solve these, an easier and faster way to test if dump/restore will work, --check-only option was introduced.

UsageEdit

Source system:

# criu dump --check-only -D /tmp/cp -t <PID>
Only checking if requested operation will succeed
...
# rsync -a /tmp/cp dest-system:/tmp

Destination system:

# criu restore -D /tmp/cp
Checking mode enabled
...

Upon restore, CRIU will see that the checkpoint is a check-only one, and will do restore in a check-only mode automatically, so there is no need to supply this option explicitly.

Restore-onlyEdit

One can use the --check-only switch on a full checkpoint, to see if the restore will succeed, at the same time making sure the process will not start running:

Destination system:

# criu restore --check-only -D /tmp/cp
Only checking if requested operation will succeed
Checking mode enabled
...

Currently only the existing checks (e.g., check binary size) are run in check-only mode, but additional checks could be added, for example:

  • checksums of binaries
  • checksums of used libraries
  • available memory