Probing C/R

Revision as of 16:20, 20 October 2016 by Kir (talk | contribs) (No need for a header repeating the article name)

Sometimes CRIU fails to checkpoint or restore a process for various reasons. It might be that the process to be checkpointed cannot be dumped at all (see What cannot be checkpointed or that there are some problems which can be fixed. Running 'criu dump' with --leave-running to see if the checkpointing will work and then running 'criu restore' is always an option. If one of those operations (either 'dump' or 'restore') will fail the chances are high that there are problems with checkpointing or restoring. But a lot of memory might have already been dumped to disk and transferred to the destination system which is not necessary to test for a restore failure. If the restore, however, works the problem exists that the source process has been told to keep on running (--leave-running) which might be an undesired situation to have the process now running on the source and destination system. To avoid a situation like this and to give an easier option to test if 'criu dump' and 'criu restore' will work, this patch introduces the --check-only option:

    source system:
     # criu dump --check-only -D /tmp/cp -t <PID>
     Only checking if requested operation will succeed
     # rsync -a /tmp/cp dest-system:/tmp
   
    destination system:
     # criu restore -D /tmp/cp
     Checking mode enabled

CRIU will detect if a checkpoint is a 'check-only' checkpoint and the restore will automatically run in --check-only mode.

It is also possible to use the --check-only switch on a full checkpoint to see if the restore will succeed and making sure at the same time that the process will not start running:

    destination system:
     # criu restore --check-only -D /tmp/cp
     Only checking if requested operation will succeed
     Checking mode enabled
   

Right now only the existing checks (e.g., check binary size) are run in 'check-only' mode, but additional checks could be added like:

  • checksums of binaries
  • checksums of used libraries
  • available memory