Difference between revisions of "Probing C/R"

From CRIU
Jump to navigation Jump to search
m (No need for a header repeating the article name)
(reworked)
 
Line 1: Line 1:
Sometimes CRIU fails to checkpoint or restore a process for various reasons. It might be that the process to be checkpointed cannot be dumped at all (see [[What cannot be checkpointed]] or that there are some problems which can be fixed. Running 'criu dump' with ''--leave-running'' to see if the checkpointing will work and then running 'criu restore' is always an option. If one of those operations (either 'dump' or 'restore') will fail the chances are high that there are problems with checkpointing or restoring. But a lot of memory might have already been dumped to disk and transferred to the destination system which is not necessary to test for a restore failure. If the restore, however, works the problem exists that the source process has been told to keep on running (''--leave-running'') which might be an undesired situation to have the process now running on the source and destination    system. To avoid a situation like this and to give an easier option to  test if 'criu dump' and 'criu restore' will work, this patch introduces the ''--check-only'' option:
+
Sometimes CRIU is unable to checkpoint or restore a process for various reasons. This article describes a way to check whether CRIU works or not for a given case.
  
    source system:
+
== Rationale ==
      # criu dump --check-only -D /tmp/cp -t <PID>
+
 
      Only checking if requested operation will succeed
+
It might be the case that a process can not be dumped at all (see [[what cannot be checkpointed]]), or that there are some problems which can be fixed. One way to check if checkpoint/restore will work is to run <code>criu dump</code> with {{Opt|--leave-running}} on the source system, copy the dump over, then run <code>criu restore</code> on the destination.
      # rsync -a /tmp/cp dest-system:/tmp
+
 
   
+
The problems with the above approach are:
    destination system:
+
* If restore succeeds, one will have two sets of same processes running on two machines, which may have disastrous effects;
      # criu restore -D /tmp/cp
+
* A lot of memory is dumped to disk and transferred to the destination system, which is not necessary for a test run.
      Checking mode enabled
+
 
 +
To solve these, an easier and faster way to test if dump/restore will work, <code>--check-only</code> option was introduced.
 +
 
 +
== Usage ==
 +
 
 +
Source system:
 +
# criu dump --check-only -D /tmp/cp -t <PID>
 +
Only checking if requested operation will succeed
 +
...
 +
# rsync -a /tmp/cp dest-system:/tmp
 +
 
 +
Destination system:
 +
# criu restore -D /tmp/cp
 +
Checking mode enabled
 +
...
 +
 
 +
Upon restore, CRIU will see that the checkpoint is a check-only one, and will do restore in a check-only mode automatically, so there is no need to supply this option explicitly.
  
CRIU will detect if a checkpoint is a 'check-only' checkpoint and the restore will automatically run in ''--check-only'' mode.
+
=== Restore-only ===
 
      
 
      
It is also possible to use the ''--check-only'' switch on a full checkpoint to see if the restore will succeed and making sure at the same time that the process will not start running:
+
One can use the ''--check-only'' switch on a full checkpoint, to see if the restore will succeed, at the same time making sure the process will not start running:
 
      
 
      
    destination system:
+
Destination system:
      # criu restore --check-only -D /tmp/cp
+
# criu restore --check-only -D /tmp/cp
      Only checking if requested operation will succeed
+
Only checking if requested operation will succeed
      Checking mode enabled
+
Checking mode enabled
   
+
...
Right now only the existing checks (e.g., check binary size) are run in 'check-only' mode, but additional checks could be added like:
+
 
 +
Currently only the existing checks (e.g., check binary size) are run in check-only mode,
 +
but additional checks could be added, for example:
 
      
 
      
 
* checksums of binaries
 
* checksums of binaries
 
* checksums of used libraries
 
* checksums of used libraries
 
* available memory
 
* available memory
 
 
  
 
[[Category: Using]]
 
[[Category: Using]]

Latest revision as of 20:17, 20 October 2016

Sometimes CRIU is unable to checkpoint or restore a process for various reasons. This article describes a way to check whether CRIU works or not for a given case.

Rationale[edit]

It might be the case that a process can not be dumped at all (see what cannot be checkpointed), or that there are some problems which can be fixed. One way to check if checkpoint/restore will work is to run criu dump with --leave-running on the source system, copy the dump over, then run criu restore on the destination.

The problems with the above approach are:

  • If restore succeeds, one will have two sets of same processes running on two machines, which may have disastrous effects;
  • A lot of memory is dumped to disk and transferred to the destination system, which is not necessary for a test run.

To solve these, an easier and faster way to test if dump/restore will work, --check-only option was introduced.

Usage[edit]

Source system:

# criu dump --check-only -D /tmp/cp -t <PID>
Only checking if requested operation will succeed
...
# rsync -a /tmp/cp dest-system:/tmp

Destination system:

# criu restore -D /tmp/cp
Checking mode enabled
...

Upon restore, CRIU will see that the checkpoint is a check-only one, and will do restore in a check-only mode automatically, so there is no need to supply this option explicitly.

Restore-only[edit]

One can use the --check-only switch on a full checkpoint, to see if the restore will succeed, at the same time making sure the process will not start running:

Destination system:

# criu restore --check-only -D /tmp/cp
Only checking if requested operation will succeed
Checking mode enabled
...

Currently only the existing checks (e.g., check binary size) are run in check-only mode, but additional checks could be added, for example:

  • checksums of binaries
  • checksums of used libraries
  • available memory