Difference between revisions of "ZDTM test suite"

From CRIU
Jump to navigation Jump to search
m (→‎All zdtm tests: Add more detailed description for --lazy-pages, --lazy-migrate and --remote-lazy-pages.)
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Description ==
+
'''ZDTM''' stands for '''Z'''ero '''D'''own'''T'''ime '''M'''igration. '''ZDTM test suite''' was developed for testing how OpenVZ live migration works. CRIU also uses this test suite.
  
ZDTM stands for zero-down-time-migration. It's test suite developed for testing how OpenVZ live migration works. We use this test suite for checking how criu do their job. The suit consists of many small atomic tests -- each puts a process into some "state" (opens a file, maps a memory segment, puts data in a pipe, etc.), then asks to be checkpointed and restored. The in checks that the "state" was preserved (file is still open, memory is still mapped, pipe contains what was put into it).
+
The suite consists of many small atomic tests. Each test case creates a process, puts it into some specific state (opens a file, maps a memory segment, puts data in a pipe, etc.), then asks to be checkpointed and restored. Upon restoring, it checks that the state was preserved (file is still opened, memory is still mapped, pipe contains what was put into it, etc).
  
 
== Running ==
 
== Running ==
 +
 +
=== Preparation ===
 +
 +
Install <code>libaio-devel</code> (RPM) or <code>libaio-dev</code> (DEB). For the test launcher <code>zdtm.py</code> you need <code>PyYAML</code> (RPM) or <code>python-yaml</code> (DEB).
 +
 +
There's also a known issue with BTRFS spoiling <code>dev_t</code> values for files and sockets! Not all tests will work on it, so it's better to use other FS.
 +
 
=== All zdtm tests ===
 
=== All zdtm tests ===
 +
 
You can run all the tests by:
 
You can run all the tests by:
       test/zdtm.py
+
       test/zdtm.py run -a
 
 
{{Out|There's a known issue with BTRFS spoiling dev_t values for files and sockets! Not all tests will work on it.}}
 
  
 
This would run the tests in basic variation -- run the test, checkpoint, restore, check for results.
 
This would run the tests in basic variation -- run the test, checkpoint, restore, check for results.
Line 15: Line 21:
  
 
;--nocr
 
;--nocr
: would do start test and check results. User to check that test themselves are working.
+
: Start tests and check results, omitting checkpoint/restore steps. Used to check that the tests themselves are working.
.
+
 
 
;--norst
 
;--norst
: would start the test, the checkpoint it leaving the tests run, then check the results. Used to check that checkpoint is not destructive.
+
: Start the test, then checkpoint it leaving the tests running, then check the results. Used to check that checkpoint itself is not destructive.
  
 
;--iter <number>
 
;--iter <number>
: would start the test, then would checkpoint and restore it the <number> times. Used to check that after restore tests are in checkpoint-able state.
+
: Start the test, then checkpoint and restore it the <number> of times. Used to check that after restore tests are in checkpoint-able state.
  
 
;--pre <number>
 
;--pre <number>
: would statr the test, then do <number> pre-dumps, then checkpoint, restore and check results. Used to check that pre-dumps work.
+
: Start the test, then do a <number> of pre-dumps, then checkpoint, restore and check results. Used to check that pre-dumps work.
  
 
;--page-server
 
;--page-server
: would run tests, but dumps (and pre-dumps) will go through the [[page server]].
+
: Run tests, but dumps (and pre-dumps) will go through the [[page server]].
  
 
;--sibling
 
;--sibling
: would run tests, but restore would happen in so called ''sibling'' mode. Used by LXC and Docker.
+
: Run tests, but restore in a so-called ''sibling'' mode. Used by LXC and Docker.
  
 
;--snaps (in conjunction with --pre)
 
;--snaps (in conjunction with --pre)
: instead of pre-dumps do full dumps
+
: Instead of pre-dumps do full dumps.
  
 
;--user (only works with --norst)
 
;--user (only works with --norst)
: check how criu works when run from non-root.
+
: Check how criu works when run from non-root.
 +
 
 +
;--join-ns
 +
: Restore tests and join existing namespace.
 +
 
 +
;--empty-ns
 +
: Restore tests in empty net namespace.
 +
 
 +
;--snaps
 +
: Instead of pre-dumps do full dumps.
 +
 
 +
;--dedup
 +
: Auto-deduplicate images on iterations.
 +
 
 +
;--noauto-dedup
 +
: Manual deduplicate images on iterations.
 +
 
 +
;--stop
 +
: Check that --leave-stopped option stops process tree.
 +
 
 +
;--fault
 +
: Test fault injection.
 +
 
 +
;--sat
 +
: Generate CRIU strace-s for sat tool (restore is fake, images are kept).
 +
 
 +
;--sbs
 +
: Do step-by-step execution, asking user for keypress to continue.
 +
 
 +
;--freezecg
 +
: Use freeze cgroup (path:state).
 +
 
 +
;--rpc
 +
: Run CRIU via RPC rather than CLI.
 +
 
 +
;--remote
 +
: Use [[Image_cache/proxy_TODO|automatic images transfer]].
 +
 
 +
;--parallel
 +
: Run test in parallel.
 +
 
 +
;--dry-run
 +
: Don't run tests, just pretend to.
 +
 
 +
;--script
 +
: Add script to be notified by CRIU.
 +
 
 +
;--keep-img
 +
: Whether or not to keep images after test. Possible values: <code>always</code>, <code>never</code>, <code>failed</code> (default).
 +
 
 +
;--report
 +
: Generate summary report in directory.
 +
 
 +
;--keep-going
 +
: Keep running tests in spite of failures.
 +
 
 +
;--ignore-taint
 +
: Don't care about a non-zero kernel taint flag.
 +
 
 +
;--lazy-pages
 +
: Restore pages on demand. In this case dump works "normally", restore skips the memory pages that can be handled on demand and lazy-pages daemon handles the page faults and reads the memory contents from the image files.
 +
 
 +
;--lazy-migrate
 +
: Allows testing of post-copy migration when running <code>ns</code> or <code>uns</code> flavor. It cannot run with the host flavor because during post-copy migration the migrated tasks should exist both on the source and the destination.
 +
 
 +
;--remote-lazy-pages
 +
: Simulates post-copy memory migration. Here again, the dump works "normally", but the lazy-pages daemon does not read data from the image files but requests it "over the network" for the page-server. It is the page-server that reads memory pages from the images and simulates what 'dump --lazy-pages' wound have done.
 +
 
 +
;--title
 +
: Specify a test suite title.
 +
 
 +
;--show-stats
 +
: Show CRIU [[Statistics]].
 +
 
 +
;--criu-bin
 +
: Path to CRIU binary (default: <code>../criu/criu</code>).
 +
 
 +
;--crit-bin
 +
: Path to crit binary (default: <code>../crit/crit</code>).
 +
 
 +
=== Flavors ===
 +
 
 +
Each test can be executed in up to 3 flavors: <code>h</code>, <code>ns</code> and <code>uns</code>. The first is ''host'' flavor, the test is run on host and criu c/r-s it. <code>ns</code> is ''namespace'' flavor, tests are run in all namespaces but the user one, criu c/r-s test with all the namespaces. The <code>uns</code> is ''user namespace'', it's like <code>ns</code> but with user namespace in the game.
 +
 
 +
By default tests are run on all flavors they can, but one can chose flavor with <code>-f ''flavor''</code> option.
 +
 
 +
=== Specific tests ===
 +
 
 +
You can run a single test with
 +
      zdtm.py run -t ''test-name''
 +
 
 +
The simplest test is <code>zdtm/static/env00</code>. It has the bare minimal a process might have -- a little bit of memory, a signal handler and one opened file. If CRIU cannot C/R '''this''' test, it surely cannot do anything else.
 +
 
 +
You can also run only the test proggies themselves manually by issuing the
 +
 
 +
      make ''test-name''.pid
  
=== Certain test ===
 
You can also run the tests manually by issuing a
 
      make <testname>.pid
 
 
command. After you've done c/r-ing it you should run  
 
command. After you've done c/r-ing it you should run  
      make <testname>.out
 
and check for the <testname>.out file contents.
 
  
If you don't want to mess with this you can use the <code>zdtm.py run</code> script. When launched with the "-a" option runs all the tests one-by-one. The exact test can be specified by the command line argument. The <code>list</code> command lists the tests it can run.
+
      make ''test-name''.out
 +
 
 +
and check for the ''test-name''.out file contents.
 +
 
 +
If you don't want to mess with this, you can use the <code>zdtm.py run</code> script. When launched with the <code>-a</code> option, it runs all the tests one-by-one. The exact test can be specified by a command line argument. The <code>list</code> command lists the tests it can run.
 +
 
 +
== See also ==
 +
[[ZDTM API]]
  
 
[[Category:Development]]
 
[[Category:Development]]
 
[[Category:Testing]]
 
[[Category:Testing]]

Latest revision as of 12:40, 14 September 2018

ZDTM stands for Zero DownTime Migration. ZDTM test suite was developed for testing how OpenVZ live migration works. CRIU also uses this test suite.

The suite consists of many small atomic tests. Each test case creates a process, puts it into some specific state (opens a file, maps a memory segment, puts data in a pipe, etc.), then asks to be checkpointed and restored. Upon restoring, it checks that the state was preserved (file is still opened, memory is still mapped, pipe contains what was put into it, etc).

Running[edit]

Preparation[edit]

Install libaio-devel (RPM) or libaio-dev (DEB). For the test launcher zdtm.py you need PyYAML (RPM) or python-yaml (DEB).

There's also a known issue with BTRFS spoiling dev_t values for files and sockets! Not all tests will work on it, so it's better to use other FS.

All zdtm tests[edit]

You can run all the tests by:

      test/zdtm.py run -a

This would run the tests in basic variation -- run the test, checkpoint, restore, check for results.

There are more variations, each is an option to the zdtm.py. Here they are:

--nocr
Start tests and check results, omitting checkpoint/restore steps. Used to check that the tests themselves are working.
--norst
Start the test, then checkpoint it leaving the tests running, then check the results. Used to check that checkpoint itself is not destructive.
--iter <number>
Start the test, then checkpoint and restore it the <number> of times. Used to check that after restore tests are in checkpoint-able state.
--pre <number>
Start the test, then do a <number> of pre-dumps, then checkpoint, restore and check results. Used to check that pre-dumps work.
--page-server
Run tests, but dumps (and pre-dumps) will go through the page server.
--sibling
Run tests, but restore in a so-called sibling mode. Used by LXC and Docker.
--snaps (in conjunction with --pre)
Instead of pre-dumps do full dumps.
--user (only works with --norst)
Check how criu works when run from non-root.
--join-ns
Restore tests and join existing namespace.
--empty-ns
Restore tests in empty net namespace.
--snaps
Instead of pre-dumps do full dumps.
--dedup
Auto-deduplicate images on iterations.
--noauto-dedup
Manual deduplicate images on iterations.
--stop
Check that --leave-stopped option stops process tree.
--fault
Test fault injection.
--sat
Generate CRIU strace-s for sat tool (restore is fake, images are kept).
--sbs
Do step-by-step execution, asking user for keypress to continue.
--freezecg
Use freeze cgroup (path:state).
--rpc
Run CRIU via RPC rather than CLI.
--remote
Use automatic images transfer.
--parallel
Run test in parallel.
--dry-run
Don't run tests, just pretend to.
--script
Add script to be notified by CRIU.
--keep-img
Whether or not to keep images after test. Possible values: always, never, failed (default).
--report
Generate summary report in directory.
--keep-going
Keep running tests in spite of failures.
--ignore-taint
Don't care about a non-zero kernel taint flag.
--lazy-pages
Restore pages on demand. In this case dump works "normally", restore skips the memory pages that can be handled on demand and lazy-pages daemon handles the page faults and reads the memory contents from the image files.
--lazy-migrate
Allows testing of post-copy migration when running ns or uns flavor. It cannot run with the host flavor because during post-copy migration the migrated tasks should exist both on the source and the destination.
--remote-lazy-pages
Simulates post-copy memory migration. Here again, the dump works "normally", but the lazy-pages daemon does not read data from the image files but requests it "over the network" for the page-server. It is the page-server that reads memory pages from the images and simulates what 'dump --lazy-pages' wound have done.
--title
Specify a test suite title.
--show-stats
Show CRIU Statistics.
--criu-bin
Path to CRIU binary (default: ../criu/criu).
--crit-bin
Path to crit binary (default: ../crit/crit).

Flavors[edit]

Each test can be executed in up to 3 flavors: h, ns and uns. The first is host flavor, the test is run on host and criu c/r-s it. ns is namespace flavor, tests are run in all namespaces but the user one, criu c/r-s test with all the namespaces. The uns is user namespace, it's like ns but with user namespace in the game.

By default tests are run on all flavors they can, but one can chose flavor with -f flavor option.

Specific tests[edit]

You can run a single test with

      zdtm.py run -t test-name

The simplest test is zdtm/static/env00. It has the bare minimal a process might have -- a little bit of memory, a signal handler and one opened file. If CRIU cannot C/R this test, it surely cannot do anything else.

You can also run only the test proggies themselves manually by issuing the

      make test-name.pid

command. After you've done c/r-ing it you should run

      make test-name.out

and check for the test-name.out file contents.

If you don't want to mess with this, you can use the zdtm.py run script. When launched with the -a option, it runs all the tests one-by-one. The exact test can be specified by a command line argument. The list command lists the tests it can run.

See also[edit]

ZDTM API