Difference between revisions of "ZDTM test suite"

Latest revision as of 12:40, 14 September 2018

ZDTM stands for Zero DownTime Migration. ZDTM test suite was developed for testing how OpenVZ live migration works. CRIU also uses this test suite.

The suite consists of many small atomic tests. Each test case creates a process, puts it into some specific state (opens a file, maps a memory segment, puts data in a pipe, etc.), then asks to be checkpointed and restored. Upon restoring, it checks that the state was preserved (file is still opened, memory is still mapped, pipe contains what was put into it, etc).

RunningEdit

PreparationEdit

Install libaio-devel (RPM) or libaio-dev (DEB). For the test launcher zdtm.py you need PyYAML (RPM) or python-yaml (DEB).

There's also a known issue with BTRFS spoiling dev_t values for files and sockets! Not all tests will work on it, so it's better to use other FS.

All zdtm testsEdit

You can run all the tests by:

      test/zdtm.py run -a

This would run the tests in basic variation -- run the test, checkpoint, restore, check for results.

There are more variations, each is an option to the zdtm.py. Here they are:

--nocr: Start tests and check results, omitting checkpoint/restore steps. Used to check that the tests themselves are working.

--norst: Start the test, then checkpoint it leaving the tests running, then check the results. Used to check that checkpoint itself is not destructive.

--iter <number>: Start the test, then checkpoint and restore it the <number> of times. Used to check that after restore tests are in checkpoint-able state.

--pre <number>: Start the test, then do a <number> of pre-dumps, then checkpoint, restore and check results. Used to check that pre-dumps work.

--page-server: Run tests, but dumps (and pre-dumps) will go through the page server.

--sibling: Run tests, but restore in a so-called sibling mode. Used by LXC and Docker.

--snaps (in conjunction with --pre): Instead of pre-dumps do full dumps.

--user (only works with --norst): Check how criu works when run from non-root.

--join-ns: Restore tests and join existing namespace.

--empty-ns: Restore tests in empty net namespace.

--snaps: Instead of pre-dumps do full dumps.

--dedup: Auto-deduplicate images on iterations.

--noauto-dedup: Manual deduplicate images on iterations.

--stop: Check that --leave-stopped option stops process tree.

--fault: Test fault injection.

--sat: Generate CRIU strace-s for sat tool (restore is fake, images are kept).

--sbs: Do step-by-step execution, asking user for keypress to continue.

--freezecg: Use freeze cgroup (path:state).

--rpc: Run CRIU via RPC rather than CLI.

--remote: Use automatic images transfer.

--parallel: Run test in parallel.

--dry-run: Don't run tests, just pretend to.

--script: Add script to be notified by CRIU.

--keep-img: Whether or not to keep images after test. Possible values: always, never, failed (default).

--report: Generate summary report in directory.

--keep-going: Keep running tests in spite of failures.

--ignore-taint: Don't care about a non-zero kernel taint flag.

--lazy-pages: Restore pages on demand. In this case dump works "normally", restore skips the memory pages that can be handled on demand and lazy-pages daemon handles the page faults and reads the memory contents from the image files.

--lazy-migrate: Allows testing of post-copy migration when running ns or uns flavor. It cannot run with the host flavor because during post-copy migration the migrated tasks should exist both on the source and the destination.

--remote-lazy-pages: Simulates post-copy memory migration. Here again, the dump works "normally", but the lazy-pages daemon does not read data from the image files but requests it "over the network" for the page-server. It is the page-server that reads memory pages from the images and simulates what 'dump --lazy-pages' wound have done.

--title: Specify a test suite title.

--show-stats: Show CRIU Statistics.

--criu-bin: Path to CRIU binary (default: ../criu/criu).

--crit-bin: Path to crit binary (default: ../crit/crit).

FlavorsEdit

Each test can be executed in up to 3 flavors: h, ns and uns. The first is host flavor, the test is run on host and criu c/r-s it. ns is namespace flavor, tests are run in all namespaces but the user one, criu c/r-s test with all the namespaces. The uns is user namespace, it's like ns but with user namespace in the game.

By default tests are run on all flavors they can, but one can chose flavor with -f flavor option.

Specific testsEdit

You can run a single test with

      zdtm.py run -t test-name

The simplest test is zdtm/static/env00. It has the bare minimal a process might have -- a little bit of memory, a signal handler and one opened file. If CRIU cannot C/R this test, it surely cannot do anything else.

You can also run only the test proggies themselves manually by issuing the

      make test-name.pid

command. After you've done c/r-ing it you should run

      make test-name.out

and check for the test-name.out file contents.

If you don't want to mess with this, you can use the zdtm.py run script. When launched with the -a option, it runs all the tests one-by-one. The exact test can be specified by a command line argument. The list command lists the tests it can run.

@@ Line 1: / Line 1: @@
-{| class="wikitable"
+'''ZDTM''' stands for '''Z'''ero '''D'''own'''T'''ime '''M'''igration. '''ZDTM test suite''' was developed for testing how OpenVZ live migration works. CRIU also uses this test suite.
-|-
-! Name
+The suite consists of many small atomic tests. Each test case creates a process, puts it into some specific state (opens a file, maps a memory segment, puts data in a pipe, etc.), then asks to be checkpointed and restored. Upon restoring, it checks that the state was preserved (file is still opened, memory is still mapped, pipe contains what was put into it, etc).
-! Status
-|-
+== Running ==
-! colspan="2" | Static
-|-
+=== Preparation ===
-|busyloop00||+
-|-
+Install <code>libaio-devel</code> (RPM) or <code>libaio-dev</code> (DEB). For the test launcher <code>zdtm.py</code> you need <code>PyYAML</code> (RPM) or <code>python-yaml</code> (DEB).
-|cwd00||+
-|-
+There's also a known issue with BTRFS spoiling <code>dev_t</code> values for files and sockets! Not all tests will work on it, so it's better to use other FS.
-|deleted_dev||
-|-
+=== All zdtm tests ===
-|deleted_unix_sock||
-|-
+You can run all the tests by:
-|env00||+
+       test/zdtm.py run -a
-|-
-|fifo||
+This would run the tests in basic variation -- run the test, checkpoint, restore, check for results.
-|-
-|fifo_wronly||
+There are more variations, each is an option to the zdtm.py. Here they are:
-|-
-|file_attr||
+;--nocr
-|-
+: Start tests and check results, omitting checkpoint/restore steps. Used to check that the tests themselves are working.
-|fpu00||
-|-
+;--norst
-|futex||+
+: Start the test, then checkpoint it leaving the tests running, then check the results. Used to check that checkpoint itself is not destructive.
-|-
-|inotify_system||
+;--iter <number>
-|-
+: Start the test, then checkpoint and restore it the <number> of times. Used to check that after restore tests are in checkpoint-able state.
-|link10||
-|-
+;--pre <number>
-|maps00||+
+: Start the test, then do a <number> of pre-dumps, then checkpoint, restore and check results. Used to check that pre-dumps work.
-|-
-|mmx00||
+;--page-server
-|-
+: Run tests, but dumps (and pre-dumps) will go through the [[page server]].
-|mprotect00||+
-|-
+;--sibling
-|msgque||
+: Run tests, but restore in a so-called ''sibling'' mode. Used by LXC and Docker.
-|-
-|mtime_mmap||+
+;--snaps (in conjunction with --pre)
-|-
+: Instead of pre-dumps do full dumps.
-|overmount_dev||
-|-
+;--user (only works with --norst)
-|overmount_fifo||
+: Check how criu works when run from non-root.
-|-
-|overmount_file||
+;--join-ns
-|-
+: Restore tests and join existing namespace.
-|overmount_sock||
-|-
+;--empty-ns
-|pid00||FAIL: pid00.c:37: ppid != getppid()
+: Restore tests in empty net namespace.
-|-
-|pipe00||+
+;--snaps
-|-
+: Instead of pre-dumps do full dumps.
-|route_rules||
-|-
+;--dedup
-|shm||+
+: Auto-deduplicate images on iterations.
-|-
-|sleeping00||+
+;--noauto-dedup
-|-
+: Manual deduplicate images on iterations.
-|socket_aio||
-|-
+;--stop
-|sse00||
+: Check that --leave-stopped option stops process tree.
-|-
-|sse20||
+;--fault
-|-
+: Test fault injection.
-|timers||
-|-
+;--sat
-|umask00||
+: Generate CRIU strace-s for sat tool (restore is fake, images are kept).
-|-
-|unbound_sock||
+;--sbs
-|-
+: Do step-by-step execution, asking user for keypress to continue.
-|unlink_fifo||
-|-
+;--freezecg
-|unlink_fifo_wronly||
+: Use freeze cgroup (path:state).
-|-
-|unlink_fstat00||
+;--rpc
-|-
+: Run CRIU via RPC rather than CLI.
-|unlink_fstat01||
-|-
+;--remote
-|unlink_largefile||
+: Use [[Image_cache/proxy_TODO|automatic images transfer]].
-|-
-|wait00||+
+;--parallel
-|-
+: Run test in parallel.
-|write_read00||+
-|-
+;--dry-run
-|write_read01||+
+: Don't run tests, just pretend to.
-|-
-|write_read02||+
+;--script
-|-
+: Add script to be notified by CRIU.
-|write_read10||
-|-
+;--keep-img
-|zombie00||+
+: Whether or not to keep images after test. Possible values: <code>always</code>, <code>never</code>, <code>failed</code> (default).
-|-
-|socket_listen||+
+;--report
-|-
+: Generate summary report in directory.
-! colspan="2" | Streaming
-|-
+;--keep-going
-|fifo_dyn||
+: Keep running tests in spite of failures.
-|-
-|fifo_loop||
+;--ignore-taint
-|-
+: Don't care about a non-zero kernel taint flag.
-|file_aio||
-|-
+;--lazy-pages
-|netlink00||
+: Restore pages on demand. In this case dump works "normally", restore skips the memory pages that can be handled on demand and lazy-pages daemon handles the page faults and reads the memory contents from the image files.
-|-
-|pipe_loop00||+
+;--lazy-migrate
-|-
+: Allows testing of post-copy migration when running <code>ns</code> or <code>uns</code> flavor. It cannot run with the host flavor because during post-copy migration the migrated tasks should exist both on the source and the destination.
-|pipe_shared00||+
-|-
+;--remote-lazy-pages
-|socket_loop00||
+: Simulates post-copy memory migration. Here again, the dump works "normally", but the lazy-pages daemon does not read data from the image files but requests it "over the network" for the page-server. It is the page-server that reads memory pages from the images and simulates what 'dump --lazy-pages' wound have done.
-|-
-|unix_sock||
+;--title
-|-
+: Specify a test suite title.
-! colspan="2" | Transition
-|-
+;--show-stats
-|epoll||
+: Show CRIU [[Statistics]].
-|-
-|file_read||+
+;--criu-bin
-|-
+: Path to CRIU binary (default: <code>../criu/criu</code>).
-|ipc||
-|-
+;--crit-bin
-|ptrace||
+: Path to crit binary (default: <code>../crit/crit</code>).
-|}
+=== Flavors ===
+Each test can be executed in up to 3 flavors: <code>h</code>, <code>ns</code> and <code>uns</code>. The first is ''host'' flavor, the test is run on host and criu c/r-s it. <code>ns</code> is ''namespace'' flavor, tests are run in all namespaces but the user one, criu c/r-s test with all the namespaces. The <code>uns</code> is ''user namespace'', it's like <code>ns</code> but with user namespace in the game.
+By default tests are run on all flavors they can, but one can chose flavor with <code>-f ''flavor''</code> option.
+=== Specific tests ===
+You can run a single test with
+       zdtm.py run -t ''test-name''
+The simplest test is <code>zdtm/static/env00</code>. It has the bare minimal a process might have -- a little bit of memory, a signal handler and one opened file. If CRIU cannot C/R '''this''' test, it surely cannot do anything else.
+You can also run only the test proggies themselves manually by issuing the
+       make ''test-name''.pid
+command. After you've done c/r-ing it you should run
+       make ''test-name''.out
+and check for the ''test-name''.out file contents.
+If you don't want to mess with this, you can use the <code>zdtm.py run</code> script. When launched with the <code>-a</code> option, it runs all the tests one-by-one. The exact test can be specified by a command line argument. The <code>list</code> command lists the tests it can run.
+== See also ==
+[[ZDTM API]]
+[[Category:Development]]
+[[Category:Testing]]