Performance research
Written here are performance issues found.
Timing stats of live migration of a small container with 11 tasks is
- Total time ~3.5 seconds
- Frozen time ~3.0 seconds
- Pre-dump stages ~0.5 seconds each
- Restore time ~1.9 seconds
- Images transfer time ~0.3 seconds
Below is the list of issues found
Images IO
Too many write-s and read-s. Each small object issues a write/read request into file. Need to improve.
Dump
Surprisingly, but the mem-drain time is not the biggest. It's "only" ~0.02 seconds. There are places in code that take longer.
parse_smaps
Time spent in this routine is up to 0.2 seconds on dump. This one exploits /proc heavily. For a container with 11 tasks the syscall stats look like
834 read 1451 fstat 1462 close 1642 openat
while opens and stats happen on
193 openat(4, "map-symlink", O_RDONLY) = -1 ENOENT (No such file or directory) 1438 openat(4, "map-symlink", O_RDONLY) = 5 11 openat(AT_FDCWD, "/proc/$pid/map_files", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
parasite_fuxup_vdso
The 2nd longest routine. It takes ~ 0.07 seconds on dump.
Too many readlink-s
Some files are readlinked 3-4 times in a row.
Restore
Fork vs VMA restore
We restore task's mappings before it goes forking to handle COW. This effectively serializes forking.
Restoring VMAs
There are 4 stages in VMA restore. Relative times of each are below
- Reading images 1%
- Mapping huuge premap area << 1%
- (Re-)mapping sub-areas 73%
- Filling area with data 26%
The 3rd stage has two parts. With timings:
- Opening filemap fd 85%
- Maping vma 15%
Parse maps instead of smaps
When parsing self maps for restorer hint, it's enough to read smaller /proc/self/maps
Perf stats
Here's how perf tree looks like
10.69% criu [.] pb_write_one | --- pb_write_one | |--18.74%-- dump_task_rlims | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--18.63%-- dump_task_mappings | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--18.51%-- parasite_dump_sigacts_seized | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--9.51%-- dump_one_task | cr_dump_tasks | main | __libc_start_main | |--9.30%-- dump_one_reg_file | dump_task_exe_link | dump_task_mm | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--9.20%-- dump_task_mm | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--8.57%-- dump_task_fs | dump_one_task | cr_dump_tasks | main | __libc_start_main | --7.54%-- write_pagemap_loc page_xfer_dump_pages __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
8.86% criu [.] generate_iovs | --- generate_iovs __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
8.59% criu [.] should_dump_page | --- should_dump_page | |--81.22%-- generate_iovs | __parasite_dump_pages_seized | parasite_dump_pages_seized | dump_one_task | cr_dump_tasks | main | __libc_start_main | --18.78%-- __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
5.01% criu [.] is_vma_range_fmt | --- is_vma_range_fmt | |--80.15%-- parse_smaps | collect_mappings | dump_one_task | cr_dump_tasks | main | __libc_start_main | --19.85%-- collect_mappings dump_one_task cr_dump_tasks main __libc_start_main
3.04% criu [.] print_on_level | --- print_on_level | |--33.73%-- parasite_wait_ack | __parasite_wait_daemon_ack | parasite_execute_daemon | parasite_fixup_vdso | dump_one_task | cr_dump_tasks | main | __libc_start_main | |--33.28%-- parasite_start_daemon | parasite_infect_seized | dump_one_task | cr_dump_tasks | main | __libc_start_main | --32.99%-- __parasite_wait_daemon_ack parasite_drain_fds_seized dump_task_files_seized dump_one_task cr_dump_tasks main __libc_start_main
3.03% criu [.] open_image_at | --- open_image_at | |--66.87%-- cr_fdset_open_range | cr_task_fdset_open | dump_one_task | cr_dump_tasks | main | __libc_start_main | --33.13%-- dump_task_files_seized dump_one_task cr_dump_tasks main __libc_start_main
3.00% criu [.] parse_vmflags | --- parse_vmflags | |--66.59%-- parse_smaps | collect_mappings | dump_one_task | cr_dump_tasks | main | __libc_start_main | --33.41%-- collect_mappings dump_one_task cr_dump_tasks main __libc_start_main
3.00% criu [.] vma_opt_str | --- vma_opt_str | |--66.31%-- pr_vma | | | |--50.06%-- dump_task_mappings | | dump_one_task | | cr_dump_tasks | | main | | __libc_start_main | | | --49.94%-- collect_mappings | dump_one_task | cr_dump_tasks | main | __libc_start_main | --33.69%-- collect_mappings dump_one_task cr_dump_tasks main __libc_start_main
2.19% criu [.] do_open_proc | --- do_open_proc | |--36.18%-- parse_pid_stat_small | seize_task | get_children | collect_task | collect_pstree | cr_dump_tasks | main | __libc_start_main | |--31.99%-- parse_threads | collect_threads | collect_task | get_children | collect_task | collect_pstree | cr_dump_tasks | main | __libc_start_main | --31.84%-- seize_task get_children collect_task collect_pstree cr_dump_tasks main __libc_start_main
2.03% criu [.] parse_smaps | --- parse_smaps collect_mappings dump_one_task cr_dump_tasks main __libc_start_main
2.02% criu [.] try_add_page_to | --- try_add_page_to try_add_page page_pipe_add_page generate_iovs __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
2.02% criu [.] ptrace_peek_area | --- ptrace_peek_area ptrace_swap_area __parasite_execute_syscall syscall_seized mmap_seized parasite_map_exchange parasite_infect_seized dump_one_task cr_dump_tasks main __libc_start_main
2.01% criu [.] collect_mappings | --- collect_mappings dump_one_task cr_dump_tasks main __libc_start_main
2.01% criu [.] __parasite_dump_pages_seized | --- __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
2.00% criu [.] get_task_regs | --- get_task_regs parasite_start_daemon parasite_infect_seized dump_one_task cr_dump_tasks main __libc_start_main
2.00% criu [.] parasite_args_s | --- parasite_args_s | |--50.05%-- parasite_dump_pages_seized | dump_one_task | cr_dump_tasks | main | __libc_start_main | --49.95%-- dump_one_task cr_dump_tasks main __libc_start_main
2.00% criu [.] parasite_stop_on_syscall | --- parasite_stop_on_syscall parasite_unmap parasite_cure_remote parasite_cure_seized dump_one_task cr_dump_tasks main __libc_start_main
1.97% criu [.] open_pid_proc | --- open_pid_proc | |--51.54%-- parasite_fixup_vdso | dump_one_task | cr_dump_tasks | main | __libc_start_main | --48.46%-- do_open_proc dump_task_fs dump_one_task cr_dump_tasks main __libc_start_main
1.55% criu [.] dump_one_file | --- dump_one_file dump_task_files_seized dump_one_task cr_dump_tasks main __libc_start_main
1.13% criu [.] alloc_cr_fdset | --- alloc_cr_fdset cr_task_fdset_open dump_one_task cr_dump_tasks main __libc_start_main
1.03% criu [.] unseize_task_and_threads | --- unseize_task_and_threads pstree_switch_state cr_dump_tasks main __libc_start_main
1.02% criu [.] timeval_accumulate | --- timeval_accumulate timing_stop __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main
1.01% criu [.] parasite_dump_misc_seized | --- parasite_dump_misc_seized dump_one_task cr_dump_tasks main __libc_start_main
1.01% criu [.] parse_posix_timers | --- parse_posix_timers dump_one_task cr_dump_tasks main __libc_start_main
1.01% criu [.] INIT_LIST_HEAD | --- INIT_LIST_HEAD __parasite_dump_pages_seized parasite_dump_pages_seized dump_one_task cr_dump_tasks main __libc_start_main