Changes

Jump to navigation Jump to search
no edit summary
Line 20: Line 20:  
* Skill level: most probably advanced?
 
* Skill level: most probably advanced?
 
* Language: C
 
* Language: C
* Mentor: Mike Rapoport <rppt@linux.ibm.com>
+
* Mentor: Mike Rapoport <rppt@linux.ibm.com> / Andrey Vagin <avagin@gmail.com>
 
* Suggested by: Mike Rapoport <rppt@linux.ibm.com>
 
* Suggested by: Mike Rapoport <rppt@linux.ibm.com>
   Line 40: Line 40:  
* Skill level: intermediate
 
* Skill level: intermediate
 
* Language: C, though decoder/preprocessor can be in any language
 
* Language: C, though decoder/preprocessor can be in any language
* Mentor: Andrei Vagin <avagin@gmail.com>, Pavel Emelyanov <xemul@openvz.org>
+
* Mentor: Andrei Vagin <avagin@gmail.com> / Pavel Emelyanov <xemul@openvz.org>
 
* Suggested by: Andrei Vagin <avagin@gmail.com>
 
* Suggested by: Andrei Vagin <avagin@gmail.com>
   Line 78: Line 78:  
Current [[CLI/cmd/pre-dump|pre-dump]] mode is used to write task memory contents into image
 
Current [[CLI/cmd/pre-dump|pre-dump]] mode is used to write task memory contents into image
 
files w/o stopping the task for too long. It does this by stopping the task, infecting it and
 
files w/o stopping the task for too long. It does this by stopping the task, infecting it and
draining all the memory into a set of pipes. Then the task is cured and resumed and the pipes'
+
draining all the memory into a set of pipes. Then the task is cured, resumed and the pipes'
contents is written into images (maybe a [[page server]]). This approach creates a big stress
+
contents is written into images (maybe a [[page server]]). Unfortunately, this approach creates
on memory subsystem, as keeping all the memory in pipes creates a lot of unreclaimable memory
+
a big stress on the memory subsystem, as keeping all memory in pipes creates a lot of unreclaimable
(pages in pipes are not swappable), as well as the number of pipes themselves can be buge (as
+
memory (pages in pipes are not swappable), as well as the number of pipes themselves can be huge, as
one pipe doesn't store more than a fixed certain amount of data).
+
one pipe doesn't store more than a fixed amount of data (see pipe(7) man page).
   −
We can try to use sys_read_process_vm() syscall to mitigate all of the above. To do this we
+
A solution for this problem is to use a sys_read_process_vm() syscall, which will mitigate
need to allocate a temporary buffer in criu, then walk the target process vm by copying the
+
all of the above. To do this we need to allocate a temporary buffer in criu, then walk the
memory piece-by-piece into it, then flush the data into image (or page server), then repeat.
+
target process vm by copying the memory piece-by-piece into it, then flush the data into image
 +
(or page server), and repeat.
    
Ideally there should be sys_splice_process_vm() syscall in the kernel, that does the same as
 
Ideally there should be sys_splice_process_vm() syscall in the kernel, that does the same as
Line 92: Line 93:  
   
 
   
 
'''Links:'''
 
'''Links:'''
 +
* [[Memory pre dump]]
 
* https://github.com/checkpoint-restore/criu/issues/351
 
* https://github.com/checkpoint-restore/criu/issues/351
 
* [[Memory dumping and restoring]], [[Memory changes tracking]]
 
* [[Memory dumping and restoring]], [[Memory changes tracking]]
* [http://man7.org/linux/man-pages/man2/process_vm_readv.2.html process_vm_readv(2)] [http://man7.org/linux/man-pages/man2/vmsplice.2.html vmsplice(2)] [https://lkml.org/lkml/2017/11/22/527 RFC for splice_process_vm syscall]
+
* [http://man7.org/linux/man-pages/man2/process_vm_readv.2.html process_vm_readv(2)] [http://man7.org/linux/man-pages/man2/vmsplice.2.html vmsplice(2)] [https://lkml.org/lkml/2018/1/9/32 RFC for splice_process_vm syscall]
 
   
 
   
 
'''Details:'''
 
'''Details:'''
Line 120: Line 122:  
   
 
   
 
'''Links:'''
 
'''Links:'''
 +
* [[Anonymize image files]]
 
* https://github.com/checkpoint-restore/criu/issues/360
 
* https://github.com/checkpoint-restore/criu/issues/360
 
* [[CRIT]], [[Images]]
 
* [[CRIT]], [[Images]]
Line 149: Line 152:  
* Mentor: Adrian Reber <areber@redhat.com>
 
* Mentor: Adrian Reber <areber@redhat.com>
 
* Suggested by: Adrian Reber <areber@redhat.com>
 
* Suggested by: Adrian Reber <areber@redhat.com>
  −
=== Implement diskless migration ===
  −
  −
'''Summary:''' Need to investigate and implement that named diskless migration.
  −
  −
By diskless we imply a case where all images generated by checkpoint procedure do not sit on storage at all but rather get collected by the criu service on a destination machine, and read from memory later once restore procedure is initiated. More importantly the memory transferred should be deduplicated on the fly and premapped at some preliminary address. Later the restore procedure just remap data to proper positions without copying page data at all.
  −
  −
This task is under the question still and the section is like a placeholder.
  −
  −
'''Details:'''
  −
* Skill level: expert
  −
* Language: C
  −
* Mentor: Cyrill Gorcunov <gorcunov@gmail.com>
  −
* Suggested by: Cyrill Gorcunov <gorcunov@gmail.com>
      
=== Memory changes tracking with userfaultfd-WP ===
 
=== Memory changes tracking with userfaultfd-WP ===
Line 185: Line 174:  
* Mentor: Mike Rapoport <rppt@linux.ibm.com>
 
* Mentor: Mike Rapoport <rppt@linux.ibm.com>
 
* Suggested by: Mike Rapoport <rppt@linux.ibm.com>
 
* Suggested by: Mike Rapoport <rppt@linux.ibm.com>
 +
 +
=== Use eBPF to lock and unlock the network ===
 +
 +
'''Summary:''' Use ePBF instead of external iptables-restore tool for network lock and unlock.
 +
 +
During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.
 +
 +
'''Links:'''
 +
* https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
 +
* https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
 +
 +
'''Details:'''
 +
* Skill level: intermediate
 +
* Language: C
 +
* Mentor: Adrian Reber <areber@redhat.com>
 +
* Suggested by: Adrian Reber <areber@redhat.com>
    
[[Category:GSoC]]
 
[[Category:GSoC]]
 
[[Category:Development]]
 
[[Category:Development]]
79

edits

Navigation menu