Collected on this page is the design notes about supporting userfaultfd in CRIU
restoreaction should accept yet another API switch: option
- Remote pages drain would require synchronization between criu dump and criu restore. This is the P.Haul responsibility, so it looks like the best solution here would be to make it work via page server and setup connection with RPC's
Tasks after restore should have lazy VMAs being backed by userfaultfd, the fd itself should be sent before resume to CRIU (daemon?) and closed. This is CRIU who will monitor the UFFD events and repopulate the tasks address space. It should be able to get pages from both -- remote and local images.
The daemon should just use local page-read engine and read pages from images.
- The page-read engine should be patched to learn how to talk to the remote host (page server with --page-server option?) on the other end.
- The source node should get pages from tasks dumped and send them out on the destination node.
- Protocol should include out-of-order pages and background pages pushing (sending them before demand from the process).
- Only MAP_PRIVATE | MAP_ANONYMOUS will be supported in the 1st version due to kernel constraints.
- Userfault is known not to map one page into two places. Thus -- COW-ed pages will get COW-ed.
- Andrea (author) states that UFFDIO_REMAP might be slow as compared to UFFDIO_COPY. Probably it makes sense to copy data into tasks, not move.
- Unmaps and mremaps can screw things up. Either we have to make uffd-s per VMA or add events about such things.
- Forks are even worse -- kid will just populate its memory with zero pages :(