Changes

Jump to navigation Jump to search
m
Line 83: Line 83:  
## <code>flags</code> includes <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code>
 
## <code>flags</code> includes <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code>
   −
=== the process is not inside the rseq critical section ===
+
=== Not in RSEQ Critical Section ===
   −
Simplest case. Process just have <code>struct rseq</code> registered in the kernel but currently instruction pointer (IP) not inside CS.
+
This is the simplest case. Process just have <code>struct rseq</code> registered in the kernel but currently instruction pointer (IP) not inside CS.
   −
==== Dump ====
+
==== Checkpoint ====
 
We need only to determine where the <code>struct rseq</code> is and dump its address length and signature.
 
We need only to determine where the <code>struct rseq</code> is and dump its address length and signature.
 
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).
 
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).
Line 94: Line 94:  
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)
 
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)
   −
=== inside CS: <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code> ===
+
=== Inside in RSEQ Critical Section ===
   −
The process was caught with IP inside CS. Can we act as before? So, dump <code>struct rseq</code> address, restore it, and so on. No, we can't.
+
When a process is being checkpointed while its instruction pointer is inside an RSEQ critical section, CRIU preserves the instruction pointer exactly as it was at checkpoint time.
The reason is that CRIU saves IP as it was during the dump. But the rseq semantic is to jump to abort handler if CS execution was interrupted.
+
However, RSEQ semantics require that if execution of a critical section is interrupted, the kernel redirects execution to the associated abort handler. In particular, when the <code>flags</code> value is <code>0</code>, <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code> the kernel automatically redirects the instruction pointer to the abort handler associated with the RSEQ critical section. As a result, restoring the process with its instruction pointer unchanged violates the RSEQ semantics, potentially leading to incorrect behavior or application crashes. To address this issue, CRIU explicitly adjusts the instruction pointer to match kernel behavior.
In this particular case we have <code>flags</code> equal to <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
  −
it means that if CS will be interrupted by the preeption, migration (<code>0</code>) or migration (<code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code>) or preemption (<code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>)
  −
the kernel will fixup IP of the process to the abort handler address.
     −
When we dump the process using CRIU it will just save IP as it was and restore it. That's a serious problem and this may break the user application (even cause crash!).
+
The logic responsible for this is implemented in the <code>fixup_thread_rseq</code> function:
 
  −
Lets see <code>fixup_thread_rseq</code> function:
   
<pre>
 
<pre>
 
if (task_in_rseq(rseq_cs, TI_IP(core))) {
 
if (task_in_rseq(rseq_cs, TI_IP(core))) {
Line 131: Line 126:  
</pre>
 
</pre>
   −
It checks that process IP inside CS and fixes it up to the abort handler IP as the kernel does.
+
This code detects when a thread's instruction pointer lies within an RSEQ critical section and, unless <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> is set, rewrites the instruction pointer to the abort handler address. By doing so, CRIU mirrors the kernel's rseq fixup behavior and ensures that the restored process resumes execution in a semantically correct state.
   −
==== Dump ====
+
==== Checkpoint ====
 
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
 
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
 
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).
 
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).
554

edits

Navigation menu