Better logging

Logging in CRIU should obey the following rules

  1. Default logging should provide enough data for typical investigation of "can't dump"/"can't restore"
  2. It should be possible to add developer-only debugging to see some more details
  3. It should be possible to shut the logger up completely (maybe by specifying the log file as /dev/null)

Binary loggingEdit

The Binary Logging UtilityEdit

The Previous ImplementationEdit

The logging facility available in criu uses printf family functions to store the logs into the file. The issue with the printf family functions is that they take much time to execute. These functions first scan the input string for format specifiers. The arguments passed are then typecast to string and then copied into the input string. This log file is also very often left unused unless some failure occurs.

The Current ImplementationEdit

The binary logger works such that it dumps the raw input string with all its arguments into a file. This implementation saves the time required to scan the input string and also the time consumed in coping these arguments into the string. It initially stores all the information in a buffer, and after the buffer fills up, its contents are dumped into a file. This file can later be passed into the decoder, which works independently of the criu execution, and can be read whenever needed.

The binary logging thus saves the CPU execution time during the running of the process, and since these logs are rarely used unless some failure occurs, this conserves a significant amount of time. Although the overall time spent by the binary logger, when running with the decoder, is similar to the regular printer's time.

Working of Binary LoggerEdit

Criu currently supports both, logging using the regular printer and logging in binary. Adding a -b or --binary-log option at the end of a criu command allows it to use the binary printer and log in binary.

The simple loop page shows how to dump a trivial program. To store the logs generated during this process in binary, these are the changes are needed:

Dump it:

# criu dump -t 2221 -vvv -o dump.log -b && echo OK 


# criu dump -t 2221 -vvv -o dump.log --binary-log && echo OK


# criu restore -d -vvv -o restore.log --b && echo OK


# criu restore -d -vvv -o restore.log --binary-log && echo OK

The DecoderEdit

The decoder is placed in the crit folder. The decoder takes the input as the file that needs to be processed and the file where the output needs to be presented. The decoder reads the file and passes the format string and the arguments through a printf family function. The logs are then processed in this function, and the output is presented in the output file provided. Whenever the decoder encounters logs that are not printed by the flog printer into the file, it merely copies it into the output file as it is.

To Decode:

# crit/decoder dump.log decoded_dump.log && echo OK

Improvements and Pending tasksEdit

This project was added to criu as a part of GSoC 2020. These are a few issues faced and left unresolved during the duration.

Aarch64 and arm issueEdit

Currently, the binary logging is not supported in the aarch64 architecture and the arm architecture. The decoder collects all the arguments in an array. This array is then typecast to void* and passed in the place of va_list to vsnprintf (The printf family function used to format this raw data). This va_list is accepted as type void* in all other architectures but causes issues with the aarch64 and arm architecture, causing the decoder to fail.

_Generic function issueEdit

Currently, the binary logger is supported for GCC versions greater than C11. This limitation is caused due to the usage of _Generic function in the binary logging utility. This function is needed to return the type of the argument passed as input to the printf. This function was added to GCC and is supported from version C11 and forth.

Add timestamps and PIDEdit

Currently, the logs are being stored in the file without any other formatting added to it. The logs printed via the regular printer, in addition to the expected print statements, add timestamp and PID into the log file. This feature is yet to be added to the binary printer.

Optimize using .rodataEdit

To further optimize the binary logger's performance, the string being stored in the file during logging could be referenced using .rodata.

Log messages structuringEdit

CRIU generates tons of messages. The most critical to troubleshooting is the pr_err-s. Need to introduce types of errors to make troubleshooting easier. E.g. we have the guide what to do when C/R fails. Need to introduce error types into it and print them before the error text. Suggested types of failures are:

  • Restore errors
    • Resource ID conflict (mostly valid for non-containers case)
      • PID mismatch
      • File with the name we need exists
      • CGroups exist (?)
    • Missing item on restore
  • Dump errors
  • Common errors
    • Out of memory
    • Access denied / permission denied
    • Other unexpected/unhandled syscall error
    • Error reading/writing images
    • Proc file format error (do we really expect this thing?)