(Not) interrupting background tasks with Ctrl+C



When using R interactively from the command line, one can interrupt the current computionation using Ctrl+C key combination and enter a new command. This works both on Unix terminal and on Windows console in Rterm. Such computation may be implemented in R or C and may be executing an external command while waiting for the result e.g. via system(,wait=TRUE).

However, in R 4.3 and earlier, Ctrl+C interrupts also background tasks, e.g. those executed via system(,wait=FALSE) or pipe(). This was reported in PR#17764 for Unix, but it turned out to be a happening also on Windows. Such background tasks do not prevent the user to enter a new R command to the REPL (read-eval-print loop). Often they do not produce any output and the user may not even be aware of them. Such tasks are not interrupted in other systems with a REPL, including the Unix shell. The problem has been fixed in R-devel, the development version of R.

The problem

This text abstracts out some details in the interest of readability.

When the user presses Ctrl+C, some processes receive a signal from the operating system. The signal may be ignored, then nothing happens. It may also have a default or non-default handler. The default handler terminates the process. A non-default handler may be provided by the application.

R when running interactively with the REPL doesn’t want to terminate in response to Ctrl+C. It hence has its own handler. The handler just takes a note that an interrupt is pending and lets the main computation continue. Once it is safe to respond to a user interrupt, but not too often to also get something else done, R checks whether there is any pending interrupt, and if so, responds to it. In practice it means R needs to check for pending interrupts time to time in long running loops. It has to specially handle blocking calls to the OS, so that it doesn’t take too long to e.g.  interrupt a computation.

R itself may, however, be also ran to execute an R script. It can be from another instance of R via system(,wait=TRUE) or from other applications. In such cases, R should be terminated by Ctrl+C. But, when ran say via system(,wait=FALSE) or similarly from another application, it should not be terminated by Ctrl+C (the bug report). Hence, there needs to be an application-independent way to communicate what Ctrl+C should do to the child process, and this should be inherited further to its child processes.

Such a way of communication is provided by the operating system. The parent process may be able to arrange that its given child process (and its children) do not receive the signal for Ctrl+C. Also, the parent process may be able to arrange that its given child process (and its children) will ignore the signal for the interrupt.

For this to work, applications need to cooperate. By default, they do. When an application does not set any signal handler and leaves the flag on ignoring the signal alone, it inherits the intended behavior: either it terminates in response to the interrupt via the default handler action, or it continues executing as the signal does not arrive or is ignored.

Applications, such as R, that set their own interrupt handler, have to be more careful. They need to avoid installing/enabling their handler when the inherited flag says the signal should be ignored. And they need to ensure the flag is set correctly for its child processes.

On Unix

The information that a signal is ignored is encoded by a special handler named SIG_IGN. A real signal handler itself cannot be inherited, because it lives in the address space of the parent process, but when it is set to SIG_IGN, it is inherited.

One can find out whether the signal is ignored or not using sigaction(), and when it is ignored, one should not set any custom handler. The older signal() call returns the previous signal handler when setting a new one, so one should immediately restore the old one if it was SIG_IGN. A good source on signal handling on Unix is the GNU libc documentation, which mentions also this principle. R was fixed to do this.

The SIGINT signal is sent by the terminal in response to Ctrl+C to the foreground process group it controls. All processes in the group receive the signal. Each process is in exactly one process group and by default, its child processes are in the same group.

R’s system(,wait=FALSE), as documented, runs the background process using a Unix (POSIX) shell /bin/sh, running the given command with & appended to it. Typically a shell invoked this way will have job control disabled (a.k.a monitor mode disabled), which means that its child process (the command) will execute in the same process group but with the SIGINT signal ignored, so that while it would receive Ctrl+C, it won’t be interrupted by it.

This mechanism would have been sufficient to ensure that Ctrl+C does not interrupt background tasks in R, if all applications did follow the rule of respecting ignored SIGINT, but they don’t, including older versions of R, so further robustification is needed.

When the Unix shell job control is enabled, the shell executes background tasks in a new process group. Hence, they don’t receive a signal when Ctrl+C is pressed, so they are not interrupted even when not abiding by the rule. The implementation of R’s system(,wait=FALSE) cannot use /bin/sh with the job control enabled, but it can arrange for the child process (the /bin/sh) to run in a new process group.

R already had the key pieces in place to run a command in a new process group, because it is already had to do it with system(,timeout>0) via a re-implementation of C/POSIX system(). This code has been slightly generalized and now even system(,wait=FALSE) runs the child processes in a new process group.

While in principle the problem is the same with pipe(), making that more robust required more work. The implementation of pipe() used C/POSIX calls popen() and pclose(), which however did not allow to request a new process group. Therefore, popen() and pclose() had to be newly re-implemented inside R code base.

On Windows

The information that a signal for Ctrl+C (and some other events) is ignored is inherited by child processes, but there is no documented way to obtain it from the operating system. It is stored in PEB (Process Environment Block), under process parameters, in ConsoleFlags, but while some terminal implementations use it, it is not in the public API.

However, unlike Unix, Windows keeps this information separately from the actual signal handler. It is hence safe to set a handler (via SetConsoleCtrlHandler) unconditionally. The handler will only be used when the signal is not ignored. And the handler itself, indeed, will not be inherited, because it is an address in the address space of the process.

Sometimes, however, an application such as R may need to ensure that it won’t be interrupted by Ctrl+C. R.exe is a separate application, which executes Rterm.exe. When R is meant to be used interactively with a REPL, it should not be terminated by Ctrl+C. In other words, Rterm.exe should take care of interrupting the current computation and R.exe should do nothing (but definitely not terminate) on Ctrl+C. This used to be implemented by ignoring the signal in R.exe and then unignoring it in Rterm.exe (via SetConsoleCtrlHandler(NULL,)). This corrupted the inherited yet unretrievable flag telling whether the signal was ignored or not.

R has been fixed so that to “ignore” Ctrl+C, R.exe now installs its own handler for the signal. The handler only returns TRUE, which has the same effect as ignoring the signal, but it does not corrupt the inherited flag. The dangerous calls to SetConsoleCtrlHandler(NULL,) have been removed. I believe this is a pattern that should be followed on Windows, but I did not find any recommendation to this effect in Microsoft documentation nor elsewhere.

In addition, when R executes a background process that should not be interrupted via Ctrl+C (e.g. system(,wait=FALSE)), it needs to ensure that the child ignores the signal. This can be done via a process creation flag CREATE_NEW_PROCESS_GROUP and R does that now.

However, process groups are not the same thing as on Unix. This doesn’t “group” processes, it only ensures that the child executes with the signal ignored. This property is inherited by child processes, but any child process may change it using SetConsoleCtrlHandler(NULL,), so it would then be interrupted by Ctrl+C again. Windows allows to actually “group” processes, into “jobs”, but that does not help in this case.

R on Windows already used its own implementation of an alternative to C system() for executing external processes. This was extended to use CREATE_NEW_PROCESS_GROUP with background processes (e.g.  system(,wait=FALSE).

More work was again needed to fix pipe(). R with Rterm as console used the C popen() and pclose() calls, which did not allow to ignore the interrupt in the child processes. R with a GUI (e.g. Rgui) already used its own implementation of pipe(), which is now used also on the console, but had to be generalized.

Summary

Additional technical details on these changes can be found on R Bugzilla and in the source code. Any regressions related to execution of background processes should be reported so that they could be fixed before the next R release.

A known limitation on Windows is that applications directly calling SetConsoleCtrlHandler(NULL,) may still be terminated by Ctrl+C even when not desirable. This includes applications linked against the Cygwin/Msys2 runtime (so also Rtools).

Users on Unix may observe differences in behavior of job control or when background processes read from or write to the terminal, but these should be far less annoying than Ctrl+C killing background processes.