Where A.S. are asynchronous signals, D.C. is deferred cancellation, and A.C. is asynchronous cancellation. In the previous post, I discussed asychronous versus deferred cancellation in POSIX threads, and issues that make it hard to use asynchronous cancellation well. I also mentioned that there are almost no functions which are async-cancel-safe. What if you want to cheat and get the behavior of asynchronous cancellation, but without having to follow the rules?
Enter asynchronous signals. Particularly, I’m thinking of signals sent
to a specific thread using pthread_kill
, but really the signal could
be coming from another source like pressing the interrupt or quit key
on a terminal.
Suppose the main flow of excution in a thread uses only async-signal-safe functions. These are not to be confused with async-cancel-safe functions, of which only three exist; the set of async-signal-safe functions is relatively large and includes lots of powerful tools. Cancellation mode is left as deferred (the default).
Now suppose you have a signal handler that interrupts the main flow of
execution and calls pthread_testcancel
. Despite pthread_testcancel
being async-signal-unsafe, this is legal because no other
async-signal-unsafe function was interrupted by the signal.
Under this setup, a call to pthread_cancel
followed by sending the
signal gives you the equivalent of asynchronous cancellation, but
rather than being restricted to only calling async-cancel-safe
functions, it seems it’s now legal to call arbitrary async-signal-safe
functions.
On the one hand, this seems to be an argument that the concept of async-cancel-safety is misguided, and that async-signal-safety should be the condition for which functions an application can call when in asynchronous cancellation mode.
However, let’s take it a step further. pthread_testcancel
was not
async-signal-safe, but close
is. So, close(-1)
(a no-op, aside
from setting errno
) is an async-signal-safe version of
pthread_testcancel
! Now, we’re no longer restricted to only calling
async-signal-safe functions in the main flow of execution, since the
signal handler is async-signal safe. But this is obviously wrong.
Cancelling a thread while it’s in the middle of malloc
or printf
is not something that’s intended to work.
On the other hand, this brings up serious concerns for applications which are not trying to get around the async-cancel-safety rules, but which just happen to call cancellation points from their signal handlers. Doing so will cause the interrupted code to get cancelled exactly as if asynchronous cancellation had been enabled. And this is dangerous and generally unwanted.
A well-behaved application would like to just call
pthread_setcancelstate
in its signal handlers to set
PTHREAD_CANCEL_DISABLED
for the duration of the signal handler. But
that’s in general not possible, since pthread_setcancelstate
is
async-signal-unsafe. This leaves me with a conclusion that,
unless/until some improvement or clarification is added to the
standard, applications need to ensure that signal handler containing
cancellation points do not get run in threads that are potential
targets of cancellation.
I’ve filled issues #615 and #622 on the Austin Group bug tracker. Ultimately I think this just comes down to an omission in the standard of any text to forbid this madness, hopefully an omission which can be quickly and easily resolved. But it’s provided some nice insight into the non-obvious complexity of cancellation and its interaction with other features of the POSIX standard with which it was probably never intended to be used.