“[…] Note that the Interrupt exception dates back to the original SML90 language definition. It was excluded from the SML97 version to avoid its malign impact on ML program semantics, but without providing a viable alternative.” [The Isabelle/ML manual, 2010]
For the program to recover from an interrupt:
Memprof_limits
interrupts tasks by raising an exception at some allocation point chosen randomly by the statmemprof runtime, if some limit is reached. It is in general not possible to reason about where this exception will arise nor to handle it locally: there are far too many possible locations in the code. Such exceptions are called asynchronous exceptions or interrupts [Reppy, 1990].
The advantage of asynchrony is that the user code always remains responsive to memprof-limits interruptions, without requiring user code nor libraries it depends on to be reorganised to actively poll memprof-limits. The downside is that the asynchronous interrupt is akin to other exceptions denoting unexpected conditions causing the failure of the whole task at hand, such as Stdlib.Out_of_memory
, Stdlib.Stack_overflow
, Stdlib.Assert_failure
… The recovery from these unexpected exceptions is subject to some discipline, which we describe now.
In short:
Catching a specific asynchronous interrupt is not reliable and in general hides bugs. The exception could have been caught in the user code with the intention of re-raising it, but another unexpected exception arose in the meanwhile, causing the original exception to be discarded. Yet it may very well be that the first exception arising in the wrong place was the cause of the second one, in which case the intention was to catch the first exception. In addition, exceptions are sometimes wrapped, such as inside the Stdlib.Fun.Finally_raised
and Dynlink.Error
exceptions in the OCaml stdlib, in order to augment them with contextual information.
The indiscernability of asynchronous interrupts was noticed in Standard ML, which has attempted to solve it by representing all causes of interruption (out of memory, stack overflow, Ctrl+C…) with a single exception Interrupt
. While Standard ML's notion of dedicated interrupt has never been part of OCaml, we can emulate it as a programmer discipline, by never matching on specific interrupts, and using the catch-all _
instead. This amounts to treating all asynchronous interrupts, bugs, and uncaught exceptions as being equivalent at that boundary.
Memprof-limits handles this for you. In order to reliably tell whether a computation has been interrupted, it records instead with a flag whether it has previously raised an exception due to this particular limit instance.
The generic form to handle the return value is the following.
match Memprof_limits.limit_global_memory task with
| Ok result -> …
| Error exn1 -> (* limit reached *) …
| exception exn2 -> (* some other error *) …
The Error
case is encountered whenever the computation of task
has been interrupted by the corresponding limit being reached, whether the task has returned normally for some reason (exception caught and discarded), has let through the exception raised by memprof-limits, or has raised any other exception instead. In the latter case, this exception is exn1
above, the latest exception raised in the task after interruption. exn1
can be discarded assuming that the monitored computation was properly isolated (see further below).
If the limit has not been reached, no exception arising from the task is caught by memprof-limits (exn2
above).
Software relying on interrupts, such as Isabelle or Coq, have found necessary to request from users that they re-raise interrupts promptly. For instance, the Isabelle/ML manual states:
“Isabelle/ML user code needs to terminate promptly on interruption, without guessing at its meaning to the system infrastructure. Temporary handling of interrupts for cleanup of global resources etc. needs to be followed immediately by re-raising of the original exception.”
For this purpose, both Isabelle/ML and the OCaml libraries of Coq offer predicates to test whether an exception is an interrupt, so that the user can act accordingly.
OCaml does not have a built-in notion of interrupt and therefore does not provide such a predicate. However, another property comes save memprof-limits: if it happened that an interrupt was not re-raised promptly, then the task would be interrupted again soon.
It is still possible for adversarial code to systematically catch and discard all interrupts. In practice, memprof-limits expects some cooperation from the task to not do anything like this. But as far as bugs are concerned, further interruptions are always imminent.
Nevertheless, ideally one should not to use the catch-all _
with exceptions without re-raising the exception promptly, and one should prefer combinators such as Stdlib.Fun.protect
or Memprof_limits.Masking.with_resource
, which ensure that the interrupt is not discarded. The use of the predicate Memprof_limits.is_interrupted
inside when-clauses of catch-alls lets you avoid catching interrupts.
We have mentioned that memprof-limits catches all exceptions if a limit is reached, which amounts to treating bugs and uncaught exceptions as interrupts too. In fact the memprof-limits functions that execute tasks act as an isolation boundary, similar to a toplevel prompt or a thread.
One aspect of isolation concerns the consistency of state in the program. An exception raised asynchronously can corrupt the program state if it occurs in the middle of a block meant to maintain some invariant. Then, it is the responsibility of the person catching the interrupt to make sure that no corrupt state escapes the boundary. This is done in two fashion:
The typical example of external state is indeed resource allocation which must be paired with deallocation. The advice for exception-safety [David Abrahams, 2000] must be applied, see Memprof_limits.Masking.with_resource
.
A particular case concerns modifications to some state that in case of failure must be invalidated in order to prevent further access by other threads, or rolled back to a previous valid value. This can be achieved with an invariant-enforcing guard placed around critical sections using Memprof_limits.Masking.with_resource
.
Invalidation: State invalidation can be seen in the Rust language, where the release action associated with a mutex is responsible not only for unlocking the mutex, but also for “poisoning” it in case of an exception, meaning that the data it protects cannot be accessed normally any further [The Rust Team, 2017]. This strategy amounts to communicating the interruption to other users of the same state.
Rollback: Rather than invalidating state, the strategy used in the Coq proof assistant is to contain the consequence of interrupts (e.g. Ctrl-C), so they are not witnessed by later users of the shared state, and the software can continue normally as before. This is done by setting the value back to a previous known-valid value.
To sum up, the task itself can be written without reasoning about interrupts: 1) corrupted internal state disappears along with the task, 2) external state shared with the task must already be provided by the monitor through interrupt-safe abstractions.
Memprof-limits offers Memprof_limits.Masking.with_resource
, which can be used to clean-up resources reliably and to implement interrupt-safe abstractions. Checking whether clean-up was triggered by an interrupt (to implement rollback or poisoning) can be done using Memprof_limits.is_interrupted
.
An example of an unwind-safe shared data structure is given, using a Mutex wrapper which communicates the interruption to other threads using invalidation.
Caveat: Memprof-limits only support well-bracketed resources at the moment. This is known to be too limited in some situations.
Caution: The case of global values can be seen as escaping the control of the monitor. Ideally, all global values should be interrupt-safe by default. In the OCaml stdlib, some global state is interrupt-safe, but it is currently undocumented which one is and which one is not.
Notions of exception safety go way back and seem to be in perpetual rediscovery. Futher reading: Armstrong (2003), Siedersleben (2006), Duffy (2016), Snoyman (2018). Important milestones are C++ RAII [Abrahams, 2000] and Rust's extension of Armstrong's fail-fast technique to shared mutable state (e.g. [Alex Crichton & the Rust team, 2015]).