statistical (memprof-limits.statistical)

Global memory limits

$P_t(n)$ : probability of triggering a callback after $n$ allocated words.

One has:

P_t(n) \geq 1-e^{-\frac{n}{S}}

Thus, once the memory limit is reached, on 64-bit:

there is more than 64% probability that the function has been interrupted after 80 kiB of allocations,
there is a probability less than $10^{-9}$ that the function has not been interrupted after 1.62 MiB of allocations
there is a probability less than $10^{-14}$ that the function has not been interrupted after 2.5 MiB of allocations
there is a probability less than $10^{-50}$ that the function has not been interrupted after 8.8 MiB of allocations

Allocation limits

$l$ : limit chosen by the user
$k$ : number of memprof callback runs needed to interrupt the function

We have $k = \frac{l}{S}$ .

Probability of being interrupted

$P_a(n)$ : probability of being interrupted after $n$ allocations. It is given by the cumulative binomial distribution, that is, one has in terms of the regularized incomplete beta function I:

P_a(n) = I_p(k,n-k+1)

Graph entitled 'Cumulative probability of being
interrupted relative to the chosen limit'—if the graph is missing it
is because odoc does not support packaging images yet. If you are
online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

Accuracy of limit

$t$ : target safe probability
$N$ : maximum number of safe allocations, that is, allocations that can be performed while the cumulative probability of being interrupted remains less than $t$ .

A good lower bound for $N$ is estimated for various values of $k$ , and the error $(l-N)/N$ is given below for $t = 10^{-9}$ , $t = 4\cdot10^{-15}$ , and $t = 10^{-50}$ . ( $10^{-9}$ is the probability of winning the lottery, $10^{-50}$ is considered implausible by physicists' standards.)

Graph entitled 'Accuracy of limit for a target safe allocation'—if the graph is missing it
is because odoc does not support packaging images yet. If you are
online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The same data gives us an indicative value for $l$ for a given $N$ .

Graph entitled 'Allocation limit for a target safe
allocation'—if the graph is missing it is because odoc does not
support packaging images yet. If you are online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The allocation limit is reasonably accurate (i.e. $l$ is less than an order of magnitude greater than $N$ ) starting at around $N = 20\,\textrm{kw}$ , that is, for a target safe probability of $4\cdot10^{-15}$ , around a limit of $l = 200\,\textrm{kw}$ . Allocation limits $l \leq 60\,\textrm{kw}$ on the other hand are probably too inaccurate to be useful.

Impact of the sampling value

This data is given for the default sampling rate. When memprof is used for profiling via the provided Memprof module, the user's sampling rate is used instead. But, memprof-limits will refuse to run with sampling rates less than the default one. As a consequence, the limits can only get more accurate, not less, such that the chosen $N$ remains a safe allocation number.

From a theoretical point of view, one can wonder whether it is useful to increase the default rate. Below is the minimal sampling rate for a target safe allocation assuming $l$ is chosen an order of magnitude greater than $N$ .

Graph entitled 'Minimal sampling rate for a target
safe allocation (l/N = 10)'—if the graph is missing it is because odoc
does not support packaging images yet. If you are online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The default sampling rate ( $10^{-4}$ ) is one among several possible choices that provide reasonable accuracy without affecting performance. Nevertheless, feedback regarding the need of being able to select a greater (or lower) sampling rate is welcome.