Statistical analysis of Memprof-limits

By default, p=10^{-4} samples per word.

We have S = \frac{1}{p} = 10\,\textrm{kw} (by default). That is, S is about 39.0 kiB of allocations on 32-bit and 78.1 kiB on 64-bit.

Global memory limits

P_t(n): probability of triggering a callback after n allocated words.

One has:

P_t(n) \geq 1-e^{-\frac{n}{S}}

Thus, once the memory limit is reached, on 64-bit:

Allocation limits

We have k = \frac{l}{S}.

Probability of being interrupted

P_a(n): probability of being interrupted after n allocations. It is given by the cumulative binomial distribution, that is, one has in terms of the regularized incomplete beta function I:

P_a(n) = I_p(k,n-k+1)
Graph entitled 'Cumulative probability of being
interrupted relative to the chosen limit'—if the graph is missing it
is because odoc does not support packaging images yet. If you are
online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

Accuracy of limit

A good lower bound for N is estimated for various values of k, and the error (l-N)/N is given below for t = 10^{-9}, t = 4\cdot10^{-15}, and t = 10^{-50}. (10^{-9} is the probability to win the lottery, 10^{-50} is considered implausible by physicists' standards.)

Graph entitled 'Accuracy of limit for a target safe allocation'—if the graph is missing it
is because odoc does not support packaging images yet. If you are
online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The same data gives us an indicative value for l for a given N.

Graph entitled 'Allocation limit for a target safe
allocation'—if the graph is missing it is because odoc does not
support packaging images yet. If you are online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The allocation limit is reasonably accurate (i.e. l is less than an order of magnitude greater than N) starting at around N = 20\,\textrm{kw}, that is, for a target safe probability of 4\cdot10^{-15}, around a limit of l = 200\,\textrm{kw}. Allocation limits l \leq 60\,\textrm{kw} on the other hand are probably too inaccurate to be useful.

Impact of the sampling value

This data is given for the default sampling rate. When memprof is used for profiling via the provided Memprof module, the user's sampling rate is used instead. But, memprof-limits will refuse to run with sampling rates less than the default one. As a consequence, the limits can only get more accurate, not less, such that the chosen N remains a safe allocation number.

From a theoretical point of view, one can wonder whether it is useful to increase the default rate. Below is the minimal sampling rate for a target safe allocation assuming l is chosen an order of magnitude greater than N.

Graph entitled 'Minimal sampling rate for a target
safe allocation (l/N = 10)'—if the graph is missing it is because odoc
does not support packaging images yet. If you are online please visit
https://guillaume.munch.name/software/ocaml/memprof-limits/statistical.html,
otherwise please compile the documentation using `make doc`.

The default sampling rate (10^{-4}) is one among several possible choices that provide reasonable accuracy without affecting performance. Nevertheless, feedback regarding the need of being able to select a greater (or lower) sampling rate is welcome.