We have seen here, through three examples, some of the limitations of an entirely probabilistic modeling of some uncertainties of epistemic nature, especially in the presence of a very small amount of data, which does not allow a specific distribution to be associated with random variables. These limitations have motivated the development of additional approaches, which we shall review in the following sections. A first such approach simply consists of an extension of the probability theory: this is the probability box approach.
1.4. Probability box theory (p-boxes)
Probability box (or p-box) theory has been developed to deal with cases where the probability distribution ℙX of a random variable X cannot be accurately determined, thus combining an aleatory component and an epistemic component of uncertainty. Within the framework of probability box theory, the distribution function FX associated with the unknown distribution ℙX can only be bounded (exactly or at a given level of confidence): the left bound is a distribution function that will be denoted
and the right bound is a distribution function denoted F. This is illustrated in Figure 1.5.Any distribution function comprised between
and F can thus represent the probability distribution of X. The spacing between and F represents the magnitude of the epistemic uncertainty associated with the lack of knowledge of the distribution of X. If and F are superimposed, there is no epistemic uncertainty and the aleatory uncertainty is represented by the distribution function FX = = F.
Figure 1.5. Illustration of a probability box
A probability box can be interpreted in two different ways: as bounds on the cumulative probability for a given value of x or as bounds on the values of x for a given confidence level. In the example in Figure 1.5, for example, one can read that the probability that x is less than 3 is comprised between 0.05 and 0.3. We can also read that the 95% quantile of X is between x = 8 and 17.
From a formal point of view, a probability box can be defined, adopting the formalism of Ferson et al. (2015), as follows:
DEFINITION 1.15.– Let and F be non-decreasing functions from ℝ to [0,1] satisfying the condition
] denote the set of all non-decreasing functions F from ℝ to [0,1] satisfying F (x)≤ F(x)≤ (x),∀ x ∈ ℝ. [F, ] is then called a “probability box”.Note that this approach can be seen as an extension of inaccurate probabilities (Walley 1991) to distribution functions and, consequently, to probability distributions. There are different variants existing derived from this concept of a probability box. For example, in addition to bounding the distribution function F with
and F one can impose additional constraints such as:– the mean of the probability distributions associated with F must lie within a given interval;
– the variance of the probability distributions associated with F must lie within a given interval;
– the probability laws associated with F must belong to a certain class of distributions.
There are also different types of probability boxes:
– Distributional probability boxes: the probability distribution is assumed to be known and it is only the parameters of the distribution that are not accurately known, thus generating different possible distribution functions (but all of the same type of distribution). Such an approach will be illustrated in Chapter 7 of this book.
– Confidence bands: instead of seeing the bounds of the distribution function in an absolute way, this bounding can be seen at a given level of confidence. When an empirical distribution function is obtained from a sample, many developments (for example, the inequalities of (Dvoretzky et al. 1956)) have made it possible to bound the true distribution function from which the samples are derived at a given confidence level. These approaches, known as confidence bands, are obviously closely related to the notion of probability boxes.
– C-boxes: this approach can be seen as an extension of the confidence bands making it possible to work on structures modeling all levels of confidence at the same time.
Probability box-based approaches have been applied in many areas of engineering. In the context of mechanics, we can cite the application to the analysis of the buckling load of the Ariane 5 fairing (Oberguggenberger et al. 2009). Roy and Balch (2012) applied probability boxes for predicting the thrust of a supersonic nozzle, while Zhang et al. (2011) applied them to the analysis of a finite element lattice.
Probability boxes-based approaches are generally well suited for analyses where aleatory and epistemic uncertainties are simultaneously present. Its major disadvantage lies in the difficulty of obtaining bounding functions
and F in practice, especially in the presence of small amounts of data (a small number of samples or even no samples at all and only the availability of expert opinions, etc.). Indeed, when few samples are available, the bounding that can usually be obtained on the distribution function is too large to be useful in practice. Sometimes there is even a total absence of samples (that is, there have been no formalized experiments that can be traced and exploited) and only expert opinions can provide information on existing uncertainties. How can the bounds and F be constructed based on these expert opinions? Various approaches, which we shall review later, and, in particular, the Dempster–Shafer theory, which is conceptually quite close to probability boxes, have sought to overcome