4.2.1 Pixel‐wise Decomposition
We start with the concept of pixel‐wise image decomposition, which is designed to understand the contribution of a single pixel of an image x to the prediction f(x) made by a classifier f in an image classification task. We would like to find out, separately for each image x, which pixels contribute to what extent to a positive or negative classification result. In addition, we want to express this extent quantitatively by a measure. We assume that the classifier has real‐valued outputs with mapping f: ℝV → ℝ1 such that f(x) > 0 denotes the presence of the learned structure. We are interested in finding out the contribution of each input pixel x(d) of an input image x to a particular prediction f(x). The important constraint specific to classification consists in finding the differential contribution relative to the state of maximal uncertainty with respect to classification, which is then represented by the set of root points f(x0) = 0. One possible way is to decompose the prediction f(x) as a sum of terms of the separate input dimensions xd :
Here, the qualitative interpretation is that Rd < 0 contributes evidence against the presence of a structure that is to be classified, whereas Rd > 0 contributes evidence for its presence. More generally, positive values should denote positive contributions and negative values, negative contributions.
LRP: Returning to multilayer ANNs, we will introduce LRP as a concept defined by a set of constraints. In its general form, the concept assumes that the classifier can be decomposed into several layers of computation, which is a structure used in Deep NN. The first layer are the inputs, the pixels of the image; and the last layer is the real‐valued prediction output of the classifier f. The l‐th layer is modeled as a vector
Iterating Eq. (4.2) from the last layer, which is the classifier output f(x), back to the input layer x consisting of image pixels then yields the desired Eq. (4.1). The relevance for the input layer will serve as the desired sum decomposition in Eq. (4.1). In the following, we will derive further constraints beyond Eqs. (4.1) and (4.2) and motivate them by examples. A decomposition satisfying Eq. (4.2) per se is neither unique, nor it is guaranteed that it yields a meaningful interpretation of the classifier prediction.
As an example, suppose we have one layer. The inputs are x ∈ ℝV. We use a linear classifier with some arbitrary and dimension‐specific feature space mapping φd and a bias b:
(4.3)
Let us define the relevance for the second layer trivially as
(4.4)
This clearly satisfies Eqs. (4.1) and (4.2); however, the relevances R(1)(xd) of all input dimensions have the same sign as the prediction f(x). In terms of pixel‐wise decomposition interpretation, all inputs point toward the presence of a structure if f(x) > 0 and toward the absence of a structure if f(x) < 0. This is for many classification problems not a realistic interpretation. As a solution, for this example we define an alternative