the relationships between the three units are:
– natural unit: 1 nat = log2(e) = 1/loge(2) = 1.44 bits of information;
– decimal unit: 1 dit= log2(10) = 1/log10(2) = 3.32 bits of information.
They are pseudo-units without dimension.
2.3.1. Entropy of a source
Let a stationary memoryless source S produce random independent events (symbols) s, belonging to a predetermined set [S] = [s1,s2, ... ,sN]. Each event (symbol) Si is of given probability pi, with:
The source S is then characterized by the set of probabilities [P] = [p1,p2, ... ,PN]. We are now interested in the average amount of information from this source of information, that is to say, resulting from the possible set of events (symbols) that it carries out, each is taken into account with its probability of occurrence. This average amount of information from the source S is called “entropy H(S) of the source”.
It is therefore defined by:
[2.15]
2.3.2. Fundamental lemma
Let two probability partitions on S:
we have the inequality:
[2.16]
Indeed, since: loge(x) ≤ x − 1, ∀x positive real, then:
2.3.3. Properties of entropy
– Positive: since 0 ≤ pi ≤ 1; (with the agreement
– Continuous: because it is a sum of continuous functions “log” of each pi.
– Symmetric: relative to all the variables pi.
– Upper bounded: entropy has a maximum value: got for a uniform law:
– Additive: let , then
[2.17]
2.3.4. Examples of entropy
2.3.4.1. Two-event entropy (Bernoulli’s law)
Figure 2.1. Entropy of a two-event source
The maximum of the entropy is obtained for
2.3.4.2. Entropy of an alphabetic source with (26 + 1) characters
– For a uniform law:⟹H = log2 (27) = 4.75 bits of information per character
– In the French language (according to a statistical study):⟹H = 3.98 bits of information per character
Thus, a text of 100 characters provides an information = 398 bits.
The inequality of the probabilities makes a loss of 475 – 398 = 77 bits of information.
2.4. Information rate and redundancy of a source
The information rate of a source is defined by:
[2.18]
Where:
The redundancy of a source is defined as follows:
[2.19]
2.5. Discrete channels and entropies
Between the source of information and the destination, there is the medium through which information is transmitted. This medium, including the equipment necessary for transmission, is called the transmission channel (or simply the channel).
Let us consider a discrete stationary and memoryless channel (discrete: the alphabet of the symbols at the input and the one at the output are discrete).
Figure 2.2. Basic transmission system based on a discrete channel. For a color version of this figure, see www.iste.co.uk/assad/digital1.zip
We denote:
– [X] = [xl, x2, ... , xn]: the set of all the symbols at the input of the channel;
– [y] = [yi, ... , ym]: the set of all the symbols at the output of the channel;
– [P(X)] = [p(x1), p(x2), ...,p(xn)]: the vector of probability of symbols at the input of the channel;
– [P(Y)] = [p(yi), p(y2), ... , p(ym)]: the vector of probability of symbols at the output of the channel.
Because of the perturbations, the space [Y] can be different from the space [X], and the probabilities P(Y) can be different from the probabilities P(X).
We define a product space [X • Y] and we introduce the matrix of the probabilities of the joint symbols, input-output [P(X, Y)]:
[2.20]
We deduce, from this matrix of probabilities:
[2.21]
[2.22]
We then define the following entropies:
– the entropy of the source:[2.23]
– the entropy of variable Y at the output of the transmission channel:[2.24]
– the entropy of the two joint variables (X, Y)Because of the disturbances in the transmission channel, if the symbolinput-output:[2.25]
2.5.1. Conditional entropies
Because of the disturbances in the transmission