Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Savo G. Glisic
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Программы
Год издания:	0
isbn:	9781119790310

Скачать книгу

Underscript i equals 1 Overscript l Endscripts left-parenthesis alpha Subscript i Baseline minus alpha Subscript i Superscript asterisk Baseline right-parenthesis left pointing angle x Subscript i Baseline comma x right pointing angle plus b period"/>

This is the so‐called support vector expansion; that is, w can be completely described as a linear combination of the training patterns x_i . In a sense, the complexity of a function’s representation by SVs is independent of the dimensionality of the input space X, and depends only on the number of SVs. Moreover, note that the complete algorithm can be described in terms of dot products between the data. Even when evaluating f(x), we need not compute w explicitly. These observations will come in handy for the formulation of a nonlinear extension.

Computing b : Parameter b can be computed by exploiting the so‐called Karush−Kuhn−Tucker (KKT) conditions stating that at the point of the solution the product between dual variables and constraints has to vanish, giving α_i(ε + ξ_i − y_i + 〈w, x_i〉 + b) = 0, alpha Subscript i Superscript asterisk Baseline left-parenthesis epsilon plus xi Subscript i Superscript asterisk Baseline plus y Subscript i Baseline minus left pointing angle w comma x Subscript i Baseline right pointing angle minus b right-parenthesis equals 0 , (C − α_i)ξ_i = 0 and left-parenthesis upper C minus alpha Subscript i Superscript asterisk Baseline right-parenthesis xi Subscript i Superscript asterisk Baseline equals 0 period This allows us to draw several useful conclusions:

1 Only samples (xi, yi) with corresponding lie outside the ε ‐insensitive tube.

2 ; that is, there can never be a set of dual variables αi , that are both simultaneously nonzero. This allows us to conclude that(4.51) (4.52)

In conjunction with an analogous analysis on alpha Subscript i Superscript asterisk , we have for b

(4.53)

Kernels: We are interested in making the SV algorithm nonlinear. This, for instance, could be achieved by simply preprocessing the training patterns x_i by a map Φ: X → F into some feature space F, as already described in Chapter 2, and then applying the standard SV regression algorithm. Let us have a brief look at the example given in Figure 2.8 of Chapter 2. We had (quadratic features in ℝ²) with the map Φ: ℝ² → ℝ³ with Φ left-parenthesis x 1 comma x 2 right-parenthesis equals left-parenthesis x 1 squared comma StartRoot 2 EndRoot x 1 x 2 comma x 2 squared right-parenthesis . It is understood that the subscripts in this case refer to the components of x ∈ ℝ². Training a linear SV machine on the preprocessed features would yield a quadratic function as indicated in Figure 2.8. Although this approach seems reasonable in the particular example above, it can easily become computationally infeasible for both polynomial features of higher order and higher dimensionality.

Implicit mapping via kernels: Clearly this approach is not feasible, and we have to find a computationally cheaper way. The key observation [96] is that for the feature map of the above example we have

(4.54) left pointing angle left-parenthesis x 1 squared comma StartRoot 2 EndRoot x 1 x 2 comma x 2 squared right-parenthesis comma left-parenthesis x 1 Superscript prime 2 Baseline comma StartRoot 2 EndRoot x prime 1 x prime 2 comma x 2 Superscript prime 2 Baseline right-parenthesis right pointing angle equals left pointing angle x comma x Superscript prime Baseline right pointing angle squared

As noted in the previous section, the SV algorithm only depends on the dot products between patterns x_i . Hence, it suffices to know k(x, x′) ≔ 〈Φ(x), Φ(x′)〉 rather than Φ explicitly, which allows us to restate the SV optimization problem:

(4.55) equation

Now the expansion for f in Eq. (4.50) may be written as w equals sigma-summation Underscript i equals 1 Overscript p Endscripts left-parenthesis alpha Subscript i Baseline minus alpha Subscript i Superscript asterisk Baseline right-parenthesis upper Phi left-parenthesis x Subscript i Baseline right-parenthesis and

(4.56) f left-parenthesis x right-parenthesis equals sigma-summation Underscript i equals 1 Overscript p Endscripts left-parenthesis alpha Subscript i Baseline minus alpha Subscript i Superscript asterisk Baseline right-parenthesis k left-parenthesis x Subscript i Baseline comma x right-parenthesis plus b period

The difference to the linear case is that w is no longer given explicitly. Also, note that in the nonlinear setting, the optimization problem corresponds to finding the flattest function in the feature space, not in the input space.

4.4.3 Combination of Fuzzy Models and SVR

Given observation data from an unknown system, data‐driven methods aim to construct a decision function f(x) that can serve as an approximation of the system. As seen from the previous sections, both fuzzy models and SVR are employed to describe the decision function. Fuzzy models characterize the system by a collection of interpretable if‐then rules, and a general fuzzy model that consists of a set of rules with the following structure will be used here:

(4.57)

Here, parameter d is the dimension of the antecedent variables x = [x₁, x₂, … , x_d]^T, R_i is the i‐th rule in the rule base, and A_i1 , … , A_ipx are fuzzy sets defined for the respective antecedent variable. The rule consequent g_i(x, β_i) is a function of the inputs with parameters β_i . Parameter c is the number of fuzzy rules. By modification of Eq.

Скачать книгу