Hybrid learning algorithm: Parameter ρ in Eq. (4.65) is not small enough in many situations. In order to obtain a better approximation, a hybrid learning algorithm including a linear ridge regression algorithm and the gradient descent method can be used to adjust Θ and θ0i according to the experience of Eq. (4.62) [105]. The performance measure is defined as
where α is a weighted parameter that defines the relative trade‐off between the squared error loss and the experienced loss, and e1(xk) = yk − fFM(xk), e2(xk) = fSVR(xk) − fFM(xk). Thus, the error between the desired output and actual output is characterized by the first term, and the second term measures the difference between the actual output and the experienced output of SVR. Therefore, each epoch of the hybrid learning algorithm is composed of a forward pass and a backward pass which implement the linear ridge regression algorithm and the gradient descent method in E over parameters Θ and θ0i . Here, θ0i are identified by the linear ridge regression in the forward pass. In addition, it is assumed that the Gaussian membership function is employed, and thus Θj is referred as to σj . Using Eqs. (4.62) and (4.63), and defining
(4.67)
then, at the minimum point of Eq. (4.66) all derivatives with respect to θ0i should vanish:
(4.68)
These conditions can be rewritten in the form of normal equations:
(4.69)
where m = 1, … , c′. This is a standard problem that forms the grounds for linear regression, and the most well‐known formula for estimating θ = [θ01 θ02 ⋯ θ0c′]T uses the ridge regression algorithm:
(4.70)
where δ is a positive scalar, and
(4.71)
where
ψ(xk) = [φ1(xk), φ2(xk), ⋯, φc′(xk)]T, k = 1, … n. In the backward pass, the error rates propagate backward and σj are updated by the gradient descent method. The derivatives with respect to
(4.72)