Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Savo G. Glisic
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Программы
Год издания:	0
isbn:	9781119790310

Скачать книгу

of relations is very large, r‐GCN (Relational Data with Graph Convolutional Networks) [7] introduces two kinds of regularization to reduce the number of parameters for modeling relations: basis‐ and block‐diagonal‐decomposition. With the basis decomposition, each Wr is defined as follows:(5.7)

Here each W_r is a linear combination of basis transformations images with coefficients a_rb . In the blockdiagonal decomposition, r‐GCN defines each W_r through the direct sum over a set of low‐dimensional matrices, which needs more parameters than the first one.

Dynamic graphs: These have a static graph structure and dynamic input signals. To capture both kinds of information, diffusion convolutional recurrent NN (DCRNN) [8] and spatial‐temporal graph convolutional networks (STGCN) [9] first collect spatial information by GNNs, then feed the outputs into a sequence model like sequence‐to‐sequence model or convolutional neural networks (CNNs). On the other hand, structural‐recurrent NN [10] and STGCN [11] collect spatial and temporal messages at the same time. They extend the static graph structure with temporal connections so that they can apply traditional GNNs to the extended graphs.

5.1.2 Propagation Types

The propagation step and output step in the model define the hidden states of nodes (or edges). Several major modifications have been made to the propagation step from the original GNN model, whereas in the output step a simple feedforward neural network setting is the most popular. The variants utilize different aggregators to gather information from each node’s neighbors and specific updaters to update nodes’ hidden states.

Convolution has been generalized to the graph domain as well. Advances in this direction are often categorized as spectral approaches and non‐spectral (spatial) approaches. Spectral approaches work with a spectral representation of the graphs.

Spectral network: The spectral network was proposed in Bruna et al. [12]. The convolution operation is defined in the Fourier domain by computing the eigen decomposition of the graph Laplacian. For the basics of these techniques, see Appendix 5.A. The operation can be defined as the multiplication of a signal x ∈ ℝ^N (a scalar for each node) with a filter g_θ = diag(θ) parameterized by θ ∈ ℝ^N:

(5.8) equation

where U is the matrix of the eigenvectors of the normalized graph Laplacian images (D is the degree matrix, and A is the adjacency matrix of the graph), with a diagonal matrix of its eigenvalues Λ.

ChebyshevNet [13] truncates g_θ(Λ) in terms of Chebyshev polynomials T_k(x) up to K^th order as

(5.9) equation

images Parameter λ_max denotes the largest eigenvalue of L. θ ∈ ℝ^K is now a vector of Chebyshev coefficients. The Chebyshev polynomials are defined as T_k(x) = 2xT_{k − 1}(x) − T_{k − 2}(x), with T₀(x) = 1 and T₁(x) = x. It can be observed that the operation is K‐localized since it is a K^th‐order polynomial in the Laplacian.

Graph convolutional network (GCN) [14] : This limits the layer‐wise convolution operation to K = 1 to alleviate the problem of overfitting on local neighborhood structures for graphs with very wide node degree distributions. By approximating λ_max ≈ 2, the equation simplifies to

(5.10) equation

with two free parameters images and images . After constraining the number of parameters with images , we get

(5.11) equation

Non‐spectral approaches define convolutions directly on the graph, operating on spatially close neighbors. The major challenge with non‐spectral approaches is defining the convolution operation with differently sized neighborhoods and maintaining the local invariance of CNNs.

Neural fingerprints (FP) [15] use different weight matrices for nodes with different degrees,

(5.12) equation

where images is the weight matrix for nodes with degree images at layer t. The main drawback of the method is that it cannot be applied to large‐scale graphs with more node degrees.

Diffusion‐convolutional neural networks (DCNNs) were proposed in [16].Transition matrices are used to define the neighborhood for nodes in DCNN. For node classification, it has the form

(5.13) equation

where X is an N × F tensor of input features (N is the number of nodes, and F is the number of features). P^* is an N × K × N tensor that contains the power series {P, P², …, P^K} of matrix P, and P is the degree‐normalized transition matrix from the graph adjacency matrix A. Each entity is transformed to a diffusion convolutional representation that is a K × F matrix defined by K hops of graph diffusion over F features. It will then be defined by a K × F weight matrix and a nonlinear activation function f. Finally, H (which is N × K × F) denotes the diffusion representations of each node in the graph. As for graph classification, DCNN simply takes the average of nodes’ representation,

(5.14) equation

and 1_N here is an N × 1 vector of ones. DCNN can also be applied to edge classification tasks, which requires converting edges to nodes and augmenting the adjacency matrix.

Dual graph convolutional network (DGCN) [17]: This jointly considers the local consistency and global consistency on graphs. It uses two convolutional networks to capture the local/global consistency and adopts an unsupervised loss function to ensemble them. As the first step, let us note that stacking the operator in Eq. (5.11) could lead to numerical instabilities and exploding/vanishing gradients, so [14] introduces the renormalization trick: images , with images and images . Finally, [14] generalizes the definition to a signal X ∈ℝ^{N × C} with C input channels and F filters for feature maps as follows:

Скачать книгу