where the first matrix is denoted as A, and * is the convolution operator.
The MATLAB command B = im2col(A, [2 2]) gives the B matrix, which is an expanded version of A:
(3.82)
Note that the first column of B corresponds to the first 2 × 2 region in A, in a column‐first order, corresponding to (il + 1, jl + 1) = (0, 0). Similarly, the second to last column in B correspond to regions in A with (il + 1, jl + 1) being (1, 0), (0, 1), (1, 1), (0, 2) and (1, 2), respectively. That is, the MATLAB im2col function explicitly expands the required elements for performing each individual convolution to create a column in the matrix B. The transpose, BT, is called the im2row expansion of A. If we vectorize the convolution kernel itself into a vector (in the same column‐first order) (1, 1, 1, 1)T, we find that
If we reshape this resulting vector properly, we get the exact convolution result matrix in Eq. (3.81).
If Dl > 1 (xl has more than one channel, e.g., in Figure 3.24 of RGB image/three channels), the expansion operator could first expand the first channel of xl, then the second, … , until all Dl channels are expanded. The expanded channels will be stacked together; that is, one row in the im2row expansion will have H × W × Dl elements, rather than H × W.
Suppose xl is a third‐order tensor in
As an example, dividing q by HW and take the integer part of the quotient, we can determine which channel (dl) belongs to.
We can use the standard vec operator to convert the set of convolution kernels f (an order‐4 tensor) into a matrix. Starting from one kernel, which can be vectorized into a vector in
(3.85)
with
The Kronecker product: Given two matrices A ∈ ℝm × n and B ∈ ℝp × q, the Kronecker product A ⊗ B is a mp × nq matrix, defined as a block matrix
(3.86)
The Kronecker product has the following properties that will be useful for us: