Table 1.8 Comparison of ResNet-50 and ResNext-50 (32 × 4d).
1.2.10 MobileNets
Google proposed MobileNets VI [10] uses depthwise separable convolution instead of the normal convolutions, which, in turn, reduces the model size and complexity. Depthwise separable convolution is defined as a depthwise convolution followed by a pointwise convolution, i.e., a single convolution is performed on each colour channel and it is followed by pointwise convolution which applies a 1 × 1 convolution to combine the outputs of depthwise convolution; after each convolution, batch normalization (BN) and ReLU are applied. The whole architecture consists of 30 layers with (1) Convolutional layer with stride 2, (2) Depthwise layer, (3) Pointwise layer, (4) Depthwise layer with stride 2, and (5) Pointwise layer. The advantage of MobileNets is that it requires fewer number of parameters and the model is less complex (small number of Multiplications and Additions). Figure 1.11 shows the architecture of MobileNets. Table 1.11 shows the various parameters of MobileNets.
Figure 1.9 Architecture of SE-ResNet.
1.3 Application of CNN to IVD Detection
Mader [11] proposed V-Net for the detection of IVD. Bateson [12] propose a method which embeds domain-invariant prior knowledge and employ ENet to segment IVD. Other works which deserve special mentioning for the detection and segmentation of IVD from a 3D Spine MRI includes Zeng [13] uses CNN; Chang Liu [14] utilized 2.5D multi-scale FCN; Gao [15] presented a 2D CNN and DenseNet; Jose [17] presents a HD-UNet asym model; and Claudia Iriondo [16] uses VNet-based 3D connected component analysis algorithm.
Table 1.9 Comparison of ResNet-50 and ResNext-50 and SE-ResNeXt-50 (32 × 4d).
Figure 1.10 Architecture of DenseNet.
1.4 Comparison With State-of-the-Art Segmentation Approaches for Spine T2W Images
This work discusses the various architecture of CNN that have been employed for the segmentation of spine MRI. The difference in the architecture depends on several factors like number of layers, number of filters, whether padding is required or not, and the presence or absence of striding. The performance of segmentation is evaluated using Dice Similarity Coefficient (DSC), Mean Absolute Surface Distance (MASD), etc., and the experimental results are shown in Table 1.12. In the first three literature works, DSC is computed and CNN developed by Zeng et al. achieves 90.64%. DenseNET produces approximately similar segmentations based on MASD, Mean Localisation Distance (MLD), and Mean Dice Similarity Coefficient (MDSC). Comparison result is shown in Table 1.12.
1.5 Conclusion
In this Chapter, we had discussed about the various CNN architectural models and its parameters. In the first phase, various architectures such as LeNet, AlexNet, VGGnet, GoogleNet, ResNet, ResNeXt, SENet, and DenseNet and MobileNet are studied. In the second phase, the application of CNN for the segmentation of IVD is presented. The comparison with state-of-the-art of segmentation approaches for spine T2W images are also presented. From the experimental results, it is clear that 2.5D multi-scale FCN outperforms all other models. As a future study, this work modify any currents models to get optimized results.
Table 1.10 Comparison of DenseNet.
Figure 1.11 Architecture of MobileNets.
Table 1.11 Various parameters of MobileNets.
Type/Stride | Filter shape | Input size |
Conv / s2 | 3 × 3 × 3 × 32 | 224 × 224 × 3 |
Conv dw / s1 | 3 × 3 × 32 dw | 112 × 112 × 32 |
Conv / s1 | 1 × 1 × 32 × 64 | 112 × 112 × 32 |
Conv dw / s2 | 3 × 3 × 64 dw | 112 × 112 × 64 |
Conv / s1 | 1 × 1 × 64 × 128 | 56 × 56 × 64 |
Conv dw / s1 | 3 × 3 × 128 dw | 56 × 56 × 128 |
Conv / s1 | 1 × 1 × 128 × 128 | 56 × 56 × 128 |
Conv dw / s2 | 3 × 3 × 128 dw | 56 × 56 × 128 |
Conv / s1 | 1 × 1 × 1 × 128 × 256 | 28 × 28 × 128 |
Conv dw / s1 | 3 × 3 × 256 dw | 28 × 28 × 256 |
Conv / s1 |
|