Computational Analysis and Deep Learning for Medical Care. Группа авторов. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Группа авторов
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Программы
Год издания:	0
isbn:	9781119785736

Скачать книгу

1 Input Layer - - (227,227,3) 0 0 - relu - 2 CONV1 11 × 11 4 (55,55,96) 34,848 96 34,944 relu 105,415,200 3 POOL1 3 × 3 2 (27,27,96) 0 0 0 relu - 4 CONV2 5 × 5 1 (27,27,256) 614,400 256 614,656 relu 111,974,400 5 POOL2 3 × 3 2 (13,13,256) 0 0 0 relu - 6 CONV3 3 × 3 1 (13,13,384) 884,736 384 885,120 relu 149,520,384 7 CONV4 3 × 3 1 (13,13,384) 1,327,104 384 1,327,488 relu 112,140,288 8 CONV5 3 × 3 1 (13,13,256) 884,736 256 884,992 relu 74,760,192 9 POOL3 3 × 3 2 (6,6,256) 0 0 0 relu - 10 FC - - 9,216 37,748,736 4,096 37,752,832 relu 37,748,736 11 FC - - 4,096 16,777,216 4,096 16,781,312 relu 16,777,216 12 FC - - 4,096 4,096,000 1,000 4,097,000 relu 4,096,000 OUTPUT FC - - 1,000 - - 0 softmax - - - - - - - - 62,378,344 (Total) - -

Schematic illustration of architecture of ZFNet.

Figure 1.3 Architecture of ZFNet.

ZFNet uses cross-entropy loss error function, ReLU activation function, and batch stochastic gradient descent. Training is done on 1.3 million images uses a GTX 580 GPU and it takes 12 days. The ZFNet architecture consists of five convolutional layers, followed by three max-pooling layers, and then by three fully connected layers, and a softmax layer as shown in Figure 1.3. Table 1.4 shows an input image 224 × 224 × 3 and it is processing at each layer and shows the filter size, window size, stride, and padding values across each layer. ImageNet top-5 error improved from 16.4% to 11.7%.

1.2.4 VGGNet

Simonyan and Zisserman et al. [4] introduced VGGNet for the ImageNet Challenge in 2014. VGGNet-16 consists of 16 layers; accepts a 227 × 227 × 3 RGB image as input, by subtracting global mean from each pixel. Then, the image is fed to a series of convolutional layers (13 layers) which uses a small receptive field of 3 × 3 and uses same padding and stride is 1. Besides, AlexNet and ZFNet uses max-pooling layer after convolutional layer. VGGNet does not have max-pooling layer between two convolutional layers with 3 × 3 filters and the use of three of these layers is more effective than a receptive field of 5 × 5 and as spatial size decreases, the depth increases. The max-pooling layer uses a window of size 2 × 2 pixel and a stride of 2. It is followed by three fully connected layers; first two with 4,096 neurons and third is the output layer with 1,000 neurons, since ILSVRC classification contains 1,000 channels. Final layer is a softmax layer. The training is carried out on 4 Nvidia Titan Black GPUs for 2–3 weeks with ReLU nonlinearity activation function. The number of parameters is decreased and it is 138 million parameters (522 MB). The test set top-5 error rate during competition is 7.1%. Figure 1.4 shows

Скачать книгу