The main contribution of this work is stated as follows.
ML-based approach is used for the processing of several TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV.
ML-based approach focuses on several classification techniques such as LR, NN, kNN and NB for such processing.
These techniques are compared using the performance metric such as CA.
This work is carried out using Orange3-3.24.1.
The rest of the chapter is organized as follows. Section 1.2 describes related works, Section 1.3 describes methodology for the processing of TEMVIs, Section 1.4 describes results and discussion and Section 1.5 describes the conclusion.
1.2 Related Works
Different works have introduced by several researchers and scientists for the processing of virus as well as other images for wide variety of applications in the real world scenario [1–34, 35–55]. Some of the works are described as follows. Singh et al. [2] focus on the review of several ML as well as image processing techniques for the detection and classification of paddy leaf diseases. Al-Kasassbeh et al. [5] focus on the feature selection mechanism by the help of ML-based approach for the classification of malware. Yang et al. [6] focus on a sequence embedding-based ML mechanism for the prediction of human-virus protein–protein interactions. Dey et al. [7] focus on ML-based techniques for sequence based prediction of viral host interactions between human proteins and SARS-CoV-2. Karanja et al. [9] focus on ML-based techniques as well as image texture features for the analysis of internet of things malware. Muda et al. [14] focus on the k-means clustering as well as NB classification mechanism for intrusion detection. Trishan et al. [17] focus on ML-based classification such as NB, k-nearest and random forest to detect Hepatitis A, B, C and E viruses. Kaur [19] focuses on the ML-based approaches such as kNN and NB for the detection of fraud associated with credit card. Goyal [20] focuses on a NB model that is based on enhanced kNN classification mechanism for the prediction of breast cancer. Wahid et al. [22] focus on the performance analysis of several ML-based techniques for the classification of microscopic bacteria images. Ito et al. [27] focus on convolutional NN mechanism for the detection of virus particle in transmission electron microscopy (TEM) images. Devan et al. [28] focus on transfer learning mechanism to detect herpesvirus capsids by considering several TEM images.
1.3 Methodology
In this work, the ML-based classification techniques [10, 11, 14–16] such as LR, NN, kNN and NB are used to carry out classification mechanism on several TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV. LR technique is used for the prediction of probability of a target variable or dependent variable. Generally, this target variable has a dichotomous nature. It deals with the data coded as 1 for yes or success and 0 for no or failure. A LR model can be used to predict a dependent data variable by considering the relationship between one or more existing independent variable. NN technique deals with a network of functions in order to understand as well as translate a data input of one form into another form as required output. It deals with different neurons layers where each layer can receive inputs from previous layers and can pass outputs to further layers. This technique can process complex data inputs into a space that the computers can be able to understand. kNN technique uses all the available data and classifies new data points on the basis of similarity measures. This technique takes k closest training examples in the feature space as input and generates a class membership as output. NB technique uses the Bayes theorem and this technique assumes that the presence of a particular feature in a class is not related to any other features. So, every features pair is independent of each other. This technique can predict the membership probabilities for each class and the class having the highest probability can be considered as the most likely class.
In this work, at first the TEMVIs are given as input to the Orange 3-3.24.1 [56]. Afterwards, image embedding mechanism is carried out by taking input TEMVIs as inputs to generate embeddings or skipped TEMVIs as outputs. Several embedders such as Inception v3, SqueezeNet (local), VGG-16, VGG-19, Painters, DeepLoc, Openface can be used for image embedding purpose. SqueezeNet (local) is taken as embedder for image embedding purpose. Then, test and score calculation will be carried out by considering image embedding mechanism and by applying LR, NN, kNN and NB techniques separately to compute CA values. For LR, the regularization type, strength are considered as Ridge (L2) and C = 1 respectively. For NN, the neurons in hidden layers, activation function, solver method, regularization and maximal number of iterations are considered as 100, ReLu, Adam, a = 0.0001 and 100 respectively along with replicable training mechanism. For kNN, the number of neighbors, metric and weight are considered as 5, Euclidean and uniform respectively. For test and score calculation, inputs can be considered as data, test data, learner, preprocessor and outputs can be generated as evaluation results as well as predictions. Afterwards, confusion matrix can be generated to represent classification results of each technique. For confusion matrix, the inputs can be considered as evaluation results from test and score and it generates data or selected data as outputs. Figure 1.1 describes the methodology. The steps involved in this work are mentioned as follows.
Steps for TEMVIs Classification
Step 1: Input several categories of TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV.
Step 2: Perform image embedding mechanism by considering input TEMVIs.
Step 3: Test and score calculation by considering image embedding data and by applying LR, NN, kNN and NB techniques separately to compute CA values.
Step 4: Create confusion matrix to represent the classification results each technique.