To build an efficient machine learning project in healthcare, there are various steps to do such as data gathering, data wrangling, analyze data, train the model, test the model, and deployment, as shown in Figure 3.3. Sickness treatment has ordinary influence for healthcare physicians, and impeccable diagnosis at the right time is very important for a patient [2]. Compared to the previous approach, machine learning first builds the model and then presents the first reliable and accurate predictions for model construction without defining patient characteristics.
There are various machine learning algorithms for thyroid detection, some of which are as follows.
3.5.1 Decision Tree Algorithm
This algorithm used the divide-and-conquer method to construct a decision tree to solve the classification problem using decision-making trees [8]. These form a model based on decisions that relate to features in the data set and very fast to train. Examples of these types of models include random forests and conditional decision trees. The goal is to create a model that predicts the accuracy of thyroid disease using target variables, i.e., TSH by using simple decision rules derived from data features, i.e., T3 and T4.
This algorithm works on the basis of input and output variable (x, y) that is specified in a label set of pairs as follows.
The algorithm is to learn the mapping function from the input variable x to the output variable y, which is given the label set of the input output pair
In Equation (3.1), T represents the training set and n represents the number of training samples.
3.5.2 Support Vector Machines
This is machine learning algorithm that is used for text categorization, image segmentation that uses classification algorithm, and regression and detection of outlier. To implement this in healthcare, sampling is divided among training and testing [9]. This algorithm aims to isolate diseases and then work through a hyperplane. This algorithm used the training data as input and separated the graph of the data in the class as output in the hyperplane [10]. Let us consider classification task such as {ui, vi} where i = 1....n ui are data points, ui ϵ Sd and vi are labels. The data points and labels are displaced through the hyperplane with wtx + b = 0, where w represents a D-dimensional coefficient vector that is normal to the hyperplane and b represents an offset from the origin.
3.5.3 Random Forest
This machine learning algorithm is used to estimate hierarchical variables using a classification algorithm and to assess disease risk that evaluates a function that helps doctors to make medical decisions. The training time of random forest is less as compared to other algorithms. In healthcare, this algorithm is used for disease trends and disease risks that can be identified by analyzing the patient medical records.
3.5.4 Logistic Regression
Logistic regression is a supervised learning algorithm that is used to estimate target variables. The nature of the target or dependent variable is dichotomous, meaning that yes or no, there will be only two possible classes. In healthcare, logistic regression is used to predict a patient’s readmitted whether a patient is readmitted to a hospital or not. It can be divided into two classes: either the patient is not admitted or is not readmitted. Logistic regression can be used to classify whether a person will be prone to cancer due to environmental variables such as smoking habit, highway, and drinking alcohol [18].
3.5.5 Naïve Bayes
Naïve Bayes algorithm is used for prediction of disease. This algorithm trains label data sets and for this they must be trained on label data sets. This algorithm works on the basis of prior probability. The prior probability is the probability of disease that is based on its symptoms and is conducted on a data set.
This algorithm is used to predict the disease based on the maximum value between classes and that class will represent its disease or will be selected [19].
ML has contributed a considerable number of disciplines in recent years including healthcare, vision, and natural language processing. There are several machine learning approaches that are analyzed and used for the diagnosis of thyroid disease. The analysis shows that all the papers use different machine learning technologies and show different accuracy. In most research paper, it suggests that logistic regression and decision tree have obtained better accuracy than other algorithms, as shown in Figure 3.4.
Figure 3.4 Analysis of machine learning approach on thyroid.
3.6 Conclusion
The prevalence of thyroid disease in the Earth is still worrisome today, which is seen as a major threat to human life and leading to increased research. Thyroid and thyroid cancers occur mostly in women with a ratio of 3:1 compared to men. Various machine learning approaches have been implemented to predict or detect thyroid disease so that treatment for it is less complex and will increase the patient’s chances of recovery. There is a need to develop machine learning algorithms to analyze the effects of thyroid and thyroid cancer which require the minimum parameters of an individual to detect the thyroid and keeps both the time and money of the patient.
References
1. Priyanka, A prevalence of thyroid dysfunction among young females of urban and rural population in and around Bangalore. Indian J. Appl. Res., 9, 11, 37–38, November – 2019.
2. Chaubey, G., Bisen, D., Arjaria, S., Yadav, V., Thyroid Disease Prediction Using Machine Learning Approaches. Natl. Acad. Sci., 44, 3, 233–238, 2020.
3. Ma, L., Ma, C., Liu, Y., Wang, X., Thyroid Diagnosis from SPECT Images Using Convolutional Neural Network with Optimization. Comput. Intell. Neurosci., Article ID 6212759, 11 pages, https://doi.org/10.1155/2019/6212759, 2019.
4. Yadav, D.C. and Pal, S., Discovery of Hidden Pattern in Thyroid Disease by Machine Learning Algorithms. Indian J. Public Health Res. Dev., 11, 1, 61–66, 2020.
5. Reverter, J.L., Rosas-Allende, I., Puig-Jove, C., Zafon, C., Megia, A., Castells, I., Pizarro, E., Puig-Domingo, M., Luisa Granada, M., Prognostic Significance of Thyroglobulin Antibodies in Differentiated Thyroid Cancer. J. Thyroid Res., Article ID 8312628, 6 pages, https://doi.org/10.1155/2020/8312628, 2020.
6. Thyroid cancer, Patient Care & Health Information Diseases & Conditions, 2020, https://www.mayoclinic.org/diseases-conditions/thyroid-cancer/symptoms-causes/syc-20354161.
7. Beam, A.L., Big Data and Machine Learning in Healthcare, American Medical Association, 2018.
8. Jongboa, O.A., Development of an ensemble approach to chronic kidney disease diagnosis. Sci. Afr., 8, 1–8, 2020, https://doi.org/10.1016/j.sciaf.2020.e00456.
9. Shailaja, K., Machine Learning in Healthcare: