2 2. Veracity: This data’s legitimacy in the form of veracity can be challenged only if it is inaccurate. It is not about the accuracy of the data; it is about the capacity to process and interpretation of data. In healthcare, the trustworthiness function gives details on correct diagnosis, treatment, appropriate prescriptions, or otherwise established health outcomes.
3 3. Volume: Without a doubt, the large volume represents large amounts of data. To process massive amounts of data such as text, audio, video, and large-format images, existing data processing platforms and techniques must be strengthened. Personal information, radiology images, personal medical records, genomics, and biometric sensor readings, among other things, are gradually integrated into a healthcare database. All of this information adds significantly to the database’s size and complexity.
4 4. Velocity: Big data is completely represented by the amount of information produced every second is considered as velocity. The information burst of social media has brought about a wide range of new and interesting data. Data on overall health condition and growth of the plant size and food bacteria are stored on paper, as well as various X-ray images and written reports, is up dramatically.
5 5. Value: Big data truly embodies the value of data. When it comes to big data analytics, the benefits and costs of analyzing and collecting big data are more important. In healthcare, the creation of value for patients should dictate how all other actors in the system are compensated. The primary goal of healthcare delivery must be to maximize value for patients.
1.3 Areas of Big Data Analytics in Medicine
It is of critical importance to pay attention to a multitude of events that impact the health, both physiologically and pathologically. Occurring at once and expressed in various ways (systemic) aspects of the body lead to interaction between different cardiovascular parameters (i.e. such as minute ventilation and blood pressure) which results in accurate clinical evaluation. As a result, understanding and predicting diseases necessitate an integrated data collection of both structured and unstructured methods that draw on the enormous spectrum of clinical and non-clinical data to create a more thorough picture of disease depiction. Big data analytics has recently made its entrance into the healthcare industry, medical researchers are excited about an entirely new aspect of this research known as incorporating the newer concepts. Researchers are conducting research on healthcare data pertaining to both the data itself and the taxonomy of useful analytics that can be done on it.
Figure 1.3 Areas of big data analytics in medicine.
Expanding on this one would include three areas of big data analytics in medicine which is discussed in this chapter. These three research areas do not comprehensively showcase the many ways big data analytics are applied in medicine; instead, they provide a collection of loosely defined use cases where big data analytics is being employed as shown in Figure 1.3.
1.3.1 Genomics
In [2] the author suggested that the estimated price of sequencing the human DNA (the “combing cost” of) has dropped significantly in the past few years [cost to combing the 30,000 to 35,000 gene map is now inversely proportional to how many genes are found] on the grand scale, and as it is to computational biology, developing genome-scale solutions that are applied to the field of public health can have implications for current and future public health policies and services. In 2013 [3] researcher claimed that, the most important factor in making recommendations in a clinical setting is the cost and time to put them in place. Prospective/preventive, and proctical health-focused strategies aim to acquire information on 100,000 individuals for more than two decades, known as P4-predicted (stating only if it is possible); research using the predictive-targeted, or integrated omics, referred to as personalomics (using your personal data). In [4] the author suggested to include seeking solutions over with regard to the following four aspects such as:
1 1. Developing scalable genome-scale data states
2 2. Use of tools
3 3. Clinical states
4 4. Data challenges in target validation, and integration, a big data project.
Project (P4) is making strides by acquiring tools to help with handling massive datasets, and then, following this, they have developed continuous monitoring tools that aid in understanding a subject’s condition, as well as obtaining new information, and they are moving forward in their search for medication delivery and analytical tools. Everything that is known about a person’s physiology and his/her physiological states in-based person wellness is summarized and is added to person omics (usage-driven genomics methods) which are used to identify and detail the subject’s medical state [5]. Although an actionable course of action at the level of care may be one of the most difficult aspects, many improvements at the clinical level can be pursued (even though it may be arduous). According to [6], a lot of high-resolution data is required for exploration, discovery, and implementing novel approaches. These two aspects of big data necessitate the use of novel data analytics.
1.3.2 Signal Processing
Medical signals like medical images present volume and velocity challenges, most notably during continuous, high-resolution acquisition and storage from a plethora of monitors connected to each patient. Additionally, the problem of size is posed by physiological signals in that they possess a size/physical dimension in time and space. In order to derive the most useable and appropriate responses from physiological data, an individual must be aware of the circumstances that are affecting the measurements and have continual monitoring to be established in place to assure effective use and robustness, rigorous monitoring of those variables is required.
Currently, healthcare systems rely on a patchwork of disparate and continuous monitoring devices that use single physiological waveform data or discretized vital information to generate alerts in the event of over events [7]. However, such uncomplicated approaches to developing and implementing alarm systems are inherently unreliable, and their sheer volume may result in “alarm fatigue” for care givers and patients alike [8, 9]. In this context, the capacity for new medical knowledge discovery is constrained by prior knowledge that has frequently fallen short of fully exploiting high-dimensional time series data. In [10] Jphan et al. suggested the reason these alarm mechanisms frequently fail is that they rely on isolated sources of information and lack context regarding the patients’ true physiological conditions from a broader and more comprehensive perspective. As a result, improved and more comprehensive approaches to studying interactions and correlations between multimodal clinical time series data are required. This is critical because research consistently demonstrates that humans are unable to reason about changes affecting more than two signals [11, 12].
1.3.3 Image Processing
Medical images are a valuable source of data that are frequently used for diagnosis, treatment evaluation, and planning [13]. Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Photoacoustic Imaging (PI), Molecular Imaging (MI), Positron Emission Imaging (PEI), and Sonography are all examples of established clinical imaging techniques. However, medical image data will often have up to hundreds of megabytes (e.g., up to 2,500+ scans [14]) for one study (in one study, for example, histology data), or even thousands of megabytes (a large number of scans in a thin-slice CT study, e.g., proctology). Data needs a large storage area to be held for extended periods of time. While any decision support needs to be completed on the fly, they must be quick and precise algorithms in order to have practical benefits. Even though