The challenge faced with velocity does not only mean rate at which data arrives from multiple sources but also the rate at which data has to be processed and analyzed in the case of real‐time analysis. For example, in the case of credit card transactions, if fraudulent activity is suspected, the transaction has to be declined in real time.
1.9.4 Data Storage
The volume of data contributed by social media, mobile Internet, online retailers, and so forth, is massive and was beyond the handling capacity of traditional databases. This requires a storage mechanism that is highly scalable to meet the increasing demand. The storage mechanism should be capable of accommodating the growing data, which is complex in nature. When the data volume is previously known, the storage capacity required is predetermined. But in case of streaming data, the required storage capacity is not predetermined. Hence, a storage mechanism capable of accommodating this streaming data is required. Data storage should be reliable and fault tolerant as well.
Data stored has to be retrieved at a later point in time. This data may be purchase history of a customer, previous releases of a magazine, employee details of a company, twitter feeds, images captured by a satellite, patient records in a hospital, financial transactions of a bank customer, and so forth. When a business analyst has to evaluate the improvement of sales of a company, she has to compare the sales of the current year with the previous year. Hence, data has to be stored and retrieved to perform the analysis.
1.9.5 Data Privacy
Privacy of the data is yet another concern growing with the increase in data volume. Inappropriate access to personal data, EHRs, and financial transactions is a social problem affecting the privacy of the users to a great extent. The data has to be shared limiting the extent of data disclosure and ensuring that the data shared is sufficient to extract business knowledge from it. Whom access to the data should be granted to, limit of access to the data, and when the data can be accessed should be predetermined to ensure that the data is protected. Hence, there should be a deliberate access control to the data in various stages of the big data life cycle, namely data collection, storage, and management and analysis. The research on big data cannot be performed without the actual data, and consequently the issue of data openness and sharing is crucial. Data sharing is tightly coupled with data privacy and security. Big data service providers hand over huge data to the professionals for analysis, which may affect data privacy. Financial transactions contain the details of business processes and credit card details. Such kind of sensitive information should be protected well before delivering the data for analysis.
1.10 Big Data Applications
Banking and Securities – Credit/debit card fraud detection, warning for securities fraud, credit risk reporting, customer data analytics.
Healthcare sector – Storing the patient data and analyzing the data to detect various medical ailments at an early stage.
Marketing – Analyzing customer purchase history to reach the right customers in order market their newly launched products.
Web analysis – Social media data, data from search engines, and so forth, are analyzed to broadcast advertisements based on their interests.
Call center analytics – Big data technology is used to identify the recurring problems and staff behavior patterns by capturing and processing the call content.
Agriculture–Sensors are used by biotechnology firms to optimize crop efficiency. Big data technology is used in analyzing the sensor data.
Smartphones—Facial recognition feature of smart phones is used to unlock their phones, retrieve information about a person with the information previously stored in their smartphones.
1.11 Big Data Use Cases
1.11.1 Health Care
To cope up with the massive flood of information generated at a high velocity, medical institutions are looking around for a breakthrough to handle this digital flood to aid them to enhance their health care services and create a successful business model. Health care executives believe adopting innovative business technologies will reduce the cost incurred by the patients for health care and help them provide finer quality medical services. But the challenges in integrating patient data that are so large and complex growing at a faster rate hampers their efforts in improving clinical performance and converting the assets to business value.
Hadoop, the framework of big data, plays a major role in health care making big data storage and processing less expensive and highly available, giving more insight to the doctors. It has become possible with the advent of big data technologies that doctors can monitor the health of the patients who reside in a place that is remote from the hospital by making the patients wear watch‐like devices. The devices will send reports of the health of the patients, and when any issue arises or if patients’ health deteriorates, it automatically alerts the doctor.
With the development of health care information technology, the patient data can be electronically captured, stored, and moved across the universe, and health care can be provided with increased efficiency in diagnosing and treating the patient and tremendously improved quality of service. Health care in recent trend is evidence based, which means analyzing the patient’s healthcare records from heterogeneous sources such as EHR, clinical text, biomedical signals, sensing data, biomedical images, and genomic data and inferring the patient’s health from the analysis. The biggest challenge in health care is to store, access, organize, validate, and analyze this massive and complex data; also the challenge is even bigger for processing the data generated at an ever increasing speed. The need for real‐time and computationally intensive analysis of patient data generated from ICU is also increasing. Big data technologies have evolved as a solution for the critical issues in health care, which provides real‐time solutions and deploy advanced health care facilities. The major benefits of big data in health care are preventing disease, identifying modifiable risk factors, and preventing the ailment from becoming very serious, and its major applications are medical decision supporting, administrator decision support, personal health management, and public epidemic alert.
Big data gathered from heterogeneous sources are utilized to analyze the data and find patterns which can be the solution to cure the ailment and prevent its occurrence in the future.
1.11.2 Telecom
Big data promotes growth and increases profitability across telecom by optimizing the quality of service. It analyzes the network traffic, analyzes the call data in real‐time to detect any fraudulent behavior, allows call center representatives to modify subscribers plan immediately on request, utilizes the insight gained by analyzing the customer behavior and usage to evolve new plans and services to increase profitability, that is, provide personalized service based on consumer interest.
Telecom operators could analyze the customer preferences and behaviors to enable the recommendation engine to match plans to their price preferences and offer better add‐ons. Operators lower the costs to retain the existing customers and identify cross‐selling opportunities to improve or maintain the average revenue per customer and reduce churn. Big data analytics can further be used to improve the customer care services. Automated procedures can be imposed based on the understanding of customers’ repetitive calls to solve specific issues to provide faster resolution. Delivering better customer service compared to its competitors can be a key strategy in attracting customers to their brand. Big data technology optimizes business strategy by setting new business models and higher business targets. Analyzing the sales history of products and services that previously existed allows the operators to predict the outcome or revenue of new services or products to be launched.
Network performance, the operator’s major concern, can be improved with big data analytics by identifying the underlying issue and performing real‐time troubleshooting to fix the issue. Marketing