Skip to main content

The power of big data mining to improve the health care system in the United Arab Emirates



Collecting and analyzing data has become crucial for many sectors, including the health care sector, where a hefty amount of data is generated daily. Over time, the amount and complexity of this data increase substantially. Consequently, it is considered big data that cannot be stored or analyzed conveniently unless advanced technologies are incorporated. Latest advances in technology have divulged new opportunities to use big data analysis to track a patient’s record and health. Still, it has also posed new challenges in maintaining data privacy and security in the healthcare sector.


This systematic review aims to give new researchers insights into big data use in health care systems and its issues or to advise academics interested in investigating the prospects and tackling the challenges of big data implementation in rising nations like the UAE. This study uses a systematic methodology to examine big data's role and efficacy in UAE health care.


The research follows the methodology of PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) for reporting the reviews and evaluating the randomized trials. Furthermore, the Critical Appraisal Checklist for PRISMA 2009 was applied for the research.


The study concludes that the healthcare systems in the United Arab Emirates can be improved through big data; however, the country authorities must acknowledge the development of efficient frameworks for performance, and quality assessment of the new health care system is significant. The said goal can be achieved via integrating big data and health informatics with the help of IT specialists, health care managers, and stakeholders. Data privacy, data storage, data structure, data ownership, and governance were the most often expressed concerns.

Contribution to knowledge

By discussing numerous issues and presenting solutions linked with big data, the current study contributes substantially to the knowledge of big data and its integration into health care systems in the UAE.


Technology has become a vital part of our daily lives, and adopting new technologies has drastically altered our daily lives. Technology has made our lives more comfortable and efficient, regardless of age, level of knowledge, or even the reason for utilizing it. We are experiencing technological wonders in the shape of smartphones, the Internet of Things (IoT), robotic surgery, and applications that use Artificial Intelligence (AI). IoT transforms healthcare by improving patient care, treatment results, costs, provider workflows, performance, and patient experience. Healthcare IoT has problems. IoT-enabled devices gather a lot of data, including sensitive information, raising security issues [41]. Rich healthcare data frequently includes survival information. Healthcare data analysis is crucial since it can save lives and improve the quality of life [41]. IoT revolutionized health care systems and administration. IoT promises to revolutionize healthcare.

We are shifting from regular technology use to more complicated technologies connected via a robust internet and frequently generate vast amounts of data. With the growth of data from numerous mobile networks, cloud computing systems, health applications, and electronic medical records, there is an increased need for a comprehensive approach to maintaining and updating information. The expanding data and knowledge of the patients and the relevant health care activities are getting more challenging due to the data's speed, amount, and complexity. Kumar and Manjula reported that health care facilities generate an abundant amount of data each day that is centered around the patients, medicines, treatments, diseases, research, and other similar factors [29]. To manage this data more efficiently, modern health care units choose to digitize the data related to patients. Worldwide, medical institutes shifted from the traditional paper-based medical file to the electronic medical record, providing help in managing patient information, lab tests, medications, and medical imaging.

Electronic medical records (EMR) are considered to be an essential, rich platform containing patient information. EMR captures all demographic data, lab results, radiology images, and free-text notations. That collective information is beneficial as a database for many longitudinal studies. Mining data from EMR can help understand disease signs and symptoms and the progression of a particular disease. It also improves clinical knowledge and understanding of a specific phenomenon and assists in clinical trials, disease management, and therapeutic trials [15, 19].

Further, it assists in predicting disease progression, comorbidities, and mortalities [32]. Data comes from various sources, including electronic medical files, home sensors, and wearable devices. As such, it will generate a massive amount of data known as big data. Big data refers to massive data, although the term has no universally accepted definition. The oldest definition is provided by Laney, who observed that (big) data was growing in three different dimensions, namely: volume, velocity, and variety (known as the three V’s) [24]. This definition has been expanded by Demchenko et al., who define big data by five V’s: volume, velocity, variety, veracity, and value [18]. Volume refers to the amount of massively generated data that requires a unique storage format. Data velocity means the high speed of data generated from different resources. Variety of data implies the complexity of data that varies from numerical data to text notation or from numerical data to text notation or a (radiological) image. Finally, veracity refers to the accuracy of the data, and value evaluates the quality of data [35].

Wang expands on big data and defines it as a data set that cannot be analyzed by a standard computerized method [44]. Big data is segregated into structured, unstructured, and semi-structured forms of the data. The structured data can be stored, accessed, and processed in a specific format. It is an already-segregated and dedicated form of easily retrievable and readable data. The unstructured data is not explicit in its structure, as it was discussed for the structured layout of the data. As stated by Wu and Lin, this type of data possesses multiple challenges in terms of processing as well as retrieving valuable information from it [45]. The processing resources' limitations include data-related, language, relationship identification, and technical issues [2, 3]. The durability and efficiency of data variety and data skew were evaluated using a broad range of simulated and real-world healthcare datasets [3,4,5, 22]. A typical example of this data is the data that comes from heterogeneous sources, meaning a combination of text files, images, and videos. Data heterogeneity is due to the mixing of structured and unstructured data, having its roots in various quantitative or qualitative platforms. The quantitative data sources include laboratory tests, images, sensor data, and gene arrays. The qualitative data sources include demographics and textual information [34]. One of the critical challenges in this regard is related to the accuracy and trustworthiness of the data since the credibility of the data may be challenged as it is from unmanaged sources. First, to preprocess multi-database medical record linkage, employ approximation query processing. Second, aggregate queries may get approximations.

Preprocessing may be used in multi-database systems to locate patient records. This is the initial stage in connecting. If gathering aggregate data, imprecise replies may be adequate. At least they may spark additional inquiry. This estimate is critical for multi-database query design and optimization [3, 4, 48]. Health care systems need to analyze the unstructured and semi-structured data to get the ultimate benefits of big data technology. The extraction and retrieval of big data may be subject to challenges related to social and legal technicalities. These social and legal issues might be generated due to problems associated with data ownership, privacy, identification, and governance [27].

Big data and health care system within the United Arab Emirates

The United Arab Emirates (UAE) health care system is operated by government-funded health services and the rapidly growing private health sector. The standards of the health services provided by both sectors are acceptable. The healthcare industry of the UAE is realizing the potential of big data analysis which can transform the health care system (refer to Fig. 1). According to Bani-issa et al., such developments are the inactive lifestyle among the residents, leading to an increase in chronic diseases such as diabetes [11]. Several regions in the Middle East, including the UAE, have undergone or are considering implementing health care insurance, which then needs to analyze the large volume of health data generated from claims. The UAE introduced a standardized insurance coding system to deal with the situation and improve process efficiency. The insurers in the UAE are pricing premiums based on little historical data due to the lack of big data analysis tools and the sophisticated nature of the big data. The availability of big data will enable insurers to paint a clear picture of health care in the region. It will allow them to accurately predict the validity of the claims [8, 33].

Fig. 1
figure 1

(Source: Authors)

Big data sources in health care

The UAE's vision is to provide world-class healthcare by 2021, and the government’s direction is to foster innovation in the healthcare system to achieve its vision. Many strategies have been explored to ensure that people are provided with a high-quality care system and to implement SDGs, particularly Goal 3 (ensuring healthy lives and promoting well-being for all ages) [39].

With advanced technology in the UAE, smart government, and public service, big data helps provide a big database within the country, especially in the healthcare sector, which can assist in a better understanding of the population's health and provide the necessary service. To meet the government's objective of providing world-class healthcare and ensuring sustainability in delivering health and well-being to everyone in the UAE, big data mining and data analysis may help improve services and health programs to create a healthier, happier population. However, despite the availability of big data in the UAE and the potential to use big data as a government seeking innovation and exploiting big data, there are constraints and a shortage of published research on big data mining in the UAE, particularly in the health care system. Although, there is no standardized government approach and policy regarding big data mining or storing. However, a large amount of data is generated daily among different entities within the UAE. Regarding consensus, the UAE open data policy was launched in 2018 as per the UN eGovernment Survey to help access data without restriction [43]. However, by 2020, not all data was accessible, and there remain restrictions on available data from the entities [38, 40].

Much research in the literature review provides agreement about big data mining and its beneficial role in enhancing the health care system. Yet, there is still no unified process or solutions for big data mining and how to make it possible. This research will help understand the importance of harnessing big data and utilizing it to enhance the health care system and identify its challenges and limitations.

Big data and sustainable development goals

The UN developed a 2030 strategy to fight poverty, ensure equity among people and address global challenges through 17 sustainable goals [42]. According to Wu, policymakers, decision-makers, and investors need factual, accurate, and real-time data to adopt the appropriate policy decisions to accomplish the Sustainable Development Goals (SDGs). They then need to be able to check the impact of the policy, which can be achieved through the analysis of big data from different sources. Similarly, a report released by United Nations states that the big data revolution can contribute to Sustainable Development Goals (SDG) by providing accurate and reliable data and analyzing the data to develop policy and plans to achieve SDG 2030 [43]. However, the main concerns were the inadequacy in technology adoption among all countries and data privacy and transparency regardless of the industry [8, 10]. For instance, medical records are vital to medical care and include sensitive personal data; therefore, keeping electronic medical records private is a crucial difficulty [8, 33].

Furthermore, blockchain can hold accessible, immutable, tamper-proof medical data [16]. Therefore, doctors and nurses would use a Big data analysis can provide the ability to monitor the progress toward achieving SDGs by 2030. Big data analysis can be more cost-effective and faster in tracking SDGs than, for example, monitoring poverty by traditional methods such as questionnaires or interviews, which can be ineffective and time-consuming and require significant effort [13, 36].

The focus on SDGs Is stated in Goal 3, to “ensure healthy lives and promote well-being for all at all ages” [42]. Big data can help in providing precise and clear information about health. Barrett et al. makes this point by adopting big data analysis to understanding population behavior, social and environmental factors (2013) by adopting big data analysis [12]. This will help in population health management, prevent the disease, and target subpopulations by having accurate and real-time data. Accurate or approximate processing in health care systems has been associated with hospital statistics deemed critical for assessing performance and ensuring safe and dependable healthcare delivery [6,7,8]. Data quality is described by correctness, validity, reliability, completeness, legibility, timeliness, accessibility, usefulness, and secrecy [5, 20]. All data are susceptible to missing values, bias, measurement inaccuracy, and human input and processing errors. These difficulties include technical, behavioral, and organizational [20]. Therefore, big data analysis can help achieve SDGs by promoting well-being and chronic disease prevention through big data analysis.

Research gap

As discussed in the literature, big data is emerging as a great source of improvement in different sectors of the world, especially for countries adopting advanced health care systems. Most developed countries have recognized the importance of big data and have shown interest in improving the health care system through the collection and analysis of big data [14, 32]. The UAE is an example of a nation whose healthcare systems are up to date and equipped with modern health facilities. However, there is limited research in this context that has considered that despite existing challenges of data security, data classification, data modeling, data storage, data accommodation, and technology incorporation, whether the integration of big data and health care can emerge as a sustainable system. Implementing big data analysis will be difficult in countries similar to the UAE, with a high population and complex health care systems. The motivation for this systematic review is to provide new researchers insights into the field of big data usage in health care systems, together with its associated challenges, or as a guide for researchers interested in exploring the opportunities and solving the challenges of big data implementation in emerging countries such as the UAE.

This research study aims to gain further insight using a systematic approach to review the role, effectiveness, and evaluation of big data in the area of health care within the UAE. The research questions of the study using the systematic review of the methodology of the article are:

  • RQ1: What is the role of big data in the health care system?

  • RQ2: What are the potential opportunities to enhance quality-of-care services through integrating big data in the health care system?

  • RQ3: How to best understand the challenges of implementing and using big data technologies?

With RQ1, we can investigate the role of big data usage in improving health care systems and their industry. Moreover, RQ2 allows us to find the existing solutions to enhance the quality of care services in health care systems using big data. RQ3, on the other hand, gives an insight into how various challenges can be mitigated to enhance the implementation of big data technologies in health care systems.


The current research follows the methodology of PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) for reporting the reviews and evaluating the role and usefulness of big data in health care systems for emerging countries such as the UAE. According to Tricco et al. [32], to assess the quality of the selected research articles for current research, the critical appraisal PRISMA checklist 2009 was used [26, 28, 30]. Figure 2 shows the flow of information through the different phases of a systematic review for this study. However, due to limitations in publication or access control, access to a wide range of journals will limit the overview of available literature during the study time. Furthermore, restrictions on published papers in English will restrict the information to publish valuable articles in other languages.

Fig. 2
figure 2

Flow of information through the different phases of a systematic review [26]

Inclusion criteria

The article searching methodology discussed above resulted in thousands of research articles. Instead of considering all of them, the most relevant research articles were segregated from those less relevant. The titles, abstracts, and keywords of the research articles were screened, and those articles which discussed the relation of big data to health care systems were separated for full-text review. The screening process was made more efficient by removing duplicate research articles at the eligibility phase of PRISMA.

A systematic review of the studies was conducted, and the articles were judged according to their objectives, methodology and study design, use of authentic data sources, validity and reliability of the study method, analysis, and consideration of ethical issues. Further, comprehensiveness of the description of the findings and outcomes, the appropriateness of the tools used for data mining, the suitability of the qualitative methodology, the use of valid research designs, and clearly stated research findings were considered in the review. Linkage to the current study questions and objective was taken into consideration. Only one study was conducted on big data within the context of the UAE. The study used quantitative data to better understand the topic, with the same context of the UAE as the current study.

Exclusion criteria

The different research articles were excluded from consideration in the research based on the criteria of quantitative studies, surveys, focus groups from the research other than health care, feasibility studies, work environments other than health care, data collection techniques adopted, editorials, and short reports, and articles which were not reputably published, such as in international journals. The assessment of the research articles based on the criteria mentioned above further narrowed the number of articles to be included in the current research, and it was ensured that the remaining articles are the most relevant and high-quality manuscripts, which make the findings and outcomes of the study authentic and most reliable. The research that made mention of computerized or digital tools to analyze big data and technical details about the transfer of big data, processing of big data, storing the data, cleaning the data, and analyzing the data were excluded as well as these are not one included in the research objectives and technical details will not add value to the current study. Artificial Intelligence (AI) algorithms and the role of big data in AI studies were further excluded.

Quality assessment and processing steps

To evaluate each article in terms of quality, the Critical Appraisal PRISMA Checklist 2009 (refer to Table 1) was applied in the current research for qualitative studies. In addition, the quality of papers was judged according to their objectives, as was the approach of the research methods, objectives, and abstract use of authentic data sources. The validity and reliability of the research design approaches adopted were evaluated. There was also a consideration of ethical issues. A thorough evaluation was done of the comprehensive description of findings and outcomes, and of the appropriateness of the qualitative methodology, use of valid research designs, and clearly stated results of the research., The use of valid research designs and clearly stated research findings were analyzed. The flow diagram of the PRISMA methodology adopted for the research is illustrated in Fig. 3, where the parameter ‘n’ represents the number of articles identified and the PRISMA checklist in Table 2. Extended results are attached in Table 3.

Table 1 PRISMA 2009 Checklist
Fig. 3
figure 3

PRISMA Results of Flow of information through the different phases of a systematic review

Table 2 PRISMA 2009 Checklist- Big Data Write-up
Table 3 PRISMA 2019 checklist – Results

Results and discussion

Big data in health care

The findings flow diagram of the PRISMA methodology adopted for the research is illustrated in Fig. 3, where the parameter ‘n’ represents the number of articles identified, and the PRISMA checklist write-up is in Table 2. The extended results of the articles are attached in Table 3. The selection process is visualized in Fig. 3 as a PRISMA flow diagram, showing the number of papers (n) obtained from each stage of the review process, i.e., search results, duplicate removal, title and abstract screening, full-text screening, and final selected papers.

RQ1: What is big data's role in the health care system?

Big data is now considered the gold standard of the new technological era, especially for an institute that encounters great quantities of data daily, such as in the health care sector. Considering the importance of big data in healthcare, the content is analyzed to showcase how data is utilized in the healthcare sectors of the UAE and the certain challenges faced by the healthcare industry in this regard.

Big data can improve the efficiency and effectiveness of health strategy and policy. They can shift the policymaking from the patient visit to more advanced (value-based) policies with big data analysis, which can accurately reflect the population, as Gamache mentioned [21]. It was also noted by Auffray et al. and Balsam et al. that big data could help formulate a prevention and prediction strategy, in addition to improvement in the health of the population [9, 58]. Big data can help improve individual health progress and shift toward personalized medicine. It can predict personal health and improve clinician decisions [25]. In addition, big data can assist in a clinical trial in choosing suitable participants and make the process less expensive. Big data can also give more insight into drug safety, early detection, and trace adverse drug side effects. Big data can also help monitor infectious disease trends and track the cases among the population, which can assist in making the right decision and act immediately to limit the spread of contagious diseases.

It is mentioned by Zeng that big data integration with different behavioral and social factors can help in better understanding of health care and health disparities, to enhance the population health and reduce such disparities [47]. Zeng discusses two vital areas where big data can play an essential role in the health care system: it can help integrate and understand social factors impacting population health to improve the population’s quality of care [47].

The USA HITECH Act helps adopt the EMR, contributing to data generation and reducing health disparities. The same applies to the context of the UAE, where an EMR is adopted among various public and private sector authorities. The EMR is considered a new opportunity to create a vast amount of data to understand better other demographic factors and their effects on a large population. Still, there is an issue related to the standardization of medical notes worldwide, which play an essential role in a better understanding and data analysis. Big data analysis can improve health care delivery and reduce unnecessary costs. Big data can help better understand disease progression and the side effect of medications and ensure that health care delivers equally among the population. In addition to EMR data, integration with other devices such as home monitoring devices and smartphone applications may provide better insight into the population data.

Big data can improve public health surveillance and address disparities in the health care system. In the USA, the implementation of the Affordable Care Act and general insurance play an essential role in expanding the health care system and enhancing the accessibility of health care. The factors as mentioned above, help acquire extensive data from the population across different socioeconomic backgrounds to better understand additional population requirements and health status and formulate policy and strategy based on population requirements. This approach will help allocate resources and enhance the health care system by utilizing big data analysis.

Big data opportunities in health care

RQ2: What are the potential opportunities to enhance quality-of-care services through integrating big data in the health care system?

Geographic information systems (GISs) can help better understand the population at risk and health requirements and make the proper intervention based on real-time data through big data analysis. Another example Gamache provides the GIS to allocate vaccines targeting a particular population in response to an outbreak in a specific geographic area [21].

Social media data can be essential in understanding and monitoring population behavior and spreading infectious diseases. For example, Young et al. analyzed 553,186,061 tweets and found a correlation between the prevalence of HIV and the geographic location of HIV-related tweets [46]. This can be expanded by knowing and analyzing the current status and utilizing the data for future prediction of cases using big data modeling, analysis of population behavior, and linkage to social media data. Future projections can help make the proper intervention to reduce the spread of infections [47, 56, 58], reducing the cost of the disease burden on the community and health care system.

Zeng mentioned that big data modeling could be more accurate than traditional methods. In addition, big data modeling can include predictive modeling to forecast a disease occurrence and complications related to the particular disease [47], which can help in a better intervention, enhance the health outcome, and reduce the disease progression.

Challenges related to big data

RQ3: How to best understand the challenges of implementing and using big data technologies?

Many challenges are identified in the literature concerning the application of big data in the health care system. For example, the study performed by Auffray et al. [9] and Zeng [47] highlighted some of the key challenges concerning the development of utilizing big data. The most common areas of concern were data privacy [50], data storage [49, 56], data structure [57], data ownership, and governance [21].

Data security

Health care data are considered sensitive, and most literature agrees that big data raises a security issue. As several USA researchers mentioned, it is regarded as a challenge to process big data. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) prevents client data handling without prior consent. The same applies to UAE Article 379 of the UAE Penal Code, which requires prior permission to manage client’s data. A simple de-identification process was not the final solution, as Adibuzzaman et al. [1]. The De-identification still cannot protect client data within the ID process.

Further, it is easy to re-identify the person by other location or demographic information. Data related to health issues are considered highly sensitive and private [55]. This is why specific regulation regarding accessibility and the availability of the data with consideration to the customer is required and prior consent mandatory, even if the data were to be coded and de-identified.

Data storage

Data storage has been a concern in many studies due to data security. The legal issue within different countries and accessibility of data are limited to research and governments and is without a clear data storage and accessibility policy [51,52,53,54]. Big data cannot be stored by ordinary means, especially from different sources, including electronic medical records, monitoring devices, images, and lab tests.

Initiatives and solutions for data storage were discussed by Adibuzzaman et al. [1], including the platform Informatics for Integrating Biology and the Bedside (i2b2). It is a platform of more than 100 hospitals where patient data is de-identified and stored for research purposes since the hospital should use another software to transfer data. The author argues that this initiative does not permit patients to access their data [1]. Moreover, the hospital required much effort to de-identify the data and used a particular platform to transfer data. If the same system is to be applied, an additional budget is required, and the system will be limited to structured data. It is also mentioned by Dash et al. that data stored at the same time it is generated is less compared to if the data is transferred using another system [17]. Most of the evaluated authors agreed that structured data storage and analysis is more manageable than transferal to another platform. Cloud-based storage remains the ideal solution, yet the security issue remains a challenge to overcome.

Missing data and unstructured data

One of the common challenges in the literature review is missing data in the electronic health record, unstructured data, and free-text data, which is very difficult to process and analyze. For example, the study conducted by Zeng mentioned electronic health records lack social or behavioral data and that there was no standardization of the data format, leading to disparities in health information [47]. This can prevent or make big data analysis impossible. In addition, the lack of data standardization makes data transfer and acquisition impossible [23, 33, 50].

Most electronic health records are designed to make the diagnosis coding and billing more manageable and the information more explicit. Still, these are not yet advanced enough to make the analysis and data helpful linkage, as mentioned by Gamache [21]. One of the examples of the failure of EMR for data analysis is Medical Information Mart for Intensive Care (MIMIC III) which has collected data for more than 50,000 patients from Beth Israel Deaconess Hospital dating back to 2001. The researcher aimed to conduct studies to answer different questions, such as the drug–drug interaction between antihistamine and antidepression. When they applied the selection criteria and checked the files, they had a minimal sample size, which is not representative. This issue can also apply to EMRs in the UAE. There are multiple EMR systems in the UAE with different software, and there is no standardization of medical record notes and fragmented systems between local and federal authorities [33]. This will make data transfer and analysis complicated, which should be considered.

Data ownership and governance

Health care information is considered sensitive data. Data ownership and sharing is an unclear and negotiable challenge among countries and has been mentioned by many studies [8, 9, 33]. The privacy policy across Europe varies, and there is no approach to big data sharing that can fit the existing policies in other countries. The EC for general data protection regulation (2012/0011COD) tried to synchronize the fragmented health care system to make the data available and useable among different European countries. One suggestion was to share data across a blockchain where all the transactions would be recorded and the data accessibility would be monitored; however, this is not yet adopted as security was a concern. This would not be the case in the UAE as the policy, and federal law is unified, and data control and accessibility could be maintained at the federal level.

Auffray et al. [9] mentioned that the USA is a “patient-driven economy” where patients own their data. This is a step forward and a promising approach for a fragmented health care system where different healthcare systems are present, as in the case of Europe and even the UAE. This would help the patient own their digital data, but it requires a digital infrastructure and storage system to ensure the data is transferred into a cloud. It also requires that it is in a structured format to access, understand, and easily analyze the data [59,60,61]. Nevertheless, this approach can help enhance health tourism and a health-driven economy.

Adibuzzaman et al. [1] mentioned data should be Findable, Accessible, Interoperable, and Reproducible (FAIR). In addition, the data would need to be stored as open source, where researchers, stakeholders, and even patients have access to those data while ensuring data protection and privacy. For example, data storage in a protected environment after properly de-identifying client’s ID to maintain a privacy law.

Conclusion and future recommendations

The UAE government remains aware of the power of big data, as shown by the establishment of the UAE Strategy for Artificial Intelligence. The Dubai government policy framework is intended to develop and implement a culture of data sharing and evidence-based decision-making in Dubai [38]. The study concludes that the healthcare systems in the UAE can be enhanced through big data; however, the authorities within the UAE must acknowledge that the development of efficient frameworks for the performance and quality assessment of the new health care system is significant. The said goal can be achieved via integrating big data and health informatics with the help of IT specialists, health care managers, and stakeholders.

Recommendations to use big data in the health sector in the UAE

Specific recommendations for big data handling in the health care sector

  • Formulate a unified EMR standardization for the medical note to be able to process medical note data and transfer it quickly.

  • Incentivization of the health care provider to ensure they are following high standardization of EMR, as data entry is significant, and this was a significant challenge in previous studies.

  • Public–private partnership is essential, and the private sector should incentivize to share their data. The UAE has the big challenge of population diversity, and most of the population is not seeking medical care in the government health care sectors. To overcome population discrepancy and ensure the data represents the whole population, data from the private sector should be accessible as well.

  • Agree on data needed for real-time monitoring, such as infectious diseases or surveillance that requires intervention. The big data analysis will simplify and quicken the intervention decision, consequently offering a better response from authorities to an emergency.

  • Utilize big data to take a proactive measurement and to ensure timely involvement of stakeholders to prevent a disease occurring in the case of non-communicable diseases. That will ultimately reduce the management cost and the burden of chronic diseases.

Conclusion and future research

The digital revolution has arrived, and it is impacting everyday life. There is a dire need to utilize existing health care system-related data, which is automatically generated daily. This big data can transform the healthcare system to improve patient care, proactively envisaging disease origins and implementing timely solutions to bridge the gaps in the existing healthcare system. It is time to think about innovative, technological solutions to link and analyze data faster, understand the disparities between community health and public health alongside enhancing the overall health care system, and implement United Nations SDG Goal 3, ensuring healthy lives and promoting well-being for all ages.

Data availability

The authors confirm that the data supporting the findings of this study are available within the article via its supplementary materials.


  1. Adibuzzaman M, DeLaurentis P, Hill J, Benneyworth BD. Big data in healthcare–the promises, challenges, and opportunities from a research perspective: A case study with a model database. In: AMIA Annual Symposium Proceedings (Vol. 2017). American Medical Informatics Association. 2017, p. 384.

  2. Adnan K, Akbar R. Limitations of information extraction methods and techniques for heterogeneous unstructured big data. Int J Eng Business Manag. 2019.

    Article  Google Scholar 

  3. Ahmadvand H, Dargahi T, Foroutan F, Okorie P, Esposito F. Big data processing at the edge with data skew aware resource allocation. In: 2021 IEEE conference on network function virtualization and software-defined networks (NFV-SDN) (pp. 81–86). IEEE. 2021.

  4. Ahmadvand H, Foroutan F, Fathy M. DV-DVFS: merging data variety and DVFS technique to manage the energy consumption of big data processing. J Big Data. 2021;8(1):1–16.

    Article  Google Scholar 

  5. Ahmadvand H, Goudarzi M. Using data variety for efficient progressive big data processing in warehouse-scale computers. IEEE Comput Archit Lett. 2016;16(2):166–9.

    Article  Google Scholar 

  6. Ahmadvand H, Goudarzi M. SAIR: significance-aware approach to improve QoR of big data processing in case of budget constraint. J Supercomput. 2019;75(9):5760–81.

    Article  Google Scholar 

  7. Ahmadvand H, Goudarzi M, Foroutan F. Gapprox: using gallup approach for approximation in big data processing. J Big Data. 2019;6(1):1–24.

    Article  Google Scholar 

  8. AlMarzooqi FM, Moonesar IA, AlQutob R. Healthcare professional and user perceptions of eHealth data and record privacy in Dubai. Information. 2020;11(9):415.

    Article  Google Scholar 

  9. Auffray C, Balling R, Barroso I, Bencze L, Benson M, Bergeron J, Bernal-Delgado E, Blomberg N, Bock C, Conesa A, Del Signore S. Making sense of big data in health research: towards an EU action plan. Genome Med. 2016;8(1):1–13.

    Google Scholar 

  10. Balador A, Bazzi A, Hernandez-Jayo U, de la Iglesia I, Ahmadvand H. A survey on vehicular communication for cooperative truck platooning application. Vehicular Commun. 2022;34:100460.

    Article  Google Scholar 

  11. Bani-issa W, Eldeirawi K, Al Tawil H. Perspectives on the attitudes of healthcare professionals toward diabetes in community health settings in United Arab Emirates. J Diabetes Mellitus. 2014;5(01):1.

    Article  Google Scholar 

  12. Barrett MA, Humblet O, Hiatt RA, Adler NE. Big data and disease prevention: from quantified self to quantified communities. Big data. 2013;1(3):168–75.

    Article  Google Scholar 

  13. Blumenstock J, Cadamuro G, On R. Predicting poverty and wealth from mobile phone metadata. Science. 2015;350(6264):1073–6.

    Article  Google Scholar 

  14. Catalyst NEJM. Healthcare big data and the promise of value-based care. NEJM Catalyst. 2018;4(1):89.

    Google Scholar 

  15. Coorevits P, Sundgren M, Klein GO, Bahr A, Claerhout B, Daniel C, Dugas M, Dupont D, Schmidt A, Singleton P, De Moor G. Electronic health records: new opportunities for clinical research. J Intern Med. 2013;274(6):547–60.

    Article  Google Scholar 

  16. Dargahi T, Ahmadvand H, Alraja MN, Yu CM. Integration of blockchain with connected and autonomous vehicles: vision and challenge. ACM J Data Inform Quality (JDIQ). 2021;14(1):1–10.

    Google Scholar 

  17. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare: management, analysis and future prospects. J Big Data. 2019;6(1):1–25.

    Article  Google Scholar 

  18. Demchenko Y, Zhao Z, Grosso P, Wibisono A, De Laat C. Addressing big data challenges for scientific data infrastructure. In: 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings (pp. 614–617). IEEE. 2012.

  19. Effoe VS, Katula JA, Kirk J, et al. The use of electronic medical records for recruitment in clinical trials: findings from the Lifestyle Intervention for Treatment of Diabetes trial. Trials. 2016;17:496.

    Article  Google Scholar 

  20. Endriyas M, Alano A, Mekonnen E, Ayele S, Kelaye T, Shiferaw M, Misganaw T, Samuel T, Hailemariam T, Hailu S. Understanding performance data: health management information system data accuracy in Southern Nations Nationalities and People’s Region. Ethiopia BMC Health Serv Res. 2019;19(1):1–6.

    Google Scholar 

  21. Gamache RK. Public and population health informatics: the bridging of big data to benefit communities. Yearb Med Inform. 2018;27(1):199.

    Article  Google Scholar 

  22. Gao Y, Zhou Y, Zhou B, Shi L, Zhang J. Handling data skew in MapReduce cluster by using partition tuning. J Healthcare Eng. 2017;2017:56.

    Article  Google Scholar 

  23. Goldstein BA, Navar AM, Pencina MJ, Ioannidis J. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017;24(1):198–208.

    Article  Google Scholar 

  24. Laney D. 3D data management: Controlling data volume, velocity and variety. META Group Research Note. 2001;6(70):1.

    Google Scholar 

  25. Leyens L, Reumann M, Malats N, Brand A. Use of big data for drug development and for public and personal health and care. Genet Epidemiol. 2017;41(1):51–60.

    Article  Google Scholar 

  26. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JP, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol. 2009;62(10):e1-34.

    Article  Google Scholar 

  27. Mittelstadt BD, Floridi L. The ethics of big data: current and foreseeable issues in biomedical contexts. Sci Eng Ethics. 2016;22(2):303–41.

    Article  Google Scholar 

  28. Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & PRISMA Group*. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9.

    Article  Google Scholar 

  29. Muni Kumar N, Manjula R. Role of Big data analytics in rural health care-A step towards svasth bharath. Int J Computer Sci Inform Technol. 2014;5(6):7172–8.

    Google Scholar 

  30. Page MJ, Moher D, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. 2021;372:67.

    Google Scholar 

  31. Pastorino R, De Vito C, Migliara G, Glocker K, Binenbaum I, Ricciardi W, Boccia S. Benefits and challenges of Big Data in healthcare: an overview of the European initiatives. Eur J Public Health. 2019;29(3):23–7.

    Article  Google Scholar 

  32. Paxton C, Niculescu-Mizil A, Saria S. Developing predictive models using electronic medical records: challenges and pitfalls. AMIA … Annual Symposium proceedings. AMIA Symposium. 2013;2013:1109–15.

    Google Scholar 

  33. Sarabdeen J, Moonesar IA. Privacy protection laws and public perception of data privacy: The case of Dubai e-health care services. Benchmarking Int J. 2018;34:8.

    Google Scholar 

  34. Shelton T, Poorthuis A, Graham M, Zook M. Mapping the data shadows of Hurricane Sandy: Uncovering the sociospatial dimensions of ‘big data.’ Geoforum. 2014;52:167–79.

    Article  Google Scholar 

  35. Srikanth Thudumu PB. A comprehensive survey of anomaly detection techniques for high dimensional big data. J Big Data. 2020;7:42.

    Article  Google Scholar 

  36. Steele JE, Sundsøy PR, Pezzulo C, Alegana VA, Bird TJ, Blumenstock J, Bjelland J, Engø-Monsen K, De Montjoye YA, Iqbal AM, Hadiuzzaman KN. Mapping poverty using mobile phone and satellite data. J R Soc Interface. 2017;14(127):20160690.

    Article  Google Scholar 

  37. Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, Moher D, Peters MD, Horsley T, Weeks L, Hempel S. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.

    Article  Google Scholar 

  38. UAE. Retrieved 12 11, 2020. 2019.

  39. UAE GOVERNMENT. Vision 2021 and health. Retrieved April 05, 2021, from 2020.

  40. UAE. The United Arab Emirates Government portal. 2022.

  41. Ukil A, Bandyoapdhyay S, Puri, C, Pal A. IoT healthcare analytics: The importance of anomaly detection. In: 2016 IEEE 30th international conference on advanced information networking and applications (AINA) (pp. 994–997). IEEE, 2016.

  42. United Nations. SDG, The Sustainable Development Goals, Retrieved April 05, 2020, 2015.

  43. United Nations. UN World Data Forum 2018 wraps up with launch of Dubai Declaration. Retrieved April 05, 2020. 2018.

  44. Wang H. Towards felicitous decision making: an overview on challenges and trends of Big Data. Inf Sci. 2016;367–368:747–65.

    Article  Google Scholar 

  45. Wu P, Lin. Unstructured big data analytics for retrieving e-commerce logistics knowledge. Telematics Inform. 2018;35(1):237–44.

    Article  Google Scholar 

  46. Young SD, Rivers C, Lewis B. Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. Prev Med. 2014;63:112–5.

    Article  Google Scholar 

  47. Zeng X, Zhang Y, Kwong JS, Zhang C, Li S, Sun F, Niu Y, Du L. The methodological quality assessment tools for preclinical and clinical studies, systematic review and meta-analysis, and clinical practice guideline: a systematic review. J Evid Based Med. 2015;8(1):2–10.

    Article  Google Scholar 

  48. Zhang Q, Hansen D. Approximate processing for medical record linking and multidatabase analysis. Int J Healthcare Inform Syst Inform (IJHISI). 2007;2(4):59–72.

    Article  Google Scholar 

  49. Zhang X, Pérez-Stable EJ, Bourne PE, Peprah E, Duru OK, Breen N, Berrigan D, Wood F, Jackson JS, Wong DW, Denny J. Big data science: opportunities and challenges to address minority health and health disparities in the 21st century. Ethn Dis. 2017;27(2):95.

    Article  Google Scholar 

  50. Ristevski B, Chen M. Big data analytics in medicine and healthcare. J Integr Bioinform. 2018;15:3.

    Article  Google Scholar 

  51. Madanian S, Parry DT, Airehrour D, Cherrington M. mHealth and big-data integration: promises for healthcare system in India. BMJ Health Care Inform. 2019;26:1.

    Article  Google Scholar 

  52. Murphy S, Castro V, Mandl K. Grappling with the future use of big data for translational medicine and clinical care. Yearb Med Inform. 2017;26(01):96–102.

    Article  Google Scholar 

  53. Roca J, Tenyi A, Cano I. Paradigm changes for diagnosis: using big data for prediction. Clin Chem Lab Med (CCLM). 2019;57(3):317–27.

    Article  Google Scholar 

  54. Thompson ME, Dulin MF. Leveraging data analytics to advance personal, population, and system health: Moving beyond merely capturing services provided. N C Med J. 2019;80(4):214–8.

    Google Scholar 

  55. Carney TJ, Kong AY. Leveraging health informatics to foster a smart systems response to health disparities and health equity challenges. J Biomed Inform. 2017;1(68):184–9.

    Article  Google Scholar 

  56. Beckmann JS, Lew D. Reconciling evidence-based medicine and precision medicine in the era of big data: challenges and opportunities. Genome Med. 2016;8(1):1–1.

    Article  Google Scholar 

  57. Kumar S, Singh M. Big data analytics for healthcare industry: impact, applications, and tools. Big Data Mining Analyt. 2018;2(1):48–57.

    Article  Google Scholar 

  58. Alkouz B, Al Aghbari Z, Abawajy JH. Tweetluenza: Predicting flu trends from twitter data. Big Data Mining Analyt. 2019;2(4):273–87.

    Article  Google Scholar 

  59. Gravili G, Manta F, Cristofaro CL, Reina R, Toma P. Value that matters: intellectual capital and big data to assess performance in healthcare. An empirical analysis on the European context. J Intell Capital. 2020;34:56.

    Google Scholar 

  60. Gu D, Li J, Li X, Liang C. Visualizing the knowledge structure and evolution of big data research in healthcare informatics. Int J Med Informatics. 2017;1(98):22–32.

    Article  Google Scholar 

  61. Dhagarra D, Goswami M, Sarma PR, Choudhury A. Big Data and blockchain supported conceptual model for enhanced healthcare coverage: The Indian context. Bus Process Manag J. 2019;67:7.

    Google Scholar 

Download references


The authors disclose receipt of the following financial support for the authorship, research, and/or publication of this article: The authors would like to acknowledge Mohammed Bin Rashid School of Government (MBRSG), Dubai, UAE, and the Alliance for Health Policy and Systems Research at the World Health Organization for financial support as part of the Knowledge to Policy (K2P) Center Mentorship Program [BIRD Project].

Author information

Authors and Affiliations



KH and IAM: Made a substantial contribution to all the sections and participated in the review, analysis, and interpretation. Involved in drafting the manuscript and revising it critically for important intellectual content. KH: Made a substantial contribution to study design. Participated in review, analysis, and interpretation. KH: Made a considerable contribution to background and method sections. KH: Made a significant contribution to the background and discussion sections. KH and IAM: Made a significant contribution to the discussion and conclusion sections. IAM: Made a substantial contribution to background, methods, and discussion sections. All authors give final approval for the version to be published and agree to be accountable for all aspects of the work. Both authors read an approved the final manuscript.

Corresponding author

Correspondence to Immanuel Azaad Moonesar.

Ethics declarations

Ethics approval and consent to participate

Studies involving animal subjects: No animal studies are presented in this manuscript.

Studies involving human subjects: No human studies are presented in this manuscript.

Inclusion of identifiable human data: No potentially identifiable human images or data is presented in this study.

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alhajaj, K.E., Moonesar, I.A. The power of big data mining to improve the health care system in the United Arab Emirates. J Big Data 10, 12 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: