Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 20 January 2022

AI in health and medicine

  • Pranav Rajpurkar   ORCID: orcid.org/0000-0002-8030-3727 1   na1 ,
  • Emma Chen 2   na1 ,
  • Oishi Banerjee 2   na1 &
  • Eric J. Topol   ORCID: orcid.org/0000-0002-1478-4729 3  

Nature Medicine volume  28 ,  pages 31–38 ( 2022 ) Cite this article

135k Accesses

627 Citations

621 Altmetric

Metrics details

  • Computational biology and bioinformatics
  • Medical research

Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a 2-year weekly effort to track and share key developments in medical AI. We cover prospective studies and advances in medical image analysis, which have reduced the gap between research and deployment. We also address several promising avenues for novel medical AI research, including non-image data sources, unconventional problem formulations and human–AI collaboration. Finally, we consider serious technical and ethical challenges in issues spanning from data scarcity to racial bias. As these challenges are addressed, AI’s potential may be realized, making healthcare more accurate, efficient and accessible for patients worldwide.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

research paper about medical field

Similar content being viewed by others

research paper about medical field

Foundation models for generalist medical artificial intelligence

research paper about medical field

Guiding principles for the responsible development of artificial intelligence tools for healthcare

research paper about medical field

A short guide for medical professionals in the era of artificial intelligence

Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316 , 2402–2410 (2016).

Article   Google Scholar  

Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542 , 115–118 (2017).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15 , e1002686 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25 , 65–69 (2019).

Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25 , 1337–1340 (2019).

Article   CAS   PubMed   Google Scholar  

Kanagasingam, Y. et al. Evaluation of artificial intelligence-based grading of diabetic retinopathy in primary care. JAMA Netw. Open 1 , e182665 (2018).

Beede, E. et al. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems 1–12 (Association for Computing Machinery, 2020); https://dl.acm.org/doi/abs/10.1145/3313831.3376718

Kiani, A. et al. Impact of a deep learning assistant on the histopathologic classification of liver cancer. NPJ Digit. Med. 3 , 23 (2020).

Lin, H. et al. Diagnostic efficacy and therapeutic decision-making capacity of an artificial intelligence platform for childhood cataracts in eye clinics: a multicentre randomized controlled trial. EClinicalMedicine 9 , 52–59 (2019).

Gong, D. et al. Detection of colorectal adenomas with a real-time computer-aided system (ENDOANGEL): a randomised controlled study. Lancet Gastroenterol. Hepatol. 5 , 352–361 (2020).

Article   PubMed   Google Scholar  

Wang, P. et al. Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study. Lancet Gastroenterol. Hepatol. 5 , 343–351 (2020).

Hollon, T. C. et al. Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks. Nat. Med. 26 , 52–58 (2020).

Phillips, M. et al. Assessment of accuracy of an artificial intelligence algorithm to detect melanoma in images of skin lesions. JAMA Netw. Open 2 , e1913436 (2019).

Nimri, R. et al. Insulin dose optimization using an automated artificial intelligence-based decision support system in youths with type 1 diabetes. Nat. Med. 26 , 1380–1384 (2020).

Wijnberge, M. et al. Effect of a machine learning-derived early warning system for intraoperative hypotension vs. standard care on depth and duration of intraoperative hypotension during elective noncardiac surgery. J. Am. Med. Assoc. 323 , 1052–1060 (2020).

Wismüller, A. & Stockmaster, L. A prospective randomized clinical trial for measuring radiology study reporting time on Artificial Intelligence-based detection of intracranial hemorrhage in emergent care head CT. in Medical Imaging 2020: Biomedical Applications in Molecular, Structural, and Functional Imaging vol. 11317, 113170M (International Society for Optics and Photonics, 2020).

Liu, X. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Br. Med. J. 370 , m3164 (2020).

Rivera, S. C. et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat. Med. 26 , 1351–1363 (2020).

Centers for Medicare & Medicaid Services. Medicare Program; Hospital Inpatient Prospective Payment Systems for Acute Care Hospitals and the Long-Term Care Hospital Prospective Payment System and Final Policy Changes and Fiscal Year 2021 Rates; Quality Reporting and Medicare and Medicaid Promoting Interoperability Programs Requirements for Eligible Hospitals and Critical Access Hospitals. Fed. Regist. 85 , 58432–59107 (2020).

Benjamens, S., Dhunnoo, P. & Meskó, B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit. Med. 3 , 118 (2020).

Wu, N. et al. Deep neural networks improve radiologists’ performance in breast cancer screening. IEEE Trans. Med. Imaging 39 , 1184–1194 (2020).

McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577 , 89–94 (2020).

Ghorbani, A. et al. Deep learning interpretation of echocardiograms. NPJ Digit. Med. 3 , 10 (2020).

Ouyang, D. et al. Video-based AI for beat-to-beat assessment of cardiac function. Nature 580 , 252–256 (2020).

Ardila, D. et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25 , 954–961 (2019).

Huynh, E. et al. Artificial intelligence in radiation oncology. Nat. Rev. Clin. Oncol. 17 , 771–781 (2020).

Huang, P. et al. Prediction of lung cancer risk at follow-up screening with low-dose CT: a training and validation study of a deep learning method. Lancet Digit. Health 1 , e353–e362 (2019).

Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25 , 1054–1056 (2019).

Jackson, H. W. et al. The single-cell pathology landscape of breast cancer. Nature 578 , 615–620 (2020).

Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25 , 1301–1309 (2019).

Fu, Y. et al. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1 , 800–810 (2020).

Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25 , 1519–1525 (2019).

Bera, K., Schalper, K. A., Rimm, D. L., Velcheti, V. & Madabhushi, A. Artificial intelligence in digital pathology: new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 16 , 703–715 (2019).

Zhou, D. et al. Diagnostic evaluation of a deep learning model for optical diagnosis of colorectal cancer. Nat. Commun. 11 , 2961 (2020).

Zhao, S. et al. Magnitude, risk factors, and factors associated with adenoma miss rate of tandem colonoscopy: a systematic review and meta-analysis. Gastroenterology 156 , 1661–1674 (2019).

Freedman, D. et al. Detecting deficient coverage in colonoscopies. IEEE Trans. Med. Imaging 39 , 3451–3462 (2020).

Liu, H. et al. Development and validation of a deep learning system to detect glaucomatous optic neuropathy using fundus photographs. JAMA Ophthalmol. 137 , 1353–1360 (2019).

Milea, D. et al. Artificial intelligence to detect papilledema from ocular fundus photographs. N. Engl. J. Med. 382 , 1687–1695 (2020).

Wolf, R. M., Channa, R., Abramoff, M. D. & Lehmann, H. P. Cost-effectiveness of autonomous point-of-care diabetic retinopathy screening for pediatric patients with diabetes. JAMA Ophthalmol. 138 , 1063–1069 (2020).

Xie, Y. et al. Artificial intelligence for teleophthalmology-based diabetic retinopathy screening in a national programme: an economic analysis modelling study. Lancet Digit. Health 2 , e240–e249 (2020).

Arcadu, F. et al. Deep learning algorithm predicts diabetic retinopathy progression in individual patients. NPJ Digit. Med. 2 , 92 (2019).

Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577 , 706–710 (2020).

Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M. & Church, G. M. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods 16 , 1315–1322 (2019).

Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17 , 184–192 (2020).

Greener, J.G. et al. Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints. Nat. Commun. 10 , 3977 (2019).

Chabon, J. J. et al. Integrating genomic features for non-invasive early lung cancer detection. Nature 580 , 245–251 (2020).

Luo, H. et al. Circulating tumor DNA methylation profiles enable early diagnosis, prognosis prediction, and screening for colorectal cancer. Sci. Transl. Med. 12 , eaax7533 (2020).

Cristiano, S. et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature 570 , 385–389 (2019).

Gussow, A. B. et al. Machine-learning approach expands the repertoire of anti-CRISPR protein families. Nat. Commun. 11 , 3784 (2020).

Wang, D. et al. Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning. Nat. Commun. 10 , 4284 (2019).

Bhattacharyya, R. P. et al. Simultaneous detection of genotype and phenotype enables rapid and accurate antibiotic susceptibility determination. Nat. Med. 25 , 1858–1864 (2019).

Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 181 , 475–483 (2020).

Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37 , 1038–1040 (2019).

Lee, J. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36 , 1234–1240 (2020).

CAS   PubMed   Google Scholar  

Zhu, Y., Li, L., Lu, H., Zhou, A. & Qin, X. Extracting drug-drug interactions from texts with BioBERT and multiple entity-aware attentions. J. Biomed. Inform. 106 , 103451 (2020).

Smit, A. et al. CheXbert: Combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing 1500–1519 (2020).

Sarker, A., Gonzalez-Hernandez, G., Ruan, Y. & Perrone, J. Machine learning and natural language processing for geolocation-centric monitoring and characterization of opioid-related social media chatter. JAMA Netw. Open 2 , e1914672 (2019).

Claassen, J. et al. Detection of brain activation in unresponsive patients with acute brain injury. N. Engl. J. Med. 380 , 2497–2505 (2019).

Porumb, M., Stranges, S., Pescapè, A. & Pecchia, L. Precision medicine and artificial intelligence: a pilot study on deep learning for hypoglycemic events detection based on ECG. Sci. Rep. 10 , 170 (2020).

Attia, Z. I. et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet 394 , 861–867 (2019).

Chan, J., Raju, S., Nandakumar, R., Bly, R. & Gollakota, S. Detecting middle ear fluid using smartphones. Sci. Transl. Med. 11 , eaav1102 (2019).

Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M. & Shenoy, K. V. High-performance brain-to-text communication via handwriting. Nature 593 , 249–254 (2021).

Green, E. M. et al. Machine learning detection of obstructive hypertrophic cardiomyopathy using a wearable biosensor. NPJ Digit. Med. 2 , 57 (2019).

Thorsen-Meyer, H.-C. et al. Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records. Lancet Digit. Health 2 , e179–e191 (2020).

Porter, P. et al. A prospective multicentre study testing the diagnostic accuracy of an automated cough sound centred analytic system for the identification of common respiratory disorders in children. Respir. Res. 20 , 81 (2019).

Tomašev, N. et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572 , 116–119 (2019).

Kehl, K. L. et al. Assessment of deep natural language processing in ascertaining oncologic outcomes from radiology reports. JAMA Oncol. 5 , 1421–1429 (2019).

Huang, S.-C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit. Med. 3 , 136 (2020).

Wang, C. et al. Quantitating the epigenetic transformation contributing to cholesterol homeostasis using Gaussian process. Nat. Commun. 10 , 5052 (2019).

Li, Y. et al. Inferring multimodal latent topics from electronic health records. Nat. Commun. 11 , 2536 (2020).

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571 , 95–98 (2019).

Li, X. et al. Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11 , 2338 (2020).

Amodio, M. et al. Exploring single-cell data with deep multitasking neural networks. Nat. Methods 16 , 1139–1145 (2019).

Urteaga, I., McKillop, M. & Elhadad, N. Learning endometriosis phenotypes from patient-generated data. NPJ Digit. Med. 3 , 88 (2020).

Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17 , 1200–1206 (2020).

Seymour, C. W. et al. Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. J. Am. Med. Assoc. 321 , 2003–2017 (2019).

Article   CAS   Google Scholar  

Fries, J. A. et al. Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences. Nat. Commun. 10 , 3111 (2019).

Jin, L. et al. Deep learning enables structured illumination microscopy with low light levels and enhanced speed. Nat. Commun. 11 , 1934 (2020).

Vishnevskiy, V. et al. Deep variational network for rapid 4D flow MRI reconstruction. Nat. Mach. Intell. 2 , 228–235 (2020).

Masutani, E. M., Bahrami, N. & Hsiao, A. Deep learning single-frame and multiframe super-resolution for cardiac MRI. Radiology 295 , 552–561 (2020).

Rana, A. et al. Use of deep learning to develop and analyze computational hematoxylin and eosin staining of prostate core biopsy images for tumor diagnosis. JAMA Netw. Open 3 , e205111 (2020).

Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit. Health 1 , e271–e297 (2019).

Chen, P.-H. C. et al. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat. Med. 25 , 1453–1457 (2019).

Patel, B. N. et al. Human–machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ Digit. Med. 2 , 111 (2019).

Sim, Y. et al. Deep convolutional neural network–based software improves radiologist detection of malignant lung nodules on chest radiographs. Radiology 294 , 199–209 (2020).

Park, A. et al. Deep learning–assisted diagnosis of cerebral aneurysms using the HeadXNet model. JAMA Netw. Open 2 , e195600 (2019).

Steiner, D. F. et al. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer. Am. J. Surg. Pathol. 42 , 1636–1646 (2018).

Jain, A. et al. Development and assessment of an artificial intelligence-based tool for skin condition diagnosis by primary care physicians and nurse practitioners in teledermatology practices. JAMA Netw. Open 4 , e217249 (2021).

Seah, J. C. Y. et al. Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study. Lancet Digit. Health 3 , e496–e506 (2021).

Rajpurkar, P. et al. CheXaid: deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV. NPJ Digit. Med. 3 , 115 (2020).

Kim, H.-E. et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit. Health 2 , e138–e148 (2020).

Tschandl, P. et al. Human–computer collaboration for skin cancer recognition. Nat. Med. 26 , 1229–1234 (2020).

van der Laak, J., Litjens, G. & Ciompi, F. Deep learning in histopathology: the path to the clinic. Nat. Med. 27 , 775–784 (2021).

Willemink, M. J. et al. Preparing medical imaging data for machine learning. Radiology 295 , 4–15 (2020).

Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. in Proceedings of the AAAI Conference on Artificial Intelligence vol. 33, 590–597 (2019).

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 , 195 (2019).

DeGrave, A. J., Janizek, J. D. & Lee, S.-I. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3 , 610–619 (2021).

Cutillo, C. M. et al. Machine intelligence in healthcare: perspectives on trustworthiness, explainability, usability, and transparency. NPJ Digit. Med. 3 , 47 (2020).

Sendak, M. P., Gao, M., Brajer, N. & Balu, S. Presenting machine learning model information to clinical end users with model facts labels. NPJ Digit. Med. 3 , 41 (2020).

Saporta, A. et al. Deep learning saliency maps do not accurately highlight diagnostically relevant regions for medical image interpretation. Preprint at medRxiv https://doi.org/10.1101/2021.02.28.21252634 (2021).

Ehsan, U. et al . The who in explainable AI: how AI background shapes perceptions of AI explanations. Preprint at https://arxiv.org/abs/2107.13509 (2021).

Reyes, M. et al. On the interpretability of artificial intelligence in radiology: Challenges and opportunities. Radio. Artif. Intell. 2 , e190043 (2020).

Liu, C. et al . On the replicability and reproducibility of deep learning in software engineering. Preprint at https://arxiv.org/abs/2006.14244 (2020).

Beam, A. L., Manrai, A. K. & Ghassemi, M. Challenges to the reproducibility of machine learning models in health care. J. Am. Med. Assoc. 323 , 305–306 (2020).

Gerke, S., Babic, B., Evgeniou, T. & Cohen, I. G. The need for a system view to regulate artificial intelligence/machine learning-based software as medical device. NPJ Digit. Med. 3 , 53 (2020).

Lee, C. S. & Lee, A. Y. Clinical applications of continual learning machine learning. Lancet Digit. Health 2 , e279–e281 (2020).

Food and Drug Administration. Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD): Discussion Paper and Request for Feedback (FDA, 2019).

Morley, J. et al. The debate on the ethics of AI in health care: a reconstruction and critical review. SSRN http://dx.doi.org/10.2139/ssrn.3486518 (2019.

Price, W. N., Gerke, S. & Cohen, I. G. Potential liability for physicians using artificial intelligence. J. Am. Med. Assoc. 322 , 1765–1766 (2019).

Larson, D. B., Magnus, D. C., Lungren, M. P., Shah, N. H. & Langlotz, C. P. Ethics of using and sharing clinical imaging data for artificial intelligence: a proposed framework. Radiology 295 , 675–682 (2020).

Kaissis, G. A., Makowski, M. R., Rückert, D. & Braren, R. F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2 , 305–311 (2020).

Larrazabal, A. J., Nieto, N., Peterson, V., Milone, D. H. & Ferrante, E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl Acad. Sci. USA 117 , 12592–12594 (2020).

Vyas, D. A., Eisenstein, L. G. & Jones, D. S. Hidden in plain sight: reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383 , 874–882 (2020).

Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366 , 447–453 (2019).

Cirillo, D. et al. Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. NPJ Digit. Med. 3 , 81 (2020).

Download references

Acknowledgements

We thank A. Tamkin and N. Phillips for their feedback. E.J.T. receives funding support from US National Institutes of Health grant UL1TR002550.

Author information

These authors contributed equally: Pranav Rajpurkar, Emma Chen, Oishi Banerjee.

Authors and Affiliations

Department of Biomedical Informatics, Harvard University, Cambridge, MA, USA

Pranav Rajpurkar

Department of Computer Science, Stanford University, Stanford, CA, USA

Emma Chen & Oishi Banerjee

Scripps Translational Science Institute, San Diego, CA, USA

Eric J. Topol

You can also search for this author in PubMed   Google Scholar

Contributions

P.R. and E.J.T. conceptualized this Review. E.C., O.B. and P.R. were responsible for the design and synthesis of this Review. All authors contributed to writing and editing the manuscript.

Corresponding author

Correspondence to Eric J. Topol .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Medicine thanks Despina Kontos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Karen O’Leary was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Rajpurkar, P., Chen, E., Banerjee, O. et al. AI in health and medicine. Nat Med 28 , 31–38 (2022). https://doi.org/10.1038/s41591-021-01614-0

Download citation

Received : 23 July 2021

Accepted : 05 November 2021

Published : 20 January 2022

Issue Date : January 2022

DOI : https://doi.org/10.1038/s41591-021-01614-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Applying the utaut2 framework to patients’ attitudes toward healthcare task shifting with artificial intelligence.

  • Weiting Huang
  • Wen Chong Ong
  • Jasper Tromp

BMC Health Services Research (2024)

Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease

  • Zecheng Zhu
  • Shunjin Zhao

Respiratory Research (2024)

Development and validation of a machine learning model to predict time to renal replacement therapy in patients with chronic kidney disease

  • Takeshi Nakata
  • Hirotaka Shibata

BMC Nephrology (2024)

Individualized estimation of arterial carbon dioxide partial pressure using machine learning in children receiving mechanical ventilation

  • Bongjin Lee
  • June Dong Park

BMC Pediatrics (2024)

“That’s just Future Medicine” - a qualitative study on users’ experiences of symptom checker apps

  • Regina Müller
  • Malte Klemmt
  • Robert Ranisch

BMC Medical Ethics (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research paper about medical field

Research articles

Ultra-processed food consumption and all cause and cause specific mortality, comparative effectiveness of second line oral antidiabetic treatments among people with type 2 diabetes mellitus, efficacy of psilocybin for treating symptoms of depression, reverse total shoulder replacement versus anatomical total shoulder replacement for osteoarthritis, effect of combination treatment with glp-1 receptor agonists and sglt-2 inhibitors on incidence of cardiovascular and serious renal events, prenatal opioid exposure and risk of neuropsychiatric disorders in children, temporal trends in lifetime risks of atrial fibrillation and its complications, antipsychotic use in people with dementia, predicting the risks of kidney failure and death in adults with moderate to severe chronic kidney disease, impact of large scale, multicomponent intervention to reduce proton pump inhibitor overuse, esketamine after childbirth for mothers with prenatal depression, glucagon-like peptide 1 receptor agonist use and risk of thyroid cancer, use of progestogens and the risk of intracranial meningioma, delirium and incident dementia in hospital patients, derivation and external validation of a simple risk score for predicting severe acute kidney injury after intravenous cisplatin, quality and safety of artificial intelligence generated health information, large language models and the generation of health disinformation, 25 year trends in cancer incidence and mortality among adults in the uk, cervical pessary versus vaginal progesterone in women with a singleton pregnancy, comparison of prior authorization across insurers, diagnostic accuracy of magnetically guided capsule endoscopy with a detachable string for detecting oesophagogastric varices in adults with cirrhosis, ultra-processed food exposure and adverse health outcomes, added benefit and revenues of oncology drugs approved by the ema, exposure to air pollution and hospital admission for cardiovascular diseases, short term exposure to low level ambient fine particulate matter and natural cause, cardiovascular, and respiratory morbidity, optimal timing of influenza vaccination in young children, effect of exercise for depression, association of non-alcoholic fatty liver disease with cardiovascular disease and all cause death in patients with type 2 diabetes, duration of cpr and outcomes for adults with in-hospital cardiac arrest, clinical effectiveness of an online physical and mental health rehabilitation programme for post-covid-19 condition, atypia detected during breast screening and subsequent development of cancer, publishers’ and journals’ instructions to authors on use of generative ai in academic and scientific publishing, effectiveness of glp-1 receptor agonists on glycaemic control, body weight, and lipid profile for type 2 diabetes, neurological development in children born moderately or late preterm, invasive breast cancer and breast cancer death after non-screen detected ductal carcinoma in situ, all cause and cause specific mortality in obsessive-compulsive disorder, acute rehabilitation following traumatic anterior shoulder dislocation, perinatal depression and risk of mortality, undisclosed financial conflicts of interest in dsm-5-tr, effect of risk mitigation guidance opioid and stimulant dispensations on mortality and acute care visits, update to living systematic review on sars-cov-2 positivity in offspring and timing of mother-to-child transmission, perinatal depression and its health impact, christmas 2023: common healthcare related instruments subjected to magnetic attraction study, using autoregressive integrated moving average models for time series analysis of observational data, demand for morning after pill following new year holiday, christmas 2023: christmas recipes from the great british bake off, effect of a doctor working during the festive period on population health: experiment using doctor who episodes, christmas 2023: analysis of barbie medical and science career dolls, christmas 2023: effect of chair placement on physicians’ behavior and patients’ satisfaction, management of chronic pain secondary to temporomandibular disorders, christmas 2023: projecting complete redaction of clinical trial protocols, christmas 2023: a drug target for erectile dysfunction to help improve fertility, sexual activity, and wellbeing, christmas 2023: efficacy of cola ingestion for oesophageal food bolus impaction, conservative management versus laparoscopic cholecystectomy in adults with gallstone disease, social media use and health risk behaviours in young people, untreated cervical intraepithelial neoplasia grade 2 and cervical cancer, air pollution deaths attributable to fossil fuels, implementation of a high sensitivity cardiac troponin i assay and risk of myocardial infarction or death at five years, covid-19 vaccine effectiveness against post-covid-19 condition, association between patient-surgeon gender concordance and mortality after surgery, intravascular imaging guided versus coronary angiography guided percutaneous coronary intervention, treatment of lower urinary tract symptoms in men in primary care using a conservative intervention, autism intervention meta-analysis of early childhood studies, effectiveness of the live zoster vaccine during the 10 years following vaccination, effects of a multimodal intervention in primary care to reduce second line antibiotic prescriptions for urinary tract infections in women, pyrotinib versus placebo in combination with trastuzumab and docetaxel in patients with her2 positive metastatic breast cancer, association of dcis size and margin status with risk of developing breast cancer post-treatment, racial differences in low value care among older patients in the us, pharmaceutical industry payments and delivery of low value cancer drugs, rosuvastatin versus atorvastatin in adults with coronary artery disease, clinical effectiveness of septoplasty versus medical management for nasal airways obstruction, ultrasound guided lavage with corticosteroid injection versus sham lavage with and without corticosteroid injection for calcific tendinopathy of shoulder, early versus delayed antihypertensive treatment in patients with acute ischaemic stroke, mortality risks associated with floods in 761 communities worldwide, interactive effects of ambient fine particulate matter and ozone on daily mortality in 372 cities, association between changes in carbohydrate intake and long term weight changes, future-case control crossover analysis for adjusting bias in case crossover studies, association between recently raised anticholinergic burden and risk of acute cardiovascular events, suboptimal gestational weight gain and neonatal outcomes in low and middle income countries: individual participant data meta-analysis, efficacy and safety of an inactivated virus-particle vaccine for sars-cov-2, effect of invitation letter in language of origin on screening attendance: randomised controlled trial in breastscreen norway, visits by nurse practitioners and physician assistants in the usa, non-erosive gastro-oesophageal reflux disease and oesophageal adenocarcinoma, venous thromboembolism with use of hormonal contraception and nsaids, food additive emulsifiers and risk of cardiovascular disease, balancing risks and benefits of cannabis use, promoting activity, independence, and stability in early dementia and mild cognitive impairment, effect of home cook interventions for salt reduction in china, cancer mortality after low dose exposure to ionising radiation, effect of a smartphone intervention among university students with unhealthy alcohol use, long term risk of death and readmission after hospital admission with covid-19 among older adults, mortality rates among patients successfully treated for hepatitis c, association between antenatal corticosteroids and risk of serious infection in children, the proportions of term or late preterm births after exposure to early antenatal corticosteroids, and outcomes, safety of ba.4-5 or ba.1 bivalent mrna booster vaccines, comparative effectiveness of booster vaccines among adults aged ≥50 years, third dose vaccine schedules against severe covid-19 during omicron predominance in nordic countries, private equity ownership and impacts on health outcomes, costs, and quality, healthcare disruption due to covid-19 and avoidable hospital admission, follow us on, content links.

  • Collections
  • Health in South Asia
  • Women’s, children’s & adolescents’ health
  • News and views
  • BMJ Opinion
  • Rapid responses
  • Editorial staff
  • BMJ in the USA
  • BMJ in South Asia
  • Submit your paper
  • BMA members
  • Subscribers
  • Advertisers and sponsors

Explore BMJ

  • Our company
  • BMJ Careers
  • BMJ Learning
  • BMJ Masterclasses
  • BMJ Journals
  • BMJ Student
  • Academic edition of The BMJ
  • BMJ Best Practice
  • The BMJ Awards
  • Email alerts
  • Activate subscription

Information

  • Research article
  • Open access
  • Published: 10 April 2021

The role of artificial intelligence in healthcare: a structured literature review

  • Silvana Secinaro 1 ,
  • Davide Calandra 1 ,
  • Aurelio Secinaro 2 ,
  • Vivek Muthurangu 3 &
  • Paolo Biancone 1  

BMC Medical Informatics and Decision Making volume  21 , Article number:  125 ( 2021 ) Cite this article

146k Accesses

254 Citations

25 Altmetric

Metrics details

Background/Introduction

Artificial intelligence (AI) in the healthcare sector is receiving attention from researchers and health professionals. Few previous studies have investigated this topic from a multi-disciplinary perspective, including accounting, business and management, decision sciences and health professions.

The structured literature review with its reliable and replicable research protocol allowed the researchers to extract 288 peer-reviewed papers from Scopus. The authors used qualitative and quantitative variables to analyse authors, journals, keywords, and collaboration networks among researchers. Additionally, the paper benefited from the Bibliometrix R software package.

The investigation showed that the literature in this field is emerging. It focuses on health services management, predictive medicine, patient data and diagnostics, and clinical decision-making. The United States, China, and the United Kingdom contributed the highest number of studies. Keyword analysis revealed that AI can support physicians in making a diagnosis, predicting the spread of diseases and customising treatment paths.

Conclusions

The literature reveals several AI applications for health services and a stream of research that has not fully been covered. For instance, AI projects require skills and data quality awareness for data-intensive analysis and knowledge-based management. Insights can help researchers and health professionals understand and address future research on AI in the healthcare field.

Peer Review reports

Artificial intelligence (AI) generally applies to computational technologies that emulate mechanisms assisted by human intelligence, such as thought, deep learning, adaptation, engagement, and sensory understanding [ 1 , 2 ]. Some devices can execute a role that typically involves human interpretation and decision-making [ 3 , 4 ]. These techniques have an interdisciplinary approach and can be applied to different fields, such as medicine and health. AI has been involved in medicine since as early as the 1950s, when physicians made the first attempts to improve their diagnoses using computer-aided programs [ 5 , 6 ]. Interest and advances in medical AI applications have surged in recent years due to the substantially enhanced computing power of modern computers and the vast amount of digital data available for collection and utilisation [ 7 ]. AI is gradually changing medical practice. There are several AI applications in medicine that can be used in a variety of medical fields, such as clinical, diagnostic, rehabilitative, surgical, and predictive practices. Another critical area of medicine where AI is making an impact is clinical decision-making and disease diagnosis. AI technologies can ingest, analyse, and report large volumes of data across different modalities to detect disease and guide clinical decisions [ 3 , 8 ]. AI applications can deal with the vast amount of data produced in medicine and find new information that would otherwise remain hidden in the mass of medical big data [ 9 , 10 , 11 ]. These technologies can also identify new drugs for health services management and patient care treatments [ 5 , 6 ].

Courage in the application of AI is visible through a search in the primary research databases. However, as Meskò et al. [ 7 ] find, the technology will potentially reduce care costs and repetitive operations by focusing the medical profession on critical thinking and clinical creativity. As Cho et al. and Doyle et al. [ 8 , 9 ] add, the AI perspective is exciting; however, new studies will be needed to establish the efficacy and applications of AI in the medical field [ 10 ].

Our paper will also concentrate on AI strategies for healthcare from the accounting, business, and management perspectives. The authors used the structured literature review (SLR) method for its reliable and replicable research protocol [ 11 ] and selected bibliometric variables as sources of investigation. Bibliometric usage enables the recognition of the main quantitative variables of the study stream [ 12 ]. This method facilitates the detection of the required details of a particular research subject, including field authors, number of publications, keywords for interaction between variables (policies, properties and governance) and country data [ 13 ]. It also allows the application of the science mapping technique [ 14 ]. Our paper adopted the Bibliometrix R package and the biblioshiny web interface as tools of analysis [ 14 ].

The investigation offers the following insights for future researchers and practitioners:

bibliometric information on 288 peer-reviewed English papers from the Scopus collection.

Identification of leading journals in this field, such as Journal of Medical Systems, Studies in Health Technology and Informatics, IEEE Journal of Biomedical and Health Informatics, and Decision Support Systems.

Qualitative and quantitative information on authors’ Lotka’s law, h-index, g-index, m-index, keyword, and citation data.

Research on specific countries to assess AI in the delivery and effectiveness of healthcare, quotes, and networks within each region.

A topic dendrogram study that identifies five research clusters: health services management, predictive medicine, patient data, diagnostics, and finally, clinical decision-making.

An in-depth discussion that develops theoretical and practical implications for future studies.

The paper is organised as follows. Section  2 lists the main bibliometric articles in this field. Section  3 elaborates on the methodology. Section  4 presents the findings of the bibliometric analysis. Section  5 discusses the main elements of AI in healthcare based on the study results. Section  6 concludes the article with future implications for research.

Related works and originality

As suggested by Zupic and Čater [ 15 ], a research stream can be evaluated with bibliometric methods that can introduce objectivity and mitigate researcher bias. For this reason, bibliometric methods are attracting increasing interest among researchers as a reliable and impersonal research analytical approach [ 16 , 17 ]. Recently, bibliometrics has been an essential method for analysing and predicting research trends [ 18 ]. Table  1 lists other research that has used a similar approach in the research stream investigated.

The scientific articles reported show substantial differences in keywords and research topics that have been previously studied. The bibliometric analysis of Huang et al. [ 19 ] describes rehabilitative medicine using virtual reality technology. According to the authors, the primary goal of rehabilitation is to enhance and restore functional ability and quality of life for patients with physical impairments or disabilities. In recent years, many healthcare disciplines have been privileged to access various technologies that provide tools for both research and clinical intervention.

Hao et al. [ 20 ] focus on text mining in medical research. As reported, text mining reveals new, previously unknown information by using a computer to automatically extract information from different text resources. Text mining methods can be regarded as an extension of data mining to text data. Text mining is playing an increasingly significant role in processing medical information. Similarly, the studies by dos Santos et al. [ 21 ] focus on applying data mining and machine learning (ML) techniques to public health problems. As stated in this research, public health may be defined as the art and science of preventing diseases, promoting health, and prolonging life. Using data mining and ML techniques, it is possible to discover new information that otherwise would be hidden. These two studies are related to another topic: medical big data. According to Liao et al. [ 22 ], big data is a typical “buzzword” in the business and research community, referring to a great mass of digital data collected from various sources. In the medical field, we can obtain a vast amount of data (i.e., medical big data). Data mining and ML techniques can help deal with this information and provide helpful insights for physicians and patients. More recently, Choudhury et al. [ 23 ] provide a systematic review on the use of ML to improve the care of elderly patients, demonstrating eligible studies primarily in psychological disorders and eye diseases.

Tran et al. [ 2 ] focus on the global evolution of AI research in medicine. Their bibliometric analysis highlights trends and topics related to AI applications and techniques. As stated in Connelly et al.’s [ 24 ] study, robot-assisted surgeries have rapidly increased in recent years. Their bibliometric analysis demonstrates how robotic-assisted surgery has gained acceptance in different medical fields, such as urological, colorectal, cardiothoracic, orthopaedic, maxillofacial and neurosurgery applications. Additionally, the bibliometric analysis of Guo et al. [ 25 ] provides an in-depth study of AI publications through December 2019. The paper focuses on tangible AI health applications, giving researchers an idea of how algorithms can help doctors and nurses. A new stream of research related to AI is also emerging. In this sense, Choudhury and Asan’s [ 26 ] scientific contribution provides a systematic review of the AI literature to identify health risks for patients. They report on 53 studies involving technology for clinical alerts, clinical reports, and drug safety. Considering the considerable interest within this research stream, this analysis differs from the current literature for several reasons. It aims to provide in-depth discussion, considering mainly the business, management, and accounting fields and not dealing only with medical and health profession publications.

Additionally, our analysis aims to provide a bibliometric analysis of variables such as authors, countries, citations and keywords to guide future research perspectives for researchers and practitioners, as similar analyses have done for several publications in other research streams [ 15 , 16 , 27 ]. In doing so, we use a different database, Scopus, that is typically adopted in social sciences fields. Finally, our analysis will propose and discuss a dominant framework of variables in this field, and our analysis will not be limited to AI application descriptions.

Methodology

This paper evaluated AI in healthcare research streams using the SLR method [ 11 ]. As suggested by Massaro et al. [ 11 ], an SLR enables the study of the scientific corpus of a research field, including the scientific rigour, reliability and replicability of operations carried out by researchers. As suggested by many scholars, the methodology allows qualitative and quantitative variables to highlight the best authors, journals and keywords and combine a systematic literature review and bibliometric analysis [ 27 , 28 , 29 , 30 ]. Despite its widespread use in business and management [ 16 , 31 ], the SLR is also used in the health sector based on the same philosophy through which it was originally conceived [ 32 , 33 ]. A methodological analysis of previously published articles reveals that the most frequently used steps are as follows [ 28 , 31 , 34 ]:

defining research questions;

writing the research protocol;

defining the research sample to be analysed;

developing codes for analysis; and

critically analysing, discussing, and identifying a future research agenda.

Considering the above premises, the authors believe that an SLR is the best method because it combines scientific validity, replicability of the research protocol and connection between multiple inputs.

As stated by the methodological paper, the first step is research question identification. For this purpose, we benefit from the analysis of Zupic and Čater [ 15 ], who provide several research questions for future researchers to link the study of authors, journals, keywords and citations. Therefore, RQ1 is “What are the most prominent authors, journal keywords and citations in the field of the research study?” Additionally, as suggested by Haleem et al. [ 35 ], new technologies, including AI, are changing the medical field in unexpected timeframes, requiring studies in multiple areas. Therefore, RQ2 is “How does artificial intelligence relate to healthcare, and what is the focus of the literature?” Then, as discussed by Massaro et al. [ 36 ], RQ3 is “What are the research applications of artificial intelligence for healthcare?”.

The first research question aims to define the qualitative and quantitative variables of the knowledge flow under investigation. The second research question seeks to determine the state of the art and applications of AI in healthcare. Finally, the third research question aims to help researchers identify practical and theoretical implications and future research ideas in this field.

The second fundamental step of the SLR is writing the research protocol [ 11 ]. Table  2 indicates the currently known literature elements, uniquely identifying the research focus, motivations and research strategy adopted and the results providing a link with the following points. Additionally, to strengthen the analysis, our investigation benefits from the PRISMA statement methodological article [ 37 ]. Although the SLR is a validated method for systematic reviews and meta-analyses, we believe that the workflow provided may benefit the replicability of the results [ 37 , 38 , 39 , 40 ]. Figure  1 summarises the researchers’ research steps, indicating that there are no results that can be referred to as a meta-analysis.

figure 1

Source : Authors’ elaboration on Liberati et al. [ 37 ]

PRISMA workflow.

The third step is to specify the search strategy and search database. Our analysis is based on the search string “Artificial Intelligence” OR “AI” AND “Healthcare” with a focus on “Business, Management, and Accounting”, “Decision Sciences”, and “Health professions”. As suggested by [ 11 , 41 ] and motivated by [ 42 ], keywords can be selected through a top-down approach by identifying a large search field and then focusing on particular sub-topics. The paper uses data retrieved from the Scopus database, a multi-disciplinary database, which allowed the researchers to identify critical articles for scientific analysis [ 43 ]. Additionally, Scopus was selected based on Guo et al.’s [ 25 ] limitations, which suggest that “future studies will apply other databases, such as Scopus, to explore more potential papers” . The research focuses on articles and reviews published in peer-reviewed journals for their scientific relevance [ 11 , 16 , 17 , 29 ] and does not include the grey literature, conference proceedings or books/book chapters. Articles written in any language other than English were excluded [ 2 ]. For transparency and replicability, the analysis was conducted on 11 January 2021. Using this research strategy, the authors retrieved 288 articles. To strengthen the study's reliability, we publicly provide the full bibliometric extract on the Zenodo repository [ 44 , 45 ].

The fourth research phase is defining the code framework that initiates the analysis of the variables. The study will identify the following:

descriptive information of the research area;

source analysis [ 16 ];

author and citation analysis [ 28 ];

keywords and network analysis [ 14 ]; and

geographic distribution of the papers [ 14 ].

The final research phase is the article’s discussion and conclusion, where implications and future research trends will be identified.

At the research team level, the information is analysed with the statistical software R-Studio and the Bibliometrix package [ 15 ], which allows scientific analysis of the results obtained through the multi-disciplinary database.

The analysis of bibliometric results starts with a description of the main bibliometric statistics with the aim of answering RQ1, What are the most prominent authors, journal keywords and citations in the field of the research study?, and RQ2, How does artificial intelligence relate to healthcare, and what is the focus of the literature? Therefore, the following elements were thoroughly analysed: (1) type of document; (2) annual scientific production; (3) scientific sources; (4) source growth; (5) number of articles per author; (6) author’s dominance ranking; (7) author’s h-index, g-index, and m-index; (8) author’s productivity; (9) author’s keywords; (10) topic dendrogram; (11) a factorial map of the document with the highest contributions; (12) article citations; (13) country production; (14) country citations; (15) country collaboration map; and (16) country collaboration network.

Main information

Table  3 shows the information on 288 peer-reviewed articles published between 1992 and January 2021 extracted from the Scopus database. The number of keywords is 946 from 136 sources, and the number of keywords plus, referring to the number of keywords that frequently appear in an article’s title, was 2329. The analysis period covered 28 years and 1 month of scientific production and included an annual growth rate of 5.12%. However, the most significant increase in published articles occurred in the past three years (please see Fig.  2 ). On average, each article was written by three authors (3.56). Finally, the collaboration index (CI), which was calculated as the total number of authors of multi-authored articles/total number of multi-authored articles, was 3.97 [ 46 ].

figure 2

Source : Authors’ elaboration

Annual scientific production.

Table  4 shows the top 20 sources related to the topic. The Journal of Medical Systems is the most relevant source, with twenty-one of the published articles. This journal's main issues are the foundations, functionality, interfaces, implementation, impacts, and evaluation of medical technologies. Another relevant source is Studies in Health Technology and Informatics, with eleven articles. This journal aims to extend scientific knowledge related to biomedical technologies and medical informatics research. Both journals deal with cloud computing, machine learning, and AI as a disruptive healthcare paradigm based on recent publications. The IEEE Journal of Biomedical and Health Informatics investigates technologies in health care, life sciences, and biomedicine applications from a broad perspective. The next journal, Decision Support Systems, aims to analyse how these technologies support decision-making from a multi-disciplinary view, considering business and management. Therefore, the analysis of the journals revealed that we are dealing with an interdisciplinary research field. This conclusion is confirmed, for example, by the presence of purely medical journals, journals dedicated to the technological growth of healthcare, and journals with a long-term perspective such as futures.

The distribution frequency of the articles (Fig.  3 ) indicates the journals dealing with the topic and related issues. Between 2008 and 2012, a significant growth in the number of publications on the subject is noticeable. However, the graph shows the results of the Loess regression, which includes the quantity and publication time of the journal under analysis as variables. This method allows the function to assume an unlimited distribution; that is, feature can consider values below zero if the data are close to zero. It contributes to a better visual result and highlights the discontinuity in the publication periods [ 47 ].

figure 3

Source growth. Source : Authors’ elaboration

Finally, Fig.  4 provides an analytical perspective on factor analysis for the most cited papers. As indicated in the literature [ 48 , 49 ], using factor analysis to discover the most cited papers allows for a better understanding of the scientific world’s intellectual structure. For example, our research makes it possible to consider certain publications that effectively analyse subject specialisation. For instance, Santosh’s [ 50 ] article addresses the new paradigm of AI with ML algorithms for data analysis and decision support in the COVID-19 period, setting a benchmark in terms of citations by researchers. Moving on to the application, an article by Shickel et al. [ 51 ] begins with the belief that the healthcare world currently has much health and administrative data. In this context, AI and deep learning will support medical and administrative staff in extracting data, predicting outcomes, and learning medical representations. Finally, in the same line of research, Baig et al. [ 52 ], with a focus on wearable patient monitoring systems (WPMs), conclude that AI and deep learning may be landmarks for continuous patient monitoring and support for healthcare delivery.

figure 4

Factorial map of the most cited documents.

This section identifies the most cited authors of articles on AI in healthcare. It also identifies the authors’ keywords, dominance factor (DF) ranking, h-index, productivity, and total number of citations. Table  5 identifies the authors and their publications in the top 20 rankings. As the table shows, Bushko R.G. has the highest number of publications: four papers. He is the editor-in-chief of Future of Health Technology, a scientific journal that aims to develop a clear vision of the future of health technology. Then, several authors each wrote three papers. For instance, Liu C. is a researcher active in the topic of ML and computer vision, and Sharma A. from Emory University Atlanta in the USA is a researcher with a clear focus on imaging and translational informatics. Some other authors have two publications each. While some authors have published as primary authors, most have published as co-authors. Hence, in the next section, we measure the contributory power of each author by investigating the DF ranking through the number of elements.

Authors’ dominance ranking

The dominance factor (DF) is a ratio measuring the fraction of multi-authored articles in which an author acts as the first author [ 53 ]. Several bibliometric studies use the DF in their analyses [ 46 , 54 ]. The DF ranking calculates an author’s dominance in producing articles. The DF is calculated by dividing the number of an author’s multi-authored papers as the first author (Nmf) by the author's total number of multi-authored papers (Nmt). This is omitted in the single-author case due to the constant value of 1 for single-authored articles. This formulation could lead to some distortions in the results, especially in fields where the first author is entered by surname alphabetical order [ 55 ].

The mathematical equation for the DF is shown as:

Table  6 lists the top 20 DF rankings. The data in the table show a low level of articles per author, either for first-authored or multi-authored articles. The results demonstrate that we are dealing with an emerging topic in the literature. Additionally, as shown in the table, Fox J. and Longoni C. are the most dominant authors in the field.

Authors’ impact

Table  7 shows the impact of authors in terms of the h-index [ 56 ] (i.e., the productivity and impact of citations of a researcher), g-index [ 57 ] (i.e., the distribution of citations received by a researcher's publications), m-index [ 58 ] (i.e., the h-index value per year), total citations, total paper and years of scientific publication. The H-index was introduced in the literature as a metric for the objective comparison of scientific results and depended on the number of publications and their impact [ 59 ]. The results show that the 20 most relevant authors have an h-index between 2 and 1. For the practical interpretation of the data, the authors considered data published by the London School of Economics [ 60 ]. In the social sciences, the analysis shows values of 7.6 for economic publications by professors and researchers who had been active for several years. Therefore, the youthfulness of the research area has attracted young researchers and professors. At the same time, new indicators have emerged over the years to diversify the logic of the h-index. For example, the g-index indicates an author's impact on citations, considering that a single article can generate these. The m-index, on the other hand, shows the cumulative value over the years.

The analysis, also considering the total number of citations, the number of papers published and the year of starting to publish, thus confirms that we are facing an expanding research flow.

Authors’ productivity

Figure  5 shows Lotka’s law. This mathematical formulation originated in 1926 to describe the publication frequency by authors in a specific research field [ 61 ]. In practice, the law states that the number of authors contributing to research in a given period is a fraction of the number who make up a single contribution [ 14 , 61 ].

figure 5

Lotka’s law.

The mathematical relationship is expressed in reverse in the following way:

where y x is equal to the number of authors producing x articles in each research field. Therefore, C and n are constants that can be estimated in the calculation.

The figure's results are in line with Lotka's results, with an average of two publications per author in a given research field. In addition, the figure shows the percentage of authors. Our results lead us to state that we are dealing with a young and growing research field, even with this analysis. Approximately 70% of the authors had published only their first research article. Only approximately 20% had published two scientific papers.

Authors’ keywords

This section provides information on the relationship between the keywords artificial intelligence and healthcare . This analysis is essential to determine the research trend, identify gaps in the discussion on AI in healthcare, and identify the fields that can be interesting as research areas [ 42 , 62 ].

Table  8 highlights the total number of keywords per author in the top 20 positions. The ranking is based on the following elements: healthcare, artificial intelligence, and clinical decision support system . Keyword analysis confirms the scientific area of reference. In particular, we deduce the definition as “Artificial intelligence is the theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages” [ 2 , 63 ]. Panch et al. [ 4 ] find that these technologies can be used in different business and management areas. After the first keyword, the analysis reveals AI applications and related research such as machine learning and deep learning.

Additionally, data mining and big data are a step forward in implementing exciting AI applications. According to our specific interest, if we applied AI in healthcare, we would achieve technological applications to help and support doctors and medical researchers in decision-making. The link between AI and decision-making is the reason why we find, in the seventh position, the keyword clinical decision support system . AI techniques can unlock clinically relevant information hidden in the massive amount of data that can assist clinical decision-making [ 64 ]. If we analyse the following keywords, we find other elements related to decision-making and support systems.

The TreeMap below (Fig.  6 ) highlights the combination of possible keywords representing AI and healthcare.

figure 6

Keywords treemap.

The topic dendrogram in Fig.  7 represents the hierarchical order and the relationship between the keywords generated by hierarchical clustering [ 42 ]. The cut in the figure and the vertical lines facilitate an investigation and interpretation of the different clusters. As stated by Andrews [ 48 ], the figure is not intended to find the perfect level of associations between clusters. However, it aims to estimate the approximate number of clusters to facilitate further discussion.

figure 7

Topic dendrogram.

The research stream of AI in healthcare is divided into two main strands. The blue strand focuses on medical information systems and the internet. Some papers are related to healthcare organisations, such as the Internet of Things, meaning that healthcare organisations use AI to support health services management and data analysis. AI applications are also used to improve diagnostic and therapeutic accuracy and the overall clinical treatment process [ 2 ]. If we consider the second block, the red one, three different clusters highlight separate aspects of the topic. The first could be explained as AI and ML predictive algorithms. Through AI applications, it is possible to obtain a predictive approach that can ensure that patients are better monitored. This also allows a better understanding of risk perception for doctors and medical researchers. In the second cluster, the most frequent words are decisions , information system , and support system . This means that AI applications can support doctors and medical researchers in decision-making. Information coming from AI technologies can be used to consider difficult problems and support a more straightforward and rapid decision-making process. In the third cluster, it is vital to highlight that the ML model can deal with vast amounts of data. From those inputs, it can return outcomes that can optimise the work of healthcare organisations and scheduling of medical activities.

Furthermore, the word cloud in Fig.  8 highlights aspects of AI in healthcare, such as decision support systems, decision-making, health services management, learning systems, ML techniques and diseases. The figure depicts how AI is linked to healthcare and how it is used in medicine.

figure 8

Word cloud.

Figure  9 represents the search trends based on the keywords analysed. The research started in 2012. First, it identified research topics related to clinical decision support systems. This topic was recurrent during the following years. Interestingly, in 2018, studies investigated AI and natural language processes as possible tools to manage patients and administrative elements. Finally, a new research stream considers AI's role in fighting COVID-19 [ 65 , 66 ].

figure 9

Keywords frequency.

Table  9 represents the number of citations from other articles within the top 20 rankings. The analysis allows the benchmark studies in the field to be identified [ 48 ]. For instance, Burke et al. [ 67 ] writes the most cited paper and analyses efficient nurse rostering methodologies. The paper critically evaluates tangible interdisciplinary solutions that also include AI. Immediately thereafter, Ahmed M.A.'s article proposes a data-driven optimisation methodology to determine the optimal number of healthcare staff to optimise patients' productivity [ 68 ]. Finally, the third most cited article lays the groundwork for developing deep learning by considering diverse health and administrative information [ 51 ].

This section analyses the diffusion of AI in healthcare around the world. It highlights countries to show the geographies of this research. It includes all published articles, the total number of citations, and the collaboration network. The following sub-sections start with an analysis of the total number of published articles.

Country total articles

Figure  9 and Table  10 display the countries where AI in healthcare has been considered. The USA tops the list of countries with the maximum number of articles on the topic (215). It is followed by China (83), the UK (54), India (51), Australia (54), and Canada (32). It is immediately evident that the theme has developed on different continents, highlighting a growing interest in AI in healthcare. The figure shows that many areas, such as Russia, Eastern Europe and Africa except for Algeria, Egypt, and Morocco, have still not engaged in this scientific debate.

Country publications and collaboration map

This section discusses articles on AI in healthcare in terms of single or multiple publications in each country. It also aims to observe collaboration and networking between countries. Table  11 and Fig.  10 highlight the average citations by state and show that the UK, the USA, and Kuwait have a higher average number of citations than other countries. Italy, Spain and New Zealand have the most significant number of citations.

figure 10

Articles per country.

Figure  11 depicts global collaborations. The blue colour on the map represents research cooperation among nations. Additionally, the pink border linking states indicates the extent of collaboration between authors. The primary cooperation between nations is between the USA and China, with two collaborative articles. Other collaborations among nations are limited to a few papers.

figure 11

Collaboration map.

Artificial intelligence for healthcare: applications

This section aims to strengthen the research scope by answering RQ3: What are the research applications of artificial intelligence for healthcare?

Benefiting from the topical dendrogram, researchers will provide a development model based on four relevant variables [ 69 , 70 ]. AI has been a disruptive innovation in healthcare [ 4 ]. With its sophisticated algorithms and several applications, AI has assisted doctors and medical professionals in the domains of health information systems, geocoding health data, epidemic and syndromic surveillance, predictive modelling and decision support, and medical imaging [ 2 , 9 , 10 , 64 ]. Furthermore, the researchers considered the bibliometric analysis to identify four macro-variables dominant in the field and used them as authors' keywords. Therefore, the following sub-sections aim to explain the debate on applications in healthcare for AI techniques. These elements are shown in Fig.  12 .

figure 12

Dominant variables for AI in healthcare.

Health services management

One of the notable aspects of AI techniques is potential support for comprehensive health services management. These applications can support doctors, nurses and administrators in their work. For instance, an AI system can provide health professionals with constant, possibly real-time medical information updates from various sources, including journals, textbooks, and clinical practices [ 2 , 10 ]. These applications' strength is becoming even more critical in the COVID-19 period, during which information exchange is continually needed to properly manage the pandemic worldwide [ 71 ]. Other applications involve coordinating information tools for patients and enabling appropriate inferences for health risk alerts and health outcome prediction [ 72 ]. AI applications allow, for example, hospitals and all health services to work more efficiently for the following reasons:

Clinicians can access data immediately when they need it.

Nurses can ensure better patient safety while administering medication.

Patients can stay informed and engaged in their care by communicating with their medical teams during hospital stays.

Additionally, AI can contribute to optimising logistics processes, for instance, realising drugs and equipment in a just-in-time supply system based totally on predictive algorithms [ 73 , 74 ]. Interesting applications can also support the training of personnel working in health services. This evidence could be helpful in bridging the gap between urban and rural health services [ 75 ]. Finally, health services management could benefit from AI to leverage the multiplicity of data in electronic health records by predicting data heterogeneity across hospitals and outpatient clinics, checking for outliers, performing clinical tests on the data, unifying patient representation, improving future models that can predict diagnostic tests and analyses, and creating transparency with benchmark data for analysing services delivered [ 51 , 76 ].

Predictive medicine

Another relevant topic is AI applications for disease prediction and diagnosis treatment, outcome prediction and prognosis evaluation [ 72 , 77 ]. Because AI can identify meaningful relationships in raw data, it can support diagnostic, treatment and prediction outcomes in many medical situations [ 64 ]. It allows medical professionals to embrace the proactive management of disease onset. Additionally, predictions are possible for identifying risk factors and drivers for each patient to help target healthcare interventions for better outcomes [ 3 ]. AI techniques can also help design and develop new drugs, monitor patients and personalise patient treatment plans [ 78 ]. Doctors benefit from having more time and concise data to make better patient decisions. Automatic learning through AI could disrupt medicine, allowing prediction models to be created for drugs and exams that monitor patients over their whole lives [ 79 ].

  • Clinical decision-making

One of the keyword analysis main topics is that AI applications could support doctors and medical researchers in the clinical decision-making process. According to Jiang et al. [ 64 ], AI can help physicians make better clinical decisions or even replace human judgement in healthcare-specific functional areas. According to Bennett and Hauser [ 80 ], algorithms can benefit clinical decisions by accelerating the process and the amount of care provided, positively impacting the cost of health services. Therefore, AI technologies can support medical professionals in their activities and simplify their jobs [ 4 ]. Finally, as Redondo and Sandoval [ 81 ] find, algorithmic platforms can provide virtual assistance to help doctors understand the semantics of language and learning to solve business process queries as a human being would.

Patient data and diagnostics

Another challenging topic related to AI applications is patient data and diagnostics. AI techniques can help medical researchers deal with the vast amount of data from patients (i.e., medical big data ). AI systems can manage data generated from clinical activities, such as screening, diagnosis, and treatment assignment. In this way, health personnel can learn similar subjects and associations between subject features and outcomes of interest [ 64 ].

These technologies can analyse raw data and provide helpful insights that can be used in patient treatments. They can help doctors in the diagnostic process; for example, to realise a high-speed body scan, it will be simpler to have an overall patient condition image. Then, AI technology can recreate a 3D mapping solution of a patient’s body.

In terms of data, interesting research perspectives are emerging. For instance, we observed the emergence of a stream of research on patient data management and protection related to AI applications [ 82 ].

For diagnostics, AI techniques can make a difference in rehabilitation therapy and surgery. Numerous robots have been designed to support and manage such tasks. Rehabilitation robots physically support and guide, for example, a patient’s limb during motor therapy [ 83 ]. For surgery, AI has a vast opportunity to transform surgical robotics through devices that can perform semi-automated surgical tasks with increasing efficiency. The final aim of this technology is to automate procedures to negate human error while maintaining a high level of accuracy and precision [ 84 ]. Finally, the -19 period has led to increased remote patient diagnostics through telemedicine that enables remote observation of patients and provides physicians and nurses with support tools [ 66 , 85 , 86 ].

This study aims to provide a bibliometric analysis of publications on AI in healthcare, focusing on accounting, business and management, decision sciences and health profession studies. Using the SLR method of Massaro et al. [ 11 ], we provide a reliable and replicable research protocol for future studies in this field. Additionally, we investigate the trend of scientific publications on the subject, unexplored information, future directions, and implications using the science mapping workflow. Our analysis provides interesting insights.

In terms of bibliometric variables, the four leading journals, Journal of Medical Systems , Studies in Health Technology and Informatics , IEEE Journal of Biomedical and Health Informatics , and Decision Support Systems , are optimal locations for the publication of scientific articles on this topic. These journals deal mainly with healthcare, medical information systems, and applications such as cloud computing, machine learning, and AI. Additionally, in terms of h-index, Bushko R.G. and Liu C. are the most productive and impactful authors in this research stream. Burke et al.’s [ 67 ] contribution is the most cited with an analysis of nurse rostering using new technologies such as AI. Finally, in terms of keywords, co-occurrence reveals some interesting insights. For instance, researchers have found that AI has a role in diagnostic accuracy and helps in the analysis of health data by comparing thousands of medical records, experiencing automatic learning with clinical alerts, efficient management of health services and places of care, and the possibility of reconstructing patient history using these data.

Second, this paper finds five cluster analyses in healthcare applications: health services management, predictive medicine, patient data, diagnostics, and finally, clinical decision-making. These technologies can also contribute to optimising logistics processes in health services and allowing a better allocation of resources.

Third, the authors analysing the research findings and the issues under discussion strongly support AI's role in decision support. These applications, however, are demonstrated by creating a direct link to data quality management and the technology awareness of health personnel [ 87 ].

The importance of data quality for the decision-making process

Several authors have analysed AI in the healthcare research stream, but in this case, the authors focus on other literature that includes business and decision-making processes. In this regard, the analysis of the search flow reveals a double view of the literature. On the one hand, some contributions belong to the positivist literature and embrace future applications and implications of technology for health service management, data analysis and diagnostics [ 6 , 80 , 88 ]. On the other hand, some investigations also aim to understand the darker sides of technology and its impact. For example, as Carter [ 89 ] states, the impact of AI is multi-sectoral; its development, however, calls for action to protect personal data. Similarly, Davenport and Kalakota [ 77 ] focus on the ethical implications of using AI in healthcare. According to the authors, intelligent machines raise issues of accountability, transparency, and permission, especially in automated communication with patients. Our analysis does not indicate a marked strand of the literature; therefore, we argue that the discussion of elements such as the transparency of technology for patients is essential for the development of AI applications.

A large part of our results shows that, at the application level, AI can be used to improve medical support for patients (Fig.  11 ) [ 64 , 82 ]. However, we believe that, as indicated by Kalis et al. [ 90 ] on the pages of Harvard Business Review, the management of costly back-office problems should also be addressed.

The potential of algorithms includes data analysis. There is an immense quantity of data accessible now, which carries the possibility of providing information about a wide variety of medical and healthcare activities [ 91 ]. With the advent of modern computational methods, computer learning and AI techniques, there are numerous possibilities [ 79 , 83 , 84 ]. For example, AI makes it easier to turn data into concrete and actionable observations to improve decision-making, deliver high-quality patient treatment, adapt to real-time emergencies, and save more lives on the clinical front. In addition, AI makes it easier to leverage capital to develop systems and facilities and reduce expenses at the organisational level [ 78 ]. Studying contributions to the topic, we noticed that data accuracy was included in the debate, indicating that a high standard of data will benefit decision-making practitioners [ 38 , 77 ]. AI techniques are an essential instrument for studying data and the extraction of medical insight, and they may assist medical researchers in their practices. Using computational tools, healthcare stakeholders may leverage the power of data not only to evaluate past data ( descriptive analytics ) but also to forecast potential outcomes ( predictive analytics ) and to define the best actions for the present scenario ( prescriptive analytics ) [ 78 ]. The current abundance of evidence makes it easier to provide a broad view of patient health; doctors should have access to the correct details at the right time and location to provide the proper treatment [ 92 ].

Will medical technology de-skill doctors?

Further reflection concerns the skills of doctors. Studies have shown that healthcare personnel are progressively being exposed to technology for different purposes, such as collecting patient records or diagnosis [ 71 ]. This is demonstrated by the keywords (Fig.  6 ) that focus on technology and the role of decision-making with new innovative tools. In addition, the discussion expands with Lu [ 93 ], which indicates that the excessive use of technology could hinder doctors’ skills and clinical procedures' expansion. Among the main issues arising from the literature is the possible de-skilling of healthcare staff due to reduced autonomy in decision-making concerning patients [ 94 ]. Therefore, the challenges and discussion we uncovered in Fig.  11 are expanded by also considering the ethical implications of technology and the role of skills.

Implications

Our analysis also has multiple theoretical and practical implications.

In terms of theoretical contribution, this paper extends the previous results of Connelly et al., dos Santos et al, Hao et al., Huang et al., Liao et al. and Tran et al. [ 2 , 19 , 20 , 21 , 22 , 24 ] in considering AI in terms of clinical decision-making and data management quality.

In terms of practical implications, this paper aims to create a fruitful discussion with healthcare professionals and administrative staff on how AI can be at their service to increase work quality. Furthermore, this investigation offers a broad comprehension of bibliometric variables of AI techniques in healthcare. It can contribute to advancing scientific research in this field.

Limitations

Like any other, our study has some limitations that could be addressed by more in-depth future studies. For example, using only one research database, such as Scopus, could be limiting. Further analysis could also investigate the PubMed, IEEE, and Web of Science databases individually and holistically, especially the health parts. Then, the use of search terms such as "Artificial Intelligence" OR "AI" AND "Healthcare" could be too general and exclude interesting studies. Moreover, although we analysed 288 peer-reviewed scientific papers, because the new research topic is new, the analysis of conference papers could return interesting results for future researchers. Additionally, as this is a young research area, the analysis will be subject to recurrent obsolescence as multiple new research investigations are published. Finally, although bibliometric analysis has limited the subjectivity of the analysis [ 15 ], the verification of recurring themes could lead to different results by indicating areas of significant interest not listed here.

Future research avenues

Concerning future research perspectives, researchers believe that an analysis of the overall amount that a healthcare organisation should pay for AI technologies could be helpful. If these technologies are essential for health services management and patient treatment, governments should invest and contribute to healthcare organisations' modernisation. New investment funds could be made available in the healthcare world, as in the European case with the Next Generation EU programme or national investment programmes [ 95 ]. Additionally, this should happen especially in the poorest countries around the world, where there is a lack of infrastructure and services related to health and medicine [ 96 ]. On the other hand, it might be interesting to evaluate additional profits generated by healthcare organisations with AI technologies compared to those that do not use such technologies.

Further analysis could also identify why some parts of the world have not conducted studies in this area. It would be helpful to carry out a comparative analysis between countries active in this research field and countries that are not currently involved. It would make it possible to identify variables affecting AI technologies' presence or absence in healthcare organisations. The results of collaboration between countries also present future researchers with the challenge of greater exchanges between researchers and professionals. Therefore, further research could investigate the difference in vision between professionals and academics.

In the accounting, business, and management research area, there is currently a lack of quantitative analysis of the costs and profits generated by healthcare organisations that use AI technologies. Therefore, research in this direction could further increase our understanding of the topic and the number of healthcare organisations that can access technologies based on AI. Finally, as suggested in the discussion section, more interdisciplinary studies are needed to strengthen AI links with data quality management and AI and ethics considerations in healthcare.

In pursuing the philosophy of Massaro et al.’s [ 11 ] methodological article, we have climbed on the shoulders of giants, hoping to provide a bird's-eye view of the AI literature in healthcare. We performed this study with a bibliometric analysis aimed at discovering authors, countries of publication and collaboration, and keywords and themes. We found a fast-growing, multi-disciplinary stream of research that is attracting an increasing number of authors.

The research, therefore, adopts a quantitative approach to the analysis of bibliometric variables and a qualitative approach to the study of recurring keywords, which has allowed us to demonstrate strands of literature that are not purely positive. There are currently some limitations that will affect future research potential, especially in ethics, data governance and the competencies of the health workforce.

Availability of data and materials

All the data are retrieved from public scientific platforms.

Tagliaferri SD, Angelova M, Zhao X, Owen PJ, Miller CT, Wilkin T, et al. Artificial intelligence to improve back pain outcomes and lessons learnt from clinical classification approaches: three systematic reviews. NPJ Digit Med. 2020;3(1):1–16.

Article   Google Scholar  

Tran BX, Vu GT, Ha GH, Vuong Q-H, Ho M-T, Vuong T-T, et al. Global evolution of research in artificial intelligence in health and medicine: a bibliometric study. J Clin Med. 2019;8(3):360.

Article   PubMed Central   Google Scholar  

Hamid S. The opportunities and risks of artificial intelligence in medicine and healthcare [Internet]. 2016 [cited 2020 May 29]. http://www.cuspe.org/wp-content/uploads/2016/09/Hamid_2016.pdf

Panch T, Szolovits P, Atun R. Artificial intelligence, machine learning and health systems. J Glob Health. 2018;8(2):020303.

Article   PubMed   PubMed Central   Google Scholar  

Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of artificial intelligence for computer-assisted drug discovery | chemical reviews. Chem Rev. 2019;119(18):10520–94.

Article   CAS   PubMed   Google Scholar  

Burton RJ, Albur M, Eberl M, Cuff SM. Using artificial intelligence to reduce diagnostic workload without compromising detection of urinary tract infections. BMC Med Inform Decis Mak. 2019;19(1):171.

Meskò B, Drobni Z, Bényei E, Gergely B, Gyorffy Z. Digital health is a cultural transformation of traditional healthcare. Mhealth. 2017;3:38.

Cho B-J, Choi YJ, Lee M-J, Kim JH, Son G-H, Park S-H, et al. Classification of cervical neoplasms on colposcopic photography using deep learning. Sci Rep. 2020;10(1):13652.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Doyle OM, Leavitt N, Rigg JA. Finding undiagnosed patients with hepatitis C infection: an application of artificial intelligence to patient claims data. Sci Rep. 2020;10(1):10521.

Shortliffe EH, Sepúlveda MJ. Clinical decision support in the era of artificial intelligence. JAMA. 2018;320(21):2199–200.

Article   PubMed   Google Scholar  

Massaro M, Dumay J, Guthrie J. On the shoulders of giants: undertaking a structured literature review in accounting. Account Auditing Account J. 2016;29(5):767–801.

Junquera B, Mitre M. Value of bibliometric analysis for research policy: a case study of Spanish research into innovation and technology management. Scientometrics. 2007;71(3):443–54.

Casadesus-Masanell R, Ricart JE. How to design a winning business model. Harvard Business Review [Internet]. 2011 Jan 1 [cited 2020 Jan 8]. https://hbr.org/2011/01/how-to-design-a-winning-business-model

Aria M, Cuccurullo C. bibliometrix: an R-tool for comprehensive science mapping analysis. J Informetr. 2017;11(4):959–75.

Zupic I, Čater T. Bibliometric methods in management and organization. Organ Res Methods. 2015;1(18):429–72.

Secinaro S, Calandra D. Halal food: structured literature review and research agenda. Br Food J. 2020. https://doi.org/10.1108/BFJ-03-2020-0234 .

Rialp A, Merigó JM, Cancino CA, Urbano D. Twenty-five years (1992–2016) of the international business review: a bibliometric overview. Int Bus Rev. 2019;28(6):101587.

Zhao L, Dai T, Qiao Z, Sun P, Hao J, Yang Y. Application of artificial intelligence to wastewater treatment: a bibliometric analysis and systematic review of technology, economy, management, and wastewater reuse. Process Saf Environ Prot. 2020;1(133):169–82.

Article   CAS   Google Scholar  

Huang Y, Huang Q, Ali S, Zhai X, Bi X, Liu R. Rehabilitation using virtual reality technology: a bibliometric analysis, 1996–2015. Scientometrics. 2016;109(3):1547–59.

Hao T, Chen X, Li G, Yan J. A bibliometric analysis of text mining in medical research. Soft Comput. 2018;22(23):7875–92.

dos Santos BS, Steiner MTA, Fenerich AT, Lima RHP. Data mining and machine learning techniques applied to public health problems: a bibliometric analysis from 2009 to 2018. Comput Ind Eng. 2019;1(138):106120.

Liao H, Tang M, Luo L, Li C, Chiclana F, Zeng X-J. A bibliometric analysis and visualization of medical big data research. Sustainability. 2018;10(1):166.

Choudhury A, Renjilian E, Asan O. Use of machine learning in geriatric clinical care for chronic diseases: a systematic literature review. JAMIA Open. 2020;3(3):459–71.

Connelly TM, Malik Z, Sehgal R, Byrnes G, Coffey JC, Peirce C. The 100 most influential manuscripts in robotic surgery: a bibliometric analysis. J Robot Surg. 2020;14(1):155–65.

Guo Y, Hao Z, Zhao S, Gong J, Yang F. Artificial intelligence in health care: bibliometric analysis. J Med Internet Res. 2020;22(7):e18228.

Choudhury A, Asan O. Role of artificial intelligence in patient safety outcomes: systematic literature review. JMIR Med Inform. 2020;8(7):e18599.

Forliano C, De Bernardi P, Yahiaoui D. Entrepreneurial universities: a bibliometric analysis within the business and management domains. Technol Forecast Soc Change. 2021;1(165):120522.

Secundo G, Del Vecchio P, Mele G. Social media for entrepreneurship: myth or reality? A structured literature review and a future research agenda. Int J Entrep Behav Res. 2020;27(1):149–77.

Dal Mas F, Massaro M, Lombardi R, Garlatti A. From output to outcome measures in the public sector: a structured literature review. Int J Organ Anal. 2019;27(5):1631–56.

Google Scholar  

Baima G, Forliano C, Santoro G, Vrontis D. Intellectual capital and business model: a systematic literature review to explore their linkages. J Intellect Cap. 2020. https://doi.org/10.1108/JIC-02-2020-0055 .

Dumay J, Guthrie J, Puntillo P. IC and public sector: a structured literature review. J Intellect Cap. 2015;16(2):267–84.

Dal Mas F, Garcia-Perez A, Sousa MJ, Lopes da Costa R, Cobianchi L. Knowledge translation in the healthcare sector. A structured literature review. Electron J Knowl Manag. 2020;18(3):198–211.

Mas FD, Massaro M, Lombardi R, Biancuzzi H. La performance nel settore pubblico tra misure di out-put e di outcome. Una revisione strutturata della letteratura ejvcbp. 2020;1(3):16–29.

Dumay J, Cai L. A review and critique of content analysis as a methodology for inquiring into IC disclosure. J Intellect Cap. 2014;15(2):264–90.

Haleem A, Javaid M, Khan IH. Current status and applications of Artificial Intelligence (AI) in medical field: an overview. Curr Med Res Pract. 2019;9(6):231–7.

Paul J, Criado AR. The art of writing literature review: what do we know and what do we need to know? Int Bus Rev. 2020;29(4):101717.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100.

Biancone PP, Secinaro S, Brescia V, Calandra D. Data quality methods and applications in health care system: a systematic literature review. Int J Bus Manag. 2019;14(4):p35.

Secinaro S, Brescia V, Calandra D, Verardi GP, Bert F. The use of micafungin in neonates and children: a systematic review. ejvcbp. 2020;1(1):100–14.

Bert F, Gualano MR, Biancone P, Brescia V, Camussi E, Martorana M, et al. HIV screening in pregnant women: a systematic review of cost-effectiveness studies. Int J Health Plann Manag. 2018;33(1):31–50.

Levy Y, Ellis TJ. A systems approach to conduct an effective literature review in support of information systems research. Inf Sci Int J Emerg Transdiscipl. 2006;9:181–212.

Chen G, Xiao L. Selecting publication keywords for domain analysis in bibliometrics: a comparison of three methods. J Informet. 2016;10(1):212–23.

Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. FASEB J. 2007;22(2):338–42.

Article   PubMed   CAS   Google Scholar  

Sicilia M-A, Garcìa-Barriocanal E, Sànchez-Alonso S. Community curation in open dataset repositories: insights from zenodo. Procedia Comput Sci. 2017;1(106):54–60.

Secinaro S, Calandra D, Secinaro A, Muthurangu V, Biancone P. Artificial Intelligence for healthcare with a business, management and accounting, decision sciences, and health professions focus [Internet]. Zenodo; 2021 [cited 2021 Mar 7]. https://zenodo.org/record/4587618#.YEScpl1KiWh .

Elango B, Rajendran D. Authorship trends and collaboration pattern in the marine sciences literature: a scientometric Study. Int J Inf Dissem Technol. 2012;1(2):166–9.

Jacoby WG. Electoral inquiry section Loess: a nonparametric, graphical tool for depicting relationships between variables q. In 2000.

Andrews JE. An author co-citation analysis of medical informatics. J Med Libr Assoc. 2003;91(1):47–56.

PubMed   PubMed Central   Google Scholar  

White HD, Griffith BC. Author cocitation: a literature measure of intellectual structure. J Am Soc Inf Sci. 1981;32(3):163–71.

Santosh KC. AI-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data. J Med Syst. 2020;44(5):93.

Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform. 2018;22(5):1589–604.

Baig MM, GholamHosseini H, Moqeem AA, Mirza F, Lindén M. A systematic review of wearable patient monitoring systems—current challenges and opportunities for clinical adoption. J Med Syst. 2017;41(7):115.

Kumar S, Kumar S. Collaboration in research productivity in oil seed research institutes of India. In: Proceedings of fourth international conference on webometrics, informetrics and scientometrics. p. 28–1; 2008.

Gatto A, Drago C. A taxonomy of energy resilience. Energy Policy. 2020;136:111007.

Levitt JM, Thelwall M. Alphabetization and the skewing of first authorship towards last names early in the alphabet. J Informet. 2013;7(3):575–82.

Saad G. Exploring the h-index at the author and journal levels using bibliometric data of productive consumer scholars and business-related journals respectively. Scientometrics. 2006;69(1):117–20.

Egghe L. Theory and practise of the g-index. Scientometrics. 2006;69(1):131–52.

Schreiber M. A modification of the h-index: the hm-index accounts for multi-authored manuscripts. J Informet. 2008;2(3):211–6.

Engqvist L, Frommen JG. The h-index and self-citations. Trends Ecol Evol. 2008;23(5):250–2.

London School of Economics. 3: key measures of academic influence [Internet]. Impact of social sciences. 2010 [cited 2021 Jan 13]. https://blogs.lse.ac.uk/impactofsocialsciences/the-handbook/chapter-3-key-measures-of-academic-influence/ .

Lotka A. The frequency distribution of scientific productivity. J Wash Acad Sci. 1926;16(12):317–24.

Khan G, Wood J. Information technology management domain: emerging themes and keyword analysis. Scientometrics. 2015;9:105.

Oxford University Press. Oxford English Dictionary [Internet]. 2020. https://www.oed.com/ .

Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230–43.

Calandra D, Favareto M. Artificial Intelligence to fight COVID-19 outbreak impact: an overview. Eur J Soc Impact Circ Econ. 2020;1(3):84–104.

Bokolo Anthony Jnr. Use of telemedicine and virtual care for remote treatment in response to COVID-19 pandemic. J Med Syst. 2020;44(7):132.

Burke EK, De Causmaecker P, Berghe GV, Van Landeghem H. The state of the art of nurse rostering. J Sched. 2004;7(6):441–99.

Ahmed MA, Alkhamis TM. Simulation optimization for an emergency department healthcare unit in Kuwait. Eur J Oper Res. 2009;198(3):936–42.

Forina M, Armanino C, Raggio V. Clustering with dendrograms on interpretation variables. Anal Chim Acta. 2002;454(1):13–9.

Wartena C, Brussee R. Topic detection by clustering keywords. In: 2008 19th international workshop on database and expert systems applications. 2008. p. 54–8.

Hussain AA, Bouachir O, Al-Turjman F, Aloqaily M. AI Techniques for COVID-19. IEEE Access. 2020;8:128776–95.

Agrawal A, Gans JS, Goldfarb A. Exploring the impact of artificial intelligence: prediction versus judgment. Inf Econ Policy. 2019;1(47):1–6.

Chakradhar S. Predictable response: finding optimal drugs and doses using artificial intelligence. Nat Med. 2017;23(11):1244–7.

Fleming N. How artificial intelligence is changing drug discovery. Nature. 2018;557(7707):S55–7.

Guo J, Li B. The application of medical artificial intelligence technology in rural areas of developing countries. Health Equity. 2018;2(1):174–81.

Aisyah M, Cockcroft S. A snapshot of data quality issues in Indonesian community health. Int J Netw Virtual Organ. 2014;14(3):280–97.

Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J. 2019;6(2):94–8.

Mehta N, Pandit A, Shukla S. Transforming healthcare with big data analytics and artificial intelligence: a systematic mapping study. J Biomed Inform. 2019;1(100):103311.

Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577–9.

Bennett CC, Hauser K. Artificial intelligence framework for simulating clinical decision-making: a Markov decision process approach. Artif Intell Med. 2013;57(1):9–19.

Redondo T, Sandoval AM. Text Analytics: the convergence of big data and artificial intelligence. Int J Interact Multimed Artif Intell. 2016;3. https://www.ijimai.org/journal/bibcite/reference/2540 .

Winter JS, Davidson E. Big data governance of personal health information and challenges to contextual integrity. Inf Soc. 2019;35(1):36–51.

Novak D, Riener R. Control strategies and artificial intelligence in rehabilitation robotics. AI Mag. 2015;36(4):23–33.

Tarassoli SP. Artificial intelligence, regenerative surgery, robotics? What is realistic for the future of surgery? Ann Med Surg (Lond). 2019;17(41):53–5.

Saha SK, Fernando B, Cuadros J, Xiao D, Kanagasingam Y. Automated quality assessment of colour fundus images for diabetic retinopathy screening in telemedicine. J Digit Imaging. 2018;31(6):869–78.

Gu D, Li T, Wang X, Yang X, Yu Z. Visualizing the intellectual structure and evolution of electronic health and telemedicine research. Int J Med Inform. 2019;130:103947.

Madnick S, Wang R, Lee Y, Zhu H. Overview and framework for data and information quality research. J Data Inf Qual. 2009;1:1.

Chen X, Liu Z, Wei L, Yan J, Hao T, Ding R. A comparative quantitative study of utilizing artificial intelligence on electronic health records in the USA and China during 2008–2017. BMC Med Inform Decis Mak. 2018;18(5):117.

Carter D. How real is the impact of artificial intelligence? Bus Inf Surv. 2018;35(3):99–115.

Kalis B, Collier M, Fu R. 10 Promising AI Applications in Health Care. 2018;5.

Biancone P, Secinaro S, Brescia V, Calandra D. Management of open innovation in healthcare for cost accounting using EHR. J Open Innov Technol Market Complex. 2019;5(4):99.

Kayyali B, Knott D, Van Kuiken S. The ‘big data’ revolution in US healthcare [Internet]. McKinsey & Company. 2013 [cited 2020 Aug 14]. https://healthcare.mckinsey.com/big-data-revolution-us-healthcare/ .

Lu J. Will medical technology deskill doctors? Int Educ Stud. 2016;9(7):130–4.

Hoff T. Deskilling and adaptation among primary care physicians using two work innovations. Health Care Manag Rev. 2011;36(4):338–48.

Picek O. Spillover effects from next generation EU. Intereconomics. 2020;55(5):325–31.

Sousa MJ, Dal Mas F, Pesqueira A, Lemos C, Verde JM, Cobianchi L. The potential of AI in health higher education to increase the students’ learning outcomes. TEM J. 2021. ( In press ).

Download references

Acknowledgements

The authors are grateful to the Editor-in-Chief for the suggestions and all the reviewers who spend a part of their time ensuring constructive feedback to our research article.

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Department of Management, University of Turin, Turin, Italy

Silvana Secinaro, Davide Calandra & Paolo Biancone

Ospedale Pediatrico Bambino Gesù, Rome, Italy

Aurelio Secinaro

Institute of Child Health, University College London, London, UK

Vivek Muthurangu

You can also search for this author in PubMed   Google Scholar

Contributions

SS and PB, Supervision; Validation, writing, AS and VM; Formal analysis, DC and AS; Methodology, DC; Writing; DC, SS and AS; conceptualization, VM, PB; validation, VM, PB. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Davide Calandra .

Ethics declarations

Ethical approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Secinaro, S., Calandra, D., Secinaro, A. et al. The role of artificial intelligence in healthcare: a structured literature review. BMC Med Inform Decis Mak 21 , 125 (2021). https://doi.org/10.1186/s12911-021-01488-9

Download citation

Received : 24 December 2020

Accepted : 01 April 2021

Published : 10 April 2021

DOI : https://doi.org/10.1186/s12911-021-01488-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Patient data

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

research paper about medical field

  • Research article
  • Open access
  • Published: 19 March 2019

Machine learning in medicine: a practical introduction

  • Jenni A. M. Sidey-Gibbons 1 &
  • Chris J. Sidey-Gibbons   ORCID: orcid.org/0000-0002-4732-7305 2 , 3 , 4  

BMC Medical Research Methodology volume  19 , Article number:  64 ( 2019 ) Cite this article

95k Accesses

590 Citations

46 Altmetric

Metrics details

Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data.

We demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples ( N =683) was randomly split into evaluation ( n =456) and validation ( n =227) samples.

We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment.

The trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble.

Conclusions

We use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition.

Peer Review reports

Driven by an increase in computational power, storage, memory, and the generation of staggering volumes of data, computers are being used to perform a wide-range of complex tasks with impressive accuracy. Machine learning (ML) is the name given to both the academic discipline and collection of techniques which allow computers to undertake complex tasks. As an academic discipline, ML comprises elements of mathematics, statistics, and computer science. Machine learning is the engine which is helping to drive advances in the development of artificial intelligence. It is impressively employed in both academia and industry to drive the development of ‘intelligent products’ with the ability to make accurate predictions using diverse sources of data [ 1 ]. To date, the key beneficiaries of the 21 st century explosion in the availability of big data, ML, and data science have been industries which were able to collect these data and hire the necessary staff to transform their products. The learning methods developed in and for these industries offer tremendous potential to enhance medical research and clinical care, especially as providers increasingly employ electronic health records.

Two areas which may benefit from the application of ML techniques in the medical field are diagnosis and outcome prediction. This includes a possibility for the identification of high risk for medical emergencies such as relapse or transition into another disease state. ML algorithms have recently been successfully employed to classify skin cancer using images with comparable accuracy to a trained dermatologist [ 2 ] and to predict the progression from pre-diabetes to type 2 diabetes using routinely-collected electronic health record data [ 3 ].

Machine learning will is increasingly employed in combination with Natural Language Processing (NLP) to make sense of unstructured text data. By combining ML with NLP techniques, researchers have been able to derive new insights from comments from clinical incident reports [ 4 ], social media activity [ 5 , 6 ], doctor performance feedback [ 7 ], and patient reports after successful cancer treatments [ 8 ]. Automatically generated information from unstructured data could be exceptionally useful not only in order to gain insight into quality, safety, and performance, but also for early diagnosis. Recently, an automated analysis of free-speech collected during in-person interviews resulted in the ability to predict transition to psychosis with perfect accuracy in a group of high-risk youths [ 9 ].

Machine learning will also play a fundamental role in the development of learning healthcare systems. Learning healthcare systems describe environments which align science, informatics, incentives, and culture for continuous improvement and innovation. In a practical sense, these systems; which could occur on any scale from small group practices to large national providers, will combine diverse data sources with complex ML algorithms. The result will be a continuous source of data-driven insights to optimise biomedical research, public health, and health care quality improvement [ 10 ].

Machine learning

Machine learning techniques are based on algorithms – sets of mathematical procedures which describe the relationships between variables. This paper will explain the process of developing (known as training ) and validating an algorithm to predict the malignancy of a sample of breast tissue based on its characteristics. Though algorithms work in different ways depending on their type there are notable commonalities in the way in which they are developed. Though the complexities of ML algorithms may appear esoteric, they often bear more than a subtle resemblance to conventional statistical analyses.

Given the commonalities shared between statistical and ML techniques, the boundary between the two may seem fuzzy or ill-defined. One way to delineate these bodies of approaches is to consider their primary goals. The goal of statistical methods is inference ; to reach conclusions about populations or derive scientific insights from data which are collected from a representative sample of that population. Though many statistical techniques, such as linear and logistic regression, are capable of creating predictions about new data, the motivator of their use as a statistical methodology is to make inferences about relationships between variables. For example, if we were to create a model which described the relationship between clinical variables and mortality following organ transplant surgery for example, we would need to have insight into the factors which distinguish low mortality risk from high if we were to develop interventions to improve outcomes and reduce mortality in the future. In statistical inference, therefore, the goal is to understand the relationships between variables.

Conversely, in the field of ML, the primary concern is an accurate prediction ; the ‘what’ rather than the ‘how’. For example, in image recognition, the relationship between the individual features (pixels) and the outcome is of little relevance if the prediction is accurate. This is a critical facet of ML techniques as the relationship between many inputs, such as pixels in image or video and geo-location, are complex and usually non-linear. It is exceptionally difficult to describe in a coherent way the relationships between predictors and outcomes both when the relationships are non-linear and when there are a large number of predictors, each of which make a small individual contribution to the model.

Fortunately for the medical field, many relationships of interest are reasonably straightforward, such as those between body mass index and diabetes risk or tobacco use a lung cancer. Because of this, their interaction can often be reasonably well explained using relatively simple models. In many popular applications of ML, such a optimizing navigation, translating documents, and identifying objects in videos, understanding the relationship between features and outcomes is of less importance. This allows the use of complex non-linear algorithms. Given this key difference, it might be useful for researchers to consider that algorithms exist on a continuum between those algorithms which are easily interpretable (i.e., Auditable Algorithms) and those which are not (i.e., Black Boxes), presented visually in Fig.  1 .

figure 1

The complexity/interpretability trade-off in machine learning tools

Interesting questions remain as to when a conventionally statistical technique becomes a ML technique. In this work, we will introduce some that computational enhancements to traditional statistical techniques, such as elastic net regression, make these algorithms performed well with big data. However, a fuller discussion of the similarities and differences between ML and conventional statistics is beyond the purview of the current paper. Interested readers are directed to materials which develop the ideas discussed here [ 11 ]. It should also be acknowledged that whilst the ’Black Box’ concept does generally apply to models which utilize non-linear transformations, such as the neural networks, work is being carried out to facilitate feature identification in complex algorithms [ 12 ].

The majority of ML methods can be categorised into two types learning techniques: those which are supervised and those which are unsupervised. Both are introduced in the following sections.

Supervised learning

Supervised ML refers to techniques in which a model is trained on a range of inputs (or features) which are associated with a known outcome. In medicine, this might represent training a model to relate a person’s characteristics (e.g., height, weight, smoking status) to a certain outcome (onset of diabetes within five years, for example). Once the algorithm is successfully trained, it will be capable of making outcome predictions when applied to new data. Predictions which are made by models trained using supervised learning can be either discrete (e.g., positive or negative, benign or malignant) or continuous (e.g., a score from 0 to 100).

A model which produces discrete categories (sometimes referred to as classes) is referred to as a classification algorithm. Examples of classification algorithms include those which, predict if a tumour is benign or malignant, or to establish whether comments written by a patient convey a positive or negative sentiment [ 2 , 6 , 13 ]. In practice, classification algorithms return the probability of a class (between 0 for impossible and 1 for definite). Typically, we would transform any probability greater than.50 into a class of 1, but this threshold may be altered to improve algorithm performance as required. This paper provides an example of a classification algorithm in which a diagnosis is predicted.

A model which returns a prediction of a continuous value is known as a regression algorithm. The use of the term regression in ML varies from its use in statistics, where regression is often used to refer to both binary outcomes (i.e., logistic regression) and continuous outcomes (i.e., linear regression). In ML, an algorithm which is referred to as a regression algorithm might be used to predict an individual’s life expectancy or tolerable dose of chemotherapy.

Supervised ML algorithms are typically developed using a dataset which contains a number of variables and a relevant outcome. For some tasks, such as image recognition or language processing, the variables (which would be pixels or words) must be processed by a feature selector. A feature selector picks identifiable characteristics from the dataset which then can be represented in a numerical matrix and understood by the algorithm. In the examples above, a feature may be the colour of a pixel in an image or the number of times that a word appears in a given text. Using the same examples, outcomes may be whether an image shows a malignant or benign tumour or whether transcribed interview responses indicate predisposition to a mental health condition.

Once a dataset has been organised into features and outcomes, a ML algorithm may be applied to it. The algorithm is iteratively improved to reduce the error of prediction using an optimization technique.

Note that, when training ML algorithms, it is possible to over-fit the algorithm to the nuances of a specific dataset, resulting in a prediction model that does not generalise well to new data. The risk of over-fitting can be mitigated using various techniques. Perhaps the most straight-forward approach, which will be employed in this work, is to split our dataset into two segments; a training segment and a testing segment to ensure that the trained model can generalize to predictions beyond the training sample. Each segment contains a randomly-selected proportion of the features and their related outcomes. This allows the algorithm to associate certain features, or characteristics, with a specific outcome, and is known as training the algorithm. Once training is completed, the algorithm is applied to the features in the testing dataset without their associated outcomes. The predictions made by the algorithm are then compared to the known outcomes of the testing dataset to establish model performance. This is a necessary step to increase the likelihood that the algorithm will generalise well to new data. This process is illustrated graphically in Fig.  2 .

figure 2

Overview of supervised learning. a Training b Validation c Application of algorithm to new data

Unsupervised Machine Learning

In contrast with supervised learning, unsupervised learning does not involve a predefined outcome. In unsupervised learning, patterns are sought by algorithms without any input from the user. Unsupervised techniques are thus exploratory and used to find undefined patterns or clusters which occur within datasets. These techniques are often referred to as dimension reduction techniques and include processes such as principal component analysis, latent Dirichlet analysis and t-Distributed Stochastic Neighbour Embedding (t-SNE) [ 14 – 16 ]. Unsupervised learning techniques are not discussed at length in this work, which focusses primarily on supervised ML. However, unsupervised methods are sometimes employed in conjunction with the methods used in this paper to reduce the number of features in an analysis, and are thereby worth mention. By compressing the information in a dataset into fewer features, or dimensions, issues including multiple-collinearity or high computational cost may be avoided. A visual illustration of an unsupervised dimension reduction technique is given in Fig.  3 . In this figure, the raw data (represented by various shapes in the left panel) are presented to the algorithm which then groups the data into clusters of similar data points (represented in the right panel). Note that data which do not have sufficient commonality to the clustered data are typically excluded, thereby reducing the number of features within of the dataset.

figure 3

A visual illustration of an unsupervised dimension reduction technique

In a similar way to the supervised learning algorithms described earlier, also share many similarities to statistical techniques which will be familiar to medical researchers. Unsupervised learning techniques make use of similar algorithms used for clustering and dimension reduction in traditional statistics. Those familiar with Principal Component Analysis and factor analysis will already be familiar with many of the techniques used in unsupervised learning.

What this paper will achieve

This paper provides a pragmatic example using supervised ML techniques to derive classifications from a dataset containing multiple inputs. The first algorithm we introduce, the regularized logistic regression, is very closely related to multivariate logistic regression. It is distinguished primarily by the use of a regularization function which both reduces the number of features in the model and attenuates the magnitude of their coefficients. Regularization is, therefore, suitable for datasets which contain many variables and missing data (known as high sparsity datasets ), such as the term-document matrices which are used to represent text in text mining studies.

The second algorithm, a Support Vector Machine (SVM), gained popularity among the ML community for its high performance deriving accurate predictions in situations where the relationship between features and the outcome is non-linear. It uses a mathematical transformation known as the kernel trick , which we describe in more detail below.

Finally, we introduce an Artificial Neural Network (ANN), in which complex architecture and heavily modifiable parameters have led to it’s widespread use in many challenging applications, including image and video recognition. The addition of speciality neural networks, such as recurrent or convolutional networks, to ANNs has resulted in impressive performance on a range of tasks. Being highly parametrized models, ANNs are prone to over-fitting. Their performance may be improved using a regularization technique, such as DropConnect.

The ultimate goal of this manuscript is to imbue clinicians and medical researchers with both a foundational understanding of what ML is, how it may be used, as well as the practical skills to develop, evaluate, and compare their own algorithms to solve prediction problems in medicine.

How to follow this paper

We provide a conceptual introduction alongside practical instructions using code written for the R Statistical Programming Environment, which may be easily modified and applied to other classification or regression tasks. This code will act as a framework upon which researchers can develop their own ML studies. The models presented here may be fitted to diverse types of data and are, with minor modifications, suitable for analysing text and images.

This paper is divided into sections which describe the typical stages of a ML analysis: preparing data, training algorithms, validating algorithms, assessing algorithm performance, and applying new data to the trained models.

Throughout the paper, examples of R code used to the run the analyses are presented. The code is given in full in Additional file  1 . The data which was used for these analyses are available in Addition file 2 .

The dataset used in this work is the Breast Cancer Wisconsin Diagnostic Data Set. This dataset is publicly available from the University of California Irvine (UCI) Machine Learning Repository [ 17 ]. It consists of characteristics, or features, of cell nuclei taken from breast masses which were sampled using fine-needle aspiration (FNA), a common diagnostic procedure in oncology. The clinical samples used to form this dataset were collected from January 1989 to November 1991. Relevant features from digitised images of the FNA samples were extracted through the methods described in Refs. [ 13 , 18 , 19 ]. An example of one of the digitised images from an FNA sample is given in Fig.  4 .

figure 4

An example of an image of a breast mass from which dataset features were extracted

A total of 699 samples were used to create this dataset. This number will be referred to as the number of instances . Each instance has an I.D. number, diagnosis, and set of features attributed to it. While the Sample I.D. is unique to that instance, the diagnosis, listed as class in the dataset, can either be malignant or benign, depending if the FNA was found to be cancerous or not. In this dataset, 241 instances were diagnosed as malignant, and 458 instances were found to be benign. Malignant cases have a class of four, and benign cases have a class of two. This class, or diagnosis, is the outcome of the instance.

The features of the dataset are characteristics identified or calculated from each FNA image. There are nine features in this dataset, and each is valued on a scale of 1 to 10 for a particular instance, 1 being the closest to benign and 10 being the most malignant [ 18 ]. Features range from descriptors of cell characteristics, such as Uniformity of Cell Size and Uniformity of Cell Shape , to more complex cytological characteristics such as Clump Thickness and Marginal Adhesion . All nine features, along with the Instance No., Sample I.D., and Class are listed in Table  1 . The full dataset is a matrix of 699 × 12 (one identification number, nine features, and one outcome per instance).

This dataset is simple and therefore computationally efficient. The relatively low number of features and instances means that the analysis provided in this paper can be conducted using most modern PCs without long computing times. Although the principals are the same as those described throughout the rest of this paper, using large datasets to train Machine learning algorithms can be computationally intensive and, in some cases, require many days to complete. The principals illustrated here apply to datasets of any size.

The R Statistical Programming Language is an open-source tool for statistics and programming which was developed as an extension of the S language. R is supported by a large community of active users and hosts several excellent packages for ML which are both flexible and easy to use. R is a computationally efficient language which is readily comprehensible without special training in computer science. The R language is similar to many other statistical programming languages, including MATLAB, SAS, and STATA. Packages for R are arranged into different task views on the Comprehensive R Archive Network. The Machine Learning and Statistical Learning task view currently lists almost 100 packages dedicated to ML.

Many, if not most, R users access the R environment using RStudio, an open-source integrated developer environment (IDE) which is designed to make working in R more straightforward. We recommend that readers of the current paper download the latest version of both R and RStudio and access the environment through the RStudio application. Both R and RStudio are free to use and available for use under an open-source license.

Conducting a machine learning analysis

The following section will take you through the necessary steps of a ML analysis using the Wisconsin Cancer dataset.

Importing and preparing the dataset.

Training the ML algorithms.

Testing the ML algorithms.

Assessing sensitivity, specificity and accuracy of the algorithms.

Plotting receiver operating characteristic curves.

Applying new data to the trained models.

1. Importing and preparing the dataset.

The dataset can be downloaded directly from the UCI repository using the code in Fig.  5 .

figure 5

Import the data and label the columns

We first modify the data by re-scoring missing data from ‘?’ to NA, removing any rows with missing data and re-scoring the class variables from 2 and 4 to 0 and 1, where 0 indicates the tumour was benign and 1 indicates that it was malignant. Recall that a dataset with many missing data points is referred to as a sparse dataset. In this dataset there are small number of cases (n =16) with at least one missing value. To simplify the analytical steps, we will remove these cases, using the code in Fig.  6 .

figure 6

Remove missing items and restore the outcome data

Datasets used for supervised ML are most easily represented in a matrix similar to the way Table  1 is presented. The n columns are populated with the n −1 features, with the single remaining column containing the outcome. Each row contains an individual instance. The features which make up the training dataset may also be described as inputs or variables and are denoted in code as x . The outcomes may be referred to as the label or the class and are denoted using y .

Recall that it is necessary to train a supervised algorithm on a training dataset in order to ensure it generalises well to new data. The code in Fig.  7 will divide the dataset into two required segments, one which contains 67% of the dataset, to be used for training; and the other, to be used for evaluation, which contains the remaining 33%.

figure 7

Split the data into training and testing datasets

2. Training the ML algorithms

Now that we have arranged our dataset into a suitable format, we may begin training our algorithms. These ML algorithms which we will use are listed below and detailed in the following section.

Logistic regression using Generalised Linear Models (GLMs) with \(\mathscr {L}_{1}\) Least Absolute Selection and Shrinkage Operator (LASSO) regularisation.

Support Vector Machines (SVMs) with a radial basis function (RBF) kernel.

Artificial Neural Networks (ANNs) with a single hidden layer.

Regularised regression using Generalised Linear Models (GLMs)

Regularised General Linear Models (GLMs) have demonstrated excellent performance in some complex learning problems, including predicting individual traits from on-line digital footprints [ 20 ], classifying open-text reports of doctors’ performance [ 7 ], and identifying prostate cancer by desorption electro-spray ionization mass spectrometric imaging of small metabolites and lipids [ 21 ].

When fitting GLMs using datasets which have a large number of features and substantial sparsity, model performance may be increased when the contribution of each of the included features to the model is reduced (or penalised) using regularisation, a process which also reduces the risk of over-fitting. Regularisation effectively reduces both the number of coefficients in the model and their magnitudes, making especially it suitable for big datasets that may have more features than instances. In this example, feature selection is guided by the Least Absolute Shrinkage and Selection Operator (LASSO). Other forms of regularisation are available, including Ridge Regression and the Elastic Net (which is a linear blend of both Ridge and LASSO regularisation) [ 22 ]. An accessible, up-to-date summary of LASSO and other regularisation techniques is given in Ref [ 23 ].

Regularised GLMs are operationalised in R using the glmnet package [ 24 ]. The code below demonstrates how the GLM algorithm is fitted to the training dataset. In the glmnet package, the regularistion parameter is chosen using the numerical value referred to as alpha. In this package, a alpha value of 1 selects LASSO regularisation where as alpha 0 selects Ridge regularization, a value between between 0 and 1 selects a linear blend of the two techniques known as the Elastic Net [ 22 ].

nFold cross-validation is used to ascertain the optimal value of lambda ( λ ), the regularisation parameter. The value of ( λ ) which minimizes prediction error is stored in the glm_model$lambda.min object. The smaller the λ value, the greater the effect of regularisation upon the number of features in the model and their respective coefficients. Figure  8 shows the effect of different levels of log( λ ). The optimal value of log( λ ) is indicated using the vertical broken line (shown here at x = -5.75). The rightmost dotted line indicates the most parsimonious value of log( λ ) which is within 1 standard deviation of the absolute minimum value. Note that the random nature of cross-validation means that values of log( λ ) may differ slightly between analyses. The integers are given above Fig.  8 (0-9) relate to the number of features included in the model. The code shown in Fig.  9 fits the GLM algorithm to the data and extracts the minimum value of λ and the weights of the coefficients.

figure 8

Regression coefficients for the GLM model. The figure shows the coefficients for the 9 model features for different values of log( λ ). log( λ ) values are given on the lower x-axis and number of features in the model are displayed above the figure. As the size of log( λ ) decreases the number of variables in the model (i.e. those with a nonzero coefficient) increases as does the magnitude of each feature. The vertical dotted line indicates the value of log( λ ) at which the accuracy of the predictions is maximized

figure 9

Fit the GLM model to the data and extract the coefficients and minimum value of lambda

Figure  10 shows the cross-validation curves for different levels of log( λ ). This figure can be plotted using the code in Fig.  11 .

figure 10

Cross-validation curves for the GLM model. The figure shows the cross-validation curves as the red dots with upper and lower standard deviation shown as error bars

figure 11

Plot the cross-validation curves for the GLM algorithm

Figure  8 shows magnitude of the coefficients for each of the variables within the model for different values of log( λ ). The vertical dotted line indicates the value of log( λ ) which minimises the mean squared error established during cross-validation. This figure can be augmented with a dotted vertical line indicating the value of log( λ ) using the abline() function, shown in Fig.  12 .

figure 12

Plot the coefficients and their magnitudes

Support Vector Machines (SVMs)

Support Vector Machine (SVM) classifiers operate by separating the two classes using a linear decision boundary called the hyperplane. The hyperplane is placed at a location that maximises the distance between the hyperplane and instances [ 25 ].

Fig.  13 depicts an example of a linear hyperplane that perfectly separates between two classes. In real-world examples, it may not be possible to adequately separate the two classes using a linear hyperplane. By maximising the width of the decision boundary then the generalisability of the model to new data is optimised. Rather than employ a non-linear separator such as a high-order polynomial, SVM techniques use a method to transform the feature space such that the classes do become linearly separable. This technique, known as the kernel trick, is demonstrated in Fig.  14 .

figure 13

A SVM Hyperplane The hyperplane maximises the width of the decision boundary between the two classes

figure 14

The kernel trick The kernel trick modifies the feature space allowing separation of the classes with a linear hyperplane

Fig.  14 shows an example of a two classes that are not separable using a linear separator. By projecting the data to X 2 , they become linearly separable using the y =5 hyperplane. A popular method for kernel transformation in high-dimensional space is the radial basis function (RBF).

The SVM algorithm is fitted to the data using a function, given in Fig.  15 , which is arranged in a similar way to the regularised regression shown above.

figure 15

Fit the SVM algorithm to the data

Further exploration of SVM which attempt to fit separating hyperplanes following different feature space transformations is possible by altering the kernel argument to “linear”, “radial”, “polynomial”, or “sigmoid”.

Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are algorithms which are loosely modelled on the neuronal structure observed in the mammalian cortex. Neural networks are arranged with a number of input neurons, which represent the information taken from each of the features in the dataset. which feed into any number of hidden layers before passing to an output layer in which the final decision is presented. As information passes through the ’neurons’, or nodes, where is is multiplied by the weight of the neuron (plus a constant bias term) and transformed by an activation function. The activation function applies a non-linear transformation using a simple equation shown in Eq. 1 .

In recurrent ANNs, a process is undertaken in which the prediction errors are fed back through the network before modifying the weights of each neural connection is altered until error level is minimised, a process known as backpropagation [ 26 ].

Deep Neural Networks (DNNs) refers to neural networks which have many hidden layers. Deep learning, which may utilise DNNs, has produced impressive results when employed in complex tasks using very high dimensional data, such as image recognition [ 27 ] and computer-assisted diagnosis of melanoma [ 2 ].

DNNs are heavily parametrised and, resultantly, can be prone to over-fitting models to data. Regularisation can, like the GLM algorithm described above, be used prevent this. Other strategies to improve performance can include dropout regularisation, where some number of randomly-selected units are omitted from the hidden layers during training [ 28 ].

The code in Fig.  16 demonstrates the code for fitting a neural network. This is straightforward, requiring the x and y datasets to be defined, as well as the number of units in the hidden layer using the size argument.

figure 16

Fit the ANN algorithm to the data

3. Testing the ML algorithms

In order to test the performance of the trained algorithms, it is necessary to compare the predictions which the algorithm has made on data other than the data upon which it was trained with the true outcomes for that data which we have known but we did not expose the algorithm to. To accomplish this in he R programming environment, we would create a vector of model predictions using the x_test matrix, which can be compared to the y_test vector to establish performance metrics. This is easily achievable using the predict() function, which is included in the stats package in the R distribution. The nnet package contains a minor modification to the predict() function, and as such the type argument is set to ‘raw’, rather than ‘response’ for the neural network. This code is given in Fig.  17 .

figure 17

Extract predictions from the trained models on the new data

4. Assessing the sensitivity, specificity and accuracy of the algorithms

Machine learning algorithms for classification are typically evaluated using simple methodologies that will be familiar to many medical researchers and clinicians. In the current study, we will use sensitivity, specificity, and accuracy to evaluate the performance of the three algorithms. Sensitivity is the proportion of true positives that are correctly identified by the test, specificity is the proportion of true negatives that are correctly identified by the test and the accuracy is the proportion of the times which the classifier is correct [ 29 ]. Equations used to calculate sensitivity, specificity, and accuracy are given below.

Confusion matrices

Data from classifiers are often represented in a confusion matrix in which the classifications made by the algorithm (e.g., pred_y_svm ) are compared to the true classifications (which the algorithms were blinded to) in the dataset (i.e., y_test ). Once populated, the confusion matrix provides all of the information needed to calculate sensitivity, specificity, and accuracy manually. An example of an unpopulated confusion matrix is demonstrated in Table  2 .

Confusion matrices can be easily created in R using the caret package. The confusionMatrix() function creates a confusion matrix and calculates sensitivity, specificity, and accuracy. The confusionMatrix() function requires a binary input for the predictors whereas the pred() functions used earlier produce a vector of continuous values between 0 and 1, in which a larger value reflects greater certainty that the sample was positive. Before evaluating a binary classifier, a cut-off threshold must be decided upon. The round() function used in the code shown in Fig.  18 effectively sets a threshold of >.50 for a positive prediction by rounding values ≤.50 down to 0 and values >.50 up to 1. While this is sufficient for this teaching example, users may wish to evaluate the optimal threshold for a positive prediction as this may differ from.50. The populated confusion matrix for this example is shown in Table  3 and is displayed alongside sensitivity, specificity, and accuracy.

figure 18

Create confusion matrices for the three algorithms

5. Plotting receiver operating characteristic curves

Receiver operating characteristics curves are useful and are shown in the code in Fig.  19 using the pROC package. An example output is given in Fig.  20 . These curves illustrate the relationship between the model’s sensitivity (plotted on the y -axis) and specificity (plotted on the x -axis). The grey diagonal line is reflective of as-good-as-chance performance and any curves which are plotted to the left of that line are performing better than chance. Interpretation of ROC curves is facilitated by calculating the area under each curve (AUC) [ 30 ]. The AUC gives a single value which explains the probability that a random sample would be correctly classified by each algorithm. In this example all models perform very well but the SVM algorithm shows the best performance, with AUC =.97 compared to the ANN (AUC =.95) and the LASSO-regularized regression (AUC =.94).

figure 19

Draw received operating curves and calculate the area under them

figure 20

Receiver Operating Characteristics curves

6. Applying new data to the trained models

Despite many similarities, ML is differentiated from statistical inference by its focus on predicting real-life outcomes from new data. As such, we develop models not to infer the relationships between variables but rather to produce reliable predictions from new data (though, as we have demonstrated, prediction and inference are not mutually exclusive).

In order to use the trained models to make predictions from data we need to construct either a vector (if there is a single new case) or a matrix (if there are multiple new cases). We need to ensure that the new data are entered into the model in the same order as the x_train and x_test matrices. In this case, we need to enter new data in the order of thickness , cell size , cell shape , adhesion , epithelial size , bare nuclei , bland cromatin , normal nucleoli , and mitoses . The code in Fig.  21 demonstrates how these data are represented in a manner that allows them to be processed by the trained model. Note that all three algorithms return predictions that suggest there is a near-certainty that this particular sample is malignant.

figure 21

Apply new data to the trained and validated algorithm

Additional ML techniques

Reducing prediction error; the case for ensembles..

When working to maximise the performance of a predictive model, it can be beneficial to group different algorithms together to create a more robust prediction in a process known as ensemble learning [ 24 ]. There are too many ensemble techniques to adequately summarize here, but more information can be found in Ref. [ 23 ].

The principal of ensemble learning can be demonstrated using a un-weighted voting algorithm with R code. The code in Fig.  22 can be used to demonstrate the process of developing both an averaging and and voting algorithm.

figure 22

Create predictions from the ensemble

Natural language processing

Another common use for classification algorithms is in Natural Language Processing (NLP), the branch of ML in which computers are taught to interpret linguistic data. One popular example of NLP is in sentiment analysis, which involves ML algorithms trained to classify texts into different categories relating to the sentiment they convey; usually positive, negative, or neutral. We will give an overview of how features can be extracted from text and then used in the framework we have introduced above.

A linguistic dataset (also known as a corpus ) comprises a number of distinct documents . The documents can be broken down into smaller tokens of text, such as the individual words contained within. These tokens can be used as the features in a ML analysis as demonstrated above. In such an analysis, we arrange the x_train matrix such that the rows represent the individual documents and the tokenized features are represented in the columns. This arrangement for linguistic analysis is known as a term-document matrix (TDM).

In its most basic form, each row of the TDM represents a simple count of the words which were used in a document. In this case, the width of a TDM is equal to the number of unique words in the entire corpus and, for each document, the value any given cell will either be 0 if the word does not appear in that comment or 1 if it does. Arranging a document this way leads to two issues: firstly, that the majority of the matrix likely contains null values (an issue known as sparsity ); and secondly, that many of the documents contain the most common words in a language (e.g., “the”, “a”, or “and”) which are not very informative in analysis. Refining the TDM using a technique known as a term-frequency-inverse document frequency (TF-IDF) weighting can reduce the value of certain common words in the matrix which may be less informative and increase the value of less common words, which may be more informative. It is also possible to remove uninformative words using a pre-defined dictionary known as a stop words dictionary.

In a TDM, words can be tokenized individually, known as unigrams , or as groups of sequential words, known a nGrams where n is the number of words extracted in the token ( i.e, bi-gram or tri-gram extraction ). Such extraction can mitigate issues caused by grammatical nuances such as negation (e.g., “I never said she stole my money.”). Some nuances are more difficult to analyse robustly, especially those used commonly in spoken language, such as emphasis or sarcasm. For example, the sentence above about the stolen money could have at least 7 different meanings depending on where the emphasis was placed.

A TDM can be easily developed in R using the tools provided in the tm package. In Table  4 , we demonstrate a simple uniGram (single word) TDM without TF-IDF weighting.

The code in Fig.  23 demonstrates the process for creating a term document management for a vector of open-text comments called ’comments’. modifications are made to the open text comments including the removal of punctuation and weighting using the TF-DF technique. The final matrix which is saved to an objects names ’x’ could The linked to a vector of outcomes ‘y’ and used to train and validate machine learning algorithms using the process described above listings 3 to 11.

figure 23

Create a term document matrix

Once created, documents in the TDM can be combined with a vector of outcomes using the cbind() function, as shown in Table  4 , and processed in the same way as demonstrated in Fig.  7 . Interested readers can explore the informative tm package documentation to learn more about term-document matrices [ 31 ].

When trained on a proportion of the dataset, the three algorithms were able to classify cell nuclei in the remainder of the dataset with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Though each algorithm performed well individually, maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm (see Table  3 ).

Model performance was marginally increased when the three algorithms were arranged into a voting ensemble, with an overall accuracy of.97, sensitivity of.99 and specificity of.95 (see the attached R Code for further details.).

Machine learning has the potential to transform the way that medicine works [ 32 ], however, increased enthusiasm has hitherto not been met by increased access to training materials aimed at the knowledge and skill sets of medical practitioners.

In this paper, we introduce basic ML concepts within a context which medical researchers and clinicians will find familiar and accessible. We demonstrate three commonly-used algorithms; a regularized general linear model, support vector machines (SVM), and an artificial neural network to classify tumour biopsies with high accuracy as either benign or malignant. Our results show that all algorithms can perform with high accuracy, sensitivity, and specificity despite substantial differences in the way that the algorithms work. The best-performing algorithm, the SVM, is very similar to the method demonstrated by Wolberg and Mangasarian who used different versions of the same dataset with fewer observations to achieve similar results [ 18 , 33 ]. It is noteworthy that the LASSO-regularized linear regression also performed exceptionally well whilst preserving the ability to understand which features were guiding the predictions (see Table  5 ). In contrast, the archetypal ’black box’ of the heavily-parametrized neural network could not improve classification accuracy.

In parallel to our analysis, we demonstrate techniques which can be applied with a commonly-used and open-source programming software (the R environment) which does not require prior experience with command-line computing. The presented code is designed to be re-usable and easily adaptable, so that readers may apply these techniques to their own datasets. With some modification, the same code may be used to develop linguistic classifiers or object recognition algorithms using open-text or image-based data respectively. Though the R environment now provides many options for advanced ML analyses, including deep learning, the framework of the code can be easily translated to other programming languages, such as Python, if desired. After working through examples in this paper we suggest that user apply their knowledge to problems within their own datasets. Doing so will elucidate specific issue which need to be overcome and will form a foundation for continued learning in this area. Further information can be from any number of excellent textbooks, websites, and online courses. Additional practice data sets can be obtained from the University of California Irvine Machine learning data sets repository which at the time of writing, includes an additional 334 datasets suitable for classification tasks, including 35 which contain open-text data [ 17 ].

Further, this paper acts to demystify ML and endow clinicians and researchers without a previous ML experience with the ability to critically evaluate these techniques. This is particularly important because without a clear understanding of the way in which algorithms are trained, medical practitioners are at risk of relying too heavily on these tools which might not always perform as expected. In their paper demonstrating a multi-surface pattern separation technique using a similar dataset, Wolberg and Mangasarian stress the importance of training algorithms on data which does not itself contain errors; their model was unable to achieve perfect performance as the sample in the dataset appeared to have been incorrectly extracted from an area beyond the tumour. The oft-told parable of the failure of the Google Flu Trends model offers an accessible example of the risks and consequences posed by a lack of understanding of ML models deployed ostensibly to improve health [ 34 ]. In short, the Google Flu Trends model was not generalizable over time as the Google Search data it was trained on was temporally sensitive. Looking to applications of ML beyond the medical field offers further insight into some risks that these algorithms might engender. For example, concerns have been raised about predictive policing algorithms and, in particular, the risk of entrenching certain prejudices in an algorithm which may be apparent in police practice. Though the evidence of whether predictive policing algorithms leads to biases in practice is unclear [ 35 ], it stands to reason that if biases exist in routine police work then models taught to recognize patterns in routinely collected data would have no means to exclude these biases when making predictions about future crime risk. Similar bias-based risks have been identified in some areas of medical practice and, if left unchecked, threaten the ethical use of data-driven automation in those areas [ 36 ]. An understanding of the way ML algorithms are trained is essential to minimize and mitigate the risks of entrenching biases in predictive algorithms in medicine.

The approach which we have taken in this paper entails some notable strengths and weaknesses. We have chosen to use a publicly-available dataset which contains a relatively small number of inputs and cases. The data is arranged in such a way that will allow those trained in medical disciplines to easily draw parallels between familiar statistical and novel ML techniques. Additionally, the compact dataset enables short computational times on almost all modern computers. A caveat of this approach is that many of the nuances and complexities of ML analyses, such as sparsity or high dimensionality, are not well represented in the data. Despite the omission of these common features of a ML dataset, we are confident that users who have worked through the examples given here with the code provided in the appendix will be well-placed to further develop their skills working on more complex datasets using the scalable code framework which we provide. In addition, this data also usefully demonstrates an important principle of ML: more complex algorithms do not necessarily beget more useful predictions.

We look toward a future of medical research and practice greatly enhanced by the power of ML. In the provision of this paper, we hope that the enthusiasm for new and transformative ML techniques is tempered by a critical appreciation for the way in which they work and the risks that they could pose.

Abbreviations

Artificial neural network

Area under the curve

Fine needle aspiration

Generalized linear model

Integrated developer environment

Least absolute shrinkage and selection operator

Radial basis function

Received operating characteristics

Support vector machine

Term document - inverse document frequency

Term document matrix

t-embedded stochastic neighbor embedding

University of California, Irvine

Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Sci (NY). 2015; 349(6245):255–60. https://doi.org/10.1126/science.aaa8415 .

Article   CAS   Google Scholar  

Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639):115–8. https://doi.org/10.1038/nature21056 .

Anderson J, Parikh J, Shenfeld D. Reverse Engineering and Evaluation of Prediction Models for Progression to Type 2 Diabetes: Application of Machine Learning Using Electronic Health Records. J Diabetes. 2016.

Ong M-S, Magrabi F, Coiera E. Automated identification of extreme-risk events in clinical incident reports. J Am Med Inform Assoc. 2012; 19(e1):e110–e18.

Article   Google Scholar  

Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online,. J Med Internet Res. 2013; 15(11):239. https://doi.org/10.2196/jmir.2721 .

Hawkins JB, Brownstein JS, Tuli G, Runels T, Broecker K, Nsoesie EO, McIver DJ, Rozenblum R, Wright A, Bourgeois FT, Greaves F. Measuring patient-perceived quality of care in US hospitals using Twitter,. BMJ Qual Saf. 2016; 25(6):404–13. https://doi.org/10.1136/bmjqs-2015-004309 .

Gibbons C, Richards S, Valderas JM, Campbell J. Supervised Machine Learning Algorithms Can Classify Open-Text Feedback of Doctor Performance With Human-Level Accuracy,. J Med Internet Res. 2017; 19(3):65. https://doi.org/10.2196/jmir.6533 .

Wagland R, Recio-Saucedo A, Simon M, Bracher M, Hunt K, Foster C, Downing A, Glaser A, Corner J. Development and testing of a text-mining approach to analyse patients’ comments on their experiences of colorectal cancer care. Qual Saf BMJ. 2015:2015–004063. https://doi.org/10.1136/bmjqs-2015-004063 .

Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, Ribeiro S, Javitt DC, Copelli M, Corcoran CM. Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophr. 2015; 1(1):15030. https://doi.org/10.1038/npjschz.2015.30 .

Friedman CP, Wong AK, Blumenthal D. Achieving a Nationwide Learning Health System. Sci Transl Med. 2010; 2(57):57–29.

Beam A, Kohane I. Big Data and Machine Learning in Health Care. J Am Med Assoc. 2018; 319(13):1317–8.

Lei T, Barzilay R, Jaakkola T. Rationalizing Neural Predictions. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16: 2016. p. 1135–1144. https://doi.org/10.1145/2939672.2939778 .

Mangasarian OL, Street WN, Wolberg WH. Breast Cancer Diagnosis and Prognosis via Linear Programming: AAAI; 1994, pp. 83 - 86.

Jolliffe I, Jolliffe I. Principal Component Analysis. In: Wiley StatsRef: Statistics Reference Online. Chichester: John Wiley & Sons, Ltd: 2014.

Google Scholar  

Blei DM, Ng AY, Jordan MI. Latent Dirichlet Allocation. J Mach Learn Res. 2003; 3(Jan):993–1022.

Maaten Lvd, Hinton G. Visualizing Data using t-SNE. J Mach Learn Res. 2008; 9(Nov):2579–605.

Lichman M. UCI Machine Learning Repository: Breast Cancer Wisconsin (Diagnostic) Data Set. 2014. http://archive.ics.uci.edu/ml . Accessed 8 Aug 2017.

Wolberg WH, Mangasariant OL. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA. 1990; 87:9193–6.

Bennett KP. Decision tree construction via linear programming: University of Wisconsin-Madison Department of Computer Sciences; 1992, pp. 97–101.

Kosinski M, Stillwell D, Graepel T. Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci. 2013; 110(15):5802–5. https://doi.org/10.1073/pnas.1218772110 .

Banerjee S, Zare RN, Tibshirani RJ, Kunder CA, Nolley R, Fan R, Brooks JD, Sonn GA. Diagnosis of prostate cancer by desorption electrospray ionization mass spectrometric imaging of small metabolites and lipids. Proc Natl Acad Sci U S A. 2017; 114(13):3334–9. https://doi.org/10.1073/pnas.1700677114 .

Zou H, Zou H, Hastie T. Regularization and variable selection via the Elastic Net. J R Stat Soc Ser B. 2005; 67:301–20.

Efron B, Hastie T. Computer Age Statistical Inference, 1st edn. Cambridge: Cambridge University Press; 2016.

Book   Google Scholar  

Hastie T, Tibshirani R, Friedman J. Elements of statistical learning. 2001; 1(10). New York: Springer series in statistics.

Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995; 20(3):273–97. https://doi.org/10.1007/BF00994018 .

Hecht-Nielsen. Theory of the backpropagation neural network. 1989:593–605. https://doi.org/10.1109/IJCNN.1989.118638 .

Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. In: Advances in neural information processing systems: 2012. p. 1097–1105.

Dahl GE, Sainath TN, Hinton GE. Improving deep neural networks for LVCSR using rectified linear units and dropout. 2013:8609–8613. https://doi.org/10.1109/ICASSP.2013.6639346 .

Martin Bland J, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986; 327(8476):307–10. https://doi.org/10.1016/S0140-6736(86)90837-8 .

Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve,. Radiology. 1982; 143(1):29–36. https://doi.org/10.1148/radiology.143.1.7063747 .

Meyer D, Hornik K, Fienerer I. Text mining infrastructure in R. J Stat Softw. 2008; 25(5):1–54.

Darcy AM, Louie AK, Roberts LW. Machine Learning and the Profession of Medicine. J Am Med Assoc. 2016; 315(6):551. https://doi.org/10.1001/jama.2015.18421 .

Wolberg WH, Street WN, Mangasarian OL. Machine learning techniques to diagnose breast cancer from image-processed nuclear features of fine needle aspirates. Cancer Lett. 1994; 77(2-3):163–71. https://doi.org/10.1016/0304-3835(94)90099-X .

Lazer D, Kennedy R, King G, Vespignani A. The Parable of Google Flu: Traps in Big Data Analysis. Science. 2014; 343(6176):1203–5. https://doi.org/10.1126/science.1248506 .

Brantingham PJ, Valasik M, Mohler GO. Does Predictive Policing Lead to Biased Arrests? Results From a Randomized Controlled Trial. Stat Public Policy. 2018; 5(1):1–6. https://doi.org/10.1080/2330443X.2018.1438940 .

Haider AH, Chang DC, Efron DT, Haut ER, Crandall M, Cornwell EE. Race and Insurance Status as Risk Factors for Trauma Mortality. Arch Surg. 2008; 143(10):945. https://doi.org/10.1001/archsurg.143.10.945 .

Download references

Acknowledgments

We acknowledge and thank the investigators, scientists, and developers who have contributed to the scientific community by making their data, code, and software freely available. We thank our colleagues in Cambridge, Boston, and beyond who provided critical insight into this work.

CSG was funded by National Institute for Health Research Trainees Coordinating Centre Fellowships (NIHR-PDF-2014-07-028 and NIHR-CDF-2017-10-19). The funders had no role in the design or execution of this study.

Availability of data and materials

In this manuscript we use de-identified data from a public repository [ 17 ]. The data are included on the BMC Med Res Method website. As such, ethical approval was not required.

Author information

Authors and affiliations.

Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, UK

Jenni A. M. Sidey-Gibbons

Department of Surgery, Harvard Medical School, 25 Shattuck Street, Boston, 01225, Massachusetts, USA

Chris J. Sidey-Gibbons

Department of Surgery, Brigham and Women’s Hospital, 75 Francis Street, Boston, 01225, Massachusetts, USA

University of Cambridge Psychometrics Centre, Trumpington Street, Cambridge, CB2 1AG, UK

You can also search for this author in PubMed   Google Scholar

Contributions

JSG contributed to the conception and design of the work, interpretation of data and presentation of results, and drafted the manuscript. CSG contributed to the conception and design of the work, conducted the analyses, and drafted the manuscript. Both JSG and CSG approve of the final versions and agree to be accountable for their own contributions. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Chris J. Sidey-Gibbons .

Ethics declarations

Ethics approval and consent to participate, consent for publication.

All contributing parties consent for the publication of this work.

Competing interests

The authors report no competing interests relating to this work.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1.

Breast Cancer Wisconsin Dataset. Anonomised dataset used in this work. (CSV 24.9 kb)

Additional file 2

R Markdown Supplementary Material. R Code accompanying the work described in this paper and its output. (PDF 207 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Sidey-Gibbons, J., Sidey-Gibbons, C. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 19 , 64 (2019). https://doi.org/10.1186/s12874-019-0681-4

Download citation

Received : 11 June 2018

Accepted : 14 February 2019

Published : 19 March 2019

DOI : https://doi.org/10.1186/s12874-019-0681-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Medical informatics
  • Classification
  • Supervised machine learning
  • Programming languages
  • Computer-assisted
  • Decision making

BMC Medical Research Methodology

ISSN: 1471-2288

research paper about medical field

Medical image analysis based on deep learning approach

  • Published: 06 April 2021
  • Volume 80 , pages 24365–24398, ( 2021 )

Cite this article

research paper about medical field

  • Muralikrishna Puttagunta 1 &
  • S. Ravi   ORCID: orcid.org/0000-0001-7267-9233 1  

34k Accesses

112 Citations

9 Altmetric

Explore all metrics

Medical imaging plays a significant role in different clinical applications such as medical procedures used for early detection, monitoring, diagnosis, and treatment evaluation of various medical conditions. Basicsof the principles and implementations of artificial neural networks and deep learning are essential for understanding medical image analysis in computer vision. Deep Learning Approach (DLA) in medical image analysis emerges as a fast-growing research field. DLA has been widely used in medical imaging to detect the presence or absence of the disease. This paper presents the development of artificial neural networks, comprehensive analysis of DLA, which delivers promising medical imaging applications. Most of the DLA implementations concentrate on the X-ray images, computerized tomography, mammography images, and digital histopathology images. It provides a systematic review of the articles for classification, detection, and segmentation of medical images based on DLA. This review guides the researchers to think of appropriate changes in medical image analysis based on DLA.

Similar content being viewed by others

research paper about medical field

Machine learning and deep learning approach for medical image analysis: diagnosis to detection

research paper about medical field

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

research paper about medical field

Convolutional neural networks: an overview and application in radiology

Avoid common mistakes on your manuscript.

1 Introduction

In the health care system, there has been a dramatic increase in demand for medical image services, e.g. Radiography, endoscopy, Computed Tomography (CT), Mammography Images (MG), Ultrasound images, Magnetic Resonance Imaging (MRI), Magnetic Resonance Angiography (MRA), Nuclear medicine imaging, Positron Emission Tomography (PET) and pathological tests. Besides, medical images can often be challenging to analyze and time-consuming process due to the shortage of radiologists.

Artificial Intelligence (AI) can address these problems. Machine Learning (ML) is an application of AI that can be able to function without being specifically programmed, that learn from data and make predictions or decisions based on past data. ML uses three learning approaches, namely, supervised learning, unsupervised learning, and semi-supervised learning. The ML techniques include the extraction of features and the selection of suitable features for a specific problem requires a domain expert. Deep learning (DL) techniques solve the problem of feature selection. DL is one part of ML, and DL can automatically extract essential features from raw input data [ 88 ]. The concept of DL algorithms was introduced from cognitive and information theories. In general, DL has two properties: (1) multiple processing layers that can learn distinct features of data through multiple levels of abstraction, and (2) unsupervised or supervised learning of feature presentations on each layer. A large number of recent review papers have highlighted the capabilities of advanced DLA in the medical field MRI [ 8 ], Radiology [ 96 ], Cardiology [ 11 ], and Neurology [ 155 ].

Different forms of DLA were borrowed from the field of computer vision and applied to specific medical image analysis. Recurrent Neural Networks (RNNs) and convolutional neural networks are examples of supervised DL algorithms. In medical image analysis, unsupervised learning algorithms have also been studied; These include Deep Belief Networks (DBNs), Restricted Boltzmann Machines (RBMs), Autoencoders, and Generative Adversarial Networks (GANs) [ 84 ]. DLA is generally applicable for detecting an abnormality and classify a specific type of disease. When DLA is applied to medical images, Convolutional Neural Networks (CNN) are ideally suited for classification, segmentation, object detection, registration, and other tasks [ 29 , 44 ]. CNN is an artificial visual neural network structure used for medical image pattern recognition based on convolution operation. Deep learning (DL) applications in medical images are visualized in Fig.  1 .

figure 1

a X-ray image with pulmonary masses [ 121 ] b CT image with lung nodule [ 82 ] c Digitized histo pathological tissue image [ 132 ]

2 Neural networks

2.1 history of neural networks.

The study of artificial neural networks and deep learning derives from the ability to create a computer system that simulates the human brain [ 33 ]. A neurophysiologist, Warren McCulloch, and a mathematician Walter Pitts [ 97 ] developed a primitive neural network based on what has been known as a biological structure in the early 1940s. In 1949, a book titled “Organization of Behavior” [ 100 ] was the first to describe the process of upgrading synaptic weights which is now referred to as the Hebbian Learning Rule. In 1958, Frank Rosenblatt’s [ 127 ] landmark paper defined the structure of the neural network called the perceptron for the binary classification task.

In 1962, Windrow [ 172 ] introduced a device called the Adaptive Linear Neuron (ADALINE) by implementing their designs in hardware. The limitations of perceptions were emphasized by Minski and Papert (1969) [ 98 ]. The concept of the backward propagation of errors for purposes of training is discussed in Werbose1974 [ 171 ]. In 1979, Fukushima [ 38 ] designed artificial neural networks called Neocognitron, with multiple pooling and convolution layers. One of the most important breakthroughs in deep learning occurred in 2006, when Hinton et al. [ 9 ] implemented the Deep Belief Network, with several layers of Restricted Boltzmann Machines, greedily teaching one layer at a time in an unsupervised fashion. In 1989, Yann LeCun [ 71 ] combined CNN with backpropagation to effectively perform the automated recognition of handwritten digits. Figure 2 shows important advancements in the history of neural networks that led to a deep learning era.

figure 2

Demonstrations of significant developments in the history of neural networks [ 33 , 134 ]

2.2 Artificial neural networks

Artificial Neural Networks (ANN) form the basis for most of the DLA. ANN is a computational model structure that has some performance characteristics similar to biological neural networks. ANN comprises simple processing units called neurons or nodes that are interconnected by weighted links. A biological neuron can be described mathematically in Eq. ( 1 ). Figure 3 shows the simplest artificial neural model known as the perceptron.

figure 3

Perceptron [ 77 ]

2.3 Training a neural network with Backpropagation (BP)

In the neural networks, the learning process is modeled as an iterative process of optimization of the weights to minimize a loss function. Based on network performance, the weights are modified on a set of examples belonging to the training set. The necessary steps of the training procedure contain forward and backward phases. For Neural Network training, any of the activation functions in forwarding propagation is selected and BP training is used for changing weights. The BP algorithm helps multilayer FFNN to learn input-output mappings from training samples [ 16 ]. Forward propagation and backpropagation are explained with the one hidden layer deep neural networks in the following algorithm.

The backpropagation algorithm is as follows for one hidden layer neural network

Initialize all weights to small random values.

While the stopping condition is false, do steps 3 through10.

For each training pair (( x 1 ,  y 1 )…( x n ,  y n ) do steps 4 through 9.

Feed-forward propagation:

Each input unit ( X i , i  = 1, 2, … n ) receives the input signal x i and send this signal to all hidden units in the above layer.

Each hidden unit ( Z j ,  j  = 1. .,  p ) compute output using the below equation, and it transmits to the output unit (i.e.) \( {z}_{j\_ in}={b}_j+{\sum}_{i=1}^n{w}_{ij}{x}_i \) applies to an activation function Z j  =  f ( Z j  _  in ).

Compute the out signal for each output unit ( Y k , k  = 1, ….,  m ).

\( {y}_{k\_ in}={b}_k+{\sum}_{j=1}^p{z}_j{w}_{jk} \) and calculate activation y k  =  f ( y k  _  in )

Backpropagation

For input training pattern ( x 1 ,  x 2 ….,  x n ) corresponding output pattern ( y 1 ,  y 2 , …,  y m ), let ( t 1 ,  t 2 , …. . t m ) be target pattern. For each output, the neuron computes network error δ k

At output-layer neurons δ k  = ( t k  −  y k ) f ′ ( y k  _  in )

For each hidden neuron, calculate its error information term δ j while doing so, use δ k of the output neurons as obtained in the previous step

At Hidden layer neurons \( {\delta}_j={f}^{\prime}\left({z}_{j\_ in}\right){\sum}_k^m{\delta}_k{w}_{jk} \)

Update weights and biases using the following formulas where η is learning rate

Each output layer ( Y k , k  = 1, 2, …. m ) updates its weights ( J  = 0, 1, … P ) and bias

w jk ( new ) =  w jk ( old ) +  ηδ k z j ; b k ( new ) =  b k ( old ) +  ηδ k

Each hidden layer ( Z J ,  J  = 1, 2, … p ) updates its weights ( i  = 0, 1, … n ) biases:

w ij ( new ) =  w ij ( old ) +  ηδ j x i ; b j ( old ) =  b j ( old ) +  ηδ j

Test stopping condition

2.4 Activation function

The activation function is the mechanism by which artificial neurons process and transfers information [ 42 ]. There are various types of activation functions which can be used in neural networks based on the characteristic of the application. The activation functions are non-linear and continuously differentiable. Differentiability property is important mainly when training a neural network using the gradient descent method. Some widely used activation functions are listed in Table 1 .

3 Deep learning

Deep learning is a subset of the machine learning field which deals with the development of deep neural networks inspired by biological neural networks in the human brain .

3.1 Autoencoder

Autoencoder (AE) [ 128 ] is one of the deep learning models which exemplifies the principle of unsupervised representation learning as depicted in Fig.  4a . AE is useful when the input data have more number of unlabelled data compared to labeled data. AE encodes the input x into a lower-dimensional space z. The encoded representation is again decoded to an approximated representation  x ′ of the input x through one hidden layer z.

figure 4

a Autoencoder [ 187 ] b Restricted Boltzmann Machine with n hidden and m visible units [ 88 ] c Deep Belief Networks [ 88 ]

Basic AE consists of three main steps:

Encode: Convert input vector \( x\ \epsilon\ {\mathbf{\mathfrak{R}}}^{\boldsymbol{m}} \) into \( h\ \epsilon\ {\mathbf{\mathfrak{R}}}^{\mathrm{n}} \) , the hidden layer by h  =  f ( wx  +  b )where \( w\ \epsilon\ {\mathbf{\mathfrak{R}}}^{\boldsymbol{m}\ast \boldsymbol{n}} \) and \( b\ \epsilon\ {\mathbf{\mathfrak{R}}}^{\boldsymbol{n}} \) . m  and n are dimensions of the input vector and converted hidden state. The dimension of the hidden layer h is to be smaller than x . f is an activate function.

Decode: Based on the above  h , reconstruct input vector z by equation z  =  f ′ ( w ′ h  +  b ′ ) where \( {w}^{\prime}\epsilon\ {\mathbf{\mathfrak{R}}}^{\boldsymbol{n}\ast \boldsymbol{m}} \) and \( {b}^{\prime}\boldsymbol{\epsilon} {\mathbf{\mathfrak{R}}}^{\boldsymbol{m}}. \) The f ′ is the same as the above activation function.

Calculate square error: L recons ( x , z) =  ∥  x  − z∥ 2 , which is the reconstruction error cost function. Reconstruct error minimization is achieved by optimizing the cost function (2)

Another unsupervised algorithm representation is known as Stacked Autoencoder (SAE). The SAE comprises stacks of autoencoder layers mounted on top of each other where the output of each layer was wired to the inputs of the next layer. A Denoising Autoencoder (DAE) was introduced by Vincent et al. [ 159 ]. The DAE is trained to reconstruct the input from random noise added input data. Variational autoencoder (VAE) [ 66 ] is modifying the encoder where the latent vector space is used to represent the images that follow a Gaussian distribution unit. There are two losses in this model; one is a mean squared error and the Kull back Leibler divergence loss that determines how close the latent variable matches the Gaussian distribution unit. Sparse autoencoder [ 106 ] and variational autoencoders have applications in unsupervised, semi-supervised learning, and segmentation.

3.2 Restricted Boltzmann machine

A Restricted Boltzmann machine [RBM] is a Markov Random Field (MRF) associated with the two-layer undirected probabilistic generative model, as shown in Fig. 4b . RBM contains visible units (input) v and hidden (output) units  h . A significant feature of this model is that there is no direct contact between the two visible units or either of the two hidden units. In binary RBMs, the random variables ( v ,  h ) takes ( v ,  h ) ∈ {0, 1} m  +  n . Like the general Boltzmann machine [ 50 ], the RBM is an energy-based model. The energy of the state { v ,  h } is defined as (3)

where v j , h i are the binary states of visible unit j  ∈ {1, 2, … m } and hidden unit i  ∈ {1, 2, .. n }, b j , c i  are their biases of visible and hidden units, w ij is the symmetric interaction term between the units v j and h i them. A joint probability of ( v ,  h ) is given by the Gibbs distribution in Eq. ( 4 )

Z is a “partition function” that can be given by summing over all possible pairs of visual v  and hidden h (5).

A significant feature of the RBM model is that there is no direct contact between the two visible units or either of the two hidden units. In term of probability, conditional distributions p ( h |  v ) and p ( v |  h ) is computed as (6) \( p\left(h|v\right)={\prod}_{i=1}^np\left({h}_i|v\right) \)

For binary RBM condition distribution of visible and hidden are given by (7) and (8)

where σ( · ) is a sigmoid function

RBMs parameters ( w ij ,  b j ,  c i ) are efficiently calculated using the contrastive divergence learning method [ 150 ]. A batch version of k-step contrastive divergence learning (CD-k) can be discussed in the algorithm below [ 36 ]

figure d

3.3 Deep belief networks

The Deep Belief Networks (DBN) proposed by Hinton et al. [ 51 ] is a non-convolution model that can extract features and learn a deep hierarchical representation of training data. DBNs are generative models constructed by stacking multiple RBMs. DBN is a hybrid model, the first two layers are like RBM, and the rest of the layers form a directed generative model. A DBN has one visible layer v and a series of hidden layers h (1) , h (2) , …, h ( l ) as shown in Fig. 4c . The DBN model joint distribution between the observed units v and the l  hidden layers h k (  k  = 1, … l ) as (9)

where v  =  h (0) , P ( h k |  h k  + 1 ) is a conditional distribution (10) for the layer k given the units of k  + 1

A DBN has l weight matrices: W (1) , …. , W ( l ) and l  + 1 bias vectors: b (0) , …, b ( l ) P ( h ( l ) ,  h ( l  − 1) ) is the joint distribution of top-level RBM (11).

The probability distribution of DBN is given by Eq. ( 12 )

3.4 Convolutional neural networks (CNN)

In neural networks, CNN is a unique family of deep learning models. CNN is a major artificial visual network for the identification of medical image patterns. The family of CNN primarily emerges from the information of the animal visual cortex [ 55 , 116 ]. The major problem within a fully connected feed-forward neural network is that even for shallow architectures, the number of neurons may be very high, which makes them impractical to apply to image applications. The CNN is a method for reducing the number of parameters, allows a network to be deeper with fewer parameters.

CNN’s are designed based on three architectural ideas that are shared weights, local receptive fields, and spatial sub-sampling [ 70 ]. The essential element of CNN is the handling of unstructured data through the convolution operation. Convolution of the input signal  x ( t ) with filter signal  h ( t ) creates an output signal y ( t ) that may reveal more information than the input signal itself. 1D convolution of a discrete signals x ( t ) and h ( t ) is (13)

A digital image x ( n 1 ,  n 2 ) is a 2-D discrete signal. The convolution of images  x ( n 1 ,  n 2 ) and h ( n 1 ,  n 2 ) is (14)

where 0 ≤  n 1  ≤  M  − 1, 0 ≤  n 2  ≤  N  − 1.

The function of the convolution layer is to detect local features x l from input feature maps x l  − 1 using kernels k l by convolution operation (*) i.e. x l  − 1  ∗  k l . This convolution operation is repeated for every convolutional layer subject to non-linear transform (15)

where \( {k}_{mn}^{(l)} \) represents weights between feature map  m at layer l  − 1 and feature map n at \( l.{x}_m^{\left(l-1\right)} \) represents the  m  feature map of the layer l  − 1 and \( {x}_n^l \) is n  feature map of the layer l . \( {b}_m^{(l)} \) is the bias parameter. f (.) is the non-linear activation function.  M l  − 1 denotes a set of feature maps. CNN significantly reduces the number of parameters compared with a fully connected neural network because of local connectivity and weight sharing. The depth, zero-padding, and stride are three hyperparameters for controlling the volume of the convolution layer output.

A pooling layer comes after the convolutional layer to subsample the feature maps. The goal of the pooling layers is to achieve spatial invariance by minimizing the spatial dimension of the feature maps for the next convolution layer. Max pooling and average pooling are commonly used two different polling operations to achieve downsampling. Let the size of the pooling region M  and each element in the pooling region is given as x j  = ( x 1 ,  x 2 , … x M  ×  M ), the output after pooling is given as x i . Max pooling and average polling are described in the following Eqs. ( 16 ) and ( 17 ).

The max-pooling method chooses the most superior invariant feature in a pooling region. The average pooling method selects the average of all the features in the pooling area. Thus, the max-pooling method holds texture information that can lead to faster convergence, average pooling method is called Keep background information [ 133 ]. Spatial pyramid pooling [ 48 ], stochastic polling [ 175 ], Def-pooling [ 109 ], Multi activation pooling [ 189 ], and detailed preserving pooling [ 130 ] are different pooling techniques in the literature. A fully connected layer is used at the end of the CNN model. Fully connected layers perform like a traditional neural network [ 174 ]. The input to this layer is a vector of numbers (output of the pooling layer) and outputs an N-dimensional vector (N number of classes). After the pooling layers, the feature of previous layer maps is flattened and connected to fully connected layers.

The first successful seven-layered LeNet-5 CNN was developed by Yann LeCunn in 1990 for handwritten digit recognition successfully. Krizhevsky et al. [ 68 ] proposed AlexNet is a deep convolutional neural network composed of 5 convolutional and 3 fully-connected layers. In AlexNet changed the sigmoid activation function to a ReLU activation function to make model training easier.

K. Simonyan and A. Zisserman invented the VGG-16 [ 143 ] which has 13 convolutional and 3 fully connected layers. The Visual Geometric Group (VGG) research group released a series of CNN starting from VGG-11, VGG-13, VGG-16, and VGG-19. The main intention of the VGG group to understand how the depth of convolutional networks affects the accuracy of the models of image classification and recognition. Compared to the maximum VGG19, which has 16 convolutional layers and 3 fully connected layers, the minimum VGG11 has 8 convolutional layers and 3 fully connected layers. The last three fully connected layers are the same as the various variations of VGG.

Szegedy et al. [ 151 ] proposed an image classification network consisting of 22 different layers, which is GoogleNet. The main idea behind GoogleNet is the introduction of inception layers. Each inception layer convolves the input layers partially using different filter sizes. Kaiming He et al. [ 49 ] proposed the ResNet architecture, which has 33 convolutional layers and one fully-connected layer. Many models introduced the principle of using multiple hidden layers and extremely deep neural networks, but then it was realized that such models suffered from the issue of vanishing or exploding gradients problem. For eliminating vanishing gradients’ problem skip layers (shortcut connections) are introduced. DenseNet developed by Gao et al. [ 54 ] consists of several dense blocks and transition blocks, which are placed between two adjacent dense blocks. The dense block consists of three layers of batch normalization, followed by a ReLU and a 3 × 3 convolution operation. The transition blocks are made of Batch Normalization, 1 × 1 convolution, and average Pooling.

Compared to state-of-the-art handcrafted feature detectors, CNNs is an efficient technique for detecting features of an object and achieving good classification performance. There are drawbacks to CNNs, which are that unique relationships, size, perspective, and orientation of features are not taken into account. To overcome the loss of information in CNNs by pooling operation Capsule Networks (CapsNet) are used to obtain spatial information and most significant features [ 129 ]. The special type of neurons, called capsules, can detect efficiently distinct information. The capsule network consists of four main components that are matrix multiplication, Scalar weighting of the input, dynamic routing algorithm, and squashing function.

3.5 Recurrent neural networks (RNN)

RNN is a class of neural networks used for processing sequential information (deal with sequential data). The structure of the RNN shown in Fig.  5a is like an FFNN and the difference is that recurrent connections are introduced among hidden nodes. A generic RNN model at time t , the recurrent connection hidden unit h t receives input activation from the present data x t and the previous hidden state  h t  − 1 . The output y t is calculated given the hidden state h t . It can be represented using the mathematical Eqs. ( 18 ) and ( 19 ) as

figure 5

a Recurrent Neural Networks [ 163 ] b Long Short-Term Memory [ 163 ] c Generative Adversarial Networks [ 64 ]

Here f is a non-linear activation function, w hx is the weight matrix between the input and hidden layers, w hh is the matrix of recurrent weights between the hidden layers and itself w yh is the weight matrix between the hidden and output layer, and b h and b y are biases that allow each node to learn and offset. While the RNN is a simple and efficient model, in reality, it is, unfortunately, difficult to train properly. Real-Time Recurrent Learning (RTRL) algorithm [ 173 ] and Back Propagation Through Time (BPTT) [ 170 ] methods are used to train RNN. Training with these methods frequently fails because of vanishing (multiplication of many small values) or explode (multiplication of many large values) gradient problem [ 10 , 112 ]. Hochreiter and Schmidhuber (1997) designed a new RNN model named Long Short Term Memory (LSTM) that overcome error backflow problems with the aid of a specially designed memory cell [ 52 ]. Figure 5b shows an LSTM cell which is typically configured by three gates: input gate g t , forget gate  f t and output gate  o t , these gates add or remove information from the cell.

An LSTM can be represented with the following Eqs. ( 20 ) to ( 25 )

3.6 Generative adversarial networks (GAN)

In the field of deep learning, one of the deep generative models are Generative Adversarial Networks (GANs) introduced by Good Fellow in [ 43 ]. GANs are neural networks that can generate synthetic images that closely imitate the original images. In GAN shown in Fig. 5c , there are two neural networks, namely generator, and discriminator, which are trained simultaneously. The generator G generates counterfeit data samples which aim to “fool” the discriminator  D , while the discriminator attempts to correctly distinguish the true and false samples. In mathematical terms, D and G play a two player minimax game with the cost function of (26) [ 64 ].

Where x represents the original image, z is a noise vector with random numbers. p data ( x ) and p z ( z ) are probability distributions of x and  z , respectively.  D ( x ) represents the probability that x comes from the actual data p data ( x ) rather than the generated data. 1 −  D ( G (z)) is the probability that it can be generated from p z (z). The expectation of x from the real data distribution  p data is expressed by \( {E}_{x\sim {p}_{data(x)}} \) and the expectation of z sampled from noise is \( {E}_{\mathrm{z}\sim {P}_{\mathrm{z}}\left(\mathrm{z}\right)}. \) The goal of the training is to maximize the loss function for the discriminator, while the training objective for the generator is to reduce the term log (1 −  D ( G ( z ))).The most utilization of GAN in the field of medical image analysis is data augmentation (generating new data) and image to image translation [ 107 ]. Trustability of the Generated Data, Unstable Training, and evaluation of generated data are three major drawbacks of GAN that might hinder their acceptance in the medical community [ 183 ].

Ronneberger et al. [ 126 ] proposed CNN based U-Net architecture for segmentation in biomedical image data. The architecture consists of a contracting path (left side) to capture context and an expansive symmetric path (right side) that enables precise localization. U-Net is a generalized DLA used for quantification tasks such as cell detection and shape measurement in medical image data [ 34 ].

3.8 Software frameworks

There are several software frameworks available for implementing DLA which are regularly updated as new approaches and ideas are created. DLA encapsulates many levels of mathematical principles based on probability, linear algebra, calculus, and numerical computation. Several deep learning frameworks exist such as Theano, TensorFlow, Caffe, CNTK, Torch, Neon, pylearn, etc. [ 138 ]. Globally, Python is probably the most commonly used programming language for DL. PyTorch and Tensorflow are the most widely used libraries for research in 2019. Table 2 shows the analysis of various Deep Learning Frameworks based on the core language and supported interface language.

4 Use of deep learning in medical imaging

4.1 x-ray image.

Chest radiography is widely used in diagnosis to detect heart pathologies and lung diseases such as tuberculosis, atelectasis, consolidation, pleural effusion, pneumothorax, and hyper cardiac inflation. X-ray images are accessible, affordable, and less dose-effective compared to other imaging methods, and it is a powerful tool for mass screening [ 14 ]. Table 3 presents a description of the DL methods used for X-ray image analysis.

S. Hwang et al. [ 57 ] proposed the first deep CNN-based Tuberculosis screening system with a transfer learning technique. Rajaraman et al. [ 119 ] proposed modality-specific ensemble learning for the detection of abnormalities in chest X-rays (CXRs). These model predictions are combined using various ensemble techniques toward minimizing prediction variance. Class selective mapping of interest (CRM) is used for visualizing the abnormal regions in the CXR images. Loey et al. [ 90 ] proposed A GAN with deep transfer training for COVID-19 detection in CXR images. The GAN network was used to generate more CXR images due to the lack of the COVID-19 dataset. Waheed et al. [ 160 ] proposed a CovidGAN model based on the Auxiliary Classifier Generative Adversarial Network (ACGAN) to produce synthetic CXR images for COVID-19 detection. S. Rajaraman and S. Antani [ 120 ] introduced weakly labeled data augmentation for increasing training dataset to improve the COVID-19 detection performance in CXR images.

4.2 Computerized tomography (CT)

CT uses computers and rotary X-ray equipment to create cross-section images of the body. CT scans show the soft tissues, blood vessels, and bones in different parts of the body. CT is a high detection ability, reveals small lesions, and provides a more detailed assessment. CT examinations are frequently used for pulmonary nodule identification [ 93 ]. The detection of malignant pulmonary nodules is fundamental to the early diagnosis of lung cancer [ 102 , 142 ]. Table 4 summarizes the latest deep learning developments in the study of CT image analysis.

Li et al. 2016 [ 74 ] proposed deep CNN for the detection of three types of nodules that are semisolid, solid, and ground-glass opacity. Balagourouchetty et al. [ 5 ] proposed GoogLeNet based an ensemble FCNet classifier for The liver lesion classification. For feature extraction, basic Googlenet architecture is modified with three modifications. Masood et al. [ 95 ] proposed the multidimensional Region-based Fully Convolutional Network (mRFCN) for lung nodule detection/classification and achieved a classification accuracy of 97.91%. In lung nodule detection, the feature work is the detection of micronodules (less than 3 mm) without loss of sensitivity and accuracy. Zhao and Zeng 2019 [ 190 ] proposed DLA based on supervised MSS U-Net and 3DU-Net to automatically segment kidneys and kidney tumors from CT images. In the present pandemic situation, Fan et al. [ 35 ] and Li et al. [ 79 ] used deep learning-based techniques for COVID-19 detection from CT images.

4.3 Mammograph (MG)

Breast cancer is one of the world’s leading causes of death among women with cancer. MG is a reliable tool and the most common modality for early detection of breast cancer. MG is a low-dose x-ray imaging method used to visualize the breast structure for the detection of breast diseases [ 40 ]. Detection of breast cancer on mammography screening is a difficult task in image classification because the tumors constitute a small part of the actual breast image. For analyzing breast lesions from MG, three steps are involved that are detection, segmentation, and classification [ 139 ].

The automatic classification and detection of masses at an early stage in MG is still a hot subject of research. Over the past decade, DLA has shown some significant overcome in breast cancer detection and classification problem. Table 5 summarizes the latest DLA developments in the study of mammogram image analysis.

Fonseca et al. [ 37 ] proposed a breast composition classification according to the ACR standard based on CNN for feature extraction. Wang et al. [ 161 ] proposed twelve-layer CNN to detect Breast arterial calcifications (BACs) in mammograms image for risk assessment of coronary artery disease. Ribli et al. [ 124 ] developed a CAD system based on Faster R-CNN for detection and classification of benign and malignant lesions on a mammogram image without any human involvement. Wu et al. [ 176 ] present a deep CNN trained and evaluated on over 1,000,000 mammogram images for breast cancer screening exam classification. Conant et al. [ 26 ] developed a Deep CNN based AI system to detect calcified lesions and soft- tissue in digital breast tomosynthesis (DBT) images. Kang et al. [ 62 ] introduced Fuzzy completely connected layer (FFCL) architecture, which focused primarily on fused fuzzy rules with traditional CNN for semantic BI-RADS scoring. The proposed FFCL framework achieved superior results in BI-RADS scoring for both triple and multi-class classifications.

4.4 Histopathology

Histopathology is the field of study of human tissue in the sliding glass using a microscope to identify different diseases such as kidney cancer, lung cancer, breast cancer, and so on. The staining is used in histopathology for visualization and highlight a specific part of the tissue [ 45 ]. For example, Hematoxylin and Eosin (H&E) staining tissue gives a dark purple color to the nucleus and pink color to other structures. H&E stain plays a key role in the diagnosis of different pathologies, cancer diagnosis, and grading over the last century. The recent imaging modality is digital pathology

Deep learning is emerging as an effective method in the analysis of histopathology images, including nucleus detection, image classification, cell segmentation, tissue segmentation, etc. [ 178 ]. Tables 6 and 7 summarize the latest deep learning developments in pathology. In the study of digital pathology image analysis, the latest development is the introduction of whole slide imaging (WSI). WSI allows digitizing glass slides with stained tissue sections at high resolution. Dimitriou et al. [ 30 ] reviewed challenges for the analysis of multi-gigabyte WSI images for building deep learning models. A. Serag et al. [ 135 ] discuss different public “Grand Challenges” that have innovations using DLA in computational pathology.

4.5 Other images

Endoscopy is the insertion of a long nonsurgical solid tube directly into the body for the visual examination of an internal organ or tissue in detail. Endoscopy is beneficial in studying several systems inside the human body, such as the gastrointestinal tract, the respiratory tract, the urinary tract, and the female reproductive tract [ 60 , 101 ]. Du et al. [ 31 ] reviewed the Applications of Deep Learning in the Analysis of Gastrointestinal Endoscopy Images. A revolutionary device for direct, painless, and non-invasive inspection of the gastrointestinal (GI) tract for detecting and diagnosing GI diseases (ulcer, bleeding) is Wireless capsule endoscopy (WCE). Soffer et al. [ 145 ] performed a systematic analysis of the existing literature on the implementation of deep learning in the WCE. The first deep learning-based framework was proposed by He et al. [ 46 ] for the detection of hookworm in WCE images. Two CNN networks integrated (edge extraction and classification of hookworm) to detect hookworm. Since tubular structures are crucial elements for hookworm detection, the edge extraction network was used for tubular region detection. Yoon et al. [ 185 ] developed a CNN model for early gastric cancer (EGC) identification and prediction of invasion depth. The depth of tumor invasion in early gastric cancer (EGC) is a significant factor in deciding the method of treatment. For the classification of endoscopic images as EGC or non-EGC, the authors employed a VGG-16 model. Nakagawa et al. [ 105 ] applied DL technique based on CNN to enhance the diagnostic assessment of oesophageal wall invasion using endoscopy. J.choi et al. [ 22 ] express the feature aspects of DL in endoscopy.

Positron Emission Tomography (PET) is a nuclear imaging tool that is generally used by the injection of particular radioactive tracers to visualize molecular-level activities within tissues. T. Wang et al. [ 168 ] reviewed applications of machine learning in PET attenuation correction (PET AC) and low-count PET reconstruction. The authors discussed the advantages of deep learning over machine learning in the applications of PET images. AJ reader et al. [ 123 ] reviewed the reconstruction of PET images that can be used in deep learning either directly or as a part of traditional reconstruction methods.

5 Discussion

The primary purpose of this paper is to review numerous publications in the field of deep learning applications in medical images. Classification, detection, and segmentation are essential tasks in medical image processing [ 144 ]. For specific deep learning tasks in medical applications, the training of deep neural networks needs a lot of labeled data. But in the medical field, at least thousands of labeled data is not available. This issue is alleviated by a technique called transfer learning. Two transfer learning approaches are popular and widely applied that are fixed feature extractors and fine-tuning a pre-trained network. In the classification process, the deep learning models are used to classify images into two or more classes. In the detection process, Deep learning models have the function of identifying tumors and organs in medical images. In the segmentation task, deep learning models try to segment the region of interest in medical images for processing.

5.1 Segmentation

For medical image segmentation, deep learning has been widely used, and several articles have been published documenting the progress of deep learning in the area. Segmentation of breast tissue using deep learning alone has been successfully implemented [ 104 ]. Xing et al. [ 179 ] used CNN to acquire the initial shape of the nucleus and then isolate the actual nucleus using a deformable pattern. Qu et al. [ 118 ] suggested a deep learning approach that could segment the individual nucleus and classify it as a tumor, lymphocyte, and stroma nuclei. Pinckaers and Litjens [ 115 ] show on a colon gland segmentation dataset (GlaS) that these Neural Ordinary Differential Equations (NODE) can be used within the U-Net framework to get better segmentation results. Sun 2019 [ 149 ] developed a deep learning architecture for gastric cancer segmentation that shows the advantage of utilizing multi-scale modules and specific convolution operations together. Figure 6 shows U-Net is the most usually used network for segmentation (Fig. 6 ).

figure 6

U-Net architecture for segmentation,comprising encoder (downsampling) and decoder (upsampling) sections [ 135 ]

5.2 Detection

The main challenge posed by methods of detection of lesions is that they can give rise to multiple false positives while lacking a good proportion of true positive ones . For tuberculosis detection using deep learning methods applied in [ 53 , 57 , 58 , 91 , 119 ]. Pulmonary nodule detection using deep learning has been successfully applied in [ 82 , 108 , 136 , 157 ].

Shin et al. [ 141 ] discussed the effect of CNN pre-trained architectures and transfer learning on the identification of enlarged thoracoabdominal lymph nodes and the diagnosis of interstitial lung disease on CT scans, and considered transfer learning to be helpful, given the fact that natural images vary from medical images. Litjens et al. [ 85 ] introduced CNN for the identification of Prostate cancer in biopsy specimens and breast cancer metastasis identification in sentinel lymph nodes. The CNN has four convolution layers for feature extraction and three classification layers. Riddle et al. [ 124 ] proposed the Faster R-CNN model for the detection of mammography lesions and classified these lesions into benign and malignant, which finished second in the Digital Mammography DREAM Challenge. Figure 7 shows VGG architecture for detection.

figure 7

CNN architecture for detection [ 144 ]

An object detection framework named Clustering CNN (CLU-CNNs) was proposed by Z. Li et al. [ 76 ] for medical images. CLU-CNNs used Agglomerative Nesting Clustering Filtering (ANCF) and BN-IN Net to avoid much computation cost facing medical images. Image saliency detection aims at locating the most eye-catching regions in a given scene [ 21 , 78 ]. The goal of image saliency detection is to locate a given scene in the most eye-catching regions. In different applications, it also acts as a pre-processing tool including video saliency detection [ 17 , 18 ], object recognition, and object tracking [ 20 ]. Saliency maps are a commonly used tool for determining which areas are most important to the prediction of a trained CNN on the input image [ 92 ]. NT Arun et al. [ 4 ] evaluated the performance of several popular saliency methods on the RSNA Pneumonia Detection dataset and was found that GradCAM was sensitive to the model parameters and model architecture.

5.3 Classification

In classification tasks, deep learning techniques based on CNN have seen several advancements. The success of CNN in image classification has led researchers to investigate its usefulness as a diagnostic method for identifying and characterizing pulmonary nodules in CT images. The classification of lung nodules using deep learning [ 74 , 108 , 117 , 141 ] has also been successfully implemented.

Breast parenchymal density is an important indicator of the risk of breast cancer. The DL algorithms used for density assessment can significantly reduce the burden of the radiologist. Breast density classification using DL has been successfully implemented [ 37 , 59 , 72 , 177 ]. Ionescu et al. [ 59 ] introduced a CNN-based method to predict Visual Analog Score (VAS) for breast density estimation. Figure 8 shows AlexNet architecture for classification.

Alcoholism or alcohol use disorder (AUD) has effects on the brain. The structure of the brain was observed using the Neuroimaging approach. S.H.Wang et al. [ 162 ] proposed a 10-layer CNN for alcohol use disorder (AUD) problem using dropout, batch normalization, and PReLU techniques. The authors proposed a 10 layer CNN model that has obtained a sensitivity of 97.73, a specificity of 97.69, and an accuracy of 97.71. Cerebral micro-bleeding (CMB) are small chronic brain hemorrhages that can result in cognitive impairment, long-term disability, and neurologic dysfunction. Therefore, early-stage identification of CMBs for prompt treatment is essential. S. Wang et al. [ 164 ] proposed the transfer learning-based DenseNet to detect Cerebral micro-bleedings (CMBs). DenseNet based model attained an accuracy of 97.71% (Fig. 8 ).

figure 8

CNN architecture for classification [ 144 ]

5.4 Limitations and challenges

The application of deep learning algorithms to medical imaging is fascinating, but many challenges are pulling down the progress. One of the limitations to the adoption of DL in medical image analysis is the inconsistency in the data itself (resolution, contrast, signal-to-noise), typically caused by procedures in clinical practice [ 113 ]. The non-standardized acquisition of medical images is another limitation in medical image analysis. The need for comprehensive medical image annotations limits the applicability of deep learning in medical image analysis. The major challenge is limited data and compared to other datasets, the sharing of medical data is incredibly complicated. Medical data privacy is both a sociological and a technological issue that needs to be discussed from both viewpoints. For building DLA a large amount of annotated data is required. Annotating medical images is another major challenge. Labeling medical images require radiologists’ domain knowledge. Therefore, it is time-consuming to annotate adequate medical data. Semi-supervised learning could be implemented to make combined use of the existing labeled data and vast unlabelled data to alleviate the issue of “limited labeled data”. Another way to resolve the issue of “data scarcity” is to develop few-shot learning algorithms using a considerably smaller amount of data. Despite the successes of DL technology, there are many restrictions and obstacles in the medical field. Whether it is possible to reduce medical costs, increase medical efficiency, and improve the satisfaction of patients using DL in the medical field cannot be adequately checked. However, in clinical trials, it is necessary to demonstrate the efficacy of deep learning methods and to develop guidelines for the medical image analysis applications of deep learning.

6 Conclusion and future directions

Medical imaging is a place of origin of the information necessary for clinical decisions. This paper discusses the new algorithms and strategies in the area of deep learning. In this brief introduction to DLA in medical image analysis, there are two objectives. The first one is an introduction to the field of deep learning and the associated theory. The second is to provide a general overview of the medical image analysis using DLA. It began with the history of neural networks since 1940 and ended with breakthroughs in medical applications in recent DL algorithms. Several supervised and unsupervised DL algorithms are first discussed, including auto-encoders, recurrent, CNN, and restricted Boltzmann machines. Several optimization techniques and frameworks in this area include Caffe, TensorFlow, Theano, and PyTorch are discussed. After that, the most successful DL methods were reviewed in various medical image applications, including classification, detection, and segmentation. Applications of the RBM network is rarely published in the medical image analysis literature. In classification and detection, CNN-based models have achieved good results and are most commonly used. Several existing solutions to medical challenges are available. However, there are still several issues in medical image processing that need to be addressed with deep learning. Many of the current DL implementations are supervised algorithms, while deep learning is slowly moving to unsupervised and semi-supervised learning to manage real-world data without manual human labels.

DLA can support clinical decisions for next-generation radiologists. DLA can automate radiologist workflow and facilitate decision-making for inexperienced radiologists. DLA is intended to aid physicians by automatically identifying and classifying lesions to provide a more precise diagnosis. DLA can help physicians to minimize medical errors and increase medical efficiency in the processing of medical image analysis. DL-based automated diagnostic results using medical images for patient treatment are widely used in the next few decades. Therefore, physicians and scientists should seek the best ways to provide better care to the patient with the help of DLA. The potential future research for medical image analysis is the designing of deep neural network architectures using deep learning. The enhancement of the design of network structures has a direct impact on medical image analysis. Manual design of DL Model structure requires rich knowledge; hence Neural Network Search will probably replace the manual design [ 73 ]. A meaningful feature research direction is also the design of various activation functions. Radiation therapy is crucial for cancer treatment. Different medical imaging modalities are playing a critical role in treatment planning. Radiomics was defined as the extraction of high throughput features from medical images [ 28 ]. In the feature, Deep-learning analysis of radionics will be a promising tool in clinical research for clinical diagnosis, drug development, and treatment selection for cancer patients . Due to limited annotated medical data, unsupervised, weakly supervised, and reinforcement learning methods are the emerging research areas in DL for medical image analysis. Overall, deep learning, a new and fast-growing field, offers various obstacles as well as opportunities and solutions for a range of medical image applications.

Abadi M et al. (2016) TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, [Online]. Available: http://arxiv.org/abs/1603.04467 .

Abbas A, Abdelsamea MM, Gaber MM (2020) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network, pp. 1–9, [Online]. Available: http://arxiv.org/abs/2003.13815 .

Apostolopoulos ID, Mpesiana TA (2020) Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks, Phys Eng Sci Med, no. 0123456789, pp. 1–6, DOI: https://doi.org/10.1007/s13246-020-00865-4 .

Arun NT et al. (2020) Assessing the validity of saliency maps for abnormality localization in medical imaging, pp. 1–5, [Online]. Available: http://arxiv.org/abs/2006.00063 .

L. Balagourouchetty, J. K. Pragatheeswaran, B. Pottakkat, and R. G, “GoogLeNet based ensemble FCNet classifier for focal liver lesion diagnosis,” IEEE J Biomed Heal Inf, vol. 2194, no. c, pp. 1–1, 2019, DOI: https://doi.org/10.1109/jbhi.2019.2942774 , 1694.

Bastien F et al. (2012) Theano: new features and speed improvements, pp. 1–10, [Online]. Available: http://arxiv.org/abs/1211.5590 .

Basu S, Mitra S, Saha N (2020) Deep Learning for Screening COVID-19 using Chest X-Ray Images, pp. 1–6, [Online]. Available: http://arxiv.org/abs/2004.10507 .

Bauer S, Wiest R, Nolte LP, Reyes M (2013) A survey of MRI-based medical image analysis for brain tumor studies. Phys Med Biol 58(13):1–44. https://doi.org/10.1088/0031-9155/58/13/R97

Article   Google Scholar  

Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: The 19th International Conference on Neural Information Processing Systems(NIPS’06), pp 153–160. https://doi.org/10.5555/2976456.2976476

Chapter   Google Scholar  

Bengio Y, Simard P, Palo F (1994) Learning long -term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

Bizopoulos P, Koutsouris D (2019) Deep learning in cardiology. IEEE Rev Biomed Eng 12(c):168–193. https://doi.org/10.1109/RBME.2018.2885714

Bulten W, Litjens G (2018) Unsupervised Prostate Cancer Detection on H&E using Convolutional Adversarial Autoencoders, [Online]. Available: http://arxiv.org/abs/1804.07098 .

Cai H et al. (2019) Breast Microcalcification Diagnosis Using Deep Convolutional Neural Network from Digital Mammograms, Comput Math Methods Med, vol. 2019, DOI: https://doi.org/10.1155/2019/2717454 .

Candemir S, Rajaraman S, Thoma G, Antani S (2018) Deep learning for grading cardiomegaly severity in chest x-rays : an investigation. In: 2018 IEEE Life Sciences Conference (LSC), pp 109–113. https://doi.org/10.1109/LSC.2018.8572113

Capizzi G, Lo Sciuto G, Napoli C, Połap D (2020) Small Lung Nodules Detection based on Fuzzy-Logic and Probabilistic Neural Network with Bio-inspired Reinforcement Learning, IEEE Trans Fuzzy Syst, vol. PP, no. XX, p. 1. https://doi.org/10.1109/TFUZZ.2019.2952831 .

Chen DS, Jain RC (1994) A robust back propagation learning algorithm for function approximation. IEEE Trans. Neural Networks 5(3):467–479. https://doi.org/10.1109/72.286917

Chen C, Li S, Qin H, Pan Z, Yang G (2018) Bilevel feature learning for video saliency detection. IEEE Trans Multimed 20(12):3324–3336. https://doi.org/10.1109/TMM.2018.2839523

Chen C, Li S, Wang Y, Qin H, Hao A (2017) Video saliency detection via spatial-temporal fusion and low-rank coherency diffusion. IEEE Trans Image Process 26(7):3156–3170. https://doi.org/10.1109/TIP.2017.2670143

Article   MathSciNet   MATH   Google Scholar  

Chen H, Qi X, Yu L, Dou Q, Qin J, Heng PA (2017) DCAN: deep contour-aware networks for object instance segmentation from histology images. Med Image Anal 36:135–146. https://doi.org/10.1016/j.media.2016.11.004

Chen C, Wang G, Peng C, Zhang X, Qin H (2020) Improved robust video saliency detection based on long-term spatial-temporal information. IEEE Trans Image Process 29:1090–1100. https://doi.org/10.1109/TIP.2019.2934350

Article   MathSciNet   Google Scholar  

Chen C, Wei J, Peng C, Zhang W, Qin H (2020) Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion. IEEE Trans Image Process 29:4296–4307. https://doi.org/10.1109/TIP.2020.2968250

Choi J, Shin K, Jung J, Bae HJ, Kim DH, Byeon JS, Kim N (2020) Convolutional neural network technology in endoscopic imaging: artificial intelligence for endoscopy. Clin Endosc 53(2):117–126. https://doi.org/10.5946/ce.2020.054

Chougrad H, Zouaki H, Alheyane O (2018) Deep convolutional neural networks for breast cancer screening. Comput Methods Prog Biomed 157:19–30. https://doi.org/10.1016/j.cmpb.2018.01.011

Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: 4th International Conference on Learning Representations, ICLR 2016, pp 1–14

Google Scholar  

Collobert R, Kavukcuoglu K, Farabet C (2011) Torch7: A matlab-like environment for machine learning, BigLearn, NIPS Work, pp. 1–6, [Online]. Available: http://infoscience.epfl.ch/record/192376/files/Collobert_NIPSWORKSHOP_2011.pdf .

Conant EF et al (2019) Improving Accuracy and Efficiency with Concurrent Use of Artificial Intelligence for Digital Breast Tomosynthesis. Radiol Artif Intell 1(4):e180096. https://doi.org/10.1148/ryai.2019180096

Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A (2018) Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med 24(10):1559–1567. https://doi.org/10.1038/s41591-018-0177-5

Dercle L, Henry T, Carré A, Paragios N, Deutsch E, Robert C (2020) Reinventing radiation therapy with machine learning and imaging bio-markers (radiomics): State-of-the-art, challenges and perspectives, Methods, no. May, pp. 0–1, DOI: https://doi.org/10.1016/j.ymeth.2020.07.003 .

Dhillon A, Verma GK (2019) Convolutional neural network: a review of models, methodologies, and applications to object detection Prog Artif Intell, no. 0123456789, DOI: https://doi.org/10.1007/s13748-019-00203-0 .

Dimitriou N, Arandjelović O, Caie PD (2019) Deep Learning for Whole Slide Image Analysis: An Overview. Front Med 6(November):1–7. https://doi.org/10.3389/fmed.2019.00264

Du W et al (2019) Review on the applications of deep learning in the analysis of gastrointestinal endoscopy images. IEEE Access 7:142053–142069. https://doi.org/10.1109/ACCESS.2019.2944676

Dugas C, Bengio Y, Bélisle F, Nadeau C, Garcia R (2000) Incorporating second-order functional knowledge for better option pricing. In: 13th International Conference on Neural Information Processing Systems (NIPS’00), pp 451–457. https://doi.org/10.5555/3008751.3008817

Eberhart RC, Dobbins RW (1990) Early neural network development history: the age of Camelot. IEEE Eng Med Biol Mag 9(3):15–18. https://doi.org/10.1109/51.59207

Falk T, Mai D, Bensch R, Çiçek Ö, Abdulkadir A, Marrakchi Y, Böhm A, Deubner J, Jäckel Z, Seiwald K, Dovzhenko A, Tietz O, Dal Bosco C, Walsh S, Saltukoglu D, Tay TL, Prinz M, Palme K, Simons M, Diester I, Brox T, Ronneberger O (2019) U-net: deep learning for cell counting, detection, and morphometry. Nat Methods 16(1):67–70. https://doi.org/10.1038/s41592-018-0261-2

Fan D-P et al. (2020) Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Scans, pp. 1–10, [Online]. Available: http://arxiv.org/abs/2004.14133 .

Fischer A, Igel C (2014) Training restricted Boltzmann machines: an introduction. Pattern Recogn 47(1):25–39. https://doi.org/10.1016/j.patcog.2013.05.025

Article   MATH   Google Scholar  

Fonseca P et al (2015) Automatic breast density classification using a convolutional neural network architecture search procedure. Med Imaging 2015 Comput Diagnosis 9414(c):941428. https://doi.org/10.1117/12.2081576

Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202. https://doi.org/10.1007/BF00344251

Gadermayr M, Gupta L, Appel V, Boor P, Klinkhammer BM, Merhof D (2019) Generative adversarial networks for facilitating stain-independent supervised and unsupervised segmentation: a study on kidney histology. IEEE Trans Med Imaging 38(10):2293–2302. https://doi.org/10.1109/TMI.2019.2899364

Gardezi SJS, Elazab A, Lei B, Wang T (2019) Breast cancer detection and diagnosis using mammographic data: systematic review. J Med Internet Res 21(7):1–22. https://doi.org/10.2196/14464

Geras KJ et al. (2017) High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks, pp. 1–9, [Online]. Available: http://arxiv.org/abs/1703.07047 .

Goodfellow I, Bengio Y, Courville A (2016) “Deep learning,” DOI: https://doi.org/10.1038/nmeth.3707

Goodfellow IJ et al (2014) Generative adversarial nets. Adv Neural Inf Process Syst 3(January):2672–2680

Greenspan H, Van Ginneken B, Summers RM (2016) Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans Med Imaging 35(5):1153–1159. https://doi.org/10.1109/TMI.2016.2553401

Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B (2009) Histopathological image analysis: a review. IEEE Rev Biomed Eng 2:147–171. https://doi.org/10.1109/RBME.2009.2034865

He JY, Wu X, Jiang YG, Peng Q, Jain R (2018) Hookworm detection in wireless capsule endoscopy images with deep learning. IEEE Trans Image Process 27(5):2379–2392. https://doi.org/10.1109/TIP.2018.2801119

He K, Zhang X, Ren S., Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, Proc IEEE Int Conf Comput Vis, vol. 2015 Inter, pp 1026–1034, DOI: https://doi.org/10.1109/ICCV.2015.123 .

He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016(Decem):770–778. https://doi.org/10.1109/CVPR.2016.90

Hinton G (2014) Boltzmann Machines, Encycl Mach Learn Data Min, no. 1, pp. 1–7, DOI: https://doi.org/10.1007/978-1-4899-7502-7_31-1 .

Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Hooda R, Mittal A, Sofat S (2019) Automated TB classification using ensemble of deep architectures. Multimed Tools Appl 78(22):31515–31532. https://doi.org/10.1007/s11042-019-07984-5

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017(Janua):2261–2269. https://doi.org/10.1109/CVPR.2017.243

Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154. https://doi.org/10.1113/jphysiol.1962.sp006837

Huynh BQ, Li H, Giger ML (2016) Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. J Med Imaging 3(3):034501. https://doi.org/10.1117/1.jmi.3.3.034501

Hwang S, Kim H-E, Jeong J, Kim H-J (2016) A novel approach for tuberculosis screening based on deep convolutional neural networks. Med Imaging 2016 Comput Diagnosis 9785:97852W. https://doi.org/10.1117/12.2216198

Hwang EJ, Park S, Jin KN, Kim JI, Choi SY, Lee JH, Goo JM, Aum J, Yim JJ, Park CM, Deep Learning-Based Automatic Detection Algorithm Development and Evaluation Group, Kim DH, Woo W, Choi C, Hwang IP, Song YS, Lim L, Kim K, Wi JY, Oh SS, Kang MJ (2019) Development and validation of a deep learning–based automatic detection algorithm for active pulmonary tuberculosis on chest radiographs. Clin Infect Dis 69(5):739–747. https://doi.org/10.1093/cid/ciy967

Ionescu GV et al (2019) Prediction of reader estimates of mammographic density using convolutional neural networks. J Med Imaging 6(03):1. https://doi.org/10.1117/1.jmi.6.3.031405

Jani KK, Srivastava R (2019) A survey on medical image analysis in capsule endoscopy. Curr Med Imaging Rev 15(7):622–636. https://doi.org/10.2174/1573405614666181102152434

Jia Y et al. (2014) Caffe: Convolutional architecture for fast feature embedding,” MM 2014 – Proc 2014 ACM Conf Multimed , pp. 675–678, DOI: https://doi.org/10.1145/2647868.2654889 .

Kang C, Yu X, Wang SH, Guttery DS, Pandey HM, Tian Y, Zhang YD (2020) A heuristic neural network structure relying on fuzzy logic for images scoring. IEEE Trans Fuzzy Syst 6706(c):1–1. https://doi.org/10.1109/tfuzz.2020.2966163 45

S. Karthik, R. Srinivasa Perumal, and P. V. S. S. R. Chandra Mouli, “Breast cancer classification using deep neural networks,” Knowl Comput Its Appl Knowl Manip Process Tech Vol. 1, pp. 227–241, 2018, DOI: https://doi.org/10.1007/978-981-10-6680-1_12

Kazeminia S et al. (2020) GANs for Medical Image Analysis,” Artif Intell Med, p. 104262, DOI: https://doi.org/10.1016/j.jece.2020.104262 .

Kim EK, Kim HE, Han K, Kang BJ, Sohn YM, Woo OH, Lee CW (2018) Applying data-driven imaging biomarker in mammography for breast Cancer screening: preliminary study. Sci Rep 8(1):1–8. https://doi.org/10.1038/s41598-018-21215-1

Kingma DP, Welling M Auto-encoding variational bayes. In: 2nd International Conference on Learning, ICLR 2014, vol 2014, pp 1–14

Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. Adv Neural Inf Process Syst 2017(Decem):972–981

Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: The 25th International Conference on Neural Information Processing Systems, pp 1097–1105. https://doi.org/10.1145/3065386

Kyono T, Gilbert FJ, van der Schaar M (2018) MAMMO: A Deep Learning Solution for Facilitating Radiologist-Machine Collaboration in Breast Cancer Diagnosis, pp. 1–18, [Online]. Available: http://arxiv.org/abs/1811.02661 .

LeCun Y, Bengio Y (1998) Convolutional networks for images, speech, and time-series. In: The handbook of brain theory and neural networks, pp 255–258

LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to digit recognition. Neural Comput 1(4):541–551

Lehman CD, Yala A, Schuster T, Dontchos B, Bahl M, Swanson K, Barzilay R (2019) Mammographic breast density assessment using deep learning: clinical implementation. Radiology 290(1):52–58. https://doi.org/10.1148/radiol.2018180694

Lei T, Wang R, Wan Y, Du X, Meng H, Nandi AK (2020) Medical Image Segmentation Using Deep Learning: A survey, vol. 171, pp. 17–31, DOI: https://doi.org/10.1007/978-3-030-32606-7_2 .

Li W, Cao P, Zhao D, Wang J (2016) Pulmonary Nodule Classification with Deep Convolutional Neural Networks on Computed Tomography Images, Comput Math Methods Med, vol. 2016, DOI: https://doi.org/10.1155/2016/6215085 .

Li X, Chen H, Qi X, Dou Q, Fu CW, Heng PA (2018) H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans Med Imaging 37(12):2663–2674. https://doi.org/10.1109/TMI.2018.2845918

Li Z, Dong M, Wen S, Hu X, Zhou P, Zeng Z (2019) CLU-CNNs: Object detection for medical images. Neurocomputing 350(May):53–59. https://doi.org/10.1016/j.neucom.2019.04.028

Li Y, Huang C, Ding L, Li Z, Pan Y, Gao X (2019) Deep learning in bioinformatics: introduction, application, and perspective in the big data era. Methods 166:4–21. https://doi.org/10.1016/j.ymeth.2019.04.008

Li Y, Li S, Chen C, Hao A, Qin H (2020) A Plug-and-play Scheme to Adapt Image Saliency Deep Model for Video Data, IEEE Trans Circuits Syst Video Technol, no. Xx, pp. 1–1, DOI: https://doi.org/10.1109/tcsvt.2020.3023080 .

Li L, Qin L, Yin Y, Wang X et al (2019) Artificial Intelligence Distinguishes COVID-19 from Community Acquired Pneumonia on Chest CT. Radiology 2020:1–5. https://doi.org/10.1007/s10489-020-01714-3

Li C, Wang X, Liu W, Latecki LJ, Wang B, Huang J (2019) Weakly supervised mitosis detection in breast histopathology images using concentric loss. Med Image Anal 53:165–178. https://doi.org/10.1016/j.media.2019.01.013

Liang Q, Nan Y, Coppola G, Zou K, Sun W, Zhang D, Wang Y, Yu G (2019) Weakly supervised biomedical image segmentation by reiterative learning. IEEE J Biomed Heal Inf 23(3):1205–1214. https://doi.org/10.1109/JBHI.2018.2850040

Liao F, Liang M, Li Z, Hu X, Song S (2019) Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky Noisy-OR network. IEEE Trans Neural Netw Learn Syst 30(11):3484–3495. https://doi.org/10.1109/TNNLS.2019.2892409

Lin H, Chen H, Graham S, Dou Q, Rajpoot N, Heng PA (2019) Fast ScanNet: fast and dense analysis of multi-Gigapixel whole-slide images for Cancer metastasis detection. IEEE Trans Med Imaging 38(8):1948–1958. https://doi.org/10.1109/TMI.2019.2891305

Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42(1995):60–88. https://doi.org/10.1016/j.media.2017.07.005

Litjens G et al (2016) Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep 6(January):1–11. https://doi.org/10.1038/srep26286

Little WA (1974) The existence of persistent states in the brain. Math Biosci 19(1–2):101–120. https://doi.org/10.1016/0025-5564(74)90031-5

Little WA, Shaw GL (1978) Analytic study of the memory storage capacity of a neural network. Math Biosci 39(3–4):281–290. https://doi.org/10.1016/0025-5564(78)90058-5

Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234(November 2016):11–26. https://doi.org/10.1016/j.neucom.2016.12.038

Lo SLJLMFMCSMSC, Lo SCB, Lou SLA, Chien MV, Mun SK (1995) Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Trans Med Imaging 14(4):711–718. https://doi.org/10.1109/42.476112

Loey M, Smarandache F, Khalifa NEM (2020) Within the lack of chest COVID-19 X-ray dataset: A novel detection model based on GAN and deep transfer learning, Symmetry (Basel)., vol. 12, no. 4, DOI: https://doi.org/10.3390/SYM12040651 .

Lopes UK, Valiati JF (2017) Pre-trained convolutional neural networks as feature extractors for tuberculosis detection. Comput Biol Med 89(August):135–143. https://doi.org/10.1016/j.compbiomed.2017.08.001

Ma G, Li S, Chen C, Hao A, Qin H (2020) Stage-wise salient object detection in 360 omnidirectional image via object-level Semantical saliency ranking. IEEE Trans Vis Comput Graph 26:3535–3545. https://doi.org/10.1109/tvcg.2020.3023636

Ma J, Song Y, Tian X, Hua Y, Zhang R, Wu J (2020) Survey on deep learning for pulmonary medical imaging. Front Med 14(4):450–469. https://doi.org/10.1007/s11684-019-0726-4

Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: The 30th International Conference on Machine Learning, vol 30

Masood A, Sheng B, Yang P, Li P, Li H, Kim J, Feng DD (2020) Automated decision support system for lung Cancer detection and classification via enhanced RFCN with multilayer fusion RPN. IEEE Trans Ind Inf 3203(c):1–1. https://doi.org/10.1109/tii.2020.2972918 7801

Mazurowski MA, Buda M, Saha A, Bashir MR (2019) Deep learning in radiology: an overview of the concepts and a survey of the state of the art with a focus on MRI. J Magn Reson Imaging 49(4):939–954. https://doi.org/10.1002/jmri.26534

Mcculloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133. https://doi.org/10.1007/BF02478259

Minsky M, Papert S (1969) Perceptrons: an introduction to computational geometry, vol 522. MIT Press, Cambridge MA, pp 20–522. https://doi.org/10.1016/S0019-9958(70)90409-2

Book   MATH   Google Scholar  

Mittal A, Hooda R, Sofat S (2018) LF-SegNet : a fully convolutional encoder – decoder network for segmenting lung fields from chest, Wirel Pers Commun, DOI: https://doi.org/10.1007/s11277-018-5702-9

Morris RGM, Hebb DO (1949) The Organization of Behavior, Wiley: New York; 1949,” Brain Res Bull, vol. 50, no. 5–6, p. 437, DOI: https://doi.org/10.1016/S0361-9230(99)00182-3 .

Münzer B, Schoeffmann K, Böszörmenyi L (2018) Content-based processing and analysis of endoscopic images and videos: a survey. Multimed Tools Appl 77(1):1323–1362. https://doi.org/10.1007/s11042-016-4219-z

Murphy A, Skalski M, Gaillard F (2018) The utilisation of convolutional neural networks in detecting pulmonary nodules: a review. Br J Radiol 91(1090):1–6. https://doi.org/10.1259/bjr.20180028

Murphy K et al. (2019) Computer aided detection of tuberculosis on chest radiographs: An evaluation of the CAD4TB v6 system, pp. 1–11, [Online]. Available: http://arxiv.org/abs/1903.03349 .

Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. Proc 27th Int Conf Mach Learn (ICML-10), 807–814 33(5):807–814

Nakagawa K, Ishihara R, Aoyama K, Ohmori M (2019) Classification for invasion depth of esophageal squamous cell carcinoma using a deep neural network compared with experienced endoscopists. Gastrointest Endosc 90(3):407–414. https://doi.org/10.1016/j.gie.2019.04.245

Ng A (2011) Sparse autoencoder. CS294A Lect. Notes 72:1–19

Nie D, Trullo R, Lian J, Wang L, Petitjean C, Ruan S, Wang Q, Shen D (2018) Medical image synthesis with deep convolutional adversarial networks. IEEE Trans Biomed Eng 65(12):2720–2730. https://doi.org/10.1109/TBME.2018.2814538

Onishi Y et al. (2019) Automated Pulmonary Nodule Classification in Computed Tomography Images Using a Deep Convolutional Neural Network Trained by Generative Adversarial Networks, Biomed Res Int, vol. 2019, DOI: https://doi.org/10.1155/2019/6051939 .

Ouyang W et al (2015) DeepID-Net: Deformable deep convolutional neural networks for object detection. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 07–12(June):2403–2412. https://doi.org/10.1109/CVPR.2015.7298854

Ozturk T, Talo M, Yildirim EA, Baloglu UB, Yildirim O, Rajendra Acharya U (2020) Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput Biol Med 121(April):103792. https://doi.org/10.1016/j.compbiomed.2020.103792

Pang S, Zhang Y, Ding M, Wang X, Xie X (2020) A deep model for lung Cancer type identification by densely connected convolutional networks and adaptive boosting. IEEE Access 8:4799–4805. https://doi.org/10.1109/ACCESS.2019.2962862

Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. 30th Int Conf Mach Learn ICML 2013(PART 3):2347–2355

Perone CS, Cohen-Adad J (2019) Promises and limitations of deep learning for medical image segmentation. J Med Artif Intell 2:1–1. https://doi.org/10.21037/jmai.2019.01.01

Pezeshk A, Hamidian S, Petrick N, Sahiner B (2018) 3D convolutional neural networks for automatic detection of pulmonary nodules in chest CT. IEEE J Biomed Heal Inf PP(c):1. https://doi.org/10.1109/JBHI.2018.2879449

Pinckaers H, Litjens G (2019) Neural Ordinary Differential Equations for Semantic Segmentation of Individual Colon Glands, no. NeurIPS, [Online]. Available: http://arxiv.org/abs/1910.10470 .

Poggio T, Serre T (2013) Models of visual cortex. Scholarpedia 8(4):3516. https://doi.org/10.4249/scholarpedia.3516

Qiang Y, Ge L, Zhao X, Zhang X, Tang X (2017) Pulmonary nodule diagnosis using dual-modal supervised autoencoder based on extreme learning machine. Expert Syst 34(6):1–12. https://doi.org/10.1111/exsy.12224

Qu H et al (2019) Joint Segmentation and fine -grained classification of nuclei in histopathology images. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp 900–904. https://doi.org/10.1109/ISBI.2019.8759457

Rajaraman S, Antani SK (2020) Modality-specific deep learning model ensembles toward improving TB detection in chest radiographs. IEEE Access 8:27318–27326. https://doi.org/10.1109/ACCESS.2020.2971257

Rajaraman S, Antani S (2020) Weakly labeled data augmentation for deep learning: a study on COVID-19 detection in chest X-rays. Diagnostics 10(6):1–17. https://doi.org/10.3390/diagnostics10060358

Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz CP, Patel BN, Yeom KW, Shpanskaya K, Blankenberg FG, Seekins J, Amrhein TJ, Mong DA, Halabi SS, Zucker EJ, Ng AY, Lungren MP (2018) Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med 15(11):1–17. https://doi.org/10.1371/journal.pmed.1002686

Rajpurkar P et al. (2017) CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning, pp. 3–9, [Online]. Available: http://arxiv.org/abs/1711.05225 .

Reader AJ, Corda G, Mehranian A, da Costa-Luis C, Ellis S, Schnabel JA (2020) Deep learning for PET image reconstruction. IEEE Trans Radiat Plasma Med Sci 7311(1):1–1. https://doi.org/10.1109/trpms.2020.3014786 25

Ribli D, Horváth A, Unger Z, Pollner P, Csabai I (2018) Detecting and classifying lesions in mammograms with deep learning. Sci Rep 8(1):1–7. https://doi.org/10.1038/s41598-018-22437-z

Rodríguez-Ruiz A, Krupinski E, Mordang JJ, Schilling K, Heywang-Köbrunner SH, Sechopoulos I, Mann RM (2019) Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 290(3):1–10. https://doi.org/10.1148/radiol.2018181371

Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. Lect Notes Comput Sci (including Subser Lect. Notes Artif Intell Lect Notes Bioinformatics) 9351:234–241. https://doi.org/10.1007/978-3-319-24574-4_28

Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408. https://doi.org/10.1037/h0042519

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(9):533–536

Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inf Process Syst 2017-Decem(Nips):3857–3867

Saeedan F, Weber N, Goesele M, Roth S (2018) Detail-Preserving Pooling in Deep Networks,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit , no. June, pp. 9108–9116, DOI: https://doi.org/10.1109/CVPR.2018.00949 .

Sahiner B, Heang-Ping Chan, Petrick N, Datong Wei, Helvie MA, Adler DD, Goodsitt MM (1996) Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images. IEEE Trans Med Imaging 15(5):598–610. https://doi.org/10.1109/42.538937

Sari CT, Gunduz-Demir C (2019) Unsupervised feature extraction via deep learning for Histopathological classification of Colon tissue images. IEEE Trans Med Imaging 38(5):1139–1149. https://doi.org/10.1109/TMI.2018.2879369

Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 6354 LNCS(PART 3):92–101. https://doi.org/10.1007/978-3-642-15825-4_10

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003

Serag A et al (2019) Translational AI and Deep Learning in Diagnostic Pathology. Front Med 6(October):1–15. https://doi.org/10.3389/fmed.2019.00185

Setio AAA, Ciompi F, Litjens G, Gerke P, Jacobs C, van Riel SJ, Wille MMW, Naqibullah M, Sanchez CI, van Ginneken B (2016) Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans Med Imaging 35(5):1160–1169. https://doi.org/10.1109/TMI.2016.2536809

Shah A, Kadam E, Shah H, Shinde S, Shingade S (2016) Deep residual networks with exponential linear unit. ACM Int Conf Proceeding Ser 21–24(Sept):59–65. https://doi.org/10.1145/2983402.2983406

Shatnawi A, Al-Bdour G, Al-Qurran R, Al-Ayyoub M (2018) A comparative study of open source deep learning frameworks. 2018 9th Int Conf Inf Commun Syst ICICS 2018 2018-Janua:72–77. https://doi.org/10.1109/IACS.2018.8355444

Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W (2019) Deep learning to improve breast Cancer detection on screening mammography. Sci Rep 9(1):1–13. https://doi.org/10.1038/s41598-019-48995-4

Shickel B, Tighe PJ, Bihorac A, Rashidi P (2017) Deep EHR : A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record, vol. 2194, no. c, pp. 1–17, DOI: https://doi.org/10.1109/JBHI.2017.2767063 .

Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298. https://doi.org/10.1109/TMI.2016.2528162

Siegel RL, Miller KD, Jemal A (2019) Cancer statistics, 2019. CA Cancer J Clin 69(1):7–34. https://doi.org/10.3322/caac.21551

Simonyan K, Zisserman (2015) A Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, pp 1–14

Soffer S, Ben-Cohen A, Shimon O, Amitai MM, Greenspan H, Klang E (2019) Convolutional neural networks for radiologic images: a Radiologist’s guide. Radiology 290(3):590–606. https://doi.org/10.1148/radiol.2018180547

Soffer S, Klang E, Shimon O, Nachmias N, Eliakim R (2020) Deep learning for wireless capsule endoscopy : a systematic review and meta-analysis. Gastrointest Endosc 92(4):831–839.e8. https://doi.org/10.1016/j.gie.2020.04.039

Song TH, Sanchez V, Eidaly H, Rajpoot NM (2019) Simultaneous cell detection and classification in bone marrow histology images. IEEE J Biomed Heal Inf 23(4):1469–1476. https://doi.org/10.1109/JBHI.2018.2878945

Song Y, Tan EL, Jiang X, Cheng JZ, Ni D, Chen S, Lei B, Wang T (2017) Accurate cervical cell segmentation from overlapping clumps in pap smear images. IEEE Trans Med Imaging 36(1):288–300. https://doi.org/10.1109/TMI.2016.2606380

Souza JC, Bandeira Diniz JO, Ferreira JL, França da Silva GL, Corrêa Silva A, de Paiva AC (2019) An automatic method for lung segmentation and reconstruction in chest X-ray using deep neural networks. Comput Methods Prog Biomed 177:285–296. https://doi.org/10.1016/j.cmpb.2019.06.005

Sun M, Zhang G, Dang H, Qi X, Zhou X, Chang Q (2019) Accurate gastric Cancer segmentation in digital pathology images using deformable convolution and multi-scale embedding networks. IEEE Access 7:75530–75541. https://doi.org/10.1109/ACCESS.2019.2918800

Swersky K, Chen B, Marlin B, de Freitas N (2010) A tutorial on stochastic approximation algorithms for training restricted Boltzmann machines and deep belief nets,” 2010 Inf Theory Appl Work ITA 2010, Conf Proc, pp. 80–89, DOI: https://doi.org/10.1109/ITA.2010.5454138 .

Szegedy C, Reed S, Sermanet P, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: The IEEE conference on Computer Vision and Pattern Recognition (CVPR), pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594

Tabibu S, Vinod PK, Jawahar CV (2019) Pan-renal cell carcinoma classification and survival prediction from histopathology images using deep learning. Sci Rep 9(1):1–9. https://doi.org/10.1038/s41598-019-46718-3

The Theano Development Team et al. (2016) Theano: A Python framework for fast computation of mathematical expressions, pp. 1–19, [Online]. Available: http://arxiv.org/abs/1605.02688 .

Valkonen M, Isola J, Ylinen O, Muhonen V, Saxlin A, Tolonen T, Nykter M, Ruusuvuori P (2020) Cytokeratin-supervised deep learning for automatic recognition of epithelial cells in breast cancers stained for ER, PR, and Ki-67. IEEE Trans Med Imaging 39(2):534–542. https://doi.org/10.1109/TMI.2019.2933656

Valliani AAA, Ranti D, Oermann EK (2019) Deep learning and neurology: a systematic review. Neurol Ther 8(2):351–365. https://doi.org/10.1007/s40120-019-00153-8

Van Eycke YR, Balsat C, Verset L, Debeir O, Salmon I, Decaestecker C (2018) Segmentation of glandular epithelium in colorectal tumours to automatically compartmentalise IHC biomarker quantification: a deep learning approach. Med Image Anal 49:35–45. https://doi.org/10.1016/j.media.2018.07.004

van Ginneken B, Setio AAA, Jacobs C, Ciompi F (2015) Off-the-shelf convolutional neural network features for pulmonary nodule detection in computed tomography scans. In: 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), pp 286–289. https://doi.org/10.1109/ISBI.2015.7163869

Vedaldi A, Lenc K (2015) MatConvNet: Convolutional neural networks for MATLAB, MM 2015 – Proc 2015 ACM Multimed Conf, pp. 689–692, DOI: https://doi.org/10.1145/2733373.2807412 .

Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local Denoising criterion. J Mach Learn Res 11:3371–3408

MathSciNet   MATH   Google Scholar  

Waheed A, Goyal M, Gupta D, Khanna A, Al-Turjman F, Pinheiro PR (2020) CovidGAN: data augmentation using auxiliary classifier GAN for improved Covid-19 detection. IEEE Access 8:91916–91923. https://doi.org/10.1109/ACCESS.2020.2994762

Wang J, Ding H, Bidgoli FA, Zhou B, Iribarren C, Molloi S, Baldi P (2017) Detecting cardiovascular disease from mammograms with deep learning. IEEE Trans Med Imaging 36(5):1172–1181. https://doi.org/10.1109/TMI.2017.2655486

Wang SH, Muhammad K, Hong J, Sangaiah AK, Zhang YD (2020) Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput & Applic 32(3):665–680. https://doi.org/10.1007/s00521-018-3924-0

Wang H, Raj B (2017) On the Origin of Deep Learning,” pp. 1–72, [Online]. Available: http://arxiv.org/abs/1702.07800 .

Wang S, Tang C, Sun J, Zhang Y (2019) Cerebral micro-bleeding detection based on densely connected neural network. Front Neurosci 13(MAY):1–11. https://doi.org/10.3389/fnins.2019.00422

Wang L, Wong A (2020) COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images, pp. 1–12, [Online]. Available: http://arxiv.org/abs/2003.09871 .

Wang Y, Yan F, Lu X, Zheng G, Zhang X, Wang C, Zhou K, Zhang Y, Li H, Zhao Q, Zhu H, Chen F, Gao C, Qing Z, Ye J, Li A, Xin X, Li D, Wang H, Yu H, Cao L, Zhao C, Deng R, Tan L, Chen Y, Yuan L, Zhou Z, Yang W, Shao M, Dou X, Zhou N, Zhou F, Zhu Y, Lu G, Zhang B (2019) IILS: intelligent imaging layout system for automatic imaging report standardization and intra-interdisciplinary clinical workflow optimization. EBioMedicine 44:162–181. https://doi.org/10.1016/j.ebiom.2019.05.040

Wang X et al (2019) Weakly Supervised Deep Learning for Whole Slide Lung Cancer Image Analysis. IEEE Trans Cybern PP:1–13. https://doi.org/10.1109/tcyb.2019.2935141

Wang T et al (2020) Machine learning in quantitative PET: A review of attenuation correction and low-count image reconstruction methods. Phys Medica 76(March):294–306. https://doi.org/10.1016/j.ejmp.2020.07.028

Wei JW, Tafe LJ, Linnik YA, Vaickus LJ, Tomita N, Hassanpour S (2019) Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks. Sci Rep 9(1):1–8. https://doi.org/10.1038/s41598-019-40041-7

Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560. https://doi.org/10.1109/5.58337

Werbose J (1974) Beyond regression: new tools for prediction and analysis in the behavioral

Widrow B, Hoff ME (1962) Associative Storage and Retrieval of Digital Information in Networks of Adaptive ‘Neurons. Biol Prototypes Synth Syst:160–160. https://doi.org/10.1007/978-1-4684-1716-6_25

Williams RJ, David Z (1995) Gradient-based learning algorithms for recurrent networks and their computational complexity. In: Back-propagation: theory, architectures and applications. L. Erlbaum Associates Inc, pp 433–486

Wu J (2017) Convolutional Neural Networks. Med Imaging Inf Sci 34(2):109–111. https://doi.org/10.11318/mii.34.109

Wu H, Gu X (2015) Max-pooling dropout for regularization of convolutional neural networks. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 9489:46–54. https://doi.org/10.1007/978-3-319-26532-2_6

Wu N, Phang J, Park J, Shen Y, Huang Z, Zorin M, Jastrzebski S, Fevry T, Katsnelson J, Kim E, Wolfson S, Parikh U, Gaddam S, Lin LLY, Ho K, Weinstein JD, Reig B, Gao Y, Toth H, Pysarenko K, Lewin A, Lee J, Airola K, Mema E, Chung S, Hwang E, Samreen N, Kim SG, Heacock L, Moy L, Cho K, Geras KJ (2019) Deep neural networks improve radiologists’ performance in breast Cancer screening. IEEE Trans Med Imaging 39:1–1. https://doi.org/10.1109/tmi.2019.2945514 1194

Wu N et al (2018) Breast density classification with deep convolutional neural networks. ICASSP, IEEE Int Conf Acoust Speech Signal Process - Proc 2018-April:6682–6686. https://doi.org/10.1109/ICASSP.2018.8462671

Xing F, Xie Y, Su H, Liu F, Yang L (2018) Deep learning in microscopy image analysis: a survey. IEEE Trans Neural Netw Learn Syst 29(10):4550–4568. https://doi.org/10.1109/TNNLS.2017.2766168

Xing F, Xie Y, Yang L (2016) An automatic learning-based framework for robust nucleus segmentation. IEEE Trans Med Imaging 35(2):550–566. https://doi.org/10.1109/TMI.2015.2481436

Xu B, Wang N, Chen T, Li M (2015) Empirical Evaluation of Rectified Activations in Convolutional Network , [Online]. Available: http://arxiv.org/abs/1505.00853 .

Xu S, Wu H, Bie R (2019) CXNet-m1: anomaly detection on chest X-rays with image-based deep learning. IEEE Access 7(c):4466–4477. https://doi.org/10.1109/ACCESS.2018.2885997

Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A (2016) Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging 35(1):119–130. https://doi.org/10.1109/TMI.2015.2458702

Yi X, Walia E, Babyn P (2019) Generative adversarial network in medical imaging: A review,” Med Image Anal, vol. 58, DOI: https://doi.org/10.1016/j.media.2019.101552 .

Yi F, Yang L, Wang S, Guo L, Huang C, Xie Y, Xiao G (2018) Microvessel prediction in H&E Stained Pathology Images using fully convolutional neural networks. BMC Bioinform 19(1):1–9. https://doi.org/10.1186/s12859-018-2055-z

Yoon HJ et al (2019) A Lesion-Based Convolutional Neural Network Improves Endoscopic Detection and Depth Prediction of Early Gastric Cancer. J Clin Med 8(9):1310. https://doi.org/10.3390/jcm8091310

Yu D et al (2014) An Introduction to Computational Networks and the Computational Network Toolkit. INTERSPEECH, Microsoft Research

Zhang S, Zhang S, Wang B, Habetler TG (2020) Deep learning algorithms for bearing fault diagnostics - a comprehensive review. IEEE Access 8:29857–29881. https://doi.org/10.1109/ACCESS.2020.2972859

Zhang X et al (2017) Whole mammogram image classification with convolutional neural networks. Proc - 2017 IEEE Int Conf Bioinforma Biomed BIBM 2017 2017-Janua(Cc):700–704. https://doi.org/10.1109/BIBM.2017.8217738

Zhao Q, Lyu S, Zhang B, Feng W (2018) Multiactivation pooling method in convolutional neural networks for image recognition. Wirel Commun Mob Comput 2018:1–16. https://doi.org/10.1155/2018/8196906

Zhao W, Zeng Z (2019) Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation,, DOI: https://doi.org/10.24926/548719.007 .

Download references

Author information

Authors and affiliations.

Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India

Muralikrishna Puttagunta & S. Ravi

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to S. Ravi .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Puttagunta, M., Ravi, S. Medical image analysis based on deep learning approach. Multimed Tools Appl 80 , 24365–24398 (2021). https://doi.org/10.1007/s11042-021-10707-4

Download citation

Received : 25 August 2020

Revised : 28 November 2020

Accepted : 10 February 2021

Published : 06 April 2021

Issue Date : July 2021

DOI : https://doi.org/10.1007/s11042-021-10707-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning
  • Convolutional neural networks
  • Medical images
  • Segmentation
  • Classification
  • Find a journal
  • Publish with us
  • Track your research

77 interesting medical research topics for 2024

Last updated

25 November 2023

Reviewed by

Brittany Ferri, PhD, OTR/L

Medical research is the gateway to improved patient care and expanding our available treatment options. However, finding a relevant and compelling research topic can be challenging.

Use this article as a jumping-off point to select an interesting medical research topic for your next paper or clinical study.

  • How to choose a medical research topic

When choosing a research topic , it’s essential to consider a couple of things. What topics interest you? What unanswered questions do you want to address? 

During the decision-making and brainstorming process, here are a few helpful tips to help you pick the right medical research topic:

Focus on a particular field of study

The best medical research is specific to a particular area. Generalized studies are often too broad to produce meaningful results, so we advise picking a specific niche early in the process. 

Maybe a certain topic interests you, or your industry knowledge reveals areas of need.

Look into commonly researched topics

Once you’ve chosen your research field, do some preliminary research. What have other academics done in their papers and projects? 

From this list, you can focus on specific topics that interest you without accidentally creating a copycat project. This groundwork will also help you uncover any literature gaps—those may be beneficial areas for research.

Get curious and ask questions

Now you can get curious. Ask questions that start with why, how, or what. These questions are the starting point of your project design and will act as your guiding light throughout the process. 

For example: 

What impact does pollution have on children’s lung function in inner-city neighborhoods? 

Why is pollution-based asthma on the rise? 

How can we address pollution-induced asthma in young children? 

  • 77 medical research topics worth exploring in 2023

Need some research inspiration for your upcoming paper or clinical study? We’ve compiled a list of 77 topical and in-demand medical research ideas. Let’s take a look. 

  • Exciting new medical research topics

If you want to study cutting-edge topics, here are some exciting options:

COVID-19 and long COVID symptoms

Since 2020, COVID-19 has been a hot-button topic in medicine, along with the long-term symptoms in those with a history of COVID-19. 

Examples of COVID-19-related research topics worth exploring include:

The long-term impact of COVID-19 on cardiac and respiratory health

COVID-19 vaccination rates

The evolution of COVID-19 symptoms over time

New variants and strains of the COVID-19 virus

Changes in social behavior and public health regulations amid COVID-19

Vaccinations

Finding ways to cure or reduce the disease burden of chronic infectious diseases is a crucial research area. Vaccination is a powerful option and a great topic to research. 

Examples of vaccination-related research topics include:

mRNA vaccines for viral infections

Biomaterial vaccination capabilities

Vaccination rates based on location, ethnicity, or age

Public opinion about vaccination safety 

Artificial tissues fabrication

With the need for donor organs increasing, finding ways to fabricate artificial bioactive tissues (and possibly organs) is a popular research area. 

Examples of artificial tissue-related research topics you can study include:

The viability of artificially printed tissues

Tissue substrate and building block material studies

The ethics and efficacy of artificial tissue creation

  • Medical research topics for medical students

For many medical students, research is a big driver for entering healthcare. If you’re a medical student looking for a research topic, here are some great ideas to work from:

Sleep disorders

Poor sleep quality is a growing problem, and it can significantly impact a person’s overall health. 

Examples of sleep disorder-related research topics include:

How stress affects sleep quality

The prevalence and impact of insomnia on patients with mental health conditions

Possible triggers for sleep disorder development

The impact of poor sleep quality on psychological and physical health

How melatonin supplements impact sleep quality

Alzheimer’s and dementia 

Cognitive conditions like dementia and Alzheimer’s disease are on the rise worldwide. They currently have no cure. As a result, research about these topics is in high demand. 

Examples of dementia-related research topics you could explore include:

The prevalence of Alzheimer’s disease in a chosen population

Early onset symptoms of dementia

Possible triggers or causes of cognitive decline with age

Treatment options for dementia-like conditions

The mental and physical burden of caregiving for patients with dementia

  • Lifestyle habits and public health

Modern lifestyles have profoundly impacted the average person’s daily habits, and plenty of interesting topics explore its effects. 

Examples of lifestyle and public health-related research topics include:

The nutritional intake of college students

The impact of chronic work stress on overall health

The rise of upper back and neck pain from laptop use

Prevalence and cause of repetitive strain injuries (RSI)

  • Controversial medical research paper topics

Medical research is a hotbed of controversial topics, content, and areas of study. 

If you want to explore a more niche (and attention-grabbing) concept, here are some controversial medical research topics worth looking into:

The benefits and risks of medical cannabis

Depending on where you live, the legalization and use of cannabis for medical conditions is controversial for the general public and healthcare providers.

Examples of medical cannabis-related research topics that might grab your attention include:

The legalization process of medical cannabis

The impact of cannabis use on developmental milestones in youth users

Cannabis and mental health diagnoses

CBD’s impact on chronic pain

Prevalence of cannabis use in young people

The impact of maternal cannabis use on fetal development 

Understanding how THC impacts cognitive function

Human genetics

The Human Genome Project identified, mapped, and sequenced all human DNA genes. Its completion in 2003 opened up a world of exciting and controversial studies in human genetics.

Examples of human genetics-related research topics worth delving into include:

Medical genetics and the incidence of genetic-based health disorders

Behavioral genetics differences between identical twins

Genetic risk factors for neurodegenerative disorders

Machine learning technologies for genetic research

Sexual health studies

Human sexuality and sexual health are important (yet often stigmatized) medical topics that need new research and analysis.

As a diverse field ranging from sexual orientation studies to sexual pathophysiology, examples of sexual health-related research topics include:

The incidence of sexually transmitted infections within a chosen population

Mental health conditions within the LGBTQIA+ community

The impact of untreated sexually transmitted infections

Access to safe sex resources (condoms, dental dams, etc.) in rural areas

  • Health and wellness research topics

Human wellness and health are trendy topics in modern medicine as more people are interested in finding natural ways to live healthier lifestyles. 

If this field of study interests you, here are some big topics in the wellness space:

Gluten sensitivity

Gluten allergies and intolerances have risen over the past few decades. If you’re interested in exploring this topic, your options range in severity from mild gastrointestinal symptoms to full-blown anaphylaxis. 

Some examples of gluten sensitivity-related research topics include:

The pathophysiology and incidence of Celiac disease

Early onset symptoms of gluten intolerance

The prevalence of gluten allergies within a set population

Gluten allergies and the incidence of other gastrointestinal health conditions

Pollution and lung health

Living in large urban cities means regular exposure to high levels of pollutants. 

As more people become interested in protecting their lung health, examples of impactful lung health and pollution-related research topics include:

The extent of pollution in densely packed urban areas

The prevalence of pollution-based asthma in a set population

Lung capacity and function in young people

The benefits and risks of steroid therapy for asthma

Pollution risks based on geographical location

Plant-based diets

Plant-based diets like vegan and paleo diets are emerging trends in healthcare due to their limited supporting research. 

If you’re interested in learning more about the potential benefits or risks of holistic, diet-based medicine, examples of plant-based diet research topics to explore include:

Vegan and plant-based diets as part of disease management

Potential risks and benefits of specific plant-based diets

Plant-based diets and their impact on body mass index

The effect of diet and lifestyle on chronic disease management

Health supplements

Supplements are a multi-billion dollar industry. Many health-conscious people take supplements, including vitamins, minerals, herbal medicine, and more. 

Examples of health supplement-related research topics worth investigating include:

Omega-3 fish oil safety and efficacy for cardiac patients

The benefits and risks of regular vitamin D supplementation

Health supplementation regulation and product quality

The impact of social influencer marketing on consumer supplement practices

Analyzing added ingredients in protein powders

  • Healthcare research topics

Working within the healthcare industry means you have insider knowledge and opportunity. Maybe you’d like to research the overall system, administration, and inherent biases that disrupt access to quality care. 

While these topics are essential to explore, it is important to note that these studies usually require approval and oversight from an Institutional Review Board (IRB). This ensures the study is ethical and does not harm any subjects. 

For this reason, the IRB sets protocols that require additional planning, so consider this when mapping out your study’s timeline. 

Here are some examples of trending healthcare research areas worth pursuing:

The pros and cons of electronic health records

The rise of electronic healthcare charting and records has forever changed how medical professionals and patients interact with their health data. 

Examples of electronic health record-related research topics include:

The number of medication errors reported during a software switch

Nurse sentiment analysis of electronic charting practices

Ethical and legal studies into encrypting and storing personal health data

Inequities within healthcare access

Many barriers inhibit people from accessing the quality medical care they need. These issues result in health disparities and injustices. 

Examples of research topics about health inequities include:

The impact of social determinants of health in a set population

Early and late-stage cancer stage diagnosis in urban vs. rural populations

Affordability of life-saving medications

Health insurance limitations and their impact on overall health

Diagnostic and treatment rates across ethnicities

People who belong to an ethnic minority are more likely to experience barriers and restrictions when trying to receive quality medical care. This is due to systemic healthcare racism and bias. 

As a result, diagnostic and treatment rates in minority populations are a hot-button field of research. Examples of ethnicity-based research topics include:

Cancer biopsy rates in BIPOC women

The prevalence of diabetes in Indigenous communities

Access inequalities in women’s health preventative screenings

The prevalence of undiagnosed hypertension in Black populations

  • Pharmaceutical research topics

Large pharmaceutical companies are incredibly interested in investing in research to learn more about potential cures and treatments for diseases. 

If you’re interested in building a career in pharmaceutical research, here are a few examples of in-demand research topics:

Cancer treatment options

Clinical research is in high demand as pharmaceutical companies explore novel cancer treatment options outside of chemotherapy and radiation. 

Examples of cancer treatment-related research topics include:

Stem cell therapy for cancer

Oncogenic gene dysregulation and its impact on disease

Cancer-causing viral agents and their risks

Treatment efficacy based on early vs. late-stage cancer diagnosis

Cancer vaccines and targeted therapies

Immunotherapy for cancer

Pain medication alternatives

Historically, opioid medications were the primary treatment for short- and long-term pain. But, with the opioid epidemic getting worse, the need for alternative pain medications has never been more urgent. 

Examples of pain medication-related research topics include:

Opioid withdrawal symptoms and risks

Early signs of pain medication misuse

Anti-inflammatory medications for pain control

  • Identify trends in your medical research with Dovetail

Are you interested in contributing life-changing research? Today’s medical research is part of the future of clinical patient care. 

As your go-to resource for speedy and accurate data analysis , we are proud to partner with healthcare researchers to innovate and improve the future of healthcare.

Editor’s picks

Last updated: 11 January 2024

Last updated: 15 January 2024

Last updated: 25 November 2023

Last updated: 12 May 2023

Last updated: 30 April 2024

Last updated: 18 May 2023

Last updated: 10 April 2023

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

research paper about medical field

Users report unexpectedly high data usage, especially during streaming sessions.

research paper about medical field

Users find it hard to navigate from the home page to relevant playlists in the app.

research paper about medical field

It would be great to have a sleep timer feature, especially for bedtime listening.

research paper about medical field

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

  • PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
  • EDIT Edit this Article
  • EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
  • Browse Articles
  • Learn Something New
  • Quizzes Hot
  • This Or That Game
  • Train Your Brain
  • Explore More
  • Support wikiHow
  • About wikiHow
  • Log in / Sign up
  • Education and Communications
  • College University and Postgraduate
  • Academic Writing
  • Research Papers

How to Write a Medical Research Paper

Last Updated: February 5, 2024 Approved

This article was co-authored by Chris M. Matsko, MD . Dr. Chris M. Matsko is a retired physician based in Pittsburgh, Pennsylvania. With over 25 years of medical research experience, Dr. Matsko was awarded the Pittsburgh Cornell University Leadership Award for Excellence. He holds a BS in Nutritional Science from Cornell University and an MD from the Temple University School of Medicine in 2007. Dr. Matsko earned a Research Writing Certification from the American Medical Writers Association (AMWA) in 2016 and a Medical Writing & Editing Certification from the University of Chicago in 2017. wikiHow marks an article as reader-approved once it receives enough positive feedback. In this case, 89% of readers who voted found the article helpful, earning it our reader-approved status. This article has been viewed 202,526 times.

Writing a medical research paper is similar to writing other research papers in that you want to use reliable sources, write in a clear and organized style, and offer a strong argument for all conclusions you present. In some cases the research you discuss will be data you have actually collected to answer your research questions. Understanding proper formatting, citations, and style will help you write and informative and respected paper.

Researching Your Paper

Step 1 Decide on a topic.

  • Pick something that really interests you to make the research more fun.
  • Choose a topic that has unanswered questions and propose solutions.

Step 2 Determine what kind of research paper you are going to write.

  • Quantitative studies consist of original research performed by the writer. These research papers will need to include sections like Hypothesis (or Research Question), Previous Findings, Method, Limitations, Results, Discussion, and Application.
  • Synthesis papers review the research already published and analyze it. They find weaknesses and strengths in the research, apply it to a specific situation, and then indicate a direction for future research.

Step 3 Research your topic thoroughly.

  • Keep track of your sources. Write down all publication information necessary for citation: author, title of article, title of book or journal, publisher, edition, date published, volume number, issue number, page number, and anything else pertaining to your source. A program like Endnote can help you keep track of your sources.
  • Take detailed notes as you read. Paraphrase information in your own words or if you copy directly from the article or book, indicate that these are direct quotes by using quotation marks to prevent plagiarism.
  • Be sure to keep all of your notes with the correct source.
  • Your professor and librarians can also help you find good resources.

Step 4 Organize your notes.

  • Keep all of your notes in a physical folder or in a digitized form on the computer.
  • Start to form the basic outline of your paper using the notes you have collected.

Writing Your Paper

Step 1 Outline your paper.

  • Start with bullet points and then add in notes you've taken from references that support your ideas. [1] X Trustworthy Source PubMed Central Journal archive from the U.S. National Institutes of Health Go to source
  • A common way to format research papers is to follow the IMRAD format. This dictates the structure of your paper in the following order: I ntroduction, M ethods, R esults, a nd D iscussion. [2] X Research source
  • The outline is just the basic structure of your paper. Don't worry if you have to rearrange a few times to get it right.
  • Ask others to look over your outline and get feedback on the organization.
  • Know the audience you are writing for and adjust your style accordingly. [3] X Research source

Step 2 Know the required format.

  • Use a standard font type and size, such as Times New Roman 12 point font.
  • Double-space your paper.
  • If necessary, create a cover page. Most schools require a cover page of some sort. Include your main title, running title (often a shortened version of your main title), author's name, course name, and semester.

Step 3 Compile your results.

  • Break up information into sections and subsections and address one main point per section.
  • Include any figures or data tables that support your main ideas.
  • For a quantitative study, state the methods used to obtain results.

Step 4 Write the conclusion and discussion.

  • Clearly state and summarize the main points of your research paper.
  • Discuss how this research contributes to the field and why it is important. [4] X Research source
  • Highlight potential applications of the theory if appropriate.
  • Propose future directions that build upon the research you have presented. [5] X Research source
  • Keep the introduction and discussion short, and spend more time explaining the methods and results.

Step 5 Write the introduction.

  • State why the problem is important to address.
  • Discuss what is currently known and what is lacking in the field.
  • State the objective of your paper.
  • Keep the introduction short.

Step 6 Write the abstract.

  • Highlight the purpose of the paper and the main conclusions.
  • State why your conclusions are important.
  • Be concise in your summary of the paper.
  • Show that you have a solid study design and a high-quality data set.
  • Abstracts are usually one paragraph and between 250 – 500 words.

Step 7 Cite while you write.

  • Unless otherwise directed, use the American Medical Association (AMA) style guide to properly format citations.
  • Add citations at end of a sentence to indicate that you are using someone else's idea. Use these throughout your research paper as needed. They include the author's last name, year of publication, and page number.
  • Compile your reference list and add it to the end of your paper.
  • Use a citation program if you have access to one to simplify the process.

Step 8 Edit your research paper.

  • Continually revise your paper to make sure it is structured in a logical way.
  • Proofread your paper for spelling and grammatical errors.
  • Make sure you are following the proper formatting guidelines provided for the paper.
  • Have others read your paper to proofread and check for clarity. Revise as needed.

Expert Q&A

Chris M. Matsko, MD

  • Ask your professor for help if you are stuck or confused about any part of your research paper. They are familiar with the style and structure of papers and can provide you with more resources. Thanks Helpful 0 Not Helpful 0
  • Refer to your professor's specific guidelines. Some instructors modify parts of a research paper to better fit their assignment. Others may request supplementary details, such as a synopsis for your research project . Thanks Helpful 0 Not Helpful 0
  • Set aside blocks of time specifically for writing each day. Thanks Helpful 0 Not Helpful 0

research paper about medical field

  • Do not plagiarize. Plagiarism is using someone else's work, words, or ideas and presenting them as your own. It is important to cite all sources in your research paper, both through internal citations and on your reference page. Thanks Helpful 4 Not Helpful 2

You Might Also Like

Use Internal Citations

  • ↑ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3178846/
  • ↑ http://owl.excelsior.edu/research-and-citations/outlining/outlining-imrad/
  • ↑ http://china.elsevier.com/ElsevierDNN/Portals/7/How%20to%20write%20a%20world-class%20paper.pdf
  • ↑ http://intqhc.oxfordjournals.org/content/16/3/191
  • ↑ http://www.ruf.rice.edu/~bioslabs/tools/report/reportform.html#form

About This Article

Chris M. Matsko, MD

To write a medical research paper, research your topic thoroughly and compile your data. Next, organize your notes and create a strong outline that breaks up the information into sections and subsections, addressing one main point per section. Write the results and discussion sections first to go over your findings, then write the introduction to state your objective and provide background information. Finally, write the abstract, which concisely summarizes the article by highlighting the main points. For tips on formatting and using citations, read on! Did this summary help you? Yes No

  • Send fan mail to authors

Reader Success Stories

Joshua Benibo

Joshua Benibo

Jun 5, 2018

Did this article help you?

Dominic Cipriano

Dominic Cipriano

Aug 16, 2016

Obiajulu Echedom

Obiajulu Echedom

Apr 2, 2017

Noura Ammar Alhossiny

Noura Ammar Alhossiny

Feb 14, 2017

Dawn Daniel

Dawn Daniel

Apr 20, 2017

Am I a Narcissist or an Empath Quiz

Featured Articles

Write a Diary

Trending Articles

Confront a Cheater

Watch Articles

Make Sugar Cookies

  • Terms of Use
  • Privacy Policy
  • Do Not Sell or Share My Info
  • Not Selling Info

Don’t miss out! Sign up for

wikiHow’s newsletter

  • Teaching & Learning
  • Student Concentration
  • Assessment Hub
  • Find CME Courses
  • Plan CME Activity
  • Medical Educator Series - YES!
  • MEDG Discussion Group
  • Session Recordings
  • Submission Guidelines
  • Poster Guidelines
  • Criteria for Excellent Posters
  • CME Accreditation
  • Winning Posters
  • Teaching Awards
  • Education Scholar Fellowship
  • Fellowship Alumni
  • Center Team

INFORMATION FOR

  • Residents & Fellows
  • Researchers

Susan D. Boulware, MD

Contact information.

  • Office 203.785.5809
  • Appt 877.925.3637
  • Clinic Fax 203.764.9149

Patient Care Location

  • YNHH Specialty Clinic 1 Long Wharf Drive, Ste 2 New Haven, CT 06511 Appointments : 203.785.4081

Mailing Address

  • Pediatric Endocrinology & Diabetes

1 Longwharf Drive

New Haven, CT 06511

United States

Patient Care

Research & publications, appointments.

Dr. Boulware has an interest in disorders of growth and development having published several papers looking at the metabolic actions of IGF-1, a hormone critical to childhood growth. She provides care to children with general disorders of the endocrine system (pituitary, thyroid, adrenal, testes or ovaries). She is the pediatric endocrinologist in the interdisciplinary Yale Clinic for Children with Differences in Sex Development and she is the Medical Director of the Interdisciplinary Yale Gender Program offering care to gender-expansive youth.

Education & Training

  • Resident University of Texas Medical School (1987)
  • MD University of Texas at San Antonio (1984)
  • Fellow Yale University School of Medicine

Honors & Recognition

Departments & organizations.

  • Dean's Advisory Council for LGBTQI Affairs
  • Directories
  • Pediatric Gender Program
  • Research Interest
  • Yale Medicine
  • Share full article

Advertisement

Supported by

Guest Essay

In Medicine, the Morally Unthinkable Too Easily Comes to Seem Normal

A photograph of two forceps, placed handle to tip against each other.

By Carl Elliott

Dr. Elliott teaches medical ethics at the University of Minnesota. He is the author of the forthcoming book “The Occasional Human Sacrifice: Medical Experimentation and the Price of Saying No,” from which this essay is adapted.

Here is the way I remember it: The year is 1985, and a few medical students are gathered around an operating table where an anesthetized woman has been prepared for surgery. The attending physician, a gynecologist, asks the group: “Has everyone felt a cervix? Here’s your chance.” One after another, we take turns inserting two gloved fingers into the unconscious woman’s vagina.

Had the woman consented to a pelvic exam? Did she understand that when the lights went dim she would be treated like a clinical practice dummy, her genitalia palpated by a succession of untrained hands? I don’t know. Like most medical students, I just did as I was told.

Last month the Department of Health and Human Services issued new guidance requiring written informed consent for pelvic exams and other intimate procedures performed under anesthesia. Much of the force behind the new requirement came from distressed medical students who saw these pelvic exams as wrong and summoned the courage to speak out.

Whether the guidance will actually change clinical practice I don’t know. Medical traditions are notoriously difficult to uproot, and academic medicine does not easily tolerate ethical dissent. I doubt the medical profession can be trusted to reform itself.

What is it that leads a rare individual to say no to practices that are deceptive, exploitative or harmful when everyone else thinks they are fine? For a long time I assumed that saying no was mainly an issue of moral courage. The relevant question was: If you are a witness to wrongdoing, will you be brave enough to speak out?

But then I started talking to insiders who had blown the whistle on abusive medical research. Soon I realized that I had overlooked the importance of moral perception. Before you decide to speak out about wrongdoing, you have to recognize it for what it is.

This is not as simple as it seems. Part of what makes medical training so unsettling is how often you are thrust into situations in which you don’t really know how to behave. Nothing in your life up to that point has prepared you to dissect a cadaver, perform a rectal exam or deliver a baby. Never before have you seen a psychotic patient involuntarily sedated and strapped to a bed or a brain-dead body wheeled out of a hospital room to have its organs harvested for transplantation. Your initial reaction is often a combination of revulsion, anxiety and self-consciousness.

To embark on a career in medicine is like moving to a foreign country where you do not understand the customs, rituals, manners or language. Your main concern on arrival is how to fit in and avoid causing offense. This is true even if the local customs seem backward or cruel. What’s more, this particular country has an authoritarian government and a rigid status hierarchy where dissent is not just discouraged but also punished. Living happily in this country requires convincing yourself that whatever discomfort you feel comes from your own ignorance and lack of experience. Over time, you learn how to assimilate. You may even come to laugh at how naïve you were when you first arrived.

A rare few people hang onto that discomfort and learn from it. When Michael Wilkins and William Bronston started working at the Willowbrook State School in Staten Island as young doctors in the early 1970s, they found thousands of mentally disabled children condemned to the most horrific conditions imaginable: naked children rocking and moaning on concrete floors in puddles of their own urine; an overpowering stench of illness and filth; a research unit where children were deliberately infected with hepatitis A and B.

“It was truly an American concentration camp,” Dr. Bronston told me. Yet when he and Dr. Wilkins tried to enlist Willowbrook doctors and nurses to reform the institution, they were met with indifference or hostility. It seemed as if no one else on the medical staff could see what they saw. It was only when Dr. Wilkins went to a reporter and showed the world what was happening behind the Willowbrook walls that anything began to change.

When I asked Dr. Bronston how it was possible for doctors and nurses to work at Willowbrook without seeing it as a crime scene, he told me it began with the way the institution was structured and organized. “Medically secured, medically managed, doctor-validated,” he said. Medical professionals just accommodated themselves to the status quo. “You get with the program because that’s what you’re being hired to do,” he said.

One of the great mysteries of human behavior is how institutions create social worlds where unthinkable practices come to seem normal. This is as true of academic medical centers as it is of prisons and military units. When we are told about a horrific medical research scandal, we assume that we would see it just as the whistle-blower Peter Buxtun saw the Tuskegee syphilis study : an abuse so shocking that only a sociopath could fail to perceive it.

Yet it rarely happens this way. It took Mr. Buxtun seven years to convince others to see the abuses for what they were. It has taken other whistle-blowers even longer. Even when the outside world condemns a practice, medical institutions typically insist that the outsiders don’t really understand.

According to Irving Janis, a Yale psychologist who popularized the notion of groupthink, the forces of social conformity are especially powerful in organizations that are driven by a deep sense of moral purpose. If the aims of the organization are righteous, its members feel, it is wrong to put barriers in the way.

This observation helps explain why academic medicine not only defends researchers accused of wrongdoing but also sometimes rewards them. Many of the researchers responsible for the most notorious abuses in recent medical history — the Tuskegee syphilis study, the Willowbrook hepatitis studies, the Cincinnati radiation studies , the Holmesburg prison studies — were celebrated with professional accolades even after the abuses were first called out.

The culture of medicine is notoriously resistant to change. During the 1970s, it was thought that the solution to medical misconduct was formal education in ethics. Major academic medical centers began establishing bioethics centers and programs throughout the 1980s and ’90s, and today virtually every medical school in the country requires ethics training.

Yet it is debatable whether that training has had any effect. Many of the most egregious ethical abuses in recent decades have taken place in medical centers with prominent bioethics programs, such as the University of Pennsylvania , Duke University , Columbia University and Johns Hopkins University , as well as my own institution, the University of Minnesota .

One could be forgiven for concluding that the only way the culture of medicine will change is if changes are forced on it from the outside — by oversight bodies, legislators or litigators. For example, many states have responded to the controversy over pelvic exams by passing laws banning the practice unless the patient has explicitly given consent.

You may find it hard to understand how pelvic exams on unconscious women without their consent could seem like anything but a terrible invasion. Yet a central aim of medical training is to transform your sensibility. You are taught to steel yourself against your natural emotional reactions to death and disfigurement; to set aside your customary views about privacy and shame; to see the human body as a thing to be examined, tested and studied.

One danger of this transformation is that you will see your colleagues and superiors do horrible things and be afraid to speak up. But the more subtle danger is that you will no longer see what they are doing as horrible. You will just think: This is the way it is done.

Carl Elliott ( @FearLoathingBTX ) teaches medical ethics at the University of Minnesota. He is the author of the forthcoming book “The Occasional Human Sacrifice: Medical Experimentation and the Price of Saying No,” from which this essay is adapted.

The Times is committed to publishing a diversity of letters to the editor. We’d like to hear what you think about this or any of our articles. Here are some tips . And here’s our email: [email protected] .

Follow the New York Times Opinion section on Facebook , Instagram , TikTok , WhatsApp , X and Threads .

IMAGES

  1. How to Write a Medicine Research Paper: Full Guide

    research paper about medical field

  2. FREE 27+ Research Paper Formats in PDF

    research paper about medical field

  3. (PDF) Medical Research Papers and Their Popularization. A Macro- and

    research paper about medical field

  4. Medical research paper sample. How to Write a Medical Research Paper

    research paper about medical field

  5. The Medical Research Paper: Structure and Functions

    research paper about medical field

  6. 💋 Medical research paper sample. Free Medical School Research Paper

    research paper about medical field

VIDEO

  1. medical Surgical Nursing-II Exam Paper 2022-23 PNRC GNM 2nd Year सभी प्रशन पढ़ाया गया हैं

  2. Do No Harm: Loss and Liability in the Medical Field

  3. Tomorrow Neet very Very Most important questions paper✅ And Pharmacy Exit Exam questions paper#viral

  4. बायोलॉजी बेसिक क्वेश्चन पेपर और आर्मी नर्सिंग परीक्षा में पूछे जानेवाले प्रश्नऔरpolytechnic Entrance

  5. Systematic Reviews In Research Universe

  6. bsc 2nd year questions paper medical diagnostic ( zool 203th) 2024 # short # video #

COMMENTS

  1. Medical research

    Medical research involves research in a wide range of fields, such as biology, chemistry, pharmacology and toxicology with the goal of developing new medicines or medical procedures or improving ...

  2. Revolutionising health care: Exploring the latest advances in medical

    Recent years have seen a revolution in the domain of medical science, with ground-breaking discoveries changing health care as we once knew it [].These advances have considerably improved disease diagnosis, treatment, and management, improving patient outcomes and quality of life [2-5].These innovations range from the creation of novel medications and treatments to the utilization of cutting ...

  3. Teaching Medical Research to Medical Students: a Systematic Review

    Medical research is of high value to clinicians and society. ... limited to the human medical field (veterinary field, dentistry, or nursing were excluded). Only primary research articles with the above criteria were considered—all opinions, reviews, commentary, or editorial papers were excluded. ... They were also more likely to publish ...

  4. Artificial intelligence in healthcare: transforming the practice of

    Artificial intelligence in healthcare: transforming the practice of medicine is a review article that explores the current and potential applications of AI in various domains of medicine, such as diagnosis, treatment, research, and education. The article also discusses the challenges and ethical issues of implementing AI in healthcare, and provides some recommendations for future directions ...

  5. AI in health and medicine

    BMC Medical Ethics (2024) Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a ...

  6. The BMJ original medical research articles

    Original research studies that can improve decision making in clinical medicine, public health, health care policy, medical education, or biomedical research.

  7. Artificial Intelligence in Medicine

    VOL. 388 NO. 13. Artificial intelligence (AI) has gained recent public prominence with the release of deep-learning models that can generate anything from art to term papers with minimal human ...

  8. Significance of machine learning in healthcare ...

    Using ML for healthcare can open up a world of possibilities in this field. It frees up healthcare providers' time to focus on patient care rather than searching or entering information. This paper studies ML and its need in healthcare, and then it discusses the associated features and appropriate pillars of ML for healthcare structure.

  9. Machine learning in medical applications: A review of state-of-the-art

    Several papers on ML's application in the medical field have been widely published. Fig. 4 demonstrates published articles amount in the period between 2000 and December 2021. The materials are gathered depending on the keyword "machine learning in the medical field" First, published articles were collected from well-known publishers, including Springer, Elsevier, IEEE, and some other ...

  10. The role of artificial intelligence in healthcare: a structured

    This paper evaluated AI in healthcare research streams using the SLR method [].As suggested by Massaro et al. [], an SLR enables the study of the scientific corpus of a research field, including the scientific rigour, reliability and replicability of operations carried out by researchers.As suggested by many scholars, the methodology allows qualitative and quantitative variables to highlight ...

  11. Machine learning in medicine: a practical introduction

    Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open ...

  12. NEJM Catalyst

    NEJM Catalyst | Practical Innovations in Health Care Delivery

  13. Medical image analysis based on deep learning approach

    Deep Learning Approach (DLA) in medical image analysis emerges as a fast-growing research field. DLA has been widely used in medical imaging to detect the presence or absence of the disease. This paper presents the development of artificial neural networks, comprehensive analysis of DLA, which delivers promising medical imaging applications.

  14. Applications of nanotechnology in medical field: a brief review

    For carrying out this study, relevant papers on Nanotechnology in the medical field from Scopus, Google scholar, ResearchGate, and other research platforms are identified and studied. The study discusses different types of Nanoparticles used in the medical field. This paper discusses nanotechnology applications in the medical field.

  15. Artificial Intelligence: How is It Changing Medical Sciences and Its

    Abstract. Artificially intelligent computer systems are used extensively in medical sciences. Common applications include diagnosing patients, end-to-end drug discovery and development, improving communication between physician and patient, transcribing medical documents, such as prescriptions, and remotely treating patients.

  16. <em>Medical Physics</em>

    The Medical Physics publishes papers helping health professionals perform their responsibilities more effectively and efficiently. Abstract Background In recent years, genetic algorithms have been applied in the field of nuclear technology design, producing superior optimization results compared to traditional methods.

  17. 77 Exciting Medical Research Topics (2024)

    Since 2020, COVID-19 has been a hot-button topic in medicine, along with the long-term symptoms in those with a history of COVID-19. Examples of COVID-19-related research topics worth exploring include: The long-term impact of COVID-19 on cardiac and respiratory health. COVID-19 vaccination rates.

  18. Research in Medical Imaging Using Image Processing Techniques

    The image processing techniques were founded in the 1960s. Those techniques were used for different fields such as Space, clinical purposes, arts, and TV image improvement. In the 1970s with the ...

  19. Sensors

    Identifying brain-tissue types holds significant research value in the biomedical field of non-contact brain-tissue measurement applications. In this paper, a layered metastructure is proposed, and the second harmonic generation (SHG) in a multilayer metastructure is derived using the transfer matrix method. With the SHG conversion efficiency (CE) as the measurement signal, the refractive ...

  20. How to Write a Medical Research Paper: 12 Steps (with Pictures)

    Include your main title, running title (often a shortened version of your main title), author's name, course name, and semester. 3. Compile your results. Divide the paper into logical sections determined by the type of paper you are writing.

  21. Electronics

    Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

  22. Biosensors applications in medical field: A brief review

    Various applications can cover by the successful implementation of this technology. The major research objectives of this paper are as under: RO1: To study biosensors and their benefits for the medical field; RO2: to study distinctive capabilities of biosensors in healthcare services; RO3: to study major advancement of biosensors in the medical ...

  23. Visualized Analysis of Research Hotspots and Evolution Trends in ...

    Summarize the research strengths, hotspots, and stages of scientific evolution in the field of motivation in PA by performing visualization analysis which used CiteSpace software on 2375 publications including from January 1st, 1980 to January 31st, 2024, in the Web of Science Core Collection (WOSCC).

  24. The Role of 3D Printing in Medical Applications: A State of the Art

    Three-dimensional (3D) printing refers to a number of manufacturing technologies that generate a physical model from digital information. Medical 3D printing was once an ambitious pipe dream. However, time and investment made it real. Nowadays, the 3D printing technology represents a big opportunity to help pharmaceutical and medical companies ...

  25. Susan D. Boulware, MD < Center for Medical Education

    Dr. Boulware has an interest in disorders of growth and development having published several papers looking at the metabolic actions of IGF-1, a hormone critical to childhood growth. She provides care to children with general disorders of the endocrine system (pituitary, thyroid, adrenal, testes or ovaries).

  26. In Medicine, the Morally Unthinkable Too Easily Comes to Seem Normal

    Major academic medical centers began establishing bioethics centers and programs throughout the 1980s and '90s, and today virtually every medical school in the country requires ethics training.

  27. 3D scanning applications in medical field: A literature-based review

    After studying the major research papers, the requirements of the medical field are classified into five major areas. 3D scanning technologies help to fulfil these medical field requirements. 3.2.1. Complexity. Traditional manufacturing technologies are not conformable in febricity complex and organic shapes of the outer part of the body.

  28. Micromachines

    In this paper, the single-event burnout (SEB) and reinforcement structure of 1200 V SiC MOSFET (SG-SBD-MOSFET) with split gate and Schottky barrier diode (SBD) embedded were studied. The device structure was established using Sentaurus TCAD, and the transient current changes of single-event effect (SEE), SEB threshold voltage, as well as the regularity of electric field peak distribution ...

  29. Holography applications toward medical field: An overview

    The paper explores holographic applications in medical research due to its extensive capability of image processing. Holographic images are non-contact 3D images having a large field of depth. A physician can now zoom the holographic image for a better view of the medical part.