Banner

Literature Reviews: Types of Clinical Study Designs

  • Library Basics
  • 1. Choose Your Topic
  • How to Find Books
  • Types of Clinical Study Designs
  • Types of Literature
  • 3. Search the Literature
  • 4. Read & Analyze the Literature
  • 5. Write the Review
  • Keeping Track of Information
  • Style Guides
  • Books, Tutorials & Examples

Types of Study Designs

Meta-Analysis A way of combining data from many different research studies. A meta-analysis is a statistical process that combines the findings from individual studies.  Example :  Anxiety outcomes after physical activity interventions: meta-analysis findings .  Conn V.  Nurs Res . 2010 May-Jun;59(3):224-31.

Systematic Review A summary of the clinical literature. A systematic review is a critical assessment and evaluation of all research studies that address a particular clinical issue. The researchers use an organized method of locating, assembling, and evaluating a body of literature on a particular topic using a set of specific criteria. A systematic review typically includes a description of the findings of the collection of research studies. The systematic review may also include a quantitative pooling of data, called a meta-analysis.  Example :  Complementary and alternative medicine use among women with breast cancer: a systematic review.   Wanchai A, Armer JM, Stewart BR. Clin J Oncol Nurs . 2010 Aug;14(4):E45-55.

Randomized Controlled Trial A controlled clinical trial that randomly (by chance) assigns participants to two or more groups. There are various methods to randomize study participants to their groups.  Example :  Meditation or exercise for preventing acute respiratory infection: a randomized controlled trial .  Barrett B, et al.  Ann Fam Med . 2012 Jul-Aug;10(4):337-46.

Cohort Study (Prospective Observational Study) A clinical research study in which people who presently have a certain condition or receive a particular treatment are followed over time and compared with another group of people who are not affected by the condition.  Example : Smokeless tobacco cessation in South Asian communities: a multi-centre prospective cohort study . Croucher R, et al. Addiction. 2012 Dec;107 Suppl 2:45-52.

Case-control Study Case-control studies begin with the outcomes and do not follow people over time. Researchers choose people with a particular result (the cases) and interview the groups or check their records to ascertain what different experiences they had. They compare the odds of having an experience with the outcome to the odds of having an experience without the outcome.  Example :  Non-use of bicycle helmets and risk of fatal head injury: a proportional mortality, case-control study .  Persaud N, et al.  CMAJ . 2012 Nov 20;184(17):E921-3.

Cross-sectional study The observation of a defined population at a single point in time or time interval. Exposure and outcome are determined simultaneously.  Example :  Fasting might not be necessary before lipid screening: a nationally representative cross-sectional study .  Steiner MJ, et al.  Pediatrics . 2011 Sep;128(3):463-70.

Case Reports and Series A report on a series of patients with an outcome of interest. No control group is involved.  Example :  Students mentoring students in a service-learning clinical supervision experience: an educational case report .  Lattanzi JB, et al.  Phys Ther . 2011 Oct;91(10):1513-24.

Ideas, Editorials, Opinions Put forth by experts in the field.  Example : Health and health care for the 21st century: for all the people . Koop CE.  Am J Public Health . 2006 Dec;96(12):2090-2.

Animal Research Studies Studies conducted using animal subjects.  Example : Intranasal leptin reduces appetite and induces weight loss in rats with diet-induced obesity (DIO) .  Schulz C, Paulus K, Jöhren O, Lehnert H.   Endocrinology . 2012 Jan;153(1):143-53.

Test-tube Lab Research "Test tube" experiments conducted in a controlled laboratory setting.

Adapted from Study Designs. In NICHSR Introduction to Health Services Research: a Self-Study Course.  http://www.nlm.nih.gov/nichsr/ihcm/06studies/studies03.html and Glossary of EBM Terms. http://www.cebm.utoronto.ca/glossary/index.htm#top  

Study Design Terminology

Bias - Any deviation of results or inferences from the truth, or processes leading to such deviation. Bias can result from several sources: one-sided or systematic variations in measurement from the true value (systematic error); flaws in study design; deviation of inferences, interpretations, or analyses based on flawed data or data collection; etc. There is no sense of prejudice or subjectivity implied in the assessment of bias under these conditions.

Case Control Studies - Studies which start with the identification of persons with a disease of interest and a control (comparison, referent) group without the disease. The relationship of an attribute to the disease is examined by comparing diseased and non-diseased persons with regard to the frequency or levels of the attribute in each group.

Causality - The relating of causes to the effects they produce. Causes are termed necessary when they must always precede an effect and sufficient when they initiate or produce an effect. Any of several factors may be associated with the potential disease causation or outcome, including predisposing factors, enabling factors, precipitating factors, reinforcing factors, and risk factors.

Control Groups - Groups that serve as a standard for comparison in experimental studies. They are similar in relevant characteristics to the experimental group but do not receive the experimental intervention.

Controlled Clinical Trials - Clinical trials involving one or more test treatments, at least one control treatment, specified outcome measures for evaluating the studied intervention, and a bias-free method for assigning patients to the test treatment. The treatment may be drugs, devices, or procedures studied for diagnostic, therapeutic, or prophylactic effectiveness. Control measures include placebos, active medicines, no-treatment, dosage forms and regimens, historical comparisons, etc. When randomization using mathematical techniques, such as the use of a random numbers table, is employed to assign patients to test or control treatments, the trials are characterized as Randomized Controlled Trials.

Cost-Benefit Analysis - A method of comparing the cost of a program with its expected benefits in dollars (or other currency). The benefit-to-cost ratio is a measure of total return expected per unit of money spent. This analysis generally excludes consideration of factors that are not measured ultimately in economic terms. Cost effectiveness compares alternative ways to achieve a specific set of results.

Cross-Over Studies - Studies comparing two or more treatments or interventions in which the subjects or patients, upon completion of the course of one treatment, are switched to another. In the case of two treatments, A and B, half the subjects are randomly allocated to receive these in the order A, B and half to receive them in the order B, A. A criticism of this design is that effects of the first treatment may carry over into the period when the second is given.

Cross-Sectional Studies - Studies in which the presence or absence of disease or other health-related variables are determined in each member of the study population or in a representative sample at one particular time. This contrasts with LONGITUDINAL STUDIES which are followed over a period of time.

Double-Blind Method - A method of studying a drug or procedure in which both the subjects and investigators are kept unaware of who is actually getting which specific treatment.

Empirical Research - The study, based on direct observation, use of statistical records, interviews, or experimental methods, of actual practices or the actual impact of practices or policies.

Evaluation Studies - Works consisting of studies determining the effectiveness or utility of processes, personnel, and equipment.

Genome-Wide Association Study - An analysis comparing the allele frequencies of all available (or a whole genome representative set of) polymorphic markers in unrelated patients with a specific symptom or disease condition, and those of healthy controls to identify markers associated with a specific disease or condition.

Intention to Treat Analysis - Strategy for the analysis of Randomized Controlled Trial that compares patients in the groups to which they were originally randomly assigned.

Logistic Models - Statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable. A common application is in epidemiology for estimating an individual's risk (probability of a disease) as a function of a given risk factor.

Longitudinal Studies - Studies in which variables relating to an individual or group of individuals are assessed over a period of time.

Lost to Follow-Up - Study subjects in cohort studies whose outcomes are unknown e.g., because they could not or did not wish to attend follow-up visits.

Matched-Pair Analysis - A type of analysis in which subjects in a study group and a comparison group are made comparable with respect to extraneous factors by individually pairing study subjects with the comparison group subjects (e.g., age-matched controls).

Meta-Analysis - Works consisting of studies using a quantitative method of combining the results of independent studies (usually drawn from the published literature) and synthesizing summaries and conclusions which may be used to evaluate therapeutic effectiveness, plan new studies, etc. It is often an overview of clinical trials. It is usually called a meta-analysis by the author or sponsoring body and should be differentiated from reviews of literature.

Numbers Needed To Treat - Number of patients who need to be treated in order to prevent one additional bad outcome. It is the inverse of Absolute Risk Reduction.

Odds Ratio - The ratio of two odds. The exposure-odds ratio for case control data is the ratio of the odds in favor of exposure among cases to the odds in favor of exposure among noncases. The disease-odds ratio for a cohort or cross section is the ratio of the odds in favor of disease among the exposed to the odds in favor of disease among the unexposed. The prevalence-odds ratio refers to an odds ratio derived cross-sectionally from studies of prevalent cases.

Patient Selection - Criteria and standards used for the determination of the appropriateness of the inclusion of patients with specific conditions in proposed treatment plans and the criteria used for the inclusion of subjects in various clinical trials and other research protocols.

Predictive Value of Tests - In screening and diagnostic tests, the probability that a person with a positive test is a true positive (i.e., has the disease), is referred to as the predictive value of a positive test; whereas, the predictive value of a negative test is the probability that the person with a negative test does not have the disease. Predictive value is related to the sensitivity and specificity of the test.

Prospective Studies - Observation of a population for a sufficient number of persons over a sufficient number of years to generate incidence or mortality rates subsequent to the selection of the study group.

Qualitative Studies - Research that derives data from observation, interviews, or verbal interactions and focuses on the meanings and interpretations of the participants.

Quantitative Studies - Quantitative research is research that uses numerical analysis.

Random Allocation - A process involving chance used in therapeutic trials or other research endeavor for allocating experimental subjects, human or animal, between treatment and control groups, or among treatment groups. It may also apply to experiments on inanimate objects.

Randomized Controlled Trial - Clinical trials that involve at least one test treatment and one control treatment, concurrent enrollment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random-numbers table.

Reproducibility of Results - The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.

Retrospective Studies - Studies used to test etiologic hypotheses in which inferences about an exposure to putative causal factors are derived from data relating to characteristics of persons under study or to events or experiences in their past. The essential feature is that some of the persons under study have the disease or outcome of interest and their characteristics are compared with those of unaffected persons.

Sample Size - The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups.

Sensitivity and Specificity - Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition.

Single-Blind Method - A method in which either the observer(s) or the subject(s) is kept ignorant of the group to which the subjects are assigned.

Time Factors - Elements of limited time intervals, contributing to particular results or situations.

Source:  NLM MeSH Database

  • << Previous: How to Find Books
  • Next: Types of Literature >>
  • Last Updated: Dec 29, 2023 11:41 AM
  • URL: https://research.library.gsu.edu/litrev

Share

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Literature Review | Guide, Examples, & Templates

How to Write a Literature Review | Guide, Examples, & Templates

Published on January 2, 2023 by Shona McCombes . Revised on September 11, 2023.

What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research that you can later apply to your paper, thesis, or dissertation topic .

There are five key steps to writing a literature review:

  • Search for relevant literature
  • Evaluate sources
  • Identify themes, debates, and gaps
  • Outline the structure
  • Write your literature review

A good literature review doesn’t just summarize sources—it analyzes, synthesizes , and critically evaluates to give a clear picture of the state of knowledge on the subject.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

What is the purpose of a literature review, examples of literature reviews, step 1 – search for relevant literature, step 2 – evaluate and select sources, step 3 – identify themes, debates, and gaps, step 4 – outline your literature review’s structure, step 5 – write your literature review, free lecture slides, other interesting articles, frequently asked questions, introduction.

  • Quick Run-through
  • Step 1 & 2

When you write a thesis , dissertation , or research paper , you will likely have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to:

  • Demonstrate your familiarity with the topic and its scholarly context
  • Develop a theoretical framework and methodology for your research
  • Position your work in relation to other researchers and theorists
  • Show how your research addresses a gap or contributes to a debate
  • Evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

Writing literature reviews is a particularly important skill if you want to apply for graduate school or pursue a career in research. We’ve written a step-by-step guide that you can follow below.

Literature review guide

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Writing literature reviews can be quite challenging! A good starting point could be to look at some examples, depending on what kind of literature review you’d like to write.

  • Example literature review #1: “Why Do People Migrate? A Review of the Theoretical Literature” ( Theoretical literature review about the development of economic migration theory from the 1950s to today.)
  • Example literature review #2: “Literature review as a research methodology: An overview and guidelines” ( Methodological literature review about interdisciplinary knowledge acquisition and production.)
  • Example literature review #3: “The Use of Technology in English Language Learning: A Literature Review” ( Thematic literature review about the effects of technology on language acquisition.)
  • Example literature review #4: “Learners’ Listening Comprehension Difficulties in English Language Learning: A Literature Review” ( Chronological literature review about how the concept of listening skills has changed over time.)

You can also check out our templates with literature review examples and sample outlines at the links below.

Download Word doc Download Google doc

Before you begin searching for literature, you need a clearly defined topic .

If you are writing the literature review section of a dissertation or research paper, you will search for literature related to your research problem and questions .

Make a list of keywords

Start by creating a list of keywords related to your research question. Include each of the key concepts or variables you’re interested in, and list any synonyms and related terms. You can add to this list as you discover new keywords in the process of your literature search.

  • Social media, Facebook, Instagram, Twitter, Snapchat, TikTok
  • Body image, self-perception, self-esteem, mental health
  • Generation Z, teenagers, adolescents, youth

Search for relevant sources

Use your keywords to begin searching for sources. Some useful databases to search for journals and articles include:

  • Your university’s library catalogue
  • Google Scholar
  • Project Muse (humanities and social sciences)
  • Medline (life sciences and biomedicine)
  • EconLit (economics)
  • Inspec (physics, engineering and computer science)

You can also use boolean operators to help narrow down your search.

Make sure to read the abstract to find out whether an article is relevant to your question. When you find a useful book or article, you can check the bibliography to find other relevant sources.

You likely won’t be able to read absolutely everything that has been written on your topic, so it will be necessary to evaluate which sources are most relevant to your research question.

For each publication, ask yourself:

  • What question or problem is the author addressing?
  • What are the key concepts and how are they defined?
  • What are the key theories, models, and methods?
  • Does the research use established frameworks or take an innovative approach?
  • What are the results and conclusions of the study?
  • How does the publication relate to other literature in the field? Does it confirm, add to, or challenge established knowledge?
  • What are the strengths and weaknesses of the research?

Make sure the sources you use are credible , and make sure you read any landmark studies and major theories in your field of research.

You can use our template to summarize and evaluate sources you’re thinking about using. Click on either button below to download.

Take notes and cite your sources

As you read, you should also begin the writing process. Take notes that you can later incorporate into the text of your literature review.

It is important to keep track of your sources with citations to avoid plagiarism . It can be helpful to make an annotated bibliography , where you compile full citation information and write a paragraph of summary and analysis for each source. This helps you remember what you read and saves time later in the process.

The only proofreading tool specialized in correcting academic writing - try for free!

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

literature review study design

Try for free

To begin organizing your literature review’s argument and structure, be sure you understand the connections and relationships between the sources you’ve read. Based on your reading and notes, you can look for:

  • Trends and patterns (in theory, method or results): do certain approaches become more or less popular over time?
  • Themes: what questions or concepts recur across the literature?
  • Debates, conflicts and contradictions: where do sources disagree?
  • Pivotal publications: are there any influential theories or studies that changed the direction of the field?
  • Gaps: what is missing from the literature? Are there weaknesses that need to be addressed?

This step will help you work out the structure of your literature review and (if applicable) show how your own research will contribute to existing knowledge.

  • Most research has focused on young women.
  • There is an increasing interest in the visual aspects of social media.
  • But there is still a lack of robust research on highly visual platforms like Instagram and Snapchat—this is a gap that you could address in your own research.

There are various approaches to organizing the body of a literature review. Depending on the length of your literature review, you can combine several of these strategies (for example, your overall structure might be thematic, but each theme is discussed chronologically).

Chronological

The simplest approach is to trace the development of the topic over time. However, if you choose this strategy, be careful to avoid simply listing and summarizing sources in order.

Try to analyze patterns, turning points and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred.

If you have found some recurring central themes, you can organize your literature review into subsections that address different aspects of the topic.

For example, if you are reviewing literature about inequalities in migrant health outcomes, key themes might include healthcare policy, language barriers, cultural attitudes, legal status, and economic access.

Methodological

If you draw your sources from different disciplines or fields that use a variety of research methods , you might want to compare the results and conclusions that emerge from different approaches. For example:

  • Look at what results have emerged in qualitative versus quantitative research
  • Discuss how the topic has been approached by empirical versus theoretical scholarship
  • Divide the literature into sociological, historical, and cultural sources

Theoretical

A literature review is often the foundation for a theoretical framework . You can use it to discuss various theories, models, and definitions of key concepts.

You might argue for the relevance of a specific theoretical approach, or combine various theoretical concepts to create a framework for your research.

Like any other academic text , your literature review should have an introduction , a main body, and a conclusion . What you include in each depends on the objective of your literature review.

The introduction should clearly establish the focus and purpose of the literature review.

Depending on the length of your literature review, you might want to divide the body into subsections. You can use a subheading for each theme, time period, or methodological approach.

As you write, you can follow these tips:

  • Summarize and synthesize: give an overview of the main points of each source and combine them into a coherent whole
  • Analyze and interpret: don’t just paraphrase other researchers — add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
  • Critically evaluate: mention the strengths and weaknesses of your sources
  • Write in well-structured paragraphs: use transition words and topic sentences to draw connections, comparisons and contrasts

In the conclusion, you should summarize the key findings you have taken from the literature and emphasize their significance.

When you’ve finished writing and revising your literature review, don’t forget to proofread thoroughly before submitting. Not a language expert? Check out Scribbr’s professional proofreading services !

This article has been adapted into lecture slides that you can use to teach your students about writing a literature review.

Scribbr slides are free to use, customize, and distribute for educational purposes.

Open Google Slides Download PowerPoint

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarize yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

The literature review usually comes near the beginning of your thesis or dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other  academic texts , with an introduction , a main body, and a conclusion .

An  annotated bibliography is a list of  source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a  paper .  

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, September 11). How to Write a Literature Review | Guide, Examples, & Templates. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/dissertation/literature-review/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research methodology | steps & tips, how to write a research proposal | examples & templates, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Library Homepage

Research Methods and Design

  • Action Research
  • Case Study Design

Literature Review

  • Quantitative Research Methods
  • Qualitative Research Methods
  • Mixed Methods Study
  • Indigenous Research and Ethics This link opens in a new window
  • Identifying Empirical Research Articles This link opens in a new window
  • Research Ethics and Quality
  • Data Literacy
  • Get Help with Writing Assignments

A literature review is a discussion of the literature (aka. the "research" or "scholarship") surrounding a certain topic. A good literature review doesn't simply summarize the existing material, but provides thoughtful synthesis and analysis. The purpose of a literature review is to orient your own work within an existing body of knowledge. A literature review may be written as a standalone piece or be included in a larger body of work.

You can read more about literature reviews, what they entail, and how to write one, using the resources below. 

Am I the only one struggling to write a literature review?

Dr. Zina O'Leary explains the misconceptions and struggles students often have with writing a literature review. She also provides step-by-step guidance on writing a persuasive literature review.

An Introduction to Literature Reviews

Dr. Eric Jensen, Professor of Sociology at the University of Warwick, and Dr. Charles Laurie, Director of Research at Verisk Maplecroft, explain how to write a literature review, and why researchers need to do so. Literature reviews can be stand-alone research or part of a larger project. They communicate the state of academic knowledge on a given topic, specifically detailing what is still unknown.

This is the first video in a whole series about literature reviews. You can find the rest of the series in our SAGE database, Research Methods:

Videos

Videos covering research methods and statistics

To login from SAGE, click Institution, then Access via Your Institution, then find and select City University of Seattle

Identify Themes and Gaps in Literature (with real examples) | Scribbr

Finding connections between sources is key to organizing the arguments and structure of a good literature review. In this video, you'll learn how to identify themes, debates, and gaps between sources, using examples from real papers.

4 Tips for Writing a Literature Review's Intro, Body, and Conclusion | Scribbr

While each review will be unique in its structure--based on both the existing body of both literature and the overall goals of your own paper, dissertation, or research--this video from Scribbr does a good job simplifying the goals of writing a literature review for those who are new to the process. In this video, you’ll learn what to include in each section, as well as 4 tips for the main body illustrated with an example.

Cover Art

  • Literature Review This chapter in SAGE's Encyclopedia of Research Design describes the types of literature reviews and scientific standards for conducting literature reviews.
  • UNC Writing Center: Literature Reviews This handout from the Writing Center at UNC will explain what literature reviews are and offer insights into the form and construction of literature reviews in the humanities, social sciences, and sciences.
  • Purdue OWL: Writing a Literature Review The overview of literature reviews comes from Purdue's Online Writing Lab. It explains the basic why, what, and how of writing a literature review.

Organizational Tools for Literature Reviews

One of the most daunting aspects of writing a literature review is organizing your research. There are a variety of strategies that you can use to help you in this task. We've highlighted just a few ways writers keep track of all that information! You can use a combination of these tools or come up with your own organizational process. The key is choosing something that works with your own learning style.

Citation Managers

Citation managers are great tools, in general, for organizing research, but can be especially helpful when writing a literature review. You can keep all of your research in one place, take notes, and organize your materials into different folders or categories. Read more about citations managers here:

  • Manage Citations & Sources

Concept Mapping

Some writers use concept mapping (sometimes called flow or bubble charts or "mind maps") to help them visualize the ways in which the research they found connects.

literature review study design

There is no right or wrong way to make a concept map. There are a variety of online tools that can help you create a concept map or you can simply put pen to paper. To read more about concept mapping, take a look at the following help guides:

  • Using Concept Maps From Williams College's guide, Literature Review: A Self-guided Tutorial

Synthesis Matrix

A synthesis matrix is is a chart you can use to help you organize your research into thematic categories. By organizing your research into a matrix, like the examples below, can help you visualize the ways in which your sources connect. 

  • Walden University Writing Center: Literature Review Matrix Find a variety of literature review matrix examples and templates from Walden University.
  • Writing A Literature Review and Using a Synthesis Matrix An example synthesis matrix created by NC State University Writing and Speaking Tutorial Service Tutors. If you would like a copy of this synthesis matrix in a different format, like a Word document, please ask a librarian. CC-BY-SA 3.0
  • << Previous: Case Study Design
  • Next: Quantitative Research Methods >>
  • Last Updated: Feb 6, 2024 9:20 AM

CityU Home - CityU Catalog

Creative Commons License

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Writing a Literature Review

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

A literature review is a document or section of a document that collects key sources on a topic and discusses those sources in conversation with each other (also called synthesis ). The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels and plays). When we say “literature review” or refer to “the literature,” we are talking about the research ( scholarship ) in a given field. You will often see the terms “the research,” “the scholarship,” and “the literature” used mostly interchangeably.

Where, when, and why would I write a lit review?

There are a number of different situations where you might write a literature review, each with slightly different expectations; different disciplines, too, have field-specific expectations for what a literature review is and does. For instance, in the humanities, authors might include more overt argumentation and interpretation of source material in their literature reviews, whereas in the sciences, authors are more likely to report study designs and results in their literature reviews; these differences reflect these disciplines’ purposes and conventions in scholarship. You should always look at examples from your own discipline and talk to professors or mentors in your field to be sure you understand your discipline’s conventions, for literature reviews as well as for any other genre.

A literature review can be a part of a research paper or scholarly article, usually falling after the introduction and before the research methods sections. In these cases, the lit review just needs to cover scholarship that is important to the issue you are writing about; sometimes it will also cover key sources that informed your research methodology.

Lit reviews can also be standalone pieces, either as assignments in a class or as publications. In a class, a lit review may be assigned to help students familiarize themselves with a topic and with scholarship in their field, get an idea of the other researchers working on the topic they’re interested in, find gaps in existing research in order to propose new projects, and/or develop a theoretical framework and methodology for later research. As a publication, a lit review usually is meant to help make other scholars’ lives easier by collecting and summarizing, synthesizing, and analyzing existing research on a topic. This can be especially helpful for students or scholars getting into a new research area, or for directing an entire community of scholars toward questions that have not yet been answered.

What are the parts of a lit review?

Most lit reviews use a basic introduction-body-conclusion structure; if your lit review is part of a larger paper, the introduction and conclusion pieces may be just a few sentences while you focus most of your attention on the body. If your lit review is a standalone piece, the introduction and conclusion take up more space and give you a place to discuss your goals, research methods, and conclusions separately from where you discuss the literature itself.

Introduction:

  • An introductory paragraph that explains what your working topic and thesis is
  • A forecast of key topics or texts that will appear in the review
  • Potentially, a description of how you found sources and how you analyzed them for inclusion and discussion in the review (more often found in published, standalone literature reviews than in lit review sections in an article or research paper)
  • Summarize and synthesize: Give an overview of the main points of each source and combine them into a coherent whole
  • Analyze and interpret: Don’t just paraphrase other researchers – add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
  • Critically Evaluate: Mention the strengths and weaknesses of your sources
  • Write in well-structured paragraphs: Use transition words and topic sentence to draw connections, comparisons, and contrasts.

Conclusion:

  • Summarize the key findings you have taken from the literature and emphasize their significance
  • Connect it back to your primary research question

How should I organize my lit review?

Lit reviews can take many different organizational patterns depending on what you are trying to accomplish with the review. Here are some examples:

  • Chronological : The simplest approach is to trace the development of the topic over time, which helps familiarize the audience with the topic (for instance if you are introducing something that is not commonly known in your field). If you choose this strategy, be careful to avoid simply listing and summarizing sources in order. Try to analyze the patterns, turning points, and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred (as mentioned previously, this may not be appropriate in your discipline — check with a teacher or mentor if you’re unsure).
  • Thematic : If you have found some recurring central themes that you will continue working with throughout your piece, you can organize your literature review into subsections that address different aspects of the topic. For example, if you are reviewing literature about women and religion, key themes can include the role of women in churches and the religious attitude towards women.
  • Qualitative versus quantitative research
  • Empirical versus theoretical scholarship
  • Divide the research by sociological, historical, or cultural sources
  • Theoretical : In many humanities articles, the literature review is the foundation for the theoretical framework. You can use it to discuss various theories, models, and definitions of key concepts. You can argue for the relevance of a specific theoretical approach or combine various theorical concepts to create a framework for your research.

What are some strategies or tips I can use while writing my lit review?

Any lit review is only as good as the research it discusses; make sure your sources are well-chosen and your research is thorough. Don’t be afraid to do more research if you discover a new thread as you’re writing. More info on the research process is available in our "Conducting Research" resources .

As you’re doing your research, create an annotated bibliography ( see our page on the this type of document ). Much of the information used in an annotated bibliography can be used also in a literature review, so you’ll be not only partially drafting your lit review as you research, but also developing your sense of the larger conversation going on among scholars, professionals, and any other stakeholders in your topic.

Usually you will need to synthesize research rather than just summarizing it. This means drawing connections between sources to create a picture of the scholarly conversation on a topic over time. Many student writers struggle to synthesize because they feel they don’t have anything to add to the scholars they are citing; here are some strategies to help you:

  • It often helps to remember that the point of these kinds of syntheses is to show your readers how you understand your research, to help them read the rest of your paper.
  • Writing teachers often say synthesis is like hosting a dinner party: imagine all your sources are together in a room, discussing your topic. What are they saying to each other?
  • Look at the in-text citations in each paragraph. Are you citing just one source for each paragraph? This usually indicates summary only. When you have multiple sources cited in a paragraph, you are more likely to be synthesizing them (not always, but often
  • Read more about synthesis here.

The most interesting literature reviews are often written as arguments (again, as mentioned at the beginning of the page, this is discipline-specific and doesn’t work for all situations). Often, the literature review is where you can establish your research as filling a particular gap or as relevant in a particular way. You have some chance to do this in your introduction in an article, but the literature review section gives a more extended opportunity to establish the conversation in the way you would like your readers to see it. You can choose the intellectual lineage you would like to be part of and whose definitions matter most to your thinking (mostly humanities-specific, but this goes for sciences as well). In addressing these points, you argue for your place in the conversation, which tends to make the lit review more compelling than a simple reporting of other sources.

Research Methods

  • Getting Started
  • Literature Review Research
  • Research Design
  • Research Design By Discipline
  • SAGE Research Methods
  • Teaching with SAGE Research Methods

Literature Review

  • What is a Literature Review?
  • What is NOT a Literature Review?
  • Purposes of a Literature Review
  • Types of Literature Reviews
  • Literature Reviews vs. Systematic Reviews
  • Systematic vs. Meta-Analysis

Literature Review  is a comprehensive survey of the works published in a particular field of study or line of research, usually over a specific period of time, in the form of an in-depth, critical bibliographic essay or annotated list in which attention is drawn to the most significant works.

Also, we can define a literature review as the collected body of scholarly works related to a topic:

  • Summarizes and analyzes previous research relevant to a topic
  • Includes scholarly books and articles published in academic journals
  • Can be an specific scholarly paper or a section in a research paper

The objective of a Literature Review is to find previous published scholarly works relevant to an specific topic

  • Help gather ideas or information
  • Keep up to date in current trends and findings
  • Help develop new questions

A literature review is important because it:

  • Explains the background of research on a topic.
  • Demonstrates why a topic is significant to a subject area.
  • Helps focus your own research questions or problems
  • Discovers relationships between research studies/ideas.
  • Suggests unexplored ideas or populations
  • Identifies major themes, concepts, and researchers on a topic.
  • Tests assumptions; may help counter preconceived ideas and remove unconscious bias.
  • Identifies critical gaps, points of disagreement, or potentially flawed methodology or theoretical approaches.
  • Indicates potential directions for future research.

All content in this section is from Literature Review Research from Old Dominion University 

Keep in mind the following, a literature review is NOT:

Not an essay 

Not an annotated bibliography  in which you summarize each article that you have reviewed.  A literature review goes beyond basic summarizing to focus on the critical analysis of the reviewed works and their relationship to your research question.

Not a research paper   where you select resources to support one side of an issue versus another.  A lit review should explain and consider all sides of an argument in order to avoid bias, and areas of agreement and disagreement should be highlighted.

A literature review serves several purposes. For example, it

  • provides thorough knowledge of previous studies; introduces seminal works.
  • helps focus one’s own research topic.
  • identifies a conceptual framework for one’s own research questions or problems; indicates potential directions for future research.
  • suggests previously unused or underused methodologies, designs, quantitative and qualitative strategies.
  • identifies gaps in previous studies; identifies flawed methodologies and/or theoretical approaches; avoids replication of mistakes.
  • helps the researcher avoid repetition of earlier research.
  • suggests unexplored populations.
  • determines whether past studies agree or disagree; identifies controversy in the literature.
  • tests assumptions; may help counter preconceived ideas and remove unconscious bias.

As Kennedy (2007) notes*, it is important to think of knowledge in a given field as consisting of three layers. First, there are the primary studies that researchers conduct and publish. Second are the reviews of those studies that summarize and offer new interpretations built from and often extending beyond the original studies. Third, there are the perceptions, conclusions, opinion, and interpretations that are shared informally that become part of the lore of field. In composing a literature review, it is important to note that it is often this third layer of knowledge that is cited as "true" even though it often has only a loose relationship to the primary studies and secondary literature reviews.

Given this, while literature reviews are designed to provide an overview and synthesis of pertinent sources you have explored, there are several approaches to how they can be done, depending upon the type of analysis underpinning your study. Listed below are definitions of types of literature reviews:

Argumentative Review      This form examines literature selectively in order to support or refute an argument, deeply imbedded assumption, or philosophical problem already established in the literature. The purpose is to develop a body of literature that establishes a contrarian viewpoint. Given the value-laden nature of some social science research [e.g., educational reform; immigration control], argumentative approaches to analyzing the literature can be a legitimate and important form of discourse. However, note that they can also introduce problems of bias when they are used to to make summary claims of the sort found in systematic reviews.

Integrative Review      Considered a form of research that reviews, critiques, and synthesizes representative literature on a topic in an integrated way such that new frameworks and perspectives on the topic are generated. The body of literature includes all studies that address related or identical hypotheses. A well-done integrative review meets the same standards as primary research in regard to clarity, rigor, and replication.

Historical Review      Few things rest in isolation from historical precedent. Historical reviews are focused on examining research throughout a period of time, often starting with the first time an issue, concept, theory, phenomena emerged in the literature, then tracing its evolution within the scholarship of a discipline. The purpose is to place research in a historical context to show familiarity with state-of-the-art developments and to identify the likely directions for future research.

Methodological Review      A review does not always focus on what someone said [content], but how they said it [method of analysis]. This approach provides a framework of understanding at different levels (i.e. those of theory, substantive fields, research approaches and data collection and analysis techniques), enables researchers to draw on a wide variety of knowledge ranging from the conceptual level to practical documents for use in fieldwork in the areas of ontological and epistemological consideration, quantitative and qualitative integration, sampling, interviewing, data collection and data analysis, and helps highlight many ethical issues which we should be aware of and consider as we go through our study.

Systematic Review      This form consists of an overview of existing evidence pertinent to a clearly formulated research question, which uses pre-specified and standardized methods to identify and critically appraise relevant research, and to collect, report, and analyse data from the studies that are included in the review. Typically it focuses on a very specific empirical question, often posed in a cause-and-effect form, such as "To what extent does A contribute to B?"

Theoretical Review      The purpose of this form is to concretely examine the corpus of theory that has accumulated in regard to an issue, concept, theory, phenomena. The theoretical literature review help establish what theories already exist, the relationships between them, to what degree the existing theories have been investigated, and to develop new hypotheses to be tested. Often this form is used to help establish a lack of appropriate theories or reveal that current theories are inadequate for explaining new or emerging research problems. The unit of analysis can focus on a theoretical concept or a whole theory or framework.

* Kennedy, Mary M. "Defining a Literature."  Educational Researcher  36 (April 2007): 139-147.

All content in this section is from The Literature Review created by Dr. Robert Larabee USC

Robinson, P. and Lowe, J. (2015),  Literature reviews vs systematic reviews.  Australian and New Zealand Journal of Public Health, 39: 103-103. doi: 10.1111/1753-6405.12393

literature review study design

What's in the name? The difference between a Systematic Review and a Literature Review, and why it matters . By Lynn Kysh from University of Southern California

literature review study design

Systematic review or meta-analysis?

A  systematic review  answers a defined research question by collecting and summarizing all empirical evidence that fits pre-specified eligibility criteria.

A  meta-analysis  is the use of statistical methods to summarize the results of these studies.

Systematic reviews, just like other research articles, can be of varying quality. They are a significant piece of work (the Centre for Reviews and Dissemination at York estimates that a team will take 9-24 months), and to be useful to other researchers and practitioners they should have:

  • clearly stated objectives with pre-defined eligibility criteria for studies
  • explicit, reproducible methodology
  • a systematic search that attempts to identify all studies
  • assessment of the validity of the findings of the included studies (e.g. risk of bias)
  • systematic presentation, and synthesis, of the characteristics and findings of the included studies

Not all systematic reviews contain meta-analysis. 

Meta-analysis is the use of statistical methods to summarize the results of independent studies. By combining information from all relevant studies, meta-analysis can provide more precise estimates of the effects of health care than those derived from the individual studies included within a review.  More information on meta-analyses can be found in  Cochrane Handbook, Chapter 9 .

A meta-analysis goes beyond critique and integration and conducts secondary statistical analysis on the outcomes of similar studies.  It is a systematic review that uses quantitative methods to synthesize and summarize the results.

An advantage of a meta-analysis is the ability to be completely objective in evaluating research findings.  Not all topics, however, have sufficient research evidence to allow a meta-analysis to be conducted.  In that case, an integrative review is an appropriate strategy. 

Some of the content in this section is from Systematic reviews and meta-analyses: step by step guide created by Kate McAllister.

  • << Previous: Getting Started
  • Next: Research Design >>
  • Last Updated: Aug 21, 2023 4:07 PM
  • URL: https://guides.lib.udel.edu/researchmethods
  • Library Guides
  • Literature Reviews
  • Getting Started

Literature Reviews: Getting Started

What is a literature review.

A literature review is an overview of the available research for a specific scientific topic. Literature reviews summarize existing research to answer a review question, provide context for new research, or identify important gaps in the existing body of literature.

An incredible amount of academic literature is published each year, by estimates over two million articles .

Sorting through and reviewing that literature can be complicated, so this Research Guide provides a structured approach to make the process more manageable.

THIS GUIDE IS AN OVERVIEW OF THE LITERATURE REVIEW PROCESS:

  • Getting Started (asking a research question | defining scope)
  • Choosing a Type of Review
  • Searching the Literature
  • Organizing the Literature
  • Writing the Literature Review (analyzing | synthesizing)

A  literature search  is a systematic search of the scholarly sources in a particular discipline. A  literature review   is the analysis, critical evaluation and synthesis of the results of that search. During this process you will move from a review  of  the literature to a review  for   your research.   Your synthesis of the literature is your unique contribution to research.

WHO IS THIS RESEARCH GUIDE FOR?

— those new to reviewing the literature

— those that need a refresher or a deeper understanding of writing literature reviews

You may need to do a literature review as a part of a course assignment, a capstone project, a master's thesis, a dissertation, or as part of a journal article. No matter the context, a literature review is an essential part of the research process. 

Literature Review Process

A chart detailing the steps of the literature review process. The steps include: choose review type, develope research question, create search strategy (contact subject librarians in the library for help with these steps), identify databases, perform literature search, read, evaluate, and organize literature and iterate if necessary, synthesize concepts in literature, then write the literature review.

Purpose of a Literature Review

What is the purpose of a literature review.

A literature review is typically performed for a specific reason. Even when assigned as an assignment, the goal of the literature review will be one or more of the following:

  • To communicate a project's novelty by identifying a research gap

literature review study design

  • An overview of research issues , methodologies or results relevant to field
  • To explore the  volume and types of available studies
  • To establish familiarity with current research before carrying out a new project
  • To resolve conflicts amongst contradictory previous studies

Reviewing the literature helps you understand a research topic and develop your own perspective.

A LITERATURE REVIEW IS NOT :

  • An annotated bibliography – which is a list of annotated citations to books, articles and documents that includes a brief description and evaluation for each entry
  • A literary review – which is a critical discussion of the merits and weaknesses of a literary work
  • A book review – which is a critical discussion of the merits and weaknesses of a particular book

Attribution

Thanks to Librarian Jamie Niehof at the University of Michigan for providing permission to reuse and remix this Literature Reviews guide.

The Library's Subject Specialists are happy to help with your literature reviews!  Find your Subject Specialist here . 

literature review study design

If you have questions about this guide, contact Librarians Matt Upson ([email protected]), Dr. Frances Alvarado-Albertorio ([email protected]), or Clarke Iakovakis ([email protected])

  • Last Updated: Apr 4, 2024 4:51 PM
  • URL: https://info.library.okstate.edu/literaturereviews

Study Design 101: Systematic Review

  • Case Report
  • Case Control Study
  • Cohort Study
  • Randomized Controlled Trial
  • Practice Guideline
  • Systematic Review
  • Meta-Analysis
  • Helpful Formulas
  • Finding Specific Study Types

A document often written by a panel that provides a comprehensive review of all relevant studies on a particular clinical or health-related topic/question. The systematic review is created after reviewing and combining all the information from both published and unpublished studies (focusing on clinical trials of similar treatments) and then summarizing the findings.

  • Exhaustive review of the current literature and other sources (unpublished studies, ongoing research)
  • Less costly to review prior studies than to create a new study
  • Less time required than conducting a new study
  • Results can be generalized and extrapolated into the general population more broadly than individual studies
  • More reliable and accurate than individual studies
  • Considered an evidence-based resource

Disadvantages

  • Very time-consuming
  • May not be easy to combine studies

Design pitfalls to look out for

Studies included in systematic reviews may be of varying study designs, but should collectively be studying the same outcome.

Is each study included in the review studying the same variables?

Some reviews may group and analyze studies by variables such as age and gender; factors that were not allocated to participants.

Do the analyses in the systematic review fit the variables being studied in the original studies?

Fictitious Example

Does the regular wearing of ultraviolet-blocking sunscreen prevent melanoma? An exhaustive literature search was conducted, resulting in 54 studies on sunscreen and melanoma. Each study was then evaluated to determine whether the study focused specifically on ultraviolet-blocking sunscreen and melanoma prevention; 30 of the 54 studies were retained. The thirty studies were reviewed and showed a strong positive relationship between daily wearing of sunscreen and a reduced diagnosis of melanoma.

Real-life Examples

Yang, J., Chen, J., Yang, M., Yu, S., Ying, L., Liu, G., ... Liang, F. (2018). Acupuncture for hypertension. The Cochrane Database of Systematic Reviews, 11 (11), CD008821. https://doi.org/10.1002/14651858.CD008821.pub2

This systematic review analyzed twenty-two randomized controlled trials to determine whether acupuncture is a safe and effective way to lower blood pressure in adults with primary hypertension. Due to the low quality of evidence in these studies and lack of blinding, it is not possible to link any short-term decrease in blood pressure to the use of acupuncture. Additional research is needed to determine if there is an effect due to acupuncture that lasts at least seven days.

Parker, H.W. and Vadiveloo, M.K. (2019). Diet quality of vegetarian diets compared with nonvegetarian diets: a systematic review. Nutrition Reviews , https://doi.org/10.1093/nutrit/nuy067

This systematic review was interested in comparing the diet quality of vegetarian and non-vegetarian diets. Twelve studies were included. Vegetarians more closely met recommendations for total fruit, whole grains, seafood and plant protein, and sodium intake. In nine of the twelve studies, vegetarians had higher overall diet quality compared to non-vegetarians. These findings may explain better health outcomes in vegetarians, but additional research is needed to remove any possible confounding variables.

Related Terms

Cochrane Database of Systematic Reviews

A highly-regarded database of systematic reviews prepared by The Cochrane Collaboration , an international group of individuals and institutions who review and analyze the published literature.

Exclusion Criteria

The set of conditions that characterize some individuals which result in being excluded in the study (i.e. other health conditions, taking specific medications, etc.). Since systematic reviews seek to include all relevant studies, exclusion criteria are not generally utilized in this situation.

Inclusion Criteria

The set of conditions that studies must meet to be included in the review (or for individual studies - the set of conditions that participants must meet to be included in the study; often comprises age, gender, disease type and status, etc.).

Now test yourself!

1. Systematic Reviews are similar to Meta-Analyses, except they do not include a statistical analysis quantitatively combining all the studies.

a) True b) False

2. The panels writing Systematic Reviews may include which of the following publication types in their review?

a) Published studies b) Unpublished studies c) Cohort studies d) Randomized Controlled Trials e) All of the above

Evidence Pyramid - Navigation

  • Meta- Analysis
  • Case Reports
  • << Previous: Practice Guideline
  • Next: Meta-Analysis >>

Creative Commons License

  • Last Updated: Sep 25, 2023 10:59 AM
  • URL: https://guides.himmelfarb.gwu.edu/studydesign101

GW logo

  • Himmelfarb Intranet
  • Privacy Notice
  • Terms of Use
  • GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
  • Himmelfarb Health Sciences Library
  • 2300 Eye St., NW, Washington, DC 20037
  • Phone: (202) 994-2850
  • [email protected]
  • https://himmelfarb.gwu.edu
  • Open access
  • Published: 06 December 2022

What improves access to primary healthcare services in rural communities? A systematic review

  • Zemichael Gizaw 1 ,
  • Tigist Astale 2 &
  • Getnet Mitike Kassie 2  

BMC Primary Care volume  23 , Article number:  313 ( 2022 ) Cite this article

13k Accesses

9 Citations

1 Altmetric

Metrics details

To compile key strategies from the international experiences to improve access to primary healthcare (PHC) services in rural communities. Different innovative approaches have been practiced in different parts of the world to improve access to essential healthcare services in rural communities. Systematically collecting and combining best experiences all over the world is important to suggest effective strategies to improve access to healthcare in developing countries. Accordingly, this systematic review of literature was undertaken to identify key approaches from international experiences to enhance access to PHC services in rural communities.

All published and unpublished qualitative and/or mixed method studies conducted to improvement access to PHC services were searched from MEDLINE, Scopus, Web of Science, WHO Global Health Library, and Google Scholar. Articles published other than English language, citations with no abstracts and/or full texts, and duplicate studies were excluded. We included all articles available in different electronic databases regardless of their publication years. We assessed the methodological quality of the included studies using mixed methods appraisal tool (MMAT) version 2018 to minimize the risk of bias. Data were extracted using JBI mixed methods data extraction form. Data were qualitatively analyzed using emergent thematic analysis approach to identify key concepts and coded them into related non-mutually exclusive themes.

Our analysis of 110 full-text articles resulted in ten key strategies to improve access to PHC services. Community health programs or community-directed interventions, school-based healthcare services, student-led healthcare services, outreach services or mobile clinics, family health program, empanelment, community health funding schemes, telemedicine, working with traditional healers, working with non-profit private sectors and non-governmental organizations including faith-based organizations are the key strategies identified from international experiences.

This review identified key strategies from international experiences to improve access to PHC services in rural communities. These strategies can play roles in achieving universal health coverage and reducing disparities in health outcomes among rural communities and enabling them to get healthcare when and where they want.

Peer Review reports

Introduction

Universal health coverage (UHC) is used to provide expanding services to eliminate access barriers. Universal health coverage is defined by the world health organization (WHO) as access to key promotional, preventive, curative and rehabilitative health services for all at an affordable rate and ensuring equity in access. The term universal has been described as the State's legal obligation to provide healthcare to all its citizens, with particular attention to ensuring that all poor and excluded groups are included [ 1 , 2 , 3 ].

Strengthening primary healthcare (PHC) is the most comprehensive, reliable and productive approach to improving people's physical and mental wellbeing and social well-being, and that PHC is a pillar of a sustainable health system for UHC and health-related sustainable development goals [ 4 , 5 ]. Despite tremendous progress over the last decades, there are still unaddressed health needs of people in all parts of the world [ 6 , 7 ]. Many people, particularly the poor and people living in rural areas and those who are in vulnerable circumstances, face challenges to remain healthy [ 8 ].

Geographical and financial inaccessibility, inadequate funding, inconsistent medication supply and equipment and personnel shortages have left the reach, availability and effect of PHC services in many countries disappointingly limited [ 9 , 10 ]. A recent Astana Declaration recognized those aspects of PHC need to be changed to adapt adequately to current and emerging threats to the healthcare system. This declaration discussed that implementation of a need-based, comprehensive, cost-effective, accessible, efficient and sustainable healthcare system is needed for disadvantaged and rural populations in more local and convenient settings to provide care when and where they want it [ 8 ].

Different innovative approaches have been practiced in different parts of the world to improve access to essential healthcare services in rural communities. Systematically collecting and combining best experiences all over the world is important to suggest effective strategies to improve access to healthcare in developing countries. Accordingly, this systematic review of literature was undertaken to identify key approaches from international experiences to enhance access to PHC services in rural communities. The findings of this systematic literature review can be used by healthcare professionals, researchers and policy makers to improve healthcare service delivery in rural communities.

Methodology

Research question.

What improves access to PHC services in rural communities? We used the PICO (population, issue/intervention, comparison/contrast, and outcome) construct to develop the search question [ 11 ]. The population is rural communities or remote communities in developing countries who have limited access to healthcare services. Moreover, we extended the population to developed countries to capture experiences of both developing and developed countries. The issue/intervention is implementation of different community-based health interventions to access to essential healthcare services. In this systematic review, we focused on PHC health services, mainly essential or basic healthcare services, community or public health services, and health promotion or health education. Primary healthcare is “a health care system that addressed social, economic, and political causes of poor health promotes health though health services at the primary care level enhances health of the community” [ 12 ]. Comparison/contrast is not appropriate for this review. The outcome is improved access to essential healthcare services.

Outcome measures

The outcome of this review is access to PHC services, such as preventive, promotive, curative, rehabilitative, and palliative health services which are affordable, convenient or acceptable, and available to all who need care.

Criteria for considering studies for this review

All published and unpublished qualitative and/or mixed method studies conducted to improve access to PHC services were included. Government and international or national organizations reports were also included. Different organizations whose primary mission is health or promotion of community health were selected. We included articles based on these eligibility criteria: context or scope of studies (access to PHC services), article type (primary studies), and publication language (English). Articles published other than English language, citations with no abstracts and/or full texts, reviews, and duplicate studies were excluded. We included all articles available in different electronic databases regardless of their publication years. We didn’t use time of publication for screening.

Information sources and search strategy

We searched relevant articles from MEDLINE, Scopus, Web of Science, WHO Global Health Library, and Google Scholar to access all forms of evidence. An initial search of MEDLINE was undertaken followed by analysis of the text words contained in the title and abstract, and of the index terms used to describe articles. We used the aforementioned performance indicators of PHC delivery and the PICO as we described above to choose keywords. A second search using all identified keywords and index terms was undertaken across all included databases. Thirdly, references of all identified articles were searched to get additional studies. The full electronic search strategy for MEDLINE, a major database we used for this review is included as a supplementary file (Additional file 1 : Appendix 1).

Study selection and assessment of methodological quality

Search results from different electronic databases were exported to Endnote reference manager version 7 to remove duplication. Two independent reviewers (ZG and BA) screened out records. An initial screening of titles and abstracts was done based on the PICO criteria and language of publication. Secondary screening of full-text papers was done for studies we included at the initial screening phase. We further investigated and assessed records included in the full-text articles against the inclusion and exclusion criteria. We sat together and discussed the eligibility assessment. The interrater agreement was 90%. We resolved disagreements by consensus for points we had different rating. We used the PRISMA flow diagram to summarize the study selection processes.

Methodological quality of the included studies was assessed using mixed methods appraisal tool (MMAT) version 2018 [ 13 ]. As it is clearly indicated in the user guide of the MMAT tool, it is discouraged to calculate an overall score from the ratings of each criterion. Instead, it is advised to provide a more detailed presentation of the ratings of each criterion to better inform quality of the included studies. The rating of each criterion was, therefore, done as per the detail explanations included in the guideline. Almost all the included full text articles fulfilled the criteria and all the included full text articles were found to be better quality.

Data extraction

We independently extracted data from papers included in the review using JBI mixed methods data extraction form. This form is only used for reviews that follow a convergent integrated approach, i.e. integration of qualitative data and qualitative data [ 14 ]. The data extraction form was piloted on randomly selected papers and modified accordingly. One reviewer extracted the data from the included studies and the second reviewer checked the extracted data. Disagreements were resolved by discussion between the two reviewers. Information was extracted from each included study on: list of authors, year of publication, study area, population of interest, study type, methods, focus of the studies, main findings, authors’ conclusion, and limitations of the study.

Synthesis of findings

The included full-text articles were qualitatively analyzed using emergent thematic analysis approach to identify key concepts and coded them into related non-mutually exclusive themes. Themes are strategies mentioned or discussed in the included records to improve access to PHC services. Themes were identified manually by reading the included records again and again. We then synthesized each theme by comparing the discussion and conclusion of the included articles.

Systematic review registration number

The protocol of this review is registered in PROSPERO (the registration number is: CRD42019132592) to avoid unplanned duplication and to enable comparison of reported review methods with what was planned in the protocol. It is available at https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42019132592 .

Schematic of the systematic review and reporting of the search

We used PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2009 checklist [ 15 ] for reporting of this systematic review.

Study selection

The search strategy identified 1148 titles and abstracts [914 from PubMed (Table 1 ) and 234 from other sources] as of 10 March 2022. We obtained 900 after we removed duplicated articles. Following assessment by title and abstract, 485 records were excluded because these records did not meet the criteria as mentioned in the method section. Additional 256 records were discarded because the records did not discuss the outcome of interest well and some records were systematic reviews. The full text of the remaining 159 records was examined in more detail. It appeared that 49 studies did not meet the inclusion criteria as described in the method section. One hundred ten records met the inclusion criteria and were included in the systematic review or synthesis (Fig.  1 ).

figure 1

Study selection flow diagram

Of 900 articles resulting from the search term, 110 (12.2%) met the inclusion criteria. The included full-text articles were published between 1993 and 2021. Ninety-two (83.6%) of the included full-text articles were research articles, 5(4.5%) were technical reports, 3 (2.7%) were perspective, 4 (3.6%) was discussion paper, 3(2.7%) were dissertation or thesis, 2 (1.8%) were commentary, and 1 (0.9) was a book. Thirty-six (33%) and 29 (26%) of the included full-text articles were conducted in Africa and North America, respectively (Fig.  2 ).

figure 2

Regions where the included full-test articles conducted

Key strategies identified

The analysis of 110 full-text articles resulted in 10 themes. The themes are key strategies to improve access to PHC services in rural communities. The key strategies identified are community health programs or community-directed healthcare interventions, school-based healthcare services, student-led healthcare services, outreach services or mobile clinics, family health program, empanelment, community health funding schemes, telemedicine, promoting the role of traditional medicine, working with non-profit private sectors and non-governmental organizations (NGOs) including faith-based organizations (Table 2 ).

Description of strategies

a. Community health programs or community-directed healthcare interventions

Twenty-four (21.8%) of the full-text articles included in this review discussed that community health programs (CHPs) or community-directed healthcare interventions are best strategies to provide basic health and medical care close to the community to increase access and coverage of essential health services. Community health programs are locally based health promotion, disease prevention, and treatment programs available typically to communities in need and community-directed intervention strategy is an approach in which communities themselves direct the planning and implementation of intervention delivery. Rural communities, especially, in developing countries have no access to healthcare facilities in the near distance and have less chance to receive healthcare from doctors, health officers, nurses or midwives. In response to this critical problems, many countries have been investing heavily in community based primary health care to bring services to rural and remote areas where most of the population lives. Community health programs include construction of health posts or community health centers close to the community and deployment of community health workers (CHWs), such as health extension workers, to reach-out every village, who play a prominent role as the gatekeepers of healthcare in rural communities. Community-directed healthcare intervention is an approach in which communities themselves direct the planning and implementation of healthcare interventions. Community participation remains crucial in the identification of health problems, planning or designing of health interventions and implementation of the interventions, which enhances need-based and demand-driven provision of health services while promoting sustainability and ownership (Additional file 2 : Appendix 2, Table A1).

b. School-based primary healthcare

In this review, 9 of 110 (8.2%) of the included full-text articles pointed out that school-based healthcare services can be effective to improve access to PHC services. School-based health services are health programs that offer health care to children and youth either in a school or on school grounds and usually staffed according to school community needs and resources. School-based health services provide a variety of healthcare services to underserved children, youth and vulnerable populations in a convenient and accessible environment. Access to comprehensive health services via schools leads to improved access to healthcare (Additional file 3 : Appendix 3, Table A2).

c. Student-led healthcare services

In this review, 5 of 110 (4.5%) of the full-text articles discussed that the use of medical and health science students as healthcare service providers can minimize problems related with shortage of health professionals in rural healthcare system and can play appreciable roles to minimize healthcare service access problems in rural communities. Student-led healthcare services are developed through consultation between universities and local health providers and are purposefully designed clinical placements with a focus on clinical educational activities for pre-registration students. Student-led clinics link students, healthcare professionals, community-based organizations, universities, and communities. In this approach, students can gain practical experience in an interdisciplinary setting and through exposure to a community with unique and severe needs (Additional file 4 : Appendix 4, Table A3).

d. Outreach services or mobile clinics

In this systematic literature review, 18 of 110 (16.4%) of the included studies discussed that outreach services or mobile clinics in primary care and rural hospital settings can improve access to PHC services in rural communities. Mobile outreach service is defined as healthcare services provided by a mobile team of trained providers, from a higher-level health facility to a lower-level health facilities or locally available community facilities that are not used for clinical services, such as schools, health posts, or other community structures. Outreach services improve access to specialists and hospital-based services, strengthen connections between specialists and PHC providers, and give the benefits of consultations in primary care settings. Specialist outreach services have the potential to overcome access barriers faced by disadvantaged rural and remote communities. Furthermore, a community-based mobile clinics can be effective in uncovering illness and in directing patients to a healthcare home (Additional file 5 : Appendix 5, Table A4).

e. Family health program

Four (3.6%) of the included full-text articles discussed that family health program (FHP) is highly cost-effective tool for improving access to healthcare services for deprived areas (such as rural communities). Family health program means the program is a program designed to provide primary care as well as the prevention and early treatment of communicable and non-communicable diseases in defined populations by deploying interdisciplinary healthcare teams include physicians, nurses, nurse assistants, and full-time community health agents. It has evolved into a robust approach to providing primary care for defined populations by deploying interdisciplinary healthcare teams. The nucleus of each team includes a physician, a nurse, a nurse assistant, and full-time community health agents. This approach is effective on improving access to healthcare and eliminating health disparities (Additional file 6 : Appendix 6, Table A5).

f. Empanelment

This systematic review of literature identified that empanelment (also known as rostering) is a best strategy to proactively provide coordinated primary healthcare towards achieving universal health coverage. Empanelment is a continuous, iterative set of processes that identify and assign populations to facilities, care teams, or primary care providers who have a responsibility to know their assigned population. It enables health systems to improve health outcomes and to reduce costs. Empanelment establishes a point of care for individuals and simultaneously holds primary healthcare providers and care teams accountable for actively managing care for a specific group of individuals (Additional file 7 : Appendix 7, Table A6).

g. Community health funding schemes

In this systematic review of literature, 11 (10%) of the included articles discussed that community health funding schemes such as community-based health insurance (CBHI) increases access to healthcare services in low-income rural communities. Community-based health insurance schemes are usually voluntary and characterized by community members pooling funds to offset the cost of healthcare. Moreover, this approach is effective to mobilize domestic resources for health at low income levels. For low-income countries, community health financing has modest ability to increase the total amount of funds for healthcare. Properly structured community health financing system can significantly improve efficiency, reduce the cost of healthcare, improve quality and health outcomes, and pool risks. Community-financing schemes could improve preventive services and reduce the incidence of diseases. It could also improve people’s access to healthcare and the quality of services, thus improving their health status. Community health financing could also improve risk pooling and reduce health-induced impoverishment. Community health insurance has potential positive impacts on health and social security (Additional file 8 : Appendix 8, Table A7).

h. Telemedicine

In this review, 13 of 110 (11.8%) articles discussed that telemedicine is one of the solutions for rural subspecialty healthcare delivery. Telemedicine can be defined as the use of technology (computers, video, phone, messaging) by a medical professional to diagnose and treat patients in a remote location. The provision of subspecialty services using telemedicine to a remote and medically underserved population provides improved access to subspecialty care. Telemedicine brings sustainable healthcare to rural populations. Use of information and communication technologies in support of health and health-related fields, including healthcare services, health surveillance, health education, and health research has the potential to greatly improve health service efficiency, expand or scale up treatment delivery to thousands of patients in the rural populations (Additional file 9 : Appendix 9, Table A8).

i. Promoting the role of traditional medicine

Seven (6.4%) of the included articles showed that incorporating traditional healers into public health system addresses healthcare needs of people with limited access to allopathic medicine. Traditional medicine is the sum total of the knowledge, skill, and practices based on the theories, beliefs, and experiences indigenous to different cultures, whether explicable or not, used in the maintenance of health as well as in the prevention, diagnosis, improvement or treatment of physical and mental illness. Knowledge about traditional medicine has a catalyzing effect in meeting health sector development objectives. Integrating traditional medicine into national health systems in combination with national policy and regulation for products, practices and providers can enhance access to PHC services in remote populations (Additional file 10 : Appendix 10, Table A9).

j. Working with non-profit private sectors and non-governmental organizations

In this systematic review, 15 of 110 (13.6%) of the included articles revealed that working with non-profit private sectors and NGOs strengthens the healthcare system. Involving the non-profit private sectors, faith-based organizations (FBOs), and NGOs for health system strengthening eventually contributes to create a healthcare system reflecting an increased efficiency, more equity and good governance in health. International and local NGOs have endeavored to fill the gaps in access to healthcare services, research and advocacy. Non-profit private sectors and NGOs have a key role in improving health in low- and middle-income countries. With networks that reach even the most remote communities, many FBOs are well positioned to promote demand and access for healthcare services. Partnership among FBOs is critical in increasing access to healthcare services, and ensuring sustainability by influencing behaviors at the community, family and individual level. Faith-based organizations play an integral role in the healthcare system by increasing health seeking behaviors and delivering supportive services that address common access and cultural barriers (Additional file 11 : Appendix 11, Table A10).

This systematic literature review found that community health programs or community-directed healthcare interventions, school-based healthcare services, student-led healthcare services, outreach services or mobile clinics, family health program, empanelment, community health funding schemes, telehealth, integrative medicine, and working with non-profit private sectors and NGOs are key strategies to improve access to PHC services in rural communities. The identified strategies address the four major pillars of primary healthcare (i.e., community participation, inter-sectoral coordination, appropriate technology, and support mechanism made available) [ 126 ]. Moreover, the identified strategies are effective to improve access to healthcare services to rural communities. Moreover, the identified strategies are effective to solve shortage of manpower and to build knowledge and skill of the local health workforces in rural healthcare system. The ability of a healthcare system to meet health needs of the population depends largely on the knowledge, skills, motivation and deployment of the people responsible for organizing and delivering health services. The results of this review can strengthen the health information system, which are core elements of the healthcare system that ensure community engagement through dissemination and use of timely and reliable health information to rural populations. This review also suggests strategies to narrow down the health disparities among rural populations, which is wide in most Least and Middle Income Countries (LMICs). Healthcare services are usually disproportionately concentrated in major urban areas. As a result, rural communities face growing health disparities, largely attributed to weak policies, inefficiencies, poor leadership, and governance in healthcare system.

This review identified that community health programs or community-directed healthcare interventions address health disparities by ensuring equitable access to health resources in communities where health equity is limited by socioeconomic and geographical factors. Community health programs include identifying and prioritizing public health problems in a specific geographic area; designing and implementing public health interventions (such as establishing community health centers, mobile clinics, and outreach programs); providing services (such as health education, screenings, social support, and counseling), and deploying community health workers to promote healthy behaviors; advocating for improved care for populations at risk; and working with stakeholders to address community healthcare needs [ 16 , 17 , 18 , 127 , 128 , 129 , 130 ]. The community-oriented PHC model which is socially responsive medicine makes a healthcare system more rational, accountable, appropriate, and socially relevant to the public. Consequently, this model serves as a paradigm for reforming healthcare systems. Community-directed interventions can be considered as a realistic means to increase accessibility of interventions at community-level in rural areas [ 32 , 33 , 34 , 35 , 36 , 37 , 38 ]. This approach is best in situations where there are cultural barriers to implement interventions because this strategy is effective to develop ownership in the community. In-service and on-the-job training for community health workers, close supervision and government support, and program evaluation is very important to strengthen the community health program [ 131 , 132 , 133 ].

This review identified that school-based PHC services are effective strategies to improve access to PHC services. School-based health services provide a variety of healthcare services to children, youth and vulnerable populations in a convenient and accessible environment which indirectly improve leadership and governance. Science teachers and home room teachers play important roles to implement this strategy. It impacts on delivering preventive care such as immunizations, managing chronic illnesses and providing reproductive health services for adolescents. Comprehensive health services via schools improve access to healthcare information [ 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 ]. Access to school around the world increased drastically in the last century [ 134 ]. This high schooling rate is a good opportunity to provide healthcare services to school learners in accessible places and to disseminate health messages to families. Prior researches suggest that school-based healthcare services increase access to healthcare by increasing utilization of primary care, prevention services, and health maintenance visits [ 135 , 136 ]. Including science teachers, home room teachers, school principals, students, communities, community health workers, and other interested parties in the school-based healthcare system as main actors or promoters must be considered to sustain the impact. Health and education sectors should work in collaboration with the above-mentioned actors to plan, implement and monitor the progress. School-based healthcare services are preferable in situations when there is high schooling rate and limited access to healthcare institutions. This strategy is also an alternative way in areas where the health seeking behavior of the community is low.

The use of medical and health science students in rural healthcare system was identified as a key strategy to minimize health inequalities in rural communities due to shortages in health workforce and distribution of healthcare resources [ 49 , 50 , 51 , 52 , 53 ]. Student-led health intervention is an alternative approach to provide essential healthcare services to the community where there is shortage of healthcare workers [ 137 , 138 ]. Students will have opportunities to learn professional skills and competencies while they are providing healthcare services to the community. Moreover, benefits for student learning include increased communication, collaboration, and leadership skills [ 53 , 139 ]. Student-led health intervention also enables increased access to services, more time for assessments and treatments, increased depth of health teaching, holistic and integrated healthcare, and free health supports [ 140 , 141 , 142 , 143 ]. However, the use of medical and health science students in the rural healthcare system may have ethical and competency issues. Supporting strategies such as close supervision, preparing clear protocols, and including senior experts in the team should be considered.

This systematic review of literature found that outreach services or mobile clinics can improve access to PHC service delivery in rural populations [ 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 ]. In developing countries, the highest proportion of people lives in rural areas where doctor services are not available. Rural communities travel to major cities to get specialist services. This reflects a desire for closer integration between primary and secondary care. Specialist outreach services or mobile clinics have become one of the effective solution to solve health disparities, to improve access to healthcare services, and to build capacity of local healthcare workforces. This strategy is preferable in situations when there are high loads in tertiary or referral level hospitals and when there is high patient leakage in the referral system [ 63 , 64 , 65 , 66 , 67 , 68 , 69 ]. However, the implementation may not be easy. It needs well established healthcare system and budget. Moreover, the efficiency of care may be lower compared with hospital-based cares and the effect on patients’ health outcomes might be small [ 56 , 57 , 61 ] . Irregular specialist visits in rural areas may not have real impacts unless the services are sustainable with a strong commitment at national and local levels. Outreach activities should be included in health policies with strong leadership, healthcare financing, and private initiatives must be encouraged to maintain the activities over time.

This review revealed that FHP is highly effective tool for improving health for rural communities. The FHP has provided a new, more robust model of primary healthcare services designed to provide accessible, first contact, comprehensive, and whole person care that is coordinated with other healthcare services. It has positive results to improved availability, access to, and use of health services, and improved health indicators, such as reduced infant mortality, improved detection of cases of neglected diseases, and reduced health disparities [ 73 , 144 , 145 , 146 ]. The FHP deploys interdisciplinary healthcare teams. The team includes a physician, a nurse, a nurse assistant, and full-time community health agents. Family health teams are organized geographically. The teams are responsible for delivering public health interventions [ 72 , 74 ]. Family health program is an alternative strategy in rural healthcare system in situations when there are inequities in access to care; when there is high hospitalization rate; when there is low health seeking behavior in the community; and when there is poor case detecting and reporting system. Despite these remarkable achievements, the FHP has some challenges include difficulties in the recruitment and retention of doctors trained appropriately to deliver primary healthcare, large variations in quality of local care, patchy integration of primary care services with existing secondary and tertiary care, and slow adoption of FHP in large population [ 147 ].

In this review, empanelment has been identified as a best strategy to deliver coordinated primary healthcare towards achieving universal health coverage [ 76 , 77 , 78 , 79 ]. The goal of empanelment is provide people-centered healthcare services based on their needs to ensure that every established patient receives optimal care, whether he/she regularly visits healthcare centers. Major activities in this approach include assignment of all patients to a healthcare provider panel; update panel assignments on a regular basis; and use panel data to educate, and track patients [ 79 ]. Empanelment enables healthcare systems to improve patient experiences, reduce costs, and improve health outcomes. Empanelment is an effective strategy to deliver four key functions: first-contact accessibility, continuity, comprehensiveness, and coordination [ 148 ]. Effective empanelment requires responsibility for the health of a target population, including providing healthcare services based on their health status, which is an important step in moving towards people-centered integrated healthcare [ 79 ].

This review identified that community health funding schemes such as community-based health insurance (CBHI) increases access to healthcare in low-income rural communities. Moreover, this approach is effective to mobilize domestic resources for health at low income levels [ 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 ]. Community-based health insurance is an emerging strategy to provide financial protection against the cost of illness. It is an effective strategy to improve access to quality health services for low-income rural households [ 149 ]. Existence of social capital in the community is a determinant factor for the effectiveness of CBHI as social capital has a positive effect on the community's demand for insurance [ 150 , 151 ]. Moreover, solidarity and trust between the members are the key principles for the good functioning of a CBHI. Solidarity and trust stir-up members who are susceptible to risk to put together their resources for common use [ 149 , 152 , 153 ]. Affordability of premiums or contributions, technical arrangements made by the scheme management, timing of collecting the contributions, trust in the integrity and competence of the managers of the CBHI, The quality of care offered through the CBHI, accessible across different population groups are some of the determinant factors to be considered to increase people’s decision to join the CBHI schemes [ 154 , 155 ].

In this review, telemedicine has been identified as one of the many possible solutions for rural subspecialty healthcare delivery. Telemedicine is a vital technological tool to increase healthcare access, improve care delivery systems, engage in culturally competent outreach, health workforce development, and health information system [ 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 ]. Telemedicine can be a great alternative to the traditional healthcare system in situations like diagnoses of common medical problems; inquiries about various medical issues for home treatments; post-treatment check-ins or follow-up for chronic care; holidays, weekends, late night or any other situation when regular medical care is not possible; patient inability to leave the house; patients who lack regular access to relevant medical expertise in their geographic area ; and etc. However, technological issues are challenges when dealing with telemedicine, especially in developing countries. General problems of Internet connectivity and access to infrastructure can minimize benefits of this strategy. Costs associated with technology can also be a barrier. Furthermore, health technology requires human capacity to use it. Therefore, strengthening the information communication technologies (ICT) and human capacity building on ICT are important to address the health needs of the rural communities.

This systematic review of literature identified that promoting the role of TM solves problems of access to allopathic medicine. Integration of TM in health system will result in increased coverage and access to healthcare services. The role of complementary and alternative medicine for health is undisputed particularly in light of its role in health promotion and well-being. It also supports local health workforces [ 104 , 105 , 106 , 107 , 108 , 109 ]. Incorporating traditional healers into the public health system addresses healthcare needs [ 156 , 157 ]. However, integrating TM to the public healthcare system is challenging. It is a general belief that TM defies scientific procedures in terms of objectivity, measurement, codification and classification [ 157 ]. If integrated, who provides training to medical doctors on the ontology, epistemology and the efficacies of TM in modern medicine [ 157 ]. Due to these, some scholars suggest that both TM and modern medicine be allowed to operate and develop independent of one another [ 158 , 159 ]. Another fundamental challenge to TM is the widespread reported cases of fake healers and healings [ 157 ]. Generally, this strategy is more of feasible in areas where formal trainings on integrative medicine are available. Even though the integration is challenging, the health sector can use traditional healers as health educators or health promoters by providing training and continuous support. It can be also possible to use traditional healers as facilitators in the community-directed approaches. In general TM can be used in the primary healthcare system where no access to allopathic medicine and when conventional medicine is ineffective in treatment of disease [ 160 ].

Working with non-profit private sectors and NGOs has been identified as effective strategies to strengthen the healthcare system in developing countries [ 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 ]. Since governments in developing countries are challenged to meet the health needs of their populations because of financial constraints, limited human resources, and weak health infrastructure; the private sector (especially the non-profit private sectors) and non-governmental organizations can help expand access to healthcare services through its resources, expertise, and infrastructure. However, the presence of an NGO in the operation, may contribute to unrealistic expectations of health services, affecting perceptions of the latter negatively [ 113 ]. Moreover, reports have it that besides other issues in many instances NGOs allocated funds only to disease specific projects (vertical programming) rather than to broad based investments (horizontal programming) [ 161 ]. There are also concerns that donor expenditures in developing countries are not only unsustainable but may be considered as inadequate considering the enormous healthcare burden [ 161 , 162 , 163 , 164 ]. To avoid unrealistic expectations and dissatisfaction, and to increase and sustain the population’s trust in the organization, NGOs should operate in a manner that is as integrated as possible within the existing structure and should work close to the population it serves, with services anchored in the community. Moreover, faith-based organizations contribute in health such as disease prevention, health education or promotion, and community health development beyond psychological and spiritual care [ 119 , 120 , 121 , 122 , 123 , 124 ]. Religious organizations can reach all segments of rural populations. Therefore, integrating PHC services, especially health education and promotion, diseases prevention and community health development with religious organizations intensifies delivery of healthcare services. Working with FBOs is a best way in situations where cultural and faith-based barriers are common and in areas, where access problems are often related to lack of providers. However, religious organizations need intensive training on health promotion and health system to enable them to respond to local contexts within the framework of national policies. Moreover, there should be strong partnership with government agenesis to sustain the effort [ 165 , 166 , 167 , 168 ].

Contribution of this review

Various studies reported one or more strategies to improve access to primary healthcare services. However, the strategies reported by individual studies are not compiled together and there is lack of pooled evidence on effective strategies to improve access to healthcare system. This systematic literature review was, therefore, conducted to compile effective strategies to improve access to healthcare services in rural communities. The review suggests key strategies to improve access to PHC services in rural communities. These suggested strategies are implementable in countries that suffer from shortage of health workers and healthcare financing because all the strategies used locally available opportunities. The local healthcare system needs, therefore, scan the available opportunities in the locality for implementing the suggested strategies and needs to integrate the strategies in the healthcare system to sustain the impacts. Healthcare providers, researchers and policy makers could use the results of this systematic literature review to increase access to healthcare services in hard-to-reach areas. As the strategies are compiled from experiences of different countries (developed and least developed countries), there might be contextual differences like socio-economic, cultural, institutional, and geographical challenges to adopt the identified strategies. Moreover, some of the experiences only come from one or two countries. Therefore, strategy developers and implementers need to consider these contextual challenges or variation during adopting and implementing different strategies.

Strengths and limitations of the study

As a strength, this systematic review explores international (both developed and developing countries) best experiences on primary healthcare service delivery and identified ten key approaches to improve access to PHC services in rural communities. We also searched relevant published or unpublished articles, dissertations or theses, discussion papers, and perspectives from a wide range of sources, such as MEDLINE, Scopus, Web of Science, WHO Global Health Library, and Google Scholar.

As a limitation, we entirely relied on electronic databases to search relevant articles. We didn’t include locally available printed out records. We also applied limits for language. We excluded articles published other than English language. We believed we could get more relevant articles if we had access to records available in prints and if we include articles published other than English language. Furthermore, since the strategies are compiled from experiences of different countries (developed and least developed countries), there might be contextual differences like socio-economic, cultural, institutional and geographical challenges to adopt the identified strategies. There was also limited evidence for some articles, especially reports to rate their methodological quality. Readers should also note that our review might missed some important work in improving access to PHC services and the identified strategies are not the only strategies to improve access to PHC services. There might be other effective strategies which are not included in this review. In addition generalizability might be affected since some of the experiences only come from one or two countries. Moreover, this review focuses on access not quality of care delivered.

This review identified key strategies from international experiences to improve access to PHC services in rural communities. These strategies are effective to improve access to healthcare services in rural or remote communities. They can also play roles in achieving UHC and reducing disparities in health outcomes and increase access to rural communities to get healthcare when and where they want. Therefore, incorporating these key strategies suggested by this review in to the healthcare system is useful to enhance PHC services and to minimize impacts of health disparity in rural communities. However, the identified strategies may not be easy to implement. Increasing number and capacity of human resource for health; strengthening the healthcare financing system; improving medicine and supplies; working in different partners and communities; establishing monitoring and evaluation system; strong and committed leadership; and encouraging private initiatives must be considered to implement and maintain these strategies over time. Moreover, policy makers, program planners and implementers who want to utilize findings of this review should be aware that these are not the only effective strategies to improve access to primary healthcare services.

Availability of data and materials

All the extracted data are included in the manuscript.

Abbreviations

Community-based health insurance

Faith-based organizations

Family health program

Information communication technologies

Mixed methods appraisal tool

Non-governmental organizations

  • Primary healthcare

Primary Health Care Performance Initiative

Population, phenomena of interest and context)

Traditional medicine

Universal health coverage

Hampton MB, Kettle AJ, Winterbourn CC. Inside the neutrophil phagosome: oxidants, myeloperoxidase, and bacterial killing. Blood. 1998;92(9):3007–17.

Article   CAS   Google Scholar  

Kirby M. The right to health fifty years on: Still skeptical? Health Hum Rights. 1999;4(1):6–25.

O’Connell T, Rasanathan K, Chopra M. What does universal health coverage mean? The Lancet. 2014;383(9913):277–9.

White F. Primary health care and public health: foundations of universal health systems. Med Princ Pract. 2015;24(2):103–16.

Article   Google Scholar  

Sanders D, Nandi S, Labonté R, Vance C, Van Damme W. From primary health care to universal health coverage—one step forward and two steps back. The Lancet. 2019;394(10199):619–21.

Brezzi M, Luongo P. Regional Disparities In Access To Health Care. 2016.

Google Scholar  

Hartley D. Rural health disparities, population health, and rural culture. Am J Public Health. 2004;94(10):1675–8.

Walraven G. The 2018 Astana declaration on primary health care, is it useful? J Glob Health. 2019;9(1).

Gillam S. Is the declaration of Alma Ata still relevant to primary health care? BMJ (Clinical research ed). 2008;336(7643):536–8.

Tollman S, Doherty J, Mulligan JA. General Primary Care. In: Jamison DT, Breman JG, Measham AR, Alleyne G, Claeson M, Evans DB, Jha P, Mills A, Musgrove P, editors. Disease Control Priorities in Developing Countries. Washington: World Bank The International Bank for Reconstruction and Development/The World Bank Group; 2006. Available at https://www.ncbi.nlm.nih.gov/books/NBK11789/pdf/Bookshelf_NBK11789.pdf .

Stern C, Jordan Z, McArthur A. Developing the review question and inclusion criteria. AJN The Am J Nurs. 2014;114(4):53–6.

World Health Organization. losing the gap in a generation. Commission on Social Determinants of Health FINAL REPORT. 2008. Available at https://www.who.int/social_determinants/final_report/csdh_finalreport_2008.pdf . Accessed on 22 March 2022.

Hong QN, Pluye P, Fàbregues S, Bartlett G, Boardman F, Cargo M, Dagenais P, GagnonM-P GF, Nicolau B, O’Cathain A. Mixed methods appraisal tool (MMAT), version 2018. Canada: IC Canadian Intellectual Property Office, Industry; 2018. Available at https://mixedmethodsappraisaltoolpublicpbworks.com/w/file/fetch/127916259/MMAT_2018_criteria-manual_2018-08-01_ENG.pdf .

JBI Manual for Evidence Synthesis. Appendix 8.1 JBI Mixed Methods Data Extraction Form following a Convergent Integrated Approach. Available at https://jbi-global-wiki.refined.site/space/MANUAL/3318284375/Appendix+8.1+JBI+Mixed+Methods+Data+Extraction+Form+following+a+Convergent+Integrated+Approach . Accessed on 12 August 2021. 

Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

Assefa Y, Gelaw YA, Hill PS, Taye BW, Van Damme W. Community health extension program of Ethiopia, 2003–2018: successes and challenges toward universal coverage for primary healthcare services. Glob Health. 2019;15(1):1–11.

Admassie A, Abebaw D, Woldemichael AD. Impact evaluation of the Ethiopian health services extension programme. J Dev Eff. 2009;1(4):430–49.

Yitayal M, Berhane Y, Worku A, Kebede Y. The community-based Health extension Program significantly improved contraceptive utilization in West gojjam Zone, ethiopia. J Multidiscip Healthc. 2014;7:201.

Croke K, Mengistu AT, O’Connell SD, Tafere K. The impact of a health facility construction campaign on health service utilisation and outcomes: analysis of spatially linked survey and facility location data in Ethiopia. BMJ Glob Health. 2020;5(8):e002430.

Arwal S. Health Posts in Afghanistan. J Gen Practice. 2015;3(213):2.

Negussie A, Girma G. Is the role of Health Extension Workers in the delivery of maternal and child health care services a significant attribute? The case of Dale district, southern Ethiopia. BMC Health Serv Res. 2017;17(1):1–8.

Than KK, Mohamed Y, Oliver V, Myint T, La T, Beeson JG, Luchters S. Prevention of postpartum haemorrhage by community-based auxiliary midwives in hard-to-reach areas of Myanmar: a qualitative inquiry into acceptability and feasibility of task shifting. BMC Pregnancy Childbirth. 2017;17(1):1–10.

Medhanyie A, Spigt M, Kifle Y, Schaay N, Sanders D, Blanco R, GeertJan D, Berhane Y. The role of health extension workers in improving utilization of maternal health services in rural areas in Ethiopia: a cross sectional study. BMC Health Serv Res. 2012;12(1):1–9.

Sakeah E, McCloskey L, Bernstein J, Yeboah-Antwi K, Mills S, Doctor HV. Can community health officer-midwives effectively integrate skilled birth attendance in the community-based health planning and services program in rural Ghana? Reprod Health. 2014;11(1):1–13.

Sarmento DR. Traditional birth attendance (TBA) in a health system: what are the roles, benefits and challenges: a case study of incorporated TBA in Timor-Leste. Asia Pac Fam Med. 2014;13(1):1–9.

Rahmawati R, Bajorek B. Peer Reviewed: A Community Health Worker-Based Program for Elderly People with Hypertension in Indonesia: A Qualitative Study, 2013. Prev Chronic Dis. 2015;12:E175.

Feltner FJ, Ely GE, Whitler ET, Gross D, Dignan M. Effectiveness of community health workers in providing outreach and education for colorectal cancer screening in Appalachian Kentucky. Soc Work Health Care. 2012;51(5):430–40.

Hughes MM, Yang E, Ramanathan D, Benjamins MR. Community-based diabetes community health worker intervention in an underserved Chicago population. J Community Health. 2016;41(6):1249–56.

Panday S, Bissell P, Van Teijlingen E, Simkhada P. The contribution of female community health volunteers (FCHVs) to maternity care in Nepal: a qualitative study. BMC Health Serv Res. 2017;17(1):1–11.

Datiko DG, Lindtjørn B. Health extension workers improve tuberculosis case detection and treatment success in southern Ethiopia: a community randomized trial. PLoS ONE. 2009;4(5):e5443.

le Roux KW, Almirol E, Rezvan PH, Le Roux IM, Mbewu N, Dippenaar E, Stansert-Katzen L, Baker V, Tomlinson M, Rotheram-Borus M. Community health workers impact on maternal and child health outcomes in rural South Africa–a non-randomized two-group comparison study. BMC Public Health. 2020;20(1):1–14.

Witmer A, Seifer SD, Finocchio L, Leslie J, O’Neil EH. Community health workers: integral members of the health care work force. Am J Public Health. 1995;85(8 Pt 1):1055–8.

Wright RA. Community-oriented primary care. The cornerstone of health care reform. Jama. 1993;269(19):2544–7.

Makaula P, Bloch P, Banda HT, Mbera GB, Mangani C, de Sousa A, Nkhono E, Jemu S, Muula AS. Primary Health Care in rural Malawi - a qualitative assessment exploring the relevance of the community-directed interventions approach. BMC Health Serv Res. 2012;12:328.

Katabarwa MN, Habomugisha P, Richards FO Jr, Hopkins D. Community-directed interventions strategy enhances efficient and effective integration of health care delivery and development activities in rural disadvantaged communities of Uganda. Trop Med Int Health : TM & IH. 2005;10(4):312–21.

Madon S, Malecela MN, Mashoto K, Donohue R, Mubyazi G, Michael E. The role of community participation for sustainable integrated neglected tropical diseases and water, sanitation and hygiene intervention programs: A pilot project in Tanzania. Soc Sci Med. 1982;2018(202):28–37.

Okeibunor JC, Orji BC, Brieger W, Ishola G, Otolorin E, Rawlins B, Ndekhedehe EU, Onyeneho N, Fink G. Preventing malaria in pregnancy through community-directed interventions: evidence from Akwa Ibom State, Nigeria. Malaria J. 2011;10:227.

Brieger WR, Sommerfeld JU, Amazigo UV. The Potential for Community-Directed Interventions: Reaching Underserved Populations in Africa. Int Q Community Health Educ. 2015;35(4):295–316.

Braimah JA, Sano Y, Atuoye KN, Luginaah I. Access to primary health care among women: the role of Ghana’s community-based health planning and services policy. Prim Health Care Res Dev. 2019;20:e82.

Kaplan DW, Brindis CD, Phibbs SL, Melinkovich P, Naylor K, Ahlstrand K. A comparison study of an elementary school–based health center: effects on health care access and use. Arch Pediatr Adolesc Med. 1999;153(3):235–43.

Allison MA, Crane LA, Beaty BL, Davidson AJ, Melinkovich P, Kempe A. School-based health centers: improving access and quality of care for low-income adolescents. Pediatrics. 2007;120(4):e887–94.

Keeton V, Soleimanpour S, Brindis CD. School-based health centers in an era of health care reform: Building on history. Curr Probl Pediatr Adolesc Health Care. 2012;42(6):132–56.

Brindis CD, Klein J, Schlitt J, Santelli J, Juszczak L, Nystrom RJ. School-based health centers: Accessibility and accountability. J Adolesc Health. 2003;32(6):98–107.

Hutchinson P, Carton TW, Broussard M, Brown L, Chrestman S. Improving adolescent health through school-based health centers in post-Katrina New Orleans. Child Youth Serv Rev. 2012;34(2):360–8.

Paschall MJ, Bersamin M. School-based health centers, depression, and suicide risk among adolescents. Am J Prev Med. 2018;54(1):44–50.

Minguez M, Santelli JS, Gibson E, Orr M, Samant S. Reproductive health impact of a school health center. J Adolesc Health. 2015;56(3):338–44.

Gibson EJ, Santelli JS, Minguez M, Lord A, Schuyler AC. Measuring school health center impact on access to and quality of primary care. J Adolesc Health. 2013;53(6):699–705.

Bozigar M. A Cross-Sectional Survey to Evaluate Potential for Partnering With School Nurses to Promote Human Papillomavirus Vaccination. Prev Chronic Dis. 2020;17:E111.

Suen J, Attrill S, Thomas JM, Smale M, Delaney CL, Miller MD. Effect of student-led health interventions on patient outcomes for those with cardiovascular disease or cardiovascular disease risk factors: a systematic review. BMC Cardiovasc Disord. 2020;20(1):1–10.

Atuyambe LM, Baingana RK, Kibira SP, Katahoire A, Okello E, Mafigiri DK, Ayebare F, Oboke H, Acio C, Muggaga K. Undergraduate students’ contributions to health service delivery through communitybased education. BMC Med Educ. 2016;16:123.

Stuhlmiller CM, Tolchard B. Developing a student-led health and wellbeing clinic in an underserved community: collaborative learning, health outcomes and cost savings. BMC Nurs. 2015;14(1):1–8.

Campbell DJ, Gibson K, O’Neill BG, Thurston WE. The role of a student-run clinic in providing primary care for Calgary’s homeless populations: a qualitative study. BMC Health Serv Res. 2013;13(1):1–6.

Simpson SA, Long JA. Medical student-run health clinics: important contributors to patient care and medical education. J Gen Intern Med. 2007;22(3):352–6.

Gruen RL, O’Rourke IC, Bailie RS, d’Abbs PH, O’Brien MM, Verma N. Improving access to specialist care for remote Aboriginal communities: evaluation of a specialist outreach service. Med J Aust. 2001;174(10):507–11.

Gruen RL, Weeramanthri T, Bailie R. Outreach and improved access to specialist services for indigenous people in remote Australia: the requirements for sustainability. J Epidemiol Community Health. 2002;56(7):517–21.

Gruen RL, Bailie RS, Wang Z, Heard S, O’Rourke IC. Specialist outreach to isolated and disadvantaged communities: a population-based study. The Lancet. 2006;368(9530):130–8.

Bond M, Bowling A, Abery A, McClay M, Dickinson E. Evaluation of outreach clinics held by specialists in general practice in England. J Epidemiol Community Health. 2000;54(2):149–56.

Irani M, Dixon M, Dean JD. Care closer to home: past mistakes, future opportunities. J R Soc Med. 2007;100(2):75–7.

Bailey JJ, Black ME, Wilkin D. Specialist outreach clinics in general practice. BMJ (Clinical research ed). 1994;308(6936):1083–6.

De Roodenbeke E, Lucas S, Rouzaut A, Bana F. Outreach services as a strategy to increase access to health workers in remote and rural areas. Geneva: WHO; 2011.

Bowling A, Stramer K, Dickinson E, Windsor J, Bond M. Evaluation of specialists’ outreach clinics in general practice in England: process and acceptability to patients, specialists, and general practitioners. J Epidemiol Community Health. 1997;51(1):52–61.

Spencer N. Consultant paediatric outreach clinics–a practical step in integration. Arch Dis Child. 1993;68(4):496–500.

Aljasir B, Alghamdi MS. Patient satisfaction with mobile clinic services in a remote rural area of Saudi Arabia. East Mediterr Health J. 2010;16(10):1085–90.

Lee EJ, O’Neal S. A mobile clinic experience: nurse practitioners providing care to a rural population. J Pediatr Health Care. 1994;8(1):12–7.

Cone PH, Haley JM. Mobile clinics in Haiti, part 1: Preparing for service-learning. Nurse Educ Pract. 2016;21:1–8.

Diaz-Perez Mde J, Farley T, Cabanis CM. A program to improve access to health care among Mexican immigrants in rural Colorado. J Rural Health. 2004;20(3):258–64.

Hill C, Zurakowski D, Bennet J, Walker-White R, Osman JL, Quarles A, Oriol N. Knowledgeable Neighbors: a mobile clinic model for disease prevention and screening in underserved communities. Am J Public Health. 2012;102(3):406–10.

Edgerley LP, El-Sayed YY, Druzin ML, Kiernan M, Daniels KI. Use of a community mobile health van to increase early access to prenatal care. Matern Child Health J. 2007;11(3):235–9.

Peters G, Doctor H, Afenyadu G, Findley S, Ager A. Mobile clinic services to serve rural populations in Katsina State, Nigeria: perceptions of services and patterns of utilization. Health Policy Plan. 2014;29(5):642–9.

Neke NM, Gadau G, Wasem J. Policy makers’ perspective on the provision of maternal health services via mobile health clinics in Tanzania—Findings from key informant interviews. PLoS ONE. 2018;13(9):e0203588.

Padmadas SS, Johnson FA, Leone T, Dahal GP. Do mobile family planning clinics facilitate vasectomy use in Nepal? Contraception. 2014;89(6):557–63.

Macinko J, Harris MJ. Brazil’s family health strategy—delivering community-based primary care in a universal health system. N Engl J Med. 2015;372(23):2177–81.

Macinko J, Lima Costa MF. Access to, use of and satisfaction with health services among adults enrolled in Brazil’s Family Health Strategy: evidence from the 2008 National Household Survey. Tropical Med Int Health. 2012;17(1):36–42.

Dourado I, Oliveira VB, Aquino R, Bonolo P, Lima-Costa MF, Medina MG, Mota E, Turci MA, Macinko J. Trends in primary health care-sensitive conditions in Brazil: the role of the Family Health Program (Project ICSAP-Brazil). Medical care. 2011;49:577–84.

Aquino R, De Oliveira NF, Barreto ML. Impact of the family health program on infant mortality in Brazilian municipalities. Am J Public Health. 2009;99(1):87–93.

Chong P-N, Tang WE. Transforming primary care—the way forward with the TEAMS2 approach. Fam Pract. 2019;36(3):369–70.

Primary Health Care Performance Initiatives (phcpi). Improvement strategies model: Population health management: Empanelment. Available at https://improvingphc.org/sites/default/files/Empanelment%20-%20v1.2%20-%20last%20updated%2012.13.2019.pdf . Accessed on 18 March 2022. 

McGough P, Chaudhari V, El-Attar S, Yung P. A health system’s journey toward better population health through empanelment and panel management. Healthcare. 2018;6(66):1–9.

Bearden T, Ratcliffe HL, Sugarman JR, Bitton A, Anaman LA, Buckle G, Cham M, Quan DCW, Ismail F, Jargalsaikhan B. Empanelment: A foundational component of primary health care. Gates Open Res. 2019;3:1654.

Hsiao WC. Unmet health needs of two billion: is community financing a solution? 2001.

Wang W, Temsah G, Mallick L. The impact of health insurance on maternal health care utilization: evidence from Ghana, Indonesia and Rwanda. Health Policy Plan. 2017;32(3):366–75.

Atnafu DD, Tilahun H, Alemu YM. Community-based health insurance and healthcare service utilisation, North-West, Ethiopia: a comparative, cross-sectional study. BMJ Open. 2018;8(8):e019613.

USAID. Ethiopia’s Community-based Health Insurance: A Step on the Road to Universal Health Coverage. Available at https://www.hfgproject.org/ethiopias-community-based-health-insurance-step-road-universal-health-coverage/ . Accessed on 18 March 2022.

Blanchet NJ, Fink G, Osei-Akoto I. The effect of Ghana’s National Health Insurance Scheme on health care utilisation. Ghana Med J. 2012;46(2):76–84.

CAS   Google Scholar  

Nshakira-Rukundo E, Mussa EC, Nshakira N, Gerber N, von Braun J. Impact of community-based health insurance on utilisation of preventive health services in rural Uganda: a propensity score matching approach. Int J Health Econ Manag. 2021;21(2):203–27.

Mwaura JW, Pongpanich S. Access to health care: the role of a community based health insurance in Kenya. Pan Afr Med J. 2012;12(1):35.

Jutting JP. The Impact Of Health Insurance On The Access To Health Care And Financial Protection In Rural Developing Countries: The Example of Senegal. HNP discussion paper series;. World Bank, Washington, DC. © World Bank. 2011. https://openknowledge.worldbank.org/handle/10986/13774 . License: CC BY 3.0 IGO.

Balamiento NC. The impact of social health insurance on healthcare utilization outcomes: evidence from the indigent program of the Philippine National Health Insurance. International Institute of Social Studies. 2018. Available at https://thesis.eur.nl/pub/46445/Balamiento,%20Neeanne_MA_2017_18%20_ECD.pdf . Accessed 30 Nov 2022.

Farrell CM, Gottlieb A. The effect of health insurance on health care utilization in the justice-involved population: United States, 2014–2016. Am J Public Health. 2020;110(S1):S78–84.

Thuong NTT. Impact of health insurance on healthcare utilisation patterns in Vietnam: a survey-based analysis with propensity score matching method. BMJ Open. 2020;10(10):e040062.

Custodio R, Gard AM, Graham G. Health information technology: addressing health disparity by improving quality, increasing access, and developing workforce. J Health Care Poor Underserved. 2009;20(2):301–7.

Meier CA, Fitzgerald MC, Smith JM. eHealth: extending, enhancing, and evolving health care. Annu Rev Biomed Eng. 2013;15:359–82.

Anstey Watkins JOT, Goudge J, Gomez-Olive FX, Griffiths F. Mobile phone use among patients and health workers to enhance primary healthcare: A qualitative study in rural South Africa. Soc Sci Med. 1982;2018(198):139–47.

Kuntalp M, Akar O. A simple and low-cost Internet-based teleconsultation system that could effectively solve the health care access problems in underserved areas of developing countries. Comput Methods Programs Biomed. 2004;75(2):117–26.

Price M, Yuen EK, Goetter EM, Herbert JD, Forman EM, Acierno R, Ruggiero KJ. mHealth: a mechanism to deliver more accessible, more effective mental health care. Clin Psychol Psychother. 2014;21(5):427–36.

Bashshur RL, Shannon GW, Krupinski EA, Grigsby J, Kvedar JC, Weinstein RS, Sanders JH, Rheuban KS, Nesbitt TS, Alverson DC, et al. National telemedicine initiatives: essential to healthcare reform. Telemed J E Health. 2009;15(6):600–10.

Norton SA, Burdick AE, Phillips CM, Berman B. Teledermatology and underserved populations. Arch Dermatol. 1997;133(2):197–200.

Raza T, Joshi M, Schapira RM, Agha Z. Pulmonary telemedicine–a model to access the subspecialist services in underserved rural areas. Int J Med Informatics. 2009;78(1):53–9.

Shouneez YH. Smartphone hearing screening in mHealth assisted community-based primary care. UPSpace Institutional Repository, Department of Liberary Service. Dissertation (MCommPath)--University of Pretoria. 2016. Available at http://hdl.handle.net/2263/53477 . Accessed 17 Mar 2022.

Marcin JP, Ellis J, Mawis R, Nagrampa E, Nesbitt TS, Dimand RJ. Using telemedicine to provide pediatric subspecialty care to children with special health care needs in an underserved rural community. Pediatrics. 2004;113(1 Pt 1):1–6.

Olu O, Muneene D, Bataringaya JE, Nahimana M-R, Ba H, Turgeon Y, Karamagi HC, Dovlo D. How can digital health technologies contribute to sustainable attainment of universal health coverage in Africa? A perspective. Front Public Health. 2019;7:341.

Ryan MH, Yoder J, Flores SK, Soh J, Vanderbilt AA. Using health information technology to reach patients in underserved communities: A pilot study to help close the gap with health disparities. Global J Health Sci. 2016;8(6):86.

Buckwalter KC, Davis LL, Wakefield BJ, Kienzle MG, Murray MA. Telehealth for elders and their caregivers in rural communities. Fam Community Health. 2002;25(3):31–40.

WHO Regional Committee for Africa. Promoting the role of traditional medicine in health systems: a strategy for the African Region. World Health Organization. Regional Office for Africa. Available at http://www.who.int/iris/handle/10665/95467. .

Mishra SR, Neupane D, Kallestrup P. Integrating complementary and alternative medicine into conventional health care system in developing countries: an example of Amchi. J Evid-Based Complementary Altern Med. 2015;20(1):76–9.

Mbwambo ZH, Mahunnah RL, Kayombo EJ. Traditional health practitioner and the scientist: bridging the gap in contemporary health research in Tanzania. Tanzan Health Res Bull. 2007;9(2):115–20.

Poudyal AK, Jimba M, Murakami I, Silwal RC, Wakai S, Kuratsuji T. A traditional healers’ training model in rural Nepal: strengthening their roles in community health. Trop Med Int Health : TM & IH. 2003;8(10):956–60.

Payyappallimana U. Role of Traditional Medicine in Primary Health Care: An Overview of Perspectives and Challenges. Yokohama J Social Sciences. 2009;14(6):723–43.

Kange’ethe SM. Traditional healers as caregivers to HIV/AIDS clients and other terminally challenged persons in Kanye community home-based care programme (CHBC), Botswana. SAHARA J. 2009;6(2):83–91.

Habtom GK. Integrating traditional medical practice with primary healthcare system in Eritrea. J Complement Integr Med. 2015;12(1):71–87.

Ejaz I, Shaikh BT, Rizvi N. NGOs and government partnership for health systems strengthening: a qualitative study presenting viewpoints of government, NGOs and donors in Pakistan. BMC Health Serv Res. 2011;11(1):1–7.

Wu FS. International non-governmental actors in HIV/AIDS prevention in China. Cell Res. 2005;15(11):919–22.

Biermann O, Eckhardt M, Carlfjord S, Falk M, Forsberg BC. Collaboration between non-governmental organizations and public services in health–a qualitative case study from rural Ecuador. Glob Health Action. 2016;9(1):32237.

Mercer A, Khan MH, Daulatuzzaman M, Reid J. Effectiveness of an NGO primary health care programme in rural Bangladesh: evidence from the management information system. Health Policy Plan. 2004;19(4):187–98.

Baqui AH, Rosecrans AM, Williams EK, Agrawal PK, Ahmed S, Darmstadt GL, Kumar V, Kiran U, Panwar D, Ahuja RC. NGO facilitation of a government community-based maternal and neonatal health programme in rural India: improvements in equity. Health Policy Plan. 2008;23(4):234–43.

Ricca J, Kureshy N, LeBan K, Prosnitz D, Ryan L. Community-based intervention packages facilitated by NGOs demonstrate plausible evidence for child mortality impact. Health Policy Plan. 2014;29(2):204–16.

Ahmed N, DeRoeck D, Sadr-Azodi N. Private sector engagement and contributions to immunisation service delivery and coverage in Sudan. BMJ Glob Health. 2019;4(2):e001414.

Edimond BJ. The Contribution of Non-Governmental Organizations in Delivery of Basic Health Services in Partnership with Local Government. Doctoral Dissertation, Uganda Martyrs University. 2014.

Chand S, Patterson J: Faith-Based Models for Improving Maternal and Newborn Health. IMA World Health and ActionAid International USA, 2007 Available at https://imaworldhealthorg/wp-content/uploads/2014/06/faith_based_models_for_improving_maternal_and_newborn_health.pdf

Magezi V. Churchdriven primary health care: Models for an integrated church and community primary health care in Africa (a case study of the Salvation Army in East Africa). HTS Teologiese Studies/ Theological Studies. 2018;74(2):4365.

Villatoro AP, Dixon E, Mays VM. Faith-based organizations and the Affordable Care Act: Reducing Latino mental health care disparities. Psychol Serv. 2016;13(1):92–104.

Levin J. Faith-based initiatives in health promotion: history, challenges, and current partnerships. American journal of health promotion : AJHP. 2014;28(3):139–41.

Green A, Shaw J, Dimmock F, Conn C. A shared mission? Changing relationships between government and church health services in Africa. Int J Health Plann Manage. 2002;17(4):333–53.

Bandy G, Crouch A. Building from common foundations : the World Health Organization and faith-based organizations in primary healthcare. World Health Organization; 2008. Available at https://apps.who.int/iris/handle/10665/43884 . Accessed 16 Mar 2022.

Zahnd WE, Jenkins WD, Shackelford J, Lobb R, Sanders J, Bailey A. Rural cancer screening and faith community nursing in the era of the Affordable Care Act. J Health Care Poor Underserved. 2018;29(1):71–80.

Wagle K. Primary Health Care (PHC): History, Principles, Pillars, Elements & Challenges. Global Health, 2020. Available at https://www.publichealthnotes.com/primary-health-care-phc-history-principles-pillars-elements-challenges/ . Accessed 4 June 2022.

Bhatt J, Bathija P. Ensuring access to quality health care in vulnerable communities. Acad Med. 2018;93(9):1271.

Arvey SR, Fernandez ME. Identifying the core elements of effective community health worker programs: a research agenda. Am J Public Health. 2012;102(9):1633–7.

Pennel CL, McLeroy KR, Burdine JN, Matarrita-Cascante D, Wang J. Community health needs assessment: potential for population health improvement. Popul Health Manag. 2016;19(3):178–86.

Chudgar RB, Shirey LA, Sznycer-Taub M, Read R, Pearson RL, Erwin PC. Local health department and academic institution linkages for community health assessment and improvement processes: a national overview and local case study. J Public Health Manag Pract. 2014;20(3):349–55.

Desta FA, Shifa GT, Dagoye DW, Carr C, Van Roosmalen J, Stekelenburg J, Nedi AB, Kols A, Kim YM. Identifying gaps in the practices of rural health extension workers in Ethiopia: a task analysis study. BMC Health Serv Res. 2017;17(1):1–9.

Lehmann U, Sanders D. Community health workers: what do we know about them. The state of the evidence on programmes, activities, costs and impact on health outcomes of using community health workers Geneva: World Health Organization; 2007. Available at https://www.hrhresourcecenter.org/node/1587.html . Accessed 17 Mar 2022.

Chen N, Raghavan M, Albert J, McDaniel A, Otiso L, Kintu R, West M, Jacobstein D. The community health systems reform cycle: strengthening the integration of community health worker programs through an institutional reform perspective. Global Health: Sci Practice. 2021;9(Supplement 1):S32–46.

Roser M, Ortiz-Ospina E: Global rise of education. Our World in Data 2017. Available at https://ourworldindata.org/global-rise-of-education . Accessed on 29 May 2019.

Santelli J, Morreale M, Wigton A, Grason H. School health centers and primary care for adolescents: a review of the literature. J Adolesc Health. 1996;18(5):357–66.

Wade TJ, Mansour ME, Guo JJ, Huentelman T, Line K, Keller KN. Access and utilization patterns of school-based health centers at urban and rural elementary and middle schools. Public Health Reports. 2008;123(6):739–50.

Johnson I, Hunter L, Chestnutt IG. Undergraduate students’ experiences of outreach placements in dental secondary care settings. Eur J Dent Educ. 2012;16(4):213–7.

Ndira S, Ssebadduka D, Niyonzima N, Sewankambo N, Royall J. Tackling malaria, village by village: a report on a concerted information intervention by medical students and the community in Mifumi Eastern Uganda. Afr Health Sci. 2014;14(4):882–8.

Frakes K-a, Brownie S, Davies L, Thomas JB, Miller M-E, Tyack Z. Capricornia Allied Health Partnership (CAHP): a case study of an innovative model of care addressing chronic disease through a regional student-assisted clinic. Aust Health Rev. 2014;38(5):483–6.

Frakes KA, Brownie S, Davies L, Thomas J, Miller ME, Tyack Z. The sociodemographic and health-related characteristics of a regional population with chronic disease at an interprofessional student-assisted clinic in Q ueensland C apricornia A llied H ealth P artnership. Aust J Rural Health. 2013;21(2):97–104.

Frakes K-A, Tyzack Z, Miller M, Davies L, Swanston A, Brownie S. The Capricornia Project: Developing and implementing an interprofessional student-assisted allied health clinic. 2011.

Frakes K-A, Brownie S, Davies L, Thomas J, Miller M-E, Tyack Z. Experiences from an interprofessional student-assisted chronic disease clinic. J Interprof Care. 2014;28(6):573–5.

Schutte T, Tichelaar J, Dekker RS, van Agtmael MA, de Vries TP, Richir MC. Learning in student-run clinics: A systematic review. Med Educ. 2015;49(3):249–63.

Paim J, Travassos C, Almeida C, Bahia L, Macinko J. The Brazilian health system: history, advances, and challenges. The Lancet. 2011;377(9779):1778–97.

Rocha R, Soares RR. Evaluating the impact of community-based health interventions: evidence from Brazil’s Family Health Program. Health Econ. 2010;19(S1):126–58.

Rasella D, Harhay MO, Pamponet ML, Aquino R, Barreto ML. Impact of primary health care on mortality from heart and cerebrovascular diseases in Brazil: a nationwide analysis of longitudinal data. BMJ (Clinical research ed). 2014;349:g4014.

Harris M. Brazil’s Family Health Programme: A cost effective success that higher income countries could learn from. BMJ: Br Med J. 2010;341(7784):1171–2.

Starfield B. Is primary care essential? The lancet. 1994;344(8930):1129–33.

Donfouet HPP, Mahieu P-A. Community-based health insurance and social capital: a review. Heal Econ Rev. 2012;2(1):1–5.

Zhang L, Wang H, Wang L, Hsiao W. Social capital and farmer’s willingness-to-join a newly established community-based health insurance in rural China. Health Policy. 2006;76(2):233–42.

Donfouet HPP. Essombè J-RE, Mahieu P-A, Malin E: Social capital and willingness-to-pay for community-based health insurance in rural Cameroon. Global J Health Sci. 2011;3(1):142.

Grunau J. Exploring people’s motivation to join or not to join the community-based health insurance’Sina Passenang’in Sotouboua, Togo. 2013.

Gitahi JW. Innovative Healthcare Financing and Equity through Community Based Health Insurance Schemes (CBHHIS) In Kenya. United States International University-Africa Digital Repository. Available at http://erepo.usiu.ac.ke/11732/3654 . Accessed 18 May 2022.

Carrin G, Waelkens MP, Criel B. Community-based health insurance in developing countries: a study of its contribution to the performance of health financing systems. Tropical Med Int Health. 2005;10(8):799–811.

Umeh CA, Feeley FG. Inequitable access to health care by the poor in community-based health insurance programs: a review of studies from low-and middle-income countries. Global Health: Science And Practice. 2017;5(2):299–314.

Odebiyi AI. Western trained nurses’ assessment of the different categories of traditional healers in southwestern Nigeria. Int J Nurs Stud. 1990;27(4):333–42.

Abdullahi AA. Trends and challenges of traditional medicine in Africa. Afr J Tradit Complement Altern Med : AJTCAM. 2011;8(5 Suppl):115–23.

Taye OR. Yoruba Traditional Medicine and the Challenge of Integration. The J Pan Afr Studies. 2009;3(3):73–90.

Konadu K. Medicine and Anthropology in Twentieth Century Africa: Akan Medicine and Encounters with (Medical) Anthropology. African Studies Quarterly. 2008;10(2 & 3).

Benzie IF, Wachtel-Galor S: Herbal medicine: biomolecular and clinical aspects. 2nd Ed. 2011. Available at https://www.crcpress.com/Herbal-Medicine-Biomolecular-and-Clinical-Aspects-Second-Edition/Benzie-Wachtel-Galor/p/book/9781439807132 . Accessed 21 May 2022.

Ejughemre U. Donor support and the impacts on health system strengthening in sub-saharan africa: assessing the evidence through a review of the literature. Am J Public Health Res. 2013;1(7):146–51.

Seppey M, Ridde V, Touré L, Coulibaly A. Donor-funded project’s sustainability assessment: a qualitative case study of a results-based financing pilot in Koulikoro region. Mali Globalization and health. 2017;13(1):1–15.

Shaw RP, Wang H, Kress D, Hovig D. Donor and domestic financing of primary health care in low income countries. Health Systems & Reform. 2015;1(1):72–88.

Gotsadze G, Chikovani I, Sulaberidze L, Gotsadze T, Goguadze K, Tavanxhi N. The challenges of transition from donor-funded programs: results from a theory-driven multi-country comparative case study of programs in Eastern Europe and Central Asia supported by the Global Fund. Global Health: Science and Practice. 2019;7(2):258–72.

Ascroft J, Sweeney R, Samei M, Semos I, Morgan C. Strengthening church and government partnerships for primary health care delivery in Papua New Guinea: Lessons from the international experience. Health policy and health finance knowledge hub Working paper series. 2011(16).

Campbell MK, Hudson MA, Resnicow K, Blakeney N, Paxton A, Baskin M. Church-based health promotion interventions: evidence and lessons learned. Annu Rev Public Health. 2007;28:213–34.

Olivier J, Wodon Q. The role of faith-inspired health care providers in Sub-Saharan Africa and public private partnerships: Strengthening the Evidence for faith-inspired health engagement in Africa, Volume 1. Health, Nutrition and Population (HNP) Discussion Paper Series 76223v1. Available at https://documents1.worldbank.org/curated/en/851911468203673017 . Accessed 20 May 2022.

Schumann C, Stroppa A, Moreira-Almeida A. The contribution of faith-based health organisations to public health. Int Psychiatry. 2011;8(3):62–4.

Download references

Acknowledgements

The author would like to thank IPHC- E for funding this review.

This review was funded by International Institute for Primary Health Care- Ethiopia (IPHC- E).

Author information

Authors and affiliations.

Department of Environmental and Occupational Health and Safety, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia

Zemichael Gizaw

International Institute for Primary Health Care- Ethiopia, Ethiopian Public Health Institute, Addis Ababa, Ethiopia

Tigist Astale & Getnet Mitike Kassie

You can also search for this author in PubMed   Google Scholar

Contributions

ZG prepared the manuscript. TA and GMK critically reviewed the protocol and manuscript. All the authors read and approved the final manuscript.

Corresponding author

Correspondence to Zemichael Gizaw .

Ethics declarations

Ethics approval and consent to participate.

Systematic review does not required ethics approval.

Consent for publication

This manuscript does not contain any individual person’s data.

Competing interests

The authors declared that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: .

Searchstrategy. MEDLINE (PubMed).

Additional file 2: Appendix 2: Table A1.

Description of full-text articles which discussed community health programs or community-directed interventions as a strategy to improve PHC service delivery in ruralcommunities.

Additional file 3:

Appendix 3: Table A2. Description of full-text articles which discussed school-based healthcareservices as a strategy to improve PHCservice delivery in rural communities.

Additional file 4:

Appendix 4: Table A3. Description of full-text articles which discussed student-led healthcareservices as a strategy to improve PHC service delivery in ruralcommunities.

Additional file 5: Appendix 5: Table A4

. Descriptionof full-text articles which discussed outreach services or mobile clinics as astrategy to improve PHC service delivery in ruralcommunities.

Additional file 6:

  Appendix 6: Table A5. Description of full-text articles which discussed family health program as astrategy to improve PHC service delivery in rural,communities.

Additional file 7:

  Appendix 7: Table A6. Description of full-text articles whichdiscussed empanelment as a strategy to improve PHC service delivery in ruralcommunities.

Additional file 8:

  Appendix 9: Table A8. Description of full-text articles which discussed telemedicine or mobile healthas a strategy to improve PHC service delivery in ruralcommunities.

Additional file 9:

  Appendix 8: Table A7. Description of full-text articles which discussed community health funding schemes as a strategy to improve PHC service delivery in ruralcommunities.

Additional file 10:

  Appendix 10: Table A9. Description of full-text articles which discussed promoting the role of workingwith traditional healers as a strategy toimprove PHC service delivery in rural communities.

Additional file 11:

  Appendix 11: Table A10. Description of full-text articles which discussed working with non-profitprivate sectors and non-governmental organizations as a strategy to improve PHC service delivery in rural communities.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Gizaw, Z., Astale, T. & Kassie, G.M. What improves access to primary healthcare services in rural communities? A systematic review. BMC Prim. Care 23 , 313 (2022). https://doi.org/10.1186/s12875-022-01919-0

Download citation

Received : 09 August 2022

Accepted : 18 November 2022

Published : 06 December 2022

DOI : https://doi.org/10.1186/s12875-022-01919-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Access to PHC services
  • Rural communities
  • Key strategies to improve access to PHC services

BMC Primary Care

ISSN: 2731-4553

literature review study design

  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, impact of industrial policy on urban green innovation: empirical evidence of china’s national high-tech zones based on double machine learning.

www.frontiersin.org

  • College of Economics and Management, Taiyuan University of Technology, Taiyuan, China

Effective industrial policies need to be implemented, particularly aligning with environmental protection goals to drive the high-quality growth of China’s economy in the new era. Setting up national high-tech zones falls under the purview of both regional and industrial policies. Using panel data from 163 prefecture-level cities in China from 2007 to 2019, this paper empirically analyzes the impact of national high-tech zones on the level of urban green innovation and its underlying mechanisms. It utilizes the national high-tech zones as a quasi-natural experiment and employs a double machine learning model. The study findings reveal that the policy for national high-tech zones greatly enhances urban green innovation. This conclusion remains consistent even after adjusting the measurement method, empirical samples, and controlling for other policy interferences. The findings from the heterogeneity analysis reveal that the impact of the national high-tech zone policy on green innovation exhibits significant regional heterogeneity, with a particularly significant effect in the central and western regions. Among cities, there is a notable push for green innovation levels in second-tier, third-tier, and fourth-tier cities. The moderating effect results indicate that, at the current stage of development, transportation infrastructure primarily exerts a negative moderating effect on how the national high-tech zone policy impacts the level of urban green innovation. This research provides robust empirical evidence for informing the optimization of the industrial policy of China and the establishment of a future ecological civilization system.

1 Introduction

The Chinese economy currently focuses on high-quality development rather than quick growth. The traditional demographic and resource advantages gradually diminish, making the earlier crude development model reliant on excessive resource input and consumption unsustainable. Simultaneously, resource impoverishment, environmental pollution, and carbon emissions are growing more severe ( Wang F. et al., 2022 ). Consequently, pursuing a mutually beneficial equilibrium between the economy and the environment has emerged as a critical concern in China’s economic growth. Green innovation, the integration of innovation with sustainability development ideas, is progressively gaining significance within the framework of reshaping China’s economic development strategy and addressing the challenges associated with resource and environmental limitations. In light of the present circumstances, and with the objectives outlined in the “3060 Plan” for carbon peak and carbon neutral, the pursuit of a green and innovative development trajectory, emphasizing heightened innovation alongside environ-mental preservation, has emerged as a pivotal concern within the context of China’s contemporary economic progress.

Industrial policy is pivotal in government intervention within market-driven resource allocation and correcting structural disparities. The government orchestrates this initiative to bolster industrial expansion and operational effectiveness. In contrast to Western industrial policies, those in China are predominantly crafted within the administrative framework and promulgated through administrative regulations. Over an extended period, numerous industrial policies have been devised in response to regional disparities in industrial development. These policies aim to identify new growth opportunities in diverse regions, focusing on optimizing and upgrading industrial structures. These strategies have been implemented at various administrative levels, from the central government to local authorities ( Sun and Sun, 2015 ). As a distinctive regional economic policy in China, the national high-tech zone represents one of the foremost supportive measures a city can acquire at the national level. Its crucial role involves facilitating the dissemination and advancement of regional economic growth. Over more than three decades, it has evolved into the primary platform through which China executes its strategy of concentrating on high-tech industries and fostering development driven by innovation. Concurrently, the national high-tech zone, operating as a geographically focused policy customized for a specific region ( Cao, 2019 ), enhances the precision of policy support for the industries under its purview, covering a more limited range of municipalities, counties, and regions. Contrasting with conventional regional industrial policies, the industry-focused policy within national high-tech zones prioritizes comprehensive resource allocation advice and economic foundations to maximize synergy and promote the long-term sustainable growth of the regional economy, and this represents a significant paradigm shift in location-based policies within the framework of carrying out the new development idea. Its inception embodies a combination of central authorization, high-level strategic planning, local grassroots decision-making, and innovative system development. In recent years, driven by the objective of dual carbon, national high-tech have proactively promoted environmentally friendly innovation. Nevertheless, given the proliferation of new industrial policies and the escalating complexity of the policy framework, has the setting up of national high-tech zones genuinely elevated the level of urban green innovation in contrast to conventional regional industrial policies? What are the underlying mechanisms? Simultaneously, concerning the variations among different cities, have the industrial policy tools within the national high-tech zones been employed judiciously and adaptable? What are the concrete practical outcomes? Investigating these matters has emerged as a significant subject requiring resolution by government, industry and academia.

2 Literature review and research hypothesis

2.1 literature review.

When considering industrial policy, the setting up national high-tech zones embodies the intersection of regional and industrial policies. Domestic and international academic research concerning setting up national high-tech zones primarily centers on economic activities and innovation. Notably, the economic impact of national high-tech zones encompasses a wide range of factors, including their influence on total factor productivity ( Tan and Zhang, 2018 ; Wang and Liu, 2023 ), foreign trade ( Alder et al., 2016 ), industrial structure upgrades ( Yuan and Zhu, 2018 ), and economic growth ( Liu and Zhao, 2015 ; Huang and Fernández-Maldonado, 2016 ; Wang Z. et al., 2022 ). Regarding innovation, numerous researchers have confirmed the positive effects of national high-tech zones on company innovation ( Vásquez-Urriago et al., 2014 ; Díez-Vial and Fernández-Olmos, 2017 ; Wang and Xu, 2020 ); Nevertheless, a few scholars have disagreed on this matter ( Hong et al., 2016 ; Sosnovskikh, 2017 ). In general, the consensus among scholars is that setting up high-tech national zones fosters regional innovation significantly. This consensus is supported by various aspects of innovation, including innovation efficiency ( Park and Lee, 2004 ; Chandrashekar and Bala Subrahmanya, 2017 ), agglomeration effect ( De Beule and Van Beveren, 2012 ), innovation capability ( Yang and Guo, 2020 ), among other relevant dimensions. The existing literature predominantly delves into the correlation between the setting up of national high-tech zones, innovation, and economic significance. However, the rise of digital economic developments, notably industrial digitization, has accentuated the limitations of the traditional innovation paradigm. These shortcomings, such as the inadequate exploration of the social importance and sustainability of innovation, have become apparent in recent years. As the primary driver of sustainable development, green innovation represents a potent avenue for achieving economic benefits and environmental value ( Weber et al., 2014 ). Its distinctiveness from other innovation forms lies in its potential to facilitate the transformation of development modes, reshape economic structures, and address pollution prevention and control challenges. However, in the context of green innovation, based on the double-difference approach, Wang et al. (2020) has pointed out that national high-tech zones enhance the effectiveness of urban green innovation, but this is only significant in the eastern region.

Furthermore, scholars have also explored the mechanisms underlying the innovation effects of national high-tech. For example, Cattapan et al. (2012) focused on science parks in Italy. They found that green innovation represents a potent avenue for achieving economic benefits as the primary driver of sustainable development, and environmental value technology transfer services positively influence product innovation. Albahari et al. (2017) confirmed that higher education institutions’ involvement in advancing corporate innovation within technology and science parks has a beneficial moderating effect. Using the moderating effect of spatial agglomeration as a basis, Li WH. et al. (2022) found that industrial agglomeration has a significantly unfavorable moderating influence on the effectiveness of performance transformation in national high-tech zones. Multiple studies have examined the national high-tech zone industrial policy’s regulatory framework and urban innovation. However, in the age of rapidly expanding new infrastructure, infrastructure construction is concentrated on information technologies like blockchain, big data, cloud computing, artificial intelligence, and the Internet; Further research is needed to explore whether traditional infrastructure, particularly transportation infrastructure, can promote urban green innovation. Transportation infrastructure has consistently been vital in fostering economic expansion, integrating regional resources, and facilitating coordinated development ( Behrens et al., 2007 ; Zhang et al., 2018 ; Pokharel et al., 2021 ). Therefore, it is necessary to investigate whether transportation infrastructure can continue encouraging innovative urban green practices in the digital economy.

In summary, the existing literature has extensively examined the influence of national high-tech zones on economic growth and innovation from various levels and perspectives, establishing a solid foundation and offering valuable research insights for this study. Nonetheless, previous studies frequently overlooked the impact of national high-tech zones on urban green innovation levels, and a subsequent series of work in this paper aims to address this issue. Further exploration and expansion are needed to understand the industrial policy framework’s strategy for relating national high-tech zones to urban green innovation. Furthermore, there is a need for further improvement and refinement of the research model and methodology. Based on these, this paper aims to discuss the industrial policy effects of national high-tech zones from the perspective of urban green innovation to enrich and expand the existing research.

In contrast to earlier research, the marginal contribution of this paper is organized into three dimensions: 1) Most scholars have primarily focused on the effects of national high-tech zones on economic activity and innovation, with less emphasis on green innovation and rare studies according to the level of green innovation perspective. The study on national high-tech zones as an industrial policy that has already been done is enhanced by this work. 2) Regarding the research methodology, the Double Machine Learning (DML) approach is used to evaluate the policy effects of national high-tech zones, leveraging the advantages of machine learning algorithms for high-dimensional and non-parametric prediction. This approach circumvents the problems of model setting bias and the “curse of dimensionality” encountered in traditional econometric models ( Chernozhukov et al., 2018 ), enhancing the credibility of the research findings. 3) By introducing transportation infrastructure as a moderator variable, this study investigates the underlying mechanism of national high-tech zones on urban green innovation, offering suggestions for maximizing the influence of these zones on policy.

2.2 Theoretical analysis and hypotheses

2.2.1 national high-tech zones’ industrial policies and urban green innovation.

As one of the ways to land industrial policies at the national level, national high-tech zones serve as effective driving forces for enhancing China’s ability to innovate regionally and its contribution to economic growth ( Xu et al., 2022 ). Green innovation is a novel form of innovation activity that harmoniously balances the competing goals of environmental preservation and technological advancement, facilitating the superior expansion of the economy by alleviating the strain on resources and the environment ( Li, 2015 ). National high-tech zones mainly impact urban green innovation through three main aspects. Firstly, based on innovation compensation effects, national high-tech zones, established based on the government’s strategic planning, receive special treatment in areas such as land, taxation, financing, credit, and more, serving as pioneering special zones and experimental fields established by the government to promote high-quality regional development. When the government offers R&D subsidies to enterprises engaged in green innovation activities within the zones, enterprises are inclined to respond positively to the government’s policy support and enhance their level of green innovation as a means of seeking external legitimacy ( Fang et al., 2021 ), thereby contributing to the advancement of urban green innovation. Secondly, based on the industrial restructuring effect, strict regulation of businesses with high emissions, high energy consumption, and high pollution levels is another aspect of implementing the national high-tech zone program. Consequently, businesses with significant emissions and energy consumption are required to optimize their industrial structure to access various benefits within the park, resulting in the gradual transformation and upgrading of high-energy-consumption industries towards green practices, thereby further contributing to regional green innovation. Based on Porter’s hypothesis, the green and low-carbon requirements of the park policy increase the production costs for polluting industries, prompting polluting enterprises to upgrade their existing technology and adopt green innovation practices. Lastly, based on the theory of industrial agglomeration, the national high-tech zones’ industrial policy facilitates the concentration of innovative talents to a certain extent, resulting in intensified competition in the green innovation market. Increased competition fosters the sharing of knowledge, technology, and talent, stimulating a market environment where the survival of the fittest prevails ( Melitz and Ottaviano, 2008 ). These increase the effectiveness of urban green innovation, helping to propel urban green innovation forward. Furthermore, the infrastructure development within the national high-tech zones establishes a favorable physical environment for enterprises to engage in creative endeavors. Also, it enables the influx of high-quality innovation capital from foreign sources, complementing the inherent characteristics of national high-tech zones that attract such capital and concentrate green innovation resources, ultimately resulting in both environmental and economic benefits. Based on the above analysis, Hypothesis 1 is proposed:

Hypothesis 1. Implementing industrial policies in national high-tech zones enhances levels of urban green innovation.

2.2.2 Heterogeneity analysis

Given the variations in economic foundations, industrial statuses, and population distributions across different regions, development strategies in different regions are also influenced by these variations ( Chen and Zheng, 2008 ). Theoretically, when using administrative boundaries or geographic locations as benchmarks, the impact of national high-tech zone industrial policy on urban green innovation should be achieved through strategies like aligning with the region’s existing industrial structure. Compared to the western and central regions, the eastern region exhibits more incredible innovation and dynamism due to advantages such as a developed economy, good infrastructure, advanced management concepts, and technologies, combined with a relatively high initial level of green innovation factor endowment. Considering the diminishing marginal effect principle of green innovation, the industrial policy implementation in national high-tech zones favors an “icing on the cake” approach in the eastern region, contrasting with a “send carbon in the snow” approach in the central and western regions. In other words, the economic benefits of national high-tech zones for promoting urban green innovation may need to be more robust than their impact on the central and western regions. Literature confirms that establishing national high-tech zones yields a more beneficial technology agglomeration effect in the less developed central and western regions ( Liu and Zhao, 2015 ), leading to a more substantial impact on enhancing the level of urban green innovation.

Moreover, local governments consider economic development, industrial structure, and infrastructure levels when establishing national high-tech zones. These factors serve as the foundation for regional classification to address variations in regional quality and to compensate for gaps in theoretical research on the link between national high-tech zone industrial policy implementation and urban green innovation. Consequently, the execution of industrial policies in national high-tech zones relies on other vital factors influencing urban green innovation. Significant variations exist in economic development and infrastructure levels among cities of different grades ( Luo and Wang, 2023 ). Generally, cities with higher rankings exhibit strong economic growth and infrastructure, contrasting those with lower rankings. Consequently, the effect of establishing a national high-tech zone on green innovation may vary across different city grades. Thus, considering the disparities across city rankings, we delve deeper into identifying the underlying reasons for regional diversity in the green innovation outcomes of industrial policies implemented in national high-tech zones based on city grades. Based on the above analysis, Hypothesis 2 is proposed:

Hypothesis 2. There is regional heterogeneity and city-level heterogeneity in the impact of national high-tech zone policies on the level of urban green innovation.

2.2.3 The moderating effect of transportation infrastructure

Implementing industrial policies and facilitating the flow of innovation factors are closely intertwined with the role of transport infrastructure as carriers and linkages. Generally, enhanced transportation infrastructure facilitates the absorption of local factors and improves resource allocation efficiency, thereby influencing the spatial redistribution of production factors like labor, resources, and technology across cities. Enhanced transportation infrastructure fosters the development of more robust and advanced innovation networks ( Fritsch and Slavtchev, 2011 ). Banister and Berechman (2001) highlighted that transportation infrastructure exhibits network properties that are fundamental to its agglomeration or diffusion effects. From this perspective, robust infrastructure impacts various economic activities, including interregional labor mobility, factor agglomeration, and knowledge exchange among firms, thereby expediting the spillover effects of green technological innovations ( Yu et al., 2013 ). In turn, this could positively moderate the influence of national hi-tech zone policies on green innovation. On the other hand, while transportation infrastructure facilitates the growth of national high-tech zone policies, it also brings negative impacts, including high pollution, emissions, and ecological landscape fragmentation. Improving transportation infrastructure can also lead to the “relative congestion effect” in national high-tech zones. This phenomenon, observed in specific regions, refers to the excessive concentration of similar enterprises across different links of the same industrial chain, which exacerbates the competition for innovation resources among enterprises, making it challenging for enterprises in the region to allocate their limited innovation resources to technological research and development activities ( Li et al., 2015 ). As a result, there needs to be a higher green innovation level. Therefore, the impact of transportation infrastructure in the current stage of development will be more complex. When the level of transport infrastructure is moderate, adequate transport infrastructure supports the promotion of urban green innovation through national high-tech zone policies. However, the impact of transport infrastructure regulation may be harmful. Based on the above analysis, Hypothesis 3 is proposed:

Hypothesis 3. Transportation infrastructure moderates the relationship between national high-tech zones and levels of urban green invention.

3 Research design

3.1 model setting.

This research explores the impact of industrial policies of national high-tech zones on the level of urban green innovation. Many related studies utilize traditional causal inference models to assess the impact of these policies. However, these models have several limitations in their application. For instance, the commonly used double-difference model in the parallel trend test has stringent requirements for the sample data. Although the synthetic control approach can create a virtual control group that meets parallel trends’ needs, it is limited to addressing the ‘one-to-many’ problem and requires excluding groups with extreme values. The selection of matching variables in propensity score matching is subjective, among other limitations ( Zhang and Li, 2023 ). To address the limitations of conventional causal inference models, scholars have started to explore applying machine learning to infer causality ( Chernozhukov et al., 2018 ; Knittel and Stolper, 2021 ). Machine learning algorithms excel at an impartial assessment of the effect on the intended target variable for making accurate predictions.

In contrast to traditional machine learning algorithms, the formal proposal of DML was made in 2018 ( Chernozhukov et al., 2018 ). This approach offers a more robust approach to causal inference by mitigating bias through the incorporation of residual modeling. Currently, some scholars utilize DML to assess causality in economic phenomena. For instance, Hull and Grodecka-Messi (2022) examined the effects of local taxation, crime, education, and public services on migration using DML in the context of Swedish cities between 2010 and 2016. These existing research findings serve as valuable references for this study. Compared to traditional causal inference models, DML offers distinct advantages in variable selection and model estimation ( Zhang and Li, 2023 ). However, in promoting urban green innovation in China, there is a high probability of non-linear relationships between variables, and the traditional linear regression model may lead to bias and errors. Moreover, the double machine learning model can effectively avoid problems such as setting bias. Based on this, the present study employs a DML model to evaluate the policy implications of establishing a national high-tech zone.

3.1.1 Double machine learning framework

Prior to applying the DML algorithm, this paper refers to the practice of Chernozhukov et al. (2018) to construct a partially linear DML model, as depicted in Eq. 1 below:

where i represents the city, t represents the year, and l n G I i t represents the explained variable, which in this paper is the green innovation level of the city. Z o n e i t represents the disposition variable, which in this case is a national high-tech zone’s policy variable. It takes a value of 1 after the implementation of the pilot and 0 otherwise. θ 0 is the disposal factor that is the focus of this paper. X i t represents the set of high-dimensional control variables. Machine learning algorithms are utilized to estimate the specific form of g ^ X i t , whereas U i t , which has a conditional mean of 0, stands for the error term. n represents the sample size. Direct estimation of Eq. 1 provides an estimate for the coefficient of dispositions.

We can further explore the estimation bias by combining Eqs 1 , 2 as depicted in Eq. ( 3 ) below:

where a = 1 n ∑ i ∈ I , t ∈ T   Z o n e i t 2 − 1 1 n ∑ i ∈ I , t ∈ T   Z o n e i t U i t , by a normal distribution having 0 as the mean, b = 1 n ∑ i ∈ I , t ∈ T   Z o n e i t 2 − 1 1 n ∑ i ∈ I , t ∈ T   Z o n e i t g X i t − g ^ X i t . It is important to note that DML utilizes machine learning and a regularization algorithm to estimate a specific functional form g ^ X i t . The introduction of “canonical bias” is inevitable as it prevents the estimates from having excessive variance while maintaining their unbiasedness. Specifically, the convergence of g ^ X i t to g X i t , n −φg > n −1/2 , as n tends to infinity, b also tends to infinity, θ ^ 0 is difficult to converge to θ 0 . To expedite convergence and ensure unbiasedness of the disposal coefficient estimates with small samples, an auxiliary regression is constructed as follows:

where m X i t represents the disposition variable’s regression function on the high-dimensional control variable, this function also requires estimation using a machine learning algorithm in the specific form of m ^ X i t . Additionally, V i t represents the error term with a 0 conditional mean.

3.1.2 The test of the mediating effect within the DML framework

This study investigates how the national high-tech zone industrial policy influences the urban green innovation. It incorporates moderating variables within the DML framework, drawing on the testing procedure outlined by Jiang (2022) , and integrates it with the practice of He et al. (2022) , as outlined below:

Equation 5 is based on Eq. 1 with the addition of variables l n t r a i t and Z o n e i t * l n t r a i t .Where l n t r a i t represents the moderating variable, which in this paper is the transportation infrastructure. Z o n e i t * l n t r a i t represents the interaction term of the moderating variable and the disposition variable. The variables l n t r a i t and Z o n e i t are added to the high-dimensional control variables X i t , and the rest of the variables in Eq. 5 are identical to Eq. 1 . θ 1 represents the disposal factor to focus on.

3.2 Variable selection

3.2.1 dependent variable: level of urban green innovation (lngi).

Nowadays, many academics use indicators like the number of applications for patents or authorizations to assess the degree of urban innovation. To be more precise, the quantity of patent applications is a measure of technological innovation effort, while the number of patents authorized undergoes strict auditing and can provide a more direct reflection of the achievements and capacity of scientific and technological innovation. Thus, this paper refers to the studies of Zhou and Shen (2020) and Li X. et al. (2022) to utilize the count of authorized green invention patents in each prefecture-level city to indicate the level of green innovation. For the empirical study, the count of authorized green patents plus 1 is transformed using logarithm.

3.2.2 Disposal variable: dummy variables for national high-tech zones (Zone)

The national high-tech zone dummy variable’s value correlates with the city in which it is located and the list of national high-tech zones released by China’s Ministry of Science and Technology. If a national high-tech zone was established in the city by 2017, the value is set to 1 for the year the high-tech zone is established and subsequent years. Otherwise, it is set to 0.

3.2.3 Moderating variable: transportation infrastructure (lntra)

Previous studies have shown that China’s highway freight transport comprises 75% of the total freight transport ( Li and Tang, 2015 ). Highway transportation infrastructure has a significant influence on the evolution of the Chinese economy. The development and improvement of highway infrastructure are crucial for modern transportation. This paper uses the research methods of Wu (2019) and uses the roadway mileage (measured in kilometers) to population as a measure of the quality of the transportation system.

3.2.4 Control variables

(1) Foreign direct investment (lnfdi): There is general agreement among academics that foreign direct investment (FDI) significantly influences urban green innovation, as FDI provides expertise in management, human resources, and cutting-edge industrial technology ( Luo et al., 2021 ). Thus, it is necessary to consider and control the level of FDI. This paper uses the ratio of foreign investment to the local GDP in a million yuan.

(2) Financial development level (lnfd): Innovation in science and technology is greatly aided by finance. For the green innovation-driven strategy to advance, it is imperative that funding for science and technology innovation be strengthened. The amount of capital raised for innovation is strongly impacted by the state of urban financial development ( Zhou and Du, 2021 ). Thus, this paper uses the loan balance to GDP ratio as an indicator.

(3) Human capital (lnhum): Highly skilled human capital is essential for cities to drive green innovation. Generally, highly qualified human capital significantly boosts green innovation ( Ansaris et al., 2016 ). Therefore, a measure was employed: the proportion of people in the city who had completed their bachelor’s degree or above.

(4) Industrial structure (lnind): Generally, the secondary industry in China is the primary source of pollution, and there is a significant impact of industrial structure on green innovation ( Qiu et al., 2023 ). The metric used in this paper is the secondary industry-to-GDP ratio for the area.

(5) Regional economic development level (lnagdp): A region’s level of economic growth is indicative of the material foundation for urban green innovation and in-fluences the growth of green innovation in the region ( Bo et al., 2020 ). This research uses the annual gross domestic product per capita as a measurement.

3.3 Data source

By 2017, China had developed 157 national high-tech zones in total. In conjunction with the study’s objectives, this study performs sample adjustments and a screening process. The study’s sample period spans from 2007 to 2019. 57 national high-tech zones that were created prior to 2000 are omitted to lessen the impact on the test results of towns having high-tech zones founded before 2007. Due to the limitations of high-tech areas in cities at the county level in promoting urban green innovation, 8 high-tech zones located in county-level cities are excluded. And 4 high-tech zones with missing severe data are excluded. Among the list of established national high-tech zones, 88 high-tech zones are distributed across 83 prefecture-level cities due to multiple districts within a single city. As a result, 83 cities are selected as the experimental group for this study. Additionally, a control group of 80 cities was selected from among those that did not have high-tech zones by the end of 2019, resulting in a final sample size of 163 cities. This paper collects green patent data for each city from the China Green Patent Statistical Report published by the State Intellectual Property Office. The author compiled the list of national high-tech zones and the starting year of their establishment on the official government website. In addition, the remaining data in this paper primarily originated from the China Urban Statistical Yearbook (2007–2019), the EPS database, and the official websites of the respective city’s Bureau of Statistics. Missing values were addressed through linear interpolation. To address heteroskedasticity in the model, the study logarithmically transforms the variables, excluding the disposal variable. Table 1 shows the descriptive analysis of the variables.

www.frontiersin.org

Table 1 . Descriptive analysis.

4 Empirical analysis

4.1 national high-tech zones’ policy effects on urban green innovation.

This study utilizes the DML model to estimate the impact of industrial policies implemented in national high-tech zones at the level of urban green innovation. Following the approach of Zhang and Li (2023) , the sample is split in a ratio of 1:4, and the random forest algorithm is used to perform predictions and combine Eq. ( 1 ) with Eq. ( 4 ) for the regression. Table 2 presents the results with and without controlling for time and city effects. The results indicate that the treatment effect sizes for these four columns are 0.376, 0.293, 0.396, and 0.268, correspondingly, each of which was significant at a 1% level. Thus, Hypothesis 1 is supported.

www.frontiersin.org

Table 2 . Benchmark regression results.

4.2 Robustness tests

4.2.1 eliminate the influence of extreme values.

To reduce the impact of extreme values on the estimation outcomes, all variables on the benchmark regression, excluding the disposal variable, undergo a shrinkage process based on the upper and lower 1% and 5% quantiles. Values lower than the lowest and higher than the highest quantile are replaced accordingly. Regression analyses are conducted. Table 3 demonstrates that removing outliers did not substantially alter the findings of this study.

www.frontiersin.org

Table 3 . Extreme values removal results.

4.2.2 Considering province-time interaction fixed effects

Since provinces are critical administrative units in the governance system of the Chinese government, cities within the same province often share similarities in policy environment and location characteristics. Therefore, to account for the influence of temporal changes across different provinces, this study incorporates province-time interaction fixed effects based on the benchmark regression. Table 4 presents the individual regression results. Based on the regression results, after accounting for the correlation between different city characteristics within the same province, national high-tech zone policies continue to significantly influence urban green innovation, even at the 1% level.

www.frontiersin.org

Table 4 . The addition of province and time fixed effects interaction terms.

4.2.3 Excluding other policy disturbances

When analyzing how national high-tech zones affect strategy for urban green innovation, it is susceptible to the influence of concurrent policies. This study accounts for other comparable policies during the same period to ensure an accurate estimation of the policy effect. Since 2007, national high-tech zone policies have been successively implemented, including the development of “smart cities.” Therefore, this study incorporates a policy dummy variable for “smart cities” in the benchmark regression. The specific regression findings are shown in Table 5 . After controlling for the impact of concurrent policies, the importance of national high-tech zones’ policy impact remains consistent.

www.frontiersin.org

Table 5 . Results of removing the impact of parallel policies.

4.2.4 Resetting the DML model

To mitigate the potential bias introduced by the settings in the DML model on the conclusions, the purpose of this study is to assess the conclusions’ robustness using the following methods. First, the sample split ratio of the DML model is adjusted from 1:4 to 1:2 to examine the potential impact of the sample split ratio on the conclusions of this study. Second, the machine learning algorithm is substituted, replacing the random forest algorithm, which has been utilized as a prediction algorithm, with lasso regression, gradient boosting, and neural networks to investigate the potential influence of prediction algorithms on the conclusions of this study. Third, regarding benchmark regression, additional linear models were constructed and analyzed using DML, which involves subjective decisions regarding model form selection. Therefore, DML was employed to construct more comprehensive interactive models, aiming to assess the influence of model settings on the conclusions of this study. The main and auxiliary regressions utilized for the analysis were modified as follows:

Combining Eqs ( 7 ), ( 8 ) for the regression, the interactive model yielded estimated coefficients for the disposition effect:

The results of Eq. ( 9 ) are shown in column (5) of Table 6 . And all the regression results obtained from the modified DML model are presented in Table 6 .

www.frontiersin.org

Table 6 . Results of resetting the DML model.

The findings indicate that the sample split ratio in the DML model, the prediction algorithm used, or the model estimation approach does not impact the conclusion that the national high-tech zone policy raises urban areas’ level of green innovation. These factors only modify the magnitude of the policy effect to some degree.

4.3 Heterogeneity analysis

4.3.1 regional heterogeneity.

The sample cities were further divided into the east, central, and west regions based on the three major economic subregions to examine regional variations in national high-tech zone policies ' effects on urban green innovation, with the results presented in Table 7 . National high-tech zone policies do not statistically significantly affect urban green innovation in the eastern region. However, they have a considerable beneficial influence in the central and western areas. The lack of statistical significance may be explained by the possibility that the setting up of national high-tech zones in the eastern region will provide obstacles to the growth of urban green innovation, such as resource strain and environmental pollution. Given the central and western regions’ relatively underdeveloped economic status and industrial structure, coupled with the preceding theoretical analysis, establishing national high-tech zones is a crucial catalyst, significantly boosting urban green innovation levels. Furthermore, the central government emphasizes that setting high-tech national zones should consider regional resource endowments and local conditions, implementing tailored policies. The central and western regions possess unique geographic locations and natural conditions that make them well-suited for developing solar energy, wind energy, and other forms of green energy. Compared to the central region, the national high-tech zone initiative has a more pronounced impact on promoting urban green innovation in the western region. While further optimization is needed for the western region’s urban innovation environment, the policy on national high-tech zones has a more substantial incentive effect in this region due to its more significant development potential, positive transformation of industrial structure, and increased policy support from the state, including the development strategy for the western region.

www.frontiersin.org

Table 7 . Heterogeneity test results for different regions.

4.3.2 Urban hierarchical heterogeneity

The New Tier 1 Cities Institute’s ‘2020 City Business Charm Ranking’ is the basis for this study, with the sample cities categorized into Tier 1 (New Tier 1), Tier 2, Tier 3, Tier 4, and Tier 5. Table 8 presents the regression findings for each of the groups.

www.frontiersin.org

Table 8 . Heterogeneity test results for different classes of cities.

The results in Table 8 reveal significant heterogeneity at the city level regarding national high-tech zones’ effects on urban green innovation, confirming Hypothesis 2 . In particular, the coefficients for the first-tier cities are not statistically significant due to the small sample size, and the same applies to the fifth-tier cities. This could be attributed to the relatively weak economy and infrastructure development issues in the fifth-tier cities. Additionally, due to their limited level of development, the fifth-tier cities may have a relatively homogeneous industrial structure, with a dominance of traditional industries or agriculture and a need for a more diversified industrial layout. National high-tech zones have not greatly aided the development of green innovation in these cities. In contrast, national high-tech zone policies in second-tier, third-tier, and fourth-tier cities have a noteworthy favorable impact on green innovation, indicating their favorable influence on enhancing green innovation in these cities. Despite the lower level of economic development in fourth-tier cities compared to second-tier and third-tier cities, the fourth-tier cities’ national high-tech zones have the most pronounced impact on promoting green innovation. This could be attributed to the ongoing transformation of industries in fourth-tier cities, which are still in the technology diffusion and imitation stage, allowing these cities’ national high-tech zones to maintain a high marginal effect. Thus, Hypothesis 2 is supported.

5 Further analysis

According to the empirical findings, setting high-tech national zones significantly raises the bar for urban green innovation. Therefore, it is essential to understand the underlying factors and mechanisms that contribute to the positive correlation. This paper constructs a moderating effect test model using Eqs 5 , 6 and provides a detailed discussion by introducing transportation infrastructure as a moderating variable.

The empirical finding of the moderating impact of transportation infrastructure is shown in Table 9 . The dichotomous interaction term Zone*lntra is significantly negative at the 5% level, suggesting that the impact of national high-tech zone policies on the level of urban green innovation is negatively moderated by transportation infrastructure. This result deviates from the general expectation, but it aligns with the complexity of the role played by transportation infrastructure in the context of modern economic development, as discussed in the previous theoretical analysis. This could be attributed to the insufficient green innovation benefits generated by the policy on national high-tech zones at the current stage, which fails to compensate for the adverse effects of excessive resource consumption and environmental pollution caused by the construction of the zone. Furthermore, transportation infrastructure can lead to an excessive concentration of similar enterprises in the high-tech zones. This excessive concentration creates a relative crowding effect, intensifying competition among enterprises. It diminishes their inclination to engage in green innovation collaboration and investment and hinders their effective implementation of technological research and development activities. Moreover, the excessive clustering of similar enterprises implies a need for more diversity in green innovation activities among businesses located in national high-tech zones. This results in duplicated green innovation outputs and hinders the advancement of green innovation. Thus, Hypothesis 3 is supported.

www.frontiersin.org

Table 9 . Empirical results of moderating effects.

6 Conclusion and policy recommendations

6.1 conclusion.

Based on panel data from 163 prefecture-level cities in China from 2007 to 2019, the net effect of setting national high-tech zones on urban green innovation was analyzed using the double machine learning model. The results found that: firstly, the national high-tech zone policy significantly raises the degree of local green innovation, and these results remain robust even after accounting for various factors that could affect the estimation results. Secondly, in the central and western regions, the level of urban green innovation is positively impacted by the national high-tech zone policy; However, this impact is less significant in the eastern region. In the western region compared to the central region, the national high-tech zone initiative has a stronger impact on increasing the level of urban green innovation. Across different city levels, compared to second-tier and third-tier cities, the high-tech zone policy has a more substantial impact on increasing the level of green innovation in fourth-tier cities. Thirdly, based on the moderating effect mechanism test, the construction of transportation infrastructure weakens the promotional effect of national high-tech zones on urban green innovation.

6.2 Policy recommendations

In order that national high-tech zones can better promote China’s high-quality development, this paper proposes the following policy recommendations:

(1) Urban green innovation in China depends on accelerating the setting up of national high-tech zones and creating an atmosphere that supports innovation. Establishing national high-tech zones as testbeds for high-quality development and green innovation has significantly elevated urban green innovation. Thus, cities can efficiently foster urban green innovation by supporting the development of national high-tech zones. Cities that have already established national high-tech zones should further encourage enterprises within these zones to increase their investment in research and development. They should also proceed to foster the leadership of national high-tech zones for urban green innovation, assuming the role of pilot cities as models and leaders. Additionally, it is essential to establish mechanisms for cooperation and synergy between the pilot cities and their neighboring cities to promote collective green development in the region.

(2) Expanding the pilot program and implementing tailored policies based on local conditions are essential. Industrial policies about national high-tech zones have differing effects on urban green innovation. Regions should leverage their comparative advantages, consider urban development’s commonalities and unique aspects, and foster a stable and sustainable green innovation ecosystem. The western and central regions should prioritize constructing and enhancing new infrastructure and bolster support for the high-tech green industry. The western region should seize the opportunity presented by national policies that prioritize support, quicken the rate of environmental innovation, and progressively bridge the gap with the eastern and central regions in various aspects. Furthermore, second-tier, third-tier, and fourth-tier cities should enhance the advantages of national high-tech zone policies, further maintaining the high standard of green innovation and keeping green innovation at an elevated level. Regions facing challenges in green innovation, particularly fifth-tier cities, should learn from the development experiences of advanced regions with national high-tech zones to compensate for their deficiencies in green innovation.

(3) Highlighting the importance of transportation regulation and enhancing collaboration in green innovation is crucial. Firstly, transportation infrastructure should be maximized to strengthen coordination and cooperation among regions, facilitate the smooth movement of innovative talents across regions, and facilitate the rational sharing of innovative resources, collectively enhancing green innovation. Additionally, attention ought to be given to the industrial clustering effect of parks to prevent the wastage of resources and inefficiencies resulting from the excessive clustering of similar industries. Efforts should be focused on effectively harnessing the latent potential of crucial transportation infrastructure areas as long-term drivers of development, promptly mitigating the negative impact of transportation infrastructure construction, and gradually achieving the synergistic promotion of the setting up of national high-tech zones and the raising of urban levels of green innovation, among other overarching objectives.

6.3 Limitations and future research

Our study has some limitations because the research in this paper is conducted in the institutional context of China. For example, not all countries are suitable for implementing similar industrial policies to develop the economy while focusing on environmental protection. However, we recognize that this study is interesting and relevant, and it encourages us to focus more intensely on environmental protection from an industrial policy perspective. Moreover, this paper exhibits certain limitations in the research process. Firstly, the urban green innovation measurement index was developed using the quantity of green patent authorizations. Future studies could focus on green innovation processes, such as the quality of green patents granted. Secondly, the paper employs machine learning techniques for causal inference. Subsequent investigations could delve further into the potential applications of machine learning algorithms in environmental sciences to maximize the benefits of innovative research methodologies.

Data availability statement

The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions

WC: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing–review and editing. YJ: Conceptualization, Data curation, Formal Analysis, Investigation, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing. BT: Investigation, Project administration, Writing–review and editing.

The authors declare that financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Youth Fund for Humanities and Social Science research of Ministry of Education (20YJC790004).

Acknowledgments

The authors are grateful to the editors and the reviewers for their insightful comments.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Albahari, A., Pérez-Canto, S., Barge-Gil, A., and Modrego, A. (2017). Technology parks versus science parks: does the university make the difference? Technol. Forecast. Soc. Change 116, 13–28. doi:10.1016/j.techfore.2016.11.012

CrossRef Full Text | Google Scholar

Alder, S., Shao, L., and Zilibotti, F. (2016). Economic reforms and industrial policy in a panel of Chinese cities. J. Econ. Growth 21, 305–349. doi:10.1007/s10887-016-9131-x

Ansaris, M., Ashrafi, S., and Jebellie, H. (2016). The impact of human capital on green innovation. Industrial Manag. J. 8 (2), 141–162. doi:10.22059/imj.2016.60653

Banister, D., and Berechman, Y. (2001). Transport investment and the promotion of economic growth. J. Transp. Geogr. 9 (3), 209–218. doi:10.1016/s0966-6923(01)00013-8

Behrens, K., Lamorgese, A. R., Ottaviano, G. I., and Tabuchi, T. (2007). Changes in transport and non-transport costs: local vs global impacts in a spatial network. Regional Sci. Urban Econ. 37 (6), 625–648. doi:10.1016/j.regsciurbeco.2007.08.003

Bo, W., Yongzhong, Z., Lingshan, C., and Xing, Y. (2020). Urban green innovation level and decomposition of its determinants in China. Sci. Res. Manag. 41 (8), 123. doi:10.19571/j.cnki.1000-2995.2020.08.013

Cao, Q. F. (2019). The latest researches on place based policy and its implications for the construction of xiong’an national new district. Sci. Technol. Prog. Policy 36 (2), 36–43. (in Chinese).

Google Scholar

Cattapan, P., Passarelli, M., and Petrone, M. (2012). Brokerage and SME innovation: an analysis of the technology transfer service at area science park, Italy. Industry High. Educ. 26 (5), 381–391. doi:10.5367/ihe.2012.0119

Chandrashekar, D., and Bala Subrahmanya, M. H. (2017). Absorptive capacity as a determinant of innovation in SMEs: a study of Bengaluru high-tech manufacturing cluster. Small Enterp. Res. 24 (3), 290–315. doi:10.1080/13215906.2017.1396491

Chen, M., and Zheng, Y. (2008). China's regional disparity and its policy responses. China & World Econ. 16 (4), 16–32. doi:10.1111/j.1749-124x.2008.00119.x

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., et al. (2018). Double/debiased machine learning for treatment and structural parameters. Econ. J. 21 (1), C1–C68. doi:10.1111/ectj.12097

De Beule, F., and Van Beveren, I. (2012). Does firm agglomeration drive product innovation and renewal? An application for Belgium. Tijdschr. Econ. Soc. Geogr. 103 (4), 457–472. doi:10.1111/j.1467-9663.2012.00715.x

Díez-Vial, I., and Fernández-Olmos, M. (2017). The effect of science and technology parks on firms’ performance: how can firms benefit most under economic downturns? Technol. Analysis Strategic Manag. 29 (10), 1153–1166. doi:10.1080/09537325.2016.1274390

Fang, Z., Kong, X., Sensoy, A., Cui, X., and Cheng, F. (2021). Government’s awareness of environmental protection and corporate green innovation: a natural experiment from the new environmental protection law in China. Econ. Analysis Policy 70, 294–312. doi:10.1016/j.eap.2021.03.003

Fritsch, M., and Slavtchev, V. (2011). Determinants of the efficiency of regional innovation systems. Reg. Stud. 45 (7), 905–918. doi:10.1080/00343400802251494

He, J. A., Peng, F. P., and Xie, X. Y. (2022). Mixed-ownership reform, political connection and enterprise innovation: based on the double/unbiased machine learning method. Sci. Technol. Manag. Res. 42 (11), 116–126. (in Chinese).

Hong, J., Feng, B., Wu, Y., and Wang, L. (2016). Do government grants promote innovation efficiency in China's high-tech industries? Technovation 57, 4–13. doi:10.1016/j.technovation.2016.06.001

Huang, W. J., and Fernández-Maldonado, A. M. (2016). High-tech development and spatial planning: comparing The Netherlands and Taiwan from an institutional perspective. Eur. Plan. Stud. 24 (9), 1662–1683. doi:10.1080/09654313.2016.1187717

Hull, I., and Grodecka-Messi, A. (2022). Measuring the impact of taxes and public services on property values: a double machine learning approach . arXiv preprint arXiv:2203.14751.

Jiang, T. (2022). Mediating effects and moderating effects in causal inference. China Ind. Econ. 5, 100–120. doi:10.19581/j.cnki.ciejournal.2022.05.005

Knittel, C. R., and Stolper, S. (2021). Machine learning about treatment effect heterogeneity: the case of household energy use. Nashv. TN 37203, 440–444. doi:10.1257/pandp.20211090

Li, H., and Tang, L. (2015). Transportation infrastructure investment, spatial spillover effect and enterprise inventory. Manag. World 4, 126–136. doi:10.19744/j.cnki.11-1235/f.2015.04.012

Li, W. H., Liu, F., and Liu, T. S. (2022a). Can national high-tech zones improve the urban innovation efficiency? an empirical test based on the effect of spatial agglomeration regulation. Manag. Rev. 34 (5), 93. doi:10.14120/j.cnki.cn11-5057/f.2022.05.007

Li, X. (2015). Analysis and outlook of the related researches on green innovation. R&D Manag. 27 (2), 1–11. doi:10.13581/j.cnki.rdm.2015.02.001

Li, X., Shao, X., Chang, T., and Albu, L. L. (2022b). Does digital finance promote the green innovation of China's listed companies? Energy Econ. 114, 106254. doi:10.1016/j.eneco.2022.106254

Li, X. P., Li, P., Lu, D. G., and Jiang, F. T. (2015). Economic agglomeration, selection effects and firm productivity. J. Manag. World 4, 25–37+51. (in Chinese). doi:10.19744/j.cnki.11-1235/f.2015.04.004

Liu, R. M., and Zhao, R. J. (2015). Does the national high-tech zone promote regional economic development? A verification based on differences-in-differences method. J. Manag. World 8, 30–38. doi:10.19744/j.cnki.11-1235/f.2015.08.005

Luo, R., and Wang, Q. M. (2023). Does the construction of national demonstration logistics park produce economic growth effect? Econ. Surv. 40 (1), 47–56. doi:10.15931/j.cnki.1006-1096.2023.01.015

Luo, Y., Salman, M., and Lu, Z. (2021). Heterogeneous impacts of environmental regulations and foreign direct investment on green innovation across different regions in China. Sci. total Environ. 759, 143744. doi:10.1016/j.scitotenv.2020.143744

PubMed Abstract | CrossRef Full Text | Google Scholar

Melitz, M. J., and Ottaviano, G. I. (2008). Market size, trade, and productivity. Rev. Econ. Stud. 75 (1), 295–316. doi:10.1111/j.1467-937x.2007.00463.x

Park, S. C., and Lee, S. K. (2004). The regional innovation system in Sweden: a study of regional clusters for the development of high technology. Ai Soc. 18 (3), 276–292. doi:10.1007/s00146-003-0277-7

Pokharel, R., Bertolini, L., Te Brömmelstroet, M., and Acharya, S. R. (2021). Spatio-temporal evolution of cities and regional economic development in Nepal: does transport infrastructure matter? J. Transp. Geogr. 90, 102904. doi:10.1016/j.jtrangeo.2020.102904

Qiu, Y., Wang, H., and Wu, J. (2023). Impact of industrial structure upgrading on green innovation: evidence from Chinese cities. Environ. Sci. Pollut. Res. 30 (2), 3887–3900. doi:10.1007/s11356-022-22162-1

Sosnovskikh, S. (2017). Industrial clusters in Russia: the development of special economic zones and industrial parks. Russ. J. Econ. 3 (2), 174–199. doi:10.1016/j.ruje.2017.06.004

Sun, Z., and Sun, J. C. (2015). The effect of Chinese industrial policy: industrial upgrading or short-term economic growth. China Ind. Econ. 7, 52–67. (in Chinese). doi:10.19581/j.cnki.ciejournal.2015.07.004

Tan, J., and Zhang, J. (2018). Does national high-tech development zones promote the growth of urban total factor productivity? —based on" quasi-natural experiments" of 277 cities. Res. Econ. Manag. 39 (9), 75–90. doi:10.13502/j.cnki.issn1000-7636.2018.09.007

Vásquez-Urriago, Á. R., Barge-Gil, A., Rico, A. M., and Paraskevopoulou, E. (2014). The impact of science and technology parks on firms’ product innovation: empirical evidence from Spain. J. Evol. Econ. 24, 835–873. doi:10.1007/s00191-013-0337-1

Wang, F., Dong, M., Ren, J., Luo, S., Zhao, H., and Liu, J. (2022a). The impact of urban spatial structure on air pollution: empirical evidence from China. Environ. Dev. Sustain. 24, 5531–5550. doi:10.1007/s10668-021-01670-z

Wang, M., and Liu, X. (2023). The impact of the establishment of national high-tech zones on total factor productivity of Chinese enterprises. China Econ. 18 (3), 68–93. doi:10.19602/j.chinaeconomist.2023.05.04

Wang, Q., She, S., and Zeng, J. (2020). The mechanism and effect identification of the impact of National High-tech Zones on urban green innovation: based on a DID test. China Popul. Resour. Environ. 30 (02), 129–137.

Wang, W. S., and Xu, T. S. (2020). A research on the impact of national high-teach zone establishment on enterprise innovation performance. Econ. Surv. 37 (6), 76–87. doi:10.15931/j.cnki.1006-1096.20201010.001

Wang, Z., Yang, Y., and Wei, Y. (2022b). Has the construction of national high-tech zones promoted regional economic growth? empirical research from prefecture-level cities in China. Sustainability 14 (10), 6349. doi:10.3390/su14106349

Weber, M., Driessen, P. P., and Runhaar, H. A. (2014). Evaluating environmental policy instruments mixes; a methodology illustrated by noise policy in The Netherlands. J. Environ. Plan. Manag. 57 (9), 1381–1397. doi:10.1080/09640568.2013.808609

Wu, Y. B. (2019). Does fiscal decentralization promote technological innovation. Mod. Econ. Sci. 41, 13–25.

Xu, S. D., Jiang, J., and Zheng, J. (2022). Has the establishment of national high-tech zones promoted industrial Co-Agglomeration? an empirical test based on difference in difference method. Inq. into Econ. Issues 11, 113–127. (in Chinese).

Yang, F., and Guo, G. (2020). Fuzzy comprehensive evaluation of innovation capability of Chinese national high-tech zone based on entropy weight—taking the northern coastal comprehensive economic zone as an example. J. Intelligent Fuzzy Syst. 38 (6), 7857–7864. doi:10.3233/jifs-179855

Yu, N., De Jong, M., Storm, S., and Mi, J. (2013). Spatial spillover effects of transport infrastructure: evidence from Chinese regions. J. Transp. Geogr. 28, 56–66. doi:10.1016/j.jtrangeo.2012.10.009

Yuan, H., and Zhu, C. L. (2018). Do national high-tech zones promote the transformation and upgrading of China’s industrial structure. China Ind. Econ. 8, 60–77. doi:10.19581/j.cnki.ciejournal.2018.08.004

Zhang, T., Chen, L., and Dong, Z. (2018). Highway construction, firm dynamics and regional economic efficiency. China Ind. Econ. 1, 79–99. doi:10.19581/j.cnki.ciejournal.20180115.003

Zhang, T., and Li, J. C. (2023). Network infrastructure, inclusive green growth, and regional inequality: from causal inference based on double machine learning. J. Quantitative Technol. Econ. 40 (4), 113–135. doi:10.13653/j.cnki.jqte.20230310.005

Zhou, L., and Shen, K. (2020). National city group construction and green innovation. China Popul. Resour. Environ. 30 (8), 92–99.

Zhou, X., and Du, J. (2021). Does environmental regulation induce improved financial development for green technological innovation in China? J. Environ. Manag. 300, 113685. doi:10.1016/j.jenvman.2021.113685

Keywords: national high-tech zone, industrial policy, green innovation, heterogeneity analysis, moderating effect, double machine learning

Citation: Cao W, Jia Y and Tan B (2024) Impact of industrial policy on urban green innovation: empirical evidence of China’s national high-tech zones based on double machine learning. Front. Environ. Sci. 12:1369433. doi: 10.3389/fenvs.2024.1369433

Received: 12 January 2024; Accepted: 15 March 2024; Published: 04 April 2024.

Reviewed by:

Copyright © 2024 Cao, Jia and Tan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yu Jia, [email protected]

  • Open access
  • Published: 06 April 2024

Prevention of alcohol exposed pregnancies in Europe: the FAR SEAS guidelines

  • Carla Bruguera 1 ,
  • Lidia Segura-García 1 ,
  • Katarzyna Okulicz-Kozaryn 2 ,
  • Claudia Gandin 3 ,
  • Silvia Matrai 4 ,
  • Fleur Braddick 4 ,
  • Marta Zin-Sędek 5 ,
  • Luiza Slodownik 5 ,
  • Emanuele Scafato 3 &
  • Joan Colom 1  

BMC Pregnancy and Childbirth volume  24 , Article number:  246 ( 2024 ) Cite this article

45 Accesses

Metrics details

Introduction

Drinking during pregnancy is the leading cause of birth defects and child developmental disorders in Europe. The adverse effects of drinking during pregnancy may include physical, behavioural and cognitive problems, known collectively as fetal alcohol spectrum disorders (FASD). Evidence-based comprehensive recommendations at the European level on how to implement preventive and treatment policies to reduce alcohol-exposed pregnancies are needed. FAR SEAS, a tendered service contract (number 20,187,106) awarded by the European Commission, aimed at developing guidelines to respond to this knowledge gap.

FAR SEAS recommendations were built on (1) a two-phase review of interventions, (2) an international expert consultation, and (3) a pilot study on prevention of FASD conducted in the Mazovia region of Poland. The review of interventions included nineteen electronic open access databases, several repositories of grey literature and a key informant consultation covering most European Union (EU) countries and an additional guidelines search. After triangulating sources, 94 records were collected. Experts contributed in the design of the research questions, addressing the gaps in the literature and reviewing the recommendations formulated. The Polish pilot added nuances from real world practice to the formulated recommendations, resulting in the final set of guidelines for dissemination.

The FAR SEAS Guidelines comprise 23 recommendations grouped into different topics areas of policies, communication strategies, screening, brief intervention and referral to treatment, treatment and social services. The recommendations highlight the need to respect women’s autonomy and avoid discrimination and stigmatization; using universal screening for women of childbearing age, including detection of other psychosocial risks (such as domestic violence); and individualized, comprehensive and multidisciplinary supportive interventions for those who require it, such as those with alcohol use disorders, including women’s partners. Policies to prevent FASD should be multicomponent, and public health communication should combine information about the risks together with self-efficacy messages to promote changes.

Conclusions

The FAR SEAS guidelines are a tool to support policy-makers and service managers in implementing effective programmes to reduce prenatal alcohol exposure among general and at-risk population groups. FASD prevention has to involve comprehensive and multi-level evidence-based policies and practice, with services and activities tailored to the needs of women at differing levels of risk, and with due attention to reducing stigma.

Peer Review reports

Women have traditionally consumed alcohol in lower amounts and less frequently than men. However, the gender gap has been decreasing and even disappearing in some EU countries [ 1 ]. Alcohol use is associated with breast cancer [ 2 ], alcoholic hepatitis [ 3 ], heart disease [ 4 ] and brain damage [ 5 ]. The risk for alcohol-related harms is higher when women are pregnant, as the adverse effects of drinking during pregnancy may include physical, behavioural and cognitive problems, known collectively as FASD. The global prevalence of FASD [ 6 , 7 ] was estimated to be 7.7 per 1000 population (95% CI, 4.9–11.7 per 1000 population), with the European Region having the highest overall prevalence at 19.8 per 1000 population (95% CI, 14.1–28.0 per 1000 population) [ 8 ]. Overall, FASD is preventable by abstaining from drinking alcohol during pregnancy, or by effective contraception use when drinking [ 9 ].

In the context of FASD prevention policies, the EU Strategy to support Member States in reducing alcohol related harm [ 10 ] requested governments to raise awareness of the risks of drinking during pregnancy. The need for evidence-based policies to reduce alcohol related harms was also stressed. These guidelines are the response from the FAR SEAS project (Fetal Alcohol Reduction and exchange of European knowledge after the Standard European Alcohol Survey) to the absence of evidence-based guidelines at the European level on how to implement preventive interventions or treatment practices to reduce alcohol-exposed pregnancies. FAR SEAS is a tendered service contract (number 20,187,106) awarded by the European Commission, under the EU health programme.

The aim of the FAR SEAS guidelines is to provide the best available evidence to prevent and reduce alcohol consumption in women of child-bearing age, particularly in pregnant women, in an easily accessible and comprehensive format.

Target audience and population

The recommendations in the FAR SEAS guidelines are primarily aimed at policy makers, health service managers, health-care providers and social workers who are in contact with pregnant women and women of child-bearing age. Among health-care providers, we expect these guidelines to be useful to gynaecologists and obstetricians, physicians or primary health care professionals, midwives and nurses, and psychologists and psychiatrists working in the addiction field.

The target population to reach is women of child-bearing age, especially those who are pregnant, and includes those in vulnerable situations and disadvantaged groups or who have alcohol-related problems.

Evidence compilation

Evidence was collected in 4 stages:

Literature review of peer-reviewed papers.

Consultation of experts.

Grey literature.

Pilot study in Poland.

Initially, a literature review was performed. Nineteen electronic open access databases were searched for relevant articles, guidelines and policies related to pregnant women and alcohol use. The databases searched were: African Journals Online (AJOL), CiNii, Cochrane, Cordis, Cuiden, Dialnet Plus, ENFISPO, InDICEs, Joanna Briggs Institute EBP Database, LILACS, Open Grey, Psicodoc, Psychology & behavioral sciences collection, PsycINFO, PubMed, SciELo, Scopus, WoS BIOSIS Preview and WoS Core Collection. The search strategy was developed by an expert bibliographer from the University of Barcelona in consultation with experts from the Hospital Clínic de Barcelona, GENCAT (Generalitat de Catalunya) and PARPA (State Agency for the Prevention of Alcohol-Related Problems).

The strategy was organized around the following three main keyword topics:

Pregnant women: (pregnant OR gestation OR prenatal) AND.

Alcohol consumption: (alcohol drinking OR alcohol consumption OR alcoholics OR alcoholism) AND.

Types of documents relevant for the review: (Practice Guideline OR Clinical trial OR Guideline OR Health Promotion OR Health Planning Guidelines OR Advance Directive Adherence OR Benchmarking OR Clinical Trials OR Program Development OR Program Evaluation OR Government Programs OR Preventive Health Services OR Education OR Safety Management OR Social Validity, Research OR Family Planning Services OR Healthy People Programs).

Inclusion criteria:

All geographical areas.

Sample N  > 6.

All languages.

Researches that followed equity and bioethical principles.

Peer reviewed articles.

Exclusion criteria:

Experimental research with animals.

FASD diagnosis or management.

Secondly, we undertook consultations with informants from key European and international expert networks such as EUFAS (European Federation of Addiction Societies), EUFASD (European FASD Alliance), INEBRIA (International Network on Brief Interventions for Alcohol and Other Drugs), APN (Alcohol Policy Network), RARHA (Reducing Alcohol Related Harm), and 3 public health organizations – the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA), World Health Organization (WHO) and the EU’s Consumer, Health, Agriculture and Food Executive Agency (CHAFEA) - who provided further documents (published articles or grey literature).

The third step was to search relevant European and international repositories of the grey literature (e.g., reports and other documents from institutions or public health organizations such as the EMCDDA, WHO, and a key network: the European Fetal Alcohol Spectrum Disorders Alliance (EUFASD)). An additional search of guidelines was performed as well as a further round of expert consultations to address gaps in the evidence. Finally, secondary sources emerging from the documents reviewed were also included.

Evidence assessment

Data extracted were assessed to ensure a robust body of evidence:

Randomised Control Trials (RCT) were analysed by using RoB2 [ 11 ], the revised Cochrane risk-of-bias tool for RCTs. Those studies with one or more high-risk domain were excluded.

Systematic reviews (SRs) were assessed using AMSTAR [ 12 ]. Reviews that scored less than 4 items were excluded.

Guidelines were evaluated by using the section Rigour of Development criteria of the AGREE II instrument [ 13 ]. Guidelines scoring less than 70% were excluded.

Two investigators evaluated the guidelines and a sample of RCTs and systematic reviews. A template (see Table 12 in Appendix 1 ) was made with the different dimensions to be evaluated, according to study design. Each researcher evaluated independently, they then met to compare scores and reach agreement by consensus. The evidence was then rated using the criteria described by the Scottish Intercollegiate Network (SIGN) ‘A guideline developer’s handbook’ [ 14 ] (see Table 1 in Appendix 1 ).

Experts’ assessment

The Guidelines Development Group (see Table 2 in Appendix 1 ) used the evidence rating, plus further evidence on harms, benefits, values, preferences, resource use, and feasibility, to write a set of recommendations for each question of the topics.

The strength, relevance and transferability of each recommendation was assessed and determined based on the level of agreement with each recommendation by the internal reviewers and external experts (see Table 3 in Appendix 1 ).

Experts were selected according to their expertise in the FASD and public health fields and relevant experience. They were asked to provide feedback on the scope, aims and questions answered by these guidelines, as well as the recommendations formulated and the final version of the document.

In order to improve and explore the validity of those recommendations that presented evidence gaps, feedback wascollected via a bespoke online questionnaire, which elicited opinions from members of EUFASD (European FASD alliance) and the Kettil Brunn Society –a society for social and epidemiological research on alcohol.

Finally, lessons learned from the pilot study of an FASD prevention programme [ 15 ] in Mazovia (Poland), also in the context of FAR SEAS, introduced further nuances to the recommendations formulated.

The pilot project was implemented in four municipalities in Poland, recruited based on the willingness of care centres to participate, intention to address this perinatal issue, and commitment to promote the participation of multi-disciplinary professionals from different services (social, addiction, and psychology). The pilot study protocol was assessed and approved by the ethics committee of the Polish State Agency for the Prevention of Alcohol-Related Problems (formerly PARPA, now KCPU), and all participants (or their legal guardian) gave informed consent for their participation and data use in line with the EU GDPR. After the initial training based on motivational interviewing and the CHOICES programme [ 16 ], professionals recruited women of child-bearing age (pregnant and not pregnant) in their local communities, screened them for alcohol risk using standardised scales, and allocated participants ( n  = 441) to groups for low- (69%), moderate- (23%) or high-risk (7%) of alcohol-exposed pregnancy. The professionals also provided interventions tailored to the needs of each participant, ranging from short feedback, to brief intervention, to motivational interviewing sessions (to change one or both of two high-risk behaviours – risky alcohol use and/or no contraception use), to referral to other specialists. The results indicated positive changes in the key outcome variables: risky alcohol consumption, contraception use and visiting a gynaecologist, as well as in associated psychosocial risk factors (decreases in cigarette and drug use, domestic violence and depressive symptoms). Full details of the FAR SEAS Polish pilot study can be found in the resulting publication [ 15 ].

The 2626 records identified in the review were screened to exclude 2524 (682 duplicates + 1842 records that did not meet inclusion criteria) leaving 102 records included (see Fig. 1 for PRISMA chart and Table 11 in Appendix 1 for records included).

Twenty-three recommendations were formulated based on the literature reviews, several rounds of experts consultations, and the pilot study deployed in the Mazovian region of Poland. The recommendations were organised into the following 7 topics: (1) Organizational, strategic and policy changes required to properly address the needs of pregnant women and women of child-bearing age who are at risk of having, or who already have an alcohol-related problem (see Table 4 in Appendix 1 ); (2) Strategies and best practices for promoting and raising awareness of the risks of drinking alcohol during pregnancy (see Table 5 in Appendix 1 ); (3) Validated tools to screen alcohol use and maternal risk factors and further assess alcohol-related problems among pregnant women and women of child-bearing age in health and social care settings (see Table 6 in Appendix 1 ); (4) Preventive interventions for pregnant women and women of child-bearing age at risk of having alcohol related problems (see Table 7 in Appendix 1 ); (5) Treatment interventions for pregnant women and women of child-bearing age at risk of having alcohol related problems (see Table 8.1 and 8.2 in Appendix 1 ); (6) Social measures for pregnant women and women of child-bearing age at risk of having alcohol related problems (see Table 9 in Appendix 1 ); (7) Implementation, training and evaluation strategies for preventing activities (see Table 10 in Appendix 1 ).

Details on the formulation of specific recommendations and incorporation of lessons from the pilot can be seen in the FAR SEAS Deliverable report on this output (D13).

The final FAR SEAS guidelines were delivered in various formats addressing different beneficiaries (from an extensive deliverable report addressing technical policy makers and practitioners, to a laypersons’ guide to inform the general audience), for dissemination through scientific and public health channels throughout the final months of the FAR SEAS contract. These documents are available from the authors of this paper or FAR SEAS contract coordinators, pending publication on the EC web pages.

It is well-established in scientific literature that there is no safe level of alcohol use during pregnancy [ 9 ] and that abstinence or effective contraception are the only responsible professional recommendations.

There is scarce robust evidence on which interventions are more effective to reduce alcohol exposed pregnancies. However, despite the lack of high-quality studies, recommendations need to be made in order to support women in avoiding alcohol while pregnant, and to protect the health of their future children.

These guidelines have been built upon the foundation of the WHO guidelines [ 17 ]. However, they specifically target alcohol-related issues and extend beyond clinical considerations to encompass public health perspectives.

Four main recommendations can be drawn out of the FAR SEAS project are:

Multicomponent policies that range from primary prevention to harm reduction strategies need to be developed that consider respect for women’s autonomy and protection from discrimination.

In order to promote changes, it is recommended that public health communication combine information about risks and self-efficacy messages.

Universal screening is recommended (rather than targeted screening, which is open to bias and stigmatisation of certain groups) for women of childbearing age, including the detection of other psychosocial risks that can correlate with alcohol use during pregnancy, as were included in the FAR SEAS protocol, and accompanied by individualized interventions for those who require it, including their partners.

Women with alcohol use disorders need comprehensive and multidisciplinary support also addressing psychosocial risks such as poverty or violence.

This paper has some limitations. Recommendations included here were formulated according to evidence, but evidence regarding the prevention of alcohol consumption of women of child bearing age in general, and particularly pregnant women, is not strong. Given this, the contributions arising from the analysis of expert opinion is even more important.

Although the initial scope of the FAR SEAS project was broadly multi-sectoral, the evidence found for FASD prevention was generally focused on health services and professionals, which represents a further limitation of the evidence review. However, the FAR SEAS Polish pilot showed promising results regarding the involvement of social services and social workers in the prevention and management of alcohol exposed pregnancies, which has been incorporated into the guidelines. The role of social care professionals should be further explored and assessed to synthesise new evidence around this sector, to inform guidelines and policy making, especially given that women in vulnerable social situations may be at higher risk of alcohol exposed pregnancies.

Data availability

Data supporting this article are available from the corresponding author on reasonable request.

The European Institute of Women’s Health. Women and Alcohol in the EU. https://eurohealth.ie/women-and-alcohol-in-the-eu/ .

Shield KD, Soerjomataram I, Rehm J. Alcohol use and breast cancer: a critical review. Alcoholism: Clin Experimental Res. 2016;40(6):1166–81. PMID: 27130687.

Article   CAS   Google Scholar  

Guy J, Peters M. Liver disease in women: The influence of gender on epidemiology, natural history, and patient outcomes. Gastroenterology & Hepatology 9(10):633–639, 2013. PMID: 24764777.

Erol A, Karpyak V. Sex and gender-related differences in alcohol use and its consequences: Contemporary knowledge and future research considerations. Drug and Alcohol Dependence 156:1–13, 2015. PMID: 26371405.

Hommer DW. Male and female sensitivity to alcohol–induced brain damage. Bethesda, MD: National Institute on Alcohol Abuse and Alcoholism, 2004. https://pubs.niaaa.nih.gov/publications/arh27-2/181-185.htm . Accessed June 19, 2019.

Hoyme HE, May PA, Kalberg WO, et al. A practical clinical approach to diagnosis of fetal alcohol spectrum disorders: clarification of the 1996 Institute of Medicine criteria. Pediatrics. 2005;115(1):39–47.

Article   PubMed   Google Scholar  

Stratton K, Howe C, Battaglia F. Fetal alcohol syndrome: diagnosis, epidemiology, Prevention, and treatment. Washington, DC: National Academy; 1996.

Google Scholar  

Popova S, Charness ME, Burd L, Crawford A, Hoyme HE, Mukherjee RA, ... & Elliott EJ. Fetal alcohol spectrum disorders. Nature Rev Disease Primers. 2023;9(1):11. https://www.nature.com/articles/s41572-023-00420-x .

Jacobsen B, Lindemann C, Petzina R, Verthein U. The universal and primary prevention of foetal alcohol spectrum disorders (FASD): a systematic review. J Prev. 2022;43(3):297–316.

Article   Google Scholar  

European Commission. (2006). An EU strategy to support Member States in reducing alcohol related harm. https://ec.europa.eu/health/ph_determinants/life_style/alcohol/documents/alcohol_com_625_en.pdf .

RoB 2. A revised Cochrane risk-of-bias tool for randomized trials. Accessible at: https://methods.cochrane.org/bias/resources/rob-2-revised-cochrane-risk-bias-tool-randomized-trials .

AMSTAR –. a measurement tool to assess the methodological quality of systematic reviews. Accesible at: https://amstar.ca/docs/AMSTARguideline.pdf .

Brouwers M, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna S, Littlejohns P, Makarski J, Zitzelsberger L, for the AGREE Next Steps Consortium. AGREE II: Advancing guideline development, reporting and evaluation in healthcare. Can Med Assoc J. 2010. Available online July 5, 2010. https://doi.org/10.1503/cmaj.090449 .

Scottish Intercollegiate Guidelines. Network (SIGN) ‘A guideline developer’s handbook’. Accessible at: https://www.sign.ac.uk/assets/sign50_2011.pdf .

Okulicz-Kozaryn K, Segura-García L, Bruguera C, Braddick F, Zin-Sędek M, Gandin C, Słodownik-Przybyłek L, Scafato E, Ghirini S, Colom J, Matrai S. Reducing the risk of prenatal alcohol exposure and FASD through social services: promising results from the FAR SEAS pilot project. Front Psychiatry. 2023;14:1243904. https://doi.org/10.3389/fpsyt.2023.1243904 . PMID: 37779625; PMCID: PMC10540837.

Article   PubMed   PubMed Central   Google Scholar  

Ceperich SD, Ingersoll KS. Motivational interviewing + feedback intervention to reduce alcohol-exposed pregnancy risk among college binge drinkers: determinants and patterns of response. J Behav Med. 2011;34(5):381–95.

WHO. (2014) Guidelines for the identification and management of substance use and substance use disorders in pregnancy. Retrieved on July 2, 2020, from https://apps.who.int/iris/bitstream/handle/10665/107130/9789241548731_eng.pdf;jsessionid=A6556244BD91CF2E4CAE55A07ACA33E7?sequence=1 .

Download references

Acknowledgements

This work was produced under the FAR SEAS service contract (Fetal Alcohol Reduction and EU Knowledge Exchange after SEAS – www.far-seas.eu , Contract No. 20187106) with the Health and Digital Executive Agency (HaDEA) acting under the mandate from the European Commission (DG SANTE). The information and any opinion set out in this article reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein.

EU Health Programme 2014–2020, European Commission, 20187106.

Author information

Authors and affiliations.

Subdirectorate General of Addictions, HIV, STI and Viral Hepatitis, Public Health Agency of Catalonia (GENCAT), Barcelona, Spain

Carla Bruguera, Lidia Segura-García & Joan Colom

Institute of Mother and Child, Warsaw, Poland

Katarzyna Okulicz-Kozaryn

National Observatory on Alcohol, National Center on Addiction and Doping, Istituto Superiore di Sanità (ISS), Rome, Italy

Claudia Gandin & Emanuele Scafato

CLÍNIC Foundation for Biomedical Research (FCRB), Barcelona, Spain

Silvia Matrai & Fleur Braddick

Polish National Centre for Addiction Prevention (KCPU), Warsaw, Poland

Marta Zin-Sędek & Luiza Slodownik

You can also search for this author in PubMed   Google Scholar

Contributions

FB, SM and KO-K acquired funding for the project. CB, LS, FB, SM, CG, ES contributed to conception and design of the study. CB and LS wrote the first draft of the manuscript and FB carried out the editing. MZ and JC together with the other authors contributed to manuscript revision, read, and approved the submitted version.

Corresponding author

Correspondence to Carla Bruguera .

Ethics declarations

Ethics approval and consent to participate.

The pilot study protocol was assessed and approved by the ethics committee of the Polish State Agency for the Prevention of Alcohol-Related Problems (formerly PARPA, now KCPU).

Consent for publication

Not applicable.

Conflict of interest

Authors have no conflict of interest to declare.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Bruguera, C., Segura-García, L., Okulicz-Kozaryn, K. et al. Prevention of alcohol exposed pregnancies in Europe: the FAR SEAS guidelines. BMC Pregnancy Childbirth 24 , 246 (2024). https://doi.org/10.1186/s12884-024-06452-9

Download citation

Received : 07 November 2023

Accepted : 26 March 2024

Published : 06 April 2024

DOI : https://doi.org/10.1186/s12884-024-06452-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Prenatal alcohol exposure
  • Fetal alcohol spectrum disorder (FASD)

BMC Pregnancy and Childbirth

ISSN: 1471-2393

literature review study design

U.S. flag

An official website of the United States government

The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Browse Titles

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Lau F, Kuziemsky C, editors. Handbook of eHealth Evaluation: An Evidence-based Approach [Internet]. Victoria (BC): University of Victoria; 2017 Feb 27.

Cover of Handbook of eHealth Evaluation: An Evidence-based Approach

Handbook of eHealth Evaluation: An Evidence-based Approach [Internet].

Chapter 9 methods for literature reviews.

Guy Paré and Spyros Kitsiou .

9.1. Introduction

Literature reviews play a critical role in scholarship because science remains, first and foremost, a cumulative endeavour ( vom Brocke et al., 2009 ). As in any academic discipline, rigorous knowledge syntheses are becoming indispensable in keeping up with an exponentially growing eHealth literature, assisting practitioners, academics, and graduate students in finding, evaluating, and synthesizing the contents of many empirical and conceptual papers. Among other methods, literature reviews are essential for: (a) identifying what has been written on a subject or topic; (b) determining the extent to which a specific research area reveals any interpretable trends or patterns; (c) aggregating empirical findings related to a narrow research question to support evidence-based practice; (d) generating new frameworks and theories; and (e) identifying topics or questions requiring more investigation ( Paré, Trudel, Jaana, & Kitsiou, 2015 ).

Literature reviews can take two major forms. The most prevalent one is the “literature review” or “background” section within a journal paper or a chapter in a graduate thesis. This section synthesizes the extant literature and usually identifies the gaps in knowledge that the empirical study addresses ( Sylvester, Tate, & Johnstone, 2013 ). It may also provide a theoretical foundation for the proposed study, substantiate the presence of the research problem, justify the research as one that contributes something new to the cumulated knowledge, or validate the methods and approaches for the proposed study ( Hart, 1998 ; Levy & Ellis, 2006 ).

The second form of literature review, which is the focus of this chapter, constitutes an original and valuable work of research in and of itself ( Paré et al., 2015 ). Rather than providing a base for a researcher’s own work, it creates a solid starting point for all members of the community interested in a particular area or topic ( Mulrow, 1987 ). The so-called “review article” is a journal-length paper which has an overarching purpose to synthesize the literature in a field, without collecting or analyzing any primary data ( Green, Johnson, & Adams, 2006 ).

When appropriately conducted, review articles represent powerful information sources for practitioners looking for state-of-the art evidence to guide their decision-making and work practices ( Paré et al., 2015 ). Further, high-quality reviews become frequently cited pieces of work which researchers seek out as a first clear outline of the literature when undertaking empirical studies ( Cooper, 1988 ; Rowe, 2014 ). Scholars who track and gauge the impact of articles have found that review papers are cited and downloaded more often than any other type of published article ( Cronin, Ryan, & Coughlan, 2008 ; Montori, Wilczynski, Morgan, Haynes, & Hedges, 2003 ; Patsopoulos, Analatos, & Ioannidis, 2005 ). The reason for their popularity may be the fact that reading the review enables one to have an overview, if not a detailed knowledge of the area in question, as well as references to the most useful primary sources ( Cronin et al., 2008 ). Although they are not easy to conduct, the commitment to complete a review article provides a tremendous service to one’s academic community ( Paré et al., 2015 ; Petticrew & Roberts, 2006 ). Most, if not all, peer-reviewed journals in the fields of medical informatics publish review articles of some type.

The main objectives of this chapter are fourfold: (a) to provide an overview of the major steps and activities involved in conducting a stand-alone literature review; (b) to describe and contrast the different types of review articles that can contribute to the eHealth knowledge base; (c) to illustrate each review type with one or two examples from the eHealth literature; and (d) to provide a series of recommendations for prospective authors of review articles in this domain.

9.2. Overview of the Literature Review Process and Steps

As explained in Templier and Paré (2015) , there are six generic steps involved in conducting a review article:

  • formulating the research question(s) and objective(s),
  • searching the extant literature,
  • screening for inclusion,
  • assessing the quality of primary studies,
  • extracting data, and
  • analyzing data.

Although these steps are presented here in sequential order, one must keep in mind that the review process can be iterative and that many activities can be initiated during the planning stage and later refined during subsequent phases ( Finfgeld-Connett & Johnson, 2013 ; Kitchenham & Charters, 2007 ).

Formulating the research question(s) and objective(s): As a first step, members of the review team must appropriately justify the need for the review itself ( Petticrew & Roberts, 2006 ), identify the review’s main objective(s) ( Okoli & Schabram, 2010 ), and define the concepts or variables at the heart of their synthesis ( Cooper & Hedges, 2009 ; Webster & Watson, 2002 ). Importantly, they also need to articulate the research question(s) they propose to investigate ( Kitchenham & Charters, 2007 ). In this regard, we concur with Jesson, Matheson, and Lacey (2011) that clearly articulated research questions are key ingredients that guide the entire review methodology; they underscore the type of information that is needed, inform the search for and selection of relevant literature, and guide or orient the subsequent analysis. Searching the extant literature: The next step consists of searching the literature and making decisions about the suitability of material to be considered in the review ( Cooper, 1988 ). There exist three main coverage strategies. First, exhaustive coverage means an effort is made to be as comprehensive as possible in order to ensure that all relevant studies, published and unpublished, are included in the review and, thus, conclusions are based on this all-inclusive knowledge base. The second type of coverage consists of presenting materials that are representative of most other works in a given field or area. Often authors who adopt this strategy will search for relevant articles in a small number of top-tier journals in a field ( Paré et al., 2015 ). In the third strategy, the review team concentrates on prior works that have been central or pivotal to a particular topic. This may include empirical studies or conceptual papers that initiated a line of investigation, changed how problems or questions were framed, introduced new methods or concepts, or engendered important debate ( Cooper, 1988 ). Screening for inclusion: The following step consists of evaluating the applicability of the material identified in the preceding step ( Levy & Ellis, 2006 ; vom Brocke et al., 2009 ). Once a group of potential studies has been identified, members of the review team must screen them to determine their relevance ( Petticrew & Roberts, 2006 ). A set of predetermined rules provides a basis for including or excluding certain studies. This exercise requires a significant investment on the part of researchers, who must ensure enhanced objectivity and avoid biases or mistakes. As discussed later in this chapter, for certain types of reviews there must be at least two independent reviewers involved in the screening process and a procedure to resolve disagreements must also be in place ( Liberati et al., 2009 ; Shea et al., 2009 ). Assessing the quality of primary studies: In addition to screening material for inclusion, members of the review team may need to assess the scientific quality of the selected studies, that is, appraise the rigour of the research design and methods. Such formal assessment, which is usually conducted independently by at least two coders, helps members of the review team refine which studies to include in the final sample, determine whether or not the differences in quality may affect their conclusions, or guide how they analyze the data and interpret the findings ( Petticrew & Roberts, 2006 ). Ascribing quality scores to each primary study or considering through domain-based evaluations which study components have or have not been designed and executed appropriately makes it possible to reflect on the extent to which the selected study addresses possible biases and maximizes validity ( Shea et al., 2009 ). Extracting data: The following step involves gathering or extracting applicable information from each primary study included in the sample and deciding what is relevant to the problem of interest ( Cooper & Hedges, 2009 ). Indeed, the type of data that should be recorded mainly depends on the initial research questions ( Okoli & Schabram, 2010 ). However, important information may also be gathered about how, when, where and by whom the primary study was conducted, the research design and methods, or qualitative/quantitative results ( Cooper & Hedges, 2009 ). Analyzing and synthesizing data : As a final step, members of the review team must collate, summarize, aggregate, organize, and compare the evidence extracted from the included studies. The extracted data must be presented in a meaningful way that suggests a new contribution to the extant literature ( Jesson et al., 2011 ). Webster and Watson (2002) warn researchers that literature reviews should be much more than lists of papers and should provide a coherent lens to make sense of extant knowledge on a given topic. There exist several methods and techniques for synthesizing quantitative (e.g., frequency analysis, meta-analysis) and qualitative (e.g., grounded theory, narrative analysis, meta-ethnography) evidence ( Dixon-Woods, Agarwal, Jones, Young, & Sutton, 2005 ; Thomas & Harden, 2008 ).

9.3. Types of Review Articles and Brief Illustrations

EHealth researchers have at their disposal a number of approaches and methods for making sense out of existing literature, all with the purpose of casting current research findings into historical contexts or explaining contradictions that might exist among a set of primary research studies conducted on a particular topic. Our classification scheme is largely inspired from Paré and colleagues’ (2015) typology. Below we present and illustrate those review types that we feel are central to the growth and development of the eHealth domain.

9.3.1. Narrative Reviews

The narrative review is the “traditional” way of reviewing the extant literature and is skewed towards a qualitative interpretation of prior knowledge ( Sylvester et al., 2013 ). Put simply, a narrative review attempts to summarize or synthesize what has been written on a particular topic but does not seek generalization or cumulative knowledge from what is reviewed ( Davies, 2000 ; Green et al., 2006 ). Instead, the review team often undertakes the task of accumulating and synthesizing the literature to demonstrate the value of a particular point of view ( Baumeister & Leary, 1997 ). As such, reviewers may selectively ignore or limit the attention paid to certain studies in order to make a point. In this rather unsystematic approach, the selection of information from primary articles is subjective, lacks explicit criteria for inclusion and can lead to biased interpretations or inferences ( Green et al., 2006 ). There are several narrative reviews in the particular eHealth domain, as in all fields, which follow such an unstructured approach ( Silva et al., 2015 ; Paul et al., 2015 ).

Despite these criticisms, this type of review can be very useful in gathering together a volume of literature in a specific subject area and synthesizing it. As mentioned above, its primary purpose is to provide the reader with a comprehensive background for understanding current knowledge and highlighting the significance of new research ( Cronin et al., 2008 ). Faculty like to use narrative reviews in the classroom because they are often more up to date than textbooks, provide a single source for students to reference, and expose students to peer-reviewed literature ( Green et al., 2006 ). For researchers, narrative reviews can inspire research ideas by identifying gaps or inconsistencies in a body of knowledge, thus helping researchers to determine research questions or formulate hypotheses. Importantly, narrative reviews can also be used as educational articles to bring practitioners up to date with certain topics of issues ( Green et al., 2006 ).

Recently, there have been several efforts to introduce more rigour in narrative reviews that will elucidate common pitfalls and bring changes into their publication standards. Information systems researchers, among others, have contributed to advancing knowledge on how to structure a “traditional” review. For instance, Levy and Ellis (2006) proposed a generic framework for conducting such reviews. Their model follows the systematic data processing approach comprised of three steps, namely: (a) literature search and screening; (b) data extraction and analysis; and (c) writing the literature review. They provide detailed and very helpful instructions on how to conduct each step of the review process. As another methodological contribution, vom Brocke et al. (2009) offered a series of guidelines for conducting literature reviews, with a particular focus on how to search and extract the relevant body of knowledge. Last, Bandara, Miskon, and Fielt (2011) proposed a structured, predefined and tool-supported method to identify primary studies within a feasible scope, extract relevant content from identified articles, synthesize and analyze the findings, and effectively write and present the results of the literature review. We highly recommend that prospective authors of narrative reviews consult these useful sources before embarking on their work.

Darlow and Wen (2015) provide a good example of a highly structured narrative review in the eHealth field. These authors synthesized published articles that describe the development process of mobile health ( m-health ) interventions for patients’ cancer care self-management. As in most narrative reviews, the scope of the research questions being investigated is broad: (a) how development of these systems are carried out; (b) which methods are used to investigate these systems; and (c) what conclusions can be drawn as a result of the development of these systems. To provide clear answers to these questions, a literature search was conducted on six electronic databases and Google Scholar . The search was performed using several terms and free text words, combining them in an appropriate manner. Four inclusion and three exclusion criteria were utilized during the screening process. Both authors independently reviewed each of the identified articles to determine eligibility and extract study information. A flow diagram shows the number of studies identified, screened, and included or excluded at each stage of study selection. In terms of contributions, this review provides a series of practical recommendations for m-health intervention development.

9.3.2. Descriptive or Mapping Reviews

The primary goal of a descriptive review is to determine the extent to which a body of knowledge in a particular research topic reveals any interpretable pattern or trend with respect to pre-existing propositions, theories, methodologies or findings ( King & He, 2005 ; Paré et al., 2015 ). In contrast with narrative reviews, descriptive reviews follow a systematic and transparent procedure, including searching, screening and classifying studies ( Petersen, Vakkalanka, & Kuzniarz, 2015 ). Indeed, structured search methods are used to form a representative sample of a larger group of published works ( Paré et al., 2015 ). Further, authors of descriptive reviews extract from each study certain characteristics of interest, such as publication year, research methods, data collection techniques, and direction or strength of research outcomes (e.g., positive, negative, or non-significant) in the form of frequency analysis to produce quantitative results ( Sylvester et al., 2013 ). In essence, each study included in a descriptive review is treated as the unit of analysis and the published literature as a whole provides a database from which the authors attempt to identify any interpretable trends or draw overall conclusions about the merits of existing conceptualizations, propositions, methods or findings ( Paré et al., 2015 ). In doing so, a descriptive review may claim that its findings represent the state of the art in a particular domain ( King & He, 2005 ).

In the fields of health sciences and medical informatics, reviews that focus on examining the range, nature and evolution of a topic area are described by Anderson, Allen, Peckham, and Goodwin (2008) as mapping reviews . Like descriptive reviews, the research questions are generic and usually relate to publication patterns and trends. There is no preconceived plan to systematically review all of the literature although this can be done. Instead, researchers often present studies that are representative of most works published in a particular area and they consider a specific time frame to be mapped.

An example of this approach in the eHealth domain is offered by DeShazo, Lavallie, and Wolf (2009). The purpose of this descriptive or mapping review was to characterize publication trends in the medical informatics literature over a 20-year period (1987 to 2006). To achieve this ambitious objective, the authors performed a bibliometric analysis of medical informatics citations indexed in medline using publication trends, journal frequencies, impact factors, Medical Subject Headings (MeSH) term frequencies, and characteristics of citations. Findings revealed that there were over 77,000 medical informatics articles published during the covered period in numerous journals and that the average annual growth rate was 12%. The MeSH term analysis also suggested a strong interdisciplinary trend. Finally, average impact scores increased over time with two notable growth periods. Overall, patterns in research outputs that seem to characterize the historic trends and current components of the field of medical informatics suggest it may be a maturing discipline (DeShazo et al., 2009).

9.3.3. Scoping Reviews

Scoping reviews attempt to provide an initial indication of the potential size and nature of the extant literature on an emergent topic (Arksey & O’Malley, 2005; Daudt, van Mossel, & Scott, 2013 ; Levac, Colquhoun, & O’Brien, 2010). A scoping review may be conducted to examine the extent, range and nature of research activities in a particular area, determine the value of undertaking a full systematic review (discussed next), or identify research gaps in the extant literature ( Paré et al., 2015 ). In line with their main objective, scoping reviews usually conclude with the presentation of a detailed research agenda for future works along with potential implications for both practice and research.

Unlike narrative and descriptive reviews, the whole point of scoping the field is to be as comprehensive as possible, including grey literature (Arksey & O’Malley, 2005). Inclusion and exclusion criteria must be established to help researchers eliminate studies that are not aligned with the research questions. It is also recommended that at least two independent coders review abstracts yielded from the search strategy and then the full articles for study selection ( Daudt et al., 2013 ). The synthesized evidence from content or thematic analysis is relatively easy to present in tabular form (Arksey & O’Malley, 2005; Thomas & Harden, 2008 ).

One of the most highly cited scoping reviews in the eHealth domain was published by Archer, Fevrier-Thomas, Lokker, McKibbon, and Straus (2011) . These authors reviewed the existing literature on personal health record ( phr ) systems including design, functionality, implementation, applications, outcomes, and benefits. Seven databases were searched from 1985 to March 2010. Several search terms relating to phr s were used during this process. Two authors independently screened titles and abstracts to determine inclusion status. A second screen of full-text articles, again by two independent members of the research team, ensured that the studies described phr s. All in all, 130 articles met the criteria and their data were extracted manually into a database. The authors concluded that although there is a large amount of survey, observational, cohort/panel, and anecdotal evidence of phr benefits and satisfaction for patients, more research is needed to evaluate the results of phr implementations. Their in-depth analysis of the literature signalled that there is little solid evidence from randomized controlled trials or other studies through the use of phr s. Hence, they suggested that more research is needed that addresses the current lack of understanding of optimal functionality and usability of these systems, and how they can play a beneficial role in supporting patient self-management ( Archer et al., 2011 ).

9.3.4. Forms of Aggregative Reviews

Healthcare providers, practitioners, and policy-makers are nowadays overwhelmed with large volumes of information, including research-based evidence from numerous clinical trials and evaluation studies, assessing the effectiveness of health information technologies and interventions ( Ammenwerth & de Keizer, 2004 ; Deshazo et al., 2009 ). It is unrealistic to expect that all these disparate actors will have the time, skills, and necessary resources to identify the available evidence in the area of their expertise and consider it when making decisions. Systematic reviews that involve the rigorous application of scientific strategies aimed at limiting subjectivity and bias (i.e., systematic and random errors) can respond to this challenge.

Systematic reviews attempt to aggregate, appraise, and synthesize in a single source all empirical evidence that meet a set of previously specified eligibility criteria in order to answer a clearly formulated and often narrow research question on a particular topic of interest to support evidence-based practice ( Liberati et al., 2009 ). They adhere closely to explicit scientific principles ( Liberati et al., 2009 ) and rigorous methodological guidelines (Higgins & Green, 2008) aimed at reducing random and systematic errors that can lead to deviations from the truth in results or inferences. The use of explicit methods allows systematic reviews to aggregate a large body of research evidence, assess whether effects or relationships are in the same direction and of the same general magnitude, explain possible inconsistencies between study results, and determine the strength of the overall evidence for every outcome of interest based on the quality of included studies and the general consistency among them ( Cook, Mulrow, & Haynes, 1997 ). The main procedures of a systematic review involve:

  • Formulating a review question and developing a search strategy based on explicit inclusion criteria for the identification of eligible studies (usually described in the context of a detailed review protocol).
  • Searching for eligible studies using multiple databases and information sources, including grey literature sources, without any language restrictions.
  • Selecting studies, extracting data, and assessing risk of bias in a duplicate manner using two independent reviewers to avoid random or systematic errors in the process.
  • Analyzing data using quantitative or qualitative methods.
  • Presenting results in summary of findings tables.
  • Interpreting results and drawing conclusions.

Many systematic reviews, but not all, use statistical methods to combine the results of independent studies into a single quantitative estimate or summary effect size. Known as meta-analyses , these reviews use specific data extraction and statistical techniques (e.g., network, frequentist, or Bayesian meta-analyses) to calculate from each study by outcome of interest an effect size along with a confidence interval that reflects the degree of uncertainty behind the point estimate of effect ( Borenstein, Hedges, Higgins, & Rothstein, 2009 ; Deeks, Higgins, & Altman, 2008 ). Subsequently, they use fixed or random-effects analysis models to combine the results of the included studies, assess statistical heterogeneity, and calculate a weighted average of the effect estimates from the different studies, taking into account their sample sizes. The summary effect size is a value that reflects the average magnitude of the intervention effect for a particular outcome of interest or, more generally, the strength of a relationship between two variables across all studies included in the systematic review. By statistically combining data from multiple studies, meta-analyses can create more precise and reliable estimates of intervention effects than those derived from individual studies alone, when these are examined independently as discrete sources of information.

The review by Gurol-Urganci, de Jongh, Vodopivec-Jamsek, Atun, and Car (2013) on the effects of mobile phone messaging reminders for attendance at healthcare appointments is an illustrative example of a high-quality systematic review with meta-analysis. Missed appointments are a major cause of inefficiency in healthcare delivery with substantial monetary costs to health systems. These authors sought to assess whether mobile phone-based appointment reminders delivered through Short Message Service ( sms ) or Multimedia Messaging Service ( mms ) are effective in improving rates of patient attendance and reducing overall costs. To this end, they conducted a comprehensive search on multiple databases using highly sensitive search strategies without language or publication-type restrictions to identify all rct s that are eligible for inclusion. In order to minimize the risk of omitting eligible studies not captured by the original search, they supplemented all electronic searches with manual screening of trial registers and references contained in the included studies. Study selection, data extraction, and risk of bias assessments were performed inde­­pen­dently by two coders using standardized methods to ensure consistency and to eliminate potential errors. Findings from eight rct s involving 6,615 participants were pooled into meta-analyses to calculate the magnitude of effects that mobile text message reminders have on the rate of attendance at healthcare appointments compared to no reminders and phone call reminders.

Meta-analyses are regarded as powerful tools for deriving meaningful conclusions. However, there are situations in which it is neither reasonable nor appropriate to pool studies together using meta-analytic methods simply because there is extensive clinical heterogeneity between the included studies or variation in measurement tools, comparisons, or outcomes of interest. In these cases, systematic reviews can use qualitative synthesis methods such as vote counting, content analysis, classification schemes and tabulations, as an alternative approach to narratively synthesize the results of the independent studies included in the review. This form of review is known as qualitative systematic review.

A rigorous example of one such review in the eHealth domain is presented by Mickan, Atherton, Roberts, Heneghan, and Tilson (2014) on the use of handheld computers by healthcare professionals and their impact on access to information and clinical decision-making. In line with the methodological guide­lines for systematic reviews, these authors: (a) developed and registered with prospero ( www.crd.york.ac.uk/ prospero / ) an a priori review protocol; (b) conducted comprehensive searches for eligible studies using multiple databases and other supplementary strategies (e.g., forward searches); and (c) subsequently carried out study selection, data extraction, and risk of bias assessments in a duplicate manner to eliminate potential errors in the review process. Heterogeneity between the included studies in terms of reported outcomes and measures precluded the use of meta-analytic methods. To this end, the authors resorted to using narrative analysis and synthesis to describe the effectiveness of handheld computers on accessing information for clinical knowledge, adherence to safety and clinical quality guidelines, and diagnostic decision-making.

In recent years, the number of systematic reviews in the field of health informatics has increased considerably. Systematic reviews with discordant findings can cause great confusion and make it difficult for decision-makers to interpret the review-level evidence ( Moher, 2013 ). Therefore, there is a growing need for appraisal and synthesis of prior systematic reviews to ensure that decision-making is constantly informed by the best available accumulated evidence. Umbrella reviews , also known as overviews of systematic reviews, are tertiary types of evidence synthesis that aim to accomplish this; that is, they aim to compare and contrast findings from multiple systematic reviews and meta-analyses ( Becker & Oxman, 2008 ). Umbrella reviews generally adhere to the same principles and rigorous methodological guidelines used in systematic reviews. However, the unit of analysis in umbrella reviews is the systematic review rather than the primary study ( Becker & Oxman, 2008 ). Unlike systematic reviews that have a narrow focus of inquiry, umbrella reviews focus on broader research topics for which there are several potential interventions ( Smith, Devane, Begley, & Clarke, 2011 ). A recent umbrella review on the effects of home telemonitoring interventions for patients with heart failure critically appraised, compared, and synthesized evidence from 15 systematic reviews to investigate which types of home telemonitoring technologies and forms of interventions are more effective in reducing mortality and hospital admissions ( Kitsiou, Paré, & Jaana, 2015 ).

9.3.5. Realist Reviews

Realist reviews are theory-driven interpretative reviews developed to inform, enhance, or supplement conventional systematic reviews by making sense of heterogeneous evidence about complex interventions applied in diverse contexts in a way that informs policy decision-making ( Greenhalgh, Wong, Westhorp, & Pawson, 2011 ). They originated from criticisms of positivist systematic reviews which centre on their “simplistic” underlying assumptions ( Oates, 2011 ). As explained above, systematic reviews seek to identify causation. Such logic is appropriate for fields like medicine and education where findings of randomized controlled trials can be aggregated to see whether a new treatment or intervention does improve outcomes. However, many argue that it is not possible to establish such direct causal links between interventions and outcomes in fields such as social policy, management, and information systems where for any intervention there is unlikely to be a regular or consistent outcome ( Oates, 2011 ; Pawson, 2006 ; Rousseau, Manning, & Denyer, 2008 ).

To circumvent these limitations, Pawson, Greenhalgh, Harvey, and Walshe (2005) have proposed a new approach for synthesizing knowledge that seeks to unpack the mechanism of how “complex interventions” work in particular contexts. The basic research question — what works? — which is usually associated with systematic reviews changes to: what is it about this intervention that works, for whom, in what circumstances, in what respects and why? Realist reviews have no particular preference for either quantitative or qualitative evidence. As a theory-building approach, a realist review usually starts by articulating likely underlying mechanisms and then scrutinizes available evidence to find out whether and where these mechanisms are applicable ( Shepperd et al., 2009 ). Primary studies found in the extant literature are viewed as case studies which can test and modify the initial theories ( Rousseau et al., 2008 ).

The main objective pursued in the realist review conducted by Otte-Trojel, de Bont, Rundall, and van de Klundert (2014) was to examine how patient portals contribute to health service delivery and patient outcomes. The specific goals were to investigate how outcomes are produced and, most importantly, how variations in outcomes can be explained. The research team started with an exploratory review of background documents and research studies to identify ways in which patient portals may contribute to health service delivery and patient outcomes. The authors identified six main ways which represent “educated guesses” to be tested against the data in the evaluation studies. These studies were identified through a formal and systematic search in four databases between 2003 and 2013. Two members of the research team selected the articles using a pre-established list of inclusion and exclusion criteria and following a two-step procedure. The authors then extracted data from the selected articles and created several tables, one for each outcome category. They organized information to bring forward those mechanisms where patient portals contribute to outcomes and the variation in outcomes across different contexts.

9.3.6. Critical Reviews

Lastly, critical reviews aim to provide a critical evaluation and interpretive analysis of existing literature on a particular topic of interest to reveal strengths, weaknesses, contradictions, controversies, inconsistencies, and/or other important issues with respect to theories, hypotheses, research methods or results ( Baumeister & Leary, 1997 ; Kirkevold, 1997 ). Unlike other review types, critical reviews attempt to take a reflective account of the research that has been done in a particular area of interest, and assess its credibility by using appraisal instruments or critical interpretive methods. In this way, critical reviews attempt to constructively inform other scholars about the weaknesses of prior research and strengthen knowledge development by giving focus and direction to studies for further improvement ( Kirkevold, 1997 ).

Kitsiou, Paré, and Jaana (2013) provide an example of a critical review that assessed the methodological quality of prior systematic reviews of home telemonitoring studies for chronic patients. The authors conducted a comprehensive search on multiple databases to identify eligible reviews and subsequently used a validated instrument to conduct an in-depth quality appraisal. Results indicate that the majority of systematic reviews in this particular area suffer from important methodological flaws and biases that impair their internal validity and limit their usefulness for clinical and decision-making purposes. To this end, they provide a number of recommendations to strengthen knowledge development towards improving the design and execution of future reviews on home telemonitoring.

9.4. Summary

Table 9.1 outlines the main types of literature reviews that were described in the previous sub-sections and summarizes the main characteristics that distinguish one review type from another. It also includes key references to methodological guidelines and useful sources that can be used by eHealth scholars and researchers for planning and developing reviews.

Table 9.1. Typology of Literature Reviews (adapted from Paré et al., 2015).

Typology of Literature Reviews (adapted from Paré et al., 2015).

As shown in Table 9.1 , each review type addresses different kinds of research questions or objectives, which subsequently define and dictate the methods and approaches that need to be used to achieve the overarching goal(s) of the review. For example, in the case of narrative reviews, there is greater flexibility in searching and synthesizing articles ( Green et al., 2006 ). Researchers are often relatively free to use a diversity of approaches to search, identify, and select relevant scientific articles, describe their operational characteristics, present how the individual studies fit together, and formulate conclusions. On the other hand, systematic reviews are characterized by their high level of systematicity, rigour, and use of explicit methods, based on an “a priori” review plan that aims to minimize bias in the analysis and synthesis process (Higgins & Green, 2008). Some reviews are exploratory in nature (e.g., scoping/mapping reviews), whereas others may be conducted to discover patterns (e.g., descriptive reviews) or involve a synthesis approach that may include the critical analysis of prior research ( Paré et al., 2015 ). Hence, in order to select the most appropriate type of review, it is critical to know before embarking on a review project, why the research synthesis is conducted and what type of methods are best aligned with the pursued goals.

9.5. Concluding Remarks

In light of the increased use of evidence-based practice and research generating stronger evidence ( Grady et al., 2011 ; Lyden et al., 2013 ), review articles have become essential tools for summarizing, synthesizing, integrating or critically appraising prior knowledge in the eHealth field. As mentioned earlier, when rigorously conducted review articles represent powerful information sources for eHealth scholars and practitioners looking for state-of-the-art evidence. The typology of literature reviews we used herein will allow eHealth researchers, graduate students and practitioners to gain a better understanding of the similarities and differences between review types.

We must stress that this classification scheme does not privilege any specific type of review as being of higher quality than another ( Paré et al., 2015 ). As explained above, each type of review has its own strengths and limitations. Having said that, we realize that the methodological rigour of any review — be it qualitative, quantitative or mixed — is a critical aspect that should be considered seriously by prospective authors. In the present context, the notion of rigour refers to the reliability and validity of the review process described in section 9.2. For one thing, reliability is related to the reproducibility of the review process and steps, which is facilitated by a comprehensive documentation of the literature search process, extraction, coding and analysis performed in the review. Whether the search is comprehensive or not, whether it involves a methodical approach for data extraction and synthesis or not, it is important that the review documents in an explicit and transparent manner the steps and approach that were used in the process of its development. Next, validity characterizes the degree to which the review process was conducted appropriately. It goes beyond documentation and reflects decisions related to the selection of the sources, the search terms used, the period of time covered, the articles selected in the search, and the application of backward and forward searches ( vom Brocke et al., 2009 ). In short, the rigour of any review article is reflected by the explicitness of its methods (i.e., transparency) and the soundness of the approach used. We refer those interested in the concepts of rigour and quality to the work of Templier and Paré (2015) which offers a detailed set of methodological guidelines for conducting and evaluating various types of review articles.

To conclude, our main objective in this chapter was to demystify the various types of literature reviews that are central to the continuous development of the eHealth field. It is our hope that our descriptive account will serve as a valuable source for those conducting, evaluating or using reviews in this important and growing domain.

  • Ammenwerth E., de Keizer N. An inventory of evaluation studies of information technology in health care. Trends in evaluation research, 1982-2002. International Journal of Medical Informatics. 2004; 44 (1):44–56. [ PubMed : 15778794 ]
  • Anderson S., Allen P., Peckham S., Goodwin N. Asking the right questions: scoping studies in the commissioning of research on the organisation and delivery of health services. Health Research Policy and Systems. 2008; 6 (7):1–12. [ PMC free article : PMC2500008 ] [ PubMed : 18613961 ] [ CrossRef ]
  • Archer N., Fevrier-Thomas U., Lokker C., McKibbon K. A., Straus S.E. Personal health records: a scoping review. Journal of American Medical Informatics Association. 2011; 18 (4):515–522. [ PMC free article : PMC3128401 ] [ PubMed : 21672914 ]
  • Arksey H., O’Malley L. Scoping studies: towards a methodological framework. International Journal of Social Research Methodology. 2005; 8 (1):19–32.
  • A systematic, tool-supported method for conducting literature reviews in information systems. Paper presented at the Proceedings of the 19th European Conference on Information Systems ( ecis 2011); June 9 to 11; Helsinki, Finland. 2011.
  • Baumeister R. F., Leary M.R. Writing narrative literature reviews. Review of General Psychology. 1997; 1 (3):311–320.
  • Becker L. A., Oxman A.D. In: Cochrane handbook for systematic reviews of interventions. Higgins J. P. T., Green S., editors. Hoboken, nj : John Wiley & Sons, Ltd; 2008. Overviews of reviews; pp. 607–631.
  • Borenstein M., Hedges L., Higgins J., Rothstein H. Introduction to meta-analysis. Hoboken, nj : John Wiley & Sons Inc; 2009.
  • Cook D. J., Mulrow C. D., Haynes B. Systematic reviews: Synthesis of best evidence for clinical decisions. Annals of Internal Medicine. 1997; 126 (5):376–380. [ PubMed : 9054282 ]
  • Cooper H., Hedges L.V. In: The handbook of research synthesis and meta-analysis. 2nd ed. Cooper H., Hedges L. V., Valentine J. C., editors. New York: Russell Sage Foundation; 2009. Research synthesis as a scientific process; pp. 3–17.
  • Cooper H. M. Organizing knowledge syntheses: A taxonomy of literature reviews. Knowledge in Society. 1988; 1 (1):104–126.
  • Cronin P., Ryan F., Coughlan M. Undertaking a literature review: a step-by-step approach. British Journal of Nursing. 2008; 17 (1):38–43. [ PubMed : 18399395 ]
  • Darlow S., Wen K.Y. Development testing of mobile health interventions for cancer patient self-management: A review. Health Informatics Journal. 2015 (online before print). [ PubMed : 25916831 ] [ CrossRef ]
  • Daudt H. M., van Mossel C., Scott S.J. Enhancing the scoping study methodology: a large, inter-professional team’s experience with Arksey and O’Malley’s framework. bmc Medical Research Methodology. 2013; 13 :48. [ PMC free article : PMC3614526 ] [ PubMed : 23522333 ] [ CrossRef ]
  • Davies P. The relevance of systematic reviews to educational policy and practice. Oxford Review of Education. 2000; 26 (3-4):365–378.
  • Deeks J. J., Higgins J. P. T., Altman D.G. In: Cochrane handbook for systematic reviews of interventions. Higgins J. P. T., Green S., editors. Hoboken, nj : John Wiley & Sons, Ltd; 2008. Analysing data and undertaking meta-analyses; pp. 243–296.
  • Deshazo J. P., Lavallie D. L., Wolf F.M. Publication trends in the medical informatics literature: 20 years of “Medical Informatics” in mesh . bmc Medical Informatics and Decision Making. 2009; 9 :7. [ PMC free article : PMC2652453 ] [ PubMed : 19159472 ] [ CrossRef ]
  • Dixon-Woods M., Agarwal S., Jones D., Young B., Sutton A. Synthesising qualitative and quantitative evidence: a review of possible methods. Journal of Health Services Research and Policy. 2005; 10 (1):45–53. [ PubMed : 15667704 ]
  • Finfgeld-Connett D., Johnson E.D. Literature search strategies for conducting knowledge-building and theory-generating qualitative systematic reviews. Journal of Advanced Nursing. 2013; 69 (1):194–204. [ PMC free article : PMC3424349 ] [ PubMed : 22591030 ]
  • Grady B., Myers K. M., Nelson E. L., Belz N., Bennett L., Carnahan L. … Guidelines Working Group. Evidence-based practice for telemental health. Telemedicine Journal and E Health. 2011; 17 (2):131–148. [ PubMed : 21385026 ]
  • Green B. N., Johnson C. D., Adams A. Writing narrative literature reviews for peer-reviewed journals: secrets of the trade. Journal of Chiropractic Medicine. 2006; 5 (3):101–117. [ PMC free article : PMC2647067 ] [ PubMed : 19674681 ]
  • Greenhalgh T., Wong G., Westhorp G., Pawson R. Protocol–realist and meta-narrative evidence synthesis: evolving standards ( rameses ). bmc Medical Research Methodology. 2011; 11 :115. [ PMC free article : PMC3173389 ] [ PubMed : 21843376 ]
  • Gurol-Urganci I., de Jongh T., Vodopivec-Jamsek V., Atun R., Car J. Mobile phone messaging reminders for attendance at healthcare appointments. Cochrane Database System Review. 2013; 12 cd 007458. [ PMC free article : PMC6485985 ] [ PubMed : 24310741 ] [ CrossRef ]
  • Hart C. Doing a literature review: Releasing the social science research imagination. London: SAGE Publications; 1998.
  • Higgins J. P. T., Green S., editors. Cochrane handbook for systematic reviews of interventions: Cochrane book series. Hoboken, nj : Wiley-Blackwell; 2008.
  • Jesson J., Matheson L., Lacey F.M. Doing your literature review: traditional and systematic techniques. Los Angeles & London: SAGE Publications; 2011.
  • King W. R., He J. Understanding the role and methods of meta-analysis in IS research. Communications of the Association for Information Systems. 2005; 16 :1.
  • Kirkevold M. Integrative nursing research — an important strategy to further the development of nursing science and nursing practice. Journal of Advanced Nursing. 1997; 25 (5):977–984. [ PubMed : 9147203 ]
  • Kitchenham B., Charters S. ebse Technical Report Version 2.3. Keele & Durham. uk : Keele University & University of Durham; 2007. Guidelines for performing systematic literature reviews in software engineering.
  • Kitsiou S., Paré G., Jaana M. Systematic reviews and meta-analyses of home telemonitoring interventions for patients with chronic diseases: a critical assessment of their methodological quality. Journal of Medical Internet Research. 2013; 15 (7):e150. [ PMC free article : PMC3785977 ] [ PubMed : 23880072 ]
  • Kitsiou S., Paré G., Jaana M. Effects of home telemonitoring interventions on patients with chronic heart failure: an overview of systematic reviews. Journal of Medical Internet Research. 2015; 17 (3):e63. [ PMC free article : PMC4376138 ] [ PubMed : 25768664 ]
  • Levac D., Colquhoun H., O’Brien K. K. Scoping studies: advancing the methodology. Implementation Science. 2010; 5 (1):69. [ PMC free article : PMC2954944 ] [ PubMed : 20854677 ]
  • Levy Y., Ellis T.J. A systems approach to conduct an effective literature review in support of information systems research. Informing Science. 2006; 9 :181–211.
  • Liberati A., Altman D. G., Tetzlaff J., Mulrow C., Gøtzsche P. C., Ioannidis J. P. A. et al. Moher D. The prisma statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. Annals of Internal Medicine. 2009; 151 (4):W-65. [ PubMed : 19622512 ]
  • Lyden J. R., Zickmund S. L., Bhargava T. D., Bryce C. L., Conroy M. B., Fischer G. S. et al. McTigue K. M. Implementing health information technology in a patient-centered manner: Patient experiences with an online evidence-based lifestyle intervention. Journal for Healthcare Quality. 2013; 35 (5):47–57. [ PubMed : 24004039 ]
  • Mickan S., Atherton H., Roberts N. W., Heneghan C., Tilson J.K. Use of handheld computers in clinical practice: a systematic review. bmc Medical Informatics and Decision Making. 2014; 14 :56. [ PMC free article : PMC4099138 ] [ PubMed : 24998515 ]
  • Moher D. The problem of duplicate systematic reviews. British Medical Journal. 2013; 347 (5040) [ PubMed : 23945367 ] [ CrossRef ]
  • Montori V. M., Wilczynski N. L., Morgan D., Haynes R. B., Hedges T. Systematic reviews: a cross-sectional study of location and citation counts. bmc Medicine. 2003; 1 :2. [ PMC free article : PMC281591 ] [ PubMed : 14633274 ]
  • Mulrow C. D. The medical review article: state of the science. Annals of Internal Medicine. 1987; 106 (3):485–488. [ PubMed : 3813259 ] [ CrossRef ]
  • Evidence-based information systems: A decade later. Proceedings of the European Conference on Information Systems ; 2011. Retrieved from http://aisel ​.aisnet.org/cgi/viewcontent ​.cgi?article ​=1221&context ​=ecis2011 .
  • Okoli C., Schabram K. A guide to conducting a systematic literature review of information systems research. ssrn Electronic Journal. 2010
  • Otte-Trojel T., de Bont A., Rundall T. G., van de Klundert J. How outcomes are achieved through patient portals: a realist review. Journal of American Medical Informatics Association. 2014; 21 (4):751–757. [ PMC free article : PMC4078283 ] [ PubMed : 24503882 ]
  • Paré G., Trudel M.-C., Jaana M., Kitsiou S. Synthesizing information systems knowledge: A typology of literature reviews. Information & Management. 2015; 52 (2):183–199.
  • Patsopoulos N. A., Analatos A. A., Ioannidis J.P. A. Relative citation impact of various study designs in the health sciences. Journal of the American Medical Association. 2005; 293 (19):2362–2366. [ PubMed : 15900006 ]
  • Paul M. M., Greene C. M., Newton-Dame R., Thorpe L. E., Perlman S. E., McVeigh K. H., Gourevitch M.N. The state of population health surveillance using electronic health records: A narrative review. Population Health Management. 2015; 18 (3):209–216. [ PubMed : 25608033 ]
  • Pawson R. Evidence-based policy: a realist perspective. London: SAGE Publications; 2006.
  • Pawson R., Greenhalgh T., Harvey G., Walshe K. Realist review—a new method of systematic review designed for complex policy interventions. Journal of Health Services Research & Policy. 2005; 10 (Suppl 1):21–34. [ PubMed : 16053581 ]
  • Petersen K., Vakkalanka S., Kuzniarz L. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology. 2015; 64 :1–18.
  • Petticrew M., Roberts H. Systematic reviews in the social sciences: A practical guide. Malden, ma : Blackwell Publishing Co; 2006.
  • Rousseau D. M., Manning J., Denyer D. Evidence in management and organizational science: Assembling the field’s full weight of scientific knowledge through syntheses. The Academy of Management Annals. 2008; 2 (1):475–515.
  • Rowe F. What literature review is not: diversity, boundaries and recommendations. European Journal of Information Systems. 2014; 23 (3):241–255.
  • Shea B. J., Hamel C., Wells G. A., Bouter L. M., Kristjansson E., Grimshaw J. et al. Boers M. amstar is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. Journal of Clinical Epidemiology. 2009; 62 (10):1013–1020. [ PubMed : 19230606 ]
  • Shepperd S., Lewin S., Straus S., Clarke M., Eccles M. P., Fitzpatrick R. et al. Sheikh A. Can we systematically review studies that evaluate complex interventions? PLoS Medicine. 2009; 6 (8):e1000086. [ PMC free article : PMC2717209 ] [ PubMed : 19668360 ]
  • Silva B. M., Rodrigues J. J., de la Torre Díez I., López-Coronado M., Saleem K. Mobile-health: A review of current state in 2015. Journal of Biomedical Informatics. 2015; 56 :265–272. [ PubMed : 26071682 ]
  • Smith V., Devane D., Begley C., Clarke M. Methodology in conducting a systematic review of systematic reviews of healthcare interventions. bmc Medical Research Methodology. 2011; 11 (1):15. [ PMC free article : PMC3039637 ] [ PubMed : 21291558 ]
  • Sylvester A., Tate M., Johnstone D. Beyond synthesis: re-presenting heterogeneous research literature. Behaviour & Information Technology. 2013; 32 (12):1199–1215.
  • Templier M., Paré G. A framework for guiding and evaluating literature reviews. Communications of the Association for Information Systems. 2015; 37 (6):112–137.
  • Thomas J., Harden A. Methods for the thematic synthesis of qualitative research in systematic reviews. bmc Medical Research Methodology. 2008; 8 (1):45. [ PMC free article : PMC2478656 ] [ PubMed : 18616818 ]
  • Reconstructing the giant: on the importance of rigour in documenting the literature search process. Paper presented at the Proceedings of the 17th European Conference on Information Systems ( ecis 2009); Verona, Italy. 2009.
  • Webster J., Watson R.T. Analyzing the past to prepare for the future: Writing a literature review. Management Information Systems Quarterly. 2002; 26 (2):11.
  • Whitlock E. P., Lin J. S., Chou R., Shekelle P., Robinson K.A. Using existing systematic reviews in complex systematic reviews. Annals of Internal Medicine. 2008; 148 (10):776–782. [ PubMed : 18490690 ]

This publication is licensed under a Creative Commons License, Attribution-Noncommercial 4.0 International License (CC BY-NC 4.0): see https://creativecommons.org/licenses/by-nc/4.0/

  • Cite this Page Paré G, Kitsiou S. Chapter 9 Methods for Literature Reviews. In: Lau F, Kuziemsky C, editors. Handbook of eHealth Evaluation: An Evidence-based Approach [Internet]. Victoria (BC): University of Victoria; 2017 Feb 27.
  • PDF version of this title (4.5M)
  • Disable Glossary Links

In this Page

  • Introduction
  • Overview of the Literature Review Process and Steps
  • Types of Review Articles and Brief Illustrations
  • Concluding Remarks

Related information

  • PMC PubMed Central citations
  • PubMed Links to PubMed

Recent Activity

  • Chapter 9 Methods for Literature Reviews - Handbook of eHealth Evaluation: An Ev... Chapter 9 Methods for Literature Reviews - Handbook of eHealth Evaluation: An Evidence-based Approach

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Connect with NLM

National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894

Web Policies FOIA HHS Vulnerability Disclosure

Help Accessibility Careers

statistics

Serious games in high-stakes assessment contexts: a systematic literature review into the game design principles for valid game-based performance assessment

  • Research Article
  • Open access
  • Published: 08 April 2024

Cite this article

You have full access to this open access article

  • Aranka Bijl   ORCID: orcid.org/0000-0001-5745-1396 1 , 2 , 3 ,
  • Bernard P. Veldkamp 2 ,
  • Saskia Wools 3 &
  • Sebastiaan de Klerk 3  

The systematic literature review (1) investigates whether ‘serious games’ provide a viable solution to the limitations posed by traditional high-stakes performance assessments and (2) aims to synthesize game design principles for the game-based performance assessment of professional competencies. In total, 56 publications were included in the final review, targeting knowledge, motor skills and cognitive skills and further narrowed down to teaching, training or assessing professional competencies. Our review demonstrates that serious games are able to provide an environment and task authentic to the target competency. Collected in-game behaviors indicate that serious games are able to elicit behavior that is related to a candidates’ ability level. Progress feedback and freedom of gameplay in serious games can be implemented to provide an engaging and enjoyable environment for candidates. Few studies examined adaptivity and some examined serious games without an authentic environment or task. Overall, the review gives an overview of game design principles for game-based performance assessment. It highlights two research gaps regarding authenticity and adaptivity and concludes with three implications for practice.

Avoid common mistakes on your manuscript.

In the years since their first introduction (ca. 1950s), videogames have only increased in popularity. In education, videogames are already widely applied as tools to support students in learning (cf. Boyle et al., 2016 ; Ifenthaler et al., 2012 ; Young et al., 2012 ). In contrast, less research has been done on the use of videogames as summative assessment environments, even though administering (high-stakes) summative assessments through games has several advantages.

First, videogames can be used to administer standardized assessments that provide richer data about candidate ability in comparison to traditional standardized assessments (e.g., multiple-choice tests; Schwartz & Arena, 2013 ; Shaffer & Gee, 2012 ; Shute & Rahimi, 2021 ). Second, assessment through videogames gives considerable freedom in recreating real-life criterion situations, which allows for authentic, situated assessment even when this is not feasible in the real working environment (Bell et al., 2008 ; Dörner et al., 2016 ; Fonteneau et al., 2020 ; Harteveld, 2011 ; Kirriemur & McFarlane, 2004 ; Michael & Chen, 2006 ). Third, videogames can offer candidates a more enjoyable test experience by providing an engaging environment where they are given a high degree of autonomy (Boyle et al., 2012 ; Jones, 1998 ; Mavridis & Tsiatsos, 2017 ). Finally, videogames allow for assessment through in-game behaviors (i.e., stealth assessment), which intends to make assessment less salient for candidates and lets them retain engagement (Shute & Ke, 2012 ; Shute et al., 2009 ).

The benefits above highlight why videogames are viable assessment environments, irrespective of the specific level of cognitive achievement (e.g., those depicted in Bloom’s revised taxonomy; Krathwohl, 2002 ). Moreover, the possibility for immersing candidates in complex, situated contexts make them especially interesting for higher-order learning outcomes such as problem solving and critical thinking (Dede, 2009 ; Shute & Ke, 2012 ). Therefore, videogames may provide a solution to the validity threats associated with traditional high-stakes performance assessments: an assessment type to evaluate competencies through a construct-relevant task in the context for which it is intended (Lane & Stone, 2006 ; Messick, 1994 ; Stecher, 2010 ), often used for the purpose of vocational certification.

The first validity threat associated with high-stakes performance assessments is the prevalence of test anxiety among candidates (Lane & Stone, 2006 ; Messick, 1994 ; Stecher, 2010 ), which is shown to be negatively correlated to test performance (von der Embse et al., 2018 ; von der Embse & Witmer, 2014 ). Although some debate exists about the causal relationship between the two (Jerrim, 2022 ; von der Embse et al., 2018 ), it is apparent that candidates who experience test anxiety are unfairly disadvantaged in high-stakes assessment contexts.

The second threat identified is caused by a need for high-stakes performance assessment to be both standardized to ensure objectivity and fairness (AERA et al., 2014 ; Kane, 2006 ) as well as include a construct-relevant task (e.g., writing an essay, participating in a roleplay; Lane & Stone, 2006 ; Messick, 1994 ). While neither rule out adaptivity (e.g., adaptive testing and open-ended assessments), the combination often restricts us to use a linear performance task that is not adaptable to candidate ability level. The potential mismatch that could occur between task difficulty and the ability level of candidates posits two disadvantages. First, the mismatch can frustrate candidates, which negatively affects their test performance (Wainer, 2000 ). Second, candidates likely receive fewer tasks that align with their ability level, which negatively affects test reliability and efficiency (Burr et al., 2023 ). High-stakes performance assessments would thus benefit from adaptive testing that is personalized and appropriately difficult, allowing candidates to be challenged enough to retain engagement (Burr et al., 2023 ; Malone & Lepper, 1987 ; Van Eck, 2006 ) while assessors are able to determine whether the candidate is at the required level efficiently and reliably (Burr et al., 2023 ; Davey, 2011 ). Additionally, adaptive testing allows for more personalized (end-of-assessment) feedback that could further boost candidate performance (Burr et al., 2023 ; Martin & Lazendic, 2018 ).

The third threat identified in high-stakes performance assessment is a lack of assessment authenticity. Logically, assessment would be administered best in the authentic context (i.e., the workplace in the case of professional competencies). This leads to a high degree of fidelity: how closely the assessment environment mirrors reality (Alessi, 1988, as cited in Gulikers et al., 2004 ). Unfortunately, this is not attainable for competencies that are dangerous or unethical to carry out (Bell et al., 2008 ; Williams-Bell et al., 2015 ). Another concern is that in the workplace, assessments are largely dependent on the workplace in which they are carried out. This would lead to considerable variations in testing conditions between candidates, but also the construct relevance of tasks they are evaluated on (Baartman & Gulikers, 2017 ). Authenticity of physical context and task are two dimensions required for mobilizing the competencies of interest (Gulikers et al., 2004 ), there is a need to achieve authenticity in other ways. Authenticity is also related to transfer: applying what is learned to new contexts. The higher the alignment between assessment and reality is, the more likely it is that the transfer of competence to the professional practice is made.

The fourth threat identified are inconsistencies between raters in scoring candidate performance. Traditional high-stakes performance assessments are often accompanied by rubrics to evaluate candidate performance; however, inconsistencies in how rubrics are interpreted and used leads to construct-irrelevant variance (Lane & Stone, 2006 ; Wools et al., 2010 ). In this study, the aim is to investigate whether ‘serious games’ (SGs)—those “used for purposes other than mere entertainment” (Susi et al., 2007 ; p. 1)—provide a viable solution to this and the other limitations posed by traditional high-stakes performance assessments.

The most important characteristic of games is that they are played with a clear goal in mind. Many games have a predetermined goal, but other games allow players to define their own objectives (Charsky, 2010 ; Prensky, 2001 ). Goals are given structure by the provision of rules, choices, and feedback (Lameras et al., 2017 ). First, rules direct players towards the goal by placing restrictions on gameplay (Charsky, 2010 ). Second, choices enable players to make decisions, for example to choose between different strategies to attain the goal (Charsky, 2010 ). The extent to which rules are restrictive for the gameplay is also closely related to the choices players have in the game (Charsky, 2010 ). Thus, rules and choices seem to be on two ends of a continuum that determines the linearity of a game. Linearity is defined as the extent to which players are given freedom of gameplay (Kim & Shute, 2015 ; Rouse, 2004 ). The third characteristic, feedback, is a well-versed topic in the field of education. In education, the main purpose of feedback is to help students get insight into their learning and get student understanding to the level of learning goals (Hattie & Timperley, 2007 ; Shute, 2008 ; van der Kleij et al., 2012 ). In games, feedback is used in a similar way to guide players towards the goal, as well as facilitate interactivity (Prensky, 2001 ). Feedback in games is provided in many modalities and gives players information about how they are progressing and where they stand with regards to the goal. For instance whether their actions have brought them closer to the goal or further away. Games are made up of a collection of game mechanics that define the game and determine how it is played (Rouse, 2004 ; Schell, 2015 ). In other words, game mechanics are how the defining features of games are translated into gameplay. To illustrate, game mechanics that provide feedback to players can include hints, gaining or losing lives, progress bars, dashboards, currencies and/or progress trees (Lameras et al., 2017 ).

When designing a game-based performance assessment, determining the information that should be collected about candidates to inform competence and designing the tasks that fulfill this information need is something that should be considered carefully for each professional competency. One way is through the use of the evidence-centered design (ECD) framework (cf. Mislevy & Riconscente, 2006 ). The ECD framework is a systematic approach to test development that relies on evidentiary arguments to move from a candidates behavior on a task to inferences about candidate ability. It is beyond the scope of the current study to examine the design of game content in relation to the target professional competencies. In this systematic literature review, the aim is to determine which game mechanics could help overcome the validity threats associated with high-stakes performance assessments and are suitable for use in such assessments.

Previous research for game design has been done for instructional SGs (e.g., dos Santos & Fraternali, 2016 ; Gunter et al., 2008 ). For SGs used in high-stakes performance assessments, emphasis is put on the potential effect of game mechanics on the validity of inferences should be considered. For instance, choices in game design can affect correlations between in-game behavior and player ability (Kim & Shute, 2015 ). Moreover, game mechanics exist that are likely to introduce construct-irrelevant variance when used in high-stakes performance assessments. To illustrate, when direct feedback about performance (e.g., points, lives, feedback messages) is given to players, at least part of the variance in test scores would be explained by the type and amount of feedback a candidate has received.

Establishing design principles for SGs for high-stakes performance assessment is important for several reasons. First, such an overview allows future developers such assessments to make more informed choices regarding game design. Second, combining and organizing the insights gained from the available empirical evidence advances the knowledge framework around the implementation of high-stakes performance assessment through games. Reviews on the use of games exist for learning (e.g., Boyle et al., 2016 ; Connolly et al., 2012 ; Young et al., 2012 ) or are targeted at specific professional domains (e.g., Gao et al., 2019 ; Gorbanev et al., 2018 ; Graafland et al., 2012 ; Wang et al., 2016 ). Nevertheless, a research gap remains as there is no knowledge of a systematic literature review that addresses the high-stakes performance assessment of professional competencies. To this end, this study begins with identifying the available literature on SGs targeted at professional competencies; then extracts the implemented game mechanics that could help to overcome the validity threats associated with high-stakes performance assessment; and finally synthesizes game design principles for game-based performance assessment in high-stakes contexts.

The scope of the current review is limited to professional competencies specifically catered to a vocation (e.g., construction hazard recognition). More generic professional competencies (e.g., programming) are not taken into consideration, as the context in which they are used can also fall outside of secondary vocational and higher education. Additionally, there is a growing body of literature that recognizes the potential of in-game behavior as a source of information about ability level in the context of game-based learning (e.g., Chen et al., 2020 ; Kim & Shute, 2015 ; Shute et al., 2009 ; Wang et al., 2015 ; Westera et al., 2014 ). As the relationship between in-game behavior and candidate ability is of equal importance in assessment, the scope of the current review includes SGs that focus not only on assessment, but also teaching and training of professional competencies.

The following section describes the procedure followed in conducting the current systematic literature review. First, a description of the inclusion criteria and search terms is given. This is followed by a description of the selection process and data extraction, together with an evaluation of the objectivity of the inclusion and quality criteria. Then, the search and selection results are presented, where two further categorizations of included studies operationalized: the type of competency and the how a successful SG is defined.

Following the guidelines described in Systematic Reviews in the Social Sciences (Petticrew & Roberts, 2005 ), the protocol below gives a description and the rationale behind the review along with a description of how different studies were identified, analyzed, and synthesized.

Databases and search terms

The databases that include most publications from the field of educational measurement ( Education Resources Information Center (ERIC) , PsycInfo , Scopus , and Web of Science) were consulted for the literature search using the following search terms:

Serious game : (serious gam* or game-based assess* or game-based learn* or game-based train*) and

Quality measure : (perform* or valid* or effect* or affect*)

Inclusion criteria and selection process

The initial search results were narrowed down by selecting only publications that were published in English and in a scientific, peer-reviewed journal. To be included, studies were required to report on the empirical research results of a study that (1) focused on a digital SG used for teaching, training, or assessment of one or more professional competencies specific to a work setting, (2) was conducted in secondary vocational education, higher education or vocational settings, and (3) included a measure to assess the dependent variable related to the quality of the SG. Studies were excluded when the focus was on simulations; while they have an overlapping role in the acquisition of professional competencies to SGs, these modalities represent distinct types of digital environments.

All results from the databases were exported to Endnote X9 (The EndNote Team, 2013 ) for screening. The selection process was conducted in three rounds. First, duplicates, and alternative document types (e.g., editorials, conference proceedings, letters) were removed. Then, the publications were screened based on the titles and abstracts; publications were removed when the title or abstract mentioned features of the study mutually exclusive with the inclusion criteria (e.g., primary school, rehabilitation, systematic literature review). Second, titles and abstracts of the remaining results were screened again. When the title or abstract lacked information, the full article was inspected. To illustrate, some titles and abstracts did not mention the target population, or whether the game was digital, or whether the professional competency was specific to a work setting. Finally, full-text articles were screened for full compliance with the inclusion criteria. Data was extracted from those publications.

The objectivity of the inclusion criteria was determined by blinded double classification on two occasions. The first occasion, after the removal of duplicates and alternative document types, 30 randomly selected publications were independently double-classified by an expert in the field of educational measurement based on the title and abstract. An agreement rate of 93% with a Cohen’s Kappa coefficient of .81 translated to a near perfect inter-rater reliability (Landis & Koch, 1977 ). On the second occasion, a random selection of 32 publications considered for data extraction were blindly double-classified based on the full-text by a master student in educational measurement which resulted in an agreement rate of 97% was with a near perfect Cohen’s Kappa coefficient (.94; Landis & Koch, 1977 ).

To assess the comprehensiveness of the systematic review and identify additional relevant studies, snowballing was conducted by backward and forward reference searching in Web of Science . For publications not available on Web of Science , snowballing was done in Scopus .

Data extraction

For the publications included, data was extracted systematically by means of a data extraction form (Supplementary Information SI1). The data extraction form includes: (1) general information, (2) details on the professional competency and research design, (3) serious game (SG) specifics and (4) a quality checklist.

The quality checklist contains 12 closed questions with three response options: the criterion is met (1), the criterion is met partly (.5), and the criterion is not met (0). Studies that scored 7 or below were considered to be of poor quality and were excluded. Studies that scored between 7.5 and 9.5 were considered to be of medium quality, while studies with scores 10 or above were considered to be of good quality (denoted with an asterisk in the data selection table; Supplementary Information SI2). These categories were determined by piloting the study quality checklist on two publications that were included, based on the inclusion criteria: one that was considered to be of a poor quality and one that was considered to be of good quality. The scores obtained by those studies were set as the lower and upper threshold, respectively.

As this systematic literature review is focused on the extraction of game mechanics to inform game design principles, all articles included in the review needed to obtain a score of at least .5 on the criteria that the game is discussed in enough detail. When publications explicitly refer to external sources for additional information, information from those sources were included in the data extraction form as well.

Blinded double coding to determine the reliability of the quality criteria for inclusion was done by the same raters described above. 24 randomly selected publications from the final review were included, with a varying overlap between three raters. The assigned scores were translated to the corresponding class (i.e., poor, medium, and good) to calculate the agreement rate. The rates ranged between 82 and 93%, which correspond to Cohen’s Kappa coefficients between substantial and near perfect (.66–.88; Landis & Koch, 1977 ; Table  1 ).

Search and selection results

In the PRISMA flow diagram of the publication selection process (Fig.  1 ; Moher et al., 2009 ), the two rounds in which titles and abstracts were screened for eligibility are combined. The databases were consulted on the 21st of December 2020 and yielded a total of 6,128 publications. After the removal of duplicates, 3,160 publications were left. On the basis of the inclusion criteria, another 2,981 publications were excluded from the review. In total, data was extracted from 179 publications. During the examination of the full-text articles, 129 studies were excluded due to insufficient quality (n = 42), lack of a detailed game description (n = 6), unavailability of the article (n = 5), not classifying the application as a game (n = 10) and an overall mismatch with the inclusion criteria (n = 66). In total, 50 publications were included. Snowballing was conducted in November of 2021 and resulted in the inclusion of six additional studies. In total, 56 publications were included in the final review.

figure 1

PRISMA flow diagram of inclusion of the systematic literature review. PRISMA  preferred reporting items for systematic reviews and meta-analyses

Categorization of selected studies

Competency types.

Professional competencies are acquired and assessed in different ways. Given the variety of professional competencies, there is no universal game design that is likely to be beneficial across the board (Wouters et al., 2009 ). Other researchers (e.g., Young et al., 2012 ) even suggest that game design principles should not be generalized across games, contexts or competencies. While more content-related game design principles likely need to be defined per context, this review is conducted with the idea that generic game design principles exist that can be successfully used in multiple contexts. In that sense, the aim is to provide a starting point from where more context-specific SGs can be designed, for example through the use of ECD.

The review is organized according to the type of professional competency that is evaluated rather than the content of the SG under investigation, as this provides an idea of what researchers expect to train or assess within the SG. Different distinctions between competencies can be made. For example, Wouters et al. ( 2009 ) distinguish between cognitive, motor, affective, and communicative competencies. Moreover, Harteveld ( 2011 ) distinguishes between knowledge, skills, and attitudes. These taxonomies served as a basis to inductively categorize the targeted professional competencies into knowledge, motor skills, and cognitive skills.

The knowledge category includes studies that focus on for instance declarative knowledge (i.e., fact-based) or procedural knowledge (i.e., how to do something). For instance, the procedural steps involved in cardiopulmonary resuscitation (CPR). The motor skills category refers to motor behaviors (i.e., movements). For CPR, an example would be compression depth. The cognitive skills category encompasses skills such as reasoning, planning, and decision making. For example, studies that focus on the recognition of situations that require CPR.

Successful SGs

The scope of this systematic literature review is limited to SGs that are shown to be successful in teaching, training, or the assessment of professional competencies. As research methodologies differ between studies, there is a need to define what characterizes a successful SG. When SGs were used in teaching or training, it was deemed successful when a significant improvement in the targeted professional competency was found (e.g., through an external validated measure of the competency). Some studies compared an active control group and an experimental group that additionally received an SG (e.g., Boada et al., 2015 ; Dankbaar et al., 2016 ; Graafland et al., 2017 ; see Supplementary Information SI2 for a full account): an SG was not deemed successful in the current results when such two groups showed comparable results. When SGs were used for assessment, it was deemed successful when (1) research results showed a significant relationship between the SG and a validated measure of the targeted competency, or (2) the SG was shown to accurately distinguish between different competency levels.

The studies included in the review are discussed in two ways. First, descriptives of the included studies are given in terms of the degree to which games were successful in teaching, training, or assessment of professional competencies, the professional domains, and the competency types. Then, the game mechanics associated with the potential solutions to the validity threats in traditional performance assessment are presented.

Descriptives of the included studies

The final review includes 56 studies, published between 2006 and 2020 (consult Supplementary Information SI2 for a more detailed overview). No noteworthy differences were found between the SGs that aimed to teach, train, and assess professional competencies. Therefore, the results for the SGs included in the review are presented collectively.

Serious games with successful results

Divided over the type of professional competency evaluated, 84%, 83%, and 100% reported research results showing the SG was successful for cognitive skills, knowledge, and motor skills respectively (Table  2 ). Of the studies included in the systematic review, three studies found mixed effects of the SG under investigation between competency types (i.e., Luu et al., 2020 ; Phungoen et al., 2020 ; Tan et al., 2017 ).

Professional domains and competency types

The studies included in the review can be divided over seven professional domains (Table  3 ). These are further separated into professional competencies (see Supplementary Information SI2 for a full account). Examples include history taking (Alyami et al., 2019 ), crisis management (Steinrücke et al., 2020 ) and cultural understanding (Brown et al., 2018 ). Furthermore, the studies included in the review can be divided into three competency types: cognitive skills (n = 21), knowledge (n = 31), and motor skills (n = 4). An important note is that some studies evaluate the SG on more than one competency type, thus the sum of these categories is greater than the total number of studies included.

Game mechanics

The following section discusses the inclusion of game mechanics—all design choices within the game—for the SGs discussed in the studies included in the review. Following the aim of the current paper, the game mechanics discussed are selected for having the potential to (1) mediate the validity threats associated with traditional performance assessments, and (2) be appropriate for implementing in a game-based performance assessment.

Authenticity

Authenticity in the SGs is divided into two dimensions: authenticity of the physical context and task. First, an example of a physical context that was not representative of the real working environment was found for all three competencies (Table  4 ). Regarding the SGs targeted at cognitive skills, this was the case for Effic’ Asthme (Fonteneau et al., 2020 ). In this SG, the target population—medical students—would normally carry out pediatric asthma exacerbation in a hospital setting. The game environment used is, however, the virtual bedroom of a child. Regarding the SGs targeted at knowledge, Alyami et al. ( 2019 ) implemented the game Metaphoria to teach history taking content to medical students. Here, the game environment is inside a pyramid within a fantasy world. The final SG using a game environment that does not resemble the real working environment within the motor skill competency type studied by Jalink et al. ( 2014 ). In this SG, laparoscopic skills are trained by having players perform tasks in an underground mining environment.

Second, of the studies for which task authenticity could be determined, all but four included an authentic task for the professional competency targeted (Table  5 ). Examples of a task that was not authentic were found for all three competency types. Two SGs that targeted cognitive skills did not include an authentic task (Brown et al., 2018 ; Chee et al., 2019 ) as a result of implementing role reversals. Within these SGs, the players played in a reversed role fashion, and thus the task was not authentic for the task in the real working environment. One SG targeting knowledge did not include an authentic task (Alyami et al., 2019 ). In Metaphoria , the task for players is to interpret visual metaphors in relation to symptoms, whereas the target professional competency was history taking content. Finally, the SG studied by Drummond et al. ( 2017 ), targeting motor skills, the professional competency under investigation was not represented authentically within the game as the navigation was through point-and-click.

Unobtrusive data collection

For all three competency types, studies were found that use in-game data to make inferences about player ability (Table  6 ). While other studies did mention the collection of in-game behaviors, the results were limited to those that assessed the appropriateness of using the data in the assessment of competencies.

Different measures of in-game behaviors were found. First, 12 SGs determine competency by comparing player performance to some predetermined target, sometimes also translated to a score. In the game VERITAS (Veracity Education and Reactance Instruction through Technology and Applied Skills; Miller et al., 2019 ), for instance, players are assessed on whether they accurately assess whether the statement given by a character in the game is true or false. Second, seven SGs use time spent (i.e., completion time or playing time) as a measure of performance. For example, in the SG Wii Laparoscopy (Jalink et al., 2014 ), completion time is used to assess performance. This performance metric in the game showed a high correlation with performance on a validated measure for laparoscopic skills, but it should be noted that time penalties were included for mistakes made during the task. Finally, the use of log data was found in one SG targeted at cognitive skills (Steinrücke et al., 2020 ). In the Dilemma Game, in-game measures collected during gameplay were found to have promising relationships with competency levels.

In SGs, the difficulty level can be adapted in two ways: independent of the actions of players or dependent on the actions of players (Table  7 ). Whereas SGs that varied in difficulty level were found for professional competencies related to both knowledge and motor skills, none were found for professional competencies related to cognitive skills. Three SGs were found that adjust difficulty level based on player actions; however, none of the SGs adjusts the difficulty level down based on player actions. Three studies evaluated SGs where difficulty level was varied independent of player actions. Regarding the SGs targeted at knowledge, players either received fixed assignments (Boada et al., 2015 ) or were able to set the difficulty level prior to gameplay (Taillandier & Adam, 2018 ). The SG studied by Asadipour et al. ( 2017 ), targeting motor skills, increased challenge by building up the flying speed during the game as well as random generation of coins, but this was independent of player ability. Two SGs targeted at knowledge did mention difficulty levels, but not how they were adjusted. The SG Metaphoria (Alyami et al., 2019 ) included three difficulty levels. The SG Sustainability Challenge (Dib & Adamo-Villani, 2014 ) became more challenging as players progress to higher levels, but it is not clear when or how this was done.

Test anxiety

As described earlier, games are able to provide a more enjoyable testing experience by providing an engaging environment with a high degree of autonomy. Therefore, the way game characteristics, feedback, rules, and choices—are expressed in the studies included in the review are discussed below. To avoid confusion with linearity of assessment, the expression freedom of gameplay to describe the interaction between rules and choices.

First, seven examples were found where players are given feedback unrelated to performance (Table  8 ). Some ways feedback was given included a dashboard (Perini et al., 2018 ), remaining resources (Calderón et al., 2018 ; Taillandier & Adam, 2018 ) remaining time (Calderón et al., 2018 ; Dankbaar et al., 2017a , 2017b ; Mohan et al., 2014 ) or remaining tasks (Jalink et al., 2014 ).

Second, all studies included in the review but two include game mechanics to give some freedom of gameplay (Table  9 ). For cognitive skills and knowledge, game mechanics included the choice between multiple options (n = 14 for both), the inclusion of interactive elements (n = 8, for both) and the possibility for free exploration (n = 5 and n = 8, respectively). Two examples of customization were found: Dib and Adamo-Villani ( 2014 ) gave players the choice of avatar, whereas Alyami et al. ( 2019 ) allowed for a custom name. For the SGs that target motor skills, freedom of gameplay was given through control over the movements. For three out of four SGs in this category, special controllers were developed to give players authentic control over the movements in the game. This was not the case for Drummond et al. ( 2017 ), as their game did not explicitly train CPR; however, the researchers did assess its effect on motor skills.

Included studies

The final review included 56 studies. Of these, many reported positive results. This suggests that SGs are often successful in teaching, training, or assessing professional competencies, but could also point to a publication bias of positive results. As similar reviews to the current one (e.g., Connolly et al., 2012 ; Randel et al., 1992 ; Vansickle, 1986 ; Wouters et al., 2009 ) draw on similar databases, it is difficult to establish what is true. Some studies found mixed results for different competency types, suggesting that different approaches are warranted. Therefore, game mechanics in SGs for different competency types are discussed separately.

The review included few studies on SGs targeting motor skills compared to those targeting cognitive skills and knowledge. The low number of SGs for motor skills could be due to the need for specialized equipment to create an SG targeting motor skills. For example, Wii Laparoscopy (Jalink et al., 2014 ) is played using controllers that are specifically designed for the game. Not only does it require an extra investment, it also affects the ease of large scale implementation. There is no indication that motor skills cannot be assessed through SGs: four out of five studies have shown positive effects, both in learning effectiveness and assessment accuracy. Despite this, the benefits may only outweigh the added costs in situations where it is unfeasible to perform the professional competency in the real working environment.

Focusing on game mechanics for the authenticity of the physical context and the task, the results indicate that SGs are able to provide both. It should be noted that, while SGs are able to simulate the physical context and task with high fidelity, authenticity remains a matter of perception (Gulikers et al., 2008 ). The review focused only on those SGs that were successful when compared to validated measures of the targeted professional competency. Since these measures are considered to be accurate proxies for workplace performance, the transfer to the real working environment is likely to have been made. For all three competency types, examples were found for SGs that did not include an authentic physical context or authentic task, while still mobilizing competencies of interest. Even though the number of SGs in these categories is quite small, it does indicate that it is possible to assess professional competencies without an authentic environment or task.

The in-game measures most often used in the included SGs are those that indicate how well a player did in comparison to some standard or target. This suggests that SGs are able to elicit behavior in players that is dependent on their ability level in the target professional competency. Since the accuracy measures varied depending on the professional competency, an investigation is warranted to determine which in-game measures are indicative of ability per situation. Evidentiary frameworks such as the ECD framework can provide guidance in determining which data could be used to make inferences about candidate ability. Despite the promising results, more research should be done on the informational value of log data before claims can be made.

Some examples of studies were found where adaptivity was implemented was adaptive. In particular, some promising relationships between in-game behaviors and ability level were found. In traditional (high-stakes) testing, adaptivity has already been implemented successfully (Martin & Lazendic, 2018 ; Straetmans & Eggen, 2007 ). Although there are professional competencies for which ability levels cannot be differentiated, you are either able to do it or not. For such competencies, adaptivity does not have an added benefit. In contrast, for professional competencies where it is possible to differentiate ability levels, adaptivity should be considered.

Considering the appropriateness of game mechanics for high-stakes assessment, feedback considered in the current review was limited to progress feedback. This adds a fourth type of feedback to the feedback already recognized for assessment: knowledge of correct response, elaborated feedback, and delayed knowledge of results (van der Kleij et al., 2012 ). Although the small number of SGs that incorporated progress feedback affect the generalizability of the finding, it does indicate that feedback about progress may be the most appropriate solution.

Freedom of gameplay

A variety of game mechanics implemented in the SGs included in the review fulfill freedom of gameplay. While some studies did not elaborate on the choices given in the game, common ways players are given freedom are through choice options, interactive elements, and freedom to explore. These game mechanics were found in various studies, which raises the possibility that these findings can be generalized to new SGs targeted at assessing professional competencies. Other game mechanics related to freedom of gameplay were also found in a smaller capacity. Thus, further research should shed light on their generalizability. Moreover, the freedom of gameplay provided to the player plays a substantial role in shaping overall player experience and behavior (Kim & Shute, 2015 ; Kirginas & Gouscos, 2017 ). Therefore, future research should shed further light on whether different game mechanics influence players in different ways.

Limitations

Although the current systematic literature review provides a useful overview of the game design principles for game-based performance assessment of professional competencies, some limitations are identified.

First, the review covered a substantial amount of studies from the healthcare domain. This may be because the medical field consists of many higher order standardized tasks which may be particularly suitable to SGs. Although the large contribution of studies in the healthcare domain could limit the generalizability to other domains. The results of this systematic review were quite uniform; no indication was found that SGs in healthcare employed different game mechanics were employed. Moreover, there is a growing popularity of SGs in healthcare education (Wang et al., 2016 ), resulting in a higher number of studies that were available compared to other professional domains. It is advisable to regard the current results as a starting point for game design principles game-based performance assessment. Further research into the generalizability of game design principles across professional domains is warranted.

The second limitation is true for all systematic literature reviews: it is a cross section of the literature and may not present the full picture. The inclusion of studies is dependent on what is available in the search databases, what is accessible, and what keywords are included in the literature. Likely due to this limitation, only studies published from 2006 are included in the review, while the use of SGs dates back much further (Randel et al., 1992 ; Vansickle, 1986 ). To minimize the omission of relevant literature, snowballing was conducted on the final selection of studies. This method allowed for including related and potentially relevant studies. In total, six additional publications were included through this method out of the 2,370 considered.

After snowballing, an assessment of why these additionally included studies were not found through the search results resulted in various insights. First, three studies used the terms (educational) video game in their publication on SGs (Duque et al., 2008 ; Jalink et al., 2014 ; Mohan et al., 2017 ). Including this term in the original search would have resulted in too many hits outside of the scope of the current review. Second, Moreno-Ger et al. ( 2010 ) used the term simulation to describe the application, but refer to the application as game-like. As simulations fall outside of the scope of the current review, the absence of this study in the initial search cannot be attributed to a gap in the search terms, Third, the publication from Blanié et al. ( 2020 ) was probably not found due to a mismatch in search terms related to the quality measure. Additional search terms such as impact or improve could have been included. As only one additional study was found that presented this issue, it is unlikely to have had a great effect on the outcome of the review. Finally, it is unclear why the study by Fonteneau et al. ( 2020 ) was not found through the initial search, as it showed a match with the search terms used in the current review. Perhaps, this misclassification can be ascribed to the search databases queried.

Finally, many of the studies included in the review compare SGs to other, non-digital or digital, alternatives in terms of learning. These types of studies often include many confounding variables (Cook, 2005 ). This is because a comparison is done between interventions that are different in more ways than one. These differences affect the results in different ways: positive, negative, or even through an interaction with other features.

Suggestions for future research

Besides providing interesting insights, the current review also has implications for research. First, the review identified SGs successful in teaching, training, or assessment that did not authentically represent the physical context or task. Although in this review, too few examples were found to generalize the findings. Second, while some studies were found in which the SGs difficulty was adaptive, more studies should be conducted on the implementation of adaptivity within SGs. In particular, how in-game behavior to match the difficulty level to the ability level of the candidates. Third, Fantasy is included in many games (Charsky, 2010 ; Prensky, 2001 ) and is regarded as one of the reasons for playing them (Boyle et al., 2016 ). By including fantasy elements in game-based performance assessments, assessment can become even more engaging and enjoyable and candidates can become even less aware of being assessed. For learning, it has been suggested that fantasy should be closely connected to the learning content (Gunter et al., 2008 ; Malone, 1981 ), but further research might explore whether this holds for SGs used for the (high-stakes) assessment of professional competencies. Furthermore, while fantasy elements may blur the direct link between the SG and the professional practice, in-game behavior may still have a clear relationship with professional competencies (Kim & Shute, 2015 ; Simons et al., 2021 ). More research into the effect of authenticity on the measurement validity of SGs in assessing professional competencies is warranted.

Implications for practice

Based on the results of the review, four recommendations can be made for practice. First, regardless of the competency type: design the SG in such a way that both the task and the context are authentic. The results have shown that SGs are able to provide a representation of the physical context and task, authentic to the professional competency under investigation. Thus, in situations where the physical context or assessment task are difficult to represent in a traditional performance assessments, SGs can provide a solution. At the same time, implementing non-authentic (fantasy) contexts and tasks should be investigated further before being implemented in high-stakes performance assessment.

Second, ensure that in-game behavior within the SG is collected. This review has synthesized additional evidence for the potential of in-game behavior as a source of information about ability level. That being said, the in-game behavior that can be used to inform ability level is dependent on both the professional competency of interest and game design. While no generalized design principles regarding the collection of gameplay data can be given, evidentiary frameworks (e.g., ECD) can be used to determine which in-game behavior can be used to infer ability level. This is ultimately connected to implementation of adaptivity. While a limited number of SGs were found that implemented adaptivity, the potential to unobtrusively data about ability level underscores a missed opportunity for the wider implementation of adaptivity in SGs. Taken together with the successful implementation of adaptive testing in traditional high-stakes assessments (Martin & Lazendic, 2018 ; Straetmans & Eggen, 2007 ), a third recommendation would be to implement adaptivity where appropriate.

Finally, this review gives an overview of the game mechanics for high-stakes game-based performance assessment with little risk of affecting validity. To provide freedom of gameplay for SGs targeted at cognitive skills and knowledge, include free exploration, interactive elements and providing options. For motor skills, giving control over movements is a, perhaps straightforward, game design principle. Furthermore, feedback in SGs for high-stakes performance assessments can be done through providing progress feedback, which is different from traditional types of feedback in education (van der Kleij et al., 2012 ) but has potential to satisfy feedback as a game mechanic. These recommendations, intended for game developers, may prove useful in designing future SGs for the (high-stakes) assessment of professional competencies.

In-text citations

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing . American Educational Research Association.

Google Scholar  

Baartman, L., & Gulikers, J. (2017). Assessment in Dutch vocational education: Overview and tensions of the past 15 years. In E. De Bruijn, S. Billet, & J. Onstenk (Eds.), Enhancing teaching and learning in the Dutch vocational education system: Reforms enacted (pp. 245–266). Springer.

Chapter   Google Scholar  

Bell, B. S., Kanar, A. M., & Kozlowski, S. W. J. (2008). Current issues and future directions in simulation-based training in North America. The International Journal of Human Resource Management, 19 (8), 1416–1434. https://doi.org/10.1080/09585190802200173

Article   Google Scholar  

Boyle, E., Hainey, T., Connolly, T. M., Gray, G., Earp, J., Ott, M., Lim, T., Ninaus, M., Ribeiro, C., & Pereira, J. (2016). An update to the systematic literature review of empirical evidence on the impacts and outcomes of computer games and serious games. Computers & Education, 94 , 178–192. https://doi.org/10.1016/j.compedu.2015.11.003

Boyle, E. A., Connolly, T. M., Hainey, T., & Boyle, J. M. (2012). Engagement in digital entertainment games: A systematic review. Computers in Human Behavior, 28 (3), 771–780. https://doi.org/10.1016/j.chb.2011.11.020

Burr, S., Gale, T., Kisielewska, J., Millin, P., Pêgo, J., Pinter, G., Robinson, I., & Zahra, D. (2023). A narrative review of adaptive testing and its application to medical education. MedEdPublish . https://doi.org/10.12688/mep.19844.1

Charsky, D. (2010). From edutainment to serious games: A change in the use of game characteristics. Games and Culture, 5 (2), 177–198. https://doi.org/10.1177/1555412009354727

Chen, F., Cui, Y., & Chu, M.-W. (2020). Utilizing game analytics to inform and validate digital game-based assessment with evidence-centered game design: A case study. International Journal of Artificial Intelligence in Education, 30 (3), 481–503. https://doi.org/10.1007/s40593-020-00202-6

Connolly, T. M., Boyle, E. A., MacArthur, E., Hainey, T., & Boyle, J. M. (2012). A systematic literature review of empirical evidence on computer games and serious games. Computers & Education, 59 (2), 661–686. https://doi.org/10.1016/j.compedu.2012.03.004

Cook, D. A. (2005). The research we still are not doing: An agenda for the study of computer-based learning. Academic Medicine, 80 (6), 541–548. https://doi.org/10.1097/00001888-200506000-00005

Davey, T. (2011). A guide to computer adaptive testing systems . C. o. C. S. S. Officers.

Dede, C. (2009). Immersive interfaces for engagement and learning. Science, 323 (5910), 66–69. https://doi.org/10.1126/science.1167311

Dörner, R., Göbel, S., Effelsberg, W., & Wiemeyer, J. (2016). Introduction. In R. Dörner, S. Göbel, W. Effelsberg, & J. Wiemeyer (Eds.), Serious games: Foundations, concepts and practice (pp. 1–34). Springer.

dos Santos, A. D., & Fraternali, P. (2016). A Comparison of methodological frameworks for digital learning game design. Lecture notes in computer science games and learning alliance. Springer.

Gao, Y., Gonzalez, V. A., & Yiu, T. W. (2019). The effectiveness of traditional tools and computer-aided technologies for health and safety training in the in the construction sector: A systematic review. Computers & Education, 138 , 101–115. https://doi.org/10.1016/j.compedu.2019.05.003

Gorbanev, I., Agudelo-Londoño, S., González, R. A., Cortes, A., Pomares, A., Delgadillo, V., Yepes, F. J., & Muñoz, Ó. (2018). A systematic review of serious games in medical education: Quality of evidence and pedagogical strategy. Medical Education Online, 23 (1), Article 1438718. https://doi.org/10.1080/10872981.2018.1438718

Graafland, M., Schraagen, J. M., & Schijven, M. P. (2012). Systematic review of serious games for medical education and surgical skills training. British Journal of Surgery, 99 (10), 1322–1330. https://doi.org/10.1002/bjs.8819

Gulikers, J. T. M., Bastiaens, T. J., & Kirschner, P. A. (2004). A five-dimensional framework for authentic assessment. Educational Technology Research and Development, 52 (3), 67. https://doi.org/10.1007/BF02504676

Gulikers, J. T. M., Bastiaens, T. J., Kirschner, P. A., & Kester, L. (2008). Authenticity is in the eye of the beholder: Student and teacher perceptions of assessment authenticity. Journal of Vocational Education and Training, 60 (4), 401–412. https://doi.org/10.1080/13636820802591830

Gunter, G. A., Kenny, R. F., & Vick, E. H. (2008). Taking educational games seriously: using the RETAIN model to design endogenous fantasy into standalone educational games. Educational Technology Research and Development, 56 (5), 511–537. https://doi.org/10.1007/s11423-007-9073-2

Harteveld, C. (2011). Foundations. Triadic Game design: balancing reality, meaning and play (pp. 31–93). Springer.

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77 (1), 81–112. https://doi.org/10.3102/003465430298487

Ifenthaler, D., Eseryel, D., & Ge, X. (2012). Assessment in game-based learning: Foundations, innovations, and perspectives . Springer.

Book   Google Scholar  

Jerrim, J. (2022). Test anxiety: Is it associated with performance in high-stakes examinations? Oxford Review of Education . https://doi.org/10.1080/03054985.2022.2079616

Jones, M. G. (1998). Creating engagement in computer-based learning environments . https://www.yumpu.com/en/document/read/18776351/creating-engagement-in-computer-based-learning-environments

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Praeger Publishers.

Kim, Y. J., & Shute, V. J. (2015). The interplay of game elements with psychometric qualities, learning, and enjoyment in game-based assessment. Computers & Education, 87 , 340–356. https://doi.org/10.1016/j.compedu.2015.07.009

Kirginas, S., & Gouscos, D. (2017). Exploring the impact of freeform gameplay on players’ experience: an experiment with maze games at varying levels of freedom of movement. International Journal of Serious Games . https://doi.org/10.17083/ijsg.v4i4.175

Kirriemur, J., & McFarlane, A. (2004). Literature review in games and learning . Sage.

Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: An overview. Theory Into Practice, 41 (4), 212–218. https://doi.org/10.1207/s15430421tip4104_2

Lameras, P., Arnab, S., Dunwell, I., Stewart, C., Clarke, S., & Petridis, P. (2017). Essential features of serious games design in higher education: Linking learning attributes to game mechanics. British Journal of Educational Technology, 48 (4), 972–994. https://doi.org/10.1111/bjet.12467

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33 (1), 159–174.

Lane, S., & Stone, C. A. (2006). Performance assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 387–431). Praeger Publishers.

Malone, T. W. (1981). Toward a theory of intrinsically motivating instruction. Cogntive Science, 4 , 333–369. https://doi.org/10.1207/s15516709cog0504_2

Malone, T. W., & Lepper, M. R. (1987). Making learning fun: A taxonomy of intrinsic motivations for learning. In R. E. Snow & M. J. Farr (Eds.), Aptitude, learning, and instruction: Conative and affective process analyses (pp. 223–253). Lawrence Erlbaum Associates, Inc.

Martin, A. J., & Lazendic, G. (2018). Computer-adaptive testing: Implications for students’ achievement, motivation, engagement, and subjective test experience. Journal of Educational Psychology, 110 , 27–45. https://doi.org/10.1037/edu0000205

Mavridis, A., & Tsiatsos, T. (2017). Game-based assessment: Investigating the impact on test anxiety and exam performance. Journal of Computer Assisted Learning, 33 (2), 137–150. https://doi.org/10.1111/jcal.12170

Messick, S. (1994). Alternative modes of assessment, uniform standards of validity. ETS Research Report Series, 1994 (2), i–22. https://doi.org/10.1002/j.2333-8504.1994.tb01634.x

Michael, D., & Chen, S. (2006). Serious games: Games that educate, train, and inform . Muska & Lipman/Premier-Trade.

Mislevy, R. J., & Riconscente, M. M. (2006). Evidence-centered assessment design. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 61–90). Lawrence Erlbaum Associates.

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The, P. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLOS Medicine, 6 (7), e1000097. https://doi.org/10.1371/journal.pmed.1000097

Petticrew, M., & Roberts, H. (2005). Systematic reviews in the social sciences: A practical guide. Blackwell Publishing . https://doi.org/10.1002/9780470754887

Prensky, M. (2001). Fun, play and games: What makes games engaging? In M. Prensky (Ed.), Digital game-based learning (pp. 16–47). McGraw-Hill.

Randel, J. M., Morris, B. A., Wetzel, C. D., & Whitehill, B. V. (1992). The effectiveness of games for educational purposes: A review of recent research. Simulation & Gaming, 23 (3), 261–276. https://doi.org/10.1177/1046878192233001

Rouse, R. (2004). Game design: Theory and practice (2nd ed.). Jones and Bartlett Publishers, Inc.

Schell, J. (2015). The art of game design: A book of lenses (2nd ed.). CRC Press.

Schwartz, D. L., & Arena, D. (2013). Measuring what matters most: Choice-based assessment for the digital age . The MIT Press.

Shaffer, D. W., & Gee, J. P. (2012). The right kind of GATE: Computer games and the future of assessment. In M. C. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 211–228). Information Age Publishing.

Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78 (1), 153–189. https://doi.org/10.3102/0034654307313795

Shute, V. J., & Ke, F. (2012). Games, learning, and assessment. In D. Ifenthaler, D. Eseryel, & X. Ge (Eds.), Assessment in game-based learning: foundations, innovations, and perspectives (pp. 43–58). Springer.

Shute, V. J., & Rahimi, S. (2021). Stealth assessment of creativity in a physics video game. Computers in Human Behavior, 116 , Article 106647. https://doi.org/10.1016/j.chb.2020.106647

Shute, V. J., Ventura, M., Bauer, M., & Zapata-Rivera, D. (2009). Melding the power of serious games and embedded assessment to monitor and foster learning: Flow and grow. In U. Ritterfeld, M. J. Cody, & P. Vorderer (Eds.), Serious games: Mechanisms and effects (pp. 295–321). Routledge.

Simons, A., Wohlgenannt, I., Weinmann, M., & Fleischer, S. (2021). Good gamers, good managers? A proof-of-concept study with Sid Meier’s Civilization. Review of Managerial Science, 15 (4), 957–990. https://doi.org/10.1007/s11846-020-00378-0

Stecher, B. (2010). Performance assessment in an era of standards-based educational accountability . Stanford University, Stanford Center for Opportunity Policy in Education.

Straetmans, G. J. J. M., & Eggen, T. J. H. M. (2007). WISCAT-pabo: computergestuurd adaptief toetspakket rekenen. Onderwijsinnovatie, 2017 (3), 17–27.

Susi, T., Johannesson, J., & Backlund, P. (2007). Serious game—An overview [IKI Technical Reports] . https://www.diva-portal.org/smash/get/diva2:2416/FULLTEXT01.pdf

The EndNote Team. (2013). EndNote (Version X9) Clarivate. https://endnote.com/

van der Kleij, F. M., Eggen, T. J. H. M., Timmers, C. F., & Veldkamp, B. P. (2012). Effects of feedback in a computer-based assessment for learning. Computers & Education, 58 (1), 263–272. https://doi.org/10.1016/j.compedu.2011.07.020

Van Eck, R. (2006). Digital game-based learning: It’s not just the digital natives who are restless. Educause Review, 41 (2), 16–30.

Vansickle, R. L. (1986). A quantitative review of research on instructional simulation gaming: A twenty-year perspective. Theory & Research in Social Education, 14 (3), 245–264. https://doi.org/10.1080/00933104.1986.10505525

von der Embse, N., Jester, D., Roy, D., & Post, J. (2018). Test anxiety effects, predictors, and correlates: A 30-year meta-analytic review. Journal of Affective Disorders, 483–493 , 132–156. https://doi.org/10.1016/j.jad.2017.11.048

von der Embse, N., & Witmer, S. E. (2014). High-stakes accountability: Student anxiety and large-scale testing. Journal of Applied School Psychology, 30 (2), 132–156. https://doi.org/10.1080/15377903.2014.888529

Wainer, H. (2000). Introduction and history. In H. Wainer, N. J. Dorans, R. Eignor, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (2nd ed., pp. 1–21). Lawrence Erlbaum Associates Inc.

Wang, L., Shute, V., & Moore, G. R. (2015). Lessons learned and best practices of stealth assessment. International Journal of Gaming and Computer-Mediated Simulations, 7 (4), 66–87. https://doi.org/10.4018/ijgcms.2015100104

Wang, R., DeMaria, S., Jr., Goldberg, A., & Katz, D. (2016). A systematic review of serious games in training health care professionals. Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare, 11 (1), 41–51. https://doi.org/10.1097/sih.0000000000000118

Westera, W., Nadolski, R., & Hummel, H. (2014). Serious gaming analytics—What students’ log files tell us about gaming and learning. International Journal of Serious Games, 1 (2), 35–50. https://doi.org/10.17083/ijsg.v1i2.9

Williams-Bell, F. M., Kapralos, B., Hogue, A., Murphy, B. M., & Weckman, E. J. (2015). Using serious games and virtual simulation for training in the fire service: A review. Fire Technology, 51 , 553–584. https://doi.org/10.1007/s10694-014-0398-1

Wools, S., Eggen, T., & Sanders, P. (2010). Evaluation of validity and validation by means of the argument-based approach. Cadmo . https://doi.org/10.3280/cad2010-001007

Wouters, P., van der Spek, E. D., & van Oostendorp, H. (2009). Current practices in serious game research: A review from a learning outcomes perspective. In T. Connolly, M. Stansfield, & L. Boyle (Eds.), Games-based learning advancements for multi-sensory human computer interfaces: Techniques and effective practices (pp. 232–250). IGI Global.

Young, M. F., Slota, S., Cutter, A. B., Jalette, G., Mullin, G., Lai, B., Simeoni, Z., Tran, M., & Yukhymenko, M. (2012). Our princess is in another castle: A review of trends in serious gaming for education. Review of Educational Research, 82 (1), 61–89. https://doi.org/10.3102/0034654312436980

Studies included in the systematic review

Adams, A., Hart, J., Iacovides, I., Beavers, S., Oliveira, M., & Magroudi, M. (2019). Co-created evaluation: Identifying how games support police learning. International Journal of Human-Computer Studies, 132 , 34–44. https://doi.org/10.1016/j.ijhcs.2019.03.009

Aksoy, E. (2019). Comparing the effects on learning outcomes of tablet-based and virtual reality–based serious gaming modules for basic life support training: Randomized trial. JMIR Serious Games, 7 (2), Article e13442. https://doi.org/10.2196/13442

Albert, A., Hallowell, M. R., Kleiner, B., Chen, A., & Golparvar-Fard, M. (2014). Enhancing construction hazard recognition with high-fidelity augmented virtuality. Journal of Construction Engineering and Management, 140 (7), Article 04014024. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000860

Alyami, H., Alawami, M., Lyndon, M., Alyami, M., Coomarasamy, C., Henning, M., Hill, A., & Sundram, F. (2019). Impact of using a 3D visual metaphor serious game to teach history-taking content to medical students: Longitudinal mixed methods pilot study. JMIR Serious Games, 7 (3), Article e13748. https://doi.org/10.2196/13748

Ameerbakhsh, O., Maharaj, S., Hussain, A., & McAdam, B. (2019). A comparison of two methods of using a serious game for teaching marine ecology in a university setting. International Journal of Human-Computer Studies, 127 , 181–189. https://doi.org/10.1016/j.ijhcs.2018.07.004

Asadipour, A., Debattista, K., & Chalmers, A. (2017). Visuohaptic augmented feedback for enhancing motor skill acquisition. The Visual Computer, 33 (4), 401–411. https://doi.org/10.1007/s00371-016-1275-3

Barab, S. A., Scott, B., Siyahhan, S., Goldstone, R., Ingram-Goble, A., Zuiker, S. J., & Warren, S. (2009). Transformational play as a curriculur scaffold: Using videogames to support science education. Journal of Science Education and Technology, 18 (4), 305–320. https://doi.org/10.1007/s10956-009-9171-5

Benda, N. C., Kellogg, K. M., Hoffman, D. J., Fairbanks, R. J., & Auguste, T. (2020). Lessons learned from an evaluation of serious gaming as an alternative to mannequin-based simulation technology: Randomized controlled trial. JMIR Serious Games, 8 (3), Article e21123. https://doi.org/10.2196/21123

Bindoff, I., Ling, T., Bereznicki, L., Westbury, J., Chalmers, L., Peterson, G., & Ollington, R. (2014). A computer simulation of community pharmacy practice for educational use. American Journal of Pharmaceutical Education, 78 (9), Article 168. https://doi.org/10.5688/ajpe789168

Binsubaih, A., Maddock, S., & Romano, D. (2006). A serious game for traffic accident investigators. Interactive Technology and Smart Education, 3 (4), 329–346. https://doi.org/10.1108/17415650680000071

Blanié, A., Amorim, M. A., & Benhamou, D. (2020). Comparative value of a simulation by gaming and a traditional teaching method to improve clinical reasoning skills necessary to detect patient deterioration: A randomized study in nursing students. BMC Medical Education, 20 (1), Article 53. https://doi.org/10.1186/s12909-020-1939-6

Boada, I., Rodriguez-Benitez, A., Garcia-Gonzalez, J. M., Olivet, J., Carreras, V., & Sbert, M. (2015). Using a serious game to complement CPR instruction in a nurse faculty. Computer Methods and Programs in Biomedicine, 122 (2), 282–291. https://doi.org/10.1016/j.cmpb.2015.08.006

Brown, D. E., Moenning, A., Guerlain, S., Turnbull, B., Abel, D., & Meyer, C. (2018). Design and evaluation of an avatar-based cultural training system. The Journal of Defense Modeling and Simulation, 16 (2), 159–174. https://doi.org/10.1177/1548512918807593

Buttussi, F., Pellis, T., Cabas Vidani, A., Pausler, D., Carchietti, E., & Chittaro, L. (2013). Evaluation of a 3D serious game for advanced life support retraining. International Journal Medical Informatics, 82 (9), 798–809. https://doi.org/10.1016/j.ijmedinf.2013.05.007

Calderón, A., Ruiz, M., & O’Connor, R. V. (2018). A serious game to support the ISO 21500 standard education in the context of software project management. Computer Standards & Interfaces, 60 , 80–92. https://doi.org/10.1016/j.csi.2018.04.012

Chan, W. Y., Qin, J., Chui, Y. P., & Heng, P. A. (2012). A serious game for learning ultrasound-guided needle placement skills. IEEE Transactions on Information Technology in Biomedicine, 16 (6), 1032–1042. https://doi.org/10.1109/titb.2012.2204406

Chang, C., Kao, C., Hwang, G., & Lin, F. (2020). From experiencing to critical thinking: A contextual game-based learning approach to improving nursing students’ performance in electrocardiogram training. Educational Technology Research and Development, 68 (3), 1225–1245. https://doi.org/10.1007/s11423-019-09723-x

Chee, E. J. M., Prabhakaran, L., Neo, L. P., Carpio, G. A. C., Tan, A. J. Q., Lee, C. C. S., & Liaw, S. Y. (2019). Play and learn with patients—Designing and evaluating a serious game to enhance nurses’ inhaler teaching techniques: A randomized controlled trial. Games for Health Journal, 8 (3), 187–194. https://doi.org/10.1089/g4h.2018.0073

Chon, S., Timmermann, F., Dratsch, T., Schuelper, N., Plum, P., Berlth, F., Datta, R. R., Schramm, C., Haneder, S., Späth, M. R., Dübbers, M., Kleinert, J., Raupach, T., Bruns, C., & Kleinert, R. (2019). Serious games in surgical medical education: A virtual emergency department as a tool for teaching clinical reasoning to medical students. JMIR Serious Games, 7 (1), Article e13028. https://doi.org/10.2196/13028

Cook, N. F., McAloon, T., O’Neill, P., & Beggs, R. (2012). Impact of a web based interactive simulation game (PULSE) on nursing students’ experience and performance in life support training—A pilot study. Nurse Education Today, 32 (6), 714–720. https://doi.org/10.1016/j.nedt.2011.09.013

Cowley, B., Fantato, M., Jennett, C., Ruskov, M., & Ravaja, N. (2014). Learning when serious: Psychophysiological evaluation of a technology-enhanced learning game. Journal of Educational Technology & Society, 17 (1), 3–16.

Creutzfeldt, J., Hedman, L., & Felländer-Tsai, L. (2012). Effects of pre-training using serious game technology on CPR performance—An exploratory quasi-experimental transfer study. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 20 (1), Article 79. https://doi.org/10.1186/1757-7241-20-79

Creutzfeldt, J., Hedman, L., Medin, C., Heinrichs, W. L., & Felländer-Tsai, L. (2010). Exploring virtual worlds for scenario-based repeated team training of cardiopulmonary resuscitation in medical students. Journal of Medical Internet Research, 12 (3), Article e38. https://doi.org/10.2196/jmir.1426

Dankbaar, M. E. W., Alsma, J., Jansen, E. E. H., van Merrienboer, J. J. G., van Saase, J. L. C. M., & Schuit, S. C. E. (2016). An experimental study on the effects of a simulation game on students’ clinical cognitive skills and motivation. Advances in Health Sciences Education, 21 (3), 505–521. https://doi.org/10.1007/s10459-015-9641-x

Dankbaar, M. E. W., Bakhuys Roozeboom, M., Oprins, E. A. P. B., Rutten, F., van Merrienboer, J. J. G., van Saase, J. L. C. M., & Schuit, S. C. E. (2017a). Preparing residents effectively in emergency skills training with a serious game. Simulation in Healthcare, 12 (1), 9–16. https://doi.org/10.1097/sih.0000000000000194

Dankbaar, M. E. W., Richters, O., Kalkman, C. J., Prins, G., ten Cate, O. T. J., van Merrienboer, J. J. G., & Schuit, S. C. E. (2017). Comparative effectiveness of a serious game and an e-module to support patient safety knowledge and awareness. BMC Medical Education, 17 (1), Article 30. https://doi.org/10.1186/s12909-016-0836-5

de Sena, D. P., Fabrício, D. D., da Silva, V. D., Bodanese, L. C., & Franco, A. R. (2019). Comparative evaluation of video-based on-line course versus serious game for training medical students in cardiopulmonary resuscitation: A randomised trial. PLOS ONE, 14 (4), Article e0214722. https://doi.org/10.1371/journal.pone.0214722

Dib, H., & Adamo-Villani, N. (2014). Serious sustainability challenge game to promote teaching and learning of building sustainability. Journal of Computing in Civil Engineering, 28 (5), Article A4014007. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000357

Diehl, L. A., Souza, R. M., Gordan, P. A., Esteves, R. Z., & Coelho, I. C. M. (2017). InsuOnline, an electronic game for medical education on insulin therapy: A randomized controlled trial with primary care physicians. Journal of Medical Internet Research, 19 (3), Article e72. https://doi.org/10.2196/jmir.6944

Drummond, D., Delval, P., Abdenouri, S., Truchot, J., Ceccaldi, P., Plaisance, P., Hadchouel, A., & Tesnière, A. (2017). Serious game versus online course for pretraining medical students before a simulation-based mastery learning course on cardiopulmonary resuscitation: A randomised controlled study. European Journal of Anaesthesiology, 34 (12), 836–844. https://doi.org/10.1097/EJA.0000000000000675

Duque, G., Fung, S., Mallet, L., Posel, N., & Fleiszer, D. (2008). Learning while having fun: The use of video gaming to teach geriatric house calls to medical students. Journal of the American Geriatrics Society, 56 (7), 1328–1332. https://doi.org/10.1111/j.1532-5415.2008.01759.x

Fonteneau, T., Billion, E., Abdoul, C., Le, S., Hadchouel, A., & Drummond, D. (2020). Simulation game versus multiple choice questionnaire to assess the clinical competence of medical students: Prospective sequential trial. Journal of Medical Internet Research, 22 (12), Article e23254. https://doi.org/10.2196/23254

Gerard, J. M., Scalzo, A. J., Borgman, M. A., Watson, C. M., Byrnes, C. E., Chang, T. P., Auerbach, M., Kessler, D. O., Feldman, B. L., Payne, B. S., Nibras, S., Chokshi, R. K., & Lopreiato, J. O. (2018). Validity evidence for a serious game to assess performance on critical pediatric emergency medicine scenarios. Simulation in Healthcare, 13 (3), 168–180. https://doi.org/10.1097/SIH.0000000000000283

Graafland, M., Bemelman, W. A., & Schijven, M. P. (2014). Prospective cohort study on surgeons’ response to equipment failure in the laparoscopic environment. Surgical Endoscopy, 28 (9), 2695–2701. https://doi.org/10.1007/s00464-014-3530-x

Graafland, M., Bemelman, W. A., & Schijven, M. P. (2017). Game-based training improves the surgeon’s situational awareness in the operation room: A randomized controlled trial. Surgical Endoscopy, 31 (10), 4093–4101. https://doi.org/10.1007/s00464-017-5456-6

Hannig, A., Lemos, M., Spreckelsen, C., Ohnesorge-Radtke, U., & Rafai, N. (2013). Skills-O-Mat: Computer supported interactive motion- and game-based training in mixing alginate in dental education. Journal of Educational Computing Research, 48 (3), 315–343. https://doi.org/10.2190/EC.48.3.c

Hummel, H. G. K., van Houcke, J., Nadolski, R. J., van der Hiele, T., Kurvers, H., & Löhr, A. (2011). Scripted collaboration in serious gaming for complex learning: Effects of multiple perspectives when acquiring water management skills. British Journal of Educational Technology, 42 (6), 1029–1041. https://doi.org/10.1111/j.1467-8535.2010.01122.x

Jalink, M. B., Goris, J., Heineman, E., Pierie, J. P., & ten Cate Hoedemaker, H. O. (2014). Construct and concurrent validity of a Nintendo Wii video game made for training basic laparoscopic skills. Surgical Endoscopy, 28 (2), 537–542. https://doi.org/10.1007/s00464-013-3199-6

Katz, D., Zerillo, J., Kim, S., Hill, B., Wang, R., Goldberg, A., & DeMaria, S. (2017). Serious gaming for orthotopic liver transplant anesthesiology: A randomized control trial. Liver Transplantation, 23 (4), 430–439. https://doi.org/10.1002/lt.24732

Knight, J. F., Carley, S., Tregunna, B., Jarvis, S., Smithies, R., de Freitas, S., Dunwell, I., & Mackway-Jones, K. (2010). Serious gaming technology in major incident triage training: A pragmatic controlled trial. Resuscitation, 81 (9), 1175–1179. https://doi.org/10.1016/j.resuscitation.2010.03.042

LeFlore, J. L., Anderson, M., Zielke, M. A., Nelson, K. A., Thomas, P. E., Hardee, G., & John, L. D. (2012). Can a virtual patient trainer teach student nurses how to save lives—Teaching student nurses about pediatric respiratory diseases. Simulation in Healthcare, 7 (1), 10–17. https://doi.org/10.1097/SIH.0b013e31823652de

Li, K., Hall, M., Bermell-Garcia, P., Alcock, J., Tiwari, A., & González-Franco, M. (2017). Measuring the learning effectiveness of serious gaming for training of complex manufacturing tasks. Simulation & Gaming, 48 (6), 770–790. https://doi.org/10.1177/1046878117739929

Luu, C., Talbot, T. B., Fung, C. C., Ben-Isaac, E., Espinoza, J., Fischer, S., Cho, C. S., Sargsyan, M., Korand, S., & Chang, T. P. (2020). Development and performance assessment of a digital serious game to assess multi-patient care skills in a simulated pediatric emergency department. Simulation & Gaming, 51 (4), 550–570. https://doi.org/10.1177/1046878120904984

Middeke, A., Anders, S., Schuelper, M., Raupach, T., & Schuelper, N. (2018). Training of clinical reasoning with a serious game versus small-group problem-based learning: A prospective study. PLoS ONE, 13 (9), Article e0203851. https://doi.org/10.1371/journal.pone.0203851

Miller, C. H., Dunbar, N. E., Jensen, M. L., Massey, Z. B., Lee, Y., Nicholls, S. B., Anderson, C., Adams, A. S., Cecena, J. E., Thompson, W. M., & Wilson, S. N. (2019). Training law enforcement officers to identify reliable deception cues with a serious digital game. International Journal of Game-Based Learning, 9 (3), 1–22. https://doi.org/10.4018/IJGBL.2019070101

Mohan, D., Angus, D. C., Ricketts, D., Farris, C., Fischhoff, B., Rosengart, M. R., Yealy, D. M., & Barnato, A. E. (2014). Assessing the validity of using serious game technology to analyze physician decision making. PLOS ONE, 9 (8), Article e105445. https://doi.org/10.1371/journal.pone.0105445

Mohan, D., Farris, C., Fischhoff, B., Rosengart, M. R., Angus, D. C., Yealy, D. M., Wallace, D. J., & Barnato, A. E. (2017). Efficacy of educational video game versus traditional educational apps at improving physician decision making in trauma triage: Randomized controlled trial. BMJ, 359 , Article j5416. https://doi.org/10.1136/bmj.j5416

Mohan, D., Fischhoff, B., Angus, D. C., Rosengart, M. R., Wallace, D. J., Yealy, D. M., Farris, C., Chang, C. H., Kerti, S., & Barnato, A. E. (2018). Serious games may improve physician heuristics in trauma triage. Proceedings of the National Academy of Sciences, 115 (37), 9204–9209. https://doi.org/10.1073/pnas.1805450115

Moreno-Ger, P., Torrente, J., Bustamante, J., Fernandez-Galaz, C., Fernandez-Manjon, B., & Comas-Rengifo, M. D. (2010). Application of a low-cost web-based simulation to improve students’ practical skills in medical education. International Journal of Medical Informatics, 79 (6), 459–467. https://doi.org/10.1016/j.ijmedinf.2010.01.017

Perini, S., Luglietti, R., Margoudi, M., Oliveira, M., & Taisch, M. (2018). Learning and motivational effects of digital game-based learning (DGBL) for manufacturing education—The life cycle assessment (LCA) game. Computers in Industry, 102 , 40–49. https://doi.org/10.1016/j.compind.2018.08.005

Phungoen, P., Promto, S., Chanthawatthanarak, S., Maneepong, S., Apiratwarakul, K., Kotruchin, P., & Mitsungnern, T. (2020). Precourse preparation using a serious smartphone game on advanced life support knowledge and skills: Randomized controlled trial. Journal of Medical Internet Research, 22 (3), Article e16987. https://doi.org/10.2196/16987

Steinrücke, J., Veldkamp, B. P., & de Jong, T. (2020). Information literacy skills assessment in digital crisis management training for the safety domain: Developing an unobtrusive method. Frontiers in Education, 5 (140), Article 140. https://doi.org/10.3389/feduc.2020.00140

Su, C. (2016). The efects of students’ learning anxiety and motivation on the learning achievement in the activity theory based gamified learning environment. Eurasia Journal of Mathematics, Science and Technology Education, 13 , 1229–1258. https://doi.org/10.12973/eurasia.2017.00669a

Taillandier, F., & Adam, C. (2018). Games ready to use: A serious game for teaching natural risk management. Simulation & Gaming, 49 (4), 441–470. https://doi.org/10.1177/1046878118770217

Tan, A. J. Q., Lee, C. C. S., Lin, P. Y., Cooper, S., Lau, L. S. T., Chua, W. L., & Liaw, S. Y. (2017). Designing and evaluating the effectiveness of a serious game for safe administration of blood transfusion: A randomized controlled trial. Nurse Education Today, 55 , 38–44. https://doi.org/10.1016/j.nedt.2017.04.027

Zualkernan, I. A., Husseini, G. A., Loughlin, K. F., Mohebzada, J. G., & El Gami, M. (2013). Remote labs and game-based learning for process control. Chemical Engineering Education, 47 (3), 179–188.

Download references

Author information

Authors and affiliations.

eX:plain, Department of Applied Research, P.O. Box 1230, 3800 BE, Amersfoort, The Netherlands

Aranka Bijl

Faculty of Behavioural, Management and Social Sciences, Cognition, Data and Education, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands

Aranka Bijl & Bernard P. Veldkamp

Cito, Department of Research and Innovation, P.O. Box 1034, 6801 MG, Arnhem, The Netherlands

Aranka Bijl, Saskia Wools & Sebastiaan de Klerk

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Aranka Bijl .

Ethics declarations

Conflict of interest.

We have no conflict of interest to disclose.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 42 kb)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Bijl, A., Veldkamp, B.P., Wools, S. et al. Serious games in high-stakes assessment contexts: a systematic literature review into the game design principles for valid game-based performance assessment. Education Tech Research Dev (2024). https://doi.org/10.1007/s11423-024-10362-0

Download citation

Accepted : 24 February 2024

Published : 08 April 2024

DOI : https://doi.org/10.1007/s11423-024-10362-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systematic literature review
  • Serious games
  • Professional competencies
  • Performance assessment
  • Game design principles
  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. Literature Reviews: Types of Clinical Study Designs

    Systematic Review A summary of the clinical literature. A systematic review is a critical assessment and evaluation of all research studies that address a particular clinical issue. The researchers use an organized method of locating, assembling, and evaluating a body of literature on a particular topic using a set of specific criteria.

  2. Literature review as a research methodology: An ...

    As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.

  3. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  4. Study designs: Part 7

    Study designs: Part 7 - Systematic reviews. In this series on research study designs, we have so far looked at different types of primary research designs which attempt to answer a specific question. In this segment, we discuss systematic review, which is a study design used to summarize the results of several primary research studies.

  5. Literature Review and Research Design

    This book looks at literature review in the process of research design, and how to develop a research practice that will build skills in reading and writing about research literature—skills that remain valuable in both academic and professional careers. Literature review is approached as a process of engaging with the discourse of scholarly ...

  6. Literature Review Research Design

    This chapter addresses the literature review research design's peculiarities, characteristics, and significant fallacies. Conducting and writing poor literature reviews is one way to lower academic work's value. State-of-the-art literature reviews are valuable and publishable scholarly documents. Too many new scholars think that empirical ...

  7. Methodological Approaches to Literature Review

    A literature review is defined as "a critical analysis of a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles." (The Writing Center University of Winconsin-Madison 2022) A literature review is an integrated analysis, not just a summary of scholarly work on a specific topic.

  8. Reviewing the literature: choosing a review design

    The purpose of a review of healthcare literature is primarily to summarise the knowledge around a specific question or topic, or to make recommendations that can support health professionals and organisations make decisions about a specific intervention or care issue. 5 In addition, reviews can highlight gaps in knowledge to guide future research.

  9. Literature Review

    Literature Review. A literature review is a discussion of the literature (aka. the "research" or "scholarship") surrounding a certain topic. A good literature review doesn't simply summarize the existing material, but provides thoughtful synthesis and analysis. The purpose of a literature review is to orient your own work within an existing ...

  10. Writing a literature review

    Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. ... Evaluation of the quality of studies and assessment of factors, such as study design, data collection, data analysis and interpretation and the conclusions drawn by ...

  11. Designing Literature Reviews as a Research Project

    The article "Review Research as Scientific Inquiry" promotes purpose-method-fit and offers guidance on how to improve the rigor and impact of review research. While literature reviewing is part of any research project, reviews as a stand-alone research project seek to address academic, practice or policy problems using prior research as data sources.

  12. How-to conduct a systematic literature review: A quick guide for

    Method details Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12].An SLR updates the reader with current literature about a subject [6].The goal is to review critical points of current knowledge on a ...

  13. Writing a Literature Review

    The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels and plays). When we say "literature review" or refer to "the literature," we are talking about the research (scholarship) in a given field. You will often see the terms "the research," "the ...

  14. Clinical research study designs: The essentials

    Introduction. In clinical research, our aim is to design a study, which would be able to derive a valid and meaningful scientific conclusion using appropriate statistical methods that can be translated to the "real world" setting. 1 Before choosing a study design, one must establish aims and objectives of the study, and choose an appropriate target population that is most representative of ...

  15. Study designs: Part 1

    The study design used to answer a particular research question depends on the nature of the question and the availability of resources. In this article, which is the first part of a series on "study designs," we provide an overview of research study designs and their classification. The subsequent articles will focus on individual designs.

  16. Literature Review Research

    Literature Review is a comprehensive survey of the works published in a particular field of study or line of research, usually over a specific period of time, in the form of an in-depth, critical bibliographic essay or annotated list in which attention is drawn to the most significant works.. Also, we can define a literature review as the collected body of scholarly works related to a topic:

  17. Literature Review and Research Design A Guide to Effective Research

    Literature review is approached as a process of engaging with the discourse of scholarly communities that will help graduate researchers refine, define, and express their own scholarly vision and voice. This orientation on research as an exploratory practice, rather than merely a series of predetermined steps in a systematic method, allows the ...

  18. PDF 13 Literature Review Research Design

    13.2 Particularities of Literature Review Research Design. In this section, we specifically address the elements that make literature review research a discrete research design differentiated from others. Next to the characteristics of litera-ture review research, we address the main issues and decisions to be made within this research design ...

  19. Getting Started

    A literature review is an overview of the available research for a specific scientific topic. Literature reviews summarize existing research to answer a review question, provide context for new research, or identify important gaps in the existing body of literature.. An incredible amount of academic literature is published each year, by estimates over two million articles.

  20. Research Guides: Study Design 101: Systematic Review

    This systematic review was interested in comparing the diet quality of vegetarian and non-vegetarian diets. Twelve studies were included. Vegetarians more closely met recommendations for total fruit, whole grains, seafood and plant protein, and sodium intake. In nine of the twelve studies, vegetarians had higher overall diet quality compared to ...

  21. PDF Reviewing the literature: choosing a review design

    The purpose of a review of healthcare literature is primarily to summarise the knowledge around a specific question or topic, or to make recommendations that can support health professionals and organisations make decisions about a specific intervention or care issue.5 In addition, reviews can highlight gaps in knowledge to guide future research.

  22. Literature Reviews, Theoretical Frameworks, and Conceptual Frameworks

    Common questions concern determining which literature pertains to the topic of study or the role of the literature review in the design of the study. This section addresses such questions broadly while providing general guidance for writing a narrative literature review that evaluates the most pertinent studies. ... On the centrality of the ...

  23. What improves access to primary healthcare services in rural

    To compile key strategies from the international experiences to improve access to primary healthcare (PHC) services in rural communities. Different innovative approaches have been practiced in different parts of the world to improve access to essential healthcare services in rural communities. Systematically collecting and combining best experiences all over the world is important to suggest ...

  24. Literature Review Research Design

    Literature review research. part of every research paper (staggered design with literature review as one stage) and stand-alone research design. aims at summarizing the existing body of knowledge and identifying the gaps in it. different forms of literature review research design available to address the objective.

  25. Frontiers

    2 Literature review and research hypothesis 2.1 Literature review. When considering industrial policy, the setting up national high-tech zones embodies the intersection of regional and industrial policies. Domestic and international academic research concerning setting up national high-tech zones primarily centers on economic activities and ...

  26. Prevention of alcohol exposed pregnancies in Europe: the FAR SEAS

    The 2626 records identified in the review were screened to exclude 2524 (682 duplicates + 1842 records that did not meet inclusion criteria) leaving 102 records included (see Fig. 1 for PRISMA chart and Table 11 in Appendix 1 for records included).. Twenty-three recommendations were formulated based on the literature reviews, several rounds of experts consultations, and the pilot study ...

  27. Chapter 9 Methods for Literature Reviews

    Literature reviews can take two major forms. The most prevalent one is the "literature review" or "background" section within a journal paper or a chapter in a graduate thesis. This section synthesizes the extant literature and usually identifies the gaps in knowledge that the empirical study addresses (Sylvester, Tate, & Johnstone, 2013).

  28. Topologies in the Internet of Medical Things (IoMT), literature review

    The bibliographic review is a fundamental phase in a research project, and it must guarantee that the most relevant information in the field of study is obtained. Our main objective was to know the works related to the Internet of medical things, from now on (IoMT). We analyzed a total of 535 articles searched in association for Computing Machinery in Adelante ACM, Web of Science and Scopus ...

  29. Serious games in high-stakes assessment contexts: a ...

    The systematic literature review (1) investigates whether 'serious games' provide a viable solution to the limitations posed by traditional high-stakes performance assessments and (2) aims to synthesize game design principles for the game-based performance assessment of professional competencies. In total, 56 publications were included in the final review, targeting knowledge, motor skills ...

  30. Information

    This study presents a systematic literature review (SLR) aimed at identifying and analyzing primary studies that propose a roadmap for the implementation of a BI system in HEIs. The objectives of the SLR are to identify and characterize (i) the strategic objectives that underlie decision making, activities, processes, and information in HEIs ...