U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Korean J Anesthesiol
  • v.71(2); 2018 Apr

Introduction to systematic review and meta-analysis

1 Department of Anesthesiology and Pain Medicine, Inje University Seoul Paik Hospital, Seoul, Korea

2 Department of Anesthesiology and Pain Medicine, Chung-Ang University College of Medicine, Seoul, Korea

Systematic reviews and meta-analyses present results by combining and analyzing data from different studies conducted on similar research topics. In recent years, systematic reviews and meta-analyses have been actively performed in various fields including anesthesiology. These research methods are powerful tools that can overcome the difficulties in performing large-scale randomized controlled trials. However, the inclusion of studies with any biases or improperly assessed quality of evidence in systematic reviews and meta-analyses could yield misleading results. Therefore, various guidelines have been suggested for conducting systematic reviews and meta-analyses to help standardize them and improve their quality. Nonetheless, accepting the conclusions of many studies without understanding the meta-analysis can be dangerous. Therefore, this article provides an easy introduction to clinicians on performing and understanding meta-analyses.

Introduction

A systematic review collects all possible studies related to a given topic and design, and reviews and analyzes their results [ 1 ]. During the systematic review process, the quality of studies is evaluated, and a statistical meta-analysis of the study results is conducted on the basis of their quality. A meta-analysis is a valid, objective, and scientific method of analyzing and combining different results. Usually, in order to obtain more reliable results, a meta-analysis is mainly conducted on randomized controlled trials (RCTs), which have a high level of evidence [ 2 ] ( Fig. 1 ). Since 1999, various papers have presented guidelines for reporting meta-analyses of RCTs. Following the Quality of Reporting of Meta-analyses (QUORUM) statement [ 3 ], and the appearance of registers such as Cochrane Library’s Methodology Register, a large number of systematic literature reviews have been registered. In 2009, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [ 4 ] was published, and it greatly helped standardize and improve the quality of systematic reviews and meta-analyses [ 5 ].

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f1.jpg

Levels of evidence.

In anesthesiology, the importance of systematic reviews and meta-analyses has been highlighted, and they provide diagnostic and therapeutic value to various areas, including not only perioperative management but also intensive care and outpatient anesthesia [6–13]. Systematic reviews and meta-analyses include various topics, such as comparing various treatments of postoperative nausea and vomiting [ 14 , 15 ], comparing general anesthesia and regional anesthesia [ 16 – 18 ], comparing airway maintenance devices [ 8 , 19 ], comparing various methods of postoperative pain control (e.g., patient-controlled analgesia pumps, nerve block, or analgesics) [ 20 – 23 ], comparing the precision of various monitoring instruments [ 7 ], and meta-analysis of dose-response in various drugs [ 12 ].

Thus, literature reviews and meta-analyses are being conducted in diverse medical fields, and the aim of highlighting their importance is to help better extract accurate, good quality data from the flood of data being produced. However, a lack of understanding about systematic reviews and meta-analyses can lead to incorrect outcomes being derived from the review and analysis processes. If readers indiscriminately accept the results of the many meta-analyses that are published, incorrect data may be obtained. Therefore, in this review, we aim to describe the contents and methods used in systematic reviews and meta-analyses in a way that is easy to understand for future authors and readers of systematic review and meta-analysis.

Study Planning

It is easy to confuse systematic reviews and meta-analyses. A systematic review is an objective, reproducible method to find answers to a certain research question, by collecting all available studies related to that question and reviewing and analyzing their results. A meta-analysis differs from a systematic review in that it uses statistical methods on estimates from two or more different studies to form a pooled estimate [ 1 ]. Following a systematic review, if it is not possible to form a pooled estimate, it can be published as is without progressing to a meta-analysis; however, if it is possible to form a pooled estimate from the extracted data, a meta-analysis can be attempted. Systematic reviews and meta-analyses usually proceed according to the flowchart presented in Fig. 2 . We explain each of the stages below.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f2.jpg

Flowchart illustrating a systematic review.

Formulating research questions

A systematic review attempts to gather all available empirical research by using clearly defined, systematic methods to obtain answers to a specific question. A meta-analysis is the statistical process of analyzing and combining results from several similar studies. Here, the definition of the word “similar” is not made clear, but when selecting a topic for the meta-analysis, it is essential to ensure that the different studies present data that can be combined. If the studies contain data on the same topic that can be combined, a meta-analysis can even be performed using data from only two studies. However, study selection via a systematic review is a precondition for performing a meta-analysis, and it is important to clearly define the Population, Intervention, Comparison, Outcomes (PICO) parameters that are central to evidence-based research. In addition, selection of the research topic is based on logical evidence, and it is important to select a topic that is familiar to readers without clearly confirmed the evidence [ 24 ].

Protocols and registration

In systematic reviews, prior registration of a detailed research plan is very important. In order to make the research process transparent, primary/secondary outcomes and methods are set in advance, and in the event of changes to the method, other researchers and readers are informed when, how, and why. Many studies are registered with an organization like PROSPERO ( http://www.crd.york.ac.uk/PROSPERO/ ), and the registration number is recorded when reporting the study, in order to share the protocol at the time of planning.

Defining inclusion and exclusion criteria

Information is included on the study design, patient characteristics, publication status (published or unpublished), language used, and research period. If there is a discrepancy between the number of patients included in the study and the number of patients included in the analysis, this needs to be clearly explained while describing the patient characteristics, to avoid confusing the reader.

Literature search and study selection

In order to secure proper basis for evidence-based research, it is essential to perform a broad search that includes as many studies as possible that meet the inclusion and exclusion criteria. Typically, the three bibliographic databases Medline, Embase, and Cochrane Central Register of Controlled Trials (CENTRAL) are used. In domestic studies, the Korean databases KoreaMed, KMBASE, and RISS4U may be included. Effort is required to identify not only published studies but also abstracts, ongoing studies, and studies awaiting publication. Among the studies retrieved in the search, the researchers remove duplicate studies, select studies that meet the inclusion/exclusion criteria based on the abstracts, and then make the final selection of studies based on their full text. In order to maintain transparency and objectivity throughout this process, study selection is conducted independently by at least two investigators. When there is a inconsistency in opinions, intervention is required via debate or by a third reviewer. The methods for this process also need to be planned in advance. It is essential to ensure the reproducibility of the literature selection process [ 25 ].

Quality of evidence

However, well planned the systematic review or meta-analysis is, if the quality of evidence in the studies is low, the quality of the meta-analysis decreases and incorrect results can be obtained [ 26 ]. Even when using randomized studies with a high quality of evidence, evaluating the quality of evidence precisely helps determine the strength of recommendations in the meta-analysis. One method of evaluating the quality of evidence in non-randomized studies is the Newcastle-Ottawa Scale, provided by the Ottawa Hospital Research Institute 1) . However, we are mostly focusing on meta-analyses that use randomized studies.

If the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) system ( http://www.gradeworkinggroup.org/ ) is used, the quality of evidence is evaluated on the basis of the study limitations, inaccuracies, incompleteness of outcome data, indirectness of evidence, and risk of publication bias, and this is used to determine the strength of recommendations [ 27 ]. As shown in Table 1 , the study limitations are evaluated using the “risk of bias” method proposed by Cochrane 2) . This method classifies bias in randomized studies as “low,” “high,” or “unclear” on the basis of the presence or absence of six processes (random sequence generation, allocation concealment, blinding participants or investigators, incomplete outcome data, selective reporting, and other biases) [ 28 ].

The Cochrane Collaboration’s Tool for Assessing the Risk of Bias [ 28 ]

Data extraction

Two different investigators extract data based on the objectives and form of the study; thereafter, the extracted data are reviewed. Since the size and format of each variable are different, the size and format of the outcomes are also different, and slight changes may be required when combining the data [ 29 ]. If there are differences in the size and format of the outcome variables that cause difficulties combining the data, such as the use of different evaluation instruments or different evaluation timepoints, the analysis may be limited to a systematic review. The investigators resolve differences of opinion by debate, and if they fail to reach a consensus, a third-reviewer is consulted.

Data Analysis

The aim of a meta-analysis is to derive a conclusion with increased power and accuracy than what could not be able to achieve in individual studies. Therefore, before analysis, it is crucial to evaluate the direction of effect, size of effect, homogeneity of effects among studies, and strength of evidence [ 30 ]. Thereafter, the data are reviewed qualitatively and quantitatively. If it is determined that the different research outcomes cannot be combined, all the results and characteristics of the individual studies are displayed in a table or in a descriptive form; this is referred to as a qualitative review. A meta-analysis is a quantitative review, in which the clinical effectiveness is evaluated by calculating the weighted pooled estimate for the interventions in at least two separate studies.

The pooled estimate is the outcome of the meta-analysis, and is typically explained using a forest plot ( Figs. 3 and ​ and4). 4 ). The black squares in the forest plot are the odds ratios (ORs) and 95% confidence intervals in each study. The area of the squares represents the weight reflected in the meta-analysis. The black diamond represents the OR and 95% confidence interval calculated across all the included studies. The bold vertical line represents a lack of therapeutic effect (OR = 1); if the confidence interval includes OR = 1, it means no significant difference was found between the treatment and control groups.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f3.jpg

Forest plot analyzed by two different models using the same data. (A) Fixed-effect model. (B) Random-effect model. The figure depicts individual trials as filled squares with the relative sample size and the solid line as the 95% confidence interval of the difference. The diamond shape indicates the pooled estimate and uncertainty for the combined effect. The vertical line indicates the treatment group shows no effect (OR = 1). Moreover, if the confidence interval includes 1, then the result shows no evidence of difference between the treatment and control groups.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f4.jpg

Forest plot representing homogeneous data.

Dichotomous variables and continuous variables

In data analysis, outcome variables can be considered broadly in terms of dichotomous variables and continuous variables. When combining data from continuous variables, the mean difference (MD) and standardized mean difference (SMD) are used ( Table 2 ).

Summary of Meta-analysis Methods Available in RevMan [ 28 ]

The MD is the absolute difference in mean values between the groups, and the SMD is the mean difference between groups divided by the standard deviation. When results are presented in the same units, the MD can be used, but when results are presented in different units, the SMD should be used. When the MD is used, the combined units must be shown. A value of “0” for the MD or SMD indicates that the effects of the new treatment method and the existing treatment method are the same. A value lower than “0” means the new treatment method is less effective than the existing method, and a value greater than “0” means the new treatment is more effective than the existing method.

When combining data for dichotomous variables, the OR, risk ratio (RR), or risk difference (RD) can be used. The RR and RD can be used for RCTs, quasi-experimental studies, or cohort studies, and the OR can be used for other case-control studies or cross-sectional studies. However, because the OR is difficult to interpret, using the RR and RD, if possible, is recommended. If the outcome variable is a dichotomous variable, it can be presented as the number needed to treat (NNT), which is the minimum number of patients who need to be treated in the intervention group, compared to the control group, for a given event to occur in at least one patient. Based on Table 3 , in an RCT, if x is the probability of the event occurring in the control group and y is the probability of the event occurring in the intervention group, then x = c/(c + d), y = a/(a + b), and the absolute risk reduction (ARR) = x − y. NNT can be obtained as the reciprocal, 1/ARR.

Calculation of the Number Needed to Treat in the Dichotomous table

Fixed-effect models and random-effect models

In order to analyze effect size, two types of models can be used: a fixed-effect model or a random-effect model. A fixed-effect model assumes that the effect of treatment is the same, and that variation between results in different studies is due to random error. Thus, a fixed-effect model can be used when the studies are considered to have the same design and methodology, or when the variability in results within a study is small, and the variance is thought to be due to random error. Three common methods are used for weighted estimation in a fixed-effect model: 1) inverse variance-weighted estimation 3) , 2) Mantel-Haenszel estimation 4) , and 3) Peto estimation 5) .

A random-effect model assumes heterogeneity between the studies being combined, and these models are used when the studies are assumed different, even if a heterogeneity test does not show a significant result. Unlike a fixed-effect model, a random-effect model assumes that the size of the effect of treatment differs among studies. Thus, differences in variation among studies are thought to be due to not only random error but also between-study variability in results. Therefore, weight does not decrease greatly for studies with a small number of patients. Among methods for weighted estimation in a random-effect model, the DerSimonian and Laird method 6) is mostly used for dichotomous variables, as the simplest method, while inverse variance-weighted estimation is used for continuous variables, as with fixed-effect models. These four methods are all used in Review Manager software (The Cochrane Collaboration, UK), and are described in a study by Deeks et al. [ 31 ] ( Table 2 ). However, when the number of studies included in the analysis is less than 10, the Hartung-Knapp-Sidik-Jonkman method 7) can better reduce the risk of type 1 error than does the DerSimonian and Laird method [ 32 ].

Fig. 3 shows the results of analyzing outcome data using a fixed-effect model (A) and a random-effect model (B). As shown in Fig. 3 , while the results from large studies are weighted more heavily in the fixed-effect model, studies are given relatively similar weights irrespective of study size in the random-effect model. Although identical data were being analyzed, as shown in Fig. 3 , the significant result in the fixed-effect model was no longer significant in the random-effect model. One representative example of the small study effect in a random-effect model is the meta-analysis by Li et al. [ 33 ]. In a large-scale study, intravenous injection of magnesium was unrelated to acute myocardial infarction, but in the random-effect model, which included numerous small studies, the small study effect resulted in an association being found between intravenous injection of magnesium and myocardial infarction. This small study effect can be controlled for by using a sensitivity analysis, which is performed to examine the contribution of each of the included studies to the final meta-analysis result. In particular, when heterogeneity is suspected in the study methods or results, by changing certain data or analytical methods, this method makes it possible to verify whether the changes affect the robustness of the results, and to examine the causes of such effects [ 34 ].

Heterogeneity

Homogeneity test is a method whether the degree of heterogeneity is greater than would be expected to occur naturally when the effect size calculated from several studies is higher than the sampling error. This makes it possible to test whether the effect size calculated from several studies is the same. Three types of homogeneity tests can be used: 1) forest plot, 2) Cochrane’s Q test (chi-squared), and 3) Higgins I 2 statistics. In the forest plot, as shown in Fig. 4 , greater overlap between the confidence intervals indicates greater homogeneity. For the Q statistic, when the P value of the chi-squared test, calculated from the forest plot in Fig. 4 , is less than 0.1, it is considered to show statistical heterogeneity and a random-effect can be used. Finally, I 2 can be used [ 35 ].

I 2 , calculated as shown above, returns a value between 0 and 100%. A value less than 25% is considered to show strong homogeneity, a value of 50% is average, and a value greater than 75% indicates strong heterogeneity.

Even when the data cannot be shown to be homogeneous, a fixed-effect model can be used, ignoring the heterogeneity, and all the study results can be presented individually, without combining them. However, in many cases, a random-effect model is applied, as described above, and a subgroup analysis or meta-regression analysis is performed to explain the heterogeneity. In a subgroup analysis, the data are divided into subgroups that are expected to be homogeneous, and these subgroups are analyzed. This needs to be planned in the predetermined protocol before starting the meta-analysis. A meta-regression analysis is similar to a normal regression analysis, except that the heterogeneity between studies is modeled. This process involves performing a regression analysis of the pooled estimate for covariance at the study level, and so it is usually not considered when the number of studies is less than 10. Here, univariate and multivariate regression analyses can both be considered.

Publication bias

Publication bias is the most common type of reporting bias in meta-analyses. This refers to the distortion of meta-analysis outcomes due to the higher likelihood of publication of statistically significant studies rather than non-significant studies. In order to test the presence or absence of publication bias, first, a funnel plot can be used ( Fig. 5 ). Studies are plotted on a scatter plot with effect size on the x-axis and precision or total sample size on the y-axis. If the points form an upside-down funnel shape, with a broad base that narrows towards the top of the plot, this indicates the absence of a publication bias ( Fig. 5A ) [ 29 , 36 ]. On the other hand, if the plot shows an asymmetric shape, with no points on one side of the graph, then publication bias can be suspected ( Fig. 5B ). Second, to test publication bias statistically, Begg and Mazumdar’s rank correlation test 8) [ 37 ] or Egger’s test 9) [ 29 ] can be used. If publication bias is detected, the trim-and-fill method 10) can be used to correct the bias [ 38 ]. Fig. 6 displays results that show publication bias in Egger’s test, which has then been corrected using the trim-and-fill method using Comprehensive Meta-Analysis software (Biostat, USA).

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f5.jpg

Funnel plot showing the effect size on the x-axis and sample size on the y-axis as a scatter plot. (A) Funnel plot without publication bias. The individual plots are broader at the bottom and narrower at the top. (B) Funnel plot with publication bias. The individual plots are located asymmetrically.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f6.jpg

Funnel plot adjusted using the trim-and-fill method. White circles: comparisons included. Black circles: inputted comparisons using the trim-and-fill method. White diamond: pooled observed log risk ratio. Black diamond: pooled inputted log risk ratio.

Result Presentation

When reporting the results of a systematic review or meta-analysis, the analytical content and methods should be described in detail. First, a flowchart is displayed with the literature search and selection process according to the inclusion/exclusion criteria. Second, a table is shown with the characteristics of the included studies. A table should also be included with information related to the quality of evidence, such as GRADE ( Table 4 ). Third, the results of data analysis are shown in a forest plot and funnel plot. Fourth, if the results use dichotomous data, the NNT values can be reported, as described above.

The GRADE Evidence Quality for Each Outcome

N: number of studies, ROB: risk of bias, PON: postoperative nausea, POV: postoperative vomiting, PONV: postoperative nausea and vomiting, CI: confidence interval, RR: risk ratio, AR: absolute risk.

When Review Manager software (The Cochrane Collaboration, UK) is used for the analysis, two types of P values are given. The first is the P value from the z-test, which tests the null hypothesis that the intervention has no effect. The second P value is from the chi-squared test, which tests the null hypothesis for a lack of heterogeneity. The statistical result for the intervention effect, which is generally considered the most important result in meta-analyses, is the z-test P value.

A common mistake when reporting results is, given a z-test P value greater than 0.05, to say there was “no statistical significance” or “no difference.” When evaluating statistical significance in a meta-analysis, a P value lower than 0.05 can be explained as “a significant difference in the effects of the two treatment methods.” However, the P value may appear non-significant whether or not there is a difference between the two treatment methods. In such a situation, it is better to announce “there was no strong evidence for an effect,” and to present the P value and confidence intervals. Another common mistake is to think that a smaller P value is indicative of a more significant effect. In meta-analyses of large-scale studies, the P value is more greatly affected by the number of studies and patients included, rather than by the significance of the results; therefore, care should be taken when interpreting the results of a meta-analysis.

When performing a systematic literature review or meta-analysis, if the quality of studies is not properly evaluated or if proper methodology is not strictly applied, the results can be biased and the outcomes can be incorrect. However, when systematic reviews and meta-analyses are properly implemented, they can yield powerful results that could usually only be achieved using large-scale RCTs, which are difficult to perform in individual studies. As our understanding of evidence-based medicine increases and its importance is better appreciated, the number of systematic reviews and meta-analyses will keep increasing. However, indiscriminate acceptance of the results of all these meta-analyses can be dangerous, and hence, we recommend that their results be received critically on the basis of a more accurate understanding.

1) http://www.ohri.ca .

2) http://methods.cochrane.org/bias/assessing-risk-bias-included-studies .

3) The inverse variance-weighted estimation method is useful if the number of studies is small with large sample sizes.

4) The Mantel-Haenszel estimation method is useful if the number of studies is large with small sample sizes.

5) The Peto estimation method is useful if the event rate is low or one of the two groups shows zero incidence.

6) The most popular and simplest statistical method used in Review Manager and Comprehensive Meta-analysis software.

7) Alternative random-effect model meta-analysis that has more adequate error rates than does the common DerSimonian and Laird method, especially when the number of studies is small. However, even with the Hartung-Knapp-Sidik-Jonkman method, when there are less than five studies with very unequal sizes, extra caution is needed.

8) The Begg and Mazumdar rank correlation test uses the correlation between the ranks of effect sizes and the ranks of their variances [ 37 ].

9) The degree of funnel plot asymmetry as measured by the intercept from the regression of standard normal deviates against precision [ 29 ].

10) If there are more small studies on one side, we expect the suppression of studies on the other side. Trimming yields the adjusted effect size and reduces the variance of the effects by adding the original studies back into the analysis as a mirror image of each study.

1.2.2  What is a systematic review?

A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question.  It  uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made (Antman 1992, Oxman 1993) . The key characteristics of a systematic review are:

a clearly stated set of objectives with pre-defined eligibility criteria for studies;

an explicit, reproducible methodology;

a systematic search that attempts to identify all studies that would meet the eligibility criteria;

an assessment of the validity of the findings of the included studies, for example through the assessment of risk of bias; and

a systematic presentation, and synthesis, of the characteristics and findings of the included studies.

Many systematic reviews contain meta-analyses. Meta-analysis is the use of statistical methods to summarize the results of independent studies (Glass 1976). By combining information from all relevant studies, meta-analyses can provide more precise estimates of the effects of health care than those derived from the individual studies included within a review (see Chapter 9, Section 9.1.3 ). They also facilitate investigations of the consistency of evidence across studies, and the exploration of differences across studies.

Introduction to Systematic Reviews

In this guide.

  • Introduction
  • Lane Research Services
  • Types of Reviews
  • Systematic Review Process
  • Protocols & Guidelines
  • Data Extraction and Screening
  • Resources & Tools
  • Systematic Review Online Course

What is a Systematic Review?

Knowledge synthesis is a term used to describe the method of synthesizing results from individual studies and interpreting these results within the larger body of knowledge on the topic. It requires highly structured, transparent and reproducible methods using quantitative and/or qualitative evidence. Systematic reviews, meta-analyses, scoping reviews, rapid reviews, narrative syntheses, practice guidelines, among others, are all forms of knowledge syntheses. For more information on types of reviews, visit the "Types of Reviews" tab on the left.

A systematic review varies from an ordinary literature review in that it uses a comprehensive, methodical, transparent and reproducible search strategy to ensure conclusions are as unbiased and closer to the truth as possible. The Cochrane Handbook for Systematic Reviews of Interventions  defines a systematic review as:

"A systematic review attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a given research question. Researchers conducting systematic reviews use explicit methods aimed at minimizing bias, in order to produce more reliable findings that can be used to inform decision making [...] This involves: the a priori specification of a research question; clarity on the scope of the review and which studies are eligible for inclusion; making every effort to find all relevant research and to ensure that issues of bias in included studies are accounted for; and analysing the included studies in order to draw conclusions based on all the identified research in an impartial and objective way." ( Chapter 1: Starting a review )

What are systematic reviews? from Cochrane on Youtube .

  • Next: Lane Research Services >>
  • Last Updated: Nov 1, 2023 2:50 PM
  • URL: https://laneguides.stanford.edu/systematicreviews

Home

  • Duke NetID Login
  • 919.660.1100
  • Duke Health Badge: 24-hour access
  • Accounts & Access
  • Databases, Journals & Books
  • Request & Reserve
  • Training & Consulting
  • Request Articles & Books
  • Renew Online
  • Reserve Spaces
  • Reserve a Locker
  • Study & Meeting Rooms
  • Course Reserves
  • Digital Health Device Collection
  • Pay Fines/Fees
  • Recommend a Purchase
  • Access From Off Campus
  • Building Access
  • Computers & Equipment
  • Wifi Access
  • My Accounts
  • Mobile Apps
  • Known Access Issues
  • Report an Access Issue
  • All Databases
  • Article Databases
  • Basic Sciences
  • Clinical Sciences
  • Dissertations & Theses
  • Drugs, Chemicals & Toxicology
  • Grants & Funding
  • Interprofessional Education
  • Non-Medical Databases
  • Search for E-Journals
  • Search for Print & E-Journals
  • Search for E-Books
  • Search for Print & E-Books
  • E-Book Collections
  • Biostatistics
  • Global Health
  • MBS Program
  • Medical Students
  • MMCi Program
  • Occupational Therapy
  • Path Asst Program
  • Physical Therapy
  • Researchers
  • Community Partners

Conducting Research

  • Archival & Historical Research
  • Black History at Duke Health
  • Data Analytics & Viz Software
  • Data: Find and Share
  • Evidence-Based Practice
  • NIH Public Access Policy Compliance
  • Publication Metrics
  • Qualitative Research
  • Searching Animal Alternatives

Systematic Reviews

  • Test Instruments

Using Databases

  • JCR Impact Factors
  • Web of Science

Finding & Accessing

  • COVID-19: Core Clinical Resources
  • Health Literacy
  • Health Statistics & Data
  • Library Orientation

Writing & Citing

  • Creating Links
  • Getting Published
  • Reference Mgmt
  • Scientific Writing

Meet a Librarian

  • Request a Consultation
  • Find Your Liaisons
  • Register for a Class
  • Request a Class
  • Self-Paced Learning

Search Services

  • Literature Search
  • Systematic Review
  • Animal Alternatives (IACUC)
  • Research Impact

Citation Mgmt

  • Other Software

Scholarly Communications

  • About Scholarly Communications
  • Publish Your Work
  • Measure Your Research Impact
  • Engage in Open Science
  • Libraries and Publishers
  • Directions & Maps
  • Floor Plans

Library Updates

  • Annual Snapshot
  • Conference Presentations
  • Contact Information
  • Gifts & Donations

What is a Systematic Review?

  • Types of Reviews
  • Manuals and Reporting Guidelines
  • Our Service
  • 1. Assemble Your Team
  • 2. Develop a Research Question
  • 3. Write and Register a Protocol
  • 4. Search the Evidence
  • 5. Screen Results
  • 6. Assess for Quality and Bias
  • 7. Extract the Data
  • 8. Write the Review
  • Additional Resources
  • Finding Full-Text Articles

A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. The key characteristics of a systematic review are:

  • a clearly defined question with inclusion and exclusion criteria;
  • a rigorous and systematic search of the literature;
  • two phases of screening (blinded, at least two independent screeners);
  • data extraction and management;
  • analysis and interpretation of results;
  • risk of bias assessment of included studies;
  • and report for publication.

Medical Center Library & Archives Presentations

The following presentation is a recording of the Getting Started with Systematic Reviews workshop (4/2022), offered by the Duke Medical Center Library & Archives. A NetID/pw is required to access the tutorial via Warpwire. 

  • << Previous: Overview
  • Next: Types of Reviews >>
  • Last Updated: Mar 20, 2024 2:21 PM
  • URL: https://guides.mclibrary.duke.edu/sysreview
  • Duke Health
  • Duke University
  • Duke Libraries
  • Medical Center Archives
  • Duke Directory
  • Seeley G. Mudd Building
  • 10 Searle Drive
  • [email protected]

University of Texas

  • University of Texas Libraries
  • UT Libraries

Systematic Reviews & Evidence Synthesis Methods

  • Types of Reviews
  • Formulate Question
  • Find Existing Reviews & Protocols
  • Register a Protocol
  • Searching Systematically
  • Supplementary Searching
  • Managing Results
  • Deduplication
  • Critical Appraisal
  • Glossary of terms
  • Librarian Support
  • Video tutorials This link opens in a new window
  • Systematic Review & Evidence Synthesis Boot Camp

What is a Systematic Review?

A systematic review gathers, assesses, and synthesizes  all available empirical  research on a specific question using a comprehensive search method with an aim to minimize bias.

Or, put another way : 

A systematic review begins with a specific research question.  Authors of the review gather and evaluate all experimental studies that address the question .  Bringing together the findings of these separate studies allows the review authors to make new conclusions from what has been learned.

*The key characteristics of a systematic review are:

  • A clearly stated set of objectives with pre-defined eligibility criteria for studies;
  • An explicit, reproducible methodology;
  • A systematic search that attempts to identify all relevant research;
  • A critical appraisal of the included studies;
  • A clear and objective synthesis and presentation of the characteristics and findings of the included studies.

*Lasserson T, Thomas J, Higgins JPT. Chapter 1: Starting a review. In Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors).  Cochrane Handbook for Systematic Reviews of Interventions  version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

What is the difference between an evidence synthesis and a systematic review? A systematic review is a type of evidence synthesis.  Any literature review is a type of evidence synthesis.  For the various types of evidence syntheses/literature reviews, see the page on this guide Types of Reviews .

Systematic reviews are usually done as a team project , requiring cooperation and a commitment of (lots of) time and effort over an extended period. You will need at least 3 people and, depending on the scope of the project and the size of the database result sets, you should plan for 6-24 months from start to completion

Things to Know Before You Begin . . .

Run exploratory searches on the topic to get a sense of the plausibility of your project.

A systematic review requires a research question that is already well-covered in the primary literature.  That is, if there has been little previous work on the topic, there will be little to analyze and conclusions hard to find.

A narrowly-focused research question may add little to the knowledge of the field of study.

Make sure someone else has not already 1) written a recent systematic review on your topic, or 2) is in the midst of a similar systematic review project. Instructions on how to check .

Team members will need to use research databases for searching the literature.  If these databases are not available through library subscriptions or freely available, their use may require payment or travel. Look here for database recommendations .

It is extremely important to develop a protocol for your project.  Guidance is provided here .

Tools such as a reference manager and a screening tool will save time.  

Lynn Bostwick : Nursing, Nutrition, Pharmacy, Public Health

Meryl Brodsky : Communication and Information Studies

Hannah Chapman Tripp : Biology, Neuroscience

Carolyn Cunningham : Human Development & Family Sciences, Psychology, Sociology

Larayne Dallas : Engineering

Liz DeHart : Marine Science

Grant Hardaway : Educational Psychology, Kinesiology & Health Education, Social Work

Janelle Hedstrom : Special Education, Curriculum & Instruction, Ed Leadership & Policy ​

Susan Macicak : Linguistics

Imelda Vetter : Dell Medical School

  • Last Updated: Apr 9, 2024 8:57 PM
  • URL: https://guides.lib.utexas.edu/systematicreviews

Creative Commons License

Systematic Reviews (in the Health Sciences)

What is a systematic review, is a systematic review right for you, systematic review vs. literature review.

  • Library Help & Tools
  • Standards & Guidelines
  • Getting Started
  • Research Question
  • Search Development
  • Search Techniques
  • Grey Literature
  • Citation Management
  • Screening & Selection
  • Retrieving Full Text
  • Data Extraction
  • Meta-Analysis

A systematic review is a research method that attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a specific research question. Researchers conducting systematic reviews use explicit, systematic methods that are selected with a view aimed at minimizing bias, to produce more reliable findings to inform decision making.

Systematic reviews should be conducted by a team of researchers, at least one of whom has significant knowledge of research conducted in the subject, and cannot be done alone . Systematic reviews follow the established standards and methodologies for conducting and reporting systematic reviews.

  • What is a Systematic Review? (About Cochrane Reviews)

Not all questions are appropriate for a systematic review. Depending on your question, another type of review, such as a scoping review or literature review, may be more appropriate. 

A systematic review research question should be a well-formulated clearly defined clinical question, commonly in PICO format. 

is a systematic review empirical research

  • Decision Tree: What Type of Review is Right for You?
  • Interactive Tool: Which Review is Right for You?
  • PICO: Asking Focused Questions
  • A typology of reviews: an analysis of 14 review types and associated methodologies.

is a systematic review empirical research

  • Next: Library Help & Tools >>
  • Last Updated: Apr 1, 2024 3:35 PM
  • URL: https://libguides.usc.edu/healthsciences/systematicreviews

Jump to navigation

Home

Cochrane Training

Chapter 1: starting a review.

Toby J Lasserson, James Thomas, Julian PT Higgins

Key Points:

  • Systematic reviews address a need for health decision makers to be able to access high quality, relevant, accessible and up-to-date information.
  • Systematic reviews aim to minimize bias through the use of pre-specified research questions and methods that are documented in protocols, and by basing their findings on reliable research.
  • Systematic reviews should be conducted by a team that includes domain expertise and methodological expertise, who are free of potential conflicts of interest.
  • People who might make – or be affected by – decisions around the use of interventions should be involved in important decisions about the review.
  • Good data management, project management and quality assurance mechanisms are essential for the completion of a successful systematic review.

Cite this chapter as: Lasserson TJ, Thomas J, Higgins JPT. Chapter 1: Starting a review. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

1.1 Why do a systematic review?

Systematic reviews were developed out of a need to ensure that decisions affecting people’s lives can be informed by an up-to-date and complete understanding of the relevant research evidence. With the volume of research literature growing at an ever-increasing rate, it is impossible for individual decision makers to assess this vast quantity of primary research to enable them to make the most appropriate healthcare decisions that do more good than harm. By systematically assessing this primary research, systematic reviews aim to provide an up-to-date summary of the state of research knowledge on an intervention, diagnostic test, prognostic factor or other health or healthcare topic. Systematic reviews address the main problem with ad hoc searching and selection of research, namely that of bias. Just as primary research studies use methods to avoid bias, so should summaries and syntheses of that research.

A systematic review attempts to collate all the empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made (Antman et al 1992, Oxman and Guyatt 1993). Systematic review methodology, pioneered and developed by Cochrane, sets out a highly structured, transparent and reproducible methodology (Chandler and Hopewell 2013). This involves: the a priori specification of a research question; clarity on the scope of the review and which studies are eligible for inclusion; making every effort to find all relevant research and to ensure that issues of bias in included studies are accounted for; and analysing the included studies in order to draw conclusions based on all the identified research in an impartial and objective way.

This Handbook is about systematic reviews on the effects of interventions, and specifically about methods used by Cochrane to undertake them. Cochrane Reviews use primary research to generate new knowledge about the effects of an intervention (or interventions) used in clinical, public health or policy settings. They aim to provide users with a balanced summary of the potential benefits and harms of interventions and give an indication of how certain they can be of the findings. They can also compare the effectiveness of different interventions with one another and so help users to choose the most appropriate intervention in particular situations. The primary purpose of Cochrane Reviews is therefore to inform people making decisions about health or health care.

Systematic reviews are important for other reasons. New research should be designed or commissioned only if it does not unnecessarily duplicate existing research (Chalmers et al 2014). Therefore, a systematic review should typically be undertaken before embarking on new primary research. Such a review will identify current and ongoing studies, as well as indicate where specific gaps in knowledge exist, or evidence is lacking; for example, where existing studies have not used outcomes that are important to users of research (Macleod et al 2014). A systematic review may also reveal limitations in the conduct of previous studies that might be addressed in the new study or studies.

Systematic reviews are important, often rewarding and, at times, exciting research projects. They offer the opportunity for authors to make authoritative statements about the extent of human knowledge in important areas and to identify priorities for further research. They sometimes cover issues high on the political agenda and receive attention from the media. Conducting research with these impacts is not without its challenges, however, and completing a high-quality systematic review is often demanding and time-consuming. In this chapter we introduce some of the key considerations for potential review authors who are about to start a systematic review.

1.2 What is the review question?

Getting the research question right is critical for the success of a systematic review. Review authors should ensure that the review addresses an important question to those who are expected to use and act upon its conclusions.

We discuss the formulation of questions in detail in Chapter 2 . For a question about the effects of an intervention, the PICO approach is usually used, which is an acronym for Population, Intervention, Comparison(s) and Outcome. Reviews may have additional questions, for example about how interventions were implemented, economic issues, equity issues or patient experience.

To ensure that the review addresses a relevant question in a way that benefits users, it is important to ensure wide input. In most cases, question formulation should therefore be informed by people with various relevant – but potentially different – perspectives (see Chapter 2, Section 2.4 ).

1.3 Who should do a systematic review?

Systematic reviews should be undertaken by a team. Indeed, Cochrane will not publish a review that is proposed to be undertaken by a single person. Working as a team not only spreads the effort, but ensures that tasks such as the selection of studies for eligibility, data extraction and rating the certainty of the evidence will be performed by at least two people independently, minimizing the likelihood of errors. First-time review authors are encouraged to work with others who are experienced in the process of systematic reviews and to attend relevant training.

Review teams must include expertise in the topic area under review. Topic expertise should not be overly narrow, to ensure that all relevant perspectives are considered. Perspectives from different disciplines can help to avoid assumptions or terminology stemming from an over-reliance on a single discipline. Review teams should also include expertise in systematic review methodology, including statistical expertise.

Arguments have been made that methodological expertise is sufficient to perform a review, and that content expertise should be avoided because of the risk of preconceptions about the effects of interventions (Gøtzsche and Ioannidis 2012). However, it is important that both topic and methodological expertise is present to ensure a good mix of skills, knowledge and objectivity, because topic expertise provides important insight into the implementation of the intervention(s), the nature of the condition being treated or prevented, the relationships between outcomes measured, and other factors that may have an impact on decision making.

A Cochrane Review should represent an independent assessment of the evidence and avoiding financial and non-financial conflicts of interest often requires careful management. It will be important to consider if there are any relevant interests that may constitute a conflict of interest. There are situations where employment, holding of patents and other financial support should prevent people joining an author team. Funding of Cochrane Reviews by commercial organizations with an interest in the outcome of the review is not permitted. To ensure that any issues are identified early in the process, authors planning Cochrane Reviews should consult the Conflict of Interest Policy . Authors should make complete declarations of interest before registration of the review, and refresh these annually thereafter until publication and just prior to publication of the protocol and the review. For authors of review updates, this must be done at the time of the decision to update the review, annually thereafter until publication, and just prior to publication. Authors should also update declarations of interest at any point when their circumstances change.

1.3.1 Involving consumers and other stakeholders

Because the priorities of decision makers and consumers may be different from those of researchers, it is important that review authors consider carefully what questions are important to these different stakeholders. Systematic reviews are more likely to be relevant to a broad range of end users if they are informed by the involvement of people with a range of experiences, in terms of both the topic and the methodology (Thomas et al 2004, Rees and Oliver 2017). Engaging consumers and other stakeholders, such as policy makers, research funders and healthcare professionals, increases relevance, promotes mutual learning, improved uptake and decreases research waste.

Mapping out all potential stakeholders specific to the review question is a helpful first step to considering who might be invited to be involved in a review. Stakeholders typically include: patients and consumers; consumer advocates; policy makers and other public officials; guideline developers; professional organizations; researchers; funders of health services and research; healthcare practitioners, and, on occasion, journalists and other media professionals. Balancing seniority, credibility within the given field, and diversity should be considered. Review authors should also take account of the needs of resource-poor countries and regions in the review process (see Chapter 16 ) and invite appropriate input on the scope of the review and the questions it will address.

It is established good practice to ensure that consumers are involved and engaged in health research, including systematic reviews. Cochrane uses the term ‘consumers’ to refer to a wide range of people, including patients or people with personal experience of a healthcare condition, carers and family members, representatives of patients and carers, service users and members of the public. In 2017, a Statement of Principles for consumer involvement in Cochrane was agreed. This seeks to change the culture of research practice to one where both consumers and other stakeholders are joint partners in research from planning, conduct, and reporting to dissemination. Systematic reviews that have had consumer involvement should be more directly applicable to decision makers than those that have not (see online Chapter II ).

1.3.2 Working with consumers and other stakeholders

Methods for working with consumers and other stakeholders include surveys, workshops, focus groups and involvement in advisory groups. Decisions about what methods to use will typically be based on resource availability, but review teams should be aware of the merits and limitations of such methods. Authors will need to decide who to involve and how to provide adequate support for their involvement. This can include financial reimbursement, the provision of training, and stating clearly expectations of involvement, possibly in the form of terms of reference.

While a small number of consumers or other stakeholders may be part of the review team and become co-authors of the subsequent review, it is sometimes important to bring in a wider range of perspectives and to recognize that not everyone has the capacity or interest in becoming an author. Advisory groups offer a convenient approach to involving consumers and other relevant stakeholders, especially for topics in which opinions differ. Important points to ensure successful involvement include the following.

  • The review team should co-ordinate the input of the advisory group to inform key review decisions.
  • The advisory group’s input should continue throughout the systematic review process to ensure relevance of the review to end users is maintained.
  • Advisory group membership should reflect the breadth of the review question, and consideration should be given to involving vulnerable and marginalized people (Steel 2004) to ensure that conclusions on the value of the interventions are well-informed and applicable to all groups in society (see Chapter 16 ).

Templates such as terms of reference, job descriptions, or person specifications for an advisory group help to ensure clarity about the task(s) required and are available from INVOLVE . The website also gives further information on setting and organizing advisory groups. See also the Cochrane training website for further resources to support consumer involvement.

1.4 The importance of reliability

Systematic reviews aim to be an accurate representation of the current state of knowledge about a given issue. As understanding improves, the review can be updated. Nevertheless, it is important that the review itself is accurate at the time of publication. There are two main reasons for this imperative for accuracy. First, health decisions that affect people’s lives are increasingly taken based on systematic review findings. Current knowledge may be imperfect, but decisions will be better informed when taken in the light of the best of current knowledge. Second, systematic reviews form a critical component of legal and regulatory frameworks; for example, drug licensing or insurance coverage. Here, systematic reviews also need to hold up as auditable processes for legal examination. As systematic reviews need to be both correct, and be seen to be correct, detailed evidence-based methods have been developed to guide review authors as to the most appropriate procedures to follow, and what information to include in their reports to aid auditability.

1.4.1 Expectations for the conduct and reporting of Cochrane Reviews

Cochrane has developed methodological expectations for the conduct, reporting and updating of systematic reviews of interventions (MECIR) and their plain language summaries ( Plain Language Expectations for Authors of Cochrane Summaries ; PLEACS). Developed collaboratively by methodologists and Cochrane editors, they are intended to describe the desirable attributes of a Cochrane Review. The expectations are not all relevant at the same stage of review conduct, so care should be taken to identify those that are relevant at specific points during the review. Different methods should be used at different stages of the review in terms of the planning, conduct, reporting and updating of the review.

Each expectation has a title, a rationale and an elaboration. For the purposes of publication of a review with Cochrane, each has the status of either ‘mandatory’ or ‘highly desirable’. Items described as mandatory are expected to be applied, and if they are not then an appropriate justification should be provided; failure to implement such items may be used as a basis for deciding not to publish a review in the Cochrane Database of Systematic Reviews (CDSR). Items described as highly desirable should generally be implemented, but there are reasonable exceptions and justifications are not required.

All MECIR expectations for the conduct of a review are presented in the relevant chapters of this Handbook . Expectations for reporting of completed reviews (including PLEACS) are described in online Chapter III . The recommendations provided in the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) Statement have been incorporated into the Cochrane reporting expectations, ensuring compliance with the PRISMA recommendations and summarizing attributes of reporting that should allow a full assessment of the methods and findings of the review (Moher et al 2009).

1.5 Protocol development

Preparing a systematic review is complex and involves many judgements. To minimize the potential for bias in the review process, these judgements should be made as far as possible in ways that do not depend on the findings of the studies included in the review. Review authors’ prior knowledge of the evidence may, for example, influence the definition of a systematic review question, the choice of criteria for study eligibility, or the pre-specification of intervention comparisons and outcomes to analyse. It is important that the methods to be used should be established and documented in advance (see MECIR Box 1.5.a , MECIR Box 1.5.b and MECIR Box 1.5.c ).

Publication of a protocol for a review that is written without knowledge of the available studies reduces the impact of review authors’ biases, promotes transparency of methods and processes, reduces the potential for duplication, allows peer review of the planned methods before they have been completed, and offers an opportunity for the review team to plan resources and logistics for undertaking the review itself. All chapters in the Handbook should be consulted when drafting the protocol. Since systematic reviews are by their nature retrospective, an element of knowledge of the evidence is often inevitable. This is one reason why non-content experts such as methodologists should be part of the review team (see Section 1.3 ). Two exceptions to the retrospective nature of a systematic review are a meta-analysis of a prospectively planned series of trials and some living systematic reviews, as described in Chapter 22 .

The review question should determine the methods used in the review, and not vice versa. The question may concern a relatively straightforward comparison of one treatment with another; or it may necessitate plans to compare different treatments as part of a network meta-analysis, or assess differential effects of an intervention in different populations or delivered in different ways.

The protocol sets out the context in which the review is being conducted. It presents an opportunity to develop ideas that are foundational for the review. This concerns, most explicitly, definition of the eligibility criteria such as the study participants and the choice of comparators and outcomes. The eligibility criteria may also be defined following the development of a logic model (or an articulation of the aspects of an extent logic model that the review is addressing) to explain how the intervention might work (see Chapter 2, Section 2.5.1 ).

MECIR Box 1.5.a Relevant expectations for conduct of intervention reviews

A key purpose of the protocol is to make plans to minimize bias in the eventual findings of the review. Reliable synthesis of available evidence requires a planned, systematic approach. Threats to the validity of systematic reviews can come from the studies they include or the process by which reviews are conducted. Biases within the studies can arise from the method by which participants are allocated to the intervention groups, awareness of intervention group assignment, and the collection, analysis and reporting of data. Methods for examining these issues should be specified in the protocol. Review processes can generate bias through a failure to identify an unbiased (and preferably complete) set of studies, and poor quality assurance throughout the review. The availability of research may be influenced by the nature of the results (i.e. reporting bias). To reduce the impact of this form of bias, searching may need to include unpublished sources of evidence (Dwan et al 2013) ( MECIR Box 1.5.b ).

MECIR Box 1.5.b Relevant expectations for the conduct of intervention reviews

Developing a protocol for a systematic review has benefits beyond reducing bias. Investing effort in designing a systematic review will make the process more manageable and help to inform key priorities for the review. Defining the question, referring to it throughout, and using appropriate methods to address the question focuses the analysis and reporting, ensuring the review is most likely to inform treatment decisions for funders, policy makers, healthcare professionals and consumers. Details of the planned analyses, including investigations of variability across studies, should be specified in the protocol, along with methods for interpreting the results through the systematic consideration of factors that affect confidence in estimates of intervention effect ( MECIR Box 1.5.c ).

MECIR Box 1.5.c Relevant expectations for conduct of intervention reviews

While the intention should be that a review will adhere to the published protocol, changes in a review protocol are sometimes necessary. This is also the case for a protocol for a randomized trial, which must sometimes be changed to adapt to unanticipated circumstances such as problems with participant recruitment, data collection or event rates. While every effort should be made to adhere to a predetermined protocol, this is not always possible or appropriate. It is important, however, that changes in the protocol should not be made based on how they affect the outcome of the research study, whether it is a randomized trial or a systematic review. Post hoc decisions made when the impact on the results of the research is known, such as excluding selected studies from a systematic review, or changing the statistical analysis, are highly susceptible to bias and should therefore be avoided unless there are reasonable grounds for doing this.

Enabling access to a protocol through publication (all Cochrane Protocols are published in the CDSR ) and registration on the PROSPERO register of systematic reviews reduces duplication of effort, research waste, and promotes accountability. Changes to the methods outlined in the protocol should be transparently declared.

This Handbook provides details of the systematic review methods developed or selected by Cochrane. They are intended to address the need for rigour, comprehensiveness and transparency in preparing a Cochrane systematic review. All relevant chapters – including those describing procedures to be followed in the later stages of the review – should be consulted during the preparation of the protocol. A more specific description of the structure of Cochrane Protocols is provide in online Chapter II .

1.6 Data management and quality assurance

Systematic reviews should be replicable, and retaining a record of the inclusion decisions, data collection, transformations or adjustment of data will help to establish a secure and retrievable audit trail. They can be operationally complex projects, often involving large research teams operating in different sites across the world. Good data management processes are essential to ensure that data are not inadvertently lost, facilitating the identification and correction of errors and supporting future efforts to update and maintain the review. Transparent reporting of review decisions enables readers to assess the reliability of the review for themselves.

Review management software, such as Covidence and EPPI-Reviewer , can be used to assist data management and maintain consistent and standardized records of decisions made throughout the review. These tools offer a central repository for review data that can be accessed remotely throughout the world by members of the review team. They record independent assessment of studies for inclusion, risk of bias and extraction of data, enabling checks to be made later in the process if needed. Research has shown that even experienced reviewers make mistakes and disagree with one another on risk-of-bias assessments, so it is particularly important to maintain quality assurance here, despite its cost in terms of author time. As more sophisticated information technology tools begin to be deployed in reviews (see Chapter 4, Section 4.6.6.2 and Chapter 22, Section 22.2.4 ), it is increasingly apparent that all review data – including the initial decisions about study eligibility – have value beyond the scope of the individual review. For example, review updates can be made more efficient through (semi-) automation when data from the original review are available for machine learning.

1.7 Chapter information

Authors: Toby J Lasserson, James Thomas, Julian PT Higgins

Acknowledgements: This chapter builds on earlier versions of the Handbook . We would like to thank Ruth Foxlee, Richard Morley, Soumyadeep Bhaumik, Mona Nasser, Dan Fox and Sally Crowe for their contributions to Section 1.3 .

Funding: JT is supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care North Thames at Barts Health NHS Trust. JPTH is a member of the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JPTH received funding from National Institute for Health Research Senior Investigator award NF-SI-0617-10145. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

1.8 References

Antman E, Lau J, Kupelnick B, Mosteller F, Chalmers T. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts: treatment for myocardial infarction. JAMA 1992; 268 : 240–248.

Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gulmezoglu AM, Howells DW, Ioannidis JP, Oliver S. How to increase value and reduce waste when research priorities are set. Lancet 2014; 383 : 156–165.

Chandler J, Hopewell S. Cochrane methods – twenty years experience in developing systematic review methods. Systematic Reviews 2013; 2 : 76.

Dwan K, Gamble C, Williamson PR, Kirkham JJ, Reporting Bias Group. Systematic review of the empirical evidence of study publication bias and outcome reporting bias: an updated review. PloS One 2013; 8 : e66844.

Gøtzsche PC, Ioannidis JPA. Content area experts as authors: helpful or harmful for systematic reviews and meta-analyses? BMJ 2012; 345 .

Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JP, Al-Shahi Salman R, Chan AW, Glasziou P. Biomedical research: increasing value, reducing waste. Lancet 2014; 383 : 101–104.

Moher D, Liberati A, Tetzlaff J, Altman D, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Medicine 2009; 6 : e1000097.

Oxman A, Guyatt G. The science of reviewing research. Annals of the New York Academy of Sciences 1993; 703 : 125–133.

Rees R, Oliver S. Stakeholder perspectives and participation in reviews. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews . 2nd ed. London: Sage; 2017. p. 17–34.

Steel R. Involving marginalised and vulnerable people in research: a consultation document (2nd revision). INVOLVE; 2004.

Thomas J, Harden A, Oakley A, Oliver S, Sutcliffe K, Rees R, Brunton G, Kavanagh J. Integrating qualitative research with trials in systematic reviews. BMJ 2004; 328 : 1010–1012.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

  • Boston University Libraries

Systematic Reviews in the Social Sciences

What is a systematic review, difference between a systematic review and a literature review.

  • Finding Systematic Reviews
  • Conducting Systematic Reviews
  • Saving Search Results
  • Systematic Review Management Tools
  • Citing Your Sources

"A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question.  It  uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made   (Antman 1992, Oxman 1993) . The key characteristics of a systematic review are:

a clearly stated set of objectives with pre-defined eligibility criteria for studies;

an explicit, reproducible methodology;

a systematic search that attempts to identify all studies that would meet the eligibility criteria;

an assessment of the validity of the findings of the included studies, for example through the assessment of risk of bias; and

a systematic presentation, and synthesis, of the characteristics and findings of the included studies".

Cochrane Handbook for Systematic Reviews of Interventions . (March 2011)

(Original author, Meredith Kirkpatrick, 2021)

Kysh, Lynn (2013): Difference between a systematic review and a literature review . Figshare.https://doi.org/10.6084/m9.figshare.766364.v1  

Profile Photo

Related Guides

  • Social Work
  • Social Work Policy Resources
  • Literature Reviews in Social Work
  • Systematic Reviews in the Health Sciences
  • Next: Finding Systematic Reviews >>
  • Last Updated: Sep 1, 2022 2:14 PM
  • URL: https://library.bu.edu/systematic-reviews-social-sciences

UVM Libraries Research Guides banner

Systematic Reviews

  • What is a Systematic Review?
  • Request Form
  • Getting Started
  • Systematic Review Process
  • Question Frameworks
  • Key Resources

Definition of Systematic Review

"A systematic review attempts to identify, appraise, and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a given research question. Researchers conducting systematic reviews use explicit methods aimed at minimizing bias, in order to produce more reliable findings that can be used to inform decision making."

- About Cochrane Reviews, Cochrane Library

Systematic reviews are part of a larger category of research methodologies known as evidence syntheses or knowledge syntheses. While many types of evidence syntheses exist, these are the methodologies that the Dana Health Sciences Library is prepared to support and collaborate on.

Review Methods

There are many types of reviews, and choosing the right one can be challenging. The list below provides a general overview of six popular review types. If you are still unsure of which one to choose, please try the Right Review tool, which asks you a series of questions to help you determine which review methodology might be suitable for your project. When you are finished, please feel free to discuss the results with your librarian .

Chart adapted from: Grant MJ, Booth A.  A typology of reviews: an analysis of 14 review types and associated methodologies . Health Info Libr J. 2009 Jun;26(2):91-108. doi: 10.1111/j.1471-1842.2009.00848.x.

Standards & Guidance References

Here are some standards and additional research synthesis methods papers to get you started:

Chandler, J., Churchill, R., Higgins, J., Lasserson, T., & Tovey, D. (2013).  Methodological standards for the conduct of new Cochrane Intervention Reviews, Version 2.3.  Available from  http://www.editorial-unit.cochrane.org/mecir .

European Network for Health Technology Assessment. (2019) Guideline:  Process of information retrieval for systematic reviews and health technology assessments on clinical effectiveness Version 2.0.  DEC 2019. Available from:  https://www.eunethta.eu/wp-content/uploads/2020/01/EUnetHTA_Guideline_Information_Retrieval_v2-0.pdf . 

Higgins, J., Green, S., & (editors). (2011).  Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0  [updated March 2011].  J. Higgins & S. Green (Eds.), Available from:  http://handbook.cochrane.org/ .  

IOM (Institute of Medicine). (2011).  Finding What Works in Health Care: Standards for Systematic Reviews.  Available from:  https://www.nap.edu/catalog/13059/finding-what-works-in-health-care-standards-for-systematic-reviews .

Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gotzsche, P. C., Ioannidis, J. P., . . . Moher, D. (2009).  The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration.  PLoS Medicine, 6(7), e1000100. doi: 10.1371/journal.pmed.1000100. Available from:  https://journals.plos.org/plosmedicine/article/file?id=10.1371/journal.pmed.1000100&type=printable . 

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & Prisma Group. (2009).  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.  PLoS Medicine, 6(7), e1000097. doi: 10.1371/journal.pmed.1000097. Available from:  https://journals.plos.org/plosmedicine/article/file?id=10.1371/journal.pmed.1000097&type=printable . 

Moher D., Shamseer L., Clarke M., Ghersi D., Liberati A., Petticrew M., . . . PRISMA-P Group (2015).  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement.  Systematic Reviews, 4(1), doi:10.1186/2046-4053-4-1. Availalble from:  https://www.bmj.com/content/bmj/349/bmj.g7647.full.pdf .  

Rader, T., Mann, M., Stansfield, C., Cooper, C., & Sampson, M. (2013).  Methods for documenting systematic review searches: a discussion of common issues.  Research Synthesis Methods, Article first published online: 8 OCT 2013. doi: 10.1002/jrsm.1097. Available from:  https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1097 .

Rethlefsen, M. L., Murad, M., & Livingston, E. H. (2014).  Engaging medical librarians to improve the quality of review articles.  JAMA, 312(10), 999-1000. doi: 10.1001/jama.2014.9263. Availalble from:  https://jamanetwork.com/journals/jama/fullarticle/1902238 . 

Shamseer, L., Moher, D., Clarke, M., Ghersi, D., Liberati, A. D., Petticrew, M., . . . PRISMA-P Group. (2015).  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation.  BMJ, 349, g7647. doi: 10.1136/bmj.g7647. Available from:  https://www.bmj.com/content/bmj/349/bmj.g7647.full.pdf .

Umscheid, C. A. (2013).  A Primer on Performing Systematic Reviews and Meta-analyses.  Clinical Infectious Diseases, 57(5), 725-734. doi: 10.1093/cid/cit333. Available from:  https://academic.oup.com/cid/article/57/5/725/311245 . 

This website has been adapted from the Northwestern University Galter Health Sciences Library's Systematic Reviews guide.

  • Next: Systematic & Scoping Review Services at Dana >>
  • Last Updated: Apr 15, 2024 10:01 AM
  • URL: https://researchguides.uvm.edu/reviews
  • Systematic review
  • Open access
  • Published: 16 April 2020

A systematic review of empirical studies examining mechanisms of implementation in health

  • Cara C. Lewis 1 , 2 , 3 ,
  • Meredith R. Boyd 4 ,
  • Callie Walsh-Bailey 1 , 5 ,
  • Aaron R. Lyon 3 ,
  • Rinad Beidas 6 ,
  • Brian Mittman 7 ,
  • Gregory A. Aarons 8 ,
  • Bryan J. Weiner 9 &
  • David A. Chambers 10  

Implementation Science volume  15 , Article number:  21 ( 2020 ) Cite this article

20k Accesses

145 Citations

61 Altmetric

Metrics details

Understanding the mechanisms of implementation strategies (i.e., the processes by which strategies produce desired effects) is important for research to understand why a strategy did or did not achieve its intended effect, and it is important for practice to ensure strategies are designed and selected to directly target determinants or barriers. This study is a systematic review to characterize how mechanisms are conceptualized and measured, how they are studied and evaluated, and how much evidence exists for specific mechanisms.

We systematically searched PubMed and CINAHL Plus for implementation studies published between January 1990 and August 2018 that included the terms “mechanism,” “mediator,” or “moderator.” Two authors independently reviewed title and abstracts and then full texts for fit with our inclusion criteria of empirical studies of implementation in health care contexts. Authors extracted data regarding general study information, methods, results, and study design and mechanisms-specific information. Authors used the Mixed Methods Appraisal Tool to assess study quality.

Search strategies produced 2277 articles, of which 183 were included for full text review. From these we included for data extraction 39 articles plus an additional seven articles were hand-entered from only other review of implementation mechanisms (total = 46 included articles). Most included studies employed quantitative methods (73.9%), while 10.9% were qualitative and 15.2% were mixed methods. Nine unique versions of models testing mechanisms emerged. Fifty-three percent of the studies met half or fewer of the quality indicators. The majority of studies (84.8%) only met three or fewer of the seven criteria stipulated for establishing mechanisms.

Conclusions

Researchers have undertaken a multitude of approaches to pursue mechanistic implementation research, but our review revealed substantive conceptual, methodological, and measurement issues that must be addressed in order to advance this critical research agenda. To move the field forward, there is need for greater precision to achieve conceptual clarity, attempts to generate testable hypotheses about how and why variables are related, and use of concrete behavioral indicators of proximal outcomes in the case of quantitative research and more directed inquiry in the case of qualitative research.

Peer Review reports

Contributions to the literature statement

This is the first systematic review of implementation mechanisms across health that assesses the quality of studies and the extent to which they offer evidence in support of establishing mechanisms of implementation.

We summarize nine examples of models for evaluating mechanisms.

We offer conceptual, theoretical, and methodological guidance for the field to contribute to the study of implementation mechanisms.

Implementation research is the scientific evaluation of strategies or methods used to support the integration of evidence-based practices or programs (EBPs) into healthcare settings to enhance the quality and effectiveness of services [ 1 ]. There is mounting evidence that multi-faceted or blended implementation strategies are necessary (i.e., a discrete strategy is insufficient) [ 2 , 3 ], but we have a poor understanding of how and why these strategies work. Mechanistic research in implementation science is in an early phase of development. As of 2016, there were only nine studies included in one systematic review of implementation mediators Footnote 1 specific to the field of mental health. Mediators are an intervening variable that may statistically account for the relation between an implementation strategy and outcome. We define the term mechanism as a process or event through which an implementation strategy operates to affect one or more implementation outcomes (see Table 1 for key terms and definitions used throughout this manuscript). Mechanisms offer causal pathways explaining how strategies operate to achieve desired outcomes, like changes in care delivery. Some researchers conflate moderators, mediators, and mechanisms [ 6 ], using the terms interchangeably [ 7 ]. Mediators and moderators can point toward mechanisms, but they are not all mechanisms as they typically are insufficient to explain exactly how change came about.

In addition to these linguistic inconsistencies and lack of conceptual clarity, there is little attention paid to the criteria for establishing a mechanistic relation. Originally, Bradford-Hill [ 8 ], and more recently Kazdin offers [ 4 ] at least seven criteria for establishing mechanisms of psychosocial treatments that are equally relevant to implementation strategies: strong association, specificity, consistency, experimental manipulation, timeline, gradient, plausibility, or coherence (see Table 2 for definitions). Taken together, these criteria can guide study designs for building the case for mechanisms over time. In lieu of such criteria, disparate models and approaches for investigating mechanisms are likely to exist that make synthesizing findings across studies quite challenging. Consequently, the assumption that more strategies will achieve better results is likely to remain, driving costly and imprecise approaches to implementation.

Understanding the mechanisms of implementation strategies, defined as the processes by which strategies produce desired effects [ 4 , 8 ], is important for both research and practice. For research, it is important to specify and examine mechanisms of implementation strategies, especially in the case of null studies, in order to understand why a strategy did or did not achieve its intended effect. For practice, it is crucial to understand mechanisms so that strategies are designed and selected to directly target implementation determinants or barriers. In the absence of this kind of intentional, a priori matching (i.e., strategy targets determinant), it is possible that the “wrong” (or perhaps less potent) strategy will be deployed. This phenomenon of mismatched strategies and determinants was quite prevalent among the 22 tailored improvement intervention studies included in Bosch et al.’s [ 9 ] multiple case study analysis. Upon examining the timing of determinant identification and the degree to which included studies informed tailoring of the type versus the content of the strategies using determinant information, they discovered frequent determinant-strategy mismatch across levels of analysis (e.g., clinician-level strategies were used to address barriers that were at the organizational level) [ 9 ]. Perhaps what is missing is a clear articulation of implementation mechanisms to inform determinant-strategy matching. We argue that, ultimately, knowledge of mechanisms would help to create a more rational, efficient bundle of implementation strategies that fit specific contextual challenges.

Via a systematic review, we sought to understand how mechanisms are conceptualized and measured, how they are studied (by characterizing the wide array of models and designs used to evaluate mechanisms) and evaluated (by applying Kazdin’s seven criteria), and how much evidence exists for specific mechanisms. In doing so, we offer a rich characterization of the current state of the evidence. In reflecting on this evidence, we provide recommendations for future research to optimize their contributions to mechanistic implementation science.

Search protocol

The databases, PubMed and CINAHL Plus, were chosen because of their extensive collection of over 32 million combined citations of medical, nursing and allied health, and life science journals, as well as inclusiveness of international publications. We searched both databases in August 2018 for empirical studies published between January 1990 and August 2018 testing candidate mechanisms of implementation strategies. This starting date was selected given that the concept of evidence-based practice/evidence-based treatment/evidence-based medicine first gained prominence in the 1990’s with the field of implementation science following in response to a growing consciousness of the research to practice gap [ 10 , 11 ]. The search terms were based on input from all authors who represent a variety of methodological and content expertise related to implementation science and reviewed by a librarian; see Table 3 for all search terms. The search string consisted of three levels with terms reflecting (1) implementation science, (2) evidence-based practice (EBP), and (3) mechanism. We adopted Kazdin’s [ 4 ] definition of mechanisms, which he indicates are the basis of an effect. Due to the diversity of definitions that exist in the literature, the term “mechanism” was supplemented with the terms “mediator” and “moderator” to ensure all relevant studies were collected.

Study inclusion and exclusion criteria

Studies were included if they were considered an empirical implementation study (i.e., original data collection) and statistically tested or qualitatively explored mechanisms, mediators, or moderators. We did not include dissemination studies given the likely substantive differences between strategies, mechanisms, and outcomes. Specifically, we align with the distinction made between dissemination and implementation put forth by the National Institutes of Health program announcement for Dissemination and Implementation Research in Health that describes dissemination as involving distribution of evidence to a target audience (i.e., communication of evidence) and implementation as involving use of strategies to integrate evidence into target settings (i.e., use of evidence in practice) [ 12 ]. However, the word “dissemination” was included in our search terms because of the tendency of some researchers to use “implementation” and “dissemination” interchangeably. Studies were excluded if they were not an implementation study, used the terms “mediator,” “moderator,” or “mechanism” in a different context (i.e., conflict mediator), did not involve the implementation of an EBP, or were a review, concept paper, or opinion piece rather than original research. All study designs were considered. Only studies in English were assessed. See Additional File 1 for exclusion criteria and definitions. We strategically cast a wide net and limited our exclusions so as to characterize the broad range of empirical studies of implementation mechanisms.

Citations generated from the search of PubMed and CINAHL were loaded into EPPI Reviewer 4, an online software program used for conducting literature reviews [ 13 ]. Duplicate citations were identified for removal via the duplicate checking function in EPPI and via manual searching. Two independent reviewers (MRB, CWB) screened the first ten citations on title and abstract for inclusion. They then met to clarify inclusion and exclusion criteria with the authorship team, as well as add additional criteria if necessary, and clarify nuances of the inclusion/exclusion coding system (see Additional File 1 for exclusion criteria and definitions). The reviewers met once a week to compare codes and resolve discrepancies through discussion. If discrepancies could not be easily resolved through discussion among the two reviewers, the first author (CCL) made a final determination. During full text review, additional exclusion coding was applied for criteria that could not be discerned from the abstract; articles were excluded at this phase if they only mentioned the study of mechanisms in the discussion or future directions. Seven studies from the previous systematic review of implementation mechanisms [ 14 ] were added to our study for data extraction; these studies likely did not appear in our review due to differences in the search strategy in that the review undertaken by Williams hand searched published reviews of implementation strategies in mental health.

Study quality assessment

The methodological quality of included studies was assessed using the Mixed Methods Appraisal Tool (MMAT-version 2018) [ 15 ]. This tool has been utilized in over three dozen systematic reviews in the health sciences. The MMAT includes two initial screening criteria that assess for the articulation of a clear research question/objective and for the appropriateness of the data collected to address the research question. Studies must receive a “yes” in order to be included. The tool contains a subset of questions to assess for quality for each study type—qualitative, quantitative, and mixed methods. Table 4 summarizes the questions by which studies were evaluated, such as participant recruitment and relevance and quality of measures. Per the established approach to MMAT application, a series of four questions specific to each study design type are assigned a dichotomous “yes” or “no” answer. Studies receive 25 percentage points for each “yes” response. Higher percentages reflect higher quality, with 100% indicating all quality criteria were met. The MMAT was applied by the third author (CWB). The first author (CCL) checked the first 15% of included studies and, based on reaching 100% agreement on the application of the rating criteria, the primary reviewer then applied the tool independently to the remaining studies.

Data extraction and synthesis

Data extraction focused on several categories: study information/ background (i.e., country, setting, and sample), methods (i.e., theories that informed study, measures used, study design, analyses used, proposed mediation model), results (i.e., statistical relations between proposed variables of the mediation model tested), and criteria for establishing mechanisms (based on the seven listed in Table 2 [ 4 ];). All authors contributed to the development of data extraction categories that were applied to the full text of included studies. One reviewer (MRB) independently extracted relevant data and the other reviewer (CWB) checked the results for accuracy, with the first author (CCL) addressing any discrepancies or questions, consistent with the approach of other systematic reviews [ 61 ]. Extracted text demonstrating evidence of study meeting (or not meeting) each criterion for establishing a mechanism was further independently coded as “1” reflecting “criterion met” or “0” reflecting “criterion not met” by MRB and checked by CWB. Again, discrepancies and questions were resolved by the first author (CCL). Technically, mechanisms were considered “established” if all criteria were met. See Additional File 2 for PRISMA checklist for this study.

The search of PubMed and CINAHL Plus yielded 2277 studies for title and abstract screening, of which 447 were duplicates, and 183 moved on to full-text review for eligibility. Excluded studies were most frequently eliminated due to the use of mechanism in a different context (i.e., to refer to a process, technique, or system for achieving results of something other than implementation strategies). After full article review, 39 studies were deemed suitable for inclusion in this review. Two of the included studies appeared in the only other systematic review of implementation mechanisms in mental health settings [ 14 ]. For consistency and comprehensiveness, the remaining seven studies from the previously published review were added to the current systematic review for a total of 46 studies. Footnote 2 See Fig. 1 for a PRISMA Flowchart of the screening process and results.

figure 1

Mechanisms of Implementation Systematic Review PRISMA Flowchart

Study characteristics

Setting, sampling, and interventions.

Table 5 illustrates the characteristics of the 46 included studies. Twenty-five studies (54.3%) were completed in the USA, while 21 studies were conducted in other countries (e.g., Australia, Canada, Netherlands, UK). Settings were widely variable; studies occurred in behavioral health (e.g., community mental health, residential facilities) or substance abuse facilities most frequently (21.7%), followed by hospitals (15.2%), multiple sites across a health care system (15.2%), schools (15.2%), primary care clinics (10.9%), and Veteran’s Affairs facilities (8.7%). Sampling occurred at multiple ecological levels, including patients (17.4%), providers (65.2%), and organizations (43.5%). Seventeen (40.0%) studies examined the implementation of a complex psychosocial intervention (e.g., Cognitive behavioral therapy [ 42 , 56 ];, multisystemic therapy [ 25 , 26 , 58 ]).

Study design

Our review included six qualitative (10.9%), seven mixed methods (15.2%), and 34 quantitative studies (73.9%). The most common study design was quantitative non-randomized/observational (21 studies; 45.7%), of which 11 were cross-sectional. There were 13 (28.3%) randomized studies included in this review. Twenty-nine studies (63.0%) were longitudinal (i.e., included more than one data collection time point for the sample).

Study quality

Table 4 shows the results of the MMAT quality assessment. Scores for the included studies ranged from 25 to 100%. Six studies (13.0%) received a 25% rating based on the MMAT criteria [ 15 ], 17 studies (40.0%) received 50%, 21 studies (45.7%) received 75%, and only three studies (6.5%) scored 100%. The most frequent weaknesses were the lack of discussion on researcher influence in qualitative and mixed methods studies, lack of clear description of randomization approach utilized in the randomized quantitative studies, and subthreshold rates for acceptable response or follow-up in non-randomized quantitative studies.

Study design and evaluation of mechanisms theories, models, and frameworks

Twenty-seven (58.7%) of the studies articulated their plan to evaluate mechanisms, mediators, or moderators in their research aims or hypotheses; the remaining studies included this as a secondary analysis. Thirty-five studies (76.1%) cited a theory, framework, or model as the basis or rationale for their evaluation. The diffusion of innovations theory [ 63 , 64 ] was most frequently cited, appearing in nine studies (19.6%), followed by the theory of planned behavior [ 65 ], appearing in seven studies (15.2%). The most commonly cited frameworks were the theoretical domains framework (five studies; 10.9%) [ 66 ] and Promoting Action on Research in Health Services (PARiHS) [ 67 ] (three studies; 6.5%).

Ecological levels

Four studies (8.7%) incorporated theories or frameworks that focused exclusively on a single ecological level; two focusing on leadership, one at the organizational level, and one at the systems level. There was some discordance between the theories that purportedly informed studies and the potential mechanisms of interest, as 67.4% of candidate mechanisms or mediators were at the intrapersonal level, while 30.4% were at the interpersonal level, and 21.7% at the organizational level. There were no proposed mechanisms at the systems or policy level. Although 12 studies (26.1%) examined mechanisms or mediators across multiple ecological levels, few explicitly examined multilevel relationships (e.g., multiple single-level mediation models were tested in one study).

Measurement and analysis

The vast majority of studies (38, 82.6%) utilized self-report measures as the primary means of assessing the mechanism, and 13 of these studies (28.3%) utilized focus groups and/or interviews as a primary measure, often in combination with other self-report measures such as surveys. Multiple regression constituted the most common analytic approach for assessing mediators or moderators, utilized by 25 studies (54.3%), albeit this was applied in a variety of ways. Twelve studies (26.1%) utilized hierarchical linear modeling (HLM) and six studies (13.0%) utilized structural equation modeling (SEM); see Table 6 for a complete breakdown. Studies that explicitly tested mediators employed diverse approaches including Baron and Kenny’s ( N = 8, 17.4 causal steps approach [ 78 ], Preacher and Hayes’ ( N = 3, 6.5%) approach to conducting bias-corrected bootstrapping to estimate the significance of a mediated effect (i.e., computing significance for the product of coefficients) [ 95 , 126 ], and Sobel’s ( N = 4, 8.9%) approach to estimating standard error for the product of coefficients often using structural equation modeling [ 79 ]. Only one study tested a potential moderator, citing Raudenbush’s [ 80 , 82 ]. Two other studies included a potential moderator in their conceptual frameworks, but did not explicitly test moderation.

Emergent mechanism models

There was substantial variation in the models that emerged from the studies included in this review. Table 7 represents variables considered in mediating or moderating models across studies (or identified as candidate mediators, moderators, or mechanisms in the case of qualitative studies). Additional file 3 depicts the unique versions of models tested and their associated studies. We attempted to categorize variables as either (a) an independent variable ( X ) impacting a dependent variable; (b) a dependent variable ( Y ), typically the outcome of interest for a study; or (c) an intervening variable ( M ), a putative mediator in most cases, though three studies tested potential moderators. We further specified variables as representing a strategy, determinant, and outcome; see Table 1 for definitions. Footnote 3

Common model types

The most common model type (29; 63.0%) was one in which X was a determinant, M was also a determinant, and Y was an implementation outcome variable (determinant ➔ determinant ➔ implementation outcome). For example, Beenstock et al. [ 36 ] tested a model in which propensity to act (determinant) was evaluated as a mediator explaining the relation between main place of work (determinant) and referral to smoking cessation services (outcome). Just less than half the studies (22; 47.8%) included an implementation strategy in their model, of which 16 (34.8%) evaluated a mediation model in which an implementation strategy was X , a determinant was the candidate M , and an implementation outcome was Y (strategy ➔ determinant ➔ implementation outcome); ten of these studies experimentally manipulated the relation between the implementation strategy and determinant. An example of this more traditional mediation model is a study by Atkins and colleagues [ 21 ] which evaluated key opinion leader support and mental health practitioner support (determinants) as potential mediators of the relation between training and consultation (strategy) and adoption of the EBP (implementation outcome). Five studies included a mediation model in which X was an implementation strategy, Y was a clinical outcome, and M was an implementation outcome (strategy ➔ implementation outcome ➔ clinical outcome) [ 25 , 26 , 28 , 29 , 31 ].

Notable exceptions to model types

While the majority of quantitative studies tested a three-variable model, there were some notable exceptions. Several studies tested multiple three variable models that held the independent variable and mediator constant but tested the relation among several dependent variables. Several studies tested multiple three variable models that held the independent variable and dependent variables constant but tested several mediators.

Qualitative studies

Five studies included in this review utilized qualitative methods to explore potential mechanisms or mediators of change, though only one explicitly stated this goal in their aims [ 17 ]. Three studies utilized a comparative case study design incorporating a combination of interviews, focus groups, observation, and document review, whereas two studies employed a cross-sectional descriptive design. Although three of the five studies reported their analytic design was informed by a theory or previously established model, only one study included an interview guide in which items were explicitly linked to theory [ 19 ]. All qualitative studies explored relations between multiple ecological levels, drawing connections between intra and interpersonal behavioral constructs and organization or system level change.

Criteria for establishing mechanisms of change

Finally, with respect to the seven criteria for establishing mechanisms of change, the plausibility/coherence (i.e., a logical explanation of how the mechanism operates that incorporates relevant research findings) was the most frequently fulfilled requirement, met by 42 studies (91.3%). Although 20 studies (43.5%), of which 18 were quantitative, provided statistical evidence of a strong association between the dependent and independent variables, only 13 (28.2%) studies experimentally manipulated an implementation strategy or the proposed mediator or mechanism. Further, there was only one study that attempted to demonstrate a dose-response relation between mediators and outcomes. Most included studies (39; 84.8%) fulfilled three or fewer criteria, and only one study fulfilled six of the seven requirements for demonstrating a mechanism of change; see Table 8 .

Observations regarding mechanistic research in implementation science

Mechanism-focused implementation research is in an early phase of development, with only 46 studies identified in our systematic review across health disciplines broadly. Consistent with the field of implementation science, no single discipline is driving the conduct of mechanistic research, and a diverse array of methods (quantitative, qualitative, mixed methods) and designs (e.g., cross-sectional survey, longitudinal non-randomized, longitudinal randomized, etc.) have been used to examine mechanisms. Just over one-third of studies ( N = 16; 34.8%) evaluated a mediation model with the implementation strategy as the independent variable, determinant as a putative mediator, and implementation outcome as the dependent variable. Although this was the most commonly reported model, we would expect a much higher proportion of studies testing mechanisms of implementation strategies given the ultimate goal of precise selection of strategies targeting key mechanisms of change. Studies sometimes evaluated models in which the determinant was the independent variable, another determinant was the putative mediator, and an implementation outcome was the dependent variable ( N = 11; 23.9%). These models suggest an interest in understanding the cascading effect of changes in context on key outcomes, but without manipulating or evaluating an implementation strategy as the driver of observed change. Less common (only 5, 10.9%) were more complex models in which multiple mediators and outcomes and different levels of analyses were tested (e.g., [ 37 , 39 ]), despite that this level of complexity is likely to characterize the reality of typical implementation contexts. Although there were several quantitative studies that did observe significant relations pointing toward a mediator, none met all criteria for establishing a mechanism.

Less than one-third of the studies experimentally manipulated the strategy-mechanism linkage. As the field progresses, we anticipate many more tests of this nature, which will allow us to discern how strategies exert their effect on outcomes of interest. However, implementation science will continue to be challenged by the costly nature of the type of experimental studies that would be needed to establish this type of evidence. Fortunately, methodological innovations that capitalize on recently funded implementation trials to engage in multilevel mediation modeling hold promise for the next iteration of mechanistic implementation research [ 14 , 127 ] As this work unfolds, a number of scenarios are possible. For example, it is likely the case that multiple strategies can target the same mechanism; that a single strategy can target multiple mechanisms; and that mechanisms across multiple levels of analysis must be engaged for a given strategy to influence an outcome of interest. Accordingly, we expect great variability in model testing will continue and that more narrowly focused efforts will remain important contributions so long as shared conceptualization of mechanisms and related variables is embraced, articulated, and rigorously tested. As with other fields, we observed great variability in the degree to which mechanisms (and related variables of interest) were appropriately specified, operationalized, and measured. This misspecification coupled with the overall lack of high-quality studies (only three met 100% of the quality criteria), and the diversity in study methods, strategies tested, and mediating or moderating variables under consideration, we were unable to synthesize the findings across studies to point toward promising mechanisms.

The need for greater conceptual clarity and methodological advancements

Despite the important advances that the studies included in this review represent, there are clear conceptual and methodological issues that need to be addressed to allow future research to more systematically establish mechanisms. Table 1 offers a list of key terms and definitions for the field to consider. We suggest the term “mechanism” be used to reflect a process or event through which an implementation strategy operates to affect desired implementation outcomes . Consistent with existing criteria [ 4 ], mechanisms can only be confidently established via carefully designed (i.e., longitudinal; experimentally manipulated) empirical studies demonstrating a strong association, and ideally a dose-response relation, between an intervening variable and outcome (e.g., via qualitative data or mediation or moderator analyses) that are supported by very specific theoretical propositions observed consistently across multiple studies. We found the term “mediator” to be most frequently used in this systematic review, which can point toward a mechanism, but without consideration of these full criteria, detection of a mediator reflects a missed opportunity to contribute more meaningfully to the mechanisms literature.

Interestingly, the nearly half of studies (43.5%) treated a variable that many would conceptualize as a “determinant” as the independent variable in at least one proposed or tested mediation pathway. Presumably, if researchers are exploring the impact of a determinant on another determinant and then on an outcome, there must be a strategy (or action) that caused the change in the initial determinant. Or, it is possible that researchers are simply interested in the natural associations among these determinants to identify promising points of leverage. This is a prime example where the variable or overlapping use of concepts (i.e., calling all factors of interest “determinants”) becomes particularly problematic and undermines the capacity of the field to accumulate knowledge across studies in the service of establishing mechanisms. We contend that it is important to differentiate among concepts to use more meaningful terms like preconditions, putative mechanisms, proximal and distal outcomes, all of which were under-specified in the majority of the included studies. Several authors from our team have articulated an approach to building causal pathway diagrams [ 128 ] that clarifies that preconditions are necessary factors for a mechanism to be activated and proximal outcomes are the immediate result of a strategy that is realized only because the specific mechanism was activated. We conceptualize distal outcomes as the eight implementation outcomes articulated by Proctor and colleagues [ 129 ]. Disentangling these concepts can help characterize why strategies fail to exert an impact on an outcome of interest. Examples of each follow in the section below.

Conceptual and methodological recommendations for future research

Hypothesis generation.

With greater precision among these concepts, the field can also generate and test more specific hypotheses about how and why key variables are related. This begins with laying out mechanistic research questions (e.g., How does a network intervention, like a learning collaborative, influence provider attitudes?) and generating theory-driven hypotheses. For instance, a testable hypothesis may be that learning collaboratives [strategy] operate through sharing [mechanism] of positive experiences with a new practice to influence provider attitudes [outcome]. As another example, clinical decision support [strategy] may act through helping the provider to remember [mechanism] to administer a screener [proximal outcome] and flagging this practice before an encounter may not allow the mechanism to be activated [precondition]. Finally, organizational strategy development [strategy] may have an effect because it means prioritizing competing demands [mechanism] to generate a positive implementation climate [proximal outcome]. Research questions that allow for specific mechanism-focused hypotheses have the potential to expedite the rate at which effective implementation strategies are identified.

Implementation theory

Ultimately, theory is necessary to drive hypotheses, explain implementation processes, and effectively inform implementation practice by providing guidance about when and in what contexts specific implementation strategies should or should not be used. Implementation theories can offer mechanisms that extend across levels of analysis (e.g., intrapersonal, interpersonal, organizational, community, macro policy [ 130 ]). However, there is a preponderance of frameworks and process models, with few theories in existence. Given that implementation is a process of behavior change at its core, in lieu of implementation-specific theories, many researchers draw upon classic theories from psychology, decision science, and organizational literatures, for instance. Because of this, the majority of the identified studies explored intrapersonal-level mechanisms, driven by their testing of social psychological theories such as the theory of planned behavior [ 65 ] and social cognitive theory [ 76 , 77 , 99 ]. Nine studies cited the diffusion of innovations [ 63 , 64 ] as a theory guiding their mechanism investigation, which does extend beyond intrapersonal to emphasize interpersonal, and to some degree community level mechanisms, although we did not see this materialize in the included study analyses [ 63 , 64 , 65 , 76 , 77 ]. Moving forward, developing and testing theory is critical for advancing the study of implementation mechanisms because theories (implicitly or explicitly) tend to identify putative mechanisms instead of immutable determinants.

Measurement

Inadequate measurement has the potential to undermine our ability to advance this area of research. Our coding indicated that mechanisms were assessed almost exclusively via self-report (questionnaire, interview, focus group) suggesting that researchers conceptualize the diverse array of mechanisms to be latent constructs and not directly observable. This may indeed be appropriate, given that mechanisms are typically processes like learning and reflecting that occur within an individual and it is their proximal outcomes that are directly observable (e.g., knowledge acquisition, confidence, perceived control). However, conceptual, theoretical, and empirical work is needed to (a) articulate the theorized mechanisms for the 70+ strategies and proximal outcomes [ 128 ], (b) identify measures of implementation mechanisms and evaluate their psychometric evidence base [ 131 ] and pragmatic qualities [ 132 ], and (c) attempt to identify and rate or develop objective measures of proximal outcomes for use in real-time experimental manipulations of mechanism-outcome pairings.

Quantitative analytic approaches

The multilevel interrelations of factors implicated in an implementation process also call for sophisticated quantitative and qualitative methods to uncover mechanisms. With respect to quantitative methods, it was surprising that the Baron and Kenny [ 78 ] approach to mediation testing remains most prevalent despite that most studies are statistically underpowered to use this approach, and the other most common approach (i.e., the Sobel test [ 79 ]) relies on an assumption that the sampling distribution of the mediation effect is normal [ 14 , 133 ], neither of which were reported on in any of the 12 included studies that used these methods. Williams [ 14 ] suggests the product of coefficients approach [ 134 , 135 ] is more appropriate for mediation analysis because it is a highly general approach to both single and multi-level mediation models that minimizes type I error rates, maximizes statistical power, and enhances accuracy of confidence intervals [ 14 ]. The application of moderated mediation models and mediated moderator models will allow for a nuanced understanding of the complex interrelations among factors implicated in an implementation process.

Qualitative analytic approaches

Because this was the first review of implementation mechanisms across health disciplines, we believed it was important to be inclusive with respect to methods employed. Qualitative studies are important to advancing research on implementation mechanisms in part because they offer a data collection method in lieu of having an established measure to assess mechanisms quantitatively. Qualitative research is important for informing measure development work, but also for theory development given the richness of the data that can be gleaned. Qualitative inquiry can be more directive by developing hypotheses and generating interview guides to directly test mechanisms. Diagramming and tracing causal linkages can be informed by qualitative inquiry in a structured way that is explicit with regard to how the data informs our understanding of mechanisms. This kind of directed qualitative research is called for in the United Kingdom’s MRC Guidance for Process Evaluation [ 136 ]. We encourage researchers internationally to adopt this approach as it would importantly advance us beyond the descriptive studies that currently dominate the field.

Limitations

There are several limitations to this study. First, we took an efficient approach to coding for study quality when applying the MMAT. Although it was a strength that we evaluated study quality, the majority of studies were assessed only by one research specialist. Second, we may have overlooked relevant process evaluations conducted in the UK where MRC Guidance stipulates inclusion of mechanisms that may have been described using terms not included in our search string. Third, although we identified several realist reviews, we did not include them in our systematic review because they conceptualize mechanisms differently than how they are treated in this review [ 137 ]. That is, realist synthesis posits that interventions are theories and that they imply specific mechanisms of action instead of separating mechanisms from the implementation strategies/interventions themselves [ 138 ]. Thus, including the realist operationalization would have further confused an already disharmonized literature with respect to mechanisms terminology but ultimately synthesizing findings from realist reviews with standard implementation mechanism evaluations will be important. Fourth, our characterization of the models tested in the identified studies may not reflect those intended by researchers given our attempt to offer conceptual consistency across studies, although we did reach out to corresponding authors for whom we wished to seek clarification on their study. Finally, because of the diversity of study designs and methods, and the inconsistent use of relevant terms, we are unable to synthesize across the studies and report on any robustly established mechanisms.

This study represents the first systematic review of implementation mechanisms in health. Our inclusive approach yielded 46 qualitative, quantitative, and mixed methods studies, none of which met all seven criteria (i.e., strong association, specificity, consistency, experimental manipulation, timeline, gradient, plausibility or coherence) that are deemed critical for empirically establishing mechanisms. We found nine unique versions of models that attempted to uncover mechanisms, with only six exploring mediators of implementation strategies. The results of this review indicated inconsistent use of relevant terms (e.g., mechanisms, determinants) for which we offer guidance to achieve precision and encourage greater specificity in articulating research questions and hypotheses that allow for careful testing of causal relations among variables of interest. Implementation science will benefit from both quantitative and qualitative research that is more explicit in their attempt to uncover mechanisms. In doing so, our research will allow us to test the idea that more is better and move toward parsimony both for standardized and tailored approaches to implementation.

A mediator can point toward a mechanism as it is an intervening variable that may account (statistically) for the relation between the independent variable (strategy) and the dependent variable (implementation outcome), revealing one possible causal pathway for the observed effect [ 4 ]. Compared to mediators, mechanisms are conceptualized as more precise in their description of the operations underlying causal processes [ 5 ].

Key differences in Williams’ [ 14 ] search method are important to note. Williams first conducted a broad search for randomized controlled trials concerning implementation or dissemination of evidence-based therapies. Only after screening references for these criteria, did Williams narrow the search to studies that specifically addressed mediators. Conversely, the present method included mediators/moderators/mechanisms as terms in the initial search string. Additionally, Williams hand searched references included in four previous reviews of implementation strategies in mental health.

We refer to variables in the ways the study authors did, even if we might have a different way in which we would approach their conceptualization.

Abbreviations

Evidence-based practice

Mixed methods appraisal tool

Promoting Action on Research in Health Services

Hierarchical linear modeling

Structural equation modeling

Eccles MP, Mittman BS. Welcome to implementation science. Implement Sci. 2006;1(1):1.

Article   PubMed Central   Google Scholar  

Powell BJ, McMillen JC, Proctor EK, Carpenter CR, Griffey RT, Bunger AC, et al. A compilation of strategies for implementing clinical innovations in health and mental health. Med Care Res Rev. 2012;69(2):123–57.

Article   PubMed   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:21.

Article   PubMed   PubMed Central   Google Scholar  

Kazdin AE. Mediators and mechanisms of change in psychotherapy research. Annu Rev Clin Psychol. 2007;3(1):1–27.

Kraemer HC, Wilson GT, Fairburn CG, Agras WS. Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry. 2002;59(10):877–83.

Gerring J. Social science methodology: a criterial framework. Cambridge: Cambridge University Press; 2001.

Book   Google Scholar  

Frazier PA, Tix AP, Barron KE. Testing moderator and mediator effects in counseling psychology research. US: American Psychological Association; 2004. p. 115–34.

Google Scholar  

Hill AB. The Environment and Disease: Association or Causation? Proc R Soc Med. 1965;58:295–300.

CAS   PubMed   PubMed Central   Google Scholar  

Bosch M, van der Weijden T, Wensing M, Grol R. Tailoring quality improvement interventions to identified barriers: a multiple case analysis. J Eval Clin Pract. 2007;13(2):161–8.

Claridge JA, Fabian TC. History and development of evidence-based medicine. World J Surg. 2005;29(5):547–53.

Cook SC, Schwartz AC, Kaslow NJ. Evidence-Based psychotherapy: advantages and challenges. Neurotherapeutics. 2017;14(3):537–45.

Dissemination and Implementation Research in Health (R01 Clinical Trial Optional). National Institutes of Health (NIH); 2019. https://grants.nih.gov/grants/guide/pa-files/PAR-19-274.html .

Thomas J, Brunton J, Graziosi S. EPPI-Reviewer 4: software for research synthesis. EPPI-Centre Software. London: Social Science Research Unit, UCL Institute of Education; 2010.

Williams NJ. Multilevel mechanisms of implementation strategies in mental health: integrating theory, research, and practice. Adm Policy Ment Health. 2016;43(5):783–98.

Hong QN, Pluye P, Fabregues S, Bartlett G, Boardman F, Cargo M, et al. Mixed Methods Appraisal Tool (MMAT) Montreal, Canada: McGill University; 2018 [Available from: http://mixedmethodsappraisaltoolpublic.pbworks.com/w/file/fetch/127916259/MMAT_2018_criteria-manual_2018-08-01_ENG.pdf .

Bardosh KL, Murray M, Khaemba AM, Smillie K, Lester R. Operationalizing mHealth to improve patient care: a qualitative implementation science evaluation of the WelTel texting intervention in Canada and Kenya. Global Health. 2017;13(1):87.

Brewster AL, Curry LA, Cherlin EJ, Talbert-Slagle K, Horwitz LI, Bradley EH. Integrating new practices: a qualitative study of how hospital innovations become routine. Implement Sci. 2015;10:168.

Carrera PM, Lambooij MS. Implementation of out-of-office blood pressure monitoring in the netherlands: from clinical guidelines to patients’ adoption of innovation. Medicine. 2015;94(43):e1813.

Frykman M, Hasson H, Muntlin Athlin Å, von Thiele Schwarz U. Functions of behavior change interventions when implementing multi-professional teamwork at an emergency department: a comparative case study. BMC Health Serv Res. 2014;14:218.

Wiener-Ogilvie S, Huby G, Pinnock H, Gillies J, Sheikh A. Practice organisational characteristics can impact on compliance with the BTS/SIGN asthma guideline: qualitative comparative case study in primary care. BMC Fam Pract. 2008;9:32.

Atkins MS, Frazier SL, Leathers SJ, Graczyk PA, Talbott E, Jakobsons L, et al. Teacher key opinion leaders and mental health consultation in low-income urban schools. J Consult Clin Psychol. 2008;76(5):905–8.

Baer JS, Wells EA, Rosengren DB, Hartzler B, Beadnell B, Dunn C. Agency context and tailored training in technology transfer: a pilot evaluation of motivational interviewing training for community counselors. J Subst Abuse Treat. 2009;37(2):191–202.

Bonetti D, Eccles M, Johnston M, Steen N, Grimshaw J, Baker R, et al. Guiding the design and selection of interventions to influence the implementation of evidence-based practice: an experimental simulation of a complex intervention trial. Soc Sci Med. 2005;60(9):2135–47.

Garner BR, Godley SH, Bair CML. The impact of pay-for-performance on therapists’ intentions to deliver high quality treatment. J Subst Abuse Treat. 2011;41(1):97–103.

Glisson C, Schoenwald SK, Hemmelgarn A, Green P, Dukes D, Armstrong KS, et al. Randomized trial of MST and ARC in a two-level evidence-based treatment implementation strategy. J Consult Clin Psychol. 2010;78(4):537–50.

Holth P, Torsheim T, Sheidow AJ, Ogden T, Henggeler SW. Intensive quality assurance of therapist adherence to behavioral interventions for adolescent substance use problems. J Child Adolesc Subst Abuse. 2011;20(4):289–313.

Lee H, Hall A, Nathan N, Reilly KL, Seward K, Williams CM, et al. Mechanisms of implementing public health interventions: a pooled causal mediation analysis of randomised trials. Implement Sci. 2018;13(1):42.

Lochman JE, Boxmeyer C, Powell N, Qu L, Wells K, Windle M. Dissemination of the coping power program: importance of intensity of counselor training. J Consult Clin Psychol. 2009;77(3):397–409.

Rapkin BD, Weiss E, Lounsbury D, Michel T, Gordon A, Erb-Downward J, et al. Reducing Disparities in cancer screening and prevention through community-based participatory research partnerships with local libraries: a comprehensive dynamic trial. Am J Community Psychol. 2017;60(1-2):145–59.

Rohrbach LA, Graham JW, Hansen WB. Diffusion of a school-based substance abuse prevention program: predictors of program implementation. Prev Med. 1993;22(2):237–60.

Article   CAS   PubMed   Google Scholar  

Seys D, Bruyneel L, Sermeus W, Lodewijckx C, Decramer M, Deneckere S, et al. Teamwork and adherence to recommendations explain the effect of a care pathway on reduced 30-day readmission for patients with a COPD exacerbation. COPD. 2018;15(2):157–64.

Williams NJG, C. The role of organizational culture and climate in the dissemination and implementation of empirically-supported treatments for youth. Dissemination and implementation of evidence based practices in child and adolescent mental health. New York: Oxford University Press; 2014. p. 61-81.

Williams NJ, Glisson C, Hemmelgarn A, Green P. Mechanisms of change in the ARC Organizational strategy: increasing mental health clinicians' EBP adoption through improved organizational culture and capacity. Adm Policy Ment Health. 2017;44(2):269–83.

Aarons GA, Sommerfeld DH, Walrath-Greene CM. Evidence-based practice implementation: the impact of public versus private sector organization type on organizational support, provider attitudes, and adoption of evidence-based practice. Implement Sci. 2009;4:83.

Becker SJ, Squires DD, Strong DR, Barnett NP, Monti PM, Petry NM. Training opioid addiction treatment providers to adopt contingency management: a prospective pilot trial of a comprehensive implementation science approach. Subst Abus. 2016;37(1):134–40.

Beenstock J, Sniehotta FF, White M, Bell R, Milne EMG, Araujo-Soares V. What helps and hinders midwives in engaging with pregnant women about stopping smoking? A cross-sectional survey of perceived implementation difficulties among midwives in the North East of England. Implement Sci. 2012;7:36.

Beets MW, Flay BR, Vuchinich S, Acock AC, Li KK, Allred C. School climate and teachers' beliefs and attitudes associated with implementation of the positive action program: a diffusion of innovations model. Prev Sci. 2008;9(4):264–75.

Bonetti D, Johnston M, Clarkson J, Turner S. Applying multiple models to predict clinicians' behavioural intention and objective behaviour when managing children's teeth. Psychol Health. 2009;24(7):843–60.

Chou AF, Vaughn TE, McCoy KD, Doebbeling BN. Implementation of evidence-based practices: applying a goal commitment framework. Health Care Manage Rev. 2011;36(1):4–17.

Chambers D, Simpson L, Neta G, UvT S, Percy-Laurry A, Aarons GA, et al. Proceedings from the 9th annual conference on the science of dissemination and implementation. Implementation Sci. 2017;12(1):48.

Article   Google Scholar  

David P, Schiff M. Self-efficacy as a mediator in bottom-up dissemination of a Research-supported intervention for young, traumatized children and their families. J Evid Inf Soc Work. 2017;14(2):53–69.

Edmunds JM, Read KL, Ringle VA, Brodman DM, Kendall PC, Beidas RS. Sustaining clinician penetration, attitudes and knowledge in cognitive-behavioral therapy for youth anxiety. Implement Sci. 2014;9.

Gnich W, Sherriff A, Bonetti D, Conway DI, Macpherson LMD. The effect of introducing a financial incentive to promote application of fluoride varnish in dental practice in Scotland: a natural experiment. Implement Sci. 2018;13(1):95.

Guerrero EG, Frimpong J, Kong Y, Fenwick K. Aarons GA. Health Care Manage Rev: Advancing theory on the multilevel role of leadership in the implementation of evidence-based health care practices; 2018.

Huis A, Holleman G, van Achterberg T, Grol R, Schoonhoven L, Hulscher M. Explaining the effects of two different strategies for promoting hand hygiene in hospital nurses: a process evaluation alongside a cluster randomised controlled trial. Implement Sci. 2013;8:41.

Little MA, Pokhrel P, Sussman S, Rohrbach LA. The process of adoption of evidence-based tobacco use prevention programs in California schools. Prev Sci. 2015;16(1):80–9.

Llasus L, Angosta AD, Clark M. Graduating baccalaureate students' evidence-based practice knowledge, readiness, and implementation. J Nurs Educ. 2014;53(Suppl 9):S82–9.

Nelson TD, Steele RG. Predictors of practitioner self-reported use of evidence-based practices: practitioner training, clinical setting, and attitudes toward research. Adm Policy Ment Health. 2007;34(4):319–30.

Potthoff S, Presseau J, Sniehotta FF, Johnston M, Elovainio M, Avery L. Planning to be routine: habit as a mediator of the planning-behaviour relationship in healthcare professionals. Implement Sci. 2017;12(1):24.

Presseau J, Grimshaw JM, Tetroe JM, Eccles MP, Francis JJ, Godin G, et al. A theory-based process evaluation alongside a randomised controlled trial of printed educational messages to increase primary care physicians' prescription of thiazide diuretics for hypertension [ISRCTN72772651]. Implement Sci. 2016;11(1):121.

Simmonds MJ, Derghazarian T, Vlaeyen JW. Physiotherapists' knowledge, attitudes, and intolerance of uncertainty influence decision making in low back pain. Clin J Pain. 2012;28(6):467–74.

Stockdale SE, Rose D, Darling JE, Meredith LS, Helfrich CD, Dresselhaus TR, et al. Communication among team members within the patient-centered medical home and patient satisfaction with providers: the mediating role of patient-provider communication. Med Care. 2018;56(6):491–6.

Wanless SB, Rimm-Kaufman SE, Abry T, Larsen RA, Patton CL. Engagement in training as a mechanism to understanding fidelity of implementation of the responsive classroom approach. Prev Sci. 2015;16(8):1107–16.

Armson H, Roder S, Elmslie T, Khan S, Straus SE. How do clinicians use implementation tools to apply breast cancer screening guidelines to practice? Implement Sci. 2018;13(1):79.

Birken SA, Lee S-YD, Weiner BJ, Chin MH, Chiu M, Schaefer CT. From strategy to action: how top managers’ support increases middle managers’ commitment to innovation implementation in healthcare organizations. Health Care Manage Rev. 2015;40(2):159–68.

Kauth MR, Sullivan G, Blevins D, Cully JA, Landes RD, Said Q, et al. Employing external facilitation to implement cognitive behavioral therapy in VA clinics: a pilot study. Implement Sci. 2010;5(1):75.

Lukas CV, Mohr DC, Meterko M. Team effectiveness and organizational context in the implementation of a clinical innovation. Qual Manag Health Care. 2009;18(1):25–39.

Panzano PC, Sweeney HA, Seffrin B, Massatti R, Knudsen KJ. The assimilation of evidence-based healthcare innovations: a management-based perspective. J Behav Health Serv Res. 2012;39(4):397–416.

Rangachari P, Madaio M, Rethemeyer RK, Wagner P, Hall L, Roy S, et al. The evolution of knowledge exchanges enabling successful practice change in two intensive care units. Health Care Manage Rev. 2015;40(1):65–78.

Shrubsole K, Worrall L, Power E, O'Connor DA. The acute aphasia implementation study (AAIMS): a pilot cluster randomized controlled trial. Int J Lang Commun Disord. 2018;53(5):1021–56.

Scott SD, Albrecht L, O'Leary K, Ball GD, Hartling L, Hofmeyer A, et al. Systematic review of knowledge translation strategies in the allied health professions. Implement Sci. 2012;7:70.

Yamada J, Squires JE, Estabrooks CA, Victor C, Stevens B, Pain CTiCs. The role of organizational context in moderating the effect of research use on pain outcomes in hospitalized children: a cross sectional study. BMC Health Serv Res. 2017;17(1):68.

Rogers E. Diffusion of innovations. 4th ed. New York: Free Press; 1995.

Rogers E. Diffusion of Innovations. 3rd ed. New York: Free Press; 1983.

Ajzen I. The theory of planned behavior. Organ Behav Hum Decis Process. 1991;50(2):179–211.

Michie S, Johnston M, Abraham C, Lawton R, Parker D, Walker A, et al. Making psychological theory useful for implementing evidence based practice: a consensus approach. Qual Saf Health Care. 2005;14(1):26–33.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kitson A, Harvey G, McCormack B. Enabling the implementation of evidence based practice: a conceptual framework. Qual Health Care. 1998;7(3):149–58.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4:50.

Klein KJ, Sorra JS. The challenge of innovation implementation. Acad Manage Rev. 1996;21(4):1055–80.

Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly. 1989;13(3):319–40.

Thompson RS, Higgins CA, Howell JM. Personal computing: toward a conceptual model of utilization. MIS Quarterly. 1991;15(1):125–43.

Braksick LW. Unlock behavior, unleash profits: developing leadership behavior that drives profitability in your organization. New York, NY: McGraw-Hill; 2007.

Johnson J, Dakens L, Edwards P, Morse N. SwitchPoints: culture change on the fast track to business success. Hoboken, NJ: John Wiley & Sons; 2008.

Hedeker D, Gibbons RD. Longitudinal data analysis. New York, NY: Wiley; 2006.

Krull JL, MacKinnon DP. Multilevel modeling of individual and group level mediated effects. Multivariate Behav Res. 2001;36(2):249–77.

Bandura A. Self-efficacy: the exercise of control. New York: Macmillan; 1997.

Bandura A. Exercise of human agency through collective efficacy. Curr Dir Psychol Sci. 2000;9(3):75–8.

Baron RM, Kenny DA. The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol. 1986;51(6):1173–82.

Sobel ME. Asymptotic confidence intervals for indirect effects in structural equation models. In: Leinhart S, editor. Sociological Methodology. San Francisco: Jossey-Bass; 1982.

Raudenbush SW, Bryk AS, Cheong YF, Congdon RT. HLM7: hierarchical linear and nonlinear modeling. Chicago: Scientific Software International; 2004.

Hosmer DW, Lemeshow S. Applied logistic regression. New York, NY: John Wiley & Sons; 1989.

Raudenbush SW, Bryk A, Congdon RT. HLM 6. Scientific Software International: Lincolnwood, IL; 2005.

Singer JD, Willet JB. Applied longitudinal data analysis: modeling change and event occurrence. New York, NY: Oxford University Press; 2003.

Cane J, O'Connor D, Michie S. Validation of the theoretical domains framework for use in behaviour change and implementation research. Implement Sci. 2012;7:37.

Imai K, Keele L, Tingley D. A general approach to causal mediation analysis. Psychol Methods. 2010;15(4):309–34.

van Buuren SG-O, K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2010:1–68.

Rogers EM. Diffusion of innovations. 5th ed. New York, NY: Free Press; 2003.

Raudenbush SW, Liu X. Statistical power and optimal design for multisite randomized trials. Psychol Methods. 2000;5(2):199–213.

Allison PD. Event history analysis. Thousand Oaks, CA: SAGE Publications; 1984.

Yuk Fai C, Randall PF, Stephen WR. Efficiency and robustness of alternative estimators for two- and three-level models: the case of NAEP. J Educ Behav Stat. 2001;26(4):411–29.

Hox JJ, Maas CJM. The accuracy of multilevel structural equation modeling with pseudobalanced groups and small samples. Struct Equ Model. 2001;8(2):157–74.

Zhang Z, Zyphur MJ, Preacher KJ. Testing multilevel mediation using hierarchical linear models: problems and solutions. Organizational Research Methods. 2009;12(4):695–719.

Scott WR. Institutions and Organizations. Thousand Oaks, CA: Sage; 2001.

Eisenberger R, Huntington R, Hutchison S, Sowa D. Perceived organizational support. Journal of Applied Psychology. 1986;71:500–7.

Preacher KJ, Hayes AF. SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behav Res Methods Instrum Comput. 2004;36(4):717–31.

Chen HT. Theory-driven evaluations. In: Reynolds HJ, Walber HJ, editors. Advances in educational productivity: evaluation research for educational productivity. 7th ed. Bingley, UK: Emerald Group Publishing Limited; 1998.

Marsh HW, Hau KT, Balla JR, Grayson D. Is More Ever Too Much? The Number of indicators per factor in confirmatory factor analysis. Multivariate Behav Res. 1998;33(2):181–220.

Bandalos DL, Finney SJ. Item parceling issues in structural equation modeling. In: Marcoulides GA, editor. New developments and techniques in structural equation modeling. Mahwah, NJ: Erlbaum; 2001. p. 269–96.

Bandura A. Health promotion from the perspective of social cognitive theory. Psychol Health. 1998;13(4):623–49.

Blackman D. Operant conditioning: an experimental analysis of behaviour. London, UK: Methuen; 1974.

Gollwitzer PM. Implementation intentions: strong effects of simple plans. Am Psychol. 1999;54:493–503.

Leventhal H, Nerenz D, Steele DJ. Illness representations and coping with health threats. In: Baum A, Taylor SE, Singer JE, editors. Handbook of psychology and health, volume 4: social psychological aspects of health. Hillsdale, NJ: Lawrence Erlbaum; 1984. p. 219–51.

Weinstein N. The precaution adoption process. Health Psychol. 1988;7:355–86.

Prochaska JO, DiClemente CC. Stages and processes of self-change of smoking: toward an integrative model of change. J Consult Clin Psychol. 1983;51(3):390–5.

Landy FJ, Becker W. Motivation theory reconsidered. In: Cumming LL, Staw BM, editors. Research in organizational behavior. Greenwich, CT: JAI Press; 1987.

Locke EA, Latham GP. Building a practically useful theory of goal setting and task motivation: a 35-year odyssey. Am Psychol. 2002;57(9):705–17.

Kennedy P. A guide to econometrics. Cambridge, MA: MIT Press; 2003.

Joreskog KGS, D. LISRELR 8: User’s reference guide. Lincolnwood, IL: Scientific Software International; 1996.

Valente TW. Social network thresholds in the diffusion of innovations. Social Networks. 1996;18:69–89.

Hayes AF. Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication Monographs. 2009;76:408–20.

Aarons GA, Hurlburt M, Horwitz SM. Advancing a conceptual model of evidence-based practice implementation in public service sectors. Adm Policy Ment Health. 2011;38(1):4–23.

Raudenbush SW, Bryk AS. Hierarchical linear models. Thousand Oaks: Sage; 2002.

Bryk AS, Raudenbush SW. Hierarchical linear models. Newbury Park, CA: Sage; 1992.

Muthén LK, Muthén BO. Mplus user's guide Los Angeles, CA: Muthén & Muthén 2012 [Seventh Edition:[Available from: https://www.statmodel.com/download/usersguide/Mplus%20user%20guide%20Ver_7_r3_web.pdf .

Bentler PM. On tests and indices for evaluating structural models. Personal Individ Differ. 2007;42(5):825–9.

MacKinnon DP, Fairchild AJ, Fritz MS. Mediation analysis. Annu Rev Psychol. 2007;58:593–614.

Graham I, Logan J, Harrison M, Straus S, Tetroe J, Caswell W, et al. Lost in knowledge translation: time for a map? J Contin Educ Health Prof. 2006;26.

Epstein S. Cognitive-experiential self-theory. In: Pervin LA, editor. Handbook of personality: theory and research. New York: Guilford; 1990. p. 165–92.

Karlson KB, Holm A, Breen R. Comparing Regression coefficients between same-sample Nested models using logit and probit: a new method. Sociological Methodology. 2012;42(1):274–301.

Rycroft-Malone J, Kitson A, Harvey G, McCormack B, Seers K, Titchen A, et al. Ingredients for change: revisiting a conceptual framework. BMJ Qual Saf. 2002;11(2):174–80.

Article   CAS   Google Scholar  

Yukl G, Gordon A, Taber T. A hierarchical taxonomy of leadership behavior: integrating a half century of behavior research. J Leadersh Organ Stud. 2002;9(1):15–32.

Shrout PE, Bolger N. Mediation in experimental and nonexperimental studies: new procedures and recommendations. Psychol Methods. 2002;7(4):422–45.

Fixsen DL, Naoom SF, Blase KA, Friedman RM. Implementation research: a synthesis of the literature; 2005.

Frambach R. An integrated model of organizational adoption and diffusion of innovations. Eur J Mark. 1993;27(5):22–41.

Institute of Medicine (IOM). Crossing the quality chasm: a new health system for the 21st century. Washington, DC: Institute of Medicine, National Academy Press; 2001.

Preacher KJ, Hayes AF. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behav Res Methods. 2008;40(3):879–91.

Stahmer AC, Suhrheinrich J, Schetter PL, McGee HE. Exploring multi-level system factors facilitating educator training and implementation of evidence-based practices (EBP): a study protocol. Implement Sci. 2018;13(1):3.

Lewis CC, Klasnja P, Powell BJ, Lyon AR, Tuzzio L, Jones S, et al. From classification to causality: advancing understanding of mechanisms of change in implementation science. Front Public Health. 2018;6:136.

Proctor E, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, et al. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm Policy Ment Health. 2011;38(2):65–76.

Weiner BJ, Lewis MA, Clauser SB, Stitzenberg KB. In search of synergy: strategies for combining interventions at multiple levels. J Natl Cancer Inst Monogr. 2012;2012(44):34–41.

Lewis CC, Weiner BJ, Stanick C, Fischer SM. Advancing implementation science through measure development and evaluation: a study protocol. Implement Sci. 2015;10:102.

Powell BJ, Stanick CF, Halko HM, Dorsey CN, Weiner BJ, Barwick MA, et al. Toward criteria for pragmatic measurement in implementation research and practice: a stakeholder-driven approach using concept mapping. Implement Sci. 2017;12(1):118.

Wu AD, Zumbo BD. Understanding and using mediators and moderators. Soc Indic Res. 2007;87(3):367.

MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. A comparison of methods to test mediation and other intervening variable effects. Psychol Methods. 2002;7(1):83.

Pituch KA, Murphy DL, Tate RL. Three-level models for indirect effects in school- and class-randomized experiments in education. J Exp Educ. 2009;78(1):60–95.

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ : British Medical Journal. 2015;350:h1258.

Pawson R, Manzano-Santaella A. A realist diagnostic workshop. Evaluation. 2012;18(2):176–91.

Pawson R, Greenhalgh T, Harvey G, Walshe K. Realist synthesis: an introduction. Manchester, UK: ESRC Research Methods Programme, University of Manchester; 2004.

Download references

Acknowledgments

Not applicable.

Availability of data and material

The authors are willing to share the raw data tables that informed the summary tables included in this manuscript.

This project was supported by grant number R13HS025632 from the Agency for Healthcare Research and Quality. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.

Author information

Authors and affiliations.

Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Suite 1600, Seattle, WA, 98101, USA

Cara C. Lewis & Callie Walsh-Bailey

Department of Psychological and Brain Sciences, Indiana University, 1101 E 10th Street, Bloomington, IN, 47405, USA

Cara C. Lewis

Department of Psychiatry and Behavioral Sciences, School of Medicine, University of Washington, 1959 NE Pacific Avenue, Seattle, WA, 98195, USA

Cara C. Lewis & Aaron R. Lyon

Department of Psychology, University of California Los Angeles, 1177 Franz Hall, 502 Portola Plaza, Los Angeles, CA, 90095, USA

Meredith R. Boyd

Brown School, Washington University in St. Louis, 1 Brookings Drive, St. Louis, MO, 63130, USA

Callie Walsh-Bailey

Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, 3535 Market Street, Philadelphia, PA, 19104, USA

Rinad Beidas

Department of Research and Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Avenue, Pasadena, CA, 91101, USA

Brian Mittman

Department of Psychiatry, School of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA

Gregory A. Aarons

Department of Health Services, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA

Bryan J. Weiner

Division of Cancer Control and Population Science, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD, 20850, USA

David A. Chambers

You can also search for this author in PubMed   Google Scholar

Contributions

CCL conceptualized the larger study and articulated the research questions with all coauthors. CCL, MRB, and CWB designed the approach with feedback from all coauthors. MRB and CWB executed the systematic search with oversight and checking by CCL. MRB led the data extraction and CWB led the study appraisal. All authors contributed to the discussion and reviewed and approved the manuscript.

Corresponding author

Correspondence to Cara C. Lewis .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: figure s1..

Inclusion and Exclusion Criteria and Definitions.

Additional file 2.

PRISMA 2009 Checklist.

Additional file 3.

Emergent Mechanism Models.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Lewis, C.C., Boyd, M.R., Walsh-Bailey, C. et al. A systematic review of empirical studies examining mechanisms of implementation in health. Implementation Sci 15 , 21 (2020). https://doi.org/10.1186/s13012-020-00983-3

Download citation

Received : 12 November 2019

Accepted : 12 March 2020

Published : 16 April 2020

DOI : https://doi.org/10.1186/s13012-020-00983-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Determinant
  • Implementation
  • Causal model

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

is a systematic review empirical research

Writing Motivation in School: a Systematic Review of Empirical Research in the Early Twenty-First Century

  • Review Article
  • Published: 19 June 2020
  • Volume 33 , pages 213–247, ( 2021 )

Cite this article

is a systematic review empirical research

  • Ana Camacho   ORCID: orcid.org/0000-0001-5328-966X 1 ,
  • Rui A. Alves   ORCID: orcid.org/0000-0002-1657-8945 1 &
  • Pietro Boscolo 2  

6039 Accesses

61 Citations

7 Altmetric

Explore all metrics

Motivation is a catalyst of writing performance in school. In this article, we report a systematic review of empirical studies on writing motivation conducted in school settings, published between 2000 and 2018 in peer-reviewed journals. We aimed to (1) examine how motivational constructs have been defined in writing research; (2) analyze group differences in writing motivation; (3) unveil effects of motivation on writing performance; (4) gather evidence on teaching practices supporting writing motivation; and (5) examine the impact of digital tools on writing motivation. Through database and hand searches, we located 82 articles that met eligibility criteria. Articles were written in English, focused on students in grades 1–12, and included at least one quantitative or qualitative measure of writing motivation. Across the 82 studies, 24 motivation-related constructs were identified. In 46% of the cases, these constructs were unclearly defined or not defined. Studies showed that overall girls were more motivated to write than boys. Most studies indicated moderate positive associations between motivation and writing performance measures. Authors also examined how students’ writing motivation was influenced by teaching practices, such as handwriting instruction, self-regulated strategy development instruction, and collaborative writing. Digital tools were found to have a positive effect on motivation. Based on this review, we suggest that to move the field forward, researchers need to accurately define motivational constructs; give further attention to understudied motivational constructs; examine both individual and contextual factors; conduct longitudinal studies; identify evidence-based practices that could inform professional development programs for teachers; and test long-term effects of digital tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

is a systematic review empirical research

Assessing Writing Motivation: a Systematic Review of K-5 Students' Self-Reports

is a systematic review empirical research

Writing motivation and performance in Chinese children

Student, teacher and class-level correlates of flemish late elementary school children’s writing performance.

Items were formulated in the entity version and therefore lower scores indicated more incremental beliefs.

*References with an asterisk mark were included in this review.

*Abbott, J. A. (2000). ‘Blinking out’ and ‘having the touch’: Two fifth-grade boys talk about flow experiences in writing. Written Communication, 17 (1), 53–92. https://doi.org/10.1177/0741088300017001003 .

Article   Google Scholar  

Ainley, M. (2019). Curiosity and interest: Emergence and divergence. Educational Psychology Review, Advance online publication, 31 (4), 789–806. https://doi.org/10.1007/s10648-019-09495-z .

Alexander, P. (2018). Past as prologue: Educational psychology’s legacy and progeny. Journal of Educational Psychology, 110 , 147–162. https://doi.org/10.1037/edu0000200 .

*Allaire, S., Thériault, P., Gagnon, V., & Lalancette, E. (2013). Elementary students’ affective variables in a networked learning environment supported by a blog: A case study. Canadian Journal of Learning and Technology, 39 (3). https://doi.org/10.21432/t2w88v .

Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In C. A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research (2nd ed., pp. 316–329). New York: The Guilford Press.

Google Scholar  

Alves, R. A. (2019). The early steps in becoming a writer: Enabling participation in a literate world. In J. S. Horst & J. V. K. Torkildsen (Eds.), International handbook of language acquisition (pp. 567–590). London: Routledge.

Alves, R. A., & Limpo, T. (2015). Fostering the capabilities that build writing achievement. In C. M. Connor & P. McCardle (Eds.), Advances in reading intervention: Research to practice to research . Baltimore: Paul H. Brookes Publishing Co..

American Educational Research Association, A. P. A., & National Council on Measurement in Education. (2014). Standards for educational and psychological testing . Washington, DC: American Educational Research Association.

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). Washington, DC: American Psychological Association.

*Arens, A. K., & Jansen, M. (2016). Self-concepts in reading, writing, listening, and speaking: A multidimensional and hierarchical structure and its generalizability across native and foreign languages. Journal of Educational Psychology, 108 (5), 646–664. https://doi.org/10.1037/edu0000081 .

Bandura, A. (1997). Self-efficacy: The exercise of control . New York: Freeman.

*Bayat, S. (2016). The effectiveness of the creative writing instruction program based on speaking activities (CWIPSA). International Electronic Journal of Elementary Education, 8 (4), 617–628.

*Bayraktar, A. (2013). Nature of interactions during teacher-student writing conferences, revisiting the potential effects of self-efficacy beliefs. Eurasian Journal of Educational Research (50), 63-85.

*Beck, N., & Fetherston, T. (2003). The effects of incorporating a word processor into a year three writing program. Information Technology in Childhood Education Annual, 2003 (1), 139–161.

Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition . Hillsdale, NJ: Lawrence Erlbaum Associates.

Boscolo, P. (2009). Engaging and motivating children to write. In R. Beard, D. Myhill, J. Riley, & M. Nystrand (Eds.), The SAGE handbook of writing development (pp. 300–312). London: Sage.

Chapter   Google Scholar  

Boscolo, P., & Gelati, C. (2019). Best practices in promoting motivation for writing. In S. Graham, C. MacArthur, & J. Fitzgerald (Eds.), Best practices in writing instruction (3rd ed.). New York-London: Guilford.

Boscolo, P., & Hidi, S. (2007). The multiple meanings of motivation to write. In S. Hidi & P. Boscolo (Eds.), Writing and motivation (pp. 1–14). Oxford: Elsevier.

*Boscolo, P., Ariasi, N., Del Favero, L., & Ballarin, C. (2011). Interest in an expository text: How does it flow from reading to writing? Learning and Instruction, 21 (3), 467–480. https://doi.org/10.1016/j.learninstruc.2010.07.009 .

*Boscolo, P., Gelati, C., & Galvan, N. (2012). Teaching elementary school students to play with meanings and genre. Reading & Writing Quarterly, 28(1), 29–50. https://doi.org/10.1080/10573569.2012.632730 .

*Bouchamma, Y., & Lapointe, C. (2008). Success in writing and attributions of 16-year-old French-speaking students in minority and majority environments. 54 , 194–209.

*Braaksma, M., Rijlaarsdam, G., & van der Bergh, H. (2018). Effects of hypertext writing and observational learning on content knowledge acquisition, selfefficacy, and text quality: Two experimental studies exploring aptitude treatment interactions. Journal of Writing Research, 9 (3), 259–300. https://doi.org/10.17239/jowr-2018.09.03.02 .

*Bradford, K. L., Newland, A. C., Rule, A. C., & Montgomery, S. E. (2016). Rubrics as a tool in writing instruction: Effects on the opinion essays of first and second graders. Early Childhood Education Journal. 44 (5), 463–472. https://doi.org/10.1007/s10643-015-0727-0 .

Bruning, R., & Horn, C. (2000). Developing motivation to write. Educational Psychologist, 35 (1), 25–37. https://doi.org/10.1207/S15326985EP3501_4 .

Bruning, R., & Kauffman, D. F. (2016). Self-efficacy beliefs and motivation in writing development. In C. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research (pp. 160–173). New York, NY: The Guilford Press.

*Bruning, R., Dempsey, M., Kauffman, D. F., McKim, C., & Zumbrunn, S. (2013). Examining dimensions of self-efficacy for writing. Journal of Educational Psychology, 105 (1), 25–38. https://doi.org/10.1037/a0029692 .

*Brunstein, J. C., & Glaser, C. (2011). Testing a path-analytic mediation model of how self-regulated writing strategies improve fourth graders’ composition skills: A randomized controlled trial. Journal of Educational Psychology, 103 (4), 922–938. https://doi.org/10.1037/a0024622 .

*Bui, Y. N., Schumaker, J. B., & Deshler, D. D. (2006). The effects of a strategic writing program for students with and without learning disabilities in inclusive fifth-grade classes. Learning Disabilities Research & Practice, 21 (4), 244–260. https://doi.org/10.1111/j.1540-5826.2006.00221.x .

*Camacho, A., & Alves, R. A. (2017). Fostering parental involvement in writing: Development and testing of the program Cultivating Writing. Reading and Writing: An Interdisciplinary Journal, 30 (2), 253–277. https://doi.org/10.1007/s11145-016-9672-6 .

*Chai, H. H. (2010). Adolescent girl writers: “I can be good at it, if I like it”. Mid-Western Educational Researcher, 23 (1), 29–40.

*Chan, J. C. Y., & Lam, S.-f (2008). Effects of competition on students’ self-efficacy in vicarious learning. British Journal of Educational Psychology, 78 (1), 95–108. https://doi.org/10.1348/000709907x185509 .

*Chohan, S. K. (2011). Any letter for me? Relationships between an elementary school letter writing program and student attitudes, literacy achievement, and friendship culture. Early Childhood Education Journal, 39 (1), 39–50. https://doi.org/10.1007/s10643-010-0438-5 .

*Cocuk, H. E., Yanpar Yelken, T., & Ozer, O. (2016). The relationship between writing anxiety and writing disposition among secondary school students. Eurasian Journal of Educational Research, 63 , 335-352. https://doi.org/10.14689/ejer.2016.63.19

*Collie, R. J., Martin, A. J., & Curwood, J. S. (2016). Multidimensional motivation and engagement for writing: Construct validation with a sample of boys. Educational Psychology, 36 (4), 771–791. https://doi.org/10.1080/01443410.2015.1093607 .

Conradi, K., Jang, B., & McKenna, M. (2014). Motivation terminology in reading research: A conceptual review. Educational Psychology Review, 26 (1), 127–164. https://doi.org/10.1007/s10648-013-9245-z .

Cooper, H. (2010). Research synthesis and meta-analysis: A step-by-step approach (4th ed.). Thousand Oaks, CA: Sage.

*Cordeiro, C., Castro, S. L., & Limpo, T. (2018). Examining potential sources of gender differences in writing: The role of handwriting fluency and self-efficacy beliefs. Written Communication , https://doi.org/10.1177/0741088318788843 .

Costa, A., & Faria, L. (2018). Implicit theories of intelligence and academic achievement: A meta-analytic review. Frontiers in Psychology, 9 (829). https://doi.org/10.3389/fpsyg.2018.00829 .

Daly, J. A., & Wilson, D. A. (1983). Writing apprehension, self-esteem, and personality. Research in the Teaching of English, 17 , 327–341.

De Smedt, F. (2019). Cognitive and motivational challenges in writing: The impact of explicit instruction and peer-assisted writing in upper-elementary grades . Ghent: University of Ghent.

*De Smedt, F., Van Keer, H., & Merchie, E. (2016). Student, teacher and class-level correlates of Flemish late elementary school children’s writing performance. Reading and Writing: An Interdisciplinary Journal, 29 (5), 833–868. https://doi.org/10.1007/s11145-015-9590-z .

*De Smedt, F., Graham, S., & Van Keer, H. (2018a). The bright and dark side of writing motivation: Effects of explicit instruction and peer assistance. The Journal of Educational Research , 1-16. https://doi.org/10.1080/00220671.2018.1461598 .

*De Smedt, F., Merchie, E., Barendse, M., Rosseel, Y., De Naeghel, J., & Van Keer, H. (2018b). Cognitive and motivational challenges in writing: Studying the relation with writing performance across students’ gender and achievement level. Reading Research Quarterly, 53 (2), 249–272. https://doi.org/10.1002/rrq.193 .

Deci, E. L., & Ryan, R. M. (2000). The “what” and “why” of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry, 11 , 227–268. https://doi.org/10.1207/s15327965pli1104_01 .

*deFur, S. H., & Runnells, M. M. (2014). Validation of the adolescent literacy and academic behavior self-efficacy survey. Journal of Vocational Rehabilitation, 40 (3), 255–266.

Dweck, C. S. (1999). Self-theories: Their role in motivation, personality, and development . Philadelphia, PA: Psychology Press.

Eccles, J., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J., & Midgley, C. (1983). Expectancies, values and academic behaviors. In J. T. Spence (Ed.), Achievement and achievement motives . San Francisco: W. H. Freeman.

*Edwards, G., & Jones, J. (2018). Boys as writers: Perspectives on the learning and teaching of writing in three primary schools. Literacy, 52 (1), 3–10. https://doi.org/10.1111/lit.12122 .

*Ehm, J.-H., Lindberg, S., & Hasselhorn, M. (2014). Reading, writing, and math self-concept in elementary school children: Influence of dimensional comparison processes. European Journal of Psychology of Education, 29 (2), 277–294. https://doi.org/10.1007/s10212-013-0198-x .

Ekholm, E., Zumbrunn, S., & DeBusk-Lane, M. (2018). Clarifying an elusive construct: A systematic review of writing attitudes. Educational Psychology Review, 30 (3), 827–856. https://doi.org/10.1007/s10648-017-9423-5 .

Elliott, E. S., & Dweck, C. S. (1988). Goals: An approach to motivation and achievement. Journal of Personality and Social Psychology, 54 (5–12).

Elliot, A. J., & Harackiewicz, J. M. (1996). Approach and avoidance goals and intrinsic motivation: A mediational analysis. Journal of Personality and Social Psychology, 70 , 461–475. https://doi.org/10.1037//0022-3514.70.3.461 .

*Erdogan, T., & Erdogan, Ö. (2013). A metaphor analysis of the fifth grade students’ perceptions about writing. Asia-Pacific Education Researcher, 22 (4), 347–355. https://doi.org/10.1007/s40299-012-0014-4 .

*Gambrell, L. B., Hughes, E. M., Calvert, L., Malloy, J. A., & Igo, B. (2011). Authentic reading, writing, and discussion: An exploratory study of a pen pal project. The Elementary School Journal, 112 (2), 234–258. https://doi.org/10.1086/661523 .

*Goldenberg, L., Meade, T., Midouhas, E., & Cooperman, N. (2011). Impact of a technology-infused middle school writing program on sixth-grade students’ writing ability and engagement. Middle Grades Research Journal, 6 (2), 75–96.

Graham, S. (2006). Writing. In P. Alexander & P. Winne (Eds.), Handbook of educational psychology (2nd ed., pp. 457–478). Mahwah, NJ: Lawrence Erbaum.

Graham, S. (2018). A revised writer(s)-within-community model of writing. Educational Psychologist, 53 (4), 258–279. https://doi.org/10.1080/00461520.2018.1481406 .

Graham, S., & Harris, K. R. (2016). A path to better writing: Evidence-based practices in the classroom. The Reading Teacher, 69 (4), 359–365. https://doi.org/10.1002/trtr.1432 .

Graham, S., & Harris, K. R. (2019). Evidence-based practices in writing. In S. Graham, C. MacArthur, & M. Hebert (Eds.), Best practices in writing instruction (pp. 3–28). New York: The Guilford Press.

Graham, S., & Perin, D. (2007). A meta-analysis of writing instruction for adolescent students. Journal of Educational Psychology, 99 (3), 445–476. https://doi.org/10.1037/0022-0663.99.3.445 .

*Graham, S., Harris, K. R., & Fink, B. (2000). Is handwriting causally related to learning to write? Treatment of handwriting problems in beginning writers. Journal of Educational Psychology, 92 (4), 620–633. https://doi.org/10.1057//0022-0663.92.4.620 .

*Graham, S., Harris, K. R., & Mason, L. H. (2005). Improving the writing performance, knowledge, and self-efficacy of struggling young writers: The effects of self-regulated strategy development. Contemporary Educational Psychology, 30 , 207–241. https://doi.org/10.1016/j.cedpsych.2004.08.001 .

*Graham, S., Berninger, V., & Fan, W. (2007). The structural relationship between writing attitude and writing achievement in first and third grade students. Contemporary Educational Psychology, 32 (3), 516–536. https://doi.org/10.1016/j.cedpsych.2007.01.002 .

*Graham, S., Berninger, V., & Abbott, R. (2012a). Are attitudes toward writing and reading separable constructs? A study with primary grade children. Reading & Writing Quarterly, 28 (1), 51–69. https://doi.org/10.1080/10573569.2012.632732 .

Graham, S., McKeown, D., Kiuhara, S., & Harris, K. R. (2012b). A meta-analysis of writing instruction for students in the elementary grades. Journal of Educational Psychology, 104 (4), 879–896. https://doi.org/10.1037/a0029185 .

Graham, S., Harris, K. R., & Santangelo, T. (2015). Research-based writing practices and the common core: Meta-analysis and meta-synthesis. Elementary School Journal, 115 (4). https://doi.org/10.1086/681964 .

*Graham, S., Harris, K. R., Kiuhara, S. A., & Fishman, E. J. (2017). The relationship among strategic writing behavior, writing motivation, and writing performance with young, developing writers. Elementary School Journal, 118( 1), 82–104. https://doi.org/10.1086/693009 .

*Graham, S., Daley, S. G., Aitken, A. A., Harris, K. R., & Robinson, K. H. (2018). Do writing motivational beliefs predict middle school students’ writing performance? Journal of Research in Reading, 41 (4), 642–656. https://doi.org/10.1111/1467-9817.12245 .

*Guay, F., Chanal, J., Ratelle, C. F., Marsh, H. W., Larose, S., & Boivin, M. (2010). Intrinsic, identified, and controlled types of motivation for school subjects in young elementary school children. British Journal of Educational Psychology, 80 (4), 711–735. https://doi.org/10.1348/000709910x499084 .

*Gunderson, E. A., Hamdan, N., Sorhagen, N. S., & D'Esterre, A. P. (2017). Who needs innate ability to succeed in math and literacy? Academic-domain-specific theories of intelligence about peers versus adults. Developmental Psychology, 53 (6), 1188–1205. https://doi.org/10.1037/dev0000282 .

*Hall, A. H., & Axelrod, Y. (2014). “I am kind of a good writer and kind of not”: Examining students’ writing attitudes. Journal of Research in Education, 24 (2), 34–50.

*Hamilton, E. W., Nolen, S. B., & Abbott, R. D. (2013). Developing measures of motivational orientation to read and write: A longitudinal study. Learning and Individual Differences, 28 , 151–166. https://doi.org/10.1016/j.lindif.2013.04.007 .

Harris, K. R. (2018). Educational psychology: A future retrospective. Journal of Educational Psychology, 110 (2), 163–173. https://doi.org/10.1037/edu0000267 .

*Harris, K. R., Graham, S., & Mason, L. (2006). Improving the writing, knowledge, and motivation of struggling young writers: Effects of self-regulated strategy development with and without peer support. American Educational Research Journal Summer, 43 (2), 295–340. https://doi.org/10.3102/00028312043002295 .

Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing: Theories, methods, individual differences, and applications (pp. 1–27). Mahwah, NJ: Lawrence Erlbaum Associates.

Hayes, J. R., & Flower, L. S. (1980). Identifying the organization of writing processes. In L. Gregg & E. R. Steinberg (Eds.), Cognitive processes in writing (pp. 3–30). Hillsdale, NJ: Lawrence Erlbaum.

Hidi, S., & Renninger, K. A. (2006). The four-phase model of interest development. Educational Psychologist, 41 (2), 111–127.

Hidi, S., & Renninger, K. A. (2019). Interest development and its relation to curiosity: Needed neuroscientific research. Educational Psychology Review, Advance online publication, 31 (4), 833–852. https://doi.org/10.1007/s10648-019-09491-3 .

*Hidi, S., Berndorff, D., & Ainley, M. (2002). Children’s argument writing, interest and self-efficacy: An intervention study. Learning and Instruction, 12 (4), 429–446. https://doi.org/10.1016/s0959-4752(01)00009-3 .

*Hier, B. O., & Mahony, K. E. (2018). The contribution of mastery experiences, performance feedback, and task effort to elementary-aged students’ self-efficacy in writing. School Psychology Quarterly, 33 (3), 408–418. https://doi.org/10.1037/spq0000226 .

Igo, L. B. (2008). An interview with Roger Bruning. Educational Psychology Review, 20 (2), 207–216. https://doi.org/10.1007/s10648-008-9070-y .

*Kanala, S., Nousiainen, T., & Kankaanranta, M. (2013). Using a mobile application to support children’s writing motivation. Interactive Technology and Smart Education, 10 (1), 4–14. https://doi.org/10.1108/17415651311326419 .

*Kaplan, A., Lichtinger, E., & Gorodetsky, M. (2009). Achievement goal orientations and self-regulation in writing: An integrative perspective. Journal of Educational Psychology, 101 (1), 51–69. https://doi.org/10.1037/a0013200 .

Karchmer-Klein, R. (2019). Writing with digital tools. In S. Graham, C. A. MarArthur, & M. Hebert (Eds.), Best practices in writing instruction (3rd ed., pp. 185–208). New York: The Guilford Press.

*Kear, D. J., Coffman, G. A., McKenna, M. C., & Ambrosio, A. L. (2000). Measuring attitude toward writing: A new tool for teachers. 54 , 10-23. https://doi.org/10.1598/rt.43.8.3 .

Kellogg, R. T. (1994). The psychology of writing . Oxford: Oxford University Press.

King, D. A. (2004). The scientific impact of nations. Nature, 430 (6997), 311–316. https://doi.org/10.1038/430311a .

Klassen, R. (2002). Writing in early adolescence: A review of the role of self-efficacy beliefs. Educational Psychology Review, 14 (2), 173–203. https://doi.org/10.1023/A:1014626805572 .

*Korat, O., & Schiff, R. (2005). Do children who read more books know ‘what is good writing’ better than children who read less? A comparison between grade levels and SES groups. Journal of Literacy Research, 37 (3), 289–324. https://doi.org/10.1207/s15548430jlr3703_2 .

Kriegbaum, K., Becker, N., & Spinath, B. (2018). The relative importance of intelligence and motivation as predictors of school achievement: A meta-analysis. Educational Research Review, 25 , 120–148. https://doi.org/10.1016/j.edurev.2018.10.001 .

*Lam, S.-F., & Law, Y.-K. (2007). The roles of instructional practices and motivation in writing performance. Journal of Experimental Education, 75 (2), 145–164. https://doi.org/10.3200/jexe.75.2.145-164 .

*Lan, Y.-F., Hung, C.-L., & Hsu, H.-J. (2011). Effects of guided writing strategies on students’ writing attitudes based on media richness theory. Turkish Online Journal of Educational Technology, 10 (4), 148–164.

Latif, M. M. M. A. (2019). Unresolved issues in defining and assessing writing motivational constructs: A review of conceptualization and measurement perspectives. Assessing Writing, Advance online publication . https://doi.org/10.1016/j.asw.2019.100417 .

Lazowski, R. A., & Hulleman, C. S. (2015). Motivation interventions in education: A meta-analytic review. Review of Educational Research, 86 (2), 602–640. https://doi.org/10.3102/0034654315617832 .

*Lee, J. (2013). Can writing attitudes and learning behavior overcome gender difference in writing? Evidence from NAEP. Written Communication, 30 (2), 164–193. https://doi.org/10.1177/0741088313480313 .

*Lee, B. K., & Enciso, P. (2017). The big glamorous monster (or Lady Gaga’s adventures at sea): Improving student writing through dramatic approaches in schools. Journal of Literacy Research, 49 (2), 157–180. https://doi.org/10.1177/1086296x17699856 .

*Leroy, C. (2000). “Why we do this is important”: An inner-city girl’s attitudes toward reading and writing in the classroom. Reading Horizons, 41 (2), 69–92.

*Li, X., Chu, S. K. W., Ki, W. W., & Woo, M. (2012). Using a wiki-based collaborative process writing pedagogy to facilitate collaborative writing among Chinese primary school students. Australasian Journal of Educational Technology, 28 (1), 159-181. https://doi.org/10.14742/ajet.889 .

*Li, X., Chu, S. K. W., & Ki, W. W. (2014). The effects of a wiki-based collaborative process writing pedagogy on writing ability and attitudes among upper primary school students in Mainland China. Computers & Education, 77 , 151–169. https://doi.org/10.1016/j.compedu.2014.04.019 .

*Liao, C. C. Y., Chang, W.-C., & Chan, T.-W. (2018). The effects of participation, performance, and interest in a game-based writing environment. Journal of Computer Assisted Learning, 34 (3), 211–222. https://doi.org/10.1111/jcal.12233 .

*Limpo, T., & Alves, R. A. (2013). Modeling writing development: Contribution of transcription and self-regulation to Portuguese students’ text generation quality. Journal of Educational Psychology, 105 (2), 401–413. https://doi.org/10.1037/a0031391 .

*Limpo, T., & Alves, R. A. (2014). Implicit theories of writing and their impact on students’ response to a SRSD intervention. British Journal of Educational Psychology, 84 (4), 571–590. https://doi.org/10.1111/bjep.12042 .

*Limpo, T., & Alves, R. A. (2017). Relating beliefs in writing skill malleability to writing performance: The mediating role of achievement goals and self-efficacy. Journal of Writing Research, 9 (2), 97-125. https://doi.org/10.17239/jowr-2017.09.02.01 .

*Mason, L. H., Meadan, H., Hedin, L. R., & Cramer, A. M. (2012). Avoiding the struggle: Instruction that supports students’ motivation in reading and writing about content material. Reading & Writing Quarterly, 28 (1), 70–96. https://doi.org/10.1080/10573569.2012.632734 .

*Merisuo-Storm, T. (2006). Girls and boys like to read and write different texts. Scandinavian Journal of Educational Research, 50 (2), 111–125. https://doi.org/10.1080/00313830600576039 .

Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Medicine, 6 (7), 1–6. https://doi.org/10.1371/journal.pmed.1000097 .

Murphy, P. K., & Alexander, P. A. (2000). A motivated exploration of motivation terminology. Contemporary Educational Psychology, 25 (1), 3–53. https://doi.org/10.1006/ceps.1999.1019 .

*Nair, S. S., Tay, L. Y., & Koh, J. H. L. (2013). Students’ motivation and teachers’ teaching practices towards the use of blogs for writing of online journals. Educational Media International, 50 (2), 108–119. https://doi.org/10.1080/09523987.2013.795351 .

National Center for Education Statistics, NCES. (2012). The nation’s report card: Writing 2011 . Washington, D.C.: Institute of Education Sciences, U.S. Department of Education.

*Nicolaidou, I. (2012). Can process portfolios affect students’ writing self-efficacy? International Journal of Educational Research, 56 , 10–22. https://doi.org/10.1016/j.ijer.2012.08.002 .

*Nolen, S. B. (2007). Young children’s motivation to read and write: Development in social contexts. Cognition and Instruction, 25 (2), 219–270. https://doi.org/10.1080/07370000701301174 .

Nolen, A. L. (2009). The content of educational psychology: An analysis of top ranked journals from 2003 through 2007. Educational Psychology Review, 21 (3), 279–289. https://doi.org/10.1007/s10648-009-9110-2 .

*Olinghouse, N. G., & Graham, S. (2009). The relationship between the discourse knowledge and the writing performance of elementary-grade students. Journal of Educational Psychology, 101 (1), 37–50. https://doi.org/10.1037/a0013462 .

Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid, A. (2016). Rayyan—A web and mobile app for systematic reviews. Systematic Reviews, 5 (1), 1–10. https://doi.org/10.1186/s13643-016-0384-4 .

Pajares, F. (2003). Self-efficacy beliefs, motivation, and achievement in writing: A review of the literature. Reading & Writing Quarterly, 19 (2), 139–158. https://doi.org/10.1080/10573560308222 .

*Pajares, F. (2007). Empirical properties of a scale to assess writing self-efficacy in school contexts. Measurement and Evaluation in Counseling and Development, 39 (4). https://doi.org/10.1080/07481756.2007.11909801 .

*Pajares, F., & Cheong, Y. F. (2003). Achievement goal orientations in writing: A developmental perspective. International Journal of Educational Research, 39 (4), 437–455. https://doi.org/10.1016/j.ijer.2004.06.008 .

*Pajares, F., & Valiante, G. (2001). Gender differences in writing motivation and achievement of middle school students: A function of gender orientation? Contemporary Educational Psychology, 26 (3), 366–381. https://doi.org/10.1006/ceps.2000.1069 .

*Pajares, F., Britner, S. L., & Valiante, G. (2000). Relation between achievement goals and self-beliefs of middle school students in writing and science. Contemporary Educational Psychology, 25 (4), 406–422. https://doi.org/10.1006/ceps.1999.1027 .

*Pajares, F., Hartley, J., & Valiante, G. (2001). Response format in writing self-efficacy assessment: Greater discrimination increases prediction. Measurement and Evaluation in Counseling and Development, 33 (4), 214–221.

*Pajares, F., Johnson, M. J., & Usher, E. L. (2007). Sources of writing self-efficacy beliefs of elementary, middle, and high school students. Research in the Teaching of English, 42 (1), 104–120.

*Perry, N. E., Nordby, C. J., & VandeKamp, K. O. (2003). Promoting self-regulated reading and writing at home and school. The Elementary School Journal, 103 (4), 317–338. https://doi.org/10.1086/499729 .

Peterson, E. G., & Cohen, J. (2019). A case for domain-specific curiosity in mathematics. Educational Psychology Review, Advance online publication, 31 (4), 807–832. https://doi.org/10.1007/s10648-019-09501-4 .

Pintrich, P. R. (1994). Continuities and discontinuities: Future directions for research in educational psychology. Educational Psychologist, 29 , 137–148.

Pintrich, P. R., & Schunk, D. D. (2002). Motivation in education: Theory, research, and applications (2nd ed.). Columbus, OH: Merrill.

*Potter, E. F., McCormick, C. B., & Busching, B. A. (2001). Academic and life goals: Insights from adolescent writers. The High School Journal, 85 (1), 45–55. https://doi.org/10.1353/hsj.2001.0018 .

Renninger, K. A., Hidi, S., & Krapp, A. (1992). The role of interest in learning and development . Hillsdale, NJ: Erlbaum.

*Rosário, P., Högemann, J., Núñez, J. C., Vallejo, G., Cunha, J., Oliveira, V., Fuentes S. Rodrigues, C. (2017). Writing week-journals to improve the writing quality of fourth-graders’ compositions. Reading and Writing: An Interdisciplinary Journal, 30 (5), 1009–1032. https://doi.org/10.1007/s11145-016-9710-4 .

Ryan, R. M., & Deci, E. L. (2000). Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25 (1), 54–67. https://doi.org/10.1006/ceps.1999.1020 .

Saddler, B. (2012). Motivating writers: Theory and interventions. Reading & Writing Quarterly, 28 (1), 1–4. https://doi.org/10.1080/10573569.2012.632727 .

*Sessions, L., Kang, M. O., & Womack, S. (2016). The neglected “R”: Improving writing instruction through iPad apps. TechTrends: Linking Research and Practice to Improve Learning, 60 (3), 218–225. https://doi.org/10.1007/s11528-016-0041-8 .

*Smith, E. V., Jr., Wakely, M. B., De Kruif, R. E. L., & Swartz, C. W. (2003). Optimizing rating scales for self-efficacy (and other) research. Educational and Psychological Measurement, 63 (3), 369–391. https://doi.org/10.1177/0013164403063003002 .

Talsma, K., Schüza, B., Schwarzerc, R., & Norrisa, K. (2018). I believe, therefore I achieve (and vice versa): A meta-analytic cross-lagged panel analysis of self-efficacy and academic performance. Learning and Individual Differences , (61), 136–150. https://doi.org/10.1016/j.lindif.2017.11.015 .

Tollefson, J. (2018). China declared world’s largest producer of scientific articles. Nature, 553 (7689), 390–390. https://doi.org/10.1038/d41586-018-00927-4 .

Troia, G. A., Shankland, R. K., & Wolbers, K. A. (2012). Motivation research in writing: Theoretical and empirical considerations. Reading & Writing Quarterly, 28 (1), 5–28. https://doi.org/10.1080/10573569.2012.632729 .

*Troia, G. A., Harbaugh, A. G., Shankland, R. K., Wolbers, K. A., & Lawrence, A. M. (2013). Relationships between writing motivation, writing activity, and writing performance: Effects of grade, sex, and ability. Reading and Writing: An Interdisciplinary Journal, 26 (1), 17–44. https://doi.org/10.1007/s11145-012-9379-2 .

United Nations Educational, Scientific and Cultural Organization, UNESCO. (2011). UNESCO and education: Everyone has the right to education . Paris: UNESCO.

Valentine, J. C., DuBois, D. L., & Cooper, H. (2004). The relation between self-beliefs and academic achievement: A meta-analytic review. Educational Psychologist, 39 (2), 111–133. https://doi.org/10.1207/s15326985ep3902_3 .

*Villalón, R., Mateos, M., & Cuevas, I. (2015). High school boys’ and girls’ writing conceptions and writing self-efficacy beliefs: What is their role in writing performance? Educational Psychology, 35 (6), 653-674.

Weiner, B. (1985). An attributional theory of achievement, motivation and emotion. Psychological Review, 92 (4), 548–573. https://doi.org/10.1007/978-1-4612-4948-1 .

Wentzel, K., & Miele, D. (2016). Overview. In K. Wentzel & D. Miele (Eds.), Handbook of motivation at school (pp. 1–8). New York: Routledge.

Wigfield, A. (1994). Expectancy-value theory of achievement motivation: A developmental perspective. Educational Psychology Review, 6 (1), 49–78. https://doi.org/10.1007/bf02209024 .

*Wijekumar, K., Graham, S., Harris, K. R., Lei, P.-W., Barkel, A., Aitken, A., Ray A. Houston, J. (2018). The roles of writing knowledge, motivation, strategic behaviors, and skills in predicting elementary students’ persuasive writing from source material. Reading and Writing: An Interdisciplinary Journal , 32 6, 1431, 1457. https://doi.org/10.1007/s11145-018-9836-7

*Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100 , 94–109. https://doi.org/10.1016/j.compedu.2016.05.004 .

*Yarrow, F., & Topping, K. J. (2001). Collaborative writing: The effects of metacognitive prompting and structured peer interaction. British Journal of Educational Psychology, 71 (2), 261–282. https://doi.org/10.1348/000709901158514 .

*Yilmaz Soylu, M., Zeleny, M. G., Zhao, R., Bruning, R. H., Dempsey, M. S., & Kauffman, D. F. (2017). Secondary students' writing achievement goals: Assessing the mediating effects of mastery and performance goals on writing self-efficacy, affect, and writing achievement. Frontiers in Psychology, 8 . https://doi.org/10.3389/fpsyg.2017.01406 .

*Zumbrunn, S., Marrs, S., & Mewborn, C. (2016). Toward a better understanding of student perceptions of writing feedback: A mixed methods study. Reading and Writing: An Interdisciplinary Journal, 29 (2), 349–370. https://doi.org/10.1007/s11145-015-9599-3 .

Download references

Acknowledgements

The authors would like to thank Steve Graham and Hilde Van Keer for invaluable comments on earlier versions of this article. The authors thank also Mariana Silva for contributing to the study quality assessment.

This work was supported by a grant attributed to the first author from the Portuguese Foundation for Science and Technology (grant SFRH/BD/116281/2016).

Author information

Authors and affiliations.

Faculty of Psychology and Education Sciences, University of Porto, Rua Alfredo Allen, 4200-135, Porto, Portugal

Ana Camacho & Rui A. Alves

Department of Developmental Psychology and Socialization, University of Padova, Padova, Italy

Pietro Boscolo

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ana Camacho .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic Supplementary Material

(PDF 804 kb)

Rights and permissions

Reprints and permissions

About this article

Camacho, A., Alves, R.A. & Boscolo, P. Writing Motivation in School: a Systematic Review of Empirical Research in the Early Twenty-First Century. Educ Psychol Rev 33 , 213–247 (2021). https://doi.org/10.1007/s10648-020-09530-4

Download citation

Published : 19 June 2020

Issue Date : March 2021

DOI : https://doi.org/10.1007/s10648-020-09530-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systematic review
  • Teaching practices
  • Find a journal
  • Publish with us
  • Track your research

Banner

  • University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Searching for empirical research.

  • Defining Empirical Research
  • Introduction

Where Do I Find Empirical Research?

How do i find more empirical research in my search.

  • Database Tools
  • Search Terms
  • Image Descriptions

Because empirical research refers to the method of investigation rather than a method of publication, it can be published in a number of places. In many disciplines empirical research is most commonly published in scholarly, peer-reviewed journals . Putting empirical research through the peer review process helps ensure that the research is high quality. 

Finding Peer-Reviewed Articles

You can find peer-reviewed articles in a general web search along with a lot of other types of sources. However, these specialized tools are more likely to find peer-reviewed articles:

  • Library databases
  • Academic search engines such as Google Scholar

Common Types of Articles That Are Not Empirical

However, just finding an article in a peer-reviewed journal is not enough to say it is empirical, since not all the articles in a peer-reviewed journal will be empirical research or even peer reviewed. Knowing how to quickly identify some types non-empirical research articles in peer-reviewed journals can help speed up your search. 

  • Peer-reviewed articles that systematically discuss and propose abstract concepts and methods for a field without primary data collection.
  • Example: Grosser, K. & Moon, J. (2019). CSR and feminist organization studies: Towards an integrated theorization for the analysis of gender issues .
  • Peer-reviewed articles that systematically describe, summarize, and often categorize and evaluate previous research on a topic without collecting new data.
  • Example: Heuer, S. & Willer, R. (2020). How is quality of life assessed in people with dementia? A systematic literature review and a primer for speech-language pathologists .
  • Note: empirical research articles will have a literature review section as part of the Introduction , but in an empirical research article the literature review exists to give context to the empirical research, which is the primary focus of the article. In a literature review article, the literature review is the focus. 
  • While these articles are not empirical, they are often a great source of information on previous empirical research on a topic with citations to find that research.
  • Non-peer-reviewed articles where the authors discuss their thoughts on a particular topic without data collection and a systematic method. There are a few differences between these types of articles.
  • Written by the editors or guest editors of the journal. 
  • Example:  Naples, N. A., Mauldin, L., & Dillaway, H. (2018). From the guest editors: Gender, disability, and intersectionality .
  • Written by guest authors. The journal may have a non-peer-reviewed process for authors to submit these articles, and the editors of the journal may invite authors to write opinion articles.
  • Example: García, J. J.-L., & Sharif, M. Z. (2015). Black lives matter: A commentary on racism and public health . 
  • Written by the readers of a journal, often in response to an article previously-published in the journal.
  • Example: Nathan, M. (2013). Letters: Perceived discrimination and racial/ethnic disparities in youth problem behaviors . 
  • Non-peer-reviewed articles that describe and evaluate books, products, services, and other things the audience of the journal would be interested in. 
  • Example: Robinson, R. & Green, J. M. (2020). Book review: Microaggressions and traumatic stress: Theory, research, and clinical treatment .

Even once you know how to recognize empirical research and where it is published, it would be nice to improve your search results so that more empirical research shows up for your topic.

There are two major ways to find the empirical research in a database search:

  • Use built-in database tools to limit results to empirical research.
  • Include search terms that help identify empirical research.
  • << Previous: Discussion
  • Next: Database Tools >>
  • Last Updated: Apr 2, 2024 11:25 AM
  • URL: https://libguides.memphis.edu/empirical-research
  • Open access
  • Published: 19 April 2024

Person-centered care assessment tool with a focus on quality healthcare: a systematic review of psychometric properties

  • Lluna Maria Bru-Luna 1 ,
  • Manuel Martí-Vilar 2 ,
  • César Merino-Soto 3 ,
  • José Livia-Segovia 4 ,
  • Juan Garduño-Espinosa 5 &
  • Filiberto Toledano-Toledano 5 , 6 , 7  

BMC Psychology volume  12 , Article number:  217 ( 2024 ) Cite this article

268 Accesses

Metrics details

The person-centered care (PCC) approach plays a fundamental role in ensuring quality healthcare. The Person-Centered Care Assessment Tool (P-CAT) is one of the shortest and simplest tools currently available for measuring PCC. The objective of this study was to conduct a systematic review of the evidence in validation studies of the P-CAT, taking the “Standards” as a frame of reference.

First, a systematic literature review was conducted following the PRISMA method. Second, a systematic descriptive literature review of validity tests was conducted following the “Standards” framework. The search strategy and information sources were obtained from the Cochrane, Web of Science (WoS), Scopus and PubMed databases. With regard to the eligibility criteria and selection process, a protocol was registered in PROSPERO (CRD42022335866), and articles had to meet criteria for inclusion in the systematic review.

A total of seven articles were included. Empirical evidence indicates that these validations offer a high number of sources related to test content, internal structure for dimensionality and internal consistency. A moderate number of sources pertain to internal structure in terms of test-retest reliability and the relationship with other variables. There is little evidence of response processes, internal structure in measurement invariance terms, and test consequences.

The various validations of the P-CAT are not framed in a structured, valid, theory-based procedural framework like the “Standards” are. This can affect clinical practice because people’s health may depend on it. The findings of this study show that validation studies continue to focus on the types of validity traditionally studied and overlook interpretation of the scores in terms of their intended use.

Peer Review reports

Person-centered care (PCC)

Quality care for people with chronic diseases, functional limitations, or both has become one of the main objectives of medical and care services. The person-centered care (PCC) approach is an essential element not only in achieving this goal but also in providing high-quality health maintenance and medical care [ 1 , 2 , 3 ]. In addition to guaranteeing human rights, PCC provides numerous benefits to both the recipient and the provider [ 4 , 5 ]. Additionally, PCC includes a set of necessary competencies for healthcare professionals to address ongoing challenges in this area [ 6 ]. PCC includes the following elements [ 7 ]: an individualized, goal-oriented care plan based on individuals’ preferences; an ongoing review of the plan and the individual’s goals; support from an interprofessional team; active coordination among all medical and care providers and support services; ongoing information exchange, education and training for providers; and quality improvement through feedback from the individual and caregivers.

There is currently a growing body of literature on the application of PCC. A good example of this is McCormack’s widely known mid-range theory [ 8 ], an internationally recognized theoretical framework for PCC and how it is operationalized in practice. This framework forms a guide for care practitioners and researchers in hospital settings. This framework is elaborated in PCC and conceived of as “an approach to practice that is established through the formation and fostering of therapeutic relationships between all care providers, service users, and others significant to them, underpinned by values of respect for persons, [the] individual right to self-determination, mutual respect, and understanding” [ 9 ].

Thus, as established by PCC, it is important to emphasize that reference to the person who is the focus of care refers not only to the recipient but also to everyone involved in a care interaction [ 10 , 11 ]. PCC ensures that professionals are trained in relevant skills and methodology since, as discussed above, carers are among the agents who have the greatest impact on the quality of life of the person in need of care [ 12 , 13 , 14 ]. Furthermore, due to the high burden of caregiving, it is essential to account for caregivers’ well-being. In this regard, studies on professional caregivers are beginning to suggest that the provision of PCC can produce multiple benefits for both the care recipient and the caregiver [ 15 ].

Despite a considerable body of literature and the frequent inclusion of the term in health policy and research [ 16 ], PCC involves several complications. There is no standard consensus on the definition of this concept [ 17 ], which includes problematic areas such as efficacy assessment [ 18 , 19 ]. In addition, the difficulty of measuring the subjectivity involved in identifying the dimensions of the CPC and the infrequent use of standardized measures are acute issues [ 20 ]. These limitations and purposes motivated the creation of the Person-Centered Care Assessment Tool (P-CAT; [ 21 ]), which emerged from the need for a brief, economical, easily applied, versatile and comprehensive assessment instrument to provide valid and reliable measures of PCC for research purposes [ 21 ].

Person-centered care assessment tool (P-CAT)

There are several instruments that can measure PCC from different perspectives (i.e., the caregiver or the care recipient) and in different contexts (e.g., hospitals and nursing homes). However, from a practical point of view, the P-CAT is one of the shortest and simplest tools and contains all the essential elements of PCC described in the literature. It was developed in Australia to measure the approach of long-term residential settings to older people with dementia, although it is increasingly used in other healthcare settings, such as oncology units [ 22 ] and psychiatric hospitals [ 23 ].

Due to the brevity and simplicity of its application, the versatility of its use in different medical and care contexts, and its potential emic characteristics (i.e., constructs that can be cross-culturally applicable with reasonable and similar structure and interpretation; [ 24 ]), the P-CAT is one of the most widely used tests by professionals to measure PCC [ 25 , 26 ]. It has expanded to several countries with cultural and linguistic differences. Since its creation, it has been adapted in countries separated by wide cultural and linguistic differences, such as Norway [ 27 ], Sweden [ 28 ], China [ 29 ], South Korea [ 30 ], Spain [ 25 ], and Italy [ 31 ].

The P-CAT comprises 13 items rated on a 5-point ordinal scale (from “strongly disagree” to “strongly agree”), with high scores indicating a high degree of person-centeredness. The scale consists of three dimensions: person-centered care (7 items), organizational support (4 items) and environmental accessibility (2 items). In the original study ( n  = 220; [ 21 ]), the internal consistency of the instrument yielded satisfactory values for the total scale ( α  = 0.84) and good test-retest reliability ( r  =.66) at one-week intervals. A reliability generalization study conducted in 2021 [ 32 ] that estimated the internal consistency of the P-CAT and analyzed possible factors that could affect the it revealed that the mean α value for the 25 meta-analysis samples (some of which were part of the validations included in this study) was 0.81, and the only variable that had a statistically significant relationship with the reliability coefficient was the mean age of the sample. With respect to internal structure validity, three factors (56% of the total variance) were obtained, and content validity was assessed by experts, literature reviews and stakeholders [ 33 ].

Although not explicitly stated, the apparent commonality between validation studies of different versions of the P-CAT may be influenced by an influential decades-old validity framework that differentiates three categories: content validity, construct validity, and criterion validity [ 34 , 35 ]. However, a reformulation of the validity of the P-CAT within a modern framework, which would provide a different definition of validity, has not been performed.

Scale validity

Traditionally, validation is a process focused on the psychometric properties of a measurement instrument [ 36 ]. In the early 20th century, with the frequent use of standardized measurement tests in education and psychology, two definitions emerged: the first defined validity as the degree to which a test measures what it intends to measure, while the second described the validity of an instrument in terms of the correlation it presents with a variable [ 35 ].

However, in the past century, validity theory has evolved, leading to the understanding that validity should be based on specific interpretations for an intended purpose. It should not be limited to empirically obtained psychometric properties but should also be supported by the theory underlying the construct measured. Thus, to speak of classical or modern validity theory suggests an evolution in the classical or modern understanding of the concept of validity. Therefore, a classical approach (called classical test theory, CTT) is specifically differentiated from a modern approach. In general, recent concepts associated with a modern view of validity are based on (a) a unitary conception of validity and (b) validity judgments based on inferences and interpretations of the scores of a measure [ 37 , 38 ]. This conceptual advance in the concept of validity led to the creation of a guiding framework to for obtaining evidence to support the use and interpretation of the scores obtained by a measure [ 39 ].

This purpose is addressed by the Standards for Educational and Psychological Testing (“Standards”), a guide created by the American Educational Research Association (AERA), the American Psychological Association (APA) and the National Council on Measurement in Education (NCME) in 2014 with the aim of providing guidelines to assess the validity of the interpretations of scores of an instrument based on their intended use. Two conceptual aspects stand out in this modern view of validity: first, validity is a unitary concept centered on the construct; second, validity is defined as “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” [ 37 ]. Thus, the “Standards” propose several sources that serve as a reference for assessing different aspects of validity. The five sources of valid evidence are as follows [ 37 ]: test content, response processes, internal structure, relations to other variables and consequences of testing. According to AERA et al. [ 37 ], test content validity refers to the relationship of the administration process, subject matter, wording and format of test items to the construct they are intended to measure. It is measured predominantly with qualitative methods but without excluding quantitative approaches. The validity of the responses is based on analysis of the cognitive processes and interpretation of the items by respondents and is measured with qualitative methods. Internal structure validity is based on the interrelationship between the items and the construct and is measured by quantitative methods. Validity in terms of the relationship with other variables is based on comparison between the variable that the instrument intends to measure and other theoretically relevant external variables and is measured by quantitative methods. Finally, validity based on the results of the test analyses consequences, both intended and unintended, that may be due to a source of invalidity. It is measured mainly by qualitative methods.

Thus, although validity plays a fundamental role in providing a strong scientific basis for interpretations of test scores, validation studies in the health field have traditionally focused on content validity, criterion validity and construct validity and have overlooked the interpretation and use of scores [ 34 ].

“Standards” are considered a suitable validity theory-based procedural framework for reviewing the validity of questionnaires due to its ability to analyze sources of validity from both qualitative and quantitative approaches and its evidence-based method [ 35 ]. Nevertheless, due to a lack of knowledge or the lack of a systematic description protocol, very few instruments to date have been reviewed within the framework of the “Standards” [ 39 ].

Current study

Although the P-CAT is one of the most widely used instruments by professionals and has seven validations [ 25 , 27 , 28 , 29 , 30 , 31 , 40 ], no analysis has been conducted of its validity within the framework of the “Standards”. That is, empirical evidence of the validity of the P-CAT has not been obtained in a way that helps to develop a judgment based on a synthesis of the available information.

A review of this type is critical given that some methodological issues seem to have not been resolved in the P-CAT. For example, although the multidimensionality of the P-CAT was identified in the study that introduced it, Bru-Luna et al. [ 32 ] recently stated that in adaptations of the P-CAT [ 25 , 27 , 28 , 29 , 30 , 40 ], the total score is used for interpretation and multidimensionality is disregarded. Thus, the multidimensionality of the original study was apparently not replicated. Bru-Luna et al. [ 32 ] also indicated that the internal structure validity of the P-CAT is usually underreported due to a lack of sufficiently rigorous approaches to establish with certainty how its scores are calculated.

The validity of the P-CAT, specifically its internal structure, appears to be unresolved. Nevertheless, substantive research and professional practice point to this measure as relevant to assessing PCC. This perception is contestable and judgment-based and may not be sufficient to assess the validity of the P-CAT from a cumulative and synthetic angle based on preceding validation studies. An adequate assessment of validity requires a model to conceptualize validity followed by a review of previous studies of the validity of the P-CAT using this model.

Therefore, the main purpose of this study was to conduct a systematic review of the evidence provided by P-CAT validation studies while taking the “Standards” as a framework.

The present study comprises two distinct but interconnected procedures. First, a systematic literature review was conducted following the PRISMA method ( [ 41 ]; Additional file 1; Additional file 2) with the aim of collecting all validations of the P-CAT that have been developed. Second, a systematic description of the validity evidence for each of the P-CAT validations found in the systematic review was developed following the “Standards” framework [ 37 ]. The work of Hawkins et al. [ 39 ], the first study to review validity sources according to the guidelines proposed by the “Standards”, was also used as a reference. Both provided conceptual and pragmatic guidance for organizing and classifying validity evidence for the P-CAT.

The procedure conducted in the systematic review is described below, followed by the procedure for examining the validity studies.

Systematic review

Search strategy and information sources.

Initially, the Cochrane database was searched with the aim of identifying systematic reviews of the P-CAT. When no such reviews were found, subsequent preliminary searches were performed in the Web of Science (WoS), Scopus and PubMed databases. These databases play a fundamental role in recent scientific literature since they are the main sources of published articles that undergo high-quality content and editorial review processes [ 42 ]. The search formula was as follows. The original P-CAT article [ 21 ] was located, after which all articles that cited it through 2021 were identified and analyzed. This approach ensured the inclusion of all validations. No articles were excluded on the basis of language to avoid language bias [ 43 ]. Moreover, to reduce the effects of publication bias, a complementary search in Google Scholar was also performed to allow the inclusion of “gray” literature [ 44 ]. Finally, a manual search was performed through a review of the references of the included articles to identify other articles that met the search criteria but were not present in any of the aforementioned databases.

This process was conducted by one of the authors and corroborated by another using the Covidence tool [ 45 ]. A third author was consulted in case of doubt.

Eligibility criteria and selection process

The protocol was registered in PROSPERO, and the search was conducted according to these criteria. The identification code is CRD42022335866.

The articles had to meet the following criteria for inclusion in the systematic review: (a) a methodological approach to P-CAT validations, (b) an experimental or quasiexperimental studies, (c) studies with any type of sample, and (d) studies in any language. We discarded studies that met at least one of the following exclusion criteria: (a) systematic reviews or bibliometric reviews of the instrument or meta-analyses or (b) studies published after 2021.

Data collection process

After the articles were selected, the most relevant information was extracted from each article. Fundamental data were recorded in an Excel spreadsheet for each of the sections: introduction, methodology, results and discussion. Information was also recorded about the limitations mentioned in each article as well as the practical implications and suggestions for future research.

Given the aim of the study, information was collected about the sources of validity of each study, including test content (judges’ evaluation, literature review and translation), response processes, internal structure (factor analysis, design, estimator, factor extraction method, factors and items, interfactor R, internal replication, effect of the method, and factor loadings), and relationships with other variables (convergent, divergent, concurrent and predictive validity) and consequences of measurement.

Description of the validity study

To assess the validity of the studies, an Excel table was used. Information was recorded for the seven articles included in the systematic review. The data were extracted directly from the texts of the articles and included information about the authors, the year of publication, the country where each P-CAT validation was produced and each of the five standards proposed in the “Standards” [ 37 ].

The validity source related to internal structure was divided into three sections to record information about dimensionality (e.g., factor analysis, design, estimator, factor extraction method, factors and items, interfactor R, internal replication, effect of the method, and factor loadings), reliability expression (i.e., internal consistency and test-retest) and the study of factorial invariance according to the groups into which it was divided (e.g., sex, age, profession) and the level of study (i.e., metric, intercepts). This approach allowed much more information to be obtained than relying solely on source validity based on internal structure. This division was performed by the same researcher who performed the previous processes.

Study selection and study characteristics

The systematic review process was developed according to the PRISMA methodology [ 41 ].

The WoS, Scopus, PubMed and Google Scholar databases were searched on February 12, 2022 and yielded a total of 485 articles. Of these, 111 were found in WoS, 114 in Scopus, 43 in PubMed and 217 in Google Scholar. In the first phase, the title and abstracts of all the articles were read. In this first screening, 457 articles were eliminated because they did not include studies with a methodological approach to P-CAT validation and one article was excluded because it was the original P-CAT article. This resulted in a total of 27 articles, 19 of which were duplicated in different databases and, in the case of Google Scholar, within the same database. This process yielded a total of eight articles that were evaluated for eligibility by a complete reading of the text. In this step, one of the articles was excluded due to a lack of access to the full text of the study [ 31 ] (although the original manuscript was found, it was impossible to access the complete content; in addition, the authors of the manuscript were contacted, but no reply was received). Finally, a manual search was performed by reviewing the references of the seven studies, but none were considered suitable for inclusion. Thus, the review was conducted with a total of seven articles.

Of the seven studies, six were original validations in other languages. These included Norwegian [ 27 ], Swedish [ 28 ], Chinese (which has two validations [ 29 , 40 ]), Spanish [ 25 ], and Korean [ 30 ]. The study by Selan et al. [ 46 ] included a modification of the Swedish version of the P-CAT and explored the psychometric properties of both versions (i.e., the original Swedish version and the modified version).

The item selection and screening process are illustrated in detail in Fig.  1 .

figure 1

PRISMA 2020 flow diagram for new systematic reviews including database searches

Validity analysis

To provide a clear overview of the validity analyses, Table  1 descriptively shows the percentages of items that provide information about the five standards proposed by the “Standards” guide [ 37 ].

The table shows a high number of validity sources related to test content and internal structure in relation to dimensionality and internal consistency, followed by a moderate number of sources for test-retest and relationship with other variables. A rate of 0% is observed for validity sources related to response processes, invariance and test consequences. Below, different sections related to each of the standards are shown, and the information is presented in more detail.

Evidence based on test content

The first standard, which focused on test content, was met for all items (100%). Translation, which refers to the equivalence of content between the original language and the target language, was met in the six articles that conducted validation in another language and/or culture. These studies reported that the validations were translated by bilingual experts and/or experts in the area of care. In addition, three studies [ 25 , 29 , 40 ] reported that the translation process followed International Test Commission guidelines, such as those of Beaton et al. [ 47 ], Guillemin [ 48 ], Hambleton et al. [ 49 ], and Muñiz et al. [ 50 ]. Evaluation by judges, who referred to the relevance, clarity and importance of the content, was divided into two categories: expert evaluation (a panel of expert judges for each of the areas to consider in the evaluation instrument) and experiential evaluation (potential participants testing the test). The first type of evaluation occurred in three of the articles [ 28 , 29 , 46 ], while the other occurred in two [ 25 , 40 ]. Only one of the items [ 29 ] reported that the scale contained items that reflected the dimension described in the literature. The validity evidence related to the test content presented in each article can be found in Table  2 .

Evidence based on response processes

The second standard, related to the validity of the response process, was obtained according to the “Standards” from the analysis of individual responses: “questioning test takers about their performance strategies or response to particular items (…), maintaining records that monitor the development of a response to a writing task (…), documentation of other aspects of performance, like eye movement or response times…” [ 37 ] (p. 15). According to the analysis of the validity of the response processes, none of the articles complied with this evidence.

Evidence based on internal structure

The third standard, validity related to internal structure, was divided into three sections. First, the dimensionality of each study was examined in terms of factor analysis, design, estimator, factor extraction method, factors and items, interfactor R, internal replication, effect of the method, and factor loadings. Le et al. [ 40 ] conducted an exploratory-confirmatory design while Sjögren et al. [ 28 ] conducted a confirmatory-exploratory design to assess construct validity using confirmatory factor analysis (CFA) and investigated it further using exploratory factor analysis (EFA). The remaining articles employed only a single form of factor analysis: three employed EFA, and two employed CFA. Regarding the next point, only three of the articles reported the factor extraction method used, including Kaiser’s eigenvalue, criterion, scree plot test, parallel analysis and Velicer’s MAP test. Instrument validations yielded a total of two factors in five of the seven articles, while one yielded a single dimension [ 25 ] and the other yielded three dimensions [ 29 ], as in the original instrument. The interfactor R was reported only in the study by Zhong and Lou [ 29 ], whereas in the study by Martínez et al. [ 25 ], it could be easily obtained since it consisted of only one dimension. Internal replication was also calculated in the Spanish validation by randomly splitting the sample into two to test the correlations between factors. The effectiveness of the method was not reported in any of the articles. This information is presented in Table  3 in addition to a summary of the factor loadings.

The second section examined reliability. All the studies presented measures of internal consistency conducted in their entirety with Cronbach’s α coefficient for both the total scale and the subscales. The ω coefficient of McDonald was not used in any case. Four of the seven articles performed a test-retest test. Martínez et al. [ 25 ] conducted a test-retest after a period of seven days, while Le et al. [ 40 ] and Rokstad et al. [ 27 ] performed it between one and two weeks later and Sjögren et al. [ 28 ] allowed approximately two weeks to pass after the initial test.

The third section analyzes the calculation of invariance, which was not reported in any of the studies.

Evidence based on relationships with other variables

In the fourth standard, based on validity according to the relationship with other variables, the articles that reported it used only convergent validity (i.e., it was hypothesized that the variables related to the construct measured by the test—in this case, person-centeredness—were positively or negatively related to another construct). Discriminant validity hypothesizes that the variables related to the PCC construct are not correlated in any way with any other variable studied. No article (0%) measured discriminant evidence, while four (57%) measured convergent evidence [ 25 , 29 , 30 , 46 ]. Convergent validity was obtained through comparisons with instruments such as the Person-Centered Climate Questionnaire–Staff Version (PCQ-S), the Staff-Based Measures of Individualized Care for Institutionalized Persons with Dementia (IC), the Caregiver Psychological Elder Abuse Behavior Scale (CPEAB), the Organizational Climate (CLIOR) and the Maslach Burnout Inventory (MBI). In the case of Selan et al. [ 46 ], convergent validity was assessed on two items considered by the authors as “crude measures of person-centered care (i.e., external constructs) giving an indication of the instruments’ ability to measure PCC” (p. 4). Concurrent validity, which measures the degree to which the results of one test are or are not similar to those of another test conducted at more or less the same time with the same participants, and predictive validity, which allows predictions to be established regarding behavior based on comparison between the values of the instrument and the criterion, were not reported in any of the studies.

Evidence based on the consequences of testing

The fifth and final standard was related to the consequences of the test. It analyzed the consequences, both intended and unintended, of applying the test to a given sample. None of the articles presented explicit or implicit evidence of this.

The last two sources of validity can be seen in Table  4 .

Table  5 shows the results of the set of validity tests for each study according to the described standards.

The main purpose of this article is to analyze the evidence of validity in different validation studies of the P-CAT. To gather all existing validations, a systematic review of all literature citing this instrument was conducted.

The publication of validation studies of the P-CAT has been constant over the years. Since the publication of the original instrument in 2010, seven validations have been published in other languages (taking into account the Italian version by Brugnolli et al. [ 31 ], which could not be included in this study) as well as a modification of one of these versions. The very unequal distribution of validations between languages and countries is striking. A recent systematic review [ 51 ] revealed that in Europe, the countries where the PCC approach is most widely used are the United Kingdom, Sweden, the Netherlands, Northern Ireland, and Norway. It has also been shown that the neighboring countries seem to exert an influence on each other due to proximity [ 52 ] such that they tend to organize healthcare in a similar way, as is the case for Scandinavian countries. This favors the expansion of PCC and explains the numerous validations we found in this geographical area.

Although this approach is conceived as an essential element of healthcare for most governments [ 53 ], PCC varies according to the different definitions and interpretations attributed to it, which can cause confusion in its application (e.g., between Norway and the United Kingdom [ 54 ]). Moreover, facilitators of or barriers to implementation depend on the context and level of development of each country, and financial support remains one of the main factors in this regard [ 53 ]. This fact explains why PCC is not globally widespread among all territories. In countries where access to healthcare for all remains out of reach for economic reasons, the application of this approach takes a back seat, as does the validation of its assessment tools. In contrast, in a large part of Europe or in countries such as China or South Korea that have experienced decades of rapid economic development, patients are willing to be involved in their medical treatment and enjoy more satisfying and efficient medical experiences and environments [ 55 ], which facilitates the expansion of validations of instruments such as the P-CAT.

Regarding validity testing, the guidelines proposed by the “Standards” [ 37 ] were followed. According to the analysis of the different validations of the P-CAT instrument, none of the studies used a structured validity theory-based procedural framework for conducting validation. The most frequently reported validity tests were on the content of the test and two of the sections into which the internal structure was divided (i.e., dimensionality and internal consistency).

In the present article, the most cited source of validity in the studies was the content of the test because most of the articles were validations of the P-CAT in other languages, and the authors reported that the translation procedure was conducted by experts in all cases. In addition, several of the studies employed International Test Commission guidelines, such as those by Beaton et al. [ 47 ], Guillemin [ 48 ], Hambleton et al. [ 49 ], and Muñiz et al. [ 50 ]. Several studies also assessed the relevance, clarity and importance of the content.

The third source of validity, internal structure, was the next most often reported, although it appeared unevenly among the three sections into which this evidence was divided. Dimensionality and internal consistency were reported in all studies, followed by test-retest consistency. In relation to the first section, factor analysis, a total of five EFAs and four CFAs were presented in the validations. Traditionally, EFA has been used in research to assess dimensionality and identify key psychological constructs, although this approach involves a number of inconveniences, such as difficulty testing measurement invariance and incorporating latent factors into subsequent analyses [ 56 ] or the major problem of factor loading matrix rotation [ 57 ]. Studies eventually began to employ CFA, a technique that overcame some of these obstacles [ 56 ] but had other drawbacks; for example, the strict requirement of zero cross-loadings often does not fit the data well, and misspecification of zero loadings tends to produce distorted factors [ 57 ]. Recently, exploratory structural equation modeling (ESEM) has been proposed. This technique is widely recommended both conceptually and empirically to assess the internal structure of psychological tools [ 58 ] since it overcomes the limitations of EFA and CFA in estimating their parameters [ 56 , 57 ].

The next section, reliability, reports the total number of items according to Cronbach’s α reliability coefficient. Reliability is defined as a combination of systematic and random influences that determine the observed scores on a psychological test. Reporting the reliability measure ensures that item-based scores are consistent, that the tool’s responses are replicable and that they are not modified solely by random noise [ 59 , 60 ]. Currently, the most commonly employed reliability coefficient in studies with a multi-item measurement scale (MIMS) is Cronbach’s α [ 60 , 61 ].

Cronbach’s α [ 62 ] is based on numerous strict assumptions (e.g., the test must be unidimensional, factor loadings must be equal for all items and item errors should not covary) to estimate internal consistency. These assumptions are difficult to meet, and their violation may produce small reliability estimates [ 60 ]. One of the alternative measures to α that is increasingly recommended by the scientific literature is McDonald’s ω [ 63 ], a composite reliability measure. This coefficient is recommended for congeneric scales in which tau equivalence is not assumed. It has several advantages. For example, estimates of ω are usually robust when the estimated model contains more factors than the true model, even with small samples, or when skewness in univariate item distributions produces lower biases than those found when using α [ 59 ].

The test-retest method was the next most commonly reported internal structure section in these studies. This type of reliability considers the consistency of the scores of a test between two measurements separated by a period [ 64 ]. It is striking that test-retest consistency does not have a prevalence similar to that of internal consistency since, unlike internal consistency, test-retest consistency can be assessed for practically all types of patient-reported outcomes. It is even considered by some measurement experts to report reliability with greater relevance than internal consistency since it plays a fundamental role in the calculation of parameters for health measures [ 64 ]. However, the literature provides little guidance regarding the assessment of this type of reliability.

The internal structure section that was least frequently reported in the studies in this review was invariance. A lack of invariance refers to a difference between scores on a test that is not explained by group differences in the structure it is intended to measure [ 65 ]. The invariance of the measure should be emphasized as a prerequisite in comparisons between groups since “if scale invariance is not examined, item bias may not be fully recognized and this may lead to a distorted interpretation of the bias in a particular psychological measure” [ 65 ].

Evidence related to other variables was the next most reported source of validity in the studies included in this review. Specifically, the four studies that reported this evidence did so according to convergent validity and cited several instruments. None of the studies included evidence of discriminant validity, although this may be because there are currently several obstacles related to the measurement of this type of validity [ 66 ]. On the one hand, different definitions are used in the applied literature, which makes its evaluation difficult; on the other hand, the literature on discriminant validity focuses on techniques that require the use of multiple measurement methods, which often seem to have been introduced without sufficient evidence or are applied randomly.

Validity related to response processes was not reported by any of the studies. There are several methods to analyze this validity. These methods can be divided into two groups: “those that directly access the psychological processes or cognitive operations (think aloud, focus group, and interviews), compared to those which provide indirect indicators which in turn require additional inference (eye tracking and response times)” [ 38 ]. However, this validity evidence has traditionally been reported less frequently than others in most studies, perhaps because there are fewer clear and accepted practices on how to design or report these studies [ 67 ].

Finally, the consequences of testing were not reported in any of the studies. There is debate regarding this source of validity, with two main opposing streams of thought. On the one hand [ 68 , 69 ]) suggests that consequences that appear after the application of a test should not derive from any source of test invalidity and that “adverse consequences only undermine the validity of an assessment if they can be attributed to a problem of fit between the test and the construct” (p. 6). In contrast, Cronbach [ 69 , 70 ] notes that adverse social consequences that may result from the application of a test may call into question the validity of the test. However, the potential risks that may arise from the application of a test should be minimized in any case, especially in regard to health assessments. To this end, it is essential that this aspect be assessed by instrument developers and that the experiences of respondents be protected through the development of comprehensive and informed practices [ 39 ].

This work is not without limitations. First, not all published validation studies of the P-CAT, such as the Italian version by Brugnolli et al. [ 31 ], were available. These studies could have provided relevant information. Second, many sources of validity could not be analyzed because the studies provided scant or no data, such as response processes [ 25 , 27 , 28 , 29 , 30 , 40 , 46 ], relationships with other variables [ 27 , 28 , 40 ], consequences of testing [ 25 , 27 , 28 , 29 , 30 , 40 , 46 ], or invariance [ 25 , 27 , 28 , 29 , 30 , 40 , 46 ] in the case of internal structure and interfactor R [ 27 , 28 , 30 , 40 , 46 ], internal replication [ 27 , 28 , 29 , 30 , 40 , 46 ] or the effect of the method [ 25 , 27 , 28 , 29 , 30 , 40 , 46 ] in the case of dimensionality. In the future, it is hoped that authors will become aware of the importance of validity, as shown in this article and many others, and provide data on unreported sources so that comprehensive validity studies can be performed.

The present work also has several strengths. The search was extensive, and many studies were obtained using three different databases, including WoS, one of the most widely used and authoritative databases in the world. This database includes a large number and variety of articles and is not fully automated due to its human team [ 71 , 72 , 73 ]. In addition, to prevent publication bias, gray literature search engines such as Google Scholar were used to avoid the exclusion of unpublished research [ 44 ]. Finally, linguistic bias was prevented by not limiting the search to articles published in only one or two languages, thus avoiding the overrepresentation of studies in one language and underrepresentation in others [ 43 ].

Conclusions

Validity is understood as the degree to which tests and theory support the interpretations of instrument scores for their intended use [ 37 ]. From this perspective, the various validations of the P-CAT are not presented in a structured, valid, theory-based procedural framework like the “Standards” are. After integration and analysis of the results, it was observed that these validation reports offer a high number of sources of validity related to test content, internal structure in dimensionality and internal consistency, a moderate number of sources for internal structure in terms of test-retest reliability and the relationship with other variables, and a very low number of sources for response processes, internal structure in terms of invariance, and test consequences.

Validity plays a fundamental role in ensuring a sound scientific basis for test interpretations because it provides evidence of the extent to which the data provided by the test are valid for the intended purpose. This can affect clinical practice as people’s health may depend on it. In this sense, the “Standards” are considered a suitable and valid theory-based procedural framework for studying this modern conception of questionnaire validity, which should be taken into account in future research in this area.

Although the P-CAT is one of the most widely used instruments for assessing PCC, as shown in this study, PCC has rarely been studied. The developers of measurement tests applied to the health care setting, on which the health and quality of life of many people may depend, should use this validity framework to reflect the clear purpose of the measurement. This approach is important because the equity of decision making by healthcare professionals in daily clinical practice may depend on the source of validity. Through a more extensive study of validity that includes the interpretation of scores in terms of their intended use, the applicability of the P-CAT, an instrument that was initially developed for long-term care homes for elderly people, could be expanded to other care settings. However, the findings of this study show that validation studies continue to focus on traditionally studied types of validity and overlook the interpretation of scores in terms of their intended use.

Data availability

All data relevant to the study were included in the article or uploaded as additional files. Additional template data extraction forms are available from the corresponding author upon reasonable request.

Abbreviations

American Educational Research Association

American Psychological Association

Confirmatory factor analysis

Organizational Climate

Caregiver Psychological Elder Abuse Behavior Scale

Exploratory factor analysis

Exploratory structural equation modeling

Staff-based Measures of Individualized Care for Institutionalized Persons with Dementia

Maslach Burnout Inventory

Multi-item measurement scale

Maximum likelihood

National Council on Measurement in Education

Person-Centered Care Assessment Tool

  • Person-centered care

Person-Centered Climate Questionnaire–Staff Version

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

International Register of Systematic Review Protocols

Standards for Educational and Psychological Testing

weighted least square mean and variance adjusted

Web of Science

Institute of Medicine. Crossing the quality chasm: a new health system for the 21st century. Washington, DC: National Academy; 2001.

Google Scholar  

International Alliance of Patients’ Organizations. What is patient-centred healthcare? A review of definitions and principles. 2nd ed. London, UK: International Alliance of Patients’ Organizations; 2007.

World Health Organization. WHO global strategy on people-centred and integrated health services: interim report. Geneva, Switzerland: World Health Organization; 2015.

Britten N, Ekman I, Naldemirci Ö, Javinger M, Hedman H, Wolf A. Learning from Gothenburg model of person centred healthcare. BMJ. 2020;370:m2738.

Article   PubMed   Google Scholar  

Van Diepen C, Fors A, Ekman I, Hensing G. Association between person-centred care and healthcare providers’ job satisfaction and work-related health: a scoping review. BMJ Open. 2020;10:e042658.

Article   PubMed   PubMed Central   Google Scholar  

Ekman N, Taft C, Moons P, Mäkitalo Å, Boström E, Fors A. A state-of-the-art review of direct observation tools for assessing competency in person-centred care. Int J Nurs Stud. 2020;109:103634.

American Geriatrics Society Expert Panel on Person-Centered Care. Person-centered care: a definition and essential elements. J Am Geriatr Soc. 2016;64:15–8.

Article   Google Scholar  

McCormack B, McCance TV. Development of a framework for person-centred nursing. J Adv Nurs. 2006;56:472–9.

McCormack B, McCance T. Person-centred practice in nursing and health care: theory and practice. Chichester, England: Wiley; 2016.

Nolan MR, Davies S, Brown J, Keady J, Nolan J. Beyond person-centred care: a new vision for gerontological nursing. J Clin Nurs. 2004;13:45–53.

McCormack B, McCance T. Person-centred nursing: theory, models and methods. Oxford, UK: Wiley-Blackwell; 2010.

Book   Google Scholar  

Abraha I, Rimland JM, Trotta FM, Dell’Aquila G, Cruz-Jentoft A, Petrovic M, et al. Systematic review of systematic reviews of non-pharmacological interventions to treat behavioural disturbances in older patients with dementia. The SENATOR-OnTop series. BMJ Open. 2017;7:e012759.

Anderson K, Blair A. Why we need to care about the care: a longitudinal study linking the quality of residential dementia care to residents’ quality of life. Arch Gerontol Geriatr. 2020;91:104226.

Bauer M, Fetherstonhaugh D, Haesler E, Beattie E, Hill KD, Poulos CJ. The impact of nurse and care staff education on the functional ability and quality of life of people living with dementia in aged care: a systematic review. Nurse Educ Today. 2018;67:27–45.

Smythe A, Jenkins C, Galant-Miecznikowska M, Dyer J, Downs M, Bentham P, et al. A qualitative study exploring nursing home nurses’ experiences of training in person centred dementia care on burnout. Nurse Educ Pract. 2020;44:102745.

McCormack B, Borg M, Cardiff S, Dewing J, Jacobs G, Janes N, et al. Person-centredness– the ‘state’ of the art. Int Pract Dev J. 2015;5:1–15.

Wilberforce M, Challis D, Davies L, Kelly MP, Roberts C, Loynes N. Person-centredness in the care of older adults: a systematic review of questionnaire-based scales and their measurement properties. BMC Geriatr. 2016;16:63.

Rathert C, Wyrwich MD, Boren SA. Patient-centered care and outcomes: a systematic review of the literature. Med Care Res Rev. 2013;70:351–79.

Sharma T, Bamford M, Dodman D. Person-centred care: an overview of reviews. Contemp Nurse. 2016;51:107–20.

Ahmed S, Djurkovic A, Manalili K, Sahota B, Santana MJ. A qualitative study on measuring patient-centered care: perspectives from clinician-scientists and quality improvement experts. Health Sci Rep. 2019;2:e140.

Edvardsson D, Fetherstonhaugh D, Nay R, Gibson S. Development and initial testing of the person-centered Care Assessment Tool (P-CAT). Int Psychogeriatr. 2010;22:101–8.

Tamagawa R, Groff S, Anderson J, Champ S, Deiure A, Looyis J, et al. Effects of a provincial-wide implementation of screening for distress on healthcare professionals’ confidence and understanding of person-centered care in oncology. J Natl Compr Canc Netw. 2016;14:1259–66.

Degl’ Innocenti A, Wijk H, Kullgren A, Alexiou E. The influence of evidence-based design on staff perceptions of a supportive environment for person-centered care in forensic psychiatry. J Forensic Nurs. 2020;16:E23–30.

Hulin CL. A psychometric theory of evaluations of item and scale translations: fidelity across languages. J Cross Cult Psychol. 1987;18:115–42.

Martínez T, Suárez-Álvarez J, Yanguas J, Muñiz J. Spanish validation of the person-centered Care Assessment Tool (P-CAT). Aging Ment Health. 2016;20:550–8.

Martínez T, Martínez-Loredo V, Cuesta M, Muñiz J. Assessment of person-centered care in gerontology services: a new tool for healthcare professionals. Int J Clin Health Psychol. 2020;20:62–70.

Rokstad AM, Engedal K, Edvardsson D, Selbaek G. Psychometric evaluation of the Norwegian version of the person-centred Care Assessment Tool. Int J Nurs Pract. 2012;18:99–105.

Sjögren K, Lindkvist M, Sandman PO, Zingmark K, Edvardsson D. Psychometric evaluation of the Swedish version of the person-centered Care Assessment Tool (P-CAT). Int Psychogeriatr. 2012;24:406–15.

Zhong XB, Lou VW. Person-centered care in Chinese residential care facilities: a preliminary measure. Aging Ment Health. 2013;17:952–8.

Tak YR, Woo HY, You SY, Kim JH. Validity and reliability of the person-centered Care Assessment Tool in long-term care facilities in Korea. J Korean Acad Nurs. 2015;45:412–9.

Brugnolli A, Debiasi M, Zenere A, Zanolin ME, Baggia M. The person-centered Care Assessment Tool in nursing homes: psychometric evaluation of the Italian version. J Nurs Meas. 2020;28:555–63.

Bru-Luna LM, Martí-Vilar M, Merino-Soto C, Livia J. Reliability generalization study of the person-centered Care Assessment Tool. Front Psychol. 2021;12:712582.

Edvardsson D, Innes A. Measuring person-centered care: a critical comparative review of published tools. Gerontologist. 2010;50:834–46.

Hawkins M, Elsworth GR, Nolte S, Osborne RH. Validity arguments for patient-reported outcomes: justifying the intended interpretation and use of data. J Patient Rep Outcomes. 2021;5:64.

Sireci SG. On the validity of useless tests. Assess Educ Princ Policy Pract. 2016;23:226–35.

Hawkins M, Elsworth GR, Osborne RH. Questionnaire validation practice: a protocol for a systematic descriptive literature review of health literacy assessments. BMJ Open. 2019;9:e030753.

American Educational Research Association, American Psychological Association. National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014.

Padilla JL, Benítez I. Validity evidence based on response processes. Psicothema. 2014;26:136–44.

PubMed   Google Scholar  

Hawkins M, Elsworth GR, Hoban E, Osborne RH. Questionnaire validation practice within a theoretical framework: a systematic descriptive literature review of health literacy assessments. BMJ Open. 2020;10:e035974.

Le C, Ma K, Tang P, Edvardsson D, Behm L, Zhang J, et al. Psychometric evaluation of the Chinese version of the person-centred Care Assessment Tool. BMJ Open. 2020;10:e031580.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg. 2021;88:105906.

Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, web of Science, and Google Scholar: strengths and weaknesses. FASEB J. 2008;22:338–42.

Grégoire G, Derderian F, Le Lorier J. Selecting the language of the publications included in a meta-analysis: is there a tower of Babel bias? J Clin Epidemiol. 1995;48:159–63.

Arias MM. Aspectos metodológicos Del metaanálisis (1). Pediatr Aten Primaria. 2018;20:297–302.

Covidence. Covidence systematic review software. Veritas Health Innovation, Australia. 2014. https://www.covidence.org/ . Accessed 28 Feb 2022.

Selan D, Jakobsson U, Condelius A. The Swedish P-CAT: modification and exploration of psychometric properties of two different versions. Scand J Caring Sci. 2017;31:527–35.

Beaton DE, Bombardier C, Guillemin F, Ferraz MB. Guidelines for the process of cross-cultural adaptation of self-report measures. Spine (Phila Pa 1976). 2000;25:3186–91.

Guillemin F. Cross-cultural adaptation and validation of health status measures. Scand J Rheumatol. 1995;24:61–3.

Hambleton R, Merenda P, Spielberger C. Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Lawrence Erlbaum Associates; 2005.

Muñiz J, Elosua P, Hambleton RK. International test commission guidelines for test translation and adaptation: second edition. Psicothema. 2013;25:151–7.

Rosengren K, Brannefors P, Carlstrom E. Adoption of the concept of person-centred care into discourse in Europe: a systematic literature review. J Health Organ Manag. 2021;35:265–80.

Alharbi T, Olsson LE, Ekman I, Carlström E. The impact of organizational culture on the outcome of hospital care: after the implementation of person-centred care. Scand J Public Health. 2014;42:104–10.

Bensbih S, Souadka A, Diez AG, Bouksour O. Patient centered care: focus on low and middle income countries and proposition of new conceptual model. J Med Surg Res. 2020;7:755–63.

Stranz A, Sörensdotter R. Interpretations of person-centered dementia care: same rhetoric, different practices? A comparative study of nursing homes in England and Sweden. J Aging Stud. 2016;38:70–80.

Zhou LM, Xu RH, Xu YH, Chang JH, Wang D. Inpatients’ perception of patient-centered care in Guangdong province, China: a cross-sectional study. Inquiry. 2021. https://doi.org/10.1177/00469580211059482 .

Marsh HW, Morin AJ, Parker PD, Kaur G. Exploratory structural equation modeling: an integration of the best features of exploratory and confirmatory factor analysis. Annu Rev Clin Psychol. 2014;10:85–110.

Asparouhov T, Muthén B. Exploratory structural equation modeling. Struct Equ Model Multidiscip J. 2009;16:397–438.

Cabedo-Peris J, Martí-Vilar M, Merino-Soto C, Ortiz-Morán M. Basic empathy scale: a systematic review and reliability generalization meta-analysis. Healthc (Basel). 2022;10:29–62.

Flora DB. Your coefficient alpha is probably wrong, but which coefficient omega is right? A tutorial on using R to obtain better reliability estimates. Adv Methods Pract Psychol Sci. 2020;3:484–501.

McNeish D. Thanks coefficient alpha, we’ll take it from here. Psychol Methods. 2018;23:412–33.

Hayes AF, Coutts JJ. Use omega rather than Cronbach’s alpha for estimating reliability. But… Commun Methods Meas. 2020;14:1–24.

Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334.

McDonald R. Test theory: a unified approach. Mahwah, NJ: Erlbaum; 1999.

Polit DF. Getting serious about test-retest reliability: a critique of retest research and some recommendations. Qual Life Res. 2014;23:1713–20.

Ceylan D, Çizel B, Karakaş H. Testing destination image scale invariance for intergroup comparison. Tour Anal. 2020;25:239–51.

Rönkkö M, Cho E. An updated guideline for assessing discriminant validity. Organ Res Methods. 2022;25:6–14.

Hubley A, Zumbo B. Response processes in the context of validity: setting the stage. In: Zumbo B, Hubley A, editors. Understanding and investigating response processes in validation research. Cham, Switzerland: Springer; 2017. pp. 1–12.

Messick S. Validity of performance assessments. In: Philips G, editor. Technical issues in large-scale performance assessment. Washington, DC: Department of Education, National Center for Education Statistics; 1996. pp. 1–18.

Moss PA. The role of consequences in validity theory. Educ Meas Issues Pract. 1998;17:6–12.

Cronbach L. Five perspectives on validity argument. In: Wainer H, editor. Test validity. Hillsdale, MI: Erlbaum; 1988. pp. 3–17.

Birkle C, Pendlebury DA, Schnell J, Adams J. Web of Science as a data source for research on scientific and scholarly activity. Quant Sci Stud. 2020;1:363–76.

Bramer WM, Rethlefsen ML, Kleijnen J, Franco OH. Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study. Syst Rev. 2017;6:245.

Web of Science Group. Editorial selection process. Clarivate. 2024. https://clarivate.com/webofsciencegroup/solutions/%20editorial-selection-process/ . Accessed 12 Sept 2022.

Download references

Acknowledgements

The authors thank the casual helpers for their aid in information processing and searching.

This work is one of the results of research project HIM/2015/017/SSA.1207, “Effects of mindfulness training on psychological distress and quality of life of the family caregiver”. Main researcher: Filiberto Toledano-Toledano Ph.D. The present research was funded by federal funds for health research and was approved by the Commissions of Research, Ethics and Biosafety (Comisiones de Investigación, Ética y Bioseguridad), Hospital Infantil de México Federico Gómez, National Institute of Health. The source of federal funds did not control the study design, data collection, analysis, or interpretation, or decisions regarding publication.

Author information

Authors and affiliations.

Departamento de Educación, Facultad de Ciencias Sociales, Universidad Europea de Valencia, 46010, Valencia, Spain

Lluna Maria Bru-Luna

Departamento de Psicología Básica, Universitat de València, Blasco Ibáñez Avenue, 21, 46010, Valencia, Spain

Manuel Martí-Vilar

Departamento de Psicología, Instituto de Investigación de Psicología, Universidad de San Martín de Porres, Tomás Marsano Avenue 242, Lima 34, Perú

César Merino-Soto

Instituto Central de Gestión de la Investigación, Universidad Nacional Federico Villarreal, Carlos Gonzalez Avenue 285, 15088, San Miguel, Perú

José Livia-Segovia

Unidad de Investigación en Medicina Basada en Evidencias, Hospital Infantil de México Federico Gómez Instituto Nacional de Salud, Dr. Márquez 162, 06720, Doctores, Cuauhtémoc, Mexico

Juan Garduño-Espinosa & Filiberto Toledano-Toledano

Unidad de Investigación Multidisciplinaria en Salud, Instituto Nacional de Rehabilitación Luis Guillermo Ibarra Ibarra, México-Xochimilco 289, Arenal de Guadalupe, 14389, Tlalpan, Mexico City, Mexico

Filiberto Toledano-Toledano

Dirección de Investigación y Diseminación del Conocimiento, Instituto Nacional de Ciencias e Innovación para la Formación de Comunidad Científica, INDEHUS, Periférico Sur 4860, Arenal de Guadalupe, 14389, Tlalpan, Mexico City, Mexico

You can also search for this author in PubMed   Google Scholar

Contributions

L.M.B.L. conceptualized the study, collected the data, performed the formal anal- ysis, wrote the original draft, and reviewed and edited the subsequent drafts. M.M.V. collected the data and reviewed and edited the subsequent drafts. C.M.S. collected the data, performed the formal analysis, wrote the original draft, and reviewed and edited the subsequent drafts. J.L.S. collected the data, wrote the original draft, and reviewed and edited the subsequent drafts. J.G.E. collected the data and reviewed and edited the subsequent drafts. F.T.T. conceptualized the study and reviewed and edited the subsequent drafts. L.M.B.L. conceptualized the study and reviewed and edited the subsequent drafts. M.M.V. conceptualized the study and reviewed and edited the subsequent drafts. C.M.S. reviewed and edited the subsequent drafts. J.G.E. reviewed and edited the subsequent drafts. F.T.T. conceptualized the study; provided resources, software, and supervision; wrote the original draft; and reviewed and edited the subsequent drafts.

Corresponding author

Correspondence to Filiberto Toledano-Toledano .

Ethics declarations

Ethics approval and consent to participate.

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Commissions of Research, Ethics and Biosafety (Comisiones de Investigación, Ética y Bioseguridad), Hospital Infantil de México Federico Gómez, National Institute of Health. HIM/2015/017/SSA.1207, “Effects of mindfulness training on psychological distress and quality of life of the family caregiver”.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Bru-Luna, L.M., Martí-Vilar, M., Merino-Soto, C. et al. Person-centered care assessment tool with a focus on quality healthcare: a systematic review of psychometric properties. BMC Psychol 12 , 217 (2024). https://doi.org/10.1186/s40359-024-01716-7

Download citation

Received : 17 May 2023

Accepted : 07 April 2024

Published : 19 April 2024

DOI : https://doi.org/10.1186/s40359-024-01716-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Person-centered care assessment tool

BMC Psychology

ISSN: 2050-7283

is a systematic review empirical research

  • Download PDF
  • Share X Facebook Email LinkedIn
  • Permissions

Prevalence of Mental Health Disorders Among Individuals Experiencing Homelessness : A Systematic Review and Meta-Analysis

  • 1 Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
  • 2 Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
  • 3 Faculty of Social Work, University of Calgary, Calgary, Alberta, Canada
  • 4 Mathison Centre for Mental Health Research and Education, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
  • 5 Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
  • 6 Department of Electrical and Software Engineering, University of Calgary, Calgary, Alberta, Canada
  • 7 Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada

Question   What is the prevalence of mental health disorders among people experiencing homelessness?

Findings   In this systematic review and meta-analysis, the prevalence of current and lifetime mental health disorders among people experiencing homelessness was high, with male individuals exhibiting a significantly higher lifetime prevalence of any mental health disorder compared to female individuals.

Meaning   These findings demonstrate that most people experiencing homelessness have mental health disorders, with current and lifetime prevalence generally much greater than that observed in general community samples.

Importance   Several factors may place people with mental health disorders, including substance use disorders, at increased risk of experiencing homelessness and experiencing homelessness may also increase the risk of developing mental health disorders. Meta-analyses examining the prevalence of mental health disorders among people experiencing homelessness globally are lacking.

Objective   To determine the current and lifetime prevalence of mental health disorders among people experiencing homelessness and identify associated factors.

Data Sources   A systematic search of electronic databases (PubMed, MEDLINE, PsycInfo, Embase, Cochrane, CINAHL, and AMED) was conducted from inception to May 1, 2021.

Study Selection   Studies investigating the prevalence of mental health disorders among people experiencing homelessness aged 18 years and older were included.

Data Extraction and Synthesis   Data extraction was completed using standardized forms in Covidence. All extracted data were reviewed for accuracy by consensus between 2 independent reviewers. Random-effects meta-analysis was used to estimate the prevalence (with 95% CIs) of mental health disorders in people experiencing homelessness. Subgroup analyses were performed by sex, study year, age group, region, risk of bias, and measurement method. Meta-regression was conducted to examine the association between mental health disorders and age, risk of bias, and study year.

Main Outcomes and Measures   Current and lifetime prevalence of mental health disorders among people experiencing homelessness.

Results   A total of 7729 citations were retrieved, with 291 undergoing full-text review and 85 included in the final review (N = 48 414 participants, 11 154 [23%] female and 37 260 [77%] male). The current prevalence of mental health disorders among people experiencing homelessness was 67% (95% CI, 55-77), and the lifetime prevalence was 77% (95% CI, 61-88). Male individuals exhibited a significantly higher lifetime prevalence of mental health disorders (86%; 95% CI, 74-92) compared to female individuals (69%; 95% CI, 48-84). The prevalence of several specific disorders were estimated, including any substance use disorder (44%), antisocial personality disorder (26%), major depression (19%), schizophrenia (7%), and bipolar disorder (8%).

Conclusions and Relevance   The findings demonstrate that most people experiencing homelessness have mental health disorders, with higher prevalences than those observed in general community samples. Specific interventions are needed to support the mental health needs of this population, including close coordination of mental health, social, and housing services and policies to support people experiencing homelessness with mental disorders.

Read More About

Barry R , Anderson J , Tran L, et al. Prevalence of Mental Health Disorders Among Individuals Experiencing Homelessness : A Systematic Review and Meta-Analysis . JAMA Psychiatry. Published online April 17, 2024. doi:10.1001/jamapsychiatry.2024.0426

Manage citations:

© 2024

Artificial Intelligence Resource Center

Psychiatry in JAMA : Read the Latest

Browse and subscribe to JAMA Network podcasts!

Others Also Liked

Select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing
  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

IMAGES

  1. Are Systematic Reviews Empirical Research? Exploring The Evidence

    is a systematic review empirical research

  2. What Is Empirical Research? Definition, Types & Samples

    is a systematic review empirical research

  3. Differences Between Empirical Research and Literature Review

    is a systematic review empirical research

  4. How to Conduct a Systematic Review

    is a systematic review empirical research

  5. Before you begin

    is a systematic review empirical research

  6. What are systematic reviews

    is a systematic review empirical research

VIDEO

  1. What is Empirical Research Methodology ? || What is Empiricism in Methodology

  2. Midterm Review: Empirical Political Inquiry

  3. Systematic Literature Review: An Introduction [Urdu/Hindi]

  4. Systematic Review: Explained!

  5. Empirical Research Methods for Human-Computer Interaction

  6. Literature review structure and AI tools

COMMENTS

  1. Introduction to systematic review and meta-analysis

    Formulating research questions. A systematic review attempts to gather all available empirical research by using clearly defined, systematic methods to obtain answers to a specific question. A meta-analysis is the statistical process of analyzing and combining results from several similar studies. Here, the definition of the word "similar ...

  2. Systematic Review

    A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr. Robert Boyle and his colleagues published a systematic review in ...

  3. About Cochrane Reviews

    A systematic review attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a specific research question. Researchers conducting systematic reviews use explicit, systematic methods that are selected with a view aimed at minimizing bias, to produce more reliable findings to ...

  4. 1.2.2 What is a systematic review?

    A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made (Antman ...

  5. Introduction

    "A systematic review attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a given research question. Researchers conducting systematic reviews use explicit methods aimed at minimizing bias, in order to produce more reliable findings that can be used to inform decision ...

  6. What is a Systematic Review?

    A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. The key characteristics of a systematic review are: a clearly defined question with inclusion and exclusion criteria; a rigorous and systematic search of the literature;

  7. Systematic Reviews & Evidence Synthesis Methods

    A systematic review gathers, assesses, and synthesizes all available empirical research on a specific question using a comprehensive search method with an aim to minimize bias.. Or, put another way: . A systematic review begins with a specific research question. Authors of the review gather and evaluate all experimental studies that address the question.

  8. Introduction

    A systematic review is a research method that attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a specific research question. Researchers conducting systematic reviews use explicit, systematic methods that are selected with a view aimed at minimizing bias, to produce ...

  9. Chapter 1: Starting a review

    A systematic review attempts to collate all the empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made ...

  10. Systematic Reviews in the Social Sciences

    "A systematic review attempts to collate all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit, systematic methods that are selected with a view to minimizing bias, thus providing more reliable findings from which conclusions can be drawn and decisions made (Antman ...

  11. Guidance on Conducting a Systematic Literature Review

    Ideally, a systematic review should be conducted before empirical research, and a subset of the literature from the systematic review that is closely related to the empirical work can be used as background review. In that sense, good stand-alone reviews could help improve the quality of background reviews.

  12. What is a Systematic Review?

    Definition of Systematic Review. "A systematic review attempts to identify, appraise, and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a given research question. Researchers conducting systematic reviews use explicit methods aimed at minimizing bias, in order to produce more reliable findings ...

  13. Guidance on Conducting a Systematic Literature Review

    Manning, and Denyer 2008). Ideally, a systematic review should be conducted before empirical research, and a subset of the literature from the systematic review that is closely related to the empirical work can be used as background review. In that sense, good stand-alone reviews could help improve the quality of background reviews. For the purpose

  14. Literature review as a research methodology: An ...

    A systematic review can be explained as a research method and process for identifying and critically appraising relevant research, as well as for collecting and analyzing data from said research (Liberati et al., 2009). The aim of a systematic review is to identify all empirical evidence that fits the pre-specified inclusion criteria to answer ...

  15. Are meta-analysis and systematic reviews theoretical or empirical research?

    I would say they are empirical research because they don't just summarise existing research but find out new things about it: it is meta-research. Someone described systematic reviews as primary ...

  16. A systematic review of empirical studies examining mechanisms of

    Background Understanding the mechanisms of implementation strategies (i.e., the processes by which strategies produce desired effects) is important for research to understand why a strategy did or did not achieve its intended effect, and it is important for practice to ensure strategies are designed and selected to directly target determinants or barriers. This study is a systematic review to ...

  17. Writing Motivation in School: a Systematic Review of Empirical Research

    The lack of a systematic review examining different writing motivational constructs is a deterrent of the progress in the field. Empirical research on writing motivation published in the twenty-first century needs to be mapped and its findings synthesized. Therefore, we address this gap in the current review.

  18. Empirical research in clinical supervision: a systematic review and

    Although clinical supervision is considered to be a major component of the development and maintenance of psychotherapeutic competencies, and despite an increase in supervision research, the empirical evidence on the topic remains sparse. Because most previous reviews lack methodological rigor, we aimed to review the status and quality of the empirical literature on clinical supervision, and ...

  19. Searching for Empirical Research

    A systematic literature review and a primer for speech-language pathologists. Note: empirical research articles will have a literature review section as part of the Introduction, but in an empirical research article the literature review exists to give context to the empirical research, which is the primary focus of the article. In a literature ...

  20. Big data security and privacy in healthcare: A systematic review and

    The systematic review explores the issues and challenges associated with big data security and privacy in healthcare. Through reference to resource-based view theory, this paper seeks to examine the present state of research in this area, identify gaps in the existing literature, and propose strategies for future research.

  21. Full article: Organizational culture: a systematic review

    The systematic review revealed a comprehensive overview of the research landscape on organizational culture. Notably, the majority of the studies (87%) employed empirical methods, with quantitative (37%) and qualitative (33%) research being predominant.

  22. Promising practices for culturally relevant assessment: A systematic review

    Olasunkanmi Kehinde is a Ph.D. candidate in Educational Psychology at Washington State University, with a background in applied mathematics and statistics. His research interests include assessment and measurement, psychometrics, large-scale assessment, cognitive diagnostic models (CDMs), multilevel modeling, systematic reviews/meta-analyses, and structural equation modeling in social, medical ...

  23. Person-centered care assessment tool with a focus on quality healthcare

    The person-centered care (PCC) approach plays a fundamental role in ensuring quality healthcare. The Person-Centered Care Assessment Tool (P-CAT) is one of the shortest and simplest tools currently available for measuring PCC. The objective of this study was to conduct a systematic review of the evidence in validation studies of the P-CAT, taking the "Standards" as a frame of reference.

  24. Prevalence of Mental Health Disorders Among Individuals Experiencing

    Key Points. Question What is the prevalence of mental health disorders among people experiencing homelessness?. Findings In this systematic review and meta-analysis, the prevalence of current and lifetime mental health disorders among people experiencing homelessness was high, with male individuals exhibiting a significantly higher lifetime prevalence of any mental health disorder compared to ...

  25. Informatics

    Using the Preferred Reporting Items for Systematic Review and Meta-Analysis, the paper reviewed 60 studies conducted since the beginning of the twenty-first century and classified them by different metrics to identify relevant trends and research gaps. ... This can be achieved by adopting empirical research approaches, such as action research ...

  26. Business Simulation Games in Higher Education: A Systematic Review of

    There is a need to understand the current state of research and future research opportunities; however, there is a lack of recent systematic literature reviews in BSG literature. This study addresses this gap by systematically compiling online empirical research from January 2015 to April 2022.