Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Recent quantitative research on determinants of health in high income countries: A scoping review

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Centre for Health Economics Research and Modelling Infectious Diseases, Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium

ORCID logo

Roles Conceptualization, Data curation, Funding acquisition, Project administration, Resources, Supervision, Validation, Visualization, Writing – review & editing

  • Vladimira Varbanova, 
  • Philippe Beutels

PLOS

  • Published: September 17, 2020
  • https://doi.org/10.1371/journal.pone.0239031
  • Peer Review
  • Reader Comments

Fig 1

Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature.

We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that performed cross-national statistical analyses aiming to evaluate the impact of one or more aggregate level determinants on one or more general population health outcomes in high-income countries. To assess in which combinations and to what extent individual (or thematically linked) determinants had been studied together, we performed multidimensional scaling and cluster analysis.

Sixty studies were selected, out of an original yield of 3686. Life-expectancy and overall mortality were the most widely used population health indicators, while determinants came from the areas of healthcare, culture, politics, socio-economics, environment, labor, fertility, demographics, life-style, and psychology. The family of regression models was the predominant statistical approach. Results from our multidimensional scaling showed that a relatively tight core of determinants have received much attention, as main covariates of interest or controls, whereas the majority of other determinants were studied in very limited contexts. We consider findings from these studies regarding the importance of any given health determinant inconclusive at present. Across a multitude of model specifications, different country samples, and varying time periods, effects fluctuated between statistically significant and not significant, and between beneficial and detrimental to health.

Conclusions

We conclude that efforts to understand the underlying mechanisms of population health are far from settled, and the present state of research on the topic leaves much to be desired. It is essential that future research considers multiple factors simultaneously and takes advantage of more sophisticated methodology with regards to quantifying health as well as analyzing determinants’ influence.

Citation: Varbanova V, Beutels P (2020) Recent quantitative research on determinants of health in high income countries: A scoping review. PLoS ONE 15(9): e0239031. https://doi.org/10.1371/journal.pone.0239031

Editor: Amir Radfar, University of Central Florida, UNITED STATES

Received: November 14, 2019; Accepted: August 28, 2020; Published: September 17, 2020

Copyright: © 2020 Varbanova, Beutels. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This study (and VV) is funded by the Research Foundation Flanders ( https://www.fwo.be/en/ ), FWO project number G0D5917N, award obtained by PB. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Identifying the key drivers of population health is a core subject in public health and health economics research. Between-country comparative research on the topic is challenging. In order to be relevant for policy, it requires disentangling different interrelated drivers of “good health”, each having different degrees of importance in different contexts.

“Good health”–physical and psychological, subjective and objective–can be defined and measured using a variety of approaches, depending on which aspect of health is the focus. A major distinction can be made between health measurements at the individual level or some aggregate level, such as a neighborhood, a region or a country. In view of this, a great diversity of specific research topics exists on the drivers of what constitutes individual or aggregate “good health”, including those focusing on health inequalities, the gender gap in longevity, and regional mortality and longevity differences.

The current scoping review focuses on determinants of population health. Stated as such, this topic is quite broad. Indeed, we are interested in the very general question of what methods have been used to make the most of increasingly available region or country-specific databases to understand the drivers of population health through inter-country comparisons. Existing reviews indicate that researchers thus far tend to adopt a narrower focus. Usually, attention is given to only one health outcome at a time, with further geographical and/or population [ 1 , 2 ] restrictions. In some cases, the impact of one or more interventions is at the core of the review [ 3 – 7 ], while in others it is the relationship between health and just one particular predictor, e.g., income inequality, access to healthcare, government mechanisms [ 8 – 13 ]. Some relatively recent reviews on the subject of social determinants of health [ 4 – 6 , 14 – 17 ] have considered a number of indicators potentially influencing health as opposed to a single one. One review defines “social determinants” as “the social, economic, and political conditions that influence the health of individuals and populations” [ 17 ] while another refers even more broadly to “the factors apart from medical care” [ 15 ].

In the present work, we aimed to be more inclusive, setting no limitations on the nature of possible health correlates, as well as making use of a multitude of commonly accepted measures of general population health. The goal of this scoping review was to document the state of the art in the recent published literature on determinants of population health, with a particular focus on the types of determinants selected and the methodology used. In doing so, we also report the main characteristics of the results these studies found. The materials collected in this review are intended to inform our (and potentially other researchers’) future analyses on this topic. Since the production of health is subject to the law of diminishing marginal returns, we focused our review on those studies that included countries where a high standard of wealth has been achieved for some time, i.e., high-income countries belonging to the Organisation for Economic Co-operation and Development (OECD) or Europe. Adding similar reviews for other country income groups is of limited interest to the research we plan to do in this area.

In view of its focus on data and methods, rather than results, a formal protocol was not registered prior to undertaking this review, but the procedure followed the guidelines of the PRISMA statement for scoping reviews [ 18 ].

We focused on multi-country studies investigating the potential associations between any aggregate level (region/city/country) determinant and general measures of population health (e.g., life expectancy, mortality rate).

Within the query itself, we listed well-established population health indicators as well as the six world regions, as defined by the World Health Organization (WHO). We searched only in the publications’ titles in order to keep the number of hits manageable, and the ratio of broadly relevant abstracts over all abstracts in the order of magnitude of 10% (based on a series of time-focused trial runs). The search strategy was developed iteratively between the two authors and is presented in S1 Appendix . The search was performed by VV in PubMed and Web of Science on the 16 th of July, 2019, without any language restrictions, and with a start date set to the 1 st of January, 2013, as we were interested in the latest developments in this area of research.

Eligibility criteria

Records obtained via the search methods described above were screened independently by the two authors. Consistency between inclusion/exclusion decisions was approximately 90% and the 43 instances where uncertainty existed were judged through discussion. Articles were included subject to meeting the following requirements: (a) the paper was a full published report of an original empirical study investigating the impact of at least one aggregate level (city/region/country) factor on at least one health indicator (or self-reported health) of the general population (the only admissible “sub-populations” were those based on gender and/or age); (b) the study employed statistical techniques (calculating correlations, at the very least) and was not purely descriptive or theoretical in nature; (c) the analysis involved at least two countries or at least two regions or cities (or another aggregate level) in at least two different countries; (d) the health outcome was not differentiated according to some socio-economic factor and thus studied in terms of inequality (with the exception of gender and age differentiations); (e) mortality, in case it was one of the health indicators under investigation, was strictly “total” or “all-cause” (no cause-specific or determinant-attributable mortality).

Data extraction

The following pieces of information were extracted in an Excel table from the full text of each eligible study (primarily by VV, consulting with PB in case of doubt): health outcome(s), determinants, statistical methodology, level of analysis, results, type of data, data sources, time period, countries. The evidence is synthesized according to these extracted data (often directly reflected in the section headings), using a narrative form accompanied by a “summary-of-findings” table and a graph.

Search and selection

The initial yield contained 4583 records, reduced to 3686 after removal of duplicates ( Fig 1 ). Based on title and abstract screening, 3271 records were excluded because they focused on specific medical condition(s) or specific populations (based on morbidity or some other factor), dealt with intervention effectiveness, with theoretical or non-health related issues, or with animals or plants. Of the remaining 415 papers, roughly half were disqualified upon full-text consideration, mostly due to using an outcome not of interest to us (e.g., health inequality), measuring and analyzing determinants and outcomes exclusively at the individual level, performing analyses one country at a time, employing indices that are a mixture of both health indicators and health determinants, or not utilizing potential health determinants at all. After this second stage of the screening process, 202 papers were deemed eligible for inclusion. This group was further dichotomized according to level of economic development of the countries or regions under study, using membership of the OECD or Europe as a reference “cut-off” point. Sixty papers were judged to include high-income countries, and the remaining 142 included either low- or middle-income countries or a mix of both these levels of development. The rest of this report outlines findings in relation to high-income countries only, reflecting our own primary research interests. Nonetheless, we chose to report our search yield for the other income groups for two reasons. First, to gauge the relative interest in applied published research for these different income levels; and second, to enable other researchers with a focus on determinants of health in other countries to use the extraction we made here.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0239031.g001

Health outcomes

The most frequent population health indicator, life expectancy (LE), was present in 24 of the 60 studies. Apart from “life expectancy at birth” (representing the average life-span a newborn is expected to have if current mortality rates remain constant), also called “period LE” by some [ 19 , 20 ], we encountered as well LE at 40 years of age [ 21 ], at 60 [ 22 ], and at 65 [ 21 , 23 , 24 ]. In two papers, the age-specificity of life expectancy (be it at birth or another age) was not stated [ 25 , 26 ].

Some studies considered male and female LE separately [ 21 , 24 , 25 , 27 – 33 ]. This consideration was also often observed with the second most commonly used health index [ 28 – 30 , 34 – 38 ]–termed “total”, or “overall”, or “all-cause”, mortality rate (MR)–included in 22 of the 60 studies. In addition to gender, this index was also sometimes broken down according to age group [ 30 , 39 , 40 ], as well as gender-age group [ 38 ].

While the majority of studies under review here focused on a single health indicator, 23 out of the 60 studies made use of multiple outcomes, although these outcomes were always considered one at a time, and sometimes not all of them fell within the scope of our review. An easily discernable group of indices that typically went together [ 25 , 37 , 41 ] was that of neonatal (deaths occurring within 28 days postpartum), perinatal (fetal or early neonatal / first-7-days deaths), and post-neonatal (deaths between the 29 th day and completion of one year of life) mortality. More often than not, these indices were also accompanied by “stand-alone” indicators, such as infant mortality (deaths within the first year of life; our third most common index found in 16 of the 60 studies), maternal mortality (deaths during pregnancy or within 42 days of termination of pregnancy), and child mortality rates. Child mortality has conventionally been defined as mortality within the first 5 years of life, thus often also called “under-5 mortality”. Nonetheless, Pritchard & Wallace used the term “child mortality” to denote deaths of children younger than 14 years [ 42 ].

As previously stated, inclusion criteria did allow for self-reported health status to be used as a general measure of population health. Within our final selection of studies, seven utilized some form of subjective health as an outcome variable [ 25 , 43 – 48 ]. Additionally, the Health Human Development Index [ 49 ], healthy life expectancy [ 50 ], old-age survival [ 51 ], potential years of life lost [ 52 ], and disability-adjusted life expectancy [ 25 ] were also used.

We note that while in most cases the indicators mentioned above (and/or the covariates considered, see below) were taken in their absolute or logarithmic form, as a—typically annual—number, sometimes they were used in the form of differences, change rates, averages over a given time period, or even z-scores of rankings [ 19 , 22 , 40 , 42 , 44 , 53 – 57 ].

Regions, countries, and populations

Despite our decision to confine this review to high-income countries, some variation in the countries and regions studied was still present. Selection seemed to be most often conditioned on the European Union, or the European continent more generally, and the Organisation of Economic Co-operation and Development (OECD), though, typically, not all member nations–based on the instances where these were also explicitly listed—were included in a given study. Some of the stated reasons for omitting certain nations included data unavailability [ 30 , 45 , 54 ] or inconsistency [ 20 , 58 ], Gross Domestic Product (GDP) too low [ 40 ], differences in economic development and political stability with the rest of the sampled countries [ 59 ], and national population too small [ 24 , 40 ]. On the other hand, the rationales for selecting a group of countries included having similar above-average infant mortality [ 60 ], similar healthcare systems [ 23 ], and being randomly drawn from a social spending category [ 61 ]. Some researchers were interested explicitly in a specific geographical region, such as Eastern Europe [ 50 ], Central and Eastern Europe [ 48 , 60 ], the Visegrad (V4) group [ 62 ], or the Asia/Pacific area [ 32 ]. In certain instances, national regions or cities, rather than countries, constituted the units of investigation instead [ 31 , 51 , 56 , 62 – 66 ]. In two particular cases, a mix of countries and cities was used [ 35 , 57 ]. In another two [ 28 , 29 ], due to the long time periods under study, some of the included countries no longer exist. Finally, besides “European” and “OECD”, the terms “developed”, “Western”, and “industrialized” were also used to describe the group of selected nations [ 30 , 42 , 52 , 53 , 67 ].

As stated above, it was the health status of the general population that we were interested in, and during screening we made a concerted effort to exclude research using data based on a more narrowly defined group of individuals. All studies included in this review adhere to this general rule, albeit with two caveats. First, as cities (even neighborhoods) were the unit of analysis in three of the studies that made the selection [ 56 , 64 , 65 ], the populations under investigation there can be more accurately described as general urban , instead of just general. Second, oftentimes health indicators were stratified based on gender and/or age, therefore we also admitted one study that, due to its specific research question, focused on men and women of early retirement age [ 35 ] and another that considered adult males only [ 68 ].

Data types and sources

A great diversity of sources was utilized for data collection purposes. The accessible reference databases of the OECD ( https://www.oecd.org/ ), WHO ( https://www.who.int/ ), World Bank ( https://www.worldbank.org/ ), United Nations ( https://www.un.org/en/ ), and Eurostat ( https://ec.europa.eu/eurostat ) were among the top choices. The other international databases included Human Mortality [ 30 , 39 , 50 ], Transparency International [ 40 , 48 , 50 ], Quality of Government [ 28 , 69 ], World Income Inequality [ 30 ], International Labor Organization [ 41 ], International Monetary Fund [ 70 ]. A number of national databases were referred to as well, for example the US Bureau of Statistics [ 42 , 53 ], Korean Statistical Information Services [ 67 ], Statistics Canada [ 67 ], Australian Bureau of Statistics [ 67 ], and Health New Zealand Tobacco control and Health New Zealand Food and Nutrition [ 19 ]. Well-known surveys, such as the World Values Survey [ 25 , 55 ], the European Social Survey [ 25 , 39 , 44 ], the Eurobarometer [ 46 , 56 ], the European Value Survey [ 25 ], and the European Statistics of Income and Living Condition Survey [ 43 , 47 , 70 ] were used as data sources, too. Finally, in some cases [ 25 , 28 , 29 , 35 , 36 , 41 , 69 ], built-for-purpose datasets from previous studies were re-used.

In most of the studies, the level of the data (and analysis) was national. The exceptions were six papers that dealt with Nomenclature of Territorial Units of Statistics (NUTS2) regions [ 31 , 62 , 63 , 66 ], otherwise defined areas [ 51 ] or cities [ 56 ], and seven others that were multilevel designs and utilized both country- and region-level data [ 57 ], individual- and city- or country-level [ 35 ], individual- and country-level [ 44 , 45 , 48 ], individual- and neighborhood-level [ 64 ], and city-region- (NUTS3) and country-level data [ 65 ]. Parallel to that, the data type was predominantly longitudinal, with only a few studies using purely cross-sectional data [ 25 , 33 , 43 , 45 – 48 , 50 , 62 , 67 , 68 , 71 , 72 ], albeit in four of those [ 43 , 48 , 68 , 72 ] two separate points in time were taken (thus resulting in a kind of “double cross-section”), while in another the averages across survey waves were used [ 56 ].

In studies using longitudinal data, the length of the covered time periods varied greatly. Although this was almost always less than 40 years, in one study it covered the entire 20 th century [ 29 ]. Longitudinal data, typically in the form of annual records, was sometimes transformed before usage. For example, some researchers considered data points at 5- [ 34 , 36 , 49 ] or 10-year [ 27 , 29 , 35 ] intervals instead of the traditional 1, or took averages over 3-year periods [ 42 , 53 , 73 ]. In one study concerned with the effect of the Great Recession all data were in a “recession minus expansion change in trends”-form [ 57 ]. Furthermore, there were a few instances where two different time periods were compared to each other [ 42 , 53 ] or when data was divided into 2 to 4 (possibly overlapping) periods which were then analyzed separately [ 24 , 26 , 28 , 29 , 31 , 65 ]. Lastly, owing to data availability issues, discrepancies between the time points or periods of data on the different variables were occasionally observed [ 22 , 35 , 42 , 53 – 55 , 63 ].

Health determinants

Together with other essential details, Table 1 lists the health correlates considered in the selected studies. Several general categories for these correlates can be discerned, including health care, political stability, socio-economics, demographics, psychology, environment, fertility, life-style, culture, labor. All of these, directly or implicitly, have been recognized as holding importance for population health by existing theoretical models of (social) determinants of health [ 74 – 77 ].

thumbnail

https://doi.org/10.1371/journal.pone.0239031.t001

It is worth noting that in a few studies there was just a single aggregate-level covariate investigated in relation to a health outcome of interest to us. In one instance, this was life satisfaction [ 44 ], in another–welfare system typology [ 45 ], but also gender inequality [ 33 ], austerity level [ 70 , 78 ], and deprivation [ 51 ]. Most often though, attention went exclusively to GDP [ 27 , 29 , 46 , 57 , 65 , 71 ]. It was often the case that research had a more particular focus. Among others, minimum wages [ 79 ], hospital payment schemes [ 23 ], cigarette prices [ 63 ], social expenditure [ 20 ], residents’ dissatisfaction [ 56 ], income inequality [ 30 , 69 ], and work leave [ 41 , 58 ] took center stage. Whenever variables outside of these specific areas were also included, they were usually identified as confounders or controls, moderators or mediators.

We visualized the combinations in which the different determinants have been studied in Fig 2 , which was obtained via multidimensional scaling and a subsequent cluster analysis (details outlined in S2 Appendix ). It depicts the spatial positioning of each determinant relative to all others, based on the number of times the effects of each pair of determinants have been studied simultaneously. When interpreting Fig 2 , one should keep in mind that determinants marked with an asterisk represent, in fact, collectives of variables.

thumbnail

Groups of determinants are marked by asterisks (see S1 Table in S1 Appendix ). Diminishing color intensity reflects a decrease in the total number of “connections” for a given determinant. Noteworthy pairwise “connections” are emphasized via lines (solid-dashed-dotted indicates decreasing frequency). Grey contour lines encircle groups of variables that were identified via cluster analysis. Abbreviations: age = population age distribution, associations = membership in associations, AT-index = atherogenic-thrombogenic index, BR = birth rate, CAPB = Cyclically Adjusted Primary Balance, civilian-labor = civilian labor force, C-section = Cesarean delivery rate, credit-info = depth of credit information, dissatisf = residents’ dissatisfaction, distrib.orient = distributional orientation, EDU = education, eHealth = eHealth index at GP-level, exch.rate = exchange rate, fat = fat consumption, GDP = gross domestic product, GFCF = Gross Fixed Capital Formation/Creation, GH-gas = greenhouse gas, GII = gender inequality index, gov = governance index, gov.revenue = government revenues, HC-coverage = healthcare coverage, HE = health(care) expenditure, HHconsump = household consumption, hosp.beds = hospital beds, hosp.payment = hospital payment scheme, hosp.stay = length of hospital stay, IDI = ICT development index, inc.ineq = income inequality, industry-labor = industrial labor force, infant-sex = infant sex ratio, labor-product = labor production, LBW = low birth weight, leave = work leave, life-satisf = life satisfaction, M-age = maternal age, marginal-tax = marginal tax rate, MDs = physicians, mult.preg = multiple pregnancy, NHS = Nation Health System, NO = nitrous oxide emissions, PM10 = particulate matter (PM10) emissions, pop = population size, pop.density = population density, pre-term = pre-term birth rate, prison = prison population, researchE = research&development expenditure, school.ref = compulsory schooling reform, smoke-free = smoke-free places, SO = sulfur oxide emissions, soc.E = social expenditure, soc.workers = social workers, sugar = sugar consumption, terror = terrorism, union = union density, UR = unemployment rate, urban = urbanization, veg-fr = vegetable-and-fruit consumption, welfare = welfare regime, Wwater = wastewater treatment.

https://doi.org/10.1371/journal.pone.0239031.g002

Distances between determinants in Fig 2 are indicative of determinants’ “connectedness” with each other. While the statistical procedure called for higher dimensionality of the model, for demonstration purposes we show here a two-dimensional solution. This simplification unfortunately comes with a caveat. To use the factor smoking as an example, it would appear it stands at a much greater distance from GDP than it does from alcohol. In reality however, smoking was considered together with alcohol consumption [ 21 , 25 , 26 , 52 , 68 ] in just as many studies as it was with GDP [ 21 , 25 , 26 , 52 , 59 ], five. To aid with respect to this apparent shortcoming, we have emphasized the strongest pairwise links. Solid lines connect GDP with health expenditure (HE), unemployment rate (UR), and education (EDU), indicating that the effect of GDP on health, taking into account the effects of the other three determinants as well, was evaluated in between 12 to 16 studies of the 60 included in this review. Tracing the dashed lines, we can also tell that GDP appeared jointly with income inequality, and HE together with either EDU or UR, in anywhere between 8 to 10 of our selected studies. Finally, some weaker but still worth-mentioning “connections” between variables are displayed as well via the dotted lines.

The fact that all notable pairwise “connections” are concentrated within a relatively small region of the plot may be interpreted as low overall “connectedness” among the health indicators studied. GDP is the most widely investigated determinant in relation to general population health. Its total number of “connections” is disproportionately high (159) compared to its runner-up–HE (with 113 “connections”), and then subsequently EDU (with 90) and UR (with 86). In fact, all of these determinants could be thought of as outliers, given that none of the remaining factors have a total count of pairings above 52. This decrease in individual determinants’ overall “connectedness” can be tracked on the graph via the change of color intensity as we move outwards from the symbolic center of GDP and its closest “co-determinants”, to finally reach the other extreme of the ten indicators (welfare regime, household consumption, compulsory school reform, life satisfaction, government revenues, literacy, research expenditure, multiple pregnancy, Cyclically Adjusted Primary Balance, and residents’ dissatisfaction; in white) the effects on health of which were only studied in isolation.

Lastly, we point to the few small but stable clusters of covariates encircled by the grey bubbles on Fig 2 . These groups of determinants were identified as “close” by both statistical procedures used for the production of the graph (see details in S2 Appendix ).

Statistical methodology

There was great variation in the level of statistical detail reported. Some authors provided too vague a description of their analytical approach, necessitating some inference in this section.

The issue of missing data is a challenging reality in this field of research, but few of the studies under review (12/60) explain how they dealt with it. Among the ones that do, three general approaches to handling missingness can be identified, listed in increasing level of sophistication: case-wise deletion, i.e., removal of countries from the sample [ 20 , 45 , 48 , 58 , 59 ], (linear) interpolation [ 28 , 30 , 34 , 58 , 59 , 63 ], and multiple imputation [ 26 , 41 , 52 ].

Correlations, Pearson, Spearman, or unspecified, were the only technique applied with respect to the health outcomes of interest in eight analyses [ 33 , 42 – 44 , 46 , 53 , 57 , 61 ]. Among the more advanced statistical methods, the family of regression models proved to be, by and large, predominant. Before examining this closer, we note the techniques that were, in a way, “unique” within this selection of studies: meta-analyses were performed (random and fixed effects, respectively) on the reduced form and 2-sample two stage least squares (2SLS) estimations done within countries [ 39 ]; difference-in-difference (DiD) analysis was applied in one case [ 23 ]; dynamic time-series methods, among which co-integration, impulse-response function (IRF), and panel vector autoregressive (VAR) modeling, were utilized in one study [ 80 ]; longitudinal generalized estimating equation (GEE) models were developed on two occasions [ 70 , 78 ]; hierarchical Bayesian spatial models [ 51 ] and special autoregressive regression [ 62 ] were also implemented.

Purely cross-sectional data analyses were performed in eight studies [ 25 , 45 , 47 , 50 , 55 , 56 , 67 , 71 ]. These consisted of linear regression (assumed ordinary least squares (OLS)), generalized least squares (GLS) regression, and multilevel analyses. However, six other studies that used longitudinal data in fact had a cross-sectional design, through which they applied regression at multiple time-points separately [ 27 , 29 , 36 , 48 , 68 , 72 ].

Apart from these “multi-point cross-sectional studies”, some other simplistic approaches to longitudinal data analysis were found, involving calculating and regressing 3-year averages of both the response and the predictor variables [ 54 ], taking the average of a few data-points (i.e., survey waves) [ 56 ] or using difference scores over 10-year [ 19 , 29 ] or unspecified time intervals [ 40 , 55 ].

Moving further in the direction of more sensible longitudinal data usage, we turn to the methods widely known among (health) economists as “panel data analysis” or “panel regression”. Most often seen were models with fixed effects for country/region and sometimes also time-point (occasionally including a country-specific trend as well), with robust standard errors for the parameter estimates to take into account correlations among clustered observations [ 20 , 21 , 24 , 28 , 30 , 32 , 34 , 37 , 38 , 41 , 52 , 59 , 60 , 63 , 66 , 69 , 73 , 79 , 81 , 82 ]. The Hausman test [ 83 ] was sometimes mentioned as the tool used to decide between fixed and random effects [ 26 , 49 , 63 , 66 , 73 , 82 ]. A few studies considered the latter more appropriate for their particular analyses, with some further specifying that (feasible) GLS estimation was employed [ 26 , 34 , 49 , 58 , 60 , 73 ]. Apart from these two types of models, the first differences method was encountered once as well [ 31 ]. Across all, the error terms were sometimes assumed to come from a first-order autoregressive process (AR(1)), i.e., they were allowed to be serially correlated [ 20 , 30 , 38 , 58 – 60 , 73 ], and lags of (typically) predictor variables were included in the model specification, too [ 20 , 21 , 37 , 38 , 48 , 69 , 81 ]. Lastly, a somewhat different approach to longitudinal data analysis was undertaken in four studies [ 22 , 35 , 48 , 65 ] in which multilevel–linear or Poisson–models were developed.

Regardless of the exact techniques used, most studies included in this review presented multiple model applications within their main analysis. None attempted to formally compare models in order to identify the “best”, even if goodness-of-fit statistics were occasionally reported. As indicated above, many studies investigated women’s and men’s health separately [ 19 , 21 , 22 , 27 – 29 , 31 , 33 , 35 , 36 , 38 , 39 , 45 , 50 , 51 , 64 , 65 , 69 , 82 ], and covariates were often tested one at a time, including other covariates only incrementally [ 20 , 25 , 28 , 36 , 40 , 50 , 55 , 67 , 73 ]. Furthermore, there were a few instances where analyses within countries were performed as well [ 32 , 39 , 51 ] or where the full time period of interest was divided into a few sub-periods [ 24 , 26 , 28 , 31 ]. There were also cases where different statistical techniques were applied in parallel [ 29 , 55 , 60 , 66 , 69 , 73 , 82 ], sometimes as a form of sensitivity analysis [ 24 , 26 , 30 , 58 , 73 ]. However, the most common approach to sensitivity analysis was to re-run models with somewhat different samples [ 39 , 50 , 59 , 67 , 69 , 80 , 82 ]. Other strategies included different categorization of variables or adding (more/other) controls [ 21 , 23 , 25 , 28 , 37 , 50 , 63 , 69 ], using an alternative main covariate measure [ 59 , 82 ], including lags for predictors or outcomes [ 28 , 30 , 58 , 63 , 65 , 79 ], using weights [ 24 , 67 ] or alternative data sources [ 37 , 69 ], or using non-imputed data [ 41 ].

As the methods and not the findings are the main focus of the current review, and because generic checklists cannot discern the underlying quality in this application field (see also below), we opted to pool all reported findings together, regardless of individual study characteristics or particular outcome(s) used, and speak generally of positive and negative effects on health. For this summary we have adopted the 0.05-significance level and only considered results from multivariate analyses. Strictly birth-related factors are omitted since these potentially only relate to the group of infant mortality indicators and not to any of the other general population health measures.

Starting with the determinants most often studied, higher GDP levels [ 21 , 26 , 27 , 29 , 30 , 32 , 43 , 48 , 52 , 58 , 60 , 66 , 67 , 73 , 79 , 81 , 82 ], higher health [ 21 , 37 , 47 , 49 , 52 , 58 , 59 , 68 , 72 , 82 ] and social [ 20 , 21 , 26 , 38 , 79 ] expenditures, higher education [ 26 , 39 , 52 , 62 , 72 , 73 ], lower unemployment [ 60 , 61 , 66 ], and lower income inequality [ 30 , 42 , 53 , 55 , 73 ] were found to be significantly associated with better population health on a number of occasions. In addition to that, there was also some evidence that democracy [ 36 ] and freedom [ 50 ], higher work compensation [ 43 , 79 ], distributional orientation [ 54 ], cigarette prices [ 63 ], gross national income [ 22 , 72 ], labor productivity [ 26 ], exchange rates [ 32 ], marginal tax rates [ 79 ], vaccination rates [ 52 ], total fertility [ 59 , 66 ], fruit and vegetable [ 68 ], fat [ 52 ] and sugar consumption [ 52 ], as well as bigger depth of credit information [ 22 ] and percentage of civilian labor force [ 79 ], longer work leaves [ 41 , 58 ], more physicians [ 37 , 52 , 72 ], nurses [ 72 ], and hospital beds [ 79 , 82 ], and also membership in associations, perceived corruption and societal trust [ 48 ] were beneficial to health. Higher nitrous oxide (NO) levels [ 52 ], longer average hospital stay [ 48 ], deprivation [ 51 ], dissatisfaction with healthcare and the social environment [ 56 ], corruption [ 40 , 50 ], smoking [ 19 , 26 , 52 , 68 ], alcohol consumption [ 26 , 52 , 68 ] and illegal drug use [ 68 ], poverty [ 64 ], higher percentage of industrial workers [ 26 ], Gross Fixed Capital creation [ 66 ] and older population [ 38 , 66 , 79 ], gender inequality [ 22 ], and fertility [ 26 , 66 ] were detrimental.

It is important to point out that the above-mentioned effects could not be considered stable either across or within studies. Very often, statistical significance of a given covariate fluctuated between the different model specifications tried out within the same study [ 20 , 49 , 59 , 66 , 68 , 69 , 73 , 80 , 82 ], testifying to the importance of control variables and multivariate research (i.e., analyzing multiple independent variables simultaneously) in general. Furthermore, conflicting results were observed even with regards to the “core” determinants given special attention, so to speak, throughout this text. Thus, some studies reported negative effects of health expenditure [ 32 , 82 ], social expenditure [ 58 ], GDP [ 49 , 66 ], and education [ 82 ], and positive effects of income inequality [ 82 ] and unemployment [ 24 , 31 , 32 , 52 , 66 , 68 ]. Interestingly, one study [ 34 ] differentiated between temporary and long-term effects of GDP and unemployment, alluding to possibly much greater complexity of the association with health. It is also worth noting that some gender differences were found, with determinants being more influential for males than for females, or only having statistically significant effects for male health [ 19 , 21 , 28 , 34 , 36 , 37 , 39 , 64 , 65 , 69 ].

The purpose of this scoping review was to examine recent quantitative work on the topic of multi-country analyses of determinants of population health in high-income countries.

Measuring population health via relatively simple mortality-based indicators still seems to be the state of the art. What is more, these indicators are routinely considered one at a time, instead of, for example, employing existing statistical procedures to devise a more general, composite, index of population health, or using some of the established indices, such as disability-adjusted life expectancy (DALE) or quality-adjusted life expectancy (QALE). Although strong arguments for their wider use were already voiced decades ago [ 84 ], such summary measures surface only rarely in this research field.

On a related note, the greater data availability and accessibility that we enjoy today does not automatically equate to data quality. Nonetheless, this is routinely assumed in aggregate level studies. We almost never encountered a discussion on the topic. The non-mundane issue of data missingness, too, goes largely underappreciated. With all recent methodological advancements in this area [ 85 – 88 ], there is no excuse for ignorance; and still, too few of the reviewed studies tackled the matter in any adequate fashion.

Much optimism can be gained considering the abundance of different determinants that have attracted researchers’ attention in relation to population health. We took on a visual approach with regards to these determinants and presented a graph that links spatial distances between determinants with frequencies of being studies together. To facilitate interpretation, we grouped some variables, which resulted in some loss of finer detail. Nevertheless, the graph is helpful in exemplifying how many effects continue to be studied in a very limited context, if any. Since in reality no factor acts in isolation, this oversimplification practice threatens to render the whole exercise meaningless from the outset. The importance of multivariate analysis cannot be stressed enough. While there is no “best method” to be recommended and appropriate techniques vary according to the specifics of the research question and the characteristics of the data at hand [ 89 – 93 ], in the future, in addition to abandoning simplistic univariate approaches, we hope to see a shift from the currently dominating fixed effects to the more flexible random/mixed effects models [ 94 ], as well as wider application of more sophisticated methods, such as principle component regression, partial least squares, covariance structure models (e.g., structural equations), canonical correlations, time-series, and generalized estimating equations.

Finally, there are some limitations of the current scoping review. We searched the two main databases for published research in medical and non-medical sciences (PubMed and Web of Science) since 2013, thus potentially excluding publications and reports that are not indexed in these databases, as well as older indexed publications. These choices were guided by our interest in the most recent (i.e., the current state-of-the-art) and arguably the highest-quality research (i.e., peer-reviewed articles, primarily in indexed non-predatory journals). Furthermore, despite holding a critical stance with regards to some aspects of how determinants-of-health research is currently conducted, we opted out of formally assessing the quality of the individual studies included. The reason for that is two-fold. On the one hand, we are unaware of the existence of a formal and standard tool for quality assessment of ecological designs. And on the other, we consider trying to score the quality of these diverse studies (in terms of regional setting, specific topic, outcome indices, and methodology) undesirable and misleading, particularly since we would sometimes have been rating the quality of only a (small) part of the original studies—the part that was relevant to our review’s goal.

Our aim was to investigate the current state of research on the very broad and general topic of population health, specifically, the way it has been examined in a multi-country context. We learned that data treatment and analytical approach were, in the majority of these recent studies, ill-equipped or insufficiently transparent to provide clarity regarding the underlying mechanisms of population health in high-income countries. Whether due to methodological shortcomings or the inherent complexity of the topic, research so far fails to provide any definitive answers. It is our sincere belief that with the application of more advanced analytical techniques this continuous quest could come to fruition sooner.

Supporting information

S1 checklist. preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (prisma-scr) checklist..

https://doi.org/10.1371/journal.pone.0239031.s001

S1 Appendix.

https://doi.org/10.1371/journal.pone.0239031.s002

S2 Appendix.

https://doi.org/10.1371/journal.pone.0239031.s003

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 75. Dahlgren G, Whitehead M. Policies and Strategies to Promote Equity in Health. Stockholm, Sweden: Institute for Future Studies; 1991.
  • 76. Brunner E, Marmot M. Social Organization, Stress, and Health. In: Marmot M, Wilkinson RG, editors. Social Determinants of Health. Oxford, England: Oxford University Press; 1999.
  • 77. Najman JM. A General Model of the Social Origins of Health and Well-being. In: Eckersley R, Dixon J, Douglas B, editors. The Social Origins of Health and Well-being. Cambridge, England: Cambridge University Press; 2001.
  • 85. Carpenter JR, Kenward MG. Multiple Imputation and its Application. New York: John Wiley & Sons; 2013.
  • 86. Molenberghs G, Fitzmaurice G, Kenward MG, Verbeke G, Tsiatis AA. Handbook of Missing Data Methodology. Boca Raton: Chapman & Hall/CRC; 2014.
  • 87. van Buuren S. Flexible Imputation of Missing Data. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2018.
  • 88. Enders CK. Applied Missing Data Analysis. New York: Guilford; 2010.
  • 89. Shayle R. Searle GC, Charles E. McCulloch. Variance Components: John Wiley & Sons, Inc.; 1992.
  • 90. Agresti A. Foundations of Linear and Generalized Linear Models. Hoboken, New Jersey: John Wiley & Sons Inc.; 2015.
  • 91. Leyland A. H. (Editor) HGE. Multilevel Modelling of Health Statistics: John Wiley & Sons Inc; 2001.
  • 92. Garrett Fitzmaurice MD, Geert Verbeke, Geert Molenberghs. Longitudinal Data Analysis. New York: Chapman and Hall/CRC; 2008.
  • 93. Wolfgang Karl Härdle LS. Applied Multivariate Statistical Analysis. Berlin, Heidelberg: Springer; 2015.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 11 December 2020

Quantifying and addressing the prevalence and bias of study designs in the environmental and social sciences

  • Alec P. Christie   ORCID: orcid.org/0000-0002-8465-8410 1 ,
  • David Abecasis   ORCID: orcid.org/0000-0002-9802-8153 2 ,
  • Mehdi Adjeroud 3 ,
  • Juan C. Alonso   ORCID: orcid.org/0000-0003-0450-7434 4 ,
  • Tatsuya Amano   ORCID: orcid.org/0000-0001-6576-3410 5 ,
  • Alvaro Anton   ORCID: orcid.org/0000-0003-4108-6122 6 ,
  • Barry P. Baldigo   ORCID: orcid.org/0000-0002-9862-9119 7 ,
  • Rafael Barrientos   ORCID: orcid.org/0000-0002-1677-3214 8 ,
  • Jake E. Bicknell   ORCID: orcid.org/0000-0001-6831-627X 9 ,
  • Deborah A. Buhl 10 ,
  • Just Cebrian   ORCID: orcid.org/0000-0002-9916-8430 11 ,
  • Ricardo S. Ceia   ORCID: orcid.org/0000-0001-7078-0178 12 , 13 ,
  • Luciana Cibils-Martina   ORCID: orcid.org/0000-0002-2101-4095 14 , 15 ,
  • Sarah Clarke 16 ,
  • Joachim Claudet   ORCID: orcid.org/0000-0001-6295-1061 17 ,
  • Michael D. Craig 18 , 19 ,
  • Dominique Davoult 20 ,
  • Annelies De Backer   ORCID: orcid.org/0000-0001-9129-9009 21 ,
  • Mary K. Donovan   ORCID: orcid.org/0000-0001-6855-0197 22 , 23 ,
  • Tyler D. Eddy 24 , 25 , 26 ,
  • Filipe M. França   ORCID: orcid.org/0000-0003-3827-1917 27 ,
  • Jonathan P. A. Gardner   ORCID: orcid.org/0000-0002-6943-2413 26 ,
  • Bradley P. Harris 28 ,
  • Ari Huusko 29 ,
  • Ian L. Jones 30 ,
  • Brendan P. Kelaher 31 ,
  • Janne S. Kotiaho   ORCID: orcid.org/0000-0002-4732-784X 32 , 33 ,
  • Adrià López-Baucells   ORCID: orcid.org/0000-0001-8446-0108 34 , 35 , 36 ,
  • Heather L. Major   ORCID: orcid.org/0000-0002-7265-1289 37 ,
  • Aki Mäki-Petäys 38 , 39 ,
  • Beatriz Martín 40 , 41 ,
  • Carlos A. Martín 8 ,
  • Philip A. Martin 1 , 42 ,
  • Daniel Mateos-Molina   ORCID: orcid.org/0000-0002-9383-0593 43 ,
  • Robert A. McConnaughey   ORCID: orcid.org/0000-0002-8537-3695 44 ,
  • Michele Meroni 45 ,
  • Christoph F. J. Meyer   ORCID: orcid.org/0000-0001-9958-8913 34 , 35 , 46 ,
  • Kade Mills 47 ,
  • Monica Montefalcone 48 ,
  • Norbertas Noreika   ORCID: orcid.org/0000-0002-3853-7677 49 , 50 ,
  • Carlos Palacín 4 ,
  • Anjali Pande 26 , 51 , 52 ,
  • C. Roland Pitcher   ORCID: orcid.org/0000-0003-2075-4347 53 ,
  • Carlos Ponce 54 ,
  • Matt Rinella 55 ,
  • Ricardo Rocha   ORCID: orcid.org/0000-0003-2757-7347 34 , 35 , 56 ,
  • María C. Ruiz-Delgado 57 ,
  • Juan J. Schmitter-Soto   ORCID: orcid.org/0000-0003-4736-8382 58 ,
  • Jill A. Shaffer   ORCID: orcid.org/0000-0003-3172-0708 10 ,
  • Shailesh Sharma   ORCID: orcid.org/0000-0002-7918-4070 59 ,
  • Anna A. Sher   ORCID: orcid.org/0000-0002-6433-9746 60 ,
  • Doriane Stagnol 20 ,
  • Thomas R. Stanley 61 ,
  • Kevin D. E. Stokesbury 62 ,
  • Aurora Torres 63 , 64 ,
  • Oliver Tully 16 ,
  • Teppo Vehanen   ORCID: orcid.org/0000-0003-3441-6787 65 ,
  • Corinne Watts 66 ,
  • Qingyuan Zhao 67 &
  • William J. Sutherland 1 , 42  

Nature Communications volume  11 , Article number:  6377 ( 2020 ) Cite this article

15k Accesses

57 Citations

69 Altmetric

Metrics details

  • Environmental impact
  • Scientific community
  • Social sciences

Building trust in science and evidence-based decision-making depends heavily on the credibility of studies and their findings. Researchers employ many different study designs that vary in their risk of bias to evaluate the true effect of interventions or impacts. Here, we empirically quantify, on a large scale, the prevalence of different study designs and the magnitude of bias in their estimates. Randomised designs and controlled observational designs with pre-intervention sampling were used by just 23% of intervention studies in biodiversity conservation, and 36% of intervention studies in social science. We demonstrate, through pairwise within-study comparisons across 49 environmental datasets, that these types of designs usually give less biased estimates than simpler observational designs. We propose a model-based approach to combine study estimates that may suffer from different levels of study design bias, discuss the implications for evidence synthesis, and how to facilitate the use of more credible study designs.

Similar content being viewed by others

quantitative research scholarly article

Citizen science in environmental and ecological sciences

quantitative research scholarly article

Improving quantitative synthesis to achieve generality in ecology

quantitative research scholarly article

Empirical evidence of widespread exaggeration bias and selective reporting in ecology

Introduction.

The ability of science to reliably guide evidence-based decision-making hinges on the accuracy and credibility of studies and their results 1 , 2 . Well-designed, randomised experiments are widely accepted to yield more credible results than non-randomised, ‘observational studies’ that attempt to approximate and mimic randomised experiments 3 . Randomisation is a key element of study design that is widely used across many disciplines because of its ability to remove confounding biases (through random assignment of the treatment or impact of interest 4 , 5 ). However, ethical, logistical, and economic constraints often prevent the implementation of randomised experiments, whereas non-randomised observational studies have become popular as they take advantage of historical data for new research questions, larger sample sizes, less costly implementation, and more relevant and representative study systems or populations 6 , 7 , 8 , 9 . Observational studies nevertheless face the challenge of accounting for confounding biases without randomisation, which has led to innovations in study design.

We define ‘study design’ as an organised way of collecting data. Importantly, we distinguish between data collection and statistical analysis (as opposed to other authors 10 ) because of the belief that bias introduced by a flawed design is often much more important than bias introduced by statistical analyses. This was emphasised by Light, Singer & Willet 11 (p. 5): “You can’t fix by analysis what you bungled by design…”; and Rubin 3 : “Design trumps analysis.” Nevertheless, the importance of study design has often been overlooked in debates over the inability of researchers to reproduce the original results of published studies (so-called ‘reproducibility crises’ 12 , 13 ) in favour of other issues (e.g., p-hacking 14 and Hypothesizing After Results are Known or ‘HARKing’ 15 ).

To demonstrate the importance of study designs, we can use the following decomposition of estimation error equation 16 :

This demonstrates that even if we improve the quality of modelling and analysis (to reduce modelling bias through a better bias-variance trade-off 17 ) or increase sample size (to reduce statistical noise), we cannot remove the intrinsic bias introduced by the choice of study design (design bias) unless we collect the data in a different way. The importance of study design in determining the levels of bias in study results therefore cannot be overstated.

For the purposes of this study we consider six commonly used study designs; differences and connections can be visualised in Fig.  1 . There are three major components that allow us to define these designs: randomisation, sampling before and after the impact of interest occurs, and the use of a control group.

figure 1

A hypothetical study set-up is shown where the abundance of birds in three impact and control replicates (e.g., fields represented by blocks in a row) are monitored before and after an impact (e.g., ploughing) that occurs in year zero. Different colours represent each study design and illustrate how replicates are sampled. Approaches for calculating an estimate of the true effect of the impact for each design are also shown, along with synonyms from different disciplines.

Of the non-randomised observational designs, the Before-After Control-Impact (BACI) design uses a control group and samples before and after the impact occurs (i.e., in the ‘before-period’ and the ‘after-period’). Its rationale is to explicitly account for pre-existing differences between the impact group (exposed to the impact) and control group in the before-period, which might otherwise bias the estimate of the impact’s true effect 6 , 18 , 19 .

The BACI design improves upon several other commonly used observational study designs, of which there are two uncontrolled designs: After, and Before-After (BA). An After design monitors an impact group in the after-period, while a BA design compares the state of the impact group between the before- and after-periods. Both designs can be expected to yield poor estimates of the impact’s true effect (large design bias; Equation (1)) because changes in the response variable could have occurred without the impact (e.g., due to natural seasonal changes; Fig.  1 ).

The other observational design is Control-Impact (CI), which compares the impact group and control group in the after-period (Fig.  1 ). This design may suffer from design bias introduced by pre-existing differences between the impact group and control group in the before-period; bias that the BACI design was developed to account for 20 , 21 . These differences have many possible sources, including experimenter bias, logistical and environmental constraints, and various confounding factors (variables that change the propensity of receiving the impact), but can be adjusted for through certain data pre-processing techniques such as matching and stratification 22 .

Among the randomised designs, the most commonly used are counterparts to the observational CI and BACI designs: Randomised Control-Impact (R-CI) and Randomised Before-After Control-Impact (R-BACI) designs. The R-CI design, often termed ‘Randomised Controlled Trials’ (RCTs) in medicine and hailed as the ‘gold standard’ 23 , 24 , removes any pre-impact differences in a stochastic sense, resulting in zero design bias (Equation ( 1 )). Similarly, the R-BACI design should also have zero design bias, and the impact group measurements in the before-period could be used to improve the efficiency of the statistical estimator. No randomised equivalents exist of After or BA designs as they are uncontrolled.

It is important to briefly note that there is debate over two major statistical methods that can be used to analyse data collected using BACI and R-BACI designs, and which is superior at reducing modelling bias 25 (Equation (1)). These statistical methods are: (i) Differences in Differences (DiD) estimator; and (ii) covariance adjustment using the before-period response, which is an extension of Analysis of Covariance (ANCOVA) for generalised linear models — herein termed ‘covariance adjustment’ (Fig.  1 ). These estimators rely on different assumptions to obtain unbiased estimates of the impact’s true effect. The DiD estimator assumes that the control group response accurately represents the impact group response had it not been exposed to the impact (‘parallel trends’ 18 , 26 ) whereas covariance adjustment assumes there are no unmeasured confounders and linear model assumptions hold 6 , 27 .

From both theory and Equation (1), with similar sample sizes, randomised designs (R-BACI and R-CI) are expected to be less biased than controlled, observational designs with sampling in the before-period (BACI), which in turn should be superior to observational designs without sampling in the before-period (CI) or without a control group (BA and After designs 7 , 28 ). Between randomised designs, we might expect that an R-BACI design performs better than a R-CI design because utilising extra data before the impact may improve the efficiency of the statistical estimator by explicitly characterising pre-existing differences between the impact group and control group.

Given the likely differences in bias associated with different study designs, concerns have been raised over the use of poorly designed studies in several scientific disciplines 7 , 29 , 30 , 31 , 32 , 33 , 34 , 35 . Some disciplines, such as the social and medical sciences, commonly undertake direct comparisons of results obtained by randomised and non-randomised designs within a single study 36 , 37 , 38 or between multiple studies (between-study comparisons 39 , 40 , 41 ) to specifically understand the influence of study designs on research findings. However, within-study comparisons are limited in their scope (e.g., a single study 42 , 43 ) and between-study comparisons can be confounded by variability in context or study populations 44 . Overall, we lack quantitative estimates of the prevalence of different study designs and the levels of bias associated with their results.

In this work, we aim to first quantify the prevalence of different study designs in the social and environmental sciences. To fill this knowledge gap, we take advantage of summaries for several thousand biodiversity conservation intervention studies in the Conservation Evidence database 45 ( www.conservationevidence.com ) and social intervention studies in systematic reviews by the Campbell Collaboration ( www.campbellcollaboration.org ). We then quantify the levels of bias in estimates obtained by different study designs (R-BACI, R-CI, BACI, BA, and CI) by applying a hierarchical model to approximately 1000 within-study comparisons across 49 raw environmental datasets from a range of fields. We show that R-BACI, R-CI and BACI designs are poorly represented in studies testing biodiversity conservation and social interventions, and that these types of designs tend to give less biased estimates than simpler observational designs. We propose a model-based approach to combine study estimates that may suffer from different levels of study design bias, discuss the implications for evidence synthesis, and how to facilitate the use of more credible study designs.

Prevalence of study designs

We found that the biodiversity-conservation (conservation evidence) and social-science (Campbell collaboration) literature had similarly high proportions of intervention studies that used CI designs and After designs, but low proportions that used R-BACI, BACI, or BA designs (Fig.  2 ). There were slightly higher proportions of R-CI designs used by intervention studies in social-science systematic reviews than in the biodiversity-conservation literature (Fig.  2 ). The R-BACI, R-CI, and BACI designs made up 23% of intervention studies for biodiversity conservation, and 36% of intervention studies for social science.

figure 2

Intervention studies from the biodiversity-conservation literature were screened from the Conservation Evidence database ( n =4260 studies) and studies from the social-science literature were screened from 32 Campbell Collaboration systematic reviews ( n =1009 studies – note studies excluded by these reviews based on their study design were still counted). Percentages for the social-science literature were calculated for each systematic review (blue data points) and then averaged across all 32 systematic reviews (blue bars and black vertical lines represent mean and 95% Confidence Intervals, respectively). Percentages for the biodiversity-conservation literature are absolute values (shown as green bars) calculated from the entire Conservation Evidence database (after excluding any reviews). Source data are provided as a Source Data file. BA before-after, CI control-impact, BACI before-after-control-impact, R-BACI randomised BACI, R-CI randomised CI.

Influence of different study designs on study results

In non-randomised datasets, we found that estimates of BACI (with covariance adjustment) and CI designs were very similar, while the point estimates for most other designs often differed substantially in their magnitude and sign. We found similar results in randomised datasets for R-BACI (with covariance adjustment) and R-CI designs. For ~30% of responses, in both non-randomised and randomised datasets, study design estimates differed in their statistical significance (i.e., p < 0.05 versus p  > =0.05), except for estimates of (R-)BACI (with covariance adjustment) and (R-)CI designs (Table  1 ; Fig.  3 ). It was rare for the 95% confidence intervals of different designs’ estimates to not overlap – except when comparing estimates of BA designs to (R-)BACI (with covariance adjustment) and (R-)CI designs (Table  1 ). It was even rarer for estimates of different designs to have significantly different signs (i.e., one estimate with entirely negative confidence intervals versus one with entirely positive confidence intervals; Table  1 , Fig.  3 ). Overall, point estimates often differed greatly in their magnitude and, to a lesser extent, in their sign between study designs, but did not differ as greatly when accounting for the uncertainty around point estimates – except in terms of their statistical significance.

figure 3

t-statistics were obtained from two-sided t-tests of estimates obtained by each design for different responses in each dataset using Generalised Linear Models (see Methods). For randomised datasets, BACI and CI axis labels refer to R-BACI and R-CI designs (denoted by ‘R-’). DiD Difference in Differences; CA covariance adjustment. Lines at t-statistic values of 1.96 denote boundaries between cells and colours of points indicate differences in direction and statistical significance ( p  < 0.05; grey = same sign and significance, orange = same sign but difference in significance, red = different sign and significance). Numbers refer to the number of responses in each cell. Source data are provided as a Source Data file. BA Before-After, CI Control-Impact, BACI Before-After-Control-Impact.

Levels of bias in estimates of different study designs

We modelled study design bias using a random effect across datasets in a hierarchical Bayesian model; σ is the standard deviation of the bias term, and assuming bias is randomly distributed across datasets and is on average zero, larger values of σ will indicate a greater magnitude of bias (see Methods). We found that, for randomised datasets, estimates of both R-BACI (using covariance adjustment; CA) and R-CI designs were affected by negligible amounts of bias (very small values of σ; Table  2 ). When the R-BACI design used the DiD estimator, it suffered from slightly more bias (slightly larger values of σ), whereas the BA design had very high bias when applied to randomised datasets (very large values of σ; Table  2 ). There was a highly positive correlation between the estimates of R-BACI (using covariance adjustment) and R-CI designs (Ω[R-BACI CA, R-CI] was close to 1; Table  2 ). Estimates of R-BACI using the DiD estimator were also positively correlated with estimates of R-BACI using covariance adjustment and R-CI designs (moderate positive mean values of Ω[R-BACI CA, R-BACI DiD] and Ω[R-BACI DiD, R-CI]; Table  2 ).

For non-randomised datasets, controlled designs (BACI and CI) were substantially less biased (far smaller values of σ) than the uncontrolled BA design (Table  2 ). A BACI design using the DiD estimator was slightly less biased than the BACI design using covariance adjustment, which was, in turn, slightly less biased than the CI design (Table  2 ).

Standard errors estimated by the hierarchical Bayesian model were reasonably accurate for the randomised datasets (see λ in Methods and Table  2 ), whereas there was some underestimation of standard errors and lack-of-fit for non-randomised datasets.

Our approach provides a principled way to quantify the levels of bias associated with different study designs. We found that randomised study designs (R-BACI and R-CI) and observational BACI designs are poorly represented in the environmental and social sciences; collectively, descriptive case studies (the After design), the uncontrolled, observational BA design, and the controlled, observational CI design made up a substantially greater proportion of intervention studies (Fig.  2 ). And yet R-BACI, R-CI and BACI designs were found to be quantifiably less biased than other observational designs.

As expected the R-CI and R-BACI designs (using a covariance adjustment estimator) performed well; the R-BACI design using a DiD estimator performed slightly less well, probably because the differencing of pre-impact data by this estimator may introduce additional statistical noise compared to covariance adjustment, which controls for these data using a lagged regression variable. Of the observational designs, the BA design performed very poorly (both when analysing randomised and non-randomised data) as expected, being uncontrolled and therefore prone to severe design bias 7 , 28 . The CI design also tended to be more biased than the BACI design (using a DiD estimator) due to pre-existing differences between the impact and control groups. For BACI designs, we recommend that the underlying assumptions of DiD and CA estimators are carefully considered before choosing to apply them to data collected for a specific research question 6 , 27 . Their levels of bias were negligibly different and their known bracketing relationship suggests they will typically give estimates with the same sign, although their tendency to over- or underestimate the true effect will depend on how well the underlying assumptions of each are met (most notably, parallel trends for DiD and no unmeasured confounders for CA; see Introduction) 6 , 27 . Overall, these findings demonstrate the power of large within-study comparisons to directly quantify differences in the levels of bias associated with different designs.

We must acknowledge that the assumptions of our hierarchical model (that the bias for each design (j) is on average zero and normally distributed) cannot be verified without gold standard randomised experiments and that, for observational designs, the model was overdispersed (potentially due to underestimation of statistical error by GLM(M)s or positively correlated design biases). The exact values of our hierarchical model should therefore be treated with appropriate caution, and future research is needed to refine and improve our approach to quantify these biases more precisely. Responses within datasets may also not be independent as multiple species could interact; therefore, the estimates analysed by our hierarchical model are statistically dependent on each other, and although we tried to account for this using a correlation matrix (see Methods, Eq. ( 3 )), this is a limitation of our model. We must also recognise that we collated datasets using non-systematic searches 46 , 47 and therefore our analysis potentially exaggerates the intrinsic biases of observational designs (i.e., our data may disproportionately reflect situations where the BACI design was chosen to account for confounding factors). We nevertheless show that researchers were wise to use the BACI design because it was less biased than CI and BA designs across a wide range of datasets from various environmental systems and locations. Without undertaking costly and time-consuming pre-impact sampling and pilot studies, researchers are also unlikely to know the levels of bias that could affect their results. Finally, we did not consider sample size, but it is likely that researchers might use larger sample sizes for CI and BA designs than BACI designs. This is, however, unlikely to affect our main conclusions because larger sample sizes could increase type I errors (false positive rate) by yielding more precise, but biased estimates of the true effect 28 .

Our analyses provide several empirically supported recommendations for researchers designing future studies to assess an impact of interest. First, using a controlled and/or randomised design (if possible) was shown to strongly reduce the level of bias in study estimates. Second, when observational designs must be used (as randomisation is not feasible or too costly), we urge researchers to choose the BACI design over other observational designs—and when that is not possible, to choose the CI design over the uncontrolled BA design. We acknowledge that limited resources, short funding timescales, and ethical or logistical constraints 48 may force researchers to use the CI design (if randomisation and pre-impact sampling are impossible) or the BA design (if appropriate controls cannot be found 28 ). To facilitate the usage of less biased designs, longer-term investments in research effort and funding are required 43 . Far greater emphasis on study designs in statistical education 49 and better training and collaboration between researchers, practitioners and methodologists, is needed to improve the design of future studies; for example, potentially improving the CI design by pairing or matching the impact group and control group 22 , or improving the BA design using regression discontinuity methods 48 , 50 . Where the choice of study design is limited, researchers must transparently communicate the limitations and uncertainty associated with their results.

Our findings also have wider implications for evidence synthesis, specifically the exclusion of certain observational study designs from syntheses (the ‘rubbish in, rubbish out’ concept 51 , 52 ). We believe that observational designs should be included in systematic reviews and meta-analyses, but that careful adjustments are needed to account for their potential biases. Exclusion of observational studies often results from subjective, checklist-based ‘Risk of Bias’ or quality assessments of studies (e.g., AMSTRAD 2 53 , ROBINS-I 54 , or GRADE 55 ) that are not data-driven and often neglect to identify the actual direction, or quantify the magnitude, of possible bias introduced by observational studies when rating the quality of a review’s recommendations. We also found that there was a small proportion of studies that used randomised designs (R-CI or R-BACI) or observational BACI designs (Fig.  2 ), suggesting that systematic reviews and meta-analyses risk excluding a substantial proportion of the literature and limiting the scope of their recommendations if such exclusion criteria are used 32 , 56 , 57 . This problem is compounded by the fact that, at least in conservation science, studies using randomised or BACI designs are strongly concentrated in Europe, Australasia, and North America 31 . Systematic reviews that rely on these few types of study designs are therefore likely to fail to provide decision makers outside of these regions with locally relevant recommendations that they prefer 58 . The Covid-19 pandemic has highlighted the difficulties in making locally relevant evidence-based decisions using studies conducted in different countries with different demographics and cultures, and on patients of different ages, ethnicities, genetics, and underlying health issues 59 . This problem is also acute for decision-makers working on biodiversity conservation in the tropical regions, where the need for conservation is arguably the greatest (i.e., where most of Earth’s biodiversity exists 60 ) but they either have to rely on very few well-designed studies that are not locally relevant (i.e., have low generalisability), or more studies that are locally relevant but less well-designed 31 , 32 . Either option could lead decision-makers to take ineffective or inefficient decisions. In the long-term, improving the quality and coverage of scientific evidence and evidence syntheses across the world will help solve these issues, but shorter-term solutions to synthesising patchy evidence bases are required.

Our work furthers sorely needed research on how to combine evidence from studies that vary greatly in their design. Our approach is an alternative to conventional meta-analyses which tend to only weight studies by their sample size or the inverse of their variance 61 ; when studies vary greatly in their study design, simply weighting by inverse variance or sample size is unlikely to account for different levels of bias introduced by different study designs (see Equation (1)). For example, a BA study could receive a larger weight if it had lower variance than a BACI study, despite our results suggesting a BA study usually suffers from greater design bias. Our model provides a principled way to weight studies by both their variance and the likely amount of bias introduced by their study design; it is therefore a form of ‘bias-adjusted meta-analysis’ 62 , 63 , 64 , 65 , 66 . However, instead of relying on elicitation of subjective expert opinions on the bias of each study, we provide a data-driven, empirical quantification of study biases – an important step that was called for to improve such meta-analytic approaches 65 , 66 .

Future research is needed to refine our methodology, but our empirically grounded form of bias-adjusted meta-analysis could be implemented as follows: 1.) collate studies for the same true effect, their effect size estimates, standard errors, and the type of study design; 2.) enter these data into our hierarchical model, where effect size estimates share the same intercept (the true causal effect), a random effect term due to design bias (whose variance is estimated by the method we used), and a random effect term for statistical noise (whose variance is estimated by the reported standard error of studies); 3.) fit this model and estimate the shared intercept/true effect. Heuristically, this can be thought of as weighting studies by both their design bias and their sampling variance and could be implemented on a dynamic meta-analysis platform (such as metadataset.com 67 ). This approach has substantial potential to develop evidence synthesis in fields (such as biodiversity conservation 31 , 32 ) with patchy evidence bases, where reliably synthesising findings from studies that vary greatly in their design is a fundamental and unavoidable challenge.

Our study has highlighted an often overlooked aspect of debates over scientific reproducibility: that the credibility of studies is fundamentally determined by study design. Testing the effectiveness of conservation and social interventions is undoubtedly of great importance given the current challenges facing biodiversity and society in general and the serious need for more evidence-based decision-making 1 , 68 . And yet our findings suggest that quantifiably less biased study designs are poorly represented in the environmental and social sciences. Greater methodological training of researchers and funding for intervention studies, as well as stronger collaborations between methodologists and practitioners is needed to facilitate the use of less biased study designs. Better communication and reporting of the uncertainty associated with different study designs is also needed, as well as more meta-research (the study of research itself) to improve standards of study design 69 . Our hierarchical model provides a principled way to combine studies using a variety of study designs that vary greatly in their risk of bias, enabling us to make more efficient use of patchy evidence bases. Ultimately, we hope that researchers and practitioners testing interventions will think carefully about the types of study designs they use, and we encourage the evidence synthesis community to embrace alternative methods for combining evidence from heterogeneous sets of studies to improve our ability to inform evidence-based decision-making in all disciplines.

Quantifying the use of different designs

We compared the use of different study designs in the literature that quantitatively tested interventions between the fields of biodiversity conservation (4,260 studies collated by Conservation Evidence 45 ) and social science (1,009 studies found by 32 systematic reviews produced by the Campbell Collaboration: www.campbellcollaboration.org ).

Conservation Evidence is a database of intervention studies, each of which has quantitatively tested a conservation intervention (e.g., sowing strips of wildflower seeds on farmland to benefit birds), that is continuously being updated through comprehensive, manual searches of conservation journals for a wide range of fields in biodiversity conservation (e.g., amphibian, bird, peatland, and farmland conservation 45 ). To obtain the proportion of studies that used each design from Conservation Evidence, we simply extracted the type of study design from each study in the database in 2019 – the study design was determined using a standardised set of criteria; reviews were not included (Table  3 ). We checked if the designs reported in the database accurately reflected the designs in the original publication and found that for a random subset of 356 studies, 95.1% were accurately described.

Each systematic review produced by the Campbell Collaboration collates and analyses studies that test a specific social intervention; we collated systematic reviews that tested a variety of social interventions across several fields in the social sciences, including education, crime and justice, international development and social welfare (Supplementary Data  1 ). We retrieved systematic reviews produced by the Campbell Collaboration by searching their website ( www.campbellcollaboration.org ) for reviews published between 2013‒2019 (as of 8th September 2019) — we limited the date range as we could not go through every review. As we were interested in the use of study designs in the wider social-science literature, we only considered reviews (32 in total) that contained sufficient information on the number of included and excluded studies that used different study designs. Studies may be excluded from systematic reviews for several reasons, such as their relevance to the scope of the review (e.g., testing a relevant intervention) and their study design. We only considered studies if the sole reason for their exclusion from the systematic review was their study design – i.e., reviews clearly reported that the study was excluded because it used a particular study design, and not because of any other reason, such as its relevance to the review’s research questions. We calculated the proportion of studies that used each design in each systematic review (using the same criteria as for the biodiversity-conservation literature – see Table  3 ) and then averaged these proportions across all systematic reviews.

Within-study comparisons of different study designs

We wanted to make direct within-study comparisons between the estimates obtained by different study designs (e.g., see 38 , 70 , 71 for single within-study comparisons) for many different studies. If a dataset contains data collected using a BACI design, subsets of these data can be used to mimic the use of other study designs (a BA design using only data for the impact group, and a CI design using only data collected after the impact occurred). Similarly, if data were collected using a R-BACI design, subsets of these data can be used to mimic the use of a BA design and a R-CI design. Collecting BACI and R-BACI datasets would therefore allow us to make direct within-study comparisons of the estimates obtained by these designs.

We collated BACI and R-BACI datasets by searching the Web of Science Core Collection 72 which included the following citation indexes: Science Citation Index Expanded (SCI-EXPANDED) 1900-present; Social Sciences Citation Index (SSCI) 1900-present Arts & Humanities Citation Index (A&HCI) 1975-present; Conference Proceedings Citation Index - Science (CPCI-S) 1990-present; Conference Proceedings Citation Index - Social Science & Humanities (CPCI-SSH) 1990-present; Book Citation Index - Science (BKCI-S) 2008-present; Book Citation Index - Social Sciences & Humanities (BKCI-SSH) 2008-present; Emerging Sources Citation Index (ESCI) 2015-present; Current Chemical Reactions (CCR-EXPANDED) 1985-present (Includes Institut National de la Propriete Industrielle structure data back to 1840); Index Chemicus (IC) 1993-present. The following search terms were used: [‘BACI’] OR [‘Before-After Control-Impact’] and the search was conducted on the 18th December 2017. Our search returned 674 results, which we then refined by selecting only ‘Article’ as the document type and using only the following Web of Science Categories: ‘Ecology’, ‘Marine Freshwater Biology’, ‘Biodiversity Conservation’, ‘Fisheries’, ‘Oceanography’, ‘Forestry’, ‘Zoology’, Ornithology’, ‘Biology’, ‘Plant Sciences’, ‘Entomology’, ‘Remote Sensing’, ‘Toxicology’ and ‘Soil Science’. This left 579 results, which we then restricted to articles published since 2002 (15 years prior to search) to give us a realistic opportunity to obtain the raw datasets, thus reducing this number to 542. We were able to access the abstracts of 521 studies and excluded any that did not test the effect of an environmental intervention or threat using an R-BACI or BACI design with response measures related to the abundance (e.g., density, counts, biomass, cover), reproduction (reproductive success) or size (body length, body mass) of animals or plants. Many studies did not test a relevant metric (e.g., they measured species richness), did not use a BACI or R-BACI design, or did not test the effect of an intervention or threat — this left 96 studies for which we contacted all corresponding authors to ask for the raw dataset. We were able to fully access 54 raw datasets, but upon closer inspection we found that three of these datasets either: did not use a BACI design; did not use the metrics we specified; or did not provide sufficient data for our analyses. This left 51 datasets in total that we used in our preliminary analyses (Supplementary Data  2 ).

All the datasets were originally collected to evaluate the effect of an environmental intervention or impact. Most of them contained multiple response variables (e.g., different measures for different species, such as abundance or density for species A, B, and C). Within a dataset, we use the term “response” to refer to the estimation of the true effect of an impact on one response variable. There were 1,968 responses in total across 51 datasets. We then excluded 932 responses (resulting in the exclusion of one dataset) where one or more of the four time-period and treatment subsets (Before Control, Before Impact, After Control, and After Impact data) consisted of entirely zero measurements, or two or more of these subsets had more than 90% zero measurements. We also excluded one further dataset as it was the only one to not contain repeated measurements at sites in both the before- and after-periods. This was necessary to generate reliable standard errors when modelling these data. We modelled the remaining 1,036 responses from across 49 datasets (Supplementary Table  1 ).

We applied each study design to the appropriate components of each dataset using Generalised Linear Models (GLMs 73 , 74 ) because of their generality and ability to implement the statistical estimators of many different study designs. The model structure of GLMs was adjusted for each response in each dataset based on the study design specified, response measure and dataset structure (Supplementary Table  2 ). We quantified the effect of the time period for the BA design (After vs Before the impact) and the effect of the treatment type for the CI and R-CI designs (Impact vs Control) on the response variable (Supplementary Table  2 ). For BACI and R-BACI designs, we implemented two statistical estimators: 1.) a DiD estimator that estimated the true effect using an interaction term between time and treatment type; and 2.) a covariance adjustment estimator that estimated the true effect using a term for the treatment type with a lagged variable (Supplementary Table  2 ).

As there were large numbers of responses, we used general a priori rules to specify models for each response; this may have led to some model misspecification, but was unlikely to have substantially affected our pairwise comparison of estimates obtained by different designs. The error family of each GLM was specified based on the nature of the measure used and preliminary data exploration: count measures (e.g., abundance) = poisson; density measures (e.g., biomass or abundance per unit area) = quasipoisson, as data for these measures tended to be overdispersed; percentage measures (e.g., percentage cover) = quasibinomial; and size measures (e.g., body length) = gaussian.

We treated each year or season in which data were collected as independent observations because the implementation of a seasonal term in models is likely to vary on a case-by-case basis; this will depend on the research questions posed by each study and was not feasible for us to consider given the large number of responses we were modelling. The log link function was used for all models to generate a standardised log response ratio as an estimate of the true effect for each response; a fixed effect coefficient (a variable named treatment status; Supplementary Table  2 ) was used to estimate the log response ratio 61 . If the response had at least ten ‘sites’ (independent sampling units) and two measurements per site on average, we used the random effects of subsample (replicates within a site) nested within site to capture the dependence within a site and subsample (i.e., a Generalised Linear Mixed Model or GLMM 73 , 74 was implemented instead of a GLM); otherwise we fitted a GLM with only the fixed effects (Supplementary Table  2 ).

We fitted all models using R version 3.5.1 75 , and packages lme4 76 and MASS 77 . Code to replicate all analyses is available (see Data and Code Availability). We compared the estimates obtained using each study design (both in terms of point estimates and estimates with associated standard error) by their magnitude and sign.

A model-based quantification of the bias in study design estimates

We used a hierarchical Bayesian model motivated by the decomposition in Equation (1) to quantify the bias in different study design estimates. This model takes the estimated effects of impacts and their standard errors as inputs. Let \(\hat \beta _{ij}\) be the true effect estimator in study \(i\) using design \(j\) and \(\hat \sigma _{ij}\) be its estimated standard error from the corresponding GLM or GLMM. Our hierarchical model assumes:

where β i is the true effect for response \(i\) , \(\gamma _{ij}\) is the bias of design j in response \(i\) , and \(\varepsilon _{ij}\) is the sampling noise of the statistical estimator. Although \(\gamma _{ij}\) technically incorporates both the design bias and any misspecification (modelling) bias due to using GLMs or GLMMs (Equation (1)), we expect the modelling bias to be much smaller than the design bias 3 , 11 . We assume the statistical errors \(\varepsilon _i\) within a response are related to the estimated standard errors through the following joint distribution:

where \({\Omega}\) is the correlation matrix for the different estimators in the same response and λ is a scaling factor to account for possible over/under-estimation of the standard errors.

This model effectively quantifies the bias of design \(j\) using the value of \(\sigma _j\) (larger values = more bias) by accounting for within-response correlations using the correlation matrix \({\Omega}\) and for possible under-estimation of the standard error using \(\lambda\) . We ensured that the prior distributions we used had very large variances so they would have a very small effect on the posterior distribution — accordingly we placed the following disperse priors on the variance parameters:

We fitted the hierarchical Bayesian model in R version 3.5.1 using the Bayesian inference package rstan 78 .

Data availability

All data analysed in the current study are available from Zenodo, https://doi.org/10.5281/zenodo.3560856 .  Source data are provided with this paper.

Code availability

All code used in the current study is available from Zenodo, https://doi.org/10.5281/zenodo.3560856 .

Donnelly, C. A. et al. Four principles to make evidence synthesis more useful for policy. Nature 558 , 361–364 (2018).

Article   ADS   CAS   PubMed   Google Scholar  

McKinnon, M. C., Cheng, S. H., Garside, R., Masuda, Y. J. & Miller, D. C. Sustainability: map the evidence. Nature 528 , 185–187 (2015).

Rubin, D. B. For objective causal inference, design trumps analysis. Ann. Appl. Stat. 2 , 808–840 (2008).

Article   MathSciNet   MATH   Google Scholar  

Peirce, C. S. & Jastrow, J. On small differences in sensation. Mem. Natl Acad. Sci. 3 , 73–83 (1884).

Fisher, R. A. Statistical methods for research workers . (Oliver and Boyd, 1925).

Angrist, J. D. & Pischke, J.-S. Mostly harmless econometrics: an empiricist’s companion . (Princeton University Press, 2008).

de Palma, A. et al . Challenges with inferring how land-use affects terrestrial biodiversity: study design, time, space and synthesis. in Next Generation Biomonitoring: Part 1 163–199 (Elsevier Ltd., 2018).

Sagarin, R. & Pauchard, A. Observational approaches in ecology open new ground in a changing world. Front. Ecol. Environ. 8 , 379–386 (2010).

Article   Google Scholar  

Shadish, W. R., Cook, T. D. & Campbell, D. T. Experimental and quasi-experimental designs for generalized causal inference . (Houghton Mifflin, 2002).

Rosenbaum, P. R. Design of observational studies . vol. 10 (Springer, 2010).

Light, R. J., Singer, J. D. & Willett, J. B. By design: Planning research on higher education. By design: Planning research on higher education . (Harvard University Press, 1990).

Ioannidis, J. P. A. Why most published research findings are false. PLOS Med. 2 , e124 (2005).

Article   PubMed   PubMed Central   Google Scholar  

Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349 , aac4716 (2015).

Article   CAS   Google Scholar  

John, L. K., Loewenstein, G. & Prelec, D. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23 , 524–532 (2012).

Article   PubMed   Google Scholar  

Kerr, N. L. HARKing: hypothesizing after the results are known. Personal. Soc. Psychol. Rev. 2 , 196–217 (1998).

Zhao, Q., Keele, L. J. & Small, D. S. Comment: will competition-winning methods for causal inference also succeed in practice? Stat. Sci. 34 , 72–76 (2019).

Article   MATH   Google Scholar  

Friedman, J., Hastie, T. & Tibshirani, R. The Elements of Statistical Learning . vol. 1 (Springer series in statistics, 2001).

Underwood, A. J. Beyond BACI: experimental designs for detecting human environmental impacts on temporal variations in natural populations. Mar. Freshw. Res. 42 , 569–587 (1991).

Stewart-Oaten, A. & Bence, J. R. Temporal and spatial variation in environmental impact assessment. Ecol. Monogr. 71 , 305–339 (2001).

Eddy, T. D., Pande, A. & Gardner, J. P. A. Massive differential site-specific and species-specific responses of temperate reef fishes to marine reserve protection. Glob. Ecol. Conserv. 1 , 13–26 (2014).

Sher, A. A. et al. Native species recovery after reduction of an invasive tree by biological control with and without active removal. Ecol. Eng. 111 , 167–175 (2018).

Imbens, G. W. & Rubin, D. B. Causal Inference in Statistics, Social, and Biomedical Sciences . (Cambridge University Press, 2015).

Greenhalgh, T. How to read a paper: the basics of Evidence Based Medicine . (John Wiley & Sons, Ltd, 2019).

Salmond, S. S. Randomized Controlled Trials: Methodological Concepts and Critique. Orthopaedic Nursing 27 , (2008).

Geijzendorffer, I. R. et al. How can global conventions for biodiversity and ecosystem services guide local conservation actions? Curr. Opin. Environ. Sustainability 29 , 145–150 (2017).

Dimick, J. B. & Ryan, A. M. Methods for evaluating changes in health care policy. JAMA 312 , 2401 (2014).

Article   CAS   PubMed   Google Scholar  

Ding, P. & Li, F. A bracketing relationship between difference-in-differences and lagged-dependent-variable adjustment. Political Anal. 27 , 605–615 (2019).

Christie, A. P. et al. Simple study designs in ecology produce inaccurate estimates of biodiversity responses. J. Appl. Ecol. 56 , 2742–2754 (2019).

Watson, M. et al. An analysis of the quality of experimental design and reliability of results in tribology research. Wear 426–427 , 1712–1718 (2019).

Kilkenny, C. et al. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS ONE 4 , e7824 (2009).

Christie, A. P. et al. The challenge of biased evidence in conservation. Conserv, Biol . 13577, https://doi.org/10.1111/cobi.13577 (2020).

Christie, A. P. et al. Poor availability of context-specific evidence hampers decision-making in conservation. Biol. Conserv. 248 , 108666 (2020).

Moscoe, E., Bor, J. & Bärnighausen, T. Regression discontinuity designs are underutilized in medicine, epidemiology, and public health: a review of current and best practice. J. Clin. Epidemiol. 68 , 132–143 (2015).

Goldenhar, L. M. & Schulte, P. A. Intervention research in occupational health and safety. J. Occup. Med. 36 , 763–778 (1994).

CAS   PubMed   Google Scholar  

Junker, J. et al. A severe lack of evidence limits effective conservation of the World’s primates. BioScience https://doi.org/10.1093/biosci/biaa082 (2020).

Altindag, O., Joyce, T. J. & Reeder, J. A. Can Nonexperimental Methods Provide Unbiased Estimates of a Breastfeeding Intervention? A Within-Study Comparison of Peer Counseling in Oregon. Evaluation Rev. 43 , 152–188 (2019).

Chaplin, D. D. et al. The Internal And External Validity Of The Regression Discontinuity Design: A Meta-Analysis Of 15 Within-Study Comparisons. J. Policy Anal. Manag. 37 , 403–429 (2018).

Cook, T. D., Shadish, W. R. & Wong, V. C. Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. J. Policy Anal. Manag. 27 , 724–750 (2008).

Ioannidis, J. P. A. et al. Comparison of evidence of treatment effects in randomized and nonrandomized studies. J. Am. Med. Assoc. 286 , 821–830 (2001).

dos Santos Ribas, L. G., Pressey, R. L., Loyola, R. & Bini, L. M. A global comparative analysis of impact evaluation methods in estimating the effectiveness of protected areas. Biol. Conserv. 246 , 108595 (2020).

Benson, K. & Hartz, A. J. A Comparison of Observational Studies and Randomized, Controlled Trials. N. Engl. J. Med. 342 , 1878–1886 (2000).

Smokorowski, K. E. et al. Cautions on using the Before-After-Control-Impact design in environmental effects monitoring programs. Facets 2 , 212–232 (2017).

França, F. et al. Do space-for-time assessments underestimate the impacts of logging on tropical biodiversity? An Amazonian case study using dung beetles. J. Appl. Ecol. 53 , 1098–1105 (2016).

Duvendack, M., Hombrados, J. G., Palmer-Jones, R. & Waddington, H. Assessing ‘what works’ in international development: meta-analysis for sophisticated dummies. J. Dev. Effectiveness 4 , 456–471 (2012).

Sutherland, W. J. et al. Building a tool to overcome barriers in research-implementation spaces: The Conservation Evidence database. Biol. Conserv. 238 , 108199 (2019).

Gusenbauer, M. & Haddaway, N. R. Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Res. Synth. Methods 11 , 181–217 (2020).

Konno, K. & Pullin, A. S. Assessing the risk of bias in choice of search sources for environmental meta‐analyses. Res. Synth. Methods 11 , 698–713 (2020).

PubMed   Google Scholar  

Butsic, V., Lewis, D. J., Radeloff, V. C., Baumann, M. & Kuemmerle, T. Quasi-experimental methods enable stronger inferences from observational data in ecology. Basic Appl. Ecol. 19 , 1–10 (2017).

Brownstein, N. C., Louis, T. A., O’Hagan, A. & Pendergast, J. The role of expert judgment in statistical inference and evidence-based decision-making. Am. Statistician 73 , 56–68 (2019).

Article   MathSciNet   Google Scholar  

Hahn, J., Todd, P. & Klaauw, W. Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica 69 , 201–209 (2001).

Slavin, R. E. Best evidence synthesis: an intelligent alternative to meta-analysis. J. Clin. Epidemiol. 48 , 9–18 (1995).

Slavin, R. E. Best-evidence synthesis: an alternative to meta-analytic and traditional reviews. Educ. Researcher 15 , 5–11 (1986).

Shea, B. J. et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ (Online) 358 , 1–8 (2017).

Google Scholar  

Sterne, J. A. C. et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 355 , i4919 (2016).

Guyatt, G. et al. GRADE guidelines: 11. Making an overall rating of confidence in effect estimates for a single outcome and for all outcomes. J. Clin. Epidemiol. 66 , 151–157 (2013).

Davies, G. M. & Gray, A. Don’t let spurious accusations of pseudoreplication limit our ability to learn from natural experiments (and other messy kinds of ecological monitoring). Ecol. Evolution 5 , 5295–5304 (2015).

Lortie, C. J., Stewart, G., Rothstein, H. & Lau, J. How to critically read ecological meta-analyses. Res. Synth. Methods 6 , 124–133 (2015).

Gutzat, F. & Dormann, C. F. Exploration of concerns about the evidence-based guideline approach in conservation management: hints from medical practice. Environ. Manag. 66 , 435–449 (2020).

Greenhalgh, T. Will COVID-19 be evidence-based medicine’s nemesis? PLOS Med. 17 , e1003266 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Barlow, J. et al. The future of hyperdiverse tropical ecosystems. Nature 559 , 517–526 (2018).

Gurevitch, J. & Hedges, L. V. Statistical issues in ecological meta‐analyses. Ecology 80 , 1142–1149 (1999).

Stone, J. C., Glass, K., Munn, Z., Tugwell, P. & Doi, S. A. R. Comparison of bias adjustment methods in meta-analysis suggests that quality effects modeling may have less limitations than other approaches. J. Clin. Epidemiol. 117 , 36–45 (2020).

Rhodes, K. M. et al. Adjusting trial results for biases in meta-analysis: combining data-based evidence on bias with detailed trial assessment. J. R. Stat. Soc.: Ser. A (Stat. Soc.) 183 , 193–209 (2020).

Article   MathSciNet   CAS   Google Scholar  

Efthimiou, O. et al. Combining randomized and non-randomized evidence in network meta-analysis. Stat. Med. 36 , 1210–1226 (2017).

Article   MathSciNet   PubMed   Google Scholar  

Welton, N. J., Ades, A. E., Carlin, J. B., Altman, D. G. & Sterne, J. A. C. Models for potentially biased evidence in meta-analysis using empirically based priors. J. R. Stat. Soc. Ser. A (Stat. Soc.) 172 , 119–136 (2009).

Turner, R. M., Spiegelhalter, D. J., Smith, G. C. S. & Thompson, S. G. Bias modelling in evidence synthesis. J. R. Stat. Soc.: Ser. A (Stat. Soc.) 172 , 21–47 (2009).

Shackelford, G. E. et al. Dynamic meta-analysis: a method of using global evidence for local decision making. bioRxiv 2020.05.18.078840, https://doi.org/10.1101/2020.05.18.078840 (2020).

Sutherland, W. J., Pullin, A. S., Dolman, P. M. & Knight, T. M. The need for evidence-based conservation. Trends Ecol. evolution 19 , 305–308 (2004).

Ioannidis, J. P. A. Meta-research: Why research on research matters. PLOS Biol. 16 , e2005468 (2018).

Article   PubMed   PubMed Central   CAS   Google Scholar  

LaLonde, R. J. Evaluating the econometric evaluations of training programs with experimental data. Am. Econ. Rev. 76 , 604–620 (1986).

Long, Q., Little, R. J. & Lin, X. Causal inference in hybrid intervention trials involving treatment choice. J. Am. Stat. Assoc. 103 , 474–484 (2008).

Article   MathSciNet   CAS   MATH   Google Scholar  

Thomson Reuters. ISI Web of Knowledge. http://www.isiwebofknowledge.com (2019).

Stroup, W. W. Generalized linear mixed models: modern concepts, methods and applications . (CRC press, 2012).

Bolker, B. M. et al. Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol. Evolution 24 , 127–135 (2009).

R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing (2019).

Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67 , 1–48 (2015).

Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S . (Springer, 2002).

Stan Development Team. RStan: the R interface to Stan. R package version 2.19.3 (2020).

Download references

Acknowledgements

We are grateful to the following people and organisations for contributing datasets to this analysis: P. Edwards, G.R. Hodgson, H. Welsh, J.V. Vieira, authors of van Deurs et al. 2012, T. M. Grome, M. Kaspersen, H. Jensen, C. Stenberg, T. K. Sørensen, J. Støttrup, T. Warnar, H. Mosegaard, Axel Schwerk, Alberto Velando, Dolores River Restoration Partnership, J.S. Pinilla, A. Page, M. Dasey, D. Maguire, J. Barlow, J. Louzada, Jari Florestal, R.T. Buxton, C.R. Schacter, J. Seoane, M.G. Conners, K. Nickel, G. Marakovich, A. Wright, G. Soprone, CSIRO, A. Elosegi, L. García-Arberas, J. Díez, A. Rallo, Parks and Wildlife Finland, Parc Marin de la Côte Bleue. Author funding sources: T.A. was supported by the Grantham Foundation for the Protection of the Environment, Kenneth Miller Trust and Australian Research Council Future Fellowship (FT180100354); W.J.S. and P.A.M. were supported by Arcadia, MAVA, and The David and Claudia Harding Foundation; A.P.C. was supported by the Natural Environment Research Council via Cambridge Earth System Science NERC DTP (NE/L002507/1); D.A. was funded by Portugal national funds through the FCT – Foundation for Science and Technology, under the Transitional Standard – DL57 / 2016 and through the strategic project UIDB/04326/2020; M.A. acknowledges Koniambo Nickel SAS, and particularly Gregory Marakovich and Andy Wright; J.C.A. was funded through by Dirección General de Investigación Científica, projects PB97-1252, BOS2002-01543, CGL2005-04893/BOS, CGL2008-02567 and Comunidad de Madrid, as well as by contract HENARSA-CSIC 2003469-CSIC19637; A.A. was funded by Spanish Government: MEC (CGL2007-65176); B.P.B. was funded through the U.S. Geological Survey and the New York City Department of Environmental Protection; R.B. was funded by Comunidad de Madrid (2018-T1/AMB-10374); J.A.S. and D.A.B. were funded through the U.S. Geological Survey and NextEra Energy; R.S.C. was funded by the Portuguese Foundation for Science and Technology (FCT) grant SFRH/BD/78813/2011 and strategic project UID/MAR/04292/2013; A.D.B. was funded through the Belgian offshore wind monitoring program (WINMON-BE), financed by the Belgian offshore wind energy sector via RBINS—OD Nature; M.K.D. was funded by the Harold L. Castle Foundation; P.M.E. was funded by the Clackamas County Water Environment Services River Health Stewardship Program and the Portland State University Student Watershed Research Project; T.D.E., J.P.A.G. and A.P. were supported by funding from the New Zealand Department of Conservation (Te Papa Atawhai) and from the Centre for Marine Environmental & Economic Research, Victoria University of Wellington, New Zealand; F.M.F. was funded by CNPq-CAPES grants (PELD site 23 403811/2012-0, PELD-RAS 441659/2016-0, BEX5528/13-5 and 383744/2015-6) and BNP Paribas Foundation (Climate & Biodiversity Initiative, BIOCLIMATE project); B.P.H. was funded by NOAA-NMFS sea scallop research set-aside program awards NA16FM1031, NA06FM1001, NA16FM2416, and NA04NMF4720332; A.L.B. was funded by the Portuguese Foundation for Science and Technology (FCT) grant FCT PD/BD/52597/2014, Bat Conservation International student research fellowship and CNPq grant 160049/2013-0; L.C.M. acknowledges Secretaría de Ciencia y Técnica (UNRC); R.A.M. acknowledges Alaska Fisheries Science Center, NOAA Fisheries, and U.S. Department of Commerce for salary support; C.F.J.M. was funded by the Portuguese Foundation for Science and Technology (FCT) grant SFRH/BD/80488/2011; R.R. was funded by the Portuguese Foundation for Science and Technology (FCT) grant PTDC/BIA-BIC/111184/2009, by Madeira’s Regional Agency for the Development of Research, Technology and Innovation (ARDITI) grant M1420-09-5369-FSE-000002 and by a Bat Conservation International student research fellowship; J.C. and S.S. were funded by the Alabama Department of Conservation and Natural Resources; A.T. was funded by the Spanish Ministry of Education with a Formacion de Profesorado Universitario (FPU) grant AP2008-00577 and Dirección General de Investigación Científica, project CGL2008-02567; C.W. was funded by Strategic Science Investment Funding of the Ministry of Business, Innovation and Employment, New Zealand; J.S.K. acknowledges Boreal Peatland LIFE (LIFE08 NAT/FIN/000596), Parks and Wildlife Finland and Kone Foundation; J.J.S.S. was funded by the Mexican National Council on Science and Technology (CONACYT 242558); N.N. was funded by The Carl Tryggers Foundation; I.L.J. was funded by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada; D.D. and D.S. were funded by the French National Research Agency via the “Investment for the Future” program IDEALG (ANR-10-BTBR-04) and by the ALGMARBIO project; R.C.P. was funded by CSIRO and whose research was also supported by funds from the Great Barrier Reef Marine Park Authority, the Fisheries Research and Development Corporation, the Australian Fisheries Management Authority, and Queensland Department of Primary Industries (QDPI). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. The scientific results and conclusions, as well as any views or opinions expressed herein, are those of the author(s) and do not necessarily reflect those of NOAA or the Department of Commerce.

Author information

Authors and affiliations.

Conservation Science Group, Department of Zoology, University of Cambridge, The David Attenborough Building, Downing Street, Cambridge, CB3 3QZ, UK

Alec P. Christie, Philip A. Martin & William J. Sutherland

Centre of Marine Sciences (CCMar), Universidade do Algarve, Campus de Gambelas, 8005-139, Faro, Portugal

David Abecasis

Institut de Recherche pour le Développement (IRD), UMR 9220 ENTROPIE & Laboratoire d’Excellence CORAIL, Université de Perpignan Via Domitia, 52 avenue Paul Alduy, 66860, Perpignan, France

Mehdi Adjeroud

Museo Nacional de Ciencias Naturales, CSIC, Madrid, Spain

Juan C. Alonso & Carlos Palacín

School of Biological Sciences, University of Queensland, Brisbane, 4072, QLD, Australia

Tatsuya Amano

Education Faculty of Bilbao, University of the Basque Country (UPV/EHU). Sarriena z/g E-48940 Leioa, Basque Country, Spain

Alvaro Anton

U.S. Geological Survey, New York Water Science Center, 425 Jordan Rd., Troy, NY, 12180, USA

Barry P. Baldigo

Universidad Complutense de Madrid, Departamento de Biodiversidad, Ecología y Evolución, Facultad de Ciencias Biológicas, c/ José Antonio Novais, 12, E-28040, Madrid, Spain

Rafael Barrientos & Carlos A. Martín

Durrell Institute of Conservation and Ecology (DICE), School of Anthropology and Conservation, University of Kent, Canterbury, CT2 7NR, UK

Jake E. Bicknell

U.S. Geological Survey, Northern Prairie Wildlife Research Center, Jamestown, ND, 58401, USA

Deborah A. Buhl & Jill A. Shaffer

Northern Gulf Institute, Mississippi State University, 1021 Balch Blvd, John C. Stennis Space Center, Mississippi, 39529, USA

Just Cebrian

MARE – Marine and Environmental Sciences Centre, Dept. Life Sciences, University of Coimbra, Coimbra, Portugal

Ricardo S. Ceia

CFE – Centre for Functional Ecology, Dept. Life Sciences, University of Coimbra, Coimbra, Portugal

Departamento de Ciencias Naturales, Universidad Nacional de Río Cuarto (UNRC), Córdoba, Argentina

Luciana Cibils-Martina

CONICET, Buenos Aires, Argentina

Marine Institute, Rinville, Oranmore, Galway, Ireland

Sarah Clarke & Oliver Tully

National Center for Scientific Research, PSL Université Paris, CRIOBE, USR 3278 CNRS-EPHE-UPVD, Maison des Océans, 195 rue Saint-Jacques, 75005, Paris, France

Joachim Claudet

School of Biological Sciences, University of Western Australia, Nedlands, WA, 6009, Australia

Michael D. Craig

School of Environmental and Conservation Sciences, Murdoch University, Murdoch, WA, 6150, Australia

Sorbonne Université, CNRS, UMR 7144, Station Biologique, F.29680, Roscoff, France

Dominique Davoult & Doriane Stagnol

Flanders Research Institute for Agriculture, Fisheries and Food (ILVO), Ankerstraat 1, 8400, Ostend, Belgium

Annelies De Backer

Marine Science Institute, University of California Santa Barbara, Santa Barbara, CA, 93106, USA

Mary K. Donovan

Hawaii Institute of Marine Biology, University of Hawaii at Manoa, Honolulu, HI, 96822, USA

Baruch Institute for Marine & Coastal Sciences, University of South Carolina, Columbia, SC, USA

Tyler D. Eddy

Centre for Fisheries Ecosystems Research, Fisheries & Marine Institute, Memorial University of Newfoundland, St. John’s, Canada

School of Biological Sciences, Victoria University of Wellington, P O Box 600, Wellington, 6140, New Zealand

Tyler D. Eddy, Jonathan P. A. Gardner & Anjali Pande

Lancaster Environment Centre, Lancaster University, LA1 4YQ, Lancaster, UK

Filipe M. França

Fisheries, Aquatic Science and Technology Laboratory, Alaska Pacific University, 4101 University Dr., Anchorage, AK, 99508, USA

Bradley P. Harris

Natural Resources Institute Finland, Manamansalontie 90, 88300, Paltamo, Finland

Department of Biology, Memorial University, St. John’s, NL, A1B 2R3, Canada

Ian L. Jones

National Marine Science Centre and Marine Ecology Research Centre, Southern Cross University, 2 Bay Drive, Coffs Harbour, 2450, Australia

Brendan P. Kelaher

Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, Finland

Janne S. Kotiaho

School of Resource Wisdom, University of Jyväskylä, Jyväskylä, Finland

Centre for Ecology, Evolution and Environmental Changes – cE3c, Faculty of Sciences, University of Lisbon, 1749-016, Lisbon, Portugal

Adrià López-Baucells, Christoph F. J. Meyer & Ricardo Rocha

Biological Dynamics of Forest Fragments Project, National Institute for Amazonian Research and Smithsonian Tropical Research Institute, 69011-970, Manaus, Brazil

Granollers Museum of Natural History, Granollers, Spain

Adrià López-Baucells

Department of Biological Sciences, University of New Brunswick, PO Box 5050, Saint John, NB, E2L 4L5, Canada

Heather L. Major

Voimalohi Oy, Voimatie 23, Voimatie, 91100, Ii, Finland

Aki Mäki-Petäys

Natural Resources Institute Finland, Paavo Havaksen tie 3, 90014 University of Oulu, Oulu, Finland

Fundación Migres CIMA Ctra, Cádiz, Spain

Beatriz Martín

Intergovernmental Oceanographic Commission of UNESCO, Marine Policy and Regional Coordination Section Paris 07, Paris, France

BioRISC, St. Catharine’s College, Cambridge, CB2 1RL, UK

Philip A. Martin & William J. Sutherland

Departamento de Ecología e Hidrología, Universidad de Murcia, Campus de Espinardo, 30100, Murcia, Spain

Daniel Mateos-Molina

RACE Division, Alaska Fisheries Science Center, National Marine Fisheries Service, NOAA, 7600 Sand Point Way NE, Seattle, WA, 98115, USA

Robert A. McConnaughey

European Commission, Joint Research Centre (JRC), Ispra, VA, Italy

Michele Meroni

School of Science, Engineering and Environment, University of Salford, Salford, M5 4WT, UK

Christoph F. J. Meyer

Victorian National Park Association, Carlton, VIC, Australia

Department of Earth, Environment and Life Sciences (DiSTAV), University of Genoa, Corso Europa 26, 16132, Genoa, Italy

Monica Montefalcone

Department of Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden

Norbertas Noreika

Chair of Plant Health, Institute of Agricultural and Environmental Sciences, Estonian University of Life Sciences, Tartu, Estonia

Biosecurity New Zealand – Tiakitanga Pūtaiao Aotearoa, Ministry for Primary Industries – Manatū Ahu Matua, 66 Ward St, PO Box 40742, Wallaceville, New Zealand

Anjali Pande

National Institute of Water & Atmospheric Research Ltd (NIWA), 301 Evans Bay Parade, Greta Point Wellington, New Zealand

CSIRO Oceans & Atmosphere, Queensland Biosciences Precinct, 306 Carmody Road, ST. LUCIA QLD, 4067, Australia

C. Roland Pitcher

Museo Nacional de Ciencias Naturales, CSIC, José Gutiérrez Abascal 2, E-28006, Madrid, Spain

Carlos Ponce

Fort Keogh Livestock and Range Research Laboratory, 243 Fort Keogh Rd, Miles City, Montana, 59301, USA

Matt Rinella

CIBIO-InBIO, Research Centre in Biodiversity and Genetic Resources, University of Porto, Vairão, Portugal

Ricardo Rocha

Departamento de Sistemas Físicos, Químicos y Naturales, Universidad Pablo de Olavide, ES-41013, Sevilla, Spain

María C. Ruiz-Delgado

El Colegio de la Frontera Sur, A.P. 424, 77000, Chetumal, QR, Mexico

Juan J. Schmitter-Soto

Division of Fish and Wildlife, New York State Department of Environmental Conservation, 625 Broadway, Albany, NY, 12233-4756, USA

Shailesh Sharma

University of Denver Department of Biological Sciences, Denver, CO, USA

Anna A. Sher

U.S. Geological Survey, Fort Collins Science Center, Fort Collins, CO, 80526, USA

Thomas R. Stanley

School for Marine Science and Technology, University of Massachusetts Dartmouth, New Bedford, MA, USA

Kevin D. E. Stokesbury

Georges Lemaître Earth and Climate Research Centre, Earth and Life Institute, Université Catholique de Louvain, 1348, Louvain-la-Neuve, Belgium

Aurora Torres

Center for Systems Integration and Sustainability, Department of Fisheries and Wildlife, 13 Michigan State University, East Lansing, MI, 48823, USA

Natural Resources Institute Finland, Latokartanonkaari 9, 00790, Helsinki, Finland

Teppo Vehanen

Manaaki Whenua – Landcare Research, Private Bag 3127, Hamilton, 3216, New Zealand

Corinne Watts

Statistical Laboratory, Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WB, UK

Qingyuan Zhao

You can also search for this author in PubMed   Google Scholar

Contributions

A.P.C., T.A., P.A.M., Q.Z., and W.J.S. designed the research; A.P.C. wrote the paper; D.A., M.A., J.C.A., A.A., B.P.B, R.B., J.B., D.A.B., J.C., R.S.C., L.C.M., S.C., J.C., M.D.C, D.D., A.D.B., M.K.D., T.D.E., P.M.E., F.M.F., J.P.A.G., B.P.H., A.H., I.L.J., B.P.K., J.S.K., A.L.B., H.L.M., A.M., B.M., C.A.M., D.M., R.A.M, M.M., C.F.J.M.,K.M., M.M., N.N., C.P., A.P., C.R.P., C.P., M.R., R.R., M.C.R., J.J.S.S., J.A.S., S.S., A.A.S., D.S., K.D.E.S., T.R.S., A.T., O.T., T.V., C.W. contributed datasets for analyses. All authors reviewed, edited, and approved the manuscript.

Corresponding author

Correspondence to Alec P. Christie .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Casper Albers, Samuel Scheiner, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, description of additional supplementary information, supplementary data 1, supplementary data 2, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Christie, A.P., Abecasis, D., Adjeroud, M. et al. Quantifying and addressing the prevalence and bias of study designs in the environmental and social sciences. Nat Commun 11 , 6377 (2020). https://doi.org/10.1038/s41467-020-20142-y

Download citation

Received : 29 January 2020

Accepted : 13 November 2020

Published : 11 December 2020

DOI : https://doi.org/10.1038/s41467-020-20142-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Robust language-based mental health assessments in time and space through social media.

  • Siddharth Mangalik
  • Johannes C. Eichstaedt
  • H. Andrew Schwartz

npj Digital Medicine (2024)

Is there a “difference-in-difference”? The impact of scientometric evaluation on the evolution of international publications in Egyptian universities and research centres

  • Mona Farouk Ali

Scientometrics (2024)

Quantifying research waste in ecology

  • Marija Purgar
  • Tin Klanjscek
  • Antica Culina

Nature Ecology & Evolution (2022)

Assessing assemblage-wide mammal responses to different types of habitat modification in Amazonian forests

  • Paula C. R. Almeida-Maués
  • Anderson S. Bueno
  • Ana Cristina Mendes-Oliveira

Scientific Reports (2022)

Mitigating impacts of invasive alien predators on an endangered sea duck amidst high native predation pressure

  • Kim Jaatinen
  • Ida Hermansson

Oecologia (2022)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

quantitative research scholarly article

Quantitative and Qualitative Research: An Overview of Approaches

  • First Online: 03 January 2022

Cite this chapter

quantitative research scholarly article

  • Euclid Seeram 5 , 6 , 7  

628 Accesses

In Chap. 1 , the nature and scope of research were outlined and included an overview of quantitative and qualitative research and a brief description of research designs. In this chapter, both quantitative and qualitative research will be described in a little more detail with respect to essential features and characteristics. Furthermore, the research designs used in each of these approaches will be reviewed. Finally, this chapter will conclude with examples of published quantitative and qualitative research in medical imaging and radiation therapy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

quantitative research scholarly article

Health Technology Assessment

Types of information.

quantitative research scholarly article

Medical Imaging Informatics

Anvari, A., Halpern, E. F., & Samir, A. E. (2015). Statistics 101 for radiologists. Radiographics, 35 , 1789–1801.

Article   Google Scholar  

Battistelli, A., Portoghese, I., Galletta, M., & Pohl, S. (2013). Beyond the tradition: Test of an integrative conceptual model on nurse turnover. International Nursing Review, 60 (1), 103–111. https://doi.org/10.1111/j.1466-7657.2012.01024.x

Article   CAS   PubMed   Google Scholar  

Bhattacherjee, A. (2012). Social science research: Principles, methods, and practices . In Textbooks Collection , 3. http://scholarcommons.usf.edu/oa_textbooks/3 . University of South Florida.

Chenail, R. (2011). Ten steps for conceptualizing and conducting qualitative research studies in a pragmatically curious manner. The Qualitative Report, 16 (6), 1713–1730. http://www.nova.edu/ssss/QR/QR16-6/chenail.pdf

Google Scholar  

Coyle, M. K. (2012). Depressive symptoms after a myocardial infarction and self-care. Archives of Psychiatric Nursing, 26 (2), 127–134. https://doi.org/10.1016/j.apnu.2011.06.004

Article   PubMed   Google Scholar  

Creswell, J. W., & Guetterman, T. C. (2019). Educational research: Planning, conducting, and evaluating quantitative and qualitative research (6th ed.). Pearson Education.

Curtis, E. A., Comiskey, C., & Dempsey, O. (2016). Importance and use of correlational research. Nurse Researcher, 23 (6), 20–25. https://doi.org/10.7748/nr.2016.e1382

Gibson, D. J., & Davidson, R. A. (2012). Exposure creep in computed radiography: A longitudinal study. Academic Radiology, 19 (4), 458–462. https://doi.org/10.1016/j.acra.2011.12.003 . Epub 2012 Jan 5.

Gray, J. R., Grove, S. K., & Sutherland, S. (2017). The practice of nursing research: Appraisal, synthesis, and generation of evidence . Elsevier.

Miles, M., Hubermann, A., & Saldana, J. (2014). Qualitative data analysis: a methods sourcebook (3rd ed.). Sage.

Munhall, P. L. (2012). Nursing research: A qualitative perspective (5th ed.). Jones and Bartlett.

Munn, Z., & Jordan, Z. (2011). The patient experience of high technology medical imaging: A systematic review of the qualitative evidence. JBI Library of Systematic Reviews, 9 (19), 631–678. https://doi.org/10.11124/01938924-201109190-00001

Munn, Z., Pearson, A., Jordan, Z., Murphy, F., & Pilkington, D. (2013). Action research in radiography: What it is and how it can be conducted. Journal of Medical Radiation Sciences, 60 (2), 47–52. https://doi.org/10.1002/jmrs.8

Article   PubMed   PubMed Central   Google Scholar  

O’Regan, T., Robinson, L., Newton-Hughes, A., & Strudwick, R. (2019). A review of visual ethnography: Radiography viewed through a different lens. Radiography, 25 (Supplement 1), S9–S13.

Price, P., Jhangiani, R., & Chiang, I. (2015). Research methods of psychology (2nd Canadian ed.). BC Campus. Retrieved from https://opentextbc.ca/researchmethods/

Seeram, E., Davidson, R., Bushong, S., & Swan, H. (2015). Education and training required for the digital radiography environment: A non-interventional quantitative survey study of radiologic technologists. International Journal of Radiology & Medical Imaging, 2 , 103. https://doi.org/10.15344/ijrmi/2015/103

Seeram, E., Davidson, R., Bushong, S., & Swan, H. (2016). Optimizing the exposure indicator as a dose management strategy in computed radiography. Radiologic Technology, 87 (4), 380–391.

PubMed   Google Scholar  

Solomon, P., & Draine, J. (2010). An overview of quantitative methods. In B. Thyer (Ed.), The handbook of social work research methods (2nd ed., pp. 26–36). Sage.

Chapter   Google Scholar  

Suchsland, M. Z., Cruz, M. J., Hardy, V., Jarvik, J., McMillan, G., Brittain, A., & Thompson, M. (2020). Qualitative study to explore radiologist and radiologic technologist perceptions of outcomes patients experience during imaging in the USA. BMJ Open, 10 , e033961. https://doi.org/10.1136/bmjopen-2019-033961

Thomas, L. (2020). An introduction to quasi-experimental designs. Retrieved from Scribbr.com https://www.scribbr.com/methodology/quasi-experimental-design/ . Accessed 8 Jan 2021.

University of Lethbridge (Alberta, Canada). (2020). An introduction to action research. https://www.uleth.ca/education/research/research-centers/action-research/introduction . Accessed 12 Jan 2020.

Download references

Author information

Authors and affiliations.

Medical Imaging and Radiation Sciences, Monash University, Melbourne, VIC, Australia

Euclid Seeram

Faculty of Science, Charles Sturt University, Bathurst, NSW, Australia

Medical Radiation Sciences, Faculty of Health, University of Canberra, Canberra, ACT, Australia

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Medical Imaging, Faculty of Health, University of Canberra, Burnaby, BC, Canada

Faculty of Health, University of Canberra, Canberra, ACT, Australia

Robert Davidson

Brookfield Health Sciences, University College Cork, Cork, Ireland

Andrew England

Mark F. McEntee

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Seeram, E. (2021). Quantitative and Qualitative Research: An Overview of Approaches. In: Seeram, E., Davidson, R., England, A., McEntee, M.F. (eds) Research for Medical Imaging and Radiation Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-79956-4_2

Download citation

DOI : https://doi.org/10.1007/978-3-030-79956-4_2

Published : 03 January 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-79955-7

Online ISBN : 978-3-030-79956-4

eBook Packages : Biomedical and Life Sciences Biomedical and Life Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Affiliations.

  • 1 Department of General Education, Graduate School of Nursing Science, St. Luke's International University, Tokyo, Japan. [email protected].
  • 2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.
  • PMID: 35470596
  • PMCID: PMC9039193
  • DOI: 10.3346/jkms.2022.37.e121

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

Keywords: Hypotheses; Qualitative Research; Quantitative Research; Research Questions.

© 2022 The Korean Academy of Medical Sciences.

PubMed Disclaimer

Conflict of interest statement

The authors have no potential conflicts of interest to disclose.

Fig. 1. General flow for constructing effective…

Fig. 1. General flow for constructing effective research questions and hypotheses prior to conducting research.

Fig. 2. Algorithm for building research question…

Fig. 2. Algorithm for building research question and hypothesis in quantitative research, and illustrative example…

Fig. 3. Algorithm for building research question…

Fig. 3. Algorithm for building research question and hypothesis in qualitative research, and illustrative example…

Similar articles

  • Conducting and Writing Quantitative and Qualitative Research. Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M. Barroga E, et al. J Korean Med Sci. 2023 Sep 18;38(37):e291. doi: 10.3346/jkms.2023.38.e291. J Korean Med Sci. 2023. PMID: 37724495 Free PMC article. Review.
  • A Guide to Writing a Qualitative Systematic Review Protocol to Enhance Evidence-Based Practice in Nursing and Health Care. Butler A, Hall H, Copnell B. Butler A, et al. Worldviews Evid Based Nurs. 2016 Jun;13(3):241-9. doi: 10.1111/wvn.12134. Epub 2016 Jan 20. Worldviews Evid Based Nurs. 2016. PMID: 26790142
  • The qualitative research proposal. Klopper H. Klopper H. Curationis. 2008 Dec;31(4):62-72. doi: 10.4102/curationis.v31i4.1062. Curationis. 2008. PMID: 19653539 Review.
  • Research proposal writing: breaking the myth. Nte AR, Awi DD. Nte AR, et al. Niger J Med. 2006 Oct-Dec;15(4):373-81. doi: 10.4314/njm.v15i4.37249. Niger J Med. 2006. PMID: 17111720
  • Research in adolescent healthcare: The value of qualitative methods. Lefèvre H, Moro MR, Lachal J. Lefèvre H, et al. Arch Pediatr. 2019 Oct;26(7):426-430. doi: 10.1016/j.arcped.2019.09.012. Epub 2019 Oct 12. Arch Pediatr. 2019. PMID: 31611145 Review.
  • Daily experiences of non-psychiatric nurses in acute psychiatric wards. Rangwaneni ME, Raliphaswa NS, Maluleke M, Masutha TC. Rangwaneni ME, et al. Nurs Open. 2024 May;11(5):e2174. doi: 10.1002/nop2.2174. Nurs Open. 2024. PMID: 38728530 Free PMC article.
  • Lessons From the Pandemic for Hand Surgery in Wales. Lawrence OJ, Shanbhag V. Lawrence OJ, et al. Cureus. 2024 Mar 20;16(3):e56577. doi: 10.7759/cureus.56577. eCollection 2024 Mar. Cureus. 2024. PMID: 38646319 Free PMC article.
  • Exploring barriers and facilitators of implementing an at-home SARS-CoV-2 antigen self-testing intervention: The Rapid Acceleration of Diagnostics-Underserved Populations (RADx-UP) initiatives. Cross LM, DeFosset A, Yusuf B, Conserve D, Anderson R, Carilli C, Kibbe W, Cohen-Wolkowiez M, Richmond A, Corbie G, Dave G. Cross LM, et al. PLoS One. 2023 Nov 16;18(11):e0294458. doi: 10.1371/journal.pone.0294458. eCollection 2023. PLoS One. 2023. PMID: 37971996 Free PMC article.
  • Research.com . How to write a research question: Types, steps, and examples. [Updated 2021]. [Accessed January 2, 2022]. https://research.com/research/how-to-write-a-research-question .
  • Chigbu UE. Visually hypothesising in scientific paper writing: confirming and refuting qualitative research hypotheses using diagrams. Publications. 2019;7(1):22.
  • International Institute of Health Sciences. Developing hypothesis and research question. [Updated 2022]. [Accessed January 3, 2022]. https://www.iihs.edu.lk/mod/resource/view.php?id=34513&forceview=1 .
  • Excelsior College. Research hypotheses. [Updated 2022]. [Accessed February 3, 2022]. https://owl.excelsior.edu/research/research-hypotheses/
  • Wordvice. How to write a hypothesis or research question. [Updated 2021]. [Accessed January 4, 2022]. https://blog.wordvice.com/how-to-write-a-hypothesis-or-research-question/

Publication types

  • Search in MeSH

Related information

Linkout - more resources, full text sources.

  • Europe PubMed Central
  • Korean Academy of Medical Sciences
  • PubMed Central

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Can J Infect Dis
  • v.8(2); Mar-Apr 1997

Broadening horizons: Integrating quantitative and qualitative research

Health research usually employs quantitative, often experimental, methods to study clinical conditions and outcomes. The use of qualitative methods in this type of research is much less common. However, we contend that qualitative research, in combination with quantitative research, can play an important role generating an improved understanding of disease, health and health care.

Data collected in qualitative research are usually in narrative rather than numerical form, such as the transcript of an unstructured, in-depth interview. Analysis of qualitative data organizes, summarizes and interprets these nonnumerical observations. The goal of qualitative research is the development of concepts that help clarify phenomena in natural, rather than experimental, settings, giving due emphasis to the meanings, experiences and views of all the participants being studied. For example, to understand why some members of ethnic minorities have refused tuberculosis treatment, qualitative, culturally sensitive interviews may be much more informative than standardized quantitative interviews.

Both quantitative and qualitative research have weaknesses that to some degree are compensated for by the strengths of the other. Quantitative research is very well suited to establishing cause-and-effect relationships, to testing hypotheses and to determining the opinions, attitudes and practices of a large population, whereas qualitative research lends itself very well to developing hypotheses and theories and to describing processes such as decision making or communication processes. Quantitative research generates factual, reliable outcome data that are usually generalizable to some larger populations, and qualitative research produces rich, detailed and valid process data based on the participant’s, rather than the investigator’s, perspectives and interpretations ( 1 ).

Quantitative research is usually deductive, relying on experimental and survey methods to test specific hypotheses based on general principles. It is strong in inductive reasoning, building and expanding theories concerning relationships among phenomena. In the actual practice of scientific research, theory and research interact through a never-ending cycle of deduction, induction, deduction, induction and so forth ( 2 ). By combining quantitative and qualitative methods, a degree of comprehensiveness may be achieved that neither approach, if used alone, can achieve. For example, to target populations of children who are not being immunized for common childhood infectious diseases, it is critical to quantify the existence of a low rate of immunization. However, to intervene to rectify the identified problem, it is important to explore why parents are not having their children vaccinated. Qualitative interviews are most appropriate for this purpose.

The nature of inquiry is similar in both quantitative and qualitative research, it is couched in the human desire to understand and explain behaviour and events, their components, antecedents, corollaries and consequences. If differences among researchers exist, it is not because they aspire to different ends, but because they have operationalized their methods for reaching those ends differently ( 3 ).

Even though both approaches are different from one another, one is not necessarily inferior to the other. However, qualitative research is often considered to be lacking scientific rigour. Unfortunately, the standard strategies used to enhance validity, reliability and objectivity in quantitative research are not always relevant to qualitative research. Fortunately, more attention is being paid to strategies to enhance the quality of the data and interpretations collected through qualitative research ( 4 , 5 ). An example of a commonly used approach is triangulation, which refers to data collection in which evidence is deliberately sought from a wide range of different, independent sources and often by different means (for instance, comparing oral testimony with written records) ( 4 ).

The rigid demarcation between the two types of research has not encouraged interaction between the two camps. However, there are some good reasons for combining qualitative and quantitative research. First, a researcher may wish to explore an issue to understand what the relevant variables are or to develop hypotheses that can then be studied or tested in quantitative research. This way of combining the two approaches is also used in the development of scales or questionnaires. For example, qualitative techniques, such as observation, in-depth interviews or focus groups, can provide a description and understanding of a situation or behaviour. At their most basic, these techniques can be used simply to discover the most comprehensive terms or words to use in a subsequent survey questionnaire ( 6 ). Second, qualitative research also may follow quantitative research with the aim of explaining the quantitative results. For example, designing and evaluating an effective health campaign promoting influenza vaccinations faces multimethod challenges. To determine whether the campaign works so that the strategy can be effectively used again, it is not only important to identify how many people received shots, but also why and how they decided to get vaccinated (ie, linking process to outcome). Third, qualitative and quantitative research can be combined to enhance the validity of the results, much the same as in triangulation, but now using both quantitative and qualitative approaches, for their combined strength, rather than using one method to validate the result of the other. For example, overall validity would be enhanced through the use of a multimethods approach in instances where measuring the technical accuracy of a diagnostic or treatment intervention was important alongside an understanding of patient response to or acceptance of such a diagnostic test or treatment protocol ( 7 ).

Several barriers may prevent the further development of integrated research, such as lack of expertise, time and funding, mutual prejudices, and publication bias. If methodological integration is going to progress, a number of changes in the current environment are needed. These include acceptance and refinement of the underlying paradigms of qualitative and quantitative research, recognition by funding agencies of the need for both perspectives and a willingness to allocate sufficient resources, awareness by editorial boards of the importance of publishing multimethods research, training of researchers in both paradigms, encouragement of teamwork and promotion of mutual acceptance and respect by the adherents of each approach.

Given the multiple challenges facing health researchers today, broadening the horizons of health research to embrace the use and benefits of both quantitative and qualitative analysis is a methodological advance that must be supported and nurtured.

IMAGES

  1. 10 Easy Steps: How to Write a Research Article Analysis in 2023

    quantitative research scholarly article

  2. (PDF) Training of scholarly article writing using quantitative methods

    quantitative research scholarly article

  3. 717 Background Of The Study In Quantitative Research Images

    quantitative research scholarly article

  4. Quantitative Research: What It Is, Practices & Methods

    quantitative research scholarly article

  5. (PDF) Reflexivity in quantitative research: A rationale and beginner's

    quantitative research scholarly article

  6. (PDF) Writing a Quantitative Research Thesis

    quantitative research scholarly article

COMMENTS

  1. A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

    INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...

  2. Quantitative Research Excellence: Study Design and Reliable and Valid

    Share access to this article. Sharing links are not relevant where the article is open access and not available if you do not have a subscription. For more information view the Sage Journals article sharing page.

  3. Quantitative and Qualitative Approaches to Generalization and

    Second, quantitative research may exploit the bottom-up generalization strategy that is inherent to many qualitative approaches. This offers a new perspective on unsuccessful replications by treating them not as scientific failures, but as a valuable source of information about the scope of a theory. ... [Google Scholar] Borsboom D. (2005 ...

  4. What Is Quantitative Research? An Overview and Guidelines

    Abstract. In an era of data-driven decision-making, a comprehensive understanding of quantitative research is indispensable. Current guides often provide fragmented insights, failing to offer a holistic view, while more comprehensive sources remain lengthy and less accessible, hindered by physical and proprietary barriers.

  5. Synthesising quantitative and qualitative evidence to inform guidelines

    Introduction. Recognition has grown that while quantitative methods remain vital, they are usually insufficient to address complex health systems related research questions. 1 Quantitative methods rely on an ability to anticipate what must be measured in advance. Introducing change into a complex health system gives rise to emergent reactions, which cannot be fully predicted in advance.

  6. Recent quantitative research on determinants of health in high ...

    Background Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature. Methods We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that ...

  7. (PDF) An Overview of Quantitative Research Methods

    The phrase "research" refers to seeking knowledge. It is a scholarly and systematic search for relevant knowledge on a specified subject. The Oxford Learner's Dictionaries defines "Research ...

  8. Deeper than Wordplay: A Systematic Review of Critical Quantitative

    We share how critical quantitative approaches are definite shifts within the quantitative research paradigm, highlight relevant assumptions, and share strategies and future directions for applied practice in this emergent field. ... Google Scholar. Alexander P. A. (2020). Methodological guidance paper: The art and science of quality systematic ...

  9. Advances in quantitative research within the psychological sciences

    Psychology-based journals are not new to issues dedicated to quantitative methods. Many special issues and key invited articles have highlighted important advancements in methodology, each helping to promote methodological rigor.For example, the journal Health Psychology Review recently published an issue (2017, Volume 11, Issue 3) on statistical tools that can benefit the subdiscipline of ...

  10. Quantifying and addressing the prevalence and bias of study ...

    Ioannidis, J. P. A. Meta-research: Why research on research matters. PLOS Biol. 16 , e2005468 (2018). Article PubMed PubMed Central CAS Google Scholar

  11. Quantitative research

    This article describes the basic tenets of quantitative research. The concepts of dependent and independent variables are addressed and the concept of measurement and its associated issues, such as error, reliability and validity, are explored. Experiments and surveys - the principal research designs in quantitative research - are described ...

  12. Quantitative Research

    Quantitative research methods are concerned with the planning, design, and implementation of strategies to collect and analyze data. Descartes, the seventeenth-century philosopher, suggested that how the results are achieved is often more important than the results themselves, as the journey taken along the research path is a journey of discovery. . High-quality quantitative research is ...

  13. (PDF) Quantitative Research Methods : A Synopsis Approach

    Quantitative research, according to Apuke (2017), is "one that works with quantifying and analyzing variables to produce results". It includes the analysis of data using numerical and statistical ...

  14. 'Qualitative' and 'quantitative' methods and approaches ...

    There is considerable literature showing the complexity, connectivity and blurring of 'qualitative' and 'quantitative' methods in research. Yet these concepts are often represented in a binary way as independent dichotomous categories. This is evident in many key textbooks which are used in research methods courses to guide students and newer researchers in their research training. This paper ...

  15. Quantitative and Qualitative Research: An Overview of Approaches

    Abstract. In Chap. 1, the nature and scope of research were outlined and included an overview of quantitative and qualitative research and a brief description of research designs. In this chapter, both quantitative and qualitative research will be described in a little more detail with respect to essential features and characteristics.

  16. Critical Quantitative Literacy: An Educational Foundation for Critical

    Quantitative research in the social sciences is undergoing a change. After years of scholarship on the oppressive history of quantitative methods, quantitative scholars are grappling with the ways that our preferred methodology reinforces social injustices (Zuberi, 2001).Among others, the emerging fields of CritQuant (critical quantitative studies) and QuantCrit (quantitative critical race ...

  17. Quantitative Research

    Social media and entrepreneurship research: A literature review. Abdus-Samad Temitope Olanrewaju, ... Paul Mercieca, in International Journal of Information Management, 2020. 3.1 Research methods used in the reviewed literature. We first examined the research methods used by the reviewed papers; most of the papers use a single analytical approach: quantitative (n = 77) or qualitative (n = 54).

  18. Quantitative Approaches for the Evaluation of Implementation Research

    3. Quantitative Methods for Evaluating Implementation Outcomes. While summative evaluation is distinguishable from formative evaluation (see Elwy et al. this issue), proper understanding of the implementation strategy requires using both methods, perhaps at different stages of implementation research (The Health Foundation, 2015).Formative evaluation is a rigorous assessment process designed ...

  19. A Practical Guide to Writing Quantitative and Qualitative Research

    A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles J Korean Med Sci. 2022 Apr 25;37(16):e121. doi: 10.3346/jkms.2022.37.e121. Authors Edward Barroga 1 , Glafera Janet Matanguihan 2 Affiliations 1 ...

  20. Quantitative Data Analysis—In the Graduate Curriculum

    Teaching quantitative data analysis is not teaching number crunching, but teaching a way of critical thinking for how to analyze the data. The goal of data analysis is to reveal the underlying patterns, trends, and relationships of a study's contextual situation. Learning data analysis is not learning how to use statistical tests to crunch ...

  21. The advantages and disadvantages of quantitative ...

    Multidimensional analysis of the linguistic phenomena improves the analytic potential. This article focuses on the application of quantitative methods in schoolscape research, including a discussion of its advantages and disadvantages. This article seeks to rehabilitate the quantitative by re-theorizing the landscape in linguistic landscape (LL ...

  22. Conducting and Writing Quantitative and Qualitative Research

    SEARCH FOR RELEVANT ARTICLES. To identify articles relevant to this topic, we adhered to the search strategy recommended by Gasparyan et al.7 We searched through PubMed, Scopus, Directory of Open Access Journals, and Google Scholar databases using the following keywords: quantitative research, qualitative research, mixed-method research, deductive reasoning, inductive reasoning, study design ...

  23. Broadening horizons: Integrating quantitative and qualitative research

    Quantitative research generates factual, reliable outcome data that are usually generalizable to some larger populations, and qualitative research produces rich, detailed and valid process data based on the participant's, rather than the investigator's, perspectives and interpretations ( 1 ). Quantitative research is usually deductive ...