Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 7: Nonexperimental Research

Overview of Nonexperimental Research

Learning Objectives

  • Define nonexperimental research, distinguish it clearly from experimental research, and give several examples.
  • Explain when a researcher might choose to conduct nonexperimental research as opposed to experimental research.

What Is Nonexperimental Research?

Nonexperimental research  is research that lacks the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both.

In a sense, it is unfair to define this large and diverse set of approaches collectively by what they are  not . But doing so reflects the fact that most researchers in psychology consider the distinction between experimental and nonexperimental research to be an extremely important one. This distinction is because although experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, nonexperimental research generally cannot. As we will see, however, this inability does not mean that nonexperimental research is less important than experimental research or inferior to it in any general sense.

When to Use Nonexperimental Research

As we saw in  Chapter 6 , experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable and randomly assign participants to conditions or to orders of conditions. It stands to reason, therefore, that nonexperimental research is appropriate—even necessary—when these conditions are not met. There are many ways in which preferring nonexperimental research can be the case.

  • The research question or hypothesis can be about a single variable rather than a statistical relationship between two variables (e.g., How accurate are people’s first impressions?).
  • The research question can be about a noncausal statistical relationship between variables (e.g., Is there a correlation between verbal intelligence and mathematical intelligence?).
  • The research question can be about a causal relationship, but the independent variable cannot be manipulated or participants cannot be randomly assigned to conditions or orders of conditions (e.g., Does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • The research question can be broad and exploratory, or it can be about what it is like to have a particular experience (e.g., What is it like to be a working mother diagnosed with depression?).

Again, the choice between the experimental and nonexperimental approaches is generally dictated by the nature of the research question. If it is about a causal relationship and involves an independent variable that can be manipulated, the experimental approach is typically preferred. Otherwise, the nonexperimental approach is preferred. But the two approaches can also be used to address the same research question in complementary ways. For example, nonexperimental studies establishing that there is a relationship between watching violent television and aggressive behaviour have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001) [1] . Similarly, after his original study, Milgram conducted experiments to explore the factors that affect obedience. He manipulated several independent variables, such as the distance between the experimenter and the participant, the participant and the confederate, and the location of the study (Milgram, 1974) [2] .

Types of Nonexperimental Research

Nonexperimental research falls into three broad categories: single-variable research, correlational and quasi-experimental research, and qualitative research. First, research can be nonexperimental because it focuses on a single variable rather than a statistical relationship between two variables. Although there is no widely shared term for this kind of research, we will call it  single-variable research . Milgram’s original obedience study was nonexperimental in this way. He was primarily interested in one variable—the extent to which participants obeyed the researcher when he told them to shock the confederate—and he observed all participants performing the same task under the same conditions. The study by Loftus and Pickrell described at the beginning of this chapter is also a good example of single-variable research. The variable was whether participants “remembered” having experienced mildly traumatic childhood events (e.g., getting lost in a shopping mall) that they had not actually experienced but that the research asked them about repeatedly. In this particular study, nearly a third of the participants “remembered” at least one event. (As with Milgram’s original study, this study inspired several later experiments on the factors that affect false memories.)

As these examples make clear, single-variable research can answer interesting and important questions. What it cannot do, however, is answer questions about statistical relationships between variables. This detail is a point that beginning researchers sometimes miss. Imagine, for example, a group of research methods students interested in the relationship between children’s being the victim of bullying and the children’s self-esteem. The first thing that is likely to occur to these researchers is to obtain a sample of middle-school students who have been bullied and then to measure their self-esteem. But this design would be a single-variable study with self-esteem as the only variable. Although it would tell the researchers something about the self-esteem of children who have been bullied, it would not tell them what they really want to know, which is how the self-esteem of children who have been bullied  compares  with the self-esteem of children who have not. Is it lower? Is it the same? Could it even be higher? To answer this question, their sample would also have to include middle-school students who have not been bullied thereby introducing another variable.

Research can also be nonexperimental because it focuses on a statistical relationship between two variables but does not include the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both. This kind of research takes two basic forms: correlational research and quasi-experimental research. In correlational research , the researcher measures the two variables of interest with little or no attempt to control extraneous variables and then assesses the relationship between them. A research methods student who finds out whether each of several middle-school students has been bullied and then measures each student’s self-esteem is conducting correlational research. In  quasi-experimental research , the researcher manipulates an independent variable but does not randomly assign participants to conditions or orders of conditions. For example, a researcher might start an antibullying program (a kind of treatment) at one school and compare the incidence of bullying at that school with the incidence at a similar school that has no antibullying program.

The final way in which research can be nonexperimental is that it can be qualitative. The types of research we have discussed so far are all quantitative, referring to the fact that the data consist of numbers that are analyzed using statistical techniques. In  qualitative research , the data are usually nonnumerical and therefore cannot be analyzed using statistical techniques. Rosenhan’s study of the experience of people in a psychiatric ward was primarily qualitative. The data were the notes taken by the “pseudopatients”—the people pretending to have heard voices—along with their hospital records. Rosenhan’s analysis consists mainly of a written description of the experiences of the pseudopatients, supported by several concrete examples. To illustrate the hospital staff’s tendency to “depersonalize” their patients, he noted, “Upon being admitted, I and other pseudopatients took the initial physical examinations in a semipublic room, where staff members went about their own business as if we were not there” (Rosenhan, 1973, p. 256). [3] Qualitative data has a separate set of analysis tools depending on the research question. For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were said in an interview or focus group.

Internal Validity Revisited

Recall that internal validity is the extent to which the design of a study supports the conclusion that changes in the independent variable caused any observed differences in the dependent variable.  Figure 7.1  shows how experimental, quasi-experimental, and correlational research vary in terms of internal validity. Experimental research tends to be highest because it addresses the directionality and third-variable problems through manipulation and the control of extraneous variables through random assignment. If the average score on the dependent variable in an experiment differs across conditions, it is quite likely that the independent variable is responsible for that difference. Correlational research is lowest because it fails to address either problem. If the average score on the dependent variable differs across levels of the independent variable, it  could  be that the independent variable is responsible, but there are other interpretations. In some situations, the direction of causality could be reversed. In others, there could be a third variable that is causing differences in both the independent and dependent variables. Quasi-experimental research is in the middle because the manipulation of the independent variable addresses some problems, but the lack of random assignment and experimental control fails to address others. Imagine, for example, that a researcher finds two similar schools, starts an antibullying program in one, and then finds fewer bullying incidents in that “treatment school” than in the “control school.” There is no directionality problem because clearly the number of bullying incidents did not determine which school got the program. However, the lack of random assignment of children to schools could still mean that students in the treatment school differed from students in the control school in some other way that could explain the difference in bullying.

""

Notice also in  Figure 7.1  that there is some overlap in the internal validity of experiments, quasi-experiments, and correlational studies. For example, a poorly designed experiment that includes many confounding variables can be lower in internal validity than a well designed quasi-experiment with no obvious confounding variables. Internal validity is also only one of several validities that one might consider, as noted in  Chapter 5.

Key Takeaways

  • Nonexperimental research is research that lacks the manipulation of an independent variable, control of extraneous variables through random assignment, or both.
  • There are three broad types of nonexperimental research. Single-variable research focuses on a single variable rather than a relationship between variables. Correlational and quasi-experimental research focus on a statistical relationship but lack manipulation or random assignment. Qualitative research focuses on broader research questions, typically involves collecting large amounts of data from a small number of participants, and analyses the data nonstatistically.
  • In general, experimental research is high in internal validity, correlational research is low in internal validity, and quasi-experimental research is in between.

Discussion: For each of the following studies, decide which type of research design it is and explain why.

  • A researcher conducts detailed interviews with unmarried teenage fathers to learn about how they feel and what they think about their role as fathers and summarizes their feelings in a written narrative.
  • A researcher measures the impulsivity of a large sample of drivers and looks at the statistical relationship between this variable and the number of traffic tickets the drivers have received.
  • A researcher randomly assigns patients with low back pain either to a treatment involving hypnosis or to a treatment involving exercise. She then measures their level of low back pain after 3 months.
  • A college instructor gives weekly quizzes to students in one section of his course but no weekly quizzes to students in another section to see whether this has an effect on their test performance.
  • Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage. ↵
  • Milgram, S. (1974). Obedience to authority: An experimental view . New York, NY: Harper & Row. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵

Research that lacks the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both.

Research that focuses on a single variable rather than a statistical relationship between two variables.

The researcher measures the two variables of interest with little or no attempt to control extraneous variables and then assesses the relationship between them.

The researcher manipulates an independent variable but does not randomly assign participants to conditions or orders of conditions.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

data analysis in non experimental research

6.1 Overview of Non-Experimental Research

Learning objectives.

  • Define non-experimental research, distinguish it clearly from experimental research, and give several examples.
  • Explain when a researcher might choose to conduct non-experimental research as opposed to experimental research.

What Is Non-Experimental Research?

Non-experimental research  is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world).

Most researchers in psychology consider the distinction between experimental and non-experimental research to be an extremely important one. This is because although experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, non-experimental research generally cannot. As we will see, however, this inability to make causal conclusions does not mean that non-experimental research is less important than experimental research.

When to Use Non-Experimental Research

As we saw in the last chapter , experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable. It stands to reason, therefore, that non-experimental research is appropriate—even necessary—when these conditions are not met. There are many times in which non-experimental research is preferred, including when:

  • the research question or hypothesis relates to a single variable rather than a statistical relationship between two variables (e.g., How accurate are people’s first impressions?).
  • the research question pertains to a non-causal statistical relationship between variables (e.g., is there a correlation between verbal intelligence and mathematical intelligence?).
  • the research question is about a causal relationship, but the independent variable cannot be manipulated or participants cannot be randomly assigned to conditions or orders of conditions for practical or ethical reasons (e.g., does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • the research question is broad and exploratory, or is about what it is like to have a particular experience (e.g., what is it like to be a working mother diagnosed with depression?).

Again, the choice between the experimental and non-experimental approaches is generally dictated by the nature of the research question. Recall the three goals of science are to describe, to predict, and to explain. If the goal is to explain and the research question pertains to causal relationships, then the experimental approach is typically preferred. If the goal is to describe or to predict, a non-experimental approach will suffice. But the two approaches can also be used to address the same research question in complementary ways. For example, Similarly, after his original study, Milgram conducted experiments to explore the factors that affect obedience. He manipulated several independent variables, such as the distance between the experimenter and the participant, the participant and the confederate, and the location of the study (Milgram, 1974) [1] .

Types of Non-Experimental Research

Non-experimental research falls into three broad categories: cross-sectional research, correlational research, and observational research. 

First, cross-sectional research  involves comparing two or more pre-existing groups of people. What makes this approach non-experimental is that there is no manipulation of an independent variable and no random assignment of participants to groups. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American college students and 50 Japanese college students. Although this “feels” like a between-subjects experiment, it is a cross-sectional study because the researcher did not manipulate the students’ nationalities. As another example, if we wanted to compare the memory test performance of a group of cannabis users with a group of non-users, this would be considered a cross-sectional study because for ethical and practical reasons we would not be able to randomly assign participants to the cannabis user and non-user groups. Rather we would need to compare these pre-existing groups which could introduce a selection bias (the groups may differ in other ways that affect their responses on the dependent variable). For instance, cannabis users are more likely to use more alcohol and other drugs and these differences may account for differences in the dependent variable across groups, rather than cannabis use per se.

Cross-sectional designs are commonly used by developmental psychologists who study aging and by researchers interested in sex differences. Using this design, developmental psychologists compare groups of people of different ages (e.g., young adults spanning from 18-25 years of age versus older adults spanning 60-75 years of age) on various dependent variables (e.g., memory, depression, life satisfaction). Of course, the primary limitation of using this design to study the effects of aging is that differences between the groups other than age may account for differences in the dependent variable. For instance, differences between the groups may reflect the generation that people come from (a cohort effect) rather than a direct effect of age. For this reason, longitudinal studies in which one group of people is followed as they age offer a superior means of studying the effects of aging. Once again, cross-sectional designs are also commonly used to study sex differences. Since researchers cannot practically or ethically manipulate the sex of their participants they must rely on cross-sectional designs to compare groups of men and women on different outcomes (e.g., verbal ability, substance use, depression). Using these designs researchers have discovered that men are more likely than women to suffer from substance abuse problems while women are more likely than men to suffer from depression. But, using this design it is unclear what is causing these differences. So, using this design it is unclear whether these differences are due to environmental factors like socialization or biological factors like hormones?

When researchers use a participant characteristic to create groups (nationality, cannabis use, age, sex), the independent variable is usually referred to as an experimenter-selected independent variable (as opposed to the experimenter-manipulated independent variables used in experimental research). Figure 6.1 shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this is an experiment or a cross-sectional study because it is unclear whether the independent variable was manipulated by the researcher or simply selected by the researcher. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then the independent variable was experimenter-manipulated and it is a true experiment. If the researcher simply asked participants whether they made daily to-do lists or not, then the independent variable it is experimenter-selected and the study is cross-sectional. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a cross-sectional study, it could only be concluded that these variables are statistically related. Perhaps being stressed has a negative effect on people’s ability to plan ahead. Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed. The crucial point is that what defines a study as experimental or cross-sectional l is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. It is how the study is conducted.

Figure 6.1  Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Second, the most common type of non-experimental research conducted in Psychology is correlational research. Correlational research is considered non-experimental because it focuses on the statistical relationship between two variables but does not include the manipulation of an independent variable.  More specifically, in correlational research , the researcher measures two continuous variables with little or no attempt to control extraneous variables and then assesses the relationship between them. As an example, a researcher interested in the relationship between self-esteem and school achievement could collect data on students’ self-esteem and their GPAs to see if the two variables are statistically related. Correlational research is very similar to cross-sectional research, and sometimes these terms are used interchangeably. The distinction that will be made in this book is that, rather than comparing two or more pre-existing groups of people as is done with cross-sectional research, correlational research involves correlating two continuous variables (groups are not formed and compared).

Third,   observational research  is non-experimental because it focuses on making observations of behavior in a natural or laboratory setting without manipulating anything. Milgram’s original obedience study was non-experimental in this way. He was primarily interested in the extent to which participants obeyed the researcher when he told them to shock the confederate and he observed all participants performing the same task under the same conditions. The study by Loftus and Pickrell described at the beginning of this chapter is also a good example of observational research. The variable was whether participants “remembered” having experienced mildly traumatic childhood events (e.g., getting lost in a shopping mall) that they had not actually experienced but that the researchers asked them about repeatedly. In this particular study, nearly a third of the participants “remembered” at least one event. (As with Milgram’s original study, this study inspired several later experiments on the factors that affect false memories.

The types of research we have discussed so far are all quantitative, referring to the fact that the data consist of numbers that are analyzed using statistical techniques. But as you will learn in this chapter, many observational research studies are more qualitative in nature. In  qualitative research , the data are usually nonnumerical and therefore cannot be analyzed using statistical techniques. Rosenhan’s observational study of the experience of people in a psychiatric ward was primarily qualitative. The data were the notes taken by the “pseudopatients”—the people pretending to have heard voices—along with their hospital records. Rosenhan’s analysis consists mainly of a written description of the experiences of the pseudopatients, supported by several concrete examples. To illustrate the hospital staff’s tendency to “depersonalize” their patients, he noted, “Upon being admitted, I and other pseudopatients took the initial physical examinations in a semi-public room, where staff members went about their own business as if we were not there” (Rosenhan, 1973, p. 256) [2] . Qualitative data has a separate set of analysis tools depending on the research question. For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were said in an interview or focus group.

Internal Validity Revisited

Recall that internal validity is the extent to which the design of a study supports the conclusion that changes in the independent variable caused any observed differences in the dependent variable.  Figure 6.2  shows how experimental, quasi-experimental, and non-experimental (correlational) research vary in terms of internal validity. Experimental research tends to be highest in internal validity because the use of manipulation (of the independent variable) and control (of extraneous variables) help to rule out alternative explanations for the observed relationships. If the average score on the dependent variable in an experiment differs across conditions, it is quite likely that the independent variable is responsible for that difference. Non-experimental (correlational) research is lowest in internal validity because these designs fail to use manipulation or control. Quasi-experimental research (which will be described in more detail in a subsequent chapter) is in the middle because it contains some, but not all, of the features of a true experiment. For instance, it may fail to use random assignment to assign participants to groups or fail to use counterbalancing to control for potential order effects. Imagine, for example, that a researcher finds two similar schools, starts an anti-bullying program in one, and then finds fewer bullying incidents in that “treatment school” than in the “control school.” While a comparison is being made with a control condition, the lack of random assignment of children to schools could still mean that students in the treatment school differed from students in the control school in some other way that could explain the difference in bullying (e.g., there may be a selection effect).

Figure 7.1 Internal Validity of Correlational, Quasi-Experimental, and Experimental Studies. Experiments are generally high in internal validity, quasi-experiments lower, and correlational studies lower still.

Figure 6.2 Internal Validity of Correlation, Quasi-Experimental, and Experimental Studies. Experiments are generally high in internal validity, quasi-experiments lower, and correlation studies lower still.

Notice also in  Figure 6.2  that there is some overlap in the internal validity of experiments, quasi-experiments, and correlational studies. For example, a poorly designed experiment that includes many confounding variables can be lower in internal validity than a well-designed quasi-experiment with no obvious confounding variables. Internal validity is also only one of several validities that one might consider, as noted in Chapter 5.

Key Takeaways

  • Non-experimental research is research that lacks the manipulation of an independent variable.
  • There are two broad types of non-experimental research. Correlational research that focuses on statistical relationships between variables that are measured but not manipulated, and observational research in which participants are observed and their behavior is recorded without the researcher interfering or manipulating any variables.
  • In general, experimental research is high in internal validity, correlational research is low in internal validity, and quasi-experimental research is in between.
  • A researcher conducts detailed interviews with unmarried teenage fathers to learn about how they feel and what they think about their role as fathers and summarizes their feelings in a written narrative.
  • A researcher measures the impulsivity of a large sample of drivers and looks at the statistical relationship between this variable and the number of traffic tickets the drivers have received.
  • A researcher randomly assigns patients with low back pain either to a treatment involving hypnosis or to a treatment involving exercise. She then measures their level of low back pain after 3 months.
  • A college instructor gives weekly quizzes to students in one section of his course but no weekly quizzes to students in another section to see whether this has an effect on their test performance.
  • Milgram, S. (1974). Obedience to authority: An experimental view . New York, NY: Harper & Row. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Logo for Portland State University Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Overview of Non-Experimental Research

Rajiv S. Jhangiani; I-Chant A. Chiang; Carrie Cuttler; and Dana C. Leighton

Learning Objectives

  • Define non-experimental research, distinguish it clearly from experimental research, and give several examples.
  • Explain when a researcher might choose to conduct non-experimental research as opposed to experimental research.

What Is Non-Experimental Research?

Non-experimental research  is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world).

Most researchers in psychology consider the distinction between experimental and non-experimental research to be an extremely important one. This is because although experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, non-experimental research generally cannot. As we will see, however, this inability to make causal conclusions does not mean that non-experimental research is less important than experimental research. It is simply used in cases where experimental research is not able to be carried out.

When to Use Non-Experimental Research

As we saw in the last chapter , experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable. It stands to reason, therefore, that non-experimental research is appropriate—even necessary—when these conditions are not met. There are many times in which non-experimental research is preferred, including when:

  • the research question or hypothesis relates to a single variable rather than a statistical relationship between two variables (e.g., how accurate are people’s first impressions?).
  • the research question pertains to a non-causal statistical relationship between variables (e.g., is there a correlation between verbal intelligence and mathematical intelligence?).
  • the research question is about a causal relationship, but the independent variable cannot be manipulated or participants cannot be randomly assigned to conditions or orders of conditions for practical or ethical reasons (e.g., does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • the research question is broad and exploratory, or is about what it is like to have a particular experience (e.g., what is it like to be a working mother diagnosed with depression?).

Again, the choice between the experimental and non-experimental approaches is generally dictated by the nature of the research question. Recall the three goals of science are to describe, to predict, and to explain. If the goal is to explain and the research question pertains to causal relationships, then the experimental approach is typically preferred. If the goal is to describe or to predict, a non-experimental approach is appropriate. But the two approaches can also be used to address the same research question in complementary ways. For example, in Milgram’s original (non-experimental) obedience study, he was primarily interested in one variable—the extent to which participants obeyed the researcher when he told them to shock the confederate—and he observed all participants performing the same task under the same conditions. However,  Milgram subsequently conducted experiments to explore the factors that affect obedience. He manipulated several independent variables, such as the distance between the experimenter and the participant, the participant and the confederate, and the location of the study (Milgram, 1974) [1] .

Types of Non-Experimental Research

Non-experimental research falls into two broad categories: correlational research and observational research. 

The most common type of non-experimental research conducted in psychology is correlational research. Correlational research is considered non-experimental because it focuses on the statistical relationship between two variables but does not include the manipulation of an independent variable. More specifically, in correlational research , the researcher measures two variables with little or no attempt to control extraneous variables and then assesses the relationship between them. As an example, a researcher interested in the relationship between self-esteem and school achievement could collect data on students’ self-esteem and their GPAs to see if the two variables are statistically related.

Observational research  is non-experimental because it focuses on making observations of behavior in a natural or laboratory setting without manipulating anything. Milgram’s original obedience study was non-experimental in this way. He was primarily interested in the extent to which participants obeyed the researcher when he told them to shock the confederate and he observed all participants performing the same task under the same conditions. The study by Loftus and Pickrell described at the beginning of this chapter is also a good example of observational research. The variable was whether participants “remembered” having experienced mildly traumatic childhood events (e.g., getting lost in a shopping mall) that they had not actually experienced but that the researchers asked them about repeatedly. In this particular study, nearly a third of the participants “remembered” at least one event. (As with Milgram’s original study, this study inspired several later experiments on the factors that affect false memories).

Cross-Sectional, Longitudinal, and Cross-Sequential Studies

When psychologists wish to study change over time (for example, when developmental psychologists wish to study aging) they usually take one of three non-experimental approaches: cross-sectional, longitudinal, or cross-sequential. Cross-sectional studies involve comparing two or more pre-existing groups of people (e.g., children at different stages of development). What makes this approach non-experimental is that there is no manipulation of an independent variable and no random assignment of participants to groups. Using this design, developmental psychologists compare groups of people of different ages (e.g., young adults spanning from 18-25 years of age versus older adults spanning 60-75 years of age) on various dependent variables (e.g., memory, depression, life satisfaction). Of course, the primary limitation of using this design to study the effects of aging is that differences between the groups other than age may account for differences in the dependent variable. For instance, differences between the groups may reflect the generation that people come from (a cohort effect ) rather than a direct effect of age. For this reason, longitudinal studies , in which one group of people is followed over time as they age, offer a superior means of studying the effects of aging. However, longitudinal studies are by definition more time consuming and so require a much greater investment on the part of the researcher and the participants. A third approach, known as cross-sequential studies , combines elements of both cross-sectional and longitudinal studies. Rather than measuring differences between people in different age groups or following the same people over a long period of time, researchers adopting this approach choose a smaller period of time during which they follow people in different age groups. For example, they might measure changes over a ten year period among participants who at the start of the study fall into the following age groups: 20 years old, 30 years old, 40 years old, 50 years old, and 60 years old. This design is advantageous because the researcher reaps the immediate benefits of being able to compare the age groups after the first assessment. Further, by following the different age groups over time they can subsequently determine whether the original differences they found across the age groups are due to true age effects or cohort effects.

The types of research we have discussed so far are all quantitative, referring to the fact that the data consist of numbers that are analyzed using statistical techniques. But as you will learn in this chapter, many observational research studies are more qualitative in nature. In  qualitative research , the data are usually nonnumerical and therefore cannot be analyzed using statistical techniques. Rosenhan’s observational study of the experience of people in psychiatric wards was primarily qualitative. The data were the notes taken by the “pseudopatients”—the people pretending to have heard voices—along with their hospital records. Rosenhan’s analysis consists mainly of a written description of the experiences of the pseudopatients, supported by several concrete examples. To illustrate the hospital staff’s tendency to “depersonalize” their patients, he noted, “Upon being admitted, I and other pseudopatients took the initial physical examinations in a semi-public room, where staff members went about their own business as if we were not there” (Rosenhan, 1973, p. 256) [2] . Qualitative data has a separate set of analysis tools depending on the research question. For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were said in an interview or focus group.

Internal Validity Revisited

Recall that internal validity is the extent to which the design of a study supports the conclusion that changes in the independent variable caused any observed differences in the dependent variable.  Figure 6.1 shows how experimental, quasi-experimental, and non-experimental (correlational) research vary in terms of internal validity. Experimental research tends to be highest in internal validity because the use of manipulation (of the independent variable) and control (of extraneous variables) help to rule out alternative explanations for the observed relationships. If the average score on the dependent variable in an experiment differs across conditions, it is quite likely that the independent variable is responsible for that difference. Non-experimental (correlational) research is lowest in internal validity because these designs fail to use manipulation or control. Quasi-experimental research (which will be described in more detail in a subsequent chapter) falls in the middle because it contains some, but not all, of the features of a true experiment. For instance, it may fail to use random assignment to assign participants to groups or fail to use counterbalancing to control for potential order effects. Imagine, for example, that a researcher finds two similar schools, starts an anti-bullying program in one, and then finds fewer bullying incidents in that “treatment school” than in the “control school.” While a comparison is being made with a control condition, the inability to randomly assign children to schools could still mean that students in the treatment school differed from students in the control school in some other way that could explain the difference in bullying (e.g., there may be a selection effect).

Figure 6.1 Internal Validity of Correlational, Quasi-Experimental, and Experimental Studies. Experiments are generally high in internal validity, quasi-experiments lower, and correlational studies lower still.

Notice also in  Figure 6.1 that there is some overlap in the internal validity of experiments, quasi-experiments, and correlational (non-experimental) studies. For example, a poorly designed experiment that includes many confounding variables can be lower in internal validity than a well-designed quasi-experiment with no obvious confounding variables. Internal validity is also only one of several validities that one might consider, as noted in Chapter 5.

  • Milgram, S. (1974). Obedience to authority: An experimental view . New York, NY: Harper & Row. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵

A research that lacks the manipulation of an independent variable.

Research that is non-experimental because it focuses on the statistical relationship between two variables but does not include the manipulation of an independent variable.

Research that is non-experimental because it focuses on recording systemic observations of behavior in a natural or laboratory setting without manipulating anything.

Studies that involve comparing two or more pre-existing groups of people (e.g., children at different stages of development).

Differences between the groups may reflect the generation that people come from rather than a direct effect of age.

Studies in which one group of people are followed over time as they age.

Studies in which researchers follow people in different age groups in a smaller period of time.

Overview of Non-Experimental Research Copyright © 2022 by Rajiv S. Jhangiani; I-Chant A. Chiang; Carrie Cuttler; and Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.1 Overview of Nonexperimental Research

Learning objectives.

  • Define nonexperimental research, distinguish it clearly from experimental research, and give several examples.
  • Explain when a researcher might choose to conduct nonexperimental research as opposed to experimental research.

What Is Nonexperimental Research?

Nonexperimental research is research that lacks the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both.

In a sense, it is unfair to define this large and diverse set of approaches collectively by what they are not . But doing so reflects the fact that most researchers in psychology consider the distinction between experimental and nonexperimental research to be an extremely important one. This is because while experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, nonexperimental research generally cannot. As we will see, however, this does not mean that nonexperimental research is less important than experimental research or inferior to it in any general sense.

When to Use Nonexperimental Research

As we saw in Chapter 6 “Experimental Research” , experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable and randomly assign participants to conditions or to orders of conditions. It stands to reason, therefore, that nonexperimental research is appropriate—even necessary—when these conditions are not met. There are many ways in which this can be the case.

  • The research question or hypothesis can be about a single variable rather than a statistical relationship between two variables (e.g., How accurate are people’s first impressions?).
  • The research question can be about a noncausal statistical relationship between variables (e.g., Is there a correlation between verbal intelligence and mathematical intelligence?).
  • The research question can be about a causal relationship, but the independent variable cannot be manipulated or participants cannot be randomly assigned to conditions or orders of conditions (e.g., Does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • The research question can be broad and exploratory, or it can be about what it is like to have a particular experience (e.g., What is it like to be a working mother diagnosed with depression?).

Again, the choice between the experimental and nonexperimental approaches is generally dictated by the nature of the research question. If it is about a causal relationship and involves an independent variable that can be manipulated, the experimental approach is typically preferred. Otherwise, the nonexperimental approach is preferred. But the two approaches can also be used to address the same research question in complementary ways. For example, nonexperimental studies establishing that there is a relationship between watching violent television and aggressive behavior have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001). Similarly, after his original study, Milgram conducted experiments to explore the factors that affect obedience. He manipulated several independent variables, such as the distance between the experimenter and the participant, the participant and the confederate, and the location of the study (Milgram, 1974).

Types of Nonexperimental Research

Nonexperimental research falls into three broad categories: single-variable research, correlational and quasi-experimental research, and qualitative research. First, research can be nonexperimental because it focuses on a single variable rather than a statistical relationship between two variables. Although there is no widely shared term for this kind of research, we will call it single-variable research . Milgram’s original obedience study was nonexperimental in this way. He was primarily interested in one variable—the extent to which participants obeyed the researcher when he told them to shock the confederate—and he observed all participants performing the same task under the same conditions. The study by Loftus and Pickrell described at the beginning of this chapter is also a good example of single-variable research. The variable was whether participants “remembered” having experienced mildly traumatic childhood events (e.g., getting lost in a shopping mall) that they had not actually experienced but that the research asked them about repeatedly. In this particular study, nearly a third of the participants “remembered” at least one event. (As with Milgram’s original study, this study inspired several later experiments on the factors that affect false memories.)

As these examples make clear, single-variable research can answer interesting and important questions. What it cannot do, however, is answer questions about statistical relationships between variables. This is a point that beginning researchers sometimes miss. Imagine, for example, a group of research methods students interested in the relationship between children’s being the victim of bullying and the children’s self-esteem. The first thing that is likely to occur to these researchers is to obtain a sample of middle-school students who have been bullied and then to measure their self-esteem. But this would be a single-variable study with self-esteem as the only variable. Although it would tell the researchers something about the self-esteem of children who have been bullied, it would not tell them what they really want to know, which is how the self-esteem of children who have been bullied compares with the self-esteem of children who have not. Is it lower? Is it the same? Could it even be higher? To answer this question, their sample would also have to include middle-school students who have not been bullied.

Research can also be nonexperimental because it focuses on a statistical relationship between two variables but does not include the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both. This kind of research takes two basic forms: correlational research and quasi-experimental research. In correlational research , the researcher measures the two variables of interest with little or no attempt to control extraneous variables and then assesses the relationship between them. A research methods student who finds out whether each of several middle-school students has been bullied and then measures each student’s self-esteem is conducting correlational research. In quasi-experimental research , the researcher manipulates an independent variable but does not randomly assign participants to conditions or orders of conditions. For example, a researcher might start an antibullying program (a kind of treatment) at one school and compare the incidence of bullying at that school with the incidence at a similar school that has no antibullying program.

The final way in which research can be nonexperimental is that it can be qualitative. The types of research we have discussed so far are all quantitative, referring to the fact that the data consist of numbers that are analyzed using statistical techniques. In qualitative research , the data are usually nonnumerical and are analyzed using nonstatistical techniques. Rosenhan’s study of the experience of people in a psychiatric ward was primarily qualitative. The data were the notes taken by the “pseudopatients”—the people pretending to have heard voices—along with their hospital records. Rosenhan’s analysis consists mainly of a written description of the experiences of the pseudopatients, supported by several concrete examples. To illustrate the hospital staff’s tendency to “depersonalize” their patients, he noted, “Upon being admitted, I and other pseudopatients took the initial physical examinations in a semipublic room, where staff members went about their own business as if we were not there” (Rosenhan, 1973, p. 256).

Internal Validity Revisited

Recall that internal validity is the extent to which the design of a study supports the conclusion that changes in the independent variable caused any observed differences in the dependent variable. Figure 7.1 shows how experimental, quasi-experimental, and correlational research vary in terms of internal validity. Experimental research tends to be highest because it addresses the directionality and third-variable problems through manipulation and the control of extraneous variables through random assignment. If the average score on the dependent variable in an experiment differs across conditions, it is quite likely that the independent variable is responsible for that difference. Correlational research is lowest because it fails to address either problem. If the average score on the dependent variable differs across levels of the independent variable, it could be that the independent variable is responsible, but there are other interpretations. In some situations, the direction of causality could be reversed. In others, there could be a third variable that is causing differences in both the independent and dependent variables. Quasi-experimental research is in the middle because the manipulation of the independent variable addresses some problems, but the lack of random assignment and experimental control fails to address others. Imagine, for example, that a researcher finds two similar schools, starts an antibullying program in one, and then finds fewer bullying incidents in that “treatment school” than in the “control school.” There is no directionality problem because clearly the number of bullying incidents did not determine which school got the program. However, the lack of random assignment of children to schools could still mean that students in the treatment school differed from students in the control school in some other way that could explain the difference in bullying.

Experiments are generally high in internal validity, quasi-experiments lower, and correlational studies lower still

Experiments are generally high in internal validity, quasi-experiments lower, and correlational studies lower still.

Notice also in Figure 7.1 that there is some overlap in the internal validity of experiments, quasi-experiments, and correlational studies. For example, a poorly designed experiment that includes many confounding variables can be lower in internal validity than a well designed quasi-experiment with no obvious confounding variables.

Key Takeaways

  • Nonexperimental research is research that lacks the manipulation of an independent variable, control of extraneous variables through random assignment, or both.
  • There are three broad types of nonexperimental research. Single-variable research focuses on a single variable rather than a relationship between variables. Correlational and quasi-experimental research focus on a statistical relationship but lack manipulation or random assignment. Qualitative research focuses on broader research questions, typically involves collecting large amounts of data from a small number of participants, and analyzes the data nonstatistically.
  • In general, experimental research is high in internal validity, correlational research is low in internal validity, and quasi-experimental research is in between.

Discussion: For each of the following studies, decide which type of research design it is and explain why.

  • A researcher conducts detailed interviews with unmarried teenage fathers to learn about how they feel and what they think about their role as fathers and summarizes their feelings in a written narrative.
  • A researcher measures the impulsivity of a large sample of drivers and looks at the statistical relationship between this variable and the number of traffic tickets the drivers have received.
  • A researcher randomly assigns patients with low back pain either to a treatment involving hypnosis or to a treatment involving exercise. She then measures their level of low back pain after 3 months.
  • A college instructor gives weekly quizzes to students in one section of his course but no weekly quizzes to students in another section to see whether this has an effect on their test performance.

Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage.

Milgram, S. (1974). Obedience to authority: An experimental view . New York, NY: Harper & Row.

Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Non-Experimental Research

28 Overview of Non-Experimental Research

Learning objectives.

  • Define non-experimental research, distinguish it clearly from experimental research, and give several examples.
  • Explain when a researcher might choose to conduct non-experimental research as opposed to experimental research.

What Is Non-Experimental Research?

Non-experimental research  is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world).

Most researchers in psychology consider the distinction between experimental and non-experimental research to be an extremely important one. This is because although experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, non-experimental research generally cannot. As we will see, however, this inability to make causal conclusions does not mean that non-experimental research is less important than experimental research. It is simply used in cases where experimental research is not able to be carried out.

When to Use Non-Experimental Research

As we saw in the last chapter , experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable. It stands to reason, therefore, that non-experimental research is appropriate—even necessary—when these conditions are not met. There are many times in which non-experimental research is preferred, including when:

  • the research question or hypothesis relates to a single variable rather than a statistical relationship between two variables (e.g., how accurate are people’s first impressions?).
  • the research question pertains to a non-causal statistical relationship between variables (e.g., is there a correlation between verbal intelligence and mathematical intelligence?).
  • the research question is about a causal relationship, but the independent variable cannot be manipulated or participants cannot be randomly assigned to conditions or orders of conditions for practical or ethical reasons (e.g., does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • the research question is broad and exploratory, or is about what it is like to have a particular experience (e.g., what is it like to be a working mother diagnosed with depression?).

Again, the choice between the experimental and non-experimental approaches is generally dictated by the nature of the research question. Recall the three goals of science are to describe, to predict, and to explain. If the goal is to explain and the research question pertains to causal relationships, then the experimental approach is typically preferred. If the goal is to describe or to predict, a non-experimental approach is appropriate. But the two approaches can also be used to address the same research question in complementary ways. For example, in Milgram’s original (non-experimental) obedience study, he was primarily interested in one variable—the extent to which participants obeyed the researcher when he told them to shock the confederate—and he observed all participants performing the same task under the same conditions. However,  Milgram subsequently conducted experiments to explore the factors that affect obedience. He manipulated several independent variables, such as the distance between the experimenter and the participant, the participant and the confederate, and the location of the study (Milgram, 1974) [1] .

Types of Non-Experimental Research

Non-experimental research falls into two broad categories: correlational research and observational research. 

The most common type of non-experimental research conducted in psychology is correlational research. Correlational research is considered non-experimental because it focuses on the statistical relationship between two variables but does not include the manipulation of an independent variable. More specifically, in correlational research , the researcher measures two variables with little or no attempt to control extraneous variables and then assesses the relationship between them. As an example, a researcher interested in the relationship between self-esteem and school achievement could collect data on students’ self-esteem and their GPAs to see if the two variables are statistically related.

Observational research  is non-experimental because it focuses on making observations of behavior in a natural or laboratory setting without manipulating anything. Milgram’s original obedience study was non-experimental in this way. He was primarily interested in the extent to which participants obeyed the researcher when he told them to shock the confederate and he observed all participants performing the same task under the same conditions. The study by Loftus and Pickrell described at the beginning of this chapter is also a good example of observational research. The variable was whether participants “remembered” having experienced mildly traumatic childhood events (e.g., getting lost in a shopping mall) that they had not actually experienced but that the researchers asked them about repeatedly. In this particular study, nearly a third of the participants “remembered” at least one event. (As with Milgram’s original study, this study inspired several later experiments on the factors that affect false memories).

Cross-Sectional, Longitudinal, and Cross-Sequential Studies

When psychologists wish to study change over time (for example, when developmental psychologists wish to study aging) they usually take one of three non-experimental approaches: cross-sectional, longitudinal, or cross-sequential. Cross-sectional studies involve comparing two or more pre-existing groups of people (e.g., children at different stages of development). What makes this approach non-experimental is that there is no manipulation of an independent variable and no random assignment of participants to groups. Using this design, developmental psychologists compare groups of people of different ages (e.g., young adults spanning from 18-25 years of age versus older adults spanning 60-75 years of age) on various dependent variables (e.g., memory, depression, life satisfaction). Of course, the primary limitation of using this design to study the effects of aging is that differences between the groups other than age may account for differences in the dependent variable. For instance, differences between the groups may reflect the generation that people come from (a cohort effect ) rather than a direct effect of age. For this reason, longitudinal studies , in which one group of people is followed over time as they age, offer a superior means of studying the effects of aging. However, longitudinal studies are by definition more time consuming and so require a much greater investment on the part of the researcher and the participants. A third approach, known as cross-sequential studies , combines elements of both cross-sectional and longitudinal studies. Rather than measuring differences between people in different age groups or following the same people over a long period of time, researchers adopting this approach choose a smaller period of time during which they follow people in different age groups. For example, they might measure changes over a ten year period among participants who at the start of the study fall into the following age groups: 20 years old, 30 years old, 40 years old, 50 years old, and 60 years old. This design is advantageous because the researcher reaps the immediate benefits of being able to compare the age groups after the first assessment. Further, by following the different age groups over time they can subsequently determine whether the original differences they found across the age groups are due to true age effects or cohort effects.

The types of research we have discussed so far are all quantitative, referring to the fact that the data consist of numbers that are analyzed using statistical techniques. But as you will learn in this chapter, many observational research studies are more qualitative in nature. In  qualitative research , the data are usually nonnumerical and therefore cannot be analyzed using statistical techniques. Rosenhan’s observational study of the experience of people in psychiatric wards was primarily qualitative. The data were the notes taken by the “pseudopatients”—the people pretending to have heard voices—along with their hospital records. Rosenhan’s analysis consists mainly of a written description of the experiences of the pseudopatients, supported by several concrete examples. To illustrate the hospital staff’s tendency to “depersonalize” their patients, he noted, “Upon being admitted, I and other pseudopatients took the initial physical examinations in a semi-public room, where staff members went about their own business as if we were not there” (Rosenhan, 1973, p. 256) [2] . Qualitative data has a separate set of analysis tools depending on the research question. For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were said in an interview or focus group.

Internal Validity Revisited

Recall that internal validity is the extent to which the design of a study supports the conclusion that changes in the independent variable caused any observed differences in the dependent variable.  Figure 6.1 shows how experimental, quasi-experimental, and non-experimental (correlational) research vary in terms of internal validity. Experimental research tends to be highest in internal validity because the use of manipulation (of the independent variable) and control (of extraneous variables) help to rule out alternative explanations for the observed relationships. If the average score on the dependent variable in an experiment differs across conditions, it is quite likely that the independent variable is responsible for that difference. Non-experimental (correlational) research is lowest in internal validity because these designs fail to use manipulation or control. Quasi-experimental research (which will be described in more detail in a subsequent chapter) falls in the middle because it contains some, but not all, of the features of a true experiment. For instance, it may fail to use random assignment to assign participants to groups or fail to use counterbalancing to control for potential order effects. Imagine, for example, that a researcher finds two similar schools, starts an anti-bullying program in one, and then finds fewer bullying incidents in that “treatment school” than in the “control school.” While a comparison is being made with a control condition, the inability to randomly assign children to schools could still mean that students in the treatment school differed from students in the control school in some other way that could explain the difference in bullying (e.g., there may be a selection effect).

Figure 6.1 Internal Validity of Correlational, Quasi-Experimental, and Experimental Studies. Experiments are generally high in internal validity, quasi-experiments lower, and correlational studies lower still.

Notice also in  Figure 6.1 that there is some overlap in the internal validity of experiments, quasi-experiments, and correlational (non-experimental) studies. For example, a poorly designed experiment that includes many confounding variables can be lower in internal validity than a well-designed quasi-experiment with no obvious confounding variables. Internal validity is also only one of several validities that one might consider, as noted in Chapter 5.

  • Milgram, S. (1974). Obedience to authority: An experimental view . New York, NY: Harper & Row. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵

A research that lacks the manipulation of an independent variable.

Research that is non-experimental because it focuses on the statistical relationship between two variables but does not include the manipulation of an independent variable.

Research that is non-experimental because it focuses on recording systemic observations of behavior in a natural or laboratory setting without manipulating anything.

Studies that involve comparing two or more pre-existing groups of people (e.g., children at different stages of development).

Differences between the groups may reflect the generation that people come from rather than a direct effect of age.

Studies in which one group of people are followed over time as they age.

Studies in which researchers follow people in different age groups in a smaller period of time.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

6: Non-Experimental Research

  • Last updated
  • Save as PDF
  • Page ID 19616

  • Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton
  • Kwantlen Polytechnic U., Washington State U., & Texas A&M U.—Texarkana

In this chapter we look more closely at non-experimental research. We begin with a general definition of, non-experimental research, along with a discussion of when and why non-experimental research is more appropriate than experimental research. We then look separately at three important types of non-experimental research: cross-sectional research, correlational research and observational research.

  • 6.1: Prelude to Nonexperimental Research What do the following classic studies have in common? Stanley Milgram found that about two thirds of his research participants were willing to administer dangerous shocks to another person just because they were told to by an authority figure (Milgram, 1963). Elizabeth Loftus and Jacqueline Pickrell showed that it is relatively easy to “implant” false memories in people by repeatedly asking them about childhood events that did not actually happen to them (Loftus & Pickrell, 1995).
  • 6.2: Overview of Non-Experimental Research Most researchers in psychology consider the distinction between experimental and non-experimental research to be an extremely important one. This is because although experimental research can provide strong evidence that changes in an independent variable cause differences in a dependent variable, non-experimental research generally cannot. As we will see, however, this inability to make causal conclusions does not mean that non-experimental research is less important than experimental research.
  • 6.3: Correlational Research Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment.
  • 6.4: Complex Correlation As we have already seen, researchers conduct correlational studies rather than experiments when they are interested in noncausal relationships or when they are interested in causal relationships but the independent variable cannot be manipulated for practical or ethical reasons. In this section, we look at some approaches to complex correlational research that involve measuring several variables and assessing the relationships among them.
  • 6.5: Qualitative Research Quantitative researchers typically start with a focused research question or hypothesis, collect a small amount of data from a large number of individuals, describe the resulting data using statistical techniques, and draw general conclusions about some large population. Although this method is by far the most common approach to conducting empirical research in psychology, there is an important alternative called qualitative research.
  • 6.6: Observational Research Observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. The goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. Observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach.
  • 6.7: Non-Experimental Research (Summary) Key Takeaways and Exercises for the chapter on Non-Experimental Research.

Thumbnail: An example of data produced by data dredging, showing a correlation between the number of letters in a spelling bee's winning word (red curve) and the number of people in the United States killed by venomous spiders (black curve). (CC BY 4.0 International; Tyler Vigen - Spurious Correlations ).​​​​​

Logo for Open Library Publishing Platform

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Nonexperimental research is research that lacks manipulation of an independent variable and/or random assignment of participants to conditions. While the distinction between experimental and nonexperimental research is considered important, it does not mean that nonexperimental research is less important or inferior to experimental research (Price, Jhangiani & Chiang, 2015).

When to use nonexperimental research

Often it is not possible, feasible, and/or ethical to manipulate the independent variable, nor to randomly assign participants to conditions or to orders of conditions. In such cases, nonexperimental research is more appropriate and often necessary. Price, et al. (2015) provide the following examples that demonstrate when the research question is better answered with non-experimental methods:

  • The research question or hypothesis contains a single variable rather than a statistical relationship between two variables (e.g., How accurate are people’s first impressions?).
  • The research question involves a non-causal statistical relationship between variables (e.g., is there a correlation between verbal intelligence and mathematical intelligence?).
  • The research question involves a causal relationship, but the independent variable cannot be manipulated, or participants cannot be randomly assigned to conditions or orders of conditions (e.g., Does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • The research question is broad and exploratory, or explores a particular experience (e.g., what is it like to be a working mother diagnosed with depression?).

As demonstrated above, it is the nature of the research question that guides the choice between experimental and non-experimental approaches. However, this is not to suggest that a research project cannot contain elements of both an experiment and a non-experiment. For example, nonexperimental studies that establish a relationship between two variables can be explored further in an experimental study to confirm or refute the causal nature of the relationship (Price, Jhangiani & Chiang, 2015).

Types of nonexperimental research

In social sciences it is often the case that a true experimental approach is inappropriate and unethical. For example, conducting a true experiment may require the researcher to deny needed treatment to a patient, which is clearly an ethical issue. Furthermore, it might not be equitable or ethical to provide a large financial or other reward to members of an experimental group, as can occur in a true experiment.

There are three types of non-experimental research: cross-sectional, correlational, and observational. In the following sections we explore each of three types of nonexperimental research.

Research Methods, Data Collection and Ethics Copyright © 2020 by Valerie Sheppard is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Logo for British Columbia/Yukon Open Authoring Platform

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 6: Data Collection Strategies

6.2 Nonexperimental Research

Nonexperimental research is research that lacks manipulation of an independent variable and/or random assignment of participants to conditions. While the distinction between experimental and nonexperimental research is considered important, it does not mean that nonexperimental research is less important or inferior to experimental research (Price, Jhangiani & Chiang, 2015).

When to use nonexperimental research

Often it is not possible, feasible, and/or ethical to manipulate the independent variable, nor to randomly assign participants to conditions or to orders of conditions. In such cases, nonexperimental research is more appropriate and often necessary. Price, et al. (2015) provide the following examples that demonstrate when the research question is better answered with non-experimental methods:

  • The research question or hypothesis contains a single variable rather than a statistical relationship between two variables (e.g., How accurate are people’s first impressions?).
  • The research question involves a non-causal statistical relationship between variables (e.g., is there a correlation between verbal intelligence and mathematical intelligence?).
  • The research question involves a causal relationship, but the independent variable cannot be manipulated, or participants cannot be randomly assigned to conditions or orders of conditions (e.g., Does damage to a person’s hippocampus impair the formation of long-term memory traces?).
  • The research question is broad and exploratory, or explores a particular experience (e.g., what is it like to be a working mother diagnosed with depression?).

As demonstrated above, it is the nature of the research question that guides the choice between experimental and non-experimental approaches. However, this is not to suggest that a research project cannot contain elements of both an experiment and a non-experiment. For example, nonexperimental studies that establish a relationship between two variables can be explored further in an experimental study to confirm or refute the causal nature of the relationship (Price, Jhangiani & Chiang, 2015).

Types of nonexperimental research

In social sciences it is often the case that a true experimental approach is inappropriate and unethical. For example, conducting a true experiment may require the researcher to deny needed treatment to a patient, which is clearly an ethical issue. Furthermore, it might not be equitable or ethical to provide a large financial or other reward to members of an experimental group, as can occur in a true experiment.

There are three types of non-experimental research: cross-sectional, correlational, and observational. In the following sections we explore each of three types of nonexperimental research.

Research Methods for the Social Sciences: An Introduction Copyright © 2020 by Valerie Sheppard is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Using Non-experimental Data to Estimate Treatment Effects

Elizabeth a. stuart.

2 Johns Hopkins Bloomberg School of Public Health, Baltimore

Sue M. Marcus

3 Mount Sinai School of Medicine, New York

Marcela V. Horvitz-Lennon

4 University of Pittsburgh School of Medicine, Department of Psychiatry, Pittsburgh

Robert D. Gibbons

5 University of Illinois at Chicago, Chicago

Sharon-Lise T. Normand

6 Harvard Medical School and Harvard School of Public Health, Boston

While much psychiatric research is based on randomized controlled trials (RCTs), where patients are randomly assigned to treatments, sometimes RCTs are not feasible. This paper describes propensity score approaches, which are increasingly used for estimating treatment effects in non-experimental settings. The primary goal of propensity score methods is to create sets of treated and comparison subjects who look as similar as possible, in essence replicating a randomized experiment, at least with respect to observed patient characteristics. A study to estimate the metabolic effects of antipsychotic medication in a sample of Florida Medicaid beneficiaries with schizophrenia illustrates methods.

Introduction

While much psychiatric research is based on randomized controlled trials (RCTs), where patients are randomly assigned to treatments, sometimes RCTs are not feasible. Ethical concerns might preclude randomization, such as randomizing subjects to smoke, or it may be impractical, such as when the treatment of interest is widely available and commonly used. When RCTs are unethical or infeasible, a carefully constructed non-experimental study can be used to estimate treatment effects. While non-experimental studies are disadvantaged by lack of randomization, the study costs may be lower, the study sample may be broader, and follow-up may be longer, as compared to an RCT ( 1 , 2 ).

The primary challenge for estimation of treatment effects is the identification of subjects who are as similar as possible on all background characteristics other than the treatment of interest. By virtue of randomization, RCTs ensure, on average, the treatment and comparison groups are similar on background characteristics, measured and unmeasured. In non-experimental studies, there is no such guarantee. Treatment and comparison groups may systematically differ on factors that also affect the outcome, a problem referred to as “selection bias.” Selection bias leads to confounding, “a situation in which the estimated intervention effect is biased because of some difference between the comparison groups apart from the planned interventions such as baseline characteristics, prognostic factors, or concomitant interventions. For a factor to be a confounder, it must differ between the comparison groups and predict the outcome of interest” ( 3 ).

Numerous design and analytical strategies are available to account for measured confounders but the major limitation is the potential for unmeasured confounders. Well-designed non-experimental studies make good use of measured confounders by creating treatment groups that look as similar as possible on the measured characteristics. Researchers then assume that, given comparability (or balance) between the groups on measured confounders, there are no measured or unmeasured differences, other than treatment received. This assumption has many names: “unconfounded treatment assignment,” “no hidden bias,” “ignorable treatment assignment,” or “selection on observables” ( 4 – 6 ).

We describe approaches that, through the careful design and analysis of non-experimental studies, create balance between treatment groups. The key idea is to use relatively recently developed techniques, known as propensity score methods, to ensure that the treatment and comparison subjects are as similar as possible. The goal is to replicate a randomized experiment, at least with respect to the measured confounders, by making the treatment and comparison groups look as if they could have been randomly assigned to the groups, in the sense of having similar distributions of the confounders. We describe the five key stages to this process ( Table 1 ). A study that compares atypical and conventional antipsychotic medications with regard to their effect on adverse metabolic outcomes (dyslipidemia, Type II diabetes, and obesity) ( 16 ) illustrates the methods. The study uses data from Florida Medicaid beneficiaries (18 to 64 years), diagnosed with schizophrenia and continuously enrolled from 1997 to 2001. Although the bulk of the evidence on the causal associations of antipsychotics comes from studies using U.S. and U.K. administrative and medical databases, RCTs have been used to assess the metabolic effects of antipsychotic drugs (e.g., 17 , 18 ). Findings of these RCTs, however, are not regarded as representative of the adverse events of these drugs as used in routine practice. A possible exception is the CATIE trial ( 19 ), an effectiveness trial in that other than the randomization, every other aspect of the care was naturalistic. Conduct of this type of trial is costly and generally unfeasible.

Recommended Steps in Analyzing Non-Experimental Studies

I. Defining the Treatment and Comparison Groups

The first step involves clearly specifying the treatment of interest, and identifying individuals who experienced that treatment. One way to address this is to consider what treatment would be randomized if randomization were possible. For example, we could randomly assign patients to receive an atypical medication. We then need to select an appropriate comparison condition. Because this study investigates the metabolic effects of atypical antipsychotics the relevant question is whether the comparison of interest is another type of medication, no medication, or either? Virtually all subjects with schizophrenia during this time frame are treated with some type of antipsychotic agent and thus the key clinical question is not whether the patient should receive an antipsychotic medication, but rather, which type of antipsychotic medication should be used. We compare atypical antipsychotics (specifically, clozapine, olanzapine, quetiapine, and risperidone) to conventional antipsychotics (specifically, chlorpromazine, trifluoperazine, fluphenazine, perphenazine, thioridazine, haloperidol, and thiothixene). We use Medicaid claims data so that atypical (conventional) antipsychotic users are those subjects who filled at least one prescription for an atypical (conventional) antipsychotic. Prescribing information is unavailable and so only subjects who were written an antipsychotic prescription and filled it are included. Like an intent-to-treat analysis, we only know that the prescription was filled and not whether the medication was actually taken.

The next consideration is identification of confounders: factors that have previously been found to be associated with receipt of atypical antipsychotics and/or with metabolic outcomes. Key confounders in the Medicaid study include demographic and clinical variables, listed in Table 1 , such as sex, age, race, and medical comorbidities. A good study will have a large set of measured confounders so that the assumption of no hidden bias is likely to be satisfied.

Once the treatment group, comparison group, and potential confounders are identified, researchers need to identify data on those groups and the confounders. The particular data elements necessary are: subjects, some of whom received the treatment (atypical antipsychotics) and others the comparison condition (conventional antipsychotics), an indicator for which subject is in which group, potential confounders, and outcomes. Confounders are measured before treatment assignment to ensure that they are not affected by the treatment ( 20 , 21 ) and outcomes are measured after treatment assignment, to ensure temporal ordering. In the Medicaid study, we determined periods during which an individual had some minimal exposure to an antipsychotic drug, at least 6 months of Medicaid enrollment preceding treatment initiation (from which we obtained the covariate information), and a 12-month follow-up period to examine incidence of metabolic outcomes. Often it is not possible to have truly longitudinal data, and researchers instead use cross-sectional data where assumptions regarding the time ordering of the variables being measured are made. We analyze one measurement occasion for each subject, measured 12 months following antipsychotic initiation. See the paper by Marcus et al. in this series for methods for estimating causal effects with multiple outcome occasions ( 22 ).

II. Creating the Groups for Comparison

Table 2 (Columns 1–3) compares the means of the potential confounders between atypical and conventional antipsychotic users. The differences in percentages (for binary variables) or standardized differences (for continuous variables) are also reported. The standardized difference is the difference in means divided by the standard deviation of the confounder among the full set of conventional users ( 1 , 11 , 23 ). We then multiply by 100 to express the difference as a percentage. The conventional users are older on average (by 26% of a standard deviation) and more likely to be African American (34% vs. 24%), as compared to the atypical users. Because of these differences between the groups, comparing the raw outcomes between the two treatment groups would result in bias ( 24 ). Statistical adjustments are required to deal with the differences in the observed confounders.

Characteristics of Individuals Taking Atypical vs. Conventional Antipsychotics

Ideally we want to compare atypical and conventional users who have “exactly” the same values for all the confounders. Assuming no unmeasured confounders, any difference in the outcomes could then be attributed to the treatment. However, exact matching on all of the covariates is often infeasible given the large number of covariates and relatively small number of subjects available. In the Medicaid study, if we were to make each of our 11 confounders binary, we would have 2048 (= 2 11 ) distinct strata and need to have both atypical and conventional antipsychotic users in each. Because this is not feasible, a reasonable strategy is to make the “distributions” of the confounders similar between the atypical and conventional antipsychotic users—e.g., similar age, similar race, similar chronic medical comorbidity status. There are several general strategies to create comparable groups.

Regression adjustment

A common approach to adjusting for confounders is regression adjustment, whereby the treatment effect is estimated by regressing the outcome of interest on an indicator for the treatment received and the set of confounders. The coefficient on the treatment indicator provides an estimate of the treatment effect ( Table 3 , Column 1). 2 A drawback of this approach is that if the atypical and conventional groups are very different on the observed covariates (e.g., with over a 25% standard deviation difference on average age, as seen in Table 2 ), the regression adjustment relies heavily on the particular model form and extrapolates between the two groups ( 24 , 25 ). Why does this pose a problem? First, the regression approach will provide a prediction of what would have happened to atypical users had they instead used conventional antipsychotics using information from a set of conventional users who are very different from, e.g., older than, those atypical users. Second, in most cases, the regression approach assumes a linear relationship between the measured covariates and the outcome of interest—an assumption that may not be true and is often difficult to test. Third, the output of standard regression analysis provides no information regarding covariate balance between the two treatment groups. Other approaches avoid these problems by ensuring that the comparisons are made between groups that are similar.

Estimated Absolute Risk (%) of Adverse Metabolic Outcomes of Atypical Compared to Conventional Antipsychotic Medication Use. P-value in parentheses. Numbers greater than 0 indicate higher risk for individuals taking atypical antipsychotics.

Propensity score methods

A useful tool to achieve comparable confounder distributions is the “propensity score,” defined as the probability of receiving the treatment given the measured covariates ( 6 ). A property of the propensity score makes it possible to select subjects based on their similarity with respect to the propensity score (a single number summary of the covariates, similar to a comorbidity score) in order to achieve comparability on all the measured confounders , rather than having to consider each confounder separately. If a group of subjects have similar propensity scores, then they have similar probabilities of receiving the treatment, given the measured confounders. Within a small range of propensity score values, the atypical and conventional users should only differ randomly on the measured confounders, in essence replicating a randomized experiment.

Because the true propensity score for each subject is unknown, it is estimated with a model, such as a logistic regression, predicting treatment received given the measured confounders. Each subject’s propensity score is their predicted probability of receiving the treatment, generated from the model. The diagnostics for propensity score estimation are not the standard logistic regression diagnostics, as concern is not with the parameter estimates or predictive ability of the model. Rather, the success of a propensity score model (and subsequent matching or stratification procedure) is determined by the covariate balance achieved.

Nearest neighbor matching

One of the simplest ways of ensuring the comparability of groups is to select for each treated individual the comparison individual with the closest propensity score 3 ( 26 ). We illustrate a 1:1 matching algorithm where one conventional antipsychotic user is selected for each atypical antipsychotic user. Variations on this algorithm include selecting multiple matches for each atypical user, matching atypical users to a variable number of conventional users ( 27 ), and prioritizing certain variables ( 12 ). For example, if there are a large number of potential control subjects relative to the number of treated, it may be possible to get 2 or 3 good matches for each treated individual, which will increase the precision of estimates without sacrificing much balance ( 27 , 28 ). In our study, because the numbers of conventional and atypical users are nearly equal, we used matching with replacement, meaning that each conventional user could be used as a match multiple times ( 29 ).

Figure 1 Panel A illustrates the resulting matches in the Medicaid study, with 1,809 conventional users matched to the 3,384 atypical users. The x-axis reflects the propensity scores; the y-axis is used to group the subjects into atypical (treated) vs. conventional (control), and matched vs. unmatched; the vertical spread of the symbols within each grouping is done to show the symbols more clearly. The figure shows the relative weight different subjects receive in the analyses of the outcomes, with the relative size of the symbols reflecting the number of times a subject was matched. Thus, conventional users selected as a match multiple times have larger symbols. The goal is to see good “overlap” between the propensity scores of the atypical and conventional users, which we have. However, there are quite a few conventional users with low propensity scores who are left unmatched. This illustrates a common drawback of nearest neighbor matching, in that sometimes subjects are unmatched, including some with propensity scores similar to those in the other group.

An external file that holds a picture, illustration, etc.
Object name is nihms94293f1.jpg

Results of 1:1 nearest neighbor matching with replacement and subclassification. Propensity scores on x-axis; y-axis used to group subjects into atypical (treated) vs. conventional (control) and matched vs. unmatched. Matched subjects in black; unmatched in grey. The relative sizes of the diamonds reflect the relative weights subjects receive. Propensity score predicts atypical use given covariates; higher values indicate a higher likelihood of using atypical antipsychotics as compared to conventional antipsychotics. In Panel B, vertical lines indicate subclass dividers.

A second approach, inverse probability of treatment weighting (IPTW), avoids this problem by using data from all subjects ( 9 , 13 , 30 ). The idea of IPTW is similar to that of survey sampling weights, where individuals in a survey sample are weighted by their inverse probabilities of selection so that they then represent the full population from which the sample was selected. In our setting we treat each of the treatment groups (the atypical users and the conventional users) as a separate sample, and weight each up to the “population,” which in this case is all study subjects. Each subject receives a weight that is the inverse probability of being in the group in which they are in. However, instead of having known survey sampling probabilities, we use the estimated propensity scores. In particular, atypical users are weighted by one over their probability of receiving an atypical antipsychotic (the propensity score), and conventional users are weighted by one over their probability of receiving a conventional antipsychotic (one minus the propensity score). In the Medicaid study, the conventional users with low probabilities of receiving a conventional antipsychotic will receive relatively large weights, because they actually look more similar to the atypical users, thus providing good information about what would happen to the atypical users if they had instead taken conventional antipsychotics.

Subclassification

Subclassification, also called stratification, is a method that also uses all subjects, by forming groups (subclasses) of individuals with similar propensity scores ( 31 ). In the Medicaid study the subclasses were created to have approximately the same number of subjects taking atypical antipsychotics (about 565); the number of conventional users in each subclass ranges from 287 to 933 ( Figure 1 Panel B; Table 4 ). Because of the properties of propensity scores described above, within each subclass, the subjects look similar on the measured confounders.

Estimated Absolute Risk (%) of Adverse Metabolic Outcomes of Atypical Compared to Conventional Antipsychotic Medication Use Stratified by Propensity Score Subclass. P-value in parentheses. Numbers greater than 0 indicate higher risk for atypical users.

Is it better to match or to stratify/weight? The answer depends on whether the investigator is more concerned about bias or about having enough power to detect an effect. Matching approaches are often used when it is important to reduce as much as possible differences between treatment groups and consequently, not all subjects are used, reducing the total sample size available to find differences. While subclassification and weighting retain all subjects (generally yielding some efficiency gain), there is a risk of making comparisons between individuals who are not as alike as desired.

III. Assessing Potential Confounding

How do we know if the atypical and conventional groups are “similar,” at least on the measured covariates? After using one of the approaches described above, the crucial next step is to check the resulting “balance:” the similarity of the confounders between the treatment and comparison groups. Common (and sometimes misguided) measures used for balance checks are standard hypothesis tests, such as t-tests. The danger in using test statistics is that they conflate changes in balance with changes in the sample size; comparing p-values before and after matching can be misleading, implying that balance has improved when in fact it has not ( 1 , 11 ).

A good balance measure, and the one we suggest, is the standardized difference in means. This is most appropriate for continuous variables. A general rule of thumb is that an acceptable standardized difference is less than 10% ( 11 ). Differences larger than 10% roughly imply that 8% or more of the area covered by atypical and conventional users combined is not overlapping. 4 For binary variables the absolute value of the difference in proportions is examined. These measures are generally calculated both in the full dataset ( Table 2 , Column 3), as well as in the dataset after applying one of the propensity score methods described above ( Table 2 , Column 4); if the propensity score method was successful the standardized differences and differences in proportions should be smaller than they were in the original data set. After 1:1 matching ( Table 2 , Column 4) the largest standardized difference is 3%, which is a good situation. Similar balance was achieved with weighting and subclassification. In contrast, the largest standardized difference prior to matching was 26%, which is clearly an unacceptable situation. In some cases adequate balance may not be achieved with the available data. This is an indication that estimating the treatment effect with that data may be unreliable. It may be necessary to add interactions of the measured covariates in the propensity score model, seek additional data sources, or reconsider the question of interest.

IV. Estimating the Treatment Effect

Once adequate balance is achieved, the next step is to estimate the treatment effect. Note that this is the first time that the outcome is used; the propensity score method itself is not selected or implemented using the metabolic outcome measures, beyond the idea of selecting confounders that may be correlated with the outcome(s).

One method of estimating the treatment effect is to regress the outcomes for subjects in the original (unmatched) dataset on the measured confounders. In the antipsychotic study, we estimated a linear regression, where the coefficient of the atypical antipsychotic variable represents the increase (or decrease) in risk for atypical users. The results of this approach are shown in Table 3 , Column 1, where atypical antipsychotic use increases the risk of dyslipidemia and of obesity. This regression is easy to conduct, but has the drawbacks discussed above, particularly when the treatment groups are far apart based on the covariates. However, despite these limitations of regression adjustment in general, in fact, combining it with the propensity score methods described above has been found to be a very effective approach ( 10 , 32 – 34 ), and we use that approach for the remaining methods.

Outcome analysis after 1:1 nearest neighbor matching is very straightforward. With paired data and binary outcomes, a natural method is McNemar’s test. McNemar’s test indicates a statistically significant adverse effect of atypical antipsychotics on obesity (χ 2 = 14.61 on 1 degree of freedom; p = 0.0001): 5% of the 3,384 pairs had discordant outcomes and in 65% of the discordant pairs, the atypical subjects had obesity.

Alternatively, any analysis that would have been conducted on the full dataset can instead be conducted on the matched dataset ( 10 ). We estimated a regression model with each metabolic outcome predicted by whether someone took an atypical antipsychotic and the measured confounders, using the matched sample. Because the matching was done with replacement, the regression analysis was run using weights to account for that design ( 12 ). We find that atypical antipsychotics increased the risk of obesity, but not dyslipidemia or Type II diabetes ( Table 3 , Column 2), consistent with the results found using McNemar’s test.

After constructing IPTW weights, the effect estimate is obtained by estimating a weighted regression model using the IPTW weights ( 13 ). The results are consistent with those of the standard regression adjustment, indicating increased risk of dyslipidemia and obesity for those taking atypical antipsychotics ( Table 3 , Column 3).

With subclassification, treatment effects are first estimated separately within each subclass. Because of the potential for residual bias when the subclasses are relatively large, it is particularly important to estimate these effects using regression adjustment within each subclass, controlling for the confounders ( 13 ). If the treatment effects are similar across subclasses, it may make sense to combine the subclass-specific estimates to obtain an overall estimate. The results for the antipsychotic study do not indicate substantial treatment differences across subclasses ( Table 4 ). After combining the subclass results by taking a precision-weighted average of the effects within each subclass, we find that the overall effects are similar to those from the simple regression adjustment and from weighting ( Table 3 , Column 4). An advantage of the subclassification approach is that it permits non-linear associations in the effects across the subclasses.

Selection of matching versus subclassification or weighting involves a bias/variance trade-off. One-to-one matching generally yields more closely matched samples and thus lower bias, but higher variance because of the smaller sample size used. The better balance generally obtained by matching also sometimes yields smaller point estimates of effects. In our example, the lack of a statistically significant finding on dyslipidemia when using 1:1 matching but a significant finding when using other approaches appears to be a result of a combination of these factors. In comparison with the effect on obesity, the effect of dyslipidemia is much weaker: for dyslipidemia, 53% of the discordant pairs had an atypical user with dyslipidemia (χ 2 = 2.613 on 1 degree of freedom; p = 0.11), for obesity, 65% of the discrepant pairs had an atypical user with obesity. The discrepancy in results also indicates the value in assessing sensitivity by trying a few different approaches; those that yield the best covariate balance should be used ( 10 ).

V. Assessing Unmeasured Confounding

The final question in any non-experimental study is how sensitive are the results to a potential unmeasured confounder. We illustrate an approach that determines how strongly related to the decision to fill an atypical antipsychotic medication an unmeasured confounder would have to be to make the observed effect go away (i.e., lose statistical significance; 35). We illustrate the approach using the matched pairs from 1:1 matching using the obesity outcome. Table 5 indicates that for two subjects who appear similar on the measured covariates, if their odds of filling an atypical antipsychotic medication differ by a factor of 1.5 or larger, then the treatment effect becomes statistically insignificant. The size of these odds needs to be interpreted in the context of the particular problem. In our analyses, the largest observed odds ratio was 1.75 (95% CI: 1.55, 1.98) reflecting an increased odds of receiving an atypical antipsychotic for white subjects relative to black subjects. Given this size odds ratio observed, the small number of confounders available in the data, and knowing that the results are sensitive at an odds of 1.5, makes us cautious in concluding that atypical antipsychotic use increases the risk of obesity compared to conventional antipsychotic use. These results need to be replicated in other studies.

Sensitivity of atypical antipsychotic effect on obesity to an unmeasured confounder. Sensitivity parameter represents the odds by which individuals with the same measured confounders differ in receiving atypical antipsychotics due to hidden bias. P-values shown are 1-sided; the sum of p-values > .05 indicates the odds of atypical use that would change the conclusions of the study in terms of making the effect insignificant. For the risk of obesity, this occurs at a value of 1.5.

VI. Discussion

This paper has provided an overview of the approaches for estimating treatment effects with non-experimental data, with a focus on propensity score methods that ensure comparison of similar individuals. While in this study the propensity score approaches gave results similar to those of traditional regression adjustment, we can have more confidence because of the balance obtained by the matching, weighting, and subclassification methods. The methods generally imply increased risk of dyslipidemia and obesity for individuals on atypical antipsychotics and no increased risk of Type II diabetes. However, we should interpret these results with caution, as the effect on dyslipidemia was sensitive to the particular method used and even the (stronger) effect on obesity is potentially sensitive to an unmeasured confounder.

There are a number of complications that researchers may encounter when designing an observational study. The first is missing data: rarely do researchers measure all of the variables of interest for all study subjects. If there are not many patterns of missing data, a first solution is to estimate separate propensity scores for each missing data pattern ( 6 ). A second approach is to include missing data indicators in the propensity score model; this will essentially match individuals on both the observed values (when possible) and on the patterns of missingness ( 36 , 37 ). A third approach is to use multiple imputation and undertake the propensity score matching and outcome analysis separately within each multiply imputed dataset ( 38 ).

A second complication involves questions where the treatment of interest is not a simple binary comparison. Interest might be in the effect of different types or dosages of antipsychotic medications. Two solutions exist in this type of setting. First, if scientifically interesting, focus can be shifted to a binary comparison, for example comparing low vs. high doses. Second, a new area of methodological research has developed generalized propensity scores for use with non-binary treatments ( 5 , 16 , 39 ).

A final concern with any non-experimental study is that of unmeasured confounding: there may be some unmeasured variable related to both which treatment an individual receives and their outcome. Using propensity score approaches to deal with measured confounders is an important step, but there is always concern about effects of unmeasured confounders. One approach to assess whether this could be a problem is to examine an outcome that should not be affected by the treatment of interest; if an effect is actually found, that may indicate the presence of unmeasured confounding. We have also illustrated here a statistical sensitivity analysis, which can be used to assess how important such an unmeasured confounder may be with respect to the study conclusions.

What are the primary lessons? When reading a study that uses non-experimental data, readers should:

  • Consider whether the results are plausible ( 40 ),
  • Examine whether the groups being compared are similar on the relevant variables,
  • Consider whether there are potentially important confounders that were not measured.

When estimating treatment effects using non-experimental methods, researchers should:

  • Be clear about the treatment and comparison conditions,
  • Identify data that has a large set of potential confounders measured,
  • Ensure comparisons are made using similar individuals by using one of the propensity score methods described above.

In conclusion, propensity score approaches such as matching, weighting, and subclassification are an important step forward in the estimation of treatment effects using observational data. Whenever treatment effects are estimated using non-experimental studies, particular care should be taken to ensure that the comparison is being done using treated and comparison subjects who are as similar as possible; propensity scores are one way of doing so. Propensity score methods can thus help researchers, as well as users of that research, to have more confidence in the resulting study findings.

Acknowledgments

Dr. Stuart’s effort was supported by the Center for Prevention and Early Intervention, jointly funded by the National Institute of Mental Health (NIMH) and the National Institute on Drug Abuse (Grant MH066247; PI: N. Ialongo). Dr. Normand’s effort was supported by Grant MH61434 from NIMH. Dr. Gibbons’ effort was supported by NIMH Grant R56-MH078580, and Dr. Horvitz-Lennon’s by NIMH Grant P50-MH073469. The authors are indebted to Larry Zaborski, MS, Harvard Medical School, for earlier programming help and to Richard Frank, PhD, Harvard Medical School, for generously providing the Medicaid data.

2 Although our outcomes are binary we present results from a linear regression model. This was for comparability with the analyses described for the propensity score approaches with weights. If a logistic regression model is used, the difference in absolute risk can be obtained by comparing predictions of the outcomes for the full sample under each of the treatment conditions. In this study the results are virtually identical. Section IV provides more detail.

3 Often the matches are based on the logits (the log-odds of the predicted probabilities) because the logits have better statistical properties.

4 The 10% threshold is a small effect size using Cohen’s effect size criteria ( 21 ).

Advertisement

Advertisement

Non-Experimental Comparative Effectiveness Research: How to Plan and Conduct a Good Study

  • Pharmacoepidemiology (T Stürmer, Section Editor)
  • Published: 04 October 2014
  • Volume 1 , pages 206–212, ( 2014 )

Cite this article

  • Vera Ehrenstein 1 ,
  • Christian F. Christiansen 1 ,
  • Morten Schmidt 1 &
  • Henrik T. Sørensen 1  

3195 Accesses

1 Altmetric

Explore all metrics

Knowledge about the benefit-to-harm balance of alternative treatment options is central to high-quality patient care. In contrast to the traditional hierarchy of evidence, led by randomized designs, the emerging consensus is to move away from judging a study’s validity based only on randomization. Ethical, practical, and financial considerations dictate that most epidemiologic research be non-experimental. That includes studies of effectiveness and safety of treatments. We provide a non-technical overview of essential prerequisites for high-quality comparative effectiveness research from the standpoint of clinical epidemiology, keeping in mind potentially divergent agendas of investigators and other stakeholders. We discuss the essentials of study planning, implementation, and publication of results. Our focus is on non-experimental studies that generate evidence addressing different dimensions of harm–benefit profiles of therapies. Bias minimization strategies, transparency, and independence in reporting are the guiding principles of comparative effectiveness research, whose ultimate goal is to improve patient care and public health.

Similar content being viewed by others

Empirical consequences of current recommendations for the design and interpretation of noninferiority trials.

Scott K. Aberegg, Andrew M. Hersh & Matthew H. Samore

data analysis in non experimental research

Methodological aspects of superiority, equivalence, and non-inferiority trials

Roumeliotis Stefanos, D.’Arrigo Graziella & Tripepi Giovanni

data analysis in non experimental research

Real-world evidence: the devil is in the detail

Mugdha Gokhale, Til Stürmer & John B. Buse

Avoid common mistakes on your manuscript.

Introduction

Knowledge about the benefit-to-harm balance of alternative treatment options is central to high-quality patient care. Traditionally, the experiment [randomized controlled trial (RCT) or natural experiment] has been at the top of the ‘hierarchy of evidence’ as the gold standard for evidence-based medicine, especially for therapeutic choices [ 1 ]. Bias-reducing features of the RCTs—random treatment assignment with the expectation of zero net confounding at baseline; restriction to uniform patient populations; blinding; and standardized data collection (all combined with underlying statistical theory)—are ways to maximize internal validity. In contrast to the traditional hierarchy of evidence [ 1 ], the emerging consensus among clinical epidemiologists is to move away from judging a study’s validity based only on its design type [ 2 – 5 ]. This consensus arises from an appreciation that some purported benefits of experimental designs are not always realized in practice (e.g., the baseline prognostic balance achieved by randomization is often upset during follow-up). Nor do the internally valid results of RCTs apply in all settings of routine clinical care because of the inevitable validity–generalizability tradeoffs of RCTs [ 2 – 5 , 6 ••, 7 – 9 ]. As well, ethical, practical, and financial considerations dictate that most epidemiologic research be observational, including studies of comparative effectiveness and comparative safety of treatments [ 10 ]. Thus, observational studies comparing treatments are increasingly advocated and implemented [ 6 ••, 11 ]. Novel designs that combine advantages of randomized and non-randomized approaches (such as lowering the tradeoff between internal and external validity in pragmatic trials or reliance on new-user designs [ 12 , 13 ••]) help mitigate the disadvantages of both approaches, aiding the acceptance of non-experimental methods in the clinical research community. Modern design and analytic approaches to reducing or quantifying systematic errors in observational research include propensity score methods, marginal structural models, instrumental variables, external adjustment, and bias analyses [ 2 , 12 , 14 – 19 ]. Choosing and correctly implementing study design is a prerequisite for subsequent valid application of different analytic techniques.

Although clinicians have routinely compared harms and benefits of treatments for their patients in an informal way, the concept of systematic comparative effectiveness research (CER) is relatively new. For example, the 2008 edition of the Dictionary of Epidemiology did not yet contain an entry for CER [ 20 ]. In 2009, the Institute of Medicine defined CER as “generation and synthesis of evidence that compares the benefits or harms of alternative methods to prevent, diagnose, treat, and monitor clinical conditions, or to improve the delivery of care” [ 21 ]. CER thus encompasses studies (1) directly or indirectly comparing safety and/or effectiveness of active treatments for the same indication; (2) carried out in routine clinical practice; and (3) aiming to help clinicians, regulators, and policy makers to make evidence-based decisions. In addition to scientific aims, CER studies initiated outside academic institutions may have explicit practical goals, including formulation of guidelines, standards of care, safety regulations, or reimbursement policies [ 22 ]. Thus, clinical decision making and policy are much more prominent in planning CER studies in non-academic settings than in conventional investigator-initiated studies in academia [ 22 , 23 ].

Guidelines relevant to CER have been published by several authorities [ 8 , 9 , 24 – 28 ], with some of these publications eliciting critique and calls for harmonization [ 29 , 30 •]. Investigators embarking on a CER study should start by consulting the Guidelines for Good Pharmacoepidemiology Practice (GPP), maintained by the International Society for Pharmacoepidemiology (ISPE) [ 30 •]. The Good Research for Comparative Effectiveness (GRACE) principles specify the following questions to be considered when assessing study quality [ 25 ]: (1) whether the study plans (including research questions, main comparisons, outcomes, etc.) were specified before the study was conducted; (2) whether the study was conducted and analyzed in a manner consistent with good practices and reported in sufficient detail for evaluation and replication; and (3) how valid the interpretation of the CER study is for the population of interest, assuming sound methods and appropriate follow-up.

With these questions in mind, we provide a non-technical overview of essential prerequisites for high-quality CER studies from a clinical epidemiology standpoint, keeping in mind the potentially divergent agendas of investigators and other stakeholders. We discuss the essentials of study planning, implementation, and publication of results, focusing on observational studies that generate evidence addressing different dimensions of the harm–benefit profiles of therapies.

Study Planning

The stakeholders and the aim.

The aim of a CER study should be clearly and unambiguously defined and should meet criteria for good research, e.g., the FINER [ 31 ] or PICOTS [ 32 ] criteria. The FINER criteria state that the proposed research should be f easible (in terms of number of patients and sources of data, technical expertise, expenditure of time and money, and manageable scope); i nteresting (to the clinical community as well as the investigator); n ovel (in terms of extending and improving previous research); e thical; and r elevant (to scientific knowledge, clinical health policies, or future research). The parameters for good research to be considered according to PICOTS include the p opulation (condition(s), disease severity and stage, co-morbidities, and patient demographics), the i ntervention (dosage, frequency, and method of administration), c omparator (placebo, usual care, or active control),the o utcome (morbidity, mortality, or quality of life), the t iming (duration of follow-up), and the s etting (primary, specialty, inpatient, and co-interventions).

The CER study proposal should also explicitly list study initiators, sponsors, and other stakeholders, and potential conflicts of interest. Stakeholders are individuals, organizations, or communities who have a direct interest in the process and outcomes of a study [ 22 , 23 , 33 ]. Stakeholders who might be involved in a CER study include industry (in voluntary or regulator-imposed post-authorization safety or effectiveness studies [ 34 •]), regulators (e.g., European Medicines Agency (EMA), US Food and Drug Administration), and governments—in different combinations [ 22 ]. Patient engagement in reviewing merits of research proposals is becoming increasingly common, and may serve to increase relevance to patient care of CER and clinical research [ 35 ].

An investigator contemplating a CER study initiated by a pharmaceutical company should always consider underlying motivations. These could include concern about safety signals emerging from spontaneous reporting, a wish to study disease risk in the general population or in specific groups of patients before a new treatment enters the market, or a regulator-imposed post-authorization monitoring. To eliminate concerns about hidden agendas that might otherwise compromise the integrity of a CER study, any potential conflict of interest among investigators or participating institutions should be fully disclosed.

It is important to note that collaboration with industry does not per se threaten study validity. If there is an agenda (hidden or obvious), university-based researchers are in a better position than for-profit contract research organizations to uphold and enforce principles and procedures protecting study validity. Academically based investigators are backed by institutional mandates for independence and the obligation to publish results of all studies in journals with independent peer review. Unless they are providing direct gainful consultancy services to the pharmaceutical industry, academic researchers are typically salaried employees who do not directly benefit financially from ‘landing’ a lucrative pharmaceutical contract. Since such a contract is executed between institutions rather than individuals, the financial gain of an individual academic investigator is limited (source: Susanne Kudsk, Legal Advisor, Aarhus University, personal communication). As well, conducting a poor study under pressure from a sponsor affects an investigator’s reputation [ 29 ]. If experts from academia refuse to collaborate with industry on CER studies, they may be replaced by potentially less skillful, less scrupulous, or less independent investigators [ 36 ].

The Contract

Collaboration between academic institutions and regulators, government, and/or industry sponsors should be governed by a professional contract, which is crucial for both the researcher and the sponsor. A contract is a formal agreement establishing the ‘rules of the game’: what is to be done, by whom, when, and at what cost. In international environments, the country whose laws will govern the contract should be clearly specified. A contract serves as a master document to be consulted in case of disputes. It should be executed by the researcher’s institution to avoid conflicts and charges of corruption that could arise, were the researcher to receive payment directly from the sponsor.

The type of contract depends on the sponsor’s role. It can take the form of (1) a grant for investigator-initiated studies with no substantial involvement by the sponsor; (2) a cooperative agreement in which the investigator and the sponsor collaborate on the project and both contribute funding and intellectual content; or (3) a contract for sponsor-initiated studies with substantial involvement by the sponsor.

The contract should regulate the interests of both the investigator (and his/her institution) and the sponsor. It should describe the parties, the purpose of the research, the definition of the project, deliverables, schedule, subcontracting, contributions and obligations of the parties, distribution and transfer of rights, confidentiality, and consequences of ending the collaboration.

The contract must ensure that the researcher and the researcher’s institution are free to use the findings in future research and teaching. The researcher also should have the unrestricted right to publish the research findings. In most cases, the sponsor may require a period of time (e.g., 30 or 60 days) to review and comment on a manuscript arising from contract research before submission for publication. Both parties must be willing to negotiate the manuscript’s content and phrasing, but the researcher should have the final say. In special circumstances, the sponsor may postpone publication for up to 6 months, for instance, to apply for a patent. However, this is a rare occasion in CER, in which timely publication of results with a public health impact has high priority. In addition, publication should not be postponed by adverse event reporting, which is usually not possible or appropriate based on aggregate results from a non-experimental CER study using databases [ 30 •].

Assessing Study Feasibility

CER studies are increasingly conducted using secondary data sources, such as healthcare databases, which rely on routine data collected for other purposes. This raises the question whether the data relevant to the study aim are measured or measured well in the candidate data source. A feasibility study conducted ahead of the main effort may help secure data access, estimate study size, or evaluate background rates of the target condition. A feasibility study may also help establish referral and hospitalization patterns to assess the potential role of selection bias or confounding by indication. At our institution, we routinely evaluate the validity of study algorithms before using them in CER studies. For example, we evaluated the validity of an algorithm to identify osteonecrosis of the jaw and serious infections [ 37 – 39 ] before conducting regulator-imposed industry-sponsored comparative safety studies of antiresorptive agents [ 40 ]. While the validity of the algorithm used to identify serious infection was high in hospital records, the algorithm to identify osteonecrosis of the jaw performed poorly and necessitated primary data collection [ 41 ]. Thus, a feasibility study helps estimate whether—and to what extent—existing data must be supplemented with primary data collection. In addition, a pilot study can help in estimating associated costs and in planning appropriate resources. If data from several different databases are to be combined in a CER study, a pilot study may help determine whether all databases measure equally well what they purport to measure. For example, pilot studies may compare estimates of incidence of well-characterized conditions, examine sources of any unexpected variation, and adjust the methodology (see Avillach et al. [ 42 ] and Coloma et al. [ 43 ] for illustration of this approach).

Review of the Skills of Team Members

For a CER study to be well-conducted, the investigator should be mindful of whether the research team covers the spectrum of required expertise and skills. Multidisciplinary CER study teams usually include pharmacoepidemiologists, biostatisticians, pharmacologists, and clinicians. Access to legal advice and project management are also essential to a well-conducted CER study. For multi-institutional studies, it may be efficient to outsource certain administrative or IT tasks. Furthermore, since many comparative effectiveness studies address major and pressing clinical and legal issues, it is important to select participating investigators who can meet tight deadlines without compromising research quality.

International Collaboration

If the required skills and resources are not present within the local team, international collaboration with leading experts in relevant fields can help ensure high quality of a CER study. Moreover, data from a single country/data system may be insufficient to address all study objectives, to achieve sufficient sample size, or to achieve sufficient generalizability. In some instances, collaboration between at least two different countries may be a condition for funding: for example, the EMA routinely requires use of data from two or more EU Member States in its commissioned research [ 44 ]. Finally, investigators whose institutional or national policies proscribe direct collaboration with industry may contribute to CER as subcontractors within international collaborations [ 40 ]. Decisions about the number of required databases can be formalized in the study protocol, as recently described [ 45 ].

Study Implementation

Protocol and statistical analysis plan.

After study feasibility is established, study sources identified, study teams assembled, and the contract signed, a study protocol is developed or finalized as the first step of study implementation. Several guidelines for the structure and components of CER protocols have been proposed [ 13 ••, 27 , 46 , 47 ]. The user guide developed for the United States Agency for Healthcare Research and Quality is comprehensive yet readable and contains contributions by highly reputed experts [ 13 ••].

Protocol writers should strive to create a detailed and transparent guide to the conduct of the study. The protocol must define the primary, secondary, and potential exploratory study objectives. Protocol writing is an iterative process that helps raise and address methodological issues. Protocol-related challenges of studies based on multinational secondary data sources require an adequate description of diverse data systems and measurement of study variables extracted from diverse sources (such as general practice-based databases, claims databases, and/or national registries). These sources may have different mechanisms for generating records, which affect data validity and completeness as well as interpretation of results.

In multinational studies, it is crucial to involve all participants in writing and revising the study protocol. In regulator-imposed post-authorization studies, the marketing authorization holder may initiate writing of the protocol according to prespecified formats, working with data custodians in participating countries to harmonize data-generating mechanisms. The protocol should be reviewed by clinicians with relevant expertise and with experience treating patients in a given health system; by statisticians with practical expertise in data-generating mechanisms, data flow, and data architecture; and by epidemiologists who can foresee the implications of data idiosyncrasies for interpretation of results.

For observational studies, including CER studies, the protocol should contain clear provisions for efforts to rule out methodological threats to validity, including selection bias, information bias, confounding, and chance. Use of automated health records—claims, patient, and disease registries, medical record databases, and insurance databases—has become a mainstay of CER [ 8 , 9 , 25 , 48 ]. Thus, investigators have large amounts of routinely collected data on large numbers of individuals but limited control of data collection. In an era of automated databases, it is essential to consider how selection bias, confounding by indication, data quality, misclassification, and medical surveillance bias, are to be handled [ 49 •]. Some traditional epidemiologic ‘mantras’ [ 50 ] may not apply in CER settings. One example is the dilution of estimates by non-differential misclassification of exposure, frequently invoked to defend ‘conservative estimates’ in studies of non-pharmaceutical exposures. Dilution of estimates in CER studies is, like in any other study, a potential public health hazard if exposure measurement instruments and definitions are so poor that they lower the strength of a safety signal beyond detection, resulting in continued use of a potentially unsafe agent. CER study protocols must specify ways to avoid dilution of the effect by inclusion of outcome measures that have high specificity. Another example is the challenge of confounding by indication when comparing treated with untreated; however, in CER studies comparing two different drugs with the same indication, this problem is often reduced considerably.

The planned statistical analysis should be described in sufficient detail in the study protocol. However, the comprehensive description of statistical procedures may require a separate document, the Statistical Analysis Plan (SAP). As the SAP is a guide for the study statistician, he/she should be involved in its preparation and must approve it. The SAP closely follows the study protocol and is developed after the protocol is finalized. The SAP contains a detailed description of sampling and analytic procedures, and many sections of the SAP will be lifted verbatim for use in the statistical analysis section of the study report or a published paper. Analysis of data from different international sources may be country-based or pooled. Development of common data models is quickly becoming the standard approach. Different approaches to combining international data have been described and are beyond the scope of this paper [ 40 – 43 , 45 , 51 , 52 ••, 53 , 54 ••, 55 , 56 ].

Transparency and methodological rigor are necessary features of the protocol and the SAP. The CER protocol must be in place before the study commences. In some situations, e.g., in some regulator-imposed studies, a protocol must be in place before the drug under study enters the market. By definition, such a protocol is not informed by crucial aspects of real-life drug utilization, including whether the drug will be distributed in inpatient or outpatient settings (and therefore measurable in outpatient prescription databases) and how fast drug uptake occurs. Therefore, amendments to the protocol are often necessary as real-life aspects of drug use become apparent. Protocol amendments should be justified, scientifically sound, agreed-upon by all study stakeholders, and meticulously documented [ 57 ]. CER protocols and all amendments may need approval by a regulator. The EMA publishes the protocols of imposed post-authorization studies in its ENCePP (European Network of Centres for Pharmacoepidemiology and Pharmacovigilance) register of studies [ 58 ]. Researchers should consider registration of any CER study; for example non-ENCePP studies can be registered in the ENCePP registry.

Interacting with the Sponsor

Professional interaction with the sponsor is important in both investigator- and sponsor-initiated studies, depending on contributions agreed on before study initiation. Formal channels of communication (e.g., frequency of investigator meetings, teleconferences, and updates) should be agreed upon in advance. Informal communication with sponsor employees is less regulated. Pharmaceutical companies often have dedicated research, development, and/or safety departments that are separated from the sales department in order to reduce conflicts of interest.

The sponsor may contribute important background knowledge to a CER study, which can be useful in formulating the research question (e.g., nature of potential adverse events from ongoing RCTs). However, during the conduct of the study, communication may be more informative than interactive. While the researcher and the sponsor should share a fundamental interest in improving health for patients, they may have different interests that should be kept in mind during interactions. Respectful communication is required, as research findings should not be influenced by the sponsor. Still, the sponsor may have a particular interest in getting as much information as possible, as research findings may have a major impact on approval, labeling, and sale of the company’s products.

Publication of Results

The publication potential of CER studies is attractive to academia-based researchers and may serve as an important motivator for expert clinicians and methodologists to contribute their efforts. The investigators should be free to publish all results stemming from CER research, and this right should be delineated in the contract. Sponsor employees should co-author the publications, provided they fulfill the authorship criteria [ 59 ]. Several scientific publications may stem from a single CER study, with different author constellations. Even if it seems redundant, it is worth circulating the ICMJE (International Committee of Medical Journal Editors) authorship criteria before drafting a manuscript to ensure that all aspiring authors understand and are prepared to fulfill their expected contributions. Results should be transparently reported and judiciously interpreted, including honest discussion of study limitations. Current reporting guidelines [ 60 ], especially the STROBE (STrengthening the Reporting of OBservational studies in Epidemiology) statement for observational studies, and the upcoming RECORD (REporting of studies Conducted using Observational Routinely collected Data) guidelines for reporting studies conducted using routinely collected data [ 61 •], will help determine the type of information that needs to be included in the planned report.

In conclusion, methodological rigor, clear rules, transparency in communication, and independence in reporting are the guiding principles of observational CER, with the ultimate goal of improving patient care and public health.

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Fletcher RH, Fletcher SW, Fletcher GS. Clinical epidemiology: the essentials. 5th ed. Philadelphia: Wolters Kluwer/Lippincott Williams & Wilkins Health; 2014.

Google Scholar  

Hernan MA, Alonso A, Logan R, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19(6):766–79.

Article   PubMed   PubMed Central   Google Scholar  

Hernan MA, Hernandez-Diaz S, Robins JM. Randomized trials analyzed as observational studies. Ann Intern Med. 2013;159(8):560–2.

PubMed   Google Scholar  

Sorensen HT, Lash TL, Rothman KJ. Beyond randomized controlled trials: a critical comparison of trials with nonrandomized studies. Hepatology. 2006;44(5):1075–82.

Article   PubMed   Google Scholar  

Rothman KJ. Six persistent research misconceptions. J Gen Intern Med. 2014;29(7):1060–4.

Sox HC, Goodman SN. The methods of comparative effectiveness research. Annu Rev Public Health. 2012;33:425–45. This review provides a concise and comprehensive overview of methods used in CER and its key elements, with focus on issues relevant in observational settings .

Haynes B. Can it work? Does it work? Is it worth it? BMJ. 1999;319(7211):652–3.

Article   PubMed   CAS   PubMed Central   Google Scholar  

Dreyer NA. Making observational studies count: shaping the future of comparative effectiveness research. Epidemiology. 2011;22(3):295–7.

Sturmer T, Jonsson Funk M, Poole C, Brookhart MA. Nonexperimental comparative effectiveness research using linked healthcare databases. Epidemiology. 2011;22(3):298–301.

Holve E, Pittman P. A First Look at the Volume and Cost of Comparative Effectiveness Research in the United States. AcademyHealth. 2009. http://www.academyhealth.org/files/publications/CERMonograph09.pdf .

Sox HC. Comparative effectiveness research: a progress report. Ann Intern Med. 2010;153(7):469–72.

Sturmer T, Schneeweiss S, Brookhart MA, Rothman KJ, Avorn J, Glynn RJ. Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. Am J Epidemiol. 2005;161(9):891–8.

Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM, editors. Developing a protocol for observational comparative effectiveness research: a user’s guide. AHRQ publication no. 12(13)-EHC099. Rockville: Agency for Healthcare Research and Quality; 2013. http://www.effectivehealthcare.ahrq.gov/Methods-OCER.cfm . A well-referenced, comprehensive, modern, and methodologically sound manual for those writing CER prototols. Of particular value is the reference material to state-of-the art analytic techniques and advice on study design decisions, using real-life examples.

Schneeweiss S. Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics. Pharmacoepidemiol Drug Saf. 2006;15(5):291–303.

Braitman LE, Rosenbaum PR. Rare outcomes, common treatments: analytic strategies using propensity scores. Ann Intern Med. 2002;137(8):693–5.

Brookhart MA, Rassen JA, Schneeweiss S. Instrumental variable methods in comparative safety and effectiveness research. Pharmacoepidemiol Drug Saf. 2010;19(6):537–54.

Brookhart MA, Sturmer T, Glynn RJ, Rassen J, Schneeweiss S. Confounding control in healthcare database research: challenges and potential approaches. Med Care. 2010;48(6 Suppl):S114–20.

Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. Dordrecht: Springer; 2009.

Book   Google Scholar  

Garabedian LF, Chu P, Toh S, Zaslavsky AM, Soumerai SB. Potential bias of instrumental variable analyses for observational comparative effectiveness research. Ann Intern Med. 2014;161(2):131–8.

Porta MS, editor. A dictionary of epidemiology. 5th ed. Oxford: Oxford University Press; 2008.

Sox HC, Greenfield S. Comparative effectiveness research: a report from the Institute of Medicine. Ann Intern Med. 2009;151(3):203–5.

Smith SR. Introduction. In: Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM, editors. Developing a protocol for observational comparative effectiveness research: a user’s guide. AHRQ publication no 12(13)-EHC099. Rockville: Agency for Healthcare Research and Quality; 2013. p. 1–6.

Smith SR. Study objectives and questions. In: Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM, editors. Developing a protocol for observational comparative effectiveness research: a user’s guide. AHRQ publication no 12(13)-EHC099. Rockville: Agency for Healthcare Research and Quality; 2013. p. 7–20.

Berger ML, Mamdani M, Atkins D, Johnson ML. Good research practices for comparative effectiveness research: defining, reporting and interpreting nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part. Value Health. 2009;12(8):1044–52.

Dreyer NA, Schneeweiss S, McNeil BJ, et al. GRACE principles: recognizing high-quality observational studies of comparative effectiveness. Am J Manag Care. 2010;16(6):467–71.

Johnson ML, Crown W, Martin BC, Dormuth CR, Siebert U. Good research practices for comparative effectiveness research: analytic methods to improve causal inference from nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part II. Value Health. 2009;12(8):1062–73.

The European Medicines Agency. Guidance for the format and content of the protocol of non-interventional post-authorisation safety studies. http://www.ema.europa.eu/docs/en_GB/document_library/Other/2012/10/WC500133174.pdf . Accessed 28 Jul 2012.

Cox E, Martin BC, Van Staa T, Garbe E, Siebert U, Johnson ML. Good research practices for comparative effectiveness research: approaches to mitigate bias and confounding in the design of nonrandomized studies of treatment effects using secondary data sources: the International Society for Pharmacoeconomics and Outcomes Research Good Research Practices for Retrospective Database Analysis Task Force Report--Part I. Value Health. 2009;12(8):1053–61.

Sturmer T, Carey T, Poole C. ISPOR Health Policy Council proposed good research practices for comparative effectiveness research: benefit or harm? Value Health. 2009;12(8):1042–3.

International Society for Pharmacoepidemiology. Guidelines for Good Pharmacoepidemiology Practices (GPP). https://www.pharmacoepi.org/resources/guidelines_08027.cfm . Accessed 31 May 2014. A current industry standard for conducting pharmacoepidemiology and pharmacovigilance studies.

Hulley SB, Cumming SR, Browner WS, Grady DG, Newman TB. Designing clinical research. 4th ed. Philadelphia: Lippincott, Williams and Wilkins; 2013.

Whitlock EP, Lopez SA, Chang S, Helfand M, Eder M, Floyd N. AHRQ series paper 3: identifying, selecting, and refining topics for comparative effectiveness systematic reviews: AHRQ and the effective health-care program. J Clin Epidemiol. 2010;63(5):491–501.

Deverka PA, Lavallee DC, Desai PJ, et al. Stakeholder participation in comparative effectiveness research: defining a framework for effective engagement. J Comp Eff Res. 2012;1(2):181–94.

European Medicines Agency. Post-authorisation safety studies (PASS). http://www.ema.europa.eu/ema/index.jsp?curl=pages/regulation/document_listing/document_listing_000377.jsp&mid=WC0b01ac058066e979 . Accessed 28 Jul 2014. Guide on type of CER studies from the European regulator.

Fleurence RL, Forsythe LP, Lauer M, et al. Engaging patients and stakeholders in research proposal review: the patient-centered outcomes research institute. Ann Intern Med. 2014;161(2):122–30.

Lash TL. Plenary lecture: The future of epidemiology - where do we go from here? European Congress of Epidemiology (EUROEPI); 11–13 Aug 2013; Aarhus, Denmark.

Bergdahl J, Jarnbring F, Ehrenstein V, et al. Evaluation of an algorithm ascertaining cases of osteonecrosis of the jaw in the Swedish National Patient Register. Clin Epidemiol. 2013;5:1–7.

Gammelager H, Svaerke C, Noerholt SE, et al. Validity of an algorithm to identify osteonecrosis of the jaw in women with postmenopausal osteoporosis in the Danish National Registry of Patients. Clin Epidemiol. 2013;5:263–7.

Holland-Bill L, Xu H, Sørensen HT, et al . Positive predictive value of primary inpatient discharge diagnoses of infection among cancer patients in the Danish National Registry of Patients. Ann Epidemiol. 2014;24(8):593–597.e18

Xue F, Ma H, Stehman-Breen C, et al. Design and methods of a postmarketing pharmacoepidemiology study assessing long-term safety of Prolia® (denosumab) for the treatment of postmenopausal osteoporosis. Pharmacoepidemiol Drug Saf. 2013;22(10):1107–14.

PubMed   CAS   Google Scholar  

Schiodt M, Wexell CL, Herlofson BB, Giltvedt KM, Norholt SE, Ehrenstein V. Existing Data Sources for Clinical Epidemiology: Scandinavian Cohort for Osteonecrosis of the Jaw – Work in Progress and Challenges. Clinical Epidemiol 2014 (in press).

Avillach P, Coloma PM, Gini R, et al. Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project. J Am Med Inform Assoc. 2013;20(1):184–92.

Coloma PM, Schuemie MJ, Trifiro G, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011;20(1):1–11.

Ehrenstein V, Hernandez RK, Ulrichsen SP, et al. Rosiglitazone use and post-discontinuation glycaemic control in two European countries, 2000-2010. BMJ Open. 2013;3(9):e003424.

Maro JC, Brown JS, Kulldorff M. Medical product safety surveillance how many databases to use? Epidemiology. 2013;24(5):692–9.

Berger ML, Dreyer N, Anderson F, Towse A, Sedrakyan A, Normand SL. Prospective observational studies to assess comparative effectiveness: the ISPOR good research practices task force report. Value Health. 2012;15(2):217–30.

ENCePP Guide on Methodological Standards in Pharmacoepidemiology. Section 9.1: Comparative effectiveness research. http://www.encepp.eu/standards_and_guidances/methodologicalGuide9_1.shtml . Accessed 1 Jun 2014.

Schneeweiss S. Developments in post-marketing comparative effectiveness research. Clin Pharmacol Ther. 2007;82(2):143–56.

Article   PubMed   CAS   Google Scholar  

Sørensen HT, Baron JA. Medical databases. In: Olsen J, Saracci R, Trichopoulos D, editors. Teaching epidemiology: a guide for teachers in epidemiology, public health and clinical medicine. 4th edn. Oxford: Oxford University Press; (in press). An overview of database research, which has become a CER mainstay.

Lash TL, Fink AK. Re: “Neighborhood environment and loss of physical function in older adults: evidence fro alameda county study” [letter]. Am J Epidemiol. 2003;157(5):472–3.

Gagne JJ, Glynn RJ, Rassen JA, et al. Active safety monitoring of newly marketed medications in a distributed data network: application of a semi-automated monitoring system. Clin Pharmacol Ther. 2012;92(1):80–6.

Gagne JJ, Wang SV, Rassen JA, Schneeweiss S. A modular, prospective, semi-automated drug safety monitoring system for use in a distributed data environment. Pharmacoepidemiol Drug Saf. 2014;23(6):619–27. A guide for conducting drug safety monitoring involving databases from different countries. The authors demonstrate the feasibility of a semi-automated prospective monitoring approach .

Platt R, Davis R, Finkelstein J, et al. Multicenter epidemiologic and health services research on therapeutics in the HMO Research Network Center for Education and Research on Therapeutics. Pharmacoepidemiol Drug Saf. 2001;10(5):373–7.

Toh S, Gagne JJ, Rassen JA, Fireman BH, Kulldorff M, Brown JS. Confounding adjustment in comparative effectiveness research conducted within distributed research networks. Med Care. 2013;51(8 Suppl 3):S4–10. A critical assessment of different confounding adjustment applications for observational CER studies conducted within distributed research networks, including analysis of patient-level data, case-centered logistic regression of risk set data, analysis of aggregated data, and meta-analysis of site-specific effect estimates .

Kieler H, Artama M, Engeland A, et al. Selective serotonin reuptake inhibitors during pregnancy and risk of persistent pulmonary hypertension in the newborn: population based cohort study from the five Nordic countries. BMJ. 2012;344:d8012.

Harcourt SE, Smith GE, Elliot AJ, et al. Use of a large general practice syndromic surveillance system to monitor the progress of the influenza A(H1N1) pandemic 2009 in the UK. Epidemiol Infect. 2012;140(1):100–5.

Chalkidou K, Anderson G. Comparative Effectiveness Research: International Experiences and Implications for the United States. http://www.academyhealth.org/files/publications/CER_International_Experience_09%20(3).pdf . Accessed 1 Jun 2014.

ENCePP. The EU PAS Register. http://www.encepp.eu/encepp_studies/indexRegister.shtml . Accessed 28 Jul 2014.

ICMJE. Defining the role of authors and contributors. http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html . Accessed 22 Jul 2014.

EQUATOR Network: Enhancing the QUAlity and Transparency Of health Research. http://www.equator-network.org/ . Accessed 22 Jul 2014.

Langan SM, Benchimol EI, Guttmann A, et al. Setting the RECORD straight: developing a guideline for the REporting of studies Conducted using Observational Routinely collected Data. Clin Epidemiol. 2013;5:29–31. A guideline specifically addressing issues of reporting results of studies stemming from automated databases .

PubMed   PubMed Central   Google Scholar  

Download references

Acknowledgments

The authors each report being a salaried employee of Aarhus University (Aarhus, Denmark). Aarhus University receives (and administers) research grants from various pharmaceutical companies and the European Medicines Agency. V. Ehrenstein, C.F. Christiansen, M. Schmidt, and H.T. Sørensen do not receive research grants or consultant fees from pharmaceutical companies.

Compliance with Ethics Guidelines

Conflict of interest.

V. Ehrenstein, C.F. Christiansen, M. Schmidt, and H.T. Sørensen all declare no conflicts of interest.

Human and Animal Rights and Informed Consent

All studies by the authors involving animal and/or human subjects were performed after approval by the appropriate institutional review boards. When required, written informed consent was obtained from all participants.

Author information

Authors and affiliations.

Department of Clinical Epidemiology, Aarhus University Hospital, Olof Palmes Allé 43-45, 8200, Aarhus, Denmark

Vera Ehrenstein, Christian F. Christiansen, Morten Schmidt & Henrik T. Sørensen

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Vera Ehrenstein .

Rights and permissions

Reprints and permissions

About this article

Ehrenstein, V., Christiansen, C.F., Schmidt, M. et al. Non-Experimental Comparative Effectiveness Research: How to Plan and Conduct a Good Study. Curr Epidemiol Rep 1 , 206–212 (2014). https://doi.org/10.1007/s40471-014-0021-5

Download citation

Published : 04 October 2014

Issue Date : December 2014

DOI : https://doi.org/10.1007/s40471-014-0021-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Comparative effectiveness research
  • Database research
  • Epidemiology
  • Evidence-based medicine
  • Observational research
  • Pharmacoepidemiology
  • Post-authorization study
  • Find a journal
  • Publish with us
  • Track your research
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data analysis in non experimental research

Home Market Research

Data Analysis in Research: Types & Methods

data-analysis-in-research

Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

LEARN ABOUT: Steps in Qualitative Research

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection methods , and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.

MORE LIKE THIS

customer advocacy software

21 Best Customer Advocacy Software for Customers in 2024

Apr 19, 2024

quantitative data analysis software

10 Quantitative Data Analysis Software for Every Data Scientist

Apr 18, 2024

Enterprise Feedback Management software

11 Best Enterprise Feedback Management Software in 2024

online reputation management software

17 Best Online Reputation Management Software in 2024

Apr 17, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

data analysis in non experimental research

Yearly paid plans are up to 65% off for the spring sale. Limited time only! 🌸

  • Form Builder
  • Survey Maker
  • AI Form Generator
  • AI Survey Tool
  • AI Quiz Maker
  • Store Builder
  • WordPress Plugin

data analysis in non experimental research

HubSpot CRM

data analysis in non experimental research

Google Sheets

data analysis in non experimental research

Google Analytics

data analysis in non experimental research

Microsoft Excel

data analysis in non experimental research

  • Popular Forms
  • Job Application Form Template
  • Rental Application Form Template
  • Hotel Accommodation Form Template
  • Online Registration Form Template
  • Employment Application Form Template
  • Application Forms
  • Booking Forms
  • Consent Forms
  • Contact Forms
  • Donation Forms
  • Customer Satisfaction Surveys
  • Employee Satisfaction Surveys
  • Evaluation Surveys
  • Feedback Surveys
  • Market Research Surveys
  • Personality Quiz Template
  • Geography Quiz Template
  • Math Quiz Template
  • Science Quiz Template
  • Vocabulary Quiz Template

Try without registration Quick Start

Read engaging stories, how-to guides, learn about forms.app features.

Inspirational ready-to-use templates for getting started fast and powerful.

Spot-on guides on how to use forms.app and make the most out of it.

data analysis in non experimental research

See the technical measures we take and learn how we keep your data safe and secure.

  • Integrations
  • Help Center
  • Sign In Sign Up Free
  • What is non-experimental research: Definition, types & examples

What is non-experimental research: Definition, types & examples

Defne Çobanoğlu

The experimentation method is very useful for getting information on a specific subject. However, when experimenting is not possible or practical, there is another way of collecting data for those interested. It's a non-experimental way, to say the least.

In this article, we have gathered information on non-experimental research, clearly defined what it is and when one should use it, and listed the types of non-experimental research. We also gave some useful examples to paint a better picture. Let us get started. 

  • What is non-experimental research?

Non-experimental research is a type of research design that is based on observation and measuring instead of experimentation with randomly assigned participants.

What characterizes this research design is the fact that it lacks the manipulation of independent variables . Because of this fact, the non-experimental research is based on naturally occurring conditions, and there is no involvement of external interventions. Therefore, the researchers doing this method must not rely heavily on interviews, surveys , or case studies.

  • When to use non-experimental research?

An experiment is done when a researcher is investigating the relationship between one or two phenomena and has a theory or hypothesis on the relationship between two variables that are involved. The researcher can carry out an experiment when it is ethical, possible, and feasible to do one.

However, when an experiment can not be done because of a limitation, then they decide to opt for a non-experimental research design . Non-experimental research is considered preferable in some conditions, including:

  • When the manipulation of the independent variable is not possible because of ethical or practical concerns
  • When the subjects of an experimental design can not be randomly assigned to treatments.
  • When the research question is too extensive or it relates to a general experience.
  • When researchers want to do a starter research before investing in more extensive research.
  • When the research question is about the statistical relationship between variables , but in a noncausal context.
  • Characteristics of non-experimental research

Non-experimental research has some characteristics that clearly define the framework of this research method. They provide a clear distinction between experimental design and non-experimental design. Let us see some of them:

  • Non-experimental research does not involve the manipulation of variables .
  • The aim of this research type is to explore the factors as they naturally occur .
  • This method is used when experimentation is not possible because of ethical or practical reasons .
  • Instead of creating a sample or participant group, the existing groups or natural thresholds are used during the research.
  • This research method is not about finding causality between two variables.
  • Most studies are done on past events or historical occurrences to make sense of specific research questions.
  • Types of non-experimental research

Non-experimental research types

Non-experimental research types

What makes research non-experimental research is the fact that the researcher does not manipulate the factors, does not randomly assign the participants, and observes the existing groups. But this research method can also be divided into different types. These types are:

Correlational research:

In correlation studies, the researcher does not manipulate the variables and is not interested in controlling the extraneous variables. They only observe and assess the relationship between them. For example, a researcher examines students’ study hours every day and their overall academic performance. The positive correlation this between study hours and academic performance suggests a statistical association. 

Quasi-experimental research:

In quasi-experimental research, the researcher does not randomly assign the participants into two groups. Because you can not deliberately deprive someone of treatment, the researcher uses natural thresholds or dividing points . For example, examining students from two different high schools with different education methods.

Cross-sectional research:

In cross-sectional research, the researcher studies and compares a portion of a population at the same time . It does not involve random assignment or any outside manipulation. For example, a study on smokers and non-smokers in a specific area.

Observational research:

In observational research, the researcher once again does not manipulate any aspect of the study, and their main focus is observation of the participants . For example, a researcher examining a group of children playing in a playground would be a good example.

  • Non-experimental research examples

Non-experimental research is a good way of collecting information and exploring relationships between variables. It can be used in numerous fields, from social sciences, economics, psychology, education, and market research. When gathering information using secondary research is not enough and an experiment can not be done, this method can bring out new information.

Non-experimental research example #1

Imagine a researcher who wants to see the connection between mobile phone usage before bedtime and the amount of sleep adults get in a night . They can gather a group of individuals to observe and present them with some questions asking about the details of their day, frequency and duration of phone usage, quality of sleep, etc . And observe them by analyzing the findings.

Non-experimental research example #2

Imagine a researcher who wants to explore the correlation between job satisfaction levels among employees and what are the factors that affect this . The researcher can gather all the information they get about the employees’ ages, sexes, positions in the company, working patterns, demographic information, etc . 

The research provides the researcher with all the information to make an analysis to identify correlations and patterns. Then, it is possible for researchers and administrators to make informed predictions.

  • Frequently asked questions about non-experimental research

When not to use non-experimental research?

There are some situations where non-experimental research is not suitable or the best choice. For example, the aim of non-experimental research is not about finding causality therefore, if the researcher wants to explore the relationship between two variables, then this method is not for them. Also, if the control over the variables is extremely important to the test of a theory, then experimentation is a more appropriate option.

What is the difference between experimental and non-experimental research?

Experimental research is an example of primary research where the researcher takes control of all the variables, randomly assigns the participants into different groups, and studies them in a pre-determined environment to test a hypothesis. 

On the contrary, non-experimental research does not intervene in any way and only observes and studies the participants in their natural environments to make sense of a phenomenon

What makes a quasi-experiment a non-experiment?

The same as true experimentation, quasi-experiment research also aims to explore a cause-and-effect relationship between independent and dependent variables. However, in quasi-experimental research, the participants are not randomly selected. They are assigned to groups based on non-random criteria .

Is a survey a non-experimental study?

Yes, as the main purpose of a survey or questionnaire is to collect information from participants without outside interference, it makes the survey a non-experimental study. Surveys are used by researchers when experimentation is not possible because of ethical reasons, but first-hand data is needed

What is non-experimental data?

Non-experimental data is data collected by researchers via using non-experimental methods such as observations, interpretation, and interactions. Non-experimental data could both be qualitative or quantitative, depending on the situation.

Advantages of non-experimental research

Non-experimental research has its positive sides that a researcher should have in mind when going through a study. They can start their research by going through the advantages. These advantages are:

  • It is used to observe and analyze past events .
  • This method is more affordable than a true experiment .
  • As the researcher can adapt the methods during the study, this research type is more flexible than an experimental study.
  • This method allows the researchers to answer specific questions .

Disadvantages of non-experimental research

Even though non-experimental research has its advantages, it also has some disadvantages a researcher should be mindful of. Here are some of them:

  • The findings of non-experimental research can not be generalized to the whole population. Therefore, it has low external validity .
  • This research is used to explore only a single variable .
  • Non-experimental research designs are prone to researcher bias and may not produce neutral results.
  • Final words

A non-experimental study differs from an experimental study in that there is no intervention or change of internal or extraneous elements. It is a smart way to collect information without the limitations of experimentation. These limitations could be about ethical or practical problems. When you can not do proper experimentation, your other option is to study existing conditions and groups to draw conclusions. This is a non-experimental design .

In this article, we have gathered information on non-experimental research to shed light on the details of this research method. If you are thinking of doing a study, make sure to have this information in mind. And lastly, do not forget to visit our articles on other research methods and so much more!

Defne is a content writer at forms.app. She is also a translator specializing in literary translation. Defne loves reading, writing, and translating professionally and as a hobby. Her expertise lies in survey research, research methodologies, content writing, and translation.

  • Form Features
  • Data Collection

Table of Contents

Related posts.

100+ Eye-opening mobile statistics for 2024

100+ Eye-opening mobile statistics for 2024

Fatih Özkan

Diagnostic analysis: Definition, tools & examples

Diagnostic analysis: Definition, tools & examples

How to improve your business with a web form

How to improve your business with a web form

  • Open access
  • Published: 10 April 2024

A paclitaxel-hyaluronan conjugate (ONCOFID-P-B™) in patients with BCG-unresponsive carcinoma in situ of the bladder: a dynamic assessment of the tumor microenvironment

  • Anna Tosi 1 ,
  • Beatrice Parisatto 2 ,
  • Enrico Gaffo 3 ,
  • Stefania Bortoluzzi 3 &
  • Antonio Rosato   ORCID: orcid.org/0000-0002-5263-8386 1 , 2  

Journal of Experimental & Clinical Cancer Research volume  43 , Article number:  109 ( 2024 ) Cite this article

340 Accesses

1 Altmetric

Metrics details

The intravesical instillation of the paclitaxel-hyaluronan conjugate ONCOFID-P-B™ in patients with bacillus Calmette-Guérin (BCG)-unresponsive bladder carcinoma in situ (CIS; NCT04798703 phase I study), induced 75 and 40% of complete response (CR) after 12 weeks of intensive phase and 12 months of maintenance phase, respectively. The aim of this study was to provide a detailed description of the tumor microenvironment (TME) of ONCOFID-P-B™-treated BCG-unresponsive bladder CIS patients enrolled in the NCT04798703 phase I study, in order to identify predictive biomarkers of response.

The composition and spatial interactions of tumor-infiltrating immune cells and the expression of the most relevant hyaluronic acid (HA) receptors on cancer cells, were analyzed in biopsies from the 20 patients enrolled in the NCT04798703 phase I study collected before starting ONCOFID-P-B™ therapy (baseline), and after the intensive and the maintenance phases. Clinical data were correlated with cell densities, cell distribution and cell interactions. Associations between immune populations or HA receptors expression and outcome were analyzed using univariate Cox regression and log-rank analysis.

In baseline biopsies, patients achieving CR after the intensive phase had a lower density of intra-tumoral CD8+ cytotoxic T lymphocytes (CTL), but also fewer interactions between CTL and macrophages or T-regulatory cells, as compared to non-responders (NR). NR expressed higher levels of the HA receptors CD44v6, ICAM-1 and RHAMM. The intra-tumoral macrophage density was positively correlated with the expression of the pro-metastatic and aggressive variant CD44v6, and the combined score of intra-tumoral macrophage density and CD44v6 expression had an AUC of 0.85 (95% CI 0.68–1.00) for patient response prediction.

Conclusions

The clinical response to ONCOFID-P-B™ in bladder CIS likely relies on several components of the TME, and the combined evaluation of intra-tumoral macrophages density and CD44v6 expression is a potentially new predictive biomarker for patient response. Overall, our data allow to advance a potential rationale for combinatorial treatments targeting the immune infiltrate such as immune checkpoint inhibitors, to make bladder CIS more responsive to ONCOFID-P-B™ treatment.

The standard therapy for bladder carcinoma in situ (CIS) is represented by intravesical instillation of Bacillus Calmette-Guerin (BCG) that, however, can lead to intolerance or unresponsiveness [ 1 ].

Paclitaxel (PTX) is an antimitotic agent active against many cancers, including bladder cancer, but with several drawbacks [ 2 ]. To overcome PTX clinical limitations, a strategy relates to the conjugation of PTX to hyaluronic acid (HA) as a carrier, which offers several advantages in terms of biocompatibility, tolerability and solubility [ 3 ]. Indeed, HA-drug conjugates can efficiently bind to cancer cells overexpressing HA receptors, and exert strong antiproliferative and cytotoxic activity [ 4 ]. Among HA receptors, the most studied is CD44, a transmembrane glycoprotein overexpressed in several tumors. CD44 is encoded by 19 exons, with 9 of them undergoing alternative splicing and generating CD44 variants (CD44v), each of them activating different signalling pathways that in turn lead to distinct functions [ 5 ]. A PTX-HA formulation, namely ONCOFID-P-B™, has been reported to significantly increase CD44-dependent cellular uptake of the chemotherapy moiety in bladder cancer cell lines [ 4 ]. Moreover, ONCOFID-P-B™ has been already tested in BCG-refractory patients with bladder CIS [ 6 ], demonstrating high tolerability and achieving 60% of complete response (CR) following 6 weekly intravesical instillations. These positive observations were confirmed in the NCT04798703 phase I study [ 7 ], where the intravesical administration of ONCOFID-P-B™ for 12 consecutive weeks (intensive phase, IP) led to 75% of CR. Patients with a CR at this time point underwent a subsequent maintenance phase (MP) of 12 monthly instillations and at month 15 the CR rate was still 40%, thus supporting further clinical development for ONCOFID-P-B™. However, potentially predictive biomarkers of treatment response to ONCOFID-P-B™ require to be identified.

In the present study, we provide a detailed description of the tumor microenvironment (TME) of ONCOFID-P-B™-treated BCG-unresponsive bladder CIS patients enrolled in the NCT04798703 study. In particular, we focused on the composition and spatial interactions of tumor-infiltrating immune cells, and evaluated the expression of the most relevant HA receptors on bladder cancer cells. Moreover, the study provided a unique opportunity to monitor the therapy-induced changes in immune cell composition and HA receptors expression, ultimately leading to the identification of biological markers predictive of response to ONCOFID-P-B™ treatment.

Patient samples

Based on the clinical study protocol [ 7 ], bioptic samples from urothelial mucosa were collected during the cystoscopy from 20 subjects with BCG-unresponsive CIS +/−Ta-T1, before ONCOFID-P-B™ (baseline) and after the 12-week IP. Patients who achieved a CR (defined as a negative cystoscopy including negative biopsy of the urothelium and negative cytology) after the IP entered a subsequent 12 monthly instillations MP. In such patients, additional biopsies were collected every 3-months to assess the duration of response (Fig.  1 ). We referred as non-responders (NR) whenever drug discontinuation occurred or a positive cystoscopy or cytology confirmed any evidence of persistent CIS, progression disease or relapse. Disease-free survival (DFS) was calculated from the beginning of the treatment to relapse or to the end of the treatment protocol, whichever first. After the IP, 15/20 patients achieved a CR and, among these, 8 patients still had a CR after the MP. The response rate of the study is 40%, and the median DFS is 12 months (95% CI 2.5-21.4). The clinical characterization of these patients has been previously reported [ 7 ]. Specimens were fixed in 10% buffered formalin solution for 24 hours and paraffin embedded. Only biopsies containing tumor tissue confirmed by a pathologist were considered and analyzed. The study was conducted according with Good Clinical Practice Guidelines, the World Medical Association Declaration of Helsinki, and the directives of the Committee of the Ministers of EU member states on the use of samples of human origin for research. All patients provided written informed consent. The trial protocol and all amendments were approved by the competent ethical committee at each participating institution [ 7 ]. Four normal bladder samples deriving from the Body Donation Program of the Institute of Human Anatomy of the University of Padova [ 8 ] were collected from anonymous donors who died from causes not attributable to bladder cancer, and matched gender and age characteristics of ONCOFID-P-B™-treated patients, and prepared as done for tumor biopsies (Supplementary_Table_ 1 ).

figure 1

Trial profile and samples collection timeline. Created with BioRender.com

Multiplex immunofluorescence

The immune TME and the expression of the most relevant HA receptors on cancer cells were analyzed by multiplex immunofluorescence (mIF) on sequential 4 μm-thick formalin-fixed paraffin-embedded (FFPE) tumor tissue sections using the Opal Polaris 7-Color Automated IHC Detection Kit (Akoya Biosciences, Marlborough, MA, USA). Two custom 9-color staining panels and one 4-color panel were carefully designed to characterize the subsets of tumor-infiltrating immune cells, and the expression pattern of HA receptors on cancer cells. For each marker of the panels, the staining condition were optimized using monoplex stained slides from positive control tissues, and then re-examined in a multiplex-stained bladder cancer slide. FFPE tumor sections were stained on the BOND-RX autostainer (Leica Microsystems, Wetzlar, Germany), and staining conditions are described in Table  1 . At the end of the staining protocols, slides were mounted with the ProLong Diamond antifade mountant (ThermoFisher Scientific, Waltham, MA, USA).

Multispectral imaging

Multiplex-stained slides were acquired using the multispectral microscope Mantra Workstation 2.0 (Akoya Biosciences) at 20X magnification, considering only areas comprising tumor cells. The inForm Image Analysis software (Akoya Biosciences) was used to unmix and analyze multispectral images, and to create algorithms of analysis through its training with a selection of representative fields, as previously reported [ 9 , 10 , 11 , 12 ]. The pan-cytokeratin (CK) staining was used to differentiate infiltrating immune cells within the tumor areas and in the surrounding stroma in the tissue segmentation step. Then, single cells were segmented by nuclear counterstaining, and co-localized cell surface or intracellular markers were used to determine cell phenotypes. For the first, second and third panel we generated four, six and three different algorithms of analysis, respectively, and the relative algorithms were applied in the batch analysis of all acquired multispectral images of the same panel. Cell density data were calculated as the sum of the cells positive for a specific marker, divided by the area analyzed from the same tissue slide. Cell density and cell percentage results refer to the total area analyzed (tumor plus stroma), the intra-tumoral area only or the peri-tumoral stroma only, as indicated.

Spatial metrics analyses

To assess the topological arrangement of immune cells in bladder cancer microenvironment and cell-to-cell interactions, spatial metrics between cells were calculated using phenoptrReports (add-ins for R Studio from Akoya Biosciences). In particular, the nearest neighbor analysis calculates the average distance between each feature’s centroid and its nearest neighbors’ centroid location, and it was used to analyse the mean distance between different cell subtypes. Moreover, the count within analysis was employed to calculate for each pair of phenotypes, the number of cells with a distinct phenotype having a cell of another phenotype within a specified radius. Since a distance radius of 20-25 μm between two cell subtypes is considered indicative of an enhanced probability for cell-to-cell contact [ 13 ], we calculated the number of reference cells that are present within a 20-25 μm radius from a cell with a different phenotype, and normalized for the total number of reference cells expressed as the percentage among the total number of reference cells. This methodology was used in order to not only consider the absolute number of cell-to-cell contacts, which might be viewed as an epiphenomenon of cellular density, but also to correct for the different number of immune cells present in the bladder TME.

Gene expression analysis

Total RNA was extracted from 4 μm-thick FFPE tumor samples obtained before starting ONCOFID-P-B™ treatment, using the RNAesy FFPE kit (Qiagen, Hilden, Germany). RNA quantification was performed with Nanodrop 1000 spectrophotometer (ThermoFisher Scientific), and the RNA integrity and quality were evaluated with the Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA). The PanCancer Immune Profiling panel (NanoString Technologies, Seattle, WA, USA) was used to measure the expression of 770 immune-related genes covering innate and adaptive immune responses [ 14 ]. The panel included 20 housekeeping genes, 8 negative controls and 6 synthetic positive controls. The samples were processed according to the manufacturer’s instructions and kits provided by NanoString Technologies. Sample RNA was hybridized with panel probes for 19 hours at 65 °C, and then complexes were processed on the nCounter FLEX platform (NanoString Technologies). Cartridges were scanned at 555 fields of view. Gene expression data were analyzed with the nSolver 4.0 Software (NanoString Technologies), and a quality check was performed. Raw data were normalized using a ratio of the expression value to the geometric mean of all housekeeping genes on the panel. Data were then Log2 transformed. The nCounter Advanced Analysis module V.2.0.134 software (NanoString Technologies) was used for differential expression analysis, and to obtain scores for cell type profiling and signature analysis, based on the expression of predefined genes.

Analysis of bladder cancer dataset from the Cancer genome atlas (TCGA)

The bladder cancer (BLCA) dataset in the TCGA repository consists of 404 patient samples for which RNA-seq and clinical data are available [ 15 ]. Patients survival data were obtained from the supplementary material of Liu et al. [ 16 ], as suggested by Idogawa and colleagues [ 17 ]. The normalized expression of transcripts in the TCGA bladder cancer dataset was retrieved from the FIREBROWSE web utility ( http://firebrowse.org/?cohort=BLCA ) [ 18 ] and from the Broad Institute TCGA Genome Data Analysis Center ( https://doi.org/10.7908/C11G0KM9 ). To evaluate the intra-tumoral macrophage density, we applied the CIBERSORTx deconvolution method [ 19 ], which estimates the fraction of 22 immune cell types from the expression profiles of bulk RNA-seq samples, including three types of CD68+ macrophages (M0, M1, and M2). We computed gene expression profiles (GEPs) of the TCGA BLCA samples from the normalized expression of transcript isoforms data. The GEPs were uploaded to the CIBERSORTx online utility, and the resulting immune cell fractions were used to estimate macrophage infiltration. All the analyses were performed with the R programming language v4.3.2. Survival analysis was conducted using the Cox proportional hazards regression model implemented in the survival v3.5-7 R package. Further, the following packages were used: kableExtra v1.4.0, finalfit v1.0.7, riskRegression v2023.12.21, condsurv v1.0.0, tidycmprsk v1.0.0, gtsummary v1.7.2, ggsurvfit v1.0.0, lubridate v1.9.3, ggpubr v0.6.0, ggthemes v5.0.0, ggridges v0.5.5, ggplot2 v3.4.4.

Statistical analysis

All statistical analyses were carried out using GraphPad Prism software (version 7.0) and IBM SPSS Statistics (version 28). Clinical data were correlated with cell densities, cell distribution and cell interactions analyzed at each time point. Non-parametric two-tailed Mann-Whitney test between two groups was used to compare the associations between variables. The Wilcoxon-rank sum test was used to compare the level of immune markers before and after the treatment. Statistical differences in HA receptors expression over time were determined with repeated measures 2-way ANOVA and Holm-Sidak multiple comparison post hoc test. To investigate association between immune factors and patient outcome, median values were used to dichotomize immune variables in subgroups; then, the Kaplan-Meier method was used to estimate survival curves, and the log-rank test was used to test difference between groups. Moreover, univariate Cox regression modelling for proportional hazards was used to calculate hazard ratio (HR) and 95% confidence interval (CI) for the association of dichotomized immune variables and patient outcome. For the correlation analyses, the non-parametric Spearman’s correlation coefficient (r) was calculated. Differences in gene expression between tumor samples from responding or not-responding patients were assessed using the t-test. The combined score of intra-tumoral macrophage density and CD44v6 expression was calculated from the estimated coefficient of each variable in a bivariate logistic model for complete response at the end of the treatment: intra-tumoral CD68 density (0 = low; 1 = high) *1.73 + CD44v6 expression (0 = low; 1 = high) *1.212. The performance of the combined score was estimated by determining the area under the receiving operator curve (AUC). All reported p -values are two-sided and p  ≤ 0.05 was considered statistically significant.

The immune TME composition differs between CR patients and non-responders (NR)

We first compared the composition of immune TME at baseline, between patients achieving or not the CR after the IP (CR IP and NR IP , respectively). The densities of CD4+ T cells, B lymphocytes (CD20+ cells) and tumor-associated macrophages (TAMs; CD68 + CD163- and CD68 + CD163+ cells) were comparable between the two patient groups, while natural killer cells (CD56+) and neutrophils (neutrophil elastase+) appeared negligible in all patients (Supplementary_Figure_ 1 ). Intriguingly, NR IP exhibited a higher intra-tumoral infiltration of CD8+ cytotoxic T lymphocytes (CTLs) as compared to CR IP patients (Fig.  2 a). Notwithstanding, in NR IP a higher percentage of such CTL were in proximity to CD4 + FoxP3+ T-regulatory cells (Treg; Fig. 2 b) or TAMs (Fig. 2 c-e). Additionally, the analysis of the number and type of cell interactions carried out progressively moving away from the tumor margin revealed that CR IP patients had less CTLs in proximity to the tumor edge but their interactions with immunosuppressive subsets (Treg and TAMs) remained constantly low (Fig. 2 f). Conversely, CTLs present in NR IP within a distance ranging from 0 to 20 μm from tumor edge were more numerous but also closer to Treg or TAMs, these interactions progressively decreasing only moving away from the tumor edge (Fig. 2 f).

figure 2

Characterization of the immune infiltrate in bladder CIS at baseline in patients achieving or not a CR after the intensive phase. a Intra-tumoral density (cells/mm 2 ) of CD8+ T cells. b (Left) Representative image of a bladder CIS sample stained with the first mIF panel. In the crop, the proximity between CD8+ cells (magenta staining) and a CD4 + FoxP3+ cell (white and green staining) is highlighted. Original magnification × 20. (Right) Percentage of CD8+ T cells within a radius of 20 μm from CD4 + FoxP3+ Treg cells within the tumor regions. c (Left) Representative image of a bladder CIS sample stained with the first mIF panel. The color code is the same as in (b). In the crop, the proximity between a CD68 + CD163+ macrophage (grey and orange staining) and CD8+ cells (magenta staining) is highlighted. Original magnification × 20. (Right) Percentage of CD163+ M2-polarized macrophages within a radius of 20 μm from CD8+ T lymphocytes. d Percentage of CD68 + CD163- macrophages within a radius of 20 μm from CD8+ T lymphocytes within the tumor regions. e Mean distance (μm) between each CD68 + CD163- macrophage and the nearest CD8+ T lymphocyte. Significantly different data are represented by * p  < 0.05. Floating box extends from 25th to 75th percentiles, line through the box indicates median, and bars extend from the smallest to largest values. f Schematic representation of the analysis of the number and type of cell interactions carried out progressively moving away from the tumor margin. Bubble graphs show the percentage of cell-to-cell interactions (dimension of the bubbles), progressively moving away from the tumor margin in CR IP and NR IP patients

In the biopsies collected after the IP, the TME of CR IP patients resulted enriched in CD4 + FoxP3- T cells as compared to NR IP (Fig.  3 a,b). Moreover, the mean distance between CD4+ T lymphocytes and CK+ cells (Fig. 3 c) or CD8+ T cells (Fig. 3 d) was shorter in CR IP patients. On the other hand, NR IP presented more abundant CD68 + CD163+ TAMs in the stromal compartment (Fig. 3 e), and closer interactions between macrophages and CK+ cells (Fig. 3 f) or CTLs (Fig. 3 g), as compared to CR IP patients.

figure 3

The impact of the TME contexture in patient response to ONCOFID-P-B™. a-g Characterization of the immune infiltrate in bladder CIS collected after the intensive phase in patients achieving or not a CR after the intensive phase. h-k Characterization of the immune infiltrate in bladder CIS collected at baseline in patients achieving or not a CR at the end of the 15-month study. Significantly different data are represented by * p  < 0.05, ** p  < 0.01. Floating box extends from 25th to 75th percentiles, line through the box indicates median, and bars extend from the smallest to largest values. l-p Kaplan-Meier survival curves for disease-free survival according to the immune cell composition and cell-to-cell interactions at baseline in ONCOFID-B-P™-treated bladder CIS patients. The median cut-off of each immune variable was used to separate high and low infiltrated groups. Log-rank p values, hazard ratios (HR) and 95% confidence intervals (CI) are reported in each graph

Finally, we compared the TME of the biopsies collected at baseline according to the clinical response reached at the end of the study (CR end and NR end ). In this case, both intra-tumoral Treg and macrophages were more abundant in NR end samples (Fig. 3 h,i), with macrophages being closer to tumor cells (Fig. 3 j) and CTLs (Fig. 3 k).

The higher Treg/T cell ratio and the shorter mean distance between CTLs and Treg were associated with a shorter DFS (Fig. 3 l,m). Furthermore, a longer DFS was associated with an overall lower density of macrophages (Fig. 3 n) and, in particular, of intra-tumoral TAMs (Fig. 3 o). Moreover, a higher percentage of tumor cells in close proximity to macrophages was associated with a shorter DFS (Fig. 3 p). Collectively, these observations suggest that, rather than the mere presence of CD8+ T cells within the TME, are the interactions between such T lymphocytes and immunosuppressive cells that limit their anti-tumoral activity, to play a key role in bladder CIS progression. Moreover, we identified a key negative predictive role for Treg and TAMs in the patient response to ONCOFID-P-B™.

HA receptors expression differs between responding and non-responding patients

We examined the expression and distribution of the principal HA receptors, namely the CD44 as the standard isoform (CD44s) and its most represented variants (CD44v3, CD44v6, CD44v9), ICAM-1 and RHAMM (Fig.  4 ). The staining revealed differential expression patterns among HA receptors in bladder CIS: CD44s was preferentially expressed in the basal urothelial cell layer and lamina propria, CD44v6 in the basal urothelial cell layer, while CD44v3 and CD44v9 were strongly evidenced in the basal and intermediate urothelial cell layers (Fig. 4 a). ICAM-1 and RHAMM were distributed throughout the urothelium (Fig. 4 b).

figure 4

Assessment of HA receptors expression and distribution in bladder CIS samples. a Representative 9-color multispectral image of the second mIF panel. Markers and color codes are indicated in the figure. Single markers assessment of the CD44 variants is depicted around the merged image. b Representative 4-color multispectral image of the third mIF panel. Markers and color codes are indicated in the figure. Single markers assessment of the ICAM-1 and RHAMM is depicted on the right of the merged image. Original magnification 20x

In all baseline biopsies we found the expression of at least one CD44 isoform on tumor tissue, with the different variants being often co-expressed by cancer cells and CD44v9 the most represented (Fig.  5 a). ICAM-1 and RHAMM were expressed in the majority of tumor cells (Fig. 5 a). Only a small proportion of CD44 isoform-expressing cancer cells was also positive for the Ki-67 proliferation marker (Supplementary_Figure_ 2 a).

figure 5

Correlation between HA receptors expression at baseline and patient response to ONCOFID-P-B™. a Percentage of tumor cells expressing each HA receptor at baseline. b-c Expression (counts/mm 2 ) of each HA receptor at baseline according to clinical response (b) after the 12-weekly ONCOFID-P-B™ instillation of the intensive phase and (c) at the end of the 15-month study. Significantly different data are represented by * p  < 0.05, ** p  < 0.01. Floating box extends from 25th to 75th percentiles, line through the box indicates median, and bars extend from the smallest to largest values. d Kaplan-Meier survival curves for disease-free survival according to the expression of CD44v6 at baseline in ONCOFID-B-P™-treated bladder CIS patients. The median cut-off of each variable was used to separate high and low groups. Log-rank p values, hazard ratios (HR) and 95% confidence intervals (CI) are reported in the graph. e Correlation between CD44v6 expression and the density of total (Spearman r = 0.5193 95% CI 0.07 to 0.79; p  = 0.022) or intra-tumoral CD68+ macrophages (r = 0.54 95% CI 0.09 to 0.80; p  = 0.016). f Kaplan-Meier curves for disease-free survival stratifying patients according to the expression of CD44v6 and the intra-tumoral CD68 density. The median value of each variable was used as cut-off to identify high and low subgroups. Log-rank p values, hazard ratios (HR) and 95% confidence intervals (CI) are reported in each graph. g Receiving Operator Curve (ROC) showing the performance of the combined CD44v6 expression and intra-tumoral CD68+ cells to predict patient response to ONCOFID-P-B™

Among the CD44 isoforms considered independently, NR IP turned out to express significantly higher levels of CD44v6 as compared to CR IP patients (Fig. 5 b). Moreover, NR IP showed a higher expression of both ICAM-1 and RHAMM, as compared to CR IP patients (Fig. 5 b). This trend was maintained in CR end and NR end patients (Fig. 5 c). Accordingly, the only variable with a predictive value was CD44v6, as patients with a high CD44v6 expression had a shorter DFS as compared to patients with a low expression of the isoform (Fig. 5 d and Supplementary_Figure_ 2 b).

Moreover, a direct correlation existed between CD44v6 expression and the density of CD68+ macrophages (Spearman r = 0.5193 95% CI 0.07 to 0.79; p  = 0.022), and in particular with intra-tumoral macrophages (Spearman r = 0.54 95% CI 0.09 to 0.80; p  = 0.016; Fig. 5 e). Thus, patients were divided in two groups (CD44v6 low /CD68 low versus CD44v6 high /CD68 high ) depending on CD44v6 expression level and intra-tumoral CD68+ macrophages density, to be thereafter correlated with DFS. CD44v6 high /CD68 high patients had a significantly worst prognosis as compared to CD44v6 low /CD68 low patients (Fig. 5 f). The derived integrated score had an AUC of 0.85 (95%CI 0.68–1.00) for patient response prediction (Fig. 5 g). Therefore, the combined evaluation of both CD44v6 expression and intra-tumoral macrophage density provided a biomarker with increased predictive value for patient response to ONCOFID-P-B™.

ONCOFID-P-B™ modulates immune cell populations and HA receptors expression

We then analyzed the changes in immune subsets and HA receptors expression on tumor cells induced by ONCOFID-P-B™ treatment. CTLs, Treg and B lymphocytes densities were minimally modified by ONCOFID-P-B™ treatment both in CR end and NR end patients (Fig.  6 a-c). On the other hand, CD4 + FoxP3- T cells increased after the IP only in responding patients, to return thereafter to the basal level after the MP (Fig. 6 d). Conversely, while ONCOFID-P-B™ treatment induced very limited variations in macrophage densities, we observed a trend for an increase in TAMs when patients relapsed during the MP (NR MP ) (Fig. 6 e,f).

figure 6

Changes in immune subsets and HA receptors expression on tumor cells induced by ONCOFID-P-B™ treatment. Density (cells/mm 2 ) of ( a-f ) immune cell populations and ( g-l ) tumor cells expressing each HA receptor at baseline, after the intensive phase (IP) and during or after the maintenance phase (MP) in responding and non-responding patients. Significantly different data are represented by * p  < 0.05 and ** p  < 0.01. Floating box extends from 25th to 75th percentiles, line through the box indicates median, and bars extend from the smallest to largest values

Regarding HA receptors expression, all CD44 isoforms were affected by the treatment in either patient groups, likely a feature reflecting a direct interaction between ONCOFID-P-B™ and HA receptors. Indeed, the density of tumor cells expressing CD44s appeared significantly increased in CR end patients after MP (Fig. 6 g), while CD44v3 progressively decreased throughout the treatment protocol (Fig. 6 h). Moreover, CD44v6 and CD44v9 in either patient groups appeared increased at the end of the treatment, albeit not significantly (Fig. 6 i,j). Differently, the changes in ICAM-1 and RHAMM expression induced by ONCOFID-P-B™ were very limited both in responding and non-responding patients (Fig. 6 k,l).

In CR end patients, a normal bladder immune contexture is re-established after the maintenance phase

Four normal bladder samples were also collected, stained and analyzed to compare their immune infiltrate and HA receptors expression pattern with those observed in bladder CIS samples. Baseline bladder CIS specimens had a higher density of Treg, B lymphocytes and macrophages as compared to normal bladder (Fig.  7 a). Normal epithelial cells stained moderately for CD44v9 and RHAMM, low for CD44s and ICAM-1, and negligibly for CD44v3 and CD44v6 isoforms (Fig. 7 a).

figure 7

Comparison of immune TME and HA receptors expression between bladder CIS and normal bladder samples. a Density (cells/mm2) of immune cell populations and tumor cells expressing each HA receptors in bladder CIS collected at baseline and normal bladders. b Density (cells/mm2) of immune cell populations and tumor cells expressing each HA receptors in bladder CIS samples collected after the IP and in normal bladders. Bladder CIS patients were grouped according to the clinical response (NR IP : patients who did not respond to ONCOFID-P-B™ treatment after the intensive phase; CR end : patients with a complete pathological response at the end of the 15-month study; NR MP : patients who relapsed during the maintenance phase). c Density (cells/mm2) of immune cell populations and tumor cells expressing each HA receptors in bladder CIS samples collected after the MP and in normal bladders. Bladder CIS patients were grouped according to the clinical response (CR end : patients with a complete pathological response at the end of the 15-month study; NR MP : patients who relapsed during the maintenance phase). Significantly different data are represented by * p  < 0.05 and ** p  < 0.01. Floating box extends from 25th to 75th percentiles, line through the box indicates median, and bars extend from the smallest to largest values

The immune contexture of bladder CIS samples collected after the IP appeared similar to what observed in NB (Fig. 7 b). Similarly, in CR end and NR MP patients, the differences in the expression of HA receptors tended to smooth as compared to normal bladders, with the exception of CD44v3 that remained still elevated (Fig. 7 b). However, in NR IP the expression of CD44v3, CD44v9, ICAM-1 and RHAMM was more elevated as compared to CR end patients and normal bladders (Fig. 7 b).

Finally, in biopsies collected during or after the MP, the density of infiltrating immune cells in CR end patients was comparable to normal bladders (Fig. 7 c). Conversely, in NR MP , Treg and macrophages resulted still elevated as compared to normal bladders (Fig. 7 c). Moreover, we observed a trend for a higher expression of HA receptors in CR end and NR MP patients as compared to normal bladders (Fig. 7 c).

Gene-based cell types and signatures are differentially expressed between CR end and NR end patients, and between patients with high or low CD44v6 expression

We investigated potential differences in gene expression in baseline biopsies between CR end and NR end patients. Due to the very limited tumor tissues available, only 3 CR end and 3 NR end successfully passed the quality controls, and therefore were considered for the subsequent gene expression analysis. In CR end patients, a trend for a higher expression of ALCAM, ITGAE and CXCL16 genes was observed (Supplementary_Figure_ 3 a). Based on the expression of cell-type and signature-associated predefined genes present in the panel, we found a trend for a higher expression of genes associated to CD8 T cells, Th1 cells, Treg cells, Exhausted CD8 cells and macrophages in NR end patients as compared to CR end (Supplementary_Figure_ 3 b). Moreover, several gene signatures were found differentially regulated between CR end and NR end patients (Supplementary_Figure_ 3 c). To validate these results, the TME of these selected patients was analysed in terms of immune cell infiltration and spatial distribution, and HA receptors expression. A trend for a higher infiltration of Treg cells and TAMs was found in NR end patients as compared to CR end (Supplementary_Figure_ 3 d), as well as a higher percentage of CD8+ T cells in close proximity to Treg cells and TAMs (Supplementary_Figure_ 3 e). Moreover, NR end patients disclosed a trend for a higher expression of HA receptors except for CD44v9 (Supplementary_Figure_ 3 f).

In addition, we stratified the 6 patients according to the expression of CD44v6 (higher or lower than the median), and found that patients with a higher expression of CD44v6 had a lower ratio between TILs-related and exhausted CD8-related genes (Supplementary_Figure_ 3 g). Moreover, in patients with higher CD44v6 levels, genes related with regulation, chemokines, macrophage functions and T cell functions were overexpressed as compared to patients with lower CD44v6 levels. Conversely, genes related to transporter functions, tumor-inflammation signature, cytotoxicity, antigen processing and adhesion were downregulated in patients with higher expression of CD44v6 (Supplementary_Figure_ 3 h).

The combined evaluation of CD44v6 coding transcript and estimated macrophage infiltration is an independent prognostic biomarker in the TCGA bladder cancer cohort.

We considered the whole TCGA BLCA dataset that includes both clinical and RNA-seq data of 404 patients, to correlate the expression of the transcript encoding the CD44v6 isoform and the CD68+ cell fraction in tumor samples with clinical features of patients (Fig.  8 a). Of note, the bulk RNA-seq sample deconvolution analysis using CIBERSORTx allowed to estimate the fraction of 22 immune cell types, including three subsets of CD68+ macrophages (M0, M1, and M2).

figure 8

The prognostic value of CD44v6 transcript and estimated intra-tumoral macrophages in the TCGA Bladder Cancer (BLCA) cohort. a Flowchart of the analysis performed on the TCGA BLCA clinical and transcript expression data. b , c Kaplan-Meier survival curves for overall survival of the TCGA BLCA samples considering combined expression levels of CD44v6 transcript and b ) estimated total fraction of CD68+ cells or c ) the M0 macrophage fraction. High and low levels were computed according to the median in the dataset

In the TCGA cohort, stage I tumors were quite rare (5 cases, 1.2%) and therefore we considered grouping with stage II cancers (overall 131 cases, 32.4%), while stages III and IV accounted for 34.7 and 32.9%, respectively (Supplementary_Table_ 2 ).

Patient stratification according to the median expression value of the CD44v6 isoform-coding transcript, disclosed that the high expression of CD44v6 was associated with an increased risk (HR 1.22, 95% CI 0.91-1.65, p  = 0.179; Supplementary_Table_ 2 ). Moreover, we stratified patients according to the median value of all estimated CD68+ macrophages in the tumor samples (M0 + M1 + M2). In univariate analysis, CD68 high cases had a significantly worse prognosis as compared to CD68 low patients (HR 1.61, 95% CI 1.19-2.18, p  = 0.002) (Supplementary_Table_ 2 ). Notably, the combination of the two factors evidenced that patients with CD44v6 high /CD68 high had a significantly worse prognosis than CD44v6 low /CD68 low patients (HR 2.02, 95% CI 1.30-3.15, p = 0.002; Fig. 8 b and Supplementary_Table_ 2 ), with a median survival of 19.4 months versus 86.8 months, respectively. The prognostic value of the combination remained significant in multivariate analysis that also considered tumor stage and patient age (HR 1.85, range 1.18-2.89, p  = 0.007), both significant predictors of outcome [ 20 ] (Supplementary_Table_ 2 and Supplementary_Figure_ 4 ).

Since the deconvolution analysis allowed to estimate the fraction of three distinct macrophage populations (M0, M1 and M2), we investigated the prognostic role of such subtypes more in detail. We observed that the M0 population was linked with an increased risk (HR = 2.07, 95% CI 1.52-2.82, p  < 0.001), whereas the M1 and the rarest M2 macrophage fractions did not have prognostic relevance (Supplementary_Table_ 3 ). Combining the evaluation of both CD44v6 transcript expression and M0 macrophages, patients with CD44v6 high /M0 high had a significantly increased risk than patients with CD44v6 low /M0 low (HR 2.53, 95% CI 1.64-3.92, p < 0.001; Fig. 8 c and Supplementary_Table_ 3 ), which was even higher than considering the CD68+ cells altogether. This risk remained significantly higher also in multivariate analysis with age and stage (2.21, 95% CI 1.42-3.46, p < 0.001) (Supplementary_Table_ 3 and Supplementary_Figure_ 5 ).

The treatment strategy with ONCOFID-P-B™ relies on the interaction of HA with its receptors, followed by the internalization of the conjugate within tumor cells with the subsequent release of paclitaxel in its active form. In this regard, CD44 represents an HA receptor therapeutically interesting since it is known to be upregulated in cancer-initiating or metastasizing cells, and involved in the epithelial-mesenchymal transition (EMT), cancer cell survival and drug resistance [ 21 , 22 ]. However, this rational approach to CD44 targeting by HA may be complicated by i) the alternative splicing of CD44 that leads to multiple variants with different affinity for HA, and responsible for different signalling pathways [ 5 ], and ii) the unclear relation between CD44 expression, HA binding and internalization [ 23 , 24 ].

Currently, the role of CD44 in tumor-related clinical outcomes is still contradictory [ 25 , 26 , 27 , 28 , 29 ], and therefore further investigations are required to fully clarify the specific role of different CD44v in patient prognosis. Indeed, CD44 functions are regulated by a delicate balance of different factors, such as a minimal degree of glycosylation and optimal density values, above which the internalization process may slow down [ 30 ]. Moreover, HA internalization is a complex phenomenon of endocytic recognition likely mediated by protein complexes formation [ 31 ]. In this regard, we identified CD44v6 expression in bladder CIS patients at baseline as a negative predictive factor for response to ONCOFID-P-B™ treatment, in line with the observation that HT-29 tumor cells, which highly express the CD44v6 isoform, have a poorer internalization ability [ 23 ]. This is likely due to the formation of protein complexes with c-Met and Hepatocyte Growth Factor, which reduce CD44v6 endocytic performance while activating a signalling pathway leading to cancer invasiveness and metastatic spread [ 32 , 33 ]. Moreover, CD44v6 is considered a critical marker of cancer-initiating/stem cells, as its role in niche formation [ 34 ], apoptosis resistance [ 35 ], EMT [ 36 ], and tumor progression and metastatic invasion [ 37 ]. Accordingly, a negative prognostic role for the CD44v6 isoform has been described for bladder, lung, breast, gastric and colon cancers [ 38 , 39 , 40 , 41 , 42 ]. All these features suggest that CD44v6-expressing bladder CIS may be intrinsically more aggressive and less susceptible to ONCOFID-P-B™ because of a reduced uptake of the conjugate.

Beyond HA receptors expression, the crosstalk between immune components and neoplastic cells is crucial for tumor progression. We report a negative predictive role for intra-tumoral TAMs in response to ONCOFID-P-B™ treatment, and a positive correlation between CD44v6 expression and intra-tumoral macrophages density. Accordingly, the combined evaluation of CD44v6 expression and intra-tumoral TAMs revealed a stronger predictive value for the stratification of patients with a high risk of recurrence. These observations are in line with Rao et al, who uncovered a reciprocal interaction between TAMs and CD44-positive colorectal cancer (CRC) cells during tumorigenesis [ 43 ]. Indeed, CD44-positive cells were found to promote the secretion of high level of osteopontin (OPN) by macrophages, which in turn binds to CD44 expressed by tumor cells promoting clonal growth via the activation of the JNK pathway, invasion and metastasis. Moreover, authors also showed that the combination of OPN and CD44v6 transcripts negatively correlated with CRC patient survival. Accordingly, the interaction between macrophage-secreted OPN and CD44v6 has been reported to drive cancer progression and metastasis in several cancers [ 44 , 45 , 46 ]. Moreover, OPN promotes stem cell-like proprieties and radiation resistance in adjacent tumor cells via activation of CD44 signalling [ 47 ]. Thus, perturbing the OPN–CD44 axis has been proposed as a therapeutic strategy to treat patients with metastatic bladder cancer [ 48 ]. Additionally, myeloid- and tumor cell–released OPN acts as an immune checkpoint to suppress CTL activation, and confers host tumor immune tolerance and immune evasion [ 49 ]. All these data can explain the apparently contradictory result that non-responders have more intra-tumoral CTL as compared to responders. In this regard, Baras et al. observed that a favorable association between the level of CD8+ T cells and the outcome of patients with bladder cancer can depend on the presence of other immune cell populations, including FoxP3+ Tregs cells [ 50 ]. Accordingly, we found that a higher fraction of CD8+ T cells in NR patients are closer to TAMs and FoxP3+ cells, making us to assume that macrophages and Treg act as immunosuppressive populations limiting CTL anti-tumor activity and patient response to ONCOFID-P-B™.

Additionally, we demonstrated that the combined evaluation of CD44v6 coding transcript and macrophages, in particular the M0 subtype, has a prognostic value also in the TCGA BLCA dataset. In ovarian cancer and glioblastoma, transcriptomic and proteomic profiling demonstrated that M0 macrophages disclose high expression of M2 markers and a transcriptional profile more similar to M2 macrophages [ 51 , 52 ]. Moreover, M0 macrophages have been found to be one of the cell subsets most strongly associated with poor outcome in breast cancer [ 53 ], prostate cancer [ 54 ], lung adenocarcinoma [ 55 ], and bladder cancer [ 56 , 57 ]. In addition, Wei and colleagues reported that M0 macrophages secrete OPN, which acts as a chemokine for pro-tumoral monocytes and macrophages (i.e. M0 and M2) in glioblastoma [ 58 ]. Notably, almost all of patients included in the TCGA BLCA dataset had tumor stages higher than stage I; therefore, results from this analysis support the concept that the combined evaluation of macrophages plus CD44v6 isoform can be adopted as prognostic biomarker also in urothelial cancer cohorts with advanced tumor stages, strengthening the potential of our findings.

Collectively, our results suggest that the complex reciprocal interactions between HA receptors on tumor cells, immunosuppressive cells/molecules and CTL infiltrating the TME play a key role in the clinical response to ONCOFID-P-B™ in bladder CIS patients. On the other hand, we also analyzed the variations induced by ONCOFID-P-B™ in the TME, taking into consideration that our patient cohort was not treatment-naïve but had already received BCG within 6 months from ONCOFID-P-B™ therapy start. Indeed, the effects of BCG on TME were likely still appreciable in the baseline bladder CIS, since they appeared more infiltrated as compared to normal bladder samples. In this regard, it has been reported that BCG therapy exerts pleiotropic effects, among which also the enhancement of the effector functions of tumor-specific CD4+ T cells [ 59 ]. Interestingly, CD4+ T lymphocytes further increased and were closer to epithelial or CD8+ T cells in CR patients after the IP, all signs of an activated and tumor-specific immune response that could have been directly fostered by ONCOFID-P-B™ action, as the HA mojety has intrinsic immunomodulatory effects. Similarly, Kates et al. reported a more pronounced infiltration of antitumor immunophenotypes in two BCG-naïve non-muscle invasive bladder cancer patients responding to a microparticle docetaxel experimental drug [ 60 ].

Like other cancers associated with long-term carcinogenic exposure, such as non-small cell lung cancer and melanoma, urothelial bladder cancer has been known to harbor relatively high tumor mutational burden (TMB) [ 61 ]. High TMB is associated with benefit from immunotherapy with BCG in non-muscle invasive bladder cancer [ 62 ]. Moreover, urothelial carcinomas with high TMB exhibit several molecular defects that could be exploited for combinatorial treatments [ 63 ]. Results from a meta-analysis interrogating dataset of 33 cancer types from TCGA, revealed that CD44 expression is negatively associated with TMB in bladder cancer [ 64 ]. In light of these observations, while it would have been interesting to include such information in our work, this genomic analysis was precluded by the paucity of the available biopsy materials. Notwithstanding, such limitations could be prospectively overcome by analysing the samples from an ongoing phase III, single-arm clinical study aimed to evaluate the efficacy and safety of ONCOFID-P-B™ administered intravesically to patients with BCG-unresponsive CIS of the bladder with or without Ta-T1 papillary disease (NCT05024773).

Overall, our data highlight the powerful intrinsic activity of ONCOFID-P-B™ and allow to advance a potential rationale for combinatorial treatments with different drugs: i) targeting the macrophage-CD44 axis or depleting the macrophage/Treg compartment could increase tumor surveillance by CD8+ T cells and make bladder CIS more responsive to ONCOFID-P-B™ treatment. ii) The combined instillation of both BCG and ONCOFID-P-B™ in bladder CIS patients might results in a synergic effect improving clinical activity because HA binding by effector T cells may help their recruitment to inflammatory sites and may improve their survival and function [ 65 ]. iii) Since PD-1/PD-L1 checkpoint expression has been reported to increase in BCG-resistant patients [ 66 ], a checkpoint blockade therapy could remove the immunosuppressive constrains in the TME and allow ONCOFID-P-B™ to be effective even in NR patients.

In conclusion, we advance that a thorough analysis of both HA receptors and immune TME can provide more informative hints to predict bladder CIS response to ONCOFID-P-B™. This is in particular exemplified by the combined evaluation of intra-tumoral macrophages density and CD44v6 expression, a potentially new biomarker that showed high sensitivity and specificity for response prediction, and that can be easily reproduced by classical immunohistochemistry in the clinical setting. Although the combined score we advance appears promising, we are aware that it would require further validation in a larger cohort of patients, like those enrolled in the currently ongoing phase III, single-arm NCT05024773 clinical study.

Availability of data and materials

The data that support the findings of this study are available from Fidia Farmaceutici SpA but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Fidia Farmaceutici SpA.

Abbreviations

Carcinoma in situ

Bacillus Calmette-Guerin

  • Hyaluronic acid

CD44 variants

Complete response

Intensive phase

Maintenance phase

Tumor microenvironment

Non-responders

Disease-free survival

Multiplex Immunofluorescence

Formalin-fixed paraffin-embedded

Pan-cytokeratin

Hazard ratio

Confidence interval

Area under the receiving operator curve

Tumor-associated macrophages

Cytotoxic T lymphocytes

T-regulatory cells

Epithelial-mesenchymal transition

Colorectal cancer

Osteopontin

Bladder Cancer dataset

Gene expression profiling

Tumor mutational burden

Han J, Gu X, Li Y, Wu Q. Mechanisms of BCG in the treatment of bladder cancer-current understanding and the prospect. Biomed Pharmacother. 2020;129

Vitiello A, Ferrara F, Lasala R, Zovi A. Precision medicine in the treatment of locally advanced or metastatic urothelial Cancer: new molecular targets and pharmacological therapies. Cancers (Basel). 2022;14:1–14.

Article   Google Scholar  

Huang G, Huang H. Application of hyaluronic acid as carriers in drug delivery. https://doi.org/10.1080/10717544.2018.1450910 .

Montagner IM, et al. Paclitaxel-hyaluronan hydrosoluble bioconjugate: mechanism of action in human bladder cancer cell lines. Urol Oncol. 2013;31:1261–9.

Article   CAS   PubMed   Google Scholar  

Chen C, Zhao S, Karnad A, Freeman JW. The biology and role of CD44 in cancer progression: therapeutic implications. J Hematol Oncol. 2018;111(11):1–23.

Google Scholar  

Bassi PF, et al. Paclitaxel-hyaluronic acid for intravesical therapy of bacillus calmette-gurin refractory carcinoma in situ of the bladder: results of a phase i study. J Urol. 2011;185:445–9.

Hurle R, et al. Oncofid-P-B: a novel treatment for BCG unresponsive carcinoma in situ (CIS) of the bladder: results of a prospective European multicentre study at 15 months from treatment start. Urol Oncol Semin Orig Investig. 2021; https://doi.org/10.1016/j.urolonc.2021.07.007 .

Porzionato A, et al. Quality management of body donation program at the University of Padova. Anat Sci Educ. 2012;5:264–72.

Article   PubMed   Google Scholar  

Tosi A, et al. The immune cell landscape of metastatic uveal melanoma correlates with overall survival. J Exp Clin Cancer Res. 2021;40:1–17.

Tosi, A. et al. Reduced Interleukin-17-Expressing Cells in Cutaneous Melanoma. 1–17 (2021).

Dieci MV, et al. Neoadjuvant chemotherapy and immunotherapy in luminal B-like breast Cancer: results of the phase II GIADA trial. Clin Cancer Res. 2022;28:308–17.

Narducci MG, et al. Reduction of T lymphoma cells and immunological invigoration in a patient concurrently affected by melanoma and Sezary syndrome treated with Nivolumab. Front Immunol. 2020;11

Carstens JL, et al. Spatial computation of intratumoral T cells correlates with survival of patients with pancreatic cancer. Nat Commun. 2017;8:15095.

Article   PubMed   PubMed Central   Google Scholar  

Tosi A, et al. The immune microenvironment of HPV-positive and HPV-negative oropharyngeal squamous cell carcinoma: a multiparametric quantitative and spatial analysis unveils a rationale to target treatment-naïve tumors with immune checkpoint inhibitors. J Exp Clin Cancer Res. 2021;41:279.

Robertson AG, et al. Comprehensive molecular characterization of muscle-invasive bladder Cancer. Cell. 2017;171:540–556.e25.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Liu J, et al. An integrated TCGA Pan-Cancer clinical data resource to drive high-quality survival outcome analytics. Cell. 2018;173:400–416.e11.

Idogawa M, et al. Dead or alive? Pitfall of survival analysis with TCGA datasets. Cancer Biol Ther. 2021;22:527–8.

Deng M, Brägelmann J, Kryukov I, Saraiva-Agostinho N, Perner S. FirebrowseR: an R client to the broad Institute’s firehose pipeline. Database. 2017;2017:1–6.

Newman AM, et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat Biotechnol. 2019;37:773–82.

Kluth LA, et al. Prognostic and prediction tools in bladder Cancer: a comprehensive review of the literature. Eur Urol. 2015;68:238–53.

Yaghobi Z, et al. The role of CD44 in cancer chemoresistance: a concise review. Eur J Pharmacol. 2021;903:174147.

Misra S, et al. Hyaluronan-CD44 interactions as potential targets for cancer therapy. FEBS J. 2011;278:1429–43.

Rios De La Rosa JM, et al. Binding and internalization in receptor-targeted carriers: the complex role of CD44 in the uptake of hyaluronic acid-based nanoparticles (siRNA delivery). Adv Healthc Mater. 2019;8:1901182.

Article   CAS   Google Scholar  

Rios de la Rosa JM, Tirella A, Gennari A, Stratford IJ, Tirelli N. The CD44-mediated uptake of hyaluronic acid-based carriers in macrophages. Adv Healthc Mater. 2017;6:1–11.

Roosta Y, Sanaat Z, Nikanfar AR, Dolatkhah R, Fakhrjou A. Predictive value of CD44 for prognosis in patients with breast Cancer. Asian Pac J Cancer Prev. 2020;21:2561–7.

Zanjani LS, et al. Increased expression of CD44 is associated with more aggressive behavior in clear cell renal cell carcinoma. Biomark Med. 2017;12:45–61. https://doi.org/10.2217/bmm-2017-0142 .

Liu Y, Wu T, Lu D, Zhen J, Zhang L. CD44 overexpression related to lymph node metastasis and poor prognosis of pancreatic cancer. Int J Biol Markers. 2018;33:308–13.

Bartakova A, et al. Journal of obstetrics and Gynaecology CD44 as a cancer stem cell marker and its prognostic value in patients with ovarian carcinoma. J Obstet Gynaecol (Lahore). 2018;38:110–4.

Tsidulko AY, et al. Prognostic relevance of NG2/CSPG4, CD44 and Ki-67 in patients with glioblastoma. Tumour Biol. 2017;39

Skelton TP, Zeng C, Nocks A, Stamenkovic I. Glycosylation provides both stimulatory and inhibitory effects on cell surface and soluble CD44 binding to hyaluronan. J Cell Biol. 1998;140:431–46.

Thankamony SP, Knudson W. Acylation of CD44 and its association with lipid rafts are required for receptor and Hyaluronan endocytosis. J Biol Chem. 2006;281:34601–9.

Orian-Rousseau V, Chen L, Sleeman JP, Herrlich P, Ponta H. CD44 is required for two consecutive steps in HGF/c-met signaling. Genes Dev. 2002;16:3074–86.

Hasenauer S, et al. Internalization of met requires the co-receptor CD44v6 and its link to ERM proteins. PLoS ONE. 2013;8:e62357.

Prasetyanti PR, et al. Regulation of stem cell self-renewal and differentiation by Wnt and notch are conserved throughout the adenoma-carcinoma sequence in the colon. Mol Cancer. 2013;12:126.

Vlashi E, Pajonk F. Cancer stem cells, Cancer cell plasticity and radiation therapy. Semin Cancer Biol. 2015;0:28.

Wells A, Chao YL, Grahovac J, Wu Q, Lauffenburger DA. Epithelial and mesenchymal phenotypic switchings modulate cell motility in metastasis. Front Biosci (Landmark Ed). 2011;16:815–37.

ElShamy WM, Duhé RJ. Overview: cellular plasticity, cancer stem cells and metastasis. Cancer Lett. 2013;341:2–8.

Omran OM, Ata HS. CD44s and CD44v6 in diagnosis and prognosis of human bladder cancer. Ultrastruct Pathol. 2012;36:145–52.

Jiang H, Zhao W, Shao W. Prognostic value of CD44 and CD44v6 expression in patients with non-small cell lung cancer: meta-analysis. Tumour Biol. 2014;358(35):7383–9.

Qiao GL, Song LN, Deng ZF, Chen Y, Ma LJ. Prognostic value of CD44v6 expression in breast cancer: a meta-analysis. Onco Targets Ther. 2018;11:5451–7.

Fang M, et al. CD44 and CD44v6 are correlated with gastric Cancer progression and poor patient prognosis: evidence from 42 studies. Cell Physiol Biochem. 2016;40:567–78.

Ma L, Dong L, Chang P. CD44v6 engages in colorectal cancer progression. Cell Death Dis. 2018;101(10):1–13.

Rao G, et al. Reciprocal interactions between tumor-associated macrophages and CD44-positive cancer cells via osteopontin/CD44 promote tumorigenicity in colorectal cancer. Clin Cancer Res. 2013;19:785–97.

Shi J, Zhou Z, Di W, Li N. Correlation of CD44v6 expression with ovarian cancer progression and recurrence. BMC Cancer. 2013;13:1–10.

Khan SA, et al. Enhanced cell surface CD44 variant (v6, v9) expression by osteopontin in breast cancer epithelial cells facilitates tumor cell migration: novel post-transcriptional, post-translational regulation. Clin Exp Metastasis. 2006;228(22):663–73.

Sun SJ, et al. Integrin β3 and CD44 levels determine the effects of the OPN-a splicing variant on lung cancer cell growth. Oncotarget. 2016;7:55572–84.

Pietras A, et al. Osteopontin-CD44 signaling in the glioma perivascular niche enhances cancer stem cell phenotypes and promotes aggressive tumor growth. Cell Stem Cell. 2014;14:357–69.

Ahmed M, et al. An Osteopontin/CD44 Axis in RhoGDI2-mediated metastasis suppression. Cancer Cell. 2016;30:432–43.

Klement JD, et al. An osteopontin/CD44 immune checkpoint controls CD8+ T cell activation and tumor immune evasion. J Clin Invest. 2018;128:5549–60.

Baras AS, et al. The ratio of CD8 to Treg tumor-infiltrating lymphocytes is associated with response to cisplatin-based neoadjuvant chemotherapy in patients with muscle invasive urothelial carcinoma of the bladder. Oncoimmunology. 2016;5:1–7.

Zhang Q, et al. Apoptotic SKOV3 cells stimulate M0 macrophages to differentiate into M2 macrophages and promote the proliferation and migration of ovarian cancer cells by activating the ERK signaling pathway. Int J Mol Med. 2020;45:10–22.

CAS   PubMed   Google Scholar  

Gabrusiewicz, K. et al. Glioblastoma-infiltrated innate immune cells resemble M0 macrophage phenotype. JCI Insight 1, 0–19 (2016).

Ali HR, Chlon L, Pharoah PDP, Markowetz F, Caldas C. Patterns of immune infiltration in breast Cancer and their clinical implications: a gene-expression-based retrospective study. PLoS Med. 2016;13:1–24.

Jairath NK, et al. Tumor immune microenvironment clusters in localized prostate adenocarcinoma: prognostic impact of macrophage enriched/plasma cell non-enriched subtypes. J Clin Med. 2020;9:1–13.

Liu X, et al. The prognostic landscape of tumor-infiltrating immune cell and immunomodulators in lung cancer. Biomed Pharmacother. 2017;95:55–61.

Li P, et al. Identification of an immune-related risk signature correlates with Immunophenotype and predicts anti-PD-L1 efficacy of urothelial Cancer. Front Cell Dev Biol. 2021;9:1–11.

CAS   Google Scholar  

Lin J, et al. A robust 11-genes prognostic model can predict overall survival in bladder cancer patients based on five cohorts. Cancer Cell Int. 2020;20:1–14.

Wei J, et al. Osteopontin mediates glioblastoma-associated macrophage infiltration and is a potential therapeutic target. J Clin Invest. 2019;129:137–49.

Antonelli AC, Binyamin A, Hohl TM, Glickman MS, Redelman-Sidi G. Bacterial immunotherapy for cancer induces CD4-dependent tumor-specific immunity through tumor-intrinsic interferon-γ signaling. Proc Natl Acad Sci USA. 2020;117:18627–37.

Kates M, et al. Phase 1 / 2 trial results of a large surface area microparticle docetaxel for the treatment of high-risk nonmuscle-invasive bladder. Cancer. 2022;208:822–9.

Chalmers ZR, et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med. 2017;9:1–14.

Bastos DA, et al. Genomic biomarkers and underlying mechanism of benefit from BCG immunotherapy in non-muscle invasive bladder Cancer. Bl Cancer. 2020;6:171–86.

Voutsadakis IA. Urothelial bladder carcinomas with high tumor mutation burden have a better prognosis and targetable molecular defects beyond immunotherapies. Curr Oncol. 2022;29:1390–407.

Chen S, Zhang S, Chen S, Ma F. The prognostic value and immunological role of CD44 in pan-cancer study. Sci Rep. 2023;13:1–13.

Lee-Sayer SSM, et al. The where, when, how and why of hyaluronan binding by immune cells. Front Immunol. 2015;6:1–12.

Audisio A, et al. New perspectives in the medical treatment of non-muscle-invasive bladder Cancer: immune checkpoint inhibitors and beyond. Cells. 2022;11

Download references

Acknowledgements

We aknowledge Dr. R. Hurle, Dr. O. De Cobelli, Dr. L. Cecchini, Dr. C. Llorente and Dr. C. Hernandez, who participated to the phase I clinical trial NCT04798703 and provided bioptic samples. We also acknowledge Prof. R. De Caro and Dr. A. Emmi for providing samples of healthy bladders.

Open access funding provided by Università degli Studi di Padova. This study was funded by a specific grant to AR from Fidia Farmaceutici SpA, which also participated in review and approval of the manuscript.

Author information

Authors and affiliations.

Immunology and Molecular Oncology Diagnostics, Veneto Institute of Oncology IOV-IRCCS, Via Gattamelata 64, 35128, Padova, Italy

Anna Tosi & Antonio Rosato

Department of Surgery, Oncology and Gastroenterology, University of Padova, Via Gattamelata 64, 35128, Padova, Italy

Beatrice Parisatto & Antonio Rosato

Department of Molecular Medicine, University of Padova, Padova, Italy

Enrico Gaffo & Stefania Bortoluzzi

You can also search for this author in PubMed   Google Scholar

Contributions

AT and AR conceived the study. AT and BP performed multiplex immunofluorescence and gene expression experiments. EG and SB performed bioinformatic analyses. AT and BP analyzed and interpreted the data. AT and AR drafted the manuscript. All authors read and approved the final version of the manuscript.

Corresponding authors

Correspondence to Anna Tosi or Antonio Rosato .

Ethics declarations

Ethics approval and consent to participate.

The trial protocol and all amendments were approved by the competent ethical committee at each participating institution [ 7 ], all patients provided written informed consent.

Consent for publication

Not applicable.

Competing interests

This study was supported by a specific grant from Fidia Farmaceutici SpA to AR.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Tosi, A., Parisatto, B., Gaffo, E. et al. A paclitaxel-hyaluronan conjugate (ONCOFID-P-B™) in patients with BCG-unresponsive carcinoma in situ of the bladder: a dynamic assessment of the tumor microenvironment. J Exp Clin Cancer Res 43 , 109 (2024). https://doi.org/10.1186/s13046-024-03028-5

Download citation

Received : 12 December 2023

Accepted : 26 March 2024

Published : 10 April 2024

DOI : https://doi.org/10.1186/s13046-024-03028-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Bladder cancer
  • Macrophages
  • Predictive biomarkers

Journal of Experimental & Clinical Cancer Research

ISSN: 1756-9966

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

data analysis in non experimental research

  • Search Menu
  • Advance Articles
  • Special Issues
  • High-Impact Collection
  • Author Guidelines
  • Submission Site
  • Call for Papers
  • Open Access Options
  • Self-Archiving Policy
  • Why Publish with Us?
  • About Forensic Sciences Research
  • Journal Metrics
  • About the Academy of Forensic Science
  • Editorial Board
  • Advertising & Corporate Services

Article Contents

A comparative experimental study on the collection and analysis of dna samples from under fingernail material.

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Elif Yüksel, Sukriye Karadayı, Tulin Ozbek, Beytullah Karadayı, A comparative experimental study on the collection and analysis of DNA samples from under fingernail material, Forensic Sciences Research , 2024;, owae025, https://doi.org/10.1093/fsr/owae025

  • Permissions Icon Permissions

In cases of murder and rape where there is physical contact between the perpetrator and the victim, analysis of the victim’s nail material is quite valuable. Although it is possible that the foreign DNA detected in the fingernail material does not belong to the perpetrator of the incident, ıf it belongs to the perpetrator of the incident, it may provide useful findings for solve the incident. Fingernail material collected after the incident often contains resulting in mixed DNA. The efficiency of sample collection procedures is of particular importance, as this process may pose some problems in the interpretation of autosomal STR analyses used for the identification of the individual or individuals. The aim of this study is to compare 3 different fingernail material collection procedures (thick-tipped swabbing and thin-tipped swabbing and nail clipping) to determine the most efficient sample collection procedure and to contribute to routine investigations to identify the assailant in forensic cases. In our study, under fingernail material was collected from 12 volunteer couples by three different methods. To help comparing the efficiency of the three different methods, the profiles obtained were classified based on the number of female and men alleles detected. Obtained STR profiles, while nail clipping yielded 58.3% (n:7) ‘High level DNA mixture’ as a profile containing 12 or more than 12 female alleles, 75% (n:9) of the samples taken with cotton-toothpick swabs (thin-tipped) yielded ‘Full Male Profile’. In conclusion, our study shows that cotton toothpick swabs (thin-tipped) are the most efficient method for determining the male DNA profile among three different fingernail material collection procedures. We suggest that using thin-tipped swabs produced in a specific standard instead of the commonly used size swabs that are frequently used in routine crime investigations to identify perpetrator from fingernail material may improve efficiency of processing the nails and evaluation of the evidence.

Supplementary data

Email alerts, citing articles via.

  • Advertising and Corporate Services
  • Journals Career Network

Affiliations

  • Online ISSN 2471-1411
  • Print ISSN 2096-1790
  • Copyright © 2024 Academy of Forensic Science
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Help | Advanced Search

High Energy Physics - Phenomenology

Title: a global analysis of $π^0$, $k_s^0$ and $η$ fragmentation functions with besiii data.

Abstract: In this research, we conduct a global QCD analysis of fragmentation functions (FFs) for neutral pions ($\pi^0$), neutral kaons ($K_S^0$), and eta mesons ($\eta$), utilizing world data of single inclusive hadron production in $e^+e^-$ annihilation involving the most recent BESIII data with low collision energy, to test the operational region of QCD collinear factorization for single inclusive hadron production. We found that the QCD-based analysis at next-to-next-to leading order in perturbative QCD with parameterized higher-twist effects can explain both existing high-energy world data and the BESIII new measurements, while the latter cannot be explained with existing FFs extracted with high-energy data. To investigate the higher-twist contributions to this discrepancy, a direct functional approach is employed, providing testing framework for characterizing the experimental results over a wide range of energy scales, from low to high, thus extending the classical theoretical models to the BESIII domain.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • INSPIRE HEP
  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

ORIGINAL RESEARCH article

Effects of covid-19-targeted non-pharmaceutical interventions on pediatric hospital admissions in north italian hospitals, 2017 to 2022: a quasi-experimental study interrupted time-series analysis.

Giuseppe Maglietta&#x;

  • 1 Clinical and Epidemiological Research Unit, University Hospital of Parma, Parma, Italy
  • 2 Pediatric Clinic, IRCCS Azienda Ospedaliera Universitaria di Bologna, Bologna, Italy
  • 3 Pediatric Emergency Unit, IRCCS Azienda Ospedaliera Universitaria di Bologna, Bologna, Italy
  • 4 Pediatric Intensive Care Unit, IRCCS Azienda Ospedaliera Universitaria di Bologna, Bologna, Italy
  • 5 Pediatrics and Neonatology Unit, Ravenna Hospital, AUSL Romagna, Ravenna, Italy
  • 6 Paediatrics Unit, Santa Maria Nuova Hospital, AUSL-IRCCS of Reggio Emilia, Reggio Emilia, Italy
  • 7 Pediatrics Unit, Department of Medical and Surgical Sciences of Mothers, Children and Adults, University of Modena and Reggio Emilia, Modena, Italy
  • 8 Pediatrics and Neonatology Unit, Guglielmo da Saliceto Hospital, Piacenza, Italy
  • 9 Department of Medicine and Surgery, University of Parma, Parma, Italy
  • 10 Pediatric Clinic, University of Ferrara, Ferrara, Italy
  • 11 Pediatric Unit, Pavullo Hospital, AUSL Modena, Modena, Italy
  • 12 Paediatrics Unit, Maggiore Hospital, Bologna, Italy
  • 13 Pediatric Clinic, Rimini Hospital, AUSL Romagna, Rimini, Italy
  • 14 Pediatric Unit, G.B. Morgagni – L. Pierantoni Hospital, AUSL Romagna, Forlì, Italy
  • 15 Pediatric Unit, AUSL Romagna, Cesena, Italy
  • 16 Pediatric Clinic, University Hospital of Parma, Parma, Italy
  • 17 Department of Medicine and Surgery, University of Parma, Parma, Italy

Background: The use of Non-Pharmaceutical Interventions (NPIs), such as lockdowns, social distancing and school closures, against the COVID-19 epidemic is debated, particularly for the possible negative effects on vulnerable populations, including children and adolescents. This study therefore aimed to quantify the impact of NPIs on the trend of pediatric hospitalizations during 2 years of pandemic compared to the previous 3 years, also considering two pandemic phases according to the type of adopted NPIs.

Methods: This is a multicenter, quasi-experimental before-after study conducted in 12 hospitals of the Emilia-Romagna Region, Northern Italy, with NPI implementation as the intervention event. The 3 years preceding the beginning of NPI implementation (in March 2020) constituted the pre-pandemic phase. The subsequent 2 years were further subdivided into a school closure phase (up to September 2020) and a subsequent mitigation measures phase with less stringent restrictions. School closure was chosen as delimitation as it particularly concerns young people. Interrupted Time Series (ITS) regression analysis was applied to calculate Hospitalization Rate Ratios (HRR) on the diagnostic categories exhibiting the greatest variation. ITS allows the estimation of changes attributable to an intervention, both in terms of immediate (level change) and sustained (slope change) effects, while accounting for pre-intervention secular trends.

Results: Overall, in the 60 months of the study there were 84,368 cases. Compared to the pre-pandemic years, statistically significant 35 and 19% decreases in hospitalizations were observed during school closure and in the following mitigation measures phase, respectively. The greatest reduction was recorded for “Respiratory Diseases,” whereas the “Mental Disorders” category exhibited a significant increase during mitigation measures. ITS analysis confirms a high reduction of level change during school closure for Respiratory Diseases (HRR 0.19, 95%CI 0.08–0.47) and a similar but smaller significant reduction when mitigation measures were enacted. Level change for Mental Disorders significantly decreased during school closure (HRR 0.50, 95%CI 0.30–0.82) but increased during mitigation measures by 28% (HRR 1.28, 95%CI 0.98–1.69).

Conclusion: Our findings provide information on the impact of COVID-19 NPIs which may inform public health policies in future health crises, plan effective control and preventative interventions and target resources where needed.

1 Introduction

The SARS-CoV-2 epidemic has had little medical consequences for children and adolescents, as incidence of severe forms of COVID-19 in the pediatric population was low and symptoms of infection were generally mild ( 1 , 2 ). However, young people were deeply affected by the restrictive measures imposed globally to reduce transmission, such as quarantine, lockdown, and social distancing, often referred to as Non-Pharmaceutical Interventions (NPIs), which considerably changed their daily lives ( 3 ). They were confined at home for long periods, with limited opportunity for learning and reduced peer contact, together with adults who were often anxious or psychologically stressed by the circumstances, which added to their own discomfort ( 3 , 4 ). School closure, enforced in many countries with different durations, was particularly relevant for these age groups, as school is where children and adolescents spend most of their time, and have opportunity for both social interactions and intellectual stimulation ( 5 ).

The debate on the pros and cons of population-wide restrictions enacted during the COVID-19 pandemic is ongoing. On the one hand, data seems to support the positive effects of NPIs ( 6 – 10 ), particularly in terms of control of virus spread and consequent reduction in mortality ( 10 ). On the other hand, some authors emphasize a range of “side effects” of NPIs, including economic, educational, and health repercussions, disproportionately affecting more vulnerable populations, including children, with little health benefits ( 11 ). To manage future health crises, therefore, it is crucial that these strategies are further assessed to inform future pandemic policy and avoid past mistakes ( 12 ).

The timing and intensity of NPIs against COVID-19 all over the world varied greatly according to local situations ( 7 ). Italy, starting from the Northern regions, was the first European country to be affected by the pandemic ( 13 ), and enacted very aggressive restrictive policies, including one of the longest school closures in the world ( 14 ).

The analysis of hospitalization trends can provide valuable insights into the repercussions of different restrictions adopted over time, needed to prepare for future pandemics. In particular, to estimate the effectiveness of population-level health interventions that have been implemented at a clearly defined point in time, Interrupted Time Series (ITS) regression analysis is the recommended method ( 15 ). However, the majority of research on this topic is monocentric ( 16 – 19 ), is restricted to specific pediatric age classes or considers all ages including adults ( 17 , 19 – 26 ), focuses on specific diagnoses ( 9 , 16 , 17 , 19 , 21 , 25 , 27 ), only looks at Emergency Department (ED) visits ( 16 , 17 , 19 , 21 , 23 , 26 , 28 , 29 ), or addresses the time period immediately following the pandemic onset without evaluating ongoing effects ( 18 , 25 , 30 ).

We therefore aimed to quantify the impact of NPIs adopted to prevent or control COVID-19 transmission on the trend of hospitalizations, in 12 hospitals in the Emilia-Romagna Region, Northern Italy, during the 2 years following the start of the pandemic, compared with the previous 3 years, considering two pandemic phases according to the type of adopted NPIs.

2 Materials and methods

2.1 study design and setting.

This is a multicenter, quasi-experimental controlled before-after study, conducted to estimate the change in pediatric hospital admissions during the COVID-19 pandemic compared to the previous period. For disease categories exhibiting the greatest variations, we investigated the effect during school closure and in the subsequent phase when schools were re-opened and mitigation measures were implemented.

This study was conducted in the Emilia-Romagna Region, Northern Italy, which has an overall pediatric population (from 0 to 17 years) of 673,818 subjects (year 2020) ( 31 ), who were potentially affected by NPIs.

The overall study period covered from March 2017 to February 2022 (60 months), defining the implementation of NPIs as an intervention event.

2.2 Intervention

National lockdown in Italy was imposed from March 11 through May 4th, 2020, after which economic and social activities were gradually resumed. Restrictions were relaxed over the summer and then reintroduced gradually to counter the second wave of the pandemic. On November 6th, 2020, the Italian Government enforced a three-tiered restriction system on a regional basis, using periodic risk assessments by the Ministry of Health ( 32 ). Italy also enforced one of the longest school closures in the world ( 14 ). Educational institutions of any grade were shut down from late February up to September 2020, after which schools were reopened and mitigation measures were kept in place, such as mask wearing and reduced student social contact, as well as mandatory distance learning for at least 75% of the time in high schools ( 32 ). On March 31, 2022, the state of emergency ended in Italy.

In this study, the beginning of NPI implementation was used as delimitation, defining the 3 years prior to March 2020 (from March 2017 to February 2020, 36 months) as the pre-COVID19 phase (PC). Since school closure is thought to have had a more direct impact on young people than other NPIs, the subsequent 2 years were further subdivided into a school closure phase (SC), from March 2020 to September 2020 (7 months) and a mitigation measures phase (MM), from October 2020 to February 2022 (17 months).

2.3 Participants

We analyzed data from 12 of the 15 (80%) hospitals in the Emilia-Romagna Region, which provided complete data throughout the study duration. These centers had a catchment area of 574,760 minor inhabitants in 2020 (equal to 85% of the Emilia Romagna region), comprising 211/269 (78%) pediatric beds. Included subjects were patients aged between 0 and 17 years, hospitalized in the considered time frame. Healthy new-borns were excluded from the analysis.

2.4 Data sources

Study data were anonymously extracted from the electronic hospital discharge forms (eHDFs), contained in the administrative databases of the Emilia-Romagna Regional Health Trust, and included the following: age, sex, dates of admission and discharge, main diagnosis and up to five secondary diagnoses (i.e., any conditions existing at admission or occurring during hospitalization which influence treatment or length of stay). The diagnoses were coded according to the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM).

2.5 Statistical analysis

As outcome variables, we considered the monthly frequency of hospitalizations, total and for ICD9-CM categories (the first three characters), during the 60 months considered by the study. To identify which major ICD9-CM categories had the greatest impact, the Standardized Hospitalization Rates (SHR) per 100,000 person-year were used, considering as standard the resident population in Europe in 2020 (the intermediate of the 5 years considered in this study) ( 33 ) and adjusting for age and sex. For each diagnostic category, we measured how any of the time periods changed with respect to the previous phase (SC vs. PC, MM vs. PC and MM vs. SC), by estimating the Standardized Hospitalization Rate Ratios (SHRR) and their 95% Confidence Intervals (95% CI). To investigate the effect of NPIs, the ICD9-CM categories exhibiting the greatest change were assessed using ITS regression analysis. This segmented approach allows to estimate changes attributable to an intervention, in terms of overall (as time trend), immediate (as changes in level) and sustained (increase or decrease in the slope) effects, while accounting for pre-intervention secular trends. Since ITS regression models were applied to analyze count data through time, over-dispersion parameter was also evaluated and tested by graphical diagnostic plot and overdispersion test. We modeled admissions using a Poisson generalized linear model; in case p -value from Chi-square test of “estat gof” STATA function was less than 0.05 the model switched from Poisson to Quasi Poisson by specifying the parameter scale (x2). The seasonality components were also included into the ITS models to estimate recurrence undulatory patterns of admissions. Winter was defined as January/February/March, Spring as April/May/June, Summer as July/August/September and Autumn as October/ November/ December. In all ITS models, the annual population of the considered provinces was used as off-set allowing to estimate the hospitalization rate. Post-hoc sensitivity analyses were conducted to investigate the impact of children aged 0–1 years old on HRR estimates from ITS modeling, since we assumed that a very small proportion of children in this age group attends day-care. This is an important factor since the school closure is one of the main NPIs under study. All statistical analyses were centralized and performed with STATA (StataCorp. 2023. Stata Statistical Software: Release 18. College Station, TX: StataCorp LLC).

Overall, in the 5 years of the study and in the 12 participating centers, there were 84,368 cases. Case demographics are shown in Table 1 for each of the three phases: PC, SC, and MM. The sample of admissions was made up of 57.0% males, with the predominant age group being between 0 and 1 year (38.0%). As expected, the hospitalization rate decreased considerably when school closure was enforced with respect to the pre-pandemic time period (2,548 vs. 3,915 × 100,000 person-year). Supplementary Table S1 shows the standardized hospitalization rates by type of primary diagnosis, from highest to lowest.

www.frontiersin.org

Table 1 . Demographic and clinical characteristics of the analyzed sample.

Figure 1 shows the comparisons in terms of SHRR, overall and for individual ICD9-CM categories, between SC, MM and PC. Overall, a statistically significant decrease in hospitalizations with respect to pre-pandemic rates was observed both in SC (−35%, SHRR 0.65, 95%CI: 0.64–0.67) and MM (−19%, SHRR 0.81, 0.80–0.83), while a 25% increase (SHRR 1.25, 1.22–1.28) was recorded in MM with respect to SC.

www.frontiersin.org

Figure 1 . Forest plot of Standardized Hospitalization Rate Ratios (SHRR). Estimates are reported as x 100,000 person-year and are age & sex standardized using as Standard the European resident population in 2020. SHR, standardized hospitalization rate.

Considering individual ICD-9-CM diagnoses, a generalized reduction was detected during SC for all categories. The greatest reduction (−73%, SHRR 0.27, 95%CI 0.25–0.29) occurred in the “Respiratory Diseases” category, which exhibited the highest frequency of hospitalizations (approximately 4,000 cases/year in the 3 years before the pandemic). In MM, the reduction compared to PC persisted, although less prominent. Only the Mental Disorders category showed a large increase (51%, SHRR 1.51, 95%CI 1.30–1.75).

To measure NPI effects, ITS regression analysis was carried out on overall hospital admissions and on the two categories which stood out for the greatest variation (Respiratory Diseases and Mental Disorders).

Results of the ITS analysis are presented in the following paragraphs.

3.1 Any hospitalization

As shown in Figure 2 and Table 2 , we observed a highly significant decrease in hospitalizations in SC (level change, HRR 0.44, 95%CI 0.35–0.55) and in MM (although of lesser impact, HRR 0.65, 95%CI 0.57–0.75) compared to PC. Unlike the constant hospitalization rate recorded throughout the 3 years before the pandemic, immediately after the collapse of admissions an increasing trend occurred, particularly in SC (slope change, 11% per month, HRR 1.11, 95%CI 1.06–1.16), but also to a lesser extent in MM (slope change, 2% per month, HRR 1.02, 95%CI 1.01–1.03). Hospitalization rates returned to pre-pandemic levels only in autumn 2021 (18 months since the start of the pandemic).

www.frontiersin.org

Figure 2 . Monthly hospitalization rate any disease with line trend from ITS regression analysis. PC, pre-COVID19 phase; SC, School closure phase; MM, Mitigation measures phase.

www.frontiersin.org

Table 2 . Interrupted time series analysis results on hospitalizations.

3.2 Respiratory diseases

The most frequent types of respiratory diseases as primary diagnosis are shown in Supplementary Table S2 . This category, which contributed the most to the hospitalization decline, exhibited in SC a statistically significant reduction of 81% in the number of admissions in terms of level change (HRR 0.19, 95%CI 0.08–0.47), and a increase of the monthly slope change of 17% (HRR 1.17, 95%CI 0.97–1.42). A similar but less pronounced decrease was seen during MM, with a statistically significant reduction in terms of level change (HRR 0.26, 95%CI 0.16–0.41), and a 7% increase of the monthly slope change (HRR 1.07, 95%CI 1.03–1.11). The seasonality component analysis showed statistically significant increases from autumn to spring compared to summer ( Table 3 and Figure 3A ).

www.frontiersin.org

Table 3 . Interrupted time series analysis results on hospitalizations for Respiratory Diseases and Mental Disorders categories.

www.frontiersin.org

Figure 3 . Monthly hospitalization rate for respiratory diseases (A) and mental disorders (B) with line trend from ITS regression analysis. PC, pre-COVID19 phase; SC, School closure phase; MM, Mitigation measures phase.

3.3 Mental disorders

As evident in Figure 3B , although hospitalizations in this category underwent a substantial decrease at the start of SC, we observed a sharp increasing trend until MM, when hospitalization rates exceeded pre-pandemic levels. ITS analysis ( Table 3 ) detected in SC a statistically significant 50% reduction in level change (HRR 0.50, 95%CI 0.30–0.82) and a borderline statistically significant 11% increase in the monthly slope change (HRR 1.11, 95%CI 1.00–1.23), compared to PC. Comparing MM with the pre-pandemic situation, a 28% level change increase was observed (HRR 1.28, 95%CI 0.98–1.69), while the monthly slope change remained unchanged (HRR 1.01, 95%CI 0.99–1.03). Finally, comparing MM vs. SC, we recorded a strong increase in the level change of about 2.6 times (HRR 2.59, 95%CI 1.55–4.34). The seasonality component analysis showed statistically significant increases of admission during spring (HRR 1.18, 95%CI 1.00–1.40) versus summer ( Table 3 and Figure 3B ).

3.4 Subgroup analysis by sex and age

We performed subgroup analyses considering gender and age categories. Gender differences were not found for Respiratory Diseases, whereas for Mental Disorders the increase in MM vs. PC seemed to be significantly stronger in females vs. males (MM vs. PC: 1.66, 1.19–2.33 vs. 1.28, 0.98–1.69, respectively) ( Supplementary Figures S1, S2 and Supplementary Tables S4, S5 ). Although incidence rates of Respiratory Diseases differed between ages 0–5 and 12–17, HRR estimates did not exhibit relevant differences ( Supplementary Figures S3, S4 and Supplementary Tables S6, S7 ). Concerning Mental Disorders, in the 12–17 age subgroup, the HRR of level change in MM increased from a non-statistically significant 1.28 ( p  = 0.071) to a highly statistically significant 1.66 ( p  < 0.001), even though the slope change was almost absent and identical (HRR 1.01) ( Supplementary Figure S5 and Supplementary Table S8 ).

3.5 Sensitivity analyses

Supplementary Tables S9, S10 display the results of sensitivity analyses performed excluding children aged 0–1 years old, who represent about 38% of overall hospital admissions. Considering both admissions for any cause and for the Respiratory Diseases and Mental Disorders categories, sensitivity analyses did not reveal differences sufficient to suggest that the proportion of children aged 0 to 1 year significantly skewed the results of our analyses.

4 Discussion

This is the first European study on the impact of COVID-19 NPIs on the trend of pediatric hospitalizations conducted in a wide area severely hit by the pandemic, covering an extended pandemic period (24 months). The use of appropriate analysis through ITS regression makes our findings and corresponding conclusions reliable. In fact, existing research on the effects of NPIs mostly consists of modeling studies, implying a lack of empirical, real-world data, or uses descriptive statistics on admission trends ( 34 ).

Overall, our results showed that the number of pediatric hospital admissions dropped by more than 50% in the first months of the lockdown period, and then began to rise, achieving pre-pandemic hospitalization levels only 2 years later. This considerable, long-lasting reduction appears to be mainly determined by a decrease in the occurrence of infectious diseases (the most frequent cause for hospitalization in children), mainly affecting the respiratory system. However, these results may also be due to a change in health-seeking behaviors of parents, who might have chosen not to attend hospital with their sick children for fear of contagion ( 9 , 35 ). Moreover, the decrease may be attributed to a tendency to avoid hospitalizing children with minor health problems. Supporting this hypothesis is the fact that admissions for childhood neoplasms remained constant, suggesting that healthcare services were maintained for severe illnesses. A similar observation was made by Wang et al. ( 9 ), who found a 55% reduction in admissions for all-cause respiratory diseases, in line with our finding, and a smaller reduction in admissions for childhood neoplasms.

Interesting results emerged from ITS analyses conducted on the two disease categories exhibiting the largest variation, which recorded opposite trends. For Respiratory Diseases, we observed a marked reduction of hospitalizations which persisted throughout school closure and for the most part of the subsequent time period when less stringent mitigation measures were enforced, in the absence of typical seasonal epidemic peaks. Conversely, for Mental Disorders an immediate decline of admissions was detected in the first 2 months of lockdown, followed by an incremental trend, on average by 11% monthly. These trends need to be further investigated using hospitalization data recorded in the following years, to understand whether the effects persist, or whether at the end of the pandemic hospitalizations return to pre-pandemic levels.

Some plausible reasons for these results exist. Regarding Respiratory Diseases, the drop in admissions is likely to be related to the impact of mask-wearing, hand washing, and social distancing on the interruption of person-to-person viral or bacterial transmission, as also discussed by Wang et al. ( 9 ). The reduction may also be partly due to a “virus interference phenomenon” among respiratory viruses, whereby the infection of one virus can partially prevent or inhibit the infection of another virus in the same host ( 36 ). The contribution of this factor is however likely to be marginal compared to the absence of influenza epidemics and other respiratory infections following social distancing, which has been reported and commented in the literature ( 37 , 38 ). Concerning mental health, the negative effects may have taken longer to manifest, but once developed they may not resolve easily even if restrictions are lifted, instead requiring much time and specific care to be removed ( 3 ).

The results of this study can contribute to the current debate on benefits and harms of individual NPIs, which is not a simple one, also because it is hard to separate the impact of one measure from that of other interventions introduced simultaneously. Concerning the pediatric population in particular, it would be essential to elucidate the role of school closures on the control of pandemic spread ( 39 ). Recent reviews ( 12 , 34 , 40 ) suggested that measures implemented in the school setting may have limited the number or proportion of cases and deaths among adults, and delayed the progression of the pandemic. This seems to contrast with a report on data from Sweden, where school closure was only reserved for upper secondary schools, indicating that the number of deaths per population unit was lower than most other high-income countries that applied stringent school closure policies ( 41 ). On the other hand, the literature also highlights negative consequences of these measures on children’s health and education. As reported by UNICEF ( 42 ), school closures disrupted the provision of educational (and in some cases health and nutritional) services, increased emotional distress and mental health problems, an prevented access to a wide range of school-provided services, including school meals, monitoring of health and welfare, social skills training, and services targeted to children with special needs. Furthermore, as schools moved online, impoverished children experienced dramatic educational setbacks contributing to inequalities and long-term hardship ( 42 ).

Within the current debate, our findings also highlight that evaluating the trade-offs between positive and negative consequences of NPI implementation during pandemics is a complex task. In particular, as commented above, the decrease in hospitalizations for Respiratory Diseases after the beginning of the outbreak may be due both to the hesitancy in attending hospitals, certainly an undesired effect, and to the reduction of respiratory infections due to lockdown measures, a welcome benefit.

One of the main strengths of this research lies in the use of ITS analysis, one of the strongest evaluative designs when randomization is not possible ( 15 ). Furthermore, the study involves numerous hospitals, which makes results robust and increases their generalizability. Also, analyzed data concern the first European area hit by the pandemic, where aggressive restrictive measures were immediately adopted since the start of the outbreak and maintained for an extended period, are restricted to one endpoint (pediatric hospitalizations) and include COVID and non-COVID hospitalizations. Finally, the study covers a wide timeframe, longer than most similar research, which enabled to verify the impact of NPIs in the long-term.

This study has some limitations. Firstly, data were taken from hospital administrative databases and were not collected prospectively for this research. However, the data quality is supposed to be similar in the years we compared; thus, this aspect should not impact interpretation. Secondly, we did not attempt to discriminate between new versus recurrent hospitalizations. Such discrimination would be important to understand whether the observed changes were due to the onset of a new condition or to the exacerbation of existing problems. Thirdly, since the analysis used data collected retrospectively without formal power analysis, we cannot exclude the risk of false negative findings in the case of low-prevalence diagnoses. Lastly, we did not attempt to investigate the potential role of different waves of variants of the SARS-CoV-2 virus which were predominant in the 2 years covered by the study, because it was not an objective of our research. This may have led to an overestimation of the effect of NPIs on hospital admissions.

5 Conclusion

The results of this and other studies on the impact of COVID-19 NPIs on children provide information needed to guide and target interventions in the event of future pandemics, and to plan the allocation of resources where they are needed most. However, the different plausible interpretations of our findings make it difficult to inform about the trade-offs between benefits and negative consequences of NPI strategies during pandemics. Rigorous research should be conducted to understand whether the reduction in pediatric hospital admissions we observed over a two-year period has affected child and adolescent health. Meta-analyses are needed to quantify the contribution to observed effects of individual mitigation actions, to better determine the appropriateness of their introduction, timing and intensity.

Data availability statement

The data analyzed in this study is subject to the following licenses/restrictions: the raw data supporting the conclusions of this article will be made available by the authors upon a motivated request to the corresponding author. Requests to access these datasets should be directed to [email protected] .

Ethics statement

The studies involving humans were approved by AVEN (Area Vasta Emilia Nord) Ethics Committee. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because waiver for informed consent was obtained from the Italian Data Protection Authority (Garante della Privacy), because of feasibility issues (more than 80,000 subjects should have been contacted).

Collaborators

Collective authors who fulfill all four ICMJE authorship criteria.

Francesca Diodati, Chiara Maria Palo: Clinical and Epidemiological Research Unit, University Hospital of Parma, Parma, Italy; Angela Miniaci, Luca Bertelli: Pediatric Clinic, IRCCS Azienda Ospedaliera Universitaria di Bologna, Bologna, Italy; Giovanni Biserni: Pediatric Emergency Unit, IRCCS Azienda Ospedaliera Universitaria di Bologna, Bologna, Italy; Angela Troisi, Alessandra Iacono: Pediatrics and Neonatology Unit, Ravenna Hospital, AUSL Romagna, Ravenna, Italy; Federico Bonvicini, Domenico Bartolomeo, Andrea Trombetta: Paediatrics Unit, Santa Maria Nuova Hospital, AUSL-IRCCS of Reggio Emilia, Reggio Emilia, Italy; Tommaso Zini: Pediatrics Unit, Department of Medical and Surgical Sciences of Mothers, Children and Adults, University of Modena and Reggio Emilia, Modena, Italy; Nicoletta de Paulis: Pediatrics and Neonatology Unit, Guglielmo da Saliceto Hospital, Piacenza, Italy; Cristina Forest: Pediatric Clinic, University of Ferrara, Ferrara, Italy; Battista Guidi: Pediatric Unit, Pavullo Hospital, AUSL Modena, Pavullo, Italy; Francesca Di Florio: Paediatrics Unit, Maggiore Hospital, Bologna, Italy; Enrico Valletta, Francesco Accomando: Pediatric Unit, G.B. Morgagni - L. Pierantoni Hospital, AUSL Romagna, Forlì; Greta Ramundo, Alberto Argentiero, Valentina Fainardi, Michela Deolmi: Pediatric Clinic, University Hospital, Department of Medicine and Surgery, University of Parma, Parma, Italy.

Author contributions

GM: Formal analysis, Writing – original draft, Writing – review & editing, Data curation, Methodology. MP: Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing. CC: Formal analysis, Writing – original draft, Writing – review & editing, Conceptualization, Validation. AP: Investigation, Resources, Resources, Writing – review & editing. ML: Investigation, Resources, Writing – review & editing. FC: Investigation, Resources, Writing – review & editing. FM: Investigation, Resources, Writing – review & editing. AF: Investigation, Resources, Writing – review & editing. LI: Investigation, Resources, Writing – review & editing. GB: Investigation, Resources, Writing – review & editing. AS: Investigation, Resources, Writing – review & editing. AM: Investigation, Resources, Writing – review & editing. CG: Investigation, Resources, Writing – review & editing. GV: Investigation, Resources, Writing – review & editing. MA: Investigation, Resources, Writing – review & editing. MS: Investigation, Resources, Writing – review & editing. SE: Conceptualization, Investigation, Project administration, Supervision, Writing – review & editing.

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpubh.2024.1393677/full#supplementary-material

1. Curatola, A, Ferretti, S, Gatto, A, Valentini, P, Giugno, G, Della Marca, G, et al. The effects of COVID-19 pandemic on Italian school-aged children: sleep-related difficulties and trauma reactions. J Child Neurol . (2022) 37:568–74. doi: 10.1177/08830738221096194

PubMed Abstract | Crossref Full Text | Google Scholar

2. Garazzino, S, Montagnani, C, Donà, D, Meini, A, Felici, E, Vergine, G, et al. Multicentre Italian study of SARS-CoV-2 infection in children and adolescents, preliminary data as at 10 April 2020. Eurosurveillance . (2020) 25:2000600. doi: 10.2807/1560-7917.ES.2020.25.18.2000600

3. Esposito, S, Giannitto, N, Squarcia, A, Neglia, C, Argentiero, A, Minichetti, P, et al. Development of psychological problems among adolescents during school closures because of the COVID-19 lockdown phase in Italy: a cross-sectional survey. Front Pediatr . (2021) 8:628072. doi: 10.3389/fped.2020.628072

4. Stracke, M, Heinzl, M, Müller, AD, Gilbert, K, Thorup, AAE, Paul, JL, et al. Mental health is a family affair—systematic review and meta-analysis on the associations between mental health problems in parents and children during the COVID-19 pandemic. IJERPH . (2023) 20:4485. doi: 10.3390/ijerph20054485

5. Raffetti, E, and Di Baldassarre, G. Do the benefits of school closure outweigh its costs? IJERPH . (2022) 19:2500. doi: 10.3390/ijerph19052500

6. Siqueira, CAS, Freitas, YNL, Cancela, MC, Carvalho, M, Oliveras-Fabregas, A, and de Souza, DLB. The effect of lockdown on the outcomes of COVID-19 in Spain: an ecological study. PLoS One . (2020) 15:e0236779. doi: 10.1371/journal.pone.0236779

Crossref Full Text | Google Scholar

7. Tsou, HH, Kuo, SC, Lin, YH, Hsiung, CA, Chiou, HY, Chen, WJ, et al. A comprehensive evaluation of COVID-19 policies and outcomes in 50 countries and territories. Sci Rep . (2022) 12:8802. doi: 10.1038/s41598-022-12853-7

8. Torres, AR, Rodrigues, AP, Sousa-Uva, M, Kislaya, I, Silva, S, Antunes, L, et al. Impact of stringent non-pharmaceutical interventions applied during the second and third COVID-19 epidemic waves in Portugal, 9 November 2020 to 10 February 2021: an ecological study. Eurosurveillance . (2022) 27:2100497. doi: 10.2807/1560-7917.ES.2022.27.23.2100497

9. Wang, X, Xu, H, Chu, P, Zeng, Y, Tian, J, Song, F, et al. Effects of COVID-19-targeted nonpharmaceutical interventions on children’s respiratory admissions in China: a national multicenter time series study. Int J Infect Dis . (2022) 124:174–80. doi: 10.1016/j.ijid.2022.10.009

10. Mendes, JM, and Coelho, PS. The effect of non-pharmaceutical interventions on COVID-19 outcomes: a heterogeneous age-related generalisation of the SEIR model. Infect Dis Model . (2023) 8:742–68. doi: 10.1016/j.idm.2023.05.009

11. Schippers, MC, Ioannidis, JPA, and Joffe, AR. Aggressive measures, rising inequalities, and mass formation during the COVID-19 crisis: an overview and proposed way forward. Front Public Health . (2022) 10:950965. doi: 10.3389/fpubh.2022.950965

12. Soriano-Arandes, A, Brett, A, Buonsenso, D, Emilsson, L, de la Fuente, GI, Gkentzi, D, et al. Policies on children and schools during the SARS-CoV-2 pandemic in Western Europe. Front Public Health . (2023) 11:1175444. doi: 10.3389/fpubh.2023.1175444

13. Caminiti, C, Maglietta, G, Meschi, T, Ticinesi, A, Silva, M, and Sverzellati, N. Effects of the COVID-19 epidemic on hospital admissions for non-communicable diseases in a large Italian university-hospital: a descriptive case-series study. JCM . (2021) 10:880. doi: 10.3390/jcm10040880

14. Bertoletti, A, Soncin, M, Cannistrà, M, and Agasisti, T. The educational effects of emergency remote teaching practices—the case of COVID-19 school closure in Italy. PLoS One . (2023) 18:e0280494. doi: 10.1371/journal.pone.0280494

15. Lopez Bernal, J, Cummins, S, and Gasparrini, A. Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol . (2016) 46:dyw098–355. doi: 10.1093/ije/dyw098

16. Hernández-Calle, D, Andreo-Jover, J, Curto-Ramos, J, García Martínez, D, Valor, LV, Juárez, G, et al. Pediatric mental health emergency visits during the COVID-19 pandemic. Scand J Child Adolescent Psychiatry Psychol . (2022) 10:53–7. doi: 10.2478/sjcapp-2022-0005

17. Arakelyan, M, Emond, JA, and Leyenaar, JK. Suicide and self-harm in youth presenting to a US rural hospital during COVID-19. Hosp Pediatr . (2022) 12:e336–42. doi: 10.1542/hpeds.2022-006635

18. Alongi, A, D’Aiuto, F, Montomoli, C, and Borrelli, P. Impact of the first year of the COVID-19 pandemic on pediatric emergency department attendance in a tertiary center in South Italy: an interrupted time-series analysis. Healthcare . (2023) 11:1638. doi: 10.3390/healthcare11111638

19. Rebbe, R, Reddy, J, Kuelbs, CL, Huang, JS, and Putnam-Hornstein, E. The impact of COVID-19 on infant maltreatment emergency department and inpatient medical encounters. J Pediatr . (2023) 262:113582. doi: 10.1016/j.jpeds.2023.113582

20. Ahmadi, S, Kazemi-Karyani, A, Badiee, N, Byford, S, Mohammadi, A, Piroozi, B, et al. The impact of COVID-19 pandemic on hospital admissions for nine diseases in Iran: insight from an interrupted time series analysis. Cost Eff Resour Alloc . (2022) 20:58. doi: 10.1186/s12962-022-00394-9

21. Gutiérrez-Sacristán, A, Serret-Larmande, A, Hutch, MR, Sáez, C, Aronow, BJ, Bhatnagar, S, et al. Hospitalizations associated with mental health conditions among adolescents in the US and France during the COVID-19 pandemic. JAMA Netw Open . (2022) 5:e2246548. doi: 10.1001/jamanetworkopen.2022.46548

22. Beaudry, G, Drouin, O, Gravel, J, Smyrnova, A, Bender, A, Orri, M, et al. A comparative analysis of pediatric mental health-related emergency department utilization in Montréal, Canada, before and during the COVID-19 pandemic. Ann Gen Psychiatry . (2022) 21:17. doi: 10.1186/s12991-022-00398-y

23. Schranz, M, Boender, TS, Greiner, T, Kocher, T, Wagner, B, Greiner, F, et al. Changes in emergency department utilisation in Germany before and during different phases of the COVID-19 pandemic, using data from a national surveillance system up to June 2021. BMC Public Health . (2023) 23:799. doi: 10.1186/s12889-023-15375-7

24. Côté-Corriveau, G, Luu, TM, Lewin, A, Brousseau, É, Ayoub, A, Blaser, C, et al. Hospitalization for child maltreatment and other types of injury during the COVID-19 pandemic. Child Abuse Negl . (2023) 140:106186. doi: 10.1016/j.chiabu.2023.106186

25. Matsumoto, N, Kadowaki, T, Takanaga, S, Shigeyasu, Y, Okada, A, and Yorifuji, T. Longitudinal impact of the COVID-19 pandemic on the development of mental disorders in preadolescents and adolescents. BMC Public Health . (2023) 23:1308. doi: 10.1186/s12889-023-16228-z

26. Lopes, S, Soares, P, Santos Sousa, J, Rocha, JV, Boto, P, and Santana, R. Effect of the COVID-19 pandemic on the frequency of emergency department visits in Portugal: an interrupted time series analysis until July 2021. JACEP Open . (2023) 4:e12864. doi: 10.1002/emp2.12864

27. Milliren, CE, Richmond, TK, and Hudgins, JD. Emergency department visits and hospitalizations for eating disorders during the COVID-19 pandemic. Pediatrics . (2023) 151:e2022058198. doi: 10.1542/peds.2022-058198

28. Amado, V, Moller, J, Couto, MT, Wallis, L, and Laflamme, L. Effect of the COVID-19 pandemic on emergency department attendances for pediatric injuries in Mozambique’s central hospitals: an interrupted time series and a comparison within the restriction periods between 2019 and 2020. Trauma Surg Acute Care Open . (2023) 8:e001062. doi: 10.1136/tsaco-2022-001062

29. Negriff, S, Huang, BZ, Sharp, AL, and DiGangi, M. The impact of stay-at-home orders on the rate of emergency department child maltreatment diagnoses. Child Abuse Negl . (2022) 132:105821. doi: 10.1016/j.chiabu.2022.105821

30. Schroeder, AR, Dahlen, A, Purington, N, Alvarez, F, Brooks, R, Destino, L, et al. Healthcare utilization in children across the care continuum during the COVID-19 pandemic. PLoS One . (2022) 17:e0276461. doi: 10.1371/journal.pone.0276461

31. ISTAT. Demo – Statistiche demografiche. (2020) Available at: https://demo.istat.it/app (Accessed February 16, 2024).

Google Scholar

32. Manica, M, Guzzetta, G, Riccardo, F, Valenti, A, Poletti, P, Marziano, V, et al. Impact of tiered restrictions on human activities and the epidemiology of the second wave of COVID-19 in Italy. Nat Commun . (2021) 12:4570. doi: 10.1038/s41467-021-24832-z

33. Database. Eurostat. (2020) Available at: https://ec.europa.eu/eurostat/web/main/data/database (Accessed February 16, 2024)

34. Krishnaratne, S, Littlecott, H, Sell, K, Burns, J, Rabe, JE, Stratil, JM, et al. Measures implemented in the school setting to contain the COVID-19 pandemic. Cochrane Database Syst Rev . (2022) 2022:CD015029. doi: 10.1002/14651858.CD015029

35. Lim, E, Mistry, RD, Battersby, A, Dockerty, K, Koshy, A, Chopra, MN, et al. “How to recognize if your child is seriously ill” during COVID-19 lockdown: an evaluation of parents’ confidence and health-seeking behaviors. Front Pediatr . (2020) 8:580323. doi: 10.3389/fped.2020.580323

36. Zhou, Q, Hu, J, Hu, W, Li, H, and Lin, GZ. Interrupted time series analysis using the ARIMA model of the impact of COVID-19 on the incidence rate of notifiable communicable diseases in China. BMC Infect Dis . (2023) 23:375. doi: 10.1186/s12879-023-08229-5

37. Sun, J, Shi, Z, and Xu, H. Non-pharmaceutical interventions used for COVID-19 had a major impact on reducing influenza in China in 2020. J Travel Med . (2020) 27:taaa064. doi: 10.1093/jtm/taaa064

38. Steffen, R, Lautenschlager, S, and Fehr, J. Travel restrictions and lockdown during the COVID-19 pandemic—impact on notified infectious diseases in Switzerland. J Travel Med . (2020) 27:taaa180. doi: 10.1093/jtm/taaa180

39. Esposito, S, Cotugno, N, and Principi, N. Comprehensive and safe school strategy during COVID-19 pandemic. Ital J Pediatr . (2021) 47:6. doi: 10.1186/s13052-021-00960-6

40. Hume, S, Brown, SR, and Mahtani, KR. School closures during COVID-19: an overview of systematic reviews. BMJ EBM . (2023) 28:164–74. doi: 10.1136/bmjebm-2022-112085

41. Björkman, A, Gisslén, M, Gullberg, M, and Ludvigsson, J. The Swedish COVID-19 approach: a scientific dialogue on mitigation policies. Front Public Health . (2023) 11:1206732. doi: 10.3389/fpubh.2023.1206732

42. United Nations Children’s Fund (UNICEF). (2020) Framework for Reopening Schools. Available at: https://www.unicef.org/documents/framework-reopening-schools (Accessed February 16, 2024)

Keywords: COVID-19 epidemiology, non-pharmaceutical intervention (NPI), quasi-experimental design, observational study, Interrupted Time Series (ITS) regression analysis, time series analysis, diseases of the respiratory system, Mental Disorders

Citation: Maglietta G, Puntoni M, Caminiti C, Pession A, Lanari M, Caramelli F, Marchetti F, De Fanti A, Iughetti L, Biasucci G, Suppiej A, Miceli A, Ghizzi C, Vergine G, Aricò M, Stella M, Esposito S on behalf of Emilia-Romagna Paediatric COVID-19 network (2024) Effects of COVID-19-targeted non-pharmaceutical interventions on pediatric hospital admissions in North Italian hospitals, 2017 to 2022: a quasi-experimental study interrupted time-series analysis. Front. Public Health . 12:1393677. doi: 10.3389/fpubh.2024.1393677

Received: 29 February 2024; Accepted: 25 March 2024; Published: 18 April 2024.

Reviewed by:

Copyright © 2024 Maglietta, Puntoni, Caminiti, Pession, Lanari, Caramelli, Marchetti, De Fanti, Iughetti, Biasucci, Suppiej, Miceli, Ghizzi, Vergine, Aricò, Stella, Esposito and on behalf of Emilia-Romagna Paediatric COVID-19 network. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Caterina Caminiti, [email protected]

† The members of the network are listed under Collaborators at the end of the article

‡ These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

What’s happening at the U.S.-Mexico border in 7 charts

Immigrants walk toward the Rio Grande to cross into Del Rio, Texas, on Sept. 23, 2021, from Ciudad Acuna, Mexico.

The U.S. Border Patrol reported more than 1.6 million encounters with migrants along the U.S.-Mexico border in the 2021 fiscal year, more than quadruple the number of the prior fiscal year and the highest annual total on record.

The number of encounters had fallen to just over 400,000 in fiscal 2020 as the coronavirus outbreak  slowed migration  across much of the world. But encounters at the southwest border rebounded sharply in fiscal 2021 and ultimately eclipsed the previous annual high recorded in fiscal 2000, according to recently published data from U.S. Customs and Border Protection (CBP), the federal agency that encompasses the Border Patrol.

Migrant encounters refer to two distinct kinds of events: expulsions, in which migrants are immediately expelled to their home country or last country of transit, and apprehensions, in which migrants are detained in the United States, at least temporarily.

Since the onset of the coronavirus pandemic , most encounters have resulted in expulsion from the U.S., unlike before the pandemic, when the vast majority ended in apprehension instead. The Trump administration began expelling migrants in March 2020 under a public health order aimed at limiting the spread of COVID-19. The Biden administration has continued to expel migrants under the same order.

Below is a closer look at the shifting dynamics at the southwest border, based on the recent CBP statistics. Most of these statistics refer to federal fiscal years, which run from Oct. 1 to Sept. 30, as opposed to calendar years. It’s also important to note that encounters refer to events, not people, and that some migrants are encountered more than once.

This Pew Research Center analysis examines changing migration patterns at the U.S.-Mexico border, based on current  and  historical  data from U.S. Customs and Border Protection (CBP). The analysis is based on migrant encounters – a common but only partial indicator of how many people enter the United States illegally in a given year.

Encounters refer to two distinct kinds of events: expulsions, in which migrants are immediately expelled to their home country or last country of transit, and apprehensions, in which migrants are detained in the U.S., at least temporarily. Since March 2020, encounter statistics have included expulsions carried out under Title 42, a  public health order aimed at limiting the spread of COVID-19. Encounter statistics prior to March 2020 include apprehensions only.

It is important to note that encounters refer to events, not people, and that some migrants are encountered more than once. In fact, repeat border crossers have accounted for a sizable proportion of total encounters in recent years. As a result, the number of encounters overstates the number of distinct individuals involved.

Most of the findings in this analysis refer to federal fiscal years, which run from Oct. 1 to Sept. 30, as opposed to calendar years. Due to data limitations, not all findings in this analysis cover the same time period. CBP statistics on total southwest border encounters are available for the 1960-2021 period, for example, while statistics on the demographic profile of those being encountered are available only for the 2013-2021 period.

This analysis only includes encounters reported by the U.S. Border Patrol. It excludes encounters reported by the Office of Field Operations.

Southwest border encounters increased to their highest recorded level in fiscal 2021. The Border Patrol reported 1,659,206 encounters with migrants at the U.S.-Mexico border last fiscal year, narrowly exceeding the prior highs of 1,643,679 in 2000 and 1,615,844 in 1986.

A line graph showing that migrant encounters at the U.S.-Mexico border reached their highest level on record in 2021

The large number of encounters in fiscal 2021 dwarfed the total during the last major wave of migration at the southwest border, which occurred in fiscal 2019. The Border Patrol recorded 851,508 encounters that year.

While the number of encounters was the highest on record last fiscal year, the number of individuals encountered was considerably lower. That’s because more than a quarter of all migrant encounters at U.S. borders in both fiscal 2021 and fiscal 2020 (27% and 26%, respectively) involved repeat crossers, according to CBP statistics . By comparison, the proportion of repeat border crossers was much lower in the 2019 fiscal year (7%), before the Border Patrol began regularly expelling migrants during the coronavirus outbreak. (These recidivism statistics include encounters at all U.S. borders. While separate statistics for only the U.S.-Mexico border are not available, encounters at the southwest border have accounted for more than 97% of total encounters in recent years.)

A line graph showing that more than 1 million southwest border encounters in 2021 involved people from countries other than Mexico

A record number of encounters in fiscal 2021 involved people from countries other than Mexico. Mexico was the single most common origin country for migrants encountered at the border in fiscal 2021. The Border Patrol reported 608,037 encounters with Mexican nationals last year, accounting for 37% of the total. The remaining 1,051,169 encounters, or 63%, involved people from countries other than Mexico – by far the highest total for non-Mexican nationals in CBP records dating back to 2000.

Most of the encounters with non-Mexicans in fiscal 2021 involved people from the Northern Triangle countries of Honduras, Guatemala and El Salvador. There were 308,931 encounters with people from Honduras last fiscal year (representing 19% of all encounters), 279,033 with people from Guatemala (17%) and 95,930 with people from El Salvador (6%). The Northern Triangle region has been a major source of migration at the U.S-Mexico border in recent years.

Encounters soared in fiscal 2021 for some countries that have not historically been common sources of migration at the U.S.-Mexico border. The number of encounters involving people from Ecuador, for example, increased more than eightfold, from 11,861 in fiscal 2020 to 95,692 in fiscal 2021. There were also stark increases in encounters involving people from Brazil (from 6,946 to 56,735), Nicaragua (from 2,123 to 49,841), Venezuela (from 1,227 to 47,752), Haiti (from 4,395 to 45,532) and Cuba (from 9,822 to 38,139).

A line graph showing that encounters with migrants form some countries rose dramatically in 2021

Economic, social and political instability in some of these countries likely played a role in the spike in encounters at the U.S.-Mexico border last fiscal year. In Ecuador, widespread economic problems and the COVID-19 pandemic have led many migrants to make the journey north . Haiti, meanwhile, has faced a number of challenges in recent years, ranging from natural disasters to the assassination of its president in July.

Related: Biden administration widens scope of Temporary Protected Status for immigrants

The increase in encounters at the U.S.-Mexico border didn’t just involve people from Latin America or the Caribbean region. The number of encounters involving people from Romania rose from 266 in fiscal 2020 to 4,029 in fiscal 2021, while the number involving people from Turkey increased from 67 to 1,366.

A line graph showing that border encounters with single adults, families and unaccompanied children all increased in 2021

Migrant encounters increased across demographic groups in fiscal 2021, but single adults continued to account for the large majority. Encounters with unaccompanied children rose from 30,557 in fiscal 2020 to 144,834 in fiscal 2021, while encounters with people traveling in families increased from 52,230 to 451,087.

By far the largest number and share of encounters involved single adults. There were 1,063,285 encounters with single adults in fiscal 2021, up from 317,864 the year before. More than six-in-ten encounters (64%) involved single adults, though that was down from 79% in fiscal 2020.

Migrant encounters more than doubled in every sector along the U.S.-Mexico border in fiscal 2021. The largest numerical increase occurred in the Rio Grande Valley sector, where there were 549,077 encounters last fiscal year, up from 90,206 the year before. But the largest proportional increase occurred in the Yuma sector, where encounters increased thirteenfold, from 8,804 in fiscal 2020 to 114,488 in fiscal 2021.

A map showing that migrant encounters more than doubled in all nine southwest border sectors in 2021

Since the coronavirus outbreak began, most migrant encounters have resulted in expulsion from the U.S., rather than apprehension within the country. In March 2020, the administration of former President Donald Trump invoked Title 42, a public health order allowing the Border Patrol to expel migrants immediately in an effort to control the domestic spread of the coronavirus. President Joe Biden’s administration has continued to expel migrants under Title 42, though to a lesser extent than the Trump administration.

A bar chart showing that most migrant encounters during COVID-19 have ended in expulsion, but less so in recent months

About two-thirds (66%) of all migrant encounters ended in expulsion between April 2020, the first full month after Title 42 was invoked, and September 2021, the end of the 2021 fiscal year. The remaining 34% resulted in apprehension. But the share of encounters resulting in expulsion has decreased under the Biden administration. In September 2021, 54% of encounters ended in expulsion, down from 74% in February 2021, the first full month after Biden took office.

A chart showing that southwest border encounters have often peaked in March, but pattern has changed since 2013

Seasonal migration patterns have changed in recent years. Since 2000, border encounters have typically peaked in the spring – most often in March – before declining during the hot summer months, when migration journeys become more perilous. But the pattern has changed since 2013, with the annual peak occurring in months other than March. July was the peak month in fiscal 2021, with the number of encounters (200,658) far exceeding the total recorded in March (169,216), even though temperatures in July are typically much higher.

Note: This is an update of a post originally published on April 10, 2019.

  • Border Security & Enforcement
  • Immigrant Populations
  • Immigration & Migration
  • Immigration Trends
  • Unauthorized Immigration

Latinos’ Views on the Migrant Situation at the U.S.-Mexico Border

U.s. christians more likely than ‘nones’ to say situation at the border is a crisis, how americans view the situation at the u.s.-mexico border, its causes and consequences, migrant encounters at the u.s.-mexico border hit a record high at the end of 2023, americans remain critical of government’s handling of situation at u.s.-mexico border, most popular.

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Age & Generations
  • Coronavirus (COVID-19)
  • Economy & Work
  • Family & Relationships
  • Gender & LGBTQ
  • International Affairs
  • Internet & Technology
  • Methodological Research
  • News Habits & Media
  • Non-U.S. Governments
  • Other Topics
  • Politics & Policy
  • Race & Ethnicity
  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

Copyright 2024 Pew Research Center

Terms & Conditions

Privacy Policy

Cookie Settings

Reprints, Permissions & Use Policy

COMMENTS

  1. Overview of Nonexperimental Research

    Key Takeaways. Nonexperimental research is research that lacks the manipulation of an independent variable, control of extraneous variables through random assignment, or both. There are three broad types of nonexperimental research. Single-variable research focuses on a single variable rather than a relationship between variables.

  2. 6.1 Overview of Non-Experimental Research

    Non-experimental research is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world). ... Qualitative data has a separate set of analysis tools depending on the ...

  3. Quantitative Research with Nonexperimental Designs

    There are two main types of nonexperimental research designs: comparative design and correlational design. In comparative research, the researcher examines the differences between two or more groups on the phenomenon that is being studied. For example, studying gender difference in learning mathematics is a comparative research.

  4. 1.6: Non-Experimental Research

    When to Use Non-Experimental Research. As we saw earlier, experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable.It stands to reason, therefore, that non-experimental research is appropriate—even necessary—when ...

  5. PDF Non-experimental study designs: The basics and recent advances

    So when we can't randomize…the role of design for non-experimental studies. •Should use the same spirit of design when analyzing non-experimental data, where we just see that some people got the treatment and others the control •Helps articulate 1) the causal question, and 2) the timing of covariates, exposure, and outcomes.

  6. Overview of Non-Experimental Research

    Non-experimental research falls into two broad categories: correlational research and observational research. ... Qualitative data has a separate set of analysis tools depending on the research question. For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were ...

  7. 6.1: Overview of Non-Experimental Research

    Non-experimental research is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world). Most researchers in psychology consider the distinction between experimental ...

  8. 7.1 Overview of Nonexperimental Research

    Key Takeaways. Nonexperimental research is research that lacks the manipulation of an independent variable, control of extraneous variables through random assignment, or both. There are three broad types of nonexperimental research. Single-variable research focuses on a single variable rather than a relationship between variables.

  9. Overview of Non-Experimental Research

    Types of Non-Experimental Research. Non-experimental research falls into two broad categories: correlational research and observational research. ... For example, thematic analysis would focus on themes that emerge in the data or conversation analysis would focus on the way the words were said in an interview or focus group.

  10. 6.2: Overview of Non-Experimental Research

    When to Use Non-Experimental Research. As we saw in the last chapter, experimental research is appropriate when the researcher has a specific research question or hypothesis about a causal relationship between two variables—and it is possible, feasible, and ethical to manipulate the independent variable.It stands to reason, therefore, that non-experimental research is appropriate—even ...

  11. 6: Non-Experimental Research

    6.3: Correlational Research. Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical ...

  12. Non-experimental research: What it is, Types & Tips

    Non-experimental research is the type of research that lacks an independent variable. Instead, the researcher observes the context in which the phenomenon occurs and analyzes it to obtain information. Unlike experimental research, where the variables are held constant, non-experimental research happens during the study when the researcher ...

  13. Planning and Conducting Clinical Research: The Whole Process

    Three study designs should be planned in sequence and iterated until properly refined: theoretical design, data collection design, and statistical analysis design. The design of data collection could be further categorized into three facets: experimental or non-experimental, sampling or census, and time features of the variables to be studied.

  14. 6.2 Nonexperimental Research

    Furthermore, it might not be equitable or ethical to provide a large financial or other reward to members of an experimental group, as can occur in a true experiment. There are three types of non-experimental research: cross-sectional, correlational, and observational. In the following sections we explore each of three types of nonexperimental ...

  15. 6.2 Nonexperimental Research

    6.2 Nonexperimental Research Nonexperimental research is research that lacks manipulation of an independent variable and/or random assignment of participants to conditions.While the distinction between experimental and nonexperimental research is considered important, it does not mean that nonexperimental research is less important or inferior to experimental research (Price, Jhangiani ...

  16. Using Non-experimental Data to Estimate Treatment Effects

    While much psychiatric research is based on randomized controlled trials (RCTs), where patients are randomly assigned to treatments, sometimes RCTs are not feasible. This paper describes propensity score approaches, which are increasingly used for estimating treatment effects in non-experimental settings. The primary goal of propensity score ...

  17. Non-Experimental Comparative Effectiveness Research: How to ...

    Ethical, practical, and financial considerations dictate that most epidemiologic research be non-experimental. That includes studies of effectiveness and safety of treatments. ... Analysis of data from different international sources may be country-based or pooled. Development of common data models is quickly becoming the standard approach.

  18. 2.5: Experimental and Non-experimental Research

    2.5: Experimental and Non-experimental Research. One of the big distinctions that you should be aware of is the distinction between "experimental research" and "non-experimental research". When we make this distinction, what we're really talking about is the degree of control that the researcher exercises over the people and events in ...

  19. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  20. (PDF) Analysis and design for nonexperimental data ...

    Longitudinal or panel data were used to strengthen the non-experimental approach to causal analysis [35]. This approach permits the quantification of direct and indirect effects between observed ...

  21. What is non-experimental research: Definition, types & examples

    Non-experimental research is a type of research design that is based on observation and measuring instead of experimentation with randomly assigned participants. What characterizes this research design is the fact that it lacks the manipulation of independent variables. Because of this fact, the non-experimental research is based on naturally ...

  22. Experimental Vs Non-Experimental Research: 15 Key Differences

    Observational research is an example of non-experimental research, which is classified as a qualitative research method. Cross-section; Experimental research is usually single-sectional while non-experimental research is cross-sectional. That is, when evaluating the research subjects in experimental research, each group is evaluated as an entity.

  23. Overview of Non-Experimental Research

    Non-experimental research is research that lacks the manipulation of an independent variable. Rather than manipulating an independent variable, researchers conducting non-experimental research simply measure variables as they naturally occur (in the lab or real world). Most researchers in psychology consider the distinction between experimental ...

  24. A paclitaxel-hyaluronan conjugate (ONCOFID-P-B™) in patients with BCG

    Patient samples. Based on the clinical study protocol [], bioptic samples from urothelial mucosa were collected during the cystoscopy from 20 subjects with BCG-unresponsive CIS +/−Ta-T1, before ONCOFID-P-B™ (baseline) and after the 12-week IP.Patients who achieved a CR (defined as a negative cystoscopy including negative biopsy of the urothelium and negative cytology) after the IP entered ...

  25. comparative experimental study on the collection and analysis of DNA

    In cases of murder and rape where there is physical contact between the perpetrator and the victim, analysis of the victim's nail material is quite valuable. Although it is possible that the foreign DNA detected in the fingernail material does not belong to the perpetrator of the incident, ıf it belongs to the perpetrator of the incident, it ...

  26. Key Performance Indicators and Data Envelopment Analysis in Greek

    For this very purpose, this study's methodology consists of a combined application of the key performance indicators and data envelopment analysis. The research conducted is quantitative, aiming to analyze the efficiency of the Greek hotels by region and determine the effective ones, as well as the strategic and managerial changes which ...

  27. [2404.11527] A global analysis of $π^0$, $K_S^0$ and $η$ fragmentation

    S. and. η. fragmentation functions with BESIII data. In this research, we conduct a global QCD analysis of fragmentation functions (FFs) for neutral pions ( π0 ), neutral kaons ( K0 S ), and eta mesons ( η ), utilizing world data of single inclusive hadron production in e+e− annihilation involving the most recent BESIII data with low ...

  28. Frontiers

    3.3 Mental disorders. As evident in Figure 3B, although hospitalizations in this category underwent a substantial decrease at the start of SC, we observed a sharp increasing trend until MM, when hospitalization rates exceeded pre-pandemic levels.ITS analysis detected in SC a statistically significant 50% reduction in level change (HRR 0.50, 95%CI 0.30-0.82) and a borderline statistically ...

  29. Cells

    Our data highlight differences in the response to IL6 between INS-1 cells and human islets, suggesting the presence of species-specific variations across different experimental models. Further research is warranted to unravel the precise mechanisms underlying the observed effects of IL-6 on insulin secretion.

  30. What's happening at the U.S.-Mexico border in 7 charts

    This Pew Research Center analysis examines changing migration patterns at the U.S.-Mexico border, based on current and historical data from U.S. Customs and Border Protection (CBP). The analysis is based on migrant encounters - a common but only partial indicator of how many people enter the United States illegally in a given year.