Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Correlational Research | When & How to Use

Correlational Research | When & How to Use

Published on July 7, 2021 by Pritha Bhandari . Revised on June 22, 2023.

A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them.

A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

Table of contents

Correlational vs. experimental research, when to use correlational research, how to collect correlational data, how to analyze correlational data, correlation and causation, other interesting articles, frequently asked questions about correlational research.

Correlational and experimental research both use quantitative methods to investigate relationships between variables. But there are important differences in data collection methods and the types of conclusions you can draw.

Prevent plagiarism. Run a free check.

Correlational research is ideal for gathering data quickly from natural settings. That helps you generalize your findings to real-life situations in an externally valid way.

There are a few situations where correlational research is an appropriate choice.

To investigate non-causal relationships

You want to find out if there is an association between two variables, but you don’t expect to find a causal relationship between them.

Correlational research can provide insights into complex real-world relationships, helping researchers develop theories and make predictions.

To explore causal relationships between variables

You think there is a causal relationship between two variables, but it is impractical, unethical, or too costly to conduct experimental research that manipulates one of the variables.

Correlational research can provide initial indications or additional support for theories about causal relationships.

To test new measurement tools

You have developed a new instrument for measuring your variable, and you need to test its reliability or validity .

Correlational research can be used to assess whether a tool consistently or accurately captures the concept it aims to measure.

There are many different methods you can use in correlational research. In the social and behavioral sciences, the most common data collection methods for this type of research include surveys, observations , and secondary data.

It’s important to carefully choose and plan your methods to ensure the reliability and validity of your results. You should carefully select a representative sample so that your data reflects the population you’re interested in without research bias .

In survey research , you can use questionnaires to measure your variables of interest. You can conduct surveys online, by mail, by phone, or in person.

Surveys are a quick, flexible way to collect standardized data from many participants, but it’s important to ensure that your questions are worded in an unbiased way and capture relevant insights.

Naturalistic observation

Naturalistic observation is a type of field research where you gather data about a behavior or phenomenon in its natural environment.

This method often involves recording, counting, describing, and categorizing actions and events. Naturalistic observation can include both qualitative and quantitative elements, but to assess correlation, you collect data that can be analyzed quantitatively (e.g., frequencies, durations, scales, and amounts).

Naturalistic observation lets you easily generalize your results to real world contexts, and you can study experiences that aren’t replicable in lab settings. But data analysis can be time-consuming and unpredictable, and researcher bias may skew the interpretations.

Secondary data

Instead of collecting original data, you can also use data that has already been collected for a different purpose, such as official records, polls, or previous studies.

Using secondary data is inexpensive and fast, because data collection is complete. However, the data may be unreliable, incomplete or not entirely relevant, and you have no control over the reliability or validity of the data collection procedures.

After collecting data, you can statistically analyze the relationship between variables using correlation or regression analyses, or both. You can also visualize the relationships between variables with a scatterplot.

Different types of correlation coefficients and regression analyses are appropriate for your data based on their levels of measurement and distributions .

Correlation analysis

Using a correlation analysis, you can summarize the relationship between variables into a correlation coefficient : a single number that describes the strength and direction of the relationship between variables. With this number, you’ll quantify the degree of the relationship between variables.

The Pearson product-moment correlation coefficient , also known as Pearson’s r , is commonly used for assessing a linear relationship between two quantitative variables.

Correlation coefficients are usually found for two variables at a time, but you can use a multiple correlation coefficient for three or more variables.

Regression analysis

With a regression analysis , you can predict how much a change in one variable will be associated with a change in the other variable. The result is a regression equation that describes the line on a graph of your variables.

You can use this equation to predict the value of one variable based on the given value(s) of the other variable(s). It’s best to perform a regression analysis after testing for a correlation between your variables.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

It’s important to remember that correlation does not imply causation . Just because you find a correlation between two things doesn’t mean you can conclude one of them causes the other for a few reasons.

Directionality problem

If two variables are correlated, it could be because one of them is a cause and the other is an effect. But the correlational research design doesn’t allow you to infer which is which. To err on the side of caution, researchers don’t conclude causality from correlational studies.

Third variable problem

A confounding variable is a third variable that influences other variables to make them seem causally related even though they are not. Instead, there are separate causal links between the confounder and each variable.

In correlational research, there’s limited or no researcher control over extraneous variables . Even if you statistically control for some potential confounders, there may still be other hidden variables that disguise the relationship between your study variables.

Although a correlational study can’t demonstrate causation on its own, it can help you develop a causal hypothesis that’s tested in controlled experiments.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Correlational Research | When & How to Use. Scribbr. Retrieved April 5, 2024, from https://www.scribbr.com/methodology/correlational-research/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, what is quantitative research | definition, uses & methods, correlation vs. causation | difference, designs & examples, correlation coefficient | types, formulas & examples, what is your plagiarism score.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Correlational Research | Guide, Design & Examples

Correlational Research | Guide, Design & Examples

Published on 5 May 2022 by Pritha Bhandari . Revised on 5 December 2022.

A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them.

A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

Table of contents

Correlational vs experimental research, when to use correlational research, how to collect correlational data, how to analyse correlational data, correlation and causation, frequently asked questions about correlational research.

Correlational and experimental research both use quantitative methods to investigate relationships between variables. But there are important differences in how data is collected and the types of conclusions you can draw.

Prevent plagiarism, run a free check.

Correlational research is ideal for gathering data quickly from natural settings. That helps you generalise your findings to real-life situations in an externally valid way.

There are a few situations where correlational research is an appropriate choice.

To investigate non-causal relationships

You want to find out if there is an association between two variables, but you don’t expect to find a causal relationship between them.

Correlational research can provide insights into complex real-world relationships, helping researchers develop theories and make predictions.

To explore causal relationships between variables

You think there is a causal relationship between two variables, but it is impractical, unethical, or too costly to conduct experimental research that manipulates one of the variables.

Correlational research can provide initial indications or additional support for theories about causal relationships.

To test new measurement tools

You have developed a new instrument for measuring your variable, and you need to test its reliability or validity .

Correlational research can be used to assess whether a tool consistently or accurately captures the concept it aims to measure.

There are many different methods you can use in correlational research. In the social and behavioural sciences, the most common data collection methods for this type of research include surveys, observations, and secondary data.

It’s important to carefully choose and plan your methods to ensure the reliability and validity of your results. You should carefully select a representative sample so that your data reflects the population you’re interested in without bias .

In survey research , you can use questionnaires to measure your variables of interest. You can conduct surveys online, by post, by phone, or in person.

Surveys are a quick, flexible way to collect standardised data from many participants, but it’s important to ensure that your questions are worded in an unbiased way and capture relevant insights.

Naturalistic observation

Naturalistic observation is a type of field research where you gather data about a behaviour or phenomenon in its natural environment.

This method often involves recording, counting, describing, and categorising actions and events. Naturalistic observation can include both qualitative and quantitative elements, but to assess correlation, you collect data that can be analysed quantitatively (e.g., frequencies, durations, scales, and amounts).

Naturalistic observation lets you easily generalise your results to real-world contexts, and you can study experiences that aren’t replicable in lab settings. But data analysis can be time-consuming and unpredictable, and researcher bias may skew the interpretations.

Secondary data

Instead of collecting original data, you can also use data that has already been collected for a different purpose, such as official records, polls, or previous studies.

Using secondary data is inexpensive and fast, because data collection is complete. However, the data may be unreliable, incomplete, or not entirely relevant, and you have no control over the reliability or validity of the data collection procedures.

After collecting data, you can statistically analyse the relationship between variables using correlation or regression analyses, or both. You can also visualise the relationships between variables with a scatterplot.

Different types of correlation coefficients and regression analyses are appropriate for your data based on their levels of measurement and distributions .

Correlation analysis

Using a correlation analysis, you can summarise the relationship between variables into a correlation coefficient : a single number that describes the strength and direction of the relationship between variables. With this number, you’ll quantify the degree of the relationship between variables.

The Pearson product-moment correlation coefficient, also known as Pearson’s r , is commonly used for assessing a linear relationship between two quantitative variables.

Correlation coefficients are usually found for two variables at a time, but you can use a multiple correlation coefficient for three or more variables.

Regression analysis

With a regression analysis , you can predict how much a change in one variable will be associated with a change in the other variable. The result is a regression equation that describes the line on a graph of your variables.

You can use this equation to predict the value of one variable based on the given value(s) of the other variable(s). It’s best to perform a regression analysis after testing for a correlation between your variables.

It’s important to remember that correlation does not imply causation . Just because you find a correlation between two things doesn’t mean you can conclude one of them causes the other, for a few reasons.

Directionality problem

If two variables are correlated, it could be because one of them is a cause and the other is an effect. But the correlational research design doesn’t allow you to infer which is which. To err on the side of caution, researchers don’t conclude causality from correlational studies.

Third variable problem

A confounding variable is a third variable that influences other variables to make them seem causally related even though they are not. Instead, there are separate causal links between the confounder and each variable.

In correlational research, there’s limited or no researcher control over extraneous variables . Even if you statistically control for some potential confounders, there may still be other hidden variables that disguise the relationship between your study variables.

Although a correlational study can’t demonstrate causation on its own, it can help you develop a causal hypothesis that’s tested in controlled experiments.

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, December 05). Correlational Research | Guide, Design & Examples. Scribbr. Retrieved 2 April 2024, from https://www.scribbr.co.uk/research-methods/correlational-research-design/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, qualitative vs quantitative research | examples & methods.

Join thousands of product people at Insight Out Conf on April 11. Register free.

Insights hub solutions

Analyze data

Uncover deep customer insights with fast, powerful features, store insights, curate and manage insights in one searchable platform, scale research, unlock the potential of customer insights at enterprise scale.

Featured reads

correlation analysis in quantitative research

Inspiration

Three things to look forward to at Insight Out

Create a quick summary to identify key takeaways and keep your team in the loop.

Tips and tricks

Make magic with your customer data in Dovetail

correlation analysis in quantitative research

Four ways Dovetail helps Product Managers master continuous product discovery

Events and videos

© Dovetail Research Pty. Ltd.

What is correlation analysis?

Last updated

11 May 2023

Reviewed by

Miroslav Damyanov

Correlation analysis is a staple of data analytics. It’s a commonly used method to measure the relationship between two variables. It helps researchers understand the extent to which changes to the value in one variable are associated with changes to the value in the other. 

Correlations are often misused and misunderstood, especially in the insight industry. Below is a helpful guide to help you understand the basics and mechanics of correlation analysis. 

Make research less tedious

Dovetail streamlines research to help you uncover and share actionable insights

  • Definition of correlation analysis

Correlation analysis, also known as bivariate, is a statistical test primarily used to identify and explore linear relationships between two variables and then determine the strength and direction of that relationship. It’s mainly used to spot patterns within datasets. 

It’s worth noting that correlation doesn't equate to causation. In essence, one cannot infer a cause-and-effect relationship between the two types of data with correlation analysis. However, you can determine the relationship's size, degree, and direction. 

  • Strength of the correlation

The degree of association in correlation analysis is measured by a correlation coefficient. The Pearson correlation, which is denoted by r , is the most commonly used coefficient. The correlation coefficient quantifies the degree of linear association between two variables and can take values between -1 and +1.

No correlation: This is when the value r is zero.

Low degree: A small correlation is when r lies below ± .29

Moderate degree: If the value of the correlation coefficient is between ± 0.30 and ± 0.49, then there’s a medium correlation.

High degree: When the correlation coefficient takes a value between ±0.50 and ±1, it indicates a strong correlation.

Perfect: A perfect correlation occurs when the value of r is near ±1, indicating that as one variable increases, the other variable either increases (if positive) or decreases (if negative). 

  • Direction of the correlation

You can also identify the direction of the linear relationship between two variables by the correlation coefficient's sign. 

Positive correlation

Scores from +0.5 to +1 indicate a robust positive correlation, meaning they both increase simultaneously.

Negative correlation

Scores from -0.5 to -1 indicate a sturdy negative correlation, meaning that as a single variable increases, the other reduces proportionally. 

No correlation

If the correlation coefficient is 0, it means there’s no correlation or relationship between the two variables being analyzed. It's worth noting that increasing the sample size can lead to more precise and accurate results.

Significance of the correlation 

Once we learn about the strength and direction of the correlation, it’s critical to evaluate whether the observed correlation is likely to have occurred by chance or whether it’s a real relationship between the two variables. Therefore, we need to test the correlation for significance. The most common method for determining the significance of a correlation coefficient is by conducting a hypothesis test. 

The hypothesis test (t-test) helps us decide whether the value of the population correlation coefficient ρ is "close to zero" or "significantly different from zero." We decide this based on the sample correlation coefficient ( r ) and the sample size (n). 

As with other hypothesis tests, the significance level is set first, generally at 5%. If the t-test yields a p-value below 5%, we can conclude that the correlation coefficient is significantly different from zero. Furthermore, we simply say that the correlation coefficient is "significant." Otherwise, we wouldn’t have enough evidence to conclude that there’s a true linear relationship between the two variables.

In general, the larger the correlation coefficient ( r ) and sample size (n), the more likely it is that the correlation is statistically significant. However, it's important to remember that a significant correlation doesn’t necessarily imply causation between the two variables. 

  • What factors affect a correlation analysis?

Below are the factors you must consider when arranging a correlation analysis:

Performing a correlation analysis is only appropriate if there’s evidence of a linear relationship between the quantitative variables. You can use a scatter plot to assess linearity. If you can’t draw a straight line between the points, a correlation analysis isn’t recommended.

Ensure you draw a dispersed plot since it assists in glancing and uncovering exceptions, heteroscedasticity, and non-linear relations.

Avoid analyzing correlations when information is rehashed proportions of a similar variable from a similar individual at the equivalent or changed time focus.

The existing sample size should be determined a priori. 

  • Uses of correlation analysis

Correlation analysis is primarily used to quantify the degree to which two variables relate. By using correlation analysis, researchers evaluate the correlation coefficient that tells them to what degree one variable changes when the other changes too. It provides researchers with a linear relationship between two variables. 

Correlation analysis is used by marketers to evaluate the efficiency of a marketing campaign by monitoring and analyzing customers' reactions to various marketing tactics. As such, they can better understand and serve their customers. 

Another use of correlation analysis is among data scientists and experts tasked with data monitoring. They can use correlation analysis for root cause analysis and minimize Time To Deduction (TTD) and Time To Remediation (TTR). 

Different anomalies or two unusual events happening simultaneously or at the same rate can help identify the exact cause of an issue. As a result, users incur a lower cost of experiencing the issue if they can understand and fix it soon using correlation analysis. 

  • What is the business value of correlation analysis?

Correlation analysis has numerous business values, including identifying potential inputs for more complex analyses and testing for future changes while holding other factors constant. 

Additionally, businesses can use correlation analysis to understand the relationship between two variables. This type of analysis is easy to interpret and comprehend, as it focuses on the variance of one data row in relation to another dataset.

One of the primary business values of correlation analysis is its ability to identify hidden issues within a company. For example, if there’s a positive correlation between customers looking at reviews for a particular product and whether or not they purchase it, this could indicate a place where testing can provide more information. 

By testing whether increasing the number of people who look at positive product reviews leads to an increase in purchases, businesses can develop hypotheses to improve their products and services.

Correlation analysis can also help businesses diagnose problems with multiple regression models. For instance, if a multivariate or multiple regression model isn’t producing the expected results or if independent variables are not truly independent, correlation analysis can help discover these issues.

In digital environments, correlations can be especially helpful in fueling different hypotheses that can then be rapidly tested. This is because the testing can be low risk and not require a significant investment of time or money. 

With the abundance of data available to businesses, they must be careful in selecting the variables they’ll analyze. By doing so, they can uncover previously hidden relationships between variables and gain insights that can help them make data-driven decisions. 

  • Correlation ≠ causation

As previously stated, correlation doesn't strictly imply causation, even when you identify a significant relationship by correlation analysis techniques. You can’t determine the cause by the analysis.

The significant relationship implies that there’s much more to comprehend. Additionally, it implies that there are underlying and extraneous factors that you must further explore to look for a cause. Despite the possibility of a causal relationship existing, it would be irresponsible for researchers to utilize the correlation results as proof of such existence. 

  • Example of correlation analysis

A real-life example of correlation analysis is health improvement vs. medical dose reductions. Medical researchers can use a correlation study in clinical trials to better comprehend how a newly-developed drug impacts patients. 

If a patient's health improves due to taking the drug regularly, there’s a positive correlation. Conversely, if the patient's health deteriorates or doesn't improve, there’s no correlation between the two variables (health and the drug).

What is the difference between correlation and correlation analysis?

Correlation shows us the direction and strength of a relationship between two variables. It’s expressed numerically by the correlation coefficient. Correlation analysis, on the other hand, is a statistical test that reveals the relationship between two variables/datasets.

What are correlation and regression?

Regression and correlation are the most popular methods used to examine the linear relationship between two quantitative variables. Correlation measures how strong the relationship is between a pair of variables, while regression is used to describe the relationship as an equation. 

What is the purpose of correlation?

Correlation analysis can help you to identify possible inputs for a more refined analysis. You can also use it to test for future changes while holding other things constant. The whole purpose of using correlations in research is to determine which variables are connected.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 17 February 2024

Last updated: 5 March 2024

Last updated: 19 November 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of nonexperimental research.

What Is Correlational Research?

Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms independent variable and dependent variable do not apply to this kind of research.

The other reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher cannot manipulate the independent variable because it is impossible, impractical, or unethical. For example, Allen Kanner and his colleagues thought that the number of “daily hassles” (e.g., rude salespeople, heavy traffic) that people experience affects the number of physical and psychological symptoms they have (Kanner, Coyne, Schaefer, & Lazarus, 1981). But because they could not manipulate the number of daily hassles their participants experienced, they had to settle for measuring the number of daily hassles—along with the number of symptoms—using self-report questionnaires. Although the strong positive relationship they found between these two variables is consistent with their idea that hassles cause symptoms, it is also consistent with the idea that symptoms cause hassles or that some third variable (e.g., neuroticism) causes both.

A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extraversion tests or the number of hassles and number of symptoms people have experienced. However, the defining feature of correlational research is that the two variables are measured—neither one is manipulated—and this is true regardless of whether the variables are quantitative or categorical. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American college students and 50 Japanese college students. Although this “feels” like a between-subjects experiment, it is a correlational study because the researcher did not manipulate the students’ nationalities. The same is true of the study by Cacioppo and Petty comparing college faculty and factory workers in terms of their need for cognition. It is a correlational study because the researchers did not manipulate the participants’ occupations.

Figure 7.2 “Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists” shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this is an experiment or a correlational study because it is unclear whether the independent variable was manipulated. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then it is an experiment. If the researcher simply asked participants whether they made daily to-do lists, then it is a correlational study. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a correlational study, it could only be concluded that these variables are statistically related. Perhaps being stressed has a negative effect on people’s ability to plan ahead (the directionality problem). Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed (the third-variable problem). The crucial point is that what defines a study as experimental or correlational is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. It is how the study is conducted.

Figure 7.2 Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. However, because some approaches to data collection are strongly associated with correlational research, it makes sense to discuss them here. The two we will focus on are naturalistic observation and archival data. A third, survey research, is discussed in its own chapter.

Naturalistic Observation

Naturalistic observation is an approach to data collection that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). It could involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are often not aware that they are being studied. Ethically, this is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

Researchers Robert Levine and Ara Norenzayan used naturalistic observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999). One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in the United States and Japan covered 60 feet in about 12 seconds on average, while people in Brazil and Romania took close to 17 seconds.

Because naturalistic observation takes place in the complex and even chaotic “real world,” there are two closely related issues that researchers must deal with before collecting data. The first is sampling. When, where, and under what conditions will the observations be made, and who exactly will be observed? Levine and Norenzayan described their sampling process as follows:

Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities. (p. 186)

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.

The second issue is measurement. What specific behaviors will be observed? In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance. Often, however, the behaviors of interest are not so obvious or objective. For example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979). But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

A woman bowling

Naturalistic observation has revealed that bowlers tend to smile when they turn away from the pins and toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

sieneke toering – bowling big lebowski style – CC BY-NC-ND 2.0.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as coding . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that different observers code them in the same way. This is the issue of interrater reliability. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

Archival Data

Another approach to correlational research is the use of archival data , which are data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005). In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988). In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them, were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as college students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as college students, the healthier they were as older men. Pearson’s r was +.25.

This is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as naturalistic observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlational research is not defined by where or how the data are collected. However, some approaches to data collection are strongly associated with correlational research. These include naturalistic observation (in which researchers observe people’s behavior in the context in which it normally occurs) and the use of archival data that were already collected for some other purpose.

Discussion: For each of the following, decide whether it is most likely that the study described is experimental or correlational and explain why.

  • An educational researcher compares the academic performance of students from the “rich” side of town with that of students from the “poor” side of town.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

Kanner, A. D., Coyne, J. C., Schaefer, C., & Lazarus, R. S. (1981). Comparison of two modes of stress measurement: Daily hassles and uplifts versus major life events. Journal of Behavioral Medicine, 4 , 1–39.

Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553.

Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205.

Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110.

Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

correlation analysis in quantitative research

Quantitative Research Methods

  • Introduction
  • Descriptive and Inferential Statistics
  • Hypothesis Testing
  • Regression and Correlation
  • Time Series
  • Meta-Analysis
  • Mixed Methods
  • Additional Resources
  • Get Research Help

correlation analysis in quantitative research

Correlation is the relationship or association between two variables. There are multiple ways to measure correlation, but the most common is Pearson's correlation coefficient (r), which tells you the strength of the linear relationship between two variables. The value of r has a range of -1 to 1 (0 indicates no relationship). Values of r closer to -1 or 1 indicate a stronger relationship and values closer to 0 indicate a weaker relationship.  Because Pearson's coefficient only picks up on linear relationships, and there are many other ways for variables to be associated, it's always best to plot your variables on a scatter plot, so that you can visually inspect them for other types of correlation.

  • Correlation Penn State University tutorial
  • Correlation and Causation Australian Bureau of Statistics Article

Spurious Relationships

It's important to remember that correlation does not always indicate causation. Two variables can be correlated without either variable causing the other. For instance, ice cream sales and drownings might be correlated, but that doesn't mean that ice cream causes drownings—instead, both ice cream sales and drownings increase when the weather is hot. Relationships like this are called spurious correlations.

  • Spuriousness Harvard Business Review article.
  • New Evidence for Theory of The Stork A satirical article demonstrating the dangers of confusing correlation with causation.

correlation analysis in quantitative research

Regression is a statistical method for estimating the relationship between two or more variables. In theory, regression can be used to predict the value of one variable (the dependent variable) from the value of one or more other variables (the independent variable/s or predictor/s). There are many different types of regression, depending on the number of variables and the properties of the data that one is working with, and each makes assumptions about the relationship between the variables. (For instance, most types of regression assume that the variables have a linear relationship.) Therefore, it is important to understand the assumptions underlying the type of regression that you use and how to properly interpret its results. Because regression will always output a relationship, whether or not the variables are truly causally associated, it is also important to carefully select your predictor variables.

  • A Refresher on Regression Analysis Harvard Business Review article.
  • Introductory Business Statistics - Regression

Simple Linear Regression

Simple linear regression estimates a linear relationship between one dependent variable and one independent variable.

  • Simple Linear Regression Tutorial Penn State University Tutorial
  • Statistics 101: Linear Regression, The Very Basics YouTube video from Brandon Foltz.

Multiple Linear Regression

Multiple linear regression estimates a linear relationship between one dependent variable and two or more independent variables.

  • Multiple Linear Regression Tutorial Penn State University Tutorial
  • Multiple Regression Basics NYU course materials.
  • Statistics 101: Multiple Linear Regression, The Very Basics YouTube video from Brandon Foltz.

If you do a subject search for Regression Analysis you'll see that the library has over 200 books about regression.  Select books are listed below.  Also, note that econometrics texts will often include regression analysis and other related methods.  

correlation analysis in quantitative research

Search for ebooks using Quicksearch .  Use keywords to search for e-books about Regression .  

correlation analysis in quantitative research

  • << Previous: Hypothesis Testing
  • Next: ANOVA >>
  • Last Updated: Aug 18, 2023 11:55 AM
  • URL: https://guides.library.duq.edu/quant-methods

Research-Methodology

Correlation Analysis

Methods of correlation and regression can be used in order to analyze the extent and the nature of relationships between different variables. Correlation analysis is used to understand the nature of relationships between two individual variables. For example, if we aim to study the impact of foreign direct investment (FDI) on the level of economic growth in Vietnam, then two variables can be specified as the amounts of FDI and GDP for the same period.

Correlation coefficient ‘r’ is calculated through the following formula:

Correlation analysis

Where, x and y are values of variables, and n is size of the sample.

The value of correlation coefficient can be interpreted in the following manner:

If ‘r’ is equal to 1, then there is perfect positive correlation between two values;

If ‘r’ is equal to -1, then there is perfect negative correlation between two values;

If ‘r’ is equal to zero, then there is no correlation between the two values.

In practical terms, the closer the value of ‘r’ to 1, the higher positive impact of FDI on GDP growth in Vietnam. Similarly, if the value of ‘r’ is less than 0, the closer it is to – 1, the greater the negative impact of FDI on GDP growth in Vietnam. If ‘r’ is equal to zero, then FDI is perceived to have no impact on GDP change in Vietnama within the given sample.

The most popular forms of correlation analysis used in business studies include Pearson product-moment correlation , Spearman Rank correlation  and Autocorrelation.

The Pearson product-moment correlation  is calculated by taking the ratio of the sample of the two variables to the product of the two standard deviations and illustrates the strength of linear relationships. In Pearson product-moment correlation the correlation coefficient is not robust due to the fact that strong linear relationships between the variables are not recognized. The correlation coefficient is sensitive to outlying points therefore the correlation coefficient is not resistant.

Spearman Rank correlation  requires the data to be sorted and the value to be assigned a specific rank with 1 to be assigned as the lowest value. Moreover, in case of data value appearing more than once, equal values will be specified their average rank.

Autocorrelation   (serial correlation) implies the correlation among the values of the same variables but at various times. Autocorrelation coefficient is calculated by changing lagged data with the formula for the Pearson product-moment correlation coefficient. Also, because a series of unshifted data will express perfect correlation, the function begins with the coefficient of 1.

Correlation coefficient ‘r’ illustrated above is just a mathematical formula and you don’t have to calculate correlation coefficient manually. For a bachelor’s degree dissertation most supervisors accept correlation tests that have been run on a simple Excel  spreadsheet. For master’s or PhD level studies,  on the other hand, you will have to use  more advanced statistical software such as SPSS or NCSS for your correlation analysis.

Correlation analysis as a research method offers a range of advantages. This method allows data analysis from many subjects simultaneously. Moreover, correlation analysis can study a wide range of variables and their interrelations.  On the negative side, findings of correlation does not indicate causations i.e. cause and effect relationships.

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline. John Dudovskiy

Correlation analysis

6.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of non-experimental research.
  • Interpret the strength and direction of different correlation coefficients.
  • Explain why correlation does not imply causation.

What Is Correlational Research?

Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one or are not interested in causal relationships. Recall two goals of science are to describe and to predict and the correlational research strategy allows researchers to achieve both of these goals. Specifically, this strategy can be used to describe the strength and direction of the relationship between two variables and if there is a relationship between the variables then the researchers can use scores on one variable to predict scores on the other (using a statistical technique called regression).

Another reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher  cannot  manipulate the independent variable because it is impossible, impractical, or unethical. For example, while I might be interested in the relationship between the frequency people use cannabis and their memory abilities I cannot ethically manipulate the frequency that people use cannabis. As such, I must rely on the correlational research strategy; I must simply measure the frequency that people use cannabis and measure their memory abilities using a standardized test of memory and then determine whether the frequency people use cannabis use is statistically related to memory test performance. 

Correlation is also used to establish the reliability and validity of measurements. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms  independent variable  and dependent variabl e  do not apply to this kind of research.

Another strength of correlational research is that it is often higher in external validity than experimental research. Recall there is typically a trade-off between internal validity and external validity. As greater controls are added to experiments, internal validity is increased but often at the expense of external validity. In contrast, correlational studies typically have low internal validity because nothing is manipulated or control but they often have high external validity. Since nothing is manipulated or controlled by the experimenter the results are more likely to reflect relationships that exist in the real world.

Finally, extending upon this trade-off between internal and external validity, correlational research can help to provide converging evidence for a theory. If a theory is supported by a true experiment that is high in internal validity as well as by a correlational study that is high in external validity then the researchers can have more confidence in the validity of their theory. As a concrete example, correlational studies establishing that there is a relationship between watching violent television and aggressive behavior have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001) [1] .  These converging results provide strong evidence that there is a real relationship (indeed a causal relationship) between watching violent television and aggressive behavior.

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. 

Correlations Between Quantitative Variables

Correlations between quantitative variables are often presented using scatterplots . Figure 6.3 shows some hypothetical data on the relationship between the amount of stress people are under and the number of physical symptoms they have. Each point in the scatterplot represents one person’s score on both variables. For example, the circled point in Figure 6.3 represents a person whose stress score was 10 and who had three physical symptoms. Taking all the points into account, one can see that people under more stress tend to have more physical symptoms. This is a good example of a positive relationship , in which higher scores on one variable tend to be associated with higher scores on the other. A  negative relationship  is one in which higher scores on one variable tend to be associated with lower scores on the other. There is a negative relationship between stress and immune system functioning, for example, because higher stress is associated with lower immune system functioning.

Figure 2.2 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms

Figure 6.3 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms. The circled point represents a person whose stress score was 10 and who had three physical symptoms. Pearson’s r for these data is +.51.

The strength of a correlation between quantitative variables is typically measured using a statistic called  Pearson’s Correlation Coefficient (or Pearson’s  r ) . As Figure 6.4 shows, Pearson’s r ranges from −1.00 (the strongest possible negative relationship) to +1.00 (the strongest possible positive relationship). A value of 0 means there is no relationship between the two variables. When Pearson’s  r  is 0, the points on a scatterplot form a shapeless “cloud.” As its value moves toward −1.00 or +1.00, the points come closer and closer to falling on a single straight line. Correlation coefficients near ±.10 are considered small, values near ± .30 are considered medium, and values near ±.50 are considered large. Notice that the sign of Pearson’s  r  is unrelated to its strength. Pearson’s  r  values of +.30 and −.30, for example, are equally strong; it is just that one represents a moderate positive relationship and the other a moderate negative relationship. With the exception of reliability coefficients, most correlations that we find in Psychology are small or moderate in size. The website http://rpsychologist.com/d3/correlation/ , created by Kristoffer Magnusson, provides an excellent interactive visualization of correlations that permits you to adjust the strength and direction of a correlation while witnessing the corresponding changes to the scatterplot.

Figure 2.3 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

Figure 6.4 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

There are two common situations in which the value of Pearson’s  r  can be misleading. Pearson’s  r  is a good measure only for linear relationships, in which the points are best approximated by a straight line. It is not a good measure for nonlinear relationships, in which the points are better approximated by a curved line. Figure 6.5, for example, shows a hypothetical relationship between the amount of sleep people get per night and their level of depression. In this example, the line that best approximates the points is a curve—a kind of upside-down “U”—because people who get about eight hours of sleep tend to be the least depressed. Those who get too little sleep and those who get too much sleep tend to be more depressed. Even though Figure 6.5 shows a fairly strong relationship between depression and sleep, Pearson’s  r  would be close to zero because the points in the scatterplot are not well fit by a single straight line. This means that it is important to make a scatterplot and confirm that a relationship is approximately linear before using Pearson’s  r . Nonlinear relationships are fairly common in psychology, but measuring their strength is beyond the scope of this book.

Figure 2.4 Hypothetical Nonlinear Relationship Between Sleep and Depression

Figure 6.5 Hypothetical Nonlinear Relationship Between Sleep and Depression

The other common situations in which the value of Pearson’s  r  can be misleading is when one or both of the variables have a limited range in the sample relative to the population. This problem is referred to as  restriction of range . Assume, for example, that there is a strong negative correlation between people’s age and their enjoyment of hip hop music as shown by the scatterplot in Figure 6.6. Pearson’s  r  here is −.77. However, if we were to collect data only from 18- to 24-year-olds—represented by the shaded area of Figure 6.6—then the relationship would seem to be quite weak. In fact, Pearson’s  r  for this restricted range of ages is 0. It is a good idea, therefore, to design studies to avoid restriction of range. For example, if age is one of your primary variables, then you can plan to collect data from people of a wide range of ages. Because restriction of range is not always anticipated or easily avoidable, however, it is good practice to examine your data for possible restriction of range and to interpret Pearson’s  r  in light of it. (There are also statistical methods to correct Pearson’s  r  for restriction of range, but they are beyond the scope of this book).

Figure 12.10 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range

Figure 6.6 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range.The overall correlation here is −.77, but the correlation for the 18- to 24-year-olds (in the blue box) is 0.

Correlation Does Not Imply Causation

You have probably heard repeatedly that “Correlation does not imply causation.” An amusing example of this comes from a 2012 study that showed a positive correlation (Pearson’s r = 0.79) between the per capita chocolate consumption of a nation and the number of Nobel prizes awarded to citizens of that nation [2] . It seems clear, however, that this does not mean that eating chocolate causes people to win Nobel prizes, and it would not make sense to try to increase the number of Nobel prizes won by recommending that parents feed their children more chocolate.

There are two reasons that correlation does not imply causation. The first is called the  directionality problem . Two variables,  X  and  Y , can be statistically related because X  causes  Y  or because  Y  causes  X . Consider, for example, a study showing that whether or not people exercise is statistically related to how happy they are—such that people who exercise are happier on average than people who do not. This statistical relationship is consistent with the idea that exercising causes happiness, but it is also consistent with the idea that happiness causes exercise. Perhaps being happy gives people more energy or leads them to seek opportunities to socialize with others by going to the gym. The second reason that correlation does not imply causation is called the  third-variable problem . Two variables,  X  and  Y , can be statistically related not because  X  causes  Y , or because  Y  causes  X , but because some third variable,  Z , causes both  X  and  Y . For example, the fact that nations that have won more Nobel prizes tend to have higher chocolate consumption probably reflects geography in that European countries tend to have higher rates of per capita chocolate consumption and invest more in education and technology (once again, per capita) than many other countries in the world. Similarly, the statistical relationship between exercise and happiness could mean that some third variable, such as physical health, causes both of the others. Being physically healthy could cause people to exercise and cause them to be happier. Correlations that are a result of a third-variable are often referred to as  spurious correlations.

Some excellent and funny examples of spurious correlations can be found at http://www.tylervigen.com  (Figure 6.7  provides one such example).

Figure 2.5 Example of a Spurious Correlation Source: http://tylervigen.com/spurious-correlations (CC-BY 4.0)

“Lots of Candy Could Lead to Violence”

Although researchers in psychology know that correlation does not imply causation, many journalists do not. One website about correlation and causation, http://jonathan.mueller.faculty.noctrl.edu/100/correlation_or_causation.htm , links to dozens of media reports about real biomedical and psychological research. Many of the headlines suggest that a causal relationship has been demonstrated when a careful reading of the articles shows that it has not because of the directionality and third-variable problems.

One such article is about a study showing that children who ate candy every day were more likely than other children to be arrested for a violent offense later in life. But could candy really “lead to” violence, as the headline suggests? What alternative explanations can you think of for this statistical relationship? How could the headline be rewritten so that it is not misleading?

As you have learned by reading this book, there are various ways that researchers address the directionality and third-variable problems. The most effective is to conduct an experiment. For example, instead of simply measuring how much people exercise, a researcher could bring people into a laboratory and randomly assign half of them to run on a treadmill for 15 minutes and the rest to sit on a couch for 15 minutes. Although this seems like a minor change to the research design, it is extremely important. Now if the exercisers end up in more positive moods than those who did not exercise, it cannot be because their moods affected how much they exercised (because it was the researcher who determined how much they exercised). Likewise, it cannot be because some third variable (e.g., physical health) affected both how much they exercised and what mood they were in (because, again, it was the researcher who determined how much they exercised). Thus experiments eliminate the directionality and third-variable problems and allow researchers to draw firm conclusions about causal relationships.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlation does not imply causation. A statistical relationship between two variables,  X  and  Y , does not necessarily mean that  X  causes  Y . It is also possible that  Y  causes  X , or that a third variable,  Z , causes both  X  and  Y .
  • While correlational research cannot be used to establish causal relationships between variables, correlational research does allow researchers to achieve many other important objectives (establishing reliability and validity, providing converging evidence, describing relationships and making predictions)
  • Correlation coefficients can range from -1 to +1. The sign indicates the direction of the relationship between the variables and the numerical value indicates the strength of the relationship.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

2. Practice: For each of the following statistical relationships, decide whether the directionality problem is present and think of at least one plausible third variable.

  • People who eat more lobster tend to live longer.
  • People who exercise more tend to weigh less.
  • College students who drink more alcohol tend to have poorer grades.
  • Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage. ↵
  • Messerli, F. H. (2012). Chocolate consumption, cognitive function, and Nobel laureates. New England Journal of Medicine, 367 , 1562-1564. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Ready to level up your insights?

Get ready to streamline, scale and supercharge your research. Fill out this form to request a demo of the InsightHub platform and discover the difference insights empowerment can make. A member of our team will reach out within two working days.

Cost effective insights that scale

Quality insight doesn't need to cost the earth. Our flexible approach helps you make the most of research budgets and build an agile solution that works for you. Fill out this form to request a call back from our team to explore our pricing options.

  • What is InsightHub?
  • Data Collection
  • Data Analysis
  • Data Activation
  • Research Templates
  • Information Security
  • Our Expert Services
  • Support & Education
  • Consultative Services
  • Insight Delivery
  • Research Methods
  • Sectors We Work With
  • Meet the team
  • Advisory Board
  • Press & Media
  • Book a Demo
  • Request Pricing

Camp InsightHub

Embark on a new adventure. Join Camp InsightHub, our free demo platform, to discover the future of research.

FlexMR InsightHub

Read a brief overview of the agile research platform enabling brands to inform decisions at speed in this PDF.

InsightHub on the Blog

  • Surveys, Video and the Changing Face of Agile Research
  • Building a Research Technology Stack for Better Insights
  • The Importance of Delegation in Managing Insight Activities
  • Common Insight Platform Pitfalls (and How to Avoid Them)
  • Support and Education
  • Insight Delivery Services

FlexMR Services Team

Our services drive operational and strategic success in challenging environments. Find out how.

Video Close Connection Programme

Close Connections bring stakeholders and customers together for candid, human conversations.

Services on the Blog

  • Closing the Client-Agency Divide in Market Research
  • How to Speed Up Fieldwork Without Compromising Quality
  • Practical Ways to Support Real-Time Decision Making
  • Developing a Question Oriented, Not Answer Oriented Culture
  • Meet the Team

FlexMR Credentials Deck

The FlexMR credentials deck provides a brief introduction to the team, our approach to research and previous work.

FlexMR Insights Empowerment

We are the insights empowerment company. Our framework addresses the major pressures insight teams face.

Latest News

  • Insight as Art Shortlisted for AURA Innovation Award
  • FlexMR Launch Video Close Connection Programme
  • VideoMR Analysis Tool Added to InsightHub
  • FlexMR Makes Shortlist for Quirks Research Supplier Award
  • Latest Posts
  • Strategic Thinking
  • Technology & Trends
  • Practical Application
  • Insights Empowerment
  • View Full Blog Archives

FlexMR Close Connection Programme

Discover how to build close customer connections to better support real-time decision making.

Market Research Playbook

What is a market research and insights playbook, plus discover why should your team consider building one.

Featured Posts

  • Five Strategies for Turning Insight into Action
  • How to Design Surveys that Ask the Right Questions
  • Scaling Creative Qual for Rich Customer Insight
  • How to Measure Brand Awareness: The Complete Guide
  • All Resources
  • Client Stories
  • Whitepapers
  • Events & Webinars
  • The Open Ideas Panel
  • InsightHub Help Centre
  • FlexMR Client Network

Insights Empowerment Readiness Calculator

The insights empowerment readiness calculator measures your progress in building an insight-led culture.

MRX Lab Podcast

The MRX Lab podcast explores new and novel ideas from the insights industry in 10 minutes or less.

Featured Stories

  • Specsavers Informs Key Marketing Decisions with InsightHub
  • The Coventry Panel Helps Maintain Award Winning CX
  • Isagenix Customer Community Steers New Product Launch
  • Curo Engage Residents with InsightHub Community
  • Research Methods /
  • Strategic Thinking /
  • Practical Application /

What is Correlation Analysis? A Definition and Explanation

Emily james, how to increase the lifecycle of insights.

Have you ever thought about the typical lifecycle of the insights that market researchers generate? ...

  • Insights Empowerment (29)
  • Practical Application (167)
  • Research Methods (283)
  • Strategic Thinking (191)
  • Survey Templates (7)
  • Tech & Trends (386)

Correlation analysis is a topic that few people might remember from statistics lessons in school, but the majority of insights professionals will know as a staple of data analytics. However, correlations are frequently misunderstood and misused, even in the insights industry for a number of reasons. So here is a helpful guide to the basics of correlation analysis, with a few links along the way.  

Definition of Correlation Analysis

Correlation Analysis is statistical method that is used to discover if there is a relationship between two variables/datasets, and how strong that relationship may be.

In terms of market research this means that, correlation analysis is used to analyse quantitative data gathered from research methods such as surveys and polls, to identify whether there is any significant connections, patterns, or trends between the two.

Essentially, correlation analysis is used for spotting patterns within datasets. A positive correlation result means that both variables increase in relation to each other, while a negative correlation means that as one variable decreases, the other increases.

Correlation Coefficients

There are usually three different ways of ranking statistical correlation according to Spearman, Kendall, and Pearson. Each coefficient will represent the end result as ‘ r’. Spearman’s Rank and Pearson’s Coefficient are the two most widely used analytical formulae depending on the types of data researchers have to hand:

Spearman’s Rank Correlation Coefficient

This coefficient is used to see if there is any significant relationship between the two datasets, and operates under the assumption that the data being used is ordinal, which here means that the numbers do not indicate quantity, but rather they signify a position of place of the subject’s standing (e.g. 1 st , 2 nd , 3 rd , etc.)

Spearmans Rank

This coefficient requires a table of data which displays the raw data, it’s ranks, and the different between the two ranks. This squared difference between the two ranks will be shown on a scatter graph, which will clearly indicate whether there is a positive correlation, negative correlation, or no correlation at all between the two variables. The constraint that this coefficient works under is -1 ≤ r ≤ +1, where a result of 0 would mean that there was no relation between the data whatsoever. For more information on Spearman’s Rank Correlation Coefficient, there is a great document explaining the process here .

Pearson Product-Moment Coefficient

This is the most widely used correlation analysis formula, which measures the strength of the ‘ linear ’ relationships between the raw data from both variables, rather than their ranks. This is an dimensionless coefficient, meaning that there are no data-related boundaries to be considered when conducting analyses with this formula, which is a reason why this coefficient is the first formula researchers try.

Pearsons Rank

However, if the relationship between the data is not linear, then that is when this particular coefficient will not accurately represent the relationship between the two variables, and when Spearman’s Rank must be implemented instead. Pearson’s coefficient requires the relevant data must be inputted into a table similar to that of Spearman’s Rank but without the ranks, and the result produced will be in the numerical form which all correlation coefficients produce, including Spearman’s Rank and Pearson’s Coefficient: -1 ≤ r ≤ +1.

When to Use

The two methods outlined above are to be used according to whether there are parameters associated with the data gathered. The two terms to watch out for are:

  • Parametric: (Pearson’s Coefficient) Where the data must be handled in relation to the parameters of populations or probability distributions. Typically used with quantitative data already set out within said parameters.
  • Nonparametric: (Spearman’s Rank) Where no assumptions can be made about the probability distribution. Typically used with qualitative data, but can be used with quantitative data if Spearman’s Rank proves inadequate.

In cases when both are applicable, statisticians recommend using the parametric methods such as Pearson’s Coefficient, because they tend to be more precise. But that doesn’t mean discount the non-parametric methods if there isn’t enough data or a more specified accurate result is needed.

Interpreting Results

Typically, the best way to gain a generalised but more immediate interpretation of the results of a set of data, is to visualise it on a scatter graph such as these:

Positive Correlation Graph

Positive Correlation

Any score from +0.5 to +1 indicates a very strong positive correlation, which means that they both increase at the same time. The line of best fit, or the trend line, is places to best represent the data on the graph. In this case, it is following the data points upwards to indicate the positive correlation.  

Negative Correlation Graph

Negative Correlation

Any score from -0.5 to -1 indicate a strong negative correlation, which means that as one variable increases, the other decreases proportionally. The line of best fit can be seen here to indicate the negative correlation. In these cases it will slope downwards from the point of origin.

No Correlation Graph-2

No Correlation

Very simply, a score of 0 indicates that there is no correlation, or relationship, between the two variables.The larger the sample size, the more accurate the result. No matter which formula is used, this fact will stand true for all. The more data there is in putted into the formula, the more accurate the end result will be.

Outliers or anomalies must be accounted for in both correlation coefficients. Using a scatter graph is the easiest way of identifying any anomalies that may have occurred, and running the correlation analysis twice (with and without anomalies) is a great way to assess the strength of the influence of the anomalies on the analysis. If anomalies are present, Spearman’s Rank coefficient may be used instead of Pearson’s Coefficient, as this formula is extremely robust against anomalies due to the ranking system used.

Correlation ≠ Causation

While a significant relationship may be identified by correlation analysis techniques , correlation does not imply causation. The cause cannot be determined by the analysis, nor should this conclusion be attempted. The significant relationship implies that there is more to understand and that there are extraneous or underlying factors that should be explored further in order to search for a cause. While it is possible that a causal relationship exists, it would be remiss of any researcher to use the correlation results as proof of this existence.

The cause of any relationship that may be discovered through the correlation analysis, is for the researcher to determine through other means of statistical analysis, such as the coefficient of determination analysis . However, there is a great amount of value that correlation analysis can provide; for example, the value of the dependency or the variables can be estimated, which can help firms estimate the cost and sale of a product or service.

In essence, the uses for and applications of correlation-based statistical analyses allows researchers to identify which aspects and variables are dependent on each other, the result of which can generate actionable insights as they are, or starting points for further investigations and deeper insights.

Camp InsightHub

About FlexMR

We are The Insights Empowerment Company. We help research, product and marketing teams drive informed decisions with efficient, scalable & impactful insight.

About Emily James

As a professional copywriter, Emily brings our global vision to life through a broad range of industry-leading content.

Stay up to date

You might also like....

Blog Featured Image Header

10 Common Questions About InsightHu...

With a suite of impactful integrated data collection, analysis and activation tools, FlexMR’s InsightHub platform is used by many insight teams and experts for impactful insight generation and activat...

Blog Featured Image Header

10 Design Principles to Help Improv...

Surveys have been the most popular research method since the conception of market research. They are a still-flourishing method that stakeholders continually turn to as a first port of call and resear...

Blog Featured Image Header

The Best Projective Techniques for ...

Online focus groups are one of the most prominent ways to conduct qualitative research for a very good reason: they directly connect brands to customers, so they can truly understand what goes on insi...

Grit Top 50 Logo

Logo for Rhode Island College Digital Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Quantitative Data Analysis With SPSS

16 Quantitative Analysis with SPSS: Correlation

Mikaila Mariel Lemonik Arthur

So far in this text, we have only looked at relationships involving at least one discrete variable. But what if we want to explore relationships between two continuous variables? Correlation is a tool that lets us do just that. [1] The way correlation works is detailed in the chapter on Correlation and Regression ; this chapter, then, will focus on how to produce scatterplots (the graphical representations of the data upon which correlation procedures are based); bivariate correlations and correlation matrices (which can look at many variables, but only two at a time); and partial correlations (which enable the analyst to examine a bivariate correlation while controlling for a third variable).

A screenshot of the dialog for selecting the type of scatterplot. Use arrow keys to move between simple scatter, matrix scatter, and other options not described in this text. Use tab to move to the Define button, Cancel, or Help.

Scatterplots

To produce a scatterplot, go to Graphs → Legacy Dialogs → Scatter/Dot (Alt+G, Alt+L, Alt+S), as shown in Figure 13 in the chapter on Quantitative Analysis with SPSS: Univariate Analysis . Choose “Simple Scatter” for a scatterplot with two variables, as shown in Figure 1.

A screenshot of the dialog for a simple scatterplot. Alt+Y accesses the Y Axis variable and Alt+X accesses the X axis variable. Other options include Alt+S for Set Markers By, Alt+C for Label Cases By, Alt+W for Panel By Rows, Alt+L for Panel by Columns, Alt+T for the titles dialog, Alt+O for the Options dialog, and Alt+U to toggle Use Chart Specifications from (after which Alt+F allows you to load a file with the chart specifications). Under Titles, Alt+L for Line 1 of the title, Alt+N for line 2, Alt+S for the subtitle, Alt+1 for line 1 of the footnote, Alt+2 for line 2 of the footnote. Under Options, most will typically be greyed out, but if not, Alt+X for exclude cases listwise, Alt+V for exclude cases variable by variable, Alt+D to toggle Display groups defined by missing values, Alt+S to toggle Display chart with case labels; Alt+E for Display error bars, Alt+C for Error Bars Represent Confidence intervals (Alt+L allows you to specify the level), Alt+A for Error Bars represent Standard error (with Alt+M allowing you to specify the Multiplier), and Alt+N for Error Bars represent Standard Deviation (with Alt+M allowing you to specify the multiplier).

This brings up the dialog for creating a scatterplot, as shown in Figure 2. The independent variable is placed in the X Axis box, as it is a graphing convention to always put the independent variable on the X axis (you can remember this because X comes before Y, therefore X is the independent variable and Y is the dependent variable, and X goes on the X axis while Y goes on the Y axis). Then the dependent variable is placed in the Y Axis box.

There are a variety of other options in the simple scatter dialog, but most are rarely used. In a small dataset, Label Cases by allows you to specify a variable that will be used to label the dots in the scatterplot (for instance, in a database of states you could label the dots with the 2-letter state code).

Once the scatterplot is set up with the independent and dependent variables, click OK to continue. The scatterplot will then appear in the output. In this case, we have used the independent variable AGE and the dependent variable CARHR to look at whether there is a relationship between the respondent’s age and how many hours they spend in a car per week. The resulting scatterplot is shown in Figure 3.

A scatterplot with age of respondent on the X axis and how many hours, in a typical week, the respondent spends in a car or other motor vehicle, not including public transit, on the Y axis. The graph shows values clustered under 10 hours, but with many outliers above 60 and 3 at or above 80 (the highest at about 90). Outliers are distributed broadly but fewer appear at the oldest ages.

In some scatterplots, it is easy to observe the relationship between the variables. In others, like the one in Figure 3, the pattern of dots is too complex to make it possible to really see the relationship. A tool to help analysts visualize the relationship is the line of best fit , as discussed in the chapter on Correlation and Regression . This line is the line mathematically calculated to be the closest possible to the greatest number of dots. To add the line of best fit, sometimes called the regression line or the fit line, to your scatterplot, go to the scatterplot in the output window and double-click on it. This will open up the Chart Editor window. Then go to Elements → Fit Line at Total, as shown in Figure 4. This will bring up the Properties window. Under the Fit Line tab, be sure the Linear button is selected; click apply if needed and close out.

A screenshot of the process for adding a regression line to a scatterplot. Begin by using tab and arrow keys to navigate to the scatterplot in the output window. Then, once the output window is selected, click enter. Alt+M opens the Elements menu (if broken, use Alt+O and then the right arrow); Alt+F the fit line at total dialog. This brings up the properties window; Alt+L selects Linear and Alt+A applies (you may not need to apply if you have not had to change the selection of line type). There are other types of fit lines, like Quadratic (Alt+Q) and Cubic (Alt+U) but they are beyond the scope of the chapter.

Doing so will add a line with an equation to the scatterplot, as shown in Figure 5. [2] From looking at the line, we can see that age age goes up, time spent in the car per week goes down, but only slightly. The equation confirms this. As shown in the graph, the equation for this line is [latex]y=9.04-0.05x[/latex]. This equation tells us that the line crosses the y axis at 9.04 and that the line goes down 0.05 hours per week in the car for every one year that age goes up (that’s about 3 minutes).

This image is the same as Figure 3 but with the addition of a regression line and an equation, y=9.04-0.05x, and the Rsquared linear of 0.010.

What if we are interested in a whole bunch of different variables? It would take a while to produce scatterplots for each pair of variables. But there is an option for producing them all at once, if smaller and a bit harder to read. This is a scatterplot matrix. To produce a scatterplot matrix, go to Graphs → Legacy Dialogs → Scatter/Dot (Alt+G, Alt+L, Alt+S), as in Figure 1. But this time, choose Matrix from the dialog that appears.

In the Scatterplot Matrix dialog, select all of the variables you are interested in and put them in the Matrix Variables box, and then click OK. The many other options here, as in the case of the simple scatterplot, are rarely used.

The scatterplot matrix will then be produced. As you can see in Figure 7, the scatterplot matrix involves a series of smaller scatterplots, one for each pair of variables specified. Here we specified CARHR and AGE, the two variables we were already using, and added REALINC, the respondent’s family’s income in real (inflation-adjusted) dollars. It is possible, using the same instructions detailed above, to add lines of best fit to the little scatterplots in the scatterplot matrix. Note that each little scatterplot appears twice, once with the variable on the x-axis and once with the variable on the y-axis. You only need to pay attention to one version of each pair of scatterplots.

A scatterplot matrix showing scatterplots of the relationships between hours spent in the car per week and age; hours spent in the car per week and real family income, and real family income and age. None provide a visual that makes it possible to easily identify the relationship.

Keep in mind that while you can include discrete variables in a scatterplot, the resulting scatterplot will be very hard to read as most of the dots will just be stacked on top of each other. See Figure 8 for an example of a scatterplot matrix that uses some binary and ordinal variables so you are aware of what to expect in such circumstances. Here, we are looking at the relationships between pairs of the three variables real family income, whether the respondent works for themselves or someone else, and how they would rate their family income from the time that they were 16 in comparison to that of others. As you can see, including discrete variables in a scatterplot produces a series of stripes which are not very useful for analytical purposes.

A scatterplot matrix looking at real family income, whether the respondent works for themselves (yes/no), and how the respondent would rate their family income compared to others at age 16 (ordinal). The scatterplots basically display stripes, not groupings of dots that would make it possible to observe a relationship.

Correlation

Scatterplots can help us visualize the relationships between our variables. But they cannot tell us whether the patterns we observe are statistically significant—or how strong the relationships are. For this, we turn to correlation, as discussed in the chapter on Correlation and Regression . Correlations are bivariate in nature—in other words, each correlation looks at the relationship between two variables. However, like in the case of the scatterplot matrix discussed above, we can produce a correlation matrix with results for a series of pairs of variables all shown in one table.

A screenshot of the bivariate correlation dialog. Alt+V moves to the variables box. Alt+N toggles the Pearson coefficient; Alt+K the Kendall's tau-b, and Alt+S the Spearman; Alt+T selects two-tailed and Alt+L selects one-tailed. Alt+F toggles Flag significant correlations. There is an option to show only lower triangle but it must be accessed via tab. Alt+O opens the options menu, under which Alt+M produces means and standard deviations. There are various other tools and options which are less frequently used.

To produce a correlation matrix, go to Analyze → Correlate → Bivariate (Alt+A, Alt+C, Alt+B). Put all of the variables of interest in the Variables box. Be sure Flag significant correlations is checked and select your correlation coefficient. Note that the dialog provides the option of three different correlation coefficients, Pearson, Kendall’s tau-b, and Spearman. The first, Pearson, is used when looking at the relationship between two continuous variables; the other two are used when looking at the relationship between two ordinal variables. [3] In most cases, you will want the two-tailed test of significance. Under options, you can request that means and standard deviations are also produced. When your correlation is set up, as shown in Figure 8, click OK to produce it. The results will be as shown in Table 1 (the order of variables in the table is determined by the order in which they were entered into the bivariate correlation dialog).

As in the scatterplot matrix above, each correlation appears twice, so you only need to look at half of the table—above or below the diagonal. Note that in the diagonal, you are seeing the correlation of each variable with itself, so a perfect 1 for complete agreement and the number of cases with valid responses on that variable. For each pair of variables, the correlation matrix includes the N, or number of respondents included in the analysis; the Sig. (2-tailed), or the p value of the correlation; and the Pearson Correlation, which is the measure of association in this analysis. It is starred to further indicate the significance level. The direction, indicated by a + or – sign, tells us whether the relationship is direct or inverse. Therefore, for each pair of variables, you can determine the significance, strength, and direction of the relationship. Taking the results in Table 1 one variable pair at a time, we can thus conclude that:

  • The relationship between age and family income is not significant. (We could say there is a weak positive association, but since this association is not significant, we often do not comment on it.)
  • The relationship between time spent in a car per week and family income is significant at the p<0.05 level. It is a weak negative relationship—in other words, as family income goes up, time spent in a car each week goes down, but only a little bit.
  • The relationship between time spent in a car per week and age is significant at the p<0.001 level. It is a moderate negative relationship—in other words, as age goes up, time spent in a car each week goes down.

Partial Correlation

Partial correlation analysis is an analytical procedure designed to allow you to examine the association between two continuous variables while controlling for a third variable. Remember that when we control for a variable, what we are doing is holding that variable constant so we can see what the relationship between our independent and dependent variables would look like without the influence of the third variable on that relationship.

Once you’ve developed a hypothesis about the relationship between the independent, dependent, and control or intervening variable and run appropriate descriptive statistics, the first step in partial correlation analysis is to run a regular bivariate correlation with all of your variables, as shown above, and interpret your results.

A screenshot of the partial correlation dialog box. Alt+V moves to the variables box, while Alt+C moves to the Controlling for box. Alt+T selects a two-tailed test of significance, while Alt+N selects a one-tailed test. Alt+D toggles Display actual significance level. Alt+O options the Options dialog, under which Alt+M produces means and standard deviations.

After running and interpreted the results of your bivariate correlation matrix, the next step is to produce the partial correlation by going to Analyze → Correlate → Partial (Alt+A, Alt+C, Alt+R). Place the independent and dependent variables in the Variables box, and the control variable in the Controlling for box, as shown in Figure 9. Note that the partial correlation assumes continuous variables and will only produce the Pearson correlation. The resulting partial correlation Table 2 will look much like the original bivariate correlation, but will show that the third variable has been controlled for, as shown in Table 2. To interpret the results of the partial correlation, begin by looking at the significance and association displayed and interpret them as usual.

To interpret the results, we again look at significance, strength, and direction. Here, we find that the relationship is significant at the p<0.001 level and it is a weak negative relationship. As age goes up, time spent in a car each week goes down.

After interpreting the results of the bivariate correlation, compare the value of the measure of association in the correlation to that in the partial correlation to see how they differ. Keep in mind that we ignore the + or – sign when we do this, just considering the actual number (the absolute value). In this case, then, we would be comparing 0.100 from the bivariate correlation to 0.106 from the partial correlation. The number in the partial correlation is just a little bit higher. So what does this mean?

Interpreting Partial Correlation Coefficients

To determine how to interpret the results of your partial correlation, figure out which of the following criteria applies:

  • If the correlation between x and y is smaller in the bivariate correlation than in the partial correlation: the third variable is a suppressor variable. This means that when we don’t control for the third variable, the relationship between x and y seems smaller than it really is. So, for example, if I give you an exam with a very strict time limit to see if how much time you spend in class predicts your exam score, the exam time limit might suppress the relationship between class time and exam scores. In other words, if we control for the time limit on the exam, your time in class might better predict your exam score.
  • If the correlation between x and y is bigger in the bivariate correlation than in the partial correlation, this means that the third variable is a mediating variable. This is another way of saying that it is an intervening variable —in other words, the relationship between x and y seems larger than it really is because some other variable z intervenes in the relationship between x and y to change the nature of that relationship. So, for example, if we are interested in the relationship between how tall you are and how good you are at basketball, we might find a strong relationship. However, if we added the additional variable of how many hours a week you practice shooting hoops, we might find the relationship between height and basketball skill is much diminished.
  • It is additionally possible for the direction of the relationship to change. So, for example, we might find that there is a direct relationship between miles run and marathon performance, but if we add frequency of injuries, then running more miles might reduce your marathon performance.
  • If the value of Pearson’s r is the same or very similar in the bivariate and partial correlations, the third variable has little or no effect. In other words, the relationship between x and y is basically the same regardless of whether we consider the influence of the third variable, and thus we can conclude that the third variable does not really matter much and the relationship of interest remains the one between our independent and dependent variables.

Finally, remember that significance still matters ! If neither the bivariate correlation nor the partial correlation is significant, we cannot reject our null hypothesis and thus we cannot conclude that there is anything happening amongst our variables. If both the bivariate correlation and the partial correlation are significant, we can reject the null hypothesis and proceed according to the instructions for interpretation as discussed above. If the original bivariate correlation was not significant but the partial correlation was significant, we cannot reject the null hypothesis in regards to the relationship between our independent and dependent variables alone. However, we can reject the null hypothesis that there is no relationship between the variables as long as we are controlling for the third variable! If the original bivariate correlation was significant but the partial correlation was not significant, we can reject the null hypothesis in regards to the relationship between our independent and dependent variables, but we cannot reject the null hypothesis when considering the role of our third variable. While we can’t be sure what is going on in such a circumstance, the analyst should conduct more analysis to try to see what the relationship between the control variable and the other variables of interest might be.

So, what about our example above? Well, the number in our partial correlation was higher, even if just a little bit, than the number in our bivariate correlation. This means that family income is a suppressor variable. In other words, when we do not control for family income, the relationship between age and time spent in the car seems smaller than it really is. But here is where we find the limits of what the computer can do to help us with our analysis—the computer cannot explain why controlling for income makes the relationship between age and time spent in the car larger. We have to figure that out ourselves. What do you think is going on here?

  • Choose two continuous variables of interest. Produce a scatterplot with regression line and describe what you see.
  • Choose three continuous variables of interest. Produce a scatterplot matrix for the three variables and describe what you see.
  • Using the same three continuous variables, produce a bivariate correlation matrix. Interpret your results, paying attention to statistical significance, direction, and strength.
  • Choose one of your three variables to use as a control variable. Write a hypothesis about how controlling for this variable will impact the relationship between the other two variables.
  • Produce a partial correlation. Interpret your results, paying attention to statistical significance, direction, and strength.
  • Compare the results of your partial correlation to the results from the correlation of those same two variables in Question 3 (when the other variable is not controlled for). How have the results changed? What does that tell you about the impact of the control variable?

Media Attributions

  • scatter dot dialog © IBM SPSS is licensed under a All Rights Reserved license
  • simple scatter dialog © IBM SPSS is licensed under a All Rights Reserved license
  • scatter of carhrs and age © Mikaila Mariel Lemonik Arthur is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
  • scatter fit line © IBM SPSS is licensed under a All Rights Reserved license
  • scatter with line © Mikaila Mariel Lemonik Arthur is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
  • scatterplot matrix dialog © IBM SPSS is licensed under a All Rights Reserved license
  • matrix scatter © Mikaila Mariel Lemonik Arthur is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
  • scatter binary ordinal © Mikaila Mariel Lemonik Arthur is licensed under a All Rights Reserved license
  • bivariate correlation dialog © IBM SPSS is licensed under a All Rights Reserved license
  • partial correlation dialog © IBM SPSS is licensed under a All Rights Reserved license
  • Note that the bivariate correlation procedures discussed in this chapter can also be used with ordinal variables when appropriate options are selected, as will be detailed below. ↵
  • It will also add the R 2 ; see the chapter on Correlation and Regression for more on how to interpret this. ↵
  • A detailed explanation of each of these measures of association is found in the chapter An In-Depth Look At Measures of Association . ↵

The line that best minimizes the distance between itself and all of the points in a scatterplot.

A variable hypothesized to intervene in the relationship between an independent and a dependent variable; in other words, a variable that is affected by the independent variable and in turn affects the dependent variable.

Social Data Analysis Copyright © 2021 by Mikaila Mariel Lemonik Arthur is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

surveys-cube-80px

  • Solutions Industry Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Member Experience Technology Use case NPS+ Communities Audience InsightsHub InstantAnswers Digsite LivePolls Journey Mapping GDPR Positive People Science 360 Feedback Surveys Research Edition
  • Resources Blog eBooks Survey Templates Case Studies Training Webinars Help center

Correlation analysis

Using correlation analysis to identify linear relationships between two variables

image analysis

What is correlation analysis?

Correlation analysis in research is a statistical method used to measure the strength of the linear relationship between two variables and compute their association. Simply put - correlation analysis calculates the level of change in one variable due to the change in the other.

A high correlation points to a strong relationship between the two variables, while a low correlation means that the variables are weakly related.

When it comes to market research, researchers use correlation analysis to analyze quantitative data collected through research methods like surveys and live polls. They try to identify the relationship, patterns, significant connections, and trends between two variables or datasets.

There is a positive correlation between two variabls when an increase in one variable leads to the increase in the other. On the other hand, a negative correlation means that when one variable increases, the other decreases and vice-versa.

The Correlation Coefficient

One of the statistical concepts that is most related to this type of analysis is the correlation coefficient.

The correlation coefficient is the unit of measurement used to calculate the intensity in the linear relationship between the variables involved in a correlation analysis, this is easily identifiable since it is represented with the symbol r and is usually a value without units which is located between 1 and -1.

If you want to delve into this topic, we recommend you consult our guide: Pearson Correlation Coefficent .

Example of correlation analysis

Correlation between two variables can be either a positive correlation, a negative correlation, or no correlation. Let's look at examples of each of these three types.

Positive correlation: A positive correlation between two variables means both the variables move in the same direction. An increase in one variable leads to an increase in the other variable and vice versa.

For example, spending more time on a treadmill burns more calories.

Negative correlation: A negative correlation between two variables means that the variables move in opposite directions. An increase in one variable leads to a decrease in the other variable and vice versa.

For example, increasing the speed of a vehicle decreases the time you take to reach your destination.

Weak/Zero correlation: No correlation exists when one variable does not affect the other.

For example, there is no correlation between the number of years of school a person has attended and the letters in his/her name.

Correlation analysis

Uses of correlation analysis

Correlation analysis is used to study practical cases. Here, the researcher can't manipulate individual variables. For example, correlation analysis is used to measure the correlation between the patient's blood pressure and the medication used.

Marketers use it to measure the effectiveness of advertising. Researchers measure the increase/decrease in sales due to a specific marketing campaign.

Advantages of correlation analysis

In statistics, correlation refers to the fact that there is a link between various events. One of the tools to infer whether such a link exists is correlation analysis. Practical simplicity is undoubtedly one of its main advantages.

To perform reliable correlation analysis, it is essential to make in-depth observations of two variables, which gives us an advantage in obtaining results. Some of the most notorious benefits of correlation analysis are:

Awareness of the behavior between two variables: A correlation helps to identify the absence or presence of a relationship between two variables. It tends to be more relevant to everyday life.

A good starting point for research: It proves to be a good starting point when a researcher starts investigating relationships for the first time.

Uses for further studies: Researchers can identify the direction and strength of the relationship between two variables and later narrow the findings down in later studies.

Simple metrics: Research findings are simple to classify. The findings can range from -1.00 to 1.00. There can be only three potential broad outcomes of the analysis.

How to use correlation analysis in your surveys?

Learn how to set up and use this feature with our help file on correlation analysis .

Get started today!

Are you interested in using this feature with questionpro you can gain access to this and many more. do your data collection and research more efficiently than ever.

  • Sample questions
  • Sample reports
  • Survey logic
  • Integrations
  • Professional services
  • Survey Software
  • Customer Experience
  • Communities
  • Polls Explore the QuestionPro Poll Software - The World's leading Online Poll Maker & Creator. Create online polls, distribute them using email and multiple other options and start analyzing poll results.
  • Research Edition
  • InsightsHub
  • Survey Templates
  • Case Studies
  • AI in Market Research
  • Quiz Templates
  • Qualtrics Alternative Explore the list of features that QuestionPro has compared to Qualtrics and learn how you can get more, for less.
  • SurveyMonkey Alternative
  • VisionCritical Alternative
  • Medallia Alternative
  • Likert Scale Complete Likert Scale Questions, Examples and Surveys for 5, 7 and 9 point scales. Learn everything about Likert Scale with corresponding example for each question and survey demonstrations.
  • Conjoint Analysis
  • Net Promoter Score (NPS) Learn everything about Net Promoter Score (NPS) and the Net Promoter Question. Get a clear view on the universal Net Promoter Score Formula, how to undertake Net Promoter Score Calculation followed by a simple Net Promoter Score Example.
  • Offline Surveys
  • Customer Satisfaction Surveys
  • Employee Survey Software Employee survey software & tool to create, send and analyze employee surveys. Get real-time analysis for employee satisfaction, engagement, work culture and map your employee experience from onboarding to exit!
  • Market Research Survey Software Real-time, automated and advanced market research survey software & tool to create surveys, collect data and analyze results for actionable market insights.
  • GDPR & EU Compliance
  • Employee Experience
  • Customer Journey
  • Executive Team
  • In the news
  • Testimonials
  • Advisory Board

QuestionPro in your language

  • Encuestas Online
  • Pesquisa Online
  • Umfrage Software
  • برامج للمسح

Awards & certificates

The experience journal.

Find innovative ideas about Experience Management from the experts

  • © 2021 QuestionPro Survey Software | +1 (800) 531 0228
  • Privacy Statement
  • Terms of Use
  • Cookie Settings
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Quantitative Research – Methods, Types and Analysis

Quantitative Research – Methods, Types and Analysis

Table of Contents

What is Quantitative Research

Quantitative Research

Quantitative research is a type of research that collects and analyzes numerical data to test hypotheses and answer research questions . This research typically involves a large sample size and uses statistical analysis to make inferences about a population based on the data collected. It often involves the use of surveys, experiments, or other structured data collection methods to gather quantitative data.

Quantitative Research Methods

Quantitative Research Methods

Quantitative Research Methods are as follows:

Descriptive Research Design

Descriptive research design is used to describe the characteristics of a population or phenomenon being studied. This research method is used to answer the questions of what, where, when, and how. Descriptive research designs use a variety of methods such as observation, case studies, and surveys to collect data. The data is then analyzed using statistical tools to identify patterns and relationships.

Correlational Research Design

Correlational research design is used to investigate the relationship between two or more variables. Researchers use correlational research to determine whether a relationship exists between variables and to what extent they are related. This research method involves collecting data from a sample and analyzing it using statistical tools such as correlation coefficients.

Quasi-experimental Research Design

Quasi-experimental research design is used to investigate cause-and-effect relationships between variables. This research method is similar to experimental research design, but it lacks full control over the independent variable. Researchers use quasi-experimental research designs when it is not feasible or ethical to manipulate the independent variable.

Experimental Research Design

Experimental research design is used to investigate cause-and-effect relationships between variables. This research method involves manipulating the independent variable and observing the effects on the dependent variable. Researchers use experimental research designs to test hypotheses and establish cause-and-effect relationships.

Survey Research

Survey research involves collecting data from a sample of individuals using a standardized questionnaire. This research method is used to gather information on attitudes, beliefs, and behaviors of individuals. Researchers use survey research to collect data quickly and efficiently from a large sample size. Survey research can be conducted through various methods such as online, phone, mail, or in-person interviews.

Quantitative Research Analysis Methods

Here are some commonly used quantitative research analysis methods:

Statistical Analysis

Statistical analysis is the most common quantitative research analysis method. It involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis can be used to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.

Regression Analysis

Regression analysis is a statistical technique used to analyze the relationship between one dependent variable and one or more independent variables. Researchers use regression analysis to identify and quantify the impact of independent variables on the dependent variable.

Factor Analysis

Factor analysis is a statistical technique used to identify underlying factors that explain the correlations among a set of variables. Researchers use factor analysis to reduce a large number of variables to a smaller set of factors that capture the most important information.

Structural Equation Modeling

Structural equation modeling is a statistical technique used to test complex relationships between variables. It involves specifying a model that includes both observed and unobserved variables, and then using statistical methods to test the fit of the model to the data.

Time Series Analysis

Time series analysis is a statistical technique used to analyze data that is collected over time. It involves identifying patterns and trends in the data, as well as any seasonal or cyclical variations.

Multilevel Modeling

Multilevel modeling is a statistical technique used to analyze data that is nested within multiple levels. For example, researchers might use multilevel modeling to analyze data that is collected from individuals who are nested within groups, such as students nested within schools.

Applications of Quantitative Research

Quantitative research has many applications across a wide range of fields. Here are some common examples:

  • Market Research : Quantitative research is used extensively in market research to understand consumer behavior, preferences, and trends. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform marketing strategies, product development, and pricing decisions.
  • Health Research: Quantitative research is used in health research to study the effectiveness of medical treatments, identify risk factors for diseases, and track health outcomes over time. Researchers use statistical methods to analyze data from clinical trials, surveys, and other sources to inform medical practice and policy.
  • Social Science Research: Quantitative research is used in social science research to study human behavior, attitudes, and social structures. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform social policies, educational programs, and community interventions.
  • Education Research: Quantitative research is used in education research to study the effectiveness of teaching methods, assess student learning outcomes, and identify factors that influence student success. Researchers use experimental and quasi-experimental designs, as well as surveys and other quantitative methods, to collect and analyze data.
  • Environmental Research: Quantitative research is used in environmental research to study the impact of human activities on the environment, assess the effectiveness of conservation strategies, and identify ways to reduce environmental risks. Researchers use statistical methods to analyze data from field studies, experiments, and other sources.

Characteristics of Quantitative Research

Here are some key characteristics of quantitative research:

  • Numerical data : Quantitative research involves collecting numerical data through standardized methods such as surveys, experiments, and observational studies. This data is analyzed using statistical methods to identify patterns and relationships.
  • Large sample size: Quantitative research often involves collecting data from a large sample of individuals or groups in order to increase the reliability and generalizability of the findings.
  • Objective approach: Quantitative research aims to be objective and impartial in its approach, focusing on the collection and analysis of data rather than personal beliefs, opinions, or experiences.
  • Control over variables: Quantitative research often involves manipulating variables to test hypotheses and establish cause-and-effect relationships. Researchers aim to control for extraneous variables that may impact the results.
  • Replicable : Quantitative research aims to be replicable, meaning that other researchers should be able to conduct similar studies and obtain similar results using the same methods.
  • Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis allows researchers to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.
  • Generalizability: Quantitative research aims to produce findings that can be generalized to larger populations beyond the specific sample studied. This is achieved through the use of random sampling methods and statistical inference.

Examples of Quantitative Research

Here are some examples of quantitative research in different fields:

  • Market Research: A company conducts a survey of 1000 consumers to determine their brand awareness and preferences. The data is analyzed using statistical methods to identify trends and patterns that can inform marketing strategies.
  • Health Research : A researcher conducts a randomized controlled trial to test the effectiveness of a new drug for treating a particular medical condition. The study involves collecting data from a large sample of patients and analyzing the results using statistical methods.
  • Social Science Research : A sociologist conducts a survey of 500 people to study attitudes toward immigration in a particular country. The data is analyzed using statistical methods to identify factors that influence these attitudes.
  • Education Research: A researcher conducts an experiment to compare the effectiveness of two different teaching methods for improving student learning outcomes. The study involves randomly assigning students to different groups and collecting data on their performance on standardized tests.
  • Environmental Research : A team of researchers conduct a study to investigate the impact of climate change on the distribution and abundance of a particular species of plant or animal. The study involves collecting data on environmental factors and population sizes over time and analyzing the results using statistical methods.
  • Psychology : A researcher conducts a survey of 500 college students to investigate the relationship between social media use and mental health. The data is analyzed using statistical methods to identify correlations and potential causal relationships.
  • Political Science: A team of researchers conducts a study to investigate voter behavior during an election. They use survey methods to collect data on voting patterns, demographics, and political attitudes, and analyze the results using statistical methods.

How to Conduct Quantitative Research

Here is a general overview of how to conduct quantitative research:

  • Develop a research question: The first step in conducting quantitative research is to develop a clear and specific research question. This question should be based on a gap in existing knowledge, and should be answerable using quantitative methods.
  • Develop a research design: Once you have a research question, you will need to develop a research design. This involves deciding on the appropriate methods to collect data, such as surveys, experiments, or observational studies. You will also need to determine the appropriate sample size, data collection instruments, and data analysis techniques.
  • Collect data: The next step is to collect data. This may involve administering surveys or questionnaires, conducting experiments, or gathering data from existing sources. It is important to use standardized methods to ensure that the data is reliable and valid.
  • Analyze data : Once the data has been collected, it is time to analyze it. This involves using statistical methods to identify patterns, trends, and relationships between variables. Common statistical techniques include correlation analysis, regression analysis, and hypothesis testing.
  • Interpret results: After analyzing the data, you will need to interpret the results. This involves identifying the key findings, determining their significance, and drawing conclusions based on the data.
  • Communicate findings: Finally, you will need to communicate your findings. This may involve writing a research report, presenting at a conference, or publishing in a peer-reviewed journal. It is important to clearly communicate the research question, methods, results, and conclusions to ensure that others can understand and replicate your research.

When to use Quantitative Research

Here are some situations when quantitative research can be appropriate:

  • To test a hypothesis: Quantitative research is often used to test a hypothesis or a theory. It involves collecting numerical data and using statistical analysis to determine if the data supports or refutes the hypothesis.
  • To generalize findings: If you want to generalize the findings of your study to a larger population, quantitative research can be useful. This is because it allows you to collect numerical data from a representative sample of the population and use statistical analysis to make inferences about the population as a whole.
  • To measure relationships between variables: If you want to measure the relationship between two or more variables, such as the relationship between age and income, or between education level and job satisfaction, quantitative research can be useful. It allows you to collect numerical data on both variables and use statistical analysis to determine the strength and direction of the relationship.
  • To identify patterns or trends: Quantitative research can be useful for identifying patterns or trends in data. For example, you can use quantitative research to identify trends in consumer behavior or to identify patterns in stock market data.
  • To quantify attitudes or opinions : If you want to measure attitudes or opinions on a particular topic, quantitative research can be useful. It allows you to collect numerical data using surveys or questionnaires and analyze the data using statistical methods to determine the prevalence of certain attitudes or opinions.

Purpose of Quantitative Research

The purpose of quantitative research is to systematically investigate and measure the relationships between variables or phenomena using numerical data and statistical analysis. The main objectives of quantitative research include:

  • Description : To provide a detailed and accurate description of a particular phenomenon or population.
  • Explanation : To explain the reasons for the occurrence of a particular phenomenon, such as identifying the factors that influence a behavior or attitude.
  • Prediction : To predict future trends or behaviors based on past patterns and relationships between variables.
  • Control : To identify the best strategies for controlling or influencing a particular outcome or behavior.

Quantitative research is used in many different fields, including social sciences, business, engineering, and health sciences. It can be used to investigate a wide range of phenomena, from human behavior and attitudes to physical and biological processes. The purpose of quantitative research is to provide reliable and valid data that can be used to inform decision-making and improve understanding of the world around us.

Advantages of Quantitative Research

There are several advantages of quantitative research, including:

  • Objectivity : Quantitative research is based on objective data and statistical analysis, which reduces the potential for bias or subjectivity in the research process.
  • Reproducibility : Because quantitative research involves standardized methods and measurements, it is more likely to be reproducible and reliable.
  • Generalizability : Quantitative research allows for generalizations to be made about a population based on a representative sample, which can inform decision-making and policy development.
  • Precision : Quantitative research allows for precise measurement and analysis of data, which can provide a more accurate understanding of phenomena and relationships between variables.
  • Efficiency : Quantitative research can be conducted relatively quickly and efficiently, especially when compared to qualitative research, which may involve lengthy data collection and analysis.
  • Large sample sizes : Quantitative research can accommodate large sample sizes, which can increase the representativeness and generalizability of the results.

Limitations of Quantitative Research

There are several limitations of quantitative research, including:

  • Limited understanding of context: Quantitative research typically focuses on numerical data and statistical analysis, which may not provide a comprehensive understanding of the context or underlying factors that influence a phenomenon.
  • Simplification of complex phenomena: Quantitative research often involves simplifying complex phenomena into measurable variables, which may not capture the full complexity of the phenomenon being studied.
  • Potential for researcher bias: Although quantitative research aims to be objective, there is still the potential for researcher bias in areas such as sampling, data collection, and data analysis.
  • Limited ability to explore new ideas: Quantitative research is often based on pre-determined research questions and hypotheses, which may limit the ability to explore new ideas or unexpected findings.
  • Limited ability to capture subjective experiences : Quantitative research is typically focused on objective data and may not capture the subjective experiences of individuals or groups being studied.
  • Ethical concerns : Quantitative research may raise ethical concerns, such as invasion of privacy or the potential for harm to participants.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.

IMAGES

  1. What Is a Correlational Study And Examples of correlational research

    correlation analysis in quantitative research

  2. Correlational Research: What it is with Examples

    correlation analysis in quantitative research

  3. Correlation analysis

    correlation analysis in quantitative research

  4. Pearson correlation matrix of the behavioural model parameters

    correlation analysis in quantitative research

  5. Accounting Nest

    correlation analysis in quantitative research

  6. Correlation Analysis (LEC-4)

    correlation analysis in quantitative research

VIDEO

  1. CORRELATION ANALYSIS|| BUSINESS TATISTICS|| Lecture

  2. Statistics-Correlation Analysis

  3. Pearson's r Correlation

  4. Phi Correlation Coefficient

  5. CORRELATION PART 1 QUANTITATIVE TECHNIQUES IV SEM BCOM BBA UNIVERSITY OF CALICUT

  6. Lesson 3.4 More about Correlation

COMMENTS

  1. Correlational Research

    Revised on June 22, 2023. A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

  2. Correlation Analysis

    Correlation analysis is a statistical method used to evaluate the strength and direction of the relationship between two or more variables. The correlation coefficient ranges from -1 to 1. A correlation coefficient of 1 indicates a perfect positive correlation. This means that as one variable increases, the other variable also increases.

  3. Correlational Research

    A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative. Positive correlation.

  4. What Is Correlation Analysis: Comprehensive Guide

    Correlation analysis is a staple of data analytics. It's a commonly used method to measure the relationship between two variables. It helps researchers understand the extent to which changes to the value in one variable are associated with changes to the value in the other. This analysis often applies to quantitative data collected through ...

  5. Conducting correlation analysis: important limitations and pitfalls

    The correlation coefficient is easy to calculate and provides a measure of the strength of linear association in the data. However, it also has important limitations and pitfalls, both when studying the association between two variables and when studying agreement between methods. These limitations and pitfalls should be taken into account when ...

  6. Correlational Study Overview & Examples

    A correlational study is an experimental design that evaluates only the correlation between variables. The researchers record measurements but do not control or manipulate the variables. Correlational research is a form of observational study. A correlation indicates that as the value of one variable increases, the other tends to change in a ...

  7. Correlation analysis in clinical and experimental studies

    Correlation analysis in clinical and experimental studies. It is common for researchers conducting clinical or biomedical studies to be interested in investigating whether the values of two or more quantitative variables change in conjunction in a given individual or object of study. In other words, whether when the value of one variable ...

  8. 7.2 Correlational Research

    Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between ...

  9. Regression and Correlation

    Quantitative Research Methods. Correlation is the relationship or association between two variables. There are multiple ways to measure correlation, but the most common is Pearson's correlation coefficient (r), which tells you the strength of the linear relationship between two variables. The value of r has a range of -1 to 1 (0 indicates no ...

  10. Correlation Analysis

    Correlation Analysis. Methods of correlation and regression can be used in order to analyze the extent and the nature of relationships between different variables. Correlation analysis is used to understand the nature of relationships between two individual variables. For example, if we aim to study the impact of foreign direct investment (FDI ...

  11. How Accurate Is Your Correlation? Different Methods Derive Different

    Understanding the association between theoretical constructs is at the heart of quantitative research. Researchers use correlation to understand how two or more variables are associated. Note that correlation does not infer causality especially when it is applied to cross-sectional data (Alamer and Lee, 2021). Beyond this, in first-generation ...

  12. Correlational Research

    It should involve two or more variables that you want to investigate for a correlation. Choose the research method: Decide on the research method that will be most appropriate for your research question. The most common methods for correlational research are surveys, archival research, and naturalistic observation.

  13. 6.2 Correlational Research

    Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables ...

  14. What is Correlation Analysis? A Definition and Explanation

    Correlation Analysis is statistical method that is used to discover if there is a relationship between two variables/datasets, and how strong that relationship may be. In terms of market research this means that, correlation analysis is used to analyse quantitative data gathered from research methods such as surveys and polls, to identify ...

  15. Quantitative Analysis with SPSS: Correlation

    To produce a scatterplot, go to Graphs → Legacy Dialogs → Scatter/Dot (Alt+G, Alt+L, Alt+S), as shown in Figure 13 in the chapter on Quantitative Analysis with SPSS: Univariate Analysis. Choose "Simple Scatter" for a scatterplot with two variables, as shown in Figure 1. Figure 2. Simpler Scatter Dialog. This brings up the dialog for ...

  16. Correlation analysis

    Correlation analysis in research is a statistical method used to measure the strength of the linear relationship between two variables and compute their association. Simply put - correlation analysis calculates the level of change in one variable due to the change in the other. A high correlation points to a strong relationship between the two ...

  17. (PDF) Usefulness of Correlation Analysis

    A simple correlation analysis represents measures the degree of closeness between two related. variables. The correlation coefficient (r or R) as a measure provid es information about closeness ...

  18. Quantitative Research

    Correlational research design is used to investigate the relationship between two or more variables. Researchers use correlational research to determine whether a relationship exists between variables and to what extent they are related. ... Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze ...

  19. A Practical Guide to Writing Quantitative and Qualitative Research

    A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. ... In quantitative research, ... There is a positive correlation between the level of stress at the workplace and the number of suicides (population characteristics) among working ...

  20. What is Correlation Analysis? [Examples & How to Measure It]

    Correlation analysis in market research is a statistical method that identifies the strength of a relationship between two or more variables. In a nutshell, the process reveals patterns within a dataset's many variables. It's all about identifying relationships between variables-specifically in research. Using one of the several formulas ...

  21. Research on quantitative evaluation model of human error probability of

    Using bipolar 2-tuples as the Common Performance Condition (CPC) evaluation linguistics, the subjective and objective weights of CPC are calculated through the Analytic Hierarchy Process and the Criteria Importance Though Intercriteria Correlation (CRITIC), and then the combined weighting method is used to further obtain the comprehensive ...