• Privacy Policy

Research Method

Home » Correlational Research – Methods, Types and Examples

Correlational Research – Methods, Types and Examples

Table of Contents

Correlational Research Design

Correlational Research

Correlational Research is a type of research that examines the statistical relationship between two or more variables without manipulating them. It is a non-experimental research design that seeks to establish the degree of association or correlation between two or more variables.

Types of Correlational Research

There are three types of correlational research:

Positive Correlation

A positive correlation occurs when two variables increase or decrease together. This means that as one variable increases, the other variable also tends to increase. Similarly, as one variable decreases, the other variable also tends to decrease. For example, there is a positive correlation between the amount of time spent studying and academic performance. The more time a student spends studying, the higher their academic performance is likely to be. Similarly, there is a positive correlation between a person’s age and their income level. As a person gets older, they tend to earn more money.

Negative Correlation

A negative correlation occurs when one variable increases while the other decreases. This means that as one variable increases, the other variable tends to decrease. Similarly, as one variable decreases, the other variable tends to increase. For example, there is a negative correlation between the number of hours spent watching TV and physical activity level. The more time a person spends watching TV, the less physically active they are likely to be. Similarly, there is a negative correlation between the amount of stress a person experiences and their overall happiness. As stress levels increase, happiness levels tend to decrease.

Zero Correlation

A zero correlation occurs when there is no relationship between two variables. This means that the variables are unrelated and do not affect each other. For example, there is zero correlation between a person’s shoe size and their IQ score. The size of a person’s feet has no relationship to their level of intelligence. Similarly, there is zero correlation between a person’s height and their favorite color. The two variables are unrelated to each other.

Correlational Research Methods

Correlational research can be conducted using different methods, including:

Surveys are a common method used in correlational research. Researchers collect data by asking participants to complete questionnaires or surveys that measure different variables of interest. Surveys are useful for exploring the relationships between variables such as personality traits, attitudes, and behaviors.

Observational Studies

Observational studies involve observing and recording the behavior of participants in natural settings. Researchers can use observational studies to examine the relationships between variables such as social interactions, group dynamics, and communication patterns.

Archival Data

Archival data involves using existing data sources such as historical records, census data, or medical records to explore the relationships between variables. Archival data is useful for investigating the relationships between variables that cannot be manipulated or controlled.

Experimental Design

While correlational research does not involve manipulating variables, researchers can use experimental design to establish cause-and-effect relationships between variables. Experimental design involves manipulating one variable while holding other variables constant to determine the effect on the dependent variable.

Meta-Analysis

Meta-analysis involves combining and analyzing the results of multiple studies to explore the relationships between variables across different contexts and populations. Meta-analysis is useful for identifying patterns and inconsistencies in the literature and can provide insights into the strength and direction of relationships between variables.

Data Analysis Methods

Correlational research data analysis methods depend on the type of data collected and the research questions being investigated. Here are some common data analysis methods used in correlational research:

Correlation Coefficient

A correlation coefficient is a statistical measure that quantifies the strength and direction of the relationship between two variables. The correlation coefficient ranges from -1 to +1, with -1 indicating a perfect negative correlation, +1 indicating a perfect positive correlation, and 0 indicating no correlation. Researchers use correlation coefficients to determine the degree to which two variables are related.

Scatterplots

A scatterplot is a graphical representation of the relationship between two variables. Each data point on the plot represents a single observation. The x-axis represents one variable, and the y-axis represents the other variable. The pattern of data points on the plot can provide insights into the strength and direction of the relationship between the two variables.

Regression Analysis

Regression analysis is a statistical method used to model the relationship between two or more variables. Researchers use regression analysis to predict the value of one variable based on the value of another variable. Regression analysis can help identify the strength and direction of the relationship between variables, as well as the degree to which one variable can be used to predict the other.

Factor Analysis

Factor analysis is a statistical method used to identify patterns among variables. Researchers use factor analysis to group variables into factors that are related to each other. Factor analysis can help identify underlying factors that influence the relationship between two variables.

Path Analysis

Path analysis is a statistical method used to model the relationship between multiple variables. Researchers use path analysis to test causal models and identify direct and indirect effects between variables.

Applications of Correlational Research

Correlational research has many practical applications in various fields, including:

  • Psychology : Correlational research is commonly used in psychology to explore the relationships between variables such as personality traits, behaviors, and mental health outcomes. For example, researchers may use correlational research to examine the relationship between anxiety and depression, or the relationship between self-esteem and academic achievement.
  • Education : Correlational research is useful in educational research to explore the relationships between variables such as teaching methods, student motivation, and academic performance. For example, researchers may use correlational research to examine the relationship between student engagement and academic success, or the relationship between teacher feedback and student learning outcomes.
  • Business : Correlational research can be used in business to explore the relationships between variables such as consumer behavior, marketing strategies, and sales outcomes. For example, marketers may use correlational research to examine the relationship between advertising spending and sales revenue, or the relationship between customer satisfaction and brand loyalty.
  • Medicine : Correlational research is useful in medical research to explore the relationships between variables such as risk factors, disease outcomes, and treatment effectiveness. For example, researchers may use correlational research to examine the relationship between smoking and lung cancer, or the relationship between exercise and heart health.
  • Social Science : Correlational research is commonly used in social science research to explore the relationships between variables such as socioeconomic status, cultural factors, and social behavior. For example, researchers may use correlational research to examine the relationship between income and voting behavior, or the relationship between cultural values and attitudes towards immigration.

Examples of Correlational Research

  • Psychology : Researchers might be interested in exploring the relationship between two variables, such as parental attachment and anxiety levels in young adults. The study could involve measuring levels of attachment and anxiety using established scales or questionnaires, and then analyzing the data to determine if there is a correlation between the two variables. This information could be useful in identifying potential risk factors for anxiety in young adults, and in developing interventions that could help improve attachment and reduce anxiety.
  • Education : In a correlational study in education, researchers might investigate the relationship between two variables, such as teacher engagement and student motivation in a classroom setting. The study could involve measuring levels of teacher engagement and student motivation using established scales or questionnaires, and then analyzing the data to determine if there is a correlation between the two variables. This information could be useful in identifying strategies that teachers could use to improve student motivation and engagement in the classroom.
  • Business : Researchers might explore the relationship between two variables, such as employee satisfaction and productivity levels in a company. The study could involve measuring levels of employee satisfaction and productivity using established scales or questionnaires, and then analyzing the data to determine if there is a correlation between the two variables. This information could be useful in identifying factors that could help increase productivity and improve job satisfaction among employees.
  • Medicine : Researchers might examine the relationship between two variables, such as smoking and the risk of developing lung cancer. The study could involve collecting data on smoking habits and lung cancer diagnoses, and then analyzing the data to determine if there is a correlation between the two variables. This information could be useful in identifying risk factors for lung cancer and in developing interventions that could help reduce smoking rates.
  • Sociology : Researchers might investigate the relationship between two variables, such as income levels and political attitudes. The study could involve measuring income levels and political attitudes using established scales or questionnaires, and then analyzing the data to determine if there is a correlation between the two variables. This information could be useful in understanding how socioeconomic factors can influence political beliefs and attitudes.

How to Conduct Correlational Research

Here are the general steps to conduct correlational research:

  • Identify the Research Question : Start by identifying the research question that you want to explore. It should involve two or more variables that you want to investigate for a correlation.
  • Choose the research method: Decide on the research method that will be most appropriate for your research question. The most common methods for correlational research are surveys, archival research, and naturalistic observation.
  • Choose the Sample: Select the participants or data sources that you will use in your study. Your sample should be representative of the population you want to generalize the results to.
  • Measure the variables: Choose the measures that will be used to assess the variables of interest. Ensure that the measures are reliable and valid.
  • Collect the Data: Collect the data from your sample using the chosen research method. Be sure to maintain ethical standards and obtain informed consent from your participants.
  • Analyze the data: Use statistical software to analyze the data and compute the correlation coefficient. This will help you determine the strength and direction of the correlation between the variables.
  • Interpret the results: Interpret the results and draw conclusions based on the findings. Consider any limitations or alternative explanations for the results.
  • Report the findings: Report the findings of your study in a research report or manuscript. Be sure to include the research question, methods, results, and conclusions.

Purpose of Correlational Research

The purpose of correlational research is to examine the relationship between two or more variables. Correlational research allows researchers to identify whether there is a relationship between variables, and if so, the strength and direction of that relationship. This information can be useful for predicting and explaining behavior, and for identifying potential risk factors or areas for intervention.

Correlational research can be used in a variety of fields, including psychology, education, medicine, business, and sociology. For example, in psychology, correlational research can be used to explore the relationship between personality traits and behavior, or between early life experiences and later mental health outcomes. In education, correlational research can be used to examine the relationship between teaching practices and student achievement. In medicine, correlational research can be used to investigate the relationship between lifestyle factors and disease outcomes.

Overall, the purpose of correlational research is to provide insight into the relationship between variables, which can be used to inform further research, interventions, or policy decisions.

When to use Correlational Research

Here are some situations when correlational research can be particularly useful:

  • When experimental research is not possible or ethical: In some situations, it may not be possible or ethical to manipulate variables in an experimental design. In these cases, correlational research can be used to explore the relationship between variables without manipulating them.
  • When exploring new areas of research: Correlational research can be useful when exploring new areas of research or when researchers are unsure of the direction of the relationship between variables. Correlational research can help identify potential areas for further investigation.
  • When testing theories: Correlational research can be useful for testing theories about the relationship between variables. Researchers can use correlational research to examine the relationship between variables predicted by a theory, and to determine whether the theory is supported by the data.
  • When making predictions: Correlational research can be used to make predictions about future behavior or outcomes. For example, if there is a strong positive correlation between education level and income, one could predict that individuals with higher levels of education will have higher incomes.
  • When identifying risk factors: Correlational research can be useful for identifying potential risk factors for negative outcomes. For example, a study might find a positive correlation between drug use and depression, indicating that drug use could be a risk factor for depression.

Characteristics of Correlational Research

Here are some common characteristics of correlational research:

  • Examines the relationship between two or more variables: Correlational research is designed to examine the relationship between two or more variables. It seeks to determine if there is a relationship between the variables, and if so, the strength and direction of that relationship.
  • Non-experimental design: Correlational research is typically non-experimental in design, meaning that the researcher does not manipulate any variables. Instead, the researcher observes and measures the variables as they naturally occur.
  • Cannot establish causation : Correlational research cannot establish causation, meaning that it cannot determine whether one variable causes changes in another variable. Instead, it only provides information about the relationship between the variables.
  • Uses statistical analysis: Correlational research relies on statistical analysis to determine the strength and direction of the relationship between variables. This may include calculating correlation coefficients, regression analysis, or other statistical tests.
  • Observes real-world phenomena : Correlational research is often used to observe real-world phenomena, such as the relationship between education and income or the relationship between stress and physical health.
  • Can be conducted in a variety of fields : Correlational research can be conducted in a variety of fields, including psychology, sociology, education, and medicine.
  • Can be conducted using different methods: Correlational research can be conducted using a variety of methods, including surveys, observational studies, and archival studies.

Advantages of Correlational Research

There are several advantages of using correlational research in a study:

  • Allows for the exploration of relationships: Correlational research allows researchers to explore the relationships between variables in a natural setting without manipulating any variables. This can help identify possible relationships between variables that may not have been previously considered.
  • Useful for predicting behavior: Correlational research can be useful for predicting future behavior. If a strong correlation is found between two variables, researchers can use this information to predict how changes in one variable may affect the other.
  • Can be conducted in real-world settings: Correlational research can be conducted in real-world settings, which allows for the collection of data that is representative of real-world phenomena.
  • Can be less expensive and time-consuming than experimental research: Correlational research is often less expensive and time-consuming than experimental research, as it does not involve manipulating variables or creating controlled conditions.
  • Useful in identifying risk factors: Correlational research can be used to identify potential risk factors for negative outcomes. By identifying variables that are correlated with negative outcomes, researchers can develop interventions or policies to reduce the risk of negative outcomes.
  • Useful in exploring new areas of research: Correlational research can be useful in exploring new areas of research, particularly when researchers are unsure of the direction of the relationship between variables. By conducting correlational research, researchers can identify potential areas for further investigation.

Limitation of Correlational Research

Correlational research also has several limitations that should be taken into account:

  • Cannot establish causation: Correlational research cannot establish causation, meaning that it cannot determine whether one variable causes changes in another variable. This is because it is not possible to control all possible confounding variables that could affect the relationship between the variables being studied.
  • Directionality problem: The directionality problem refers to the difficulty of determining which variable is influencing the other. For example, a correlation may exist between happiness and social support, but it is not clear whether social support causes happiness, or whether happy people are more likely to have social support.
  • Third variable problem: The third variable problem refers to the possibility that a third variable, not included in the study, is responsible for the observed relationship between the two variables being studied.
  • Limited generalizability: Correlational research is often limited in terms of its generalizability to other populations or settings. This is because the sample studied may not be representative of the larger population, or because the variables studied may behave differently in different contexts.
  • Relies on self-reported data: Correlational research often relies on self-reported data, which can be subject to social desirability bias or other forms of response bias.
  • Limited in explaining complex behaviors: Correlational research is limited in explaining complex behaviors that are influenced by multiple factors, such as personality traits, situational factors, and social context.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Correlational Research | Guide, Design & Examples

Correlational Research | Guide, Design & Examples

Published on 5 May 2022 by Pritha Bhandari . Revised on 5 December 2022.

A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them.

A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

Table of contents

Correlational vs experimental research, when to use correlational research, how to collect correlational data, how to analyse correlational data, correlation and causation, frequently asked questions about correlational research.

Correlational and experimental research both use quantitative methods to investigate relationships between variables. But there are important differences in how data is collected and the types of conclusions you can draw.

Prevent plagiarism, run a free check.

Correlational research is ideal for gathering data quickly from natural settings. That helps you generalise your findings to real-life situations in an externally valid way.

There are a few situations where correlational research is an appropriate choice.

To investigate non-causal relationships

You want to find out if there is an association between two variables, but you don’t expect to find a causal relationship between them.

Correlational research can provide insights into complex real-world relationships, helping researchers develop theories and make predictions.

To explore causal relationships between variables

You think there is a causal relationship between two variables, but it is impractical, unethical, or too costly to conduct experimental research that manipulates one of the variables.

Correlational research can provide initial indications or additional support for theories about causal relationships.

To test new measurement tools

You have developed a new instrument for measuring your variable, and you need to test its reliability or validity .

Correlational research can be used to assess whether a tool consistently or accurately captures the concept it aims to measure.

There are many different methods you can use in correlational research. In the social and behavioural sciences, the most common data collection methods for this type of research include surveys, observations, and secondary data.

It’s important to carefully choose and plan your methods to ensure the reliability and validity of your results. You should carefully select a representative sample so that your data reflects the population you’re interested in without bias .

In survey research , you can use questionnaires to measure your variables of interest. You can conduct surveys online, by post, by phone, or in person.

Surveys are a quick, flexible way to collect standardised data from many participants, but it’s important to ensure that your questions are worded in an unbiased way and capture relevant insights.

Naturalistic observation

Naturalistic observation is a type of field research where you gather data about a behaviour or phenomenon in its natural environment.

This method often involves recording, counting, describing, and categorising actions and events. Naturalistic observation can include both qualitative and quantitative elements, but to assess correlation, you collect data that can be analysed quantitatively (e.g., frequencies, durations, scales, and amounts).

Naturalistic observation lets you easily generalise your results to real-world contexts, and you can study experiences that aren’t replicable in lab settings. But data analysis can be time-consuming and unpredictable, and researcher bias may skew the interpretations.

Secondary data

Instead of collecting original data, you can also use data that has already been collected for a different purpose, such as official records, polls, or previous studies.

Using secondary data is inexpensive and fast, because data collection is complete. However, the data may be unreliable, incomplete, or not entirely relevant, and you have no control over the reliability or validity of the data collection procedures.

After collecting data, you can statistically analyse the relationship between variables using correlation or regression analyses, or both. You can also visualise the relationships between variables with a scatterplot.

Different types of correlation coefficients and regression analyses are appropriate for your data based on their levels of measurement and distributions .

Correlation analysis

Using a correlation analysis, you can summarise the relationship between variables into a correlation coefficient : a single number that describes the strength and direction of the relationship between variables. With this number, you’ll quantify the degree of the relationship between variables.

The Pearson product-moment correlation coefficient, also known as Pearson’s r , is commonly used for assessing a linear relationship between two quantitative variables.

Correlation coefficients are usually found for two variables at a time, but you can use a multiple correlation coefficient for three or more variables.

Regression analysis

With a regression analysis , you can predict how much a change in one variable will be associated with a change in the other variable. The result is a regression equation that describes the line on a graph of your variables.

You can use this equation to predict the value of one variable based on the given value(s) of the other variable(s). It’s best to perform a regression analysis after testing for a correlation between your variables.

It’s important to remember that correlation does not imply causation . Just because you find a correlation between two things doesn’t mean you can conclude one of them causes the other, for a few reasons.

Directionality problem

If two variables are correlated, it could be because one of them is a cause and the other is an effect. But the correlational research design doesn’t allow you to infer which is which. To err on the side of caution, researchers don’t conclude causality from correlational studies.

Third variable problem

A confounding variable is a third variable that influences other variables to make them seem causally related even though they are not. Instead, there are separate causal links between the confounder and each variable.

In correlational research, there’s limited or no researcher control over extraneous variables . Even if you statistically control for some potential confounders, there may still be other hidden variables that disguise the relationship between your study variables.

Although a correlational study can’t demonstrate causation on its own, it can help you develop a causal hypothesis that’s tested in controlled experiments.

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, December 05). Correlational Research | Guide, Design & Examples. Scribbr. Retrieved 14 May 2024, from https://www.scribbr.co.uk/research-methods/correlational-research-design/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, qualitative vs quantitative research | examples & methods.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

quantitative research correlational example

Home Market Research

Correlational Research: What it is with Examples

Use correlational research method to conduct a correlational study and measure the statistical relationship between two variables. Learn more.

Our minds can do some brilliant things. For example, it can memorize the jingle of a pizza truck. The louder the jingle, the closer the pizza truck is to us. Who taught us that? Nobody! We relied on our understanding and came to a conclusion. We don’t stop there, do we? If there are multiple pizza trucks in the area and each one has a different jingle, we would memorize it all and relate the jingle to its pizza truck.

This is what correlational research precisely is, establishing a relationship between two variables, “jingle” and “distance of the truck” in this particular example. The correlational study looks for variables that seem to interact with each other. When you see one variable changing, you have a fair idea of how the other variable will change.

What is Correlational research?

Correlational research is a type of non-experimental research method in which a researcher measures two variables and understands and assesses the statistical relationship between them with no influence from any extraneous variable. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities.

Correlational Research Example

The correlation coefficient shows the correlation between two variables (A correlation coefficient is a statistical measure that calculates the strength of the relationship between two variables), a value measured between -1 and +1. When the correlation coefficient is close to +1, there is a positive correlation between the two variables. If the value is relative to -1, there is a negative correlation between the two variables. When the value is close to zero, then there is no relationship between the two variables.

Let us take an example to understand correlational research.

Consider hypothetically, a researcher is studying a correlation between cancer and marriage. In this study, there are two variables: disease and marriage. Let us say marriage has a negative association with cancer. This means that married people are less likely to develop cancer.

However, this doesn’t necessarily mean that marriage directly avoids cancer. In correlational research, it is not possible to establish the fact, what causes what. It is a misconception that a correlational study involves two quantitative variables. However, the reality is two variables are measured, but neither is changed. This is true independent of whether the variables are quantitative or categorical.

Types of correlational research

Mainly three types of correlational research have been identified:

1. Positive correlation: A positive relationship between two variables is when an increase in one variable leads to a rise in the other variable. A decrease in one variable will see a reduction in the other variable. For example, the amount of money a person has might positively correlate with the number of cars the person owns.

2. Negative correlation: A negative correlation is quite literally the opposite of a positive relationship. If there is an increase in one variable, the second variable will show a decrease, and vice versa.

For example, being educated might negatively correlate with the crime rate when an increase in one variable leads to a decrease in another and vice versa. If a country’s education level is improved, it can lower crime rates. Please note that this doesn’t mean that lack of education leads to crimes. It only means that a lack of education and crime is believed to have a common reason – poverty.

3. No correlation: There is no correlation between the two variables in this third type . A change in one variable may not necessarily see a difference in the other variable. For example, being a millionaire and happiness are not correlated. An increase in money doesn’t lead to happiness.

Characteristics of correlational research

Correlational research has three main characteristics. They are: 

  • Non-experimental : The correlational study is non-experimental. It means that researchers need not manipulate variables with a scientific methodology to either agree or disagree with a hypothesis. The researcher only measures and observes the relationship between the variables without altering them or subjecting them to external conditioning.
  • Backward-looking : Correlational research only looks back at historical data and observes events in the past. Researchers use it to measure and spot historical patterns between two variables. A correlational study may show a positive relationship between two variables, but this can change in the future.
  • Dynamic : The patterns between two variables from correlational research are never constant and are always changing. Two variables having negative correlation research in the past can have a positive correlation relationship in the future due to various factors.

Data collection

The distinctive feature of correlational research is that the researcher can’t manipulate either of the variables involved. It doesn’t matter how or where the variables are measured. A researcher could observe participants in a closed environment or a public setting.

Correlational Research

Researchers use two data collection methods to collect information in correlational research.

01. Naturalistic observation

Naturalistic observation is a way of data collection in which people’s behavioral targeting is observed in their natural environment, in which they typically exist. This method is a type of field research. It could mean a researcher might be observing people in a grocery store, at the cinema, playground, or in similar places.

Researchers who are usually involved in this type of data collection make observations as unobtrusively as possible so that the participants involved in the study are not aware that they are being observed else they might deviate from being their natural self.

Ethically this method is acceptable if the participants remain anonymous, and if the study is conducted in a public setting, a place where people would not normally expect complete privacy. As mentioned previously, taking an example of the grocery store where people can be observed while collecting an item from the aisle and putting in the shopping bags. This is ethically acceptable, which is why most researchers choose public settings for recording their observations. This data collection method could be both qualitative and quantitative . If you need to know more about qualitative data, you can explore our newly published blog, “ Examples of Qualitative Data in Education .”

02. Archival data

Another approach to correlational data is the use of archival data. Archival information is the data that has been previously collected by doing similar kinds of research . Archival data is usually made available through primary research .

In contrast to naturalistic observation, the information collected through archived data can be pretty straightforward. For example, counting the number of people named Richard in the various states of America based on social security records is relatively short.

Use the correlational research method to conduct a correlational study and measure the statistical relationship between two variables. Uncover the insights that matter the most. Use QuestionPro’s research platform to uncover complex insights that can propel your business to the forefront of your industry.

Research to make better decisions. Start a free trial today. No credit card required.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

data information vs insight

Data Information vs Insight: Essential differences

May 14, 2024

pricing analytics software

Pricing Analytics Software: Optimize Your Pricing Strategy

May 13, 2024

relationship marketing

Relationship Marketing: What It Is, Examples & Top 7 Benefits

May 8, 2024

email survey tool

The Best Email Survey Tool to Boost Your Feedback Game

May 7, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence
  • Correlational Research Designs: Types, Examples & Methods

busayo.longe

A human mind is a powerful tool that allows you to sift through seemingly unrelated variables and establish a connection with regards to a specific subject at hand. This skill is what comes to play when we talk about correlational research.

Correlational research is something that we do every day; think about how you establish a connection between the doorbell ringing at a particular time and the milkman’s arrival. As such, it is expedient to understand the different types of correlational research that are available and more importantly, how to go about it. 

What is Correlational Research?

Correlational research is a type of research method that involves observing two variables in order to establish a statistically corresponding relationship between them. The aim of correlational research is to identify variables that have some sort of relationship do the extent that a change in one creates some change in the other. 

This type of research is descriptive, unlike experimental research that relies entirely on scientific methodology and hypothesis. For example, correlational research may reveal the statistical relationship between high-income earners and relocation; that is, the more people earn, the more likely they are to relocate or not. 

What are the Types of Correlational Research?

Essentially, there are 3 types of correlational research which are positive correlational research, negative correlational research, and no correlational research. Each of these types is defined by peculiar characteristics. 

  • Positive Correlational Research

Positive correlational research is a research method involving 2 variables that are statistically corresponding where an increase or decrease in 1 variable creates a like change in the other. An example is when an increase in workers’ remuneration results in an increase in the prices of goods and services and vice versa.  

  • Negative Correlational Research

Negative correlational research is a research method involving 2 variables that are statistically opposite where an increase in one of the variables creates an alternate effect or decrease in the other variable. An example of a negative correlation is if the rise in goods and services causes a decrease in demand and vice versa. 

  • Zero Correlational Research

Zero correlational research is a type of correlational research that involves 2 variables that are not necessarily statistically connected. In this case, a change in one of the variables may not trigger a corresponding or alternate change in the other variable.

Zero correlational research caters for variables with vague statistical relationships. For example, wealth and patience can be variables under zero correlational research because they are statistically independent. 

Sporadic change patterns that occur in variables with zero correlational are usually by chance and not as a result of corresponding or alternate mutual inclusiveness. 

Correlational research can also be classified based on data collection methods. Based on these, there are 3 types of correlational research: Naturalistic observation research, survey research and archival research. 

What are the Data Collection Methods in Correlational research? 

Data collection methods in correlational research are the research methodologies adopted by persons carrying out correlational research in order to determine the linear statistical relationship between 2 variables. These data collection methods are used to gather information in correlational research. 

The 3 methods of data collection in correlational research are naturalistic observation method, archival data method, and the survey method. All of these would be clearly explained in the subsequent paragraphs. 

  • Naturalistic Observation

Naturalistic observation is a correlational research methodology that involves observing people’s behaviors as shown in the natural environment where they exist, over a period of time. It is a type of research-field method that involves the researcher paying closing attention to natural behavior patterns of the subjects under consideration.

This method is extremely demanding as the researcher must take extra care to ensure that the subjects do not suspect that they are being observed else they deviate from their natural behavior patterns. It is best for all subjects under observation to remain anonymous in order to avoid a breach of privacy. 

The major advantages of the naturalistic observation method are that it allows the researcher to fully observe the subjects (variables) in their natural state. However, it is a very expensive and time-consuming process plus the subjects can become aware of this act at any time and may act contrary. 

  • Archival Data

Archival data is a type of correlational research method that involves making use of already gathered information about the variables in correlational research. Since this method involves using data that is already gathered and analyzed, it is usually straight to the point. 

For this method of correlational research, the research makes use of earlier studies conducted by other researchers or the historical records of the variables being analyzed. This method helps a researcher to track already determined statistical patterns of the variables or subjects. 

This method is less expensive, saves time and provides the researcher with more disposable data to work with. However, it has the problem of data accuracy as important information may be missing from previous research since the researcher has no control over the data collection process. 

  • Survey Method

The survey method is the most common method of correlational research; especially in fields like psychology. It involves random sampling of the variables or the subjects in the research in which the participants fill a questionnaire centered on the subjects of interest. 

This method is very flexible as researchers can gather large amounts of data in very little time. However, it is subject to survey response bias and can also be affected by biased survey questions or under-representation of survey respondents or participants. 

These would be properly explained under data collection methods in correlational research. 

Examples of Correlational Research

Correlational research examples are numerous and highlight several instances where a correlational study may be carried out in order to determine the statistical behavioral trend with regards to the variables under consideration. Here are 3 case examples of correlational research. 

  • You want to know if wealthy people are less likely to be patient. From your experience, you believe that wealthy people are impatient. However, you want to establish a statistical pattern that proves or disproves your belief. In this case, you can carry out correlational research to identify a trend that links both variables. 
  • You want to know if there’s a correlation between how much people earn and the number of children that they have. You do not believe that people with more spending power have more children than people with less spending power. 

You think that how much people earn hardly determines the number of children that they have. Yet, carrying out correlational research on both variables could reveal any correlational relationship that exists between them. 

  • You believe that domestic violence causes a brain hemorrhage. You cannot carry out an experiment as it would be unethical to deliberately subject people to domestic violence. 

However, you can carry out correlational research to find out if victims of domestic violence suffer brain hemorrhage more than non-victims. 

What are the Characteristics of Correlational Research? 

  • Correlational Research is non-experimental

Correlational research is non-experimental as it does not involve manipulating variables using a scientific methodology in order to agree or disagree with a hypothesis. In correlational research, the researcher simply observes and measures the natural relationship between 2 variables; without subjecting either of the variables to external conditioning. 

  • Correlational Research is Backward-looking

Correlational research doesn’t take the future into consideration as it only observes and measures the recent historical relationship that exists between 2 variables. In this sense, the statistical pattern resulting from correlational research is backward-looking and can seize to exist at any point, going forward. 

Correlational research observes and measures historical patterns between 2 variables such as the relationship between high-income earners and tax payment. Correlational research may reveal a positive relationship between the aforementioned variables but this may change at any point in the future. 

  • Correlational Research is Dynamic

Statistical patterns between 2 variables that result from correlational research are ever-changing. The correlation between 2 variables changes on a daily basis and such, it cannot be used as a fixed data for further research. 

For example, the 2 variables can have a negative correlational relationship for a period of time, maybe 5 years. After this time, the correlational relationship between them can become positive; as observed in the relationship between bonds and stocks. 

  • Data resulting from correlational research are not constant and cannot be used as a standard variable for further research. 

What is the Correlation Coefficient? 

A correlation coefficient is an important value in correlational research that indicates whether the inter-relationship between 2 variables is positive, negative or non-existent. It is usually represented with the sign [r] and is part of a range of possible correlation coefficients from -1.0 to +1.0. 

The strength of a correlation between quantitative variables is typically measured using a statistic called Pearson’s Correlation Coefficient (or Pearson’s r) . A positive correlation is indicated by a value of 1.0, a perfect negative correlation is indicated by a value of -1.0 while zero correlation is indicated by a value of 0.0. 

It is important to note that a correlation coefficient only reflects the linear relationship between 2 variables; it does not capture non-linear relationships and cannot separate dependent and independent variables. The correlation coefficient helps you to determine the degree of statistical relationship that exists between variables. 

What are the Advantages of Correlational Research?

  • In cases where carrying out experimental research is unethical, correlational research  can be used to determine the relationship between 2 variables. For example, when studying humans, carrying out an experiment can be seen as unsafe or unethical; hence, choosing correlational research would be the best option. 
  • Through correlational research, you can easily determine the statistical relationship between 2 variables.
  • Carrying out correlational research is less time-consuming and less expensive than experimental research. This becomes a strong advantage when working with a minimum of researchers and funding or when keeping the number of variables in a study very low. 
  • Correlational research allows the researcher to carry out shallow data gathering using different methods such as a short survey. A short survey does not require the researcher to personally administer it so this allows the researcher to work with a few people. 

What are the Disadvantages of Correlational Research? 

  • Correlational research is limiting in nature as it can only be used to determine the statistical relationship between 2 variables. It cannot be used to establish a relationship between more than 2 variables. 
  • It does not account for cause and effect between 2 variables as it doesn’t highlight which of the 2 variables is responsible for the statistical pattern that is observed. For example, finding that education correlates positively with vegetarianism doesn’t explain whether being educated leads to becoming a vegetarian or whether vegetarianism leads to more education.
  • Reasons for either can be assumed, but until more research is done, causation can’t be determined. Also, a third, unknown variable might be causing both. For instance, living in the state of Detroit can lead to both education and vegetarianism.
  • Correlational research depends on past statistical patterns to determine the relationship between variables. As such, its data cannot be fully depended on for further research. 
  • In correlational research, the researcher has no control over the variables. Unlike experimental research, correlational research only allows the researcher to observe the variables for connecting statistical patterns without introducing a catalyst. 
  • The information received from correlational research is limited. Correlational research only shows the relationship between variables and does not equate to causation. 

What are the Differences between Correlational and Experimental Research?  

  • Methodology

The major difference between correlational research and experimental research is methodology. In correlational research, the researcher looks for a statistical pattern linking 2 naturally-occurring variables while in experimental research, the researcher introduces a catalyst and monitors its effects on the variables. 

  • Observation

In correlational research, the researcher passively observes the phenomena and measures whatever relationship that occurs between them. However, in experimental research, the researcher actively observes phenomena after triggering a change in the behavior of the variables. 

In experimental research, the researcher introduces a catalyst and monitors its effects on the variables, that is, cause and effect. In correlational research, the researcher is not interested in cause and effect as it applies; rather, he or she identifies recurring statistical patterns connecting the variables in research. 

  • Number of Variables

research caters to an unlimited number of variables. Correlational research, on the other hand, caters to only 2 variables. 

  • Experimental research is causative while correlational research is relational.
  • Correlational research is preliminary and almost always precedes experimental research. 
  • Unlike correlational research, experimental research allows the researcher to control the variables.

How to Use Online Forms for Correlational Research

One of the most popular methods of conducting correlational research is by carrying out a survey which can be made easier with the use of an online form. Surveys for correlational research involve generating different questions that revolve around the variables under observation and, allowing respondents to provide answers to these questions. 

Using an online form for your correlational research survey would help the researcher to gather more data in minimum time. In addition, the researcher would be able to reach out to more survey respondents than is plausible with printed correlational research survey forms . 

In addition, the researcher would be able to swiftly process and analyze all responses in order to objectively establish the statistical pattern that links the variables in the research. Using an online form for correlational research also helps the researcher to minimize the cost incurred during the research period. 

To use an online form for a correlational research survey, you would need to sign up on a data-gathering platform like Formplus . Formplus allows you to create custom forms for correlational research surveys using the Formplus builder. 

You can customize your correlational research survey form by adding background images, new color themes or your company logo to make it appear even more professional. In addition, Formplus also has a survey form template that you can edit for a correlational research study. 

You can create different types of survey questions including open-ended questions , rating questions, close-ended questions and multiple answers questions in your survey in the Formplus builder. After creating your correlational research survey, you can share the personalized link with respondents via email or social media.

Formplus also enables you to collect offline responses in your form.

Conclusion 

Correlational research enables researchers to establish the statistical pattern between 2 seemingly interconnected variables; as such, it is the starting point of any type of research. It allows you to link 2 variables by observing their behaviors in the most natural state. 

Unlike experimental research, correlational research does not emphasize the causative factor affecting 2 variables and this makes the data that results from correlational research subject to constant change. However, it is quicker, easier, less expensive and more convenient than experimental research. 

It is important to always keep the aim of your research at the back of your mind when choosing the best type of research to adopt. If you simply need to observe how the variables react to change then, experimental research is the best type to subscribe for. 

It is best to conduct correlational research using an online correlational research survey form as this makes the data-gathering process, more convenient. Formplus is a great online data-gathering platform that you can use to create custom survey forms for correlational research. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • characteristics of correlational research
  • types of correlational research
  • what is correlational research
  • busayo.longe

Formplus

You may also like:

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

quantitative research correlational example

Exploratory Research: What are its Method & Examples?

Overview on exploratory research, examples and methodology. Shows guides on how to conduct exploratory research with online surveys

Extrapolation in Statistical Research: Definition, Examples, Types, Applications

In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of nonexperimental research.

What Is Correlational Research?

Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms independent variable and dependent variable do not apply to this kind of research.

The other reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher cannot manipulate the independent variable because it is impossible, impractical, or unethical. For example, Allen Kanner and his colleagues thought that the number of “daily hassles” (e.g., rude salespeople, heavy traffic) that people experience affects the number of physical and psychological symptoms they have (Kanner, Coyne, Schaefer, & Lazarus, 1981). But because they could not manipulate the number of daily hassles their participants experienced, they had to settle for measuring the number of daily hassles—along with the number of symptoms—using self-report questionnaires. Although the strong positive relationship they found between these two variables is consistent with their idea that hassles cause symptoms, it is also consistent with the idea that symptoms cause hassles or that some third variable (e.g., neuroticism) causes both.

A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extraversion tests or the number of hassles and number of symptoms people have experienced. However, the defining feature of correlational research is that the two variables are measured—neither one is manipulated—and this is true regardless of whether the variables are quantitative or categorical. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American college students and 50 Japanese college students. Although this “feels” like a between-subjects experiment, it is a correlational study because the researcher did not manipulate the students’ nationalities. The same is true of the study by Cacioppo and Petty comparing college faculty and factory workers in terms of their need for cognition. It is a correlational study because the researchers did not manipulate the participants’ occupations.

Figure 7.2 “Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists” shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this is an experiment or a correlational study because it is unclear whether the independent variable was manipulated. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then it is an experiment. If the researcher simply asked participants whether they made daily to-do lists, then it is a correlational study. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a correlational study, it could only be concluded that these variables are statistically related. Perhaps being stressed has a negative effect on people’s ability to plan ahead (the directionality problem). Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed (the third-variable problem). The crucial point is that what defines a study as experimental or correlational is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. It is how the study is conducted.

Figure 7.2 Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. However, because some approaches to data collection are strongly associated with correlational research, it makes sense to discuss them here. The two we will focus on are naturalistic observation and archival data. A third, survey research, is discussed in its own chapter.

Naturalistic Observation

Naturalistic observation is an approach to data collection that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). It could involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are often not aware that they are being studied. Ethically, this is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

Researchers Robert Levine and Ara Norenzayan used naturalistic observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999). One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in the United States and Japan covered 60 feet in about 12 seconds on average, while people in Brazil and Romania took close to 17 seconds.

Because naturalistic observation takes place in the complex and even chaotic “real world,” there are two closely related issues that researchers must deal with before collecting data. The first is sampling. When, where, and under what conditions will the observations be made, and who exactly will be observed? Levine and Norenzayan described their sampling process as follows:

Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities. (p. 186)

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.

The second issue is measurement. What specific behaviors will be observed? In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance. Often, however, the behaviors of interest are not so obvious or objective. For example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979). But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

A woman bowling

Naturalistic observation has revealed that bowlers tend to smile when they turn away from the pins and toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

sieneke toering – bowling big lebowski style – CC BY-NC-ND 2.0.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as coding . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that different observers code them in the same way. This is the issue of interrater reliability. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

Archival Data

Another approach to correlational research is the use of archival data , which are data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005). In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988). In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them, were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as college students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as college students, the healthier they were as older men. Pearson’s r was +.25.

This is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as naturalistic observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlational research is not defined by where or how the data are collected. However, some approaches to data collection are strongly associated with correlational research. These include naturalistic observation (in which researchers observe people’s behavior in the context in which it normally occurs) and the use of archival data that were already collected for some other purpose.

Discussion: For each of the following, decide whether it is most likely that the study described is experimental or correlational and explain why.

  • An educational researcher compares the academic performance of students from the “rich” side of town with that of students from the “poor” side of town.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

Kanner, A. D., Coyne, J. C., Schaefer, C., & Lazarus, R. S. (1981). Comparison of two modes of stress measurement: Daily hassles and uplifts versus major life events. Journal of Behavioral Medicine, 4 , 1–39.

Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553.

Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205.

Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110.

Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

6.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of non-experimental research.
  • Interpret the strength and direction of different correlation coefficients.
  • Explain why correlation does not imply causation.

What Is Correlational Research?

Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one or are not interested in causal relationships. Recall two goals of science are to describe and to predict and the correlational research strategy allows researchers to achieve both of these goals. Specifically, this strategy can be used to describe the strength and direction of the relationship between two variables and if there is a relationship between the variables then the researchers can use scores on one variable to predict scores on the other (using a statistical technique called regression).

Another reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher  cannot  manipulate the independent variable because it is impossible, impractical, or unethical. For example, while I might be interested in the relationship between the frequency people use cannabis and their memory abilities I cannot ethically manipulate the frequency that people use cannabis. As such, I must rely on the correlational research strategy; I must simply measure the frequency that people use cannabis and measure their memory abilities using a standardized test of memory and then determine whether the frequency people use cannabis use is statistically related to memory test performance. 

Correlation is also used to establish the reliability and validity of measurements. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms  independent variable  and dependent variabl e  do not apply to this kind of research.

Another strength of correlational research is that it is often higher in external validity than experimental research. Recall there is typically a trade-off between internal validity and external validity. As greater controls are added to experiments, internal validity is increased but often at the expense of external validity. In contrast, correlational studies typically have low internal validity because nothing is manipulated or control but they often have high external validity. Since nothing is manipulated or controlled by the experimenter the results are more likely to reflect relationships that exist in the real world.

Finally, extending upon this trade-off between internal and external validity, correlational research can help to provide converging evidence for a theory. If a theory is supported by a true experiment that is high in internal validity as well as by a correlational study that is high in external validity then the researchers can have more confidence in the validity of their theory. As a concrete example, correlational studies establishing that there is a relationship between watching violent television and aggressive behavior have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001) [1] .  These converging results provide strong evidence that there is a real relationship (indeed a causal relationship) between watching violent television and aggressive behavior.

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. 

Correlations Between Quantitative Variables

Correlations between quantitative variables are often presented using scatterplots . Figure 6.3 shows some hypothetical data on the relationship between the amount of stress people are under and the number of physical symptoms they have. Each point in the scatterplot represents one person’s score on both variables. For example, the circled point in Figure 6.3 represents a person whose stress score was 10 and who had three physical symptoms. Taking all the points into account, one can see that people under more stress tend to have more physical symptoms. This is a good example of a positive relationship , in which higher scores on one variable tend to be associated with higher scores on the other. A  negative relationship  is one in which higher scores on one variable tend to be associated with lower scores on the other. There is a negative relationship between stress and immune system functioning, for example, because higher stress is associated with lower immune system functioning.

Figure 2.2 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms

Figure 6.3 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms. The circled point represents a person whose stress score was 10 and who had three physical symptoms. Pearson’s r for these data is +.51.

The strength of a correlation between quantitative variables is typically measured using a statistic called  Pearson’s Correlation Coefficient (or Pearson’s  r ) . As Figure 6.4 shows, Pearson’s r ranges from −1.00 (the strongest possible negative relationship) to +1.00 (the strongest possible positive relationship). A value of 0 means there is no relationship between the two variables. When Pearson’s  r  is 0, the points on a scatterplot form a shapeless “cloud.” As its value moves toward −1.00 or +1.00, the points come closer and closer to falling on a single straight line. Correlation coefficients near ±.10 are considered small, values near ± .30 are considered medium, and values near ±.50 are considered large. Notice that the sign of Pearson’s  r  is unrelated to its strength. Pearson’s  r  values of +.30 and −.30, for example, are equally strong; it is just that one represents a moderate positive relationship and the other a moderate negative relationship. With the exception of reliability coefficients, most correlations that we find in Psychology are small or moderate in size. The website http://rpsychologist.com/d3/correlation/ , created by Kristoffer Magnusson, provides an excellent interactive visualization of correlations that permits you to adjust the strength and direction of a correlation while witnessing the corresponding changes to the scatterplot.

Figure 2.3 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

Figure 6.4 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

There are two common situations in which the value of Pearson’s  r  can be misleading. Pearson’s  r  is a good measure only for linear relationships, in which the points are best approximated by a straight line. It is not a good measure for nonlinear relationships, in which the points are better approximated by a curved line. Figure 6.5, for example, shows a hypothetical relationship between the amount of sleep people get per night and their level of depression. In this example, the line that best approximates the points is a curve—a kind of upside-down “U”—because people who get about eight hours of sleep tend to be the least depressed. Those who get too little sleep and those who get too much sleep tend to be more depressed. Even though Figure 6.5 shows a fairly strong relationship between depression and sleep, Pearson’s  r  would be close to zero because the points in the scatterplot are not well fit by a single straight line. This means that it is important to make a scatterplot and confirm that a relationship is approximately linear before using Pearson’s  r . Nonlinear relationships are fairly common in psychology, but measuring their strength is beyond the scope of this book.

Figure 2.4 Hypothetical Nonlinear Relationship Between Sleep and Depression

Figure 6.5 Hypothetical Nonlinear Relationship Between Sleep and Depression

The other common situations in which the value of Pearson’s  r  can be misleading is when one or both of the variables have a limited range in the sample relative to the population. This problem is referred to as  restriction of range . Assume, for example, that there is a strong negative correlation between people’s age and their enjoyment of hip hop music as shown by the scatterplot in Figure 6.6. Pearson’s  r  here is −.77. However, if we were to collect data only from 18- to 24-year-olds—represented by the shaded area of Figure 6.6—then the relationship would seem to be quite weak. In fact, Pearson’s  r  for this restricted range of ages is 0. It is a good idea, therefore, to design studies to avoid restriction of range. For example, if age is one of your primary variables, then you can plan to collect data from people of a wide range of ages. Because restriction of range is not always anticipated or easily avoidable, however, it is good practice to examine your data for possible restriction of range and to interpret Pearson’s  r  in light of it. (There are also statistical methods to correct Pearson’s  r  for restriction of range, but they are beyond the scope of this book).

Figure 12.10 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range

Figure 6.6 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range.The overall correlation here is −.77, but the correlation for the 18- to 24-year-olds (in the blue box) is 0.

Correlation Does Not Imply Causation

You have probably heard repeatedly that “Correlation does not imply causation.” An amusing example of this comes from a 2012 study that showed a positive correlation (Pearson’s r = 0.79) between the per capita chocolate consumption of a nation and the number of Nobel prizes awarded to citizens of that nation [2] . It seems clear, however, that this does not mean that eating chocolate causes people to win Nobel prizes, and it would not make sense to try to increase the number of Nobel prizes won by recommending that parents feed their children more chocolate.

There are two reasons that correlation does not imply causation. The first is called the  directionality problem . Two variables,  X  and  Y , can be statistically related because X  causes  Y  or because  Y  causes  X . Consider, for example, a study showing that whether or not people exercise is statistically related to how happy they are—such that people who exercise are happier on average than people who do not. This statistical relationship is consistent with the idea that exercising causes happiness, but it is also consistent with the idea that happiness causes exercise. Perhaps being happy gives people more energy or leads them to seek opportunities to socialize with others by going to the gym. The second reason that correlation does not imply causation is called the  third-variable problem . Two variables,  X  and  Y , can be statistically related not because  X  causes  Y , or because  Y  causes  X , but because some third variable,  Z , causes both  X  and  Y . For example, the fact that nations that have won more Nobel prizes tend to have higher chocolate consumption probably reflects geography in that European countries tend to have higher rates of per capita chocolate consumption and invest more in education and technology (once again, per capita) than many other countries in the world. Similarly, the statistical relationship between exercise and happiness could mean that some third variable, such as physical health, causes both of the others. Being physically healthy could cause people to exercise and cause them to be happier. Correlations that are a result of a third-variable are often referred to as  spurious correlations.

Some excellent and funny examples of spurious correlations can be found at http://www.tylervigen.com  (Figure 6.7  provides one such example).

Figure 2.5 Example of a Spurious Correlation Source: http://tylervigen.com/spurious-correlations (CC-BY 4.0)

“Lots of Candy Could Lead to Violence”

Although researchers in psychology know that correlation does not imply causation, many journalists do not. One website about correlation and causation, http://jonathan.mueller.faculty.noctrl.edu/100/correlation_or_causation.htm , links to dozens of media reports about real biomedical and psychological research. Many of the headlines suggest that a causal relationship has been demonstrated when a careful reading of the articles shows that it has not because of the directionality and third-variable problems.

One such article is about a study showing that children who ate candy every day were more likely than other children to be arrested for a violent offense later in life. But could candy really “lead to” violence, as the headline suggests? What alternative explanations can you think of for this statistical relationship? How could the headline be rewritten so that it is not misleading?

As you have learned by reading this book, there are various ways that researchers address the directionality and third-variable problems. The most effective is to conduct an experiment. For example, instead of simply measuring how much people exercise, a researcher could bring people into a laboratory and randomly assign half of them to run on a treadmill for 15 minutes and the rest to sit on a couch for 15 minutes. Although this seems like a minor change to the research design, it is extremely important. Now if the exercisers end up in more positive moods than those who did not exercise, it cannot be because their moods affected how much they exercised (because it was the researcher who determined how much they exercised). Likewise, it cannot be because some third variable (e.g., physical health) affected both how much they exercised and what mood they were in (because, again, it was the researcher who determined how much they exercised). Thus experiments eliminate the directionality and third-variable problems and allow researchers to draw firm conclusions about causal relationships.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlation does not imply causation. A statistical relationship between two variables,  X  and  Y , does not necessarily mean that  X  causes  Y . It is also possible that  Y  causes  X , or that a third variable,  Z , causes both  X  and  Y .
  • While correlational research cannot be used to establish causal relationships between variables, correlational research does allow researchers to achieve many other important objectives (establishing reliability and validity, providing converging evidence, describing relationships and making predictions)
  • Correlation coefficients can range from -1 to +1. The sign indicates the direction of the relationship between the variables and the numerical value indicates the strength of the relationship.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

2. Practice: For each of the following statistical relationships, decide whether the directionality problem is present and think of at least one plausible third variable.

  • People who eat more lobster tend to live longer.
  • People who exercise more tend to weigh less.
  • College students who drink more alcohol tend to have poorer grades.
  • Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage. ↵
  • Messerli, F. H. (2012). Chocolate consumption, cognitive function, and Nobel laureates. New England Journal of Medicine, 367 , 1562-1564. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Non-Experimental Research

29 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of non-experimental research.
  • Interpret the strength and direction of different correlation coefficients.
  • Explain why correlation does not imply causation.

What Is Correlational Research?

Correlational research is a type of non-experimental research in which the researcher measures two variables (binary or continuous) and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one or are not interested in causal relationships. Recall two goals of science are to describe and to predict and the correlational research strategy allows researchers to achieve both of these goals. Specifically, this strategy can be used to describe the strength and direction of the relationship between two variables and if there is a relationship between the variables then the researchers can use scores on one variable to predict scores on the other (using a statistical technique called regression, which is discussed further in the section on Complex Correlation in this chapter).

Another reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher  cannot manipulate the independent variable because it is impossible, impractical, or unethical. For example, while a researcher might be interested in the relationship between the frequency people use cannabis and their memory abilities they cannot ethically manipulate the frequency that people use cannabis. As such, they must rely on the correlational research strategy; they must simply measure the frequency that people use cannabis and measure their memory abilities using a standardized test of memory and then determine whether the frequency people use cannabis is statistically related to memory test performance. 

Correlation is also used to establish the reliability and validity of measurements. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms  independent variable  and dependent variabl e  do not apply to this kind of research.

Another strength of correlational research is that it is often higher in external validity than experimental research. Recall there is typically a trade-off between internal validity and external validity. As greater controls are added to experiments, internal validity is increased but often at the expense of external validity as artificial conditions are introduced that do not exist in reality. In contrast, correlational studies typically have low internal validity because nothing is manipulated or controlled but they often have high external validity. Since nothing is manipulated or controlled by the experimenter the results are more likely to reflect relationships that exist in the real world.

Finally, extending upon this trade-off between internal and external validity, correlational research can help to provide converging evidence for a theory. If a theory is supported by a true experiment that is high in internal validity as well as by a correlational study that is high in external validity then the researchers can have more confidence in the validity of their theory. As a concrete example, correlational studies establishing that there is a relationship between watching violent television and aggressive behavior have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001) [1] .

Does Correlational Research Always Involve Quantitative Variables?

A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extraversion tests or the number of daily hassles and number of symptoms people have experienced. However, the defining feature of correlational research is that the two variables are measured—neither one is manipulated—and this is true regardless of whether the variables are quantitative or categorical. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American college students and 50 Japanese college students. Although this “feels” like a between-subjects experiment, it is a correlational study because the researcher did not manipulate the students’ nationalities. The same is true of the study by Cacioppo and Petty comparing college faculty and factory workers in terms of their need for cognition. It is a correlational study because the researchers did not manipulate the participants’ occupations.

Figure 6.2 shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this is an experiment or a correlational study because it is unclear whether the independent variable was manipulated. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then it is an experiment. If the researcher simply asked participants whether they made daily to-do lists, then it is a correlational study. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a correlational study, it could only be concluded that these variables are statistically related. Perhaps being stressed has a negative effect on people’s ability to plan ahead (the directionality problem). Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed (the third-variable problem). The crucial point is that what defines a study as experimental or correlational is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. What defines a study is how the study is conducted.

quantitative research correlational example

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. 

Correlations Between Quantitative Variables

Correlations between quantitative variables are often presented using scatterplots . Figure 6.3 shows some hypothetical data on the relationship between the amount of stress people are under and the number of physical symptoms they have. Each point in the scatterplot represents one person’s score on both variables. For example, the circled point in Figure 6.3 represents a person whose stress score was 10 and who had three physical symptoms. Taking all the points into account, one can see that people under more stress tend to have more physical symptoms. This is a good example of a positive relationship , in which higher scores on one variable tend to be associated with higher scores on the other. In other words, they move in the same direction, either both up or both down. A negative relationship is one in which higher scores on one variable tend to be associated with lower scores on the other. In other words, they move in opposite directions. There is a negative relationship between stress and immune system functioning, for example, because higher stress is associated with lower immune system functioning.

Figure 6.3 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms

The strength of a correlation between quantitative variables is typically measured using a statistic called  Pearson’s Correlation Coefficient (or Pearson's  r ) . As Figure 6.4 shows, Pearson’s r ranges from −1.00 (the strongest possible negative relationship) to +1.00 (the strongest possible positive relationship). A value of 0 means there is no relationship between the two variables. When Pearson’s  r  is 0, the points on a scatterplot form a shapeless “cloud.” As its value moves toward −1.00 or +1.00, the points come closer and closer to falling on a single straight line. Correlation coefficients near ±.10 are considered small, values near ± .30 are considered medium, and values near ±.50 are considered large. Notice that the sign of Pearson’s  r  is unrelated to its strength. Pearson’s  r  values of +.30 and −.30, for example, are equally strong; it is just that one represents a moderate positive relationship and the other a moderate negative relationship. With the exception of reliability coefficients, most correlations that we find in Psychology are small or moderate in size. The website http://rpsychologist.com/d3/correlation/ , created by Kristoffer Magnusson, provides an excellent interactive visualization of correlations that permits you to adjust the strength and direction of a correlation while witnessing the corresponding changes to the scatterplot.

Figure 6.4 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

There are two common situations in which the value of Pearson’s  r  can be misleading. Pearson’s  r  is a good measure only for linear relationships, in which the points are best approximated by a straight line. It is not a good measure for nonlinear relationships, in which the points are better approximated by a curved line. Figure 6.5, for example, shows a hypothetical relationship between the amount of sleep people get per night and their level of depression. In this example, the line that best approximates the points is a curve—a kind of upside-down “U”—because people who get about eight hours of sleep tend to be the least depressed. Those who get too little sleep and those who get too much sleep tend to be more depressed. Even though Figure 6.5 shows a fairly strong relationship between depression and sleep, Pearson’s  r  would be close to zero because the points in the scatterplot are not well fit by a single straight line. This means that it is important to make a scatterplot and confirm that a relationship is approximately linear before using Pearson’s  r . Nonlinear relationships are fairly common in psychology, but measuring their strength is beyond the scope of this book.

Figure 6.5 Hypothetical Nonlinear Relationship Between Sleep and Depression

The other common situations in which the value of Pearson’s  r  can be misleading is when one or both of the variables have a limited range in the sample relative to the population. This problem is referred to as  restriction of range . Assume, for example, that there is a strong negative correlation between people’s age and their enjoyment of hip hop music as shown by the scatterplot in Figure 6.6. Pearson’s  r  here is −.77. However, if we were to collect data only from 18- to 24-year-olds—represented by the shaded area of Figure 6.6—then the relationship would seem to be quite weak. In fact, Pearson’s  r  for this restricted range of ages is 0. It is a good idea, therefore, to design studies to avoid restriction of range. For example, if age is one of your primary variables, then you can plan to collect data from people of a wide range of ages. Because restriction of range is not always anticipated or easily avoidable, however, it is good practice to examine your data for possible restriction of range and to interpret Pearson’s  r  in light of it. (There are also statistical methods to correct Pearson’s  r  for restriction of range, but they are beyond the scope of this book).

Figure 6.6 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range

Correlation Does Not Imply Causation

You have probably heard repeatedly that “Correlation does not imply causation.” An amusing example of this comes from a 2012 study that showed a positive correlation (Pearson’s r = 0.79) between the per capita chocolate consumption of a nation and the number of Nobel prizes awarded to citizens of that nation [2] . It seems clear, however, that this does not mean that eating chocolate causes people to win Nobel prizes, and it would not make sense to try to increase the number of Nobel prizes won by recommending that parents feed their children more chocolate.

There are two reasons that correlation does not imply causation. The first is called the  directionality problem . Two variables,  X  and  Y , can be statistically related because X  causes  Y  or because  Y  causes  X . Consider, for example, a study showing that whether or not people exercise is statistically related to how happy they are—such that people who exercise are happier on average than people who do not. This statistical relationship is consistent with the idea that exercising causes happiness, but it is also consistent with the idea that happiness causes exercise. Perhaps being happy gives people more energy or leads them to seek opportunities to socialize with others by going to the gym. The second reason that correlation does not imply causation is called the  third-variable problem . Two variables,  X  and  Y , can be statistically related not because  X  causes  Y , or because  Y  causes  X , but because some third variable,  Z , causes both  X  and  Y . For example, the fact that nations that have won more Nobel prizes tend to have higher chocolate consumption probably reflects geography in that European countries tend to have higher rates of per capita chocolate consumption and invest more in education and technology (once again, per capita) than many other countries in the world. Similarly, the statistical relationship between exercise and happiness could mean that some third variable, such as physical health, causes both of the others. Being physically healthy could cause people to exercise and cause them to be happier. Correlations that are a result of a third-variable are often referred to as  spurious correlations .

Some excellent and amusing examples of spurious correlations can be found at http://www.tylervigen.com  (Figure 6.7  provides one such example).

quantitative research correlational example

“Lots of Candy Could Lead to Violence”

Although researchers in psychology know that correlation does not imply causation, many journalists do not. One website about correlation and causation, http://jonathan.mueller.faculty.noctrl.edu/100/correlation_or_causation.htm , links to dozens of media reports about real biomedical and psychological research. Many of the headlines suggest that a causal relationship has been demonstrated when a careful reading of the articles shows that it has not because of the directionality and third-variable problems.

One such article is about a study showing that children who ate candy every day were more likely than other children to be arrested for a violent offense later in life. But could candy really “lead to” violence, as the headline suggests? What alternative explanations can you think of for this statistical relationship? How could the headline be rewritten so that it is not misleading?

As you have learned by reading this book, there are various ways that researchers address the directionality and third-variable problems. The most effective is to conduct an experiment. For example, instead of simply measuring how much people exercise, a researcher could bring people into a laboratory and randomly assign half of them to run on a treadmill for 15 minutes and the rest to sit on a couch for 15 minutes. Although this seems like a minor change to the research design, it is extremely important. Now if the exercisers end up in more positive moods than those who did not exercise, it cannot be because their moods affected how much they exercised (because it was the researcher who used random assignment to determine how much they exercised). Likewise, it cannot be because some third variable (e.g., physical health) affected both how much they exercised and what mood they were in. Thus experiments eliminate the directionality and third-variable problems and allow researchers to draw firm conclusions about causal relationships.

Media Attributions

  • Nicholas Cage and Pool Drownings  © Tyler Viegen is licensed under a  CC BY (Attribution)  license
  • Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage. ↵
  • Messerli, F. H. (2012). Chocolate consumption, cognitive function, and Nobel laureates. New England Journal of Medicine, 367 , 1562-1564. ↵

A graph that presents correlations between two quantitative variables, one on the x-axis and one on the y-axis. Scores are plotted at the intersection of the values on each axis.

A relationship in which higher scores on one variable tend to be associated with higher scores on the other.

A relationship in which higher scores on one variable tend to be associated with lower scores on the other.

A statistic that measures the strength of a correlation between quantitative variables.

When one or both variables have a limited range in the sample relative to the population, making the value of the correlation coefficient misleading.

The problem where two variables, X  and  Y , are statistically related either because X  causes  Y, or because  Y  causes  X , and thus the causal direction of the effect cannot be known.

Two variables, X and Y, can be statistically related not because X causes Y, or because Y causes X, but because some third variable, Z, causes both X and Y.

Correlations that are a result not of the two variables being measured, but rather because of a third, unmeasured, variable that affects both of the measured variables.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

How Accurate Is Your Correlation? Different Methods Derive Different Results and Different Interpretations

1 Department of Foreign Languages, Hangzhou Dianzi University, Hangzhou, China

Majid Elahi Shirvan

2 Department of Foreign Languages, University of Bojnord, Bojnord, Iran

Abdullah Alamer

3 Department of English, Imam Mohammad Ibn Saud Islamic University, Hofuf, Saudi Arabia

4 Department of English, King Faisal University, Al-Ahsa, Saudi Arabia

Associated Data

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. Requests to access the datasets should be directed to AA, [email protected] .

Assessing the association between conceptual constructs are at the heart of quantitative research in educational and psychological research. Researchers apply different methods to the data to obtain results about the correlation between a set of variables. However, the question remains, how accurate are the results of the correlation obtained from these methods? Although various considerations should be taken to ensure accurate results, we focus on the types of analysis researchers apply to the data and discuss three methods most researchers use to obtain results about correlation. Particularly, we show how correlation results in bivariate correlation, confirmatory factor analysis (CFA), and exploratory structural equation modeling (ESEM) differ substantially in size. We observe that methods that assume independence of the items often generate inflated factor correlations whereas methods that relax this assumption present uninflated, thus more accurate correlations. Because factor correlations are inflated in bivariate correlation and CFA, the discriminant validity of the constructs is often unattainable. In these methods, the size of the correlation can be very large and biased. We discuss the reasons for this variation and suggest the type of correlation that researchers should select and report.

Introduction

Understanding the association between theoretical constructs is at the heart of quantitative research. Researchers use correlation to understand how two or more variables are associated. Note that correlation does not infer causality especially when it is applied to cross-sectional data ( Alamer and Lee, 2021 ). Beyond this, in first-generation analyses of correlations, which mainly involved bivariate correlation, the average or summary of the items’ score (or manifest score) is used to represent the given construct or dimension in the assessment. However, as noted by Marsh et al. (2009) , the dimensionality assumption of the items belonging to only one factor leads to potential inflation in the magnitude of the correlation between variables. This limitation paved the way for the emergence of the second-generation methods of correlation based on structural equation modeling (SEM) such as confirmatory factor analysis (CFA) and exploratory structural equation modeling (ESEM). Researchers can obtain results of correlation between latent variables in CFA and ESEM, but empirical studies have highlighted significant differences between the two methods which we explain in this study. In this study, we present empirical evidence that different methods can generate distinct results of correlation, which eventually might change the interpretation of the results.

Literature Review

Measuring the correlation between variables, first-generation methods.

The relationship between variables is usually obtained by assessing how measures/scales that represent the variables are correlated. Analysts rarely use single items to represent a complex phenomenon because single items cannot appropriately capture the complexity inherent in theoretical concepts ( Dörnyei and Taguchi, 2009 ). Researchers utilize measurement scales to get details about the constructs under investigation. Typically, few worded items (usually from three to ten items) targeting a particular concept are used. In first-generation analyses (such as bivariate correlation, regression, and t -test), these items are combined by averaging or summating their scores. This process is needed for such methods because it allows analysts to use one overall score that represents the construct in the analysis. Researchers, then, repeat this process for all subscales involved in the assessment. Obtaining total scores (manifest scores) of the items allows quantitative researchers to use correlation analysis (among other first-generation analyses). Nonetheless, Marsh et al. (2009) explain that manifest scores are derived from the assumption that items only reflect a single construct; thus, this assumption potentially inflates substantially the sizes of correlations between the variables (more to say about the relationship between the items and their factor in the subsequent section). Drawing on the same issue, Haenlein and Kaplan (2004) described the limitations of using first-generation techniques to examine correlation as they (i) postulate a simple model structure, (ii) require all variables to be observable (alternatively they are obtained by means of averaged or summed up the scores), and (iii) assume all variables are measured without measurement errors. These issues have an unavoidable impact on the quality of the results of correlation (among other analyses).

Second-Generation Methods

Beyond bivariate correlation, researchers have started to endorse second-generation methods ( Hair et al., 2022 ) that are built on the property of structural equation modeling (SEM) to assess the associations between variables. Among these methods are confirmatory factor analysis (CFA) and exploratory structural equation modeling (ESEM) [see Alamer and Marsh (2022) and Alamer (2022b) for details and applied examples about ESEM]. CFA is a method that is used to understand the underlying factor structure of the constructs ( Marsh et al., 2009 ; Morin et al., 2016 ). CFA gained more popularity in the field of SLA in last few decades as it uses the advantages of SEM, a key feature that exploratory factor analysis (EFA) is missing. Because it builds on SEM functionality, CFA is able to provide goodness-of-fit indices, examine competing model specifications, correlate items’ error terms (when theory and analysis support that), and assessment of between-group measurement invariance. In fact, the label “exploratory” only appears to be used for EFA after the invention of CFA (previously EFA was just called “factor analysis”) ( Marsh et al., 2005 ). However, the label “exploratory” in EFA does not really imply that it should only be used for exploratory purposes; its statistical limitations are what prevented analysts from getting deeper results from EFA. For instance, in its basic form, EFA cannot generate the goodness-of-fit indices, be used in a predictive model, and be tested for invariance across different groups of participants.

One key feature of CFA is that items load only on the factors they are hypothesized to load on. Thus, cross-loadings across other untargeted factors are not allowed in CFA and are constrained to be zero . Early literature in EFA and CFA ( Jöreskog, 1973 ) made the assumption that factors should be anchored in distinctive clusters of observed variables to constitute the latent variable. Nevertheless, this restrictive system of the measurement model has been challenged in the last decade ( Marsh et al., 2009 ; Guay et al., 2015 ; Morin et al., 2016 , 2020 ; Alamer, 2021a , 2022b ; Alamer and Marsh, 2022 ). This is because conceptual constructs have certain levels of similarities; they can overlap especially when they are conceptually related. Consider, for example, the measurement model reported in Alamer and Marsh (2022) study where two constructs, harmonious passion and obsessive passion , were involved in the analysis. To provide context to the example, harmonious passion reflects the strong desire to freely engage in language activity whereas obsessive passion reflects the controlled pressure combined with an uncontrollable urge to partake in the language activity. In the L2-Passion scale ( Alamer and Marsh, 2022 ), an item in harmonious passion reads “the new things that I discover in English allow me to appreciate it even more” while an item in obsessive passion reads “learning English is the only thing that really turns me on.” One can see how these two items belong to two different types of passion, but also each item seems to present significant true scores on the other (untargeted) type of passion. If the item on harmonious passion has no role at all to play in contributing to the meaning of the other factor, then why factor correlation is relatively high? such inflated factor correlation may be the result of the overly restrictive independent cluster representation of CFA.

With these observations in mind, why do analysts still prefer CFA even though it often produces unacceptable results both in the fit indices and factor correlation? Marsh et al. (2009) provide an answer to this question as they explain that “because of the recent dominance of CFA approaches to factor analysis, applied researchers have persisted with dubious approaches to CFA in the mistaken believe that EFA approaches were no longer acceptable. These misconceptions have been reinforced by the erroneous beliefs that many of the methodological advances associated with CFAs… are not possible when latent constructs are inferred on the basis of EFAs rather than CFAs” (p. 441).

Alternatively, research shows that certain levels of true scores can be relevant for other conceptually related constructs ( Guay et al., 2015 ; Morin et al., 2016 , 2020 ). This observation was confirmed by Alamer and Marsh (2022) as they found CFA fit indices were not acceptable for the L2-Passion scale. The researchers noted that CFA was rather a restrictive structure in that it ignores cross-loadings among the items of two conceptually related constructs, which, in turn, resulted in inflated factor correlation. To solve this, the researchers applied the recently developed method, ESEM (explained next). What they found is that ESEM better fits the measurement model and provided uninflated factor correlation.

Exploratory Structural Equation Modeling as an Alternative Method of Correlation

So, why does ESEM outperform CFA in empirical studies? In essence, ESEM shares a fundamental property with EFA in that both methods allow items to cross-load. However, they differ in that ESEM builds directly on SEM property (same as CFA). Hence, all SEM features used in the CFA have been successfully transferred (or brought back) to the EFA. As research has shown ( Marsh et al., 2009 ; Morin et al., 2020 ; Alamer, 2021b ), conceptual as well as empirical evidence is in favor of allowing cross-loadings to be estimated, particularly when conceptually related constructs are involved in the measurement model. When cross-loadings are allowed to be estimated, factor correlation appears to be unbiased (even when cross-loadings are very small), and model fit indices improve substantially ( Marsh et al., 2020 ). Accordingly, and most importantly, the correlation obtained from ESEM is deemed more realistic and reflects a more accurate correlation magnitude in the population ( Alamer, 2022b ). We show to the readers an applied example of correlation generated from bivariate correlation, CFA, and ESEM. Figure 1 represents visually the differences in correlation between the three methods.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-901412-g001.jpg

Visual representation of the correlation in bivariate correlation, CFA, and ESEM.

Example From Real Data Using Self-Determination Theory in Second Language Scale

As a macro theory of motivation, self-determination theory (SDT) has been used in several life domains to examine what motivates individuals to follow their goals ( Deci and Ryan, 2000 ). The theory contends the existence of two general types of motivation, autonomous motivation and controlled motivation with each having two sub-types of regulations. Autonomous motivation refers to the quality of individuals’ motivation being volitional. The first regulation under autonomous motivation is intrinsic regulation which represents language learners’ inherent inclination toward carrying out the language tasks. Identified regulation refers to the value and importance language learners attach while doing language tasks ( Alamer and Lee, 2019 ; Alamer and Al Khateeb, 2021 ). On the other hand, two regulations “introjected regulation” and “external regulation” constitute the overarching construct of controlled motivation ( Ryan and Deci, 2020 ). Introjected regulation refers to the inner and outer pressure that individuals experience to undertake learning activities. External regulation reflects the desire to learn and engage in language activity because of tangible and intangible rewards or the avoidance of punishment ( Alamer and Almulhim, 2021 ).

To make our discussion about correlation concise, the example in this methodological paper reports only on the correlation between two variables, intrinsic regulation and identified regulation under the global construct “autonomous motivation.” We draw on the empirical results reported in the study of Alamer (2022a) which tested the construct validity of the self-determination theory in second language (SDT-L2) scale (readers are referred to that study for more details about the data). The author assessed the factorial structure of the constructs via ESEM. He found support for the bifactor ESEM model over bifactor CFA in goodness-of-fit indices and meaningful factor loadings in both the specific and general factors. Among the four constructs of SDT-L2 scale, two constructs, intrinsic regulation and identified regulation are explained in the present study (see the Appendix for scale items). Each construct has 5 items (collectively comprising 10 items) that are based on a 5-point Likert-type response format. The bivariate correlation between the variables reported in that study was r = 0.69, p < 0.001. Alamer (2022a) study did not include the standard CFA and ESEM but only the bifactor solutions, thus we extend that by reporting the correlation of standard CFA and ESEM using the same dataset (readers are referred to that study for more details about the descriptive statistics and the sample). After running the analysis through Mplus 8.1, we found that CFA and ESEM have resulted in a distinct size of correlation (CFA r = 0.82, p < 0.001; ESEM r = 0.51, p < 0.001). Table 1 describes the differences between the three methods. Although the fit indices are not the focus of our discussion we report them accordingly: (CFA: χ 2 = 116.93, df = 34, p < 0.001, SRMR = 0.05, RMSEA = 0.11, RMSEA Low and Hi 95% CI [0.09, 0.13], CFI = 0.92, TLI = 0.90; ESEM: χ 2 = 81.72, df = 26, p < 0.001, SRMR = 0.03, RMSEA = 0.09, RMSEA Low and Hi 95% CI [0.09, 0.12], CFI = 0.95, TLI = 0.92).

Factor correlations obtained from bivariate correlation, CFA, and ESEM.

All correlations are significant at p < 0.001.

The reduced factor correlation between the two variables in the ESEM can be said to reflect a more realistic, thus more precise, representation of the association between intrinsic regulation and identified regulation. This is because, as noted by Morin et al. (2016 , 2020) , certain levels of true scores of items on the non-target factors should be expected and accepted in ESEM solutions. If we use the L2 guidelines to interpret our correlations (i.e., Plonsky and Oswald, 2014 ), we will conclude that bivariate correlation, and particularly the CFA, have resulted in correlations that are large in size (very large in CFA) while the correlation in ESEM has been reduced significantly to a medium effect size. The small differences in CFA and bivariate correlation can be attributed to the fact that bivariate correlation aggregates the items of the factor into one sum or averaged score; thus, results are not likely to be identical. Hence, it can be clear that different analyses result in different magnitudes of correlation, and with different magnitudes come distinct interpretations of the results. The weaker correlation of the two regulations in ESEM represents an uninflated and unbiased result due to the cross-loading of their items. More specifically, despite the fact that the two variables refer to different types of motivational regulation, they provide significant true scores on each other because both tap on and relate to the general construct “autonomous motivation” ( Deci and Ryan, 2000 ; Alamer, 2021a ). The cross-loadings of intrinsic regulation items on identified regulation can be supported by the fact that items on intrinsic regulation contributed, albeit weakly, to the meaning of the construct of identified regulation, and vice versa. For example, an item on intrinsic regulation reads “for the satisfaction I feel when I speak and write in English” cross-loaded on identified regulation [0.14, p > 0.05 as reported in Alamer (2021a) ]. This cross-loading, albeit weak, can be said meaningful because certain levels of learning satisfaction can be also associated with self-growth and personal value as expressed in identified regulation (readers are referred to the original report for a fuller discussion of the cross-loadings). With such a flexible system, factor correlation reduces to a more realistic level. Thus, it can be noticeable that ESEM relaxes the strong assumption of the independent clusters model of CFA which assumes all items have zero factor loadings on all untargeted factors other than the one they are hypothesized to relate to Marsh et al. (2020) . Consequently, fit indices in ESEM improve substantially compared with CFA. Such improvement may indicate that factor correlation (among other results) in ESEM better represent the data as well.

Effects of Estimation and Rotation Methods

We want to highlight that using different rotation methods in ESEM may result in slightly different loadings, which might lead to different sizes of correlations. However, two mostly used rotations “Target rotation” and “Oblimin rotation,” are recommended depending on the nature of the investigation [see Morin et al. (2020) for greater a discussion], and their correlation results are often comparable. Another area that needs to be considered is the estimator used in ESEM. The method selected plays a role in estimating the path coefficients and factor correlations. The most common method used to estimate the model is maximum likelihood (ML). But robust ML (MLR) is better suited when data does not fully satisfy normality assumption. Apart from ML and MLR, some estimators make no distributional assumptions about the observed variables and, thus better suit ordinal data such as diagonally weighted least squares (WLSMV, also called DWLS). Simulation studies noted that using WLSMV results in inflated factor correlation compared to MLR when the sample size is modest N < 200 and the data is relatively non-normal [see Li (2016) for a greater discussion]. It is recommended that researchers use MLR when the normality is not substantially violated and that the scale has 5 or more categories (which is commonly used in Likert scale questionnaires), while WLSMV estimator is justified when the scale has 4 categories or less ( Shi and Maydeu-Olivares, 2020 ).

We also note that ESEM has a specification that assumes the co-existence of a global factor called, bifactor ESEM. In bifactor ESEM, specific factors and general factors were specified as orthogonal ( Morin et al., 2016 ; Alamer, 2021b ). That is, this type of model requires that correlations between all factors be constrained to zero [see Alamer (2021a) for an application of bifactor models]. Therefore, when the researchers’ goal is to evaluate factor correlations, they should first consider standard ESEM to obtain results about correlation. Then, they can pursue the analysis and use bifactor models (if theory suggests that).

Summary and Recommendations

In this methodological paper, we have discussed three types of approaches that researchers mostly apply to obtain results about correlation. Correlation is one of the most widely used quantitative analyses that researchers use to understand how L2 variables are interrelated. Beyond the layman’s belief that the significance test (i.e., whether the p -value is less than 0.05) is the ultimate objective of correlation ( Plonsky and Oswald, 2014 ; Alamer and Lee, 2021 ), researchers need to select an approach that represents reality in the population as close as possible. Our results with data from self-determination theory in second language (SDT-L2) scale, specifically the association between intrinsic regulation and identified regulation, show that analyses that assume independent item loadings (e.g., bivariate correlation and CFA) have provided biased factor correlation, thus negatively impacting the interpretation of the results. Reviewing correlation results other than those reported in the present study, one can find examples from SLA literature of factor correlations that reach r = 0.90 in CFA (see, for example, Park, 2011 ) and many other studies report correlation that ranges between r = 0.70 and r = 0.90. Statistically, r = 0.90 is a very large magnitude and implies that the two factors have 81% of shared variance (i.e., they are 81% similar), which empirically detracts from the discriminant validity of the factors. Arguably, it is not unmanageable to assume a distinct meaning of factors when they share such a substantial amount of variance. We suggest that factor correlation should not exceed 0.70 in the measurement model because exceeding this cutoff value indicate that the factors share more than 50% of similarities. When these two highly correlated factors are employed in a structural model, the solution is likely to face collinearity issues, which result in biased path coefficients.

This observation also applies to bivariate correlation, albeit at a decreased magnitude level. Conversely, when ESEM is employed, results of correlation would better support the discriminant validity of the factors and can be said to be more realistically represent the population. As such, instead of bivariate correlation or CFA, we suggest researchers apply ESEM to understand how latent variables are associated. Note that discriminant validity is not only achieved by factor correlation but also through the weak cross-loadings of the items on the untargeted factors [see Alamer and Marsh (2022) for greater details]. Further, not all ESEM solutions will result in significantly reduced factor correlation because it depends on the nature of the factors involved in the assessment. Nevertheless, ESEM often results in reduced factor correlation relative to CFA.

In addition, our review on correlation does not cover the full possibilities of gaining correlation between the factors; we, instead, have discussed the most widely used methods in the field and commented on them. For example, researchers may obtain correlation from partial least squares SEM (PLS-SEM) [ Hair et al., 2019 ; see also Henseler (2020) ] and the results are likely to be different because the estimation method is different. We also want to highlight that we did not comment on the type of the scale (measurement scales) or the normality of the data as each type holds particular consideration in the analysis. We instead focused on variables that are perceived as continuous or at least treated as interval such as the case with the symmetric Likert scale ( Hair et al., 2019 ). At the time of this publication, only Mplus and R can run ESEM and we hope researchers would endorse this method in their research. A recent study that introduces ESEM to SLA research that included applied examples, the syntax required for Mplus with data that is publicly available is currently published (i.e., Alamer and Marsh, 2022 ). Among the wealth of its benefits, we think that ESEM can be an alternative analytical tool to understand precisely how constructs/measures are correlated. It would be more accurate for the L2 quantitative researchers to endorse ESEM for future empirical studies to investigate the association between the variables.

Data Availability Statement

Ethics statement.

The studies involving human participants were reviewed and approved by Imam Mohammad Ibn Saud Islamic University (IMSIU). The patients/participants provided their written informed consent to participate in this study.

Author Contributions

AA was responsible for the research design, data collection, and draft writing of this study. ME helped with the theory, literature review, and arrangement of the article. KS helped with the theory, review, and revision of the article. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Self-Determination Theory of Second Language Subscale (SDT-L2)*

Why are you learning English?

*Taken from Alamer (2022a) .

Funding Statement

This research was supported by the grant from ZheJiang Philosophy and Social Science Foundation, Zhejiang, China, awarded to KS (No: 22NDJC084YB).

  • Alamer A. (2021a). Construct validation of self-determination theory in second language scale: the bifactor exploratory structural equation modeling approach. Front. Psychol. 12 : 732016 . 10.3389/fpsyg.2021.732016 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alamer A. (2021b). Grit and language learning: construct validation of L2-grit scale and its relation to later vocabulary knowledge. Educ. Psychol. 41 544–562. 10.1080/01443410.2020.1867076 [ CrossRef ] [ Google Scholar ]
  • Alamer A. (2022b). Exploratory structural equation modeling (ESEM) and bifactor ESEM for construct validation purposes: guidelines and applied example. Res. Methods Appl. Linguist. 1 1–13. 10.1016/j.rmal.2022.100005 [ CrossRef ] [ Google Scholar ]
  • Alamer A. (2022a). Basic psychological needs, motivational orientations, effort, and vocabulary knowledge: a comprehensive model. Stud. Second Lang. Acquis. 44 164–184. 10.1017/S027226312100005X [ CrossRef ] [ Google Scholar ]
  • Alamer A., Al Khateeb A. (2021). Effects of using the WhatsApp application on language learners motivation: a controlled investigation using structural equation modelling. Comput. Assist. Lang. Learn. 1–27. 10.1080/09588221.2021.1903042 [ CrossRef ] [ Google Scholar ]
  • Alamer A., Almulhim F. (2021). The interrelation between language anxiety and self-determined motivation; a mixed methods approach. Front. Educ. 6 : 618655 . 10.3389/feduc.2021.618655 [ CrossRef ] [ Google Scholar ]
  • Alamer A., Lee J. (2019). A motivational process model explaining L2 Saudi students’ achievement of English. System 87 : 102133 . 10.1016/j.system.2019.102133 [ CrossRef ] [ Google Scholar ]
  • Alamer A., Lee J. (2021). Language achievement predicts anxiety and not the other way around: a cross-lagged panel analysis approach. Lang. Teach. Res. 1–22. 10.1177/13621688211033694 [ CrossRef ] [ Google Scholar ]
  • Alamer A., Marsh H. (2022). Exploratory structural equation modeling in second language research: an applied example using the dualistic model of passion. Stud. Second Lang. Acquis. 1–24. 10.1017/S0272263121000863 [ CrossRef ] [ Google Scholar ]
  • Deci E. L., Ryan R. M. (2000). The “what” and “why” of goal pursuits: human needs and the self-determination of behavior. Psychol. Inq. 11 227–268. 10.1207/S15327965PLI1104_01 [ CrossRef ] [ Google Scholar ]
  • Dörnyei Z., Taguchi T. (2009). Questionnaires in Second Language Research: Construction, Administration, and Processing. Routledge: Milton Park. [ Google Scholar ]
  • Guay F., Morin A. J. S., Litalien D., Valois P., Vallerand R. J. (2015). Application of exploratory structural equation modeling to evaluate the academic motivation scale. J. Exp. Educ. 83 51–82. 10.1080/00220973.2013.876231 [ CrossRef ] [ Google Scholar ]
  • Haenlein M., Kaplan A. M. (2004). A beginner’s guide to partial least squares analysis. Understand. Stat. 3 283–297. 10.1207/s15328031us0304_4 [ CrossRef ] [ Google Scholar ]
  • Hair J. F., Black B., Babin B., Anderson R. E. (2019). Multivariate Data Analysis , 8th Edn. Boston, MA: Cengage Learning. [ Google Scholar ]
  • Hair J. F., Hult T., Ringle C. M., Sarstedt M. (2022). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) , 3rd Edn. New York, NA: Sage. [ Google Scholar ]
  • Henseler J. (2020). Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables. New York, NY: Guilford Publications. [ Google Scholar ]
  • Jöreskog K. G. (1973). “ A general method for estimating a linear structural equation system ,” in Structural Equation Models in the Social Sciences , eds Goldberger A. S., Duncan O. D. (New York, NY: Seminar Press; ), 255–284. [ Google Scholar ]
  • Li C. H. (2016). Confirmatory factor analysis with ordinal data: comparing robust maximum likelihood and diagonally weighted least squares. Behav. Res. Methods 48 936–949. 10.3758/s13428-015-0619-7 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Marsh H. W., Guo J., Dicke T., Parker P. D., Craven R. G. (2020). Confirmatory factor analysis (CFA), exploratory structural equation modeling (ESEM), and set-ESEM: optimal balance between goodness of fit and parsimony. Multivar. Behav. Res. 55 102–119. 10.1080/00273171.2019.1602503 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Marsh H. W., Hau K.-T., Grayson D. (2005). “ Goodness of fit in structural equation models ,” in Contemporary Psychometrics: A Festschrift for Roderick P. McDonald , eds Maydeu-Olivares A., McArdle J. J. (Mahwah, NJ: Lawrence Erlbaum Associates Publishers; ), 275–340. [ Google Scholar ]
  • Marsh H. W., Mutheìn B., Asparouhov T., Lüdtke O., Robitzsch A., Morin A. J. S., et al. (2009). Exploratory structural equation modeling, integrating CFA and EFA: application to students’ evaluations of university teaching. Struct. Equ. Model. 16 439–476. 10.1080/10705510903008220 [ CrossRef ] [ Google Scholar ]
  • Morin A. J. S., Myers N. D., Lee S. (2020). “ Modern factor analytic techniques: bifactor models, exploratory structural equation modeling and bifactor-ESEM ,” in Handbook of Sport Psychology , 4th Edn, eds Tenenbaum G., Eklund R. C. (New York, NY: Wiley; ), 1044–1073. 10.1002/9781119568124.ch51 [ CrossRef ] [ Google Scholar ]
  • Morin S., Arens K., Tran A., Caci H. (2016). Exploring sources of construct-relevant multidimensionality in psychiatric measurement: a tutorial and illustration using the composite scale of morningness. Int. J. Methods Psychiatr. Res. 25 277–288. 10.1002/mpr.1485 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Park G. P. (2011). The validation process of the SILL: a confirmatory factor analysis. English Lang. Teach. 4 21–27. 10.1093/acprof:oso/9780195339888.003.0002 [ CrossRef ] [ Google Scholar ]
  • Plonsky L., Oswald F. (2014). How big is “big”? Interpreting effect sizes in L2 research. Lang. Learn. 64 878–912. 10.1111/lang.12079 [ CrossRef ] [ Google Scholar ]
  • Ryan R., Deci E. (2020). Intrinsic and extrinsic motivation from a self-determination theory perspective: definitions, theory, practices, and future directions. Contemp. Educ. Psychol. 61 : 101860 . 10.1016/j.cedpsych.2020.101860 [ CrossRef ] [ Google Scholar ]
  • Shi D., Maydeu-Olivares A. (2020). The effect of estimation methods on SEM fit indices. Educ. Psychol. Meas. 80 421–445. 10.1177/0013164419885164 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Greek and Roman Papyrology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Variation
  • Language Families
  • Language Evolution
  • Language Reference
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Culture
  • Music and Media
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business History
  • Business Ethics
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic Methodology
  • Economic History
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Music Education Research: An Introduction

  • < Previous chapter
  • Next chapter >

Music Education Research: An Introduction

12 Quantitative Descriptive and Correlational Research

  • Published: February 2023
  • Cite Icon Cite
  • Permissions Icon Permissions

This chapter presents research designs for descriptive and correlational quantitative research. Descriptive research designs are used to address the question “What is x?” Correlational research designs are used to address the question “How are things related?” In contrast to some experimental research designs, in these design types the primary area of interest under investigation is not manipulated by the researcher. Researchers investigating descriptive or correlational research questions commonly use surveys or observational methods to gather data. Surveys are an efficient method for gathering large amounts of information about such things as individuals’ experiences, beliefs, and attitudes. When designing a survey, researchers must consider many things, such as how long it will be and what it will cover. Observation is an important means of gathering data, as when researchers observe video recordings of teachers or students in various situations. Another approach to observational research is the experience sampling method (ESM). In ESM, participants are interrupted at random times throughout the day and asked to respond to questions concerning their experiences in real time. In other words, researchers ask participants what they are doing at the moment they are contacted.

Signed in as

Institutional accounts.

  • GoogleCrawler [DO NOT DELETE]
  • Google Scholar Indexing

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Sign in through your institution

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

quantitative research correlational example

Quantitative Research Methods

  • Introduction
  • Descriptive and Inferential Statistics
  • Hypothesis Testing
  • Regression and Correlation
  • Time Series
  • Meta-Analysis
  • Mixed Methods
  • Additional Resources
  • Get Research Help

quantitative research correlational example

Correlation is the relationship or association between two variables. There are multiple ways to measure correlation, but the most common is Pearson's correlation coefficient (r), which tells you the strength of the linear relationship between two variables. The value of r has a range of -1 to 1 (0 indicates no relationship). Values of r closer to -1 or 1 indicate a stronger relationship and values closer to 0 indicate a weaker relationship.  Because Pearson's coefficient only picks up on linear relationships, and there are many other ways for variables to be associated, it's always best to plot your variables on a scatter plot, so that you can visually inspect them for other types of correlation.

  • Correlation Penn State University tutorial
  • Correlation and Causation Australian Bureau of Statistics Article

Spurious Relationships

It's important to remember that correlation does not always indicate causation. Two variables can be correlated without either variable causing the other. For instance, ice cream sales and drownings might be correlated, but that doesn't mean that ice cream causes drownings—instead, both ice cream sales and drownings increase when the weather is hot. Relationships like this are called spurious correlations.

  • Spuriousness Harvard Business Review article.
  • New Evidence for Theory of The Stork A satirical article demonstrating the dangers of confusing correlation with causation.

quantitative research correlational example

Regression is a statistical method for estimating the relationship between two or more variables. In theory, regression can be used to predict the value of one variable (the dependent variable) from the value of one or more other variables (the independent variable/s or predictor/s). There are many different types of regression, depending on the number of variables and the properties of the data that one is working with, and each makes assumptions about the relationship between the variables. (For instance, most types of regression assume that the variables have a linear relationship.) Therefore, it is important to understand the assumptions underlying the type of regression that you use and how to properly interpret its results. Because regression will always output a relationship, whether or not the variables are truly causally associated, it is also important to carefully select your predictor variables.

  • A Refresher on Regression Analysis Harvard Business Review article.
  • Introductory Business Statistics - Regression

Simple Linear Regression

Simple linear regression estimates a linear relationship between one dependent variable and one independent variable.

  • Simple Linear Regression Tutorial Penn State University Tutorial
  • Statistics 101: Linear Regression, The Very Basics YouTube video from Brandon Foltz.

Multiple Linear Regression

Multiple linear regression estimates a linear relationship between one dependent variable and two or more independent variables.

  • Multiple Linear Regression Tutorial Penn State University Tutorial
  • Multiple Regression Basics NYU course materials.
  • Statistics 101: Multiple Linear Regression, The Very Basics YouTube video from Brandon Foltz.

If you do a subject search for Regression Analysis you'll see that the library has over 200 books about regression.  Select books are listed below.  Also, note that econometrics texts will often include regression analysis and other related methods.  

quantitative research correlational example

Search for ebooks using Quicksearch .  Use keywords to search for e-books about Regression .  

quantitative research correlational example

  • << Previous: Hypothesis Testing
  • Next: ANOVA >>
  • Last Updated: Aug 18, 2023 11:55 AM
  • URL: https://guides.library.duq.edu/quant-methods

Learn / Guides / Quantitative data analysis guide

Back to guides

The ultimate guide to quantitative data analysis

Numbers help us make sense of the world. We collect quantitative data on our speed and distance as we drive, the number of hours we spend on our cell phones, and how much we save at the grocery store.

Our businesses run on numbers, too. We spend hours poring over key performance indicators (KPIs) like lead-to-client conversions, net profit margins, and bounce and churn rates.

But all of this quantitative data can feel overwhelming and confusing. Lists and spreadsheets of numbers don’t tell you much on their own—you have to conduct quantitative data analysis to understand them and make informed decisions.

Last updated

Reading time.

quantitative research correlational example

This guide explains what quantitative data analysis is and why it’s important, and gives you a four-step process to conduct a quantitative data analysis, so you know exactly what’s happening in your business and what your users need .

Collect quantitative customer data with Hotjar

Use Hotjar’s tools to gather the customer insights you need to make quantitative data analysis a breeze.

What is quantitative data analysis? 

Quantitative data analysis is the process of analyzing and interpreting numerical data. It helps you make sense of information by identifying patterns, trends, and relationships between variables through mathematical calculations and statistical tests. 

With quantitative data analysis, you turn spreadsheets of individual data points into meaningful insights to drive informed decisions. Columns of numbers from an experiment or survey transform into useful insights—like which marketing campaign asset your average customer prefers or which website factors are most closely connected to your bounce rate. 

Without analytics, data is just noise. Analyzing data helps you make decisions which are informed and free from bias.

What quantitative data analysis is not

But as powerful as quantitative data analysis is, it’s not without its limitations. It only gives you the what, not the why . For example, it can tell you how many website visitors or conversions you have on an average day, but it can’t tell you why users visited your site or made a purchase.

For the why behind user behavior, you need qualitative data analysis , a process for making sense of qualitative research like open-ended survey responses, interview clips, or behavioral observations. By analyzing non-numerical data, you gain useful contextual insights to shape your strategy, product, and messaging. 

Quantitative data analysis vs. qualitative data analysis 

Let’s take an even deeper dive into the differences between quantitative data analysis and qualitative data analysis to explore what they do and when you need them.

quantitative research correlational example

The bottom line: quantitative data analysis and qualitative data analysis are complementary processes. They work hand-in-hand to tell you what’s happening in your business and why.  

💡 Pro tip: easily toggle between quantitative and qualitative data analysis with Hotjar Funnels . 

The Funnels tool helps you visualize quantitative metrics like drop-off and conversion rates in your sales or conversion funnel to understand when and where users leave your website. You can break down your data even further to compare conversion performance by user segment.

Spot a potential issue? A single click takes you to relevant session recordings , where you see user behaviors like mouse movements, scrolls, and clicks. With this qualitative data to provide context, you'll better understand what you need to optimize to streamline the user experience (UX) and increase conversions .

Hotjar Funnels lets you quickly explore the story behind the quantitative data

4 benefits of quantitative data analysis

There’s a reason product, web design, and marketing teams take time to analyze metrics: the process pays off big time. 

Four major benefits of quantitative data analysis include:

1. Make confident decisions 

With quantitative data analysis, you know you’ve got data-driven insights to back up your decisions . For example, if you launch a concept testing survey to gauge user reactions to a new logo design, and 92% of users rate it ‘very good’—you'll feel certain when you give the designer the green light. 

Since you’re relying less on intuition and more on facts, you reduce the risks of making the wrong decision. (You’ll also find it way easier to get buy-in from team members and stakeholders for your next proposed project. 🙌)

2. Reduce costs

By crunching the numbers, you can spot opportunities to reduce spend . For example, if an ad campaign has lower-than-average click-through rates , you might decide to cut your losses and invest your budget elsewhere. 

Or, by analyzing ecommerce metrics , like website traffic by source, you may find you’re getting very little return on investment from a certain social media channel—and scale back spending in that area.

3. Personalize the user experience

Quantitative data analysis helps you map the customer journey , so you get a better sense of customers’ demographics, what page elements they interact with on your site, and where they drop off or convert . 

These insights let you better personalize your website, product, or communication, so you can segment ads, emails, and website content for specific user personas or target groups.

4. Improve user satisfaction and delight

Quantitative data analysis lets you see where your website or product is doing well—and where it falls short for your users . For example, you might see stellar results from KPIs like time on page, but conversion rates for that page are low. 

These quantitative insights encourage you to dive deeper into qualitative data to see why that’s happening—looking for moments of confusion or frustration on session recordings, for example—so you can make adjustments and optimize your conversions by improving customer satisfaction and delight.

💡Pro tip: use Net Promoter Score® (NPS) surveys to capture quantifiable customer satisfaction data that’s easy for you to analyze and interpret. 

With an NPS tool like Hotjar, you can create an on-page survey to ask users how likely they are to recommend you to others on a scale from 0 to 10. (And for added context, you can ask follow-up questions about why customers selected the rating they did—rich qualitative data is always a bonus!)

quantitative research correlational example

Hotjar graphs your quantitative NPS data to show changes over time

4 steps to effective quantitative data analysis 

Quantitative data analysis sounds way more intimidating than it actually is. Here’s how to make sense of your company’s numbers in just four steps:

1. Collect data

Before you can actually start the analysis process, you need data to analyze. This involves conducting quantitative research and collecting numerical data from various sources, including: 

Interviews or focus groups 

Website analytics

Observations, from tools like heatmaps or session recordings

Questionnaires, like surveys or on-page feedback widgets

Just ensure the questions you ask in your surveys are close-ended questions—providing respondents with select choices to choose from instead of open-ended questions that allow for free responses.

quantitative research correlational example

Hotjar’s pricing plans survey template provides close-ended questions

 2. Clean data

Once you’ve collected your data, it’s time to clean it up. Look through your results to find errors, duplicates, and omissions. Keep an eye out for outliers, too. Outliers are data points that differ significantly from the rest of the set—and they can skew your results if you don’t remove them.

By taking the time to clean your data set, you ensure your data is accurate, consistent, and relevant before it’s time to analyze. 

3. Analyze and interpret data

At this point, your data’s all cleaned up and ready for the main event. This step involves crunching the numbers to find patterns and trends via mathematical and statistical methods. 

Two main branches of quantitative data analysis exist: 

Descriptive analysis : methods to summarize or describe attributes of your data set. For example, you may calculate key stats like distribution and frequency, or mean, median, and mode.

Inferential analysis : methods that let you draw conclusions from statistics—like analyzing the relationship between variables or making predictions. These methods include t-tests, cross-tabulation, and factor analysis. (For more detailed explanations and how-tos, head to our guide on quantitative data analysis methods.)

Then, interpret your data to determine the best course of action. What does the data suggest you do ? For example, if your analysis shows a strong correlation between email open rate and time sent, you may explore optimal send times for each user segment.

4. Visualize and share data

Once you’ve analyzed and interpreted your data, create easy-to-read, engaging data visualizations—like charts, graphs, and tables—to present your results to team members and stakeholders. Data visualizations highlight similarities and differences between data sets and show the relationships between variables.

Software can do this part for you. For example, the Hotjar Dashboard shows all of your key metrics in one place—and automatically creates bar graphs to show how your top pages’ performance compares. And with just one click, you can navigate to the Trends tool to analyze product metrics for different segments on a single chart. 

Hotjar Trends lets you compare metrics across segments

Discover rich user insights with quantitative data analysis

Conducting quantitative data analysis takes a little bit of time and know-how, but it’s much more manageable than you might think. 

By choosing the right methods and following clear steps, you gain insights into product performance and customer experience —and you’ll be well on your way to making better decisions and creating more customer satisfaction and loyalty.

FAQs about quantitative data analysis

What is quantitative data analysis.

Quantitative data analysis is the process of making sense of numerical data through mathematical calculations and statistical tests. It helps you identify patterns, relationships, and trends to make better decisions.

How is quantitative data analysis different from qualitative data analysis?

Quantitative and qualitative data analysis are both essential processes for making sense of quantitative and qualitative research .

Quantitative data analysis helps you summarize and interpret numerical results from close-ended questions to understand what is happening. Qualitative data analysis helps you summarize and interpret non-numerical results, like opinions or behavior, to understand why the numbers look like they do.

 If you want to make strong data-driven decisions, you need both.

What are some benefits of quantitative data analysis?

Quantitative data analysis turns numbers into rich insights. Some benefits of this process include: 

Making more confident decisions

Identifying ways to cut costs

Personalizing the user experience

Improving customer satisfaction

What methods can I use to analyze quantitative data?

Quantitative data analysis has two branches: descriptive statistics and inferential statistics. 

Descriptive statistics provide a snapshot of the data’s features by calculating measures like mean, median, and mode. 

Inferential statistics , as the name implies, involves making inferences about what the data means. Dozens of methods exist for this branch of quantitative data analysis, but three commonly used techniques are: 

Cross tabulation

Factor analysis

ORIGINAL RESEARCH article

The causal effect of adipose tissue on hodgkin's lymphoma: two-sample mendelian randomization study and validation.

Lihua Wu

  • Fujian Medical University Union Hospital, Fuzhou, China

The final, formatted version of the article will be published soon.

Select one of your emails

You have multiple emails registered with Frontiers:

Notify me on publication

Please enter your email address:

If you already have an account, please login

You don't have a Frontiers account ? You can register here

Background: Extensive research has been conducted on the correlation between adipose tissue and the risk of malignant lymphoma. Despite numerous observational studies exploring this connection, uncertainty remains regarding a causal relationship between adipose tissue and malignant lymphoma.Methods: The increase or decrease in adipose tissue was represented by the height of BMI. The BMI and malignant lymphoma genome-wide association studies (GWAS) used a summary dataset from the OPEN GWAS website. Single-nucleotide polymorphisms (SNPs) that met the criteria of P 0.8 were identified, while palindromic and outlier SNPs were excluded. Mendelian randomization (MR) analysis used five methods, including inverse-variance weighted (IVW) model, weighted median (WM), MR-Egger, simple mode, and weighted mode. Sensitivity assessments included Cochran's Q test, MR-Egger intercept test, and leave-one-out analysis. Participants randomly selected by the National Center for Health Statistics (NHANSE) and newly diagnosed HL patients at Fujian Medical University Union Hospital were used for external validation.Results: The results of the MR analysis strongly supported the causal link between BMI and Hodgkin's lymphoma (HL). The research demonstrated that individuals with lower BMI face a significantly increased risk of developing HL, with a 91.65% higher risk (ORIVW = 0.0835, 95% CI 0.0147 -0.4733, P = 0.005). No signs of horizontal or directional pleiotropy were observed in the MR studies. The validation results aligned with the results from the MR analysis (OR = 0.871, 95% CI 0.826 -0.918, P < 0.001). And there was no causal relationship between BMI and non-Hodgkin's lymphoma (NHL).Conclusions: The MR analysis study demonstrated a direct correlation between lower BMI and HL. This suggested that a decrease in adipose tissue increases the risk of developing HL. Nevertheless, further research is essential to grasp the underlying mechanism of this causal association comprehensively.

Keywords: Hodgkin's lymphoma (HL)1, non-Hodgkin's lymphoma (NHL)2, Adipose3, Body mass index (BMI)4, Mendelian randomization (MR)5

Received: 14 Mar 2024; Accepted: 16 May 2024.

Copyright: © 2024 Wu, 廖, Guo and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Nainong Li, Fujian Medical University Union Hospital, Fuzhou, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Open access
  • Published: 07 May 2024

Causal association between low vitamin D and polycystic ovary syndrome: a bidirectional mendelian randomization study

  • Bingrui Gao 1 ,
  • Chenxi Zhang 1 ,
  • Deping Wang 1 , 2 ,
  • Bojuan Li 1 ,
  • Zhongyan Shan 1 ,
  • Weiping Teng 1 &
  • Jing Li   ORCID: orcid.org/0000-0002-3681-4095 1  

Journal of Ovarian Research volume  17 , Article number:  95 ( 2024 ) Cite this article

280 Accesses

Metrics details

Recent studies have revealed the correlation between serum vitamin D (VD) level and polycystic ovary syndrome (PCOS), but the causality and specific mechanisms remain uncertain.

We aimed to investigate the cause-effect relationship between serum VD and PCOS, and the role of testosterone in the related pathological mechanisms.

We assessed the causality between serum VD and PCOS by using genome-wide association studies (GWAS) data in a bidirectional two-sample Mendelian randomization (TS-MR) analysis. Subsequently, a MR mediation analysis was conducted to examine the mediating action of testosterone in the causality between serum VD and PCOS. Ultimately, we integrated GWAS data with cis-expression quantitative loci (cis-eQTLs) data for gene annotation, and used the potentially related genes for functional enrichment analysis to assess the involvement of testosterone and the potential mechanisms.

TS-MR analysis showed that individuals with lower level of serum VD were more likely to develop PCOS (OR = 0.750, 95% CI: 0.587–0.959, P  = 0.022). MR mediation analysis uncovered indirect causal effect of serum VD level on the risk of PCOS via testosterone (OR = 0.983, 95% CI: 0.968–0.998, P  = 0.025). Functional enrichment analysis showed that several pathways may be involved in the VD-testosterone-PCOS axis, such as steroid hormone biosynthesis and autophagy process.

Our findings suggest that genetically predicted lower serum VD level may cause a higher risk of developing PCOS, which may be mediated by increased testosterone production.

Introduction

Vitamin D (VD) is an essential fat-soluble steroid hormone that is necessary for calcium-phosphate metabolism, bone homeostasis, cell differentiation, and immune system function. The prevalence of VD deficiency (VDD) in the population has gradually increased over the past few decades. VDD is associated with various diseases, including cardiovascular disease, inflammation, dyslipidemia, weight gain, and infectious diseases [ 1 , 2 ]. Furthermore, mounting studies have indicated the potential link between the serum VD status and women's reproductive health. Firstly, the biological function of VD is mediated via intracellular VD receptors (VDRs), which are distributed among various tissues, encompassing hypothalamic, pituitary tissue, endometrium, and ovary [ 3 , 4 ]. Secondly, VD participates in regulating genes associated with ovarian and placental functions [ 5 , 6 ]. All evidences suggest that the serum VD plays a potentially significant role in female reproductive health.

Polycystic ovary syndrome (PCOS) is the most common endocrine disorder that effects women of reproductive age, with a global incidence ranging 20–25% [ 7 , 8 ]. PCOS will affect woman's endometrial function and oocyte competence [ 9 , 10 ], which leads to reproductive dysfunction in PCOS patients, including infertility, miscarriage, and pregnancy complications [ 11 , 12 , 13 ]. However, the exact pathogenesis of PCOS remains unclear. Prior observational studies have elucidated the correlation between the serum VD and the risk of PCOS. A recent study revealed that serum VD concentration were lower in women diagnosed with PCOS compared to body mass index (BMI)-matched control, suggesting that regardless of BMI, PCOS is correlated with reduced VD level [ 14 ]. However, these studies can only prove that there is a correlation between them, they cannot clarify the causality between them. In addition, hyperandrogenemia stands as one of the diagnostic criteria for PCOS and impacts 60–80% of patients [ 15 ]. Female are actually more sensitive to testosterone even though it is known as a male hormone [ 16 ]. Growing evidences showed that testosterone may play an important role between the serum VD level and the risk of PCOS. Hahn et al. illustrated an association between the serum VD level and the severity of hirsutism in individuals with PCOS [ 17 ]. The research conducted by Latic et al. indicates a negative correlation between serum VD level and testosterone production in patients with PCOS [ 18 ]. However, a study by Mesinovic et al. suggested no discernible correlation between the serum VD level and androgen production in individuals with PCOS [ 19 ]. Moreover, a large observational study by Gallea et al. also showcased the association between serum VD levels, insulin, and body weight among PCOS patients but not specifically with hyperandrogenemia [ 20 ]. The reason for these different results may be due to the fact that observational studies are susceptible to confounding factors as well as various biases [ 21 ]. Therefore, it is not clear whether testosterone production mediate the relationship between serum VD level and the risk of PCOS, due to the limitations of the study methodology.

In recent years, mendelian randomization (MR) analysis is widely used as an epidemiological method in medical research. Firstly, MR analysis can minimize the impact of confounding factors and various biases on the results by simulating randomized controlled trials (RCTs) at the genetic level, and secondly, MR analysis can also determine causality and reduce the impact of reverse causality on the results of the study [ 22 ].

Thus, in this study, we use the bidirectional two-sample MR (TS-MR) analysis to investigate the cause-effect relationship between the serum VD level and the risk of PCOS. Secondly, we perform the mediation MR analysis to test the mediating role of testosterone production between serum VD level and the risk of PCOS. Finally, we used the bioinformatics analysis to assess the possible biological functions and molecular mechanisms between them.

Materials and methods

Study design of mendelian randomization study.

Our study explored the cause-effect of serum VD level as an exposure on the risk of developing PCOS as an outcome trait and the effect of testosterone as a mediator between VD and PCOS through bidirectional TS-MR analysis, multivariable MR (MVMR) and mediator MR analysis (Fig.  1 ). In order to ensure the study's validity, the study needed to meet the three following crucial assumptions [ 23 ] (Fig.  1 C):1) the correlation assumption: instrumental variables (IVs) must be robustly correlated with the exposure factors; 2) the exclusion restriction assumption: IVs are not associated with potential confounders of the exposure or the outcome; and 3) the independence assumption: IVs do not influence the outcome variables through other pathways besides the exposure factors. This study followed guidelines of STROBE-MR [ 24 ] checklist (Table S 1 ).

figure 1

Flowchart of the study. A Flowchart of the MR study; ( B ) Flowchart of the Bioinformatics study; ( C ) Diagram of the MR assumptions of the association between VD and PCOS; ( D ) Illustrative diagram for the mediation MR analysis framework Abbreviations: MR, mendelian randomization; TS-MR, two-sample MR; VD, vitamin D; PCOS, polycystic ovary syndrome; IVW, inverse variance weighted; BMI, body mass index; FBG, fasting glucose; FI, fasting insulin; MVMR, multivariable MR; BT, bioavailable testosterone; SNPs, single-nucleotide polymorphisms

Data source and IVs selection of mendelian randomization study

We obtained data associated with VD from a large genome-wide association study (GWAS) that identified 143 loci among 417,580 participants which was conducted by Revez et al. in 2020 [ 25 ]. We accessed the summary data related to PCOS from a meta-analysis in the FinnGen and Estonian Biobank (EstBB), which included 3609 cases and 229,788 controls [ 7 ]. Summary data related to bioavailable testosterone (BT) were obtained from the UK Biobank (UKB). Data on serum fasting glucose (FBG) levels were obtained from a UKB GWAS we conducted in 340,002 British participants [ 26 ]. Summary data on circulating concentrations of fasting insulin (FI) were obtained from the MAGIC GWAS included 151,013 participants [ 27 ]. Pooled data related to BMI were acquired from a GWAS meta-analysis within the (GIANT) consortium, encompassing 681,275 participants [ 28 ]. Details of the GWAS database are summarized in Table S 2 .

In the bidirectional TS-MR analysis, Single-nucleotide polymorphisms (SNPs) with genome-wide significance ( P  < 5 × 10 –8 ) were first selected. These SNPs were matched against the SNP-outcome GWAS database to exclude SNPs that could not be matched. To minimize the effects of linkage disequilibrium, we conducted a clumping process with an r 2 threshold of 0.001 and a clumping window of 10,000 kb and excluded these SNPs if present. Subsequently, we performed MR-PRESSO analysis immediately to demonstrate whether there was significant horizontal pleiotropy to exclude outlier SNPs [ 29 ]. To ensure that the IVs were not affected by confounding variables, we searched the PhenoScanner V2 [ 30 ] and deleted obesity-related SNPs associated with BMI and waist circumference (WC). Finally, 88 SNPs (VD on PCOS) and 2 SNPs (PCOS on VD) were used as IVs in the primary bidirectional TS-MR study, respectively. All SNPs exhibited an F statistic greater than 10. The variance explained for each SNP (R 2 ) was calculated using the widely-accepted formula [ 31 , 32 ]. We used the same method as above to screen the SNPs required in the MR mediation analysis. All the IVs SNPs are summarized in Table S 3 - 7 .

Statistic analysis of mendelian randomization study

Initially, the primary analysis aimed to explore the causal relationship between VD and PCOS. We used bidirectional TS-MR analysis to assess the causal relationship between VD and PCOS. In this, we used Cochran's Q test to assess the heterogeneity [ 33 ]; if there was no heterogeneity, we would use the fixed-effects inverse variance weighted (IVW) method, otherwise, we would use the random-effects IVW method [ 34 ]. Furthermore, considering that obesity, abnormal insulin levels, and abnormal glucose values are common in patients with PCOS, we adjusted genetically predicted BMI, FBG, and FI by MVMR to explore the direct causal effect between VD and PCOS. To make the results more robust.

Secondly, a stepwise MR analysis approach was used to examine whether there exist mediation effects of BT between VD and PCOS. To assess the direct causal effect between VD, BT, and PCOS, we performed an MVMR analysis using the MVMR R package [ 35 ]. Conditional F statistics were calculated for assessing the strength of the genetic instruments in MVMR analysis [ 36 ]. The product of the coefficients method [ 37 ] and the multivariate delta method [ 38 ] were used to calculate the indirect effects of VD on PCOS via mediator.

Sensitivity analysis of mendelian randomization study

The following tests were used as sensitivity analyses to assess the robustness of MR effect estimates to invalid genetic variants. Firstly, we conducted MR-Egger regression [ 39 , 40 ], weighted median [ 41 ], and weighted mode [ 42 ] methods. MR-Egger regression can detect and explain horizontal pleiotropy mainly through intercept tests [ 39 , 40 ]. Weighted median can yield impartial estimations even when over half of the information arise from flawed IVs [ 43 ]. We used weighted mode to divide SNPs into multiple subsets based on similar causal effects, and the estimates of causal effects were computed for the subset with the highest number of SNPs [ 42 ]. Secondly, the leave-one-out (LOO) analysis can test whether the results are affected by a single SNP [ 44 ]. Thirdly, as described above we performed MR-PRESSO analysis [ 29 ] to identify the presence of potential horizontal pleiotropic outliers in IVs that could lead to biased results, as well as searching for and removing obesity-related SNPs associated with BMI and WC from the PhenoScanner database [ 45 ].

All analyses were conducted using R version 4.2.0 (R Foundation for Statistical Computing, Vienna, Austria). P values were considered significant at 0.05.

Bioinformatical analysis

We used the largest whole blood expression quantitative trait loci (eQTL) dataset from the eQTLGen consortium, which includes data on cis-eQTLs for 19,250 whole blood expressed genes from 31,684 individuals [ 46 ]. We combined SNPs data of VD-PCOS ( n -SNP = 90) and VD-BT ( n -SNP = 88) with cis-eQTLs data for gene annotation, respectively. Genes with P  < 5*10 –8 and FDR < 0.05 were screened as potentially relevant genes for VD-PCOS and VD-BT.

Subsequently, we used these potentially relevant genes for bioinformatics analyses, including Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. GO analyses [ 47 ], including biological process (BP), molecular function (MF), and cellular composition (CC), are commonly used for large-scale functional enrichment studies. KEGG is a database that stores information about genomes, biological pathways, diseases, and drugs. We used the clusterProfiler package, org.Hs.eg.db package, and enrichplot package in the software R to perform GO and KEGG enrichment analyses of the potentially relevant genes. P  < 0.05 for GO entries and KEGG pathways were considered significant.

Causal effect between serum vitamin D and polycystic ovary syndrome

In our bidirectional TS-MR analysis, the number of IVs of VD on PCOS and PCOS on VD were 90 and 2, respectively. The F-statistic values for each SNP were greater than 10 (Table S 3 ), indicating that the results were almost unaffected by weak instrumental bias. The result of fixed-effects IVW method (Cochran's Q statistic = 81.42, P  = 0.704) indicated that genetically predicted higher level of VD led to a lower risk of developing PCOS after excluding obesity-associated SNPs ( n  = 90 SNPs, OR = 0.750, 95% CI: 0.587–0.959, P  = 0.022) (Table  1 ). MR-Egger, weighted median, and weighted mode methods all obtained similar magnitude and direction to IVW method (Table  1 ). The scatter plot demonstrates the inhibitory effect of individual SNP on PCOS (Fig. S 1 ). Since the MR-Egger P -intercept was greater than 0.05 (Table S 8 ) and the funnel plot (Fig. S 2 ) was roughly symmetrical, there was no indication of horizontal pleiotropy detected in the study. The results of the LOO analyses indicated that there were no potentially affecting SNPs in the main MR analyses (Fig. S 3 ). The result of the result of the MR-PRESSO test did not show any outlier SNPs. Nevertheless, the results of reverse TS-MR showed that genetically predicted risk of developing PCOS did not affect the VD level (fixed-IVW: n  = 2 SNPs, OR = 1.004, 95% CI: 0.987–1.022, P  = 0.640) (Table  1 ).

We subsequently explored the direct effect of the serum VD level on PCOS by MVMR methods, and the results of both Model 1 (adjusted BMI) and Model 2 (adjusted BMI, FBG, and FI) showed that the negative correlation between serum VD level and the risk of PCOS remained similar (Table  2 ). This confirms the robustness of the TS-MR results.

Mendelian randomization mediation analysis

After excluding the outlier SNPs and obesity-related SNPs, MVMR analysis (adjusted BT) revealed direct causal effects of serum VD level (OR: 0.735, 95% CI: 0.552–0.978; P  = 0.035) on the risk of developing PCOS (Table  3 , Fig.  1 D). In the following steps of the MR mediation analysis, we found strong evidence for a causal effect of serum VD level (β: − 0.053, P  = 0.026) on BT (Table  3 ). In addition to this, we also found a causal relationship between BT and PCOS (OR: 1.378, 95% CI: 1.123–1.691; P  = 0.002) (Table  3 ).

Taken together, we found the potential mediation pathways between VD and PCOS: an indirect causal effect of VD on PCOS risk via BT (θ 3  × θ 4 ) (OR: 0.983, 95% CI: 0.968–0.998; P  = 0.025) (Table  3 ). The pathway mediated 5.96% of the total causal effect of VD on PCOS risk. Detailed estimates of direct and indirect causal effects can be found in Table  3 .

Bioinformatics study

The results of the MR study suggested that reduced VD level may lead to the development of PCOS, and BT is a mediator between VD and PCOS, meaning that VD can ultimately influence the development of PCOS by affecting the production of testosterone. On the basis of the above studies, we collected IVs of VD-PCOS ( n -SNPs = 90) and VD-BT ( n -SNPs = 88) respectively, and combined these IVs with cis-eQTLs data for gene annotation respectively. Ultimately, 147 (VD-PCOS) and 164 (VD-BT) potentially relevant genes were annotated (Table S 9 - 10 ), respectively. We then used these genes to perform GO and KEGG analyses.

Firstly, the potentially relevant genes of VD-PCOS were analyzed for enrichment. The results of GO analysis suggested that these genes were mainly related to androgen metabolic process, superoxide metabolic process, cell body membrane, and steroid dehydrogenase activity (Fig.  2 A). The KEGG analysis was mainly enriched in the process of autophagy, steroid biosynthesis, cytochrome P450 metabolic process, and vitamin digestion and absorption process (Fig.  2 C). Subsequently, potentially relevant genes associated with VD-BT were analyzed for enrichment. The results of GO analysis suggested that these genes were mainly associated with steroid metabolism, superoxide metabolism, autophagosome membrane, nuclear androgen receptor binding, and vitamin transmembrane transporter activity (Fig.  2 B), and the KEGG analysis was mainly enriched for autophagy, steroid biosynthesis, vitamin digestion and absorption, and cholesterol metabolism process (Fig.  2 C). All information of the enrichment analysis is shown in the additional file (Table S 11 -S 12 ).

figure 2

Gene Ontology and Kyoto Encyclopedia of the Genome pathway enrichment analysis of potentially relevant genes. A The GO enrichment analysis for potentially relevant genes related to VD and PCOS; ( B ) The GO enrichment analysis for potentially relevant genes related to VD and BT; ( C ). The KEGG pathway analysis for potentially relevant genes related to VD and PCOS; ( D ). The KEGG pathway analysis for potentially relevant genes related to VD and BT. Abbreviations: VD, vitamin D; PCOS, polycystic ovary syndrome; BT, bioavailable testosterone; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of the Genome

In our bidirectional TS-MR analysis, we found that higher serum VD level was causally associated with a lower risk of developing PCOS (OR = 0.750, 95% CI: 0.587–0.959, P  = 0.022), whereas there was little evidence for a causal effect of the risk of PCOS on the effect of serum VD level. Furthermore, our MR mediation analysis confirmed that testosterone can act as one of the mediating factors between the causality of VD and PCOS (OR = 0.983, 95% CI: 0.968–0.998, P  = 0.025). The mediating effect of testosterone was 5.96%. Ultimately, we utilized potentially relevant genes for GO and KEGG enrichment analysis to assess the involvement of testosterone and the potential biological and molecular mechanisms between them.

VD, a lipid-soluble vitamin, plays a pivotal role in numerous biological processes. Primarily synthesized endogenously through exposure to sunlight, it is also acquired, albeit to a lesser extent, from dietary sources [ 48 ]. VDD is considered a globally prevalent nutritional deficiency, with various studies reporting prevalence rates of 58–91% among infertile women [ 49 ]. A cross-sectional study encompassing 625 women diagnosed with PCOS and 217 control subjects revealed that Chinese women diagnosed with PCOS exhibited notably lower level of VD compared to their healthy [ 50 ]. The result from a large observational study conducted by Krul-Poel et al. similarly demonstrated significantly diminished level of VD among women within the PCOS group [ 51 ]. Recent research has demonstrated that women with PCOS exhibit lower serum concentrations of VD compared to BMI-matched controls. This implies that the level of VD is linked to PCOS irrespective of BMI [ 14 ]. Aligned with the outcomes of these observational studies, our research indicated that higher serum VD level serves as a protective factor for the risk of PCOS. To eliminate the influence of obesity as a potential confounder on the results, we excluded obesity-related SNPs in our TS-MR analysis. Subsequently, in our MVMR analyses, we adjusted for genetically predicted BMI, FBG, and FI to explore the direct causal relationship between VD and PCOS. These stringent measures significantly enhance the credibility and robustness of our findings.

The precise mechanism through which serum VD operates on PCOS remains elusive. Hyperandrogenemia stands as a pivotal diagnostic criterion for PCOS. Numerous past studies have concentrated on exploring the correlation between serum VD and hyperandrogenemia in PCOS, yet the conclusions drawn from these studies have not reached a consensus. A study conducted by Latic N et al. revealed a negative correlation between serum VD level and testosterone in PCOS patients. Additionally, Menichini et al. demonstrated a positive impact of VD supplementation (4000 IU) on total testosterone [ 52 ]. However, a study by Mesinovic et al. suggested no discernible correlation between serum VD and androgens in individuals with PCOS [ 19 ]. Moreover, a large observational study by Gallea et al. also showcased associations between serum VD level, insulin, and body weight among PCOS patients but not specifically with hyperandrogenemia [ 20 ]. The inconsistencies observed in these findings might stem from variations in race, sample sizes, seasonal disparities, and the lifestyles of the included subjects. Our study, employing Mendelian randomization, effectively mitigated the impact of sample size, seasonal fluctuations, and diverse lifestyles on the outcomes. Furthermore, our research focused solely on individuals of European ethnicity, and we excluded BMI-related SNPs when incorporating instrumental variables, thereby significantly reducing BMI's potential confounding effect on the results. These measures ensured the robustness and reliability of our findings. Our results suggest that testosterone acts as a mediator between serum VD and PCOS, implying that serum VD may potentially contribute to the development of PCOS by influencing testosterone production.

The mechanism by which serum VD ultimately contributes to the development of PCOS by affecting testosterone remains unclear, but possible explanation has been proposed. Serum VD heightens the activity of aromatase within the ovary, thereby fostering the conversion of androgens to estrogens, ultimately culminating in diminished androgens production [ 53 ]. Kinuta et al. demonstrated a marked reduction in aromatase activity within the ovaries of VDR knockout mice in contrast to the control group [ 54 ]. In addition, we performed bioinformatics analysis to explore more possible biological mechanisms. Firstly, the results of GO and KEGG analyses of potentially related genes of VD-PCOS showed that steroid biosynthetic process, androgen metabolic process, and nuclear androgen receptor binding process were the possible biological mechanisms between the causality of the serum VD level and PCOS. These results are consistent with the results of our bidirectional TS-MR analysis, demonstrating again that the serum VD can ultimately influence the development of PCOS by modulating testosterone production. Subsequently, we subjected potentially relevant genes associated with VD-BT to bioinformatics analysis. The results suggested that autophagy process and superoxide metabolism process might be the biological mechanism between serum VD and testosterone.

There are very few studies linking autophagy to PCOS, and the results of these studies suggest that the development of PCOS is closely related to the process of autophagy [ 55 ]. Texada et al. showed that autophagy can regulate steroid production by modulating cholesterol transport in endocrine cells [ 56 ]. In addition to this, the role of VD-mediated autophagy in disease has been extensively studied, and basic study by Hu et al. showed that VD can mediate the regulation of autophagy function through gastric epithelial cell VD receptors, which ultimately affects the pathogenic effects of H. pylori [ 57 ]. However, whether VD can mediate autophagy ultimately leading to PCOS remains unknown. The results of the bioinformatics study in this study suggest that autophagy is most likely one of the important mechanisms underlying the relationship between VD and PCOS.

Our study has proved that lower serum VD level causes higher prevalence of PCOS. The latter could have oocyte competence and endometrial function impaired [ 9 , 10 ], but also cause a few adverse outcomes related to reproduction, such as infertility, miscarriage, and premature delivery [ 12 , 13 ]. It has been found that VDD could decrease the rates of ovulation and success pregnancy in the PCOS patients, leading to less live birth [ 58 ]. In addition, It has been reported that serum VD level was independent predicting factor for live birth in the PCOS patients received ovulati0on induction [ 59 ]. Yasmine et al. have reported that endometrial thickness of PCOS patients maybe improved after VD administration [ 60 ]. A recent meta-analysis has shown that VD supplementation to PCOS women could decrease the occurrence rates of early miscarriage and premature delivery [ 53 ]. The nuclear receptor of VD (VDR) and 1,25(OH)2D3 membrane binding protein are expressed in both ovarian granulosa and theca cells [ 61 , 62 ]. It has been found that VD can regulate the expression of enzymes in the VDR and ovary, ultimately regulating ovarian function [ 63 ]. One study showed that VDR mRNA was significantly less expressed in granulosa cells of the women with PCOS [ 64 ]. It may cause PCOS patients to be more sensitive to VDD. Based on the above studies and ours, serum VD level need be monitored in the female population, especially in the women of reproductive age, and timely VD administration in PCOS patients would help to improve their reproductive function and pregnancy outcomes.

Our research has several advantages. Primarily, this study confirms the direct causal relationship of the serum VD level on the risk of PCOS through the utilization of the TS-MR analysis method. This method avoids the limitation commonly found in most observational studies, thereby fortifying the reliability and validity of our finding. Secondly, we ascertain the mediating function of testosterone in the relationship between serum VD and PCOS via MR mediation analysis, thus laying the groundwork for subsequent mechanistic studies. Finally, this is the first study to combine MR studies and bioinformatics analyses together to explore causal relationship and potential functional mechanisms between serum VD level, testosterone, and the risk of PCOS, which is quite different from other studies. Nonetheless, this study also has limitations. Firstly, our study failed to capture dietary and sun exposure information that may affect serum VD level. Secondly, the use of exclusively European data in a MR analysis may not be generalizable to other ethnic populations, albeit reducing the impact of ethnicity bias on the study outcomes. Finally, the absence of relevant data prevented us from independently exploring the relationship of serum VD 2 /D 3 with the risk of PCOS, warranting further investigation.

Conclusions

In conclusion, our studies confirm the causality between lower serum VD level and higher risk of PCOS. Furthermore, testosterone may act as a mediator between serum VD and PCOS. These findings emphasize the clinical importance of testing serum VD level and timely VD supplementation as possible primary prevention and treatment of PCOS.

Availability of data and materials

No datasets were generated or analysed during the current study.

Abbreviations

  • Polycystic ovary syndrome

Genome-wide association studies

Two-sample Mendelian randomization

Cis-expression quantitative loci

VD deficiency

VD receptors

Body mass index

  • Mendelian randomization

Multivariable MR

Instrumental variables

Bioavailable testosterone

Fasting glucose

Fasting insulin

Single-nucleotide polymorphisms

Waist circumference

Inverse variance weighted

Leave one out

Gene ontology

Kyoto Encyclopedia of Genes and Genomes

Biological process

Molecular function

Cellular composition

Holick MF. The vitamin D deficiency pandemic: Approaches for diagnosis, treatment and prevention. Rev Endocr Metab Disord. 2017;18:153–65.

Article   CAS   PubMed   Google Scholar  

Autier P, Boniol M, Pizot C, Mullie P. Vitamin D status and ill health: a systematic review. Lancet Diabetes Endocrinol. 2014;2:76–89.

Lerchbaum E, Obermayer-Pietsch B. Vitamin D and fertility: a systematic review. Eur J Endocrinol. 2012;166:765–78.

Irani M, Merhi Z. Role of vitamin D in ovarian physiology and its implication in reproduction: a systematic review. Fertil Steril. 2014;102:460-468.e3.

Parikh G, Varadinova M, Suwandhi P, Araki T, Rosenwaks Z, Poretsky L, et al. Vitamin D regulates steroidogenesis and insulin-like growth factor binding protein-1 (IGFBP-1) production in human ovarian cells. Horm Metab Res. 2010;42:754–7.

Du H, Daftary GS, Lalwani SI, Taylor HS. Direct regulation of HOXA10 by 1,25-(OH)2D3 in human myelomonocytic cells and human endometrial stromal cells. Mol Endocrinol. 2005;19:2222–33.

Tyrmi JS, Arffman RK, Pujol-Gualdo N, Kurra V, Morin-Papunen L, Sliz E, et al. Leveraging Northern European population history: novel low-frequency variants for polycystic ovary syndrome. Hum Reprod. 2022;37:352–65.

Article   PubMed   Google Scholar  

Bruni V, Capozzi A, Lello S. The Role of Genetics, Epigenetics and Lifestyle in Polycystic Ovary Syndrome Development: the State of the Art. Reprod Sci. 2022;29:668–79.

Palomba S, Daolio J, La Sala GB. Oocyte Competence in Women with Polycystic Ovary Syndrome. Trends Endocrinol Metab. 2017;28:186–98.

Palomba S, Piltonen TT, Giudice LC. Endometrial function in women with polycystic ovary syndrome: a comprehensive review. Hum Reprod Update. 2021;27:584–618.

Norman RJ, Dewailly D, Legro RS, Hickey TE. Polycystic ovary syndrome. Lancet. 2007;370:685–97.

Palomba S. Is fertility reduced in ovulatory women with polycystic ovary syndrome? An opinion paper Hum Reprod. 2021;36:2421–8.

CAS   PubMed   Google Scholar  

Palomba S, De Wilde MA, Falbo A, Koster MPH, La Sala GB, Fauser BCJM. Pregnancy complications in women with polycystic ovary syndrome. Hum Reprod Update. 2015;21:575–92.

Bacopoulou F, Kolias E, Efthymiou V, Antonopoulos CN, Charmandari E. Vitamin D predictors in polycystic ovary syndrome: a meta-analysis. Eur J Clin Invest. 2017;47:746–55.

Lejman-Larysz K, Golara A, Baranowska M, Kozłowski M, Guzik P, Szydłowska I, et al. Influence of Vitamin D on the Incidence of Metabolic Syndrome and Hormonal Balance in Patients with Polycystic Ovary Syndrome. Nutrients. 2023;15:2952.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Durdiakova J, Ostatnikova D, Celec P. Testosterone and its metabolites–modulators of brain functions. Acta Neurobiol Exp (Warsz). 2011;71:434–54.

Hahn S, Haselhorst U, Tan S, Quadbeck B, Schmidt M, Roesler S, et al. Low serum 25-hydroxyvitamin D concentrations are associated with insulin resistance and obesity in women with polycystic ovary syndrome. Exp Clin Endocrinol Diabetes. 2006;114:577–83.

Latic N, Erben RG. Vitamin D and Cardiovascular Disease, with Emphasis on Hypertension, Atherosclerosis, and Heart Failure. Int J Mol Sci. 2020;21:6483.

Mesinovic J, Teede HJ, Shorakae S, Lambert GW, Lambert EA, Naderpoor N, et al. The Relationship between Vitamin D Metabolites and Androgens in Women with Polycystic Ovary Syndrome. Nutrients. 2020;12:1219.

Gallea M, Granzotto M, Azzolini S, Faggian D, Mozzanega B, Vettor R, et al. Insulin and body weight but not hyperandrogenism seem involved in seasonal serum 25-OH-vitamin D3 levels in subjects affected by PCOS. Gynecol Endocrinol. 2014;30:739–45.

Smith GD, Lawlor DA, Harbord R, Timpson N, Day I, Ebrahim S. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. Cardon L, editor. PLoS Med. 2007;4:e352.

Article   PubMed   PubMed Central   Google Scholar  

Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23:R89-98.

EPIC- InterAct Consortium, Burgess S, Scott RA, Timpson NJ, Davey Smith G, Thompson SG. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur J Epidemiol. 2015;30:543–52.

Article   PubMed Central   Google Scholar  

Skrivankova VW, Richmond RC, Woolf BAR, Yarmolinsky J, Davies NM, Swanson SA, et al. Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization: The STROBE-MR Statement. JAMA. 2021;326:1614.

Revez JA, Lin T, Qiao Z, Xue A, Holtz Y, Zhu Z, et al. Genome-wide association study identifies 143 loci associated with 25 hydroxyvitamin D concentration. Nat Commun. 2020;11:1647.

Mbatchou J, Barnard L, Backman J, Marcketta A, Kosmicki JA, Ziyatdinov A, et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet. 2021;53:1097–103.

Chen J, Spracklen CN, Marenne G, Varshney A, Corbin LJ, Luan J, et al. The trans-ancestral genomic architecture of glycemic traits. Nat Genet. 2021;53:840–60.

Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼ 700000 individuals of European ancestry. Hum Mol Genet. 2018;27:3641–9.

Verbanck M, Chen C-Y, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50:693–8.

Mortada I. Hyperuricemia, Type 2 Diabetes Mellitus, and Hypertension: an Emerging Association. Curr Hypertens Rep. 2017;19:69.

Choi HK, McCormick N, Lu N, Rai SK, Yokose C, Zhang Y. Population Impact Attributable to Modifiable Risk Factors for Hyperuricemia. Arthritis Rheumatol. 2020;72:157–65.

Nakamura K, Sakurai M, Miura K, Morikawa Y, Yoshita K, Ishizaki M, et al. Alcohol intake and the risk of hyperuricaemia: a 6-year prospective study in Japanese men. Nutr Metab Cardiovasc Dis. 2012;22:989–96.

Greco MFD, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015;34:2926–40.

Article   Google Scholar  

Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-base platform supports systematic causal inference across the human phenome. eLife. 2018;7:e34408.

Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Stat Med. 2021;40:5434–52.

Sanderson E, Davey Smith G, Windmeijer F, Bowden J. An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol. 2019;48:713–27.

VanderWeele TJ. Mediation Analysis: A Practitioner’s Guide. Annu Rev Public Health. 2016;37:17–32.

MacKinnon DP, Fairchild AJ, Fritz MS. Mediation analysis. Annu Rev Psychol. 2007;58:593–614.

Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44:512–25.

Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32:377–89.

Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol. 2016;40:304–14.

Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46:1985–98.

Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator - PubMed. Available from: https://pubmed.ncbi.nlm.nih.gov/27061298/ . [cited 2023 Feb 15].

Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG. Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. Epidemiology. 2017;28:30–42.

Kamat MA, Blackshaw JA, Young R, Surendran P, Burgess S, Danesh J, et al. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics. 2019;35:4851–3.

Võsa U, Claringbould A, Westra H-J, Bonder MJ, Deelen P, Zeng B, et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet. 2021;53:1300–10.

The Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015;43:D1049–56.

Yuan C, Qian ZR, Babic A, Morales-Oyarvide V, Rubinson DA, Kraft P, et al. Prediagnostic Plasma 25-Hydroxyvitamin D and Pancreatic Cancer Survival. J Clin Oncol. 2016;34:2899–905.

Cunningham TK, Allgar V, Dargham SR, Kilpatrick E, Sathyapalan T, Maguiness S, et al. Association of Vitamin D Metabolites With Embryo Development and Fertilization in Women With and Without PCOS Undergoing Subfertility Treatment. Front Endocrinol. 2019;10:13.

Shan C, Zhu Y, Yu J, Zhang Y, Wang Y, Lu N, et al. Low Serum 25-Hydroxyvitamin D Levels Are Associated With Hyperandrogenemia in Polycystic Ovary Syndrome: A Cross-Sectional Study. Front Endocrinol. 2022;13:894935.

Krul-Poel YHM, Koenders PP, Steegers-Theunissen RP, Ten Boekel E, Wee MMT, Louwers Y, et al. Vitamin D and metabolic disturbances in polycystic ovary syndrome (PCOS): a cross-sectional study. Narayanan R, editor. PLOS One. 2018;13:e0204748.

Menichini D, Facchinetti F. Effects of vitamin D supplementation in women with polycystic ovary syndrome: a review. Gynecol Endocrinol. 2020;36:1–5.

Yang M, Shen X, Lu D, Peng J, Zhou S, Xu L, et al. Effects of vitamin D supplementation on ovulation and pregnancy in women with polycystic ovary syndrome: a systematic review and meta-analysis. Front Endocrinol. 2023;14:1148556.

Kinuta K, Tanaka H, Moriwake T, Aya K, Kato S, Seino Y. Vitamin D is an important factor in estrogen biosynthesis of both female and male gonads. Endocrinology. 2000;141:1317–24.

Kumariya S, Ubba V, Jha RK, Gayen JR. Autophagy in ovary and polycystic ovary syndrome: role, dispute and future perspective. Autophagy. 2021;17:2706–33.

Texada MJ, Malita A, Rewitz K. Autophagy regulates steroid production by mediating cholesterol trafficking in endocrine cells. Autophagy. 2019;15:1478–80.

Hu W, Zhang L, Li MX, Shen J, Liu XD, Xiao ZG, et al. Vitamin D3 activates the autolysosomal degradation function against Helicobacter pylori through the PDIA3 receptor in gastric epithelial cells. Autophagy. 2019;15:707–25.

Butts SF, Seifer DB, Koelper N, Senapati S, Sammel MD, Hoofnagle AN, et al. Vitamin D Deficiency Is Associated With Poor Ovarian Stimulation Outcome in PCOS but Not Unexplained Infertility. J Clin Endocrinol Metab. 2019;104:369–78.

Pal L, Zhang H, Williams J, Santoro NF, Diamond MP, Schlaff WD, et al. Vitamin D Status Relates to Reproductive Outcome in Women With Polycystic Ovary Syndrome: Secondary Analysis of a Multicenter Randomized Controlled Trial. J Clin Endocrinol Metab. 2016;101:3027–35.

Abuzeid Y. Impact of Vitamin D Deficiency on Reproductive Outcome in infertile anovulatory women with polycystic ovary syndrome: a systematic literature review. Curr Dev Nutr. 2020;4:nzaa067_001.

Li S, Qi J, Sun Y, Gao X, Ma J, Zhao S. An integrated RNA-Seq and network study reveals that valproate inhibited progesterone production in human granulosa cells. J Steroid Biochem Mol Biol. 2021;214:105991.

Hrabia A, Kamińska K, Socha M, Grzesiak M. Vitamin D3 Receptors and Metabolic Enzymes in Hen Reproductive Tissues. Int J Mol Sci. 2023;24:17074.

Xu J, Lawson MS, Xu F, Du Y, Tkachenko OY, Bishop CV, et al. Vitamin D3 Regulates Follicular Development and Intrafollicular Vitamin D Biosynthesis and Signaling in the Primate Ovary. Front Physiol. 2018;9:1600.

Aghadavod E, Mollaei H, Nouri M, Hamishehkar H. Evaluation of Relationship between Body Mass Index with Vitamin D Receptor Gene Expression and Vitamin D Levels of Follicular Fluid in Overweight Patients with Polycystic Ovary Syndrome. Int J Fertil Steril. 2017;11:105–11.

CAS   PubMed   PubMed Central   Google Scholar  

Download references

Acknowledgements

We would like to express our sincere gratitude to the compilers of the GWAS summary dataset for their management of the data collection and data resources.

This work was supported by the General Program of National Natural Science Foundation of China (grant number No.81771741), Distinguished Professor at Educational Department of Liaoning Province (grant number No. [2014]187) to JL.

Author information

Authors and affiliations.

Department of Endocrinology and Metabolism, The Institute of Endocrinology, NHC Key Laboratory of Diagnosis and Treatment of Thyroid Diseases, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, 110000, P.R. China

Bingrui Gao, Chenxi Zhang, Deping Wang, Bojuan Li, Zhongyan Shan, Weiping Teng & Jing Li

Department of Endocrinology and Metabolism, Hongqi Hospital Affiliated to Mudanjiang Medical College, Mudanjiang, Heilongjiang, 157011, P.R. China

Deping Wang

You can also search for this author in PubMed   Google Scholar

Contributions

Designed the study: Jing Li, Bingrui Gao; Collected data: Bingrui Gao, Chenxi Zhang; Performed statistical analyses: Bingrui Gao, Deping Wang, Bojuan Li; Drafted the manuscript: Bingrui Gao; Supervised the study and reviewed the manuscript: Jing Li, Zhongyan Shan, Weiping Teng.

Corresponding author

Correspondence to Jing Li .

Ethics declarations

Ethics approval and consent to participate.

Our analysis used publicly available genome-wide association study (GWAS) summary statistics. No new data were collected, and no new ethical approval was required.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: table s1..

STROBE-MR Checklist; Table S2. Key characteristics of participating studies; Table S3. GWAS significant SNPs used as genetic instruments for VD level on PCOS; Table S4. GWAS significant SNPs used as genetic instruments for PCOS on VD level; Table S5. GWAS significant SNPs used as genetic instruments for VD level on BT; Table S6. GWAS significant SNPs used as genetic instruments for BT on PCOS; Table S7. GWAS significant SNPs used as genetic instruments for BT and VD level on PCOS; Table S8. Heterogeneity and directional pleiotropy test using MR-Egger intercepts; Table S9. Potentially relevant genes corresponding to IVs associated with VD and PCOS; Table S10. Potentially relevant genes corresponding to IVs associated with VD and PCOS; Table S11. GO and KEGG enrichment analysis for potentially relevant genes related to VD and PCOS; Table S12. GO and KEGG enrichment analysis for potentially relevant genes related to VD and BT; Figure S1. Scatter plot of the MR estimates for the association of VD level with PCOS; Figure S2. Funnel plot reveals overall heterogeneity of the impact of VD on PCOS; Figure S3. Leave-one-out analysis of the impact of the VD on PCOS.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Gao, B., Zhang, C., Wang, D. et al. Causal association between low vitamin D and polycystic ovary syndrome: a bidirectional mendelian randomization study. J Ovarian Res 17 , 95 (2024). https://doi.org/10.1186/s13048-024-01420-5

Download citation

Received : 23 February 2024

Accepted : 20 April 2024

Published : 07 May 2024

DOI : https://doi.org/10.1186/s13048-024-01420-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Testosterone

Journal of Ovarian Research

ISSN: 1757-2215

quantitative research correlational example

COMMENTS

  1. Correlational Research

    For example, in psychology, correlational research can be used to explore the relationship between personality traits and behavior, or between early life experiences and later mental health outcomes. In education, correlational research can be used to examine the relationship between teaching practices and student achievement.

  2. Correlational Research

    Revised on June 22, 2023. A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

  3. Correlational Research

    Revised on 5 December 2022. A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

  4. Correlational Study Overview & Examples

    A correlational study is an experimental design that evaluates only the correlation between variables. The researchers record measurements but do not control or manipulate the variables. Correlational research is a form of observational study. A correlation indicates that as the value of one variable increases, the other tends to change in a ...

  5. Correlational Research: What it is with Examples

    It is a misconception that a correlational study involves two quantitative variables. However, the reality is two variables are measured, but neither is changed. This is true independent of whether the variables are quantitative or categorical. Types of correlational research. Mainly three types of correlational research have been identified: 1.

  6. Correlational Research Designs: Types, Examples & Methods

    Here are 3 case examples of correlational research. ... The strength of a correlation between quantitative variables is typically measured using a statistic called Pearson's Correlation Coefficient (or Pearson's r). A positive correlation is indicated by a value of 1.0, a perfect negative correlation is indicated by a value of -1.0 while ...

  7. 7.2 Correlational Research

    Define correlational research and give several examples. ... A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extraversion tests or the number of hassles and number of symptoms people have experienced. ... An example is a study by Brett Pelham and ...

  8. 6.2 Correlational Research

    Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables ...

  9. Correlational Research

    Correlational research is a type of non-experimental research in which the researcher measures two variables (binary or continuous) and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical ...

  10. Chapter 12 Methods for Correlational Studies

    Correlational studies aim to find out if there are differences in the characteristics of a population depending on whether or not its subjects have been exposed to an event of interest in the naturalistic setting. In eHealth, correlational studies are often used to determine whether the use of an eHealth system is associated with a particular set of user characteristics and/or quality of care ...

  11. PDF SURVEY AND CORRELATIONAL RESEARCH DESIGNS

    correlational designs. We begin this chapter with an introduction to the research design that was illustrated here: the survey research design. 8.1 An Overview of Survey Designs A nonexperimental research design used to describe an individual or a group by having participants complete a survey or questionnaire is called the survey research design.

  12. What Is Quantitative Research?

    Quantitative research methods. You can use quantitative research methods for descriptive, correlational or experimental research. In descriptive research, you simply seek an overall summary of your study variables.; In correlational research, you investigate relationships between your study variables.; In experimental research, you systematically examine whether there is a cause-and-effect ...

  13. How Accurate Is Your Correlation? Different Methods Derive Different

    Assessing the association between conceptual constructs are at the heart of quantitative research in educational and psychological research. Researchers apply different methods to the data to obtain results about the correlation between a set of variables. ... We show to the readers an applied example of correlation generated from bivariate ...

  14. PDF Chapter 3 QUANTITATIVE Workbook for a Correlational Study

    descriptive, correlational research. First, describe the nature of descriptive research and how it differs from experimental research. Next, describe the nature of correlational research. Explain how correlational research differs from other non-experimental quantitative designs, such as ex post facto and compar ative designs.

  15. PDF A Correlational Study Examining the Relationship Between Restorative

    quantitative correlational research design was utilized for this study to examine the relationship between restorative practices and school climate. Findings from this study indicated a strong positive correlation between measures of restorative practice and school climate. Keywords: school discipline, school climate, restorative practices.

  16. A correlational study of the relationship between academic performance

    sample consisted of 25 females and 20 males. Both mothers age and fathers age were found to correlate strongly with the three measures of academic performance as determined by the Pearson Product Moment Correlation test, and correlations for all three measures were statistically significant. This study concluded that parental age is a

  17. 12 Quantitative Descriptive and Correlational Research

    AbstractThis chapter presents research designs for descriptive and correlational quantitative research. Descriptive research designs are used to address th ... and a study of expert studio teachers' teaching behaviors (e.g., Blackwell, 2020) are both examples of research topics wherein the primary purpose is to systematically describe a ...

  18. (PDF) Quantitative Correlational Research Study of Leadership

    The document is a proposal for doctoral research at the School of Advanced Studies of the University of Phoenix. The purpose of the quantitative correlational study is to profile the leadership of ...

  19. (PDF) Quantitative Research Designs

    Correlational Research design explored ... The study adopted a cross-sectional research design and quantitative research approach using a sample of 300 respondents from the six public hospitals in ...

  20. A Quantitative Correlational Study between Transformational Leadership

    This quantitative, non-experimental, correlational study involved utilizing two Likert-type scale survey instruments to measure leadership styles and employee job satisfaction.

  21. Regression and Correlation

    Quantitative Research Methods. Correlation is the relationship or association between two variables. There are multiple ways to measure correlation, but the most common is Pearson's correlation coefficient (r), which tells you the strength of the linear relationship between two variables. The value of r has a range of -1 to 1 (0 indicates no ...

  22. A Correlational Study on Students' Reading Interest ...

    A Correlational Study on Students' Reading Interest ... - ResearchGate

  23. Quantitative Data Analysis: A Complete Guide

    Here's how to make sense of your company's numbers in just four steps: 1. Collect data. Before you can actually start the analysis process, you need data to analyze. This involves conducting quantitative research and collecting numerical data from various sources, including: Interviews or focus groups.

  24. Chapter 1 TO 3

    a correlational study betweeen ml and academic acknowledgement this research study would not have been possible without the support, help, advice, and affection ... (for example, live streaming, eSports broadcasts). According to global study, a big percentage of teens said they played games on a range of platforms last year, including personal ...

  25. The causal effect of adipose tissue on Hodgkin's lymphoma: two-sample

    Background: Extensive research has been conducted on the correlation between adipose tissue and the risk of malignant lymphoma. Despite numerous observational studies exploring this connection, uncertainty remains regarding a causal relationship between adipose tissue and malignant lymphoma.Methods: The increase or decrease in adipose tissue was represented by the height of BMI.

  26. Causal association between low vitamin D and polycystic ovary syndrome

    Background Recent studies have revealed the correlation between serum vitamin D (VD) level and polycystic ovary syndrome (PCOS), but the causality and specific mechanisms remain uncertain. Objective We aimed to investigate the cause-effect relationship between serum VD and PCOS, and the role of testosterone in the related pathological mechanisms. Methods We assessed the causality between serum ...