17 Longitudinal Study Advantages and Disadvantages

Longitudinal studies are a research design which requires repeated observations of the same variables over specific time periods. These may be shorter examinations or designed to collect long-term data. Under most situations, it is treated as a type of observational study, although there are times when researchers can structure them as more of a randomized experiment.

Most longitudinal studies are used in either clinical psychology or social-personality observations. They are useful when observing the rapid fluctuations of emotion, thoughts, or behaviors between two specific baseline points. Some researchers use them to study life events, compare generational behaviors, or review developmental trends across individual lifetimes.

When they are observational, then longitudinal studies are able to observe the world without manipulating it in any way. That means they may have less power to detect casual relationships that may form in their observed subjects. Because there are repeated observations performed at the individual level with this option, there is also more power than other studies to remove time-invariant differences while review the temporal order of events that occur.

The longest-running longitudinal study in the world today was started in 1921 by psychologist Lewis Terman. He wanted to investigate how highly intelligent children would develop as they turned into adults. The original study had over 1,000 participants, but that figure has dropped to under 200. Researchers plan to continue their work until there are no participants left.

These are the crucial longitudinal studies pros and cons to review before setting up this form of a panel study.

List of the Pros of Longitudinal Studies

1. This form of research is designed to be more flexible than other options. There are times when a longitudinal study will look at one specific data point only when researchers begin observing their subjects. You will also find that this option provides enough data when implemented to provide information on unanticipated relationships or patterns that may be meaningful in specific environments. Since most of these studies are not designed to be lengthy, there are more options to pursue tangents here than in other research formats.

Researchers have an opportunity to pursue additional data points which were collected to determine if a shift in focus is necessary to review a complete set of information. If there is something interesting found in the material, then longitudinal studies allow for an option to pursue them.

2. The accuracy rate of the data collected during longitudinal studies is high. When researchers decide to follow longitudinal studies to collect observational data, then the accurate rate of the information they collect is high because everything occurs in a real-time situation. Although mistakes do happen because no one is perfect, the structure and foundation of this option limits the problems that can occur. This information is also useful in the implementation of changes that may be necessary to achieve the best possible outcome during an observational period.’

3. This research method can identify unique developmental trends. When researchers pursue a short-term longitudinal study, then they are looking for answers to very specific questions. If a long-term model is developed, there is an opportunity to identify specific developmental trends that occur in various fields, including sociology, psychology, and general medicine.

Researchers using longitudinal studies have opportunities to track multiple generations in specific family groups while still collecting real-time data on all of the individuals being tracked to see how current decisions can influence future outcomes for some population demographics.

4. It allows for the consistent use of the observational method. It is a simpler process to collect information when using longitudinal studies for research because it almost always uses the observational method. This structure makes it possible to collect consistent data samples at the individual level instead of relying on extrapolation or other methods of personal identification. It is the consistency offered in this approach which provides for exclusion differences for individuals, making it possible to exclude variations that could adversely impact outcomes as it happens with other research options.

5. Longitudinal studies allow for unique a specific data points to be collected. Most research study options provide a structure where data is available over a short time period for collection, offering a small window where cause-and-effect examples can be observed. Longitudinal studies provide an option to increase the amount of time provided for researchers to collect their data, sometimes on a very dramatic scale. There are some studies which are measured in decades or centuries instead of days, weeks, or months. This process makes it possible to examine the macro- and micro-changes that can occur in the various fields of humanity.

6. This process allows for higher levels of research validity. For any research project to be successful, there are laws, regulations, and rules that must be instituted from the very beginning to ensure all researchers follow the same path of data collection. This structure makes it possible of multiple people to collect similar information from unique individuals because everyone is following the same set of processes. It creates a result that offers higher levels of validity because it is a simpler process to verify the data that is being developed from the direct observations of others.

7. There are three different types of longitudinal studies available for use. Researchers have access to three significant types of longitudinal studies to collect the information that they require. Panel studies are the first option, and they involve a sampling of a cross-section of individuals. Cohort studies are the second type, which involves the selection of a group based on specific events, such as their historical experience, household location, or place of birth.

The final option is called a retrospective study. This option looks at the past by reviewing historical information, such as medical records, to determine if there is a pattern in the data that is useful.

List of the Cons of Longitudinal Studies

1. The structure makes it possible for one person to change everything. Longitudinal studies have a robust reliance on the individual interpretations that researchers develop after making their observations. That makes it possible for personal bias, inexperience, or a mistake to inadvertently alter the data being collected in real-time situations. This issue makes it possible for the information to be invalid without researchers realizing that this disadvantage is present in their work. Even if there are numerous people involved with a project, it is possible for a single person to disrupt potentially decades of work because of their incorrect (and possibly inadvertent) approach.

2. It is more expensive to perform longitudinal studies than other research methods. This disadvantage typically applies to the research studies which are designed to take longer periods of time to collect relevant information. Because observations may last for several years (if not decades), the organizations which are behind the effort of information retention can discover that their costs can be up to 50% higher in some situations when they choose this method over the other options that are available. Although the value of the research remains high, some may find the cost to be a significant barrier to cross.

3. The information collected by researchers may have few controls. The real-time observational data that researchers collect during longitudinal studies is both informative and efficient from a cost perspective when looking at short-term situations. One of the problems that this method encounters is that the information being collected comes from a relatively small number of individuals. Unless it is built into the rules for collection, there may be no controls in place for environmental factors or cultural differences between the individuals involved.

4. It can be challenging for longitudinal research to adapt to changes. There is sometimes no follow up to identify changes in thinking or operations that occur when using longitudinal studies as the primary basis of information collection. Researchers sometimes fail to compare attitudes, behaviors, or perceptions from one point of time to another. Most people change as time passes because they have more information available to them upon which they can draw an opinion. Some people can be very different today than they were 10 years ago. Unless the structures are flexible enough to recognize and adapt to this situation, then the data they gather may not be as useful as it should be.

5. Longitudinal studies often require a larger sample size. Researchers use longitudinal studies to develop a recognition for patterns and relationships. That means there is a large amount of data that must be collected from numerous individual sources to draw meaningful connections to the topic under study. If there is not a significant sample size available to researchers for the project, then there may not be enough information available to find specific conclusions.

Even when there is enough data present for researchers to use, the sheer size of what they collect can require data mining efforts that can take time to sort out.

6. Some people do not authentically participate in longitudinal studies. As with any other form of research that is performed today, you will encounter individuals who behave artificially because they know they are part of a longitudinal study program. When this issue occurs, then it becomes challenging for researchers to sort out what the authentic and inauthentic emotions, thoughts, and behaviors are from each other. Some participants may try to behave in the ways that they believe the researchers want to create specific results.

A study by psychologist Robert S. Feldmen and conducted by the University of Massachusetts found that 60% of people lie at least once during a 10-minute conversation. The average person will lie 2-3 times during that discussion. The content of fibs varies between men and women, trying to make themselves look better or to make the person they are talking to feel good respectively. Researchers must recognize this trait early to remove this potential disadvantage.

7. Longitudinal studies rely on the skill set of the researchers. The data that longitudinal studies collects is presented in real-time to researchers, which means it relies on their individual skills to make it useful. Those who are tasked with this job must follow a specific set of steps to ensure that there is authenticity and value to what they observe. Even if you offer step-by-step guidelines on how to perform the work, two different researchers may interpret the instructions differently, which can then lead to an adverse result. The personal views of the information being collected can also impact the results in ways that are not useful.

8. The data that is collected from longitudinal studies may not be reliable. Although the goal of longitudinal studies is to identify patterns, inaccuracies in the information collected can lead to incorrect interpretations of choices, thoughts, and behaviors. All it takes is one piece of data to be inaccurate for the results to be impacted in negative ways. It is possible that the findings of the research could be invalidated by just one incorrect interpretation of a real-time result. That is why any conclusion made using this method is often taken with a “grain of salt” with regard to its viability.

9. There is a time element to consider with longitudinal studies. Researchers may find that it requires several years of direct observation before any meaningful data becomes available through longitudinal studies. Some relationships or observable behaviors may never occur even though it seems like they should, which means this time investment may never offer dividends. These studies must have the means to maintain continuously open lines of communication with all of the involved parties to ensure that the quality of the data remains high throughout the entire effort.

10. Longitudinal studies always offer a factor of unpredictability. Because the structure of longitudinal studies will follow the same individuals over an extended time period, what happens to each person outside of the scope of the research can have a direct impact on the eventual findings that researchers develop. Some people may choose to stop participating in the study altogether, which may reduce the validity of the final result when published. It is possible for some individuals or households to shift their demographic profile so that they are no longer viable candidates for the research. Unless these factors are included in the initial structure of the project, then the findings that are developed from the work could be invalid.

The pros and cons of longitudinal studies provides us with a valuable foundation of data that makes it possible to recognize long-term relationships, determine their value, and where it may be possible to make healthy changes in numerous fields. There are certain risks to consider with this process that may create unpredictable outcomes, but it is also through this research method that we will likely find new ways to transform numerous scientific and medical fields in the future.

Longitudinal Study Design

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

A longitudinal study is a type of observational and correlational study that involves monitoring a population over an extended period of time. It allows researchers to track changes and developments in the subjects over time.

What is a Longitudinal Study?

In longitudinal studies, researchers do not manipulate any variables or interfere with the environment. Instead, they simply conduct observations on the same group of subjects over a period of time.

These research studies can last as short as a week or as long as multiple years or even decades. Unlike cross-sectional studies that measure a moment in time, longitudinal studies last beyond a single moment, enabling researchers to discover cause-and-effect relationships between variables.

They are beneficial for recognizing any changes, developments, or patterns in the characteristics of a target population. Longitudinal studies are often used in clinical and developmental psychology to study shifts in behaviors, thoughts, emotions, and trends throughout a lifetime.

For example, a longitudinal study could be used to examine the progress and well-being of children at critical age periods from birth to adulthood.

The Harvard Study of Adult Development is one of the longest longitudinal studies to date. Researchers in this study have followed the same men group for over 80 years, observing psychosocial variables and biological processes for healthy aging and well-being in late life (see Harvard Second Generation Study).

When designing longitudinal studies, researchers must consider issues like sample selection and generalizability, attrition and selectivity bias, effects of repeated exposure to measures, selection of appropriate statistical models, and coverage of the necessary timespan to capture the phenomena of interest.

Panel Study

  • A panel study is a type of longitudinal study design in which the same set of participants are measured repeatedly over time.
  • Data is gathered on the same variables of interest at each time point using consistent methods. This allows studying continuity and changes within individuals over time on the key measured constructs.
  • Prominent examples include national panel surveys on topics like health, aging, employment, and economics. Panel studies are a type of prospective study .

Cohort Study

  • A cohort study is a type of longitudinal study that samples a group of people sharing a common experience or demographic trait within a defined period, such as year of birth.
  • Researchers observe a population based on the shared experience of a specific event, such as birth, geographic location, or historical experience. These studies are typically used among medical researchers.
  • Cohorts are identified and selected at a starting point (e.g. birth, starting school, entering a job field) and followed forward in time. 
  • As they age, data is collected on cohort subgroups to determine their differing trajectories. For example, investigating how health outcomes diverge for groups born in 1950s, 1960s, and 1970s.
  • Cohort studies do not require the same individuals to be assessed over time; they just require representation from the cohort.

Retrospective Study

  • In a retrospective study , researchers either collect data on events that have already occurred or use existing data that already exists in databases, medical records, or interviews to gain insights about a population.
  • Appropriate when prospectively following participants from the past starting point is infeasible or unethical. For example, studying early origins of diseases emerging later in life.
  • Retrospective studies efficiently provide a “snapshot summary” of the past in relation to present status. However, quality concerns with retrospective data make careful interpretation necessary when inferring causality. Memory biases and selective retention influence quality of retrospective data.

Allows researchers to look at changes over time

Because longitudinal studies observe variables over extended periods of time, researchers can use their data to study developmental shifts and understand how certain things change as we age.

High validation

Since objectives and rules for long-term studies are established before data collection, these studies are authentic and have high levels of validity.

Eliminates recall bias

Recall bias occurs when participants do not remember past events accurately or omit details from previous experiences.

Flexibility

The variables in longitudinal studies can change throughout the study. Even if the study was created to study a specific pattern or characteristic, the data collection could show new data points or relationships that are unique and worth investigating further.

Limitations

Costly and time-consuming.

Longitudinal studies can take months or years to complete, rendering them expensive and time-consuming. Because of this, researchers tend to have difficulty recruiting participants, leading to smaller sample sizes.

Large sample size needed

Longitudinal studies tend to be challenging to conduct because large samples are needed for any relationships or patterns to be meaningful. Researchers are unable to generate results if there is not enough data.

Participants tend to drop out

Not only is it a struggle to recruit participants, but subjects also tend to leave or drop out of the study due to various reasons such as illness, relocation, or a lack of motivation to complete the full study.

This tendency is known as selective attrition and can threaten the validity of an experiment. For this reason, researchers using this approach typically recruit many participants, expecting a substantial number to drop out before the end.

Report bias is possible

Longitudinal studies will sometimes rely on surveys and questionnaires, which could result in inaccurate reporting as there is no way to verify the information presented.

  • Data were collected for each child at three-time points: at 11 months after adoption, at 4.5 years of age and at 10.5 years of age. The first two sets of results showed that the adoptees were behind the non-institutionalised group however by 10.5 years old there was no difference between the two groups. The Romanian orphans had caught up with the children raised in normal Canadian families.
  • The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents (Marques Pais-Ribeiro, & Lopez, 2011)
  • The correlation between dieting behavior and the development of bulimia nervosa (Stice et al., 1998)
  • The stress of educational bottlenecks negatively impacting students’ wellbeing (Cruwys, Greenaway, & Haslam, 2015)
  • The effects of job insecurity on psychological health and withdrawal (Sidney & Schaufeli, 1995)
  • The relationship between loneliness, health, and mortality in adults aged 50 years and over (Luo et al., 2012)
  • The influence of parental attachment and parental control on early onset of alcohol consumption in adolescence (Van der Vorst et al., 2006)
  • The relationship between religion and health outcomes in medical rehabilitation patients (Fitchett et al., 1999)

Goals of Longitudinal Data and Longitudinal Research

The objectives of longitudinal data collection and research as outlined by Baltes and Nesselroade (1979):
  • Identify intraindividual change : Examine changes at the individual level over time, including long-term trends or short-term fluctuations. Requires multiple measurements and individual-level analysis.
  • Identify interindividual differences in intraindividual change : Evaluate whether changes vary across individuals and relate that to other variables. Requires repeated measures for multiple individuals plus relevant covariates.
  • Analyze interrelationships in change : Study how two or more processes unfold and influence each other over time. Requires longitudinal data on multiple variables and appropriate statistical models.
  • Analyze causes of intraindividual change: This objective refers to identifying factors or mechanisms that explain changes within individuals over time. For example, a researcher might want to understand what drives a person’s mood fluctuations over days or weeks. Or what leads to systematic gains or losses in one’s cognitive abilities across the lifespan.
  • Analyze causes of interindividual differences in intraindividual change : Identify mechanisms that explain within-person changes and differences in changes across people. Requires repeated data on outcomes and covariates for multiple individuals plus dynamic statistical models.

How to Perform a Longitudinal Study

When beginning to develop your longitudinal study, you must first decide if you want to collect your own data or use data that has already been gathered.

Using already collected data will save you time, but it will be more restricted and limited than collecting it yourself. When collecting your own data, you can choose to conduct either a retrospective or prospective study .

In a retrospective study, you are collecting data on events that have already occurred. You can examine historical information, such as medical records, in order to understand the past. In a prospective study, on the other hand, you are collecting data in real-time. Prospective studies are more common for psychology research.

Once you determine the type of longitudinal study you will conduct, you then must determine how, when, where, and on whom the data will be collected.

A standardized study design is vital for efficiently measuring a population. Once a study design is created, researchers must maintain the same study procedures over time to uphold the validity of the observation.

A schedule should be maintained, complete results should be recorded with each observation, and observer variability should be minimized.

Researchers must observe each subject under the same conditions to compare them. In this type of study design, each subject is the control.

Methodological Considerations

Important methodological considerations include testing measurement invariance of constructs across time, appropriately handling missing data, and using accelerated longitudinal designs that sample different age cohorts over overlapping time periods.

Testing measurement invariance

Testing measurement invariance involves evaluating whether the same construct is being measured in a consistent, comparable way across multiple time points in longitudinal research.

This includes assessing configural, metric, and scalar invariance through confirmatory factor analytic approaches. Ensuring invariance gives more confidence when drawing inferences about change over time.

Missing data

Missing data can occur during initial sampling if certain groups are underrepresented or fail to respond.

Attrition over time is the main source – participants dropping out for various reasons. The consequences of missing data are reduced statistical power and potential bias if dropout is nonrandom.

Handling missing data appropriately in longitudinal studies is critical to reducing bias and maintaining power.

It is important to minimize attrition by tracking participants, keeping contact info up to date, engaging them, and providing incentives over time.

Techniques like maximum likelihood estimation and multiple imputation are better alternatives to older methods like listwise deletion. Assumptions about missing data mechanisms (e.g., missing at random) shape the analytic approaches taken.

Accelerated longitudinal designs

Accelerated longitudinal designs purposefully create missing data across age groups.

Accelerated longitudinal designs strategically sample different age cohorts at overlapping periods. For example, assessing 6th, 7th, and 8th graders at yearly intervals would cover 6-8th grade development over a 3-year study rather than following a single cohort over that timespan.

This increases the speed and cost-efficiency of longitudinal data collection and enables the examination of age/cohort effects. Appropriate multilevel statistical models are required to analyze the resulting complex data structure.

In addition to those considerations, optimizing the time lags between measurements, maximizing participant retention, and thoughtfully selecting analysis models that align with the research questions and hypotheses are also vital in ensuring robust longitudinal research.

So, careful methodology is key throughout the design and analysis process when working with repeated-measures data.

Cohort effects

A cohort refers to a group born in the same year or time period. Cohort effects occur when different cohorts show differing trajectories over time.

Cohort effects can bias results if not accounted for, especially in accelerated longitudinal designs which assume cohort equivalence.

Detecting cohort effects is important but can be challenging as they are confounded with age and time of measurement effects.

Cohort effects can also interfere with estimating other effects like retest effects. This happens because comparing groups to estimate retest effects relies on cohort equivalence.

Overall, researchers need to test for and control cohort effects which could otherwise lead to invalid conclusions. Careful study design and analysis is required.

Retest effects

Retest effects refer to gains in performance that occur when the same or similar test is administered on multiple occasions.

For example, familiarity with test items and procedures may allow participants to improve their scores over repeated testing above and beyond any true change.

Specific examples include:

  • Memory tests – Learning which items tend to be tested can artificially boost performance over time
  • Cognitive tests – Becoming familiar with the testing format and particular test demands can inflate scores
  • Survey measures – Remembering previous responses can bias future responses over multiple administrations
  • Interviews – Comfort with the interviewer and process can lead to increased openness or recall

To estimate retest effects, performance of retested groups is compared to groups taking the test for the first time. Any divergence suggests inflated scores due to retesting rather than true change.

If unchecked in analysis, retest gains can be confused with genuine intraindividual change or interindividual differences.

This undermines the validity of longitudinal findings. Thus, testing and controlling for retest effects are important considerations in longitudinal research.

Data Analysis

Longitudinal data involves repeated assessments of variables over time, allowing researchers to study stability and change. A variety of statistical models can be used to analyze longitudinal data, including latent growth curve models, multilevel models, latent state-trait models, and more.

Latent growth curve models allow researchers to model intraindividual change over time. For example, one could estimate parameters related to individuals’ baseline levels on some measure, linear or nonlinear trajectory of change over time, and variability around those growth parameters. These models require multiple waves of longitudinal data to estimate.

Multilevel models are useful for hierarchically structured longitudinal data, with lower-level observations (e.g., repeated measures) nested within higher-level units (e.g., individuals). They can model variability both within and between individuals over time.

Latent state-trait models decompose the covariance between longitudinal measurements into time-invariant trait factors, time-specific state residuals, and error variance. This allows separating stable between-person differences from within-person fluctuations.

There are many other techniques like latent transition analysis, event history analysis, and time series models that have specialized uses for particular research questions with longitudinal data. The choice of model depends on the hypotheses, timescale of measurements, age range covered, and other factors.

In general, these various statistical models allow investigation of important questions about developmental processes, change and stability over time, causal sequencing, and both between- and within-person sources of variability. However, researchers must carefully consider the assumptions behind the models they choose.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies and cross-sectional studies are two different observational study designs where researchers analyze a target population without manipulating or altering the natural environment in which the participants exist.

Yet, there are apparent differences between these two forms of study. One key difference is that longitudinal studies follow the same sample of people over an extended period of time, while cross-sectional studies look at the characteristics of different populations at a given moment in time.

Longitudinal studies tend to require more time and resources, but they can be used to detect cause-and-effect relationships and establish patterns among subjects.

On the other hand, cross-sectional studies tend to be cheaper and quicker but can only provide a snapshot of a point in time and thus cannot identify cause-and-effect relationships.

Both studies are valuable for psychologists to observe a given group of subjects. Still, cross-sectional studies are more beneficial for establishing associations between variables, while longitudinal studies are necessary for examining a sequence of events.

1. Are longitudinal studies qualitative or quantitative?

Longitudinal studies are typically quantitative. They collect numerical data from the same subjects to track changes and identify trends or patterns.

However, they can also include qualitative elements, such as interviews or observations, to provide a more in-depth understanding of the studied phenomena.

2. What’s the difference between a longitudinal and case-control study?

Case-control studies compare groups retrospectively and cannot be used to calculate relative risk. Longitudinal studies, though, can compare groups either retrospectively or prospectively.

In case-control studies, researchers study one group of people who have developed a particular condition and compare them to a sample without the disease.

Case-control studies look at a single subject or a single case, whereas longitudinal studies are conducted on a large group of subjects.

3. Does a longitudinal study have a control group?

Yes, a longitudinal study can have a control group . In such a design, one group (the experimental group) would receive treatment or intervention, while the other group (the control group) would not.

Both groups would then be observed over time to see if there are differences in outcomes, which could suggest an effect of the treatment or intervention.

However, not all longitudinal studies have a control group, especially observational ones and not testing a specific intervention.

Baltes, P. B., & Nesselroade, J. R. (1979). History and rationale of longitudinal research. In J. R. Nesselroade & P. B. Baltes (Eds.), (pp. 1–39). Academic Press.

Cook, N. R., & Ware, J. H. (1983). Design and analysis methods for longitudinal research. Annual review of public health , 4, 1–23.

Fitchett, G., Rybarczyk, B., Demarco, G., & Nicholas, J.J. (1999). The role of religion in medical rehabilitation outcomes: A longitudinal study. Rehabilitation Psychology, 44, 333-353.

Harvard Second Generation Study. (n.d.). Harvard Second Generation Grant and Glueck Study. Harvard Study of Adult Development. Retrieved from https://www.adultdevelopmentstudy.org.

Le Mare, L., & Audet, K. (2006). A longitudinal study of the physical growth and health of postinstitutionalized Romanian adoptees. Pediatrics & child health, 11 (2), 85-91.

Luo, Y., Hawkley, L. C., Waite, L. J., & Cacioppo, J. T. (2012). Loneliness, health, and mortality in old age: a national longitudinal study. Social science & medicine (1982), 74 (6), 907–914.

Marques, S. C., Pais-Ribeiro, J. L., & Lopez, S. J. (2011). The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents: A two-year longitudinal study. Journal of Happiness Studies: An Interdisciplinary Forum on Subjective Well-Being, 12( 6), 1049–1062.

Sidney W.A. Dekker & Wilmar B. Schaufeli (1995) The effects of job insecurity on psychological health and withdrawal: A longitudinal study, Australian Psychologist, 30: 1,57-63.

Stice, E., Mazotti, L., Krebs, M., & Martin, S. (1998). Predictors of adolescent dieting behaviors: A longitudinal study. Psychology of Addictive Behaviors, 12 (3), 195–205.

Tegan Cruwys, Katharine H Greenaway & S Alexander Haslam (2015) The Stress of Passing Through an Educational Bottleneck: A Longitudinal Study of Psychology Honours Students, Australian Psychologist, 50:5, 372-381.

Thomas, L. (2020). What is a longitudinal study? Scribbr. Retrieved from https://www.scribbr.com/methodology/longitudinal-study/

Van der Vorst, H., Engels, R. C. M. E., Meeus, W., & Deković, M. (2006). Parental attachment, parental control, and early development of alcohol use: A longitudinal study. Psychology of Addictive Behaviors, 20 (2), 107–116.

Further Information

  • Schaie, K. W. (2005). What can we learn from longitudinal studies of adult development?. Research in human development, 2 (3), 133-158.
  • Caruana, E. J., Roman, M., Hernández-Sánchez, J., & Solli, P. (2015). Longitudinal studies. Journal of thoracic disease, 7 (11), E537.

Print Friendly, PDF & Email

1.6: Longitudinal Research

Chapter 1: research methods, chapter 2: the social self, chapter 3: social judgement and decision-making, chapter 4: understanding and influencing others, chapter 5: attitudes and persuasion, chapter 6: close relationships, chapter 7: stereotypes, prejudice, and discrimination, chapter 8: helping and hurting, chapter 9: group dynamics.

The JoVE video player is compatible with HTML5 and Adobe Flash. Older browsers that do not support HTML5 and the H.264 video codec will still use a Flash-based video player. We recommend downloading the newest version of Flash here, but we support all versions 10 and above.

limitations of longitudinal research

Sometimes the goal of a psychological study may be to understand how people’s attitudes and behaviors change over time, or to determine what factors may predict future abilities.

These objectives can be accomplished using a longitudinal design —a research study where data are repeatedly collected from the same group of individuals for a period of time, whether it’s as short as a few weeks or months or as long as several decades.

For example, if a researcher wants to know whether college students’ exercise routines change over the course of their first semester of college, she can use a longitudinal approach and ask students to repeatedly report their workout regiments. She may find that as students get more caught up in their studies, they go to the gym less often.

In addition, the same researcher may keep track of a group of people for twenty years, because she wants to explore how their exercise routines shift across their 20s, 30s and 40s. This approach allows her to best measure changes, within individuals, over time.

In this case, she may discover that those who enjoyed running outdoors in their 20s maintain blood pressure levels, display low amounts of stress, and are more likely to do yoga in their 40s.

While longitudinal research can provide informative results, the method also has its drawbacks. For instance, long-running studies can be very expensive and require a significant time-commitment from the research team and their participants.

Because of this commitment, attrition rates tend to be higher—meaning, more participants dropout. For this reason, the researcher would have to recruit more individuals at the start of the study, expecting a certain number to dropout. Attrition may also cause the study’s sample to be less representative of the population.

Despite its disadvantages, longitudinal research has the power to help us understand variation across human development and the lifespan.

One of the longest-running studies—following people over 80 years as opposed to comparing different groups at different ages—provides a robust measure of human growth—even revealing factors, like close relationships, that lead to people living healthy and happy lives.

Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again at age 40.

Let's consider another example. In recent years there has been significant growth in the popular support of same-sex marriage. Many studies on this topic break down survey participants into different age groups. In general, younger people are more supportive of same-sex marriage than are those who are older (Jones, 2013). Does this mean that as we age we become less open to the idea of same-sex marriage, or does this mean that older individuals have different perspectives because of the social climates in which they grew up? Longitudinal research is a powerful approach because the same individuals are involved in the research project over time, which means that the researchers need to be less concerned with differences among cohorts affecting the results of their study.

Often longitudinal studies are employed when researching various diseases in an effort to understand particular risk factors. Such studies often involve tens of thousands of individuals who are followed for several decades. Given the enormous number of people involved in these studies, researchers can feel confident that their findings can be generalized to the larger population. The Cancer Prevention Study-3 (CPS-3) is one of a series of longitudinal studies sponsored by the American Cancer Society aimed at determining predictive risk factors associated with cancer. When participants enter the study, they complete a survey about their lives and family histories, providing information on factors that might cause or prevent the development of cancer. Then every few years the participants receive additional surveys to complete. In the end, hundreds of thousands of participants will be tracked over 20 years to determine which of them develop cancer and which do not.

Clearly, this type of research is important and potentially very informative. For instance, earlier longitudinal studies sponsored by the American Cancer Society provided some of the first scientific demonstrations of the now well-established links between increased rates of cancer and smoking (American Cancer Society, n.d.).

As with any research strategy, longitudinal research is not without limitations. For one, these studies require an incredible time investment by the researcher and research participants. Given that some longitudinal studies take years, if not decades, to complete, the results will not be known for a considerable period of time. In addition to the time demands, these studies also require a substantial financial investment. Many researchers are unable to commit the resources necessary to see a longitudinal project through to the end.

Research participants must also be willing to continue their participation for an extended period of time, and this can be problematic. People move, get married and take new names, get ill, and eventually die. Even without significant life changes, some people may simply choose to discontinue their participation in the project. As a result, the attrition rates , or reduction in the number of research participants due to dropouts, in longitudinal studies are quite high and increases over the course of a project. For this reason, researchers using this approach typically recruit many participants fully expecting that a substantial number will drop out before the end. As the study progresses, they continually check whether the sample still represents the larger population, and make adjustments as necessary.

This text is adapted from OpenStax, Psychology. OpenStax CNX.

Get cutting-edge science videos from J o VE sent straight to your inbox every month.

mktb-description

We use cookies to enhance your experience on our website.

By continuing to use our website or clicking “Continue”, you are agreeing to accept our cookies.

WeChat QR Code - JoVE

  • Search Menu
  • Advance articles
  • Author Guidelines
  • Submission Site
  • Open Access
  • Call for Papers
  • Why publish with Work, Aging and Retirement?
  • About Work, Aging and Retirement
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Questions on conceptual issues, questions on research design, questions on statistical techniques, acknowledgments, longitudinal research: a panel discussion on conceptual issues, research design, and statistical techniques.

All authors contributed equally to this article and the order of authorship is arranged arbitrarily. Correspondence concerning this article should be addressed to Mo Wang, Warrington College of Business, Department of Management, University of Florida, Gainesville, FL 32611. E-mail: [email protected]

Decision Editor: Donald Truxillo, PhD

  • Article contents
  • Figures & tables
  • Supplementary Data

Mo Wang, Daniel J. Beal, David Chan, Daniel A. Newman, Jeffrey B. Vancouver, Robert J. Vandenberg, Longitudinal Research: A Panel Discussion on Conceptual Issues, Research Design, and Statistical Techniques, Work, Aging and Retirement , Volume 3, Issue 1, 1 January 2017, Pages 1–24, https://doi.org/10.1093/workar/waw033

  • Permissions Icon Permissions

The goal of this article is to clarify the conceptual, methodological, and practical issues that frequently emerge when conducting longitudinal research, as well as in the journal review process. Using a panel discussion format, the current authors address 13 questions associated with 3 aspects of longitudinal research: conceptual issues, research design, and statistical techniques. These questions are intentionally framed at a general level so that the authors could address them from their diverse perspectives. The authors’ perspectives and recommendations provide a useful guide for conducting and reviewing longitudinal studies in work, aging, and retirement research.

An important meta-trend in work, aging, and retirement research is the heightened appreciation of the temporal nature of the phenomena under investigation and the important role that longitudinal study designs play in understanding them (e.g., Heybroek, Haynes, & Baxter, 2015 ; Madero-Cabib, Gauthier, & Le Goff, 2016 ; Wang, 2007 ; Warren, 2015 ; Weikamp & Göritz, 2015 ). This echoes the trend in more general research on work and organizational phenomena, where the discussion of time and longitudinal designs has evolved from explicating conceptual and methodological issues involved in the assessment of changes over time (e.g., McGrath & Rotchford, 1983 ) to the development and application of data analytic techniques (e.g., Chan, 1998 ; Chan & Schmitt, 2000 ; DeShon, 2012 ; Liu, Mo, Song, & Wang, 2016 ; Wang & Bodner, 2007 ; Wang & Chan, 2011 ; Wang, Zhou, & Zhang, 2016 ), theory rendering (e.g., Ancona et al. , 2001 ; Mitchell & James, 2001 ; Vancouver, Tamanini, & Yoder, 2010 ; Wang et al. , 2016 ), and methodological decisions in conducting longitudinal research (e.g., Beal, 2015 ; Bolger, Davis, & Rafaeli, 2003 ; Ployhart & Vandenberg, 2010 ). Given the importance of and the repeated call for longitudinal studies to investigate work, aging, and retirement-related phenomena (e.g., Fisher, Chaffee, & Sonnega, 2016 ; Wang, Henkens, & van Solinge, 2011 ), there is a need for more nontechnical discussions of the relevant conceptual and methodological issues. Such discussions would help researchers to make more informed decisions about longitudinal research and to conduct studies that would both strengthen the validity of inferences and avoid misleading interpretations.

In this article, using a panel discussion format, the authors address 13 questions associated with three aspects of longitudinal research: conceptual issues, research design, and statistical techniques. These questions, as summarized in Table 1 , are intentionally framed at a general level (i.e., not solely in aging-related research), so that the authors could address them from diverse perspectives. The goal of this article is to clarify the conceptual, methodological, and practical issues that frequently emerge in the process of conducting longitudinal research, as well as in the related journal review process. Thus, the authors’ perspectives and recommendations provide a useful guide for conducting and reviewing longitudinal studies—not only those dealing with aging and retirement, but also in the broader fields of work and organizational research.

Questions Regarding Longitudinal Research Addressed in This Article

Conceptual Issue Question 1: Conceptually, what is the essence of longitudinal research?

This is a fundamental question to ask given the confusion in the literature. It is common to see authors attribute their high confidence in their causal inferences to the longitudinal design they use. It is also common to see authors attribute greater confidence in their measurement because of using a longitudinal design. Less common, but with increasing frequency, authors claim to be examining the role of time in their theoretical models via the use of longitudinal designs. These different assumptions by authors illustrate the need for clarifying when specific attributions about longitudinal research are appropriate. Hence, a discussion of the essence of longitudinal research and what it provides is in order.

Oddly, definitions of longitudinal research are rare. One exception is a definition by Taris (2000) , who explained that longitudinal “data are collected for the same set of research units (which might differ from the sampling units/respondents) for (but not necessarily at) two or more occasions, in principle allowing for intra-individual comparison across time” (pp. 1–2). Perhaps more directly relevant for the current discussion of longitudinal research related to work and aging phenomena, Ployhart and Vandenberg (2010) defined “ longitudinal research as research emphasizing the study of change and containing at minimum three repeated observations (although more than three is better) on at least one of the substantive constructs of interest” (p. 97; italics in original). Compared to Taris (2000) , Ployhart and Vandenberg’s (2010) definition explicitly emphasizes change and encourages the collection of many waves of repeated measures. However, Ployhart and Vandenberg’s definition may be overly restrictive. For example, it precludes designs often classified as longitudinal such as the prospective design. In a prospective design, some criterion (i.e., presumed effect) is measured at Times 1 and 2, so that one can examine change in the criterion as a function of events (i.e., presumed causes) happening (or not) between the waves of data collection. For example, a researcher can use this design to assess the psychological and behavioral effects of retirement that occur before and after retirement. That is, psychological and behavioral variables are measured before and after retirement. Though not as internally valid as an experiment (which is not possible because we cannot randomly assign participants into retirement and non-retirement conditions), this prospective design is a substantial improvement over the typical design where the criteria are only measured at one time. This is because it allows one to more directly examine change in a criterion as a function of differences between events or person variables. Otherwise, one must draw inferences based on retrospective accounts of the change in criterion along with the retrospective accounts of the events; further, one may worry that the covariance between the criterion and person variables is due to changes in the criterion that are also changing the person. Of course, this design does not eliminate the possibility that changes in criterion may cause differences in events (e.g., changes observed in psychological and behavioral variables lead people to decide to retire).

In addition to longitudinal designs potentially having only two waves of data collection for a variable, there are certain kinds of criterion variables that need only one explicit measure at Time 2 in a 2-wave study. Retirement (or similarly, turnover) is an example. I say “explicit” because retirement is implicitly measured at Time 1. That is, if the units are in the working sample at Time 1, they have not retired. Thus, retirement at Time 2 represents change in working status. On the other hand, if retirement intentions is the criterion variable, repeated measures of this variable are important for assessing change. Repeated measures also enable the simultaneous assessment of change in retirement intentions and its alleged precursors; it could be that a variable like job satisfaction (a presumed cause of retirement intentions) is actually lowered after the retirement intentions are formed, perhaps in a rationalization process. That is, individuals first intend to retire and then evaluate over time their attitudes toward their present job. This kind of reverse causality process would not be detected in a design measuring job satisfaction at Time 1 and retirement intentions at Time 2.

Given the above, I opt for a much more straightforward definition of longitudinal research. Specifically, longitudinal research is simply research where data are collected over a meaningful span of time. A difference between this definition and the one by Taris (2000) is that this definition does not include the clause about examining intra-individual comparisons. Such designs can examine intra-individual comparisons, but again, this seems overly restrictive. That said, I do add a restriction to this definition, which is that the time span should be “meaningful.” This term is needed because time will always pass—that is, it takes time to complete questionnaires, do tasks, or observe behavior, even in cross-sectional designs. Yet, this passage of time likely provides no validity benefit. On the other hand, the measurement interval could last only a few seconds and still be meaningful. To be meaningful it has to support the inferences being made (i.e., improve the research’s validity). Thus, the essence of longitudinal research is to improve the validity of one’s inferences that cannot otherwise be achieved using cross-sectional research ( Shadish, Cook, & Campbell, 2002 ). The inferences that longitudinal research can potentially improve include those related to measurement (i.e., construct validity), causality (i.e., internal validity), generalizability (i.e., external validity), and quality of effect size estimates and hypothesis tests (i.e., statistical conclusion validity). However, the ability of longitudinal research to improve these inferences will depend heavily on many other factors, some of which might make the inferences less valid when using a longitudinal design. Increased inferential validity, particularly of any specific kind (e.g., internal validity), is not an inherent quality of the longitudinal design; it is a goal of the design. And it is important to know how some forms of the longitudinal design fall short of that goal for some inferences.

For example, consider a case where a measure of a presumed cause precedes a measure of a presumed effect, but over a time period across which one of the constructs in question does not likely change. Indeed, it is often questionable as to whether a gap of several months between the observations of many variables examined in research would change meaningfully over the interim, much less that the change in one preceded the change in the other (e.g., intention to retire is an example of this, as people can maintain a stable intention to retire for years). Thus, the design typically provides no real improvement in terms of internal validity. On the other hand, it does likely improve construct and statistical conclusion validity because it likely reduces common method bias effects found between the two variables ( Podsakoff et al., 2003 ).

Further, consider the case of the predictive validity design, where a selection instrument is measured from a sample of job applicants and performance is assessed some time later. In this case, common method bias is not generally the issue; external validity is. The longitudinal design improves external validity because the Time 1 measure is taken during the application process, which is the context in which the selection instrument will be used, and the Time 2 measure is taken after a meaningful time interval (i.e., after enough time has passed for performance to have stabilized for the new job holders). Again, however, internal validity is not much improved, which is fine given that prediction, not cause, is the primary concern in the selection context.

Another clear construct validity improvement gained by using longitudinal research is when one is interested in measuring change. A precise version of change measurement is assessing rate of change. When assessing the rate, time is a key variable in the analysis. To assess a rate one needs only two repeated measures of the variable of interest, though these measures should be taken from several units (e.g., individuals, groups, organizations) if measurement and sampling errors are present and perhaps under various conditions if systematic measurement error is possible (e.g., testing effect). Moreover, Ployhart and Vandenberg (2010) advocate at least three repeated measures because most change rates are not constant; thus, more than two observations will be needed to assess whether and how the rate changes (i.e., the shape of the growth curves). Indeed, three is hardly enough given noise in measurement and the commonality of complex processes (i.e., consider the opponent process example below).

Longitudinal research designs can, with certain precautions, improve one’s confidence in inferences about causality. When this is the purpose, time does not need to be measured or included as a variable in the analysis, though the interval between measurements should be reported because rate of change and cause are related. For example, intervals can be too short, such that given the rate of an effect, the cause might not have had sufficient time to register on the effect. Alternatively, if intervals are too long, an effect might have triggered a compensating process that overshoots the original level, inverting the sign of the cause’s effect. An example of this latter process is opponent process ( Solomon & Corbit, 1974 ). Figure 1 depicts this process, which refers to the response to an emotional stimulus. Specifically, the emotional response elicits an opponent process that, at its peak, returns the emotion back toward the baseline and beyond. If the emotional response is collected when peak opponent response occurs, it will look like the stimulus is having the opposite effect than it actually is having.

The opponent process effect demonstrated by Solomon and Corbit (1974).

The opponent process effect demonstrated by Solomon and Corbit (1974) .

Most of the longitudinal research designs that improve internal validity are quasi-experimental ( Shadish et al. , 2002 ). For example, interrupted time series designs use repeated observations to assess trends before and after some manipulation or “natural experiment” to model possible maturation or maturation-by-selection effects ( Shadish et al. , 2002 ; Stone-Romero, 2010 ). Likewise, regression discontinuous designs (RDD) use a pre-test to assign participants to the conditions prior to the manipulation and thus can use the pre-test value to model selection effects ( Shadish et al. , 2002 ; Stone-Romero, 2010 ). Interestingly, the RDD design is not assessing change explicitly and thus is not susceptible to maturations threats, but it uses the timing of measurement in a meaningful way.

Panel (i.e., cohort) designs are also typically considered longitudinal. These designs measure all the variables of interest during each wave of data collection. I believe it was these kinds of designs that Ployhart and Vandenberg (2010) had in mind when they created their definition of longitudinal research. In particular, these designs can be used to assess rates of change and can improve causal inferences if done well. In particular, to improve causal inferences with panel designs, researchers nearly always need at least three repeated measures of the hypothesized causes and effects. Consider the case of job satisfaction and intent to retire. If a researcher measures job satisfaction and intent to retire at Times 1 and 2 and finds that the Time 2 measures of job satisfaction and intent to retire are negatively related when the Time 1 states of the variables are controlled, the researcher still cannot tell which changed first (or if some third variable causes both to change in the interim). Unfortunately, three observations of each variable is only a slight improvement because it might be a difficult thing to get enough variance in changing attitudes and changing intentions with just three waves to find anything significant. Indeed, the researcher might have better luck looking at actual retirement, which as mentioned, only needs one observation. Still, two observations of job satisfaction are needed prior to the retirement to determine if changes in job satisfaction influence the probability of retirement.

Finally, on this point I would add that meaningful variance in time will often mean case-intensive designs (i.e., lots of observations of lots of variables over time per case; Bolger & Laurenceau, 2013 ; Wang et al. , 2016 ) because we will be more and more interested in assessing feedback and other compensatory processes, reciprocal relationships, and how dynamic variables change. In these cases, within-unit covariance will be much more interesting than between-unit covariance.

It is important to point out that true experimental designs are also a type of longitudinal research design by nature. This is because in experimental design, an independent variable is manipulated before the measure of the dependent variable occurs. This time precedence (or lag) is critical for using experimental designs to achieve stronger causal inferences. Specifically, given that random assignment is used to generate experimental and control groups, researchers can assume that prior to the manipulation, the mean levels of the dependent variables are the same across experimental and control groups, as well as the mean levels of the independent variables. Thus, by measuring the dependent variable after manipulation, an experimental design reveals the change in the dependent variable as a function of change in the independent variable as a result of manipulation. As such, the time lag between the manipulation and the measure of the dependent variable is indeed meaningful in the sense of achieving causal inference.

Conceptual Issue Question 2: What is the status of “time” in longitudinal research? Is “time” a general notion of the temporal dynamics in phenomena, or is “time” a substantive variable similar to other focal variables in the longitudinal study?

In longitudinal research, we are concerned with conceptualizing and assessing the changes over time that may occur in one or more substantive variables. A substantive variable refers to a measure of an intended construct of interest in the study. For example, in a study of newcomer adaptation (e.g., Chan & Schmitt, 2000 ), the substantive variables, whose changes over time we are interested in tracking, could be frequency of information seeking, job performance, and social integration. We could examine the functional form of the substantive variable’s change trajectory (e.g., linear or quadratic). We could also examine the extent to which individual differences in a growth parameter of the trajectory (e.g., the individual slopes of a linear trajectory) could be predicted from the initial (i.e., at Time 1 of the repeated measurement) values on the substantive variable, the values on a time-invariant predictor (e.g., personality trait), or the values on another time-varying variable (e.g., individual slopes of the linear trajectory of a second substantive variable in the study). The substantive variables are measures used to represent the study constructs. As measures of constructs, they have specific substantive content. We can assess the construct validity of the measure by obtaining relevant validity evidence. The evidence could be the extent to which the measure’s content represents the conceptual content of the construct (i.e., content validity) or the extent to which the measure is correlated with another established criterion measure representing a criterion construct that, theoretically, is expected to be associated with the measure (i.e., criterion-related validity).

“Time,” on the other hand, has a different ontological status from the substantive variables in the longitudinal study. There are at least three ways to describe how time is not a substantive variable similar to other focal variables in the longitudinal study. First, when a substantive construct is tracked in a longitudinal study for changes over time, time is not a substantive measure of a study construct. In the above example of newcomer adaptation study by Chan and Schmitt, it is not meaningful to speak of assessing the construct validity of time, at least not in the same way we can speak of assessing the construct validity of job performance or social integration measures. Second, in a longitudinal study, a time point in the observation period represents one temporal instance of measurement. The time point per se, therefore, is simply the temporal marker of the state of the substantive variable at the point of measurement. The time point is not the state or value of the substantive variable that we are interested in for tracking changes over time. Changes over time occur when the state of a substantive variable changes over different points of measurement. Finally, in a longitudinal study of changes over time, “time” is distinct from the substantive process that underlies the change over time. Consider a hypothetical study that repeatedly measured the levels of job performance and social integration of a group of newcomers for six time points, at 1-month intervals between adjacent time points over a 6-month period. Let us assume that the study found that the observed change over time in their job performance levels was best described by a monotonically increasing trajectory at a decreasing rate of change. The observed functional form of the performance trajectory could serve as empirical evidence for the theory that a learning process underlies the performance level changes over time. Let us further assume that, for the same group of newcomers, the observed change over time in their social integration levels was best described by a positive linear trajectory. This observed functional form of the social integration trajectory could serve as empirical evidence for a theory of social adjustment process that underlies the integration level changes over time. In this example, there are two distinct substantive processes of change (learning and social adjustment) that may underlie the changes in levels on the two respective study constructs (performance and social integration). There are six time points at which each substantive variable was measured over the same time period. Time, in this longitudinal study, was simply the medium through which the two substantive processes occur. Time was not an explanation. Time did not cause the occurrence of the different substantive processes and there was nothing in the conceptual content of the time construct that could, nor was expected to, explain the functional form or nature of the two different substantive processes. The substantive processes occur or unfold through time but they did not cause time to exist.

The way that growth modeling techniques analyze longitudinal data is consistent with the above conceptualization of time. For example, in latent growth modeling, time per se is not represented as a substantive variable in the analysis. Instead, a specific time point is coded as a temporal marker of the substantive variable (e.g., as basis coefficients in a latent growth model to indicate the time points in the sequence of repeated measurement at which the substantive variable was measured). The time-varying nature of the substantive variable is represented either at the individual level as the individual slopes or at the group level as the variance of the slope factor. It is the slopes and variance of slopes of the substantive variable that are being analyzed, and not time per se. The nature of the trajectory of change in the substantive variable is descriptively represented by the specific functional form of the trajectory that is observed within the time period of study. We may also include in the latent growth model other substantive variables, such as time-invariant predictors or time-varying correlates, to assess the strength of their associations with variance of the individual slopes of trajectory. These associations serve as validation and explanation of the substantive process of change in the focal variable that is occurring over time.

Many theories of change require the articulation of a change construct (e.g., learning, social adjustment—inferred from a slope parameter in a growth model). When specifying a change construct, the “time” variable is only used as a marker to track a substantive growth or change process. For example, when we say, “Extraversion × time interaction effect” on newcomer social integration, we really mean that Extraversion relates to the change construct of social adjustment (i.e., where social adjustment is operationalized as the slope parameter from a growth model of individuals’ social integration over time). Likewise, when we say, “Conscientiousness × time2 quadratic interaction effect” on newcomer task performance, we really mean that Conscientiousness relates to the change construct of learning (where learning is operationalized as the nonlinear slope of task performance over time).

This view of time brings up a host of issues with scaling and calibration of the time variable to adequately assess the underlying substantive change construct. For example, should work experience be measured in number of years in the job versus number of assignments completed ( Tesluk & Jacobs, 1998 )? Should the change construct be thought of as a developmental age effect, historical period effect, or birth cohort effect ( Schaie, 1965 )? Should the study of time in teams reflect developmental time rather than clock time, and thus be calibrated to each team’s lifespan ( Gersick, 1988 )? As such, although time is not a substantive variable itself in longitudinal research, it is important to make sure that the measurement of time matches the theory that specifies the change construct that is under study (e.g., aging, learning, adaptation, social adjustment).

I agree that time is typically not a substantive variable, but that it can serve as a proxy for substantive variables if the process is well-known. The example about learning by Chan is a case in point. Of course, well-known temporal processes are rare and I have often seen substantive power mistakenly given to time: For example, it is the process of oxidation, not the passage of time that is responsible for rust. However, there are instances where time plays a substantive role. For example, temporal discounting ( Ainslie & Haslam, 1992 ) is a theory of behavior that is dependent on time. Likewise, Vancouver, Weinhardt, and Schmidt’s (2010) theory of multiple goal pursuit involves time as a key substantive variable. To be sure, in that latter case the perception of time is a key mediator between time and its hypothetical effects on behavior, but time has an explicit role in the theory and thus should be considered a substantive variable in tests of the theory.

I was referring to objective time when explaining that time is not a substantive variable in longitudinal research and that it is instead the temporal medium through which a substantive process unfolds or a substantive variable changes its state. When we discuss theories of substantive phenomena or processes involving temporal constructs, such as temporal discounting, time urgency, or polychronicity related to multitasking or multiple goal pursuits, we are in fact referring to subjective time, which is the individual’s psychological experience of time. Subjective time constructs are clearly substantive variables. The distinction between objective time and subjective time is important because it provides conceptual clarity to the nature of the temporal phenomena and guides methodological choices in the study of time (for details, see Chan, 2014 ).

Conceptual Issue Question 3: What are the procedures, if any, for developing a theory of changes over time in longitudinal research? Given that longitudinal research purportedly addresses the limitations of cross-sectional research, can findings from cross-sectional studies be useful for the development of a theory of change?

To address this question, what follows is largely an application of some of the ideas presented by Mitchell and James (2001) and by Ployhart and Vandenberg (2010) in their respective publications. Thus, credit for the following should be given to those authors, and consultation of their articles as to specifics is highly encouraged.

Before we specifically address this question, it is important to understand our motive for asking it. Namely, as most succinctly stated by Mitchell and James (2001) , and repeated by, among others, Bentein and colleagues (2005) , Chan (2002 , 2010 ), and Ployhart and Vandenberg (2010) , there is an abundance of published research in the major applied psychology and organizational science journals in which the authors are not operationalizing through their research designs the causal relations among their focal independent, dependent, moderator, and mediator variables even though the introduction and discussion sections imply such causality. Mitchell and James (2001) used the published pieces in the most recent issues (at that time) of the Academy of Management Journal and Administrative Science Quarterly to anchor this point. At the crux of the problem is using designs in which time is not a consideration. As they stated so succinctly:

“At the simplest level, in examining whether an X causes a Y, we need to know when X occurs and when Y occurs. Without theoretical or empirical guides about when to measure X and Y, we run the risk of inappropriate measurement, analysis, and, ultimately, inferences about the strength, order, and direction of causal relationships (italics added, Mitchell & James, 2001 , p. 530).”

When is key because it is at the heart of causality in its simplest form, as in the “cause must precede the effect” ( James, Mulaik, & Brett, 1982 ; Condition 3 of 10 for inferring causality, p. 36). Our casual glance at the published literature over the decade since Mitchell and James (2001) indicates that not much has changed in this respect. Thus, our motive for asking the current question is quite simple—“perhaps it’s ‘time’ to put these issues in front of us once more (pun intended), particularly given the increasing criticisms as to the meaningfulness of published findings from studies with weak methods and statistics” (e.g., statistical myths and urban legends, Lance & Vandenberg, 2009 ).

The first part of the question asks, “what are the procedures, if any, for developing a theory of change over time in longitudinal research?” Before addressing procedures per se, it is necessary first to understand some of the issues when incorporating change into research. Doing so provides a context for the procedures. Ployhart and Vandenberg (2010) noted four theoretical issues that should be addressed when incorporating change in the variables of interest across time. These were:

“To the extent possible, specify a theory of change by noting the specific form and duration of change and predictors of change.

Clearly articulate or graph the hypothesized form of change relative to the observed form of change.

Clarify the level of change of interest: group average change, intraunit change, or interunit differences in intraunit change.

Realize that cross-sectional theory and research may be insufficient for developing theory about change. You need to focus on explaining why the change occurs” (p. 103).

The interested reader is encouraged to consult Ployhart and Vandenberg (2010) as to the specifics underlying the four issues, but they were heavily informed by Mitchell and James (2001) . Please note that, as one means of operationalizing time, Mitchell and James (2001) focused on time very broadly in the context of strengthening causal inferences about change across time in the focal variables. Thus, Ployhart and Vandenberg’s (2010) argument, with its sole emphasis on change, is nested within the Mitchell and James (2001) perspective. I raise this point because it is in this vein that the four theoretical issues presented above have as their foundation the five theoretical issues addressed by Mitchell and James (2001) . Specifically, first, we need to know the time lag between X and Y . How long after X occurs does Y occur? Second, X and Y have durations. Not all variables occur instantaneously. Third, X and Y may change over time. We need to know the rate of change. Fourth, in some cases we have dynamic relationships in which X and Y both change. The rate of change for both variables should be known, as well as how the X – Y relationship changes. Fifth, in some cases we have reciprocal causation: X causes Y and Y causes X . This situation requires an understanding of two sets of lags, durations, and possibly rates. The major point of both sets of authors is that these theoretical issues need to be addressed first in that they should be the key determinants in designing the overall study; that is, deciding upon the procedures to use.

Although Mitchell and James (2001 , see p. 543) focused on informing procedures through theory in the broader context of time (e.g., draw upon studies and research that may not be in our specific area of interest; going to the workplace and actually observing the causal sequence, etc.), our specific question focuses on change across time. In this respect, Ployhart and Vandenberg (2010 , Table 1 in p. 103) identified five methodological and five analytical procedural issues that should be informed by the nature of the change. These are:

“Methodological issues

1. Determine the optimal number of measurement occasions and their intervals to appropriately model the hypothesized form of change.

2. Whenever possible, choose samples most likely to exhibit the hypothesized form of change, and try to avoid convenience samples.

3. Determine the optimal number of observations, which in turn means addressing the attrition issue before conducting the study. Prepare for the worst (e.g., up to a 50% drop from the first to the last measurement occasion). In addition, whenever possible, try to model the hypothesized “cause” of missing data (ideally theorized and measured a priori) and consider planned missingness approaches to data collection.

4. Introduce time lags between intervals to address issues of causality, but ensure the lags are neither too long nor too short.

5. Evaluate the measurement properties of the variable for invariance (e.g., configural, metric) before testing whether change has occurred.

Analytical issues

1. Be aware of potential violations in statistical assumptions inherent in longitudinal designs (e.g., correlated residuals, nonindependence).

2. Describe how time is coded (e.g., polynomials, orthogonal polynomials) and why.

3. Report why you use a particular analytical method and its strengths and weaknesses for the particular study.

4. Report all relevant effect sizes and fit indices to sufficiently evaluate the form of change.

5. It is easy to ‘overfit’ the data; strive to develop a parsimonious representation of change.”

In summary, the major point from the above is to encourage researchers to develop a thorough conceptual understanding of time as it relates to defining the causal relationships between the focal variables of interest. We acknowledge that researchers are generally good at conceptualizing why their x -variables cause some impact on their y -variables. What is called for here goes beyond just understanding why, but forcing ourselves to be very specific about the timing between the variables. Doing so will result in stronger studies and ones in which our inferences from the findings can confidently include statements about causality—a level of confidence that is sorely lacking in most published studies today. As succinctly stated by Mitchell and James (2001) , “With impoverished theory about issues such as when events occur, when they change, or how quickly they change, the empirical researcher is in a quandary. Decisions about when to measure and how frequently to measure critical variables are left to intuition, chance, convenience, or tradition. None of these are particularly reliable guides (p. 533).”

The latter quote serves as a segue to address the second part of our question, “Given that longitudinal research purportedly addresses the limitations of cross-sectional research, can findings from cross-sectional studies be useful for the development of a theory of change?” Obviously, the answer here is “it depends.” In particular, it depends on the design contexts around which the cross-sectional study was developed. For example, if the study was developed strictly following many of the principles for designing quasi-experiments in field settings spelled out by Shadish, Cook, and Campbell (2002) , then it would be very useful for developing a theory of change on the phenomenon of interest. Findings from such studies could inform decisions as to how much change needs to occur across time in the independent variable to see measurable change in the dependent variable. Similarly, it would help inform decisions as to what the baseline on the independent variable needs to be, and what amount of change from this baseline is required to impact the dependent variable. Another useful set of cross-sectional studies would be those developed for the purpose of verifying within field settings the findings from a series of well-designed laboratory experiments. Again, knowing issues such as thresholds, minimal/maximal values, and intervals or timing of the x -variable onset would be very useful for informing a theory of change. A design context that would be of little use for developing a theory of change is the case where a single cross-sectional study was completed to evaluate the conceptual premises of interest. The theory underlying the study may be useful, but the findings themselves would be of little use.

Few theories are not theories of change. Most, however, are not sufficiently specified. That is, they leave much to the imagination. Moreover, they often leave to the imagination the implications of the theory on behavior. My personal bias is that theories of change should generally be computationally rendered to reduce vagueness, provide a test of internal coherence, and support the development of predictions. One immediately obvious conclusion one will draw when attempting to create a formal computational theoretical model is that we have little empirical data on rates of change.

The procedures for developing a computational model are the following ( Vancouver & Weinhardt, 2012 ; also see Wang et al. , 2016 ). First, take variables from (a) existing theory (verbal or static mathematical theory), (b) qualitative studies, (c) deductive reasoning, or (d) some combination of these. Second, determine which variables are dynamic. Dynamic variables have “memory” in that they retain their value over time, changing only as a function of processes that move the value in one direction or another at some rate or some changing rate. Third, describe processes that would affect these dynamic variables (if using existing theory, this likely involves other variables in the theory) or the rates and direction of change to the dynamic variables if the processes that affect the rates are beyond the theory. Fourth, represent formally (e.g., mathematically) the effect of the variables on each other. Fifth, simulate the model to see if it (a) works (e.g., no out-of-bounds values generated), (b) produces phenomena the theory is presumed to explain, (c) produces patterns of data over time (trajectories; relationships) that match (or could be matched to) data, and (d) determine if variance in exogenous variables (i.e., ones not presumably affected by other variables in the model) affect trajectories/relationships (called sensitivity analysis). For example, if we build a computational model to understand retirement timing, it will be critical to simulate the model to make sure that it generates predictions in a realistic way (e.g., the simulation should not generate too many cases where retirement happens after the person is a 90-year old). It will also be important to see whether the predictions generated from the model match the actual empirical data (e.g., the average retirement age based on simulation should match the average retirement age in the target population) and whether the predictions are robust when the model’s input factors take on a wide range of values.

As mentioned above, many theories of change require the articulation of a change construct (e.g., learning, aging, social adjustment—inferred from a slope parameter in a growth model). A change construct must be specified in terms of its: (a) theoretical content (e.g., what is changing, when we say “learning” or “aging”?), (b) form of change (linear vs. quadratic vs. cyclical), and (c) rate of change (does the change process meaningfully occur over minutes vs. weeks?). One salient problem is how to develop theory about the form of change (linear vs. nonlinear/quadratic) and the rate of change (how fast?) For instance, a quadratic/nonlinear time effect can be due to a substantive process of diminishing returns to time (e.g., a learning curve), or to ceiling (or floor) effects (i.e., hitting the high end of a measurement instrument, past which it becomes impossible to see continued growth in the latent construct). Indeed, only a small fraction of the processes we study would turn out to be linear if we used more extended time frames in the longitudinal design. That is, most apparently linear processes result from the researcher zooming in on a nonlinear process in a way that truncates the time frame. This issue is directly linked to the presumed rate of change of a phenomenon (e.g., a process that looks nonlinear in a 3-month study might look linear in a 3-week study). So when we are called upon to theoretically justify why we hypothesize a linear effect instead of a nonlinear effect, we must derive a theory of what the passage of time means. This would involve three steps: (a) naming the substantive process for which time is a marker (e.g., see answers to Question #2 above), (b) theorizing the rate of this process (e.g., over weeks vs. months), which will be more fruitful if it hinges on related past empirical longitudinal research, than if it hinges on armchair speculation about time (i.e., the appropriate theory development sequence here is: “past data → theory → new data,” and not simply, “theory → new data”; the empirical origins of theory are an essential step), and (c) disavowing nonlinear forces (e.g., diminishing returns to time, periodicity), within the chosen time frame of the study.

Research Design Question 1: What are some of the major considerations that one should take into account before deciding to employ a longitudinal study design?

As with all research, the design needs to allow the researcher to address the research question. For example, if one is seeking to assess a change rate, one needs to ask if it is safe to assume that the form of change is linear. If not, one will need more than two waves or will need to use continuous sampling. One might also use a computational model to assess whether violations of the linearity assumption are important. The researcher needs to also have an understanding of the likely time frame across which the processes being examined occur. Alternatively, if the time frame is unclear, the researcher should sample continuously or use short intervals. If knowing the form of the change is desired, then one will need enough waves of data collection in which to comprehensively capture the changes.

If one is interested in assessing causal processes, more issues need to be considered. For example, what are the processes of interest? What are the factors affecting the processes or the rates of the processes? What is the form of the effect of these factors? And perhaps most important, what alternative process could be responsible for effects observed?

For example, consider proactive socialization ( Morrison, 2002 ). The processes of interest are those involved in determining proactive information seeking. One observation is that the rate of proactive information seeking drops with the tenure of an employee ( Chan & Schmitt, 2000 ). Moreover, the form of the drop is asymptotic to a floor (Vancouver, Tamanini et al. , 2010 ). The uncertainty reduction model predicts that proactive information seeking will drop over time because knowledge increases (i.e., uncertainty decreases). An alternative explanation is that ego costs grow over time: One feels that they will look more foolish asking for information the longer one’s tenure ( Ashford, 1986 ). To distinguish these explanations for a drop in information seeking over time, one might want to look at whether the transparency of the reason to seek information would moderate the negative change trend of information seeking. For the uncertainty reduction model, transparency should not matter, but for the ego-based model, transparency and legitimacy of reason should matter. Of course, it might be that both processes are at work. As such, the researcher may need a computational model or two to help think through the effects of the various processes and whether the forms of the relationships depend on the processes hypothesized (e.g., Vancouver, Tamanini et al. , 2010 ).

Research Design Question 2: Are there any design advantages of cross-sectional research that might make it preferable to longitudinal research? That is, what would be lost and what might be gained if a moratorium were placed on cross-sectional research?

Cross-sectional research is easier to conduct than longitudinal research, but it often estimates the wrong parameters. Interestingly, researchers typically overemphasize/talk too much about the first fact (ease of cross-sectional research), and underemphasize/talk too little about the latter fact (that cross-sectional studies estimate the wrong thing). Cross-sectional research has the advantages of allowing broader sampling of participants, due to faster and cheaper studies that involve less participant burden; and broader sampling of constructs, due to the possibility of participant anonymity in cross-sectional designs, which permits more honest and complete measurement of sensitive concepts, like counterproductive work behavior.

Also, when the theoretical process at hand has a very short time frame (e.g., minutes or seconds), then cross-sectional designs can be entirely appropriate (e.g., for factor analysis/measurement modeling, because it might only take a moment for a latent construct to be reflected in a survey response). Also, first-stage descriptive models of group differences (e.g., sex differences in pay; cross-cultural differences in attitudes; and other “black box” models that do not specify a psychological process) can be suggestive even with cross-sectional designs. Cross-sectional research can also be condoned in the case of a 2-study design wherein cross-sectional data are supplemented with lagged/longitudinal data.

But in the end, almost all psychological theories are theories of change (at least implicitly) [Contrary to Ployhart and Vandenberg (2010) , I tend to believe that “cross-sectional theory” does not actually exist— theories are inherently longitudinal, whereas models and evidence can be cross-sectional.]. Thus, longitudinal and time-lagged designs are indispensable, because they allow researchers to begin answering four types of questions: (a) causal priority, (b) future prediction, (c) change, and (d) temporal external validity. To define and compare cross-sectional against longitudinal and time-lagged designs, I refer to Figure 2 . Figure 2 displays three categories of discrete-time designs: cross-sectional ( X and Y measured at same time; Figure 2a ), lagged ( Y measured after X by a delay of duration t ; Figure 2b ), and longitudinal ( Y measured at three or more points in time; Figure 2c ) designs. First note that, across all time designs, a 1 denotes the cross-sectional parameter (i.e., the correlation between X 1 and Y 1 ) . In other words, if X is job satisfaction and Y is retirement intentions, a 1 denotes the cross-sectional correlation between these two variables at t 1 . To understand the value (and limitations) of cross-sectional research, we will look at the role of the cross-sectional parameter ( a 1 ) in each of the Figure 2 models.

Time-based designs for two constructs, X and Y. (a) cross-sectional design (b) lagged designs (c) longitudinal designs.

Time-based designs for two constructs, X and Y . (a) cross-sectional design (b) lagged designs (c) longitudinal designs.

For assessing causal priority , the lagged models and panel model are most relevant. The time-lagged b 1 parameter (i.e., correlation between X 1 and Y 2 ; e.g., predictive validity) aids in future prediction, but tells us little about causal priority. In contrast, the panel regression b 1 ' parameter from the cross-lagged panel regression (in Figure 2b ) and the cross-lagged panel model (in Figure 2c ) tells us more about causal priority from X to Y ( Kessler & Greenberg, 1981 ; Shingles, 1985 ), and is a function of the b 1 parameter and the cross-sectional a 1 parameter [ b 1 ' = ( b 1 − a 1 r Y 1 , Y 2 ) / 1 − a 1 2 ] . For testing theories that X begets Y (i.e., X → Y ), the lagged parameter b 1 ' can be extremely useful, whereas the cross-sectional parameter a 1 is the wrong parameter (indeed, a 1 is often negatively related to b 1 ' ) . That is, a 1 does not estimate X → Y , but it is usually negatively related to that estimate (via the above formula for b 1 ' ) . Using the example of job satisfaction and retirement intentions, if we would like to know about the causal priority from job satisfaction to retirement intentions, we should at least measure both job satisfaction and retirement intentions at t 1 and then measure retirement intentions at t 2 . Deriving the estimate for b 1 ' involves regressing retirement intentions at t 2 on job satisfaction at t 1 , while controlling for the effect of retirement intentions at t 1 .

For future prediction , the autoregressive model and growth model in Figure 2c are most relevant. One illustrative empirical phenomenon is validity degradation, which means the X – Y correlation tends to shrink as the time interval between X and Y increases ( Keil & Cortina, 2001 ). Validity degradation and patterns of stability have been explained via simplex autoregressive models ( Hulin, Henry, & Noon, 1990 ; Humphreys, 1968 ; Fraley, 2002 ), which express the X – Y correlation as r X 1 , Y 1 + k = a 1 g k , where k is the number of time intervals separating X and Y . Notice the cross-sectional parameter a 1 in this formula serves as a multiplicative constant in the time-lagged X – Y correlation, but is typically quite different from the time-lagged X – Y correlation itself. Using the example of extraversion and retirement intentions, validity degradation means that the effect of extraversion at t 1 on the measure of retirement intentions is likely to decrease over time, depending on how stable retirement intentions are. Therefore, relying on a 1 to gauge how well extraversion can predict future retirement intentions is likely to overestimate the predictive effect of extraversion.

Another pertinent model is the latent growth model ( Chan, 1998 ; Ployhart & Hakel, 1998 ), which explains longitudinal data using a time intercept and slope. In the linear growth model in Figure 2 , the cross-sectional a 1 parameter is equal to the relationship between X 1 and the Y intercept, when t 1 = 0. I also note that from the perspective of the growth model, the validity degradation phenomenon (e.g., Hulin et al. , 1990 ) simply means that X 1 has a negative relationship with the Y slope. Thus, again, the cross-sectional a 1 parameter merely indicates the initial state of the X and Y relationship in a longitudinal system, and will only offer a reasonable estimate of future prediction of Y under the rare conditions when g ≈ 1.0 in the autoregressive model (i.e., Y is extremely stable), or when i ≈ 0 in the growth model (i.e., X does not predict the Y -slope; Figure 2c ).

For studying change , I refer to the growth model (where both X and the Y -intercept explain change in Y [or Y -slope]) and the coupled growth model (where X -intercept, Y -intercept, change in X , and change in Y all interrelate) in Figure 2c . Again, in these models the cross-sectional a 1 parameter is the relationship between the X and Y intercepts, when the slopes are specified with time centered at t 1 = 0 (where t 1 refers arbitrarily to any time point when the cross-sectional data were collected). In the same way that intercepts tell us very little about slopes (ceiling and floor effects notwithstanding), the cross-sectional X 1 parameter tells us almost nothing about change parameters. Again, using the example of the job satisfaction and retirement intentions relationship, to understand change in retirement intentions over time, it is important to gauge the effects of initial status of job satisfaction (i.e., job satisfaction intercept) and change in job satisfaction (i.e., job satisfaction slope) on change in retirement intentions (i.e., slope of retirement intentions).

Finally, temporal external validity refers to the extent to which an effect observed at one point in time generalizes across other occasions. This includes longitudinal measurement equivalence (e.g., whether the measurement metric of the concept or the meaning of the concept may change over time; Schmitt, 1982 ), stability of bivariate relationships over time (e.g., job satisfaction relates more weakly to turnover when the economy is bad; Carsten & Spector, 1987 ), the stationarity of cross-lagged parameters across measurement occasions ( b 1 ' = b 2 ' , see cross-lagged panel model in Figure 2c ; e.g., Cole & Maxwell, 2003 ), and the ability to identify change as an effect of participant age/tenure/development—not an effect of birth/hire cohort or historical period ( Schaie, 1965 ). Obviously, cross-sectional data have nothing to say about temporal external validity.

Should there be a moratorium on cross-sectional research? Because any single wave of a longitudinal design is itself cross-sectional data, a moratorium is not technically possible. However, there should be (a) an explicit acknowledgement of the different theoretical parameters in Figure 2 , and (b) a general moratorium on treating the cross-sectional a 1 parameter as though it implies causal priority (cf. panel regression parameter b 1 ' ) , future prediction (cf. panel regression, autoregressive, and growth models), change (cf. growth models), or temporal external validity. This recommendation is tantamount to a moratorium on cross-sectional research papers, because almost all theories imply the lagged and/or longitudinal parameters in Figure 2 . As noted earlier, cross-sectional data are easier to get, but they estimate the wrong parameter.

I agree with Newman that most theories are about change or should be (i.e., we are interested in understanding processes and, of course, processes occur over time). I am also in agreement that cross-sectional designs are of almost no value for assessing theories of change. Therefore, I am interested in getting to a place where most research is longitudinal, and where top journals rarely publish papers with only a cross-sectional design. However, as Newman points out, some research questions can still be addressed using cross-sectional designs. Therefore, I would not support a moratorium on cross-sectional research papers.

Research Design Question 3: In a longitudinal study, how do we decide on the length of the interval between two adjacent time points?

This question needs to be addressed together with the question on how many time points of measurement to administer in a longitudinal study. It is well established that intra-individual changes cannot be adequately assessed with only two time points because (a) a two-point measurement by necessity produces a linear trajectory and therefore is unable to empirically detect the functional form of the true change trajectory and (b) time-related (random or correlated) measurement error and true change over time are confounded in the observed change in a two-point measurement situation (for details, see Chan, 1998 ; Rogosa, 1995 ; Singer & Willett, 2003 ). Hence, the minimum number of time points for assessing intra-individual change is three, but more than three is better to obtain a more reliable and valid assessment of the change trajectory ( Chan, 1998 ). However, it does not mean that a larger number of time points is always better or more accurate than a smaller number of time points. Given that the total time period of study captures the change process of interest, the number of time points should be determined by the appropriate location of the time point. This then brings us to the current practical question on the choice regarding the appropriate length of the interval between adjacent time points.

The correct length of the time interval between adjacent time points in a longitudinal study is critical because it directly affects the observed functional form of the change trajectory and in turn the inference we make about the true pattern of change over time ( Chan, 1998 ). What then should be the correct length of the time interval between adjacent time points in a longitudinal study? Put simply, the correct or optimal length of the time interval will depend on the specific substantive change phenomenon of interest. This means it is dependent on the nature of the substantive construct, its underlying process of change over time, and the context in which the change process is occurring which includes the presence of variables that influence the nature and rate of the change. In theory, the time interval for data collection is optimal when the time points are appropriately spaced in such a way that it allows the true pattern of change over time to be observed during the period of study. When the observed time interval is too short or too long as compared to the optimal time interval, true patterns of change will get masked or false patterns of change will get observed.

The problem is we almost never know what this optimal time interval is, even if we have a relatively sound theory of the change phenomenon. This is because our theories of research phenomena are often static in nature. Even when our theories are dynamic and focus on change processes, they are almost always silent on the specific length of the temporal dimension through which the substantive processes occur over time ( Chan, 2014 ).

In practice, researchers determine their choice of the length of the time interval in conjunction with the choice of number of time points and the choice of the length of the total time period of study. Based on my experiences as an author, reviewer, and editor, I suspect that these three choices are influenced by the specific resource constraints and opportunities faced by the researchers when designing and conducting the longitudinal study. Deviation from optimal time intervals probably occurs more frequently than we would like, since decisions on time intervals between measures in a study are often pragmatic and atheoretical. When we interpret findings from longitudinal studies, we should consider the possibility that the study may have produced patterns of results that led to wrong inferences because the study did not reflect the true changes over time.

Given that our theories of phenomena are not at the stage where we could specify the optimal time intervals, the best we could do now is to explicate the nature of the change processes and the effects of the influencing factors to serve as guides for decisions on time intervals, number of time points, and total time period of study. For example, in research on sense-making processes in newcomer adaptation, the total period of study often ranged from 6 months to 1 year, with 6 to 12 time points, equally spaced at time intervals of 1 or 2 months between adjacent time points. A much longer time interval and total time period, ranging from several months to several years, would be more appropriate for a change process that should take a longer time to manifest itself, such as development of cognitive processes or skill acquisition requiring extensive practice or accumulation of experiences over time. On the other extreme, a much shorter time interval and total time period, ranging from several hours to several days, will be appropriate for a change process that should take a short time to manifest itself such as activation or inhibition of mood states primed by experimentally manipulated events.

Research Design Question 4: As events occur in our daily life, our mental representations of these events may change as time passes. How can we determine the point(s) in time at which the representation of an event is appropriate? How can these issues be addressed through design and measurement in a study?

In some cases, longitudinal researchers will wish to know the nature and dynamics of one’s immediate experiences. In these cases, the items included at each point in time will simply ask participants to report on states, events, or behaviors that are relatively immediate in nature. For example, one might be interested in an employee’s immediate affective experiences, task performance, or helping behavior. This approach is particularly common for intensive, short-term longitudinal designs such as experience sampling methods (ESM; Beal & Weiss, 2003 ). Indeed, the primary objective of ESM is to capture a representative sample of points within one’s day to help understand the dynamic nature of immediate experience ( Beal, 2015 ; Csikszentmihalyi & Larson, 1987 ). Longitudinal designs that have longer measurement intervals may also capture immediate experiences, but more often will ask participants to provide some form of summary of these experiences, typically across the entire interval between each measurement occasion. For example, a panel design with a 6-month interval may ask participants to report on affective states, but include a time frame such as “since the last survey” or “over the past 6 months”, requiring participants to mentally aggregate their own experiences.

As one might imagine, there also are various designs and approaches that range between the end points of immediate experience and experiences aggregated over the entire interval. For example, an ESM study might examine one’s experiences since the last survey. These intervals obviously are close together in time, and therefore are conceptually similar to one’s immediate state; nevertheless, they do require both increased levels of recall and some degree of mental aggregation. Similarly, studies with a longer time interval (e.g., 6-months) might nevertheless ask about one’s relatively recent experiences (e.g., affect over the past week), requiring less in terms of recall and mental aggregation, but only partially covering the events of the entire intervening interval. As a consequence, these two approaches and the many variations in between form a continuum of abstraction containing a number of differences that are worth considering.

Differences in Stability

Perhaps the most obvious difference across this continuum of abstraction is that different degrees of aggregation are captured. As a result, items will reflect more or less stable estimates of the phenomenon of interest. Consider the hypothetical temporal break-down of helping behavior depicted in Figure 3 . No matter how unstable the most disaggregated level of helping behavior may appear, aggregations of these behaviors will always produce greater stability. So, asking about helping behavior over the last hour will produce greater observed variability (i.e., over the entire scale) than averages of helping behavior over the last day, week, month, or one’s overall general level. Although it is well-known that individuals do not follow a strict averaging process when asked directly about a higher level of aggregation (e.g., helping this week; see below), it is very unlikely that such deviations from a straight average will result in less stability at higher levels of aggregation.

Hypothetical variability of helping behavior at different levels of aggregation.

Hypothetical variability of helping behavior at different levels of aggregation.

The reason why this increase in stability is likely to occur regardless of the actual process of mental aggregation is that presumably, as you move from shorter to longer time frames, you are estimating either increasingly stable aspects of an individual’s dispositional level of the construct, or increasingly stable features of the context (e.g., a consistent workplace environment). As you move from longer to shorter time frames you are increasingly estimating immediate instances of the construct or context that are influenced not only by more stable predictors, but also dynamic trends, cycles, and intervening events ( Beal & Ghandour, 2011 ). Notably, this stabilizing effect exists independently of the differences in memory and mental aggregation that are described below.

Differences in Memory

Fundamental in determining how people will respond to these different forms of questions is the nature of memory. Robinson and Clore (2002) provided an in-depth discussion of how we rely on different forms of memory when answering questions over different time frames. Although these authors focus on reports of emotion experiences, their conclusions are likely applicable to a much wider variety of self-reports. At one end of the continuum, reports of immediate experiences are direct, requiring only one’s interpretation of what is occurring and minimizing mental processes of recall.

Moving slightly down the continuum, we encounter items that ask about very recent episodes (e.g., “since the last survey” or “in the past 2 hours” in ESM studies). Here, Robinson and Clore (2002) note that we rely on what cognitive psychologists refer to as episodic memory. Although recall is involved, specific details of the episode in question are easily recalled with a high degree of accuracy. As items move further down the continuum toward summaries of experiences over longer periods of time (e.g., “since the last survey” in a longitudinal panel design), the details of particular relevant episodes are harder to recall and so responses are tinged to an increasing degree by semantic memory. This form of memory is based on individual characteristics (e.g., neurotic individuals might offer more negative reports) as well as well-learned situation-based knowledge (e.g., “my coworkers are generally nice people, so I’m sure that I’ve been satisfied with my interactions over this period of time”). Consequently, as the time frame over which people report increases, the nature of the information provided changes. Specifically, it is increasingly informed by semantic memory (i.e., trait and situation-based knowledge) and decreasingly informed by episodic memory (i.e., particular details of one’s experiences). Thus, researchers should be aware of the memory-related implications when they choose the time frame for their measures.

Differences in the Process of Summarizing

Aside from the role of memory in determining the content of these reports, individuals also summarize their experiences in a complex manner. For example, psychologists have demonstrated that even over a single episode, people tend not to base subjective summaries of the episode on its typical or average features. Instead, we focus on particular notable moments during the experience, such as its peak or its end state, and pay little attention to some aspects of the experience, such as its duration ( Fredrickson, 2000 ; Redelmeier & Kahneman, 1996 ). The result is that a mental summary of a given episode is unlikely to reflect actual averages of the experiences and events that make up the episode. Furthermore, when considering reports that span multiple episodes (e.g., over the last month or the interval between two measurements in a longitudinal panel study), summaries become even more complex. For example, recent evidence suggests that people naturally organize ongoing streams of experience into more coherent episodes largely on the basis of goal relevance ( Beal, Weiss, Barros, & MacDermid, 2005 ; Beal & Weiss, 2013 ; Zacks, Speer, Swallow, Braver, & Reynolds, 2007 ). Thus, how we interpret and parse what is going on around us connects strongly to our goals at the time. Presumably, this process helps us to impart meaning to our experiences and predict what might happen next, but it also influences the type of information we take with us from the episode, thereby affecting how we might report on this period of time.

Practical Differences

What then, can researchers take away from this information to help in deciding what sorts of items to include in longitudinal studies? One theme that emerges from the above discussion is that summaries over longer periods of time will tend to reflect more about the individual and the meanings he or she may have imparted to the experiences, events, and behaviors that have occurred during this time period, whereas shorter-term summaries or reports of more immediate occurrences are less likely to have been processed through this sort of interpretive filter. Of course, this is not to say that the more immediate end of this continuum is completely objective, as immediate perceptions are still host to many potential biases (e.g., attributional biases typically occur immediately); rather, immediate reports are more likely to reflect one’s immediate interpretation of events rather than an interpretation that has been mulled over and considered in light of an individual’s short- and long-term goals, dispositions, and broader worldview.

The particular choice of item type (i.e., immediate vs. aggregated experiences) that will be of interest to a researcher designing a longitudinal study should of course be determined by the nature of the research question. For example, if a researcher is interested in what Weiss and Cropanzano (1996) referred to as judgment-driven behaviors (e.g., a calculated decision to leave the organization), then capturing the manner in which individuals make sense of relevant work events is likely more appropriate, and so items that ask one to aggregate experiences over time may provide a better conceptual match than items asking about immediate states. In contrast, affect-driven behaviors or other immediate reactions to an event will likely be better served by reports that ask participants for minimal mental aggregations of their experiences (e.g., immediate or over small spans of time).

The issue of mental representations of events at particular points in time should always be discussed and evaluated within the research context of the conceptual questions on the underlying substantive constructs and change processes that may account for patterns of responses over time. Many of these conceptual questions are likely to relate to construct-oriented issues such as the location of the substantive construct on the state-trait continuum and the timeframe through which short-term or long-term effects on the temporal changes in the substantive construct are likely to be manifested (e.g., effects of stressors on changes in health). On the issue of aggregation of observations across time, I see it as part of a more basic question on whether an individual’s subjective experience on a substantive construct (e.g., emotional well-being) should be assessed using momentary measures (e.g., assessing the individual’s current emotional state, measured daily over the past 1 week) or retrospective global reports (e.g., asking the individual to report an overall assessment of his or her emotional state over the past 1 week). Each of the two measurement perspectives (i.e., momentary and global retrospective) has both strengths and limitations. For example, momentary measures are less prone to recall biases compared to global retrospective measures ( Kahneman, 1999 ). Global retrospective measures, on the other hand, are widely used in diverse studies for the assessment of many subjective experience constructs with a large database of evidence concerning the measure’s reliability and validity ( Diener, Inglehart, & Tay, 2013 ). In a recent article ( Tay, Chan, & Diener, 2014 ), my colleagues and I reviewed the conceptual, methodological, and practical issues in the debate between the momentary and global retrospective perspectives as applied to the research on subjective well-being. We concluded that both perspectives could offer useful insights and suggested a multiple-method approach that is sensitive to the nature of the substantive construct and specific context of use, but also called for more research on the use of momentary measures to obtain more evidence for their psychometric properties and practical value.

Research Design Question 5: What are the biggest practical hurdles to conducting longitudinal research? What are the ways to overcome them?

As noted earlier, practical hurdles are perhaps one of the main reasons why researchers choose cross-sectional rather than longitudinal designs. Although we have already discussed a number of these issues that must be faced when conducting longitudinal research, the following discussion emphasizes two hurdles that are ubiquitous, often difficult to overcome, and are particularly relevant to longitudinal designs.

Encouraging Continued Participation

Encouraging participation is a practical issue that likely faces all studies, irrespective of design; however, longitudinal studies raise special considerations given that participants must complete measurements on multiple occasions. Although there is a small literature that has examined this issue specifically (e.g., Fumagalli, Laurie, & Lynn, 2013 ; Groves et al. , 2006 ; Laurie, Smith, & Scott, 1999 ), it appears that the relevant factors are fairly similar to those noted for cross-sectional surveys. In particular, providing monetary incentives prior to completing the survey is a recommended strategy (though nonmonetary gifts can also be effective), with increased amounts resulting in increased participation rates, particularly as the burden of the survey increases ( Laurie & Lynn, 2008 ).

The impact of participant burden relates directly to the special considerations of longitudinal designs, as they are generally more burdensome. In addition, with longitudinal designs, the nature of the incentives used can vary over time, and can be tailored toward reducing attrition rates across the entire span of the survey ( Fumagalli et al. , 2013 ). For example, if the total monetary incentive is distributed across survey waves such that later waves have greater incentive amounts, and if this information is provided to participants at the outset of the study, then attrition rates may be reduced more effectively ( Martin & Loes, 2010 ); however, some research suggests that a larger initial payment is particularly effective at reducing attrition throughout the study ( Singer & Kulka, 2002 ).

In addition, the fact that longitudinal designs reflect an implicit relationship between the participant and the researchers over time suggests that incentive strategies that are considered less effective in cross-sectional designs (e.g., incentive contingent on completion) may be more effective in longitudinal designs, as the repeated assessments reflect a continuing reciprocal relationship. Indeed, there is some evidence that contingent incentives are effective in longitudinal designs ( Castiglioni, Pforr, & Krieger, 2008 ). Taken together, one potential strategy for incentivizing participants in longitudinal surveys would be to divide payment such that there is an initial relatively large incentive delivered prior to completing the first wave, followed by smaller, but increasing amounts that are contingent upon completion of each successive panel. Although this strategy is consistent with theory and evidence just discussed, it has yet to be tested explicitly.

Continued contact

One thing that does appear certain, particularly in longitudinal designs, is that incentives are only part of the picture. An additional factor that many researchers have emphasized is the need to maintain contact with participants throughout the duration of a longitudinal survey ( Laurie, 2008 ). Strategies here include obtaining multiple forms of contact information at the outset of the study and continually updating this information. From this information, researchers should make efforts to keep in touch with participants in-between measurement occasions (for panel studies) or some form of ongoing basis (for ESM or other intensive designs). Laurie (2008) referred to these efforts as Keeping In Touch Exercises (KITEs) and suggested that they serve to increase belongingness and perhaps a sense of commitment to the survey effort, and have the additional benefit of obtaining updated contact and other relevant information (e.g., change of job).

Mode of Data Collection

General considerations.

In panel designs, relative to intensive designs discussed below, only a limited number of surveys are sought, and the interval between assessments is relatively large. Consequently, there is likely to be greater flexibility as to the particular methods chosen for presenting and recording responses. Although the benefits, costs, and deficiencies associated with traditional paper-and-pencil surveys are well-known, the use of internet-based surveys has evolved rapidly and so the implications of using this method have also changed. For example, early survey design technologies for internet administration were often complex and potentially costly. Simply adding items was sometimes a difficult task, and custom-formatted response options (e.g., sliding scales with specific end points, ranges, and tick marks) were often unattainable. Currently available web-based design tools often are relatively inexpensive and increasingly customizable, yet have maintained or even improved the level of user-friendliness. Furthermore, a number of studies have noted that data collected using paper-and-pencil versus internet-based applications are often comparable if not indistinguishable (e.g., Cole, Bedeian, & Feild, 2006 ; Gosling et al. , 2004 ), though notable exceptions can occur ( Meade, Michels, & Lautenschlager, 2007 ).

One issue related to the use of internet-based survey methods that is likely to be of increasing relevance in the years to come is collection of survey data using a smartphone. As of this writing (this area changes rapidly), smartphone options are in a developing phase where some reasonably good options exist, but have yet to match the flexibility and standardized appearance that comes with most desktop or laptop web-based options just described. For example, it is possible to implement repeated surveys for a particular mobile operating system (OS; e.g., Apple’s iOS, Google’s Android OS), but unless a member of the research team is proficient in programming, there will be a non-negligible up-front cost for a software engineer ( Uy, Foo, & Aguinis, 2010 ). Furthermore, as market share for smartphones is currently divided across multiple mobile OSs, a comprehensive approach will require software development for each OS that the sample might use.

There are a few other options, however, but some of these options are not quite complete solutions. For example, survey administration tools such as Qualtrics now allow for testing of smartphone compatibility when creating web-based surveys. So, one could conceivably create a survey using this tool and have people respond to it on their smartphone with little or no loss of fidelity. Unfortunately, these tools (again, at this moment in time) do not offer elegant or flexible signaling capabilities. For example, intensive repeated measures designs will often try to signal reasonably large (e.g., N = 50–100) number of participants multiple random signals every day for multiple weeks. Accomplishing this task without the use of a built-in signaling function (e.g., one that generates this pattern of randomized signals and alerts each person’s smartphone at the appropriate time), is no small feat.

There are, however, several efforts underway to provide free or low-cost survey development applications for mobile devices. For example, PACO is a (currently) free Google app that is in the beta-testing stage and allows great flexibility in the design and implementation of repeated surveys on both Android OS and iOS smartphones. Another example that is currently being developed for both Android and iOS platforms is Expimetrics ( Tay, 2015 ), which promises flexible design and signaling functions that is of low cost for researchers collecting ESM data. Such applications offer the promise of highly accessible survey administration and signaling and have the added benefit of transmitting data quickly to servers accessible to the research team. Ideally, such advances in accessibility of survey administration will allow increased response rates throughout the duration of the longitudinal study.

Issues specific to intensive designs

All of the issues just discussed with respect to the mode of data collection are particularly relevant for short-term intensive longitudinal designs such as ESM. As the number of measurement occasions increases, so too do the necessities of increasing accessibility and reducing participant burden wherever possible. Of particular relevance is the emphasis ESM places on obtaining in situ assessments to increase the ecological validity of the study ( Beal, 2015 ). To maximize this benefit of the method, it is important to reduce the interruption introduced by the survey administration. If measurement frequency is relatively sparse (e.g., once a day), it is likely that simple paper-and-pencil or web-based modes of collection will be sufficient without creating too much interference ( Green et al. , 2006 ). In contrast, as measurements become increasingly intensive (e.g., four or five times/day or more), reliance on more accessible survey modes will become important. Thus, a format that allows for desktop, laptop, or smartphone administration should be of greatest utility in such intensive designs.

Statistical Techniques Question 1: With respect to assessing changes over time in a latent growth modeling framework, how can a researcher address different conceptual questions by coding the slope variable differently?

As with many questions in this article, an in-depth answer to this particular question is not possible in the available space. Hence, only a general treatment of different coding schemes of the slope or change variable is provided. Excellent detailed treatments of this topic may be found in Bollen and Curran (2006 , particularly chapters 3 & 4), and in Singer and Willett (2003 , particularly chapter 6). As noted by Ployhart and Vandenberg (2010) , specifying the form of change should be an a priori conceptual endeavor, not a post hoc data driven effort. This stance was also stated earlier by Singer and Willett (2003) when distinguishing between empirical (data driven) versus rational (theory driven) strategies. “Under rational strategies, on the other hand, you use theory to hypothesize a substantively meaningful functional form for the individual change trajectory. Although rational strategies generally yield clearer interpretations, their dependence on good theory makes them somewhat more difficult to develop and apply ( Singer & Willett, 2003 , p. 190).” The last statement in the quote simply reinforces the main theme throughout this article; that is, researchers need to undertake the difficult task of bringing in time (change being one form) within their conceptual frameworks in order to more adequately examine the causal structure among the focal variables within those frameworks.

In general, there are three sets of functional forms for which the slope or change variable may be coded or specified: (a) linear; (b) discontinuous; and (c) nonlinear. Sets emphasize that within each form there are different types that must be considered. The most commonly seen form in our literature is linear change (e.g., Bentein et al. , 2005 ; Vandenberg & Lance, 2000 ). Linear change means there is an expectation that the variable of interest should increase or decrease in a straight-line function during the intervals of the study. The simplest form of linear change occurs when there are equal measurement intervals across time and the units of observations were obtained at the same time in those intervals. Assuming, for example, that there were four occasions of measurement, the coding of the slope variable would be 0 (Time 1), 1 (Time 2), 2 (Time 3) and 3 (Time 4). Such coding fixes the intercept (starting value of the line) at the Time 1 interval, and thus, the conceptual interpretation of the linear change is made relative to this starting point. Reinforcing the notion that there is a set of considerations, one may have a conceptual reason for wanting to fix the intercept to the last measurement occasion. For example, there may be an extensive training program anchored with a “final exam” on the last occasion, and one wants to study the developmental process resulting in the final score. In this case, the coding scheme may be −3, −2, −1, and 0 going from Time 1 to Time 4, respectively ( Bollen & Curran, 2006 , p. 116; Singer & Willett, 2003 , p. 182). One may also have a conceptual reason to use the middle of the time intervals to anchor the intercept and look at the change above and below this point. Thus, the coding scheme in the current example may be −1.5, −0.5, 0.5, and 1.5 for Time 1 to Time 4, respectively ( Bollen & Curran, 2006 ; Singer & Willett, 2003 ). There are other considerations in the “linear set” such as the specification of linear change in cohort designs or other cases where there are individually-varying times of observation (i.e., not everyone started at the same time, at the same age, at the same intervals, etc.). The latter may need to make use of missing data procedures, or the use of time varying covariates that account for the differences as to when observations were collected. For example, to examine how retirement influences life satisfaction, Pinquart and Schindler (2007) modeled life satisfaction data from a representative sample of German retirees who retired between 1985 and 2003. Due to the retirement timing differences among the participants (not everyone retired at the same time or at the same age), different numbers of life satisfaction observations were collected for different retirees. Therefore, the missing observations on a yearly basis were modeled as latent variables to ensure that the analyses were able to cover the entire studied time span.

Discontinuous change is the second set of functional form with which one could theoretically describe the change in one’s substantive focal variables. Discontinuities are precipitous events that may cause the focal variable to rapidly accelerate (change in slope) or to dramatically increase/decrease in value (change in elevation) or both change in slope and elevation (see Ployhart & Vandenberg, 2010 , Figure 1 in p. 100; Singer & Willett, 2003 , pp. 190–208, see Table 6.2 in particular). For example, according to the stage theory ( Wang et al. , 2011 ), retirement may be such a precipitous event, because it can create an immediate “honeymoon effect” on retirees, dramatically increasing their energy-level and satisfaction with life as they pursue new activities and roles.

This set of discontinuous functional form has also been referred to as piecewise growth ( Bollen & Curran, 2006 ; Muthén & Muthén, 1998–2012 ), but in general, represents situations where all units of observation are collected at the same time during the time intervals and the discontinuity happens to all units at the same time. It is actually a variant of the linear set, and therefore, could have been presented above as well. To illustrate, assume we are tracking individual performance metrics that had been rising steadily across time, and suddenly the employer announces an upcoming across-the-board bonus based on those metrics. A sudden rise (as in a change in slope) in those metrics could be expected based purely on reinforcement theory. Assume, for example, we had six intervals of measurement, and the bonus announcement was made just after the Time 3 data collection. We could specify two slope or change variables and code the first one as 0, 1, 2, 2, 2, and 2, and code the second slope variable as 0, 0, 0, 1, 2, and 3. The latter specification would then independently examine the linear change in each slope variable. Conceptually, the first slope variable brings the trajectory of change up to the transition point (i.e., the last measurement before the announcement) while the second one captures the change after the transition ( Bollen & Curran, 2006 ). Regardless of whether the variables are latent or observed only, if this is modeled using software such as Mplus ( Muthén & Muthén, 1998–2012 ), the difference between the means of the slope variables may be statistically tested to evaluate whether the post-announcement slope is indeed greater than the pre-announcement slope. One may also predict that the announcement would cause an immediate sudden elevation in the performance metric as well. This can be examined by including a dummy variable which is zero at all time points prior to the announcement and one at all time points after the announcement ( Singer & Willett, 2003 , pp. 194–195). If the coefficient for this dummy variable is statistically significant and positive, then it indicates that there was a sudden increase (upward elevation) in value post-transition.

Another form of discontinuous change is one in which the discontinuous event occurs at varying times for the units of observation (indeed it may not occur at all for some) and the intervals for collecting data may not be evenly spaced. For example, assume again that individual performance metrics are monitored across time for individuals in high-demand occupations with the first one collected on the date of hire. Assume as well that these individuals are required to report when an external recruiter approaches them; that is, they are not prohibited from speaking with a recruiter but need to just report when it occurred. Due to some cognitive dissonance process, individuals may start to discount the current employer and reduce their inputs. Thus, a change in slope, elevation, or both may be expected in performance. With respect to testing a potential change in elevation, one uses the same dummy-coded variable as described above ( Singer & Willett, 2003 ). With respect to whether the slopes of the performance metrics differ pre- versus post-recruiter contact, however, requires the use of a time-varying covariate. How this operates specifically is beyond the scope here. Excellent treatments on the topic, however, are provided by Bollen and Curran (2006 , pp. 192–218), and Singer and Willett (2003 , pp. 190–208). In general, a time-varying covariate captures the intervals of measurement. In the current example, this may be the number of days (weeks, months, etc.) from date of hire (when baseline performance was obtained) to the next interval of measurement and all subsequent intervals. Person 1, for example, may have the values 1, 22, 67, 95, 115, and 133, and was contacted after Time 3 on Day 72 from the date of hire. Person 2 may have the values 1, 31, 56, 101, 141, and 160, and was contacted after Time 2 on Day 40 from date of hire. Referring the reader to the specifics starting on page 195 of Singer and Willett (2003) , one would then create a new variable from the latter in which all of the values on this new variable before the recruiting contact are set to zero, and values after that to the difference in days when contact was made to the interval of measurement. Thus, for Person 1, this new variable would have the values 0, 0, 0, 23, 43, and 61, and for Person 2, the values would be 0, 0, 16, 61, 101, and 120. The slope of this new variable represents the increment (up or down) to what the slope would have been had the individuals not been contacted by a recruiter. If it is statistically nonsignificant, then there is no change in slope pre- versus post-recruiter contact. If it is statistically significant, then the slope after contact differed from that before the contact. Finally, while much of the above is based upon a multilevel approach to operationalizing change, Muthén and Muthén (1998–2012 ) offer an SEM approach to time-varying covariates through their Mplus software package.

The final functional form to which the slope or change variable may be coded or specified is nonlinear. As with the other forms, there is a set of nonlinear forms. The simplest in the set is when theory states that the change in the focal variable may be quadratic (curve upward or downward). As such, in addition to the linear slope/change variable, a second change variable is specified in which the values of its slope are fixed to the squared values of the first or linear change variable. Assuming five equally spaced intervals of measurement coded as 0, 1, 2, 3, and 4 on the linear change variable. The values of the second quadratic change variable would be 0, 1, 4, 9, and 16. Theory could state that there is cubic change as well. In that case, a third cubic change variable is introduced with the values of 0, 1, 8, 27, and 64. One problem with the use of quadratic (or even linear change variables) or other polynomial forms as described above is that the trajectories are unbounded functions ( Bollen & Curran, 2006 ); that is, there is an assumption that they tend toward infinity. It is unlikely that most, if any, of the theoretical processes in the social sciences are truly unbounded. If a nonlinear form is expected, operationalizing change using an exponential trajectory is probably the most realistic choice. This is because exponential trajectories are bounded functions in the sense that they approach an asymptote (either growing and/or decaying to asymptote). There are three forms of exponential trajectories: (a) simple where there is explosive growth from asymptote; (b) negative where there is growth to an asymptote; and (c) logistic where this is asymptote at both ends ( Singer & Willett, 2003 ). Obviously, the values of the slope or change variable would be fixed to the exponents most closely representing the form of the curve (see Bollen & Curren, 2006, p. 108; and Singer & Willett, 2003 , Table 6.7, p. 234).

There are other nonlinear considerations as well that belong to this. For example, Bollen and Curran (2006 , p. 109) address the issue of cycles (recurring ups and downs but that follow a general upward or downward trend.) Once more the values of the change variable would be coded to reflect those cycles. Similarly, Singer and Willett (2003 , p. 208) address recoding when one wants to remove through transformations the nonlinearity in the change function to make it more linear. They provide an excellent heuristic on page 211 to guide one’s thinking on this issue.

Statistical Techniques Question 2: In longitudinal research, are there additional issues of measurement error that we need to pay attention to, which are over and above those that are applicable to cross-sectional research?

Longitudinal research should pay special attention to the measurement invariance issue. Chan (1998) and Schmitt (1982) introduced Golembiewski and colleagues’ (1976) notion of alpha, beta, and gamma change to explain why measurement invariance is a concern in longitudinal research. When the measurement of a particular concept retains the same structure (i.e., same number of observed items and latent factors, same value and pattern of factor loadings), change in the absolute levels of the latent factor is called alpha change. Only for this type of change can we draw the conclusion that there is a specific form of growth in a given variable. When the measurement of a concept has to be adjusted over time (i.e., different values or patterns of factor loadings), beta change happens. Although the conceptual meaning of the factor remains the same over measurements, the subjective metric of the concept has changed. When the meaning of a concept changes over time (e.g., having different number of factors or different correlations between factors), gamma change happens. It is not possible to compare difference in absolute levels of a latent factor when beta and gamma changes happen, because there is no longer a stable measurement model for the construct. The notions of beta and gamma changes are particularly important to consider when conducting longitudinal research on aging-related phenomena, especially when long time intervals are used in data collection. In such situations, the risk for encountering beta and gamma changes is higher and can seriously jeopardize the internal and external validity of the research.

Longitudinal analysis is often conducted to examine how changes happen in the same variable over time. In other words, it operates on the “alpha change” assumption. Thus, it is often important to explicitly test measurement invariance before proceeding to model the growth parameters. Without establishing measurement invariance, it is unknown whether we are testing meaningful changes or comparing apples and oranges. A number of references have discussed the procedures for testing measurement invariance in latent variable analysis framework (e.g., Chan, 1998 ; McArdle, 2007 ; Ployhart & Vandenberg, 2010 ). The basic idea is to specify and include the measurement models in the longitudinal model, with either continuous or categorical indicators (see answers to Statistical Techniques #4 below on categorical indicators). With the latent factor invariance assumption, factor loadings across measurement points should be constrained to be equal. Errors from different measurement occasions might correlate, especially when the measurement contexts are very similar over time ( Tisak & Tisak, 2000 ). Thus, the error variances for the same item over time can also be correlated to account for common influences at the item-level (i.e., autocorrelation between items). With the specification of the measurement structure, the absolute changes in the latent variables can then be modeled by the mean structure. It should be noted that a more stringent definition of measurement invariance also requires equal variance in latent factors. However, in longitudinal data this requirement becomes extremely difficult to satisfy, and factor variances can be sample specific. Thus, this requirement is often eased when testing measurement invariance in longitudinal analysis. Moreover, this requirement may even be invalid when the nature of the true change over time involves changes in the latent variance ( Chan, 1998 ).

It is important to note that the mean structure approach not only applies to longitudinal models with three or more measurement points, but also applies to simple repeated measures designs (e.g., pre–post design). Traditional paired sample t tests and within-subject repeated measures ANOVAs do not take into account measurement equivalence, which simply uses the summed scores at two measurement points to conduct a hypothesis test. The mean structure approach provides a more powerful way to test the changes/differences in a latent variable by taking measurement errors into consideration ( McArdle, 2009 ).

However, sometimes it is not possible to achieve measurement equivalence through using the same scales over time. For example, in research on development of cognitive intelligence in individuals from birth to late adulthood, different tests of cognitive intelligence are administrated at different ages (e.g., Bayley, 1956 ). In applied settings, different domain-knowledge or skill tests may be administrated to evaluate employee competence at different stages of their career. Another possible reason for changing measures is poor psychometric properties of scales used in earlier data collection. Previously, researchers have used transformed scores (e.g., scores standardized within each measurement point) before modeling growth curves over time. In response to critiques of these scaling methods, new procedures have been developed to model longitudinal data using changed measurement (e.g., rescoring methods, over-time prediction, and structural equation modeling with convergent factor patterns). Recently, McArdle and colleagues (2009) proposed a joint model approach that estimated an item response theory (IRT) model and latent curve model simultaneously. They provided a demonstration of how to effectively handle changing measurement in longitudinal studies by using this new proposed approach.

I am not sure these issues of measurement error are “over and above” cross-sectional issues as much as that cross-sectional data provide no mechanisms for dealing with these issues, so they are simply ignored at the analysis stage. Unfortunately, this creates problems at the interpretation stage. In particular, issues of random walk variables ( Kuljanin, Braun, & DeShon, 2011 ) are a potential problem for longitudinal data analysis and the interpretation of either cross-sectional or longitudinal designs. Random walk variables are dynamic variables that I mentioned earlier when describing the computational modeling approach. These variables have some value and are moved from that value. The random walk expression comes from the image of a highly inebriated individual, who is in some position, but who staggers and sways from the position to neighboring positions because the alcohol has disrupted the nerve system’s stabilizers. This inebriated individual might have an intended direction (called “the trend” if the individual can make any real progress), but there may be a lot of noise in that path. In the aging and retirement literature, one’s retirement savings can be viewed as a random walk variable. Although the general trend of retirement savings should be positive (i.e., the amount of retirement savings should grow over time), at any given point, the exact amount added/gained into the saving (or withdrawn/loss from the saving) depends on a number of situational factors (e.g., stock market performance) and cannot be consistently predicted. The random walks (i.e., dynamic variables) have a nonindependence among observations over time. Indeed, one way to know if one is measuring a dynamic variable is if one observes a simplex pattern among inter-correlations of the variable with itself over time. In a simplex pattern, observations of the variable are more highly correlated when they are measured closer in time (e.g., Time 1 observations correlate more highly with Time 2 than Time 3). Of course, this pattern can also occur if its proximal causes (rather than itself) is a dynamic variable.

As noted, dynamic or random walk variables can create problems for poorly designed longitudinal research because one may not realize that the level of the criterion ( Y ), say measured at Time 3, was largely near its level at Time 2, when the presumed cause ( X ) was measured. Moreover, at Time 1 the criterion ( Y ) might have been busy moving the level of the “causal” variable ( X ) to the place it is observed at Time 2. That is, the criterion variable ( Y ) at Time 1 is actually causing the presumed causal variable ( X ) at Time 2. For example, performances might affect self-efficacy beliefs such that self-efficacy beliefs end up aligning with performance levels. If one measures self-efficacy after it has largely been aligned, and then later measures the largely stable performance, a positive correlation between the two variables might be thought of as reflecting self-efficacy’s influence on performance because of the timing of measurement (i.e., measuring self-efficacy before performance). This is why the multiple wave measurement practice is so important in passive observational panel studies.

However, the multiple waves of measurement might still create problems for random walk variables, particularly if there are trends and reverse causality. Consider the self-efficacy to performance example again. If performance is trending over time and self-efficacy is following along behind, a within-person positive correlation between self-efficacy and subsequent performance is likely be observed (even if there is no or a weak negative causal effect) because self-efficacy will be relatively high when performance is relatively high and low when performance is low. In this case, controlling for trend or past performance will generally solve the problem ( Sitzmann & Yeo, 2013 ), unless the random walk has no trend. Meanwhile, there are other issues that random walk variables may raise for both cross-sectional and longitudinal research, which Kuljanin et al. (2011) do a very good job of articulating.

A related issue for longitudinal research is nonindependence of observations as a function of nesting within clusters. This issue has received a great deal of attention in the multilevel literature (e.g., Bliese & Ployhart, 2002 ; Singer & Willett, 2003 ), so I will not belabor the point. However, there is one more nonindependence issue that has not received much attention. Specifically, the issue can be seen when a variable is a lagged predictor of itself ( Vancouver, Gullekson, & Bliese, 2007 ). With just three repeated measures or observations, the correlation of the variable on itself will average −.33 across three time points, even if the observations are randomly generated. This is because there is a one-third chance the repeated observations are changing monotonically over the three time points, which results in a correlation of 1, and a two-thirds chance they are not changing monotonically, which results in a correlation of −1, which averages to −.33. Thus, on average it will appear the variable is negatively causing itself. Fortunately, this problem is quickly mitigated by more waves of observations and more cases (i.e., the bias is largely removed with 60 pairs of observations).

Statistical Techniques Question 3: When analyzing longitudinal data, how should we handle missing values?

As reviewed by Newman (2014 ; see in-depth discussions by Enders, 2001 , 2010 ; Little & Rubin, 1987 ; Newman, 2003 , 2009 ; Schafer & Graham, 2002 ), there are three levels of missing data (item level missingness, variable/construct-level missingness, and person-level missingness), two problems caused by missing data (parameter estimation bias and low statistical power), three mechanisms of missing data (missing completely at random/MCAR, missing at random/MAR, and missing not at random/MNAR), and a handful of common missing data techniques (listwise deletion, pairwise deletion, single imputation techniques, maximum likelihood, and multiple imputation). State-of-the-art advice is to use maximum likelihood (ML: EM algorithm, Full Information ML) or multiple imputation (MI) techniques, which are particularly superior to other missing data techniques under the MAR missingness mechanism, and perform as well as—or better than—other missing data techniques under MCAR and MNAR missingness mechanisms (MAR missingness is a form of systematic missingness in which the probability that data are missing on one variable [ Y ] is related to the observed data on another variable [ X ]).

Most of the controversy surrounding missing data techniques involves two misconceptions: (a) the misconception that listwise and pairwise deletion are somehow more natural techniques that involve fewer or less tenuous assumptions than ML and MI techniques do, with the false belief that a data analyst can draw safer inferences by avoiding the newer techniques, and (b) the misconception that multiple imputation simply entails “fabricating data that were not observed.” First, because all missing data techniques are based upon particular assumptions, none is perfect. Also, when it comes to selecting a missing data technique to analyze incomplete data, one of the above techniques (e.g., listwise, pairwise, ML, MI) must be chosen. One cannot safely avoid the decision altogether—that is, abstinence is not an option. One must select the least among evils.

Because listwise and pairwise deletion make the exceedingly unrealistic assumption that missing data are missing completely at random/MCAR (cf. Rogelberg et al. , 2003 ), they will almost always produce worse bias than ML and MI techniques, on average ( Newman & Cottrell, 2015 ). Listwise deletion can further lead to extreme reductions in statistical power. Next, single imputation techniques (e.g., mean substitution, stochastic regression imputation)—in which the missing data are filled in only once, and the resulting data matrix is analyzed as if the data had been complete—are seriously flawed because they overestimate sample size and underestimate standard errors and p -values.

Unfortunately, researchers often get confused into thinking that multiple imputation suffers from the same problems as single imputation; it does not. In multiple imputation, missing data are filled in several different times, and the multiple resulting imputed datasets are then aggregated in a way that accounts for the uncertainty in each imputation ( Rubin, 1987 ). Multiple imputation is not an exercise in “making up data”; it is an exercise in tracing the uncertainty of one’s parameter estimates, by looking at the degree of variability across several imprecise guesses (given the available information). The operative word in multiple imputation is multiple , not imputation.

Longitudinal modeling tends to involve a lot of construct- or variable-level missing data (i.e., omitting answers from an entire scale, an entire construct, or an entire wave of observation—e.g., attrition). Such conditions create many partial nonrespondents, or participants for whom some variables have been observed and some other variables have not been observed. Thus a great deal of missing data in longitudinal designs tends to be MAR (e.g., because missing data at Time 2 is related to observed data at Time 1). Because variable-level missingness under the MAR mechanism is the ideal condition for which ML and MI techniques were designed ( Schafer & Graham, 2002 ), both ML and MI techniques (in comparison to listwise deletion, pairwise deletion, and single imputation techniques) will typically produce much less biased estimates and more accurate hypothesis tests when used on longitudinal designs ( Newman, 2003 ). Indeed, ML missing data techniques are now the default techniques in LISREL, Mplus, HLM, and SAS Proc Mixed. It is thus no longer excusable to perform discrete-time longitudinal analyses ( Figure 2 ) without using either ML or MI missing data techniques ( Enders, 2010 ; Graham, 2009 ; Schafer & Graham, 2002 ).

Lastly, because these newer missing data techniques incorporate all of the available data, it is now increasingly important for longitudinal researchers to not give up on early nonrespondents. Attrition need not be a permanent condition. If a would-be respondent chooses not to reply to a survey request at Time 1, the researcher should still attempt to collect data from that person at Time 2 and Time 3. More data = more useful information that can reduce bias and increase statistical power. Applying this advice to longitudinal research on aging and retirement, it means that even when a participant fails to provide responses at some measurement points, continuing to make an effort to collect more data from the participant in subsequent waves may still be worthwhile. It will certainly help combat the issue of attrition and allow more usable data to emerge from the longitudinal data collection.

Statistical Techniques Question 4: Most of existing longitudinal research focuses on studying quantitative change over time. What if the variable of interest is categorical or if the changes over time are qualitative in nature?

I think there are two questions here: How to model longitudinal data of categorical variables, and how to model discontinuous change patterns of variables over time. In terms of longitudinal categorical data, there are two types of data that researchers typically encounter. One type of data comes from measuring a sample of participants on a categorical variable at a few time points (i.e., panel data). The research question that drives the data analyses is to understand the change of status from one time point to the next. For example, researchers might be interested in whether a population of older workers would stay employed or switch between employed and unemployed statuses (e.g., Wang & Chan, 2011 ). To answer this question, employment status (employed or unemployed) of a sample of older workers might be measured five or six times over several years. When transition between qualitative statuses is of theoretical interest, this type of panel data can be modeled via Markov chain models. The simplest form of Markov chain models is a simple Markov model with a single chain, which assumes (a) the observed status at time t depends on the observed status at time t –1, (b) the observed categories are free from measurement error, and (c) the whole population can be described by a single chain. The first assumption is held by most if not all Markov chain models. The other two assumptions can be released by using latent Markov chain modeling (see Langeheine & Van de Pol, 2002 for detailed explanation).

The basic idea of latent Markov chains is that observed categories reflect the “true” status on latent categorical variables to a certain extent (i.e., the latent categorical variable is the cause of the observed categorical variable). In addition, because the observations may contain measurement error, a number of different observed patterns over time could reflect the same underlying latent transition pattern in qualitative status. This way, a large number of observed patterns (e.g., a maximum of 256 patterns of a categorical variable with four categories measured four times) can be reduced into reflecting a small number of theoretically coherent patterns (e.g., a maximum of 16 patterns of a latent categorical variable with two latent statuses over four time points). It is also important to note that subpopulations in a larger population can follow qualitatively different transition patterns. This heterogeneity in latent Markov chains can be modeled by mixture latent Markov modeling, a technique integrating latent Markov modeling and latent class analysis (see Wang & Chan, 2011 for technical details). Given that mixture latent Markov modeling is a part of the general latent variable analysis framework ( Muthén, 2001 ), mixture latent Markov models can include different types of covariates and outcomes (latent or observed, categorical or continuous) of the subpopulation membership as well as the transition parameters of each subpopulation.

Another type of longitudinal categorical data comes from measuring one or a few study units on many occasions separated by the same time interval (e.g., every hour, day, month, or year). Studies examining this type of data mostly aim to understand the temporal trend or periodic tendency in a phenomenon. For example, one can examine the cyclical trend of daily stressful events (occurred or not) over several months among a few employees. The research goal could be to reveal multiple cyclical patterns within the repeated occurrences in stressful events, such as daily, weekly, and/or monthly cycles. Another example is the study of performance of a particular player or a sports team (i.e., win, lost, or tie) over hundreds of games. The research question could be to find out time-varying factors that could account for the cyclical patterns of game performance. The statistical techniques typically used to analyze this type of data belong to the family of categorical time series analyses . A detailed technical review is beyond the current scope, but interested readers can refer to Fokianos and Kedem (2003) for an extended overview.

In terms of modeling discontinuous change patterns of variables, Singer and Willett (2003) and Bollen and Curran (2006) provided guidance on modeling procedures using either the multilevel modeling or structural equation modeling framework. Here I briefly discuss two additional modeling techniques that can achieve similar research goals: spline regression and catastrophe models.

Spline regression is used to model a continuous variable that changes its trajectory at a particular time point (see Marsh & Cormier, 2001 for technical details). For example, newcomers’ satisfaction with coworkers might increase steadily immediately after they enter the organization. Then due to a critical organizational event (e.g., the downsizing of the company, a newly introduced policy to weed out poor performers in the newcomer cohort), newcomers’ coworker satisfaction may start to drop. A spline model can be used to capture the dramatic change in the trend of newcomer attitude as a response to the event (see Figure 4 for an illustration of this example). The time points at which the variable changes its trajectory are called spline knots. At the spline knots, two regression lines connect. Location of the spline knots may be known ahead of time. However, sometimes the location and the number of spline knots are unknown before data collection. Different spline models and estimation techniques have been developed to account for these different explorations of spline knots ( Marsh & Cormier, 2001 ). In general, spline models can be considered as dummy-variable based models with continuity constraints. Some forms of spline models are equivalent to piecewise linear regression models and are quite easy to implement ( Pindyck & Rubinfeld, 1998 ).

Hypothetical illustration of spline regression: The discontinuous change in newcomers’ satisfaction with coworkers over time.

Hypothetical illustration of spline regression: The discontinuous change in newcomers’ satisfaction with coworkers over time.

Catastrophe models can also be used to describe “sudden” (i.e., catastrophic) discontinuous change in a dynamic system. For example, some systems in organizations develop from one certain state to uncertainty, and then shift to another certain state (e.g., perception of performance; Hanges, Braverman, & Rentsch, 1991 ). This nonlinear dynamic change pattern can be described by a cusp model, one of the most popular catastrophe models in the social sciences. Researchers have applied catastrophe models to understand various types of behaviors at work and in organizations (see Guastello, 2013 for a summary). Estimation procedures are also readily available for fitting catastrophe models to empirical data (see technical introductions in Guastello, 2013 ).

Statistical Techniques Question 5: Could you speculate on the “next big thing” in conceptual or methodological advances in longitudinal research? Specifically, describe a novel idea or specific data analytic model that is rarely used in longitudinal studies in our literature, but could serve as a useful conceptual or methodological tool for future science in work, aging and retirement.

Generally, but mostly on the conceptual level, I think we will see an increased use of computational models to assess theory, design, and analysis. Indeed, I think this will be as big as multilevel analysis in future years, though the rate at which it will happen I cannot predict. The primary factors slowing the rate of adoption are knowledge of how to do it and ignorance of the cost of not doing it (cf. Vancouver, Tamanini et al. , 2010 ). Factors that will speed its adoption are easy-to-use modeling software and training opportunities. My coauthor and I recently published a tutorial on computational modeling ( Vancouver & Weinhardt, 2012 ), and we provide more details on how to use a specific, free, easy-to-use modeling platform on our web site ( https://sites.google.com/site/motivationmodeling/home ).

On the methodology level I think research simulations (i.e., virtual worlds) will increase in importance. They offer a great deal of control and the ability to measure many variables continuously or frequently. On the analysis level I anticipate an increased use of Bayesian and Hierarchical Bayesian analysis, particularly to assess computational model fits ( Kruschke, 2010 ; Rouder, & Lu, 2005 ; Wagenmakers, 2007 ).

I predict that significant advances in various areas will be made in the near future through the appropriate application of mixture latent modeling approaches. These approaches combine different latent variable techniques such as latent growth modeling, latent class modeling, latent profile analysis, and latent transition analysis into a unified analytical model ( Wang & Hanges, 2011 ). They could also integrate continuous variables and discrete variables, as either predictor or outcome variables, in a single analytical model to describe and explain simultaneous quantitative and qualitative changes over time. In a recent study, my coauthor and I applied an example of a mixture latent model to understand the retirement process ( Wang & Chan, 2011 ). Despite or rather because of the power and flexibility of these advanced mixture techniques to fit diverse models to longitudinal data, I will repeat the caution I made over a decade ago—that the application of these complex models to assess changes over time should be guided by adequate theories and relevant previous empirical findings ( Chan, 1998 ).

My hope or wish for the next big thing is the use of longitudinal methods to integrate the micro and macro domains of our literature on work-related phenomena. This will entail combining aspects of growth modeling with multi-level processes. Although I do not have a particular conceptual framework in mind to illustrate this, my reasoning is based on the simple notion that it is the people who make the place. Therefore, it seems logical that we could, for example, study change in some aspect of firm performance across time as a function of change in some aspect of individual behavior and/or attitudes. Another example could be that we can study change in household well-being throughout the retirement process as a function of change in the two partners’ individual well-being over time. The analytical tools exist for undertaking such analyses. What are lacking at this point are the conceptual frameworks.

I hope the next big thing for longitudinal research will be dynamic computational models ( Ilgen & Hulin, 2000 ; Miller & Page, 2007 ; Weinhardt & Vancouver, 2012 ), which encode theory in a manner that is appropriately longitudinal/dynamic. If most theories are indeed theories of change, then this advancement promises to revolutionize what passes for theory in the organizational sciences (i.e., a computational model is a formal theory, with much more specific, risky, and therefore more meaningful predictions about phenomena—in comparison to the informal verbal theories that currently dominate and are somewhat vague with respect to time). My preferred approach is iterative: (a) authors first collect longitudinal data, then (b) inductively build a parsimonious computational model that can reproduce the data, then (c) collect more longitudinal data and consider its goodness of fit with the model, then (d) suggest possible model modifications, and then repeat steps (c) and (d) iteratively until some convergence is reached (e.g., Stasser, 2000 , 1988 describes one such effort in the context of group discussion and decision making theory). Exactly how to implement all the above steps is not currently well known, but developments in this area can potentially change what we think good theory is.

I am uncertain whether my “next big thing” truly reflects the wave of the future, or if it instead simply reflects my own hopes for where longitudinal research should head in our field. I will play it safe and treat it as the latter. Consistent with several other responses to this question, I hope that researchers will soon begin to incorporate far more complex dynamics of processes into both their theorizing and their methods of analysis. Although process dynamics can (and do) occur at all levels of analysis, I am particularly excited by the prospect of linking them across at least adjacent levels. For example, basic researchers interested in the dynamic aspects of affect recently have begun theorizing and modeling emotional experiences using various forms of differential structural equation or state-space models (e.g. Chow et al. , 2005 ; Kuppens, Oravecz, & Tuerlinckx, 2010 ), and, as the resulting parameters that describe within-person dynamics can be aggregated to higher levels of analysis (e.g., Beal, 2014 ; Wang, Hamaker, & Bergeman, 2012 ), they are inherently multilevel.

Another example of models that capture this complexity and are increasingly used in both immediate and longer-term longitudinal research are multivariate latent change score models ( Ferrer & McArdle, 2010 ; McArdle, 2009 ; Liu et al. , 2016 ). These models extend LGMs to include a broader array of sources of change (e.g., autoregressive and cross-lagged factors) and consequently capture more of the complexity of changes that can occur in one or more variables measured over time. All of these models share a common interest in modeling the underlying dynamic patterns of a variable (e.g., linear, curvilinear, or exponential growth, cyclical components, feedback processes), while also taking into consideration the “shocks” to the underlying system (e.g., affective events, organizational changes, etc.), allowing them to better assess the complexity of dynamic processes with greater accuracy and flexibility ( Wang et al. , 2016 ).

I believe that applying a dynamical systems framework will greatly advance our research. Applying the dynamic systems framework (e.g., DeShon, 2012 ; Vancouver, Weinhardt, & Schmidt, 2010 ; Wang et al. , 2016 ) forces us to more explicitly conceptualize how changes unfold over time in a particular system. Dynamic systems models can also answer the why question better by specifying how elements of a system work together over time to bring about the observed change at the system level. Studies on dynamic systems models also tend to provide richer data and more detailed analyses on the processes (i.e., the black boxes not measured in traditional research) in a system. A number of research design and analysis methods relevant for dynamical systems frameworks are available, such as computational modeling, ESM, event history analyses, and time series analyses ( Wang et al. , 2016 ).

M. Wang’s work on this article was supported in part by the Netherlands Institute for Advanced Study in the Humanities and Social Sciences.

Ainslie G. , & Haslam N . ( 1992 ). Hyperbolic discounting . In G. Loewenstein J. Elster (Eds.), Choice over time (pp. 57 – 92 ). New York, NY : Russell Sage Foundation .

Google Scholar

Google Preview

Ancona D. G. Goodman P. S. Lawrence B. S. , & Tushman M. L . ( 2001 ). Time: A new research lens . Academy of Management Review , 26 , 645 – 563 . doi: 10.5465/AMR.2001.5393903

Ashford S. J . ( 1986 ). The role of feedback seeking in individual adaptation: A resource perspective . Academy of Management Journal , 29 , 465 – 487 . doi: 10.2307/256219

Bayley N . ( 1956 ). Individual patterns of development . Child Development , 27 , 45 – 74 . doi: 10.2307/1126330

Beal D. J . ( 2014 ). Time and emotions at work . In Shipp A. J. Fried Y. (Eds.), Time and work (Vol. 1 , pp. 40 – 62 ). New York, NY : Psychology Press .

Beal D. J . ( 2015 ). ESM 2.0: State of the art and future potential of experience sampling methods in organizational research . Annual Review of Organizational Psychology and Organizational Behavior , 2 , 383 – 407 .

Beal D. J. , & Ghandour L . ( 2011 ). Stability, change, and the stability of change in daily workplace affect . Journal of Organizational Behavior , 32 , 526 – 546 . doi: 10.1002/job.713

Beal D. J. , & Weiss H. M . ( 2013 ). The episodic structure of life at work . In Bakker A. B. Daniels K. (Eds.), A day in the life of a happy worker (pp. 8 – 24 ). London, UK : Psychology Press .

Beal D. J. , & Weiss H. M . ( 2003 ). Methods of ecological momentary assessment in organizational research . Organizational Research Methods , 6 , 440 – 464 . doi: 10.1177/1094428103257361

Beal D. J. Weiss H. M. Barros E. , & MacDermid S. M . ( 2005 ). An episodic process model of affective influences on performance . Journal of Applied Psychology , 90 , 1054 . doi: 10.1037/0021-9010.90.6.1054

Bentein K. Vandenberghe C. Vandenberg R. , & Stinglhamber F . ( 2005 ). The role of change in the relationship between commitment and turnover: a latent growth modeling approach . Journal of Applied Psychology , 90 , 468 – 482 . doi: 10.1037/0021-9010.90.3.468

Bliese P. D. , & Ployhart R. E . ( 2002 ). Growth modeling using random coefficient models: Model building, testing, and illustrations . Organizational Research Methods , 5 , 362 – 387 . doi: 10.1177/109442802237116

Bolger N. Davis A. , & Rafaeli E . ( 2003 ). Diary methods: Capturing life as it is lived . Annual Review of Psychology , 54 , 579 – 616 . doi: 10.1146/annurev.psych.54.101601.145030

Bolger N. , & Laurenceau J.-P . ( 2013 ). Intensive longitudinal methods: An introduction to diary and experience sampling research . New York, NY : Guilford .

Bollen K. A. , & Curran P. J . ( 2006 ). Latent curve models: A structural equation approach . Hoboken, NJ : Wiley .

Carsten J. M. , & Spector P. E . ( 1987 ). Unemployment, job satisfaction, and employee turnover: A meta-analytic test of the Muchinsky model . Journal of Applied Psychology , 72 , 374 . doi: 10.1037/0021-9010.72.3.374

Castiglioni L. Pforr K. , & Krieger U . ( 2008 ). The effect of incentives on response rates and panel attrition: Results of a controlled experiment . Survey Research Methods , 2 , 151 – 158 . doi: 10.18148/srm/2008.v2i3.599

Chan D . ( 1998 ). The conceptualization and analysis of change over time: An integrative approach incorporating longitudinal mean and covariance structures analysis (LMACS) and multiple indicator latent growth modeling (MLGM) . Organizational Research Methods , 1 , 421 – 483 . doi: 10.1177/109442819814004

Chan D . ( 2002 ). Longitudinal modeling . In Rogelberg S . Handbook of research methods in industrial and organizational psychology (pp. 412 – 430 ). Malden, MA : Blackwell Publishers, Inc .

Chan D . ( 2010 ). Advances in analytical strategies . In S. Zedeck (Ed.), APA handbook of industrial and organizational psychology (Vol. 1 ), Washington, DC : APA .

Chan D . ( 2014 ). Time and methodological choices . In In A. J. Shipp Y. Fried (Eds.), Time and work (Vol. 2): How time impacts groups, organizations, and methodological choices . New York, NY : Psychology Press .

Chan D. , & Schmitt N . ( 2000 ). Interindividual differences in intraindividual changes in proactivity during organizational entry: A latent growth modeling approach to understanding newcomer adaptation . Journal of Applied Psychology , 85 , 190 – 210 .

Chow S. M. Ram N. Boker S. M. Fujita F. , & Clore G . ( 2005 ). Emotion as a thermostat: representing emotion regulation using a damped oscillator model . Emotion , 5 , 208 – 225 . doi: 10.1037/1528-3542.5.2.208

Cole M. S. Bedeian A. G. , & Feild H. S . ( 2006 ). The measurement equivalence of web-based and paper-and-pencil measures of transformational leadership a multinational test . Organizational Research Methods , 9 , 339 – 368 . doi: 10.1177/1094428106287434

Cole D. A. , & Maxwell S. E . ( 2003 ). Testing mediational models with longitudinal data: Questions and tips in the use of structural equation modeling . Journal of Abnormal Psychology , 112 , 558 – 577 . doi: 10.1037/0021-843X.112.4.558

Csikszentmihalyi M. , & Larson R . ( 1987 ). Validity and reliability of the experience sampling method . Journal of Nervous and Mental Disease , 775 , 526 – 536 .

DeShon R. P . ( 2012 ). Multivariate dynamics in organizational science . In S. W. J. Kozlowski (Ed.), The Oxford Handbook of Organizational Psychology (pp. 117 – 142 ). New York, NY : Oxford University Press .

Diener E. Inglehart R. , & Tay L . ( 2013 ). Theory and validity of life satisfaction scales . Social Indicators Research , 112 , 497 – 527 . doi: 10.1007/s11205-012-0076-y

Enders C. K . ( 2001 ). . Structural Equation Modelling , 8 , 128 – 141 .

Enders C. K . ( 2010 ). Applied missing data analysis . New York City, NY : The Guilford Press .

Gersick C. J . ( 1988 ). Time and transition in work teams: Toward a new model of group development . Academy of Management Journal , 31 , 9 – 41 . doi: 10.2307/256496

Graham J. W . ( 2009 ). Missing data analysis: Making it work in the real world . Annual Review of Psychology , 60 , 549 – 576 . doi: 10.1146/annurev.psych.58.110405.085530

Ferrer E. , & McArdle J. J . ( 2010 ). Longitudinal modeling of developmental changes in psychological research . Current Directions in Psychological Science , 19 , 149 – 154 . doi: 10.1177/0963721410370300

Fisher G. G. Chaffee D. S. , & Sonnega A . ( 2016 ). Retirement timing: A review and recommendations for future research . Work, Aging and Retirement , 2 , 230 – 261 . doi: 10.1093/workar/waw001

Fokianos K. , & Kedem B . ( 2003 ). Regression theory for categorical time series . Statistical Science , 357 – 376 . doi: 10.1214/ss/1076102425

Fraley R. C . ( 2002 ). Attachment stability from infancy to adulthood: Meta-analysis and dynamic modeling of developmental mechanisms . Personality and Social Psychology Review , 6 , 123 – 151 . doi: 10.1207/S15327957PSPR0602_03

Fredrickson B. L . ( 2000 ). Extracting meaning from past affective experiences: The importance of peaks, ends, and specific emotions . Cognition and Emotion , 14 , 577 – 606 .

Fumagalli L. Laurie H. , & Lynn P . ( 2013 ). Experiments with methods to reduce attrition in longitudinal surveys . Journal of the Royal Statistical Society: Series A (Statistics in Society) , 176 , 499 – 519 . doi: 10.1111/j.1467-985X.2012.01051.x

Golembiewski R. T. Billingsley K. , & Yeager S . ( 1976 ). Measuring change and persistence in human affairs: Types of change generated by OD designs . Journal of Applied Behavioral Science , 12 , 133 – 157 . doi: 10.1177/002188637601200201

Gosling S. D. Vazire S. Srivastava S. , & John O. P . ( 2004 ). Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires . American Psychologist , 59 , 93 – 104 . doi: 10.1037/0003-066X.59.2.93

Green A. S. Rafaeli E. Bolger N. Shrout P. E. , & Reis H. T . ( 2006 ). Paper or plastic? Data equivalence in paper and electronic diaries . Psychological Methods , 11 , 87 – 105 . doi: 10.1037/1082-989X.11.1.87

Groves R. M. Couper M. P. Presser S. Singer E. Tourangeau R. Acosta G. P. , & Nelson L . ( 2006 ). Experiments in producing nonresponse bias . Public Opinion Quarterly , 70 , 720 – 736 . doi: 10.1093/poq/nfl036

Guastello S. J . ( 2013 ). Chaos, catastrophe, and human affairs: Applications of nonlinear dynamics to work, organizations, and social evolution . New York, NY : Psychology Press

Hanges P. J. Braverman E. P. , & Rentsch J. R . ( 1991 ). Changes in raters’ perceptions of subordinates: A catastrophe model . Journal of Applied Psychology , 76 , 878 – 888 . doi: 10.1037/0021-9010.76.6.878

Heybroek L. Haynes M. , & Baxter J . ( 2015 ). Life satisfaction and retirement in Australia: A longitudinal approach . Work, Aging and Retirement , 1 , 166 – 180 . doi: 10.1093/workar/wav006

Hulin C. L. Henry R. A. , & Noon S. L . ( 1990 ). Adding a dimension: Time as a factor in the generalizability of predictive relationships . Psychological Bulletin , 107 , 328 – 340 .

Humphreys L. G . ( 1968 ). The fleeting nature of the prediction of college academic success . Journal of Educational Psychology , 59 , 375 – 380 .

Ilgen D. R. , & Hulin C. L . (Eds.). ( 2000 ). Computational modeling of behavior in organizations: The third scientific discipline . Washington, DC : American Psychological Association .

James L. R. Mulaik S. A. , & Brett J. M . ( 1982 ). Causal analysis: Assumptions, models, and data . Beverly Hills, CA : Sage Publications .

Kahneman D . ( 1999 ). Objective happiness . In D. Kahneman E. Diener N. Schwarz (Eds.), Well-being: The foundations of hedonic psychology (pp. 3 – 25 ). New York, NY : Russell Sage Foundation .

Keil C. T. , & Cortina J. M . ( 2001 ). Degradation of validity over time: A test and extension of Ackerman’s model . Psychological Bulletin , 127 , 673 – 697 .

Kessler R. C. , & Greenberg D. F . ( 1981 ). Linear panel analysis: Models of quantitative change . New York, NY : Academic Press .

Kruschke J. K . ( 2010 ). What to believe: Bayesian methods for data analysis . Trends in Cognitive Science , 14 : 293 – 300 . doi: 10.1016/j.tics.2010.05.001

Kuljanin G. Braun M. T. , & DeShon R. P . ( 2011 ). A cautionary note on modeling growth trends in longitudinal data . Psychological Methods , 16 , 249 – 264 . doi: 10.1037/a0023348

Kuppens P. Oravecz Z. , & Tuerlinckx F . ( 2010 ). Feelings change: accounting for individual differences in the temporal dynamics of affect . Journal of Personality and Social Psychology , 99 , 1042 – 1060 . doi: 10.1037/a0020962

Lance C. E. , & Vandenberg R. J . (Eds.). ( 2009 ) Statistical and methodological myths and urban legends: Doctrine, verity and fable in the organizational and social sciences . New York, NY : Taylor & Francis .

Langeheine R. , & Van de Pol F . ( 2002 ). Latent Markov chains . In J. A. Hagenaars A. L. McCutcheon (Eds.), Applied latent class analysis (pp. 304 – 341 ). New York City, NY : Cambridge University Press .

Laurie H . ( 2008 ). Minimizing panel attrition . In S. Menard (Ed.), Handbook of longitudinal research: Design, measurement, and analysis . Burlington, MA : Academic Press .

Laurie H. , & Lynn P . ( 2008 ). The use of respondent incentives on longitudinal surveys (Working Paper No. 2008–42 ) . Retrieved from Institute of Social and Economic Research website: https://www.iser.essex.ac.uk/files/iser_working_papers/2008–42.pdf

Laurie H. Smith R. , & Scott L . ( 1999 ). Strategies for reducing nonresponse in a longitudinal panel survey . Journal of Official Statistics , 15 , 269 – 282 .

Little R. J. A. , & Rubin D. B . ( 1987 ). Statistical analysis with missing data . New York, NY : Wiley .

Liu Y. Mo S. Song Y. , & Wang M . ( 2016 ). Longitudinal analysis in occupational health psychology: A review and tutorial of three longitudinal modeling techniques . Applied Psychology: An International Review , 65 , 379 – 411 . doi: 10.1111/apps.12055

Madero-Cabib I Gauthier J. A. , & Le Goff J. M . ( 2016 ). The influence of interlocked employment-family trajectories on retirement timing . Work, Aging and Retirement , 2 , 38 – 53 . doi: 10.1093/workar/wav023

Marsh L. C. , & Cormier D. R . ( 2001 ). Spline regression models . Thousand Oaks, CA : Sage Publications .

Martin G. L. , & Loes C. N . ( 2010 ). What incentives can teach us about missing data in longitudinal assessment . New Directions for Institutional Research , S2 , 17 – 28 . doi: 10.1002/ir.369

Meade A. W. Michels L. C. , & Lautenschlager G. J . ( 2007 ). Are Internet and paper-and-pencil personality tests truly comparable? An experimental design measurement invariance study . Organizational Research Methods , 10 , 322 – 345 . doi: 10.1177/1094428106289393

McArdle JJ . ( 2007 ). Dynamic structural equation modeling in longitudinal experimental studies . In K.V. Montfort H. Oud and A. Satorra et al. (Eds.), Longitudinal Models in the Behavioural and Related Sciences (pp. 159 – 188 ). Mahwah, NJ : Lawrence Erlbaum .

McArdle J. J . ( 2009 ). Latent variable modeling of differences and changes with longitudinal data . Annual Review of Psychology , 60 , 577 – 605 . doi: 10.1146/annurev.psych.60.110707.163612

McArdle J. J. Grimm K. J. Hamagami F. Bowles R. P. , & Meredith W . ( 2009 ). Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement . Psychological methods , 14 , 126 – 149 .

McGrath J. E. , & Rotchford N. L . ( 1983 ). Time and behavior in organizations . Research in Organizational Behavior , 5 , 57 – 101 .

Miller J. H. , & Page S. E . ( 2007 ). Complex adaptive systems: An introduction to computational models of social life . Princeton, NJ, USA : Princeton University Press .

Mitchell T. R. , & James L. R . ( 2001 ). Building better theory: Time and the specification of when things happen . Academy of Management Review , 26 , 530 – 547 . doi: 10.5465/AMR.2001.5393889

Morrison E. W . ( 2002 ). Information seeking within organizations . Human Communication Research , 28 , 229 – 242 . doi: 10.1111/j.1468-2958.2002.tb00805.x

Muthén B . ( 2001 ). Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class–latent growth modeling . In L. M. Collins A. G. Sayer (Eds.), New methods for the analysis of change. Decade of behavior (pp. 291 – 322 ). Washington, DC : American Psychological Association .

Muthén L. K. , & Muthén B. O . (1998– 2012 ). Mplus user’s guide . 7th ed. Los Angeles, CA : Muthén & Muthén .

Newman D. A . ( 2003 ). Longitudinal modeling with randomly and systematically missing data: A simulation of ad hoc, maximum likelihood, and multiple imputation techniques . Organizational Research Methods , 6 , 328 – 362 . doi: 10.1177/1094428103254673

Newman D. A . ( 2009 ). Missing data techniques and low response rates: The role of systematic nonresponse parameters . In C. E. Lance R. J. Vandenberg (Eds.), Statistical and methodological myths and urban legends: Doctrine, verity, and fable in the organizational and social sciences (pp. 7 – 36 ). New York, NY : Routledge .

Newman D. A. , & Cottrell J. M . ( 2015 ). Missing data bias: Exactly how bad is pairwise deletion? In C. E. Lance R. J. Vandenberg (Eds.), More statistical and methodological myths and urban legends , pp. 133 – 161 . New York, NY : Routledge .

Newman D. A . ( 2014 ). Missing data five practical guidelines . Organizational Research Methods , 17 , 372 – 411 . doi: 10.1177/1094428114548590

Pindyck R. S. , & Rubinfeld D. L . ( 1998 ). Econometric Models and Economic Forecasts . Auckland, New Zealand : McGraw-Hill .

Pinquart M. , & Schindler I . ( 2007 ). Changes of life satisfaction in the transition to retirement: A latent-class approach . Psychology and Aging , 22 , 442 – 455 . doi: 10.1037/0882-7974.22.3.442

Ployhart R. E. , & Hakel M. D . ( 1998 ). The substantive nature of performance variability: Predicting interindividual differences in intraindividual performance . Personnel Psychology , 51 , 859 – 901 . doi: 10.1111/j.1744-6570.1998.tb00744.x

Ployhart R. E. , & Vandenberg R. J . ( 2010 ). Longitudinal Research: The theory, design, and analysis of change . Journal of Management , 36 , 94 – 120 . doi: 10.1177/0149206309352110

Podsakoff P. M. MacKenzie S. B. Lee J. Y. , & Podsakoff N. P . ( 2003 ). Common method biases in behavioral research: a critical review of the literature and recommended remedies . Journal of Applied Psychology , 88 , 879 – 903 . doi: 10.1037/0021-9010.88.5.879

Redelmeier D. A. , & Kahneman D . ( 1996 ). Patients’ memories of painful medical treatments: real-time and retrospective evaluations of two minimally invasive procedures . Pain , 66 , 3 – 8 .

Robinson M. D. , & Clore G. L . ( 2002 ). Belief and feeling: evidence for an accessibility model of emotional self-report . Psychological Bulletin , 128 , 934 – 960 .

Rogelberg S. G. Conway J. M. Sederburg M. E. Spitzmuller C. Aziz S. , & Knight W. E . ( 2003 ). Profiling active and passive nonrespondents to an organizational survey . Journal of Applied Psychology , 88 , 1104 – 1114 . doi: 10.1037/0021-9010.88.6.1104

Rogosa D. R . ( 1995 ). Myths and methods: “Myths about longitudinal research” plus supplemental questions . In J. M. Gottman (Ed.), The analysis of change (pp. 3 – 66 ). Mahwah, NJ : Lawrence Erlbaum .

Rouder J. N. , & Lu J . ( 2005 ). An introduction to Bayesian hierarchical models with an application in the theory of signal detection . Psychonomic Bulletin & Review , 12 , 573 – 604 . doi: 10.3758/BF03196750

Rubin D. B . ( 1987 ). Multiple imputation for nonresponse in surveys . New York, NY : John Wiley .

Schafer J. L. , & Graham J. W . ( 2002 ). Missing data: Our view of the state of the art . Psychological Methods , 7 , 147 – 177 .

Schaie K. W . ( 1965 ). A general model for the study of developmental problems . Psychological bulletin , 64 , 92 – 107 . doi: 10.1037/h0022371

Schmitt N . ( 1982 ). The use of analysis of covariance structures to assess beta and gamma change . Multivariate Behavioral Research , 17 , 343 – 358 . doi: 10.1207/s15327906mbr1703_3

Shadish W. R. Cook T. D. , & Campbell D. T . ( 2002 ). Experimental and quasi-experimental designs for generalized causal inference . Boston, MA : Houghton Mifflin .

Shingles R . ( 1985 ). Causal inference in cross-lagged panel analysis . In H. M. Blalock (Ed.), Causal models in panel and experimental design (pp. 219 – 250 ). New York, NY : Aldine .

Singer E. , & Kulka R. A . ( 2002 ). Paying respondents for survey participation . In M. ver Ploeg R. A. Moffit , & C. F. Citro (Eds.), Studies of welfare populations: Data collection and research issues (pp. 105 – 128 ). Washington, DC : National Research Council .

Singer J. D. , & Willett J. B . ( 2003 ). Applied longitudinal data analysis: Modeling change and event occurrence . New York, NY : Oxford university press .

Sitzmann T. , & Yeo G . ( 2013 ). A meta-analytic investigation of the within-person self-efficacy domain: Is self-efficacy a product of past performance or a driver of future performance? Personnel Psychology , 66 , 531 – 568 . doi: 10.1111/peps.12035

Solomon R. L. , & Corbit J. D . ( 1974 ). An opponent-process theory of motivation: I. Temporal dynamics of affect . Psychological Review , 81 , 119 – 145 . doi: 10.1037/h0036128

Stasser G . ( 1988 ). Computer simulation as a research tool: The DISCUSS model of group decision making . Journal of Experimental Social Psychology , 24 , 393 – 422 . doi: 10.1016/ 0022-1031(88)90028-5

Stasser G . ( 2000 ). Information distribution, participation, and group decision: Explorations with the DISCUSS and SPEAK models . In D. R. Ilgen R. Daniel , & C. L. Hulin (Eds.), Computational modeling of behavior in organizations: The third scientific discipline (pp. 135 – 161 ). Washington, DC : American Psychological Association .

Stone-Romero E. F. , & Rosopa P. J . ( 2010 ). Research design options for testing mediation models and their implications for facets of validity . Journal of Managerial Psychology , 25 , 697 – 712 . doi: 10.1108/02683941011075256

Tay L . ( 2015 ). Expimetrics [Computer software] . Retrieved from http://www.expimetrics.com

Tay L. Chan D. , & Diener E . ( 2014 ). The metrics of societal happiness . Social Indicators Research , 117 , 577 – 600 . doi: 10.1007/s11205-013-0356-1

Taris T . ( 2000 ). Longitudinal data analysis . London, UK : Sage Publications .

Tesluk P. E. , & Jacobs R. R . ( 1998 ). Toward an integrated model of work experience . Personnel Psychology , 51 , 321 – 355 . doi: 10.1111/j.1744-6570.1998.tb00728.x

Tisak J. , & Tisak M. S . ( 2000 ). Permanency and ephemerality of psychological measures with application to organizational commitment . Psychological Methods , 5 , 175 – 198 .

Uy M. A. Foo M. D. , & Aguinis H . ( 2010 ). Using experience sampling methodology to advance entrepreneurship theory and research . Organizational Research Methods , 13 , 31 – 54 . doi: 10.1177/1094428109334977

Vancouver J. B. Gullekson N. , & Bliese P . ( 2007 ). Lagged Regression as a Method for Causal Analysis: Monte Carlo Analyses of Possible Artifacts . Poster submitted to the annual meeting of the Society for Industrial and Organizational Psychology, New York .

Vancouver J. B. Tamanini K. B. , & Yoder R. J . ( 2010 ). Using dynamic computational models to reconnect theory and research: Socialization by the proactive newcomer exemple . Journal of Management , 36 , 764 – 793 . doi: 10.1177/0149206308321550

Vancouver J. B. , & Weinhardt J. M . ( 2012 ). Modeling the mind and the milieu: Computational modeling for micro-level organizational researchers . Organizational Research Methods , 15 , 602 – 623 . doi: 10.1177/1094428112449655

Vancouver J. B. Weinhardt J. M. , & Schmidt A. M . ( 2010 ). A formal, computational theory of multiple-goal pursuit: integrating goal-choice and goal-striving processes . Journal of Applied Psychology , 95 , 985 – 1008 . doi: 10.1037/a0020628

Vandenberg R. J. , & Lance C. E . ( 2000 ). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research . Organizational research methods , 3 , 4 – 70 . doi: 10.1177/109442810031002

Wagenmakers E. J . ( 2007 ). A practical solution to the pervasive problems of p values . Psychonomic Bulletin & Review , 14 , 779 – 804 . doi: 10.3758/BF03194105

Wang M . ( 2007 ). Profiling retirees in the retirement transition and adjustment process: Examining the longitudinal change patterns of retirees’ psychological well-being . Journal of Applied Psychology , 92 , 455 – 474 . doi: 10.1037/0021-9010.92.2.455

Wang M. , & Bodner T. E . ( 2007 ). Growth mixture modeling: Identifying and predicting unobserved subpopulations with longitudinal data . Organizational Research Methods , 10 , 635 – 656 . doi: 10.1177/1094428106289397

Wang M. , & Chan D . ( 2011 ). Mixture latent Markov modeling: Identifying and predicting unobserved heterogeneity in longitudinal qualitative status change . Organizational Research Methods , 14 , 411 – 431 . doi: 10.1177/1094428109357107

Wang M. , & Hanges P . ( 2011 ). Latent class procedures: Applications to organizational research . Organizational Research Methods , 14 , 24 – 31 . doi: 10.1177/1094428110383988

Wang M. Henkens K. , & van Solinge H . ( 2011 ). Retirement adjustment: A review of theoretical and empirical advancements . American Psychologist , 66 , 204 – 213 . doi: 10.1037/a0022414

Wang M. Zhou L. , & Zhang Z . ( 2016 ). Dynamic modeling . Annual Review of Organizational Psychology and Organizational Behavior , 3 , 241 – 266 .

Wang L. P. Hamaker E. , & Bergeman C. S . ( 2012 ). Investigating inter-individual differences in short-term intra-individual variability . Psychological Methods , 17 , 567 – 581 . doi: 10.1037/a0029317

Warren D. A . ( 2015 ). Pathways to retirement in Australia: Evidence from the HILDA survey . Work, Aging and Retirement , 1 , 144 – 165 . doi: 10.1093/workar/wau013

Weikamp J. G. , & Göritz A. S . ( 2015 ). How stable is occupational future time perspective over time? A six-wave study across 4 years . Work, Aging and Retirement , 1 , 369 – 381 . doi: 10.1093/workar/wav002

Weinhardt J. M. , & Vancouver J. B . ( 2012 ). Computational models and organizational psychology: Opportunities abound . Organizational Psychology Review , 2 , 267 – 292 . doi: 10.1177/2041386612450455

Weiss H. M. , & Cropanzano R . ( 1996 ). Affective Events Theory: A theoretical discussion of the structure, causes and consequences of affective experiences at work . Research in Organizational Behavior , 18 , 1 – 74 .

Zacks J. M. Speer N. K. Swallow K. M. Braver T. S. , & Reynolds J. R . ( 2007 ). Event perception: a mind-brain perspective . Psychological Bulletin , 133 , 273 – 293 . doi: 10.1037/0033-2909.133.2.273

Author notes

Email alerts, citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 2054-4650
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

What Is a Longitudinal Study?

Tracking Variables Over Time

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

limitations of longitudinal research

Amanda Tust is a fact-checker, researcher, and writer with a Master of Science in Journalism from Northwestern University's Medill School of Journalism.

limitations of longitudinal research

Steve McAlister / The Image Bank / Getty Images

The Typical Longitudinal Study

Potential pitfalls, frequently asked questions.

A longitudinal study follows what happens to selected variables over an extended time. Psychologists use the longitudinal study design to explore possible relationships among variables in the same group of individuals over an extended period.

Once researchers have determined the study's scope, participants, and procedures, most longitudinal studies begin with baseline data collection. In the days, months, years, or even decades that follow, they continually gather more information so they can observe how variables change over time relative to the baseline.

For example, imagine that researchers are interested in the mental health benefits of exercise in middle age and how exercise affects cognitive health as people age. The researchers hypothesize that people who are more physically fit in their 40s and 50s will be less likely to experience cognitive declines in their 70s and 80s.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies, a type of correlational research , are usually observational, in contrast with cross-sectional research . Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point.

To test this hypothesis, the researchers recruit participants who are in their mid-40s to early 50s. They collect data related to current physical fitness, exercise habits, and performance on cognitive function tests. The researchers continue to track activity levels and test results for a certain number of years, look for trends in and relationships among the studied variables, and test the data against their hypothesis to form a conclusion.

Examples of Early Longitudinal Study Design

Examples of longitudinal studies extend back to the 17th century, when King Louis XIV periodically gathered information from his Canadian subjects, including their ages, marital statuses, occupations, and assets such as livestock and land. He used the data to spot trends over the years and understand his colonies' health and economic viability.

In the 18th century, Count Philibert Gueneau de Montbeillard conducted the first recorded longitudinal study when he measured his son every six months and published the information in "Histoire Naturelle."

The Genetic Studies of Genius (also known as the Terman Study of the Gifted), which began in 1921, is one of the first studies to follow participants from childhood into adulthood. Psychologist Lewis Terman's goal was to examine the similarities among gifted children and disprove the common assumption at the time that gifted children were "socially inept."

Types of Longitudinal Studies

Longitudinal studies fall into three main categories.

  • Panel study : Sampling of a cross-section of individuals
  • Cohort study : Sampling of a group based on a specific event, such as birth, geographic location, or experience
  • Retrospective study : Review of historical information such as medical records

Benefits of Longitudinal Research

A longitudinal study can provide valuable insight that other studies can't. They're particularly useful when studying developmental and lifespan issues because they allow glimpses into changes and possible reasons for them.

For example, some longitudinal studies have explored differences and similarities among identical twins, some reared together and some apart. In these types of studies, researchers tracked participants from childhood into adulthood to see how environment influences personality , achievement, and other areas.

Because the participants share the same genetics , researchers chalked up any differences to environmental factors . Researchers can then look at what the participants have in common and where they differ to see which characteristics are more strongly influenced by either genetics or experience. Note that adoption agencies no longer separate twins, so such studies are unlikely today. Longitudinal studies on twins have shifted to those within the same household.

As with other types of psychology research, researchers must take into account some common challenges when considering, designing, and performing a longitudinal study.

Longitudinal studies require time and are often quite expensive. Because of this, these studies often have only a small group of subjects, which makes it difficult to apply the results to a larger population.

Selective Attrition

Participants sometimes drop out of a study for any number of reasons, like moving away from the area, illness, or simply losing motivation . This tendency, known as selective attrition , shrinks the sample size and decreases the amount of data collected.

If the final group no longer reflects the original representative sample , attrition can threaten the validity of the experiment. Validity refers to whether or not a test or experiment accurately measures what it claims to measure. If the final group of participants doesn't represent the larger group accurately, generalizing the study's conclusions is difficult.

The World’s Longest-Running Longitudinal Study

Lewis Terman aimed to investigate how highly intelligent children develop into adulthood with his "Genetic Studies of Genius." Results from this study were still being compiled into the 2000s. However, Terman was a proponent of eugenics and has been accused of letting his own sexism , racism , and economic prejudice influence his study and of drawing major conclusions from weak evidence. However, Terman's study remains influential in longitudinal studies. For example, a recent study found new information on the original Terman sample, which indicated that men who skipped a grade as children went on to have higher incomes than those who didn't.

A Word From Verywell

Longitudinal studies can provide a wealth of valuable information that would be difficult to gather any other way. Despite the typical expense and time involved, longitudinal studies from the past continue to influence and inspire researchers and students today.

A longitudinal study follows up with the same sample (i.e., group of people) over time, whereas a cross-sectional study examines one sample at a single point in time, like a snapshot.

A longitudinal study can occur over any length of time, from a few weeks to a few decades or even longer.

That depends on what researchers are investigating. A researcher can measure data on just one participant or thousands over time. The larger the sample size, of course, the more likely the study is to yield results that can be extrapolated.

Piccinin AM, Knight JE. History of longitudinal studies of psychological aging . Encyclopedia of Geropsychology. 2017:1103-1109. doi:10.1007/978-981-287-082-7_103

Terman L. Study of the gifted . In: The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. 2018. doi:10.4135/9781506326139.n691

Sahu M, Prasuna JG. Twin studies: A unique epidemiological tool .  Indian J Community Med . 2016;41(3):177-182. doi:10.4103/0970-0218.183593

Almqvist C, Lichtenstein P. Pediatric twin studies . In:  Twin Research for Everyone . Elsevier; 2022:431-438.

Warne RT. An evaluation (and vindication?) of Lewis Terman: What the father of gifted education can teach the 21st century . Gifted Child Q. 2018;63(1):3-21. doi:10.1177/0016986218799433

Warne RT, Liu JK. Income differences among grade skippers and non-grade skippers across genders in the Terman sample, 1936–1976 . Learning and Instruction. 2017;47:1-12. doi:10.1016/j.learninstruc.2016.10.004

Wang X, Cheng Z. Cross-sectional studies: Strengths, weaknesses, and recommendations .  Chest . 2020;158(1S):S65-S71. doi:10.1016/j.chest.2020.03.012

Caruana EJ, Roman M, Hernández-Sánchez J, Solli P. Longitudinal studies .  J Thorac Dis . 2015;7(11):E537-E540. doi:10.3978/j.issn.2072-1439.2015.10.63

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Social Research Update is published quarterly by the Department of Sociology, University of Surrey, Guildford GU2 7XH, England. Subscriptions for the hardcopy version are free to researchers with addresses in the UK. Apply by email to [email protected] .

Dr Elisabetta Ruspini has a PhD in Sociology and Social Research. She is a post-doctoral research fellow at the Department of Sociology, University of Padova. In 1999 she was an International Visiting Fellow in Social Research Methods at the University of Surrey. Her current research focuses on the feminisation of poverty and women’s poverty dynamics in Italy in a comparative perspective. Her research interests include gender issues, comparative welfare research, social and family policies, poverty, and the study of living conditions. Within the methodological field, her main interests are longitudinal data analysis and the design and collection of complex data sets such as household panel surveys. She has published a number of articles and contributed papers to national and international conferences in the fields of longitudinal research and research on poverty.

Blossfeld, H.P. and Rohwer, G. (1995) Techniques of Event History Modeling. New Approaches to Causal Analysis , Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Social Research Update is published by:

Vittana.org

13 Advantages of Disadvantages of Longitudinal Studies

Longitudinal studies are a method of observational research. In this type of study, data is gathered from the same subjects repeatedly over a defined period. Because of this structure, it is possible for a longitudinal study to last for several years or even several decades. This form of research is common in the areas of sociology, psychology, and medicine.

The primary advantage of using this form of research is that it can help find patterns that may occur over long periods, but would not be observed over short periods. Changes can be tracked so that cause and effect relationships can be discovered.

The primary disadvantage of using longitudinal studies for research is that long-term research increases the chances of unpredictable outcomes. If the same people cannot be found for a study update, then the research ceases.

Here are some additional key advantages and disadvantages of longitudinal studies to think about.

What Are the Advantages of Longitudinal Studies?

1. It allows for high levels of validity. For a long-term study to be successful, there must be rules and regulations in place at the beginning that dictate the path that researchers must follow. The end goal of the research must be defined at the beginning of the process as well, with outlined steps in place that verify the authenticity of the data being collected. This means high levels of data validity are often available through longitudinal studies.

2. The data collected is unique. Most research studies will collect short-term data to determine the cause-and-effect of what is being researched. Longitudinal studies follow the same principles, but extend the timeframe for data collection on a dramatic scale. Long-term relationships cannot be discovered in short-term research, but short-term relationships can be tracked in long-term research.

3. Most will use the observational method. Because longitudinal studies will use the observational method for data collection more often than not, it is easier to collect consistent data at a personal level. This consistency allows for differences to be excluded on a personal level, making it easier to exclude variations that could affect data outcomes in other research methods.

4. It makes it possible to identify developmental trends. Whether in medicine, psychology, or sociology, the long-term design of a longitudinal study makes it possible to find trends and relationships within the data collected. It isn’t just the span of a human life that can be tracked with this type of research. Multiple generations can have real-time data collected and analyzed to find trends. Observational changes can also be made from past data so it can be applied to future outcomes.

5. Data collection accuracy is almost always high. Because data is collected in real-time using observational data, the collection process is almost always accurate. Humans are fallible beings, so mistakes are always possible, but the structure of this research format limits those mistakes. That data can also be used to implement necessary changes that a course of action may need to take so the best possible outcome can be identified.

6. Longitudinal studies can be designed for flexibility. Although a longitudinal study may be created to study one specific data point, the collected data may show unanticipated patterns or relationships that may be meaningful. Because this is a long-term study, there is a flexibility available to researchers that is not available in other research formats. Additional data points can be collected to study the unanticipated findings, allowing for shifts in focus to occur whenever something interesting is found.

What Are the Disadvantages of Longitudinal Studies?

1. There is a factor of unpredictability always present. Because longitudinal studies involve the same subjects over a long period, what happens to them outside of the data collection moments can influence future data being collected. Some people may choose to stop participating in the research. Others may no longer find themselves in the correct demographics for the research. If these factors are not included in the initial design of the research, then it could invalidate the findings that are produced.

2. It takes time. Researchers involved with longitudinal studies may never see the full outcome of their work. It may take several years before the data begins producing observable patterns or relationships that can be tracked. That means the ability to maintain open lines of communication with all researchers is vitally important to the eventual success of the study.

3. The data gathered by longitudinal studies is not always accurate or reliable. It only takes one piece of unreliable or inaccurate data to possibly invalidate the findings that the longitudinal studies produce. Because humans have their own personal bias toward certain subjects, the researcher processing the data may unconsciously alter the data to produce intended results.

4. It relies on the skills of the researchers to be complete. Because data collection occurs in real-time and relies heavily on the skills of the researchers who are tasked with this job, the quality of the data is heavily reliant on those skills. Two different researchers with varying skill levels can produce very different data points from the same subject material. Personal views of the data being collected can also impact the results on both ends, from the subject or the collector.

5. Large sample sizes are required to make the research meaningful. To develop relationships or patterns, a large amount of data must be collected and then mined to create results. That means a large sample size is required so the amount of data being collected can meet expectations. When the subjects being studied are people, it can be difficult to find enough people who are willing to honestly participate in the longitudinal studies.

6. There is a direct cost that is higher than other forms of research. Longitudinal research requires a larger sample size, which means there is a larger cost involved in contacting subjects to collect data. It is also a long-term form of research, which means the costs of the study will be extended for years, or decades, when other forms of research may be completed in a fraction of that time.

7. One person can change a long-term outcome. Because there is such a reliance on individual interpretations within longitudinal studies, it is possible for one person to inadvertently alter or invalidate the data being collected. It is entirely possible for decades of research to be invalidated because one subject or researcher was misleading.

The advantages and disadvantages of longitudinal studies show us that there is a tremendous value available in the ability to find long-term patterns and relationships. If the unpredictable factors of this research format can be planned for in advance and steps taken to remove bias, the data collected offers the potential to dramatically change the fields of medicine, psychology, or sociology.

ReviseSociology

A level sociology revision – education, families, research methods, crime and deviance and more!

Longitudinal Studies

Longitudinal Studies are studies in which data is collected at specific intervals over a long period of time in order to measure changes over time. This post provides one example of a longitudinal study and explores some the strengths and limitations of this research method.

With a longitudinal study you might start with an original sample of respondents in one particular year (say the year 2000) and then go back to them every year, every five years, or every ten years, aiming to collect data from the same people. One of the biggest problems with Longitudinal Studies is the attrition rate, or the subject dropout rate over time.

The Millennium Cohort Study

One recent example of a Longitudinal study is the Millennium Cohort Study, which stretched from 2000 to 2011, with an initial sample of 19 000 children.

The study tracked children until the age of 11 and has provide an insight into how differences in early socialisation affect child development in terms of health and educational outcomes.

The study also allowed researchers to make comparisons in rates of development between children of different sexes and from different economic backgrounds.

Led by the Centre for Longitudinal Studies at the Institute of Education , it was funded by the Economic and Social Research Council and government departments. The results below come from between 2006 and 2007, when the children were aged five.

Selected Findings

  • The survey found that children whose parents read to them every day at the age of three were more likely to flourish in their first year in primary school, getting more than two months ahead not just in language and literacy but also in maths
  • Children who were read to on a daily basis were 2.4 months ahead of those whose parents never read to them in maths, and 2.8 months ahead in communication, language and literacy.
  • Girls were consistently outperforming boys at the age of five, when they were nine months ahead in creative development – activities like drama, singing and dancing, and 4.2 months ahead in literacy.
  • Children from lower-income families with parents who were less highly educated were less advanced in their development at age five. Living in social housing put them 3.2 months behind in maths and 3.5 months behind in literacy.

The strengths of longitudinal studies

  • They allow researchers to trace developments over time, rather than just taking a one-off ‘snapshot’ of one moment.
  • By making comparisons over time, they can identify causes. The Millennium Cohort study, for example suggests a clear correlation between poverty and its early impact on low educational achievement

The limitations of longitudinal studies

  • Sample attrition – people dropping out of the study, and the people who remain in the study may not end up being representative of the starting sample.
  • People may start to act differently because they know they are part of the study
  • Because they take a long time, they are costly and time consuming.
  • Continuity over many years may be a problem – if a lead researcher retires, for example, her replacement might not have the same rapport with respondents.

Related Posts

Explaining Social Class Differences in Educational Through Longitudinal Studies

Share this:

  • Share on Tumblr

One thought on “Longitudinal Studies”

  • Pingback: Depression leads to more social media usage, not the other way around! – ReviseSociology

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Discover more from ReviseSociology

Subscribe now to keep reading and get access to the full archive.

Continue reading

limitations of longitudinal research

FutureofWorking.com

11 Advantages and Disadvantages of Longitudinal Studies

Longitudinal studies are a type of research or survey that primarily uses the method of observation, which entails that they do not involve interfering with the subjects in any means. These studies are also unique in a way that they follow a certain timeline that is entirely dependent on the respondents, which means that data collection could take years depending on the exact timetable put in place. Most of the time, they are used by psychologists who are looking to measure or identify the impact therapy can have over time, involving long time frames and vast amounts of data.

Now, like any other type of method in conducting research, longitudinal studies also come with certain disadvantages, while they offer obvious advantages. Here are important things to take note when planning to use this methodology:

List of Advantages of Longitudinal Studies

1. They are effective in determining variable patterns over time. Because these studies involve the use and collection of data in long periods of time, they can determine patterns efficiently. By using them, it would be possible for researchers to learn more about cause and effect relationships and make connections in a clearer manner. Aside from this, remember that more data over longer periods of time will allow for more concise and better results. These studies are considered highly valid for determining long-term changes and are unique in themselves when it comes to being able to provide useful data about these individual changes.

2. They can ensure clear focus and validity. With a clear focus, longitudinal studies would let us observe how a certain set of circumstances or an end state would come to be. And while it is natural for people not to remember past events, this problem can be addressed by means of actual recording that ensures a high level of validity.

3. They are very effective in doing research on developmental trends. As mentioned above, these studies are often used in psychology to conduct research on developmental trends across life spans. They are used in sociology to study life events throughout lifetimes or generations. This is so because, unlike cross-sectional studies where different individuals with similar characteristics are being compared, longitudinal studies would track the same people, which means that the differences observed in a group will be less likely to be the result of a change or difference in culture across generations.

4. They are more powerful than cross-sectional studies. As they utilize the observation method without manipulating the state of the world, longitudinal studies have been argued to having less power in terms of detecting causal relationships compared with experiments. However, they are known to have more power than cross-sectional studies when it comes to excluding time-invariants and unobserved individual differences and when it comes to observing a certain event’s temporal order, as they use repeated observations at individual levels.

5. They are highly flexible. Longitudinal studies are often observed to allow flexibility to occur. This means that the focus they use can be shifted while researchers are collecting data.

6. They can provide high accuracy when observing changes. With their quality of being the perfect method to conduct research on developmental trends, these studies can make observation of changes more accurate, making them as the usual option in various fields. In medicine, for example, longitudinal studies are used to discover predictors or indicators of certain diseases, while in advertising, they are used to determine changes that a campaign has made in the behavior of consumers who belong to its target audience and have seen the advertisement.

List of Disadvantages of Longitudinal Studies

1. They require huge amounts of time. Time is definitely a huge disadvantage to any longitudinal study, as it typically takes a substantial amount of time to collect all the data that is required. Also, it takes equally long periods to gather results before the patterns can even start to be made.

2. They risk gathering data that is not 100% reliable. While data is collected at multiple points in this method of conducting research, you cannot pre-determine and take into account the observation periods regardless of what happens between these points. Aside from this, respondents would unknowingly change their qualitative responses over time to better suit what they see as the objective of the observer. Generally, the process involved in longitudinal studies will change how respondents and subjects the questions that are being used.

3. They would risk experiencing panel attrition. One of the biggest disadvantages of conducting longitudinal studies is panel attrition. This means that, if researchers are only relying upon the same group of subjects for a research that takes place at certain points in time in years, then there is the possibility that some of the subjects would no longer be able to participate because of various reasons, such as changes in contact details, refusal, incapacity and even death, which cuts down the usable data to be drawn to formulate the conclusion.

4. They require a large sample size. Another disadvantage that makes longitudinal studies not the perfect option to conduct research is that they typically require large sample sizes. So, you must have a large number of cooperating subjects for your research or else it will not realize or be valid.

5. They can be more expensive compared with cross-sectional studies. Cross-sectional studies are known to be more affordable compared with longitudinal studies and are much quicker in reaching an observational conclusion as they use fewer touch points. Considering that they utilize a sample size that is carefully chosen, rather than subsets, the former can also be more of a help in representing entire populations. The former is observed to be very beneficial when it comes to considering a change in policy, unlike the latter.

A lot of researchers encourage and welcome the use of longitudinal data sets, where they can apply and access data via relevant pathways that are set out by the groups that hold such information. However, longitudinal studies also have some limitations. Based on the advantages and disadvantages listed on this article, do you think these method are more helpful to society than not?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Open-source data pipeline for street-view images: A case study on community mobility during COVID-19 pandemic

Contributed equally to this work with: Matthew Martell, Nick Terry

Roles Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Industrial & Systems Engineering, University of Washington, Seattle, WA, United States of America

ORCID logo

Roles Data curation, Formal analysis, Software

Roles Data curation, Software

Roles Conceptualization, Funding acquisition, Methodology, Project administration, Writing – review & editing

Affiliation Environmental & Occupational Health Sciences, University of Washington, Seattle, WA, United States of America

Affiliation Human Centered Design & Engineering, University of Washington, Seattle, WA, United States of America

Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Writing – review & editing

Affiliation Civil & Environmental Engineering, University of Washington, Seattle, WA, United States of America

Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Validation, Writing – review & editing

  • Matthew Martell, 
  • Nick Terry, 
  • Ribhu Sengupta, 
  • Chris Salazar, 
  • Nicole A. Errett, 
  • Scott B. Miles, 
  • Joseph Wartman, 
  • Youngjun Choe

PLOS

  • Published: May 10, 2024
  • https://doi.org/10.1371/journal.pone.0303180
  • Reader Comments

Fig 1

Street View Images (SVI) are a common source of valuable data for researchers. Researchers have used SVI data for estimating pedestrian volumes, demographic surveillance, and to better understand built and natural environments in cityscapes. However, the most common source of publicly available SVI data is Google Street View. Google Street View images are collected infrequently, making temporal analysis challenging, especially in low population density areas. Our main contribution is the development of an open-source data pipeline for processing 360-degree video recorded from a car-mounted camera. The video data is used to generate SVIs, which then can be used as an input for longitudinal analysis. We demonstrate the use of the pipeline by collecting an SVI dataset over a 38-month longitudinal survey of Seattle, WA, USA during the COVID-19 pandemic. The output of our pipeline is validated through statistical analyses of pedestrian traffic in the images. We confirm known results in the literature and provide new insights into outdoor pedestrian traffic patterns. This study demonstrates the feasibility and value of collecting and using SVI for research purposes beyond what is possible with currently available SVI data. Our methods and dataset represent a first of its kind longitudinal collection and application of SVI data for research purposes. Limitations and future improvements to the data pipeline and case study are also discussed.

Citation: Martell M, Terry N, Sengupta R, Salazar C, Errett NA, Miles SB, et al. (2024) Open-source data pipeline for street-view images: A case study on community mobility during COVID-19 pandemic. PLoS ONE 19(5): e0303180. https://doi.org/10.1371/journal.pone.0303180

Editor: Ahmed Mancy Mosa, Al Mansour University College-Baghdad-Iraq, IRAQ

Received: January 26, 2024; Accepted: April 20, 2024; Published: May 10, 2024

Copyright: © 2024 Martell et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All images collected throughout this longitudinal study are available on mapillary.com under the username ‘uwrapid’. Full instructions and code on how to reproduce the data pipeline described in this paper are available at https://github.com/marte292/rapid-data-pipeline . The processed output necessary to reproduce the regression analyses in this paper are within the supporting files.

Funding: The U.S. National Science Foundation (Grant Number 2031119) provided financial support for this research. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. Data was collected using instrumentation provided by NSF as part of the RAPID Facility, a component of the Natural Hazards Engineering Research Infrastructure, under Award No. CMMI: 2130997. There was no additional external funding received for this study.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Street-level imagery is becoming an increasingly popular form of data for research [ 1 ]. Between 2009 and 2020, more than 200 publications utilized street-level imagery from corporate sources in urban research [ 1 ]. Out of all these sources, Google Street View’s Street View Images (SVI) were the most popular among academics [ 1 – 3 ]. Uses for SVI data include estimating demographics [ 4 ], evaluating the built environment [ 5 ], surveying plant species [ 6 ], measuring pedestrian volume [ 7 ], among many other applications [ 8 – 10 ].

While SVI data can provide many useful insights for researchers, it is not without its flaws. For corporate-collected images such as Google Street View, or Tencent Street View the availability of images depends on where the companies decide to collect data, while the accessibility of these images hinges on the companies’ data provision policies. For example, there is no Google Street View service in most parts of Africa. An alternative to corporate-collected images are crowdsourced SVI databases such as Mapillary [ 11 ]. These crowdsourced images sometimes may have better coverage or temporal resolution than Google Street View, at the cost of varying image quality, field of view, and positional accuracy [ 3 , 12 ]. Perhaps the largest challenge with SVI data is its temporal instability. Updates to these image datasets at specific locations are infrequent, especially in rural areas [ 1 , 13 , 14 ]. Additionally, images frequently are not collected at a consistent time of day, or season, even within the same city. These issues make existing SVI data unreliable for temporal studies.

Typically, temporal studies involving image data use images (or video) from fixed locations. This data is used to do things such as evaluate disaster recovery [ 15 ], monitor ecological change [ 16 ], or measure urban flooding [ 17 ]. Data from fixed cameras is also used to count people [ 18 ]. The challenge with these methods is that they are fixed-location. In order to collect spatial image data for these methods, frequently a large team is required to traverse areas on foot. This challenge, along with existing SVI data’s temporal issues, demonstrate the potential value of collecting longitudinal SVI data.

Our main contribution is demonstrating the feasibility of collecting longitudinal SVI data. We demonstrate this through the creation of a complete data pipeline for conducting pedestrian counts using car-based street-level imagery. The pipeline accepts raw video collected by the camera as an input and outputs a record of each pedestrian detection and their locations (latitude and longitude). This approach allows for analysis of mobility patterns with high spatial resolution and a short lag time. It alleviates the quality and field of view inconsistencies that come with crowdsourcing SVI data [ 3 , 12 ], generates data that is not corporately owned, eliminates the temporal instability challenge of both kinds of data [ 1 , 13 , 14 ], while still maintaining the advantages of SVI data over fixed-location methods [ 15 ].

Specifically, we use this pipeline to generate and analyze video from 37 video-collection runs in the city of Seattle, Washington, USA from May 2020 through July 2023. The video data was converted into over 4 million high-resolution images, with each data-collection run representing about 1.5 TB of image data. We used the images to create a record containing the location of each detected pedestrian, cross-referenced to the relevant GEOID [ 19 ]. To detect pedestrians in the still images, our pipeline leverages the state-of-the-art convolutional neural network, Pedestron [ 20 ]. We used the cascade_hrnet architecture benchmarked on the CrowdHuman data set [ 21 ]. Our methods and dataset represent a first of its kind longitudinal collection and application of SVI data for research purposes.

As a secondary contribution, we provide a case study based on the video data collected throughout the COVID-19 pandemic. We examine the effect of vaccine availability and local demographics on pedestrian detections, while accounting for weekly and yearly seasonality. Community mobility became a key metric during the height of the COVID-19 pandemic as government officials worked to halt the spread of the virus [ 22 , 23 ]. Two of the largest and most widely used data sets for community mobility during this time were the Google Community Mobility Reports [ 24 ] and Apple Mobility Trends Reports [ 25 ]. Researchers used this data to study the incidence of COVID-19 in the US [ 26 ] and the effectiveness of government lockdown policies [ 27 , 28 ], among other topics. Issues with these two data sets include mandatory opt-in, use of specific map applications, a lack of independent verification, and no long-term data availability guarantees [ 26 , 28 – 30 ]. Our findings demonstrate the utility of our data processing pipeline as an alternative for tracking community mobility over time and show the potential for its use in a variety of research domains.

Data collection

We collected our data as a part of the Seattle street-level imagery campaign, an ongoing series of video surveys for the purposes of documenting mobility throughout the COVID-19 pandemic. During each survey, a vehicle equipped with a 360° video camera is driven along a pre-defined route through Seattle while collecting video data and GPS metadata. The route incorporates broad neighborhood/area canvassing designed to collect data useful to multidisciplinary researchers as well as capital transects. Full details on the route design are available in Errett et al. [ 31 ]. The capital transects specifically target capitals (social, cultural, built, economic, and public health) which are theorized to be closely tied to community resilience [ 32 ]. Specific canvassing areas and capitals within Seattle were chosen to ensure a representative sample of the overall population of Seattle [ 31 ]. While the drivers try to make the surveys as consistent as possible, occasionally exogenous factors caused deviations from standard protocols. For example, during three of the surveys (05-29-2020, 06-18-2020, and 06-26-2020), protests over the murder of George Floyd caused parts of the survey route to be unnavigable.

After consulting with the University of Washington Human Subjects Division, it was determined that this study was not considered human subjects research and would not require IRB approval. The data we captured was people in public places, where they cannot expect personal privacy. As an added precaution, all data for this study was published through Mapillary [ 11 ], which automatically obscures faces.

Data processing pipeline

After video collection, the raw data is segmented into image data. The images are subsampled from video frames so that they are collected about every 4 meters. The images are then uploaded into the DesignSafe-CI Data Depot [ 33 ]. From DesignSafe, the images are transferred to the TACC Frontera high-performance computing cluster [ 34 ]. We completed all file transfers between the two services using Globus [ 35 ]. Without access to these services, or similar ones, the storage and computing requirements for this project would be intractable.

On Frontera, orthorectification is performed to the images, then pedestrian detection is performed on the orthorectified images. The orthorectification transforms the images from a single image in the equirectangular projection to two images in the rectilinear (gnomonic) projection [ 36 ]. Pedestrians are detected on each of the new images using a convolutional neural network (CNN) based on a pre-trained model from the Pedestron repository [ 20 ]. Our data represents a highly challenging detection task, as there is great variation in lighting, backgrounds, human poses, levels of occlusion and crowd density from image to image and run to run. The Cascade Mask R-CNN architecture in the Pedestron repository performed well on the CrowdHuman data set, representing a similar challenge to our data [ 21 ]. All testing and use of the CNN was performed using GPUs on the Frontera cluster. An example image after undergoing orthorectification and pedestrian detection is shown in Fig 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

The left image is an original 360° image from a data collection run. The image on the right is the right-hand side of the original image after orthorectification and pedestrian detection (both sides of the image are processed separately). There are two pedestrians that were detected by the algorithm (in red bounding boxes).

https://doi.org/10.1371/journal.pone.0303180.g001

Using one GPU node on Frontera, with four NVIDIA Quadro RTX 5000 GPUs, the entire process takes about 3 seconds per original 360° image. Given the 4 million images we collected, this takes about 3,300 hours of computing time. While this is not a small number, when running in parallel, the whole process can be completed in a manner of days. In comparison, a human taking 10s per orthorectified image to count all the pedestrians would take over 22,000 hours to complete the same task. File compression/decompression for file transfer also takes a substantial amount of time. Since we used DesignSafe as our main data storage platform, we had to transfer files to/from the Frontera supercomputer to perform our pedestrian detection. To avoid overloading the file transfer system, we compressed the images from each run into a tar file prior to transferring the files to Frontera. This file compression/decompression can take several hours per run, but can be performed in parallel with the detection algorithm since they are on different systems. After compression, file transfer using Globus [ 35 ] takes minutes.

In post-processing, the pipeline filters out low-confidence detections (defined as any detection with less than 80% confidence) and associates the remaining high-confidence detections to U.S. Cenus Bureau GEOIDs [ 19 ]. We arrived at this confidence level after tuning for the precision and recall of the CNN classifier. Specifically, the pipeline filters based on the output of the second to last layer of the CNN, known as a softmax layer . For a k –class classification problem, the softmax layer will output a k –dimensional probability vector, where each i th entry of the vector gives the probability that the original input to the CNN belongs to class i .

The final stage of post-processing is GEOID matching, where latitude and longitude metadata are cross-referenced to disjoint geographic regions (e.g. U.S. census tracts or block groups) and their respective GEOID codes. The cross-referencing code assumes the availability of shapefiles describing the geometry of the geographic regions. Aggregating the pedestrian detections according to U.S. Census Bureau GEOIDs [ 19 ] is necessary for analyses using sociodemographic data collected by the census. Additionally, the pedestrian detections can easily be cross-referenced with custom geometry defined using popular geographic information system software, such as the capitals data used in route construction and our analysis.

Following the GEOID matching step, the pedestrian detections data is written to a tabular format file (e.g. comma separated values). This file is an “analysis-ready” data product, in the sense that it is readable by most popular statistical analysis software (R, SPSS, Stata, etc.) and can be easily merged with other datasets using the GEOID column(s). A visual depiction of the entire pipeline is seen in Fig 2 . Full code and a manual for following our process is available at https://github.com/marte292/rapid-data-pipeline .

thumbnail

The parts of the flowchart in gray occur on NHERI DesignSafe-CI, while the right-hand part in blue is done on the Frontera cluster.

https://doi.org/10.1371/journal.pone.0303180.g002

Case study: Community mobility in Seattle during the COVID-19 pandemic

Data processing..

All analysis is performed using the Python programming language version 3.11 [ 37 ]. The initial data product as outlined in the previous section is a list of detections, alongside the date of collection, geolocation, and GEOID. We also utilized a similar list of the images themselves with the same features. The last dataset we utilized is the median household income data and racial demographic data from the 2019 American Community Survey (ACS) 5-year estimates. We aggregated the detections and image data for each data collection survey at the census tract level, then matched each census tract’s total number of detections and images to its respective demographic and income data.

We utilized the data from 36 of the 37 surveys, omitting data from 10-29-2020. A heavy rain event caused the survey to be stopped early due to poor video quality. For each survey, we divided the number of detections in each census tract by the number of images collected in the tract to create a normalized ‘detections per image’ metric. This is a necessary step as the number of images in each tract may change survey to survey due to circumstances outside our control, such as construction or community events altering the route.

The last step in data processing was to transform some of our data to be represented by categorical variables. The date of each survey was coded both as either a weekend or weekday, and by the season. The date was also coded as either being before, or after the date that vaccines became publicly available. Income data was coded to be one of 5 levels that were used during route design. These brackets were $48,274 and below, $48,275 to $80,819, $80,820, to $110,536, $110537 to $153,500, and $153,501 and above. Lastly, the proportion of the census tract’s population that identifies as non-white was coded as an indicator variable, with ‘1’ corresponding to areas that are 55.5% white or more. We determined this threshold using Jenk’s natural breaks optimization. This left us with a dataset of 3171 observations to be used for analysis. Each observation represented a census tract with a detections per image value, as well as values for each of the categorical variables defined above.

Initial regression analysis.

limitations of longitudinal research

In addition to the above analysis, we subset the data by only looking at detections that occurred in an image with at least one other detection. Then we calculated detections per image again, and fit the above model again with the new response variable. This same process was followed for detections with at least two, three, and four other detections in the same image. The goal of these analyses was to see if there were different trends for larger groups of people when compared with the entire data set.

Data pipeline

Our main contribution, the open-source data pipeline, is publicly available on https://github.com/marte292/rapid-data-pipeline . The repository contains a process manual with step-by-step instructions on how to implement the data pipeline in Python [ 37 ]. The required Python libraries and system requirements are provided. Additionally, we provide enough code for future researchers to implement the pipeline on their own systems, with their own file structure. The pipeline is capable of processing terabytes of image data and outputting an analysis-ready data product in a matter of days (using high-performance computing, such as a single GPU node on Frontera, an academic supercomputer) with minimal human input.

Using data from the Seattle street-level imagery campaign, we calculated the number of detections per image across all data collection surveys. Fig 3 shows the detections per image for each survey, as well as the detections per image for the subset of detections sharing an image with at least 4 others. Fig 3 also displays the timestamp of COVID-19 vaccines becoming publicly available in Washington state.

thumbnail

As the survey dates are irregular, all dates are included in the figure. Please note that the axis for total detections per image does not start at 0. This was done purposefully to facilitate comparison between the trends of the two graphs.

https://doi.org/10.1371/journal.pone.0303180.g003

Fig 3 depicts the trends over time for detections per image and detections sharing an image with at least 4 others. While both graphs exhibit similar trends overall, notably after vaccine rollout the graph of detections sharing an image with at least 4 others exceeds the graph of detections per image in all cases. The spike in detections seen in June 2020 is due to the large scale protests of police brutality that took place in Seattle in the aftermath of George Floyd’s murder.

The full results of the linear regression model for total detections per image are displayed in Table 1 . They show that the season being summer is the only significant seasonal effect. Additionally, the income bracket is a significant predictor, with wealthier areas seeing less pedestrian traffic. Finally, a census tract having a population greater than 55.5% white is a significant positive predictor. All other variables are not significant, including vaccine availability.

thumbnail

The first three non-intercept terms represent indicator variables for the different seasons, with fall being the baseline. The Vaccine Available term represents a binary variable for whether the COVID-19 initial vaccination series was publicly available or not. Weekend is a binary variable for whether the data was collected on Saturday or Sunday. The four Income Bracket terms are indicator variables for the median income level of the census tract where the data was collected. The income brackets are defined in our methods. Lastly, the More than 55.5% White term is an indicator variable for if the census tract in question had a populace that is more than 55.5% White. Full documentation for the Python package used to make this output is available from the developers [ 38 ].

https://doi.org/10.1371/journal.pone.0303180.t001

For the regression models using a subset of data, the results are similar to the initial model. All models have the same significant predictors as the initial model. The model using the detections sharing an image with at least one other also had the weekend as a borderline significant, negative predictor. The models using detections sharing an image with at least 3 and 4 others had vaccine availability as a significant, positive predictor. The full results of the linear regression model for detections per image with at least 4 others are displayed in Table 2 , with all other regression models available in the supporting information.

thumbnail

Coefficients are defined the same as in Table 1 .

https://doi.org/10.1371/journal.pone.0303180.t002

Comparison to Google Community Mobility data

Given the ability to measure community mobility through pedestrian counts, there is potential value of our pipeline for social sciences and public health research [ 22 , 23 ]. At an individual level, higher physical activity is known to predict better physical [ 39 , 40 ] and mental health [ 41 – 43 ], and is associated with higher self-reported satisfaction and quality of life [ 44 , 45 ]. In an aggregate sense, mobility is theorized to be an intermediate variable through which socioeconomic deprivation affects vulnerability to infectious disease [ 46 , 47 ], resilience to disasters [ 48 ], and exposure to environmental hazards [ 49 ]. In light of this body of literature, we argue that the use of pedestrian counts to assess mobility could be a differentiating factor in researching social and health inequity. One extremely common source of mobility data during the COVID-19 Pandemic has been Google Community Mobility Reports [ 24 ] and Apple Mobility Trends Reports [ 25 ]. While there have been improvements in recent years [ 50 ], there are known representation and self-selection biases with existing mobility data captured by smartphones and other internet-based data collection methods [ 51 – 55 ].

Given the large number of publications using smartphone data as the foundation for their work, a natural question is how our data compares to smartphone mobility data. Comparison between our data set and the still publicly available Google Community Mobility Reports data can reveal some of the similarities and differences between the two data sets [ 24 ]. Google Community Mobility data is reported at the county level in the United States. Since Seattle is in King County, Washington, the King County data is what we use to draw the comparison.

Google Community Mobility data does not provide raw mobility numbers, but rather is reported as a percentage change from the five-week period of Jan 5–Feb 6, 2020. This data is collected from smartphones running the Android operating system with location history turned on, which is off by default. The data is baselined by day of the week, so data from a given Monday is compared to the median of the five Mondays in the baseline window to calculate a percent change. Additionally, it is unclear how exactly Google quantifies mobility. It is mentioned that it combines number of visitors to a location with amount of time spent in that location, but no specifics beyond that are provided.

Google mobility data is broken down into different categories. The category that most closely aligns with one of the categories used in our analysis is parks. Although Google’s data classifies parks as official national parks and not the general outdoors, it does not indicate how it accounts for city or state parks. Our own data for park locations is based on the City of Seattle’s official classifications.

Fig 4 shows a comparison of our detections per image data against Google Community Mobility data. Note that not all surveys are included because Google Community Mobility data stopped being provided on October 15, 2022. Overall, the trends between the two data sets are remarkably similar, lending further credibility to our data collection procedure. The more notable differences in the graph are from the months of November 2020 through August 2021, where the Google mobility data shows a larger drop followed by an increase in community mobility than was visible through our own data.

thumbnail

The Pearson correlation between the two data sets is 0.387. The Google Community mobility data is aggregated at King County, WA, while our data covers a survey route within Seattle, which belongs to King County. As the dates of surveys were irregular (e.g., due to weather conditions), all dates are included in the figure.

https://doi.org/10.1371/journal.pone.0303180.g004

One plausible explanation for this is the upwards sampling bias that occurs when using smartphone data [ 56 , 57 ]. Our data set captures anyone on the street, including individuals experiencing homelessness, who are less likely to have smartphones. This population was on the streets throughout the entirety of the pandemic, so they were consistently captured by our data collection efforts. This consistent baseline pedestrian count could lead to a lesser response to vaccine rollout and winter weather in our own data in comparison with Google’s. Additionally, there is a known income gap in both vaccination rates and smartphone ownership [ 58 , 59 ]. This gap could drive the increase in the Google Mobility data during vaccine rollout.

Implications, limitations, and extensions

Our results show that it is possible for researchers to collect and analyze longitudinal SVI data. The presented methods can be used to collect and process SVI data from 8 hours worth of video in a manner of days. This time will only further decrease with faster data processing infrastructure and methods. These methods will allow novel longitudinal SVI data to be collected for research in a variety of application areas.

The results of the case study also bear further discussion. We demonstrated expected relationships between seasonal effects like day of week and weather on pedestrian traffic. Additionally, we showed that pedestrian traffic is inversely proportional to income, a known result during the COVID-19 pandemic, as lower income households are constrained in their capacity to work from home or take time off of work [ 30 , 60 ]. Our results also showed that more white areas had higher on average pedestrian counts. This could be due to known trends, such as areas with larger non-white populations being more likely to stay home in response to government restrictions [ 61 ] and participate in other risk-reducing practices such as wearing a mask [ 62 ], or just due to local trends, as racial mobility trends tend to vary between cities [ 63 ]. These findings are consistent across all of our models, both looking at the entire data set, and the subsets examining pedestrians sharing an image. These results validate our method with respect to established literature, and provide a quantitative confirmation of results that had previously been found using cell phone data.

One new finding from our case study is that while overall pedestrian counts did not respond to vaccine availability, the subset of pedestrians who were in larger groups (4+ people in an image) did. Likely, the reason we did not see a response to the vaccine in the aggregate data is because our data only captures people who are outdoors. There is data that shows that outdoor pedestrian activity varied across cities, frequently increasing at recreation locations like trails, during the early days of the pandemic [ 64 , 65 ]. Given these increases at some locations, a return to ‘normal’ pedestrian traffic may not mean an increase, but rather a change in traffic patterns. Our data captures this by showing that there was a significant increase in larger groups of people after the vaccine became available. This implies that people were more willing to be near each other outdoors after they had been vaccinated.

While the data pipeline presented here does represent a method for generating a novel data product, there are implementation challenges worth further discussion. For data collection, in addition to the time required to drive the route limiting the places of interest the route could reach, there were also many tradeoffs that had to be made when designing the route itself [ 31 ]. Despite having our survey route carefully designed to assess a representative sample of the Seattle population, some bias in route design is unavoidable. Since the route design included data from the American Community Survey aggregated at the census tract level, there is an implicit assumption of spatial homogeneity of the population within each census tract. Such bias is a manifestation of the well-known modifiable areal unit problem [ 66 ]. Since the majority of the route was primarily based on locations of interest throughout the city, this concern is somewhat mitigated.

In terms of processing, the pre-trained model we used required a substantial amount of high-performance computing time, and at times the data product generated was so large as to be unwieldy. Given the challenge our data set represents, using a model designed to be generalizable is necessary to attain good detection results. As many state-of-the-art models perform substantially worse out of sample, we had to be careful to choose a model that was designed to perform well in this situation, at the cost of slower computing times [ 67 ]. Another unforeseen challenge was regular updates to the video camera’s software to process and segment the video data into images. Consistent image formatting was vital for the data processing pipeline to function, so regular quality checks are necessary to make sure the images are processed properly.

The data product created, pedestrian detections, has some limitations as well. First, our method only captures pedestrians who are outdoors and near enough to the street to be captured via camera. This means that our data set does not include people who are indoors at these locations of interest, or who are too far from the street to be seen by camera. While the changes over time in pedestrian traffic we observed are still meaningful, it is important to recognize they don’t capture everything. Similarly, our data cannot be interpreted as the actual number of pedestrians on the street. There is overlap in the image data, even when subset at 4 meter intervals and cropped during orthorectification. The orthorectified images only represent about 25% of the originals. However, this natural cropping is not enough to avoid the image overlap and further cropping would risk information loss. Pedestrians that appear in the foreground of one image may end up in the background of another. There are also several known instances of cyclists keeping relative pace with the street-view vehicle for several blocks, resulting in numerous detections. These issues are easy to circumvent in analysis by comparing the relative number of detections, although at the cost of interpretability.

Even with the above limitations, the data pipeline presented in this paper can be directly applied or adapted to be used in a number of contexts. Potential applications of longitudinal SVI data in assessing the built environment [ 14 ], broad urban research [ 1 , 3 , 68 ], and health research [ 8 ] have been well-documented, as the temporal instability of existing SVI data is discussed as a limitation in all of these fields. Beyond this, it is possible to estimate population demographics [ 4 ], and other neighborhood-level statistics [ 13 , 69 ] using SVI data. As our ability to quickly and accurately parse scenes using computer vision improves [ 70 ], potential application areas will only increase in number.

Another field where longitudinal SVI data could contribute a lot is disaster research. There is a substantial body of research dedicated to empirical methods for modeling various aspects of disaster recovery [ 71 ]. Our methods could be applied in this field to quantify recovery using pedestrian detections as a metric for community mobility, or another metric assessing the built environment as appropriate. Similar work has been done using repeat photography after Hurricane Katrina [ 15 ] but our methods represent a substantial increase in generated data, allowing for a wider range of analyses. Spatial video data collection for disaster reconnaissance has also been done [ 72 ], but involves manual assessment of the captured video. Our methods demonstrate that a fully-automated approach is possible, which would allow for more frequent data collection at a lower cost.

This article describes the creation of the first open-source SVI data pipeline for longitudinal analysis. Regression analysis based on the resulting longitudinal SVI data showed that pedestrian traffic patterns changed in response to the availability of the COVID-19 vaccine, thereby demonstrating the data pipeline’s usefulness in research and practice. In particular, we showed that there were statistically significant increases in groups of people in proximity to each other after the vaccine became publicly available. Our data also captured expected trends in pedestrian traffic based on annual seasonality and socioeconomic factors. Our results demonstrate the feasibility and value in collecting SVI data as part of a longitudinal study. Longitudinal SVI data is capable of providing valuable insights in a variety of fields of study. Future work includes applications of our methods in broader public health research, disaster research, and other fields of study that can benefit from longitudinal SVI data. Potential methodological directions include study-specific route design process improvements and newer pedestrian detection approaches, as further progress is made in this area.

Supporting information

S1 dataset. full dataset used for obtaining regression results presented in this paper..

https://doi.org/10.1371/journal.pone.0303180.s001

https://doi.org/10.1371/journal.pone.0303180.s002

Acknowledgments

The authors gratefully acknowledge DesignSafe and the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing the cyberinfrastructure that enabled the research results reported within this paper.

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 11. Mapillary. Mapillary; 2013. [Cited 2020 July 10]. Available from: https://www.mapillary.com .
  • 19. Bureau USC. Understanding Geographic Identifiers (GEOIDs); 2021. [Cited 2021 October 20]. Available from: https://www.census.gov/programs-surveys/geography/guidance/geo-identifiers.html .
  • 20. Hasan I, Liao S, Li J, Akram SU, Shao L. Pedestrian Detection: Domain Generalization, CNNs, Transformers and Beyond. arXiv preprint arXiv:220103176. 2022;.
  • 21. Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, et al. CrowdHuman: A Benchmark for Detecting Human in a Crowd. arXiv preprint arXiv:180500123. 2018;.
  • 24. Google COVID-19 Community Mobility Reports; 2020. [Cited 2020 April 15] Available from: https://www.google.com/covid19/mobility/ .
  • 25. Apple COVID-19 Mobility Trends Reports; 2020. [Cited 2020 April 15] Available from: https://covid19.apple.com/mobility .
  • 31. Errett NA, Wartman J, Miles SB, Silver B, Martell M, Choe Y. Street View Data Collection Design for Disaster Reconnaissance. arXiv preprint arXiv:230806284. 2023;.
  • 34. Stanzione D, West J, Evans RT, Minyard T, Ghattas O, Panda DK. In: Frontera: The Evolution of Leadership Computing at the National Science Foundation. New York, NY, USA: Association for Computing Machinery; 2020. p. 106–111.
  • 35. Chard K, Foster I, Tuecke S. Globus: Research Data Management as Service and Platform. PEARC17. New York, NY, USA: Association for Computing Machinery; 2017.
  • 36. Yang W, Qian Y, Kamarainen JK, Cricri F, Fan L. Object Detection in Equirectangular Panorama. 24th International Conference on Pattern Recognition. IEEE; 2018. p. 2190–2195
  • 37. Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace; 2009.
  • 38. Seabold S, Perktold J. statsmodels: Econometric and statistical modeling with python. 9th Python in Science Conference. 2010.
  • 46. Ossimetha A, Ossimetha A, Kosar CM, Rahman M. Socioeconomic disparities in community mobility reduction and COVID-19 growth. In: Mayo Clinic Proceedings. vol. 96. Elsevier; 2021. p. 78–85.
  • 54. Milusheva S, Bjorkegren D, Viotti L. Assessing Bias in Smartphone Mobility Estimates in Low Income Countries. In: ACM SIGCAS Conference on Computing and Sustainable Societies; 2021. p. 364–378.
  • 58. Barry V, Dasgupta S, Weller DL, Kriss JL, Cadwell BL, Rose C, et al.. Patterns in COVID-19 Vaccination Coverage, by Social Vulnerability and Urbanicity—United States, December 14, 2020–May 1, 2021; 2021.
  • 59. Center PR. Mobile Fact Sheet; 2021. [Cited 2022 January 8]. Available from: https://www.pewresearch.org/internet/fact-sheet/mobile/ .
  • 67. Hasan I, Liao S, Li J, Akram SU, Shao L. Generalizable pedestrian detection: The elephant in the room. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 11328–11337.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 May 2024

A longitudinal analysis of soil salinity changes using remotely sensed imageries

  • Soraya Bandak 1 ,
  • Seyed Alireza Movahedi-Naeini 1 ,
  • Saeed Mehri 2 &
  • Aynaz Lotfata 3  

Scientific Reports volume  14 , Article number:  10383 ( 2024 ) Cite this article

303 Accesses

1 Altmetric

Metrics details

  • Agroecology
  • Electronic properties and materials
  • Environmental impact
  • Scientific data

Soil salinization threatens agricultural productivity, leading to desertification and land degradation. Given the challenges of conducting labor-intensive and expensive field studies and laboratory analyses on a large scale, recent efforts have focused on leveraging remote sensing techniques to study soil salinity. This study assesses the importance of soil salinity indices’ derived from remotely sensed imagery. Indices derived from Landsat 8 (L8) and Sentinel 2 (S2) imagery are used in Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Decision Tree (DT), and Support Vector Machine (SVR) are associated with the electrical (EC) conductivity of 280 soil samples across 24,000 hectares in Northeast Iran. The results indicated that the DT is the best-performing method (RMSE = 12.25, MAE = 2.15, R 2  = 0.85 using L8 data and RMSE = 10.9, MAE = 2.12, and R 2  = 0.86 using S2 data). Also, the results showed that Multi-resolution Valley Bottom Flatness (MrVBF), moisture index, Topographic Wetness Index (TWI), and Topographic Position Indicator (TPI) are the most important salinity indices. Subsequently, a time series analysis indicated a reduction in salinity and sodium levels in regions with installed drainage networks, underscoring the effectiveness of the drainage system. These findings can assist decision-making about land use and conservation efforts, particularly in regions with high soil salinity.

Similar content being viewed by others

limitations of longitudinal research

Coupling of machine learning and remote sensing for soil salinity mapping in coastal area of Bangladesh

limitations of longitudinal research

Investigation of the spatial and temporal variation of soil salinity using Google Earth Engine: a case study at Werigan–Kuqa Oasis, West China

limitations of longitudinal research

Remote sensing imagery detects hydromorphic soils hidden under agriculture system

Introduction.

Soil salinization profoundly affects soil productivity, nutrient availability, and plant physiology and biochemistry, especially in arid and semi-arid regions. Intensive irrigation in these areas brings saline groundwater to the surface, leading to overflow zones where evaporite minerals precipitate 1 , 2 , 3 . The salinity adversely affects crop water and fertilizer uptake and soil fertility enhancement, impacting more than 20% of the world's irrigated land 4 .

Soil Electrical Conductivity (EC) is a key indicator of soil salinity measurements, strongly correlating with salinity levels 5 . Mapping EC can enhance understanding of soil genesis processes in arid and semi-arid soils and aid agricultural management. However, due to its complex spatial variation, soil EC estimation is more challenging than other soil properties (e.g., soil organic carbon), necessitating the development of more reliable methods.

Remotely sensed images, including Landsat 8 (L8) and Sentinel 2 (S2), are extensively used for salinity analysis, offering spectral diversity and cost-effectiveness in soil salinity mapping 6 , 7 , 8 , 9 , 10 . Within this context, Erkin et al. 11 used L8 for temporal soil salinity analysis in Kashgar. They found that due to the continuous increase in inland reclamation and insufficient drainage, salinized arable land has steadily increased, and the average salinity of the cropland reached higher than 5.1 g per kilogram of soil. Taghizadeh-Mehrjardi et al. 12 identified that the S2 satellite image is a suitable and cost-effective data source for soil salinity assessment due to its short revisit interval, multiple spectral bands, and high spatial resolution. Wu et al. 16 used Landsat 5 TM and ALOS L-band radar to create a soil salinity map in the Mussaib region of Central Mesopotamia, finding that the Random Forest (RF) algorithm outperformed Support Vector Regression (SVR). Wang et al. 13 compared L8 and S2 images in the Ebinur Lake wetland using the cubist model, finding that the S2 image is superior for salinity estimation. Wang et al. 14 have shown that the Cubist model with L8 image is superior for salinity estimation compared to using S2 with that model in Ebinur Lake Wetland National Nature Reserve, China. Wang et al. 15 demonstrate the effectiveness of S2 images in distinguishing between saline and non-saline areas using the RF algorithm. Their study also highlights the capability to monitor changes in soil salinity levels between dry and wet seasons using remotely sensed images by generating region-specific maps.

Ma et al. 16 digitally mapped salinity distribution in the Werigan-Kuqa oasis, analyzing the evolution characteristics and driving factors using a machine learning approach and field data. The eXtreme Gradient Boosting (XGBoost) model significantly enhances prediction accuracy and salinity mapping, illustrating spatial and temporal changes over 25 years. Ge et al. 17 introduced a hybrid machine learning framework utilizing S2 image and environmental determinants, achieving a notable improvement in soil salinity mapping accuracy by 5–8%. These studies show that the L8 satellite image is more effective for monitoring soil salinity than the S2 image 13 , 18 , 19 , 20 . Also, researchers have employed RF 21 , XGBoost 22 , Decision Tree (DT) 23 , and SVR 24 , 25 as predictive tools in their investigations, and these models consistently demonstrated robust and reliable performance in the context of soil salinity prediction.

This study's main aim is to analyze soil salinity variations in a drainage area using L8 and S2 imagery data alongside machine learning methods. Additionally, it compares salinity levels before and after installing a regional drainage network.

Materials and methods

Figure  1 provides a summary of the workflow in this study. The initial step involves preprocessing the imagery data for image segmentation delineating areas for collecting soil sampling field data. The second step entails collecting field soil sample data to measure EC as an indicator of soil salinity. Following this, preprocessed S2 and L8 imagery data is utilized to extract indices, which are then used for feature selection using machine learning algorithms.

figure 1

Workflow of the study.

The measured EC of soil samples is used as a dependent variable for predicting soil salinity, while the independent variables are derived from S2 and L8 imagery data. Considering that indices are extracted at the object-level segments rather than the pixel level. An object-based feature extraction strategy simplifies analysis, reduces noise, and enhances accuracy 26 , 27 , 28 . Also, this study uses a hyperparameter tuning evaluation in regression models to optimize model performance and identify the most suitable parameters. Subsequently, sample migration is performed based on the selected most informative features to conduct a time-series analysis using the Continuous Change Detection and Classification (CCDC) algorithm 29 . Finally, the paper creates a time series of EC maps to analyze temporal variations of soil salinity over time. These maps give a better understanding of the EC patterns and trends over time.

Within the geographical coordinates ranging from 55° 10′ to 55° 22′ East longitude and 37° 15′ to 37° 25′ North latitude, the study area was conducted in the Gonbad region of Golestan Province, situated in the northern part of Iran (Fig.  2 a). This region experiences a temperate climate and features predominantly flat terrain. Over the past two decades, it has received an average annual rainfall of 455 mm, with a mean annual temperature of 17 °C 30 . Notably, the yearly minimum temperature recorded in this period was 15.6 °C, while the maximum temperature reached 37.5 °C. Furthermore, based on the soil taxonomy system established by the United States, the soil type in this area is classified as a typical Haploxerept 31 .

figure 2

Location of Golestan province and ground sampling site ( a ). overlaying on the National Geographic Style Map in Esri ArcGIS (version 10.8) 32 , Boostan and Golestan drainage area, and ground sampling locations ( b ). overlaid on the World Hillshade base map in Esri ArcGIS (version 10.8) 32 .

The study area, located on the right flank of the Gorganroud River, spans approximately 24,000 hectares of farmlands in the Golestan and Boostan drainage networks (Fig.  2 b), which are irrigated from the Golestan and Boostan dams, respectively. Soil salinity is a prevalent issue in the Gonbad Kavus area, attributed to high groundwater levels and subsequent water evaporation, leaving behind salts in the soil. Therefore, this study aimed to assess the potential of optical Earth observation imagery for predicting soil surface salinity in the eastern region of Gonbad City.

Image segmentaion

We utilized S2 imagery for land segmentation, which was employed for field sampling and extraction of indices within each segment. The S2 multispectral imagery comprises 13 bands and has a spatial resolution of 10 m in visible bands and 20 m in the remaining. The S2 dataset, processed by ESA 33 , is also accessed through the Google Earth Engine (GEE) 34 . Utilizing S2 imagery data, we grouped nearby pixels based on similar characteristics such as color and texture 39 . We employed the K-Nearest Neighbors (KNN) algorithm to group nearby pixels or image segmentation. We tried different grouping levels to find the best one, considering things like how clear the images were. We used various maps and data layers, like contour maps and soil maps, to help with this process. Then, we looked for soil samples in each segmented region (see Supplementary A for detailed information on the segmentation process).

Sampled soil data as the dependent variable

On July 1, 2020, during the summer season, a field survey and soil sample collection were carried out to ensure compatibility with imagery from the S2 and L8 satellites. Global Positioning System (GPS) receivers were employed to pinpoint the sampling locations. Each sampling point was established as a circle with a 5-m radius, from which eight individual soil samples were extracted from the 0–10 cm depth and subsequently combined into a single composite sample. All soil samples were transported to the laboratory for further analysis to determine their moisture content and conductivity. First, 200 g of fresh soil samples were weighed and placed in a drying apparatus. Subsequently, 20.00 g from each naturally air-dried soil sample were precisely measured to prepare a soil extract with saturated paste extracts. The EC was then calculated using the saturation extract method.

S2 and L8 imagery data were used to extract indices as independent variables

The independent variables used in this study are sourced from L8 and S2 imagery data. The L8 is a multispectral satellite with a spatial resolution of 30 m. It is provided by the United States Geological Survey (USGS) 35 and accessed through GEE 35 . To select the most suitable images, we filtered image collections available between April and July 2020 to exclude images having more than 20 percent cloud cover. The filtering process allows the creation of a composite image with no cloud cover.

Twenty-six environmental variables (including auxiliary variables) are used for soil salinity estimation. These variables are used in feature selection to identify the most important features in soil salinity estimation (see Supplementary A for detailed information on feature selection). Table 1 provides a detailed overview of these variables (indices), including their formulas.

Machine learning regression analysis

Regression analysis is a commonly used statistical tool to study relationships between factors, making it straightforward to analyze multifactor data 54 . This study employs SVR 44 , 55 , 56 , 57 , 58 , 59 , RF 22 , 58 , 60 , 61 , 62 , DT 63 , 64 , and the XGBoost 16 , 65 , 66 , 67 , 68 methods to investigate the association between remotely sensed imagery data and field soil salinity. The Scikit-learn for Python (version 1.3.0) is used to implement these algorithms 69 .

Regression and hyperparameter tuning for feature selection

Hyperparameter tuning is essential for maximizing the performance of machine learning algorithms 70 , 71 . Algorithms, such as DT, RF, XGBoost, and SVR, involve various types of hyperparameters, and the fine-tuning of these parameters directly influences the algorithm's effectiveness 72 . Several methods can be utilized to optimize parameters and enhance the performance of models, such as the local search method 73 . This paper uses the BayesSearchCV and GridSearchCV methods to fine-tune the hyperparameters of RF, XGBoost, DT, and SVR models using the scikit-optimize library (version 0.8.1) in Python 74 . Bayesian search CV is grounded in Baye’s rule of conditional probability, using prior knowledge to calculate posterior probabilities. Detailed information on the hyperparameters is provided in supplementary B.

A DT, one of the simplest yet most successful machine learning methods, uses a divide-and-conquer approach to classify and regress large databases 75 , 76 . This makes DT one of the common machine-learning methods 75 . A DT is a non-parametric supervised learning method with a tree-like structure 77 . It acts as a function taking input from a vector of attribute values and returning a “decision.” The decision is reached through a series of tests. Each internal node in the tree corresponds to an examination of the value of one of the input attributes, and the branches from that node are labeled with the attribute’s possible values. Each leaf node in the tree specifies a value to be returned by the function 76 . Both input and output values are discrete (classification) or continuous (regression). A regression tree has a linear function of some subset of numerical attributes at each leaf rather than a single value. The learning algorithm must decide when to stop splitting and begin applying linear regression over the attributes 54 , 76 . DTs are prone to overfitting, meaning they follow the peculiarities of the training dataset too closely and may not perform well on a new dataset, i.e., the test dataset. In such cases, the general predictive accuracy of overfitting DTs will be low, i.e., generalization accuracy 78 . One approach to improve the generalization accuracy is to construct multiple individual trees using a subset of the observations 78 , 79 , which is the main idea of the RF algorithm 80 . Detailed information on the DT hyperparameters are provided in Supplementary Table S1 .

The RF is an ensemble supervised learning algorithm using a collection of DTs for prediction. Individual trees are constructed by bootstrapping the dataset and averaging the results of all the trees to make the final prediction. Bootstrap aggregating or bagging helps reduce overfitting 78 . The RF can be used to classify categorical target variables and the regression of continuous variables 81 . For regression purposes, at each branching of the regression tree, the mean of the samples on the leaf nodes and the Root Mean Square Error (RMSE) formed between each sample are calculated. Following the minimum RMSE of leaf nodes as a branching condition, the regression tree stops when no more features are available, or the overall RMSE is optimal 77 . The key to creating an accurate model is ensuring that the base learner, typically a regression tree, is as uncorrelated as possible to produce a robust generalization ability 80 . Detailed RF hyperparameter information is provided in Supplementary Fig. S2 and Supplementary Table S2 .

The Support Vector (SV) algorithm, a set of related supervised learning methods, was first introduced for pattern recognition 82 and then generalized to solve regression problems 83 . The SVR is a tool for overall and short-term forecasting or when real-time analysis is required 84 . Also, it works in an infinite-dimensional space, giving it an edge over similar networks 84 . The SVR investigates the relationship between one or more predictor variables and a real-valued (continuous) dependent variable 85 . SVR finds a best-fitting hyperplane to data points in a continuous space, while in linear regression, a line is fitted to the data points. In SVR, the best-fitting hyperplane passes through as many sample points as possible within a certain distance, called a margin, which is defined to be the smallest distance between the decision boundary and any of the samples 86 . SVR is very sensitive to the input data type, as it can produce incorrect results if the data spans a wide range. Therefore, data normalization is an essential step in using SVR 84 . Detailed information on the SVR hyperparameters are provided in Supplementary Fig. S3 .

The XGBoost algorithm was proposed by Chen and Guestrin 87 . A scalable implementation of XGBoost is robust and highly efficient 88 , 89 . It uses the Classification and Regression Trees (CART) and is jointly decided by multiple related DTs 90 . In this structure, the input sample of the next DT is related to the training and prediction results of the previous DT. Like most machine learning algorithms, in XGBoost, the objective is to minimize the sum of the loss function to control the accuracy and complexity of the mode 90 . Detailed information on the XGBoost hyperparameters are provided in Supplementary Table S3 .

Migration sample with algorithm CCDC

The fundamental concept behind CCDC involves fitting a simple harmonic model to a cloud-free time series and detecting changes when the difference between observed and predicted pixel values surpasses a predefined threshold for consecutive periods. Notably, Chen et al. 87 enhanced the CCDC algorithm by introducing a multi-harmonic model capable of fitting intricate phenological profiles in cultivated land. The paper used the CCDC method to select unchanged samples and generate additional field samples for each period during which field data were not collected in those years. These unchanged field samples serve as training and test data for all years. The process commenced with cloud masking of L8 images and the computation of two essential spectral indices: the NDWI and the NDSI (Table 1 ). Subsequently, these indices were leveraged with the CCDC model, as outlined in Eq. ( 1 ), to identify and isolate the unchanged field samples 91 .

where i , x , and t represent the spectral index, Julian date, and the number of days in a year (i.e., 365.25 days); a 0,i stands for the overall value of spectral index i of an L8 image; a 1,i and b 1,i specify the intra-year change. Furthermore, c 1,i pertains to interannual values, and these new samples estimate model values and residuals by comparing observed and modeled sample values, as outlined in Eq. ( 1 ). In this context, a threshold value of 20% was employed. If the residual exceeded this threshold, it was assumed that an interannual variation had occurred. Samples failing to meet this threshold were excluded from other time interval classifications. The remaining residual samples were considered to exhibit stable spectral responses and were assumed to remain unchanged throughout the study period.

Before classification, these unaltered samples were randomly divided into a training group (70%) and a test group (30%). This division allowed for the assessment of the accuracy of the generated salinity maps using both the classification algorithm's training and test groups. The process CCDC is as follows: Identification of essential features for salinity determination based on field data; Time series analysis based on the above feature to identify field points whose value does not change over time (280 samples); Time series analysis based on the above feature to generate artificial points whose value does not change over time (576 sample artifacts); Producing a map with points 1 and 2 with different regression methods and comparing their accuracy and choosing the most accurate map from the previous step.

Accuracy assessment

Three distinct criteria, RMSE, R 2 , and Mean Absolute Error (MAE), were used to estimate the accuracy of the prepared prediction models (see Supplementary D ).

Soil salinity estimation using L8 imagery data

Feature extraction and segmentation of l8 data.

According to Table 2, the results revealed that out of 26 environmental variables (including several auxiliary variables), the "Random Forest-Backward Feature Elimination" (RFE-RF) method 92 . The relative importance of independent variables for estimating soil salinity using L8 imagery is shown in Supplementary Fig. S4 . It can be concluded that the parameters obtained from remote sensing are more important than other factors of soil formation, i.e., geomorphology, topography, and EC, in the spatial estimation of soil salinity in the surface horizon.

The MNDWI (0.35%) and NDSI (0.33%) were determined to be important predictors of soil salinity in the RF method (Supplementary Fig. S4 a). The NDWI is the third important auxiliary variable, with a relative importance of 0.23%. Moreover, the XGBoost has shown better results in salinity estimation with L8 data. Therefore, the critical factors using L8 data, as shown in Supplementary Fig. S4 b, are B3, MNDWI, NDSI, NDWI, NDVI, EVI, MrBVF, TWI, TPI, index S1, and S4.

Furthermore, based on the results presented in Supplementary Fig. S4 c, the DT algorithm ranks the relative importance of auxiliary variables as follows: MNDWI, NDSI, NDWI, TWI, SAVI, MrBVF, TPI, NDVI, EVI, S3 (Table 1 ), in an ascending to descending trend.

Accuracy assessments of salinity estimation using L8 data

The scatter diagram of soil EC is shown in Fig.  3 . The highest R 2 value is related to the DT model. Therefore, DT can predict the salinity more accurately than others with RMSE = 12.25, MAE = 2.15, and R 2  = 0.85. Furthermore, RF has almost the same performance as DT, while the XGBoost algorithm has the lowest coefficient of explanation for soil salinity estimation with RMSE = 18.62, MAE = 2.87, and R 2  = 0.58).

figure 3

Scatter plots of SVR ( a ), RF ( b ), DT ( c ), and XGBoost ( d ) methods estimating the soil salinity value using the L8 image. The Matplotlib for Python (version 3.7.2) draws graphs 93 .

Spatial estimation of Soil salinity with L8 data

Suitable bands were determined to train the regression algorithms and create salinity level maps, and pixel data corresponding to sample points within each class were employed as training data. The regression was performed based on five salinity classes, and Fig.  4 displays the results obtained from various supervised regression algorithms.

figure 4

Soil salinity map created by L8 data in 2020 ( a ) XGBoost, ( b ) DT, ( c ) SVR, and ( d ) RF methods.

Soil salinity estimation using S2 imagery data

Feature extraction and segmentation s2 data.

As indicated in Table 2, the results revealed that out of a total of 26 environmental variables (including several auxiliary variables), the RFE-RF method 92 identified the following eight variables as having the highest feature importance: Clay, Carbonate, NDVI, S3, S2, S1, Greens, and Brightness. These variables were derived from band assignments of the L8. Additionally, six parameters, Diffuse, MrVBF, Midslope, Standard, and SAGA, extracted from the DEM 94 , and two climatic parameters, MAP and MAT, were selected for analysis.

Supplementary Figure S5 shows that the importance of environmental auxiliary variables differed in each algorithm using S2 data. The results showed that remote sensing data were the most important predictors of soil salinity. In Supplementary Fig. S5 , the essential controllers are auxiliary variables of the humidity index, salinity, modified soil index, MrVBF, and climate variables of MAT, NDVI, S3, and TWI (Table 1 ). Soil salinity models with advanced DT algorithms have the highest effect on the prediction of soil salinity in the study area. This shows that humidity, altitude, elevation, and vegetation are the most important factors of soil formation in the study area. It is effective in the spatial distribution of soil and its characteristics because the topography and vegetation of each region are some of the essential and influential characteristics of the soil characteristics of that region, including soil salinity.

Accuracy assessments of salinity estimation using S2 data

As seen in Fig.  5 , the DT had the highest value of R 2 and was the best model for estimating EC. The best-fitted model with the highest explanation coefficient and the lowest error was selected. Based on the results obtained from the estimator’s machine learning, the SVR has the lowest accuracy, and the DT method has the most accuracy.

figure 5

Scatter plots of SVR ( a ), RF ( b ), DT ( c ), and XGBoost ( d ) methods estimating the soil salinity value using the S2 data. The Matplotlib for Python (version 3.7.2) is used to draw graphs 93 .

Spatial estimation of Soil salinity with S2 data

Figure  6 illustrates the classified salinity map of the study area, created using S2 data. It highlights the effectiveness of drainage in mitigating soil salinity. The RF algorithm exhibited higher accuracy than others due to its utilization of more trees and optimal bootstrap sampling techniques for auxiliary variables and observation points. The DT algorithm also performed well, while the SVR demonstrated improved accuracy over the XGBoost model.

figure 6

Soil Salinity map created with S2 data in 2020 using ( a ) RF, ( b ) XGBoost, ( c ) SVR, and ( d ) DT.

The map indicates a range of soil salinity values in the study area, with the highest recorded at 24 ds/m and the lowest at 4 ds/m. These areas primarily consist of agricultural fields with low slopes and are equipped with pipe drainage systems.

Temporal analysis of soil salinity using L8 imagery data

We conducted temporal analysis on soil salinity exclusively using L8 imagery data. Supplementary Figures S6 , S7 , and S8 show the soil salinity map generated using RF and L8 data between 2013 and 2020. They revealed a decreasing trend in salinity levels in the study area. The Golestan drainage area started in 2013 and ended in 2019, and the Bostan drainage area began in 2019 and was completed in 2023. This can be attributed to the extensive use of pipe drainage in agricultural regions and potential climate changes. Also, the average soil salinity time series ten years after installing the drainage network is shown in Supplementary Figs. S9 and S10 . In 2012, the predominant and least prevalent land classifications were those subject to high salinity restrictions (54%) and those free of salinity restrictions (2%), respectively. However, by 2023, the most prevalent land classification had shifted to areas without salinity limitations (40%), while those with very high salinity restrictions constituted 7% of the region's landscape.

Over the period from 2012 to 2023, the extent of land subject to high salinity restrictions experienced a significant decrease in drainage areas, nearing parity with the extent of land free of salinity restrictions. This transformation suggests a notable reduction in salinity levels within the drainage area and an overall increase in salinity across other non-drained regions.

Upon analyzing the histograms for 2012, it's apparent that salinity in agricultural regions primarily falls within the range of 8–18, with a peak around 13 (Supplementary Fig. S11 ). In the 2023 histogram shown in Supplementary Fig. S12 , the minimum and maximum values have decreased, now confined to a range of 4–15, indicating an overall reduction in salinity in agricultural areas. Notably, the shift in the peak of the histogram to the left suggests that regions with higher initial salinity values within this range have experienced more pronounced reductions, indicating a significant improvement in their salinity conditions.

A 7-year analysis of soil salinity spanning from 2013 to 2020 was conducted using Landsat 8 (L8) imagery, focusing on the differentiation of pixels between bare soil and plant cover. This analysis aimed to identify optimal indices and suitable timing for studying changes. A comparison of the soil salinity maps for 2013 and 2020, illustrated in Supplementary Fig. S13 , indicates a decreasing trend in salinity, particularly noticeable in agricultural areas with tile drainage systems. However, rangeland areas display a notable trend potentially influenced by climate change.

A slight increase is observed within the range of high salinity levels when comparing data from 2013 to 2020, consistent with global trends. Consequently, salinity has risen in non-agricultural regions but declined in agricultural areas. A distribution chart comparing values from 2013 has been generated for further investigation. The histograms and this distribution chart confirm that changes diminish within the 8–18 salinity range, displaying a negative regression slope. This indicates that areas with initially higher salinity levels experienced more significant reductions. However, in cases exceeding a salinity value of 18, a relative increase in salinity is observed. On average, this increase amounts to approximately five units of EC for values greater than 18 (Supplementary Fig. S14 ).

Figure  7 compares soil salinity and provides the slope (β) and significance level (P) values related to the salinity trend in drained areas. Based on these values, it can be concluded that soil salinity has decreased with drainage. Drained regions exhibit a significant decreasing trend at a 5% level, with the most substantial decrease observed in 2013. Based on Kendall's statistics, the analysis of soil salinity changes in the study area over the 11 years (2012–2023) indicates a consistent downward trend in salinity changes following drainage implementation. However, the situation is different for non-drained areas.

figure 7

Trend diagram of Kendall and Pettit Mutation test for phases 1, 2, 3, and 4 of drained areas.

Estimation of soil salinity using S2 and L8 data

The study compares four machine learning methods, RF, DT, XGBoost, and SVR, to estimate soil salinity levels using remotely sensed L8 and S2 data. We have found that the accuracy of DT and RF in salinity estimation at a depth of 10 cm is higher than XGBoost and SVR. This shows that RF and DT are more powerful in modeling highly nonlinear dimensional relationships than XGBoost and SVR. Our findings concur with those found in 95 , 96 . Merembayev et al. 97 indicated that DT and RF have almost the same performance for soil salinity estimation, which is aligned with our findings. Ding et al. 98 used the Landsat Enhanced Thematic Mapper Plus (ETM +) image, stating that DT is suitable for extracting saline soil information. Qi-sheng et al. 99 found the effectiveness of DT for soil salinity extraction from ETM data, which is aligned with our findings. Fu et al. 100 reported that a combination of DT using Landsat TM images had superior performance in land salinization classification. In addition, Breiman 80 stated that by aggregating multiple models, the instability of a single-tree model is minimized, which leads to an improvement in consistency 101 . While our results demonstrate the superiority of DT over RF for indices developed by L8 imagery data, Haq et al. 102 have argued that RF outperforms DT using L8 imagery data. It is worth noting that their soil samples were collected at a depth of 15 cm, whereas our study samples were collected at a depth of 10 cm. Nevertheless, our findings indicate a close alignment between the predicted and actual EC values when using the RF and DT models. This contrasts several studies showing RF superior to DT for estimating soil salinity 103 , 104 , 105 .

Feature importance for estimation of soil salinity

Our study shows the advantage of ranking the relative importance of predictor variables (indices) using machine learning methods. Salcedo et al. 106 have stated that applying different spectral indices helps characterize soil salinity. Accordingly, our results suggest that MNDWI is an optimal index for assessing soil salinity using the DT and RF methods with both L8 and S2 imagery data. This is aligned with Xu 46 , Qi-sheng et al. 99 , and Fu et al. 100 , stating that MNDWI is suitable for salinization mapping. Also, Ding et al. 98 reported the usefulness of MNDWI combined with NDVI in soil salinity classification. Additionally, our results indicated that the NDSI is the optimal index for salinity estimation using S2 data. Aligning with our findings, Shrestha et al. 107 have found that the only significant predictor of observed soil salinity was NDSI.

According to our study, topography and vegetation indices are important indicators of soil salinity. This is aligned with the findings of Wang et al. 108 . They indicated that environmental factors contributed substantially to soil salinity estimation, including the digital elevation model (DEM) and Green Atmospherically Resistant Vegetation Index (GARI).

Additionally, our findings show that the salinity index reaches its maximum value, particularly near the soil surface in specific sections. Factors such as capillary currents, soil texture, groundwater, and moisture levels are pivotal in transporting solutes to surface layers, especially during dry seasons. Consistent with our findings, Taghizadeh-Mehrjardi et al. 55 identified climate parameters, particularly prolonged droughts, as the primary factors influencing soil salinity in lowland areas. They also highlighted saline parent materials, soil texture, and the lack of surface irrigation and drainage as contributing factors.

Moreover, our findings showed that in areas with saline soil, an increase in soil moisture leads to a decrease in reflectance within the visible and near-infrared regions, resulting in enhanced soil salinity estimation, as also reported by Cao et al. 109 .

Temporal and spatial changes in soil salinity

Analyzing soil salinity changes from 2012 to 2023 in the study area indicates that most of these lands did not transition from saline to non-saline over 11 years in the undrained areas. The extent of moderate and high salinity classes gradually decreased, while the area of lands with very high and severe salinity increased. Despite the similar influence of climate and weather changes on salinity increase, agricultural areas with drainage systems have experienced reduced salinity. Additionally, areas with improved soil amendments and established drainage systems exhibit more favorable salinity classifications than non-drained areas 110 . Moreover, regions with drainage systems in place report higher crop yields than non-drained areas; this is corroborated by the soil salinity chart in this range, which demonstrates a declining trend, affirming the effective functioning of drainage systems. Additionally, the high EC of drainage water, signifying a very high salinity class, indicates that drainage systems have efficiently removed salinity from the soil, and these changes have stabilized over time 110 . This is aligned with Gopalakrishnan et al. 111 , as they reported that poor drainage intensifies soil salinity, and salinization has severe implications for food production and security. Therefore, this indicates the importance of installing a proper drainage system aligned with Singh's findings 112 .

Since the soil salinity of the grazing lands around the outlet drainage evaporation pond has gradually increased over the 11 years following the implementation of the pipe drainage project, it is recommended to consider a comprehensive Integrated Water Resources Management (IWRM) plan. This plan would utilize serial biological drainage for agricultural drainage waters multiple times, each cycle devoted to cultivating salt-tolerant crops. The highly saline drainage water produced at the end of this process can be repurposed for non-agricultural activities such as aquaculture or salt production. Eventually, a small volume of water should be directed into the outlet drawing, or ideally, no water should enter the outlet and instead be discharged directly into the sea. Mardanifar et al. 113 investigated the declining salinity trend in the Golestan and Boustan dam areas. Their findings regarding the trend of electrical conductivity (EC) changes in the sampled wells within the project area generally indicate a reduction during the operation of the drainage system. Factors such as the terrain slope, positioning of drainage systems, and the known distribution of soil salinity are considered significant in this region.

Overall, proper drainage is an essential aspect of managing soil salinity. It helps to remove excess salts from the soil and prevent their accumulation. Drainage systems such as tile drains or subsurface drainage can be implemented to improve water movement and reduce salt build-up in the root zone. Effective drainage can help to lower the water table, preventing the capillary rise of saline groundwater. It also promotes leaching, which involves flushing salts from the soil profile through water movement. This helps to maintain a more favorable salt balance in the soil, reducing the risk of salinity problems. Farmers can effectively manage soil salinity by implementing proper drainage measures and creating a better plant-growing environment. This can improve yields, improve water-use efficiency, and enhance soil fertility. In addition, proper drainage can help minimize the environmental impact of soil salinity. It reduces the risk of saltwater intrusion into freshwater sources, protecting water quality and preserving aquatic ecosystems. Overall, drainage plays a crucial role in soil salinity by preventing salt accumulation, improving water movement, and promoting optimal plant growth and agricultural productivity.

Limitations

The assessment of soil salinity change trends through the utilization of remote sensing data has a few limitations. The principal constraints in this study are as follows: (1) Remote sensing data is typically collected based on a specific scale, with each data pixel potentially representing several meters. This spatial scale may result in losing fine details related to soil salinity changes at smaller scales. (2) Acquiring remote sensing data over time and for specific areas can be challenging. Some regions may have limited access to satellite imagery, which can restrict the comprehensive evaluation of soil salinity changes. (3) Cloud cover in satellite imagery can introduce interferences in the remote sensing process and degrade image quality. Accurate soil salinity monitoring requires high-quality, cloud-free images.

The study utilized remote sensing imagery data and soil salinity samples in machine learning algorithms to explore factors affecting salinity levels in salt-affected crop fields. The DT machine learning model effectively predicted total soil salinity in heavily vegetated croplands. The study highlighted that undrained areas exhibit greater sensitivity to salinity, likely influenced by climate patterns. Furthermore, a significant disparity in average salinity levels between undrained and drained land was observed, with lower EC in agricultural land due to salt leaching. To mitigate salinity, the study recommends the installation of drainage pipes at a depth of 2 m to reduce and stabilize soil electrical conductivity. The results suggest incorporating environmental variables in time series modeling to predict soil salinity, considering that climate and weather variations contribute to increased salinity.

Data availability

According to the first author's university regulations, the datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

AbdelRahman, M. A. An overview of land degradation, desertification and sustainable land management using GIS and remote sensing applications. Rendiconti Lincei. Scienze Fisiche e Naturali 34 (3), 767–808 (2023).

Article   ADS   Google Scholar  

Metwaly, M. M., AbdelRahman, M. A. & Abdellatif, B. Heavy metals and micronutrients assessment in soil and groundwater using geospatial analyses under agricultural exploitation in dry areas. Acta Geophys. 71 (4), 1937–1965 (2023).

AbdelRahman, M. A. et al. Determining the extent of soil degradation processes using trend analyses at a regional multispectral scale. Land 12 (4), 855 (2023).

Article   Google Scholar  

Hafez, E. M. et al. Minimizing hazard impacts of soil salinity and water stress on wheat plants by soil application of vermicompost and biochar. Physiol. Plant. 172 (2), 587–602 (2021).

Article   CAS   PubMed   Google Scholar  

Vermeulen, D. & Van Niekerk, A. Machine learning performance for predicting soil salinity using different combinations of geomorphometric covariates. Geoderma 299 , 1–12 (2017).

Gao, L. et al. Road extraction from high-resolution remote sensing imagery using refined deep residual convolutional neural network. Remote Sens. 11 (5), 552 (2019).

Yu, H. et al. Mapping soil salinity/sodicity by using Landsat OLI imagery and PLSR algorithm over semiarid West Jilin Province, China. Sensors 18 (4), 1048 (2018).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Wu, Z. et al. Remote sensing monitoring and driving force analysis of salinized soil in grassland mining area. Sustainability 14 (2), 741 (2022).

AbdelRahman, M. A. et al. Detecting and mapping salt-affected soil with arid integrated indices in feature space using multi-temporal Landsat imagery. Remote Sens. 14 (11), 2599 (2022).

Aboelsoud, H. M. et al. Quantitative estimation of saline-soil amelioration using remote-sensing indices in arid land for better management. Land 11 (7), 1041 (2022).

Erkin, N. et al. Method for predicting soil salinity concentrations in croplands based on machine learning and remote sensing techniques. J. Appl. Remote Sens. 13 (3), 034520–034520 (2019).

Taghizadeh-Mehrjardi, R., Nabiollahi, K. & Kerry, R. Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran. Geoderma 266 , 98–110 (2016).

Article   ADS   CAS   Google Scholar  

Wang, J. et al. Machine learning-based detection of soil salinity in an arid desert region, Northwest China: A comparison between Landsat-8 OLI and Sentinel-2 MSI. Sci. Total Environ. 707 , 136092 (2020).

Article   ADS   CAS   PubMed   Google Scholar  

Wang, F. et al. Multi-algorithm comparison for predicting soil salinity. Geoderma 365 , 114211 (2020).

Wang, J. et al. Capability of Sentinel-2 MSI data for monitoring and mapping of soil salinity in dry and wet seasons in the Ebinur Lake region, Xinjiang, China. Geoderma 353 , 172–187 (2019).

Ma, S. et al. Investigation of the spatial and temporal variation of soil salinity using Google Earth Engine: A case study at Werigan-Kuqa Oasis, West China. Sci. Rep. 13 (1), 2754 (2023).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Ge, X. et al. Updated soil salinity with fine spatial resolution and high accuracy: The synergy of Sentinel-2 MSI, environmental covariates and hybrid machine learning approaches. Catena 212 , 106054 (2022).

Article   CAS   Google Scholar  

Gorji, T. et al. Soil salinity analysis of Urmia Lake Basin using Landsat-8 OLI and Sentinel-2A based spectral indices and electrical conductivity measurements. Ecol. Indic. 112 , 106173 (2020).

Bannari, A. et al. Sentinel-msi and landsat-oli data quality characterization for high temporal frequency monitoring of soil salinity dynamic in an arid landscape. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 13 , 2434–2450 (2020).

Davis, E., Wang, C. & Dow, K. Comparing Sentinel-2 MSI and Landsat 8 OLI in soil salinity detection: A case study of agricultural lands in coastal North Carolina. Int. J. Remote Sens. 40 (16), 6134–6153 (2019).

Nabiollahi, K. et al. Assessing agricultural salt-affected land using digital soil mapping and hybridized random forests. Geoderma 385 , 114858 (2021).

Xiao, C. et al. Prediction of soil salinity parameters using machine learning models in an arid region of northwest China. Comput. Electron. Agric. 204 , 107512 (2023).

He, B. et al. Spatiotemporal variation and future predictions of soil salinization in the Werigan-Kuqa River Delta Oasis of China. Sustainability 15 (18), 13996 (2023).

Haq, Y. U. et al. Spatial mapping of soil salinity using machine learning and remote sensing in Kot Addu, Pakistan. Sustainability 15 (17), 12943 (2023).

Rajath, E. et al. Soil Salinity Mapping Using Multisensor Data Employing Machine-Learning Technique: A Case Study from Indo-gangetic Plain. In Remote Sensing of Soils 439–453 (Elsevier, 2024).

Chapter   Google Scholar  

Mehri, S., Hooshangi, N. & Ghaffari Razin, M. R. Providing a knowledge-based method for distinguishing crops and estimating a cultivation area (Case study: The Moghan Plain). Geograph. Eng. Territory. 7 (1), 191–208 (2023).

Google Scholar  

Awad, M. An unsupervised artificial neural network method for satellite image segmentation. Int. Arab J. Inf. Technol. 7 (2), 199–205 (2010).

MathSciNet   Google Scholar  

Blaschke, T., Burnett, C. & Pekkarinen, A. Image segmentation methods for object-based analysis and classification. In Remote Sensing Image Analysis: Including the Spatial Domain 211–236 (Springer, 2004).

Zhu, Z. & Woodcock, C. E. Continuous change detection and classification of land cover using all available Landsat data. Remote Sens. Environ. 144 , 152–171 (2014).

Ghorbani, K., Zakerinia, M. & Hezarjaribi, A. The effect of climate change on water requirement of soybean in Gorgan. J. Agric. Meteorol. 1 (2), 60–72 (2014).

Roozitalab, M. H. et al. Major soils, properties, and classification. In The Soils of Iran (eds Roozitalab, M. H. et al. ) 93–147 (Springer International Publishing, 2018).

Esri. Working with basemap layers . (2018). https://desktop.arcgis.com/en/arcmap/latest/map/working-with-layers/working-with-basemap-layers.htm .

Copernicus Sentinel-2 (processed by ESA). MSI Level-2H/F Harmonized/Fused Reflectance Product. Collection 1. (European Space Agency, 2021).

Catalog, E. E. D. Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Level-2A . (2020).

Catalog, E.E.D. USGS Landsat 8 Level 2, Collection 2, Tier 1 . (2022). https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2#terms-of-use .

Olaya, V. & Conrad, O. Geomorphometry in SAGA. Dev. Soil Sci. 33 , 293–308 (2009).

Wilson, J. P., Gallant, J. C. Terrain Analysis: Principles and Applications (Wiley, 2000).

Fick, S. E., Hijmans, R. WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol . 1–14 (2017).

Khan, N. M. et al. Assessment of hydrosaline land degradation by using a simple approach of remote sensing indicators. Agric. Water Manag. 77 (1–3), 96–109 (2005).

Dehni, A. & Lounis, M. Remote sensing techniques for salt affected soil mapping: Application to the Oran region of Algeria. Proc. Eng. 33 , 188–198 (2012).

Huete, A. R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 25 (3), 295–309 (1988).

Rouse Jr, J. W. et al. Monitoring the vernal advancement and retrogradation (green wave effect) of natural vegetation. (1974).

Priya, M. et al. Monitoring vegetation dynamics using multi-temporal Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) images of Tamil Nadu. J. Appl. Nat. Sci. 15 (3), 1170–1177 (2023).

Wu, W. et al. Mapping soil salinity changes using remote sensing in Central Iraq. Geoderma Reg. 2 , 21–31 (2014).

Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 58 (3), 257–266 (1996).

Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 27 (14), 3025–3033 (2006).

Gallant, J. C., Dowling, T. I. A multiresolution index of valley bottom flatness for mapping depositional areas. Water Resour. Res. 39 (12) (2003).

Böhner, J., Selige, T. Spatial prediction of soil attributes using terrain analysis and climate regionalization. In SAGA-Analyses and Modelling Applications (Goltze, 2006).

Allbed, A. & Kumar, L. Soil salinity mapping and monitoring in arid and semi-arid regions using remote sensing technology: A review. Adv. Remote Sens. 02 (04), 13 (2013).

Abbas, A. et al. Characterizing soil salinity in irrigated agriculture using a remote sensing approach. Phys. Chem. Earth Parts A/B/C 55 , 43–52 (2013).

Scudiero, E., Skaggs, T. H. & Corwin, D. L. Regional-scale soil salinity assessment using Landsat ETM+ canopy reflectance. Remote Sens. Environ. 169 , 335–343 (2015).

Jiang, Z. et al. Development of a two-band enhanced vegetation index without a blue band. Remote Sens. Environ. 112 (10), 3833–3845 (2008).

Sriwongsitanon, N. et al. Comparing the Normalized Difference Infrared Index (NDII) with root zone storage in a lumped conceptual model. Hydrol. Earth Syst. Sci. 20 (8), 3361–3377 (2016).

Chatterjee, S., Hadi, A. S. Regression Analysis by Example (Wiley, 2013).

Taghizadeh-Mehrjardi, R. et al. Improving the spatial prediction of soil salinity in arid regions using wavelet transformation and support vector regression models. Geoderma 383 , 114793 (2021).

Tang, Y., Wang, Z. & Zhang, T. Soil salinity estimation in Shule River Basin using support vector regression model. Land Degrad. Dev. 34 (13), 4094–4108 (2023).

Wu, W. et al. Soil salinity prediction and mapping by machine learning regression in Central Mesopotamia, Iraq. Land Degrad. Dev. 29 , 4005–4014 (2018).

Aksoy, S. et al. Assessing the performance of machine learning algorithms for soil salinity mapping in Google Earth Engine platform using Sentinel-2A and Landsat-8 OLI data. Adv. Space Res. 69 (2), 1072–1086 (2022).

Li, J. et al. Comparing machine learning algorithms for soil salinity mapping using topographic factors and sentinel-1/2 data: A case study in the yellow river delta of China. Remote Sens. 15 (9), 2332 (2023).

Yin, H. et al. Synergistic estimation of soil salinity based on Sentinel-1 image texture and Sentinel-2 salinity spectral indices. J. Appl. Rem. Sens. 17 (1), 018502 (2023).

Andrade Foronda, D. & Colinet, G. Prediction of soil salinity/sodicity and salt-affected soil classes from soluble salt ions using machine learning algorithms. Soil Syst. 7 (2), 47 (2023).

Chakraborty, S., Elahi, F. Comparison of Soil salinity prediction by Machine Learning algorithms in coastal areas of Bangladesh. Authorea Preprints (2023).

Shahare, Y. et al. A comprehensive analysis of machine learning-based assessment and prediction of soil enzyme activity. Agriculture 13 (7), 1323 (2023).

Liu, Q. et al. Estimation of soil moisture using multi-source remote sensing and machine learning algorithms in farming land of Northern China. Remote Sens. 15 (17), 4214 (2023).

Jiang, Z. et al. Simulating soil salinity dynamics, cotton yield and evapotranspiration under drip irrigation by ensemble machine learning. Front. Plant Sci. 14 (2023).

Abedi, F. et al. Salt dome related soil salinity in southern Iran: Prediction and mapping with averaging machine learning models. Land Degrad. Dev. 32 (3), 1540–1554 (2021).

Zhou, Y. et al. Global soil salinity prediction by open soil vis-NIR spectral library. Remote Sens. 14 (21), 5627 (2022).

Ma, G. et al. Digital mapping of soil salinization based on Sentinel-1 and Sentinel-2 data combined with machine learning algorithms. Reg. Sustain. 2 (2), 177–188 (2021).

Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 , 2825–2830 (2011).

Probst, P., Wright, M. N. & Boulesteix, A.-L. Hyperparameters and tuning strategies for random forest. WIREs Data Min. Knowl. Discov. 9 (3), e1301 (2019).

Bardenet, R. et al. Collaborative hyperparameter tuning. In International Conference on Machine Learning (PMLR, 2013).

Wistuba, M., Schilling, N. & Schmidt-Thieme, L. Two-stage transfer surrogate model forźautomatic hyperparameter optimization. In European Conference on Machine Learning and Knowledge Discovery in Databases Vol. 9851 199–214 (Springer, 2016).

Hutter, F. Automated Configuration of Algorithms for Solving Hard Computational Problems (University of British Columbia, 2009).

Head, T. et al. Scikit-optimize/scikit-optimize: v0. 8.1. Zenodo (2020).

Myles, A. J. et al. An introduction to decision tree modeling. J. Chemometr. 18 (6), 275–285 (2004).

Russell, S. J. Artificial Intelligence a Modern Approach (Pearson Education, Inc, 2010).

Zhang, W. et al. Assessment of pile drivability using random forest regression and multivariate adaptive regression splines. Georisk. Assess. Manag. Risk Eng. Syst. Geohazards. 15 (1), 27–40 (2021).

Schonlau, M. & Zou, R. Y. The random forest algorithm for statistical learning. Stata J. 20 (1), 3–29 (2020).

Ho, T. K. Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition (IEEE, 1995).

Breiman, L. Random forests. Mach. Learn. 45 , 5–32 (2001).

Jaiswal, J. K., Samikannu, R. Application of Random forest algorithm on feature subset selection and classification and regression. In 2017 World Congress on Computing and Communication Technologies (WCCCT) . (2017).

Boser, B. E., Guyon, I. M., Vapnik, V. N. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory (1992).

Vapnik, V. The Nature of Statistical Learning Theory . (Springer Science & Business Media, 1999).

Kurani, A. et al. A comprehensive comparative study of Artificial Neural Network (ANN) and Support Vector Machines (SVM) on stock forecasting. Ann. Data Sci. 10 (1), 183–208 (2023).

Zhang, F., O'Donnell, L. J. Chapter 7—Support vector regression. In Machine Learning (eds. Mechelli, A., Vieira, S.) 123–140 (Academic Press, 2020).

Bishop, C. M., Nasrabadi, N. M. Pattern Recognition and Machine Learning , vol. 4. (Springer, 2006).

Chen, T., Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016).

Friedman, J., Hastie, T. & Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 28 (2), 337–407 (2000).

Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29 , 1189–1232 (2001).

Article   MathSciNet   Google Scholar  

Ma, M. et al. XGBoost-based method for flash flood risk assessment. J. Hydrol. 598 , 126382 (2021).

Peng, Y. et al. Automated glacier extraction using a Transformer based deep learning approach from multi-sensor remote sensing imagery. ISPRS J. Photogramm. Remote Sens. 202 , 303–313 (2023).

Zhou, J. Y. et al. Prediction of hepatic inflammation in chronic hepatitis B patients with a random forest-backward feature elimination algorithm. World J. Gastroenterol. 27 (21), 2910–2920 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hunter, J. D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9 (03), 90–95 (2007).

Khosravani, P. et al. Digital mapping to extrapolate the selected soil fertility attributes in calcareous soils of a semiarid region in Iran. J. Soils Sediment. 23 , 4032–4054 (2023).

Mzid, N. et al. Salinity properties retrieval from Sentinel-2 satellite data and machine learning algorithms. Agronomy 13 (3), 716 (2023).

Wu, W. et al. Soil salinity prediction and mapping by machine learning regression in Central Mesopotamia, Iraq. Land Degrad. Dev. 29 (11), 4005–4014 (2018).

Merembayev, T. et al. Soil salinity classification using machine learning algorithms and radar data in the case from the South of Kazakhstan. J. Ecol. Eng. 23 (10), 61–67 (2022).

Ding, J.-L., Wu, M.-C. & Tiyip, T. Study on Soil salinization information in arid region using remote sensing technique. Agric. Sci. China 10 (3), 404–411 (2011).

Qi-sheng, H., Chun-xiang, C. & Tiyip, T. Study on the extraction of saline soil information in arid area based on multiple source data. Remote Sens. Technol. Appl. 25 (2), 209–215 (2010).

Fu, H. et al. Land salinization classification method using Landsat TM images in Western Jilin Province of China. In SPIE Optical Engineering + Applications , vol. 9220 (SPIE, 2014).

Breiman, L. Bagging predictors. Mach. Learn. 24 , 123–140 (1996).

Haq, Y. U. et al. Identification of soil type in Pakistan using remote sensing and machine learning. PeerJ Comput. Sci. 8 , e1109 (2022).

Merembayev, T., Amirgaliyev, Y., Saurov, S. & Wójcik, W. Soil Salinity Classification Using Machine Learning Algorithms and Radar Data in the Case from the South of Kazakhstan. J. Ecol. Eng. 23 (10), 61–67 (2022).

Yahiaoui, I., Bradaï, A., Douaoui, A. & Abdennour, M. A. Performance of random forest and buffer analysis of Sentinel-2 data for modelling soil salinity in the Lower-Cheliff plain (Algeria). Int. J. Remot. Sens. 42 (1), 148–171 (2021).

Schulz, K., Hänsch, R. & Sörgel, U. Machine learning methods for remote sensing applications: an overview. In Proc. SPIE 10790, Earth Resources and Environmental Remote Sensing/GIS Applications IX, 1079002 . https://doi.org/10.1117/12.2503653

Salcedo, F. P. et al. Use of remote sensing to evaluate the effects of environmental factors on soil salinity in a semi-arid area. Sci. Total Environ. 815 , 152524 (2022).

Shrestha, R. P., Qasim, S. & Bachri, S. Investigating remote sensing properties for soil salinty mapping: A case study in Korat province of Thailand. Environ. Chall. 5 , 100290 (2021).

Wang, N. et al. Integrating remote sensing and landscape characteristics to estimate soil salinity using machine learning methods: A case study from Southern Xinjiang, China. Remote Sens. 12 (24), 4118 (2020).

Cao, X. et al. Multidimensional soil salinity data mining and evaluation from different satellites. Sci. Total Environ. 846 , 157416 (2022).

Ostad-Ali-Askari, K. & Shayan, M. Subsurface drain spacing in the unsteady conditions by HYDRUS-3D and artificial neural networks. Arab. J. Geosci. 14 , 1–14 (2021).

Gopalakrishnan, T. & Kumar, L. Linking long-term changes in soil salinity to paddy land abandonment in Jaffna Peninsula, Sri Lanka. Agriculture 11 (3), 211 (2021).

Singh, A. Soil salinization management for sustainable development: A review. J. Environ. Manag. 277 , 111383 (2021).

Mardanifar, M. et al. Evaluating the drainage process of agricultural lands in Golestan province based on agricultural drainage reuse. Nat. Ecosyst. Iran 13 (3), 1–13 (2022).

Download references

The authors receive no funding.

Author information

Authors and affiliations.

Department of Soil Sciences, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran

Soraya Bandak & Seyed Alireza Movahedi-Naeini

Department of Geospatial Information Systems, Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran, Iran

Saeed Mehri

Department of Pathology, Microbiology, and Immunology, School Of Veterinary Medicine, University of California, Davis, USA

Aynaz Lotfata

You can also search for this author in PubMed   Google Scholar

Contributions

S.B.: Writing the original draft, sampling and investigation, methodology, analysis, and software. S.A.M.-N.: Writing-review & editing, project administration. S.M.: Visualizing, writing-review & editing. A.L.: Conceptualization, methodology, writing-review & editing, supervision, project administration.

Corresponding authors

Correspondence to Soraya Bandak or Saeed Mehri .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Bandak, S., Movahedi-Naeini, S.A., Mehri, S. et al. A longitudinal analysis of soil salinity changes using remotely sensed imageries. Sci Rep 14 , 10383 (2024). https://doi.org/10.1038/s41598-024-60033-6

Download citation

Received : 23 November 2023

Accepted : 18 April 2024

Published : 06 May 2024

DOI : https://doi.org/10.1038/s41598-024-60033-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Soil salinization
  • Remote sensing
  • Predictive modeling
  • Decision tree

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

limitations of longitudinal research

  • Open access
  • Published: 10 May 2024

Obesity and lipid indices as predictors of depressive symptoms in middle-aged and elderly Chinese: insights from a nationwide cohort study

  • Xiaoyun Zhang 1 ,
  • Ying Wang 1 ,
  • Xue Yang 1 ,
  • Yuqing Li 1 ,
  • Jiaofeng Gui 1 ,
  • Yujin Mei 1 ,
  • Haiyang Liu 2 ,
  • Lei-lei Guo 3 ,
  • Jinlong Li 4 ,
  • Yunxiao Lei 5 ,
  • Xiaoping Li 6 ,
  • Liu Yang 7 ,
  • Ting Yuan 5 ,
  • Congzhi Wang 7 ,
  • Dongmei Zhang 8 ,
  • Jing Li 9 ,
  • Mingming Liu 9 ,
  • Ying Hua 10 &
  • Lin Zhang 7  

BMC Psychiatry volume  24 , Article number:  351 ( 2024 ) Cite this article

14 Accesses

Metrics details

Depressive symptoms are one of the most common psychiatric disorders, with a high lifetime prevalence rate among middle-aged and elderly Chinese. Obesity may be one of the risk factors for depressive symptoms, but there is currently no consensus on this view. Therefore, we investigate the relationship and predictive ability of 13 obesity- and lipid-related indices with depressive symptoms among middle-aged and elderly Chinese.

The data were obtained from The China Health and Retirement Longitudinal Study (CHARLS). Our analysis includes individuals who did not have depressive symptoms at the baseline of the CHARLS Wave 2011 study and were successfully follow-up in 2013 and 2015. Finally, 3790 participants were included in the short-term (from 2011 to 2013), and 3660 participants were included in the long-term (from 2011 to 2015). The average age of participants in short-term and long-term was 58.47 years and 57.88 years. The anthropometric indicators used in this analysis included non-invasive [e.g. waist circumference (WC), body mass index (BMI), and a body mass index (ABSI)], and invasive anthropometric indicators [e.g. lipid accumulation product (LAP), triglyceride glucose index (TyG index), and its-related indices (e.g. TyG-BMI, and TyG-WC)]. Receiver operating characteristic (ROC) analysis was used to examine the predictive ability of various indicators for depressive symptoms. The association of depressive symptoms with various indicators was calculated using binary logistic regression.

The overall incidence of depressive symptoms was 20.79% in the short-term and 27.43% in the long-term. In males, WC [AUC = 0.452], LAP [AUC = 0.450], and TyG-WC [AUC = 0.451] were weak predictors of depressive symptoms during the short-term ( P  < 0.05). In females, BMI [AUC = 0.468], LAP [AUC = 0.468], and TyG index [AUC = 0.466] were weak predictors of depressive symptoms during the long-term ( P  < 0.05). However, ABSI cannot predict depressive symptoms in males and females during both periods ( P  > 0.05).

The research indicates that in the middle-aged and elderly Chinese, most obesity- and lipid-related indices have statistical significance in predicting depressive symptoms, but the accuracy of these indicators in prediction is relatively low and may not be practical predictors.

Peer Review reports

Introduction

Depressive symptoms, as one of the most common psychiatric disorders among middle-aged and elderly in worldwide, have prevalence rates of 22.1% in the USA, 34.8% in Japan, 34.6% in France, and 42.0% in China [ 1 ]. The Chinese population is aging, and it is expected that by 2050, the number of Chinese citizens aged 65 and above will reach 400 million [ 2 ]. The increased risk of depressive symptoms caused by aging is a serious problem for China [ 3 ]. A meta-analysis consisting of 32 cross-sectional studies showed that the pooled prevalence of depression symptoms among elderly people in China was 22.7%, with a higher prevalence rate among females (24.2%) than males (19.4%) and a higher prevalence rate in rural areas (29.2%) than in urban areas (20.5%) [ 4 ]. It is reported that depressive symptoms are one of the top ten causes of disability and a risk factor for a series of chronic diseases such as cardiovascular disease, diabetes, and obesity [ 5 ]. According to a population-based cohort study [ 6 ], participants with two or more depressive symptoms had 31% higher odds of having general obesity and 26% higher odds of having central obesity. Furthermore, depressive symptoms have been shown associated with a higher risk of ischemic heart disease and its subtypes [ 7 ]. It harms personal physical function and quality of life, which in turn increases the pressure on medical resources and socio-economic conditions [ 8 ].

Indeed, obesity is a common disease that may occur simultaneously with depressive symptoms [ 9 ]. According to statistics, the prevalence of overweight and obesity among Chinese adults may reach 65.3%, and the population may reach 78.995 million by 2030 [ 10 ]. As an important public health issue, research shows that obesity will increase the death probability of many diseases and lead to a series of chronic diseases (including cancer, type 2 diabetes, and dyslipidemia), which greatly affects public health and increases social and economic burden [ 11 , 12 , 13 , 14 ]. Body mass index (BMI) and waist circumference (WC) are the most commonly used indicators for measuring obesity. They have been used in many studies [ 15 , 16 , 17 ] to explore the association between obesity and some diseases (such as diabetes, metabolic syndrome, and depressive symptoms). However, BMI is only a surrogate measure of body fatness and does not distinguish body composition (muscle and fat accumulation) [ 18 ]. While waist circumference (WC) effectively reflects body size, fat percentage, and distribution, its strong correlation with BMI complicates the differentiation of their respective contributions as separate epidemiological risk factors [ 19 , 20 ]. Therefore, many new obesity- and lipid-related indicators, including waist-height ratio (WHtR), visceral adiposity index (VAI), a body shape index (ABSI), body roundness index (BRI), lipid accumulation product (LAP), conicity index (CI), Chinese visceral adiposity index (CVAI), and triglyceride glucose (TyG) index have been proposed to use in epidemiological research [ 21 , 22 , 23 ].

Most previous studies [ 17 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ] explored the relationship between depressive symptoms and obesity, and some of them have found positive associations [ 17 , 24 , 25 , 31 ], but others have suggested negative associations [ 26 , 27 , 28 , 29 , 32 ], or no associations [ 30 ]. The reasons for this inconsistency may be differences in population characteristics (including age, race, and cultural differences) [ 33 , 34 ], confounding factors [ 35 ], and different indices and standards for measuring obesity [ 36 , 37 ]. For example, a cross-sectional study conducted based on the Mexican population aged 20 or above found that obesity measured by BMI was positively associated with depressive symptoms in Mexican women [ 31 ]. In contrast, a study report on 2604 Chinese people aged 55 and above found a negative correlation between obesity and depressive symptoms measured by BMI, supporting the “fat and jolly” hypothesis [ 32 ]. The hypothesis proposes that obesity is negatively correlated with depressive symptoms and leads to a reduction in depressive symptoms [ 27 ]. So far, these studies are not representative in predicting depressive symptoms among middle-aged and elderly people in China, as most of them only describe one indicator and do not compare it with other indicators.

It is essential to emphasize the value of surrogate obesity-related indicators as efficient, cost-effective tools for the rapid screening and preliminary identification of individuals at high risk for depressive symptoms within large populations [ 38 , 39 , 40 ]. Previous studies [ 16 , 41 , 42 ] have compared the predictive power of simple surrogate obesity-related indices (including BMI, WHtR, VAI, BRI, ABSI, LAP, and TyG index) for metabolic syndrome, and have found that LAP and TyG index have stronger predictive power than other indicators. However, few studies have comprehensively examined the association between obesity- and lipid-related indices with depressive symptoms in the Chinese population, as well as the predictive strength for depressive symptoms. Thus, the association between obesity (measured by different indices) and depressive symptoms in middle-aged and elderly Chinese has to be further researched.

The purpose of this study is to investigate the relationship between 13 obesity- and lipid-related indices and depressive symptoms based on 2-year and 4-year longitudinal data from a nationally representative sample of community-dwelling Chinese participants aged 45 years or elderly. In addition, we also compared the screening and predictive abilities of different indicators in short-term (after 2 years follow-up) and long-term follow-up periods (after 4 years follow-up), and analyzed them separately based on sex.

Materials and methods

Study design and participants.

The China Health and Retirement Longitudinal Study (CHARLS) is a nationally representative cohort study that began in 2011 (Waves 1), targeting middle-aged and elderly people aged 45 and above in China and their spouses [ 43 ]. The participants are followed every two years through a face-to-face computer-assisted personal interview (CAPI), and data collection was carried out in 2013 (Waves 2) and 2015 (Waves 3). Data from CHARLS Waves 1, 2, and 3 were used in our study. We excluded individuals who met any of the following criteria at baseline: (1) missing data on depressive symptoms (excluding 7124 individuals) or Chinese version of the Center for Epidemiologic Studies Depression Scale (CES-D) ≥ 10 scores (excluding 7276 individuals), (2) missing data on any of the 13 indicators (excluding 3392 individuals), (3) missing data on age/sex/education levels/marital status/current residence/current smoking/alcohol consumption/taking activities/having regular exercise/chronic disease (excluding 1 individual). In addition, we excluded participants who did not have follow-up data (807 people lost to follow-up in 2013 and 937 people lost to follow-up in 2015). Finally, 3790 individuals who completed baseline surveys and short-term (from 2011–2013) follow-up surveys, and 3660 individuals who completed baseline surveys and long-term (from 2011–2015) follow-up surveys were enrolled in our research.

Depressive symptoms assessment

The depressive symptoms in the study were assessed using the Chinese version of the Center for Epidemiologic Studies Depression Scale (CES-D) [ 44 ]. The Chinese version of CES-D consists of 10 items that are intended to reflect the severity of the depressive symptoms over the previous week. Four-points are present for each item: the total scores varied from 0 to 30, with 0 representing rarely or never (< 1 day), 1 representing sometimes or sporadically (1–2 days), 2 representing a moderate amount of the time (3–4 days) and 3 representing frequently or always (5–7 days). Participants with a higher total score may be indicated “at risk” of depressive symptoms. CES-D ≥ 10 was a better cutoff point for indicating depressive symptoms and has been reported in previous studies [ 45 , 46 ]. When the CES-D value is 10, it provides the best discrimination ability when detecting individuals with or without a risk of depressive symptoms, with acceptable sensitivity, specificity, and accuracy [ 46 ]. The Chinese Version of CES-D has been confirmed to have better reliability and validity and was used frequently in predicting depressive symptoms [ 47 ].

Anthropometric measurements

The anthropometric measurements used in this analysis included non-invasive anthropometric indicators (including WC, BMI, WHtR, ABSI, BRI, and CI) and invasive anthropometric indicators (including VAI, LAP, CVAI, TyG index, TyG-BMI, TyG-WC, and TyG-WHtR) [ 48 , 49 , 50 , 51 ]. These indicators are widely used as markers for obesity and insulin resistance in epidemiological studies to predict the risk of diseases (such as metabolic syndrome, depression, and diabetes) [ 28 , 52 , 53 , 54 ]. However, most of these studies [ 28 , 52 , 53 , 54 ] use a single indicator to study the relationship between obesity and depression, without attempting to compare the predictive power of these indicators for depression. Therefore, based on previous literature [ 16 , 21 , 24 ], we selected 13 obesity and lipid-related indicators to investigate their correlation with depressive symptoms. WC was measured between the iliac crest and the lower ribs on both sides, at the end of expiratory breath. BMI was measured with weight (kg) /height 2 (m 2 ) [ 55 ]. Other anthropometric measurements are calculated using the following formula. It should be noted that invasive anthropometric indicators require blood sampling to evaluate TG and HDL-C levels.

\({\text{WHtR}}=\mathrm{WC }\left({\text{cm}}\right) /\mathrm{ Height} \left({\text{cm}}\right)\)  [ 56 ]

Males:  \({\text{VAI}}=\frac{WC\left(cm\right)}{39.68+\left(1.88\times BMI\right)}\times \frac{TG\left(mmol/l\right)}{1.03}\times \frac{1.31}{HDL-C\left(mmol/l\right)}\)  [ 51 ]

Females:  \({\text{VAI}}=\frac{WC\left(cm\right)}{36.58+\left(1.89\times BMI\right)}\times \frac{TG\left(mmol/l\right)}{0.81}\times \frac{1.52}{HDL-C\left(mmol/l\right)}\)

\({\text{ABSI}}=\frac{WC(m)}{{{Height(m)}^{1/2}\times BMI}^{2/3}}\)  [ 56 ]

\({\text{BRI}}=364.2-365.5\sqrt{1-(\frac{(WC(m)/{(2\uppi ))}^{2}}{{\left(0.5\times Height(m)\right)}^{2}})}\)  [ 57 ]

Males:  \({\text{LAP}}=\left[\mathrm{WC }\left({\text{cm}}\right)-65\right]\times \mathrm{TG }\left({\text{mmo}}1/1\right)\)  [ 21 ]

Females:  \({\text{LAP}}=\left[\mathrm{WC }\left({\text{cm}}\right)-58\right]\times \mathrm{TG }\left({\text{mmo}}1/1\right)\)

\({\text{CI}}=\frac{WC\left(m\right)}{0.109\sqrt{\frac{weight\left(kg\right)}{height(m)}}}\)  [ 23 ]

Males:  \({\text{CVAI}}=-267.93+0.68\times {\text{age}}+0.03\times \mathrm{BMI }\left({\text{kg}}/{{\text{m}}}^{2}\right) +4.00\times \mathrm{WC }\left({\text{cm}}\right)+22.00\times {{\text{log}}}_{10}{\text{TG}} \left({\text{mmo}}1/1\right)-16.32\times {\text{HDL}}-{\text{C}} \left({\text{mmo}}1/1\right)\)  [ 51 ]

Females:  \({\text{CVAI}}=-187.32+1.71\times {\text{age}}+4.32\times \mathrm{BMI }\left({\text{kg}}/{{\text{m}}}^{2}\right) +1.12\times \mathrm{WC }\left({\text{cm}}\right)+39.76\times {{\text{log}}}_{10}{\text{TG}} \left({\text{mmo}}1/1\right)-11.66\times {\text{HDL}}-{\text{C}} \left({\text{mmo}}1/1\right)\)

\(\mathrm{TyG index}={\text{Ln}}\left[\left({\text{TG}}\left({\text{mg}}/{\text{dl}}\right)\times \mathrm{glucose }\left({\text{mg}}/{\text{dl}}\right)/2\right)\right]\) [ 21 ]

\({\text{TyG}}-{\text{BMI}}={\text{TyG}}\times {\text{BMI}}\) [ 50 ]

\({\text{TyG}}-{\text{WC}}={\text{TyG}}\times {\text{WC}}\) [ 50 ]

\({\text{TyG}}-{\text{WHtR}}={\text{TyG}}\times {\text{WHtR}}\) [ 50 ]

Socio-demographic characteristics include age, sex (1 = male, 2 = female), education levels, marital status, current residence, current smoking, alcohol consumption, taking activities, having regular exercise, and chronic disease. (1) age was sorted as four categories: 45–54, 55–64, 65–74, and above 75 years old; (2) education levels were classified into four groups: illiterate, less than elementary school, high school, and above vocational school; (3) marital status was classified into two categories: single and married; (4) current residence included the urban and rural; (5) current smoking was categorized into three groups: no smoker, former smoker and current smoker; (6) alcohol consumption was divided into three groups, which included never drinking, less than once a month, and more than once a month; (7) taking activities were sorted as two groups: the ever (at least once a month) and never; (8) having regular exercise included no exercise, less than exercises, and regular exercises; (9) the counts of chronic disease were classified into 0, 1–2, 3–14. Chronic diseases in our study, including hypertension, dyslipidemia, diabetes or hyperglycemia, malignant tumor, chronic lung disease, liver disease, heart disease, stroke, kidney disease, stomach or digestive system disease, mental and emotional diseases, memory-related diseases, arthritis or rheumatism, asthma. The presence of each disease is rated as 1, and the total score for all diseases ranges from 0 to 14. In terms of the number of chronic diseases, participants with three or more chronic diseases have a higher risk of depressive symptoms compared to those without any chronic disease [ 58 ]. These categories have been used extensively in our previous research [ 59 , 60 , 61 , 62 , 63 ].

Statistical analysis

Mean and standard deviation were used to express continuous variables. Categorical variables were expressed as frequencies and percentages. In order to calculate the differences in mean distribution by sex and with or without depressive symptoms, independent sample t-tests were utilized. Socio-demographic characteristics were categorized by sex and presented as frequencies and percentages. Differences between the male and female groups were tested for statistical significance using the Chi-square test. Binary logistic regression analysis was used to evaluate the associations between obesity- and lipid-related indices and depressive symptoms, with 13 indices as independent variables and depressive symptoms as dependent variables. Adjusting for age, sex, education levels, marital status, current residence, current smoking, alcohol consumption, taking activities, having regular exercise, and chronic disease, we reported odds ratios (ORs) and 95% confidence intervals (CIs). The receiver operating characteristic curve (ROC) was utilized to evaluate the performance of obesity- and lipid-related indices as predictors of depressive symptoms. The area under curve (AUC) and its 95% CIs were calculated to quantify this performance. The significance of the AUC is that an area greater than 0.9 indicates high accuracy, 0.71–0.9 indicates moderate accuracy and 0.5–0.7 indicates low accuracy [ 64 ]. Our data satisfies three assumptions required for statistical testing: normality, homogeneity of variance, and data independence. All of the statistical analyses were analyzed using the IBM SPSS version 25.0 (IBM Corp., Armonk, NY). P  < 0.05 was considered statistically significant in all the analyses.

Table 1 showed the basic characteristics of the study participants. A total of 3790 participants were included in the short-term (2 years from 2011 to 2013) and 3660 in the long-term (4 years from 2011 to 2015). For the missing data, we found that there was no difference in socio-demographic characteristics compared to all the data, so we adopted a direct deletion method for the missing data. At baseline, 53.54% of the participants were males in the short-term, and 53.63% males in the long-term. The mean BMI, WHtR, VAI, ABSI, BRI, LAP, CI, CVAI, TyG index, TyG-BMI, TyG-WC and TyG-WHtR in females are higher than males during short- and long-term ( P  < 0.05). During both short-term and long-term, we also observed the significant differences in age, education levels, marital status, current smoking, alcohol consumption between males and females, but observed no significant differences in the distribution of current residence, taking activities, and having regular exercises.

Table 2 showed the baseline characteristics of the study participants with and without depressive symptoms by sex at 2011 → 2013. After 2 years follow-up, approximately 20.79% of the participants had depressive symptoms (16.76% in males and 25.44% in females). Males with depressive symptoms had significant differences in current residence, current smoking, WC, WHtR, VAI, BRI, LAP, CI, CVAI, TyG-BMI, TyG-WC, and TyG-WHtR ( P  < 0.05) during the short-term follow-up. Females with depressive symptoms had significant differences in current residence and chronic diseases ( P  < 0.05).

Table 3 showed the baseline characteristics of the study participants with and without depressive symptoms by sex at 2011 → 2015. After 4 years follow-up, approximately 27.43% of the participants had depressive symptoms (21.50% in males and 34.30% in females). Marital status was significantly different between males with and without depressive symptoms during the long-term follow-up ( P  < 0.05). Females with depressive symptoms had significant differences in current residence, taking activities, chronic diseases, WC, BMI, BRI, CVAI, TyG index, TyG-BMI, TyG-WC, and TyG-WHtR ( P  < 0.05).

Table 4 showed the associations of obesity- and lipid-related indices with depressive symptoms. We use these indicators as continuous variables and depression as a binary variable, and the results are explained as how much the risk of depressive symptoms decreases or increases for every 1 unit increase in the indicators. In males, after controlling for age, educational levels, marital status, current residence, current smoking, alcohol consumption, taking activities, having regular exercises, and chronic diseases, WC (OR = 0.987, 95%CI: 0.974–1.000), LAP (OR = 0.996, 95%CI: 0.992–1.000), CVAI (OR = 0.997, 95%CI: 0.995–1.000), TyG-WC (OR = 0.999, 95%CI: 0.998–1.000) was significantly correlated with depressive symptoms during the short-term ( P  < 0.05). For example, for every unit increase in WC and TyG-WC, the risk of depressive symptoms decreases by 0.013 and 0.001 times, respectively. In females, WC (OR = 0.983, 95%CI: 0.973–0.993), BMI (OR = 0.953, 95%CI: 0.926–0.979), WHtR (OR = 0.130, 95%CI: 0.026–0.647), BRI (OR = 0.908, 95%CI: 0.843–0.978), LAP (OR = 0.997, 95%CI: 0.994–1.000), CVAI (OR = 0.996, 95%CI: 0.993–0.998), TyG index (OR = 0.834, 95%CI: 0.708–0.983), TyG-BMI (OR = 0.995, 95%CI: 0.992–0.998), TyG-WC (OR = 0.998, 95%CI: 0.998–0.999), and TyG-WHtR (OR = 0.814, 95%CI: 0.707–0.936) were significantly associated with depressive symptoms during the long-term ( P  < 0.05). For every unit increase in BMI and TyG-index, the risk of depressive symptoms decreases by 0.047 and 0.166 times, respectively. There were no significant associations between ABSI and depressive symptoms in males and females during both follow-up periods ( P  > 0.05).

Table 5 showed the cut-off between area under curve, sensitivity, and specificity for obesity- and lipid-related indices to detect subsequent onset of depressive symptoms by sex. The ROC curves of each index for predicting depressive symptoms risk in males and females are shown in Fig.  1 and Fig.  2 during the short-term, Fig.  3 and Fig.  4 during the long-term, respectively. In males, WHtR (AUC = 0.462, SE = 0.017, 95% CI = 0.429–0.495, and optimal cut-off = 0.432) and BRI (AUC = 0.462, SE = 0.017, 95% CI = 0.429–0.495, and optimal cut-off = 2.176) had the largest predictive values among 13 indicators during the short-term ( P  < 0.05). In females, BMI (AUC = 0.468, SE = 0.015, 95% CI = 0.439–0.496, and optimal cut-off = 19.378) and LAP (AUC = 0.468, SE = 0.015, 95% CI = 0.439–0.497, and optimal cut-off = 2.163) had the largest predictive values among 13 indicators during the long-term ( P  < 0.05). However, there was no significant predictive ability of ABSI for depressive symptoms in both males and females during both follow-up periods ( P  > 0.05).

figure 1

The ROC curves of each indicator in the prediction of depressive symptoms risk in males at 2011→2013

figure 2

The ROC curves of each indicator in the prediction of depressive symptoms risk in females at 2011→2013

figure 3

The ROC curves of each indicator in the prediction of depressive symptoms risk in males at 2011→2015

figure 4

The ROC curves of each indicator in the prediction of depressive symptoms risk in females at 2011→2015

In our nationwide cohort study, we used ROC analysis to determine the predictive power of obesity- and lipid-related indicators for depressive symptoms. Our findings revealed that the AUC values of most indicators are below 0.5, indicating that the discriminative power of these indices is weak and not significantly better than random chance [ 64 ]. Although many previous studies [ 31 , 65 , 66 ] have reported a relationship between obesity and depressive symptoms, almost no research has investigated the predictive ability of indicators for depressive symptoms. Our study for the first time investigated the predictive ability of 13 indicators for depressive symptoms based on cohort studies, and found that all indicators had limited predictive ability for depressive symptoms.

We also found the incidence of depressive symptoms in females was 25.44% in short-term follow-up and 34.30% in long-term follow-up, consistently higher than in males during both follow-up periods. This is consistent with previous published studies [ 17 , 67 , 68 ]. Due to hormonal fluctuations (such as excessive sensitivity to hormonal fluctuations and menopausal hormonal changes), endocrine disorders can occur, making women prone to emotional fluctuations, depression, and reluctance to interact with others [ 69 ]. In, addition, psychosocial events, victimization, gender specific socialization, internalized coping strategies, and disadvantaged social status, females may be more prone to depression than males [ 70 , 71 ]. From the perspective of social differences, women may experience more stressful life events throughout their lives, and they are more sensitive to these events [ 70 ]. When faced with trouble, there is a significant difference in coping styles between women and men. Women are more inclined to focus on the emotions and repetitive thinking caused by problems, and this reflective coping style may lead to a higher incidence of depression.

Moreover, the significantly negative associations were found between depressive symptoms and most obesity- and lipid-related indicators, but differed by sex (male, female) and length of follow-up (2 years, 4 years). In males, the significant association between depressive symptoms and WC, LAP, CVAI, TyG-WC was observed in the short-term, but not observed in the long-term. No association between BMI and depressive symptom was observed in males during both follow-up periods. This could be because BMI is only a surrogate measure of body fatness and does not distinguish body composition (muscle and fat accumulation), especially for males who often have more muscle mass and less fat mass than females [ 18 ]. Therefore, if only BMI is considered, males may be more susceptible to these limitations. In females, the significant association between depressive symptoms and WC, TyG-WC was observed in the short-term, and WC, BMI, WHtR, BRI, LAP, CVAI, TyG index, TyG-BMI, TyG-WC, and TyG-WHtR was observed in the long-term. Compared to short-term follow-up, our study found that more indicators showed a significant negative correlation with depressive symptoms in females during long-term follow-up, which can be explained by cumulative effects.

However, we did not find the significant association between ABSI and depressive symptoms. Unlike our results, Lotfi K, Hassanzadeh Keshteli A, Saneei P, et al. found that ABSI was positively related to odds of depressive symptoms measured by the Hospital Anxiety and Depression Scale among Iranian females but not in males [ 72 ]. There are several points that can explain the differences between our survey results and the results of the aforementioned survey report. Firstly, previous research was conducted among Iranian adults, while our survey was conducted among the middle-aged and elderly population in China, with differences in demographic characteristics such as race and age. Secondly, Lotfi, K et al . used the Hospital Anxiety and Depression Scale. However, we used the Chinese version of the CES-D scale in our study, and there were differences in the diagnostic criteria for depression between the two measurement tools. Thirdly, previous research was a cross-sectional study, while ours is a cohort study with a larger sample size and analyzed the predictive ability of ABSI, therefore the current study has greater ability to detect these relationships. According to ROC analysis, the results for the ABSI AUC did not reach statistical significance in males and females during both follow-up periods ( P  > 0.05), respectively. Hence, ABSI was not a valuable predictive indicator of depressive symptoms for both males and females.

According to our results, we supported the “fat and jolly” hypothesis in middle-age and elderly Chinese, in consistent with many previous studies [ 27 , 29 , 73 , 74 , 75 ]. Crisp AH, et al . first reported the "jolly fat" hypothesis in a middle-aged sample of the general population, which suggests a significant positive correlation between severe obesity in men and low levels of depression [ 75 ]. In addition, Yim G, Ahn Y, Cho J, et al. also found the association of obesity and depressive symptoms in 2210 Korean middle-aged women, supporting the “jolly fat” hypothesis, which suggests that women with general obesity were less likely to have depressive symptoms [ 74 ]. However, some cross-sectional studies suggest a positive correlation between obesity and depression [ 67 , 68 ]. Part of the reasons for the differences may be due to cultural differences, as people in different regions have different attitudes towards obesity. Weight bias is very common in American society. According to a survey, the prevalence of weight bias in the United States has increased by 66% in the past decade [ 76 ]. Weight stigmatization may be one of the risk factors for depression in obese individuals, and weight-based ridicule has been found to be a mediating factor in the relationship between obesity and depression [ 77 ]. A review summarizes evidence that internalization of weight bias is associated with negative mental health outcomes such as depression, anxiety, inferiority complex, and quality of life [ 78 ]. But in Chinese cultural tradition, the connection between happiness and obesity is described by a famous idiom " happy mind and fat body " [ 79 ]. Compared to Western culture, Chinese people believed that obesity is not a symbol of unhealthy behavior, as only wealthier people can afford more food and gain weight. In addition, middle-aged weight gain is considered a good omen of good luck, so people are willing to gain weight in their later years [ 28 ].

Strengths and limitations of the study

The main strength of our study are as follows: Firstly, we analyzed data based on a nationwide population-based longitudinal study. This study enrolled 3790 and 3660 middle-aged and elderly Chinese individuals in both short-term and long-term follow-up, the large sample size enhanced the generalizability and effectiveness of the research results. Secondly, it evaluated the impact of obesity- and lipid- related indicators on the depressive symptoms throughout two different interval periods. It helps us understand the short-term and long-term effects of 13 indicators on the incidence of depressive symptoms.

The study has several limitations should be noted. Firstly, depression symptoms were measured using the CES-D self-report scale, which has been shown to have acceptable psychological measurement characteristics and is suitable for a wider range of elderly participants. However, due to people tend to underreported their mental disease in the research, there may be reporting bias in the results. Secondly, with the deepening of aging, the incidence rate of depressive symptoms among middle-aged and elderly people is rising, which is a serious problem facing China. Therefore, this study included middle-aged and elderly people aged 45 and above in China. It is worth noting that the results of our study in the context of other age groups should be interpreted with caution. Lastly, our results indicate that the AUC values of most indicators are below 0.5, indicating low diagnostic accuracy and inability to effectively predict depression in clinical practice. In future research, we need to try to combine two or more indicators to see if it can improve diagnostic accuracy.

Among the obesity- and lipid-related indices, ABSI did not correlate with depressive symptoms and failed to serve as a valuable predictor for both males and females across all intervals. Our research findings indicate that most obesity- and lipid-related indicators have statistical significance in predicting depressive symptoms, but the accuracy of these indicators in prediction is relatively low and may not be practical predictive factors. The results of this study may be of great significance for the early identification and prevention of depressive symptoms in middle-aged and elderly Chinese. Given the urgency of early screening for high-risk individuals for depressive symptoms, future research can explore the use of multiple indicators in combination to test whether they can improve the predictive ability of depressive symptoms, and thus have practical applications in clinical practice.

Availability of data and materials

Data can be accessed via http://opendata.pku.edu.cn/dataverse/CHARLS .

Abbreviations

China Health and Retirement Longitudinal Study

Waist circumference

Body mass index

Waist-height ratio

Visceral adiposity index

A body shape index

Body roundness index

Lipid accumulation product

Conicity index

Chinese visceral adiposity index

Triglyceride glucose index

Triglyceride-glucose related to BMI

Triglyceride-glucose related to WC

Triglyceride-glucose related to WHtR

The Chinese version of the Center for Epidemiologic Studies Depression scale

Receiver operating characteristic curve

Area under curve

Statistical Product Service Solutions

Odds ratios

Confidence intervals

Standard error

Richardson RA, Keyes KM, Medina JT, Calvo E. Sociodemographic inequalities in depression among older adults: cross-sectional evidence from 18 countries. Lancet Psychiatry. 2020;7(8):673–81.

Article   PubMed   PubMed Central   Google Scholar  

Fang EF, Scheibye-Knudsen M, Jahn HJ, Li J, Ling L, Guo H, Zhu X, Preedy V, Lu H, Bohr VA, et al. A research agenda for aging in China in the 21st century. Ageing Res Rev. 2015;24(Pt B):197–205.

Qiu Q-W, Qian S, Li J-Y, Jia R-X, Wang Y-Q, Xu Y. Risk factors for depressive symptoms among older Chinese adults: a meta-analysis. J Affect Disord. 2020;277:341–6.

Article   PubMed   Google Scholar  

Zhang L, Xu Y, Nie H, Zhang Y, Wu Y. The prevalence of depressive symptoms among the older in China: a meta-analysis. Int J Geriatr Psychiatry. 2012;27(9):900–6.

Penninx BWJH, Milaneschi Y, Lamers F, Vogelzangs N. Understanding the somatic consequences of depression: biological mechanisms and the role of depression symptom profile. BMC Med. 2013;11(1):129.

Mulugeta A, Zhou A, Power C, Hyppönen E. Obesity and depressive symptoms in mid-life: a population-based cohort study. BMC Psychiatry. 2018;18(1):297.

Liu S, Luo J, Zhang T, Zhang D, Zhang H. The combined role of obesity and depressive symptoms in the association with ischaemic heart disease and its subtypes. Sci Rep. 2022;12(1):14419.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Zormpas C, Kahl KG, Hohmann S, Oswald H, Stiel C, Veltmann C, Bauersachs J, Duncker D. Depressive symptoms and quality of life in patients with heart failure and an implantable cardioverter-defibrillator. Front Psychiatry. 2022;13:827967.

Milaneschi Y, Simmons WK, van Rossum EFC, Penninx BW. Depression and obesity: evidence of shared biological mechanisms. Mol Psychiatry. 2019;24(1):18–33.

Article   CAS   PubMed   Google Scholar  

Wang Y, Zhao L, Gao L, Pan A, Xue H. Health policy and public health implications of obesity in China. Lancet Diabetes Endocrinol. 2021;9(7):446–61.

Tápias FS, Otani VHO, Vasques DAC, Otani TZS, Uchida RR. Costs associated with depression and obesity among cardiovascular patients: medical expenditure panel survey analysis. BMC Health Serv Res. 2021;21(1):433.

Powell-Wiley TM, Poirier P, Burke LE, Després JP, Gordon-Larsen P, Lavie CJ, Lear SA, Ndumele CE, Neeland IJ, Sanders P, et al. Obesity and cardiovascular disease: a scientific statement from the American heart association. Circulation. 2021;143(21):e984–1010.

Pischon T, Nothlings U, Boeing H. Obesity and cancer. Proc Nutr Soc. 2008;67(2):128–45.

Williams EP, Mesidor M, Winters K, Dubbert PM, Wyatt SB. Overweight and obesity: prevalence, consequences, and causes of a growing public health problem. Curr Obes Rep. 2015;4(3):363–70.

Issaka A, Cameron AJ, Paradies Y, Kiwallo JB, Bosu WK, Houehanou YCN, Wesseh CS, Houinato DS, Nazoum DJP, Stevenson C. Associations between obesity indices and both type 2 diabetes and impaired fasting glucose among West African adults: results from WHO STEPS surveys. Nutr Metab Cardiovasc Dis. 2021;31(9):2652–60.

Gu Z, Zhu P, Wang Q, He H, Xu J, Zhang L, Li D, Wang J, Hu X, Ji G, et al. Obesity and lipid-related parameters for predicting metabolic syndrome in Chinese elderly population. Lipids Health Dis. 2018;17(1):289.

Frank P, Jokela M, Batty GD, Lassale C, Steptoe A, Kivimäki M. Overweight, obesity, and individual symptoms of depression: a multicohort study with replication in UK Biobank. Brain Behav Immun. 2022;105:192–200.

Gomez-Ambrosi J, Silva C, Galofre JC, Escalada J, Santos S, Millan D, Vila N, Ibanez P, Gil MJ, Valenti V, et al. Body mass index classification misses subjects with increased cardiometabolic risk factors related to elevated adiposity. Int J Obes (Lond). 2012;36(2):286–94.

Fang H, Berg E, Cheng X, Shen W. How to best assess abdominal obesity. Curr Opin Clin Nutr Metab Care. 2018;21(5):360–5.

Krakauer NY, Krakauer JC. A new body shape index predicts mortality hazard independently of body mass index. Plos One. 2012;7(7):e39504.

Ahn N, Baumeister SE, Amann U, Rathmann W, Peters A, Huth C, Thorand B, Meisinger C. Visceral adiposity index (VAI), lipid accumulation product (LAP), and product of triglycerides and glucose (TyG) to discriminate prediabetes and diabetes. Sci Rep. 2019;9(1):9693.

Calderón-García JF, Roncero-Martín R, Rico-Martín S, De Nicolás-Jiménez JM, López-Espuela F, Santano-Mogena E, Alfageme-García P, Sánchez Muñoz-Torrero JF. Effectiveness of Body Roundness Index (BRI) and a Body Shape Index (ABSI) in predicting hypertension: a systematic review and meta-analysis of observational studies. Int J Environ Res Public Health. 2021;18(21):11607.

Shenoy U. Jagadamba: influence of central obesity assessed by conicity index on lung age in young adults. J Clin Diagn Res. 2017;11(4):09–12.

Google Scholar  

Fabricatore AN, Wadden TA, Higginbotham AJ, Faulconbridge LF, Nguyen AM, Heymsfield SB, Faith MS. Intentional weight loss and changes in symptoms of depression: a systematic review and meta-analysis. Int J Obes (Lond). 2011;35(11):1363–76.

Zhong W, Cruickshanks KJ, Schubert CR, Nieto FJ, Huang GH, Klein BE, Klein R. Obesity and depression symptoms in the Beaver Dam offspring study population. Depress Anxiety. 2010;27(9):846–51.

Zhang L, Liu K, Li H, Li D, Chen Z, Zhang LL, Guo LL. Relationship between body mass index and depressive symptoms: the “fat and jolly” hypothesis for the middle-aged and elderly in China. BMC Public Health. 2016;16(1):1201.

Liao W, Luo Z, Hou Y, Cui N, Liu X, Huo W, Wang F, Wang C. Age and gender specific association between obesity and depressive symptoms: a large-scale cross-sectional study. BMC Public Health. 2020;20(1):1565.

Zhang L, Li JL, Zhang LL, Guo LL, Li H, Yan W, Li D. Relationship between adiposity parameters and cognition: the “fat and jolly” hypothesis in middle-aged and elderly people in China. Medicine (Baltimore). 2019;98(10):e14747.

Qian J, Li N, Ren X. Obesity and depressive symptoms among Chinese people aged 45 and over. Sci Rep. 2017;7:45637.

Zhou Y, Yang G, Peng W, Zhang H, Peng Z, Ding N, Guo T, Cai Y, Deng Q, Chai X. Relationship between depression symptoms and different types of measures of obesity (BMI, SAD) in US Women. Behav Neurol. 2020;2020:9624106.

Zavala GA, Kolovos S, Chiarotto A, Bosmans JE, Campos-Ponce M, Rosado JL, Garcia OP. Association between obesity and depressive symptoms in Mexican population. Soc Psychiatry Psychiatr Epidemiol. 2018;53(6):639–46.

Ho RC, Niti M, Kua EH, Ng TP. Body mass index, waist circumference, waist-hip ratio and depressive symptoms in Chinese elderly: a population-based study. Int J Geriatr Psychiatry. 2008;23(4):401–8.

Geoffroy MC, Li L, Power C. Depressive symptoms and body mass index: co-morbidity and direction of association in a British birth cohort followed over 50 years. Psychol Med. 2014;44(12):2641–52.

Alberga AS, Pickering BJ, Alix Hayden K, Ball GD, Edwards A, Jelinski S, Nutter S, Oddie S, Sharma AM, Russell-Mayhew S. Weight bias reduction in health professionals: a systematic review. Clin Obes. 2016;6(3):175–88.

Xu Q, Anderson D, Lurie-Beck J. The relationship between abdominal obesity and depression in the general population: a systematic review and meta-analysis. Obes Res Clin Pract. 2011;5(4):e267–360.

Qiao T, Luo T, Pei H, Yimingniyazi B, Aili D, Aimudula A, Zhao H, Zhang H, Dai J, Wang D. Association between abdominal obesity indices and risk of cardiovascular events in Chinese populations with type 2 diabetes: a prospective cohort study. Cardiovasc Diabetol. 2022;21(1):225.

Nimptsch K, Konigorski S, Pischon T. Diagnosis of obesity and use of obesity biomarkers in science and clinical medicine. Metabolism. 2019;92:61–70.

Massimino M, Monea G, Marinaro G, Rubino M, Mancuso E, Mannino GC, Andreozzi F. The Triglycerides and Glucose (TyG) index is associated with 1-hour glucose levels during an OGTT. Int J Environ Res Public Health. 2022;20(1):787.

Locateli JC, Lopes WA, Simoes CF, de Oliveira GH, Oltramari K, Bim RH, de Souza Mendes VH, Remor JM, Lopera CA, Nardo Junior N. Triglyceride/glucose index is a reliable alternative marker for insulin resistance in South American overweight and obese children and adolescents. J Pediatr Endocrinol Metab. 2019;32(10):1163–70.

McGraw MB, Kohler LN, Shaibi GQ, Mandarino LJ, Coletta DK. A performance review of novel adiposity indices for assessing insulin resistance in a pediatric Latino population. Front Pediatr. 2022;10:1020901.

Bilgin Göçer D, Baş M, Çakır Biçer N, Hajhamidiasl L. Predicting metabolic syndrome by visceral adiposity index, body roundness index, dysfunctional adiposity index, lipid accumulation product index, and body shape index in adults. Nutr Hosp. 2022;39(4):794–802.

PubMed   Google Scholar  

Raimi TH, Dele-Ojo BF, Dada SA, Fadare JO, Ajayi DD, Ajayi EA, Ajayi OA. Triglyceride-Glucose index and related parameters predicted metabolic syndrome in Nigerians. Metab Syndr Relat Disord. 2021;19(2):76–82.

Zhao Y, Hu Y, Smith JP, Strauss J, Yang G. Cohort profile: the China Health and Retirement Longitudinal Study (CHARLS). Int J Epidemiol. 2014;43(1):61–8.

Radloff LS. The CES-D scale. Appl Psychol Meas. 2016;1(3):385–401.

Article   Google Scholar  

Fu H, Si L, Guo R. What is the optimal cut-off point of the 10-item center for epidemiologic studies depression scale for screening depression among Chinese individuals aged 45 and over? An exploration using latent profile analysis. Front Psychiatry. 2022;13:820777.

Boey KW. Cross-validation of a short form of the CES-D in Chinese elderly. Int J Geriatr Psychiatry. 1999;14(8):608–17.

Chen H, Mui AC. Factorial validity of the center for epidemiologic studies depression scale short form in older population in China. Int Psychogeriatr. 2014;26(1):49–57.

Arellano-Ruiz P, García-Hermoso A, García-Prieto JC, Sánchez-López M, Vizcaíno VM, Solera-Martínez M. Predictive ability of waist circumference and waist-to-height ratio for cardiometabolic risk screening among Spanish children. Nutrients. 2020;12(2):415.

Romero-Saldaña M, Fuentes-Jiménez FJ, Vaquero-Abellán M, Álvarez-Fernández C, Molina-Recio G, López-Miranda J. New non-invasive method for early detection of metabolic syndrome in the working population. Eur J Cardiovasc Nurs. 2016;15(7):549–58.

Zheng S, Shi S, Ren X, Han T, Li Y, Chen Y, Liu W, Hou PC, Hu Y. Triglyceride glucose-waist circumference, a novel and effective predictor of diabetes in first-degree relatives of type 2 diabetes patients: cross-sectional and prospective cohort study. J Transl Med. 2016;14(1):260.

Wan H, Wang Y, Xiang Q, Fang S, Chen Y, Chen C, Zhang W, Zhang H, Xia F, Wang N, et al. Associations between abdominal obesity indices and diabetic complications: Chinese visceral adiposity index and neck circumference. Cardiovasc Diabetol. 2020;19(1):118.

Ge Q, Li M, Xu Z, Qi Z, Zheng H, Cao Y, Huang H, Duan X, Zhuang X. Comparison of different obesity indices associated with type 2 diabetes mellitus among different sex and age groups in Nantong, China: a cross-section study. BMC Geriatr. 2022;22(1):20.

Duan Y, Zhang W, Li Z, Niu Y, Chen Y, Liu X, Dong Z, Zheng Y, Chen X, Feng Z, et al. Predictive ability of obesity- and lipid-related indicators for metabolic syndrome in relatively healthy Chinese adults. Front Endocrinol (Lausanne). 2022;13:1016581.

Ramírez-Vélez R, Pérez-Sousa M, González-Ruíz K, Cano-Gutierrez CA, Schmidt-RioValle J, Correa-Rodríguez M, Izquierdo M, Romero-García JA, Campos-Rodríguez AY, Triana-Reina HR, et al. Obesity- and lipid-related parameters in the identification of older adults with a high risk of prediabetes according to the American diabetes association: an analysis of the 2015 health, well-being, and aging study. Nutrients. 2019;11(11):2654.

Zhou BF. Effect of body mass index on all-cause mortality and incidence of cardiovascular diseases–report for meta-analysis of prospective studies open optimal cut-off points of body mass index in Chinese adults. Biomed Environ Sci. 2002;15(3):245–52.

Chen R, Ji L, Chen Y, Meng L. Weight-to-height ratio and body roundness index are superior indicators to assess cardio-metabolic risks in Chinese children and adolescents: compared with body mass index and a body shape index. Transl Pediatr. 2022;11(3):318–29.

Stefanescu A, Revilla L, Lopez T, Sanchez SE, Williams MA, Gelaye B. Using A Body Shape Index (ABSI) and Body Roundness Index (BRI) to predict risk of metabolic syndrome in Peruvian adults. J Int Med Res. 2020;48(1):300060519848854.

Jiang C-H, Zhu F, Qin T-T. Relationships between chronic diseases and depression among middle-aged and elderly people in China: a prospective study from CHARLS. Curr Med Sci. 2020;40(5):858–70.

Liu H, Yang X, Guo LL, Li JL, Xu G, Lei Y, Li X, Sun L, Yang L, Yuan T, et al. Frailty and incident depressive symptoms during short- and long-term follow-up period in the middle-aged and elderly: findings from the Chinese nationwide cohort study. Front Psychiatry. 2022;13:848849.

Zhang L, Li JL, Zhang LL, Guo LL, Li H, Li D. Association and interaction analysis of body mass index and triglycerides level with blood pressure in elderly individuals in China. Biomed Res Int. 2018;2018:8934534.

Zhang L, Li JL, Zhang LL, Guo LL, Li H, Li D. No association between C-reactive protein and depressive symptoms among the middle-aged and elderly in China: evidence from the china health and retirement longitudinal study. Medicine (Baltimore). 2018;97(38):e12352.

Zhang L, Li JL, Guo LL, Li H, Li D, Xu G. The interaction between serum uric acid and triglycerides level on blood pressure in middle-aged and elderly individuals in China: result from a large national cohort study. BMC Cardiovasc Disord. 2020;20(1):174.

Zhang L, Yang L, Wang C, Yuan T, Zhang D, Wei H, Li J, Lei Y, Sun L, Li X, et al. Mediator or moderator? The role of obesity in the association between age at menarche and blood pressure in middle-aged and elderly Chinese: a population-based cross-sectional study. BMJ Open. 2022;12(5):e051486.

Eusebi P. Diagnostic accuracy measures. Cerebrovasc Dis. 2013;36(4):267–72.

Simon GE, Ludman EJ, Linde JA, Operskalski BH, Ichikawa L, Rohde P, Finch EA, Jeffery RW. Association between obesity and depression in middle-aged women. Gen Hosp Psychiatry. 2008;30(1):32–9.

Zhao G, Ford ES, Li C, Tsai J, Dhingra S, Balluz LS. Waist circumference, abdominal obesity, and depression among overweight and obese U.S. Adults: national health and nutrition examination survey. BMC Psychiatry. 2011;11:130.

Hadi S, Momenan M, Cheraghpour K, Hafizi N, Pourjavidi N, Malekahmadi M, Foroughi M, Alipour M. Abdominal volume index: a predictive measure in relationship between depression/anxiety and obesity. Afr Health Sci. 2020;20(1):257–65.

Alshehri T, Boone S, de Mutsert R, Penninx B, Rosendaal F, le Cessie S, Milaneschi Y, Mook-Kanamori D. The association between overall and abdominal adiposity and depressive mood: a cross-sectional analysis in 6459 participants. Psychoneuroendocrinology. 2019;110:104429.

Sassarini DJ. Depression in midlife women. Maturitas. 2016;94:149–54.

Noble RE. Depression in women. Metabolism. 2005;54(5 Suppl 1):49–52.

Lu J, Xu X, Huang Y, Li T, Ma C, Xu G, Yin H, Xu X, Ma Y, Wang L, et al. Prevalence of depressive disorders and treatment in China: a cross-sectional epidemiological study. Lancet Psychiatry. 2021;8(11):981–90.

Lotfi K, HassanzadehKeshteli A, Saneei P, Afshar H, Esmaillzadeh A, Adibi P. A body shape index and body roundness index in relation to anxiety, depression, and psychological distress in adults. Front Nutr. 2022;9:843155.

Luo H, Li J, Zhang Q, Cao P, Ren X, Fang A, Liao H, Liu L. Obesity and the onset of depressive symptoms among middle-aged and older adults in China: evidence from the CHARLS. BMC Public Health. 2018;18(1):909.

Yim G, Ahn Y, Cho J, Chang Y, Ryu S, Lim JY, Park HY. The “jolly fat” effect in middle-aged Korean women. J Womens Health (Larchmt). 2017;26(11):1236–43.

Crisp AH, McGuiness B. Jolly fat: relation between obesity and psychoneurosis in general population. BMJ. 1976;1(6000):7–9.

Andreyeva T, Puhl RM, Brownell KD. Changes in perceived weight discrimination among Americans, 1995–1996 through 2004–2006. Obesity. 2012;16(5):1129–34.

Puhl RM, Heuer CA. The stigma of obesity: a review and update. Obesity. 2012;17(5):941–64.

Pearl RL, Puhl RM. Weight bias internalization and health: a systematic review. Obes Rev. 2018;19(8):1141–63.

Li ZB, Ho SY, Chan WM, Ho KS, Li MP, Leung GM, Lam TH. Obesity and depressive symptoms in Chinese elderly. Int J Geriatr Psychiatry. 2004;19(1):68–74.

Download references

Acknowledgements

We thank the members of the research as well as all participants for their contribution.

CHARLS was supported by the NSFC (70910107022, 71130002) and National Institute on Aging (R03-TW008358-01; R01-AG037031-03S1), and World Bank (7159234) and the Support Program for Outstanding Young Talents from the Universities and Colleges of Anhui Province for Lin Zhang(gxyqZD2021118).

Author information

Authors and affiliations.

Department of Graduate School, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Xiaoyun Zhang, Ying Wang, Xue Yang, Yuqing Li, Jiaofeng Gui & Yujin Mei

Student Health Center, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Haiyang Liu

Department of Surgical Nursing, School of Nursing, Jinzhou Medical University, No.40, Section 3, Songpo Road, Linghe District, Jinzhou City, Liaoning Province, People’s Republic of China

Lei-lei Guo

Department of Occupational and Environmental Health, Key Laboratory of Occupational Health and Safety for Coal Industry in Hebei Province, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, People’s Republic of China

Obstetrics and Gynecology Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Yunxiao Lei & Ting Yuan

Department of Emergency and Critical Care Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Xiaoping Li & Lu Sun

Department of Internal Medicine Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Liu Yang, Congzhi Wang & Lin Zhang

Department of Pediatric Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Dongmei Zhang

Department of Surgical Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

Jing Li & Mingming Liu

Rehabilitation Nursing, School of Nursing, Wannan Medical College, 22 Wenchang West Road, Higher Education Park, Wuhu City, An Hui Province, People’s Republic of China

You can also search for this author in PubMed   Google Scholar

Contributions

Conceived and designed the research: LZ. Wrote the paper: X–y Z. Analyzed the data: X–y Z and LZ. Revised the paper: X–y Z, YW, XY, Y-q L, J-f G, Y-j M, LZ, H-y L, L-l G, J-l L, Y-x L, X-p L, LS, LY, TY, C-z W, D-m Z, JL, M-m L, and YH. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Lin Zhang .

Ethics declarations

Ethics approval and consent to participate.

All data are openly published as microdata at http://opendata.pku.edu.cn/dataverse/CHARLS with no direct contact with all participants. Approval for this study was given by the medical ethics committee of Wannan medical college (approval number 2021–3).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Zhang, X., Wang, Y., Yang, X. et al. Obesity and lipid indices as predictors of depressive symptoms in middle-aged and elderly Chinese: insights from a nationwide cohort study. BMC Psychiatry 24 , 351 (2024). https://doi.org/10.1186/s12888-024-05806-z

Download citation

Received : 30 January 2023

Accepted : 02 May 2024

Published : 10 May 2024

DOI : https://doi.org/10.1186/s12888-024-05806-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Depressive symptoms
  • Lipid-related index
  • Anthropometric indicators
  • Middle-aged and elderly
  • Cohort study

BMC Psychiatry

ISSN: 1471-244X

limitations of longitudinal research

Advanced neuroimaging techniques to explore the effects of motor and cognitive rehabilitation in multiple sclerosis

  • Published: 01 May 2024

Cite this article

limitations of longitudinal research

  • Maria A. Rocca   ORCID: orcid.org/0000-0003-2358-4320 1 , 2 , 5 ,
  • Francesco Romanò 1 ,
  • Nicolò Tedone 1 &
  • Massimo Filippi 1 , 2 , 3 , 4 , 5  

149 Accesses

Explore all metrics

Introduction

Progress in magnetic resonance imaging (MRI) technology and analyses is improving our comprehension of multiple sclerosis (MS) pathophysiology. These advancements, which enable the evaluation of atrophy, microstructural tissue abnormalities, and functional plasticity, are broadening our insights into the effectiveness and working mechanisms of motor and cognitive rehabilitative treatments.

Areas covered

This narrative review with selected studies discusses findings derived from the application of advanced MRI techniques to evaluate structural and functional neuroplasticity modifications underlying the effects of motor and cognitive rehabilitative treatments in people with MS (PwMS). Current applications as outcome measure in longitudinal trials and observational studies, their interpretation and possible pitfalls and limitations in their use are covered. Finally, we examine how the use of these techniques could evolve in the future to improve monitoring of motor and cognitive rehabilitative treatments.

Expert commentary

Despite substantial variability in study design and participant characteristics in rehabilitative studies for PwMS, improvements in motor and cognitive functions accompanied by structural and functional brain modifications induced by rehabilitation can be observed. However, significant enhancements to refine rehabilitation strategies are needed. Future studies in this field should strive to implement standardized methodologies regarding MRI acquisition and processing, possibly integrating multimodal measures. This will help identifying relevant markers of treatment response in PwMS, thus improving the use of rehabilitative interventions at individual level. The combination of motor and cognitive strategies, longer periods of treatment, as well as adequate follow-up assessments will contribute to enhance the quality of evidence in support of their routine use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

limitations of longitudinal research

Neuroimaging and Rehabilitation in Multiple Sclerosis

limitations of longitudinal research

Gray Matter Atrophy in the Cortico-Striatal-Thalamic Network and Sensorimotor Network in Relapsing–Remitting and Primary Progressive Multiple Sclerosis

limitations of longitudinal research

Efficacy of inpatient personalized multidisciplinary rehabilitation in multiple sclerosis: behavioural and functional imaging results

Data availability.

Not applicable.

Filippi M et al (2018) Multiple sclerosis. Nat Rev Dis Primers 4(1):43

Article   PubMed   Google Scholar  

Vaughn CB et al (2019) Epidemiology and treatment of multiple sclerosis in elderly populations. Nat Rev Neurol 15(6):329–342

Chen MH, Chiaravalloti ND, DeLuca J (2021) Neurological update: cognitive rehabilitation in multiple sclerosis. J Neurol 268(12):4908–4914

Goverover Y et al (2018) Evidenced-based cognitive rehabilitation for persons with multiple sclerosis: an updated review of the literature from 2007 to 2016. Arch Phys Med Rehabil 99(2):390–407

Lampit A et al (2019) Computerized cognitive training in multiple sclerosis: a systematic review and meta-analysis. Neurorehabil Neural Repair 33(9):695–706

Mitolo M et al (2015) Cognitive rehabilitation in multiple sclerosis: a systematic review. J Neurol Sci 354(1–2):1–9

Amatya B, Khan F, Galea M (2019) Rehabilitation for people with multiple sclerosis: an overview of Cochrane reviews. Cochrane Datab Syst Rev 1(1):CD012732

Learmonth YC, Motl RW (2021) Exercise training for multiple sclerosis: a narrative review of history, benefits, safety, guidelines, and promotion. Int J Environ Res Public Health 18(24):1

Cicerone KD et al (2000) Evidence-based cognitive rehabilitation: recommendations for clinical practice. Arch Phys Med Rehabil 81(12):1596–1615

Article   CAS   PubMed   Google Scholar  

Sastre-Garriga J et al (2020) MAGNIMS consensus recommendations on the use of brain and spinal cord atrophy measures in clinical practice. Nat Rev Neurol 16(3):171–182

Article   PubMed   PubMed Central   Google Scholar  

Wattjes MP et al (2021) 2021 MAGNIMS-CMSC-NAIMS consensus recommendations on the use of MRI in patients with multiple sclerosis. Lancet Neurol 20(8):653–670

Filippi M et al (2014) Insights from magnetic resonance imaging. Handb Clin Neurol 122:115–149

Rocca MA et al (2017) Brain MRI atrophy quantification in MS: from methods to clinical application. Neurology 88(4):403–413

Rocca MA, Preziosa P, Filippi M (2019) Application of advanced MRI techniques to monitor pharmacologic and rehabilitative treatment in multiple sclerosis: current status and future perspectives. Expert Rev Neurother 19(9):835–866

Zatorre RJ, Fields RD, Johansen-Berg H (2012) Plasticity in gray and white: neuroimaging changes in brain structure during learning. Nat Neurosci 15(4):528–536

Article   CAS   PubMed   PubMed Central   Google Scholar  

Gass A et al (2015) MRI monitoring of pathological changes in the spinal cord in patients with multiple sclerosis. Lancet Neurol 14(4):443–454

Riccitelli G et al (2011) Cognitive impairment in multiple sclerosis is associated to different patterns of gray matter atrophy according to clinical phenotype. Hum Brain Mapp 32(10):1535–1543

Filippi M, Preziosa P, Rocca MA (2017) Microstructural MR imaging techniques in multiple sclerosis. Neuroimaging Clin N Am 27(2):313–333

Schmierer K et al (2007) Diffusion tensor imaging of post mortem multiple sclerosis brain. Neuroimage 35(2):467–477

Martinez-Heras E et al (2023) Diffusion-based structural connectivity patterns of multiple sclerosis phenotypes. J Neurol Neurosurg Psychiatry 94(11):916–923

Rocca MA, Filippi M (2007) Functional MRI in multiple sclerosis. J Neuroimaging 17:36s–41s

Ogawa S et al (1990) Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proc Natl Acad Sci U S A 87(24):9868–9872

Biswal BB (2012) Resting state fMRI: A personal history. Neuroimage 62(2):938–944

Rocca MA et al (2022) Task- and resting-state fMRI studies in multiple sclerosis: from regions to systems and time-varying analysis. Current status and future perspective. Neuroimage Clin 35:1

Chiaravalloti ND, Genove HM, DeLuca J (2015) Cognitive rehabilitation in multiple sclerosis: the role of plasticity. Front Neurol 6:1

Filippi M, Preziosa P, Rocca MA (2019) Brain mapping in multiple sclerosis: lessons learned about the human brain. Neuroimage 190:32–45

Rocca MA et al (2005) Cortical adaptation in patients with MS: a cross-sectional functional MRI study of disease phenotypes. Lancet Neurol 4(10):618–626

Rocca MA et al (2015) Clinical and imaging assessment of cognitive dysfunction in multiple sclerosis. Lancet Neurol 14(3):302–317

Kister I et al (2013) Natural history of multiple sclerosis symptoms. Int J MS Care 15(3):146–158

Conradsson D et al (2018) Changes in disability in people with multiple sclerosis: a 10-year prospective study. J Neurol 265(1):119–126

Holper L et al (2010) Characterization of functioning in multiple sclerosis using the ICF. J Neurol 257(1):103–113

Timmermans ST, de Groot V, Beckerman H (2020) Ten-year disease progression in multiple sclerosis: walking declines more rapidly than arm and hand function. Multiple Sclerosis Related Disord 2020:45

Kurtzke JF (1983) Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS). Neurology 33(11):1444–1452

Hobart JC et al (2003) Measuring the impact of MS on walking ability: the 12-Item MS Walking Scale (MSWS-12). Neurology 60(1):31–36

Fischer JS et al (1999) The multiple sclerosis functional composite measure (MSFC): an integrated approach to MS clinical outcome assessment. National MS society clinical outcomes assessment task force. Mult Scler 5(4):244–250

Langeskov-Christensen D et al (2017) Performed and perceived walking ability in relation to the expanded disability status scale in persons with multiple sclerosis. J Neurol Sci 382:131–136

Polman CH, Rudick RA (2010) The multiple sclerosis functional composite: a clinically meaningful measure of disability. Neurology 74(Suppl 3):S8-15

PubMed   Google Scholar  

Lamers I et al (2014) Upper limb assessment in multiple sclerosis: a systematic review of outcome measures and their psychometric properties. Arch Phys Med Rehabil 95(6):1184–1200

Feys P et al (2017) The Nine-Hole Peg Test as a manual dexterity performance measure for multiple sclerosis. Mult Scler J 23(5):711–720

Article   Google Scholar  

Kobelt G et al (2017) New insights into the burden and costs of multiple sclerosis in Europe. Mult Scler J 23(8):1123–1136

Kinnett-Hopkins D et al (2017) People with MS are less physically active than healthy controls but as active as those with other chronic diseases: an updated meta-analysis. Multiple Sclerosis Related Disord 13:38–43

Article   CAS   Google Scholar  

Langeskov-Christensen M et al (2015) Aerobic capacity in persons with multiple sclerosis: a systematic review and meta-analysis. Sports Med 45(6):905–923

Proschinger S et al (2022) Fitness, physical activity, and exercise in multiple sclerosis: a systematic review on current evidence for interactions with disease activity and progression. J Neurol 269(6):2922–2940

Motl RW, Sandroff BM, DeLuca J (2016) Exercise training and cognitive rehabilitation: A symbiotic approach for rehabilitating walking and cognitive functions in multiple sclerosis? Neurorehabil Neural Repair 30(6):499–511

Block VJ et al (2019) Association of continuous assessment of step count by remote monitoring with disability progression among adults with multiple sclerosis. Jama Network Open 2(3):1

Kalb R et al (2020) Exercise and lifestyle physical activity recommendations for people with multiple sclerosis throughout the disease course. Mult Scler J 26(12):1459–1469

Feys P et al (2019) Effects of an individual 12-week community-located “start-to-run” program on physical capacity, walking, fatigue, cognitive function, brain volumes, and structures in persons with multiple sclerosis. Mult Scler J 25(1):92–103

Stellmann JP et al (2020) Aerobic exercise induces functional and structural reorganization of CNS networks in multiple sclerosis: a randomized controlled trial. Front Hum Neurosci 2020:14

Langeskov-Christensen M et al (2021) Efficacy of high-intensity aerobic exercise on brain MRI measures in multiple sclerosis. Neurology 96(2):e203–e213

Riemenschneider M et al (2022) Investigating the potential disease-modifying and neuroprotective efficacy of exercise therapy early in the disease course of multiple sclerosis: the Early Multiple Sclerosis Exercise Study (EMSES). Mult Scler 28(10):1620–1629

Albergoni M et al (2023) The insula modulates the effects of aerobic training on cardiovascular function and ambulation in multiple sclerosis. J Neurol 270(3):1672–1681

Savšek L et al (2021) Impact of aerobic exercise on clinical and magnetic resonance imaging biomarkers in persons with multiple sclerosis: an exploratory randomized controlled trial. J Rehabil Med 53(4):jrm00178

Leavitt VM et al (2014) Aerobic exercise increases hippocampal volume and improves memory in multiple sclerosis: preliminary findings. Neurocase 20(6):695–697

Sandroff BM et al (2021) Effects of walking exercise training on learning and memory and hippocampal neuroimaging outcomes in MS: a targeted, pilot randomized controlled trial. Contemp Clin Trials 110:1

Tavazzi E et al (2018) Effects of motor rehabilitation on mobility and brain plasticity in multiple sclerosis: a structural and functional MRI study. J Neurol 265(6):1393–1401

Mayo CD et al (2021) A pilot study of the impact of an exercise intervention on brain structure, cognition, and psychosocial symptoms in individuals with relapsing-remitting multiple sclerosis. Pilot Feasib Stud 7(1):1

Tilsley P et al (2023) Physical fitness moderates the association between brain network impairment and both motor function and cognition in progressive multiple sclerosis. J Neurol 2023:1

Kjolhede T et al (2018) Can resistance training impact MRI outcomes in relapsing-remitting multiple sclerosis? Mult Scler J 24(10):1356–1365

Prosperini L et al (2014) Multiple sclerosis: changes in microarchitecture of white matter tracts after training with a video game balance board. Radiology 273(2):529–538

Ibrahim I et al (2011) Fractional anisotropy and mean diffusivity in the corpus callosum of patients with multiple sclerosis: the effect of physiotherapy. Neuroradiology 53(11):917–926

Rasova K et al (2015) Motor programme activating therapy influences adaptive brain functions in multiple sclerosis: clinical and MRI study. Int J Rehabil Res 38(1):49–54

Bonzano L et al (2014) Upper limb motor rehabilitation impacts white matter microstructure in multiple sclerosis. Neuroimage 90:107–116

Barghi A et al (2018) Phase II randomized controlled trial of constraint-induced movement therapy in multiple sclerosis. Part 2: effect on white matter integrity. Neurorehab Neural Repair 32(3):233–241

Boffa G et al (2020) Preserved brain functional plasticity after upper limb task-oriented rehabilitation in progressive multiple sclerosis. Eur J Neurol 27(1):77–84

Rocca MA et al (2019) Functional and structural plasticity following action observation training in multiple sclerosis. Mult Scler J 25(11):1472–1487

Sandroff BM et al (2018) Treadmill walking exercise training and brain function in multiple sclerosis: preliminary evidence setting the stage for a network-based approach to rehabilitation. Mult Scler J Exp Transl Clin 4(1):2055217318760641

PubMed   PubMed Central   Google Scholar  

Androwis GJ et al (2021) A pilot randomized controlled trial of robotic exoskeleton-assisted exercise rehabilitation in multiple sclerosis. Mult Scler Relat Disord 51:102936

Huiskamp M et al (2020) A pilot study of the effects of running training on visuospatial memory in MS: a stronger functional embedding of the hippocampus in the default-mode network? Mult Scler J 26(12):1594–1598

Saadat N et al (2021) Functional connectivity pre-post exercise intervention in individuals with relapsing-remitting multiple sclerosis. NeuroReport 32(13):1100–1105

Fling BW et al (2019) Neuroplasticity of the sensorimotor neural network associated with walking aid training in people with multiple sclerosis. Multiple Sclerosis Related Disord 31:1–4

Akbar N et al (2020) Progressive resistance exercise training and changes in resting-state functional connectivity of the caudate in persons with multiple sclerosis and severe fatigue: a proof-of-concept study. Neuropsychol Rehabil 30(1):54–66

Cordani C et al (2021) Action observation training promotes motor improvement and modulates functional network dynamic connectivity in multiple sclerosis. Mult Scler J 27(1):139–146

Bonzano L et al (2019) Upper limb motor training based on task-oriented exercises induces functional brain reorganization in patients with multiple sclerosis. Neuroscience 410:150–159

Rasova K et al (2005) Is it possible to actively and purposely make use of plasticity and adaptability in the neurorehabilitation treatment of multiple sclerosis patients? A pilot project. Clin Rehab 19(2):170–181

Zuber P et al (2020) Efficacy of inpatient personalized multidisciplinary rehabilitation in multiple sclerosis: behavioural and functional imaging results. J Neurol 267(6):1744–1753

Sulpizio V et al (2021) Effect of exoskeleton-assisted rehabilitation over prefrontal cortex in multiple sclerosis patients: a neuroimaging pilot study. Brain Topogr 34(5):651–663

Prochazkova M et al (2021) Brain activity changes following neuroproprioceptive “facilitation, inhibition” physiotherapy in multiple sclerosis: a parallel group randomized comparison of two approaches. Eur J Phys Rehabil Med 57(3):356–365

Benedict RHB et al (2020) Cognitive impairment in multiple sclerosis: clinical management, MRI, and therapeutic avenues. Lancet Neurol 19(10):860–871

Islas MAM, Ciampi E (2019) Assessment and impact of cognitive impairment in multiple sclerosis: an overview. Biomedicines 7(1):1

Sumowski JF et al (2018) Cognition in multiple sclerosis. Neurology 90(6):278–288

Wojcik C et al (2022) Staging and stratifying cognitive dysfunction in multiple sclerosis. Mult Scler 28(3):463–471

Johnen A et al (2017) Distinct cognitive impairments in different disease courses of multiple sclerosis—a systematic review and meta-analysis. Neurosci Biobehav Rev 83:568–578

Damasceno A et al (2020) Cognitive trajectories in relapsing-remitting multiple sclerosis: a longitudinal 6-year study. Mult Scler J 26(13):1740–1751

Pitteri M et al (2017) Cognitive impairment predicts disability progression and cortical thinning in MS: an 8-year study. Mult Scler J 23(6):848–854

Donaldson E et al (2019) Why sex matters: a cognitive study of people with multiple sclerosis. Cogn Behav Neurol 32(1):39–45

Pagnotti RB, Hua LH, Miller JB (2022) Cognition and disease characteristics in adult onset versus late onset multiple sclerosis. Mult Scler J 28(6):933–941

Schoonheim MM et al (2012) Subcortical atrophy and cognition: sex effects in multiple sclerosis. Neurology 79(17):1754–1761

Schoonheim MM et al (2014) Sex-specific extent and severity of white matter damage in multiple sclerosis: implications for cognitive decline. Hum Brain Mapp 35(5):2348–2358

Tedone N et al (2023) Regional white matter and gray matter damage and cognitive performances in multiple sclerosis according to sex. Mol Psychiatry 28(4):1783–1792

Schoonheim MM et al (2012) Gender-related differences in functional connectivity in multiple sclerosis. Mult Scler J 18(2):164–173

Polliack ML, Barak Y, Achiron A (2001) Late-onset multiple sclerosis. J Am Geriatr Soc 49(2):168–171

Roth AK et al (2018) Cognition in older patients with multiple sclerosis compared to patients with amnestic mild cognitive impairment and healthy older adults. Neuropsychology 32(6):654–663

De Meo E et al (2021) Identifying the distinct cognitive phenotypes in multiple sclerosis. JAMA Neurol 78(4):414–425

Hancock LM et al (2023) A proposed new taxonomy of cognitive phenotypes in multiple sclerosis: the international classification of cognitive disorders in MS (IC-CoDiMS). Mult Scler 29(4–5):615–627

Mistri D et al (2023) Cognitive phenotypes in multiple sclerosis: mapping the spectrum of impairment. J Neurol 2023:1

Frndak SE et al (2015) Disclosure of disease status among employed multiple sclerosis patients: association with negative work events and accommodations. Mult Scler J 21(2):225–234

Strober L et al (2014) Unemployment in multiple sclerosis (MS): utility of the MS functional composite and cognitive testing. Mult Scler J 20(1):112–115

Yael G, Nancy C, John D (2019) Money management in multiple sclerosis: the role of cognitive, motor, and affective factors. Front Neurol 10:1

Amato M et al (2018) Cognitive assessment in multiple sclerosis—an Italian consensus. Neurol Sci 39(8):1317–1324

Kalb R et al (2018) Recommendations for cognitive screening and management in multiple sclerosis care. Mult Scler J 24(13):1665–1680

Amato MP et al (2013) Treatment of cognitive impairment in multiple sclerosis: position paper. J Neurol 260(6):1452–1468

Fink F et al (2010) Efficacy of an executive function intervention programme in MS: a placebo-controlled and pseudo-randomized trial. Mult Scler J 16(9):1148–1151

De Giglio L et al (2016) Corpus callosum microstructural changes associated with Kawashima Nintendo Brain Training in patients with multiple sclerosis. J Neurol Sci 370:211–213

Ehling R et al (2019) Second language learning induces grey matter volume increase in people with multiple sclerosis. PLoS ONE 14(12):1

Ernst A et al (2016) Functional and structural cerebral changes in key brain regions after a facilitation programme for episodic future thought in relapsing-remitting multiple sclerosis patients. Brain Cogn 105:34–45

Ernst A et al (2018) Benefits from an autobiographical memory facilitation programme in relapsing-remitting multiple sclerosis patients: a clinical and neuroimaging study. Neuropsychol Rehabil 28(7):1110–1130

Frieske J et al (2022) Can cognitive training reignite compensatory mechanisms in advanced multiple sclerosis patients? An explorative morphological network approach. Neuroscience 495:86–96

Bonavita S et al (2015) Computer-aided cognitive rehabilitation improves cognitive performances and induces brain functional connectivity changes in relapsing remitting multiple sclerosis patients: an exploratory study. J Neurol 262(1):91–100

Filippi M et al (2012) Multiple sclerosis: effects of cognitive rehabilitation on structural and functional MR imaging measures—an explorative study. Radiology 262(3):932–940

Fuchs TA et al (2019) Response heterogeneity to home-based restorative cognitive rehabilitation in multiple sclerosis: an exploratory study. Multiple Sclerosis Related Disord 34:103–111

Fuchs TA et al (2020) Functional connectivity and structural disruption in the default-mode network predicts cognitive rehabilitation outcomes in multiple sclerosis. J Neuroimaging 30(4):523–530

Penner I-K, Kappos L, Opwis K (2005) Induced changes in brain activation using a computerized attention training in patients with multiple sclerosis (MS)

Cerasa A et al (2013) Computer-assisted cognitive rehabilitation of attention deficits for multiple sclerosis: a randomized trial with fMRI correlates. Neurorehabil Neural Repair 27(4):284–295

Sastre-Garriga J et al (2011) A functional magnetic resonance proof of concept pilot trial of cognitive rehabilitation in multiple sclerosis. Mult Scler J 17(4):457–467

Pareto D et al (2018) Classic block design “pseudo”-resting-state fMRI changes after a neurorehabilitation program in patients with multiple sclerosis. J Neuroimaging 28(3):313–319

Parisi L et al (2014) Cognitive rehabilitation correlates with the functional connectivity of the anterior cingulate cortex in patients with multiple sclerosis. Brain Imaging Behav 8(3):387–393

Parisi L et al (2014) Changes of brain resting state functional connectivity predict the persistence of cognitive rehabilitation effects in patients with multiple sclerosis. Mult Scler J 20(6):686–694

De Giglio L et al (2016) Multiple sclerosis: changes in thalamic resting-state functional connectivity induced by a homebased cognitive rehabilitation program. Radiology 280(1):202–211

Campbell J et al (2016) A randomised controlled trial of efficacy of cognitive rehabilitation in multiple sclerosis: a cognitive, behavioural, and MRI study. Neural Plast 2016:1

Bonzano L et al (2020) Brain activity pattern changes after adaptive working memory training in multiple sclerosis. Brain Imaging Behav 14(1):142–154

Hubacher M et al (2015) Cognitive rehabilitation of working memory in juvenile multiple sclerosis-effects on cognitive functioning, functional MRI and network related connectivity. Restor Neurol Neurosci 33(5):713–725

Hubacher M et al (2015) Case-based fMRI analysis after cognitive rehabilitation in MS: a novel approach. Front Neurol 6:1.

Chiaravalloti ND et al (2012) Increased cerebral activation after behavioral treatment for memory deficits in MS. J Neurol 259(7):1337–1346

Huiskamp M et al (2016) A pilot study of changes in functional brain activity during a working memory task after mSMT treatment: the MEMREHAB trial. Multiple Sclerosis Related Disord 7:76–82

Leavitt VM et al (2014) Increased functional connectivity within memory networks following memory rehabilitation in multiple sclerosis. Brain Imaging Behav 8(3):394–402

Dobryakova E et al (2014) A pilot study examining functional brain activity 6 months after memory retraining in MS: the MEMREHAB trial. Brain Imaging Behav 8(3):403–406

Ernst A et al (2012) Induced brain plasticity after a facilitation programme for autobiographical memory in multiple sclerosis: a preliminary study. Mult Scler Int 2012:820240

Prouskas SE et al (2022) A randomized trial predicting response to cognitive rehabilitation in multiple sclerosis: Is there a window of opportunity? Mult Scler J 28(13):2124–2136

Ball K et al (2002) Effects of cognitive training interventions with older adults—a randomized controlled trial. JAMA J Am Med Assoc 288(18):2271–2281

Rebok GW et al (2014) Ten-year effects of the advanced cognitive training for independent and vital elderly cognitive training trial on cognition and everyday functioning in older adults. J Am Geriatr Soc 62(1):16–24

Liberatore G et al (2014) Predictors of effectiveness of multidisciplinary rehabilitation treatment on motor dysfunction in multiple sclerosis. Mult Scler J 20(7):862–870

Rademacher A et al (2021) Do baseline cognitive status, participant specific characteristics and EDSS impact changes of cognitive performance following aerobic exercise intervention in multiple sclerosis? Multiple Sclerosis Related Disord 51:102905

Ziccardi S et al (2023) Cognitive phenotypes predict response to restorative cognitive rehabilitation in multiple sclerosis. Mult Scler 2023:13524585231208331

Taylor LA et al (2023) Understanding who benefits most from cognitive rehabilitation for multiple sclerosis: a secondary data analysis. Mult Scler 29(11–12):1482–1492

Buscarinu MC et al (2022) Late-onset MS: disease course and safety-efficacy of DMTS. Front Neurol 13:829331

Kanzler CM et al (2022) Personalized prediction of rehabilitation outcomes in multiple sclerosis: a proof-of-concept using clinical data, digital health metrics, and machine learning. Med Biol Eng Comput 60(1):249–261

Feinstein A et al (2023) Cognitive rehabilitation and aerobic exercise for cognitive impairment in people with progressive multiple sclerosis (CogEx): a randomised, blinded, sham-controlled trial. Lancet Neurol 22(10):912–924

Argento O et al (2023) Motor, cognitive, and combined rehabilitation approaches on MS patients’ cognitive impairment. Neurol Sci 44(3):1109–1118

Barbarulo AM et al (2018) Integrated cognitive and neuromotor rehabilitation in multiple sclerosis: a pragmatic study. Front Behav Neurosci 12:196

Download references

Author information

Authors and affiliations.

Neuroimaging Research Unit, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy

Maria A. Rocca, Francesco Romanò, Nicolò Tedone & Massimo Filippi

Neurology Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy

Maria A. Rocca & Massimo Filippi

Neurorehabilitation Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy

Massimo Filippi

Neurophysiology Service, IRCCS San Raffaele Scientific Institute, Milan, Italy

Vita-Salute San Raffaele University, Milan, Italy

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Maria A. Rocca .

Ethics declarations

Conflicts of interest.

All authors declare that they have no conflicts of interest.

Ethical standards

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Rocca, M.A., Romanò, F., Tedone, N. et al. Advanced neuroimaging techniques to explore the effects of motor and cognitive rehabilitation in multiple sclerosis. J Neurol (2024). https://doi.org/10.1007/s00415-024-12395-0

Download citation

Received : 31 January 2024

Revised : 17 April 2024

Accepted : 17 April 2024

Published : 01 May 2024

DOI : https://doi.org/10.1007/s00415-024-12395-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Multiple sclerosis
  • Rehabilitation
  • Find a journal
  • Publish with us
  • Track your research
  • Review article
  • Open access
  • Published: 10 May 2024

Metformin mitigates dementia risk among individuals with type 2 diabetes

  • Nicholas Aderinto 1 ,
  • Gbolahan Olatunji 2 ,
  • Emmanuel Kokori 2 ,
  • Praise Fawehinmi 3 ,
  • Abdulrahmon Moradeyo 1 ,
  • Stephen Igwe 2 ,
  • Rebecca Ojabo 4 ,
  • Badrudeen Olalekan Alabi 2 ,
  • Emmanuel Chuka Okafor 4 ,
  • Damilola Ologbe 5 ,
  • Ayobami Olafimihan 6 &
  • David B. Olawade 7  

Clinical Diabetes and Endocrinology volume  10 , Article number:  10 ( 2024 ) Cite this article

167 Accesses

Metrics details

This mini-narrative review explores the relationship between diabetes and dementia, focusing on the potential mitigating role of metformin in reducing cognitive decline among individuals with type 2 diabetes. The interplay of factors such as glycemic control, diabetic complications, and lifestyle influences characterises diabetes-related dementia. This review emphasises the significance of comprehensive diabetes management in addressing the heightened risk of dementia in this population. Methodologically, the review synthesises evidence from 23 studies retrieved through searches on PubMed, Embase, Google Scholar, and Scopus. Current evidence suggests a predominantly positive association between metformin use and a reduced risk of dementia in individuals with diabetes. However, the review shows the complex nature of these outcomes, revealing variations in results in some studies. These discrepancies show the importance of exploring dose–response relationships, long-term effects, and demographic diversity to unravel the complexities of metformin's impact on cognitive health. Limitations in the existing body of research, including methodological disparities and confounding variables, necessitate refined approaches in future studies. Large-scale prospective longitudinal studies and randomised controlled trials focusing specifically on cognitive effects are recommended. Propensity score matching and exploration of molecular mechanisms can enhance the validity of findings in clinical practice. From a clinical perspective, metformin can serve as a potential adjunctive therapy for individuals with diabetes at risk of cognitive decline.

Introduction

Diabetes-related dementia is a significant concern due to the increased risk of dementia in individuals with type 2 diabetes [ 1 ]. The relationship between diabetes and dementia is complex and multifaceted [ 1 ]. Studies have shown that both low and high HbA1C levels are associated with an increased risk of dementia in individuals with diabetes, indicating a non-linear relationship [ 1 , 2 ]. Additionally, uncontrolled diabetes has been linked to an elevated risk of Alzheimer's disease, highlighting the importance of glycemic control in mitigating dementia risk [ 3 ]. Furthermore, severe diabetic retinal disease has been identified as a potential risk factor for dementia in individuals with type 2 diabetes, emphasising the need for comprehensive management of diabetic complications to reduce the likelihood of developing dementia [ 4 ].

The impact of lifestyle factors on diabetes-related dementia has also been investigated, with studies suggesting that a combination of healthy lifestyle factors is associated with a reduced risk of dementia in patients with type 2 diabetes [ 5 ]. However, the aetiology of diabetes-related dementia remains unclear, and it has been proposed that dementia in diabetic patients should be regarded as an independent disease, distinct from Alzheimer's disease and vascular dementia, due to its unique pathophysiological characteristics related to diabetes [ 6 , 7 , 8 ].

The investigation into metformin as a potential mitigating agent for dementia risk among individuals with diabetes is grounded in the expanding body of evidence highlighting its plausible neuroprotective role [ 9 ]. Metformin's potential as a neuroprotective agent has been linked to its ability to lower mortality and age-related diseases independently of its impact on diabetes control [ 10 , 11 , 12 , 13 , 14 ]. Empirical evidence suggests that metformin might mitigate dementia risk by reducing oxidative stress, inflammation, and apoptosis and countering the deleterious effects of advanced glycosylation end products produced during hyperglycemia [ 10 , 11 ]. These collective findings show metformin's potential not only in diabetes management but also in addressing neurological disorders. This study aims to review the current evidence for metformin as a mitigating agent for dementia risk among individuals with diabetes.

Methodology

We searched PubMed, Embase, Google Scholar and Scopus to conduct this narrative review see Table  1 . We formulated a database search strategy based on keywords such as "diabetes," "diabetes mellitus," "diabetes mellitus, Type 2", "metformin," "biguanides," "metformin benefits," "anti-diabetic medications," "memory," "cognition," "cognitive-impairment," "amnestic mild cognitive impairment," "Alzheimer's disease," "Parkinson's disease," and "dementia." We also used other texts selected based on the existing literature and/or obtained from related bibliographies, combined using Boolean operators as follows: ((dementia) OR (cognitive-impairment) OR (cognitive function) OR (neurodegenerative diseases)) AND ((metformin) OR (anti-diabetic drugs)). Furthermore, we manually searched relevant articles cited within the retrieved studies to avoid omitting important research articles.

We only considered articles that a) presented results in English, b) had full text available, and c) specifically assessed dementia risk in patients with diabetes who were on metformin therapy. On the other hand, we excluded studies with a) missing data, b) articles that did not focus on metformin use in type 2 diabetes mellitus, c) studies performed on patients with significant neurological, psychiatric disease or cancer, and d) studies performed in vitro or animal models. We limited the study scope to randomised controlled trials, retrospective cohort studies, prospective observational studies, comparator studies, and case–control studies but excluded books, letters, editorials, conferences, and commentaries.

During the data extraction process, we evaluated the study characteristics such as the publication type, year, study design, study focus, sample size, and the number of positive and negative outcomes. It is important to note that we focused on the probable benefit of metformin in mitigating dementia risk among individuals with diabetes despite the controversial nature of the topic.

Current evidence in existing literature

Our review identified 23 studies, including sample sizes ranging from 305 to 446,105 participants see Table  2 . A majority of these studies, 17 out of the 23 [ 10 , 11 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 ], reported positive outcomes regarding the relationship between metformin use and dementia risk in individuals with diabetes. Metformin is the preferred first-line drug for the treatment of type 2 diabetes mellitus [ 9 ]. It can be safely administered with other antidiabetic drugs and has been demonstrated to reduce insulin resistance and improve glycaemic control [ 9 ]. However, a review of clinical trials paints a mixed picture of the connection between the use of metformin and the incidence of dementia among patients with diabetes.

The findings of observational studies examining the possible link between metformin and dementia risk have been inconclusive. Eleven (57.9%) of the 19 analysed publications had positive results, proving that metformin may help lower the risk of dementia [ 10 , 11 , 13 , 14 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 27 ]. Five articles (26.3%) had an elevated risk [ 25 , 26 , 28 , 29 , 30 ], whereas three (15.8%) provided a condition for decreased risk [ 15 , 16 , 17 ]. A retrospective cohort study by Chin-Hsiao Tseng indicated a lower risk when metformin was used with other medications, such as acarbose and pioglitazone [ 18 ]. At the end of a 6-month follow-up study, a significant difference in cognitive performance compared to baseline in frail women treated with extended-release metformin (p: 0.007) was observed [ 27 ]. Huang et al. highlighted the protective benefits of metformin when used at a low dose [ 16 ]. At the same time, Huang et al. reported higher doses of metformin with a higher intensity showed no protective role against dementia [ 16 ]. However, cohort studies by Yi-Chun Kuan showed mixed results. They raised questions because they linked long-term metformin use to a higher risk of dementia from all causes, including vascular disease and Alzheimer's disease [ 28 , 32 ]. Scherrer et al. showed that the effects of metformin vary in different subpopulations, indicating a lower risk in some individuals (> 50 years) [ 21 ].

Furthermore, the results from I-Shiang Tzeng raise questions about the possibility that metformin and DPP-4 inhibitor combination therapy alleviated the risk of dementia [ 26 ]. These varied results highlight the complex nature of the connection between dementia and metformin use and highlight the need for additional studies, especially examining dose–response interactions, long-term effects, and demographic diversity to offer a more thorough understanding. Among the notable findings is a study conducted by Chin-Hsiao Tseng in 2019, which indicated a reduction in the risk of dementia associated with metformin, particularly in the female population [ 18 ]. Furthermore, the use of a combination of three drugs (Metformin, acarbose, pioglitazone) was associated with the lowest risk of dementia, as highlighted in the same study [ 18 ]. Additionally, a study by Yonghwan Kim et al. demonstrated a dose–response relationship, revealing that Metformin use in an elderly population with diabetes mellitus contributed to a reduction in dementia risk [ 19 ]. However, a retrospective cohort study by Ariela R. Orkaby et al. in 2017 suggested that metformin was associated with a lower risk of subsequent dementia compared to sulfonylurea use in veterans aged 75 years and older [ 13 ]. Notably, a lower risk was also observed in a subset of younger veterans who maintained an HbA1C value of 7% and exhibited good renal function [ 13 ]. In the 2015 study by Kwang-pil Ko et al., a comprehensive evaluation of metformin's efficacy in modulating physical and mental profiles was undertaken, revealing favourable outcomes [ 22 ]. Specifically, within the age group of 65 to 74 years, metformin demonstrated a statistically significant association with a reduced risk of dementia across various racial categories. However, a distinctive pattern emerged among patients aged 75 years and older, as metformin exhibited no statistically significant association with dementia within this older demographic [ 23 ].

Theoretically, antidiabetic drugs designed to ameliorate insulin resistance within the brain hold promise in preventing Alzheimer's disease or dementia [ 18 , 31 ]. In a study involving 17,200 new users of metformin, a lower risk of dementia was reported in a subset of younger veterans exhibiting HbA1C values ≥ 7%, those with good renal function, and individuals of white ethnicity [ 13 ]. In a study conducted, T2DM compared with no medication, sulfonylureas alone reduced the HR from 1 to 0.85 (0.71–1.01), metformin alone to 0.76 (0.58–0.98), while with combined oral therapy, the HR was 0.65 (0.56–0.74) [ 20 ]. Adjustments included cerebrovascular diseases so that non-stroke-related dementias were found to be decreased in DM with sulfonylurea and metformin therapy. T2DM increases the risk of dementia more than 2-fold.

Elevated blood glucose levels pose a potential threat to cerebral function, contributing to an elevated risk of dementia in individuals with diabetes [ 19 , 31 ]. The link between diabetes and dementia is likely multifactorial, involving mechanisms such as inflammation, oxidative stress, atherosclerosis, amyloid-β deposition, brain insulin resistance accompanied by hyperinsulinemia, advanced glycation end-products (AGEs), and dysregulation of lipid metabolism [ 20 , 33 ]. Metformin, recognised as the primary first-line therapy for type 2 diabetes mellitus, operates by curbing hepatic gluconeogenesis and augmenting muscular glucose uptake by activating 5'-adenosine monophosphate-activated protein kinase (AMPK) [ 21 ]. Beyond its glucose-lowering effects, metformin has demonstrated additional benefits in individuals with type 2 diabetes, including reducing the risk of atherosclerotic events, protection against certain cancers, and an anti-ageing effect [ 20 ].

The potential neuroprotective effects of metformin are suggested to stem from its capacity to inhibit inflammatory responses and enhance cognitive function [ 16 ]. Apolipoprotein E (APOE), a crucial protein in lipid transport and brain injury repair, is implicated in Alzheimer's disease risk [ 21 ]. Specific APOE gene polymorphisms, particularly the ε4 allele, elevate the risk of AD, while the ε2 allele is associated with reduced risk [ 10 ]. The APOE ε4 allele is also linked to an increased risk of cerebral amyloid angiopathy and age-related cognitive decline. A recent study hinted at an association between metformin use and a faster decline in delayed memory among carriers of the APOE ε4 allele, prompting the need for further research to elucidate the potential influence of APOE ε4 genotype on the therapeutic effects of metformin [ 29 ].

Limitations and future directions

Existing studies on metformin’s involvement in reducing dementia risk in patients with diabetes have significant limitations that should be considered. First, many studies have methodological variances, such as differences in study design, sample size, and outcome measures. This variation makes obtaining standardised results difficult and direct comparisons between investigations difficult. Furthermore, the heterogeneity within the examined groups, which includes age and diabetes duration, complicates interpretation and restricts the generalizability of the findings. Most observational studies failed to address bias or did not address it clearly, making the evidence less efficient. Another significant issue is the possibility of confounding variables influencing the outcomes. Factors such as genetic predisposition, lifestyle decisions, and concurrent pharmaceutical use may all impact cognitive performance independent of metformin, making it difficult to assign observed effects to medication alone. Furthermore, contradictions in studies are exacerbated by differences in the definitions of dementia and cognitive decline between studies.

Future studies should target certain areas to address these constraints and to increase understanding. Large-scale, well-designed, prospective longitudinal studies with long follow-up periods can provide stronger data and aid in determining causation. In addition, randomised controlled trials (RCTs) focusing only on the cognitive effects of metformin would provide more control over confounding factors. Subgroup analyses within the diabetic population, considering variables such as age, sex, and diabetes management details, would help better understand the influence of metformin on various patient groups. Applying propensity score matching, or at the very least, a match for age, sex, and health status, will improve data validity by lowering baseline variability and, if possible, investigate the relationship between metformin usage, B-12 vitamin levels, and dementia. To inform clinical practice, it is critical to investigate dose–response relationships and optimal dosages for potential cognitive benefits.

Furthermore, a thorough examination of the molecular mechanisms underlying the influence of metformin on cognitive performance is required. This knowledge can guide focused therapies and identify individuals most benefit from metformin therapy. Future research should prioritise uniform study designs, investigate specific demographic subgroups, and explore molecular causes to improve the reliability and usefulness of the findings in clinical practice.

Implications for clinical practice

Clinically, the favourable results observed in multiple studies imply that metformin may be a feasible alternative for people with diabetes, particularly for those at risk of cognitive loss see Fig.  1 .

figure 1

Metformin in dementia risk in type 2 diabetes

Healthcare practitioners should inform patients about the potential cognitive benefits in addition to glycemic control. However, care is advised owing to inconsistent findings and potential issues, such as the variation in the metformin outcome, increased risk of vitamin B-12 insufficiency, and identified risk with certain combinations, emphasising the importance of tailored treatment programs and regular cognitive monitoring. A multidisciplinary approach that combines endocrinologists, neurologists, and senior experts is required to address the complicated connection between diabetes control and cognitive health. Senior experts such as diabetologists are key in tailoring diabetes treatment plans to achieve optimal glycemic control [ 34 ]. In addition, it is essential also to involve psychologists and occupational therapists. These professionals play pivotal roles in the identification, comprehensive assessment, and rehabilitation processes associated with dementia [ 35 ]. They collaborate closely to develop tailored interventions that address cognitive deficits and consider the individual's emotional and functional aspects [ 36 ]. This collaborative effort ensures a more personalised approach to patient care.

At the public health level, awareness programs should be launched to educate diabetic patients about the potential cognitive consequences of metformin and the significance of making informed decisions. Comprehensive studies investigating dose–response connections, long-term consequences, and population-specific effects should receive research funding. Public health guidelines must be revised to reflect increasing evidence, giving healthcare practitioners clear advice on using metformin in diabetes management taking both glycaemic control and cognitive outcomes into account. Policymakers should consider these findings when developing diabetes management policies and public health initiatives to ensure that possible cognitive effects are integrated into broader healthcare programs.

Limitations and strengths of review

The review provides clear implications for clinical practice, suggesting that metformin may be a feasible adjunctive therapy for individuals with diabetes at risk of cognitive decline. The multidisciplinary approach recommended for navigating the complex relationship between diabetes control and cognitive health enhances the practicality of the review's recommendations. Also, the review identifies varied outcomes across studies, emphasising the complexity of the relationship between metformin use and dementia risk. This acknowledgement of diverse findings encourages a more cautious interpretation and highlights the need for further research. However, the included studies exhibit methodological disparities, including differences in study design, sample size, and outcome measures. This variation makes it challenging to obtain standardised results and directly compare findings between investigations.

The body of evidence exploring metformin's role in mitigating dementia risk among individuals with diabetes presents a complex yet promising landscape. The interplay between diabetes and dementia shows the importance of glycemic control and comprehensive management of diabetic complications in reducing the likelihood of cognitive decline. This mini-narrative review reveals a spectrum of outcomes regarding the potential connection between metformin use and dementia risk in patients with diabetes. While a majority of studies suggest a positive association between metformin use and a reduced risk of dementia, the complex nature of these findings prompts a cautious interpretation. Dose–response interactions, long-term effects, and demographic diversity emerge as critical factors requiring further investigation to understand metformin's impact on cognitive health. Noteworthy variations in outcomes across studies highlight the need for standardised methodologies and robust study designs in future research endeavours.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

Abbreviations

Advanced Glycosylation End Products

Alzheimer's Disease

5'-Adenosine Monophosphate-Activated Protein Kinase

Apolipoprotein E

Vitamin B-12

Epsilon 2 (APOE gene polymorphism)

Epsilon 4 (APOE gene polymorphism)

Hemoglobin A1c

Hazard Ratio

Randomized Controlled Trials

Type 2 Diabetes Mellitus

Crane PK, Walker R, Hubbard RA, Li G, Nathan DM, Zheng H, et al. Glucose levels and risk of dementia. N Engl J Med. 2013;369:540–8. https://doi.org/10.1056/NEJMoa1215740 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Geijselaers SLC, Sep SJS, Stehouwer CDA, Biessels GJ. Glucose regulation, cognition, and brain MRI in type 2 diabetes: a systematic review. Lancet Diabetes Endocrinol. 2015;3:75–89. https://doi.org/10.1016/S2213-8587(14)70148-2 .

Article   CAS   PubMed   Google Scholar  

Xu WL, von Strauss E, Qiu CX, Winblad B, Fratiglioni L. Uncontrolled diabetes increases the risk of Alzheimer’s disease: a population-based cohort study. Diabetologia. 2009;52:1031–9. https://doi.org/10.1007/s00125-009-1323-x .

Exalto LG, Biessels GJ, Karter AJ, Huang ES, Quesenberry CP, Whitmer RA. Severe Diabetic retinal disease and dementia risk in type 2 diabetes. J Alzheimer’s Dis. 2014;42:S109–17. https://doi.org/10.3233/JAD-132570 .

Article   Google Scholar  

Wang B, Wang N, Sun Y, Tan X, Zhang J, Lu Y. Association of combined healthy lifestyle factors with incident dementia in patients with type 2 diabetes. Neurology. 2022;99:E2336–45. https://doi.org/10.1212/WNL.0000000000201231 .

Tsugawa A, Ogawa Y, Takenoshita N, Kaneko Y, Hatanaka H, Jaime E, et al. Decreased muscle strength and quality in diabetes-related dementia. Dement Geriatr Cogn Dis Extra. 2017;7:454–62. https://doi.org/10.1159/000485177 .

Article   PubMed   PubMed Central   Google Scholar  

Hanyu H, Hirose D, Fukasawa R, Hatanaka H, Namioka N, Sakurai H. Guidelines for the clinical diagnosis of diabetes mellitus-related dementia. J Am Geriatr Soc. 2015;63:1721–3. https://doi.org/10.1111/jgs.13581 .

Article   PubMed   Google Scholar  

Mayeda ER, Karter AJ, Huang ES, Moffet HH, Haan MN, Whitmer RA. Racial/ethnic differences in dementia risk among older type 2 diabetic patients: the diabetes and aging study. Diabetes Care. 2014;37:1009–15. https://doi.org/10.2337/dc13-0215 .

Campbell JM, Stephenson MD, de Courten B, Chapman I, Bellman SM, Aromataris E. Metformin use associated with reduced risk of dementia in patients with diabetes: a systematic review and meta-analysis. J Alzheimer’s Dis. 2018;65:1225–36. https://doi.org/10.3233/JAD-180263 .

Wium-Andersen IK, Osler M, Jørgensen MB, Rungby J, Wium-Andersen MK. Antidiabetic medication and risk of dementia in patients with type 2 diabetes: a nested case–control study. Eur J Endocrinol. 2019;181:499–507. https://doi.org/10.1530/EJE-19-0259 .

Hsu CC, Wahlqvist ML, Lee MS, Tsai HN. Incidence of dementia is increased in type 2 diabetes and reduced by the use of sulfonylureas and metformin. J Alzheimers Dis. 2011;24(3):485–93. https://doi.org/10.3233/JAD-2011-101524 .

Article   CAS   Google Scholar  

Campbell JM, Bellman SM, Stephenson MD, Lisy K. Metformin reduces all-cause mortality and diseases of ageing independent of its effect on diabetes control: a systematic review and meta-analysis. Ageing Res Rev. 2017;40:31–44. https://doi.org/10.1016/j.arr.2017.08.003 .

Orkaby AR, Cho K, Cormack J, Gagnon DR, Driver JA. Metformin vs sulfonylurea use and risk of dementia in US veterans aged ≥65 years with diabetes. Neurology. 2017;89:1877–85. https://doi.org/10.1212/WNL.0000000000004586 .

Samaras K, Makkar S, Crawford JD, Kochan NA, Wen W, Draper B, et al. Metformin use is associated with slowed cognitive decline and reduced incident dementia in older adults with type 2 diabetes: the sydney memory and ageing study. Diabetes Care. 2020;43:2691–701. https://doi.org/10.2337/DC20-0892 .

Ng TP, Feng L, Yap KB, Lee TS, Tan CH, Winblad B. Long-term metformin usage and cognitive function among older adults with diabetes. J Alzheimer’s Dis. 2014;41:61–8. https://doi.org/10.3233/JAD-131901 .

Huang K-H, Tsai Y-F, Lee CB, Gau S-Y, Tsai T-H, Chung N-J, et al. The correlation between metformin use and incident dementia in patients with new-onset diabetes mellitus: a population-based study. J Pers Med. 2023;13:738. https://doi.org/10.3390/jpm13050738 .

Luchsinger JA, Perez T, Chang H, Mehta P, Steffener J, Pradabhan G, et al. Metformin in amnestic mild cognitive impairment: results of a pilot randomized placebo controlled clinical trial. J Alzheimer’s Dis. 2016;51:501–14. https://doi.org/10.3233/JAD-150493 .

Tseng C-H. Dementia risk in type 2 diabetes patients: acarbose use and its joint effects with metformin and pioglitazone. Aging Dis. 2020;11:658.

Kim Y, Kim H-S, Lee J, Kim Y-S, You H-S, Bae Y-J, et al. Metformin use in elderly population with diabetes reduced the risk of dementia in a dose-dependent manner, based on the Korean NHIS-HEALS cohort. Diabetes Res Clin Pract. 2020;170:108496. https://doi.org/10.1016/j.diabres.2020.108496 .

Chin-Hsiao T. Metformin and the Risk of Dementia in Type 2 Diabetes Patients. Aging Dis. 2019;10(1):37–48. https://doi.org/10.14336/AD.2017.1202 .

Scherrer JF, Salas J, Floyd JS, Farr SA, Morley JE, Dublin S. Metformin and sulfonylurea use and risk of incident dementia. Mayo Clin Proc. 2019;94:1444–56. https://doi.org/10.1016/j.mayocp.2019.01.004 .

Ko K-P, Ma SH, Yang J-J, Hwang Y, Ahn C, Cho Y-M, et al. Metformin intervention in obese non-diabetic patients with breast cancer: phase II randomized, double-blind, placebo-controlled trial. Breast Cancer Res Treat. 2015;153:361–70. https://doi.org/10.1007/s10549-015-3519-8 .

Scherrer JF, Morley JE, Salas J, Floyd JS, Farr SA, Dublin S. Association between metformin initiation and incident dementia among African American and white veterans health administration patients. Ann Fam Med. 2019;17:352–62. https://doi.org/10.1370/afm.2415 .

Newby D, Linden AB, Fernandes M, Molero Y, Winchester L, Sproviero W, Ghose U, Li QS, Launer LJ, Duijn CMV, Nevado-Holgado AJ. Comparative effect of metformin versus sulfonylureas with dementia and Parkinson's disease risk in US patients over 50 with type 2 diabetes mellitus. BMJ Open Diabetes Res Care. 2022;10(5):e003036. https://doi.org/10.1136/bmjdrc-2022-003036 .

de Jager J, Kooy A, Lehert P, Wulffele MG, van der Kolk J, Bets D, et al. Long term treatment with metformin in patients with type 2 diabetes and risk of vitamin B-12 deficiency: randomised placebo controlled trial. BMJ. 2010;340:c2181–c2181. https://doi.org/10.1136/bmj.c2181 .

Tzeng I-S, Hsieh T-H. Collocation of metformin and dipeptidyl peptidase-4 inhibitor is associated with increased risk of diabetes-related vascular dementia: a single hospital study in Northern Taiwan. Expert Opin Investig Drugs. 2023;32:171–6. https://doi.org/10.1080/13543784.2023.2178417 .

Mone P, Martinelli G, Lucariello A, Leo AL, Marro A, De Gennaro S, Marzocco S, Moriello D, Frullone S, Cobellis L, Santulli G. Extended-release metformin improves cognitive impairment in frail older women with hypertension and diabetes: preliminary results from the LEOPARDESS Study. Cardiovasc Diabetol. 2023;22(1):94. https://doi.org/10.1186/s12933-023-01817-4 .

Kuan Y-C, Huang K-W, Lin C-L, Hu C-J, Kao C-H. Effects of metformin exposure on neurodegenerative diseases in elderly patients with type 2 diabetes mellitus. Prog Neuro-Psychopharmacol Biol Psychiatry. 2017;79:77–83. https://doi.org/10.1016/j.pnpbp.2017.06.002 .

Xue Y, Xie X. The association between metformin use and risk of developing severe dementia among ad patients with type 2 diabetes. Biomedicines. 2023;11:2935. https://doi.org/10.3390/biomedicines11112935 .

Luchsinger JA, Ma Y, Christophi CA, Florez H, Golden SH, Hazuda H, et al. Metformin, lifestyle intervention, and cognition in the diabetes prevention program outcomes study. Diabetes Care. 2017;40:958–65. https://doi.org/10.2337/dc16-2376 .

Zhou C, Peng B, Qin Z, Zhu W, Guo C. Metformin attenuates LPS-induced neuronal injury and cognitive impairments by blocking NF-κB pathway. BMC Neurosci. 2021;22(1):73. https://doi.org/10.1186/s12868-021-00678-5 .

Rabieipoor S, Zare M, Ettcheto M, Camins A, Javan M. Metformin restores cognitive dysfunction and histopathological deficits in an animal model of sporadic Alzheimer’s disease. Heliyon. 2023;9(7):e17873. https://doi.org/10.1016/j.heliyon.2023.e17873 .

Ji S, Wang L, Li L. Effect of metformin on short-term high-fat diet-induced weight gain and anxiety-like behavior and the gut microbiota. Front Endocrinol. 2019;10:704. https://doi.org/10.3389/fendo.2019.00704 .

Tong PC, Chan SC, Chan WB, Ho KK, Leung GT, Lo SH, Mak GY, Tse TS. Consensus statements from the diabetologists & endocrinologists alliance for the management of people with hypertension and type 2 diabetes mellitus. J Clin Med. 2023;12(10):3403. https://doi.org/10.3390/jcm12103403 .

Sridhar GR. On psychology and psychiatry in diabetes. Indian J Endocrinol Metab. 2020;24(5):387–95. https://doi.org/10.4103/ijem.IJEM_188_20 .

Aderinto N, Olatunji G, Abdulbasit M, Ashinze P, Faturoti O, Ajagbe A, Ukoaka B, Aboderin G. The impact of diabetes in cognitive impairment: a review of current evidence and prospects for future investigations. Medicine. 2023;102(43):e35557. https://doi.org/10.1097/MD.0000000000035557 .

Download references

Acknowledgements

No funding was received for this study.

Author information

Authors and affiliations.

Department of Medicine and Surgery, Ladoke Akintola University of Technology, Ogbomoso, Nigeria

Nicholas Aderinto & Abdulrahmon Moradeyo

Department of Medicine and Surgery, University of Ilorin, Ilorin, Nigeria

Gbolahan Olatunji, Emmanuel Kokori, Stephen Igwe & Badrudeen Olalekan Alabi

Southern Illinois University Edwardsville, Edwardsville, IL, USA

Praise Fawehinmi

Afe Babalola University, Ado-Ekiti, Ado, Nigeria

Rebecca Ojabo & Emmanuel Chuka Okafor

William Harvey Hospital, Ashford, UK

Damilola Ologbe

John H. Stroger Jr Hospital of Cook County, Chicago, IL, USA

Ayobami Olafimihan

Department of Allied and Public Health, School of Health, Sport and Bioscience, University of East London, London, UK

David B. Olawade

You can also search for this author in PubMed   Google Scholar

Contributions

NA conceptualised the study; All authors were involved in the literature review; GO & EK extracted the data from the reviewed studies; All authors wrote the final and first drafts. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nicholas Aderinto .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Aderinto, N., Olatunji, G., Kokori, E. et al. Metformin mitigates dementia risk among individuals with type 2 diabetes. Clin Diabetes Endocrinol 10 , 10 (2024). https://doi.org/10.1186/s40842-024-00168-7

Download citation

Received : 26 December 2023

Accepted : 18 January 2024

Published : 10 May 2024

DOI : https://doi.org/10.1186/s40842-024-00168-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Diabetes Mellitus
  • Cognitive impairment

Clinical Diabetes and Endocrinology

ISSN: 2055-8260

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

limitations of longitudinal research

  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, changes in left atrial function following two regimens of combined exercise training in patients with ischemic cardiomyopathy: a pilot study.

limitations of longitudinal research

  • 1 Department of Human Science and Promotion of Quality of Life, San Raffaele Open University, Rome, Italy
  • 2 Cardiology Rehabilitation Unit, IRCCS San Raffaele, Rome, Italy
  • 3 Division of Cardiology and Sports Medicine, Department of Clinical Sciences and Translational Medicine, University of Rome Tor Vergata, Rome, Italy
  • 4 Department of Wellbeing, Nutrition and Sport, Pegaso Open University, Naples, Italy

Purpose: Left atrial dysfunction has shown to play a prognostic role in patients with ischemic cardiomyopathy (ICM) and is becoming a therapeutic target for pharmacological and non-pharmacological interventions. The effects of exercise training on the atrial function in patients with ICM have been poorly investigated. In the present study, we assessed the effects of a 12-week combined training (CT) program on the left atrial function in patients with ICM.

Methods: We enlisted a total of 45 clinically stable patients and randomly assigned them to one of the following three groups: 15 to a supervised CT with low-frequency sessions (twice per week) (CTLF); 15 to a supervised CT with high-frequency sessions (thrice per week) (CTHF); and 15 to a control group following contemporary preventive exercise guidelines at home. At baseline and 12 weeks, all patients underwent a symptom-limited exercise test and echocardiography. The training included aerobic continuous exercise and resistance exercise. The analysis of variance (ANOVA) was used to compare within- and inter-group changes.

Results: At 12 weeks, the CTLF and CTHF groups showed a similar increase in the duration of the ergometric test compared with the control (ANOVA p  < 0.001). The peak atrial longitudinal strain significantly increased in the CTHF group, while it was unchanged in the CTLF and control groups (ANOVA p  = 0.003). The peak atrial contraction strain presented a significant improvement in the CTHF group compared with the CTLF and control groups. The left ventricular global longitudinal strain significantly increased in both the CTHF and the CTLF groups compared with the control group (ANOVA p  = 0.017). The systolic blood pressure decreased in the CTHF and CTLF groups, while it was unchanged in the control group. There were no side effects causing the discontinuation of the training.

Conclusions: We demonstrated that a CT program effectively improved atrial function in patients with ICM in a dose–effect manner. This result can help with programming exercise training in this population.

Introduction

Ischemic cardiomyopathy (ICM) is characterized by a remodeling process that is triggered by an ischemic insult to the left ventricle (LV) that impairs LV relaxation, raises LV stiffness, and ultimately causes the increase of LV filling pressure ( 1 ). In many patients with ICM, the remodeling also extends to the left atrium (LA) because this chamber takes part in the LV filling process during the diastolic phase ( 2 , 3 ). The LA remodeling process is an electro-mechanical process that culminates in LA enlargement and can be associated with the onset of atrial fibrillation. According to the current model, structural changes in the LA are preceded by functional abnormalities of this chamber ( 4 ). The onset of LA functional abnormalities is considered the first sign of an increased LV filling pressure because it can be detected even before the other indices of diastolic dysfunction appear ( 5 ). A two-dimensional speckle-tracking analysis performed during an echocardiography examination can assess LA function by identifying the reservoir, conduit, and contraction phases ( 6 ). Several studies demonstrated that the reservoir strain, which is often expressed as the peak atrial longitudinal strain (PALS), is impaired in a wide spectrum of cardiovascular conditions, with the highest degree of dysfunction being observed in patients with heart failure (HF) with both reduced and preserved ejection fractions (EF) ( 7 ). The impairment of PALS has been associated with a reduced exercise tolerance ( 8 ), a greater risk of atrial fibrillation insurgency ( 9 ), and a poor outcome ( 10 , 11 ). Similarly to PALS, the peak atrial contraction strain (PACS), reflecting the booster phase of the LA, has also recently been shown to predict survival in patients with cardiovascular diseases ( 12 ). In light of the prognostic role of its component, LA dysfunction has recently become a therapeutic target. In this regard, some preliminary studies have provided evidences on the possibility of counteracting the LA remodeling process through the improvement of LA functional parameters in different clinical contexts. This goal has been achieved independently through either the administration of sacubitril–valsartan in patients with advanced HF with reduced ejection fraction (HFrEF) ( 13 ) or weight loss in diabetic obese subjects ( 14 ). Exercise training, mainly in the setting of supervised cardiac rehabilitation programs, exerts several favorable effects in patients with ICM, such as it reduces symptoms, enhances exercise tolerance, and ultimately improves the prognosis of these patients ( 15 , 16 ). Therefore, practicing regular exercise training is strongly recommended by the current guidelines for ICM ( 17 ). Among different exercise modalities, combined training (CT) is particularly attractive for patients with ICM compared with aerobic continuous exercise alone because it carries additional advantages on several parameters, including metabolism, muscle strength and bulk, and blood pressure (BP) variability ( 18 – 20 ). In patients with ICM, exercise training causes “central” cardiac effects that can help stabilize clinical conditions and improve prognosis ( 21 ). Although exercise training is an accepted strategy for attenuating LV remodeling in ICM ( 22 , 23 ), less attention has been given to the LA. Improvement of the LA reservoir and contraction strains has recently been observed in patients with HF and mildly reduced EF (HFmrEF) after 12 weeks of CT ( 24 ). However, it remains to be established whether or not CT has beneficial effects on the LA function in patients with asymptomatic ICM which has not yet developed HF. The purpose of this study was to compare the changes in the left atrial function produced by two regimens of CT vs. control in patients with ICM. The primary endpoint was the comparative changes in PALS in patients undergoing supervised CT programs vs. control. The secondary endpoint was to compare the changes in PALS produced by the two different regimens of CT.

The study enrolled 45 patients of both genders with a previous diagnosis of ICM who had never experienced signs and/or symptoms of heart failure. The patients were recruited if they were older than 45 years old. The diagnosis of ICM was established if the subjects had one or more of the following clinical conditions: previous ST-elevation myocardial infarction (STEMI); no ST-elevation myocardial infarction (NSTEMI)/unstable angina; percutaneous coronary intervention (PCI); and coronary artery bypass grafting (CABG). The subjects were enrolled if they were in stable clinical condition, were not hospitalized in the previous 6 months, and their pharmacological therapy was unchanged in the previous 3 months. We implemented the following exclusion criteria: myocardial ischemia or threatening arrhythmias during the resting assessment or ergometric test; previous HF diagnosis; permanent atrial fibrillation; baseline blood pressure levels at rest exceeding 160/100 mmHg; severe heart valve diseases; hypertrophic cardiomyopathy diagnosis; anemia with hemoglobin levels below 10.5 g/dl; concomitant diagnosis of chronic respiratory disease with a documented FEV1 below 50%; and previous peripheral artery disease diagnosis with exercise-limiting claudication. Moreover, we excluded patients who had participated in training programs in the previous 6 months or who reported engaging in regular exercise spontaneously. The study complied with the Declaration of Helsinki and was approved by the local ethics committee of San Raffaele IRCCS (protocol number 18/2022). All patients provided written informed consent before participating in the study.

Study design

The study design has been summarized in Figure 1 . It was conceived as a pilot study with three parallel arms. The patients were randomly assigned on a 1:1:1 basis to one of the following groups: (1) the combined training high-frequency (CTHF) group; (2) the combined training low-frequency (CTLF) group; and (3) the control group. Each group was composed of 15 patients. The randomization code was developed by a computer random-number generator to select random permuted blocks. The patients belonging to the CTHF and CTLF groups performed supervised exercise training sessions in the rehabilitation facility of San Raffaele IRCCS in Rome, while patients belonging to the control group were discharged and given instructions to perform exercises at home according to the guideline recommendations ( 16 ). At baseline, the patients underwent a preliminary visit, during which their medical histories and anthropometric parameters were collected. The body mass index (BMI) was calculated using the following formula: BMI = kg/m 2 , where kg is the weight of the patient in kilograms and m 2 is their height in meters squared. The patients who met the inclusion/exclusion criteria and provided consent for the study were summoned in another day (i.e., within a week from the first visit). During the second visit, they underwent echocardiography and an ergometric test. The examinations were repeated at 12 weeks.

www.frontiersin.org

Figure 1 . Study flowchart.

Transthoracic echocardiography

The echocardiography was performed with patients in the supine position. An experienced sonographer, blinded to the type of patient's allocation, performed echocardiography at baseline and 12 weeks on all study subjects. A cardiovascular ultrasound Vivid E95® (GE HealthCare, Chicago, IL, USA) with a 4.0-MHz transducer was used. During echocardiography, one-lead electrocardiography monitoring placed on the chest was conducted. After the acquisition, all echocardiographic images were digitally stored and analyzed offline. During the review process, an experienced technician, who was also blinded to all participants’ details, performed deformation measures using a proprietary software (version 10.8, EchoPAC; GE Vingmed Ultrasound, Norway). The left ventricular end-diastolic volume (LVEDV) and left ventricular end-systolic volume (LVESV) were measured from the apical two- and four-chamber windows. The LVEF was then calculated using the modified Simpson method. The LA volume was measured from standard apical four-chamber and three-chamber views at end systole, before the opening of the mitral valve, using the biplane Simpson method of disks. The LA volume index (LAVI) was calculated by dividing the LA volume by the body surface area of the patients. The E/A ratio was defined as the ratio of the peak left ventricle filling velocity (E wave) in early diastole, corresponding to atrial relaxation, to the peak velocity flow in late diastole (A wave), corresponding to atrial contraction. The LV E/E′ ratio was calculated as the ratio between the E wave velocity and the average between the septal and lateral LV E′ wave velocities. Color tissue Doppler tracings were performed in the four-chamber view, with the range gate placed at the lateral mitral annular segments. The LV global longitudinal strain (GLS) was measured through the two-, three-, and four-chamber views. The LV endocardial boundary was automatically detected by the software; however, when deemed appropriate, it was edited to conform to the visualized LV boundaries. The maximum negative value of strain during systole measured by the software represented the maximum contractility for each segment. The average of the values from each segment was then calculated to determine the LVGLS. The LA strain was measured through the two- and four-chamber views. LA deformation tracking was carried out using the R wave as the starting point (R–R gating). The endocardial and epicardial contours of the LA were traced using an automatic contour tracking algorithm. Manual adjustments were also made when necessary. The automatic algorithm placed a set of control points on the middle curve of the myocardial wall in the reference phase based on the drawn endocardial and epicardial contours. The longitudinal strain curves were generated for each segment by the software program. The mean curve of all segments was then calculated. The LA reservoir strain, conduit strain, and contractile strain were obtained from the subdivision of longitudinal strain measurements ( 25 ). PALS was measured as a positive peak during LV systole at the end of the atrial diastolic phase, and PACS was measured as a positive peak during early LV diastole, right before the start of the atrial systolic phase ( 26 ).

Ergometric test

The ergometric test was performed on a cycle ergometer (Mortara Instrument, Casalecchio Di Reno, Italy). The heart rate and BP were measured at rest, before starting the exercise, at each stage of the exercise, and every 2 min during the recovery phase. The test was terminated if the patient experienced muscle exhaustion and chest pain or expressed the desire to stop the test. At each stage of the exercise protocol, the patients were asked to rate their perceived sensation of fatigue according to the modified Borg scale ( 27 ).

Exercise training protocol

The exercise training sessions were performed in the cardiac rehabilitation facility of San Raffaele IRCCS in Rome. The patients were asked to continue their usual dietary and lifestyle habits for the entire duration of the exercise training program. All exercise sessions were held in the morning, between 8 and 11 AM, and were supervised by two physiotherapists who advised patients regarding the setup of treadmills, cycles, and dynamometers through the training sessions and ensured that they performed the exercises correctly. The exercise sessions were also supervised by a physician specialized in sport medicine and with experience in the cardiac rehabilitation field, and by a nurse. During the first exercise session, we monitored the patients’ heart rhythms for safety using telemetry. The duration of each exercise session was 60 min. Before starting the sessions, the patients performed a 10-min warm-up, and at the end of each session, they performed a 10-min cool-down. During each session, the patients first performed the aerobic exercises before the resistance exercises. For both the CTHF and CTLF groups, the exercise sessions were planned as follows: 40 min of aerobic training on a treadmill or a bike followed by 20 min of resistance training (Technogym Wellness System, Technogym, Cesena, Italy). The rate of perceived exertion (RPE) method was used to plan the intensity of the aerobic component over the whole training program. The patients were asked to reach an intensity target of 13–14 (somewhat hard). To retain the patients’ level of effort during the entire training period, they were left free to change the treadmill/cycle setup in subsequent sessions. The RPE method was chosen to permit updating the exercise prescriptions as the fitness levels changed. The resistance component of the training sessions consisted of the following exercises: leg press, leg extension, shoulder press, chest press, low row, and vertical traction. The muscle groups involved were the quadriceps, back muscles, deltoids, and biceps. The intensity of each resistance exercise was established through the assessment of the corresponding 1-repetition maximum (1-RM). For each exercise, the patients were required to perform a warm-up set comprising 8–10 repetitions at 60% of their 1-RM, and this intensity was fixed during the entire training program. Then, patients performed one repetition at their maximal effort. This latter was carried out three times, and between each effort, the patients could rest for 2–3 min. The highest value of strength obtained was used as 1-RM ( 28 ). The patients performed the 1-RM test at baseline and every 3 weeks to update the exercise loads as the fitness levels changed. The patients performed two sets of every exercise and had 2 min of rest between sets. Each set included eight repetitions. Particular attention was paid to avoiding the contraction of muscle groups other than those specifically involved in the exercise (that is, accessory muscle recruitment). The following formula was adopted to assess the patients’ adherence to the training program: (attended sessions/planned sessions × 100). The control group was advised to perform physical activities at home according to the contemporary guidelines ( 17 ). The patients belonging to this group did not receive any supervision or wearable device for performing exercises at home. They only received a training manual and educational materials. The training manual summarized the guideline recommendations about the exercise modalities and the training intensity and frequency. After the enrollment visit at baseline, there were no further contacts between the study investigators and the patients of the control group throughout the study. The participants were finally contacted at 12 weeks and summoned for the final evaluation at our center. During that last visit, they were asked about the number of exercise sessions performed at home.

Statistical analysis

This research was conceived as a pilot study; thus, a formal sample size calculation was not required. The Shapiro–Wilks hypothesis test was used to check the assumption of normality. The pre-exercise and post-exercise data variables were assessed using a repeated-measure one-way analysis of variance (ANOVA) with Bonferroni corrections for post-hoc testing. The Pearson correlation coefficient test was used to measure the strength of the linear association between two variables. The level of significance was set at p  < 0.05. The data were analyzed using SPSS software (version 20.0 IBM Corp., Armonk, NY, USA). The intra-observer and inter-observer variabilities for the LA strain measures were evaluated in a group of 10 patients. Measurements of the primary and secondary endpoints of this study were repeated by the same observer after an interval of ≥1 week and by a second independent blinded observer. The intraclass correlation coefficient was used to measure reproducibility, with a good agreement designated as >0.80.

The baseline features of the population are presented in Table 1 . At baseline, the three groups were comparable in terms of age, anthropometric and clinical parameters, and pharmacological therapy. Overall, 30 of the 45 patients (66.6%) had a previous STEMI. Of these 30, 21 patients (70.0%) had a previous anterior STEMI. The EF ranged from 60 to 40%. Overall, 11 patients had an EF below 50%. All patients had arterial hypertension and were taking an average of 2.3 ± 1.2 drugs for BP control. Furthermore,17 of the 45 patients (37.7%) were obese (i.e., with a BMI over 30 kg/m 2 ), and 19 (42.2%) were overweight (i.e., with a BMI between 25 and 30 kg/m 2 ). No side effects occurred during the entire study period. Four patients dropped out of the study, and three of them (i.e., one from the CTHF group, one from the CTLF group, and one from the control group) refused to continue. One patient from the control group needed changes in pharmacological therapy because of uncontrolled high BP values. Altogether 41 patients completed the study and were included in the analysis. The average number of sessions performed by the patients from the CTHF group was 35.1 ± 5.7, whereas that from the CTLF group was 34.6 ± 4.2, with adherence rates of 88.8% and 87.5%, respectively. The patients of the control group declared an average number of sessions of 11.3 ± 4.7. Out of the 15 patients (46.6%), 7 of this group declared that they discontinued their physical activity within the first 2 weeks of the study.

www.frontiersin.org

Table 1 . Baseline features of recruited patients according to the three group allocations.

Within-group analysis

At 12 weeks, the duration of the ergometric test significantly increased in both the CTLF and CTHF groups (+26%, p  = 0.007, and +30%, p  = 0.001%, respectively), while it was unchanged in the control group ( Figure 2 ). PALS significantly increased in both the CTLF and CTHF groups (+24.6%; p  = 0.024, and +48%–5%; p  = 0.012, respectively), while it was unchanged in the control group ( Figure 3 ). PACS significantly increased in the CTHF group (+44.0%), but remained unchanged in the CTLF and control groups. The E/E′ ratio, LAVI, LVEDV, LVESV, and diastolic BP did not change in any of the three groups ( Table 2 ). The left ventricular global longitudinal strain (LVGLS) significantly improved in the CTHF group (−22.3%; p  = 0.037) but remained unchanged in the CTLF and control groups ( Figure 4 ). The systolic BP significantly decreased in the CTHF and CTLF groups (−9.1%; p  = 0.012, and −4.4%; p  = 0.036, respectively), while it remained unchanged in the control group.

www.frontiersin.org

Figure 2 . Between-group comparisons at 12 weeks of the changes in PALS ( A ) and PACS ( B ) occurring in the CTLF, CTHF, and control groups. Results of the one-way ANOVA and the Bonferroni post-hoc tests. Dark gray bars = baseline; light gray bars = 12 weeks.

www.frontiersin.org

Figure 3 . Between-group comparisons of the changes in time during the ergometric test ( A ) and LVGLS ( B ) occurring in the CTLF, CTHF, and control groups at 12 weeks. Results of the one-way ANOVA and Bonferroni post-hoc tests. Dark gray bars = baseline; light gray bars = 12 weeks.

www.frontiersin.org

Table 2 . Changes in the hemodynamic and echocardiography parameters in the three study groups.

www.frontiersin.org

Figure 4 . Correlation between the changes in the systolic blood pressure and in PALS in patients undergoing supervised exercise (CTLF + CTHF).

Inter-group analysis

The increase in PALS observed in the CTHF group was significantly higher than that in the CTLF [+3.2% (95% CI = 0.8–5.5), p  = 0.009] and control [+4.7% (95% CI = 1.8–7.5), p  = 0.002] groups ( Figure 2 ). The increase in PALS in the CTLF group was higher than that in the control [+2.4% (95% CI = 0.4–4.4), p  = 0.019] group. The increase in PACS in the CTHF group was significantly higher than that in the CTLF [+3.7% (95% CI = 2.1–5.2), p  = 0.001] and control [+5.0% (95% CI = 3.5–6.5), p  < 0.001] groups. There were no significant changes in PACS in the CTLF and control groups [+1.3% (95% CI −0.3–2.7), p  = 0.056]. The changes in the LVGLS in the CTHF group were higher than those in the CTLF [+4.9% (95% CI = 2.3–9.8), p  = 0.045] and control [+6.0% (95% CI = 1.1–10.8), p  = 0.017] groups ( Figure 3 ). There were no significant changes in the LVGLS in the CTLF and control groups [+1.0% (95% CI −0.2–2.3), p  = 0.126]. The increase in the duration of the ergometric test was significantly higher for the CTHF and CTLF groups compared with that in the control group [+68.1 s (95% CI = 51.6–86.5), p  < 0.001; and +54.6 s (95% CI = 37.5–71.8), p  < 0.001, respectively], while there were no significant differences between the two CT groups [+14.4 s (95% CI = 2.5–28.3), p  = 0.053]. There was an inverse significant correlation between the changes in systolic BP and the changes in PALS in patients undergoing supervised CT ( r  = −0.39; p  = 0.034) ( Figure 4 ).

The present study investigated the effects of two regimens of combined exercise training on the LA function in patients with ICM. The results led to two main findings: first, we observed that both regimens of CT elicited favorable effects on the LA reservoir strain compared with the control group; and second, the regimen with the highest training frequency led to a greater increase of PALS and PACS than that with the lowest frequency of training sessions. Regarding the first point, the LA adaptations to exercise training have been assessed mainly in healthy subjects and athletes, while very few studies on this topic have previously been conducted in patients with cardiovascular diseases. In sedentary healthy individuals, increases in the LA strain parameters have been observed after short exercise training interventions without any detectable change in the LA size ( 29 ). Conversely, no changes or small reductions in reservoir strain were described in athletes ( 30 , 31 ). LA dysfunction has been observed in patients with a wide range of cardiovascular diseases and is interpreted as the first step of the LA remodeling process ( 4 ). This process is related to an increase in the LA pressure, which in turn is due to the increase in the LV stiffness and the LV filling pressure following an ischemic insult, and can evolve toward LA enlargement. Trying to revert the LA remodeling process by improving the LA structure and function is becoming an important goal in ICM management. This is because LA size and function have both emerged as prognostic factors in the general population and in patients with ICM ( 11 , 32 , 33 ). Exercise training has well proven the anti-remodeling effects in the post-ischemic LV ( 34 ), which are among the mechanisms through which exercise favorably impacts ICM. The potential adaptation of the LA to exercise training in these patients may also have an important clinical value. Preliminary data suggest the possibility of counteracting LA remodeling through exercise training. In a recent research, a regimen of CT, consisting of three sessions/week and lasting 12 weeks, was effective in increasing the reservoir strain and the contraction strain in patients with HFmrEF and underlying ICM ( 24 ). The present study confirms the results of the previous research and extends them to patients who are in a less advanced stage of ICM: indeed, in this study we enrolled asymptomatic subjects with normal LA volume, normal E/E′ ratio, with on average normal EF, who had never experienced symptoms or signs of HF. The changes in the LA functional parameters were not coupled with change in the LA size. The lack of effects on LA size could depend on the shortness of the CT program (i.e., lasting only 12 weeks), and we cannot rule out that different results could be obtained through longer exercise interventions. Previous research on the effects of exercise training on the LA size showed mixed results ( 35 , 36 ). Moreover, an effective exercise protocol for modifying the LA size remains uncertain and is still a debated point ( 37 ). This study did not allow us to explore what mechanisms underlie the beneficial effects produced by CT on LA function. However, we found that the systolic BP values decreased, and the LVGLS improved in the CTLF and CTHF groups at 12 weeks. We can hypothesize that the improvements in the LVGLS were secondary to the exercise-induced systolic BP reductions, as demonstrated in other studies involving anti-hypertensive drugs ( 38 ). The improvement of the LVGLS may consequently have had positive repercussions on the LA function because the changes in the LVGLS are at least partly responsible for the changes in the LA reservoir strain ( 39 ). Therefore, our results seem to suggest that the increase in PALS reflected the improved hemodynamic conditions of the downstream ventricle. In this regard, we interestingly found a significant correlation between the changes in the systolic BP values and those in PALS. New studies are needed to fully understand the clinical implications of the beneficial effect produced by exercise training on the LA function. In particular, it should be assessed whether these effects constitute an additional mechanism by which exercise training contributes to the stabilization of these patients or if they are a consequence of the effects produced by exercise on the left ventricle. The effects of other exercise modalities on the LA parameters should also be tested in future research. In this study, we found that the values of PALS were lower in comparison with those reported by the literature in healthy age-matched subjects and were instead comparable with those that we and other authors observed in HFmrEF ( 24 , 40 ). Conversely, the E/E′ ratio was in the normal range and remained unchanged after the CT program. In our opinion, this finding supports the hypothesis that an impaired LA reservoir strain is the first detectable sign of LV diastolic dysfunction. This result complies with the current literature: the LA reservoir strain alone or in combination with the E/E′ ratio has already been successfully tested as a single non-invasive index to predict the elevated LV filling pressures in patients with stable ICM and preserved LVEF ( 5 , 41 ). Regarding the comparison between the two supervised CT regimens, our results suggest that there is a dose–effect relationship between the volume of exercise administered and the effect size produced on the LA function. It should be underlined that the same volume of exercise reached by the CTHF could have also been obtained by the patients from the CTLF group by increasing their session workload. There are clear demonstrations that, by equalizing the weekly training volume, different exercise training regimens produce similar effects on the skeletal muscle and cardiovascular parameters ( 42 , 43 ). However, we were prevented from increasing the workload of the CTLF group by the fear of incurring muscle injuries and losing patient compliance. Interestingly, the effects of CT on LA function appeared to be uncoupled from those on exercise tolerance. It is noteworthy that the CTHF and CTLF groups obtained similar time increases during the exercise test at 12 weeks. This result seems to indicate that the volume of CT to be prescribed to patients with ICM could vary according to the different goals to be reached. Higher weekly volumes seem preferable when the object is that of inducing LA adaptations. In light of these observations, we believe that the results of this study go in the direction of a more personalized and individually tailored prescription of exercise training for patients with ICM. Further studies with longer training programs are needed for clarifying which is the best exercise intervention, in terms of modality, intensity, and duration, for eliciting LA adaptations.

Limitations

The most important limitations of the present study are the small sample size and the short duration of the exercise protocol, which was limited to 12 weeks. New studies, including larger sample sizes and longer exercise training interventions, are needed to better understand the efficacy of CT in improving LA function in patients with ICM. The results of this study have been obtained by means of CT and they cannot be generalized to other exercise modalities. Because both groups of patients who underwent supervised training sessions performed the same protocol at each session, their weekly volume was not equated; therefore, we cannot rule out that in a scenario of equated weekly training volume the study would have produced different results. Finally, some caution in interpreting the data is needed because strain echocardiographic imaging has several technical limitations ( 44 , 45 ), and the impairment in the myocardial strain parameters may not always be an expression of intrinsic myocardial dysfunction ( 46 ). Therefore, further confirmations of our results should be obtained with alternative diagnostic techniques, such as cardiac magnetic resonance imaging. Our data suggest that CT elicited LA adaptation in patients with ICM. However, this study does not contribute to the understanding of the mechanisms through which CT exerts its effects on LA, and this point needs to be clarified in further studies.

This study showed that a 12-week CT program improved LA function in patients with ICM in a dose–effect manner. Further larger studies are needed to understand whether these exercise-induced benefits can be sustained over time and can translate into clinical benefits for patients with ICM.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by the local ethics committee of San Raffaele IRCCS, Rome. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

GC: Conceptualization, Writing – original draft. MV: Conceptualization, Supervision, Writing – original draft. FI: Conceptualization, Supervision, Validation, Writing – review & editing. GM: Investigation, Methodology, Writing – original draft, Writing – review & editing. ViM: Writing – review & editing, Data curation, Investigation. VD: Data curation, Writing – original draft. SV: Data curation, Investigation, Writing – original draft. DD: Investigation, Software, Writing – original draft. MC: Data curation, Investigation, Writing – original draft. VaM: Writing – review & editing, Data curation. MP: Software, Writing – original draft.

The authors declare financial support was received for the research, authorship, and/or publication of this article.

This work was supported by funding from the Italian Ministry of Health (Ricerca Corrente).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The authors declared that they were an editorial board member of Frontiers at the time of submission. This had no impact on the peer-review process and the final decision.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

1. Sutton MG, Sharpe N. Left ventricular remodeling after myocardial infarction: pathophysiology and therapy. Circulation . (2000) 101(25):2981–8. doi: 10.1161/01.CIR.101.25.2981Circulation

PubMed Abstract | Crossref Full Text | Google Scholar

2. Sharifov OF, Denney TS Jr, Girard AA, Gupta H, Lloyd SG. Coronary artery disease is associated with impaired atrial function regardless of left ventricular filling pressure. Int J Cardiol . (2023) 387:131102. doi: 10.1016/j.ijcard.2023.05.052

3. Pascaud A, Assunção A Jr, Garcia G, Vacher E, Willoteaux S, Prunier F, et al. Left atrial remodeling following ST-segment-elevation myocardial infarction correlates with infarct size and age older than 70 years. J Am Heart Assoc . (2023) 12(6):e026048. doi: 10.1161/JAHA.122.026048

4. Thomas L, Abhayaratna WP. Left atrial reverse remodeling: mechanisms, evaluation and clinical significance. JACC Cardiovasc Imaging . (2017) 10(1):65–77. doi: 10.1016/j.jcmg.2016.11.003

5. Lin J, Ma H, Gao L, Wang Y, Wang J, Zhu Z, et al. Left atrial reservoir strain combined with E/E’ as a better single measure to predict elevated LV filling pressures in patients with coronary artery disease. Cardiovasc Ultrasound . (2020) 18(1):11. doi: 10.1186/s12947-020-00192-4

6. Saraiva RM, Demirkol S, Buakhamsri A, Greenberg N, Popović ZB, Thomas JD, et al. Left atrial strain measured by two-dimensional speckle tracking represents a new tool to evaluate left atrial function. J Am Soc Echocardiogr . (2010) 23(2):172–80. doi: 10.1016/j.echo.2009.11.003

7. Melenovsky V, Hwang SJ, Redfield MM, Zakeri R, Lin G, Borlaug BA. Left atrial remodeling and function in advanced heart failure with preserved or reduced ejection fraction. Circ Heart Fail . (2015) 8(2):295–303. doi: 10.1161/CIRCHEARTFAILURE.114.001667

8. Maffeis C, Rossi A, Cannata L, Zocco C, Belyavskiy E, Radhakrishnan AK, et al. Left atrial strain predicts exercise capacity in heart failure independently of left ventricular ejection fraction. ESC Heart Fail . (2022) 9(2):842–52. doi: 10.1002/ehf2.13788

9. Cameli M, Mandoli GE, Loiacono F, Sparla S, Iardino E, Mondillo S. Left atrial strain: a useful index in atrial fibrillation. Int J Cardiol . (2016) 220:208–13. doi: 10.1016/j.ijcard.2016.06.197

10. Bouwmeester S, van der Stam JA, van Loon SLM, van Riel NAW, Boer AK, Dekker LR, et al. Left atrial reservoir strain as a predictor of cardiac outcome in patients with heart failure: the HaFaC cohort study. BMC Cardiovasc Disord . (2022) 22(1):104. doi: 10.1186/s12872-022-02545-5

11. Jia F, Chen A, Zhang D, Fang L, Chen W. Prognostic value of left atrial strain in heart failure: a systematic review and meta-analysis. Front Cardiovasc Med . (2022) 9:935103. doi: 10.3389/fcvm.2022.935103

12. Winkler NE, Anwer S, Rumpf PM, Tsiourantani G, Donati TG, Michel JM, et al. Left atrial pump strain predicts long-term survival after transcatheter aortic valve implantation. Int J Cardiol . (2024) 395:131403. doi: 10.1016/j.ijcard.2023.131403

13. Brás PG, Gonçalves AV, Branco LM, Moreira RI, Pereira-da-Silva T, Galrinho A, et al. Sacubitril/valsartan improves left atrial and ventricular strain and strain rate in patients with heart failure with reduced ejection fraction. Life . (2023) 13(4):995. doi: 10.3390/life13040995

Crossref Full Text | Google Scholar

14. Alfuhied A, Gulsin GS, Athithan L, Brady EM, Parke K, Henson J, et al. The impact of lifestyle intervention on left atrial function in type 2 diabetes: results from the DIASTOLIC study. Int J Cardiovasc Imaging . (2022) 38(9):2013–23. doi: 10.1007/s10554-022-02578-z

15. Kamiya K, Sato Y, Takahashi T, Tsuchihashi-Makaya M, Kotooka N, Ikegame T, et al. Multidisciplinary cardiac rehabilitation and long-term prognosis in patients with heart failure. Circ Heart Fail . (2020) 13(10):e006798. doi: 10.1161/CIRCHEARTFAILURE.119.006798

16. Lolley R, Forman DE. Cardiac rehabilitation and survival for ischemic heart disease. Curr Cardiol Rep . (2021) 23(12):184. doi: 10.1007/s11886-021-01616-x

17. Knuuti J, Wijns W, Saraste A, Capodanno D, Barbato E, Funck-Brentano C, et al. 2019 ESC guidelines for the diagnosis and management of chronic coronary syndromes. Eur Heart J . (2020) 41:407–77. doi: 10.1093/eurheartj/ehz425

18. Medeiros Nda S, de Abreu FG, Colato AS, de Lemos LS, Ramis TR, Dorneles GP, et al. Effects of concurrent training on oxidative stress and insulin resistance in obese individuals. Oxid Med Cell Longev . (2015) 2015:697181. doi: 10.1155/2015/697181

19. Volterrani M, Caminiti G, Perrone MA, Cerrito A, Franchini A, Manzi V, et al. Effects of concurrent, within-session, aerobic and resistance exercise training on functional capacity and muscle performance in elderly male patients with chronic heart failure. J Clin Med . (2023) 12:750. doi: 10.3390/jcm12030750

20. Caminiti G, Iellamo F, Mancuso A, Cerrito A, Montano M, Manzi V, et al. Effects of 12 weeks of aerobic versus combined aerobic plus resistance exercise training on short-term blood pressure variability in patients with hypertension. J Appl Physiol (1985) . (2021) 130(4):1085–92. doi: 10.1152/japplphysiol.00910.2020

21. Garza MA, Wason EA, Zhang JQ. Cardiac remodeling and physical training post myocardial infarction. World J Cardiol . (2015) 7(2):52–64. doi: 10.4330/wjc.v7.i2.52

22. Guizoni DM, Oliveira-Junior SA, Noor SLR, Pagan LU, Martinez PF, Lima ARR, et al. Effects of late exercise on cardiac remodeling and myocardial calcium handling proteins in rats with moderate and large size myocardial infarction. Int J Cardiol . (2016) 221:406–12. doi: 10.1016/j.ijcard.2016.07.072

23. Garza MA, Wason EA, Cruger JR, Chung E, Zhang JQ. Strength training attenuates post-infarct cardiac dysfunction and remodeling. J Physiol Sci . (2019) 69(3):523–30. doi: 10.1007/s12576-019-00672-x

24. Caminiti G, Perrone MA, D'Antoni V, Marazzi G, Gismondi A, Vadalà S, et al. The improvement of left atrial function after twelve weeks of supervised concurrent training in patients with heart failure with mid-range ejection fraction: a pilot study. J Cardiovasc Dev Dis . (2023) 10(7):276. doi: 10.3390/jcdd10070276

25. Gan GCH, Ferkh A, Boyd A, Thomas L. Left atrial function: evaluation by strain analysis. Cardiovasc Diagn Ther . (2018) 8:29–46. doi: 10.21037/cdt.2017.06.08

26. Badano LP, Kolias TJ, Muraru D, Abraham TP, Aurigemma G, Edvardsen T, et al. Standardization of left atrial, right ventricular, and right atrial deformation imaging using two-dimensional speckle tracking echocardiography: a consensus document of the EACVI/ASE/Industry Task Force to standardize deformation imaging. Eur Heart J Cardiovasc Imaging . (2018) 19:591–600. doi: 10.1093/ehjci/jey042

27. Borg GA. Psychophysical bases of perceived exertion. Med Sci Sports Exerc . (1982) 14:377–81.7154893

PubMed Abstract | Google Scholar

28. Seo DI, Kim E, Fahs CA, Rossow L, Young K, Ferguson SL, et al. Reliability of the one-repetition maximum test based on muscle group and gender. J Sports Sci Med . (2012) 11(2):221–5.24149193

29. Wright S, Esfandiari S, Elmayergi N, Sasson Z, Goodman JM. Left atrial functional changes following short-term exercise training. Eur J Appl Physiol . (2014) 114(12):2667–75. doi: 10.1007/s00421-014-2989-4

30. D'Ascenzi F, Cameli M, Lisi M, Zacà V, Natali B, Malandrino A, et al. Left atrial remodelling in competitive adolescent soccer players. Int J Sports Med . (2012) 33(10):795–801. doi: 10.1055/s-0032-1304660

31. Zilinski JL, Contursi ME, Isaacs SK, Deluca JR, Lewis GD, Weiner RB, et al. Myocardial adaptations to recreational marathon training among middle-aged men. Circ Cardiovasc Imaging . (2015) 8(2):e002487. doi: 10.1161/CIRCIMAGING.114.002487

32. Bombelli M, Facchetti R, Cuspidi C, Villa P, Dozio D, Brambilla G, et al. Prognostic significance of left atrial enlargement in a general population: results of the PAMELA study. Hypertension . (2014) 64(6):1205–11. doi: 10.1161/HYPERTENSIONAHA.114.03975

33. Modin D, Biering-Sørensen SR, Møgelvang R, Alhakak AS, Jensen JS, Biering-Sørensen T. Prognostic value of left atrial strain in predicting cardiovascular morbidity and mortality in the general population. Eur Heart J Cardiovasc Imaging . (2019) 20(7):804–81. doi: 10.1093/ehjci/jey181

34. Haykowsky M, Scott J, Esch B, Schopflocher D, Myers J, Paterson I, Warburton D, Jones L, Clark AM. A meta-analysis of the effects of exercise training on left ventricular remodeling following myocardial infarction: start early and go longer for greatest exercise benefits on remodeling. Trials . (2011) 12:92. doi: 10.1186/1745-6215-12-92

35. Sandri M, Kozarez I, Adams V, Mangner N, Hollriegel R, Erbs S, et al. Age-related effects of exercise training on diastolic function in heart failure with reduced ejection fraction: the Leipzig Exercise Intervention in Chronic Heart Failure and Aging (LEICA) Diastolic Dysfunction Study. Eur Heart . (2012) 33:1758–68. doi: 10.1093/eurheartj/ehr469

36. Lan NSR, Lam K, Naylor LH, Green DJ, Minaee NS, Dias P, et al. The impact of distinct exercise training modalities on echocardiographic measurements in patients with heart failure with reduced ejection fraction. J Am Soc Echocardiogr . (2020) 33:148–56. doi: 10.1016/j.echo.2019.09.012

37. Elliott AD, Ariyaratnam J, Howden EJ, La Gerche A, Sanders P. Influence of exercise training on the left atrium: implications for atrial fibrillation, heart failure, and stroke. Am J Physiol Heart Circ Physiol . (2023) 325(4):H822–36. doi: 10.1152/ajpheart.00322.2023

38. Gan GCH, Bhat A, Chen HHL, Fernandez F, Byth K, Eshoo S, et al. Determinants of LA reservoir strain: independent effects of LA volume and LV global longitudinal strain. Echocardiography . (2020) 37(12):2018–28. doi: 10.1111/echo.14922

39. Jorge JA, Foppa M, Santos ABS, Cichelero FT, Martinez D, Lucca MB, et al. Effects of antihypertensive treatment on left and right ventricular global longitudinal strain and diastolic parameters in patients with hypertension and obstructive sleep apnea: randomized clinical trial of chlorthalidone plus amiloride vs. amlodipine. J Clin Med . (2023) 12(11):3785. doi: 10.3390/jcm12113785

40. Nielsen AB, Skaarup KG, Hauser R, Johansen ND, Lassen MCH, Jensen GB, et al. Normal values and reference ranges for left atrial strain by speckle-tracking echocardiography: the Copenhagen City Heart Study. Eur Heart J Cardiovasc Imaging . (2021) 23(1):42–51. doi: 10.1093/ehjci/jeab201

41. Nagueh SF, Khan SU. Left atrial strain for assessment of left ventricular diastolic function: focus on populations with normal LVEF. JACC Cardiovasc Imaging . (2023) 16(5):691–707. doi: 10.1016/j.jcmg.2022.10.011

42. Hamarsland H, Moen H, Skaar OJ, Jorang PW, Rødahl HS, Rønnestad BR. Equal-volume strength training with different training frequencies induces similar muscle hypertrophy and strength improvement in trained participants. Front Physiol . (2022) 12:789403. doi: 10.3389/fphys.2021.789403

43. Iellamo F, Manzi V, Caminiti G, Vitale C, Castagna C, Massaro M, et al. Matched dose interval and continuous exercise training induce similar cardiorespiratory and metabolic adaptations in patients with heart failure. Int J Cardiol . (2013) 167(6):2561. doi: 10.1016/j.ijcard.2012.06.057

44. Nicolosi GL. The strain and strain rate imaging paradox in echocardiography: overabundant literature in the last two decades but still uncertain clinical utility in an individual case. Arch Med Sci Atheroscler Dis . (2020) 5:e297–305. doi: 10.5114/amsad.2020.103032

45. Rösner A, Barbosa D, Aarsæther E, Kjønås D, Schirmer H, D'hooge J. The influence of frame rate on two-dimensional speckle-tracking strain measurements: a study on silico-simulated models and images recorded in patients. Eur Heart J Cardiovasc Imaging . (2015) 16(10):1137–47. doi: 10.1093/ehjci/jev058

46. Mirea O, Pagourelias ED, Duchenne J, Bogaert J, Thomas JD, Badano LP, et al. Intervendor differences in the accuracy of detecting regional functional abnormalities: a report from the EACVI-ASE strain standardization task force. JACC Cardiovasc Imaging . (2018) 11(1):25–34. doi: 10.1016/j.jcmg.2017.02.014

Keywords: combined training, atrial dysfunction, ischemic cardiomyopathy, atrial remodeling, cardiac rehabilitation

Citation: Caminiti G, Volterrani M, Iellamo F, Marazzi G, Manzi V, D’Antoni V, Vadalà S, Di Biasio D, Catena M, Morsella V and Perrone MA (2024) Changes in left atrial function following two regimens of combined exercise training in patients with ischemic cardiomyopathy: a pilot study. Front. Cardiovasc. Med. 11:1377958. doi: 10.3389/fcvm.2024.1377958

Received: 28 January 2024; Accepted: 5 April 2024; Published: 7 May 2024.

Reviewed by:

© 2024 Caminiti, Volterrani, Iellamo, Marazzi, Manzi, D'Antoni, Vadalà, Di Biasio, Catena, Morsella and Perrone. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Giuseppe Caminiti [email protected]

IMAGES

  1. What is a Longitudinal Study?

    limitations of longitudinal research

  2. 10 Famous Examples of Longitudinal Studies (2023)

    limitations of longitudinal research

  3. Advantages and Disadvantages of Experimental Research

    limitations of longitudinal research

  4. Accounting Nest

    limitations of longitudinal research

  5. Advantage and Disadvantage of Longitudinal Study

    limitations of longitudinal research

  6. What is a Longitudinal Study?

    limitations of longitudinal research

VIDEO

  1. Longitudinal Research

  2. METCO Longitudinal Research Reveal Part 1: Presentation

  3. Limitation vs. Delimitation in Research [Urdu/Hindi]

  4. What can policy makers learn from longitudinal studies?

  5. Developmental research, Longitudinal and cross sectional research, cohort, panel, trend study

  6. RESEARCH DESIGNS Longitudinal survey designs Eddie Seva See

COMMENTS

  1. 17 Longitudinal Study Advantages and Disadvantages

    10. Longitudinal studies always offer a factor of unpredictability. Because the structure of longitudinal studies will follow the same individuals over an extended time period, what happens to each person outside of the scope of the research can have a direct impact on the eventual findings that researchers develop.

  2. Longitudinal studies

    Disadvantages. Numerous challenges are implicit in the study design; particularly by virtue of this occurring over protracted time periods. ... Conducting longitudinal research is demanding in that it requires an appropriate infrastructure that is sufficiently robust to withstand the test of time, for the actual duration of the study. It is ...

  3. Longitudinal Study

    Revised on June 22, 2023. In a longitudinal study, researchers repeatedly examine the same individuals to detect any changes that might occur over a period of time. Longitudinal studies are a type of correlational research in which researchers observe and collect data on a number of variables without trying to influence those variables.

  4. An Overview of Longitudinal Research Designs in Social Sciences

    CRD and LRD. Based on the number of time periods for which the same variable is measured, the research designs in social sciences are broadly classified into two types: CRD and LRD. In CRD, the researcher collects the data on one or more than one variable for a single time period for each case in the study. The researcher measures the variables ...

  5. Longitudinal Study Design: Definition & Examples

    A longitudinal study is a type of observational and correlational study that involves monitoring a population over an extended period of time. It allows researchers to track changes and developments in the subjects over time. ... Limitations Costly and time-consuming. Longitudinal studies can take months or years to complete, rendering them ...

  6. Longitudinal Research: Advantages and Limitations

    Despite its disadvantages, longitudinal research has the power to help us understand variation across human development and the lifespan. One of the longest-running studies—following people over 80 years as opposed to comparing different groups at different ages—provides a robust measure of human growth—even revealing factors, like close ...

  7. Longitudinal Research: A Panel Discussion on Conceptual Issues

    Longitudinal research designs can, with certain precautions, improve one's confidence in inferences about causality. ... To understand the value (and limitations) of cross-sectional research, we will look at the role of the cross-sectional parameter (a 1) in each of the Figure 2 models. Figure 2. Open in new tab Download slide.

  8. Longitudinal study: design, measures, and classic example

    A longitudinal study is observational and involves the continuous and repeated measurements of selected individuals followed over a period of time. Quantitative and qualitative data is gathered on "any combination of exposures and outcome." ... Limitations. There are some disadvantages implicit in the study design, given that it takes place ...

  9. Longitudinal Research Design

    Longitudinal research. aims for generalizable statements about elements at different points in time, relationships, and conditions, one condition being time periods or events. focus on change and comparison of the same unit of analysis at several points in time. allows to infer the probability of temporal sequences.

  10. Longitudinal study

    A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., ... Disadvantages. Longitudinal studies are time-consuming and expensive. Longitudinal studies cannot avoid an attrition effect: that is, some subjects cannot continue to participate in the study ...

  11. Cross-Sectional and Longitudinal Studies

    A longitudinal study alleviates the problem since it involves multiple observations of the same population over a longer period of time. A major benefit of a longitudinal study is that researchers can examine changes within individuals and groups in the target population. ... Yet longitudinal studies are not without limitations, and the most ...

  12. Research Methodological Challenges and Recommendations for Conducting a

    The flexibility of a qualitative longitudinal study (Neale, 2018) made a comparative study across two continents feasible. The study was also prospective ( Calman et al., 2013 ), and due to the geographical distance between the two countries, as well as the tempo ( Neale, 2018 ) and time frame planned for the study, ethnography was considered ...

  13. Longitudinal Study: Overview, Examples & Benefits

    Disadvantages of a Longitudinal Study. A longitudinal study can be time-consuming and expensive, given its extended duration. For example, a 30-year study on the aging process may require substantial funding for decades and a long-term commitment from researchers and staff.

  14. What Is a Longitudinal Study?

    Longitudinal studies, a type of correlational research, are usually observational, in contrast with cross-sectional research. Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point. To test this hypothesis, the researchers recruit participants who are in ...

  15. Longitudinal Research in the Social Sciences

    Because longitudinal research is a broad term, methods for the analysis of social change may also vary substantially. Longitudinal research can potentially provide fuller information about individual behaviour; however, the use of such data poses crucial theoretical and methodological problems. 'Longitudinal' is a broad term.

  16. Qualitative Longitudinal Research

    Qualitative longitudinal research (QLR) is an emergent research methodology that has been used to study the change, short- and long-term impacts, and probable causality resulting from a particular event, phenomenon, policy, and/or intervention in a continuous fashion through multiple waves of data collection over a substantial amount of time (Holland et al., 2006; Neale, 2018; Thomson ...

  17. Longitudinal Qualitative Methods in Health Behavior and Nursing

    Introduction. Longitudinal qualitative research (LQR) is an emerging methodology in health behavior and nursing research—fields focused on generating evidence to support nursing practices as well as programs, and policies promoting healthy behaviors (Glanz et al., 2008; Polit & Beck, 2017).Because human experiences are rarely comprised of concrete, time-limited events, but evolve and change ...

  18. Longitudinal Research Strategies: Advantages, Problems, and Prospects

    This paper has three main aims. First, it reviews the distinctive advantages of longitudinal as opposed to cross- sectional research as a method of advancing knowledge about development from birth to adulthood. Second , it out- lines the problems of the traditional single-cohort, long-term prospective longitudinal survey .

  19. What are the pros and cons of a longitudinal study?

    Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research. Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group.As a result, the characteristics of the participants who drop out differ from the characteristics of those who ...

  20. 13 Advantages of Disadvantages of Longitudinal Studies

    The primary disadvantage of using longitudinal studies for research is that long-term research increases the chances of unpredictable outcomes. If the same people cannot be found for a study update, then the research ceases. Here are some additional key advantages and disadvantages of longitudinal studies to think about.

  21. Longitudinal Studies

    Longitudinal Studies are studies in which data is collected at specific intervals over a long period of time in order to measure changes over time. This post provides one example of a longitudinal study and explores some the strengths and limitations of this research method. With a longitudinal study you might start with an original

  22. Chapter 12

    A second issue is the impact of repeated testing. Much like a within-subjects design, researchers need to assess participants several times in a longitudinal study and this might influence participants. The third limitation of longitudinal research is that it faces subject attrition. Subject attrition poses two issues: 1.

  23. 11 Advantages and Disadvantages of Longitudinal Studies

    List of Disadvantages of Longitudinal Studies. 1. They require huge amounts of time. Time is definitely a huge disadvantage to any longitudinal study, as it typically takes a substantial amount of time to collect all the data that is required. Also, it takes equally long periods to gather results before the patterns can even start to be made. 2.

  24. Open-source data pipeline for street-view images: A case study on

    Potential applications of longitudinal SVI data in assessing the built environment , broad urban research [1, 3, 68], and health research have been well-documented, as the temporal instability of existing SVI data is discussed as a limitation in all of these fields.

  25. A longitudinal analysis of soil salinity changes using remotely sensed

    Figure 1 provides a summary of the workflow in this study. The initial step involves preprocessing the imagery data for image segmentation delineating areas for collecting soil sampling field data.

  26. Obesity and lipid indices as predictors of depressive symptoms in

    Study design and participants. The China Health and Retirement Longitudinal Study (CHARLS) is a nationally representative cohort study that began in 2011 (Waves 1), targeting middle-aged and elderly people aged 45 and above in China and their spouses [].The participants are followed every two years through a face-to-face computer-assisted personal interview (CAPI), and data collection was ...

  27. Advanced neuroimaging techniques to explore the effects of ...

    The presence of these limitations in PwMS generally results in lower amounts of physical activity performed, ... A recent longitudinal study on 1073 PwMS tried to determine the order in which different cognitive domains become impaired, employing an event-based staging approach. Their findings revealed that IPS was the initial domain to be ...

  28. Metformin mitigates dementia risk among individuals with type 2

    Limitations in the existing body of research, including methodological disparities and confounding variables, necessitate refined approaches in future studies. Large-scale prospective longitudinal studies and randomised controlled trials focusing specifically on cognitive effects are recommended. Propensity score matching and exploration of ...

  29. Frontiers

    The peak atrial longitudinal strain significantly increased in the CTHF group, while it was unchanged in the CTLF and control groups (ANOVA p = 0.003). ... The most important limitations of the present study are the small sample size and the short duration of the exercise protocol, which was limited to 12 weeks. New studies, including larger ...

  30. JCM

    Background: Joint pain has been recognized as one of the major causes of limitations in mobility, functional decline, and consequently declined quality of life in older adults. Hence, this study aimed to identify the predictors, protective factors, and adverse outcomes of joint pain in community-dwelling older adults. Methods: In this Long-term Research Grant Scheme—Towards Useful Ageing ...