Longitudinal Study Design

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

A longitudinal study is a type of observational and correlational study that involves monitoring a population over an extended period of time. It allows researchers to track changes and developments in the subjects over time.

What is a Longitudinal Study?

In longitudinal studies, researchers do not manipulate any variables or interfere with the environment. Instead, they simply conduct observations on the same group of subjects over a period of time.

These research studies can last as short as a week or as long as multiple years or even decades. Unlike cross-sectional studies that measure a moment in time, longitudinal studies last beyond a single moment, enabling researchers to discover cause-and-effect relationships between variables.

They are beneficial for recognizing any changes, developments, or patterns in the characteristics of a target population. Longitudinal studies are often used in clinical and developmental psychology to study shifts in behaviors, thoughts, emotions, and trends throughout a lifetime.

For example, a longitudinal study could be used to examine the progress and well-being of children at critical age periods from birth to adulthood.

The Harvard Study of Adult Development is one of the longest longitudinal studies to date. Researchers in this study have followed the same men group for over 80 years, observing psychosocial variables and biological processes for healthy aging and well-being in late life (see Harvard Second Generation Study).

When designing longitudinal studies, researchers must consider issues like sample selection and generalizability, attrition and selectivity bias, effects of repeated exposure to measures, selection of appropriate statistical models, and coverage of the necessary timespan to capture the phenomena of interest.

Panel Study

  • A panel study is a type of longitudinal study design in which the same set of participants are measured repeatedly over time.
  • Data is gathered on the same variables of interest at each time point using consistent methods. This allows studying continuity and changes within individuals over time on the key measured constructs.
  • Prominent examples include national panel surveys on topics like health, aging, employment, and economics. Panel studies are a type of prospective study .

Cohort Study

  • A cohort study is a type of longitudinal study that samples a group of people sharing a common experience or demographic trait within a defined period, such as year of birth.
  • Researchers observe a population based on the shared experience of a specific event, such as birth, geographic location, or historical experience. These studies are typically used among medical researchers.
  • Cohorts are identified and selected at a starting point (e.g. birth, starting school, entering a job field) and followed forward in time. 
  • As they age, data is collected on cohort subgroups to determine their differing trajectories. For example, investigating how health outcomes diverge for groups born in 1950s, 1960s, and 1970s.
  • Cohort studies do not require the same individuals to be assessed over time; they just require representation from the cohort.

Retrospective Study

  • In a retrospective study , researchers either collect data on events that have already occurred or use existing data that already exists in databases, medical records, or interviews to gain insights about a population.
  • Appropriate when prospectively following participants from the past starting point is infeasible or unethical. For example, studying early origins of diseases emerging later in life.
  • Retrospective studies efficiently provide a “snapshot summary” of the past in relation to present status. However, quality concerns with retrospective data make careful interpretation necessary when inferring causality. Memory biases and selective retention influence quality of retrospective data.

Allows researchers to look at changes over time

Because longitudinal studies observe variables over extended periods of time, researchers can use their data to study developmental shifts and understand how certain things change as we age.

High validation

Since objectives and rules for long-term studies are established before data collection, these studies are authentic and have high levels of validity.

Eliminates recall bias

Recall bias occurs when participants do not remember past events accurately or omit details from previous experiences.

Flexibility

The variables in longitudinal studies can change throughout the study. Even if the study was created to study a specific pattern or characteristic, the data collection could show new data points or relationships that are unique and worth investigating further.

Limitations

Costly and time-consuming.

Longitudinal studies can take months or years to complete, rendering them expensive and time-consuming. Because of this, researchers tend to have difficulty recruiting participants, leading to smaller sample sizes.

Large sample size needed

Longitudinal studies tend to be challenging to conduct because large samples are needed for any relationships or patterns to be meaningful. Researchers are unable to generate results if there is not enough data.

Participants tend to drop out

Not only is it a struggle to recruit participants, but subjects also tend to leave or drop out of the study due to various reasons such as illness, relocation, or a lack of motivation to complete the full study.

This tendency is known as selective attrition and can threaten the validity of an experiment. For this reason, researchers using this approach typically recruit many participants, expecting a substantial number to drop out before the end.

Report bias is possible

Longitudinal studies will sometimes rely on surveys and questionnaires, which could result in inaccurate reporting as there is no way to verify the information presented.

  • Data were collected for each child at three-time points: at 11 months after adoption, at 4.5 years of age and at 10.5 years of age. The first two sets of results showed that the adoptees were behind the non-institutionalised group however by 10.5 years old there was no difference between the two groups. The Romanian orphans had caught up with the children raised in normal Canadian families.
  • The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents (Marques Pais-Ribeiro, & Lopez, 2011)
  • The correlation between dieting behavior and the development of bulimia nervosa (Stice et al., 1998)
  • The stress of educational bottlenecks negatively impacting students’ wellbeing (Cruwys, Greenaway, & Haslam, 2015)
  • The effects of job insecurity on psychological health and withdrawal (Sidney & Schaufeli, 1995)
  • The relationship between loneliness, health, and mortality in adults aged 50 years and over (Luo et al., 2012)
  • The influence of parental attachment and parental control on early onset of alcohol consumption in adolescence (Van der Vorst et al., 2006)
  • The relationship between religion and health outcomes in medical rehabilitation patients (Fitchett et al., 1999)

Goals of Longitudinal Data and Longitudinal Research

The objectives of longitudinal data collection and research as outlined by Baltes and Nesselroade (1979):
  • Identify intraindividual change : Examine changes at the individual level over time, including long-term trends or short-term fluctuations. Requires multiple measurements and individual-level analysis.
  • Identify interindividual differences in intraindividual change : Evaluate whether changes vary across individuals and relate that to other variables. Requires repeated measures for multiple individuals plus relevant covariates.
  • Analyze interrelationships in change : Study how two or more processes unfold and influence each other over time. Requires longitudinal data on multiple variables and appropriate statistical models.
  • Analyze causes of intraindividual change: This objective refers to identifying factors or mechanisms that explain changes within individuals over time. For example, a researcher might want to understand what drives a person’s mood fluctuations over days or weeks. Or what leads to systematic gains or losses in one’s cognitive abilities across the lifespan.
  • Analyze causes of interindividual differences in intraindividual change : Identify mechanisms that explain within-person changes and differences in changes across people. Requires repeated data on outcomes and covariates for multiple individuals plus dynamic statistical models.

How to Perform a Longitudinal Study

When beginning to develop your longitudinal study, you must first decide if you want to collect your own data or use data that has already been gathered.

Using already collected data will save you time, but it will be more restricted and limited than collecting it yourself. When collecting your own data, you can choose to conduct either a retrospective or prospective study .

In a retrospective study, you are collecting data on events that have already occurred. You can examine historical information, such as medical records, in order to understand the past. In a prospective study, on the other hand, you are collecting data in real-time. Prospective studies are more common for psychology research.

Once you determine the type of longitudinal study you will conduct, you then must determine how, when, where, and on whom the data will be collected.

A standardized study design is vital for efficiently measuring a population. Once a study design is created, researchers must maintain the same study procedures over time to uphold the validity of the observation.

A schedule should be maintained, complete results should be recorded with each observation, and observer variability should be minimized.

Researchers must observe each subject under the same conditions to compare them. In this type of study design, each subject is the control.

Methodological Considerations

Important methodological considerations include testing measurement invariance of constructs across time, appropriately handling missing data, and using accelerated longitudinal designs that sample different age cohorts over overlapping time periods.

Testing measurement invariance

Testing measurement invariance involves evaluating whether the same construct is being measured in a consistent, comparable way across multiple time points in longitudinal research.

This includes assessing configural, metric, and scalar invariance through confirmatory factor analytic approaches. Ensuring invariance gives more confidence when drawing inferences about change over time.

Missing data

Missing data can occur during initial sampling if certain groups are underrepresented or fail to respond.

Attrition over time is the main source – participants dropping out for various reasons. The consequences of missing data are reduced statistical power and potential bias if dropout is nonrandom.

Handling missing data appropriately in longitudinal studies is critical to reducing bias and maintaining power.

It is important to minimize attrition by tracking participants, keeping contact info up to date, engaging them, and providing incentives over time.

Techniques like maximum likelihood estimation and multiple imputation are better alternatives to older methods like listwise deletion. Assumptions about missing data mechanisms (e.g., missing at random) shape the analytic approaches taken.

Accelerated longitudinal designs

Accelerated longitudinal designs purposefully create missing data across age groups.

Accelerated longitudinal designs strategically sample different age cohorts at overlapping periods. For example, assessing 6th, 7th, and 8th graders at yearly intervals would cover 6-8th grade development over a 3-year study rather than following a single cohort over that timespan.

This increases the speed and cost-efficiency of longitudinal data collection and enables the examination of age/cohort effects. Appropriate multilevel statistical models are required to analyze the resulting complex data structure.

In addition to those considerations, optimizing the time lags between measurements, maximizing participant retention, and thoughtfully selecting analysis models that align with the research questions and hypotheses are also vital in ensuring robust longitudinal research.

So, careful methodology is key throughout the design and analysis process when working with repeated-measures data.

Cohort effects

A cohort refers to a group born in the same year or time period. Cohort effects occur when different cohorts show differing trajectories over time.

Cohort effects can bias results if not accounted for, especially in accelerated longitudinal designs which assume cohort equivalence.

Detecting cohort effects is important but can be challenging as they are confounded with age and time of measurement effects.

Cohort effects can also interfere with estimating other effects like retest effects. This happens because comparing groups to estimate retest effects relies on cohort equivalence.

Overall, researchers need to test for and control cohort effects which could otherwise lead to invalid conclusions. Careful study design and analysis is required.

Retest effects

Retest effects refer to gains in performance that occur when the same or similar test is administered on multiple occasions.

For example, familiarity with test items and procedures may allow participants to improve their scores over repeated testing above and beyond any true change.

Specific examples include:

  • Memory tests – Learning which items tend to be tested can artificially boost performance over time
  • Cognitive tests – Becoming familiar with the testing format and particular test demands can inflate scores
  • Survey measures – Remembering previous responses can bias future responses over multiple administrations
  • Interviews – Comfort with the interviewer and process can lead to increased openness or recall

To estimate retest effects, performance of retested groups is compared to groups taking the test for the first time. Any divergence suggests inflated scores due to retesting rather than true change.

If unchecked in analysis, retest gains can be confused with genuine intraindividual change or interindividual differences.

This undermines the validity of longitudinal findings. Thus, testing and controlling for retest effects are important considerations in longitudinal research.

Data Analysis

Longitudinal data involves repeated assessments of variables over time, allowing researchers to study stability and change. A variety of statistical models can be used to analyze longitudinal data, including latent growth curve models, multilevel models, latent state-trait models, and more.

Latent growth curve models allow researchers to model intraindividual change over time. For example, one could estimate parameters related to individuals’ baseline levels on some measure, linear or nonlinear trajectory of change over time, and variability around those growth parameters. These models require multiple waves of longitudinal data to estimate.

Multilevel models are useful for hierarchically structured longitudinal data, with lower-level observations (e.g., repeated measures) nested within higher-level units (e.g., individuals). They can model variability both within and between individuals over time.

Latent state-trait models decompose the covariance between longitudinal measurements into time-invariant trait factors, time-specific state residuals, and error variance. This allows separating stable between-person differences from within-person fluctuations.

There are many other techniques like latent transition analysis, event history analysis, and time series models that have specialized uses for particular research questions with longitudinal data. The choice of model depends on the hypotheses, timescale of measurements, age range covered, and other factors.

In general, these various statistical models allow investigation of important questions about developmental processes, change and stability over time, causal sequencing, and both between- and within-person sources of variability. However, researchers must carefully consider the assumptions behind the models they choose.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies and cross-sectional studies are two different observational study designs where researchers analyze a target population without manipulating or altering the natural environment in which the participants exist.

Yet, there are apparent differences between these two forms of study. One key difference is that longitudinal studies follow the same sample of people over an extended period of time, while cross-sectional studies look at the characteristics of different populations at a given moment in time.

Longitudinal studies tend to require more time and resources, but they can be used to detect cause-and-effect relationships and establish patterns among subjects.

On the other hand, cross-sectional studies tend to be cheaper and quicker but can only provide a snapshot of a point in time and thus cannot identify cause-and-effect relationships.

Both studies are valuable for psychologists to observe a given group of subjects. Still, cross-sectional studies are more beneficial for establishing associations between variables, while longitudinal studies are necessary for examining a sequence of events.

1. Are longitudinal studies qualitative or quantitative?

Longitudinal studies are typically quantitative. They collect numerical data from the same subjects to track changes and identify trends or patterns.

However, they can also include qualitative elements, such as interviews or observations, to provide a more in-depth understanding of the studied phenomena.

2. What’s the difference between a longitudinal and case-control study?

Case-control studies compare groups retrospectively and cannot be used to calculate relative risk. Longitudinal studies, though, can compare groups either retrospectively or prospectively.

In case-control studies, researchers study one group of people who have developed a particular condition and compare them to a sample without the disease.

Case-control studies look at a single subject or a single case, whereas longitudinal studies are conducted on a large group of subjects.

3. Does a longitudinal study have a control group?

Yes, a longitudinal study can have a control group . In such a design, one group (the experimental group) would receive treatment or intervention, while the other group (the control group) would not.

Both groups would then be observed over time to see if there are differences in outcomes, which could suggest an effect of the treatment or intervention.

However, not all longitudinal studies have a control group, especially observational ones and not testing a specific intervention.

Baltes, P. B., & Nesselroade, J. R. (1979). History and rationale of longitudinal research. In J. R. Nesselroade & P. B. Baltes (Eds.), (pp. 1–39). Academic Press.

Cook, N. R., & Ware, J. H. (1983). Design and analysis methods for longitudinal research. Annual review of public health , 4, 1–23.

Fitchett, G., Rybarczyk, B., Demarco, G., & Nicholas, J.J. (1999). The role of religion in medical rehabilitation outcomes: A longitudinal study. Rehabilitation Psychology, 44, 333-353.

Harvard Second Generation Study. (n.d.). Harvard Second Generation Grant and Glueck Study. Harvard Study of Adult Development. Retrieved from https://www.adultdevelopmentstudy.org.

Le Mare, L., & Audet, K. (2006). A longitudinal study of the physical growth and health of postinstitutionalized Romanian adoptees. Pediatrics & child health, 11 (2), 85-91.

Luo, Y., Hawkley, L. C., Waite, L. J., & Cacioppo, J. T. (2012). Loneliness, health, and mortality in old age: a national longitudinal study. Social science & medicine (1982), 74 (6), 907–914.

Marques, S. C., Pais-Ribeiro, J. L., & Lopez, S. J. (2011). The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents: A two-year longitudinal study. Journal of Happiness Studies: An Interdisciplinary Forum on Subjective Well-Being, 12( 6), 1049–1062.

Sidney W.A. Dekker & Wilmar B. Schaufeli (1995) The effects of job insecurity on psychological health and withdrawal: A longitudinal study, Australian Psychologist, 30: 1,57-63.

Stice, E., Mazotti, L., Krebs, M., & Martin, S. (1998). Predictors of adolescent dieting behaviors: A longitudinal study. Psychology of Addictive Behaviors, 12 (3), 195–205.

Tegan Cruwys, Katharine H Greenaway & S Alexander Haslam (2015) The Stress of Passing Through an Educational Bottleneck: A Longitudinal Study of Psychology Honours Students, Australian Psychologist, 50:5, 372-381.

Thomas, L. (2020). What is a longitudinal study? Scribbr. Retrieved from https://www.scribbr.com/methodology/longitudinal-study/

Van der Vorst, H., Engels, R. C. M. E., Meeus, W., & Deković, M. (2006). Parental attachment, parental control, and early development of alcohol use: A longitudinal study. Psychology of Addictive Behaviors, 20 (2), 107–116.

Further Information

  • Schaie, K. W. (2005). What can we learn from longitudinal studies of adult development?. Research in human development, 2 (3), 133-158.
  • Caruana, E. J., Roman, M., Hernández-Sánchez, J., & Solli, P. (2015). Longitudinal studies. Journal of thoracic disease, 7 (11), E537.

Print Friendly, PDF & Email

  • What’s a Longitudinal Study? Types, Uses & Examples

busayo.longe

Research can take anything from a few minutes to years or even decades to complete. When a systematic investigation goes on for an extended period, it’s most likely that the researcher is carrying out a longitudinal study of the sample population. So how does this work? 

In the most simple terms, a longitudinal study involves observing the interactions of the different variables in your research population, exposing them to various causal factors, and documenting the effects of this exposure. It’s an intelligent way to establish causal relationships within your sample population. 

In this article, we’ll show you several ways to adopt longitudinal studies for your systematic investigation and how to avoid common pitfalls. 

What is a Longitudinal Study? 

A longitudinal study is a correlational research method that helps discover the relationship between variables in a specific target population. It is pretty similar to a cross-sectional study , although in its case, the researcher observes the variables for a longer time, sometimes lasting many years. 

For example, let’s say you are researching social interactions among wild cats. You go ahead to recruit a set of newly-born lion cubs and study how they relate with each other as they grow. Periodically, you collect the same types of data from the group to track their development. 

The advantage of this extended observation is that the researcher can witness the sequence of events leading to the changes in the traits of both the target population and the different groups. It can identify the causal factors for these changes and their long-term impact. 

Characteristics of Longitudinal Studies

1. Non-interference: In longitudinal studies, the researcher doesn’t interfere with the participants’ day-to-day activities in any way. When it’s time to collect their responses , the researcher administers a survey with qualitative and quantitative questions . 

2. Observational: As we mentioned earlier, longitudinal studies involve observing the research participants throughout the study and recording any changes in traits that you notice. 

3. Timeline: A longitudinal study can span weeks, months, years, or even decades. This dramatically contrasts what is obtainable in cross-sectional studies that only last for a short time. 

Cross-Sectional vs. Longitudinal Studies 

  • Definition 

A cross-sectional study is a type of observational study in which the researcher collects data from variables at a specific moment to establish a relationship among them. On the other hand, longitudinal research observes variables for an extended period and records all the changes in their relationship. 

Longitudinal studies take a longer time to complete. In some cases, the researchers can spend years documenting the changes among the variables plus their relationships. For cross-sectional studies, this isn’t the case. Instead, the researcher collects information in a relatively short time frame and makes relevant inferences from this data. 

While cross-sectional studies give you a snapshot of the situation in the research environment, longitudinal studies are better suited for contexts where you need to analyze a problem long-term. 

  • Sample Data

Longitudinal studies repeatedly observe the same sample population, while cross-sectional studies are conducted with different research samples. 

Because longitudinal studies span over a more extended time, they typically cost more money than cross-sectional observations. 

Types of Longitudinal Studies 

The three main types of longitudinal studies are: 

  • Panel Study
  • Retrospective Study
  • Cohort Study 

These methods help researchers to study variables and account for qualitative and quantitative data from the research sample. 

1. Panel Study 

In a panel study, the researcher uses data collection methods like surveys to gather information from a fixed number of variables at regular but distant intervals, often spinning into a few years. It’s primarily designed for quantitative research, although you can use this method for qualitative data analysis . 

When To Use Panel Study

If you want to have first-hand, factual information about the changes in a sample population, then you should opt for a panel study. For example, medical researchers rely on panel studies to identify the causes of age-related changes and their consequences. 

Advantages of Panel Study  

  • It helps you identify the causal factors of changes in a research sample. 
  • It also allows you to witness the impact of these changes on the properties of the variables and information needed at different points of their existing relationship. 
  • Panel studies can be used to obtain historical data from the sample population. 

Disadvantages of Panel Studies

  • Conducting a panel study is pretty expensive in terms of time and resources. 
  • It might be challenging to gather the same quality of data from respondents at every interval. 

2. Retrospective Study

In a retrospective study, the researcher depends on existing information from previous systematic investigations to discover patterns leading to the study outcomes. In other words, a retrospective study looks backward. It examines exposures to suspected risk or protection factors concerning an outcome established at the start of the study.

When To Use Retrospective Study 

Retrospective studies are best for research contexts where you want to quickly estimate an exposure’s effect on an outcome. It also helps you to discover preliminary measures of association in your data. 

Medical researchers adopt retrospective study methods when they need to research rare conditions. 

Advantages of Retrospective Study

  • Retrospective studies happen at a relatively smaller scale and do not require much time to complete. 
  • It helps you to study rare outcomes when prospective surveys are not feasible.

Disadvantages of Retrospective Study

  • It is easily affected by recall bias or misclassification bias.
  • It often depends on convenience sampling, which is prone to selection bias. 

3. Cohort Study  

A cohort study entails collecting information from a group of people who share specific traits or have experienced a particular occurrence simultaneously. For example, a researcher might conduct a cohort study on a group of Black school children in the U.K. 

During cohort study, the researcher exposes some group members to a specific characteristic or risk factor. Then, she records the outcome of this exposure and its impact on the exposed variables. 

When To Use Cohort Study

You should conduct a cohort study if you’re looking to establish a causal relationship within your data sets. For example, in medical research, cohort studies investigate the causes of disease and establish links between risk factors and effects. 

Advantages of Cohort Studies

  • It allows you to study multiple outcomes that can be associated with one risk factor. 
  • Cohort studies are designed to help you measure all variables of interest. 

Disadvantages of Cohort Studies

  • Cohort studies are expensive to conduct.
  • Throughout the process, the researcher has less control over variables. 

When Would You Use a Longitudinal Study? 

If you’re looking to discover the relationship between variables and the causal factors responsible for changes, you should adopt a longitudinal approach to your systematic investigation. Longitudinal studies help you to analyze change over a meaningful time. 

How to Perform a Longitudinal Study?

There are only two approaches you can take when performing a longitudinal study. You can either source your own data or use previously gathered data.

1. Sourcing for your own data

Collecting your own data is a more verifiable method because you can trust your own data. The way you collect your data is also heavily dependent on the type of study you’re conducting.

If you’re conducting a retrospective study, you’d have to collect data on events that have already happened. An example is going through records to find patterns in cancer patients.

For a prospective study, you collect the data in real-time. This means finding a sample population, following them, and documenting your findings over the course of your study.

Irrespective of what study type you’d be conducting, you need a versatile data collection tool to help you accurately record your data. One we strongly recommend is Formplus . Signup here for free.

2. Using previously gathered data

Governmental and research institutes often carry out longitudinal studies and make the data available to the public. So you can pick up their previously researched data and use them for your own study. An example is the UK data service website .

Using previously gathered data isn’t just easy, they also allow you to carry out research over a long period of time. 

The downside to this method is that it’s very restrictive because you can only use the data set available to you. You also have to thoroughly examine the source of the data given to you. 

Advantages of a Longitudinal Study 

  • Longitudinal studies help you discover variable patterns over time, leading to more precise causal relationships and research outcomes. 
  • When researching developmental trends, longitudinal studies allow you to discover changes across lifespans and arrive at valid research outcomes. 
  • They are highly flexible, which means the researcher can adjust the study’s focus while it is ongoing. 
  • Unlike other research methods, longitudinal studies collect unique, long-term data and highlight relationships that cannot be discovered in a short-term investigation. 
  • You can collect additional data to study unexpected findings at any point in your systematic investigation. 

Disadvantages and Limitations of a Longitudinal Study 

  • It’s difficult to predict the results of longitudinal studies because of the extended time frame. Also, it may take several years before the data begins to produce observable patterns or relationships that can be monitored. 
  • It costs lots of money to sustain a research effort for years. You’ll keep incurring costs every year compared to other forms of research that can be completed in a smaller fraction of the time.
  • Longitudinal studies require a large sample size which might be challenging to achieve. Without this, the entire investigation will have little or no impact. 
  • Longitudinal studies often experience panel attrition. This happens when some members of the research sample are unable to complete the study due to several reasons like changes in contact details, refusal, incapacity, and even death. 

Longitudinal Studies Examples

How does a longitudinal study work in the real world? To answer this, let’s consider a few typical scenarios. 

A researcher wants to know the effects of a low-carb diet on weight loss. So, he gathers a group of obese men and kicks off the systematic investigation using his preferred longitudinal study method. He records information like how much they weigh, the number of carbs in their diet, and the like at different points. All these data help him to arrive at valid research outcomes. 

Use for Free: Macros Calories Diet Plan Template

A researcher wants to know if there’s any relationship between children who drink milk before school and high classroom performance . First, he uses a sampling technique to gather a large research population. 

Then, he conducts a baseline survey to establish the premise of the research for later comparison. Next, the researcher gives a log to each participant to keep track of predetermined research variables . 

Example 3  

You decide to study how a particular diet affects athletes’ performance over time. First, you gather your sample population , establish a baseline for the research, and observe and record the required data.

Longitudinal Studies Frequently Asked Questions (FAQs) 

  • Are Longitudinal Studies Quantitative or Qualitative?

Longitudinal studies are primarily a qualitative research method because the researcher observes and records changes in variables over an extended period. However, it can also be used to gather quantitative data depending on your research context. 

  • What Is Most Likely the Biggest Problem with Longitudinal Research?

The biggest challenge with longitudinal research is panel attrition. Due to the length of the research process, some variables might be unable to complete the study for one reason or the other. When this happens, it can distort your data and research outcomes. 

  • What is Longitudinal Data Collection?

Longitudinal data collection is the process of gathering information from the same sample population over a long period. Longitudinal data collection uses interviews, surveys, and observation to collect the required information from research sources. 

  • What is the Difference Between Longitudinal Data and a Time Series Analysis?

Because longitudinal studies collect data over a long period, they are often mistaken for time series analysis. So what’s the real difference between these two concepts? 

In a time series analysis, the researcher focuses on a single individual at multiple time intervals. Meanwhile, longitudinal data focuses on multiple individuals at various time intervals. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • cohort study
  • cross-sectional study
  • longitudinal study
  • longitudinal study faq
  • panel study
  • retrospective cohort study
  • sample data
  • busayo.longe

Formplus

You may also like:

11 Retrospective vs Prospective Cohort Study Differences

differences between retrospective and prospective cohort studies in definitions, examples, data collection, analysis, advantages, sample...

type of longitudinal research

Selection Bias in Research: Types, Examples & Impact

In this article, we’ll discuss the effects of selection bias, how it works, its common effects and the best ways to minimize it.

Cross-Sectional Studies: Types, Pros, Cons & Uses

In this article, we’ll look at what cross-sectional studies are, how it applies to your research and how to use Formplus to collect...

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

What Is a Longitudinal Study?

Tracking Variables Over Time

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

type of longitudinal research

Amanda Tust is a fact-checker, researcher, and writer with a Master of Science in Journalism from Northwestern University's Medill School of Journalism.

type of longitudinal research

Steve McAlister / The Image Bank / Getty Images

The Typical Longitudinal Study

Potential pitfalls, frequently asked questions.

A longitudinal study follows what happens to selected variables over an extended time. Psychologists use the longitudinal study design to explore possible relationships among variables in the same group of individuals over an extended period.

Once researchers have determined the study's scope, participants, and procedures, most longitudinal studies begin with baseline data collection. In the days, months, years, or even decades that follow, they continually gather more information so they can observe how variables change over time relative to the baseline.

For example, imagine that researchers are interested in the mental health benefits of exercise in middle age and how exercise affects cognitive health as people age. The researchers hypothesize that people who are more physically fit in their 40s and 50s will be less likely to experience cognitive declines in their 70s and 80s.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies, a type of correlational research , are usually observational, in contrast with cross-sectional research . Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point.

To test this hypothesis, the researchers recruit participants who are in their mid-40s to early 50s. They collect data related to current physical fitness, exercise habits, and performance on cognitive function tests. The researchers continue to track activity levels and test results for a certain number of years, look for trends in and relationships among the studied variables, and test the data against their hypothesis to form a conclusion.

Examples of Early Longitudinal Study Design

Examples of longitudinal studies extend back to the 17th century, when King Louis XIV periodically gathered information from his Canadian subjects, including their ages, marital statuses, occupations, and assets such as livestock and land. He used the data to spot trends over the years and understand his colonies' health and economic viability.

In the 18th century, Count Philibert Gueneau de Montbeillard conducted the first recorded longitudinal study when he measured his son every six months and published the information in "Histoire Naturelle."

The Genetic Studies of Genius (also known as the Terman Study of the Gifted), which began in 1921, is one of the first studies to follow participants from childhood into adulthood. Psychologist Lewis Terman's goal was to examine the similarities among gifted children and disprove the common assumption at the time that gifted children were "socially inept."

Types of Longitudinal Studies

Longitudinal studies fall into three main categories.

  • Panel study : Sampling of a cross-section of individuals
  • Cohort study : Sampling of a group based on a specific event, such as birth, geographic location, or experience
  • Retrospective study : Review of historical information such as medical records

Benefits of Longitudinal Research

A longitudinal study can provide valuable insight that other studies can't. They're particularly useful when studying developmental and lifespan issues because they allow glimpses into changes and possible reasons for them.

For example, some longitudinal studies have explored differences and similarities among identical twins, some reared together and some apart. In these types of studies, researchers tracked participants from childhood into adulthood to see how environment influences personality , achievement, and other areas.

Because the participants share the same genetics , researchers chalked up any differences to environmental factors . Researchers can then look at what the participants have in common and where they differ to see which characteristics are more strongly influenced by either genetics or experience. Note that adoption agencies no longer separate twins, so such studies are unlikely today. Longitudinal studies on twins have shifted to those within the same household.

As with other types of psychology research, researchers must take into account some common challenges when considering, designing, and performing a longitudinal study.

Longitudinal studies require time and are often quite expensive. Because of this, these studies often have only a small group of subjects, which makes it difficult to apply the results to a larger population.

Selective Attrition

Participants sometimes drop out of a study for any number of reasons, like moving away from the area, illness, or simply losing motivation . This tendency, known as selective attrition , shrinks the sample size and decreases the amount of data collected.

If the final group no longer reflects the original representative sample , attrition can threaten the validity of the experiment. Validity refers to whether or not a test or experiment accurately measures what it claims to measure. If the final group of participants doesn't represent the larger group accurately, generalizing the study's conclusions is difficult.

The World’s Longest-Running Longitudinal Study

Lewis Terman aimed to investigate how highly intelligent children develop into adulthood with his "Genetic Studies of Genius." Results from this study were still being compiled into the 2000s. However, Terman was a proponent of eugenics and has been accused of letting his own sexism , racism , and economic prejudice influence his study and of drawing major conclusions from weak evidence. However, Terman's study remains influential in longitudinal studies. For example, a recent study found new information on the original Terman sample, which indicated that men who skipped a grade as children went on to have higher incomes than those who didn't.

A Word From Verywell

Longitudinal studies can provide a wealth of valuable information that would be difficult to gather any other way. Despite the typical expense and time involved, longitudinal studies from the past continue to influence and inspire researchers and students today.

A longitudinal study follows up with the same sample (i.e., group of people) over time, whereas a cross-sectional study examines one sample at a single point in time, like a snapshot.

A longitudinal study can occur over any length of time, from a few weeks to a few decades or even longer.

That depends on what researchers are investigating. A researcher can measure data on just one participant or thousands over time. The larger the sample size, of course, the more likely the study is to yield results that can be extrapolated.

Piccinin AM, Knight JE. History of longitudinal studies of psychological aging . Encyclopedia of Geropsychology. 2017:1103-1109. doi:10.1007/978-981-287-082-7_103

Terman L. Study of the gifted . In: The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. 2018. doi:10.4135/9781506326139.n691

Sahu M, Prasuna JG. Twin studies: A unique epidemiological tool .  Indian J Community Med . 2016;41(3):177-182. doi:10.4103/0970-0218.183593

Almqvist C, Lichtenstein P. Pediatric twin studies . In:  Twin Research for Everyone . Elsevier; 2022:431-438.

Warne RT. An evaluation (and vindication?) of Lewis Terman: What the father of gifted education can teach the 21st century . Gifted Child Q. 2018;63(1):3-21. doi:10.1177/0016986218799433

Warne RT, Liu JK. Income differences among grade skippers and non-grade skippers across genders in the Terman sample, 1936–1976 . Learning and Instruction. 2017;47:1-12. doi:10.1016/j.learninstruc.2016.10.004

Wang X, Cheng Z. Cross-sectional studies: Strengths, weaknesses, and recommendations .  Chest . 2020;158(1S):S65-S71. doi:10.1016/j.chest.2020.03.012

Caruana EJ, Roman M, Hernández-Sánchez J, Solli P. Longitudinal studies .  J Thorac Dis . 2015;7(11):E537-E540. doi:10.3978/j.issn.2072-1439.2015.10.63

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

type of longitudinal research

Longitudinal Studies: Methods, Benefits and Challenges

type of longitudinal research

Introduction

What is a longitudinal study, what are examples of longitudinal studies, longitudinal studies vs. cross-sectional studies, benefits of longitudinal studies, types of longitudinal studies, how do you conduct a longitudinal study, challenges of longitudinal research.

Longitudinal research refers to any study that collects the same sample of data from the same group of people at different points in time. While time-consuming and potentially costly in terms of resources and effort, a longitudinal study has enormous utility in understanding complex phenomena that might change as time passes.

In this article, we will explore the nature and importance of longitudinal studies to allow you to decide whether your research inquiry warrants a longitudinal inquiry or if a cross-sectional study is more appropriate.

type of longitudinal research

To understand a longitudinal study, let's start with a simple survey as an example. Determining the popularity of a particular product or service at a specific point in time can simply be a matter of collecting and analyzing survey responses from a certain number of people within a population. The qualitative and quantitative data collected from these surveys can tell you what people think at the moment those surveys were conducted. This is what is known as a cross-sectional study .

Now imagine the product that you're trying to assess is seasonal like a brand of ice cream or hot chocolate. What's popular in summer may not be popular in winter, and trends come and go as competing products enter the market. In this context, the one survey that was conducted is merely a snapshot of a moving phenomenon at a single point in time.

In a longitudinal study design, that same survey will be distributed to the same group of people at different time intervals (e.g., twice a year or once a month) to allow researchers to see if there are any changes. Perhaps there is an ice cream that is as popular in the winter as it is in the summer, which may be worth identifying to expand profitability. A longitudinal study would thus be useful to explore this question.

Longitudinal research isn't conducted simply for the sake of being able to say research was conducted over a extended period of time. A longitudinal analysis collects data at different points in time to observe changes in the characteristics of the object of inquiry. Ultimately, collecting data for a longitudinal study can help identify cause-and-effect relationships that cannot otherwise be perceived in discrete or cross-sectional studies.

type of longitudinal research

Longitudinal studies are found in many research fields where time is an important factor. Let's look at examples in three different research areas.

Classroom research is often longitudinal because of the acknowledgment that successful learning takes place over time and not merely in a single class session. Such studies take place over several classes, perhaps over a semester or an entire academic year. A researcher might observe the same group of students as they progress academically or, conversely, identify any significant decline in learning outcomes to determine how changes in teaching and learning over time might affect student development.

type of longitudinal research

Health sciences

Medical research often relies on longitudinal studies to determine the effectiveness and risk factors involved with drugs, treatments, or other medical remedies. Consider a dietary supplement that is purported to help people lose weight. Perhaps, in the beginning, people who take this supplement actually do lose weight. But what happens later on? Do they keep the weight off, gain it back or, even worse, gain even more weight in the long term? A longitudinal study can help researchers determine if that supplement produces sustainable results or is merely a quick fix that has negative side effects later on.

type of longitudinal research

Product life cycles and market trends can take extended periods of time to manifest. In the meantime, competing products might enter the market and consequently affect customer loyalty and product image. If a cross-sectional study captures a snapshot of opinions in the marketplace, then think of a longitudinal study as several snapshots spread out over time to allow researchers to observe changes in market behavior and their underlying causes as time passes.

type of longitudinal research

Cross-sectional studies are discrete studies that capture data within a particular context at a particular point in time. These kinds of studies are more appropriate for research inquiries that don't examine some form of development or evolution, such as concepts or phenomena that are generally static or unchanging over extended periods of time.

To determine which type of study would be more appropriate for your research inquiry, it's important to identify the object of inquiry that is being studied. Ask yourself the following questions when planning your study:

  • Do you need an extended period of time to sufficiently capture the phenomenon?
  • Is the sample of data collected likely to change over time?
  • Is it feasible to commit time and resources to an extended study?

If you said yes to all of these questions, a longitudinal study would be suited to addressing your research questions . Otherwise, cross-sectional studies may be more appropriate for your research.

type of longitudinal research

Intuitive tools, powerful analysis with ATLAS.ti

Make the most of your research with our easy-to-use data analysis platform. Download a free trial today.

A longitudinal study can provide many benefits potentially relevant to the research question you are looking to address. Here are three different advantages you might consider.

Abundance of data

In many cases, research rigor is served by collecting abundant data . Research approaches like thematic analysis and content analysis benefit from a large set of data that helps you identify the most frequently occurring phenomena within a research context. Large data sets collected through longitudinal studies can be useful for separating abundance from anecdotes.

Identification of patterns

Analyzing patterns often implies exploring how things interact sequentially or over time, which is best captured with longitudinal data. Think about, for example, how sports competitions and political elections take place over a year or even multiple years. Construction of ships and buildings can be a long and protracted process. Doctoral students can spend four or more years before earning their degree. A simple cross-sectional study in such contexts may not gather sufficient data captured over a period of time long enough to observe sequences of related events.

Observation of relationships

Certain relationships between different phenomena can only be observed longitudinally. The famous marshmallow test that asserted connections between behaviors in childhood and later life outcomes spawned decades of longitudinal study. Even if your research is much simpler, your research question might involve the observation of distant but related phenomena that only a longitudinal study can capture.

There are two types of longitudinal studies to choose from, primarily depending on what you are looking to examine. Keep in mind that longitudinal study design, no matter what type of study you might pursue, is a matter of sustaining a research inquiry over time to capture the necessary data. It's important that your decision-making process is both transparent and intentional for the sake of research rigor.

Cohort studies

A cohort study examines a group of people that share a common trait. This trait could be a similar age group, a common level of education, or a shared experience.

An example of a cohort study is one that looks to identify factors related to successful aging found in lifestyles among people of middle age. Such a study could observe a group of people, all of whom are similar in age, to identify a common range of lifestyles and activities that are applicable for others of the same age group.

type of longitudinal research

Panel studies

The difference between a cohort study and a panel study is that panel studies collect data from within a general population, rather than a specific set of particular individuals with a common characteristic. The goal of a panel study is to examine a representative sample of a larger population rather than a specific subset of people.

A longitudinal survey that adopts a panel study model, for example, would randomly sample a population and send out questionnaires to the same sample of people over time. Such a survey could look at changes in everyday habits regarding spending or work-life balance and how they might be influenced by environmental or economic shifts from one period of time to the next.

Planning a prospective or future research study that is longitudinal requires careful attention to detail prior to conducting the study. By itself, a longitudinal study can be considered a repeated sequence of the same discrete study across different periods of time.

However, ensuring that multiple iterations of the same study are conducted repeatedly and rigorously is the challenge in longitudinal studies. With that in mind, let's look at some of the different research methods that might be employed in longitudinal research.

Observational research

Action research and ethnographies rely on longitudinal observations to provide sufficient depth to the cultural practices and interactions that are under study. In anthropological and sociological research, some phenomena are so complex or dynamic that they can only be observed longitudinally.

Organizational research, for example, employs longitudinal research to identify how people in the workplace or other similar settings interact with each other. This kind of research is useful for understanding how rapport is established and whether productivity increases as a result.

A longitudinal survey can address research questions that deal with opinions and perspectives that may change over time. Unlike a cross-sectional survey from a particular point in time, longitudinal surveys are administered repeatedly to the same group of people to collect data on changes or developments.

A personal wellness study, for example, might examine how healthy habits (or the lack thereof) affect health by asking respondents questions about their daily routine. By comparing their routines over time with information such as blood pressure, weight, and waist size, survey data on lifestyle routines can allow researchers to identify what habits can cause the greatest improvement in individual health.

Experiments

Various experimental studies, especially in medical research, can be longitudinal in nature. A longitudinal experiment usually collects data from a control group and an experimental group to observe the effects of a certain treatment on the same participants over a period of time.

This type of research is commonly employed to examine the effects of medical treatments on outcomes such as cardiovascular disease or diabetes. The requirements for governmental approval are incredibly stringent and call for rigorous data collection that establishes causality.

Needless to say, longitudinal studies tend to be time-consuming. The most obvious drawback of longitudinal studies is that they take up a significant portion of researchers' time and effort.

However, there are other disadvantages of longitudinal studies, particularly the likelihood of participant attrition. In other words, the more lengthy the study, the more likely it is that participants may drop out of the study. This is especially true when working with vulnerable or marginalized populations such as migrant workers or homeless people, populations that may not always be easy to contact for collecting data.

Over the course of time, the research context that a researcher studies may change with the appearance of new technologies, trends, or other developments that may not have been anticipated. While confounding influences are possible in any study, they are likely to be more abundant in studies on a longitudinal scale. As a result, it's important for the researcher to try to account for these influences when analyzing the data . It could even be worthwhile to examine how the appearance of that phenomenon or concept impacted a relevant outcome of interest in your area.

type of longitudinal research

Turn your research into critical insights with ATLAS.ti

Powerful tools to make the most of your data are just a click away. Download a free trial of ATLAS.ti.

type of longitudinal research

What is a longitudinal study?

Last updated

20 February 2023

Reviewed by

Longitudinal studies are common in epidemiology, economics, and medicine. People also use them in other medical and social sciences, such as to study customer trends. Researchers periodically observe and collect data from the variables without manipulating the study environment.

A company may conduct a tracking study, surveying a target audience to measure changes in attitudes and behaviors over time. The collected data doesn't change, and the time interval remains consistent. This longitudinal study can measure brand awareness, customer satisfaction , and consumer opinions and analyze the impact of an advertising campaign.

Analyze longitudinal studies

Dovetail streamlines longitudinal study data to help you uncover and share actionable insights

  • Types of longitudinal studies

There are two types of longitudinal studies: Cohort and panel studies.

Panel study

A panel study is a type of longitudinal study that involves collecting data from a fixed number of variables at regular but distant intervals. Researchers follow a group or groups of people over time. Panel studies are designed for quantitative analysis but are also usable for qualitative analysis .

A panel study may research the causes of age-related changes and their effects. Researchers may measure the health markers of a group over time, such as their blood pressure, blood cholesterol, and mental acuity. Then, they can compare the scores to understand how age positively or negatively correlates with these measures.

Cohort study

A cohort longitudinal study involves gathering information from a group of people with something in common, such as a specific trait or experience of the same event. The researchers observe behaviors and other details of the group over time. Unlike panel studies, you can pick a different group to test in cohort studies.

An example of a cohort study could be a drug manufacturer studying the effects on a group of users taking a new drug over a period. A drinks company may want to research consumers with common characteristics, like regular purchasers of sugar-free sodas. This will help the company understand trends within its target market.

  • Benefits of longitudinal research

If you want to study the relationship between variables and causal factors responsible for certain outcomes, you should adopt a longitudinal approach to your investigation.

The benefits of longitudinal research over other research methods include the following:

Insights over time

It gives insights into how and why certain things change over time.

Better information

Researchers can better establish sequences of events and identify trends.

No recall bias

The participants won't have recall bias if you use a prospective longitudinal study. Recall bias is an error that occurs in a study if respondents don't wholly or accurately recall the details of their actions, attitudes, or behaviors.

Because variables can change during the study, researchers can discover new relationships or data points worth further investigation.

Small groups

Longitudinal studies don't need a large group of participants.

  • Potential pitfalls

The challenges and potential pitfalls of longitudinal studies include the following:

A longitudinal survey takes a long time, involves multiple data collections , and requires complex processes, making it more expensive than other research methods.

Unpredictability

Because they take a long time, longitudinal studies are unpredictable. Unexpected events can cause changes in the variables, making earlier data potentially less valuable.

Slow insights

Researchers can take a long time to uncover insights from the study as it involves multiple observations.

Participants can drop out of the study, limiting the data set and making it harder to draw valid conclusions from the results.

Overly specific data

If you study a smaller group to reduce research costs, results will be less generalizable to larger populations versus a study with a larger group.

Despite these potential pitfalls, you can still derive significant value from a well-designed longitudinal study by uncovering long-term patterns and relationships.

  • Longitudinal study designs

Longitudinal studies can take three forms: Repeated cross-sectional, prospective, and retrospective.

Repeated cross-sectional studies

Repeated cross-sectional studies are a type of longitudinal study where participants change across sampling periods. For example, as part of a brand awareness survey , you ask different people from the same customer population about their brand preferences. 

Prospective studies

A prospective study is a longitudinal study that involves real-time data collection, and you follow the same participants over a period. Prospective longitudinal studies can be cohort, where participants have similar characteristics or experiences. They can also be panel studies, where you choose the population sample randomly.

Retrospective studies

Retrospective studies are longitudinal studies that involve collecting data on events that some participants have already experienced. Researchers examine historical information to identify patterns that led to an outcome they established at the start of the study. Retrospective studies are the most time and cost-efficient of the three.

  • How to perform a longitudinal study

When developing a longitudinal study plan, you must decide whether to collect your data or use data from other sources. Each choice has its benefits and drawbacks.

Using data from other sources

You can freely access data from many previous longitudinal studies, especially studies conducted by governments and research institutes. For example, anyone can access data from the 1970 British Cohort Study on the  UK Data Service website .

Using data from other sources saves the time and money you would have spent gathering data. However, the data is more restrictive than the data you collect yourself. You are limited to the variables the original researcher was investigating, and they may have aggregated the data, obscuring some details.

If you can't find data or longitudinal research that applies to your study, the only option is to collect it yourself.

Collecting your own data

Collecting data enhances its relevance, integrity, reliability, and verifiability. Your data collection methods depend on the type of longitudinal study you want to perform. For example, a retrospective longitudinal study collects historical data, while a prospective longitudinal study collects real-time data.

The only way to ensure relevant and reliable data is to use an effective and versatile data collection tool. It can improve the speed and accuracy of the information you collect.

What is a longitudinal study in research?

A longitudinal study is a research design that involves studying the same variables over time by gathering data continuously or repeatedly at consistent intervals.

What is an example of a longitudinal study?

An excellent example of a longitudinal study is market research to identify market trends. The organization's researchers collect data on customers' likes and dislikes to assess market trends and conditions. An organization can also conduct longitudinal studies after launching a new product to understand customers' perceptions and how it is doing in the market.

Why is it called a longitudinal study?

It’s a longitudinal study because you collect data over an extended period. Longitudinal data tracks the same type of information on the same variables at multiple points in time. You collect the data over repeated observations.

What is a longitudinal study vs. a cross-sectional study?

A longitudinal study follows the same people over an extended period, while a cross-sectional study looks at the characteristics of different people or groups at a given time. Longitudinal studies provide insights over an extended period and can establish patterns among variables.

Cross-sectional studies provide insights about a point in time, so they cannot identify cause-and-effect relationships.

Editor’s picks

Last updated: 11 January 2024

Last updated: 15 January 2024

Last updated: 25 November 2023

Last updated: 12 May 2023

Last updated: 30 April 2024

Last updated: 18 May 2023

Last updated: 10 April 2023

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

type of longitudinal research

Users report unexpectedly high data usage, especially during streaming sessions.

type of longitudinal research

Users find it hard to navigate from the home page to relevant playlists in the app.

type of longitudinal research

It would be great to have a sleep timer feature, especially for bedtime listening.

type of longitudinal research

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

  • Search Menu
  • Advance articles
  • Author Guidelines
  • Submission Site
  • Open Access
  • Call for Papers
  • Why publish with Work, Aging and Retirement?
  • About Work, Aging and Retirement
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Questions on conceptual issues, questions on research design, questions on statistical techniques, acknowledgments, longitudinal research: a panel discussion on conceptual issues, research design, and statistical techniques.

All authors contributed equally to this article and the order of authorship is arranged arbitrarily. Correspondence concerning this article should be addressed to Mo Wang, Warrington College of Business, Department of Management, University of Florida, Gainesville, FL 32611. E-mail: [email protected]

Decision Editor: Donald Truxillo, PhD

  • Article contents
  • Figures & tables
  • Supplementary Data

Mo Wang, Daniel J. Beal, David Chan, Daniel A. Newman, Jeffrey B. Vancouver, Robert J. Vandenberg, Longitudinal Research: A Panel Discussion on Conceptual Issues, Research Design, and Statistical Techniques, Work, Aging and Retirement , Volume 3, Issue 1, 1 January 2017, Pages 1–24, https://doi.org/10.1093/workar/waw033

  • Permissions Icon Permissions

The goal of this article is to clarify the conceptual, methodological, and practical issues that frequently emerge when conducting longitudinal research, as well as in the journal review process. Using a panel discussion format, the current authors address 13 questions associated with 3 aspects of longitudinal research: conceptual issues, research design, and statistical techniques. These questions are intentionally framed at a general level so that the authors could address them from their diverse perspectives. The authors’ perspectives and recommendations provide a useful guide for conducting and reviewing longitudinal studies in work, aging, and retirement research.

An important meta-trend in work, aging, and retirement research is the heightened appreciation of the temporal nature of the phenomena under investigation and the important role that longitudinal study designs play in understanding them (e.g., Heybroek, Haynes, & Baxter, 2015 ; Madero-Cabib, Gauthier, & Le Goff, 2016 ; Wang, 2007 ; Warren, 2015 ; Weikamp & Göritz, 2015 ). This echoes the trend in more general research on work and organizational phenomena, where the discussion of time and longitudinal designs has evolved from explicating conceptual and methodological issues involved in the assessment of changes over time (e.g., McGrath & Rotchford, 1983 ) to the development and application of data analytic techniques (e.g., Chan, 1998 ; Chan & Schmitt, 2000 ; DeShon, 2012 ; Liu, Mo, Song, & Wang, 2016 ; Wang & Bodner, 2007 ; Wang & Chan, 2011 ; Wang, Zhou, & Zhang, 2016 ), theory rendering (e.g., Ancona et al. , 2001 ; Mitchell & James, 2001 ; Vancouver, Tamanini, & Yoder, 2010 ; Wang et al. , 2016 ), and methodological decisions in conducting longitudinal research (e.g., Beal, 2015 ; Bolger, Davis, & Rafaeli, 2003 ; Ployhart & Vandenberg, 2010 ). Given the importance of and the repeated call for longitudinal studies to investigate work, aging, and retirement-related phenomena (e.g., Fisher, Chaffee, & Sonnega, 2016 ; Wang, Henkens, & van Solinge, 2011 ), there is a need for more nontechnical discussions of the relevant conceptual and methodological issues. Such discussions would help researchers to make more informed decisions about longitudinal research and to conduct studies that would both strengthen the validity of inferences and avoid misleading interpretations.

In this article, using a panel discussion format, the authors address 13 questions associated with three aspects of longitudinal research: conceptual issues, research design, and statistical techniques. These questions, as summarized in Table 1 , are intentionally framed at a general level (i.e., not solely in aging-related research), so that the authors could address them from diverse perspectives. The goal of this article is to clarify the conceptual, methodological, and practical issues that frequently emerge in the process of conducting longitudinal research, as well as in the related journal review process. Thus, the authors’ perspectives and recommendations provide a useful guide for conducting and reviewing longitudinal studies—not only those dealing with aging and retirement, but also in the broader fields of work and organizational research.

Questions Regarding Longitudinal Research Addressed in This Article

Conceptual Issue Question 1: Conceptually, what is the essence of longitudinal research?

This is a fundamental question to ask given the confusion in the literature. It is common to see authors attribute their high confidence in their causal inferences to the longitudinal design they use. It is also common to see authors attribute greater confidence in their measurement because of using a longitudinal design. Less common, but with increasing frequency, authors claim to be examining the role of time in their theoretical models via the use of longitudinal designs. These different assumptions by authors illustrate the need for clarifying when specific attributions about longitudinal research are appropriate. Hence, a discussion of the essence of longitudinal research and what it provides is in order.

Oddly, definitions of longitudinal research are rare. One exception is a definition by Taris (2000) , who explained that longitudinal “data are collected for the same set of research units (which might differ from the sampling units/respondents) for (but not necessarily at) two or more occasions, in principle allowing for intra-individual comparison across time” (pp. 1–2). Perhaps more directly relevant for the current discussion of longitudinal research related to work and aging phenomena, Ployhart and Vandenberg (2010) defined “ longitudinal research as research emphasizing the study of change and containing at minimum three repeated observations (although more than three is better) on at least one of the substantive constructs of interest” (p. 97; italics in original). Compared to Taris (2000) , Ployhart and Vandenberg’s (2010) definition explicitly emphasizes change and encourages the collection of many waves of repeated measures. However, Ployhart and Vandenberg’s definition may be overly restrictive. For example, it precludes designs often classified as longitudinal such as the prospective design. In a prospective design, some criterion (i.e., presumed effect) is measured at Times 1 and 2, so that one can examine change in the criterion as a function of events (i.e., presumed causes) happening (or not) between the waves of data collection. For example, a researcher can use this design to assess the psychological and behavioral effects of retirement that occur before and after retirement. That is, psychological and behavioral variables are measured before and after retirement. Though not as internally valid as an experiment (which is not possible because we cannot randomly assign participants into retirement and non-retirement conditions), this prospective design is a substantial improvement over the typical design where the criteria are only measured at one time. This is because it allows one to more directly examine change in a criterion as a function of differences between events or person variables. Otherwise, one must draw inferences based on retrospective accounts of the change in criterion along with the retrospective accounts of the events; further, one may worry that the covariance between the criterion and person variables is due to changes in the criterion that are also changing the person. Of course, this design does not eliminate the possibility that changes in criterion may cause differences in events (e.g., changes observed in psychological and behavioral variables lead people to decide to retire).

In addition to longitudinal designs potentially having only two waves of data collection for a variable, there are certain kinds of criterion variables that need only one explicit measure at Time 2 in a 2-wave study. Retirement (or similarly, turnover) is an example. I say “explicit” because retirement is implicitly measured at Time 1. That is, if the units are in the working sample at Time 1, they have not retired. Thus, retirement at Time 2 represents change in working status. On the other hand, if retirement intentions is the criterion variable, repeated measures of this variable are important for assessing change. Repeated measures also enable the simultaneous assessment of change in retirement intentions and its alleged precursors; it could be that a variable like job satisfaction (a presumed cause of retirement intentions) is actually lowered after the retirement intentions are formed, perhaps in a rationalization process. That is, individuals first intend to retire and then evaluate over time their attitudes toward their present job. This kind of reverse causality process would not be detected in a design measuring job satisfaction at Time 1 and retirement intentions at Time 2.

Given the above, I opt for a much more straightforward definition of longitudinal research. Specifically, longitudinal research is simply research where data are collected over a meaningful span of time. A difference between this definition and the one by Taris (2000) is that this definition does not include the clause about examining intra-individual comparisons. Such designs can examine intra-individual comparisons, but again, this seems overly restrictive. That said, I do add a restriction to this definition, which is that the time span should be “meaningful.” This term is needed because time will always pass—that is, it takes time to complete questionnaires, do tasks, or observe behavior, even in cross-sectional designs. Yet, this passage of time likely provides no validity benefit. On the other hand, the measurement interval could last only a few seconds and still be meaningful. To be meaningful it has to support the inferences being made (i.e., improve the research’s validity). Thus, the essence of longitudinal research is to improve the validity of one’s inferences that cannot otherwise be achieved using cross-sectional research ( Shadish, Cook, & Campbell, 2002 ). The inferences that longitudinal research can potentially improve include those related to measurement (i.e., construct validity), causality (i.e., internal validity), generalizability (i.e., external validity), and quality of effect size estimates and hypothesis tests (i.e., statistical conclusion validity). However, the ability of longitudinal research to improve these inferences will depend heavily on many other factors, some of which might make the inferences less valid when using a longitudinal design. Increased inferential validity, particularly of any specific kind (e.g., internal validity), is not an inherent quality of the longitudinal design; it is a goal of the design. And it is important to know how some forms of the longitudinal design fall short of that goal for some inferences.

For example, consider a case where a measure of a presumed cause precedes a measure of a presumed effect, but over a time period across which one of the constructs in question does not likely change. Indeed, it is often questionable as to whether a gap of several months between the observations of many variables examined in research would change meaningfully over the interim, much less that the change in one preceded the change in the other (e.g., intention to retire is an example of this, as people can maintain a stable intention to retire for years). Thus, the design typically provides no real improvement in terms of internal validity. On the other hand, it does likely improve construct and statistical conclusion validity because it likely reduces common method bias effects found between the two variables ( Podsakoff et al., 2003 ).

Further, consider the case of the predictive validity design, where a selection instrument is measured from a sample of job applicants and performance is assessed some time later. In this case, common method bias is not generally the issue; external validity is. The longitudinal design improves external validity because the Time 1 measure is taken during the application process, which is the context in which the selection instrument will be used, and the Time 2 measure is taken after a meaningful time interval (i.e., after enough time has passed for performance to have stabilized for the new job holders). Again, however, internal validity is not much improved, which is fine given that prediction, not cause, is the primary concern in the selection context.

Another clear construct validity improvement gained by using longitudinal research is when one is interested in measuring change. A precise version of change measurement is assessing rate of change. When assessing the rate, time is a key variable in the analysis. To assess a rate one needs only two repeated measures of the variable of interest, though these measures should be taken from several units (e.g., individuals, groups, organizations) if measurement and sampling errors are present and perhaps under various conditions if systematic measurement error is possible (e.g., testing effect). Moreover, Ployhart and Vandenberg (2010) advocate at least three repeated measures because most change rates are not constant; thus, more than two observations will be needed to assess whether and how the rate changes (i.e., the shape of the growth curves). Indeed, three is hardly enough given noise in measurement and the commonality of complex processes (i.e., consider the opponent process example below).

Longitudinal research designs can, with certain precautions, improve one’s confidence in inferences about causality. When this is the purpose, time does not need to be measured or included as a variable in the analysis, though the interval between measurements should be reported because rate of change and cause are related. For example, intervals can be too short, such that given the rate of an effect, the cause might not have had sufficient time to register on the effect. Alternatively, if intervals are too long, an effect might have triggered a compensating process that overshoots the original level, inverting the sign of the cause’s effect. An example of this latter process is opponent process ( Solomon & Corbit, 1974 ). Figure 1 depicts this process, which refers to the response to an emotional stimulus. Specifically, the emotional response elicits an opponent process that, at its peak, returns the emotion back toward the baseline and beyond. If the emotional response is collected when peak opponent response occurs, it will look like the stimulus is having the opposite effect than it actually is having.

The opponent process effect demonstrated by Solomon and Corbit (1974).

The opponent process effect demonstrated by Solomon and Corbit (1974) .

Most of the longitudinal research designs that improve internal validity are quasi-experimental ( Shadish et al. , 2002 ). For example, interrupted time series designs use repeated observations to assess trends before and after some manipulation or “natural experiment” to model possible maturation or maturation-by-selection effects ( Shadish et al. , 2002 ; Stone-Romero, 2010 ). Likewise, regression discontinuous designs (RDD) use a pre-test to assign participants to the conditions prior to the manipulation and thus can use the pre-test value to model selection effects ( Shadish et al. , 2002 ; Stone-Romero, 2010 ). Interestingly, the RDD design is not assessing change explicitly and thus is not susceptible to maturations threats, but it uses the timing of measurement in a meaningful way.

Panel (i.e., cohort) designs are also typically considered longitudinal. These designs measure all the variables of interest during each wave of data collection. I believe it was these kinds of designs that Ployhart and Vandenberg (2010) had in mind when they created their definition of longitudinal research. In particular, these designs can be used to assess rates of change and can improve causal inferences if done well. In particular, to improve causal inferences with panel designs, researchers nearly always need at least three repeated measures of the hypothesized causes and effects. Consider the case of job satisfaction and intent to retire. If a researcher measures job satisfaction and intent to retire at Times 1 and 2 and finds that the Time 2 measures of job satisfaction and intent to retire are negatively related when the Time 1 states of the variables are controlled, the researcher still cannot tell which changed first (or if some third variable causes both to change in the interim). Unfortunately, three observations of each variable is only a slight improvement because it might be a difficult thing to get enough variance in changing attitudes and changing intentions with just three waves to find anything significant. Indeed, the researcher might have better luck looking at actual retirement, which as mentioned, only needs one observation. Still, two observations of job satisfaction are needed prior to the retirement to determine if changes in job satisfaction influence the probability of retirement.

Finally, on this point I would add that meaningful variance in time will often mean case-intensive designs (i.e., lots of observations of lots of variables over time per case; Bolger & Laurenceau, 2013 ; Wang et al. , 2016 ) because we will be more and more interested in assessing feedback and other compensatory processes, reciprocal relationships, and how dynamic variables change. In these cases, within-unit covariance will be much more interesting than between-unit covariance.

It is important to point out that true experimental designs are also a type of longitudinal research design by nature. This is because in experimental design, an independent variable is manipulated before the measure of the dependent variable occurs. This time precedence (or lag) is critical for using experimental designs to achieve stronger causal inferences. Specifically, given that random assignment is used to generate experimental and control groups, researchers can assume that prior to the manipulation, the mean levels of the dependent variables are the same across experimental and control groups, as well as the mean levels of the independent variables. Thus, by measuring the dependent variable after manipulation, an experimental design reveals the change in the dependent variable as a function of change in the independent variable as a result of manipulation. As such, the time lag between the manipulation and the measure of the dependent variable is indeed meaningful in the sense of achieving causal inference.

Conceptual Issue Question 2: What is the status of “time” in longitudinal research? Is “time” a general notion of the temporal dynamics in phenomena, or is “time” a substantive variable similar to other focal variables in the longitudinal study?

In longitudinal research, we are concerned with conceptualizing and assessing the changes over time that may occur in one or more substantive variables. A substantive variable refers to a measure of an intended construct of interest in the study. For example, in a study of newcomer adaptation (e.g., Chan & Schmitt, 2000 ), the substantive variables, whose changes over time we are interested in tracking, could be frequency of information seeking, job performance, and social integration. We could examine the functional form of the substantive variable’s change trajectory (e.g., linear or quadratic). We could also examine the extent to which individual differences in a growth parameter of the trajectory (e.g., the individual slopes of a linear trajectory) could be predicted from the initial (i.e., at Time 1 of the repeated measurement) values on the substantive variable, the values on a time-invariant predictor (e.g., personality trait), or the values on another time-varying variable (e.g., individual slopes of the linear trajectory of a second substantive variable in the study). The substantive variables are measures used to represent the study constructs. As measures of constructs, they have specific substantive content. We can assess the construct validity of the measure by obtaining relevant validity evidence. The evidence could be the extent to which the measure’s content represents the conceptual content of the construct (i.e., content validity) or the extent to which the measure is correlated with another established criterion measure representing a criterion construct that, theoretically, is expected to be associated with the measure (i.e., criterion-related validity).

“Time,” on the other hand, has a different ontological status from the substantive variables in the longitudinal study. There are at least three ways to describe how time is not a substantive variable similar to other focal variables in the longitudinal study. First, when a substantive construct is tracked in a longitudinal study for changes over time, time is not a substantive measure of a study construct. In the above example of newcomer adaptation study by Chan and Schmitt, it is not meaningful to speak of assessing the construct validity of time, at least not in the same way we can speak of assessing the construct validity of job performance or social integration measures. Second, in a longitudinal study, a time point in the observation period represents one temporal instance of measurement. The time point per se, therefore, is simply the temporal marker of the state of the substantive variable at the point of measurement. The time point is not the state or value of the substantive variable that we are interested in for tracking changes over time. Changes over time occur when the state of a substantive variable changes over different points of measurement. Finally, in a longitudinal study of changes over time, “time” is distinct from the substantive process that underlies the change over time. Consider a hypothetical study that repeatedly measured the levels of job performance and social integration of a group of newcomers for six time points, at 1-month intervals between adjacent time points over a 6-month period. Let us assume that the study found that the observed change over time in their job performance levels was best described by a monotonically increasing trajectory at a decreasing rate of change. The observed functional form of the performance trajectory could serve as empirical evidence for the theory that a learning process underlies the performance level changes over time. Let us further assume that, for the same group of newcomers, the observed change over time in their social integration levels was best described by a positive linear trajectory. This observed functional form of the social integration trajectory could serve as empirical evidence for a theory of social adjustment process that underlies the integration level changes over time. In this example, there are two distinct substantive processes of change (learning and social adjustment) that may underlie the changes in levels on the two respective study constructs (performance and social integration). There are six time points at which each substantive variable was measured over the same time period. Time, in this longitudinal study, was simply the medium through which the two substantive processes occur. Time was not an explanation. Time did not cause the occurrence of the different substantive processes and there was nothing in the conceptual content of the time construct that could, nor was expected to, explain the functional form or nature of the two different substantive processes. The substantive processes occur or unfold through time but they did not cause time to exist.

The way that growth modeling techniques analyze longitudinal data is consistent with the above conceptualization of time. For example, in latent growth modeling, time per se is not represented as a substantive variable in the analysis. Instead, a specific time point is coded as a temporal marker of the substantive variable (e.g., as basis coefficients in a latent growth model to indicate the time points in the sequence of repeated measurement at which the substantive variable was measured). The time-varying nature of the substantive variable is represented either at the individual level as the individual slopes or at the group level as the variance of the slope factor. It is the slopes and variance of slopes of the substantive variable that are being analyzed, and not time per se. The nature of the trajectory of change in the substantive variable is descriptively represented by the specific functional form of the trajectory that is observed within the time period of study. We may also include in the latent growth model other substantive variables, such as time-invariant predictors or time-varying correlates, to assess the strength of their associations with variance of the individual slopes of trajectory. These associations serve as validation and explanation of the substantive process of change in the focal variable that is occurring over time.

Many theories of change require the articulation of a change construct (e.g., learning, social adjustment—inferred from a slope parameter in a growth model). When specifying a change construct, the “time” variable is only used as a marker to track a substantive growth or change process. For example, when we say, “Extraversion × time interaction effect” on newcomer social integration, we really mean that Extraversion relates to the change construct of social adjustment (i.e., where social adjustment is operationalized as the slope parameter from a growth model of individuals’ social integration over time). Likewise, when we say, “Conscientiousness × time2 quadratic interaction effect” on newcomer task performance, we really mean that Conscientiousness relates to the change construct of learning (where learning is operationalized as the nonlinear slope of task performance over time).

This view of time brings up a host of issues with scaling and calibration of the time variable to adequately assess the underlying substantive change construct. For example, should work experience be measured in number of years in the job versus number of assignments completed ( Tesluk & Jacobs, 1998 )? Should the change construct be thought of as a developmental age effect, historical period effect, or birth cohort effect ( Schaie, 1965 )? Should the study of time in teams reflect developmental time rather than clock time, and thus be calibrated to each team’s lifespan ( Gersick, 1988 )? As such, although time is not a substantive variable itself in longitudinal research, it is important to make sure that the measurement of time matches the theory that specifies the change construct that is under study (e.g., aging, learning, adaptation, social adjustment).

I agree that time is typically not a substantive variable, but that it can serve as a proxy for substantive variables if the process is well-known. The example about learning by Chan is a case in point. Of course, well-known temporal processes are rare and I have often seen substantive power mistakenly given to time: For example, it is the process of oxidation, not the passage of time that is responsible for rust. However, there are instances where time plays a substantive role. For example, temporal discounting ( Ainslie & Haslam, 1992 ) is a theory of behavior that is dependent on time. Likewise, Vancouver, Weinhardt, and Schmidt’s (2010) theory of multiple goal pursuit involves time as a key substantive variable. To be sure, in that latter case the perception of time is a key mediator between time and its hypothetical effects on behavior, but time has an explicit role in the theory and thus should be considered a substantive variable in tests of the theory.

I was referring to objective time when explaining that time is not a substantive variable in longitudinal research and that it is instead the temporal medium through which a substantive process unfolds or a substantive variable changes its state. When we discuss theories of substantive phenomena or processes involving temporal constructs, such as temporal discounting, time urgency, or polychronicity related to multitasking or multiple goal pursuits, we are in fact referring to subjective time, which is the individual’s psychological experience of time. Subjective time constructs are clearly substantive variables. The distinction between objective time and subjective time is important because it provides conceptual clarity to the nature of the temporal phenomena and guides methodological choices in the study of time (for details, see Chan, 2014 ).

Conceptual Issue Question 3: What are the procedures, if any, for developing a theory of changes over time in longitudinal research? Given that longitudinal research purportedly addresses the limitations of cross-sectional research, can findings from cross-sectional studies be useful for the development of a theory of change?

To address this question, what follows is largely an application of some of the ideas presented by Mitchell and James (2001) and by Ployhart and Vandenberg (2010) in their respective publications. Thus, credit for the following should be given to those authors, and consultation of their articles as to specifics is highly encouraged.

Before we specifically address this question, it is important to understand our motive for asking it. Namely, as most succinctly stated by Mitchell and James (2001) , and repeated by, among others, Bentein and colleagues (2005) , Chan (2002 , 2010 ), and Ployhart and Vandenberg (2010) , there is an abundance of published research in the major applied psychology and organizational science journals in which the authors are not operationalizing through their research designs the causal relations among their focal independent, dependent, moderator, and mediator variables even though the introduction and discussion sections imply such causality. Mitchell and James (2001) used the published pieces in the most recent issues (at that time) of the Academy of Management Journal and Administrative Science Quarterly to anchor this point. At the crux of the problem is using designs in which time is not a consideration. As they stated so succinctly:

“At the simplest level, in examining whether an X causes a Y, we need to know when X occurs and when Y occurs. Without theoretical or empirical guides about when to measure X and Y, we run the risk of inappropriate measurement, analysis, and, ultimately, inferences about the strength, order, and direction of causal relationships (italics added, Mitchell & James, 2001 , p. 530).”

When is key because it is at the heart of causality in its simplest form, as in the “cause must precede the effect” ( James, Mulaik, & Brett, 1982 ; Condition 3 of 10 for inferring causality, p. 36). Our casual glance at the published literature over the decade since Mitchell and James (2001) indicates that not much has changed in this respect. Thus, our motive for asking the current question is quite simple—“perhaps it’s ‘time’ to put these issues in front of us once more (pun intended), particularly given the increasing criticisms as to the meaningfulness of published findings from studies with weak methods and statistics” (e.g., statistical myths and urban legends, Lance & Vandenberg, 2009 ).

The first part of the question asks, “what are the procedures, if any, for developing a theory of change over time in longitudinal research?” Before addressing procedures per se, it is necessary first to understand some of the issues when incorporating change into research. Doing so provides a context for the procedures. Ployhart and Vandenberg (2010) noted four theoretical issues that should be addressed when incorporating change in the variables of interest across time. These were:

“To the extent possible, specify a theory of change by noting the specific form and duration of change and predictors of change.

Clearly articulate or graph the hypothesized form of change relative to the observed form of change.

Clarify the level of change of interest: group average change, intraunit change, or interunit differences in intraunit change.

Realize that cross-sectional theory and research may be insufficient for developing theory about change. You need to focus on explaining why the change occurs” (p. 103).

The interested reader is encouraged to consult Ployhart and Vandenberg (2010) as to the specifics underlying the four issues, but they were heavily informed by Mitchell and James (2001) . Please note that, as one means of operationalizing time, Mitchell and James (2001) focused on time very broadly in the context of strengthening causal inferences about change across time in the focal variables. Thus, Ployhart and Vandenberg’s (2010) argument, with its sole emphasis on change, is nested within the Mitchell and James (2001) perspective. I raise this point because it is in this vein that the four theoretical issues presented above have as their foundation the five theoretical issues addressed by Mitchell and James (2001) . Specifically, first, we need to know the time lag between X and Y . How long after X occurs does Y occur? Second, X and Y have durations. Not all variables occur instantaneously. Third, X and Y may change over time. We need to know the rate of change. Fourth, in some cases we have dynamic relationships in which X and Y both change. The rate of change for both variables should be known, as well as how the X – Y relationship changes. Fifth, in some cases we have reciprocal causation: X causes Y and Y causes X . This situation requires an understanding of two sets of lags, durations, and possibly rates. The major point of both sets of authors is that these theoretical issues need to be addressed first in that they should be the key determinants in designing the overall study; that is, deciding upon the procedures to use.

Although Mitchell and James (2001 , see p. 543) focused on informing procedures through theory in the broader context of time (e.g., draw upon studies and research that may not be in our specific area of interest; going to the workplace and actually observing the causal sequence, etc.), our specific question focuses on change across time. In this respect, Ployhart and Vandenberg (2010 , Table 1 in p. 103) identified five methodological and five analytical procedural issues that should be informed by the nature of the change. These are:

“Methodological issues

1. Determine the optimal number of measurement occasions and their intervals to appropriately model the hypothesized form of change.

2. Whenever possible, choose samples most likely to exhibit the hypothesized form of change, and try to avoid convenience samples.

3. Determine the optimal number of observations, which in turn means addressing the attrition issue before conducting the study. Prepare for the worst (e.g., up to a 50% drop from the first to the last measurement occasion). In addition, whenever possible, try to model the hypothesized “cause” of missing data (ideally theorized and measured a priori) and consider planned missingness approaches to data collection.

4. Introduce time lags between intervals to address issues of causality, but ensure the lags are neither too long nor too short.

5. Evaluate the measurement properties of the variable for invariance (e.g., configural, metric) before testing whether change has occurred.

Analytical issues

1. Be aware of potential violations in statistical assumptions inherent in longitudinal designs (e.g., correlated residuals, nonindependence).

2. Describe how time is coded (e.g., polynomials, orthogonal polynomials) and why.

3. Report why you use a particular analytical method and its strengths and weaknesses for the particular study.

4. Report all relevant effect sizes and fit indices to sufficiently evaluate the form of change.

5. It is easy to ‘overfit’ the data; strive to develop a parsimonious representation of change.”

In summary, the major point from the above is to encourage researchers to develop a thorough conceptual understanding of time as it relates to defining the causal relationships between the focal variables of interest. We acknowledge that researchers are generally good at conceptualizing why their x -variables cause some impact on their y -variables. What is called for here goes beyond just understanding why, but forcing ourselves to be very specific about the timing between the variables. Doing so will result in stronger studies and ones in which our inferences from the findings can confidently include statements about causality—a level of confidence that is sorely lacking in most published studies today. As succinctly stated by Mitchell and James (2001) , “With impoverished theory about issues such as when events occur, when they change, or how quickly they change, the empirical researcher is in a quandary. Decisions about when to measure and how frequently to measure critical variables are left to intuition, chance, convenience, or tradition. None of these are particularly reliable guides (p. 533).”

The latter quote serves as a segue to address the second part of our question, “Given that longitudinal research purportedly addresses the limitations of cross-sectional research, can findings from cross-sectional studies be useful for the development of a theory of change?” Obviously, the answer here is “it depends.” In particular, it depends on the design contexts around which the cross-sectional study was developed. For example, if the study was developed strictly following many of the principles for designing quasi-experiments in field settings spelled out by Shadish, Cook, and Campbell (2002) , then it would be very useful for developing a theory of change on the phenomenon of interest. Findings from such studies could inform decisions as to how much change needs to occur across time in the independent variable to see measurable change in the dependent variable. Similarly, it would help inform decisions as to what the baseline on the independent variable needs to be, and what amount of change from this baseline is required to impact the dependent variable. Another useful set of cross-sectional studies would be those developed for the purpose of verifying within field settings the findings from a series of well-designed laboratory experiments. Again, knowing issues such as thresholds, minimal/maximal values, and intervals or timing of the x -variable onset would be very useful for informing a theory of change. A design context that would be of little use for developing a theory of change is the case where a single cross-sectional study was completed to evaluate the conceptual premises of interest. The theory underlying the study may be useful, but the findings themselves would be of little use.

Few theories are not theories of change. Most, however, are not sufficiently specified. That is, they leave much to the imagination. Moreover, they often leave to the imagination the implications of the theory on behavior. My personal bias is that theories of change should generally be computationally rendered to reduce vagueness, provide a test of internal coherence, and support the development of predictions. One immediately obvious conclusion one will draw when attempting to create a formal computational theoretical model is that we have little empirical data on rates of change.

The procedures for developing a computational model are the following ( Vancouver & Weinhardt, 2012 ; also see Wang et al. , 2016 ). First, take variables from (a) existing theory (verbal or static mathematical theory), (b) qualitative studies, (c) deductive reasoning, or (d) some combination of these. Second, determine which variables are dynamic. Dynamic variables have “memory” in that they retain their value over time, changing only as a function of processes that move the value in one direction or another at some rate or some changing rate. Third, describe processes that would affect these dynamic variables (if using existing theory, this likely involves other variables in the theory) or the rates and direction of change to the dynamic variables if the processes that affect the rates are beyond the theory. Fourth, represent formally (e.g., mathematically) the effect of the variables on each other. Fifth, simulate the model to see if it (a) works (e.g., no out-of-bounds values generated), (b) produces phenomena the theory is presumed to explain, (c) produces patterns of data over time (trajectories; relationships) that match (or could be matched to) data, and (d) determine if variance in exogenous variables (i.e., ones not presumably affected by other variables in the model) affect trajectories/relationships (called sensitivity analysis). For example, if we build a computational model to understand retirement timing, it will be critical to simulate the model to make sure that it generates predictions in a realistic way (e.g., the simulation should not generate too many cases where retirement happens after the person is a 90-year old). It will also be important to see whether the predictions generated from the model match the actual empirical data (e.g., the average retirement age based on simulation should match the average retirement age in the target population) and whether the predictions are robust when the model’s input factors take on a wide range of values.

As mentioned above, many theories of change require the articulation of a change construct (e.g., learning, aging, social adjustment—inferred from a slope parameter in a growth model). A change construct must be specified in terms of its: (a) theoretical content (e.g., what is changing, when we say “learning” or “aging”?), (b) form of change (linear vs. quadratic vs. cyclical), and (c) rate of change (does the change process meaningfully occur over minutes vs. weeks?). One salient problem is how to develop theory about the form of change (linear vs. nonlinear/quadratic) and the rate of change (how fast?) For instance, a quadratic/nonlinear time effect can be due to a substantive process of diminishing returns to time (e.g., a learning curve), or to ceiling (or floor) effects (i.e., hitting the high end of a measurement instrument, past which it becomes impossible to see continued growth in the latent construct). Indeed, only a small fraction of the processes we study would turn out to be linear if we used more extended time frames in the longitudinal design. That is, most apparently linear processes result from the researcher zooming in on a nonlinear process in a way that truncates the time frame. This issue is directly linked to the presumed rate of change of a phenomenon (e.g., a process that looks nonlinear in a 3-month study might look linear in a 3-week study). So when we are called upon to theoretically justify why we hypothesize a linear effect instead of a nonlinear effect, we must derive a theory of what the passage of time means. This would involve three steps: (a) naming the substantive process for which time is a marker (e.g., see answers to Question #2 above), (b) theorizing the rate of this process (e.g., over weeks vs. months), which will be more fruitful if it hinges on related past empirical longitudinal research, than if it hinges on armchair speculation about time (i.e., the appropriate theory development sequence here is: “past data → theory → new data,” and not simply, “theory → new data”; the empirical origins of theory are an essential step), and (c) disavowing nonlinear forces (e.g., diminishing returns to time, periodicity), within the chosen time frame of the study.

Research Design Question 1: What are some of the major considerations that one should take into account before deciding to employ a longitudinal study design?

As with all research, the design needs to allow the researcher to address the research question. For example, if one is seeking to assess a change rate, one needs to ask if it is safe to assume that the form of change is linear. If not, one will need more than two waves or will need to use continuous sampling. One might also use a computational model to assess whether violations of the linearity assumption are important. The researcher needs to also have an understanding of the likely time frame across which the processes being examined occur. Alternatively, if the time frame is unclear, the researcher should sample continuously or use short intervals. If knowing the form of the change is desired, then one will need enough waves of data collection in which to comprehensively capture the changes.

If one is interested in assessing causal processes, more issues need to be considered. For example, what are the processes of interest? What are the factors affecting the processes or the rates of the processes? What is the form of the effect of these factors? And perhaps most important, what alternative process could be responsible for effects observed?

For example, consider proactive socialization ( Morrison, 2002 ). The processes of interest are those involved in determining proactive information seeking. One observation is that the rate of proactive information seeking drops with the tenure of an employee ( Chan & Schmitt, 2000 ). Moreover, the form of the drop is asymptotic to a floor (Vancouver, Tamanini et al. , 2010 ). The uncertainty reduction model predicts that proactive information seeking will drop over time because knowledge increases (i.e., uncertainty decreases). An alternative explanation is that ego costs grow over time: One feels that they will look more foolish asking for information the longer one’s tenure ( Ashford, 1986 ). To distinguish these explanations for a drop in information seeking over time, one might want to look at whether the transparency of the reason to seek information would moderate the negative change trend of information seeking. For the uncertainty reduction model, transparency should not matter, but for the ego-based model, transparency and legitimacy of reason should matter. Of course, it might be that both processes are at work. As such, the researcher may need a computational model or two to help think through the effects of the various processes and whether the forms of the relationships depend on the processes hypothesized (e.g., Vancouver, Tamanini et al. , 2010 ).

Research Design Question 2: Are there any design advantages of cross-sectional research that might make it preferable to longitudinal research? That is, what would be lost and what might be gained if a moratorium were placed on cross-sectional research?

Cross-sectional research is easier to conduct than longitudinal research, but it often estimates the wrong parameters. Interestingly, researchers typically overemphasize/talk too much about the first fact (ease of cross-sectional research), and underemphasize/talk too little about the latter fact (that cross-sectional studies estimate the wrong thing). Cross-sectional research has the advantages of allowing broader sampling of participants, due to faster and cheaper studies that involve less participant burden; and broader sampling of constructs, due to the possibility of participant anonymity in cross-sectional designs, which permits more honest and complete measurement of sensitive concepts, like counterproductive work behavior.

Also, when the theoretical process at hand has a very short time frame (e.g., minutes or seconds), then cross-sectional designs can be entirely appropriate (e.g., for factor analysis/measurement modeling, because it might only take a moment for a latent construct to be reflected in a survey response). Also, first-stage descriptive models of group differences (e.g., sex differences in pay; cross-cultural differences in attitudes; and other “black box” models that do not specify a psychological process) can be suggestive even with cross-sectional designs. Cross-sectional research can also be condoned in the case of a 2-study design wherein cross-sectional data are supplemented with lagged/longitudinal data.

But in the end, almost all psychological theories are theories of change (at least implicitly) [Contrary to Ployhart and Vandenberg (2010) , I tend to believe that “cross-sectional theory” does not actually exist— theories are inherently longitudinal, whereas models and evidence can be cross-sectional.]. Thus, longitudinal and time-lagged designs are indispensable, because they allow researchers to begin answering four types of questions: (a) causal priority, (b) future prediction, (c) change, and (d) temporal external validity. To define and compare cross-sectional against longitudinal and time-lagged designs, I refer to Figure 2 . Figure 2 displays three categories of discrete-time designs: cross-sectional ( X and Y measured at same time; Figure 2a ), lagged ( Y measured after X by a delay of duration t ; Figure 2b ), and longitudinal ( Y measured at three or more points in time; Figure 2c ) designs. First note that, across all time designs, a 1 denotes the cross-sectional parameter (i.e., the correlation between X 1 and Y 1 ) . In other words, if X is job satisfaction and Y is retirement intentions, a 1 denotes the cross-sectional correlation between these two variables at t 1 . To understand the value (and limitations) of cross-sectional research, we will look at the role of the cross-sectional parameter ( a 1 ) in each of the Figure 2 models.

Time-based designs for two constructs, X and Y. (a) cross-sectional design (b) lagged designs (c) longitudinal designs.

Time-based designs for two constructs, X and Y . (a) cross-sectional design (b) lagged designs (c) longitudinal designs.

For assessing causal priority , the lagged models and panel model are most relevant. The time-lagged b 1 parameter (i.e., correlation between X 1 and Y 2 ; e.g., predictive validity) aids in future prediction, but tells us little about causal priority. In contrast, the panel regression b 1 ' parameter from the cross-lagged panel regression (in Figure 2b ) and the cross-lagged panel model (in Figure 2c ) tells us more about causal priority from X to Y ( Kessler & Greenberg, 1981 ; Shingles, 1985 ), and is a function of the b 1 parameter and the cross-sectional a 1 parameter [ b 1 ' = ( b 1 − a 1 r Y 1 , Y 2 ) / 1 − a 1 2 ] . For testing theories that X begets Y (i.e., X → Y ), the lagged parameter b 1 ' can be extremely useful, whereas the cross-sectional parameter a 1 is the wrong parameter (indeed, a 1 is often negatively related to b 1 ' ) . That is, a 1 does not estimate X → Y , but it is usually negatively related to that estimate (via the above formula for b 1 ' ) . Using the example of job satisfaction and retirement intentions, if we would like to know about the causal priority from job satisfaction to retirement intentions, we should at least measure both job satisfaction and retirement intentions at t 1 and then measure retirement intentions at t 2 . Deriving the estimate for b 1 ' involves regressing retirement intentions at t 2 on job satisfaction at t 1 , while controlling for the effect of retirement intentions at t 1 .

For future prediction , the autoregressive model and growth model in Figure 2c are most relevant. One illustrative empirical phenomenon is validity degradation, which means the X – Y correlation tends to shrink as the time interval between X and Y increases ( Keil & Cortina, 2001 ). Validity degradation and patterns of stability have been explained via simplex autoregressive models ( Hulin, Henry, & Noon, 1990 ; Humphreys, 1968 ; Fraley, 2002 ), which express the X – Y correlation as r X 1 , Y 1 + k = a 1 g k , where k is the number of time intervals separating X and Y . Notice the cross-sectional parameter a 1 in this formula serves as a multiplicative constant in the time-lagged X – Y correlation, but is typically quite different from the time-lagged X – Y correlation itself. Using the example of extraversion and retirement intentions, validity degradation means that the effect of extraversion at t 1 on the measure of retirement intentions is likely to decrease over time, depending on how stable retirement intentions are. Therefore, relying on a 1 to gauge how well extraversion can predict future retirement intentions is likely to overestimate the predictive effect of extraversion.

Another pertinent model is the latent growth model ( Chan, 1998 ; Ployhart & Hakel, 1998 ), which explains longitudinal data using a time intercept and slope. In the linear growth model in Figure 2 , the cross-sectional a 1 parameter is equal to the relationship between X 1 and the Y intercept, when t 1 = 0. I also note that from the perspective of the growth model, the validity degradation phenomenon (e.g., Hulin et al. , 1990 ) simply means that X 1 has a negative relationship with the Y slope. Thus, again, the cross-sectional a 1 parameter merely indicates the initial state of the X and Y relationship in a longitudinal system, and will only offer a reasonable estimate of future prediction of Y under the rare conditions when g ≈ 1.0 in the autoregressive model (i.e., Y is extremely stable), or when i ≈ 0 in the growth model (i.e., X does not predict the Y -slope; Figure 2c ).

For studying change , I refer to the growth model (where both X and the Y -intercept explain change in Y [or Y -slope]) and the coupled growth model (where X -intercept, Y -intercept, change in X , and change in Y all interrelate) in Figure 2c . Again, in these models the cross-sectional a 1 parameter is the relationship between the X and Y intercepts, when the slopes are specified with time centered at t 1 = 0 (where t 1 refers arbitrarily to any time point when the cross-sectional data were collected). In the same way that intercepts tell us very little about slopes (ceiling and floor effects notwithstanding), the cross-sectional X 1 parameter tells us almost nothing about change parameters. Again, using the example of the job satisfaction and retirement intentions relationship, to understand change in retirement intentions over time, it is important to gauge the effects of initial status of job satisfaction (i.e., job satisfaction intercept) and change in job satisfaction (i.e., job satisfaction slope) on change in retirement intentions (i.e., slope of retirement intentions).

Finally, temporal external validity refers to the extent to which an effect observed at one point in time generalizes across other occasions. This includes longitudinal measurement equivalence (e.g., whether the measurement metric of the concept or the meaning of the concept may change over time; Schmitt, 1982 ), stability of bivariate relationships over time (e.g., job satisfaction relates more weakly to turnover when the economy is bad; Carsten & Spector, 1987 ), the stationarity of cross-lagged parameters across measurement occasions ( b 1 ' = b 2 ' , see cross-lagged panel model in Figure 2c ; e.g., Cole & Maxwell, 2003 ), and the ability to identify change as an effect of participant age/tenure/development—not an effect of birth/hire cohort or historical period ( Schaie, 1965 ). Obviously, cross-sectional data have nothing to say about temporal external validity.

Should there be a moratorium on cross-sectional research? Because any single wave of a longitudinal design is itself cross-sectional data, a moratorium is not technically possible. However, there should be (a) an explicit acknowledgement of the different theoretical parameters in Figure 2 , and (b) a general moratorium on treating the cross-sectional a 1 parameter as though it implies causal priority (cf. panel regression parameter b 1 ' ) , future prediction (cf. panel regression, autoregressive, and growth models), change (cf. growth models), or temporal external validity. This recommendation is tantamount to a moratorium on cross-sectional research papers, because almost all theories imply the lagged and/or longitudinal parameters in Figure 2 . As noted earlier, cross-sectional data are easier to get, but they estimate the wrong parameter.

I agree with Newman that most theories are about change or should be (i.e., we are interested in understanding processes and, of course, processes occur over time). I am also in agreement that cross-sectional designs are of almost no value for assessing theories of change. Therefore, I am interested in getting to a place where most research is longitudinal, and where top journals rarely publish papers with only a cross-sectional design. However, as Newman points out, some research questions can still be addressed using cross-sectional designs. Therefore, I would not support a moratorium on cross-sectional research papers.

Research Design Question 3: In a longitudinal study, how do we decide on the length of the interval between two adjacent time points?

This question needs to be addressed together with the question on how many time points of measurement to administer in a longitudinal study. It is well established that intra-individual changes cannot be adequately assessed with only two time points because (a) a two-point measurement by necessity produces a linear trajectory and therefore is unable to empirically detect the functional form of the true change trajectory and (b) time-related (random or correlated) measurement error and true change over time are confounded in the observed change in a two-point measurement situation (for details, see Chan, 1998 ; Rogosa, 1995 ; Singer & Willett, 2003 ). Hence, the minimum number of time points for assessing intra-individual change is three, but more than three is better to obtain a more reliable and valid assessment of the change trajectory ( Chan, 1998 ). However, it does not mean that a larger number of time points is always better or more accurate than a smaller number of time points. Given that the total time period of study captures the change process of interest, the number of time points should be determined by the appropriate location of the time point. This then brings us to the current practical question on the choice regarding the appropriate length of the interval between adjacent time points.

The correct length of the time interval between adjacent time points in a longitudinal study is critical because it directly affects the observed functional form of the change trajectory and in turn the inference we make about the true pattern of change over time ( Chan, 1998 ). What then should be the correct length of the time interval between adjacent time points in a longitudinal study? Put simply, the correct or optimal length of the time interval will depend on the specific substantive change phenomenon of interest. This means it is dependent on the nature of the substantive construct, its underlying process of change over time, and the context in which the change process is occurring which includes the presence of variables that influence the nature and rate of the change. In theory, the time interval for data collection is optimal when the time points are appropriately spaced in such a way that it allows the true pattern of change over time to be observed during the period of study. When the observed time interval is too short or too long as compared to the optimal time interval, true patterns of change will get masked or false patterns of change will get observed.

The problem is we almost never know what this optimal time interval is, even if we have a relatively sound theory of the change phenomenon. This is because our theories of research phenomena are often static in nature. Even when our theories are dynamic and focus on change processes, they are almost always silent on the specific length of the temporal dimension through which the substantive processes occur over time ( Chan, 2014 ).

In practice, researchers determine their choice of the length of the time interval in conjunction with the choice of number of time points and the choice of the length of the total time period of study. Based on my experiences as an author, reviewer, and editor, I suspect that these three choices are influenced by the specific resource constraints and opportunities faced by the researchers when designing and conducting the longitudinal study. Deviation from optimal time intervals probably occurs more frequently than we would like, since decisions on time intervals between measures in a study are often pragmatic and atheoretical. When we interpret findings from longitudinal studies, we should consider the possibility that the study may have produced patterns of results that led to wrong inferences because the study did not reflect the true changes over time.

Given that our theories of phenomena are not at the stage where we could specify the optimal time intervals, the best we could do now is to explicate the nature of the change processes and the effects of the influencing factors to serve as guides for decisions on time intervals, number of time points, and total time period of study. For example, in research on sense-making processes in newcomer adaptation, the total period of study often ranged from 6 months to 1 year, with 6 to 12 time points, equally spaced at time intervals of 1 or 2 months between adjacent time points. A much longer time interval and total time period, ranging from several months to several years, would be more appropriate for a change process that should take a longer time to manifest itself, such as development of cognitive processes or skill acquisition requiring extensive practice or accumulation of experiences over time. On the other extreme, a much shorter time interval and total time period, ranging from several hours to several days, will be appropriate for a change process that should take a short time to manifest itself such as activation or inhibition of mood states primed by experimentally manipulated events.

Research Design Question 4: As events occur in our daily life, our mental representations of these events may change as time passes. How can we determine the point(s) in time at which the representation of an event is appropriate? How can these issues be addressed through design and measurement in a study?

In some cases, longitudinal researchers will wish to know the nature and dynamics of one’s immediate experiences. In these cases, the items included at each point in time will simply ask participants to report on states, events, or behaviors that are relatively immediate in nature. For example, one might be interested in an employee’s immediate affective experiences, task performance, or helping behavior. This approach is particularly common for intensive, short-term longitudinal designs such as experience sampling methods (ESM; Beal & Weiss, 2003 ). Indeed, the primary objective of ESM is to capture a representative sample of points within one’s day to help understand the dynamic nature of immediate experience ( Beal, 2015 ; Csikszentmihalyi & Larson, 1987 ). Longitudinal designs that have longer measurement intervals may also capture immediate experiences, but more often will ask participants to provide some form of summary of these experiences, typically across the entire interval between each measurement occasion. For example, a panel design with a 6-month interval may ask participants to report on affective states, but include a time frame such as “since the last survey” or “over the past 6 months”, requiring participants to mentally aggregate their own experiences.

As one might imagine, there also are various designs and approaches that range between the end points of immediate experience and experiences aggregated over the entire interval. For example, an ESM study might examine one’s experiences since the last survey. These intervals obviously are close together in time, and therefore are conceptually similar to one’s immediate state; nevertheless, they do require both increased levels of recall and some degree of mental aggregation. Similarly, studies with a longer time interval (e.g., 6-months) might nevertheless ask about one’s relatively recent experiences (e.g., affect over the past week), requiring less in terms of recall and mental aggregation, but only partially covering the events of the entire intervening interval. As a consequence, these two approaches and the many variations in between form a continuum of abstraction containing a number of differences that are worth considering.

Differences in Stability

Perhaps the most obvious difference across this continuum of abstraction is that different degrees of aggregation are captured. As a result, items will reflect more or less stable estimates of the phenomenon of interest. Consider the hypothetical temporal break-down of helping behavior depicted in Figure 3 . No matter how unstable the most disaggregated level of helping behavior may appear, aggregations of these behaviors will always produce greater stability. So, asking about helping behavior over the last hour will produce greater observed variability (i.e., over the entire scale) than averages of helping behavior over the last day, week, month, or one’s overall general level. Although it is well-known that individuals do not follow a strict averaging process when asked directly about a higher level of aggregation (e.g., helping this week; see below), it is very unlikely that such deviations from a straight average will result in less stability at higher levels of aggregation.

Hypothetical variability of helping behavior at different levels of aggregation.

Hypothetical variability of helping behavior at different levels of aggregation.

The reason why this increase in stability is likely to occur regardless of the actual process of mental aggregation is that presumably, as you move from shorter to longer time frames, you are estimating either increasingly stable aspects of an individual’s dispositional level of the construct, or increasingly stable features of the context (e.g., a consistent workplace environment). As you move from longer to shorter time frames you are increasingly estimating immediate instances of the construct or context that are influenced not only by more stable predictors, but also dynamic trends, cycles, and intervening events ( Beal & Ghandour, 2011 ). Notably, this stabilizing effect exists independently of the differences in memory and mental aggregation that are described below.

Differences in Memory

Fundamental in determining how people will respond to these different forms of questions is the nature of memory. Robinson and Clore (2002) provided an in-depth discussion of how we rely on different forms of memory when answering questions over different time frames. Although these authors focus on reports of emotion experiences, their conclusions are likely applicable to a much wider variety of self-reports. At one end of the continuum, reports of immediate experiences are direct, requiring only one’s interpretation of what is occurring and minimizing mental processes of recall.

Moving slightly down the continuum, we encounter items that ask about very recent episodes (e.g., “since the last survey” or “in the past 2 hours” in ESM studies). Here, Robinson and Clore (2002) note that we rely on what cognitive psychologists refer to as episodic memory. Although recall is involved, specific details of the episode in question are easily recalled with a high degree of accuracy. As items move further down the continuum toward summaries of experiences over longer periods of time (e.g., “since the last survey” in a longitudinal panel design), the details of particular relevant episodes are harder to recall and so responses are tinged to an increasing degree by semantic memory. This form of memory is based on individual characteristics (e.g., neurotic individuals might offer more negative reports) as well as well-learned situation-based knowledge (e.g., “my coworkers are generally nice people, so I’m sure that I’ve been satisfied with my interactions over this period of time”). Consequently, as the time frame over which people report increases, the nature of the information provided changes. Specifically, it is increasingly informed by semantic memory (i.e., trait and situation-based knowledge) and decreasingly informed by episodic memory (i.e., particular details of one’s experiences). Thus, researchers should be aware of the memory-related implications when they choose the time frame for their measures.

Differences in the Process of Summarizing

Aside from the role of memory in determining the content of these reports, individuals also summarize their experiences in a complex manner. For example, psychologists have demonstrated that even over a single episode, people tend not to base subjective summaries of the episode on its typical or average features. Instead, we focus on particular notable moments during the experience, such as its peak or its end state, and pay little attention to some aspects of the experience, such as its duration ( Fredrickson, 2000 ; Redelmeier & Kahneman, 1996 ). The result is that a mental summary of a given episode is unlikely to reflect actual averages of the experiences and events that make up the episode. Furthermore, when considering reports that span multiple episodes (e.g., over the last month or the interval between two measurements in a longitudinal panel study), summaries become even more complex. For example, recent evidence suggests that people naturally organize ongoing streams of experience into more coherent episodes largely on the basis of goal relevance ( Beal, Weiss, Barros, & MacDermid, 2005 ; Beal & Weiss, 2013 ; Zacks, Speer, Swallow, Braver, & Reynolds, 2007 ). Thus, how we interpret and parse what is going on around us connects strongly to our goals at the time. Presumably, this process helps us to impart meaning to our experiences and predict what might happen next, but it also influences the type of information we take with us from the episode, thereby affecting how we might report on this period of time.

Practical Differences

What then, can researchers take away from this information to help in deciding what sorts of items to include in longitudinal studies? One theme that emerges from the above discussion is that summaries over longer periods of time will tend to reflect more about the individual and the meanings he or she may have imparted to the experiences, events, and behaviors that have occurred during this time period, whereas shorter-term summaries or reports of more immediate occurrences are less likely to have been processed through this sort of interpretive filter. Of course, this is not to say that the more immediate end of this continuum is completely objective, as immediate perceptions are still host to many potential biases (e.g., attributional biases typically occur immediately); rather, immediate reports are more likely to reflect one’s immediate interpretation of events rather than an interpretation that has been mulled over and considered in light of an individual’s short- and long-term goals, dispositions, and broader worldview.

The particular choice of item type (i.e., immediate vs. aggregated experiences) that will be of interest to a researcher designing a longitudinal study should of course be determined by the nature of the research question. For example, if a researcher is interested in what Weiss and Cropanzano (1996) referred to as judgment-driven behaviors (e.g., a calculated decision to leave the organization), then capturing the manner in which individuals make sense of relevant work events is likely more appropriate, and so items that ask one to aggregate experiences over time may provide a better conceptual match than items asking about immediate states. In contrast, affect-driven behaviors or other immediate reactions to an event will likely be better served by reports that ask participants for minimal mental aggregations of their experiences (e.g., immediate or over small spans of time).

The issue of mental representations of events at particular points in time should always be discussed and evaluated within the research context of the conceptual questions on the underlying substantive constructs and change processes that may account for patterns of responses over time. Many of these conceptual questions are likely to relate to construct-oriented issues such as the location of the substantive construct on the state-trait continuum and the timeframe through which short-term or long-term effects on the temporal changes in the substantive construct are likely to be manifested (e.g., effects of stressors on changes in health). On the issue of aggregation of observations across time, I see it as part of a more basic question on whether an individual’s subjective experience on a substantive construct (e.g., emotional well-being) should be assessed using momentary measures (e.g., assessing the individual’s current emotional state, measured daily over the past 1 week) or retrospective global reports (e.g., asking the individual to report an overall assessment of his or her emotional state over the past 1 week). Each of the two measurement perspectives (i.e., momentary and global retrospective) has both strengths and limitations. For example, momentary measures are less prone to recall biases compared to global retrospective measures ( Kahneman, 1999 ). Global retrospective measures, on the other hand, are widely used in diverse studies for the assessment of many subjective experience constructs with a large database of evidence concerning the measure’s reliability and validity ( Diener, Inglehart, & Tay, 2013 ). In a recent article ( Tay, Chan, & Diener, 2014 ), my colleagues and I reviewed the conceptual, methodological, and practical issues in the debate between the momentary and global retrospective perspectives as applied to the research on subjective well-being. We concluded that both perspectives could offer useful insights and suggested a multiple-method approach that is sensitive to the nature of the substantive construct and specific context of use, but also called for more research on the use of momentary measures to obtain more evidence for their psychometric properties and practical value.

Research Design Question 5: What are the biggest practical hurdles to conducting longitudinal research? What are the ways to overcome them?

As noted earlier, practical hurdles are perhaps one of the main reasons why researchers choose cross-sectional rather than longitudinal designs. Although we have already discussed a number of these issues that must be faced when conducting longitudinal research, the following discussion emphasizes two hurdles that are ubiquitous, often difficult to overcome, and are particularly relevant to longitudinal designs.

Encouraging Continued Participation

Encouraging participation is a practical issue that likely faces all studies, irrespective of design; however, longitudinal studies raise special considerations given that participants must complete measurements on multiple occasions. Although there is a small literature that has examined this issue specifically (e.g., Fumagalli, Laurie, & Lynn, 2013 ; Groves et al. , 2006 ; Laurie, Smith, & Scott, 1999 ), it appears that the relevant factors are fairly similar to those noted for cross-sectional surveys. In particular, providing monetary incentives prior to completing the survey is a recommended strategy (though nonmonetary gifts can also be effective), with increased amounts resulting in increased participation rates, particularly as the burden of the survey increases ( Laurie & Lynn, 2008 ).

The impact of participant burden relates directly to the special considerations of longitudinal designs, as they are generally more burdensome. In addition, with longitudinal designs, the nature of the incentives used can vary over time, and can be tailored toward reducing attrition rates across the entire span of the survey ( Fumagalli et al. , 2013 ). For example, if the total monetary incentive is distributed across survey waves such that later waves have greater incentive amounts, and if this information is provided to participants at the outset of the study, then attrition rates may be reduced more effectively ( Martin & Loes, 2010 ); however, some research suggests that a larger initial payment is particularly effective at reducing attrition throughout the study ( Singer & Kulka, 2002 ).

In addition, the fact that longitudinal designs reflect an implicit relationship between the participant and the researchers over time suggests that incentive strategies that are considered less effective in cross-sectional designs (e.g., incentive contingent on completion) may be more effective in longitudinal designs, as the repeated assessments reflect a continuing reciprocal relationship. Indeed, there is some evidence that contingent incentives are effective in longitudinal designs ( Castiglioni, Pforr, & Krieger, 2008 ). Taken together, one potential strategy for incentivizing participants in longitudinal surveys would be to divide payment such that there is an initial relatively large incentive delivered prior to completing the first wave, followed by smaller, but increasing amounts that are contingent upon completion of each successive panel. Although this strategy is consistent with theory and evidence just discussed, it has yet to be tested explicitly.

Continued contact

One thing that does appear certain, particularly in longitudinal designs, is that incentives are only part of the picture. An additional factor that many researchers have emphasized is the need to maintain contact with participants throughout the duration of a longitudinal survey ( Laurie, 2008 ). Strategies here include obtaining multiple forms of contact information at the outset of the study and continually updating this information. From this information, researchers should make efforts to keep in touch with participants in-between measurement occasions (for panel studies) or some form of ongoing basis (for ESM or other intensive designs). Laurie (2008) referred to these efforts as Keeping In Touch Exercises (KITEs) and suggested that they serve to increase belongingness and perhaps a sense of commitment to the survey effort, and have the additional benefit of obtaining updated contact and other relevant information (e.g., change of job).

Mode of Data Collection

General considerations.

In panel designs, relative to intensive designs discussed below, only a limited number of surveys are sought, and the interval between assessments is relatively large. Consequently, there is likely to be greater flexibility as to the particular methods chosen for presenting and recording responses. Although the benefits, costs, and deficiencies associated with traditional paper-and-pencil surveys are well-known, the use of internet-based surveys has evolved rapidly and so the implications of using this method have also changed. For example, early survey design technologies for internet administration were often complex and potentially costly. Simply adding items was sometimes a difficult task, and custom-formatted response options (e.g., sliding scales with specific end points, ranges, and tick marks) were often unattainable. Currently available web-based design tools often are relatively inexpensive and increasingly customizable, yet have maintained or even improved the level of user-friendliness. Furthermore, a number of studies have noted that data collected using paper-and-pencil versus internet-based applications are often comparable if not indistinguishable (e.g., Cole, Bedeian, & Feild, 2006 ; Gosling et al. , 2004 ), though notable exceptions can occur ( Meade, Michels, & Lautenschlager, 2007 ).

One issue related to the use of internet-based survey methods that is likely to be of increasing relevance in the years to come is collection of survey data using a smartphone. As of this writing (this area changes rapidly), smartphone options are in a developing phase where some reasonably good options exist, but have yet to match the flexibility and standardized appearance that comes with most desktop or laptop web-based options just described. For example, it is possible to implement repeated surveys for a particular mobile operating system (OS; e.g., Apple’s iOS, Google’s Android OS), but unless a member of the research team is proficient in programming, there will be a non-negligible up-front cost for a software engineer ( Uy, Foo, & Aguinis, 2010 ). Furthermore, as market share for smartphones is currently divided across multiple mobile OSs, a comprehensive approach will require software development for each OS that the sample might use.

There are a few other options, however, but some of these options are not quite complete solutions. For example, survey administration tools such as Qualtrics now allow for testing of smartphone compatibility when creating web-based surveys. So, one could conceivably create a survey using this tool and have people respond to it on their smartphone with little or no loss of fidelity. Unfortunately, these tools (again, at this moment in time) do not offer elegant or flexible signaling capabilities. For example, intensive repeated measures designs will often try to signal reasonably large (e.g., N = 50–100) number of participants multiple random signals every day for multiple weeks. Accomplishing this task without the use of a built-in signaling function (e.g., one that generates this pattern of randomized signals and alerts each person’s smartphone at the appropriate time), is no small feat.

There are, however, several efforts underway to provide free or low-cost survey development applications for mobile devices. For example, PACO is a (currently) free Google app that is in the beta-testing stage and allows great flexibility in the design and implementation of repeated surveys on both Android OS and iOS smartphones. Another example that is currently being developed for both Android and iOS platforms is Expimetrics ( Tay, 2015 ), which promises flexible design and signaling functions that is of low cost for researchers collecting ESM data. Such applications offer the promise of highly accessible survey administration and signaling and have the added benefit of transmitting data quickly to servers accessible to the research team. Ideally, such advances in accessibility of survey administration will allow increased response rates throughout the duration of the longitudinal study.

Issues specific to intensive designs

All of the issues just discussed with respect to the mode of data collection are particularly relevant for short-term intensive longitudinal designs such as ESM. As the number of measurement occasions increases, so too do the necessities of increasing accessibility and reducing participant burden wherever possible. Of particular relevance is the emphasis ESM places on obtaining in situ assessments to increase the ecological validity of the study ( Beal, 2015 ). To maximize this benefit of the method, it is important to reduce the interruption introduced by the survey administration. If measurement frequency is relatively sparse (e.g., once a day), it is likely that simple paper-and-pencil or web-based modes of collection will be sufficient without creating too much interference ( Green et al. , 2006 ). In contrast, as measurements become increasingly intensive (e.g., four or five times/day or more), reliance on more accessible survey modes will become important. Thus, a format that allows for desktop, laptop, or smartphone administration should be of greatest utility in such intensive designs.

Statistical Techniques Question 1: With respect to assessing changes over time in a latent growth modeling framework, how can a researcher address different conceptual questions by coding the slope variable differently?

As with many questions in this article, an in-depth answer to this particular question is not possible in the available space. Hence, only a general treatment of different coding schemes of the slope or change variable is provided. Excellent detailed treatments of this topic may be found in Bollen and Curran (2006 , particularly chapters 3 & 4), and in Singer and Willett (2003 , particularly chapter 6). As noted by Ployhart and Vandenberg (2010) , specifying the form of change should be an a priori conceptual endeavor, not a post hoc data driven effort. This stance was also stated earlier by Singer and Willett (2003) when distinguishing between empirical (data driven) versus rational (theory driven) strategies. “Under rational strategies, on the other hand, you use theory to hypothesize a substantively meaningful functional form for the individual change trajectory. Although rational strategies generally yield clearer interpretations, their dependence on good theory makes them somewhat more difficult to develop and apply ( Singer & Willett, 2003 , p. 190).” The last statement in the quote simply reinforces the main theme throughout this article; that is, researchers need to undertake the difficult task of bringing in time (change being one form) within their conceptual frameworks in order to more adequately examine the causal structure among the focal variables within those frameworks.

In general, there are three sets of functional forms for which the slope or change variable may be coded or specified: (a) linear; (b) discontinuous; and (c) nonlinear. Sets emphasize that within each form there are different types that must be considered. The most commonly seen form in our literature is linear change (e.g., Bentein et al. , 2005 ; Vandenberg & Lance, 2000 ). Linear change means there is an expectation that the variable of interest should increase or decrease in a straight-line function during the intervals of the study. The simplest form of linear change occurs when there are equal measurement intervals across time and the units of observations were obtained at the same time in those intervals. Assuming, for example, that there were four occasions of measurement, the coding of the slope variable would be 0 (Time 1), 1 (Time 2), 2 (Time 3) and 3 (Time 4). Such coding fixes the intercept (starting value of the line) at the Time 1 interval, and thus, the conceptual interpretation of the linear change is made relative to this starting point. Reinforcing the notion that there is a set of considerations, one may have a conceptual reason for wanting to fix the intercept to the last measurement occasion. For example, there may be an extensive training program anchored with a “final exam” on the last occasion, and one wants to study the developmental process resulting in the final score. In this case, the coding scheme may be −3, −2, −1, and 0 going from Time 1 to Time 4, respectively ( Bollen & Curran, 2006 , p. 116; Singer & Willett, 2003 , p. 182). One may also have a conceptual reason to use the middle of the time intervals to anchor the intercept and look at the change above and below this point. Thus, the coding scheme in the current example may be −1.5, −0.5, 0.5, and 1.5 for Time 1 to Time 4, respectively ( Bollen & Curran, 2006 ; Singer & Willett, 2003 ). There are other considerations in the “linear set” such as the specification of linear change in cohort designs or other cases where there are individually-varying times of observation (i.e., not everyone started at the same time, at the same age, at the same intervals, etc.). The latter may need to make use of missing data procedures, or the use of time varying covariates that account for the differences as to when observations were collected. For example, to examine how retirement influences life satisfaction, Pinquart and Schindler (2007) modeled life satisfaction data from a representative sample of German retirees who retired between 1985 and 2003. Due to the retirement timing differences among the participants (not everyone retired at the same time or at the same age), different numbers of life satisfaction observations were collected for different retirees. Therefore, the missing observations on a yearly basis were modeled as latent variables to ensure that the analyses were able to cover the entire studied time span.

Discontinuous change is the second set of functional form with which one could theoretically describe the change in one’s substantive focal variables. Discontinuities are precipitous events that may cause the focal variable to rapidly accelerate (change in slope) or to dramatically increase/decrease in value (change in elevation) or both change in slope and elevation (see Ployhart & Vandenberg, 2010 , Figure 1 in p. 100; Singer & Willett, 2003 , pp. 190–208, see Table 6.2 in particular). For example, according to the stage theory ( Wang et al. , 2011 ), retirement may be such a precipitous event, because it can create an immediate “honeymoon effect” on retirees, dramatically increasing their energy-level and satisfaction with life as they pursue new activities and roles.

This set of discontinuous functional form has also been referred to as piecewise growth ( Bollen & Curran, 2006 ; Muthén & Muthén, 1998–2012 ), but in general, represents situations where all units of observation are collected at the same time during the time intervals and the discontinuity happens to all units at the same time. It is actually a variant of the linear set, and therefore, could have been presented above as well. To illustrate, assume we are tracking individual performance metrics that had been rising steadily across time, and suddenly the employer announces an upcoming across-the-board bonus based on those metrics. A sudden rise (as in a change in slope) in those metrics could be expected based purely on reinforcement theory. Assume, for example, we had six intervals of measurement, and the bonus announcement was made just after the Time 3 data collection. We could specify two slope or change variables and code the first one as 0, 1, 2, 2, 2, and 2, and code the second slope variable as 0, 0, 0, 1, 2, and 3. The latter specification would then independently examine the linear change in each slope variable. Conceptually, the first slope variable brings the trajectory of change up to the transition point (i.e., the last measurement before the announcement) while the second one captures the change after the transition ( Bollen & Curran, 2006 ). Regardless of whether the variables are latent or observed only, if this is modeled using software such as Mplus ( Muthén & Muthén, 1998–2012 ), the difference between the means of the slope variables may be statistically tested to evaluate whether the post-announcement slope is indeed greater than the pre-announcement slope. One may also predict that the announcement would cause an immediate sudden elevation in the performance metric as well. This can be examined by including a dummy variable which is zero at all time points prior to the announcement and one at all time points after the announcement ( Singer & Willett, 2003 , pp. 194–195). If the coefficient for this dummy variable is statistically significant and positive, then it indicates that there was a sudden increase (upward elevation) in value post-transition.

Another form of discontinuous change is one in which the discontinuous event occurs at varying times for the units of observation (indeed it may not occur at all for some) and the intervals for collecting data may not be evenly spaced. For example, assume again that individual performance metrics are monitored across time for individuals in high-demand occupations with the first one collected on the date of hire. Assume as well that these individuals are required to report when an external recruiter approaches them; that is, they are not prohibited from speaking with a recruiter but need to just report when it occurred. Due to some cognitive dissonance process, individuals may start to discount the current employer and reduce their inputs. Thus, a change in slope, elevation, or both may be expected in performance. With respect to testing a potential change in elevation, one uses the same dummy-coded variable as described above ( Singer & Willett, 2003 ). With respect to whether the slopes of the performance metrics differ pre- versus post-recruiter contact, however, requires the use of a time-varying covariate. How this operates specifically is beyond the scope here. Excellent treatments on the topic, however, are provided by Bollen and Curran (2006 , pp. 192–218), and Singer and Willett (2003 , pp. 190–208). In general, a time-varying covariate captures the intervals of measurement. In the current example, this may be the number of days (weeks, months, etc.) from date of hire (when baseline performance was obtained) to the next interval of measurement and all subsequent intervals. Person 1, for example, may have the values 1, 22, 67, 95, 115, and 133, and was contacted after Time 3 on Day 72 from the date of hire. Person 2 may have the values 1, 31, 56, 101, 141, and 160, and was contacted after Time 2 on Day 40 from date of hire. Referring the reader to the specifics starting on page 195 of Singer and Willett (2003) , one would then create a new variable from the latter in which all of the values on this new variable before the recruiting contact are set to zero, and values after that to the difference in days when contact was made to the interval of measurement. Thus, for Person 1, this new variable would have the values 0, 0, 0, 23, 43, and 61, and for Person 2, the values would be 0, 0, 16, 61, 101, and 120. The slope of this new variable represents the increment (up or down) to what the slope would have been had the individuals not been contacted by a recruiter. If it is statistically nonsignificant, then there is no change in slope pre- versus post-recruiter contact. If it is statistically significant, then the slope after contact differed from that before the contact. Finally, while much of the above is based upon a multilevel approach to operationalizing change, Muthén and Muthén (1998–2012 ) offer an SEM approach to time-varying covariates through their Mplus software package.

The final functional form to which the slope or change variable may be coded or specified is nonlinear. As with the other forms, there is a set of nonlinear forms. The simplest in the set is when theory states that the change in the focal variable may be quadratic (curve upward or downward). As such, in addition to the linear slope/change variable, a second change variable is specified in which the values of its slope are fixed to the squared values of the first or linear change variable. Assuming five equally spaced intervals of measurement coded as 0, 1, 2, 3, and 4 on the linear change variable. The values of the second quadratic change variable would be 0, 1, 4, 9, and 16. Theory could state that there is cubic change as well. In that case, a third cubic change variable is introduced with the values of 0, 1, 8, 27, and 64. One problem with the use of quadratic (or even linear change variables) or other polynomial forms as described above is that the trajectories are unbounded functions ( Bollen & Curran, 2006 ); that is, there is an assumption that they tend toward infinity. It is unlikely that most, if any, of the theoretical processes in the social sciences are truly unbounded. If a nonlinear form is expected, operationalizing change using an exponential trajectory is probably the most realistic choice. This is because exponential trajectories are bounded functions in the sense that they approach an asymptote (either growing and/or decaying to asymptote). There are three forms of exponential trajectories: (a) simple where there is explosive growth from asymptote; (b) negative where there is growth to an asymptote; and (c) logistic where this is asymptote at both ends ( Singer & Willett, 2003 ). Obviously, the values of the slope or change variable would be fixed to the exponents most closely representing the form of the curve (see Bollen & Curren, 2006, p. 108; and Singer & Willett, 2003 , Table 6.7, p. 234).

There are other nonlinear considerations as well that belong to this. For example, Bollen and Curran (2006 , p. 109) address the issue of cycles (recurring ups and downs but that follow a general upward or downward trend.) Once more the values of the change variable would be coded to reflect those cycles. Similarly, Singer and Willett (2003 , p. 208) address recoding when one wants to remove through transformations the nonlinearity in the change function to make it more linear. They provide an excellent heuristic on page 211 to guide one’s thinking on this issue.

Statistical Techniques Question 2: In longitudinal research, are there additional issues of measurement error that we need to pay attention to, which are over and above those that are applicable to cross-sectional research?

Longitudinal research should pay special attention to the measurement invariance issue. Chan (1998) and Schmitt (1982) introduced Golembiewski and colleagues’ (1976) notion of alpha, beta, and gamma change to explain why measurement invariance is a concern in longitudinal research. When the measurement of a particular concept retains the same structure (i.e., same number of observed items and latent factors, same value and pattern of factor loadings), change in the absolute levels of the latent factor is called alpha change. Only for this type of change can we draw the conclusion that there is a specific form of growth in a given variable. When the measurement of a concept has to be adjusted over time (i.e., different values or patterns of factor loadings), beta change happens. Although the conceptual meaning of the factor remains the same over measurements, the subjective metric of the concept has changed. When the meaning of a concept changes over time (e.g., having different number of factors or different correlations between factors), gamma change happens. It is not possible to compare difference in absolute levels of a latent factor when beta and gamma changes happen, because there is no longer a stable measurement model for the construct. The notions of beta and gamma changes are particularly important to consider when conducting longitudinal research on aging-related phenomena, especially when long time intervals are used in data collection. In such situations, the risk for encountering beta and gamma changes is higher and can seriously jeopardize the internal and external validity of the research.

Longitudinal analysis is often conducted to examine how changes happen in the same variable over time. In other words, it operates on the “alpha change” assumption. Thus, it is often important to explicitly test measurement invariance before proceeding to model the growth parameters. Without establishing measurement invariance, it is unknown whether we are testing meaningful changes or comparing apples and oranges. A number of references have discussed the procedures for testing measurement invariance in latent variable analysis framework (e.g., Chan, 1998 ; McArdle, 2007 ; Ployhart & Vandenberg, 2010 ). The basic idea is to specify and include the measurement models in the longitudinal model, with either continuous or categorical indicators (see answers to Statistical Techniques #4 below on categorical indicators). With the latent factor invariance assumption, factor loadings across measurement points should be constrained to be equal. Errors from different measurement occasions might correlate, especially when the measurement contexts are very similar over time ( Tisak & Tisak, 2000 ). Thus, the error variances for the same item over time can also be correlated to account for common influences at the item-level (i.e., autocorrelation between items). With the specification of the measurement structure, the absolute changes in the latent variables can then be modeled by the mean structure. It should be noted that a more stringent definition of measurement invariance also requires equal variance in latent factors. However, in longitudinal data this requirement becomes extremely difficult to satisfy, and factor variances can be sample specific. Thus, this requirement is often eased when testing measurement invariance in longitudinal analysis. Moreover, this requirement may even be invalid when the nature of the true change over time involves changes in the latent variance ( Chan, 1998 ).

It is important to note that the mean structure approach not only applies to longitudinal models with three or more measurement points, but also applies to simple repeated measures designs (e.g., pre–post design). Traditional paired sample t tests and within-subject repeated measures ANOVAs do not take into account measurement equivalence, which simply uses the summed scores at two measurement points to conduct a hypothesis test. The mean structure approach provides a more powerful way to test the changes/differences in a latent variable by taking measurement errors into consideration ( McArdle, 2009 ).

However, sometimes it is not possible to achieve measurement equivalence through using the same scales over time. For example, in research on development of cognitive intelligence in individuals from birth to late adulthood, different tests of cognitive intelligence are administrated at different ages (e.g., Bayley, 1956 ). In applied settings, different domain-knowledge or skill tests may be administrated to evaluate employee competence at different stages of their career. Another possible reason for changing measures is poor psychometric properties of scales used in earlier data collection. Previously, researchers have used transformed scores (e.g., scores standardized within each measurement point) before modeling growth curves over time. In response to critiques of these scaling methods, new procedures have been developed to model longitudinal data using changed measurement (e.g., rescoring methods, over-time prediction, and structural equation modeling with convergent factor patterns). Recently, McArdle and colleagues (2009) proposed a joint model approach that estimated an item response theory (IRT) model and latent curve model simultaneously. They provided a demonstration of how to effectively handle changing measurement in longitudinal studies by using this new proposed approach.

I am not sure these issues of measurement error are “over and above” cross-sectional issues as much as that cross-sectional data provide no mechanisms for dealing with these issues, so they are simply ignored at the analysis stage. Unfortunately, this creates problems at the interpretation stage. In particular, issues of random walk variables ( Kuljanin, Braun, & DeShon, 2011 ) are a potential problem for longitudinal data analysis and the interpretation of either cross-sectional or longitudinal designs. Random walk variables are dynamic variables that I mentioned earlier when describing the computational modeling approach. These variables have some value and are moved from that value. The random walk expression comes from the image of a highly inebriated individual, who is in some position, but who staggers and sways from the position to neighboring positions because the alcohol has disrupted the nerve system’s stabilizers. This inebriated individual might have an intended direction (called “the trend” if the individual can make any real progress), but there may be a lot of noise in that path. In the aging and retirement literature, one’s retirement savings can be viewed as a random walk variable. Although the general trend of retirement savings should be positive (i.e., the amount of retirement savings should grow over time), at any given point, the exact amount added/gained into the saving (or withdrawn/loss from the saving) depends on a number of situational factors (e.g., stock market performance) and cannot be consistently predicted. The random walks (i.e., dynamic variables) have a nonindependence among observations over time. Indeed, one way to know if one is measuring a dynamic variable is if one observes a simplex pattern among inter-correlations of the variable with itself over time. In a simplex pattern, observations of the variable are more highly correlated when they are measured closer in time (e.g., Time 1 observations correlate more highly with Time 2 than Time 3). Of course, this pattern can also occur if its proximal causes (rather than itself) is a dynamic variable.

As noted, dynamic or random walk variables can create problems for poorly designed longitudinal research because one may not realize that the level of the criterion ( Y ), say measured at Time 3, was largely near its level at Time 2, when the presumed cause ( X ) was measured. Moreover, at Time 1 the criterion ( Y ) might have been busy moving the level of the “causal” variable ( X ) to the place it is observed at Time 2. That is, the criterion variable ( Y ) at Time 1 is actually causing the presumed causal variable ( X ) at Time 2. For example, performances might affect self-efficacy beliefs such that self-efficacy beliefs end up aligning with performance levels. If one measures self-efficacy after it has largely been aligned, and then later measures the largely stable performance, a positive correlation between the two variables might be thought of as reflecting self-efficacy’s influence on performance because of the timing of measurement (i.e., measuring self-efficacy before performance). This is why the multiple wave measurement practice is so important in passive observational panel studies.

However, the multiple waves of measurement might still create problems for random walk variables, particularly if there are trends and reverse causality. Consider the self-efficacy to performance example again. If performance is trending over time and self-efficacy is following along behind, a within-person positive correlation between self-efficacy and subsequent performance is likely be observed (even if there is no or a weak negative causal effect) because self-efficacy will be relatively high when performance is relatively high and low when performance is low. In this case, controlling for trend or past performance will generally solve the problem ( Sitzmann & Yeo, 2013 ), unless the random walk has no trend. Meanwhile, there are other issues that random walk variables may raise for both cross-sectional and longitudinal research, which Kuljanin et al. (2011) do a very good job of articulating.

A related issue for longitudinal research is nonindependence of observations as a function of nesting within clusters. This issue has received a great deal of attention in the multilevel literature (e.g., Bliese & Ployhart, 2002 ; Singer & Willett, 2003 ), so I will not belabor the point. However, there is one more nonindependence issue that has not received much attention. Specifically, the issue can be seen when a variable is a lagged predictor of itself ( Vancouver, Gullekson, & Bliese, 2007 ). With just three repeated measures or observations, the correlation of the variable on itself will average −.33 across three time points, even if the observations are randomly generated. This is because there is a one-third chance the repeated observations are changing monotonically over the three time points, which results in a correlation of 1, and a two-thirds chance they are not changing monotonically, which results in a correlation of −1, which averages to −.33. Thus, on average it will appear the variable is negatively causing itself. Fortunately, this problem is quickly mitigated by more waves of observations and more cases (i.e., the bias is largely removed with 60 pairs of observations).

Statistical Techniques Question 3: When analyzing longitudinal data, how should we handle missing values?

As reviewed by Newman (2014 ; see in-depth discussions by Enders, 2001 , 2010 ; Little & Rubin, 1987 ; Newman, 2003 , 2009 ; Schafer & Graham, 2002 ), there are three levels of missing data (item level missingness, variable/construct-level missingness, and person-level missingness), two problems caused by missing data (parameter estimation bias and low statistical power), three mechanisms of missing data (missing completely at random/MCAR, missing at random/MAR, and missing not at random/MNAR), and a handful of common missing data techniques (listwise deletion, pairwise deletion, single imputation techniques, maximum likelihood, and multiple imputation). State-of-the-art advice is to use maximum likelihood (ML: EM algorithm, Full Information ML) or multiple imputation (MI) techniques, which are particularly superior to other missing data techniques under the MAR missingness mechanism, and perform as well as—or better than—other missing data techniques under MCAR and MNAR missingness mechanisms (MAR missingness is a form of systematic missingness in which the probability that data are missing on one variable [ Y ] is related to the observed data on another variable [ X ]).

Most of the controversy surrounding missing data techniques involves two misconceptions: (a) the misconception that listwise and pairwise deletion are somehow more natural techniques that involve fewer or less tenuous assumptions than ML and MI techniques do, with the false belief that a data analyst can draw safer inferences by avoiding the newer techniques, and (b) the misconception that multiple imputation simply entails “fabricating data that were not observed.” First, because all missing data techniques are based upon particular assumptions, none is perfect. Also, when it comes to selecting a missing data technique to analyze incomplete data, one of the above techniques (e.g., listwise, pairwise, ML, MI) must be chosen. One cannot safely avoid the decision altogether—that is, abstinence is not an option. One must select the least among evils.

Because listwise and pairwise deletion make the exceedingly unrealistic assumption that missing data are missing completely at random/MCAR (cf. Rogelberg et al. , 2003 ), they will almost always produce worse bias than ML and MI techniques, on average ( Newman & Cottrell, 2015 ). Listwise deletion can further lead to extreme reductions in statistical power. Next, single imputation techniques (e.g., mean substitution, stochastic regression imputation)—in which the missing data are filled in only once, and the resulting data matrix is analyzed as if the data had been complete—are seriously flawed because they overestimate sample size and underestimate standard errors and p -values.

Unfortunately, researchers often get confused into thinking that multiple imputation suffers from the same problems as single imputation; it does not. In multiple imputation, missing data are filled in several different times, and the multiple resulting imputed datasets are then aggregated in a way that accounts for the uncertainty in each imputation ( Rubin, 1987 ). Multiple imputation is not an exercise in “making up data”; it is an exercise in tracing the uncertainty of one’s parameter estimates, by looking at the degree of variability across several imprecise guesses (given the available information). The operative word in multiple imputation is multiple , not imputation.

Longitudinal modeling tends to involve a lot of construct- or variable-level missing data (i.e., omitting answers from an entire scale, an entire construct, or an entire wave of observation—e.g., attrition). Such conditions create many partial nonrespondents, or participants for whom some variables have been observed and some other variables have not been observed. Thus a great deal of missing data in longitudinal designs tends to be MAR (e.g., because missing data at Time 2 is related to observed data at Time 1). Because variable-level missingness under the MAR mechanism is the ideal condition for which ML and MI techniques were designed ( Schafer & Graham, 2002 ), both ML and MI techniques (in comparison to listwise deletion, pairwise deletion, and single imputation techniques) will typically produce much less biased estimates and more accurate hypothesis tests when used on longitudinal designs ( Newman, 2003 ). Indeed, ML missing data techniques are now the default techniques in LISREL, Mplus, HLM, and SAS Proc Mixed. It is thus no longer excusable to perform discrete-time longitudinal analyses ( Figure 2 ) without using either ML or MI missing data techniques ( Enders, 2010 ; Graham, 2009 ; Schafer & Graham, 2002 ).

Lastly, because these newer missing data techniques incorporate all of the available data, it is now increasingly important for longitudinal researchers to not give up on early nonrespondents. Attrition need not be a permanent condition. If a would-be respondent chooses not to reply to a survey request at Time 1, the researcher should still attempt to collect data from that person at Time 2 and Time 3. More data = more useful information that can reduce bias and increase statistical power. Applying this advice to longitudinal research on aging and retirement, it means that even when a participant fails to provide responses at some measurement points, continuing to make an effort to collect more data from the participant in subsequent waves may still be worthwhile. It will certainly help combat the issue of attrition and allow more usable data to emerge from the longitudinal data collection.

Statistical Techniques Question 4: Most of existing longitudinal research focuses on studying quantitative change over time. What if the variable of interest is categorical or if the changes over time are qualitative in nature?

I think there are two questions here: How to model longitudinal data of categorical variables, and how to model discontinuous change patterns of variables over time. In terms of longitudinal categorical data, there are two types of data that researchers typically encounter. One type of data comes from measuring a sample of participants on a categorical variable at a few time points (i.e., panel data). The research question that drives the data analyses is to understand the change of status from one time point to the next. For example, researchers might be interested in whether a population of older workers would stay employed or switch between employed and unemployed statuses (e.g., Wang & Chan, 2011 ). To answer this question, employment status (employed or unemployed) of a sample of older workers might be measured five or six times over several years. When transition between qualitative statuses is of theoretical interest, this type of panel data can be modeled via Markov chain models. The simplest form of Markov chain models is a simple Markov model with a single chain, which assumes (a) the observed status at time t depends on the observed status at time t –1, (b) the observed categories are free from measurement error, and (c) the whole population can be described by a single chain. The first assumption is held by most if not all Markov chain models. The other two assumptions can be released by using latent Markov chain modeling (see Langeheine & Van de Pol, 2002 for detailed explanation).

The basic idea of latent Markov chains is that observed categories reflect the “true” status on latent categorical variables to a certain extent (i.e., the latent categorical variable is the cause of the observed categorical variable). In addition, because the observations may contain measurement error, a number of different observed patterns over time could reflect the same underlying latent transition pattern in qualitative status. This way, a large number of observed patterns (e.g., a maximum of 256 patterns of a categorical variable with four categories measured four times) can be reduced into reflecting a small number of theoretically coherent patterns (e.g., a maximum of 16 patterns of a latent categorical variable with two latent statuses over four time points). It is also important to note that subpopulations in a larger population can follow qualitatively different transition patterns. This heterogeneity in latent Markov chains can be modeled by mixture latent Markov modeling, a technique integrating latent Markov modeling and latent class analysis (see Wang & Chan, 2011 for technical details). Given that mixture latent Markov modeling is a part of the general latent variable analysis framework ( Muthén, 2001 ), mixture latent Markov models can include different types of covariates and outcomes (latent or observed, categorical or continuous) of the subpopulation membership as well as the transition parameters of each subpopulation.

Another type of longitudinal categorical data comes from measuring one or a few study units on many occasions separated by the same time interval (e.g., every hour, day, month, or year). Studies examining this type of data mostly aim to understand the temporal trend or periodic tendency in a phenomenon. For example, one can examine the cyclical trend of daily stressful events (occurred or not) over several months among a few employees. The research goal could be to reveal multiple cyclical patterns within the repeated occurrences in stressful events, such as daily, weekly, and/or monthly cycles. Another example is the study of performance of a particular player or a sports team (i.e., win, lost, or tie) over hundreds of games. The research question could be to find out time-varying factors that could account for the cyclical patterns of game performance. The statistical techniques typically used to analyze this type of data belong to the family of categorical time series analyses . A detailed technical review is beyond the current scope, but interested readers can refer to Fokianos and Kedem (2003) for an extended overview.

In terms of modeling discontinuous change patterns of variables, Singer and Willett (2003) and Bollen and Curran (2006) provided guidance on modeling procedures using either the multilevel modeling or structural equation modeling framework. Here I briefly discuss two additional modeling techniques that can achieve similar research goals: spline regression and catastrophe models.

Spline regression is used to model a continuous variable that changes its trajectory at a particular time point (see Marsh & Cormier, 2001 for technical details). For example, newcomers’ satisfaction with coworkers might increase steadily immediately after they enter the organization. Then due to a critical organizational event (e.g., the downsizing of the company, a newly introduced policy to weed out poor performers in the newcomer cohort), newcomers’ coworker satisfaction may start to drop. A spline model can be used to capture the dramatic change in the trend of newcomer attitude as a response to the event (see Figure 4 for an illustration of this example). The time points at which the variable changes its trajectory are called spline knots. At the spline knots, two regression lines connect. Location of the spline knots may be known ahead of time. However, sometimes the location and the number of spline knots are unknown before data collection. Different spline models and estimation techniques have been developed to account for these different explorations of spline knots ( Marsh & Cormier, 2001 ). In general, spline models can be considered as dummy-variable based models with continuity constraints. Some forms of spline models are equivalent to piecewise linear regression models and are quite easy to implement ( Pindyck & Rubinfeld, 1998 ).

Hypothetical illustration of spline regression: The discontinuous change in newcomers’ satisfaction with coworkers over time.

Hypothetical illustration of spline regression: The discontinuous change in newcomers’ satisfaction with coworkers over time.

Catastrophe models can also be used to describe “sudden” (i.e., catastrophic) discontinuous change in a dynamic system. For example, some systems in organizations develop from one certain state to uncertainty, and then shift to another certain state (e.g., perception of performance; Hanges, Braverman, & Rentsch, 1991 ). This nonlinear dynamic change pattern can be described by a cusp model, one of the most popular catastrophe models in the social sciences. Researchers have applied catastrophe models to understand various types of behaviors at work and in organizations (see Guastello, 2013 for a summary). Estimation procedures are also readily available for fitting catastrophe models to empirical data (see technical introductions in Guastello, 2013 ).

Statistical Techniques Question 5: Could you speculate on the “next big thing” in conceptual or methodological advances in longitudinal research? Specifically, describe a novel idea or specific data analytic model that is rarely used in longitudinal studies in our literature, but could serve as a useful conceptual or methodological tool for future science in work, aging and retirement.

Generally, but mostly on the conceptual level, I think we will see an increased use of computational models to assess theory, design, and analysis. Indeed, I think this will be as big as multilevel analysis in future years, though the rate at which it will happen I cannot predict. The primary factors slowing the rate of adoption are knowledge of how to do it and ignorance of the cost of not doing it (cf. Vancouver, Tamanini et al. , 2010 ). Factors that will speed its adoption are easy-to-use modeling software and training opportunities. My coauthor and I recently published a tutorial on computational modeling ( Vancouver & Weinhardt, 2012 ), and we provide more details on how to use a specific, free, easy-to-use modeling platform on our web site ( https://sites.google.com/site/motivationmodeling/home ).

On the methodology level I think research simulations (i.e., virtual worlds) will increase in importance. They offer a great deal of control and the ability to measure many variables continuously or frequently. On the analysis level I anticipate an increased use of Bayesian and Hierarchical Bayesian analysis, particularly to assess computational model fits ( Kruschke, 2010 ; Rouder, & Lu, 2005 ; Wagenmakers, 2007 ).

I predict that significant advances in various areas will be made in the near future through the appropriate application of mixture latent modeling approaches. These approaches combine different latent variable techniques such as latent growth modeling, latent class modeling, latent profile analysis, and latent transition analysis into a unified analytical model ( Wang & Hanges, 2011 ). They could also integrate continuous variables and discrete variables, as either predictor or outcome variables, in a single analytical model to describe and explain simultaneous quantitative and qualitative changes over time. In a recent study, my coauthor and I applied an example of a mixture latent model to understand the retirement process ( Wang & Chan, 2011 ). Despite or rather because of the power and flexibility of these advanced mixture techniques to fit diverse models to longitudinal data, I will repeat the caution I made over a decade ago—that the application of these complex models to assess changes over time should be guided by adequate theories and relevant previous empirical findings ( Chan, 1998 ).

My hope or wish for the next big thing is the use of longitudinal methods to integrate the micro and macro domains of our literature on work-related phenomena. This will entail combining aspects of growth modeling with multi-level processes. Although I do not have a particular conceptual framework in mind to illustrate this, my reasoning is based on the simple notion that it is the people who make the place. Therefore, it seems logical that we could, for example, study change in some aspect of firm performance across time as a function of change in some aspect of individual behavior and/or attitudes. Another example could be that we can study change in household well-being throughout the retirement process as a function of change in the two partners’ individual well-being over time. The analytical tools exist for undertaking such analyses. What are lacking at this point are the conceptual frameworks.

I hope the next big thing for longitudinal research will be dynamic computational models ( Ilgen & Hulin, 2000 ; Miller & Page, 2007 ; Weinhardt & Vancouver, 2012 ), which encode theory in a manner that is appropriately longitudinal/dynamic. If most theories are indeed theories of change, then this advancement promises to revolutionize what passes for theory in the organizational sciences (i.e., a computational model is a formal theory, with much more specific, risky, and therefore more meaningful predictions about phenomena—in comparison to the informal verbal theories that currently dominate and are somewhat vague with respect to time). My preferred approach is iterative: (a) authors first collect longitudinal data, then (b) inductively build a parsimonious computational model that can reproduce the data, then (c) collect more longitudinal data and consider its goodness of fit with the model, then (d) suggest possible model modifications, and then repeat steps (c) and (d) iteratively until some convergence is reached (e.g., Stasser, 2000 , 1988 describes one such effort in the context of group discussion and decision making theory). Exactly how to implement all the above steps is not currently well known, but developments in this area can potentially change what we think good theory is.

I am uncertain whether my “next big thing” truly reflects the wave of the future, or if it instead simply reflects my own hopes for where longitudinal research should head in our field. I will play it safe and treat it as the latter. Consistent with several other responses to this question, I hope that researchers will soon begin to incorporate far more complex dynamics of processes into both their theorizing and their methods of analysis. Although process dynamics can (and do) occur at all levels of analysis, I am particularly excited by the prospect of linking them across at least adjacent levels. For example, basic researchers interested in the dynamic aspects of affect recently have begun theorizing and modeling emotional experiences using various forms of differential structural equation or state-space models (e.g. Chow et al. , 2005 ; Kuppens, Oravecz, & Tuerlinckx, 2010 ), and, as the resulting parameters that describe within-person dynamics can be aggregated to higher levels of analysis (e.g., Beal, 2014 ; Wang, Hamaker, & Bergeman, 2012 ), they are inherently multilevel.

Another example of models that capture this complexity and are increasingly used in both immediate and longer-term longitudinal research are multivariate latent change score models ( Ferrer & McArdle, 2010 ; McArdle, 2009 ; Liu et al. , 2016 ). These models extend LGMs to include a broader array of sources of change (e.g., autoregressive and cross-lagged factors) and consequently capture more of the complexity of changes that can occur in one or more variables measured over time. All of these models share a common interest in modeling the underlying dynamic patterns of a variable (e.g., linear, curvilinear, or exponential growth, cyclical components, feedback processes), while also taking into consideration the “shocks” to the underlying system (e.g., affective events, organizational changes, etc.), allowing them to better assess the complexity of dynamic processes with greater accuracy and flexibility ( Wang et al. , 2016 ).

I believe that applying a dynamical systems framework will greatly advance our research. Applying the dynamic systems framework (e.g., DeShon, 2012 ; Vancouver, Weinhardt, & Schmidt, 2010 ; Wang et al. , 2016 ) forces us to more explicitly conceptualize how changes unfold over time in a particular system. Dynamic systems models can also answer the why question better by specifying how elements of a system work together over time to bring about the observed change at the system level. Studies on dynamic systems models also tend to provide richer data and more detailed analyses on the processes (i.e., the black boxes not measured in traditional research) in a system. A number of research design and analysis methods relevant for dynamical systems frameworks are available, such as computational modeling, ESM, event history analyses, and time series analyses ( Wang et al. , 2016 ).

M. Wang’s work on this article was supported in part by the Netherlands Institute for Advanced Study in the Humanities and Social Sciences.

Ainslie G. , & Haslam N . ( 1992 ). Hyperbolic discounting . In G. Loewenstein J. Elster (Eds.), Choice over time (pp. 57 – 92 ). New York, NY : Russell Sage Foundation .

Google Scholar

Google Preview

Ancona D. G. Goodman P. S. Lawrence B. S. , & Tushman M. L . ( 2001 ). Time: A new research lens . Academy of Management Review , 26 , 645 – 563 . doi: 10.5465/AMR.2001.5393903

Ashford S. J . ( 1986 ). The role of feedback seeking in individual adaptation: A resource perspective . Academy of Management Journal , 29 , 465 – 487 . doi: 10.2307/256219

Bayley N . ( 1956 ). Individual patterns of development . Child Development , 27 , 45 – 74 . doi: 10.2307/1126330

Beal D. J . ( 2014 ). Time and emotions at work . In Shipp A. J. Fried Y. (Eds.), Time and work (Vol. 1 , pp. 40 – 62 ). New York, NY : Psychology Press .

Beal D. J . ( 2015 ). ESM 2.0: State of the art and future potential of experience sampling methods in organizational research . Annual Review of Organizational Psychology and Organizational Behavior , 2 , 383 – 407 .

Beal D. J. , & Ghandour L . ( 2011 ). Stability, change, and the stability of change in daily workplace affect . Journal of Organizational Behavior , 32 , 526 – 546 . doi: 10.1002/job.713

Beal D. J. , & Weiss H. M . ( 2013 ). The episodic structure of life at work . In Bakker A. B. Daniels K. (Eds.), A day in the life of a happy worker (pp. 8 – 24 ). London, UK : Psychology Press .

Beal D. J. , & Weiss H. M . ( 2003 ). Methods of ecological momentary assessment in organizational research . Organizational Research Methods , 6 , 440 – 464 . doi: 10.1177/1094428103257361

Beal D. J. Weiss H. M. Barros E. , & MacDermid S. M . ( 2005 ). An episodic process model of affective influences on performance . Journal of Applied Psychology , 90 , 1054 . doi: 10.1037/0021-9010.90.6.1054

Bentein K. Vandenberghe C. Vandenberg R. , & Stinglhamber F . ( 2005 ). The role of change in the relationship between commitment and turnover: a latent growth modeling approach . Journal of Applied Psychology , 90 , 468 – 482 . doi: 10.1037/0021-9010.90.3.468

Bliese P. D. , & Ployhart R. E . ( 2002 ). Growth modeling using random coefficient models: Model building, testing, and illustrations . Organizational Research Methods , 5 , 362 – 387 . doi: 10.1177/109442802237116

Bolger N. Davis A. , & Rafaeli E . ( 2003 ). Diary methods: Capturing life as it is lived . Annual Review of Psychology , 54 , 579 – 616 . doi: 10.1146/annurev.psych.54.101601.145030

Bolger N. , & Laurenceau J.-P . ( 2013 ). Intensive longitudinal methods: An introduction to diary and experience sampling research . New York, NY : Guilford .

Bollen K. A. , & Curran P. J . ( 2006 ). Latent curve models: A structural equation approach . Hoboken, NJ : Wiley .

Carsten J. M. , & Spector P. E . ( 1987 ). Unemployment, job satisfaction, and employee turnover: A meta-analytic test of the Muchinsky model . Journal of Applied Psychology , 72 , 374 . doi: 10.1037/0021-9010.72.3.374

Castiglioni L. Pforr K. , & Krieger U . ( 2008 ). The effect of incentives on response rates and panel attrition: Results of a controlled experiment . Survey Research Methods , 2 , 151 – 158 . doi: 10.18148/srm/2008.v2i3.599

Chan D . ( 1998 ). The conceptualization and analysis of change over time: An integrative approach incorporating longitudinal mean and covariance structures analysis (LMACS) and multiple indicator latent growth modeling (MLGM) . Organizational Research Methods , 1 , 421 – 483 . doi: 10.1177/109442819814004

Chan D . ( 2002 ). Longitudinal modeling . In Rogelberg S . Handbook of research methods in industrial and organizational psychology (pp. 412 – 430 ). Malden, MA : Blackwell Publishers, Inc .

Chan D . ( 2010 ). Advances in analytical strategies . In S. Zedeck (Ed.), APA handbook of industrial and organizational psychology (Vol. 1 ), Washington, DC : APA .

Chan D . ( 2014 ). Time and methodological choices . In In A. J. Shipp Y. Fried (Eds.), Time and work (Vol. 2): How time impacts groups, organizations, and methodological choices . New York, NY : Psychology Press .

Chan D. , & Schmitt N . ( 2000 ). Interindividual differences in intraindividual changes in proactivity during organizational entry: A latent growth modeling approach to understanding newcomer adaptation . Journal of Applied Psychology , 85 , 190 – 210 .

Chow S. M. Ram N. Boker S. M. Fujita F. , & Clore G . ( 2005 ). Emotion as a thermostat: representing emotion regulation using a damped oscillator model . Emotion , 5 , 208 – 225 . doi: 10.1037/1528-3542.5.2.208

Cole M. S. Bedeian A. G. , & Feild H. S . ( 2006 ). The measurement equivalence of web-based and paper-and-pencil measures of transformational leadership a multinational test . Organizational Research Methods , 9 , 339 – 368 . doi: 10.1177/1094428106287434

Cole D. A. , & Maxwell S. E . ( 2003 ). Testing mediational models with longitudinal data: Questions and tips in the use of structural equation modeling . Journal of Abnormal Psychology , 112 , 558 – 577 . doi: 10.1037/0021-843X.112.4.558

Csikszentmihalyi M. , & Larson R . ( 1987 ). Validity and reliability of the experience sampling method . Journal of Nervous and Mental Disease , 775 , 526 – 536 .

DeShon R. P . ( 2012 ). Multivariate dynamics in organizational science . In S. W. J. Kozlowski (Ed.), The Oxford Handbook of Organizational Psychology (pp. 117 – 142 ). New York, NY : Oxford University Press .

Diener E. Inglehart R. , & Tay L . ( 2013 ). Theory and validity of life satisfaction scales . Social Indicators Research , 112 , 497 – 527 . doi: 10.1007/s11205-012-0076-y

Enders C. K . ( 2001 ). . Structural Equation Modelling , 8 , 128 – 141 .

Enders C. K . ( 2010 ). Applied missing data analysis . New York City, NY : The Guilford Press .

Gersick C. J . ( 1988 ). Time and transition in work teams: Toward a new model of group development . Academy of Management Journal , 31 , 9 – 41 . doi: 10.2307/256496

Graham J. W . ( 2009 ). Missing data analysis: Making it work in the real world . Annual Review of Psychology , 60 , 549 – 576 . doi: 10.1146/annurev.psych.58.110405.085530

Ferrer E. , & McArdle J. J . ( 2010 ). Longitudinal modeling of developmental changes in psychological research . Current Directions in Psychological Science , 19 , 149 – 154 . doi: 10.1177/0963721410370300

Fisher G. G. Chaffee D. S. , & Sonnega A . ( 2016 ). Retirement timing: A review and recommendations for future research . Work, Aging and Retirement , 2 , 230 – 261 . doi: 10.1093/workar/waw001

Fokianos K. , & Kedem B . ( 2003 ). Regression theory for categorical time series . Statistical Science , 357 – 376 . doi: 10.1214/ss/1076102425

Fraley R. C . ( 2002 ). Attachment stability from infancy to adulthood: Meta-analysis and dynamic modeling of developmental mechanisms . Personality and Social Psychology Review , 6 , 123 – 151 . doi: 10.1207/S15327957PSPR0602_03

Fredrickson B. L . ( 2000 ). Extracting meaning from past affective experiences: The importance of peaks, ends, and specific emotions . Cognition and Emotion , 14 , 577 – 606 .

Fumagalli L. Laurie H. , & Lynn P . ( 2013 ). Experiments with methods to reduce attrition in longitudinal surveys . Journal of the Royal Statistical Society: Series A (Statistics in Society) , 176 , 499 – 519 . doi: 10.1111/j.1467-985X.2012.01051.x

Golembiewski R. T. Billingsley K. , & Yeager S . ( 1976 ). Measuring change and persistence in human affairs: Types of change generated by OD designs . Journal of Applied Behavioral Science , 12 , 133 – 157 . doi: 10.1177/002188637601200201

Gosling S. D. Vazire S. Srivastava S. , & John O. P . ( 2004 ). Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires . American Psychologist , 59 , 93 – 104 . doi: 10.1037/0003-066X.59.2.93

Green A. S. Rafaeli E. Bolger N. Shrout P. E. , & Reis H. T . ( 2006 ). Paper or plastic? Data equivalence in paper and electronic diaries . Psychological Methods , 11 , 87 – 105 . doi: 10.1037/1082-989X.11.1.87

Groves R. M. Couper M. P. Presser S. Singer E. Tourangeau R. Acosta G. P. , & Nelson L . ( 2006 ). Experiments in producing nonresponse bias . Public Opinion Quarterly , 70 , 720 – 736 . doi: 10.1093/poq/nfl036

Guastello S. J . ( 2013 ). Chaos, catastrophe, and human affairs: Applications of nonlinear dynamics to work, organizations, and social evolution . New York, NY : Psychology Press

Hanges P. J. Braverman E. P. , & Rentsch J. R . ( 1991 ). Changes in raters’ perceptions of subordinates: A catastrophe model . Journal of Applied Psychology , 76 , 878 – 888 . doi: 10.1037/0021-9010.76.6.878

Heybroek L. Haynes M. , & Baxter J . ( 2015 ). Life satisfaction and retirement in Australia: A longitudinal approach . Work, Aging and Retirement , 1 , 166 – 180 . doi: 10.1093/workar/wav006

Hulin C. L. Henry R. A. , & Noon S. L . ( 1990 ). Adding a dimension: Time as a factor in the generalizability of predictive relationships . Psychological Bulletin , 107 , 328 – 340 .

Humphreys L. G . ( 1968 ). The fleeting nature of the prediction of college academic success . Journal of Educational Psychology , 59 , 375 – 380 .

Ilgen D. R. , & Hulin C. L . (Eds.). ( 2000 ). Computational modeling of behavior in organizations: The third scientific discipline . Washington, DC : American Psychological Association .

James L. R. Mulaik S. A. , & Brett J. M . ( 1982 ). Causal analysis: Assumptions, models, and data . Beverly Hills, CA : Sage Publications .

Kahneman D . ( 1999 ). Objective happiness . In D. Kahneman E. Diener N. Schwarz (Eds.), Well-being: The foundations of hedonic psychology (pp. 3 – 25 ). New York, NY : Russell Sage Foundation .

Keil C. T. , & Cortina J. M . ( 2001 ). Degradation of validity over time: A test and extension of Ackerman’s model . Psychological Bulletin , 127 , 673 – 697 .

Kessler R. C. , & Greenberg D. F . ( 1981 ). Linear panel analysis: Models of quantitative change . New York, NY : Academic Press .

Kruschke J. K . ( 2010 ). What to believe: Bayesian methods for data analysis . Trends in Cognitive Science , 14 : 293 – 300 . doi: 10.1016/j.tics.2010.05.001

Kuljanin G. Braun M. T. , & DeShon R. P . ( 2011 ). A cautionary note on modeling growth trends in longitudinal data . Psychological Methods , 16 , 249 – 264 . doi: 10.1037/a0023348

Kuppens P. Oravecz Z. , & Tuerlinckx F . ( 2010 ). Feelings change: accounting for individual differences in the temporal dynamics of affect . Journal of Personality and Social Psychology , 99 , 1042 – 1060 . doi: 10.1037/a0020962

Lance C. E. , & Vandenberg R. J . (Eds.). ( 2009 ) Statistical and methodological myths and urban legends: Doctrine, verity and fable in the organizational and social sciences . New York, NY : Taylor & Francis .

Langeheine R. , & Van de Pol F . ( 2002 ). Latent Markov chains . In J. A. Hagenaars A. L. McCutcheon (Eds.), Applied latent class analysis (pp. 304 – 341 ). New York City, NY : Cambridge University Press .

Laurie H . ( 2008 ). Minimizing panel attrition . In S. Menard (Ed.), Handbook of longitudinal research: Design, measurement, and analysis . Burlington, MA : Academic Press .

Laurie H. , & Lynn P . ( 2008 ). The use of respondent incentives on longitudinal surveys (Working Paper No. 2008–42 ) . Retrieved from Institute of Social and Economic Research website: https://www.iser.essex.ac.uk/files/iser_working_papers/2008–42.pdf

Laurie H. Smith R. , & Scott L . ( 1999 ). Strategies for reducing nonresponse in a longitudinal panel survey . Journal of Official Statistics , 15 , 269 – 282 .

Little R. J. A. , & Rubin D. B . ( 1987 ). Statistical analysis with missing data . New York, NY : Wiley .

Liu Y. Mo S. Song Y. , & Wang M . ( 2016 ). Longitudinal analysis in occupational health psychology: A review and tutorial of three longitudinal modeling techniques . Applied Psychology: An International Review , 65 , 379 – 411 . doi: 10.1111/apps.12055

Madero-Cabib I Gauthier J. A. , & Le Goff J. M . ( 2016 ). The influence of interlocked employment-family trajectories on retirement timing . Work, Aging and Retirement , 2 , 38 – 53 . doi: 10.1093/workar/wav023

Marsh L. C. , & Cormier D. R . ( 2001 ). Spline regression models . Thousand Oaks, CA : Sage Publications .

Martin G. L. , & Loes C. N . ( 2010 ). What incentives can teach us about missing data in longitudinal assessment . New Directions for Institutional Research , S2 , 17 – 28 . doi: 10.1002/ir.369

Meade A. W. Michels L. C. , & Lautenschlager G. J . ( 2007 ). Are Internet and paper-and-pencil personality tests truly comparable? An experimental design measurement invariance study . Organizational Research Methods , 10 , 322 – 345 . doi: 10.1177/1094428106289393

McArdle JJ . ( 2007 ). Dynamic structural equation modeling in longitudinal experimental studies . In K.V. Montfort H. Oud and A. Satorra et al. (Eds.), Longitudinal Models in the Behavioural and Related Sciences (pp. 159 – 188 ). Mahwah, NJ : Lawrence Erlbaum .

McArdle J. J . ( 2009 ). Latent variable modeling of differences and changes with longitudinal data . Annual Review of Psychology , 60 , 577 – 605 . doi: 10.1146/annurev.psych.60.110707.163612

McArdle J. J. Grimm K. J. Hamagami F. Bowles R. P. , & Meredith W . ( 2009 ). Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement . Psychological methods , 14 , 126 – 149 .

McGrath J. E. , & Rotchford N. L . ( 1983 ). Time and behavior in organizations . Research in Organizational Behavior , 5 , 57 – 101 .

Miller J. H. , & Page S. E . ( 2007 ). Complex adaptive systems: An introduction to computational models of social life . Princeton, NJ, USA : Princeton University Press .

Mitchell T. R. , & James L. R . ( 2001 ). Building better theory: Time and the specification of when things happen . Academy of Management Review , 26 , 530 – 547 . doi: 10.5465/AMR.2001.5393889

Morrison E. W . ( 2002 ). Information seeking within organizations . Human Communication Research , 28 , 229 – 242 . doi: 10.1111/j.1468-2958.2002.tb00805.x

Muthén B . ( 2001 ). Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class–latent growth modeling . In L. M. Collins A. G. Sayer (Eds.), New methods for the analysis of change. Decade of behavior (pp. 291 – 322 ). Washington, DC : American Psychological Association .

Muthén L. K. , & Muthén B. O . (1998– 2012 ). Mplus user’s guide . 7th ed. Los Angeles, CA : Muthén & Muthén .

Newman D. A . ( 2003 ). Longitudinal modeling with randomly and systematically missing data: A simulation of ad hoc, maximum likelihood, and multiple imputation techniques . Organizational Research Methods , 6 , 328 – 362 . doi: 10.1177/1094428103254673

Newman D. A . ( 2009 ). Missing data techniques and low response rates: The role of systematic nonresponse parameters . In C. E. Lance R. J. Vandenberg (Eds.), Statistical and methodological myths and urban legends: Doctrine, verity, and fable in the organizational and social sciences (pp. 7 – 36 ). New York, NY : Routledge .

Newman D. A. , & Cottrell J. M . ( 2015 ). Missing data bias: Exactly how bad is pairwise deletion? In C. E. Lance R. J. Vandenberg (Eds.), More statistical and methodological myths and urban legends , pp. 133 – 161 . New York, NY : Routledge .

Newman D. A . ( 2014 ). Missing data five practical guidelines . Organizational Research Methods , 17 , 372 – 411 . doi: 10.1177/1094428114548590

Pindyck R. S. , & Rubinfeld D. L . ( 1998 ). Econometric Models and Economic Forecasts . Auckland, New Zealand : McGraw-Hill .

Pinquart M. , & Schindler I . ( 2007 ). Changes of life satisfaction in the transition to retirement: A latent-class approach . Psychology and Aging , 22 , 442 – 455 . doi: 10.1037/0882-7974.22.3.442

Ployhart R. E. , & Hakel M. D . ( 1998 ). The substantive nature of performance variability: Predicting interindividual differences in intraindividual performance . Personnel Psychology , 51 , 859 – 901 . doi: 10.1111/j.1744-6570.1998.tb00744.x

Ployhart R. E. , & Vandenberg R. J . ( 2010 ). Longitudinal Research: The theory, design, and analysis of change . Journal of Management , 36 , 94 – 120 . doi: 10.1177/0149206309352110

Podsakoff P. M. MacKenzie S. B. Lee J. Y. , & Podsakoff N. P . ( 2003 ). Common method biases in behavioral research: a critical review of the literature and recommended remedies . Journal of Applied Psychology , 88 , 879 – 903 . doi: 10.1037/0021-9010.88.5.879

Redelmeier D. A. , & Kahneman D . ( 1996 ). Patients’ memories of painful medical treatments: real-time and retrospective evaluations of two minimally invasive procedures . Pain , 66 , 3 – 8 .

Robinson M. D. , & Clore G. L . ( 2002 ). Belief and feeling: evidence for an accessibility model of emotional self-report . Psychological Bulletin , 128 , 934 – 960 .

Rogelberg S. G. Conway J. M. Sederburg M. E. Spitzmuller C. Aziz S. , & Knight W. E . ( 2003 ). Profiling active and passive nonrespondents to an organizational survey . Journal of Applied Psychology , 88 , 1104 – 1114 . doi: 10.1037/0021-9010.88.6.1104

Rogosa D. R . ( 1995 ). Myths and methods: “Myths about longitudinal research” plus supplemental questions . In J. M. Gottman (Ed.), The analysis of change (pp. 3 – 66 ). Mahwah, NJ : Lawrence Erlbaum .

Rouder J. N. , & Lu J . ( 2005 ). An introduction to Bayesian hierarchical models with an application in the theory of signal detection . Psychonomic Bulletin & Review , 12 , 573 – 604 . doi: 10.3758/BF03196750

Rubin D. B . ( 1987 ). Multiple imputation for nonresponse in surveys . New York, NY : John Wiley .

Schafer J. L. , & Graham J. W . ( 2002 ). Missing data: Our view of the state of the art . Psychological Methods , 7 , 147 – 177 .

Schaie K. W . ( 1965 ). A general model for the study of developmental problems . Psychological bulletin , 64 , 92 – 107 . doi: 10.1037/h0022371

Schmitt N . ( 1982 ). The use of analysis of covariance structures to assess beta and gamma change . Multivariate Behavioral Research , 17 , 343 – 358 . doi: 10.1207/s15327906mbr1703_3

Shadish W. R. Cook T. D. , & Campbell D. T . ( 2002 ). Experimental and quasi-experimental designs for generalized causal inference . Boston, MA : Houghton Mifflin .

Shingles R . ( 1985 ). Causal inference in cross-lagged panel analysis . In H. M. Blalock (Ed.), Causal models in panel and experimental design (pp. 219 – 250 ). New York, NY : Aldine .

Singer E. , & Kulka R. A . ( 2002 ). Paying respondents for survey participation . In M. ver Ploeg R. A. Moffit , & C. F. Citro (Eds.), Studies of welfare populations: Data collection and research issues (pp. 105 – 128 ). Washington, DC : National Research Council .

Singer J. D. , & Willett J. B . ( 2003 ). Applied longitudinal data analysis: Modeling change and event occurrence . New York, NY : Oxford university press .

Sitzmann T. , & Yeo G . ( 2013 ). A meta-analytic investigation of the within-person self-efficacy domain: Is self-efficacy a product of past performance or a driver of future performance? Personnel Psychology , 66 , 531 – 568 . doi: 10.1111/peps.12035

Solomon R. L. , & Corbit J. D . ( 1974 ). An opponent-process theory of motivation: I. Temporal dynamics of affect . Psychological Review , 81 , 119 – 145 . doi: 10.1037/h0036128

Stasser G . ( 1988 ). Computer simulation as a research tool: The DISCUSS model of group decision making . Journal of Experimental Social Psychology , 24 , 393 – 422 . doi: 10.1016/ 0022-1031(88)90028-5

Stasser G . ( 2000 ). Information distribution, participation, and group decision: Explorations with the DISCUSS and SPEAK models . In D. R. Ilgen R. Daniel , & C. L. Hulin (Eds.), Computational modeling of behavior in organizations: The third scientific discipline (pp. 135 – 161 ). Washington, DC : American Psychological Association .

Stone-Romero E. F. , & Rosopa P. J . ( 2010 ). Research design options for testing mediation models and their implications for facets of validity . Journal of Managerial Psychology , 25 , 697 – 712 . doi: 10.1108/02683941011075256

Tay L . ( 2015 ). Expimetrics [Computer software] . Retrieved from http://www.expimetrics.com

Tay L. Chan D. , & Diener E . ( 2014 ). The metrics of societal happiness . Social Indicators Research , 117 , 577 – 600 . doi: 10.1007/s11205-013-0356-1

Taris T . ( 2000 ). Longitudinal data analysis . London, UK : Sage Publications .

Tesluk P. E. , & Jacobs R. R . ( 1998 ). Toward an integrated model of work experience . Personnel Psychology , 51 , 321 – 355 . doi: 10.1111/j.1744-6570.1998.tb00728.x

Tisak J. , & Tisak M. S . ( 2000 ). Permanency and ephemerality of psychological measures with application to organizational commitment . Psychological Methods , 5 , 175 – 198 .

Uy M. A. Foo M. D. , & Aguinis H . ( 2010 ). Using experience sampling methodology to advance entrepreneurship theory and research . Organizational Research Methods , 13 , 31 – 54 . doi: 10.1177/1094428109334977

Vancouver J. B. Gullekson N. , & Bliese P . ( 2007 ). Lagged Regression as a Method for Causal Analysis: Monte Carlo Analyses of Possible Artifacts . Poster submitted to the annual meeting of the Society for Industrial and Organizational Psychology, New York .

Vancouver J. B. Tamanini K. B. , & Yoder R. J . ( 2010 ). Using dynamic computational models to reconnect theory and research: Socialization by the proactive newcomer exemple . Journal of Management , 36 , 764 – 793 . doi: 10.1177/0149206308321550

Vancouver J. B. , & Weinhardt J. M . ( 2012 ). Modeling the mind and the milieu: Computational modeling for micro-level organizational researchers . Organizational Research Methods , 15 , 602 – 623 . doi: 10.1177/1094428112449655

Vancouver J. B. Weinhardt J. M. , & Schmidt A. M . ( 2010 ). A formal, computational theory of multiple-goal pursuit: integrating goal-choice and goal-striving processes . Journal of Applied Psychology , 95 , 985 – 1008 . doi: 10.1037/a0020628

Vandenberg R. J. , & Lance C. E . ( 2000 ). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research . Organizational research methods , 3 , 4 – 70 . doi: 10.1177/109442810031002

Wagenmakers E. J . ( 2007 ). A practical solution to the pervasive problems of p values . Psychonomic Bulletin & Review , 14 , 779 – 804 . doi: 10.3758/BF03194105

Wang M . ( 2007 ). Profiling retirees in the retirement transition and adjustment process: Examining the longitudinal change patterns of retirees’ psychological well-being . Journal of Applied Psychology , 92 , 455 – 474 . doi: 10.1037/0021-9010.92.2.455

Wang M. , & Bodner T. E . ( 2007 ). Growth mixture modeling: Identifying and predicting unobserved subpopulations with longitudinal data . Organizational Research Methods , 10 , 635 – 656 . doi: 10.1177/1094428106289397

Wang M. , & Chan D . ( 2011 ). Mixture latent Markov modeling: Identifying and predicting unobserved heterogeneity in longitudinal qualitative status change . Organizational Research Methods , 14 , 411 – 431 . doi: 10.1177/1094428109357107

Wang M. , & Hanges P . ( 2011 ). Latent class procedures: Applications to organizational research . Organizational Research Methods , 14 , 24 – 31 . doi: 10.1177/1094428110383988

Wang M. Henkens K. , & van Solinge H . ( 2011 ). Retirement adjustment: A review of theoretical and empirical advancements . American Psychologist , 66 , 204 – 213 . doi: 10.1037/a0022414

Wang M. Zhou L. , & Zhang Z . ( 2016 ). Dynamic modeling . Annual Review of Organizational Psychology and Organizational Behavior , 3 , 241 – 266 .

Wang L. P. Hamaker E. , & Bergeman C. S . ( 2012 ). Investigating inter-individual differences in short-term intra-individual variability . Psychological Methods , 17 , 567 – 581 . doi: 10.1037/a0029317

Warren D. A . ( 2015 ). Pathways to retirement in Australia: Evidence from the HILDA survey . Work, Aging and Retirement , 1 , 144 – 165 . doi: 10.1093/workar/wau013

Weikamp J. G. , & Göritz A. S . ( 2015 ). How stable is occupational future time perspective over time? A six-wave study across 4 years . Work, Aging and Retirement , 1 , 369 – 381 . doi: 10.1093/workar/wav002

Weinhardt J. M. , & Vancouver J. B . ( 2012 ). Computational models and organizational psychology: Opportunities abound . Organizational Psychology Review , 2 , 267 – 292 . doi: 10.1177/2041386612450455

Weiss H. M. , & Cropanzano R . ( 1996 ). Affective Events Theory: A theoretical discussion of the structure, causes and consequences of affective experiences at work . Research in Organizational Behavior , 18 , 1 – 74 .

Zacks J. M. Speer N. K. Swallow K. M. Braver T. S. , & Reynolds J. R . ( 2007 ). Event perception: a mind-brain perspective . Psychological Bulletin , 133 , 273 – 293 . doi: 10.1037/0033-2909.133.2.273

Author notes

Email alerts, citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 2054-4650
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

type of longitudinal research

Home Market Research

What is a Longitudinal Study?: Definition and Explanation

What is a longitudinal study and what are it's uses

In this article, we’ll cover all you need to know about longitudinal research. 

Let’s take a closer look at the defining characteristics of longitudinal studies, review the pros and cons of this type of research, and share some useful longitudinal study examples. 

Content Index

What is a longitudinal study?

Types of longitudinal studies, advantages and disadvantages of conducting longitudinal surveys.

  • Longitudinal studies vs. cross-sectional studies

Types of surveys that use a longitudinal study

Longitudinal study examples.

A longitudinal study is a research conducted over an extended period of time. It is mostly used in medical research and other areas like psychology or sociology. 

When using this method, a longitudinal survey can pay off with actionable insights when you have the time to engage in a long-term research project.

Longitudinal studies often use surveys to collect data that is either qualitative or quantitative. Additionally, in a longitudinal study, a survey creator does not interfere with survey participants. Instead, the survey creator distributes questionnaires over time to observe changes in participants, behaviors, or attitudes. 

Many medical studies are longitudinal; researchers note and collect data from the same subjects over what can be many years.

LEARN ABOUT:   Action Research

Longitudinal studies are versatile, repeatable, and able to account for quantitative and qualitative data . Consider the three major types of longitudinal studies for future research:

Types of longitudinal studies

Panel study: A panel survey involves a sample of people from a more significant population and is conducted at specified intervals for a more extended period. 

One of the panel study’s essential features is that researchers collect data from the same sample at different points in time. Most panel studies are designed for quantitative analysis , though they may also be used to collect qualitative data and unit of analysis .

LEARN ABOUT: Level of Analysis

Cohort Study: A cohort study samples a cohort (a group of people who typically experience the same event at a given point in time). Medical researchers tend to conduct cohort studies. Some might consider clinical trials similar to cohort studies. 

In cohort studies, researchers merely observe participants without intervention, unlike clinical trials in which participants undergo tests.

Retrospective study: A retrospective study uses already existing data, collected during previously conducted research with similar methodology and variables. 

While doing a retrospective study, the researcher uses an administrative database, pre-existing medical records, or one-to-one interviews.

As we’ve demonstrated, a longitudinal study is useful in science, medicine, and many other fields. There are many reasons why a researcher might want to conduct a longitudinal study. One of the essential reasons is, longitudinal studies give unique insights that many other types of research fail to provide. 

Advantages of longitudinal studies

  • Greater validation: For a long-term study to be successful, objectives and rules must be established from the beginning. As it is a long-term study, its authenticity is verified in advance, which makes the results have a high level of validity.
  • Unique data: Most research studies collect short-term data to determine the cause and effect of what is being investigated. Longitudinal surveys follow the same principles but the data collection period is different. Long-term relationships cannot be discovered in a short-term investigation, but short-term relationships can be monitored in a long-term investigation.
  • Allow identifying trends: Whether in medicine, psychology, or sociology, the long-term design of a longitudinal study enables trends and relationships to be found within the data collected in real time. The previous data can be applied to know future results and have great discoveries.
  • Longitudinal surveys are flexible: Although a longitudinal study can be created to study a specific data point, the data collected can show unforeseen patterns or relationships that can be significant. Because this is a long-term study, the researchers have a flexibility that is not possible with other research formats.

Additional data points can be collected to study unexpected findings, allowing changes to be made to the survey based on the approach that is detected.

Disadvantages of longitudinal studies

  • Research time The main disadvantage of longitudinal surveys is that long-term research is more likely to give unpredictable results. For example, if the same person is not found to update the study, the research cannot be carried out. It may also take several years before the data begins to produce observable patterns or relationships that can be monitored.
  • An unpredictability factor is always present It must be taken into account that the initial sample can be lost over time. Because longitudinal studies involve the same subjects over a long period of time, what happens to them outside of data collection times can influence the data that is collected in the future. Some people may decide to stop participating in the research. Others may not be in the correct demographics for research. If these factors are not included in the initial research design, they could affect the findings that are generated.
  • Large samples are needed for the investigation to be meaningful To develop relationships or patterns, a large amount of data must be collected and extracted to generate results.
  • Higher costs Without a doubt, the longitudinal survey is more complex and expensive. Being a long-term form of research, the costs of the study will span years or decades, compared to other forms of research that can be completed in a smaller fraction of the time.

create-longitudinal-surveys

Longitudinal studies vs. Cross-sectional studies

Longitudinal studies are often confused with cross-sectional studies. Unlike longitudinal studies, where the research variables can change during a study, a cross-sectional study observes a single instance with all variables remaining the same throughout the study. A longitudinal study may follow up on a cross-sectional study to investigate the relationship between the variables more thoroughly.

The design of the study is highly dependent on the nature of the research questions . Whenever a researcher decides to collect data by surveying their participants, what matters most are the questions that are asked in the survey.

Cross-sectional Study vs Longitudinal study

Knowing what information a study should gather is the first step in determining how to conduct the rest of the study. 

With a longitudinal study, you can measure and compare various business and branding aspects by deploying surveys. Some of the classic examples of surveys that researchers can use for longitudinal studies are:

Market trends and brand awareness: Use a market research survey and marketing survey to identify market trends and develop brand awareness. Through these surveys, businesses or organizations can learn what customers want and what they will discard. This study can be carried over time to assess market trends repeatedly, as they are volatile and tend to change constantly.

Product feedback: If a business or brand launches a new product and wants to know how it is faring with consumers, product feedback surveys are a great option. Collect feedback from customers about the product over an extended time. Once you’ve collected the data, it’s time to put that feedback into practice and improve your offerings.

Customer satisfaction: Customer satisfaction surveys help an organization get to know the level of satisfaction or dissatisfaction among its customers. A longitudinal survey can gain feedback from new and regular customers for as long as you’d like to collect it, so it’s useful whether you’re starting a business or hoping to make some improvements to an established brand.

Employee engagement: When you check in regularly over time with a longitudinal survey, you’ll get a big-picture perspective of your company culture. Find out whether employees feel comfortable collaborating with colleagues and gauge their level of motivation at work.

Now that you know the basics of how researchers use longitudinal studies across several disciplines let’s review the following examples:

Example 1: Identical twins

Consider a study conducted to understand the similarities or differences between identical twins who are brought up together versus identical twins who were not. The study observes several variables, but the constant is that all the participants have identical twins.

In this case, researchers would want to observe these participants from childhood to adulthood, to understand how growing up in different environments influences traits, habits, and personality.

LEARN MORE ABOUT: Personality Survey

Over many years, researchers can see both sets of twins as they experience life without intervention. Because the participants share the same genes, it is assumed that any differences are due to environmental analysis , but only an attentive study can conclude those assumptions.

Example 2: Violence and video games

A group of researchers is studying whether there is a link between violence and video game usage. They collect a large sample of participants for the study. To reduce the amount of interference with their natural habits, these individuals come from a population that already plays video games. The age group is focused on teenagers (13-19 years old).

The researchers record how prone to violence participants in the sample are at the onset. It creates a baseline for later comparisons. Now the researchers will give a log to each participant to keep track of how much and how frequently they play and how much time they spend playing video games. This study can go on for months or years. During this time, the researcher can compare video game-playing behaviors with violent tendencies. Thus, investigating whether there is a link between violence and video games.

Conducting a longitudinal study with surveys is straightforward and applicable to almost any discipline. With our survey software you can easily start your own survey today. 

GET STARTED

MORE LIKE THIS

email survey tool

The Best Email Survey Tool to Boost Your Feedback Game

May 7, 2024

Employee Engagement Survey Tools

Top 10 Employee Engagement Survey Tools

employee engagement software

Top 20 Employee Engagement Software Solutions

May 3, 2024

customer experience software

15 Best Customer Experience Software of 2024

May 2, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Just one more step to your free trial.

.surveysparrow.com

Already using SurveySparrow?  Login

By clicking on "Get Started", I agree to the Privacy Policy and Terms of Service .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Enterprise Survey Software

Enterprise Survey Software to thrive in your business ecosystem

NPS® Software

Turn customers into promoters

Offline Survey

Real-time data collection, on the move. Go internet-independent.

360 Assessment

Conduct omnidirectional employee assessments. Increase productivity, grow together.

Reputation Management

Turn your existing customers into raving promoters by monitoring online reviews.

Ticket Management

Build loyalty and advocacy by delivering personalized support experiences that matter.

Chatbot for Website

Collect feedback smartly from your website visitors with the engaging Chatbot for website.

Swift, easy, secure. Scalable for your organization.

Executive Dashboard

Customer journey map, craft beautiful surveys, share surveys, gain rich insights, recurring surveys, white label surveys, embedded surveys, conversational forms, mobile-first surveys, audience management, smart surveys, video surveys, secure surveys, api, webhooks, integrations, survey themes, accept payments, custom workflows, all features, customer experience, employee experience, product experience, marketing experience, sales experience, hospitality & travel, market research, saas startup programs, wall of love, success stories, sparrowcast, nps® benchmarks, learning centre, apps & integrations, testimonials.

Our surveys come with superpowers ⚡

Blog Best Of

What is a Longitudinal Study? Definition, Types & Examples

Kate williams.

16 February 2024

Table Of Contents

What is a Longitudinal Study?

  • Types of Longitudinal Studies?

Pros and Cons of Longitudinal Research Design

Examples of longitudinal surveys.

Sonia was conflicted. A few months ago, a survey from a grocery delivery app had asked her if she preferred normal eggs or the free-range ones.

She was financially stressed and couldn’t afford to pay more for free-range eggs, so she picked the normal ones.

But last night, she had watched a popular documentary on Netflix about how hens were treated in cages and now felt much more strongly about wanting to buy free-range eggs.

There was no way for Sonia to communicate this new preference to her grocery delivery app.

But that’s the thing about consumer trends. They are constantly shifting, and one survey taken years ago is not going to give you an accurate picture of the shifts in trends.

That’s why your business needs to understand what a longitudinal study is.

At times, a one-off survey simply isn’t enough to give you the data you need. If you need to observe certain trends, behaviors, or preferences over time, you can use a longitudinal study.

The simplest way to understand what is a longitudinal study is to think of it as a survey taken over time. The passing of time could influence the responses of the same person to the same question. Like Sonia, her preferences for eggs changed since she watched the documentary. That’s the kind of thing that longitudinal research design measures.

As for a formal definition, a longitudinal study is a research method that involves repeated observations of the same variable (e.g. a set of people) over some time. The observations over a period of time might be undertaken in the form of an online survey. It can be tremendously useful in a variety of fields to be able to observe behavior or trends over time.

Longitudinal studies are used in fields like:

  • Clinical psychology to measure a patient’s thoughts over time
  • Market research to observe consumer trends
  • Political polling and sociology, observing life events and societal shifts over time
  • Longitudinal research design is also used in medicine to discover predictors of certain diseases

We are dealing with nuanced changes over time here, and surveys excel at capturing these shifts in attitudes, behaviors, and experiences. Unlike one-time snapshots, surveys repeated over time enable you to track trends and understand how variables evolve. Plus, it is cost-effective and flexible in terms of reach!

For instance, SurveySparrow’s Recurring Surveys let you schedule and automate the entire process.

With this feature, you can share periodic surveys at any frequency that you set. Also, give a slight nudge to those silent respondents over a friendly reminder via email. The best part? The platform’s conversational surveys reap a higher response rate.

Please enter a valid Email ID.

14-Day Free Trial • No Credit Card Required • No Strings Attached

Types of Longitudinal Studies

When talking about what is a longitudinal study, we cannot go without also discussing the types of longitudinal research design. There are different studies based on your needs. When you understand all three types of longitudinal studies, you’ll be able to pick out the one that’s best suited to your needs.

Panel Study

When we want to find out trends in a larger population, we often use a sample size to survey. A panel study simply observes that sample size over time. By doing so, panel studies can identify cultural shifts and new trends in a larger population.

Panel studies are designed for quantitative analysis. Through the data from online surveys, you can identify common patterns in the responses from your sample (which remain the same over time). A comprehensive dashboard will help you make informed decisions.

But what’s the need to visualize?

In panel studies, the same set of people must be studied over time. If you pick a different sample, variations in individual preferences could skew your results.

Observing the same set of people can make sure that what you’re observing is a change over time. Visualizing the change over time will give you a clear idea of the trends and patterns, resulting in informed and effective decision-making.

Cohort Study

A longitudinal cohort study is one in which we study people who share a single characteristic over a period of time. Cohort studies are regularly conducted by medical researchers to ascertain the effects of a new drug or the symptoms of a disease.

In cohort studies, the behaviors of the selected group of people are observed over time to find patterns and trends. Often, these studies can go on for years. They can also be particularly useful for ascertaining consumer trends if you’re trying to research consumers with a specific common characteristic. An example of such a study would be observing the choice of cereal for kids who go to Sunshine Elementary School over time.

If you’re confused between panel studies and cohort studies, don’t worry. The one key difference between cohort studies and panel studies is that the same set of people has to be observed in the latter. In cohort studies, you can pick a different sample of the same demographic to study over time.

Retrospective Study

A retrospective longitudinal study is when you take pre-existing data from previous online surveys and other research. The objective here is to put your results in a larger timeline and observe the variation in results over time. What makes retrospective studies longitudinal is simply the fact that they’re aimed at revealing trends over time.

When understanding what is a longitudinal study, it’ll be well worth your while to look into retrospective studies. For your company, retrospective longitudinal studies can reveal crucial insights without you having to spend a single dime. Since these studies depend on existing data, they not only don’t cost much themselves but also improve the returns from your earlier research efforts.

How can retrospective longitudinal studies be useful to you? Let’s assume, for example, that you conduct an employee engagement survey every year. If your organization has done these surveys for the past 10 years, you now have more than enough material to conduct a retrospective study. You can then find out how employee engagement at your company has varied over time.

Like with every research method , longitudinal studies have their advantages and disadvantages. While trying to understand what is a longitudinal study, it is important to get the particular ways in which they’re useful, and situations in which they’re not.  Let’s go over some of the major pros and cons of longitudinal surveys.

Advantages of Longitudinal Studies

  • Rigorous Insights : A one-off online survey, no matter how well designed, is only so rigorous. Even though the results are often useful, sometimes you need more rigor in your surveys. A longitudinal survey, by observing respondents over time, can offer more rigorous results.
  • Long-term Data : When thinking about what is a longitudinal study, it is crucial to understand that it is best used for a specific type of data collection. When you need to understand trends over the longer term, longitudinal studies are best suited to that task.
  • Discover Trends : Most companies, in one way or another, rely on trends they estimate will be relevant in the future. Longitudinal studies can be great at finding out those trends and capitalizing on them before the competition.
  • Open To Surprises : When designing an online survey, it is very tough to allow for surprises. Mostly, you get what you ask for. With longitudinal surveys, you’re allowing for the possibility that you might spot patterns you didn’t imagine could exist. Longitudinal studies are more flexible in that regard and allow us to discover the unexpected.

Disadvantages of Longitudinal Studies

  • Higher Costs : Because longitudinal research needs to be conducted over time, and in some cases with the same set of people, they end up being costlier than one-off surveys. From conducting the observations to analyzing the data, it can add up financially. Using a cost-effective online survey tool like Surveysparrow can be one way to reduce costs.
  • More Demanding : One of the biggest challenges you can face while conducting a survey is to get enough respondents. Even for normal online surveys, it can be tough to get people to take your survey. Longitudinal surveys are far more demanding, so it is unlikely that anyone will participate without strong incentives.
  • Unpredictability : While unpredictability can sometimes be a good thing, at times it can also lead the whole exercise astray. The success of a longitudinal study depends not just on the resources you invest in it, but also on the respondents who have to participate in a long-term commitment. Things can go wrong when respondents are suddenly unavailable. That’s why there’s always an element of unpredictability with longitudinal surveys.
  • Time-Consuming : Unlike simple online surveys, you don’t get the results instantly with longitudinal surveys. They require a certain vision, and you have to be patient enough to see it through to get your desired results.

Longitudinal surveys have been used by researchers and businesses for a long time now, so there is no dearth of examples. Let’s walk through a few of them so you can better understand what is a longitudinal survey.

Australia’s ‘45 and Up’ Survey

There is no better example to understand what longitudinal research is than the 45 and Up study being conducted in Australia. It aims to understand healthy aging and has 250,000 participants who are aged 45 or older. The idea is to get a better idea of Australians’ health as they age.

Such a study needed to be a longitudinal survey since you can only understand the effects of aging en masse by considering the results over time. The results from this study are being used in areas like cardiovascular research and preventable hospitalizations.

Smoking and Lung Cancer

To understand the effects of smoking, you need to be able to assess its consequences over time. The British Doctors Study, which ran from 1951 to 2001, yielded results that strongly indicated the link between smoking and lung cancer. If not for longitudinal research methods, we might never have known.

Even though the research was first published in 1956, the study went on for almost half a century after that. When thinking about what is a longitudinal study, we must also consider that these studies give results while they’re ongoing. Conclusively proving the link between smoking and cancer required a robust, longitudinal survey.

Growing Up In Ireland

Started in 2006, Growing Up In Ireland is a longitudinal study conducted by the Irish government to understand what children’s life looks like in different age brackets. One cohort that the study started following at 9 years of age is now 23. The long-term study can yield interesting results by following a set of children throughout their childhood.

The thing to remember when thinking about what a longitudinal study is is that they can have broad objectives. You can go in without really knowing what you’re trying to find and what that might lead to. You can then use the surprises along the way to generate actionable insights.

Wrapping Up

If you started out wondering what is a longitudinal study, we hope that we’ve addressed that question and more in this article. If you want to create a longitudinal survey, don’t forget to first plan out your survey. A retrospective study, like we just talked about, can also be a great solution to your problems.

Here at SurveySparrow, we love surveys of all kinds. For certain types of questions, you need to conduct longitudinal surveys, and we’re here to support you through the process. With our online templates and intuitive UI, conducting a longitudinal survey will be much easier.

What we love about recurring surveys is the surprising results they can yield. That is really what drives us at Surveysparrow, that you might find something in the results you didn’t expect, and it might change the course of your company for the better.

Create recurring surveys with SurveySparrow

Unlock insights, elevate experience!

  • 14-Day Free Trial
  • • Cancel Anytime
  • • No Credit Card Required
  • • Need a Demo?

Content Marketer at SurveySparrow

You Might Also Like

10 top questionpro alternatives to look for in 2024, how to calculate standard deviation for surveys (formula+calculator), top 11 hr tools & software of 2024, cherry-picked blog posts. the best of the best..

Leave us your email, we wont spam. Promise!

Start your free trial today

No Credit Card Required. 14-Day Free Trial

Request a Demo

Want to learn more about SurveySparrow? We'll be in touch soon!

Conduct Longitudinal Surveys that Get Results!

Get more of your customers to share their current, honest thoughts with you.

14-Day Free Trial • No Credit card required • 40% more completion rate

Hi there, we use cookies to offer you a better browsing experience and to analyze site traffic. By continuing to use our website, you consent to the use of these cookies. Learn More

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 11 May 2024

Arthritis is associated with high nutritional risk among older Canadian adults from the Canadian Longitudinal Study on Aging

  • Roxanne Bennett 1 ,
  • Thea A. Demmers 1 , 2 ,
  • Hugues Plourde 3 ,
  • Kim Arrey 4 ,
  • Beth Armour 5 ,
  • Guylaine Ferland 6 &
  • Lisa Kakinami 1 , 7  

Scientific Reports volume  14 , Article number:  10807 ( 2024 ) Cite this article

Metrics details

  • Epidemiology
  • Rheumatology
  • Risk factors

This study assessed the association between arthritis, functional impairment, and nutritional risk (NR). Cross-sectional data were from the Canadian Longitudinal Study on Aging, a nationally representative sample of 45–85-year-old community-dwelling Canadians (n = 41,153). The abbreviated Seniors in the Community: Risk Evaluating for Eating and Nutrition II (SCREEN II-AB) Questionnaire determined NR scores (continuous), and high NR (score < 38); the Older American Resources and Services scale measured functional impairment. NR scores and status (low/high) were modelled using multiple linear and logistic regressions, respectively. Analyses adjusted for demographic characteristics, functional impairment, and health (body mass index, self-rated general and mental health). Additional analyses stratified the models by functional impairment. People with arthritis had poorer NR scores (B: − 0.35, CI − 0.48, − 0.22; p  < 0.05) and increased risks of high NR (OR 1.11, 95% CI 1.06, 1.17). Among those with functional impairment, the likelihood of high NR was 31% higher in people with arthritis compared to those without arthritis (95% CI 1.12, 1.53). Among those with no functional impairment, the likelihood of high NR was 10% higher in people with arthritis compared to those without (95% CI 1.04, 1.16). These relationships differed based on the type of arthritis. Arthritis is associated with high NR in community-dwelling older adults, both with and without functional impairment. Findings highlight the need for further research on these relationships to inform interventions and improve clinical practices.

Similar content being viewed by others

type of longitudinal research

Associations between dietary patterns and osteoporosis-related outcomes in older adults: a longitudinal study

type of longitudinal research

Midlife susceptibility to the effects of poor diet on diabetes risk

type of longitudinal research

Dietary patterns and associations with health outcomes in Australian people with multiple sclerosis

Introduction.

In 2018, 46.9% of Canadians over the age of 65 years suffered from arthritis 1 . Recent evidence suggests that older adults affected by disabling conditions (such as arthritis) are more likely to suffer from malnutrition 2 , 3 . The relationship between nutritional status and arthritis is complex and may vary based on the type and severity of arthritis, as well as the affected joints 4 , 5 , 6 . For example, joint pain may impact dexterity or the ability to stand to prepare food, while the fatigue present in certain forms of arthritis may decrease the energy to cook or to eat 5 , 6 , 7 . Because malnutrition has been linked to increased morbidity and mortality in this age group, the early detection of vulnerable individuals may allow for timely interventions and improved outcomes 2 , 3 , 8 , 9 , 10 .

To help prevent progression to full-fledged malnutrition, screening tools have been designed to detect the presence of risk factors associated with poor nutritional status, also known as “nutritional risk” 8 , 11 , 12 . While the relationship between arthritis and functional impairment is established 13 , the contribution of different types of arthritis, such as osteoarthritis (OA), and other external factors to the development of functional impairment 14 , and further to the development of nutritional risk, is less understood. This is in part due to delays in diagnosis of arthritis 15 , 16 , which could be precipitated sooner by functional impairment in some cases, depending on the joint and personal habits, but not in others. It is also partly due to bidirectionality between nutritional risk and subsequent development of functional impairment. Both points contribute to difficulty in establishing the temporality needed to clarify the role of functional impairment as a mediator rather than a moderator within the relationship between arthritis and nutritional risk. Past research suggests that nutritional risk may be linked to functional impairment 2 , 8 , 17 . Specifically, limitations with certain activities of daily living (ADLs) and instrumental activities of daily living (IADLs), such as meal preparation, may be particularly impactful in a person’s susceptibility to nutritional risk 4 , 17 , 18 , 19 , 20 . Although an estimated 10%-59% of patients with arthritis experience difficulties during meal preparation, there is a limited body of research on the relationship between meal preparation and arthritis 4 , 19 , 21 , 22 , 23 , 24 . Previous research has linked impairment with certain ADLs and IADLs to both the number of painful joints and the severity of pain in individuals with arthritis 4 , 24 . These results are complemented by a 2016 study, which found an association between chronic musculoskeletal pain and nutritional risk in seniors 25 . While pain caused by arthritis is hypothesized to play an important role in arthritis-related functional impairment, its association with nutritional risk remains understudied 5 , 26 , 27 .

Thus, there is a paucity of data on the relationship between arthritis, nutrition risk, and the role of functional impairment within that relationship. This study aims to bridge this gap by (1) describing the association between arthritis and nutritional risk, (2) describing the association between arthritis and nutritional risk while considering meal preparation impairment, and (3) assessing the relationship between functional impairment and these associations. To address these objectives, a large representative sample of older Canadian adults was used.

Data source

The Canadian Longitudinal Study on Aging (CLSA) is a nationally representative cohort study. Study design and measures have been published but are briefly described here 28 . Baseline data (2010–2015) were from over 50,000 Canadians between the ages of 45 and 85 years from all provinces (excluding territories). Individuals who were institutionalized, incarcerated, lived on a reserve, or had cognitive impairment at baseline were ineligible to participate. The study was conducted according to the guidelines laid down in the Declaration of Helsinki and all procedures involving research study participants were approved by the CIHR Advisory Committee on Ethical, Legal and Social Issues, and the research boards at the study sites. Ethics approval for this secondary data analysis was obtained from Concordia University (#30007632). Written informed consent was obtained from all participants. This analysis includes data from both the “tracking” group (self-reported data during a telephone interview, n = 21,241) and the “comprehensive” group (additionally includes onsite measurements, tests, questionnaires, and in-person home interviews, n = 30,097) 28 .

Participants were excluded from this secondary data analysis if they had missing data for total household income (n = 3321), number of people living in the household (n = 23), education (n = 2421), self-rated general health (n = 36), self-rated mental health (n = 33), body mass index (BMI) (n = 204), and any question within the nutritional risk or functional impairment questionnaires (n = 4147). In total, 41,153 respondents were included in this study.

Respondents were asked if they had ever been told by a doctor that they had any of the following: OA of the knee(s), the hip(s), the hand(s), rheumatoid arthritis (RA), or any other form of arthritis. Respondents who did not know about a positive arthritis diagnosis (n = 569) were considered not to have arthritis. Thus, in both groups combined (“tracking” and “comprehensive”), 14,468 respondents were considered to have arthritis and 26,685 were considered not to have arthritis. Among those with arthritis, people were additionally categorized into three non-mutually exclusive groups with OA (n = 10,485), RA (n = 1510) and other forms of arthritis (n = 4901).

In addition to self-reported arthritis, the “comprehensive” group also included validated disease ascertainment algorithms that determined the likelihood of OA of the hands, hips, and knees based on self-reported symptoms and clinical observations 29 . Participants were categorized based on whether they responded affirmatively or negatively to experiencing symptoms such as joint enlargement and pain. As algorithms were only available for OA in the “comprehensive” group (n = 25,099), the relationships between OA-related pain and nutrition were assessed in sensitivity analyses.

Nutritional risk

Nutritional risk was measured using the Seniors in the Community Risk Evaluation for Eating and Nutrition II—Abbreviated (SCREEN II-AB, also known as SCREEN-8). The SCREEN II-AB is a validated 8-item tool designed for epidemiological and clinical use in community-dwelling seniors 11 , 30 . The measure attributes scores to: recent weight changes (range: 0–8), frequency of meal skipping (0–8), general appetite (0–8), difficulties with swallowing (0–8) daily vegetable and fruit consumption (0–4), daily fluid intake (0–4), the social context of mealtime (0–4), and the frequency of cooking meals at home (0–4) 30 , 31 , for a maximum score of 48. Lower SCREEN II-AB scores reflected higher nutritional risk 32 . High nutritional risk (H-NR) was defined as a score below 38 as this cut-off had optimal sensitivity and specificity in identifying nutritional risk when compared to a dietitian’s clinical assessment (including medical and nutritional history, dietary intake and anthropometry) 11 .

Participants’ functional impairment was measured using the Older Americans’ Resources and Services (OARS) Multidimensional Functional Assessment Questionnaire 20 . This validated questionnaire includes items about seven ADLs (dressing, eating, appearance upkeep, walking, bathing, getting in and out of bed, using the bathroom) and seven IADLs (meal preparation, using the telephone, travelling, shopping, housework, taking medication, financial management) 20 . Response options included “required no help”, “some help”, or are “unable to perform” without assistance 20 . Because few CLSA participants reported any functional impairment, the total OARS score was then used to categorize the respondents’ overall functional capacity as dichotomous (no help required for any activity, as a referent, versus some help or unable to perform at least one activity without assistance), similarly to a previous study using these data 33 . To isolate its effect as both an independent predictor and a covariate of interest, meal preparation impairment was considered separately from other ADLs and IADLs in the OARS scale. Thus, functional impairment was subdivided into (1) impairment specific to meal preparation, and (2) impairment to any activity (excluding meal preparation).

Covariates of interest were identified based on the findings of past research on arthritis, chronic disability, and food insecurity 34 , 35 , 36 . These covariates included demographic characteristics such as age, sex, race (white vs non-white), total household income, education, and the number of individuals residing in the household. Total annual household income was categorized as less than $20,000, $20,000–$49,999, $50,000–$100,000, and greater than $100,000 (CAD). Education was considered as: less than secondary school, secondary school, trade school, and university or higher. Health covariates included BMI, self-rated mental health, and self-rated general health. Given the documented protective effects of a higher BMI in adults over 65 years, BMI was categorized according to the Global Leadership Initiative on Malnutrition (GLIM) criteria for malnutrition, with underweight (< 20 kg/m 2 for people aged < 70 and < 22 kg/m 2 for those aged >= 70), normal to overweight as the referent (20–29.9 kg/m 2 for people aged < 70 and 22–29.9 kg/m 2 for those aged >= 70), or obese (≥ 30.0 kg/m 2 ) 37 . BMI was calculated from self-reported height and weight in the “tracking” group and measured height and weight in the “comprehensive” group. Self-rated general health scores and self-rated mental health scores were each coded into two groups with the first including those who self-rated as “excellent” and “very good” and the latter including those who self-rated as “good”, “fair”, or “poor”.

Statistical analysis

All analyses were conducted with SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) and incorporated survey weights in accordance with the CLSA recommendations. Descriptive statistics compared the sample’s demographic and health characteristics to those who were excluded. Hierarchical multiple linear regression was used to model nutritional risk scores (where lower scores indicated greater risk) from the SCREEN-II-AB. Additionally, H-NR using the SCREEN II-AB cut-off score of 38 was assessed with logistic regression models 11 , 31 . Regressions were entered in three steps, with covariates as described previously. The first step (Model 1) controlled for demographic characteristics (age, sex, race, total household income, number of people in the household, and education) and measures of health (BMI, self-rated general physical and mental health). A second model (Model 2) additionally adjusted for meal preparation impairment. The third model (Model 3) further controlled for general functional impairment (excluding meal preparation).

Betas for arthritis (any vs none; as well as based on the three non-mutually exclusive groups as described previously), meal preparation impairment, and functional impairment from linear regression models assessing nutritional risk scores (continuous) and Odds Ratios for logistic regression models assessing probability of high nutritional risk (< 38 vs >=38) are all presented. These data are cross-sectional; as temporal precedence cannot be established, tests for mediation may be unwarranted and only indirect effects of partial mediation can be calculated 38 . In accordance to the methodological guidelines in the literature, the difference in coefficients between Model 3 and Model 1 and its standard error was calculated. The estimate was compared to the t-distribution (df = 41,118), with the null hypothesis being that the effect was mediated through functional impairment 39 , 40 , 41 . Lastly, based on evidence of an interaction between arthritis and functional impairment, the regression models were also stratified by functional impairment for further comparison. All analytical procedures were repeated for the sensitivity analysis investigating OA-related pain as previously described.

Ethical approval

The study was conducted according to the guidelines laid down in the Declaration of Helsinki and all procedures involving research study participants were approved by the CIHR Advisory Committee on Ethical, Legal and Social Issues, and the research boards at the study sites. Ethics approval for this secondary data analysis was obtained from Concordia University (#30007632).

Informed consent

Written informed consent was obtained from all participants.

Compared to the participants who were excluded from the sample, those included in the analysis were more likely to be women (49.8% vs 44.3%, p  < 0.0001), younger (59.2 years vs 63.2; p  < 0.0001), less likely to have arthritis (32% vs 39%, p  < 0.0001), and had higher (better) nutritional scores (39.2 vs 37.6; n = 46,410; p  < 0.0001; data not shown).

Demographic characteristics of participants with and without arthritis differed for all measured characteristics (Table 1 ); the sample with arthritis was older (62.4 vs 57.8 years, p  < 0.0001) and had a lower proportion of males (41.6% vs 53.7%, p  < 0.0001). Both groups differed significantly on educational attainment and income. Respondents with arthritis had greater nutritional risk than those without arthritis, indicated by their lower nutritional risk scores (38.5 vs 39.6, p  < 0.0001) and greater proportion at high nutritional risk (37.9% vs 31.2%, p  < 0.0001). Moreover, participants with arthritis had greater levels of general functional impairment (13.1% vs 4.8%, p  < 0.0001) and meal preparation impairment (0.7% vs 0.3%, p  < 0.0001). People with arthritis had higher proportions of obesity compared to those without arthritis ( p  < 0.0001).

From the multiple linear and logistic regression models, those living with arthritis were associated with a worse nutritional risk score (Table 2 ; Model 1; B = − 0.43, CI − 0.57, − 0.30; p  < 0.05) and H-NR (Model 1; OR 1.14 CI 1.08, 1.20; p  < 0.05) after controlling for demographic and health characteristics. Both associations from the first model remained significant after further adjustment. Respondents with arthritis had nutritional risk scores that were worse (B = − 0.35, CI − 0.48, − 0.22; p  < 0.05) and were 11% more likely to have H-NR (CI 1.06, 1.17; p  < 0.05) after controlling for both meal preparation impairment and functional impairment (Model 3). Nutritional risk scores were worse among those with RA (B = − 0.99, CI − 1.36, − 0.62; p  < 0.05). Functional impairment in individuals regardless of arthritis was associated with a 1.88-point decrease in nutritional risk score (Model 3; CI − 2.16, − 1.62; p  < 0.05) and a 61% higher likelihood of H-NR (Model 3; CI 1.48, 1.75; p  < 0.05).

Stratification by functional impairment indicated that the association between arthritis and nutritional risk score differed based on whether the individual had functional impairment or not (Table 3 ). For instance, while arthritis was associated with a nutritional risk score that was 0.30 units lower among people with no functional impairment (Model 1; CI − 0.44, − 0.16; p  < 0.05), the relationship was more severe among those with functional impairment (Model 1; B = − 0.90, CI − 1.41, − 0.38; p  < 0.05). Compared to individuals without arthritis, people with arthritis had increased odds of H-NR whether they experienced functional impairment (Model 2; OR 1.31, CI 1.12, 1.53; p  < 0.05) or not (Model 2; OR 1.10, CI 1.04, 1.16; p  < 0.05). Among those with arthritis but no functional impairment, those with RA had the highest increased odds of H-NR (Model 2; OR 1.29, CI 1.13, 1.47; p  < 0.05), H-NR risks were exacerbated among those who experienced functional impairment (Model 2; OR 1.41, CI 1.06, 1.87; p  < 0.05). The indirect effect of functional impairment on nutritional risk was estimated at − 0.08 (SE: 0.10; p  > 0.05).

Sensitivity analysis: OA-related pain and nutritional risk

The “comprehensive” group was significantly younger than the “tracking” group (58.9 vs 59.7 years, p  < 0.0001), had a higher proportion of males (50.3% vs 49.5%, p  = 0.03), a lower proportion of people with arthritis (30.0% vs 34.9%, p  < 0.0001), and lower proportion of people with functional impairment (6.7% vs 8.3%, p  < 0.0001). The groups did not differ in meal-related impairment nor in the likelihood of H-NR. Results from the pain sensitivity analysis conducted among those from the “comprehensive” group were consistent in both direction and magnitude with that of the main analyses (data not shown).

Arthritis has previously been linked to nutritional problems, such as poor diet quality, malnutrition, and food insecurity 19 , 23 . This increased risk of nutritional problems may stem from the complex interplay between physiological/physical, psychological, and social factors associated with arthritis 5 . For example, systemic inflammation resulting from inflammatory arthritis (which includes diseases such as RA) has been shown to trigger muscle loss and has been associated with greater pain and fatigue, thus potentially triggering physical and psychological barriers to adopting a healthy diet 5 , 42 . Osteoarthritis, traditionally perceived as a “wear and tear” condition, can cause significant pain with certain tasks and a loss of dexterity, with increasing evidence also pointing to the presence of a chronic low-grade inflammation 43 , 44 , 45 . Behavioral and psychological contributors to nutritional problems in people with arthritis include a higher incidence of depression, often associated with pain and fatigue, which may impact food intake and diet quality, while external factors include reduced accessibility to food and reliance on social support to cook or grocery shop 2 , 5 , 46 , 47 , 48 .

However, the generalizability of the past research has often been limited by small sample sizes and emphasis on RA 8 , 19 , 22 , 23 , 49 , 50 . In this large, representative sample of older Canadian adults, having any arthritis was associated with poorer nutritional risk scores and an increased likelihood of being at high nutritional risk and the association was strongest among those with RA. This association was maintained after adjustment for meal preparation impairment and general functional impairment. Functional impairment partially mediated the relationship between arthritis and nutritional risk. While having arthritis was associated with nutritional risk, the nutritional risk scores were poorer among those who also had functional impairment. Stratification for functional impairment revealed that respondents with arthritis were more likely to present lower SCREEN II-AB scores and were more likely to be at high nutritional risk, even in the absence of any functional impairment.

This is the first study to investigate the association between arthritis and nutritional risk specifically; however, past research may provide important context for its findings. In 2013, a study using data from the Canadian Community Health Survey (CCHS), the precursor to the CLSA, found that disability was independently linked with increased nutritional risk in Canadian seniors 8 . The CCHS results were echoed by research conducted outside of Canada which reinforces the positive association between nutritional risk and disability in older adults 51 , 52 , 53 . However, the absence of referents without arthritis in these previous studies has limited the interpretation of results for those with arthritis but no functional impairment 19 , 21 , 22 . We addressed this gap with a large representative sample encompassing those with and without arthritis, as well as those with and without functional impairment.

Variability in arthritis-related symptoms may also influence the type and severity of limitations with specific ADLs and IADLs 2 , 4 , 24 . Previous studies of older adults have shown that difficulties with meal preparation and shopping for food were the ADLs and IADLs most highly correlated with nutritional risk 17 , 54 . In our study, the pain sensitivity analysis utilizing a thorough assessment based on joint-related pain was consistent with the main findings. Nevertheless, affected joints may differentially contribute to functional impairment and the subsequent meal preparation adaptations should be further explored 2 .

Our results contribute to the literature to guide the development of interventions for people living with arthritis, thus it is of practical importance to consider the concepts of moderators, mediators and their role in mechanisms of producing an outcome. A moderator may be a mediator and vice-versa, what distinguishes the two is the inference of causality; strong evidence, temporality and a lack of measurement error are needed to label a predictor as a mediator 38 , 39 . The literature suggests that functional impairment may be both a moderator, and a mediator for the relationship between arthritis and nutritional risk 39 . Indeed, in this study, we found evidence of both. It is theorized that having arthritis restricts functional abilities necessary for maintaining an adequate diet 4 , 19 , 22 . As the cross-sectional nature of this study prevents any inference of causality, we were only able to calculate whether functional impairment could be a partial mediator 39 , 40 . In particular, mounting evidence suggests that the relationship between nutritional risk and disability is bidirectional, as nutritional risk (i.e. skipping meals, weight change, appetite, swallowing and eating habits) may, in turn, aggravate certain symptoms (i.e. fatigue, physical activity) that contribute to functional decline 2 , 8 , 9 . As temporality is a necessary condition for mediation analyses to be conducted properly 38 , future longitudinal research investigating potential mediation of functional impairment of nutrition risk for people with arthritis is needed 13 .

This study has certain limitations that may reduce the applicability of its results. Firstly, all data were from the baseline wave of the CLSA; follow-up is ongoing but was not available at the time of this project. Similarly, the analytic sample was significantly different from the participants who were excluded from the analysis in demographic and health characteristics and results cannot be generalized to the entire CLSA sample. As the excluded participants were more likely to have arthritis and lower (poorer) nutritional scores, suggesting that our results may underestimate their relationships. Additionally, the use of self-reported data might be affected by both recall and social desirability bias. Although this study adjusted for age in all regression models, those with arthritis were approximately four years older than those without arthritis. As older adults’ increased susceptibility to poor nutritional status stems from numerous complex physiological changes such as altered metabolism, cognitive decline, and changing socioeconomic factors 2 , 3 , 8 , 9 , 10 , a better understanding of the interactions between aging and functional impairment on diet is needed.

While this study had a large, representative sample, only a small number of respondents experienced meal-related ADL impairment, which may have rendered any effects on nutritional risk statistically undetectable. Similarly, of those who reported having difficulties with other ADLs, the majority reported only one difficulty. This limited our ability to assess a potential linear relationship with number of ADL difficulties as a continuous score. As the participants included in this analysis were younger, less likely to have arthritis, and had better nutritional risk scores than those who were excluded, the results reported here are likely underestimates.

Although the CLSA data collected information on type and severity of arthritis, these details were assessed differently between the tracking (self-reported) and comprehensive (based on disease ascertainment algorithms) groups. Our sensitivity analysis on whether pain OA-related pain and nutrition were consistent with our main findings. Nevertheless, whether the severity of arthritis, and the affected joints impacts nutritional risk should be further explored in a future study. We assessed whether the arthritis type (RA, OA, or Other) were differentially associated with nutritional risk and found that those with RA were likely the most severely impacted. However, these were non-mutually exclusive groups, as participants could have more than one type of arthritis. Considering the numerous types of arthritis and the variability in their symptoms, further research on differentiation between conditions is needed.

The SCREEN II-AB is a practical, accessible screening tool for the detection of a broad range of nutritional issues 8 , 11 , 49 . Previous research using the SCREEN II in older adults found that it was more inclusive than other screening tools, such as the SNAQ 65+ , and included determinants of early malnutrition that included both over- and undernutrition 55 . In a study using the 2008/2009 CCHS, nutritional risk was associated with disability, medication use, living alone and low social support in adults aged 65 years 8 . In a New Zealand study of 655 adults, being at high nutritional risk was linked to lower physical health scores, higher depression scores, more difficulty accessing shops, and higher likelihood of living on only a limited pension 56 . Despite its high specificity and sensitivity, the SCREEN II-AB provides neither a diagnosis nor a measure of diet quality. As there is no gold standard for measuring nutritional risk 8 , 49 , further work on identifying which nutrition-related behaviors are most problematic for respondents with and without arthritis is needed.

The results of this study highlight the presence of an association between arthritis with poorer nutritional risk scores and higher nutritional risk in a nationally representative sample of Canadian adults between the ages of 45 and 85 years. There is increasing evidence indicating that the relationship between nutrition and arthritis is multifactorial, likely caused by complex interactions between physical, psychological, social, financial, and environmental factors 35 , 47 , 50 , 57 , 58 . The overlap of these characteristics warrants an intersectional perspective, especially as there is evidence separately linking each of these factors to functional status, disease activity, and dietary behaviors 35 , 50 , 57 , 59 . More research is necessary to understand the relationship between arthritis and nutrition in specific groups to inform adapted preventative interventions and improve clinical practices.

Data availability

The data that support the findings of this study are available from the Canadian Longitudinal Study on Aging (CLSA) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the CLSA, or from the authors upon reasonable request and with permission of the CLSA.

Statistics Canada. Table 13-10-0096-06 Arthritis, by Age Group (2020).

Shatenstein, B. Impact of health conditions on food intakes among older adults. J. Nutr. Elder. 27 (3–4), 333–361 (2008).

Article   PubMed   Google Scholar  

Lee, J. S. & Frongillo, E. A. Jr. Factors associated with food insecurity among U.S. elderly persons: Importance of functional impairments. J. Gerontol. Ser. B 56 (2), S94–S99 (2001).

Article   CAS   Google Scholar  

Katz, P. P., Morris, A. & Yelin, E. H. Prevalence and predictors of disability in valued life activities among individuals with rheumatoid arthritis. Ann. Rheum. Dis. 65 (6), 763–769 (2006).

Article   CAS   PubMed   Google Scholar  

Bennett, R. et al. Identifying barriers of arthritis-related disability on food behaviors to guide nutrition interventions. J. Nutr. Educ. Behav. 51 (9), 1058–1066 (2019).

Kelsheimer, H. L. & Hawkins, S. T. Older adult women find food preparation easier with specialized kitchen tools. J. Am. Diet. Assoc. 100 (8), 950–952 (2000).

Ahlstrand, I., Björk, M., Thyberg, I. & Falkmer, T. Pain and difficulties performing valued life activities in women and men with rheumatoid arthritis. Clin. Rheumatol. 34 (8), 1353–1362 (2015).

Ramage-Morin, P. L. & Garriguet, D. Nutritional Risk Among Older Canadians (Statistics Canada, 2013).

Google Scholar  

Topinková, E. Aging, disability and frailty. Ann. Nutr. Metab. 52 (Suppl. 1), 6–11 (2008).

Bouillanne, O. et al. Geriatric nutritional risk index: A new index for evaluating at-risk elderly medical patients. Am. J. Clin. Nutr. 82 (4), 777–783 (2005).

Keller, H. H., Goy, R. & Kane, S. L. Validity and reliability of SCREEN II (seniors in the community: risk evaluation for eating and nutrition, version II). Eur. J. Clin. Nutr. 59 (10), 1149–1157 (2005).

Akhtar, U., Keller, H. H., Tate, R. B. & Lengyel, C. O. Construct validation of three nutrition questions using health and diet ratings in older Canadian males living in the community. Can. J. Diet. Pract. Res. 76 (4), 194–199 (2015).

Theis, K. A. et al. Which one? What kind? How many? Types, causes, and prevalence of disability among U.S. adults. Disabil. Health J. 12 (3), 411–421 (2019).

McDonough, C. M. & Jette, A. M. The contribution of osteoarthritis to functional limitations and disability. Clin. Geriatr. Med. 26 (3), 387–399 (2010).

Article   PubMed   PubMed Central   Google Scholar  

Martel-Pelletier, J. et al. A new decision tree for diagnosis of osteoarthritis in primary care: International consensus of experts. Aging Clin. Exp. Res. 31 (1), 19–30 (2019).

Villeneuve, E. et al. A systematic literature review of strategies promoting early referral and reducing delays in the diagnosis and management of inflammatory arthritis. Ann. Rheum. Dis. 72 (1), 13–22 (2013).

Sharkey, J. R. The interrelationship of nutritional risk factors, indicators of nutritional risk, and severity of disability among home-delivered meal participants. The Gerontologist 42 (3), 373–380 (2002).

Fillenbaum, G. G. Multidimensional Functional Assessment of Older Adults: The Duke Older Americans Resources and Services procedures (Psychology Press, 2013).

Book   Google Scholar  

Grimstvedt, M. E., Woolf, K., Milliron, B. J. & Manore, M. M. Lower Healthy eating index-2005 dietary quality scores in older women with rheumatoid arthritis v. healthy controls. Public Health Nutr. 13 (8), 1170–1177 (2010).

Fillenbaum, G. G. & Smyer, M. A. The development, validity, and reliability of the OARS multidimensional functional assessment questionnaire. J. Gerontol. 36 (4), 428–434 (1981).

Bärebring, L., Winkvist, A., Gjertsson, I. & Lindqvist, H. M. Poor dietary quality is associated with increased inflammation in Swedish patients with rheumatoid arthritis. Nutrients 10 (10), 1535 (2018).

Berube, L. T., Kiely, M., Yazici, Y. & Woolf, K. Diet quality of individuals with rheumatoid arthritis using the healthy eating index (HEI)-2010. Nutr. Health 23 (1), 17–24 (2017).

Elkan, A. C., Engvall, I. L., Tengstrand, B., Cederholm, T. & Hafström, I. Malnutrition in women with rheumatoid arthritis is not revealed by clinical anthropometrical measurements or nutritional evaluation tools. Eur. J. Clin. Nutr. 62 (10), 1239 (2008).

Katz, P. P. The impact of rheumatoid arthritis on life activities. Arthritis Rheumatol. 8 (4), 272–278 (1995).

Bárbara Pereira Costa, A. et al. Nutritional risk is associated with chronic musculoskeletal pain in community-dwelling older persons: The PAINEL study. J. Nutr. Gerontol. Geriatr. 35 (1), 43–51 (2016).

Häkkinen, A. et al. Muscle strength, pain, and disease activity explain individual subdimensions of the health assessment questionnaire disability index, especially in women with rheumatoid arthritis. Ann. Rheum. Dis. 65 (1), 30–34 (2006).

Pells, J. J. et al. Arthritis self-efficacy and self-efficacy for resisting eating: Relationships to pain, disability, and eating behavior in overweight and obese individuals with osteoarthritic knee pain. Pain 136 (3), 340–347 (2008).

Raina, P. S. et al. The Canadian longitudinal study on aging (CLSA). Can. J. Aging Rev. Can. Vieil. 28 (3), 221–229 (2009).

Article   Google Scholar  

Oremus, M. et al. Validating chronic disease ascertainment algorithms for use in the Canadian longitudinal study on aging. Can. J. Aging Rev. Can. Vieil. 32 (3), 232–239 (2013).

Keller, H. H. Getting Started with SCREENing: Your Guide to Implementing SCREEN (2012).

Canadian Longitudinal Study on Aging. Derived Variables–Nutritional Risk (NUR) 7. (2017).

Older Adult Nutrition Screening [Internet]. [cited 2023 May 24]. Older Adult Nutrition Screening . Available from: https://olderadultnutritionscreening.com/

Zhang, L., Shooshtari, S., John, P. S. & Menec, V. H. Multimorbidity and depressive symptoms in older adults and the role of social support: Evidence using Canadian Longitudinal Study on Aging (CLSA) data. PLOS ONE 17 (11), e0276279 (2022).

Article   CAS   PubMed   PubMed Central   Google Scholar  

O’Donnell, S., Lagacé, C., McRae, L. & Bancej, C. Life with arthritis in Canada: A personal and public health challenge. Chronic Inj Can. 31 (3), 135–136 (2011).

Taylor, P. C., Moore, A., Vasilescu, R., Alvir, J. & Tarallo, M. A structured literature review of the burden of illness and unmet needs in patients with rheumatoid arthritis: A current perspective. Rheumatol. Int. 36 (5), 685–695 (2016).

Eurenius, E., Stenström, C. H., PARA Study Group. Physical activity, physical fitness, and general health perception among individuals with rheumatoid arthritis. Arthritis Care Res. 53 (1), 48–55 (2005).

Cederholm, T. et al. GLIM criteria for the diagnosis of malnutrition: A consensus report from the global clinical nutrition community. J. Cachexia Sarcopenia Muscle 10 (1), 207–217 (2019).

Fairchild, A. J. & McDaniel, H. L. Best (but oft-forgotten) practices: Mediation analysis. Am. J. Clin. Nutr. 105 (6), 1259–1271 (2017).

Baron, R. M. & Kenny, D. A. The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. J. Pers. Soc. Psychol. 51 (6), 1173–1182 (1986).

O’Rourke, H. P. & MacKinnon, D. P. Reasons for testing mediation in the absence of an intervention effect: A research imperative in prevention and intervention research. J. Stud. Alcohol Drugs 79 (2), 171–181 (2018).

Agler, R. & De Boeck, P. On the interpretation and use of mediation: Multiple perspectives on mediation analysis. Front. Psychol. 8 , 293306. https://doi.org/10.3389/fpsyg.2017.01984 (2017).

Druce, K. L. & Basu, N. Predictors of fatigue in rheumatoid arthritis. Rheumatol. Oxf. Engl. 58 (Suppl 5), v29-34 (2019).

Yu, H., Huang, T., Lu, W. W., Tong, L. & Chen, D. Osteoarthritis pain. Int. J. Mol. Sci. 23 (9), 4642 (2022).

Bellamy, N., Sothern, R. B., Campbell, J. & Buchanan, W. W. Rhythmic variations in pain, stiffness, and manual dexterity in hand osteoarthritis. Ann. Rheum. Dis. 61 (12), 1075–1080 (2002).

dos Santos Duarte Lana, J. F. & Rodrigues, B. Osteoarthritis as a chronic inflammatory disease: A review of the inflammatory markers. In Osteoarthritis Biomarkers and Treatments (IntechOpen, 2019). Available from: https://www.intechopen.com/chapters/64798

Matcham, F., Ali, S., Hotopf, M. & Chalder, T. Psychological correlates of fatigue in rheumatoid arthritis: A systematic review. Clin. Psychol. Rev. 39 , 16–29 (2015).

Cai, Q., Pesa, J., Wang, R. & Fu, A. Z. Depression and food insecurity among patients with rheumatoid arthritis in NHANES. BMC Rheumatol. 6 (1), 6 (2022).

Shaw, Y. et al. Development of resilience among rheumatoid arthritis patients: A qualitative study. Arthritis Care Res. 72 (9), 1257–1265 (2020).

Keller, H. H. & McKenzie, J. D. Nutritional risk in vulnerable community-living seniors. Can. J. Diet. Pract. Res. 64 (4), 195–201 (2003).

Tarasuk, V., Mitchell, A., McLaren, L. & McIntyre, L. Chronic physical and mental health conditions among adults may increase vulnerability to household food insecurity. J. Nutr. 143 (11), 1785–1793 (2013).

Strobl, R., Müller, M., Emeny, R., Peters, A. & Grill, E. Distribution and determinants of functioning and disability in aged adults—results from the German KORA-Age study. BMC Public Health 13 (1), 137 (2013).

Yap, K. B., Niti, M. & Ng, T. P. Nutrition screening among community-dwelling older adults in Singapore. Singap. Med. J. 48 (10), 911 (2007).

CAS   Google Scholar  

Söderhamn, U., Dale, B., Sundsli, K. & Söderhamn, O. Nutritional screening of older home-dwelling Norwegians: A comparison between two instruments. Clin. Interv. Aging 7 , 383 (2012).

Sharkey, J. R. Nutrition risk screening. J. Nutr. Elder. 24 (1), 19–34 (2004).

Borkent, J. W. et al. What do screening tools measure? Lessons learned from SCREEN II and SNAQ65. Clin. Nutr. ESPEN 38 , 172–177 (2020).

Wham, C. A. et al. Health and social factors associated with nutrition risk: Results from life and living in advanced age: A cohort study in New Zealand (LiLACS NZ). J. Nutr. Health Aging 19 (6), 637–645 (2015).

Kojima, M. et al. Psychosocial factors, disease status, and quality of life in patients with rheumatoid arthritis. J. Psychosom. Res. 67 (5), 425–431 (2009).

Englbrecht, M., Kruckow, M., Araujo, E., Rech, J. & Schett, G. The interaction of physical function and emotional well-being in rheumatoid arthritis—What is the impact on disease activity and coping?. Semin. Arthritis Rheum. 42 (5), 482–491 (2013).

Fuller-Thomson, E. & Shaked, Y. Factors associated with depression and suicidal ideation among individuals with arthritis or rheumatism: Findings from a representative community survey. Arthritis Rheum. 61 (7), 944–950 (2009).

Download references

Acknowledgements

The AB SCREEN™ II assessment tool is owned by Dr. Heather Keller. Use of the AB SCREEN™ II assessment tool was made under license from the University of Guelph. Beth Armour (1952-2020), a dietitian and educator, passed away prior to the finalization of the manuscript. Her mentorship and friendship will always be remembered. This research was made possible using the data/biospecimens collected by the Canadian Longitudinal Study on Aging (CLSA). Funding for the Canadian Longitudinal Study on Aging (CLSA) is provided by the Government of Canada through the Canadian Institutes of Health Research (CIHR) under grant reference: LSA 94473 and the Canada Foundation for Innovation, as well as the following provinces, Newfoundland, Nova Scotia, Quebec, Ontario, Manitoba, Alberta, and British Columbia. This research has been conducted using the CLSA Baseline Comprehensive Dataset version 4.0 under Application ID #180106. The CLSA is led by Drs. Parminder Raina, Christina Wolfson, and Susan Kirkland.

This work was supported by a grant from the Drummond Foundation (2017RFA-#10). L.K. holds a Junior 1 salary award from the Fonds de la Recherche du Québec—Santé (FRQS). T.D. holds a doctoral training award for professionals from the FRQS. The opinions expressed in this manuscript are the authors’ own and do not reflect the views of the Canadian Longitudinal Study on Aging.

Author information

Authors and affiliations.

School of Health, Concordia University, Montreal, QC, Canada

Roxanne Bennett, Thea A. Demmers & Lisa Kakinami

École de Santé Publique, Centre de Recherche en Santé Publique, Université de Montréal, Montreal, QC, Canada

Thea A. Demmers

School of Human Nutrition, McGill University, Montreal, QC, Canada

Hugues Plourde

HUMA+, Westmount, QC, Canada

PEN- Practice-Based Evidence in Nutrition®, Dietitians of Canada, Toronto, Canada

Beth Armour

Département de Nutrition, Faculté de Médicine, Université de Montréal, Montreal, QC, Canada

Guylaine Ferland

Department of Mathematics and Statistics, Concordia University, 1455 de Maisonneuve West, Montreal, QC, H3G 1M8, Canada

Lisa Kakinami

You can also search for this author in PubMed   Google Scholar

Contributions

T.A.D. led the project administration, R.B. wrote the initial manuscript draft, and L.K. conducted the analyses. T.A.D. and L.K. were leads on conceptualization, funding acquisition, methodology, supervision, and reviewing/editing the manuscript. R.B., H.P., K.A., B.A., and G.F. assisted with conceptualization, formal analysis, and methodology. H.P., K.A., and G.F. additionally assisted with funding acquisition and reviewing/editing the manuscript.

Corresponding author

Correspondence to Lisa Kakinami .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Bennett, R., Demmers, T.A., Plourde, H. et al. Arthritis is associated with high nutritional risk among older Canadian adults from the Canadian Longitudinal Study on Aging. Sci Rep 14 , 10807 (2024). https://doi.org/10.1038/s41598-024-58370-7

Download citation

Received : 27 September 2023

Accepted : 28 March 2024

Published : 11 May 2024

DOI : https://doi.org/10.1038/s41598-024-58370-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

type of longitudinal research

  • Open access
  • Published: 03 May 2024

Insulin sensitivity estimates and their longitudinal association with coronary artery disease in type 1 diabetes. Does it matter?

  • Stefan Mutter 1 , 2 , 3 ,
  • Erika B. Parente 1 , 2 , 3 ,
  • Andrzej S. Januszewski 4 , 5 ,
  • Johan R. Simonsen 1 , 2 , 3 ,
  • Valma Harjutsalo 1 , 2 , 3 ,
  • Per-Henrik Groop 1 , 2 , 3 , 6 , 7 ,
  • Alicia J. Jenkins 5 , 7   na1 ,
  • Lena M. Thorn 1 , 3 , 8   na1 on behalf of

the FinnDiane Study Group

Cardiovascular Diabetology volume  23 , Article number:  152 ( 2024 ) Cite this article

308 Accesses

6 Altmetric

Metrics details

Insulin resistance and chronic kidney disease are both associated with increased coronary artery disease risk. Many formulae estimating glucose disposal rate in type 1 diabetes infer insulin sensitivity from clinical data. We compare associations and performance relative to traditional risk factors and kidney disease severity between three formulae estimating the glucose disposal rate and coronary artery disease in people with type 1 diabetes.

The baseline glucose disposal rate was estimated by three (Williams, Duca, and Januszewski) formulae in FinnDiane Study participants and related to subsequent incidence of coronary artery disease, by baseline kidney status.

In 3517 adults with type 1 diabetes, during median (IQR) 19.3 (14.6, 21.4) years, 539 (15.3%) experienced a coronary artery disease event, with higher rates with worsening baseline kidney status. Correlations between the three formulae estimating the glucose disposal rate were weak, but the lowest quartile of each formula was associated with higher incidence of coronary artery disease. Importantly, only the glucose disposal rate estimation by Williams showed a linear association with coronary artery disease risk in all analyses. Of the three formulae, Williams was the strongest predictor of coronary artery disease. Only age and diabetes duration were stronger predictors. The strength of associations between estimated glucose disposal rate and CAD incidence varied by formula and kidney status.

Conclusions

In type 1 diabetes, estimated glucose disposal rates are associated with subsequent coronary artery disease, modulated by kidney disease severity. Future research is merited regarding the clinical usefulness of estimating the glucose disposal rate as a coronary artery disease risk factor and potential therapeutic target.

Insulin resistance, commonly observed in type 2 diabetes, can also occur in people with type 1 diabetes [ 1 ]. This is sometimes referred to as ‘double diabetes’, and is characterised by the presence of components of the metabolic syndrome, such as adiposity, hyperglycaemia, hypertension, and dyslipidaemia [ 2 ], and is associated with a higher risk of micro- and macrovascular complications and death [ 3 , 4 , 5 , 6 ]. A definition of the metabolic syndrome has been agreed upon by an international consensus group, but notably, this definition is not specific for people with type 1 diabetes [ 7 ]. Whilst the gold standard method of quantifying insulin sensitivity, the euglycaemic hyperinsulinaemic clamp [ 8 ] is not feasible in large studies or in clinical practice. Many formulae using clinically available variables have been developed to estimate insulin sensitivity, such as the estimated glucose disposal rate (eGDR) [ 9 , 10 , 11 , 12 , 13 ]. These equations have been derived from clamp studies in several dozens of people with type 1 diabetes, usually adults, without or with only a few with chronic complications, potentially limiting generalisability. Furthermore, validation of eGDR calculated from these formulae and GDR measured in independent clamp studies showed rather weak correlations, with r < 0.3 [ 10 , 11 , 14 ]. Nevertheless, eGDR calculated from some formulae have been shown to be associated with subsequent diabetic kidney disease [ 4 , 15 , 16 ], diabetic retinopathy [ 4 ], cardiovascular disease [ 15 , 17 ], and mortality [ 18 ] in people with type 1 diabetes. We are not aware of any publications that have evaluated more than one eGDR formulae in relation to subsequent chronic diabetes complications, nor evaluated eGDR performance according to diabetic kidney disease status.

We aim, thus, to compare three eGDR formulae [ 10 , 11 , 12 ] that are based on the three available clamp studies performed in adult Caucasians with type 1 diabetes, and assess their relationship with subsequent incidence of coronary artery disease (CAD) over a long follow-up in a large cohort of adults with type 1 diabetes, including evaluating eGDR formulae performance according to baseline kidney disease severity.

Study participants

All participants are from the FinnDiane Study, an ongoing, multicentre, observational, Finnish study founded in 1997, including 77 study centres (Additional file 1 : Table S1), which aims to discover clinical, genetic, and environmental risk factors for micro- and macro-vascular complications of type 1 diabetes [ 2 ]. We included 3517 participants with type 1 diabetes, defined as age of diabetes onset < 40 years and insulin treatment from within 1-year of diagnosis who had their baseline visit before the end of 2017 (median 2000, IQR 1998, 2002). We excluded individuals with pre-existing (pre-baseline) cardiovascular events, including myocardial infarction, coronary revascularisation, stroke, amputations, or peripheral artery revascularisation, as well as those with kidney replacement therapy or an estimated glomerular filtration rate (eGFR) < 15 ml/min/1.73 m 2 . We further excluded individuals with missing data required for eGDR calculation by the three formulae of interest.

Baseline clinical characteristics were as previously described [ 2 ]. Anthropometrics : weight in light clothing to the closest 0.1 kg, height to the closest one cm, waist circumference midway between lowest ribs and iliac crest and hip circumference at widest part of gluteal region, both to closest 0.5 cm; blood pressure (BP) : the mean of two measures of systolic and diastolic BP taken seated after ten minutes rest; medical history : including history of diabetes and its complications, current medications, insulin pump use, and lifestyle by validated questionnaires; clinical chemistry : venous blood for creatinine, HbA 1c , lipids, and lipoproteins; kidney status : eGFR calculated by chronic kidney disease epidemiology collaboration equation [ 19 ], albuminuria status based on albumin excretion rate (AER) in two out of three urine collections classified as: normal AER (< 20 µg/min or < 30 mg/24 h), moderately increased albuminuria (20–200 µg/min or 30–300 mg/24 h), or severely increased albuminuria (> 200 µg/min or > 300 mg/24 h). Diabetic kidney disease severity as per Kidney Disease: Improving Global Outcomes (KDIGO) risk categories: low risk (normal AER and eGFR ≥ 60 ml/min/1.73 m 2 , N = 2513), moderately increased risk (normal AER and eGFR 45–59 ml/min/1.73 m 2 , or moderately increased albuminuria and eGFR ≥ 60 ml/min/1.73 m 2 , N = 516), as well as high (N = 251) and very high risk (N = 237) (normal AER and eGFR < 45 ml/min/1.73 m 2 , moderately increased albuminuria and eGFR < 60 ml/min/1.73 m 2 , or severely increased albuminuria irrespective of eGFR) [ 20 ]. Due to relatively low numbers, the high and very high KDIGO risk groups were combined.

eGDR and the metabolic syndrome

eGDR was estimated by three formulae for adult Caucasians:

Modified Williams formula: The original formula [ 12 ], modified for use of HbA 1c vs. HbA 1 [ 2 ]. eGDR = 24.4−(12.97*WHR)-(3.39*AHT)−(0.60*HbA 1c [%]), where WHR = waist-to-hip ratio and AHT = antihypertensive treatment and/or BP ≥ 140/90 mmHg (yes = 1; no = 0).

Duca formula: The best fit formula without adiponectin by Duca et al. [ 10 ]: eGDR = exp[4.1075-(0.01299*waist circumference [cm])-(1.05819*daily insulin dose per body weight [IU/kg]−(0.31327*triglycerides [mmol/L])−(0.00802*diastolic BP [mmHg]).

Januszewski formula [ 11 ]: eGDR = 6.6743 + (6.1818*sex [Woman = 0; Man = 1]) + (0.0708*age [years]) + (7.4104*HDL cholesterol [mmol/L])−(0.1692*pulse pressure [mmHg])−(0.0894*serum creatinine [µmol/L]).

Insulin resistance was defined as the lowest quartile of eGDR by each formula in this study [ 21 ]. In addition, suggested eGDR cut-offs from the literature of four, six, and eight mg/kg/min [ 18 ], with numbers lower than these values being regarded as insulin resistant, have also been evaluated in relationship to incident CAD. The metabolic syndrome, as a categorical variable was as defined according to the Joint Statement criteria [ 7 ].

Coronary artery disease (CAD)

CAD events (N = 539) were identified for all participants until the end of 2020 from the Finnish Care Register for Health Care, Finnish Institute for Health and Welfare, and from the death registry, Statistics Finland. CAD was defined as first coronary artery disease event of acute myocardial infarction (ICD (international classification of diseases)-8/9: 410 and 412; ICD-10: I21–23), coronary revascularisation (NOMESCO (Nordic Medico-Statistical Committee) Classification of Surgical Procedures: codes FNA (Connection to coronary artery from internal mammary artery), FNB (Connection to coronary artery from gastroepiploic artery), FNC (Aorto-coronary venous bypass), FND (Aorto-coronary bypass using prosthetic graft), FNE (Coronary bypass using free arterial graft), FNF (Coronary thromboendarterectomy), FNG (Expansion and recanalisation of coronary artery), TFN40 (Catherisation of heart with balloon widening of coronary vessels), FN1AT (Endovascular dilatation of coronary arteries), FN1BT (Extensive endovascular dilatation of coronary arteries), FN1YT (Percutaneous insertion of coronary artery stent), and FN2 (Other procedures on coronary arteries)), or CAD as immediate or underlying cause of death (ICD-9: 410–414; ICD-10: 120–I25). Individuals were followed up for at least half a year until their first CAD event, death, or until the end of 2020 for a median of 19.3 years (IQR 14.6, 21.4) years, in total 59,501 person-years of follow-up.

P-values for normally distributed continuous variables were calculated using a t-test, for non-normally distributed continuous variables a Mann–Whitney test, and for categorical variables a χ 2 test. P-values to compare more than two groups were calculated using a Kruskal–Wallis test. Correlations between eGDR scores were estimated by Spearman correlation coefficients. Kaplan–Meier survival curves were constructed by quartiles of all eGDR scores and for the presence of the metabolic syndrome. Time-to-event analyses with CAD as outcome were performed using Cox proportional hazard regression models [ 22 ]. The proportional hazard assumption was tested based on Schoenfeld residuals and if violated we restricted the follow-up to 15 years as the hazards were proportional up to that time. As the different formulae use different variables to calculate eGDR, we did not adjust the analyses with further covariates to ensure a fair comparison. As the cohort was selected in a way to avoid any missing information, there was no need to address missing information. When eGDR scores were modelled as a continuous variable, we tested for linearity using the Wald test and if indicated modelled the relationship with cubic splines. Martingale residuals were assessed, and variables were log transformed when necessary. The performance of the three formulae and all their components on predicting CAD were compared using the Harrell C-Index. The analyses were repeated on three subsets of individuals with varying KDIGO status: low, moderate and those with a high or a very high status combined in one set. All analyses were performed in R (R Core Team version 4.2.2, Vienna, Austria).

Baseline characteristics

Demographics are provided in Table  1 , including a subdivision by subsequent CAD status. From 3517 FinnDiane Study participants, 539 experienced a CAD event during the 19-year follow-up. At baseline those who subsequently developed CAD vs. those who did not, were older, had longer diabetes duration, higher BP, higher BMI and waist height ratio, higher triglycerides, total and LDL cholesterol concentrations, lower HDL cholesterol concentrations, and were more likely to be in the KDIGO category high or very high. Additionally, they were more likely to have the metabolic syndrome, lower eGDR scores, to be on antihypertensive and lipid-lowering drugs, and more likely to have a history of smoking. Insulin pumps were used by 3.5 vs. 6.3% of those who did vs. did not develop CAD respectively (p = 0.02).

Comparison of insulin resistance by eGDR formulae’s lowest quartile and metabolic syndrome

There was little overlap in the number of insulin-resistant individuals when defined as being in the lowest quartile in each of the eGDR formulae (Additional file 1 : Fig. S1). The frequency of the metabolic syndrome in participants considered insulin-resistant, i.e., the lowest quartile of eGDR by the Williams, Duca, and Januszewski eGDR formulae, was 64, 71, and 64%, respectively. In comparison, for those in the highest quartile of eGDR, the corresponding frequencies were 17, 16, and 17%.

As a continuous score calculated with the three assessed formulae, the eGDRs were significantly (p < 0.001), albeit weakly, correlated: The Spearman correlation coefficient was 0.42 for Williams vs. Duca scores; 0.10 for Williams vs. Januszewski; and 0.10 for Duca vs. Januszewski.

eGDR, the metabolic syndrome, and KDIGO risk categories

Baseline characteristics by KDIGO categories are provided in Additional file 1 : Table S2. Worsening kidney disease was associated with higher rates of CAD, male sex, longer diabetes duration, younger age of diabetes onset, higher HbA 1c concentrations, and worse traditional risk factors, such as adiposity, BP, and lipids. For all three formulae, the eGDR decreased with worsening KDIGO risk category, and, in addition, the percentage of individuals with metabolic syndrome increased with higher KDIGO category.

Associations between baseline eGDR quartiles and subsequent CAD

Kaplan–Meier curves for incidence of CAD (Additional file 1 : Fig. S2) by eGDR quartiles showed increasing separation of curves over follow-up time, with different patterns of spread between formulae, reaching statistical significance for all eGDR formulae: eGDR by Williams and Januszewski, both p < 0.001, and eGDR by Duca, p = 0.015. Similarly, metabolic syndrome status curves separated significantly (p < 0.001) over time, with higher CAD rates in those with vs. without the metabolic syndrome at baseline (Additional file 1 : Fig. S3).

Risk for CAD based on baseline eGDR score and by kidney disease severity

At all the proposed cut-offs for eGDR calculated by the Williams and Januszewski formulae, we found an association with the CAD incidence (Table  2 ). The strengths of the associations varied depending on the cut-off, but was stronger for scores based on the Williams formula at all cut-offs. For the Duca formula, only the lowest quartile of its eGDR was associated with incidence of CAD. When restricting the follow-up time to maximum of 15 years, the hazard ratio (HR) for CAD incidence decreased linearly with an increasing eGDR score for all three formulae (Fig.  1 A–C), indicating that any increase in eGDR (improvement in insulin sensitivity) was cardioprotective. In Fig.  1 D, the HRs are defined per score percentile and therefore allow for a direct comparison of the strength of the association for all three formulae. When using the C-index, the Williams-derived eGDR discriminated individuals with regards to CAD incidence either better or at least equally well compared to the other formulae. Only in the lowest score percentiles, the HRs based on Januszewski were higher, but their 95% confidence intervals (CI) overlapped with the Williams formula, e.g., in the 0.28 percentile Januszewski-based HR was 9.29 [6.46, 13.36], whereas the Williams-based HR was 7.69 [5.78, 10.24].

figure 1

Cohort-wide hazard ratios for incident coronary artery disease by estimated glucose disposal rate formulae. Williams ( A ), Duca ( B ), Januszewski ( C ). D Compares all three formulae and shows the hazard ratio per score percentile

Additional file 1 : Figures S4, S5, and S6 show the HR for the 15 years incidence of CAD, separately for each KDIGO category. In all three categories, when using the eGDR score by Williams, the HR for CAD decreased linearly with increasing insulin sensitivity.

Comparison of baseline eGDR scores with other risk factors for subsequent CAD

As shown in Fig.  2 and Additional file 1 : Tables S3–S6, age and diabetes duration showed the highest C-index for the association with incident CAD for the whole cohort and for each KDIGO risk category. In the full cohort, the eGDR score by Williams had a higher C-index compared to the Januszewski score (0.69, 95% CI [0.67, 0.72] vs. 0.62 [0.60, 0.65]) and the Duca score (0.53 [0.51, 0.56]). This was observed in the low KDIGO category (Additional file 1 : Table S4), however in the moderate (Additional file 1 : Table S5) as well as the pooled high and very high KDIGO categories (Additional file 1 : Table S6), there were no significant differences between the eGDR scores among the three formulae as the 95% CIs overlapped. However, in the KDIGO categories high and very high, the C-index for Januszewski was nominally higher than the C-index for Williams, but the CIs overlapped (0.58, [0.54, 0.63] vs. 0.57 [0.52, 0.61]).

figure 2

C-indexes for cardiovascular risk factors including estimated glucose disposal rate scores. For all individuals and separately for individuals in Kidney Disease Improving Global Outcomes risk categories low, moderate and high combined with very high

In the present study, we compared the performance of three different eGDR formulae with respect to their ability to predict subsequent incident CAD at different stages of kidney disease severity in people with type 1 diabetes. We found that correlations between the three assessed formulae were weak keeping with findings from a cross-sectional clamp study series [ 11 ]. Importantly, eGDR formulae are derived from different individuals, may have had variations in clamp methodology, and include different factors with different weightings. It is also recognised that eGDR likely reflects other processes, which modulate the insulin sensitivity. However, despite the weak correlations between the eGDR formulae, the lowest eGDR quartiles of all formulae, which represents the individuals with insulin resistance, were associated with higher risk of CAD. This was true also for the metabolic syndrome, in line with previous studies [ 23 ]. Regarding the group with the lowest quartile of eGDR, the overlapping number of subjects for two or three formulae was low, indicating that the different formulae identify different individuals with lower insulin sensitivity, which also makes comparison of the formulae’s risk prediction performance relevant.

Whilst both measured and estimated insulin sensitivity may vary with kidney disease severity [ 2 , 24 ], clamp studies usually exclude people with moderate or severe kidney disease, and eGDR formulae specific for type 1 diabetic kidney disease have not been developed, nor have existent formulae been systematically assessed at different stages of kidney disease, although kidney disease is a major predictor of CAD. Therefore, it is an important and novel finding that the performance of the three eGDR formulae for subsequent CAD events varied by eGDR formulae and KDIGO risk category.

Nonetheless, the Williams eGDR score was the strongest (based on C-index) predictor of CAD overall. Only the non-modifiable risk factors age and diabetes duration were stronger predictors than eGDR by Williams. Some differences in the Williams, Duca, and Januszewski clamp cohorts could potentially explain the different abilities of their respective eGDR formulae to predict CAD (Additional file 1 : Table S7). The individuals in the Duca cohort were somewhat older compared to the Williams cohort (mean age 45.6 vs. 35.9 years), but the Williams cohort had a higher HbA1c (9.5%) compared both to the Duca cohort (7.6%) as well as the Januszewski cohort (7.7%). As both age and glycaemic control are independent and important risk factors for CAD, the cohorts are at different risks of developing CAD to begin with, which might also influence the models’ different abilities to predict CAD. It is, however, worth noting that although the characteristics differ between the FinnDiane participants and the Williams cohort, the eGDR by Williams was a strong predictor of CAD.

Particularly with the increasing global incidence and prevalence rates of type 1 diabetes [ 25 ] and also of obesity in people with type 1 diabetes, eGDR formulae may be useful as a surrogate endpoint in clinical research and clinical practice. In the full cohort, the Williams eGDR score ranked higher than its components, but systolic BP and pulse pressure predicted CAD equally well. In the moderate KDIGO category, the components of the Williams formula performed equally well, but waist-height ratio performed nominally better than the Williams score. In the KDIGO low category, pulse pressure, a component of the Januszewski formula, outperformed the score, and in all other settings, pulse pressure performed at least equally well. In addition, the Januszewski formula also includes serum creatinine, which possibly explains its performance in the KDIGO high and very high cohort. In the overall cohort, the continuous Williams eGDR score was a better predictor of CAD than the dichotomous metabolic syndrome score. Of note, the waist-height ratio which is a marker of central fatness and is linked to insulin sensitivity, performed similar to the presence of the metabolic syndrome for the prediction of CAD in the entire cohort and in all KDIGO categories. This finding highlights the relevance of central obesity in the metabolic syndrome and is also aligned with a previous publication showing that the waist-height ratio is associated with visceral fat mass independent of sex and kidney disease status in adults with type 1 diabetes [ 26 ].

Importantly, only the eGDR by Williams showed a linear association with CAD risk in all (sub-) analyses. Therefore, eGDR based on Williams would lend itself well to evaluating the effects of interventions, which may improve insulin sensitivity and might be a better choice than a categorical variable such as the metabolic syndrome. Potential interventions are weight-loss, exercise and muscle gain, insulin sensitiser drugs (e.g., metformin, sodium-glucose cotransporter-2 (SGLT2) inhibitors, and incretin-based drugs), and the use of insulin pumps. There are past and ongoing trials of adjunct therapy in people with type 1 diabetes and large real-world databases, which could be used to test the facility of eGDR scores in clinical trials and in clinical practice.

Study strengths include the large, observational ‘real-world’ FinnDiane cohort with detailed characterisation of participants, long and ongoing follow-up, moderately high rates of CAD, and wide range of kidney status. Limitations are that not all existent insulin sensitivity estimating formulae have been evaluated [ 27 ]. However, importantly, the eGDR formulae are derived from a small number of clamp studies. We included one formula each from the clamp studies from Williams, Duca, and Januszewski, and excluded the clamp studies from youth and non-Caucasian populations. Furthermore, this small number of clamp studies included only few people with kidney disease and were performed in different populations, limiting the generalisability of eGDR formulae for other ages, ethnicities, and body habitus. The original formula by Duca (and also another formula by Januszewski) included serum adiponectin, which associates strongly with insulin sensitivity. Due to limited data on adiponectin, for this study we selected the best fit formula without adiponectin, which might have impacted on the weaker results observed for the Duca eGDR formula. There is uncertainty as to the effects of different types of lifestyles, drugs, and insulin delivery modality on eGDR. Furthermore, CAD may be silent in people with diabetes, which if anything would dilute our data.

Scoring adults with type 1 diabetes based on three formulae to estimate insulin sensitivity matters, as the lowest quartile of each score was associated with CAD. While some individual components of the eGDR formulae performed better than the eGDR score in predicting incident CAD, an eGDR score provides insight to insulin sensitivity, beyond CAD risk estimation, and offers a broader risk score that could succinctly evaluate treatment or lifestyle interventions. Notably, the strength of association varies by formula and kidney disease status. As a continuous measure to assess CAD risk, the Williams eGDR score appears particularly promising due to its linear association that is independent of kidney disease subclass.

Availability of data and materials

The datasets are not publicly available due to the consent provided by the participant at the time of data collection. The data access, which is subject to local regulations, can be obtained upon reasonable request by contacting: Maaria Puupponen (email: [email protected]), Research Program Coordinator, Clinical and Molecular Metabolism (CAMM), University of Helsinki. Upon approval, analysis needs to be performed on a user-specific local server (with protected access) and requires the applicant to sign non-disclosure and secrecy agreements.

Abbreviations

Albumin excretion rate

Antihypertensive treatment

Body mass index

Blood pressure

  • Coronary artery disease

Confidence interval

Concordance index

Estimated glomerular filtration rate

Estimated glucose disposal rate

Endovascular dilatation of coronary arteries

Extensive endovascular dilatation of coronary arteries

Percutaneous insertion of coronary artery stent

Other procedures on coronary arteries

Connection to coronary artery from internal mammary artery

Connection to coronary artery from gastroepiploic artery

Aorto-coronary venous bypass

Aorto-coronary bypass using prosthetic graft

Coronary bypass using free arterial graft

Coronary thromboendarterectomy

Expansion and recanalization of coronary artery

Glycated haemoglobin

High density lipoprotein

Hazard ratio

International classification of diseases

Interquartile range

Low density lipoprotein

Nordic Medico-Statistical Committee

Kidney Disease: Improving Global Outcomes

Sodium-glucose cotransporter-2

Catherisation of heart with balloon widening of coronary vessels

Waist-to-hip ratio

Donga E, Dekkers OM, Corssmit EP, Romijn JA. Insulin resistance in patients with type 1 diabetes assessed by glucose clamp studies: systematic review and meta-analysis. Eur J Endocrinol. 2015;173:101–9.

Article   CAS   PubMed   Google Scholar  

Thorn LM, Forsblom C, Fagerudd J, Thomas MC, Pettersson-Fernholm K, Saraheimo M, Waden J, Ronnback M, Rosengard-Barlund M, Bjorkesten CG, Taskinen MR, Groop PH, FinnDiane Study G. Metabolic syndrome in type 1 diabetes: association with diabetic nephropathy and glycemic control (the FinnDiane study). Diabetes Care. 2005;28:2019–24.

Article   PubMed   Google Scholar  

Ekstrand AV, Groop PH, Gronhagen-Riska C. Insulin resistance precedes microalbuminuria in patients with insulin-dependent diabetes mellitus. Nephrol Dial Transplant. 1998;13:3079–83.

Linn W, Persson M, Rathsman B, Ludvigsson J, Lind M, Andersson Franko M, Nystrom T. Estimated glucose disposal rate is associated with retinopathy and kidney disease in young people with type 1 diabetes: a nationwide observational study. Cardiovasc Diabetol. 2023;22:61.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Martin FI, Hopper JL. The relationship of acute insulin sensitivity to the progression of vascular disease in long-term type 1 (insulin-dependent) diabetes mellitus. Diabetologia. 1987;30:149–53.

Pambianco G, Costacou T, Orchard TJ. The prediction of major outcomes of type 1 diabetes: a 12-year prospective evaluation of three separate definitions of the metabolic syndrome and their components and estimated glucose disposal rate: the Pittsburgh Epidemiology of Diabetes Complications Study experience. Diabetes Care. 2007;30:1248–54.

Alberti KG, Eckel RH, Grundy SM, Zimmet PZ, Cleeman JI, Donato KA, Fruchart JC, James WP, Loria CM, Smith SC, Jr., International Diabetes Federation Task Force on E, Prevention, Hational Heart L, Blood I, American Heart A, World Heart F, International Atherosclerosis S, International Association for the Study of O. Harmonizing the metabolic syndrome: a joint interim statement of the International Diabetes Federation Task Force on Epidemiology and Prevention; National Heart, Lung, and Blood Institute; American Heart Association; World Heart Federation; International Atherosclerosis Society; and International Association for the Study of Obesity. Circulation 2009;120:1640–1645

DeFronzo RA, Tobin JD, Andres R. Glucose clamp technique: a method for quantifying insulin secretion and resistance. Am J Physiol. 1979;237:E214-223.

CAS   PubMed   Google Scholar  

Dabelea D, D’Agostino RB Jr, Mason CC, West N, Hamman RF, Mayer-Davis EJ, Maahs D, Klingensmith G, Knowler WC, Nadeau K. Development, validation and use of an insulin sensitivity score in youths with diabetes: the SEARCH for Diabetes in Youth study. Diabetologia. 2011;54:78–86.

Duca LM, Maahs DM, Schauer IE, Bergman BC, Nadeau KJ, Bjornstad P, Rewers M, Snell-Bergeon JK. Development and validation of a method to estimate insulin sensitivity in patients with and without type 1 diabetes. J Clin Endocrinol Metab. 2016;101:686–95.

Januszewski AS, Sachithanandan N, Ward G, Karschimkus CS, O’Neal DN, Jenkins AJ. Estimated insulin sensitivity in Type 1 diabetes adults using clinical and research biomarkers. Diabetes Res Clin Pract. 2020;167: 108359.

Williams KV, Erbey JR, Becker D, Arslanian S, Orchard TJ. Can clinical factors estimate insulin resistance in type 1 diabetes? Diabetes. 2000;49:626–32.

Zheng X, Huang B, Luo S, Yang D, Bao W, Li J, Yao B, Weng J, Yan J. A new model to estimate insulin resistance via clinical parameters in adults with type 1 diabetes. Diabetes Metab Res Rev. 2017;33: e2880.

Article   Google Scholar  

Uruska A, Zozulinska-Ziolkiewicz D, Niedzwiecki P, Pietrzak M, Wierusz-Wysocka B. TG/HDL-C ratio and visceral adiposity index may be useful in assessment of insulin resistance in adults with type 1 diabetes in clinical practice. J Clin Lipidol. 2018;12:734–40.

Kilpatrick ES, Rigby AS, Atkin SL. Insulin resistance, the metabolic syndrome, and complication risk in type 1 diabetes: “double diabetes” in the Diabetes Control and Complications Trial. Diabetes Care. 2007;30:707–12.

Orchard TJ, Chang YF, Ferrell RE, Petro N, Ellis DE. Nephropathy in type 1 diabetes: a manifestation of insulin resistance and multiple genetic susceptibilities? Further evidence from the Pittsburgh Epidemiology of Diabetes Complication Study. Kidney Int. 2002;62:963–70.

Orchard TJ, Olson JC, Erbey JR, Williams K, Forrest KY, Smithline Kinder L, Ellis D, Becker DJ. Insulin resistance-related factors, but not glycemia, predict coronary artery disease in type 1 diabetes: 10-year follow-up data from the Pittsburgh Epidemiology of Diabetes Complications Study. Diabetes Care. 2003;26:1374–9.

Nystrom T, Holzmann MJ, Eliasson B, Svensson AM, Sartipy U. Estimated glucose disposal rate predicts mortality in adults with type 1 diabetes. Diabetes Obes Metab. 2018;20:556–63.

Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF 3rd, Feldman HI, Kusek JW, Eggers P, Van Lente F, Greene T, Coresh J, Ckd EPI. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150:604–12.

Article   PubMed   PubMed Central   Google Scholar  

Levin A, Stevens PE. Summary of KDIGO 2012 CKD Guideline: behind the scenes, need for guidance, and a framework for moving forward. Kidney Int. 2014;85:49–61.

Alberti KG, Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation. Diabet Med. 1998;15:539–53.

Therneau TM, Grambsch PM. Modeling survival data: extending the cox model. New York: Springer; 2000.

Book   Google Scholar  

Thorn LM, Forsblom C, Waden J, Saraheimo M, Tolonen N, Hietala K, Groop PH, Finnish Diabetic Nephropathy Study G. Metabolic syndrome as a risk factor for cardiovascular disease, mortality, and progression of diabetic nephropathy in type 1 diabetes. Diabetes Care. 2009;32:950–2.

Yip J, Mattock MB, Morocutti A, Sethi M, Trevisan R, Viberti G. Insulin resistance in insulin-dependent diabetic patients with microalbuminuria. Lancet. 1993;342:883–7.

Gregory GA, Robinson TIG, Linklater SE, Wang F, Colagiuri S, de Beaufort C, Donaghue KC, International Diabetes Federation Diabetes Atlas Type 1 Diabetes in Adults Special Interest G, Magliano DJ, Maniam J, Orchard TJ, Rai P, Ogle GD. Global incidence, prevalence, and mortality of type 1 diabetes in 2021 with projection to 2040: a modelling study. Lancet Diabetes Endocrinol. 2022;10:741–60.

Parente EB, Mutter S, Harjutsalo V, Ahola AJ, Forsblom C, Groop PH. Waist-height ratio and waist are the best estimators of visceral fat in type 1 diabetes. Sci Rep. 2020;10:18575.

Januszewski AS, Jenkins AJ. Assessing insulin sensitivity in people with type 1 diabetes without euglycemic-hyperinsulinemic clamps. In: Patel VB, Preedy VR, editors. Biomarkers in diabetes biomarkers in disease methods discoveries and applications. Cham: Springer; 2022.

Google Scholar  

Download references

Acknowledgements

The authors are indebted to the late Carol Forsblom (1964–2022), the international coordinator of the FinnDiane Study Group, for his considerable contribution throughout the years and for this specific study. He took part in designing the study, prepared the dataset, and performed some of the preliminary analyses. We also acknowledge A. Sandelin, and K. Uljala (Folkhälsan Research Center, Helsinki, Finland) for their technical assistance, as well as all the physicians and nurses at each center participating in the collection of the study population (Additional file 1 : Table S1).

This research was funded by grants from Folkhälsan Research Foundation, Wilhelm and Else Stockmann Foundation, Liv och Hälsa Society, Sigrid Juselius Foundation, Finska Läkaresällskapet (Medical Society of Finland), Diabetes Research Foundation, and State funding for university-level health research by Helsinki University Hospital (TYH2023403). Open Access funding was provided by the Helsinki University Library.

Author information

Alicia J. Jenkins and Lena M. Thorn share equal contributions.

Authors and Affiliations

Folkhälsan Institute of Genetics, Folkhälsan Research Center, Biomedicum Helsinki, Haartmaninkatu 8, 00290, Helsinki, Finland

Stefan Mutter, Erika B. Parente, Johan R. Simonsen, Valma Harjutsalo, Per-Henrik Groop & Lena M. Thorn

Department of Nephrology, University of Helsinki and Helsinki University Hospital, Haartmaninkatu 4, 00290, Helsinki, Finland

Stefan Mutter, Erika B. Parente, Johan R. Simonsen, Valma Harjutsalo & Per-Henrik Groop

Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, 00290, Helsinki, Finland

Sydney Pharmacy School, University of Sydney, A15, Science Rd, Camperdown, NSW, 2050, Australia

Andrzej S. Januszewski

NHMRC Clinical Trials Centre, University of Sydney, K25, Parramatta Rd, Camperdown, NSW, 2050, Australia

Andrzej S. Januszewski & Alicia J. Jenkins

Department of Diabetes, Central Clinical School, Monash University, The Alfred Centre, 99 Commercial Rd, Melbourne, VIC, 3004, Australia

Per-Henrik Groop

Baker Heart and Diabetes Institute, 75 Commercial Rd, Melbourne, VIC, 3004, Australia

Per-Henrik Groop & Alicia J. Jenkins

Department of General Practice and Primary Health Care, University of Helsinki and Helsinki University Hospital, Biomedicum 2, Tukholmankatu 8, 00290, Helsinki, Finland

Lena M. Thorn

You can also search for this author in PubMed   Google Scholar

Contributions

S.M., E.B.P., A.S.J., P.H.G., A.J.J., and L.M.T. were responsible for the study design. E.B.P., R.S., V.H., P.H.G, and L.M.T. were responsible for the acquisition of the clinical data. S.M. was responsible for the statistical analyses. S.M., A.S.J., A.J.J., and L.M.T were responsible preparation of the first draft of the manuscript. All authors interpreted the results and contributed to the critical revision of the manuscript. All authors reviewed the manuscript and approved the final version. P.H.G. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Corresponding author

Correspondence to Per-Henrik Groop .

Ethics declarations

Ethics approval and consent to participate.

The study protocol was approved by the Ethics Committee at Helsinki and Uusimaa Hospital District. Each participant gave written informed consent, and the study was conducted in accordance with the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

S.M. reports receiving lecture honoraria from Encore Medical Education. E.B.P. reports receiving lecture honorariums from Sanofi and Astra Zeneca. P.H.G. reports receiving lecture honorariums from Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, Eli Lilly, Elo Water, Medscape, MSD, Mundipharma, Novo Nordisk, PeerVoice, Sanofi, Sciarc, and being an advisory board member of AbbVie, Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, Eli Lilly, Medscape, MSD, Mundipharma, Nestlé, Novo Nordisk, and Sanofi. All other authors declare no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:.

Table S1. Physicians and nurses at each of the FinnDiane centres participating in patient recruitment and characterisation. Table S2. Baseline clinical characteristics at according to the Kidney Disease Improving Global Outcomes (KDIGO) risk categories. Table S3. C-indexes and 95% confidence intervals (CI) with regards to coronary artery disease (CAD) for three estimated glucose disposal rate (eGDR) formulae, the metabolic syndrome and their components for the full cohort. Table S4. C-indexes and 95% confidence intervals (CI) with regards to coronary artery disease (CAD) for three estimated glucose disposal rate (eGDR) formulae, the metabolic syndrome and their components for individuals in Kidney Disease Improving Global Outcomes (KDIGO) category low. Table S5. C-indexes and 95% confidence intervals (CI) with regards to coronary artery disease (CAD) for three estimated glucose disposal rate (eGDR) formulae, the metabolic syndrome and their components for individuals in Kidney Disease Improving Global Outcomes (KDIGO) category moderate. Table S6. C-indexes and 95% confidence intervals (CI) with regards to coronary artery disease (CAD) for three estimated glucose disposal rate (eGDR) formulae, the metabolic syndrome and their components for individuals in Kidney Disease Improving Global Outcomes (KDIGO) categories high and very high.  Table S7 . Comparison of the FinnDiane study participants to those in the clamp studies. Figure S1. A Venn diagram for insulin resistance defined as those individuals that were ranked in the lowest quartile of each estimated glucose disposal rate (eGDR) score. Figure S2. Kaplan–Meier curves for subsequent coronary artery disease (CAD) based on baseline estimated glucose disposal rete (eGDR) quartiles for (A) eGDR by Williams; (B) eGDR by Duca, and (C) eGDR by Januszewski. Figure S3. Kaplan–Meier curves for subsequent coronary artery disease (CAD) based on metabolic syndrome (yes vs. no). Figure S4. Hazard ratio plot for coronary artery disease according to different eGDR formulae in individuals at KDIGO category low. Figure S5. Hazard ratio plot for coronary artery disease according to different eGDR formulae in individuals at KDIGO category moderate. Figure S6. Hazard ratio plot for coronary artery disease according to different eGDR formulae in individuals at KDIGO category high & very high.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Mutter, S., Parente, E.B., Januszewski, A.S. et al. Insulin sensitivity estimates and their longitudinal association with coronary artery disease in type 1 diabetes. Does it matter?. Cardiovasc Diabetol 23 , 152 (2024). https://doi.org/10.1186/s12933-024-02234-x

Download citation

Received : 06 February 2024

Accepted : 11 April 2024

Published : 03 May 2024

DOI : https://doi.org/10.1186/s12933-024-02234-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Insulin resistance
  • Type 1 diabetes mellitus
  • Cardiovascular diseases
  • Kidney disease

Cardiovascular Diabetology

ISSN: 1475-2840

type of longitudinal research

Longitudinal ultrasound-based AI model predicts axillary lymph node response to neoadjuvant chemotherapy in breast cancer: a multicenter study

  • Open access
  • Published: 10 May 2024

Cite this article

You have full access to this open access article

type of longitudinal research

  • Ying Fu 1   na1 ,
  • Yu-Tao Lei 2   na1 ,
  • Yu-Hong Huang 3 ,
  • Fang Mei 4 ,
  • Song Wang 5 ,
  • Kun Yan 5 ,
  • Yi-Hua Wang 6 ,
  • Yi-Han Ma 1 &
  • Li-Gang Cui   ORCID: orcid.org/0000-0002-0717-7445 1  

116 Accesses

Explore all metrics

Developing a deep learning radiomics model from longitudinal breast ultrasound and sonographer’s axillary ultrasound diagnosis for predicting axillary lymph node (ALN) response to neoadjuvant chemotherapy (NAC) in breast cancer.

Breast cancer patients undergoing NAC followed by surgery were recruited from three centers between November 2016 and December 2022. We collected ultrasound images for extracting tumor-derived radiomics and deep learning features, selecting quantitative features through various methods. Two machine learning models based on random forest were developed using pre-NAC and post-NAC features. A support vector machine integrated these data into a fusion model, evaluated via the area under the curve (AUC), decision curve analysis, and calibration curves. We compared the fusion model’s performance against sonographer’s diagnosis from pre-NAC and post-NAC axillary ultrasonography, referencing histological outcomes from sentinel lymph node biopsy or axillary lymph node dissection.

In the validation cohort, the fusion model outperformed both pre-NAC (AUC: 0.899 vs. 0.786, p  < 0.001) and post-NAC models (AUC: 0.899 vs. 0.853, p  = 0.014), as well as the sonographer’s diagnosis of ALN status on pre-NAC and post-NAC axillary ultrasonography (AUC: 0.899 vs. 0.719, p  < 0.001). Decision curve analysis revealed patient benefits from the fusion model across threshold probabilities from 0.02 to 0.98. The model also enhanced sonographer’s diagnostic ability, increasing accuracy from 71.9% to 79.2%.

The deep learning radiomics model accurately predicted the ALN response to NAC in breast cancer. Furthermore, the model will assist sonographers to improve their diagnostic ability on ALN status before surgery.

Clinical relevance statement

Our AI model based on pre- and post-neoadjuvant chemotherapy ultrasound can accurately predict axillary lymph node metastasis and assist sonographer’s axillary diagnosis.

Axillary lymph node metastasis status affects the choice of surgical treatment, and currently relies on subjective ultrasound .

Our AI model outperformed sonographer’s visual diagnosis on axillary ultrasound .

Our deep learning radiomics model can improve sonographers’ diagnosis and might assist in surgical decision-making .

Avoid common mistakes on your manuscript.

Introduction

Neoadjuvant chemotherapy (NAC) is increasingly used for breast cancer with clinically positive axillary lymph nodes (ALN) [ 1 , 2 ], necessitating accurate ALN response assessment for optimal post-NAC axillary surgical strategy [ 3 ]. While axillary lymph node dissection (ALND) remains the standard for clinical node-positive (cN+) breast cancer, NAC effectively eliminates ALN metastasis in 40–75% of cases [ 4 ]. Accurately predicting ALN response to NAC can markedly reduce unnecessary axillary surgeries and their associated risks like lymph node edema, and infection. Some patients may undergo axillary surgery despite without ALN metastasis [ 5 ].

Mammography, magnetic resonance imaging (MRI), and ultrasonography (US) are widely used to stage and monitor breast cancer during NAC treatment [ 6 ]. Radiomics is effective in cancer diagnosis, treatment evaluation, ALN metastasis detection, phenotype characterization, and prognosis prediction [ 7 , 8 , 9 , 10 , 11 , 12 ]. Deep learning offers automated, enhanced imaging feature analysis compared to traditional radiomics. In addition, transfer learning is explored for feature extraction in small medical datasets. Recent evidence suggests that deep learning radiomics (DLR) from preoperative US can predict early-stage breast cancer’s ALN status with high sensitivity and negative predictive value [ 13 ]. A study also found that a longitudinal MRI-based DLR model could predict the pathological complete response of breast cancer to NAC accurately, indicating that longitudinal medical images could capture more quantitative information during NAC [ 14 ]. Based on these findings, we hypothesize that a DLR model using pre-NAC and post-NAC US images can more effectively predict ALN response.

Few studies have trained and validated a multimodal DLR model that uses both pre-NAC and post-NAC ultrasound images to predict ALN response in breast cancer. Prior research has not compared artificial intelligence (AI) models with sonographers’ visual diagnosis on pre-NAC and post-NAC axillary ultrasound images. Our study focuses on comparing the DLR model’s predictive performance against sonographers, validating the model with independent external datasets, and assessing the AI model’s potential to improve sonographers’ diagnostic ability in axillary diagnosis on ultrasound images.

Materials and methods

The study received ethical approval from the Ethics Committees of the Peking University Third Hospital, Guangdong Provincial People’s Hospital, and Peking University Cancer Hospital. Due to the retrospective nature of the study, patient informed consent was waived. From November 2016 to December 2022, 669 patients from three hospitals, who underwent NAC followed by surgery, were enrolled. The inclusion criteria were: (i) cN+ breast cancer treated with standard NAC; (ii) complete pre-NAC and post-NAC ultrasound scans; (iii) ALN staging via sentinel lymph node biopsy (SLNB) or ALND; and (iv) complete baseline data. The exclusion criteria were: (i) prior breast cancer treatment ( n  = 43), (ii) other malignancies or distant metastasis ( n  = 28), (iii) bilateral breast cancer ( n  = 18), (iv) inadequate or poor-quality US images ( n  = 39), and (v) missing clinicopathological data ( n  = 44). Patients from hospitals I and II comprised the training cohort ( n  = 216), whereas patients from hospitals III comprised the independent validation cohort ( n  = 281). Figure  1 shows the study workflow.

figure 1

The design of the workflow for this study. The construction of the deep learning radiomics model involves the following steps: Acquisition of original images, manual segmentation, feature extraction, feature selection, the model construction. A sonographers made the first decision of axillary lymph node status using the pre-NAC and post-NAC axillary ultrasound. After a month interval, a second decision was provided for the same images with the assistance of an artificial intelligence model. The pathological results of axillary lymph nodes were regarded as the golden standard. NAC, neoadjuvant chemotherapy; US, ultrasound; AI, artificial intelligence; ROI, region of interest; ALN, axillary lymph node; LASSO, least absolute shrinkage and selection operator

NAC regimen and histological assessment

All patients underwent 6 or 8 cycles of NAC, using either taxane alone or in combination with anthracycline, with human epidermal growth factor receptor-2 (HER2) positive patients also receiving anti-HER2 therapy. Surgery followed 2–3 weeks after NAC. ALN response to NAC was assessed histologically via SLNB or ALND, defining ALN metastasis as invasive tumor presence in any lymph node. Immunohistochemistry (IHC) determined HER2, hormone receptor (HR), and Ki-67 status: estrogen receptor (ER) and progesterone receptor (PR) were positive if staining cells > 1%, while Ki-67 expression was defined as high or low with a 20% cutoff [ 15 ]. HER2 status was based on IHC scores (0 or 1+ as negative, 3+ as positive) or fluorescence in situ hybridization for IHC 2+ cases [ 16 ]. Baseline data, including age, menstrual status, clinical T and N stages, were recorded. Breast cancer was classified into HR+/HER2-, HER2+, and TNBC based on molecular receptor expression.

Ultrasound examination

All patients underwent pre- and post-NAC ultrasound examinations conducted two weeks before and after NAC treatment. Ultrasound images were obtained using Esaote (MyLab Twice), Siemens (S3000), or Philips (EPIQ5) ultrasound scanners equipped with 7- to 15-MHz linear transducer (see Supporting Material- II , Ultrasound Examinations). Two sonographers performed ultrasound examinations at hospital I, one at hospital II, and two at hospitals III. Each sonographer had more than eight years of experience in breast ultrasound imaging. Before NAC, patients underwent breast ultrasound scans and core-needle biopsy, selecting the ultrasound images containing the largest diameter of mass for further analysis. A total of 2585 ultrasound images from 497 patients were collected and analyzed, encompassing both pre-NAC and post-NAC images.

Tumor segmentation and image preprocessing

Pre-NAC and post-NAC US images were imported into 3D Slicer software (version 4.10.1) for manual tumor delineation. Two experienced sonographers (6 and 8 years in breast cancer ultrasound), blinded to histological results, segmented the tumor regions of interest (ROI), encompassing the entire tumor but excluding blood vessels, adipose tissue, and normal breast tissue. Each ultrasound image had its tumor ROI delineated. For radiomics process, US images were preprocessed to a uniform voxel size of 1 × 1 mm. For the deep learning process, US images covering the entire tumor area were resized to 448 × 448 pixels and grayscale normalized to 0–1000 for uniform feature extraction.

Feature extraction and selection

Feature extraction and selection were conducted on pre-NAC and post-NAC ultrasound images using pyradiomics software (version 3.3.0), extracting 2446 radiomics features (1223 from each pre-NAC and post-NAC; see Supporting Material- I . Feature Extraction). These included shape-based, first-order statistical, texture-based, and wavelet-derived features. Combat harmonization minimized bias from different scanners across hospitals. For deep learning, all resized images were input into the deep convolutional neural network VGG16, which has been pretrained on a large-scale ImageNet database ( https://www.image-net.org/ ). Then we extracted transfer features from the fully connected layers (see Supporting Material- III . Basic Principles of Deep Learning and Neural Network). This yielded 1223 pre-NAC and 1223 post-NAC radiomic features, and 2048 deep learning features each for pre-NAC and post-NAC.

Feature values were standardized using z -score normalization. In the training cohort, feature selection involved the Mann–Whitney U test to identify features significantly associated with ALN response to NAC ( p  < 0.05). The Least Absolute Shrinkage and Selection Operator (LASSO) was used to eliminate features with zero coefficients. To reduce feature correlation, Spearman analysis removed one feature from highly correlated pairs (correlation coefficient > 0.8), based on their diagnostic performance.

Model construction and Integration

To optimize the DLR model for ALN response prediction after NAC, we adjusted their hyperparameters. This included hyperparameter tuning to enhance model performance and early stopping to prevent model overfitting, ensuring model generalizability. We used 30% of the training cohort to assess the VGG16 model’s performance, measured by the area under the curve (AUC), and stopped training if the performance did not increase over ten consecutive calculation cycles. To further refine the model, significant conventional ultrasound features such as tumor size, echo type, and blood flow signal, were integrated into the fully connected layer, increasing neuron count. Two predictive models (pre-NAC and post-NAC) were built using a random forest algorithm, generating two DLR signatures. A support vector machine (SVM) model then combined pre-NAC and post-NAC radiomics and deep learning features. The integration of these temporally distinct features enables a more comprehensive analysis, enhancing the machine learning model’s predictive power. The SVM model was designed to accurately predict the ALN metastasis in breast cancer patients following NAC.

Comparison with sonographer and AI-assisted diagnosis

We evaluated model performance by comparing each machine learning model’s AUC with sonographer’s diagnosis on axillary ultrasound and explored if the fusion model enhanced sonographer’s diagnostic ability. Two sonographers, with 6 and 8 years of experience, independently assessed ALN status on pre-NAC and post-NAC ultrasound images, blinded to pathological results. Based on previous studies, the presence of any of the following criteria indicates metastatic ALN on US: (i) loss of the fatty hilum, (ii) round shape, or (iii) eccentric cortical thickening (> 3 mm) [ 17 , 18 ]. After a month, the same sonographers re-assessed the US images with AI model assistance, initially obtaining the AI prediction before making their final diagnosis. We compared the sonographer’s initial diagnosis with the AI-assisted diagnosis to determine whether the AI model would serve as a useful tool for enhancing the sonographer’s diagnostic ability.

Statistical analysis

Statistical analysis was conducted using SPSS software (version 25.0). Group differences were assessed using the student’s t -test or Mann-Whitney U -test for continuous variables and the chi-square test or Fisher’s exact test for categorical variables. The performances of the models were evaluated using the AUC, and the DeLong test was used to compare the performances of the different models. Decision curve analysis (DCA) to evaluate the clinical utility of the models [ 19 ]. Model performance was assessed based on accuracy (ACC), specificity (SPE), sensitivity (SEN), positive predictive value (PPV), and negative predictive value (NPV), seeing Supporting Material- IV . Statistical Metrics. Statistical significance was set at p  < 0.05.

Baseline characteristics of patients

In this study, 497 patients were included, with an average age of 51.47 years. Of these patients, 210 were ALN+ and 287 were ALN- after NAC. The ALN+ rates were 51.39% in the training cohort and 35.23% in the validation cohort. Significant differences in molecular subtype, primary tumor response and clinical N stage were observed between ALN+ and ALN- groups (all the p  < 0.05), while other baseline characteristics showed no significant variance in both training and validation cohorts. Table  1 details the baseline characteristics of the patients.

Feature selection and model construction

In the training cohort, 1362 radiomic features (463 pre-NAC, 899 post-NAC) and 2908 deep learning features (1357 pre-NAC, 1551 post-NAC) from ultrasound images were significantly associated with ALN metastasis after NAC (Mann–Whitney U test, p  < 0.05). After LASSO selection, seven pre-NAC and nine post-NAC features were selected. The detailed LASSO selection mean-square error change curve and coefficient change lines are shown in Fig.  S1 . From highly correlated pairs (Spearman correlation coefficient > 0.8), the feature with higher diagnostic performance was retained, resulting in six pre-NAC and eight post-NAC features for model construction (see Table  2 ). Two random forest models (pre-NAC and post-NAC) were developed, with their output signatures integrated into a SVM model.

Figure  2A, B show the ROC curves of the three machine learning models, with the fusion model achieving the highest AUCs of 0.949 in the training cohort and 0.899 in the validation cohort. It outperformed both the pre-NAC (AUC = 0.786, p  < 0.05) and post-NAC (AUC = 0.853, p  < 0.05) models in the validation cohort. The decision curve analysis demonstrated that the combined model had satisfactory net clinical benefits in both the training and validation cohorts (Fig.  2C, D ). The calibration plots also demonstrated excellent agreement between the actual and predicted ALN status in both cohorts of the fusion model (Fig.  2E, F ).

figure 2

Comparison of ROC curves, Decision curve analysis of the three models, and the calibration curves of the fusion model. ROC curves show the performance of the fusion model, pre-NAC model, and post-NAC model for predicting ALN metastasis in the training ( A ) and validation cohorts ( B ). Decision curve analysis (DCA) for three models was showed in the training ( C ) and validation cohorts ( D ), the y-axis indicates the net benefit; x-axis indicates threshold probability. Calibration curves of the fusion model in the training ( E ) and validation ( F ) cohorts are presented. AUC, area under the curve; FPR, false positive rate; TPR, true positive rate

Comparison of sonographer and radiomics model

Our study compared the sonographers’ first diagnosis on axillary ultrasound with three machine-learning models based on pre-NAC and post-NAC breast ultrasound features. The models’ performance metrics, including AUC, ACC, SEN, SPE, PPV, and NPV, are detailed in Table  3 . The fusion model outperformed the sonographer in the training cohort with an accuracy of 88.89%, sensitivity of 84.68%, and specificity of 93.33%, and in the validation cohort with an accuracy of 85.77%, sensitivity of 83.84%, and NPV of 86.81%. Despite its wide clinical application, axillary ultrasound showed the lowest AUCs (0.753 in training cohort, 0.719 in validation cohort). The three AI models (AUCs: 0.899, 0.786, and 0.853, respectively) surpassed the sonographer’s first diagnosis (AUC: 0.719) in the validation cohort. The sonographer identified ALN+ patients with sensitivities of 63.06% (training cohort) and 51.52% (validation cohort), while AI models achieved higher sensitivities (82.88–92.79% in training cohort, 73.73–83.84% in validation cohort). For identifying ALN- patients, the sonographer’s specificity was comparable to the AI models, with the pre-NAC model showing the lowest specificity (70.48% in training cohort, 64.84% in validation cohort).

AI assist in sonographer’s diagnosis on ALN status

With the assistance of the fusion AI model, the sonographer performed a second reading of the US image. As seen in Table  3 , the sonographer’s diagnostic ability improved when assisted by the AI model, most prominently in sensitivity, which increased from 63.06% to 70.27% in the training cohort and from 51.52% to 62.63% in the validation cohort. Moreover, the AUCs of the sonographer’s second diagnosis were considerably greater than that of the initial diagnosis ( p  < 0.05 in both the training and validation cohorts), indicating that the fusion AI model effectively improved the sonographer’s diagnostic ability. Figure  3 illustrates the ROI delineation and heatmap on the US images of two representative patients (ALN+ and ALN−).

figure 3

This Figure illustrates pre- and post-neoadjuvant chemotherapy (NAC) ultrasound images from two patients: one showing a complete axillary lymph node (ALN) response and the other with residual ALN metastasis after NAC. The second and third columns correspond to the radiomics heatmap (Firstorder_Maximum and GLCM_IDM) generated from the radiomics pipeline, while the fourth column depicts the Grad CAM Map heatmap from the deep learning pipeline. These heatmaps visually represent areas of interest identified by each model in assessing ALN status. For the Firstorder_Maximum heatmap, the larger the dark blue prompt value is, the more disordered the intensity value is in this region. For the GLCM_IDM heatmap, the larger the dark red prompt value is, the more disordered the texture is in this region. For the Grad CAM heatmap, the larger the red prompt value is, the more contribution feature value is in this region, also indicating the deep learning model pays more attention to the red region on breast cancer ultrasound image. NAC, neoadjuvant chemotherapy; GLCM, Gray level co-occurrence matrix inverse difference moment; Grad CAM, gradient-weighted class activation mapping; US, ultrasound; HR, hormone receptor; HER2, human epidermal growth factor antibody 2; AI, artificial intelligence

ALN status is crucial for guiding surgical treatment in clinical practice, as ALN metastasis typically indicates a worse prognosis and a higher recurrence risk [ 6 ]. SLNB or ALND is routinely performed to assess the axillary lymph node status. In our study, 42.25% of the patients had no ALN metastasis after NAC but underwent invasive axillary surgery, leading to huge costs and unnecessary complications. Previous studies have confirmed that MRI-based radiomics features from primary tumors could accurately predict the ALN status with an AUC of 0.790–0.862, but only focused on imaging-derived radiomics [ 15 , 20 , 21 , 22 ]. A previous study showed the feasibility of predicting the ALN status using a mammography-based radiomics model with an AUC of 0.809 (95% CI, 0.794–0.833) [ 23 ]. Our study involved developing a multimodality AI model using pre- and post-NAC US images, allowing for a more comprehensive use of US images to predict ALN status. The DeLong test revealed the fusion model’s reliability in noninvasively identifying the ALN status after NAC, sparing unnecessary surgery and complications.

Axillary ultrasound is commonly used to evaluate the ALN status during NAC in patients with breast cancer. In our study, the fusion model demonstrated superior diagnostic performance, with an AUC of 0.899 in the validation cohort, significantly outperforming the sonographer’s diagnosis on axillary US, with an AUC of 0.719. Alvarez reported that axillary ultrasound’s sensitivity and specificity for breast cancer ranged from 48.8% to 87.1% and 55.6% to 97.3%, respectively, consistent with our findings [ 24 ]. However, axillary ultrasound diagnosis is usually influenced by the operator experience, and difficulty in detecting very small metastasis in the ALN region. Thus, despite its widespread clinical use, a sonographer’s ultrasound diagnosis should not be the only imaging approach for assessing ALN status after NAC. In addition, we found that the sonographer’s diagnosis on the axillary US showed high specificity for ALN diagnosis after NAC but low sensitivity. Our results are consistent with previous studies showing that the sensitivity of MRI was 61.4–70%, indicating that axillary US performed similarly to MRI [ 25 , 26 ]. Moreover, sonographers rely on subjective judgments of ALN morphology, whereas respiratory and cardiac motion artifacts may affect their diagnosis on MRI. Conventional US images are also more robust.

In our study, the sonographers’ first diagnosis relied only on their personal perspective or personal diagnosis, whereas the second diagnosis referenced the prediction results of the fusion model. Some breast cancer heterogeneity might relate to ALN metastasis, but cannot be visually observed by sonographers. The results showed that the diagnostic ability was significantly enhanced in the second diagnosis, indicating that the AI model can capture and integrate potential breast cancer heterogeneity overlooked by sonographers when assessing ALN status. When the AI model’s risk score significantly deviates from the sonographer’s first diagnosis, the sonographer would pay more attention to the lymph nodes, which initially were indeterminate and were not classified as metastatic in the first reading. Sonographers re-evaluated and made an upgrading ALN diagnosis in the second reading with AI assistance.

The fusion model’s superiority for higher threshold probabilities above 15% suggests its utility in identifying patients who could benefit from ALND, thereby minimizing unnecessary surgical interventions. However, the ideal threshold for ALND recommendation should balance the risks of unwarranted surgery against undertreatment risks, warranting further validation in future studies tailored to patient conditions and clinical practices. In addition, the model’s high negative predictive value (NPV) of 90.8% in the validation cohort suggests its effectiveness in accurately identifying patients who may not need ALND, potentially averting related surgical complications. Nonetheless, ALND omission decisions should consider the AI model’s predictions in conjunction with other factors, including patient personalized condition, molecular subtype, and lymph node size.

Our study had some limitations. First, primary tumor segmentation was performed manually, which is time-consuming. In future, we plan to explore the performance of an automatic segmentation model. Second, selection bias was unavoidable due to the retrospective nature of the study. Larger sample sizes and evidence from more multicenter studies are required to test the predictive efficiency and assistive ability of the AI model. Third, we collected US images from various acquisition protocols, potentially affecting the imaging analysis. Thus, a harmonization process was employed to minimize heterogeneity. Finally, the relatively limited number of sonographers who participated in this study may not accurately represent an average sonographer’s ability. Future studies should involve more sonographers in diagnosing ALN status to evaluate the model’s assist efficacy more comprehensively.

We developed a fusion AI model that integrates pre- and post-NAC US images, providing superior prediction of ALN metastasis after NAC in breast cancer compared with the single-modality model or sonographer diagnosis. This AI model can serve as an effective tool to assist sonographers in improving their diagnostic abilities.

Abbreviations

  • Artificial intelligence

Axillary lymph node

Axillary lymph node dissection

Area under the curve

Clinical node-positive

Deep learning radiomics

Human epidermal growth factor receptor-2

Hormone receptor

Immunohistochemical

Least absolute shrinkage and selection operator

Magnetic resonance imaging

  • Neoadjuvant chemotherapy

Negative predictive value

Positive predictive value

Receiver operating characteristic

Regions of interest

Sensitivity

Sentinel lymph node biopsy

Specificity

Support vector machine

  • Ultrasonography

Sung H, Ferlay J, Siegel RL et al (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71:209–249. https://doi.org/10.3322/caac.21660

Article   CAS   PubMed   Google Scholar  

Trapani D, Ginsburg O, Fadelu T et al (2022) Global challenges and policy solutions in breast cancer control. Cancer Treat Rev 104:102339. https://doi.org/10.1016/j.ctrv.2022.102339

Article   PubMed   Google Scholar  

Tamirisa N, Thomas SM, Fayanju OM et al (2018) Axillary nodal evaluation in elderly breast cancer patients: potential effects on treatment decisions and survival. Ann Surg Oncol 25:2890–2898. https://doi.org/10.1245/s10434-018-6595-2

Article   PubMed   PubMed Central   Google Scholar  

Pilewskie M, Morrow M (2017) Axillary nodal management following neoadjuvant chemotherapy: a review. JAMA Oncol 3:549–555. https://doi.org/10.1001/jamaoncol.2016.4163

Krag DN, Anderson SJ, Julian TB et al (2010) Sentinel-lymph-node resection compared with conventional axillary-lymph-node dissection in clinically node-negative patients with breast cancer: overall survival findings from the NSABP B-32 randomised phase 3 trial. Lancet Oncol 11:927–933. https://doi.org/10.1016/S1470-2045(10)70207-2

Chang JM, Leung JWT, Moy L, Ha SM, Moon WK (2020) Axillary nodal evaluation in breast cancer: state of the art. Radiology 295:500–515. https://doi.org/10.1148/radiol.2020192534

Minarikova L, Bogner W, Pinker K et al (2017) Investigating the prediction value of multiparametric magnetic resonance imaging at 3 T in response to neoadjuvant chemotherapy in breast cancer. Eur Radiol 27:1901–1911. https://doi.org/10.1007/s00330-016-4565-2

Pereira NP, Curi C, Osório CABT et al (2019) Diffusion-weighted magnetic resonance imaging of patients with breast cancer following neoadjuvant chemotherapy provides early prediction of pathological response—a prospective study. Sci Rep 9:16372. https://doi.org/10.1038/s41598-019-52785-3

Article   CAS   PubMed   PubMed Central   Google Scholar  

Eun NL, Kim JA, Son EJ et al (2020) Texture analysis with 3.0-T MRI for association of response to neoadjuvant chemotherapy in breast cancer. Radiology 294:31–41. https://doi.org/10.1148/radiol.2019182718

Nadrljanski MM, Milosevic ZC (2020) Tumor texture parameters of invasive ductal breast carcinoma in neoadjuvant chemotherapy: early identification of non-responders on breast MRI. Clin Imaging 65:119–123. https://doi.org/10.1016/j.clinimag.2020.04.016

Dogan BE, Yuan Q, Bassett R et al (2019) Comparing the performances of magnetic resonance imaging size vs pharmacokinetic parameters to predict response to neoadjuvant chemotherapy and survival in patients with breast cancer. Curr Probl Diagn Radiol 48:235–240. https://doi.org/10.1067/j.cpradiol.2018.03.003

Lambin P, Rios-Velazquez E, Leijenaar R et al (2012) Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 48:441–446. https://doi.org/10.1016/j.ejca.2011.11.036 . Oxford, England

Jiang M, Li CL, Luo XM et al (2022) Radiomics model based on shear-wave elastography in the assessment of axillary lymph node status in early-stage breast cancer. Eur Radiol 32:2313–2325. https://doi.org/10.1007/s00330-021-08330-w

Wu J, Gong G, Cui Y, Li R (2016) Intratumor partitioning and texture analysis of dynamic contrast-enhanced (DCE)-MRI identifies relevant tumor subregions to predict pathological response of breast cancer to neoadjuvant chemotherapy. J Magn Reson Imaging 44:1107–1115. https://doi.org/10.1002/jmri.25279

Hammond ME, Hicks DG (2015) American Society of Clinical Oncology/College of American Pathologists human epidermal growth factor receptor 2 testing clinical practice guideline upcoming modifications: proof that clinical practice guidelines are living documents. Arch Pathol Lab Med 139:970–971. https://doi.org/10.5858/arpa.2015-0074-ED

Wolff AC, Hammond ME, Hicks DG et al (2013) Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. J Clin Oncol 31:3997–4013. https://doi.org/10.1200/JCO.2013.50.9984

Youk JH, Son EJ, Kim JA, Gweon HM (2017) Pre-operative evaluation of axillary lymph node status in patients with suspected breast cancer using shear wave elastography. Ultrasound Med Biol 43:1581–1586. https://doi.org/10.1016/j.ultrasmedbio.2017.03.016

Zheng Q, Yan H, He Y et al (2024) An ultrasound-based nomogram for predicting axillary node pathologic complete response after neoadjuvant chemotherapy in breast cancer: Modeling and external validation. Cancer 130:1513–1523. https://doi.org/10.1002/cncr.35248

DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845

Mao N, Yin P, Li Q et al (2020) Radiomics nomogram of contrast-enhanced spectral mammography for prediction of axillary lymph node metastasis in breast cancer: a multicenter study. Eur Radiol 30:6732–6739. https://doi.org/10.1007/s00330-020-07016-z

Yu Y, Tan Y, Xie C et al (2020) Development and validation of a preoperative magnetic resonance imaging radiomics-based signature to predict axillary lymph node metastasis and disease-free survival in patients with early-stage breast cancer. JAMA Netw Open 3:e2028086. https://doi.org/10.1001/jamanetworkopen.2020.28086

Kim EJ, Kim SH, Kang BJ, Choi BG, Song BJ, Choi JJ (2014) Diagnostic value of breast MRI for predicting metastatic axillary lymph nodes in breast cancer patients: diffusion-weighted MRI and conventional MRI. Magn Reson Imaging 32:1230–1236. https://doi.org/10.1016/j.mri.2014.07.001

Yang J, Wang T, Yang L et al (2019) Preoperative prediction of axillary lymph node metastasis in breast cancer using mammography-based radiomics method. Sci Rep 9:4429. https://doi.org/10.1038/s41598-019-40831-z

Alvarez S, Añorbe E, Alcorta P, López F, Alonso I, Cortés J (2006) Role of sonography in the diagnosis of axillary lymph node metastases in breast cancer: a systematic review. AJR Am J Roentgenol 186:1342–1348. https://doi.org/10.2214/AJR.05.0936

Song D, Yang F, Zhang Y et al (2022) Dynamic contrast-enhanced MRI radiomics nomogram for predicting axillary lymph node metastasis in breast cancer. Cancer Imaging 22:17. https://doi.org/10.1186/s40644-022-00450-w

Mao N, Dai Y, Lin F et al (2020) Radiomics nomogram of DCE-MRI for the prediction of axillary lymph node metastasis in breast cancer. Front Oncol 10:541849. https://doi.org/10.3389/fonc.2020.541849

Download references

This research received Key Clinical Projects of Peking University Third Hospital No. BYSYZD2023020.

Author information

Ying Fu and Yu-Tao Lei contributed equally to this work.

Authors and Affiliations

Department of Ultrasound, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing, 100191, China

Ying Fu, Yi-Han Ma & Li-Gang Cui

Department of General Surgery, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing, 100191, China

Department of Breast Cancer, Cancer Center, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 510080, Guangdong, China

Yu-Hong Huang

Department of Pathology, Peking University Third Hospital, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, China

Department of Ultrasound, Peking University Cancer Hospital & Institute, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), No. 52 Fucheng Road, Haidian District, Beijing, 100142, China

Song Wang & Kun Yan

Department of Ultrasound, North China University of Science and Technology Affiliated Hospital, 73 South Jianshe Road, Lubei District, Tangshan, 066300, China

Yi-Hua Wang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Li-Gang Cui .

Ethics declarations

The scientific guarantor of this publication is LC.

Conflict of interest

The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.

Statistics and biometry

One of the authors has significant statistical expertise.

Informed consent

Written informed consent was waived by the Institutional Review Board.

Ethical approval

Institutional Review Board approval was obtained.

Study subjects or cohorts overlap

No study subject or cohort overlap has been reported.

Methodology

Retrospective

Diagnostic or prognostic study

Multicenter study

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary materials, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Fu, Y., Lei, YT., Huang, YH. et al. Longitudinal ultrasound-based AI model predicts axillary lymph node response to neoadjuvant chemotherapy in breast cancer: a multicenter study. Eur Radiol (2024). https://doi.org/10.1007/s00330-024-10786-5

Download citation

Received : 28 August 2023

Revised : 04 February 2024

Accepted : 10 March 2024

Published : 10 May 2024

DOI : https://doi.org/10.1007/s00330-024-10786-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Breast cancer
  • Axillary lymph node metastasis
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. What is a Longitudinal Study?

    type of longitudinal research

  2. 10 Famous Examples of Longitudinal Studies (2023)

    type of longitudinal research

  3. What is a Longitudinal Study?

    type of longitudinal research

  4. What is a Longitudinal Study?

    type of longitudinal research

  5. Accounting Nest

    type of longitudinal research

  6. What’s a Longitudinal Study? Types, Uses & Examples

    type of longitudinal research

VIDEO

  1. Longitudinal Research

  2. Longitudinal Research

  3. What can policy makers learn from longitudinal studies?

  4. RESEARCH DESIGNS Longitudinal survey designs Eddie Seva See

  5. What is an Longitudinal Research?

  6. Difference between Longitudinal and Cross-sectional Research? in urdu/Hindi

COMMENTS

  1. Longitudinal Study

    Revised on June 22, 2023. In a longitudinal study, researchers repeatedly examine the same individuals to detect any changes that might occur over a period of time. Longitudinal studies are a type of correlational research in which researchers observe and collect data on a number of variables without trying to influence those variables.

  2. Longitudinal Study Design: Definition & Examples

    Panel Study. A panel study is a type of longitudinal study design in which the same set of participants are measured repeatedly over time. Data is gathered on the same variables of interest at each time point using consistent methods. This allows studying continuity and changes within individuals over time on the key measured constructs.

  3. Longitudinal study

    A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., people) over long periods of time (i.e., uses longitudinal data).It is often a type of observational study, although it can also be structured as longitudinal randomized experiment.. Longitudinal studies are often used in social-personality and ...

  4. What's a Longitudinal Study? Types, Uses & Examples

    A cross-sectional study is a type of observational study in which the researcher collects data from variables at a specific moment to establish a relationship among them. On the other hand, longitudinal research observes variables for an extended period and records all the changes in their relationship. Time-frame

  5. Longitudinal studies

    The Framingham study is widely recognised as the quintessential longitudinal study in the history of medical research. An original cohort of 5,209 subjects from Framingham, Massachusetts between the ages of 30 and 62 years of age was recruited and followed up for 20 years.

  6. What Is a Longitudinal Study?

    Longitudinal studies, a type of correlational research, are usually observational, in contrast with cross-sectional research. Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point. To test this hypothesis, the researchers recruit participants who are in ...

  7. Longitudinal Study: Overview, Examples & Benefits

    A cohort study is a specific type of longitudinal study focusing on a group of people sharing a common characteristic or experience within a defined period. Imagine tracking a group of individuals over time. Researchers collect data regularly, analyzing how specific factors evolve or influence outcomes. This method offers a dynamic view of ...

  8. An Overview of Longitudinal Research Designs in Social Sciences

    LRD is classified into four types based on two criteria: the number of waves, and whether data are collected from similar or different cases in the subsequent waves. These are total population design (TPD), repeated cross-sectional design (RCD), revolving or rotating panel design (RPD) and longitudinal panel design (LPD) ( Menard, 2002 ).

  9. Cross-Sectional and Longitudinal Studies

    Both cross-sectional studies and longitudinal studies reveal different population processes, and the decision to pursue one type of study ultimately depends on the research question. Whereas cross-sectional study may be confounded by selection process, temporal autocorrelation, and cohort-specific effects, longitudinal studies may lead to ...

  10. Longitudinal Research Designs

    In a longitudinal research design, the same attribute is observed repeatedly for at least one unit i (e.g., a person). In practice, one can roughly distinguish between two different types of longitudinal research designs: Either multiple units i = 1, …, N are observed at multiple time points t = 1, …, T, with N being large and T being small or a single unit is observed at many time points ...

  11. Longitudinal Study Designs

    Another type of longitudinal study design is called a "crossover design" (Grizzle 1965; Fleiss 1986; Brown and Prescott 2006). In this type of design, individuals are typically randomized to receive a sequence of treatments. Thus, instead of being assigned to a single treatment group, each unit is usually assigned to receive all of the ...

  12. Longitudinal study: design, measures, classic example

    As the name implies, longitudinal studies follow subjects over time (Fig. 42.1).There are three main types of studies that fall under the umbrella of the longitudinal study: cohort studies, panel studies, and retrospective studies. 1 The cohort study is one of the most common types of longitudinal studies. It involves following a cohort (a group of individuals with a shared characteristic(s ...

  13. Longitudinal study: design, measures, and classic example

    Overall, these types of study will measure exposure in a population and can detect changes over time. As in the Terman study, the gifted individuals had career success well above the average college graduate and had a higher degree of personal satisfaction. 11. Section 1: Key points • Longitudinal study measures the result of an exposure over ...

  14. What is a Longitudinal Study?

    Types of longitudinal studies. There are two types of longitudinal studies to choose from, primarily depending on what you are looking to examine. Keep in mind that longitudinal study design, no matter what type of study you might pursue, is a matter of sustaining a research inquiry over time to capture the necessary data.

  15. Longitudinal study: Design, measures, and classic example

    The study participants have been shown to be representative of the middle-aged Caucasian female United Kingdom (UK) population. 26 This study is a type of cohort study, itself a type of longitudinal study, as the participants are being followed up over a period of time and all share the same characteristic of belonging to one general practice.

  16. Longitudinal Study: Definition, Pros, and Cons

    A longitudinal study is a type of correlational research that involves regular observation of the same variables within the same subjects over a long or short period. These studies can last from a few weeks to several decades. Longitudinal studies are common in epidemiology, economics, and medicine. People also use them in other medical and ...

  17. Longitudinal Research: A Panel Discussion on Conceptual Issues

    An important meta-trend in work, aging, and retirement research is the heightened appreciation of the temporal nature of the phenomena under investigation and the important role that longitudinal study designs play in understanding them (e.g., Heybroek, Haynes, & Baxter, 2015; Madero-Cabib, Gauthier, & Le Goff, 2016; Wang, 2007; Warren, 2015; Weikamp & Göritz, 2015).

  18. What is a Longitudinal Study?

    A longitudinal study is a research conducted over an extended period of time. It is mostly used in medical research and other areas like psychology or sociology. When using this method, a longitudinal survey can pay off with actionable insights when you have the time to engage in a long-term research project.

  19. Longitudinal Study

    Learn what a longitudinal study is and check out the different types of longitudinal studies. Also, learn the advantages and disadvantages of longitudinal research designs. Updated: 11/21/2023

  20. PDF 7 Longitudinal Research Designs

    longitudinal research has been to study the development and natural his­ tory of events in the life course. This type of design is often regarded as superior to a cross-sectional design because it enables processes and causes of change within individuals and among individuals to be identified. Longi­

  21. What is a Longitudinal Study? Definition, Types & Examples

    That's the kind of thing that longitudinal research design measures. As for a formal definition, a longitudinal study is a research method that involves repeated observations of the same variable (e.g. a set of people) over some time. The observations over a period of time might be undertaken in the form of an online survey.

  22. Effective Data Management in Longitudinal Research

    Longitudinal research involves tracking the same variables over long periods, which can lead to a wealth of complex data. Managing and analyzing this data is a daunting task, but with the right ...

  23. Arthritis is associated with high nutritional risk among older ...

    The Canadian Longitudinal Study on Aging (CLSA) is a nationally representative cohort study. Study design and measures have been published but are briefly described here 28. Baseline data (2010 ...

  24. Insulin sensitivity estimates and their longitudinal association with

    Study participants. All participants are from the FinnDiane Study, an ongoing, multicentre, observational, Finnish study founded in 1997, including 77 study centres (Additional file 1: Table S1), which aims to discover clinical, genetic, and environmental risk factors for micro- and macro-vascular complications of type 1 diabetes [].We included 3517 participants with type 1 diabetes, defined ...

  25. PDF Longitudinal Study Designs

    Fig. 1 The multilevel nature of a typical longitudinal study design. sciences, or medical sciences to mean the same thing. Likewise, one may use the terms subject, individual, animal, experimental unit, or patient synonymously in a statistical or analytical sense.

  26. Learning to Read in an Intermediate Depth Orthography: The Longitudinal

    Future research may explore the longitudinal relationships between these emergent literacy skills and later reading outcomes covering a wider range of grade levels. Additionally, extending the longitudinal analysis to include measures of reading comprehension would provide a more comprehensive understanding of the role (direct and/or mediated ...

  27. Highly diverse dynamics of Pseudomonas aeruginosa colonization from

    Unrelated strains were identified in 41 patients (93%). Twenty-six patients (59%) presented a recurrence during the study period. No specific clones were associated with transient, recurrent or persistent colonization. Our longitudinal study revealed that 9 of the 26 patients with recurrence (35%) harbored strains of different genotypes.

  28. Effects of longitudinal bending stiffness and midsole foam on running

    Methods: Participants ran at 14 km/h (n = 8 males) or 12 km/h (n = 6 females). The shoe order was randomly assigned for the four shoe conditions, where the participants wore each shoe twice in a mirrored order. Results: There was a significant effect of the presence of the plate (-1.0%) as well as a significant effect of foam type (-1.0%).

  29. Longitudinal ultrasound-based AI model predicts axillary ...

    A study also found that a longitudinal MRI-based DLR model could predict the pathological complete response of breast cancer to ... To further refine the model, significant conventional ultrasound features such as tumor size, echo type, and blood flow signal, were integrated into the fully connected layer, increasing neuron count. Two ...