U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Int J Med Educ

Factor Analysis: a means for theory and instrument development in support of construct validity

Mohsen tavakol.

1 School of Medicine, Medical Education Centre, the University of Nottingham, UK

Angela Wetzel

2 School of Education, Virginia Commonwealth University, USA

Introduction

Factor analysis (FA) allows us to simplify a set of complex variables or items using statistical procedures to explore the underlying dimensions that explain the relationships between the multiple variables/items. For example, to explore inter-item relationships for a 20-item instrument, a basic analysis would produce 400 correlations; it is not an easy task to keep these matrices in our heads. FA simplifies a matrix of correlations so a researcher can more easily understand the relationship between items in a scale and the underlying factors that the items may have in common. FA is a commonly applied and widely promoted procedure for developing and refining clinical assessment instruments to produce evidence for the construct validity of the measure.

In the literature, the strong association between construct validity and FA is well documented, as the method provides evidence based on test content and evidence based on internal structure, key components of construct validity. 1 From FA, evidence based on internal structure and evidence based on test content can be examined to tell us what the instrument really measures - the intended abstract concept (i.e., a factor/dimension/construct) or something else. Establishing construct validity for the interpretations from a measure is critical to high quality assessment and subsequent research using outcomes data from the measure. Therefore, FA should be a researcher’s best friend during the development and validation of a new measure or when adapting a measure to a new population. FA is also a useful companion when critiquing existing measures for application in research or assessment practice. However, despite the popularity of FA, when applied in medical education instrument development, factor analytic procedures do not always match best practice. 2 This editorial article is designed to help medical educators use FA appropriately.

The Applications of FA

The applications of FA depend on the purpose of the research. Generally speaking, there are two most important types of FA: Explorator Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA).

Exploratory Factor Analysis

Exploratory Factor Analysis (EFA) is widely used in medical education research in the early phases of instrument development, specifically for measures of latent variables that cannot be assessed directly. Typically, in EFA, the researcher, through a review of the literature and engagement with content experts, selects as many instrument items as necessary to fully represent the latent construct (e.g., professionalism). Then, using EFA, the researcher explores the results of factor loadings, along with other criteria (e.g., previous theory, Minimum average partial, 3 Parallel analysis, 4 conceptual meaningfulness, etc.) to refine the measure. Suppose an instrument consisting of 30 questions yields two factors - Factor 1 and Factor 2. A good definition of a factor as a theoretical construct is to look at its factor loadings. 5 The factor loading is the correlation between the item and the factor; a factor loading of more than 0.30 usually indicates a moderate correlation between the item and the factor. Most statistical software, such as SAS, SPSS and R, provide factor loadings. Upon review of the items loading on each factor, the researcher identifies two distinct constructs, with items loading on Factor 1 all related to professionalism, and items loading on Factor 2 related, instead, to leadership. Here, EFA helps the researcher build evidence based on internal structure by retaining only those items with appropriately high loadings on Factor 1 for professionalism, the construct of interest.

It is important to note that, often, Principal Component Analysis (PCA) is applied and described, in error, as exploratory factor analysis. 2 , 6 PCA is appropriate if the study primarily aims to reduce the number of original items in the intended instrument to a smaller set. 7 However, if the instrument is being designed to measure a latent construct, EFA, using Maximum Likelihood (ML) or Principal Axis Factoring (PAF), is the appropriate method. 7   These exploratory procedures statistically analyze the interrelationships between the instrument items and domains to uncover the unknown underlying factorial structure (dimensions) of the construct of interest. PCA, by design, seeks to explain total variance (i.e., specific and error variance) in the correlation matrix. The sum of the squared loadings on a factor matrix for a particular item indicates the proportion of variance for that given item that is explained by the factors. This is called the communality. The higher the communality value, the more the extracted factors explain the variance of the item. Further, the mean score for the sum of the squared factor loadings specifies the proportion of variance explained by each factor. For example, assume four items of an instrument have produced Factor 1, factor loadings of Factor 1 are 0.86, 0.75, 0.66 and 0.58, respectively. If you square the factor loading of items, you will get the percentage of the variance of that item which is explained by Factor 1. In this example, the first principal component (PC) for item1, item2, item3 and item4 is 74%, 56%, 43% and 33%, respectively. If you sum the squared factor loadings of Factor 1, you will get the eigenvalue, which is 2.1 and dividing the eigenvalue by four (2.1/4= 0.52) we will get the proportion of variance accounted for Factor 1, which is 52 %. Since PCA does not separate specific variance and error variance, it often inflates factor loadings and limits the potential for the factor structure to be generalized and applied with other samples in subsequent study. On the other hand, Maximum likelihood and Principal Axis Factoring extraction methods separate common and unique variance (specific and error variance), which overcomes the issue attached to PCA.  Thus, the proportion of variance explained by an extracted factor more precisely reflects the extent to which the latent construct is measured by the instrument items. This focus on shared variance among items explained by the underlying factor, particularly during instrument development, helps the researcher understand the extent to which a measure captures the intended construct. It is useful to mention that in PAF, the initial communalities are not set at 1s, but they are chosen based on the squared multiple correlation coefficient. Indeed, if you run a multiple regression to predict say  item1 (dependent variable)  from other items (independent variables) and then look at the R-squared (R2), you will see R2 is equal to the communalities of item1 derived from PAF.

Confirmatory Factor Analysis

When prior EFA studies are available for your intended instrument, Confirmatory Factor Analysis extends on those findings, allowing you to confirm or disconfirm the underlying factor structures, or dimensions, extracted in prior research. CFA is a theory or model-driven approach that tests how well the data “fit” to the proposed model or theory. CFA thus departs from EFA in that researchers must first identify a factor model before analysing the data. More fundamentally, CFA is a means for statistically testing the internal structure of instruments and relies on the maximum likelihood estimation (MLE) and a different set of standards for assessing the suitability of the construct of interest. 7 , 8

Factor analysts usually use the path diagram to show the theoretical and hypothesized relationships between items and the factors to create a hypothetical model to test using the ML method. In the path diagram, circles or ovals represent factors. A rectangle represents the instrument items. Lines (→ or ↔) represent relationships between items. No line, no relationship. A single-headed arrow shows the causal relationship (the variable that the arrowhead refers to is the dependent variable), and a double-headed shows a covariance between variables or factors.

If CFA indicates the primary factors, or first-order factors, produced by the prior PAF are correlated, then the second-order factors need to be modelled and estimated to get a greater understanding of the data. It should be noted if the prior EFA applied an orthogonal rotation to the factor solution, the factors produced would be uncorrelated. Hence, the analysis of the second-order factors is not possible. Generally, in social science research, most constructs assume inter-related factors, and therefore should apply an oblique rotation. The justification for analyzing the second-order factors is that when the correlations between the primary factors exist, CFA can then statistically model a broad picture of factors not captured by the primary factors (i.e., the first-order factors). 9   The analysis of the first-order factors is like surveying mountains with a zoom lens binoculars, while the analysis of the second-order factors uses a wide-angle lens. 10 Goodness of- fit- tests need to be conducted when evaluating the hypothetical model tested by CFA. The question is: does the new data fit the hypothetical model? However, the statistical models of the goodness of- fit- tests are complex, and extend beyond the scope of this editorial paper; thus,we strongly encourage the readers consult with factors analysts to receive resources and possible advise.

Conclusions

Factor analysis methods can be incredibly useful tools for researchers attempting to establish high quality measures of those constructs not directly observed and captured by observation. Specifically, the factor solution derived from an Exploratory Factor Analysis provides a snapshot of the statistical relationships of the key behaviors, attitudes, and dispositions of the construct of interest. This snapshot provides critical evidence for the validity of the measure based on the fit of the test content to the theoretical framework that underlies the construct. Further, the relationships between factors, which can be explored with EFA and confirmed with CFA, help researchers interpret the theoretical connections between underlying dimensions of a construct and even extending to relationships across constructs in a broader theoretical model. However, studies that do not apply recommended extraction, rotation, and interpretation in FA risk drawing faulty conclusions about the validity of a measure. As measures are picked up by other researchers and applied in experimental designs, or by practitioners as assessments in practice, application of measures with subpar evidence for validity produces a ripple effect across the field. It is incumbent on researchers to ensure best practices are applied or engage with methodologists to support and consult where there are gaps in knowledge of methods. Further, it remains important to also critically evaluate measures selected for research and practice, focusing on those that demonstrate alignment with best practice for FA and instrument development. 7 , 11

Conflicts of Interest

The authors declare that they have no conflicts of interest.

factor analysis journal research

  • Special Issues
  • Conferences
  • Turkish Journal of Analysis and Number Theory Home
  • Current Issue
  • Browse Articles
  • Editorial Board
  • Abstracting and Indexing
  • Aims and Scope
  • American Journal of Applied Mathematics and Statistics Home
  • Social Science
  • Medicine & Healthcare
  • Earth & Environmental
  • Agriculture & Food Sciences
  • Business, Management & Economics
  • Biomedical & Life Science
  • Mathematics & Physics
  • Engineering & Technology
  • Materials Science & Metallurgy
  • Quick Submission
  • Apply for Editorial Position
  • Propose a special issue
  • Launch a new journal
  • Authors & Referees
  • Advertisers
  • Open Access

factor analysis journal research

  • Full-Text PDF
  • Full-Text HTML
  • Full-Text Epub
  • Full-Text XML
  • Noora Shrestha. Factor Analysis as a Tool for Survey Analysis. American Journal of Applied Mathematics and Statistics . Vol. 9, No. 1, 2021, pp 4-11. http://pubs.sciepub.com/ajams/9/1/2 ">Normal Style
  • Shrestha, Noora. 'Factor Analysis as a Tool for Survey Analysis.' American Journal of Applied Mathematics and Statistics 9.1 (2021): 4-11. ">MLA Style
  • Shrestha, N. (2021). Factor Analysis as a Tool for Survey Analysis. American Journal of Applied Mathematics and Statistics , 9 (1), 4-11. ">APA Style
  • Shrestha, Noora. 'Factor Analysis as a Tool for Survey Analysis.' American Journal of Applied Mathematics and Statistics 9, no. 1 (2021): 4-11. ">Chicago Style

Factor Analysis as a Tool for Survey Analysis

Factor analysis is particularly suitable to extract few factors from the large number of related variables to a more manageable number, prior to using them in other analysis such as multiple regression or multivariate analysis of variance. It can be beneficial in developing of a questionnaire. Sometimes adding more statements in the questionnaire fail to give clear understanding of the variables. With the help of factor analysis, irrelevant questions can be removed from the final questionnaire. This study proposed a factor analysis to identify the factors underlying the variables of a questionnaire to measure tourist satisfaction. In this study, Kaiser-Meyer-Olkin measure of sampling adequacy and Bartlett’s test of Sphericity are used to assess the factorability of the data. Determinant score is calculated to examine the multicollinearity among the variables. To determine the number of factors to be extracted, Kaiser’s Criterion and Scree test are examined. Varimax orthogonal factor rotation method is applied to minimize the number of variables that have high loadings on each factor. The internal consistency is confirmed by calculating Cronbach’s alpha and composite reliability to test the instrument accuracy. The convergent validity is established when average variance extracted is greater than or equal to 0.5. The results have revealed that the factor analysis not only allows detecting irrelevant items but will also allow extracting the valuable factors from the data set of a questionnaire survey. The application of factor analysis for questionnaire evaluation provides very valuable inputs to the decision makers to focus on few important factors rather than a large number of parameters.

1. Introduction

Factor Analysis is a multivariate statistical technique applied to a single set of variables when the investigator is interested in determining which variables in the set form logical subsets that are relatively independent of one another 1 . In other words, factor analysis is particularly useful to identify the factors underlying the variables by means of clubbing related variables in the same factor 2 . In this paper, the main focus is given on the application of factor analysis to reduce huge number of inter-correlated measures to a few representative constructs or factors that can be used for subsequent analysis 3 . The goal of the present work is to examine the application of factor analysis of a questionnaire item to measure tourist satisfaction. Therefore, in order to identify the factors, it is necessary to understand the concept and steps to apply factor analysis for the questionnaire survey.

Factor analysis is based on the assumption that all variables correlate to some degree. The variables should be measured at least at the ordinal level. The sample size for factor analysis should be larger but the more acceptable range would be a ten-to-one ratio Handbook of univariate and multivariate data analysis and interpretation with SPSS, Chapman & Hall/CRC, Boca Raton, 2006." class="coltj"> 3 , Multivariate data analysis (5 th ed.) , N J: Prentice-Hall, Upper Saddle River, 1998." class="coltj"> 4 . There are two main approaches to factor analysis: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Exploratory factor analysis is used for checking dimensionality and often used in the early stages of research to gather information about the interrelationships among a set of variables 5 . On the other hand, the confirmatory factor analysis is a more complex and sophisticated set of techniques used in the research process to test specific hypotheses or theories concerning the structure underlying a set of variables Multivariate data analysis, Upper Saddle River, New Jersey, 2006." class="coltj"> 6 , SPSS survival manual: a step by step guide to data analysis using SPSS, Open University Press/ Mc Graw-Hill, Maidenhead, 2010." class="coltj"> 7 .

Several studies examined and discussed the application of factor analysis to reduce the large set of data and to identify the factors extracted from the analysis Multivariate Behavioral Research, 12 (1). 43-47. 1977." class="coltj"> 8 , Psychological Bulletin , 81, 358-361. 1974." class="coltj"> 9 , Biometrika, 38(3/4), 337-344. 1951." class="coltj"> 10 , Psychological Methods, 4(1), 84-99. 1999." class="coltj"> 11 . In tourism business, the satisfaction of tourists can be measured by the large number of parameters. The factor analysis may cluster these variables into different factors where each factor measure some dimension of tourist satisfaction. Factors are so designed that the variables contained in it are linked with each other in some way. The significant factors are extracted to explain the maximum variability of the group under study. The application of factor analysis provides very valuable inputs to the decision makers and policy makers to focus on few factors rather than a large number of parameters. People related to the tourism business is interested to know as to what makes their customer or tourists to choose a particular destination. There may be boundless concerns on which the opinion of the tourists can be taken. Several issues like local food, weather condition, culture, nature, recreation activities, photography, travel video making, transportation, medical treatment, water supply, safety, communication, trekking, mountaineering, environment, natural resources, cost of accommodation, transportation, etc. may be explored by taking the responses from the tourists survey and from the literature review 12 . By using the factor analysis, the large number of variables may be clubbed in different components like component one, component two, etc. Instead of concentrating on many issues, the researcher or policy maker can make a strategy to optimize these components for the growth of tourism business.

The contribution of this paper is twofold related to the advantages of factor analysis. First, factor analysis can be applied to developing of a questionnaire. On doing analysis, irrelevant questions can be removed from the final questionnaire. It helps in categorizing the questions into different parameters in the questionnaire. Second, factor analysis can be used to simplify data, such as decreasing the number of variables in regression models. This study also encourages researchers to consider the step-by-step process to identify factors using factor analysis. Sometimes adding more statements or items in the questionnaire fail to give clear understanding of the variables. Using factor analysis, few factors are extracted form the large number of related variables to a more manageable number, prior to using them in other analysis such as multiple regression or multivariate analysis of variance SPSS survival manual: a step by step guide to data analysis using SPSS, Open University Press/ Mc Graw-Hill, Maidenhead, 2010." class="coltj"> 7 , Statistical Methods (8 th ed.) , Iowa State University Press, Iowa, 1989." class="coltj"> 13 . Hence, instead of examining all the parameters, few extracted factors can be studied which in turn explain the variations of the group characteristics. Therefore, the present study discusses on the factor analysis of a questionnaire to measure tourist satisfaction. In the present work, data collected from the tourist satisfaction survey is used as an example for the factor analysis.

The structured questionnaire was designed to collect primary data. The data were collected from the international tourists travelled various places of Nepal in 2019. The tourists older than 25 years of age who had been in Nepal for over a week and had experienced the travelling were included in this study. The pilot study was carried out among 15 tourists, who were not included in the sample, to identify the possible errors of a questionnaire so as to improve the reliability (Cronbach’s alpha > 0.7) of the questionnaire. The questionnaire consists of questions and statements related to the independent and dependent variables, which were developed on the basis of literature review. Each statement was rated on a five-point (1 to 5) Likert scale, with high score 5 indicating strongly agree with that statement. The statements were written to reflect the hospitality, destination attractions, and relaxation. The data were gathered from the 1 st week of November 2019 to last week of December 2019. Due to outbreak of 2019 novel coronavirus, the data collection process was affected and hence convenience sampling method was used to select a respondent. Total 220 questionnaires were distributed among the tourists but only 200 respondents provided their reactions to the statements with a response rate 91%. All the statistical analysis has performed using IBM SPSS version 23.

The reliability of a questionnaire is examined with Cronbach’s alpha. It provides a simple way to measure whether or not a score is reliable. It is used under the assumption that there are multiple items measuring the same underlying construct; such as in tourist satisfaction survey, there are few questions all asking different things, but when combined, could be said to measure overall satisfaction. Cronbach’s alpha is a measure of internal consistency. It is also considered to be a measure of scale reliability and can be expressed as

The average variance extracted and the composite reliability coefficients are related to the quality of a measure. AVE is a measure of the amount of variance that is taken by a construct in relation to the amount of variance due to measurement error 15 . To be specific, AVE is a measure to assess convergent validity.

Convergent validity is used to measure the level of correlation of multiple indicators of the same construct that are in agreement. The factor loading of the items, composite reliability and the average variance extracted have to be calculated to determine convergent validity 16 . The value of AVE and CR ranges from 0 to 1, where a higher value indicates higher reliability level. AVE is more than or equal to 0.5 confirms the convergent validity. The average variance extracted is the sum of squared loadings divided by the number of items and is given by

Composite reliability is a measure of internal consistency in scale items 17 . According to Fornell and Larcker (1981), composite reliability is an indicator of the shared variance among the observed variables used as an indicator of a latent construct. CR for each construct can be obtained by summing of squares of completely standardized factor loadings divided by this sum plus total of variance of the error term for i th indicators. CR can be calculated as:

Here, n is the number of the items, λ i the factor loading of item i, and Var (e i ) the variance of the error of the item i, The values of composite reliability between 0.6 to 0.7 are acceptable while in more advanced phase the value have to be higher than 0.7. According to Fornell and Larcker (1981), if AVE is less than 0.5, but composite reliability is higher than 0.6, the convergent validity of the construct is still adequate.

This study employs exploratory factor analysis to examine the data set to identify complicated interrelationships among items and group items that are part of integrated concepts. Due to explorative nature of factor analysis, it does not differentiate between independent and dependent variables. Factor analysis clusters similar variables into the same factor to identify underlying variables and it only uses the data correlation matrix. In this study, factor analysis with principal components extraction used to examine whether the statements represent identifiable factors related to tourist satisfaction. The principal component analysis (PCA) signifies to the statistical process used to underline variation for which principal data components are calculated and bring out strong patterns in the dataset Multivariate data analysis, Upper Saddle River, New Jersey, 2006." class="coltj"> 6 , Psychological Bulletin , 81, 358-361. 1974." class="coltj"> 9 .

Factor Model with ‘m’ Common Factors

Let X = (X 1 , X 2 , ....X p )’is a random vector with mean vector μ and covariance matrix Σ. The factor analysis model assumes that X = μ + λ F + ε, where, λ = { λ jk } pxm denotes the matrix of factor loadings; λ jk is the loading of the j th variable on the k th common factor, F= (F 1 ,F 2 ,....F m )’ denotes the vector of latent factor scores; F k is the score on the k th common factor and ε = (ε 1 , ε 2 ,....ε p )’ denotes the vector of latent error terms; ε j is the j th specific factor.

There are three major steps for factor analysis: a) assessment of the suitability of the data, b) factor extraction, and c) factor rotation and interpretation. They are described as:

2.2.1.1. Assessment of the Suitability of the data

To determine the suitability of the data set for factor analysis, sample size and strength of the relationship among the items have to be considered Using multivariate statistics (6 th ed.) , Pearson, 2013." class="coltj"> 1 , Applied multivariate statistics for the social sciences (3 rd ed.), Lawrence Erlbaum Associates, Mahwah, NJ, 1996." class="coltj"> 18 . Generally, a larger sample is recommended for factor analysis i.e. ten cases for each item. Nevertheless, a smaller sample size can also be sufficient if solutions have several high loading marker variables < 0.80 18 . To determine the strength of the relationship among the items, there must be evidence of the coefficient of correlation > 0.3 in the correlation matrix. The existence of multicollinearity in the data is a type of disturbance that alters the result of the analysis. It is a state of great inter-correlations among the independent variables. Multicollinearity makes some of the significant variables in a research study to be statistically insignificant and then the statistical inferences made about the data may not be trustworthy American Journal of Applied Mathematics and Statistics , 8(2), 39-42, 2020." class="coltj"> 19 , Review of Economics and Statistics, 51(4), 486-489. 1969." class="coltj"> 20 . Hence, the presence of multicollinearity among the variables is examined with the determinant score.

Determinant Score

The value of the determinant is an important test for multicollinearity or singularity. The determinant score of the correlation matrix should be > 0.00001 which specifies that there is an absence of multicollinearity. If the determinant value is < 0.00001, it would be important to attempt to identify pairs of variables where correlation coefficient r > 0.8 and consider eliminating them from the analysis. A lower score might indicate that groups of three or more questions/statements have high inter-correlations, so the threshold for item elimination should be reduced until this condition is satisfied. If correlation is singular, the determinant |R| =0 Review of Economics and Statistics, 51(4), 486-489. 1969." class="coltj"> 20 , Discovering statistics using SPSS (3 rd ed.), SAGE, London, 2009." class="coltj"> 21 .

There are two statistical measures to assess the factorability of the data: Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett’s test of Sphericity.

Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy

KMO test is a measure that has been intended to measure the suitability of data for factor analysis. In other words, it tests the adequacy of the sample size. The test measures sampling adequacy for each variable in the model and for the complete model. The KMO measure of sampling adequacy is given by the formula:

where, R ij is the correlation matrix and U ij is the partial covariance matrix. KMO value varies from 0 to 1. The KMO values between 0.8 to 1.0 indicate the sampling is adequate. KMO values between 0.7 to 0.79 are middling and values between 0.6 to 0.69 are mediocre. KMO values less than 0.6 indicate the sampling is not adequate and the remedial action should be taken. If the value is less than 0.5, the results of the factor analysis undoubtedly won’t be very suitable for the analysis of the data. If the sample size is < 300 the average communality of the retained items has to be tested. An average value > 0.6 is acceptable for sample size < 100, an average value between 0.5 and 0.6 is acceptable for sample sizes between 100 and 200 Using multivariate statistics (6 th ed.) , Pearson, 2013." class="coltj"> 1 , Psychometrika, 19, 149-161. 1954." class="coltj"> 22 , Psychometrika, 35, 401-415. 1970." class="coltj"> 23 , Exploratory factor analysis, [E-book], available: net Library e-book." class="coltj"> 24 .

Bartlett’s Test of Sphericity

Bartlett’s Test of Sphericity tests the null hypothesis, H 0 : The variables are orthogonal i.e. The original correlation matrix is an identity matrix indicating that the variables are unrelated and therefore unsuitable for structure detection. The alternative hypothesis, H 1 : The variables are not orthogonal i.e. they are correlated enough to where the correlation matrix diverges significantly from the identity matrix. The significant value < 0.05 indicates that a factor analysis may be worthwhile for the data set.

In order to measure the overall relation between the variables the determinant of the correlation matrix |R| is calculated. Under H 0 , |R| =1; if the variables are highly correlate, then |R| ≈ 0. The Bartlett’s test of Sphericity is given by:

where, p= number of variables, n= total sample size and R= correlation matrix Psychometrika, 19, 149-161. 1954." class="coltj"> 22 , Exploratory factor analysis, [E-book], available: net Library e-book." class="coltj"> 24 .

Factor extraction encompasses determining the least number of factors that can be used to best represent the interrelationships among the set of variables. There are many approaches to extract the number of underlying factors. For obtaining factor solutions, principal component analysis and common factor analysis can be used. This study has used principal component analysis (PCA) because the purpose of the study is to analyze the data in order to obtain the minimum number of factors required to represent the available data set.

To Determine the Number of Factors to be Extracted

In this study two techniques are used to assist in the decision concerning the number of factors to retain: Kaiser’s Criterion and Scree Test. The Kaiser’s criterion (Eigenvalue Criterion) and the Scree test can be used to determine the number of initial unrotated factors to be extracted. The eigenvalue is a ratio between the common variance and the specific variance explained by a specific factor extracted.

Kaiser’s (Eigenvalue) Criterion

The eigenvalue of a factor represents the amount of the total variance explained by that factor. In factor analysis, the remarkable factors having eigenvalue greater than one are retained. The logic underlying this rule is reasonable. An eigenvalue greater than one is considered to be significant, and it indicates that more common variance than unique variance is explained by that factor SPSS survival manual: a step by step guide to data analysis using SPSS, Open University Press/ Mc Graw-Hill, Maidenhead, 2010." class="coltj"> 7 , Psychometrika, 19, 149-161. 1954." class="coltj"> 22 , Psychometrika, 35, 401-415. 1970." class="coltj"> 23 , Data analysis in management with SPSS software, Springer, India, 2013." class="coltj"> 25 . Measure and composite variables are separate classes of variables. Factors are latent constructs created as aggregates of measured variables and so should consist of more than a single measured variable. But eigenvalues, like all sample statistics, have some sampling error. Hence, it is very important for the researcher to exercise some judgment in using this strategy to determine the number of factors to extract or retain 26 .

Cattell (1996) proposed a graphical test for determining the number of factors. A scree plot graphs eigenvalue magnitudes on the vertical access, with eigenvalue numbers constituting the horizontal axis. The eigenvalues are plotted as dots within the graph, and a line connects successive values. Factor extraction should be stopped at the point where there is an ‘elbow’ or leveling of the plot. This test is used to identify the optimum number of factors that can be extracted before the amount of unique variance begins to dominate the common variance structure Multivariate data analysis (5 th ed.) , N J: Prentice-Hall, Upper Saddle River, 1998." class="coltj"> 4 , Multivariate Behavioral Research, 1, 245-276. 1966." class="coltj"> 27 , Factor analysis, Greenwood Press, Westport, CT, 1973." class="coltj"> 28 .

Factors obtained in the initial extraction phase are often difficult to interpret because of significant cross loadings in which many factors are correlated with many variables. There are two main approaches to factor rotation; orthogonal (uncorrelated) or oblique (correlated) factor solutions. In this study, orthogonal factor rotation is used because it results in solutions that are easier to interpret and to report. The varimax, quartimax, and equimax are the methods related to orthogonal rotation. Furthermore, Varimax method developed by Kaiser (1958) is used to minimize the number of variables that have high loadings on each factor. Varimax tends to focus on maximizing the differences between the squared pattern structure coefficients on a factor (i.e. focuses on a column perspective). The spread in loadings is maximized loadings that are high after extraction become higher after rotation and loadings that are low become lower. If the rotated component matrix shows many significant cross-loading values then it is suggested to rerun the factor analysis to get an item loaded in only one component by deleting all cross loaded variables Exploratory and confirmatory factor analysis: Understanding concepts and application, American Psychological Association, Washington D.C., 2004. " class="coltj"> 26 , Factor analysis, Greenwood Press, Westport, CT, 1973." class="coltj"> 28 , Psychometrika, 23, 187-200. 1958." class="coltj"> 29 .

Orthogonal Factor Model Assumptions

The orthogonal factor analysis model assumes the form X = μ + λ F + ε, and adds the assumptions that F~ (0, 1 m ), i.e. the latent factors have mean zero, unit variance, and are uncorrelated, Ε ~ (0, Ψ) where Ψ = diag(Ψ 1 , Ψ 2 , ... Ψ p ) with Ψ i denoting the j th specific variance, and ε j and F k are independent of one another for all pairs, j, k.

Variance Explained by Common Factors

The portion of variance of the j th variable that is explained by the ‘m’ common factors is called the communality of the j th variable: σ jj = h j 2 + ψ j , where, σ jj is the variance of X j (i.e. j th diagonal of Σ). Communality is the sum of squared loadings for X j and given by h j 2 = (λλ’) jj = λ j1 2 + λ j2 2 +......+ λ jm 2 is the communality of X j , and ψ j is the specific variance (or uniqueness) of X j Encyclopedia of survey research methods. SAGE Publications, Thousand Oaks, 2008." class="coltj"> 14 , Exploratory factor analysis, [E-book], available: net Library e-book." class="coltj"> 24 , Exploratory and confirmatory factor analysis: Understanding concepts and application, American Psychological Association, Washington D.C., 2004. " class="coltj"> 26 .

3. Results and Discussions

In this section the results obtained with the statistical software SPSS are presented. In this study, the participants consisted of 200 tourists who had been travelling in Nepal during 2019. Majority (26.5%) of tourists belongs to the age group 40 to 44 years. Participants ranged in age from 25 to 55 years (mean age= 39.8 years, standard deviation = 7.94) and of the total sample n=108, 54% were male and n= 92, 46% were female. In addition, the respondents were from various parts of the world. The region wise distribution of tourists was Asian–SAARC (n=59, 29.5%), Asian-others (n=58, 29%), European (n=40, 20%), Americans (n=18, 9%), Oceania (n=17, 8.5%), and Other (n=8, 4%). There are various purposes of visiting Nepal. 29.5% (n=59) of the tourist visited Nepal with the purpose of holiday and pleasure. Similarly, n=50, 25% of the tourist came for adventure including trekking and mountaineering, n=30, 15% for volunteering and academic purposes, n=44, 22% for entertainment video and photography, n=17, 8.5% for other purposes. The average length of stay of respondent tourists was found to be 12 days. According to Nepal tourism statistics 2019, the average length of stay of international tourists in Nepal in 2018 dropped to 12.4 days from 12.6 days in 2017 30 .

This study has followed three major steps for factor analysis: a) assessment of the suitability of the data, b) factor extraction, and c) factor rotation and interpretation.

Table 1. Correlation Matrix a and Determinant Score

factor analysis journal research

  • Tables index View option Full Size Next Table

Step 1: Assessment of the Suita bility of the Data

The utmost significant factor of international tourist’s satisfaction is hospitality such as home stay and local family, arts, crafts, and historic places, local souvenirs, and local food. Similarly, destination attraction plays a vital role in tourist satisfaction such as cultural activities, trekking, sightseeing, and safety during travel period. Most of the tourists visit different places for relaxation and experience different lifestyle. These factors may be associated with the satisfaction of tourists. To analyze the tourist satisfaction, Kaiser-Meyer-Olkin is used to measure the suitability of data for factor analysis. Similarly, Bartlett’s test of Sphericity, correlation matrix, and determinant score are computed to detect the appropriateness of the data set for functioning factor analysis 31 .

In Table 1 , the correlation matrix displays that there are sufficient correlations to justify the application of factor analysis. The correlation matrix shows that there are few items whose inter-correlations > 0.3 between the variables and it can be concluded that the hypothesized factor model appears to be suitable. The value for the determinant is an important test for multicollinearity. The determinant score of the correlation matrix is 0.038 > 0.00001 which indicates that there is an absence of multicollinearity.

Table 2 illustrates the value of KMO statistics is equal to 0.813 > 0.6 which indicates that sampling is adequate and the factor analysis is appropriate for the data. Bartlett’s test of Sphericity is used to test for the adequacy of the correlation matrix. The Bartlett’s test of Sphericity is highly significant at p < 0.001 which shows that the correlation matrix has significant correlations among at least some of the variables. Here, test value is 637.65 and an associated degree of significance is less than 0.0001. Hence, the hypothesis that the correlation matrix is an identity matrix is rejected. To be specific, the variables are not orthogonal. The significant value < 0.05 indicates that a factor analysis may be worthwhile for the data set.

Table 2. Kaiser-Meyer-Olkin and Bartlett’s Test of Sphericity

factor analysis journal research

  • Tables index View option Full Size Previous Table Next Table

Step 2: Factor Extraction

Kaiser’s criterion and Scree test are used to determine the number of initial unrotated factors to be extracted. The eigenvalues associated with each factor represent the variance explained by those specific linear components. The coefficient value less than 0.4 is suppressed that will suppress the presentation of any factor loadings with values less than 0.4 23 .

Table 3. Eigenvalues (EV) and Total Variance Explained

factor analysis journal research

Table 3 demonstrates the eigenvalues and total variance explained. The extraction method of factor analysis used in this study is principal component analysis. Before extraction, eleven linear components are identified within the data set. After extraction and rotation, there are three distinct linear components within the data set for the eigenvalue > 1. The three factors are extracted accounting for a combined 60.2% of the total variance. It is suggested that the proportion of the total variance explained by the retained factors should be at least 50%. The result shows that 60.2% common variance shared by eleven variables can be accounted by three factors. This is the reflection of KMO value, 0.813, which can be considered good and also indicates that factor analysis is useful for the variables. This initial solution suggests that the final solution will extract not more than three factors. The first component has explained 22% of the total variance with eigenvalue 4.01. The second component has explained 20.9% variance with eigenvalue 1.54. The third component has explained 17.34% variance with eigenvalue 1.08.

In Figure 1 , for Scree test, a graph is plotted with eigenvalues on the y-axis against the eleven component numbers in their order of extraction on the x-axis. The initial factors extracted are large factors with higher eigenvalues followed by smaller factors. The scree plot is used to determine the number of factors to retain. Here, the scree plot shows that there are three factors for which the eigenvalue is greater than one and account for most of the total variability in data. The other factors account for a very small proportion of the variability and considered as not so much important.

Step 3: Factor Rotation and Interpretation

The present study has executed the extraction method based on principal component analysis and the orthogonal rotation method based on varimax with Kaiser normalization.

Table 4 exhibits factor loading, diagonal anti-image correlation and communality after extraction. The diagonal anti-image correlation stretches the knowledge of sampling adequacy of each and every item. The communalities reflect the common variance in the data structure after extraction of factors. Factor loading values communicates the relationship of each variable to the underlying factors. The variables with large loadings values > 0.40 indicate that they are representative of the factor.

factor analysis journal research

  • Figure 1. Scree Plot

Table 4. Summary for factors related to travel satisfaction

factor analysis journal research

The component 1 is labeled as ‘Hospitality’ which contains four items that strive for homestay, local food, local souvenirs, arts and craft, and have a correlation of 0.77, 0.70, 0.71, and 0.74, with component 1 respectively. The component hospitality explained 22% of the total variance with eigenvalue 4.01. This component contained four items but out of these items the arts, craft, and historic places tends to be strongly agreeing according to its mean score 4.23. The other three items such as strive for homestay, arts & historic place, local souvenirs, and local food have a tendency towards agree according to their mean score of the scale.

The second component entitled as ‘Destination Attraction’ explained 20.9% variance with eigenvalue 1.54. This component contained four items such as sightseeing, trekking, cultural activities, and safety. The variables sightseeing, trekking, cultural activities, and safety have correlation of 0.71, 0.72, 0.79, and 0.58 with component 2 respectively. The item cultural activities (mean = 4.25) tends to strongly agree but other items trekking, sightseeing, and safety tend to agree according to their mean score of scale.

The component 3 is marked as ‘Relaxation’. It contains three items namely stress relief, different lifestyle, new experience and which have a correlation of 0.52, 0.86, and 0.84 with component 3 respectively. The third component explained 17.34% variance with eigenvalue 1.08. The three items of the third component such as different lifestyle, new experience, and stress relief, tend to agree according to their mean score of scale.

In Table 4 , the diagonal element of the anti-image correlation value gives the information of sampling adequacy of each and every item that must be > 0.5. The amount of variance in each variable that can be explained by the retained factor is represented by the communalities after extraction. The communalities suggest the common variance in the data set. The communality value corresponding to the first statement (Item_1) of the first component is 0.63. It means 63% of the variance associated with this statement is common. Similarly, 0.63%, 0.57%, 0.56%, 0.64%, 0.59%, 0.55%, 0.67%, 0.50%, 0.50%, 0.77%, and 0.72% of the common variance associated with statement first to eleventh respectively.

4. Reliability and Validity Test Results

The internal consistency is confirmed by calculating Cronbach’s alpha to test the instrument accuracy and reliability. The adequate threshold value for Cronbach’s alpha is that it should be > 0.7. In Table 5 the component hospitality, destination attraction, and relaxation have Cronbach’s alpha values 0.75, 0.74, and 0.71 respectively, which confirmed the reliability of the survey instrument. The Cronbach’s alpha coefficient for the factors with total scale reliability is 0.82 > 0.7. It shows that the variables exhibit a correlation with their component grouping and thus they are internally consistent.

The convergent validity is established when average variance extracted is ≥ 0.5. The AVE values corresponding to the components hospitality, destination attraction, and relaxation are 0.53, 0.50, and 0.56 respectively. According to Fornell and Larcker (1981), AVE ≥ 0.5 confirms the convergent validity and it can be seen that all the AVE values in Table 5 are greater or equal to 0.5. The composite reliability value for component 1, 2, and 3 are 0.82, 0.79, and 0.79 respectively. It evidences the internal consistency in scale items.

Table 5. Reliability, Average Variance Extracted (AVE) and Composite Reliability (CR)

factor analysis journal research

  • Tables index View option Full Size Previous Table

5. Conclusion

The goal of this study was to examine on the factor analysis of a questionnaire to identify main factors that measure tourist satisfaction. The likelihood to use factor analysis for the data set is explored with the threshold values of determinant score, Kaiser-Meyer-Olkin and Bartlett’s test of Sphericity. Based on the results of this study, it can be concluded that factor analysis is a promising approach to extract significant factors to explain the maximum variability of the group under study.

The hospitality, destination attraction, and relaxation are the major factors extracted using principal component analysis and varimax orthogonal factor rotation method to measure satisfaction of tourists. The application of factor analysis provides very valuable inputs to the decision makers and policy makers to focus only on the few manageable factors rather than a large number of parameters. The findings of the study cannot be generalized for the large population so advanced study can be done taking more sample size with probability sampling methods. Nevertheless, before making stronger decision on the tourist satisfaction factors to promote tourism of a country, further research is required to analyze in detail.

Published with license by Science and Education Publishing, Copyright © 2021 Noora Shrestha

Creative Commons

Cite this article:

Normal style, chicago style.

  • Google-plus

CiteULike

  • View in article Full Size Figure
  • View in article Full Size

Change Password

Your password must have 8 characters or more and contain 3 of the following:.

  • a lower case character, 
  • an upper case character, 
  • a special character 

Password Changed Successfully

Your password has been changed

  • Sign in / Register

Request Username

Can't sign in? Forgot your username?

Enter your email address below and we will send you your username

If the address matches an existing account you will receive an email with instructions to retrieve your username

One Size Doesn’t Fit All: Using Factor Analysis to Gather Validity Evidence When Using Surveys in Your Research

  • Christopher Runyon

*Address correspondence to: Eva Knekta ( E-mail Address: [email protected] ).

Department of Science and Mathematics Education, Umeå University, 901 87 Umeå, Sweden

Department of Biological Sciences, Florida International University, Miami, FL 33199

Search for more papers by this author

Department of Educational Psychology, University of Texas at Austin, Austin, TX 78712

National Board of Medical Examiners, Philadelphia, PA 19104

Across all sciences, the quality of measurements is important. Survey measurements are only appropriate for use when researchers have validity evidence within their particular context. Yet, this step is frequently skipped or is not reported in educational research. This article briefly reviews the aspects of validity that researchers should consider when using surveys. It then focuses on factor analysis, a statistical method that can be used to collect an important type of validity evidence. Factor analysis helps researchers explore or confirm the relationships between survey items and identify the total number of dimensions represented on the survey. The essential steps to conduct and interpret a factor analysis are described. This use of factor analysis is illustrated throughout by a validation of Diekman and colleagues’ goal endorsement instrument for use with first-year undergraduate science, technology, engineering, and mathematics students. We provide example data, annotated code, and output for analyses in R, an open-source programming language and software environment for statistical computing. For education researchers using surveys, understanding the theoretical and statistical underpinnings of survey validity is fundamental for implementing rigorous education research.

THE USE OF SURVEYS IN BIOLOGY EDUCATION RESEARCH

Surveys and achievement tests are common tools used in biology education research to measure students’ attitudes, feelings, and knowledge. In the early days of biology education research, researchers designed their own surveys (also referred to as “measurement instruments” 1 ) to obtain information about students. Generally, each question on these instruments asked about something different and did not involve extensive use of measures of validity to ensure that researchers were, in fact, measuring what they intended to measure ( Armbruster et al. , 2009 ; Rissing and Cogan, 2009 ; Eddy and Hogan, 2014 ). In recent years, researchers have begun adopting existing measurement instruments. This shift may be due to researchers’ increased recognition of the amount of work that is necessary to create and validate survey instruments (cf. Andrews et al. , 2017 ; Wachsmuth et al. , 2017 ; Wiggins et al. , 2017 ). While this shift is a methodological advancement, as a community of researchers we still have room to grow. As biology education researchers who use surveys, we need to understand both the theoretical and statistical underpinnings of validity to appropriately employ instruments within our contexts. As a community, biology education researchers need to move beyond simply adopting a “validated” instrument to establishing the validity of the scores produced by the instrument for a researcher’s intended interpretation and use. This will allow education researchers to produce more rigorous and replicable science. In this primer, we walk the reader through important validity aspects to consider and report when using surveys in their specific context.

Measuring Variables That Are Not Directly Observable

Some variables measured in education studies are directly observable. For example, the percent of international students in a class or the amount of time students spend on a specific task can both be directly observed by the researcher. Other variables that researchers may want to measure are not directly observable, such as students’ attitudes, feelings, and knowledge. The measurement of unobservable variables is what we focus on in this primer. To study these unobservable variables, researchers collect several related observable variables (responses to survey items) and use them to make inferences about the unobservable variable, termed “latent variable” or “construct” 2 in the measurement literature. For example, when assessing students’ knowledge of evolution, it is intuitive that a single item (i.e., a test question) would not be sufficient to make judgments about the entirety of students’ evolution knowledge. Instead, students’ scores from several items measuring different aspects of evolution are combined into a sum score. The measurement of attitudes and feelings (e.g., students’ goals, students’ interest in biology) is no different. For example, say a researcher wanted to understand the degree to which students embrace goals focused on improving themselves, agentic goals , as will be seen in our illustrating example in this primer. Instead of asking students one question about how important it is for them to improve themselves, an instrument was created to include a number of items that focus on slightly different aspects of improving the self. The observed responses to these survey items can then be combined to represent the construct agentic goal endorsement . To combine a number of items to represent one construct, the researcher must provide evidence that these items truly represent the same construct. In this paper, we provide an overview of the evidence necessary to have confidence in using a survey instrument for one’s specific purpose and go into depth for one type of statistical evidence for validity: factor analysis.

The aims of this article are 1) to briefly review the theoretical background for instrument validation and 2) to provide a step-by-step description of how to use factor analysis to gather evidence about the number and nature of constructs in an instrument. We begin with a brief theoretical background about validity and constructs to situate factor analysis in the larger context of instrument validation. Next, we discuss coefficient alpha, a statistic currently used, and often misused, in educational research as evidence for validity. The remainder of the article explores the statistical method of factor analysis. We describe what factor analysis is, when it is appropriate to use it, what we can learn from it, and the essential steps in conducting it. An evaluation of the number and nature of constructs in the Diekman et al. (2010) goal-endorsement instrument when used with first-year undergraduate science, technology, engineering, and mathematics (STEM) students is provided to illustrate the steps involved in conducting a factor analysis and how to report it in a paper (see Boxes 1 – 7 ). The illustrating example comes from a unique data collection and analysis made by the authors of this article. Data, annotated code, and output from the analyses run in R (an open-source programming language and software environment for statistical computing; R Core Team, 2016 ) for this example are included in the Supplemental Material.

WHAT IS VALIDITY?

The quality of measurements is important to all sciences. Although different terms are used in different disciplines, the underlining principles and problems are the same across disciplines. For example, in physics, the terms “accuracy” and “precision” are commonly used to describe how confident researchers should be in their measurements. In the discourse about survey quality, validity and reliability are the key concepts for measurement quality. Roughly, validity refers to whether an instrument actually measures what it is designed to measure and reliability is the consistency of the instrument’s measurements.

In this section, we will briefly outline what validity is and the many types of validity evidence. Reliability, and its relation to validity, will be discussed in The Misuse of Coefficient Alpha . Before getting into the details, we need to emphasize a critical concept about validity that is often overlooked: validity is not a characteristic of an instrument, but rather a characteristic of the use of an instrument in a particular context. Anytime an instrument is used in a new context, at least some measures of its validity must be established for that specific context.

Validity Is Not a Property of the Instrument

Validity refers to the degree of which evidence and theory support the interpretations of the test score for the proposed use. ( AERA, APA, and NCME, 2014 , p. 11)

Thus, validity is not a property of the measurement instrument but rather refers to its proposed interpretation and use. Validity must be considered each time an instrument is used ( Kane, 2016 ). An instrument may be validated for a certain population and purpose, but that does not mean it will work across all populations and for all purposes. For example, a validation of Diekman’s goal-endorsement instrument (Diekman et al., 2010) as a reasonable measure of university students’ goal endorsement does not automatically validate the use of the instrument for measuring 6-year-olds’ goal endorsement. Similarly, a test validated for one purpose, such as being a reasonable measure of sixth-grade mathematical achievement, does not automatically validate it for use with other purposes, such as placement and advancement decisions ( Kane, 2016 ). The validation of a survey may also be time sensitive, as cultures continually change. Using a survey from the 1980s about the use of technology would be employing a dated view of what is meant by “technology” today.

Types of Validity Evidence

Validation is a continuous and iterative process of collecting many different types of evidence to support that researchers are measuring what they aim to measure. The latest Standards for Educational and Psychological Testing describes many types of validity evidence to consider when validating an instrument for a particular purpose ( AERA, APA, and NCME, 2014 , chap. 1). These types of evidence and illustrative examples are summarized in Table 1 . For example, one important aspect to consider is whether the individual items that make up the survey are interpreted by the respondents in the way intended by the researcher. Researchers must also consider whether the individual items constitute a good representation of the construct and whether the items collectively represent all the important aspects of that construct. Looking at our illustrative example ( Box 1 and Table 2 ), we could ask whether items 15–23 (i.e., helping others, serving humanity, serving community, working with people, connection with others, attending to others, caring for others, intimacy, and spiritual rewards) in the goal-endorsement instrument constitute a good representation of the construct “helping others and one’s community”? Yet another type of validity evidence involves demonstrating that the scores obtained for a construct on an instrument of interest correlate to other measures of the same or closely related constructs.

a Many of the example considerations are in reference to the elements in the Diekman et al. (2010) instrument; we provide these only as motivating examples and encourage readers to apply the example within their own work.

b If and how to include consequences of testing as a measure of validity is highly debated in educational and psychological measurement (see Mehrens, 1997 ; Lissitz and Samuelsen, 2007 ; Borsboom et al. , 2004 ; Cizek, 2016 ; Kane, 2016 ). We chose to present the view of validity as described in the latest Standards for Educational and Psychological Testing ( AERA, APA, and NCME, 2014 ).

BOX 1. How to describe the purpose (abbreviated), instrument, and sample for publication illustrated with the goal-­endorsement example

Defining the construct and intended use of the instrument.

The aim of this study was to analyze the internal structure of the goal-endorsement instrument described by Diekman et al. (2010) for use with incoming first-year university STEM students. The objective is to use the knowledge gained through the survey to design STEM curricula that might leverage the goals students perceive as most important to increase student interest in their STEM classes.

The theoretical framework leading to the development of this survey has a long and well-established history. In 1966, Bakan (1966) originally proposed that two orientations could be used to characterize the human experience: agentic (orientation to the self) and communal (orientation to others). Agentic goals can thus be seen as goals focusing on improving the self or one’s own circumstances. Communal goals are goals focusing on helping others and one’s community and being part of a community. Gender socialization theory contributed to our understanding of who holds these goals most strongly: women are socialized to desire and assume more communal roles, while males assume more agentic roles ( Eagly et al. , 2000 ; Prentice and Carranza, 2002 ; Su et al. , 2009 ).

This framework and survey were first used in the context of STEM education by Diekman et al. (2010) . They found these two goal orientations to be predictive of women’s attrition from STEM, particularly when they perceive STEM careers to be at odds with the communal goals important to them. Current research in this area has expanded beyond the focus on gender differences and has recognized that all humans value communal goals to some degree and that there is also variation in importance placed on communal goals by racial and ethnic groups ( Smith et al. , 2014 ), social class (Stephens et al. , 2012), and college generation status ( Allen et al. , 2015 ). The majority of this work has been done with the general population of undergraduates. Our proposed use of the survey is to explore the variation in goals among different groups in a STEM-exclusive sample.

The instrument

The goal-endorsement survey described by Diekman et al. , (2010) aims to measure how others-focused (communal) versus self-focused (agentic) students are. The instrument asks students to rate “how important each of the following kinds of goals [is] to you personally” on a scale of 1 (not at all important) to 7 (very important). The original measurement instrument has 23 items that have been reported as two factors: agentic (14 items) and communal (nine items) goals (see Table 2 for a listing of the items). The survey has been used many times in different contexts and has been shown to be predictive in ways hypothesized by theory. Diekman et al. (2010) briefly report on an EFA supporting the proposed two-factor structure of the instrument with a sample of undergraduates from introductory psychology courses.

Data collection and participants

The questionnaire was distributed in Fall 2015 and 2016 to entering first-year undergraduate students in STEM fields (biology, biochemistry, physics, chemistry, math, and computer science) at a large southern U.S. R1 university. Students took the questionnaire in the weeks before their first Fall semester. In total, 796 students (70% women) completed the questionnaire. Fifteen percent of the students were first-generation students, and 24% came from underrepresented minorities.

Sample size

In our study, the total sample size was 796 students. Considering the number of factors (two) and the relatively large number of items per factor (nine and 14), the sample size was deemed more than sufficient to perform factor analysis ( Gagne and Hancock, 2006 ; Wolf et al. , 2013 ).

a Items 1–14 originally represented the agentic scale, and items 15–23 represented the communal scale. Standardized pattern coefficients from the initial EFA for the three-, four-, and five-factor solutions are reported in columns 3–14. For clarity, pattern coefficients <0.2 are not shown.

The use of existing surveys usually allows the collection of less validity evidence than the creation and use of a new survey. Specifically, if previous studies collected validity evidence for the use of the survey for a similar purpose and with a similar population as the intended research, researchers can then reference that validity evidence and present less of their own. It is important to note that, even if a survey has a long history of established use, this alone does not provide adequate validity evidence for it being an appropriate measurement instrument. It is worth researchers’ time to go through the published uses of the survey and identify all the different types of validity evidence that have been collected. They can then identify the additional evidence they want to collect to feel confident applying the instrument for their intended interpretation and use. For a more detailed description of different types of validity evidence and a pedagogical description of the process of instrument validation, see Reeves and Marbach-Ad (2016) and Andrews et al. (2017) .

In this article, we will focus on the third type of validity evidence listed in Table 1 , evidence based on internal structure. Investigating the internal structure of an instrument is crucial in order to be confident that you can combine several related items to represent a specific construct. We will describe an empirical tool to gather information about the internal relationships between items in a measurement instrument: factor analysis. On its own, factor analysis is not sufficient to establish the validity of the use of an instrument in a researcher’s context and for their purpose. However, when factor analysis is combined with other validity evidence, it can increase a researcher’s confidence that they are invoking the theoretical frameworks used in the development of the instrument: that is, the researcher is correctly interpreting the results as representing the construct the instrument purports to measure.

INSTRUMENT SCOPE: ONE OR SEVERAL CONSTRUCTS?

As described in Measuring Variables That Are Not Directly Observable , a construct cannot be directly measured. Instead, different aspects of a construct are represented by different individual items. The foundational assumption in instrument development is that the construct is what drives respondents to answer similarly on all these items. Thus, it is reasonable to distill the responses on all these items into one single score and make inferences about the construct. Measurement instruments can be used to measure a single construct, several distinct constructs, or even make finer distinctions within a construct. The number of intended constructs or aspects of a construct to be measured are referred to as an instrument’s dimensionality .

Unidimensional Scales

An instrument that aims to measure one underlying construct is a unidimensional scale. To interpret a set of items as if they measure the same construct, one must have both theoretical and empirical evidence that the items function as intended; that they do, indeed represent a single construct. If a researcher takes a single value (such as the mean) to represent a set of responses to a group of items that are unrelated to one another theoretically (e.g., I like biology, I enjoy doing dissection, I know how to write a biology lab report), the resulting value would be difficult to interpret at best, if not meaningless. While all of these items are related to biology, they do not represent a specific, common construct. Obviously, taking the mean response from these three items as a measure of interest in biology would be highly problematic. For example, one could be interested in biology but dislike dissection, and one’s laboratory writing skills are likely influenced by aspects other than interest in biology. Even when a set of items theoretically seem to measure the same construct, the researcher must empirically show that students demonstrate a coherent response pattern over the set of items to validate their use to measure the construct. If students do not demonstrate a coherent response, this indicates that the items are not functioning as intended and they may not all measure the same construct. Thus, the single value used to represent the construct from that group of items would contain very little information about the intended construct.

Multidimensional Scales

An instrument that is constructed to measure several related constructs or several different aspects of a construct is called a multidimensional scale. For example, the Diekman et al. (2010) goal-endorsement instrument (see items in Box 1 and Table 2 ) we use in this article is a multidimensional scale: it theoretically aims to measure two different aspects of student goal endorsement. To be able to separate the results into two subscales, one must test that the items measure distinctly different constructs. It is important to note that whether a set of items represents different constructs can differ depending on the intended populations, which is why collecting evidence on the researcher’s own population is so critical. Wigfield and Eccles (1992) illustrate this concept in a study of children of different ages. Children in early or middle elementary school did not seem to distinguish between their perceptions of interest, importance, and usefulness of mathematics, while older children did appear to differentiate between these constructs. Thus, while it is meaningful to discuss interest, importance, and usefulness as distinct constructs for older children, is it not meaningful to do so with younger children.

In summary, before using a survey, one has to have gathered all the appropriate validity evidence for the proposed interpretations and use. When measuring a construct, one important step in this validation procedure is to explicitly describe and empirically analyze the assumed dimensionality of the survey.

THE MISUSE OF COEFFICIENT ALPHA: UNDERSTANDING THE DIFFERENCE BETWEEN RELIABILITY AND VALIDITY

In many biology educational research papers, researchers only provide coefficient alpha (also called Cronbach’s alpha) as evidence of validity. For example, in Eddy et al. (2015) , the researchers describe the alpha of two scales on a survey and no other evidence of validity or dimensionality. This usage is widely agreed to be a misuse of coefficient alpha ( Green and Yang, 2009 ). To understand why this is the case, we have to understand how validity and reliability differ and what specifically coefficient alpha measures.

Reliability is about consistency when a testing procedure is repeated ( AERA, APA, and NCME, 2014 ). For example, assuming that students do not change their goal endorsement, do repeated measurements of students’ goal endorsement using Diekman’s goal-endorsement instrument give consistent results? Theoretically, reliability can be defined as the ratio between the true variance in the construct among the participating respondents (the latent, unobserved variance the researcher aims to interpret) and the observed variance as measured by the measurement instrument ( Crocker and Algina, 2008 ). The observed variance for an item is a combination of the true variance and measurement error. Measurement error is the extent that responses are affected by factors other than the construct of interest ( Fowler, 2014 ). For example, ideally, students’ responses to Diekman’s goal-endorsement instrument would only be affected by their actual goal endorsement. But students’ responses may also be affected by things unrelated to the construct of goal endorsement. For instance, responses on communal goals items may be influenced by social desirability, students’ desire to answer in a way that they think others would want them to. Students’ responses on items may also depend on external circumstances while they were completing the survey, such as time of the day or the noise level in their environment when they were taking the survey. While it is impossible to avoid measurement error completely, minimizing measurement error increases the ratio between the true and the observed variance, which increases the likelihood that the instrument will yield similar results over repeated use.

Unfortunately, a construct cannot, by definition, be directly measured; the true score variance is unknown. Thus, reliability itself cannot be directly measured and must be estimated. One way to estimate reliability is to distribute the instrument to the same group of students multiple times and analyze how similar the responses of the same students are over time. Often it is not desirable or practically feasible to distribute the same instrument multiple times. Coefficient alpha provides a means to estimate reliability for an instrument based on a single distribution. 3 Simply put, coefficient alpha is the correlation of an instrument to itself ( Tavakol and Dennick, 2011 ). Calculation of coefficient alpha is based on the assumption that all items in a scale measure the same construct. If the average correlation among items on a scale is high, then the scale is said to be reliable.

The use and misuse of coefficient alpha as an estimate of reliability has been extensively discussed by researchers (e.g., Green and Yang, 2009 ; Sijtsma, 2009 ; Raykov and Marcoulides, 2017 ; McNeish, 2018 ). It is outside the scope of this article to fully explain and take a stand among these arguments. Although coefficient alpha may be a good estimator of reliability under certain circumstances, it has limitations. We will further elaborate on two limitations that are most pertinent within the context of instrument validation.

Limitation 1: Coefficient Alpha Is about Reliability, Not Validity

A high coefficient alpha does not prove that researchers are measuring what they intended to measure, only that they measured the same thing consistently. In other words, coefficient alpha is an estimation of reliability. Reliability and validity complement each other: for valid interpretations to be made using an instrument, the reliability of that instrument must be high. However, if the test is invalid, then reliability does not matter. Thus, high reliability is necessary, but not sufficient, to make valid interpretations from scores resulting from instrument administration. Consider this analogy using observable phenomena: a calibrated scale might produce consistent values for the weight of a student and thus the measure is reliable, but using this score to make interpretations about the students’ height would be completely invalid. Similarly, a survey’s coefficient alpha could be high, but the survey instrument could still not be measuring what the researcher intended it to measure.

Limitation 2: Coefficient Alpha Does Not Provide Evidence of Dimensionality of the Scale

Coefficient alpha does not provide evidence for whether the instrument measures one or several underlying constructs ( Schmitt, 1996 ; Sijtsma, 2009 ; Yang and Green, 2011 ). Schmitt (1996) provides two illustrative examples of why a high coefficient alpha should not be taken as a proof of a unidimensional instrument. He shows that a six-item instrument, in which all items have equal correlations to one another (unidimensional instrument), could yield the same coefficient alpha as a six-item instrument with item correlations clearly showing a two-dimensional pattern (i.e., an instrument with item correlation of 0.5 across all items has the same coefficient alpha as an instrument with item correlations of 0.8 between some items and items correlations of 0.3 between other items). Thus, as Yang and Green (2011) conclude, “A scale can be unidimensional and have a low or a high coefficient alpha; a scale can be multidimensional and have a low or a high coefficient alpha” (p. 380).

In conclusion, reporting only coefficient alpha is not sufficient evidence 1) to make valid interpretations of the scores from an instrument or 2) to prove that a set of items measure only one underlying construct (unidimensionality). We encourage readers interested in learning more about reliability to read chapters 7–9 in Bandalos (2018) . In the following section, we describe another statistical tool, factor analysis, which actually tests the dimensionality among a set of items.

FACTOR ANALYSIS: EVIDENCE OF DIMENSIONALITY AMONG A SET OF ITEMS

Factor analysis is a statistical technique that analyzes the relationships between a set of survey items to determine whether the participant’s responses on different subsets of items relate more closely to one another than to other subsets, that is, it is an analysis of the dimensionality among the items ( Raykov and Marcoulides, 2008 ; Leandre et al. , 2012; Tabachnick and Fidell, 2013 ; Kline, 2016 ; Bandalos, 2018 ). This technique was explicitly developed to better elucidate the dimensionality underpinning sets of achievement test items ( Mulaik, 1987 ). Speaking in terms of constructs, factor analysis can be used to analyze whether it is likely that a certain set of items together measure a predefined construct (collecting validity evidence relating to internal structure; Table 1 ). Factor analysis can broadly be divided into exploratory factor analysis (EFA) and confirmatory factor analysis (CFA).

Exploratory Factor Analysis

EFA can be used to explore patterns underlying a data set. As such, EFA can elucidate how different items and constructs relate to one another and help develop new theories. EFA is suitable during early stages of instrument development. By using EFA, the researcher can identify items that do not empirically belong to the intended construct and that should be removed from the survey. Further, EFA can be used to explore the dimensionality of the instrument. Sometimes EFA is conflated with principal component analysis (PCA; Leandre et al. , 2012). PCA and EFA differ from each other in several fundamental ways. EFA is a statistical technique that should be used to identify plausible underlying constructs for a set of items. In EFA, the variance the items share is assumed to represent the construct and the nonshared variance is assumed to represent measurement errors. PCA is a data reduction technique that does not assume an underlying construct. PCA reduces a number of observed variables to a smaller number of components that account for the most variance in the observed variables. In PCA, all variance is considered, that is, it assumes no measurement errors. Within educational research, PCA may be useful when measuring multiple observable variables, for example, when creating an index from a checklist of different behaviors. For readers interested in reading more about the distinction between EFA and PCA and why EFA is the most suitable for exploring constructs, see Leandre et al. (2012) or Raykov and Marcoulides (2008) .

Confirmatory Factor Analysis

CFA is used to confirm a previously stated theoretical model. Essentially, when using CFA, the researcher is testing whether the data collected supports a hypothesized model. CFA is suitable when the theoretical constructs are well understood and clearly articulated and the validity evidence on the internal structure of the scale (the relationship between the items) has already been obtained in similar contexts. The researcher can then specify the relationship between the item and the construct and use CFA to confirm the hypothesized number of constructs, the relationship between the constructs, and the relationship between the constructs and the items. CFA may be appropriate when a researcher is using a preexisting survey that has an established structure with a similar population of respondents.

A Brief Technical Description of Factor Analysis

Mathematically, factor analysis involves the analysis of the variances and covariances among the items. The shared variance among items is assumed to represent the construct. In factor analysis, the constructs (the shared variances) are commonly referred to as factors. Nonshared variance is considered error variance. During an EFA, the covariances among all items are analyzed together, and items sharing a substantial amount of variance are collapsed into a factor. During a CFA the shared variance among items that are prespecified to measure the same underlying construct is extracted. Figure 1 illustrates EFA and CFA on an instrument consisting of eight observable variables (items) aiming to measure two constructs (factors): F1 and F2. In EFA, no a priori assumption of which items represent which factors is necessary: the EFA determines these relationships. In CFA, the shared variance of items 1–4 are specified by the researcher to represent F1, and the shared variance of items 5–8 are specified to represent F2. Even further, part of what CFA tests is that items 1–4 do not represent F2, and items 5–8 do not represent F1. For both EFA and CFA, nonshared variance is considered error variance.

FIGURE 1. Conceptual illustration of EFA and CFA. Observed variables (items 1–8) by squares, and constructs (factors F1 and F2) are represented by ovals. Factor loading/pattern coefficients representing the effect of the factor on the item (i.e., the unique correlation between the factor and the item) are represented by arrows. σ j , variance for factor j ; E i , unique error variance for item i . The factor loading for one item on each factor is set to 1 to give the factors an interpretable scale.

Figures illustrating the relationships between items and factors (such as Figure 1 ) are interpreted as follows. The double-headed arrow between the factors represents the correlation between the two factors (factor correlations). Each one-­directional arrow between the factors and the items represents the unique correlation between the factor and the item (called “pattern coefficient” in EFA and “factor loading” in CFA). The pattern coefficients and factor loadings are similar to regression coefficients in a multiple regression. For example, consider the self-promotion item on Diekman’s goal-endorsement instrument. The factor loading/pattern coefficient for this item tells the researcher how much of the average respondent’s answer on this item is due to his or her general interest in agentic goals versus something unique about that item (error variance). For readers interested in more mathematical details about factor analysis, we recommend Kline (2016) , Tabachnick and Fidell (2013) , or Yong and Pearce (2013) .

Should EFA and CFA Be Applied on the Same Sample?

If a researcher decides that EFA is the best approach for analyzing the data, the results from the EFA should ideally be confirmed with a CFA before using the measurement instrument for research. This confirmation should never be conducted on the same sample as the initial EFA. Doing so does not provide generalizable information, as the CFA will be (essentially) repeating many of the relationships that were established through the EFA. Additionally, there could be something nuanced about the way the particular sample responds to items that might not be found in a second sample. For these reasons (among others), it is best practice to perform an EFA and CFA on independent samples. If a researcher has a large enough sample size, this can be done by randomly dividing the initial sample into two independent groups. It is also not uncommon for a researcher using an existing survey to decide that a CFA is suitable to start with but then discover that the data do not fit to the theoretical model specified. In this case, it is completely justified and recommended to conduct a second round of analyses starting with an EFA on half of the initial sample followed by a CFA on the other half of the sample ( Bandalos and Finney, 2010 ).

FACTOR ANALYSIS STEP BY STEP

In this section, we 1) describe the important considerations when preparing to perform a factor analysis, 2) introduce the essential analytical decisions made during an analysis, and 3) discuss how to interpret the outputs from factor analyses. We illustrate each step with real data using factor analysis to analyze the dimensionality of a goal-endorsement instrument ( Diekman et al. , 2010 ). Further, annotated code and output for running and analyzing EFA and CFA in R are provided as Supplemental Material (R syntax and Section 2) along with sample data.

Before delving into the technical details, we would like to be clear that conducting a factor analysis involves many decisions. There are no golden rules to follow to make these decisions. Instead, the researcher must make holistic judgments based on his or her specific context and available theoretical and empirical information. Factor analysis requires collecting evidence to build an argument to support a suggested instrument structure. The more time a researcher spends with the data investigating the effect of different possible decisions, the more confident they will be in finalizing the survey structure. As always, it is critical that a researcher’s decisions are guided by previously collected evidence and empirical information and not by a priori assumptions that the researcher wishes to support.

Defining the Construct and Intended Use of the Instrument

An essential prerequisite when selecting (or developing) and analyzing an instrument is to explicitly define the intended purpose and use of the instrument. Further, the theoretical construct or constructs that one aims to measure should be clearly defined, and the current general understanding of the construct should be described. The next step is to confirm a good alignment between the construct of interest and the instrument selected to measure it, that is, that the items on the instrument actually represent what one aims to measure (evidence based on content; Table 1 ). For a researcher to be able to use CFA for validation, an instrument must include at least four items in total. A multidimensional scale should have at least three but preferably five or more items for each theorized subscale. In very special cases, two items can be acceptable for a subscale ( Yong and Pearce, 2013 ; Kline, 2016 ). 4 For an abbreviated example of how to write up this type of validity for a manuscript using a survey instrument, see Box 1 .

Sample Size

The appropriate sample size needed for factor analysis is a multifaceted question. Larger sample sizes are generally better, as they will enhance the accuracy of all estimates and increase statistical power ( Gagne and Hancock, 2006 ). Early guidelines on sample sizes for factor analysis were general in their nature, such as a minimum sample size of 100 or 200 (e.g., see Boomsma, 1982 ; Gorsuch, 1983 ; Comrey and Lee, 1992 ). Although it is very tempting to adopt such general guidelines, caution must be taken, as they might lead to underestimating or overestimating the sample size needed ( Worthington and Whittaker, 2006 ; Tabachnick and Fidell, 2013 ; Wolf et al. , 2013 ).

The sample size needed depends on many elements, including number of factors, number of items per factor, size of factor loadings or pattern coefficients, correlations between factors, missing values in the data, reliability of the measurements, and the expected effect size of the parameters of interest ( Gagne and Hancock, 2006 ; Worthington and Whittaker, 2006 ; Wolf et al. , 2013 ). Wolf et al. (2013) showed that a sufficient sample size for a one-factor CFA with eight items and factor loadings of 0.8 could be as low as 30 respondents. For a two-factor CFA with three or four items per scale and factor loadings of 0.5, a sample size of ∼450 respondents is needed. For EFA, Leandre et al. (2012) recommend that “under moderately” good conditions (communalities 5 of 0.40–0.70 and at least three items for each factor), a sample of at least 200 should be sufficient, while under poor conditions (communalities lower than 0.40 and some factors with only two items for each factor), a sample size of at least 400 is needed. Thus, when deciding on an appropriate sample size, one should consider the unique properties of the actual survey. The articles written by Wolf et al. (2013) and Gagne and Hancock (2006) provide a good starting point for such considerations. See Box 1 for an example of how to discuss sample size decisions in a manuscript.

In some cases, it may be implausible to have the large sample sizes necessary to obtain stable estimates from an EFA or a CFA. Often researchers must work with data that have already been collected or are using a study design that simply does not include a large number of respondents. In these circumstances, it is strongly recommended that one use a measurement instrument that has already been validated for use in a similar population for a similar purpose. In addition to considering and analyzing other relevant types of validity evidence (see Table 1 ), the researchers should report on validity evidence based on internal structure from other studies and describe the context of those studies relative to their own context. The researchers should also acknowledge in the methods and limitation sections that they could not run dimensionality checks on their sample. Further, researchers can also analyze a correlation matrix 6 of the responses to the survey items from their own data collection to get a sense of how the items may relate to one another in their context. This correlation matrix may be reported to help provide preliminary validity evidence based on internal structure.

Properties of the Data

As with any statistical analysis, before performing a factor analysis the researcher must investigate whether the data meet the assumptions for the proposed analysis. Section 1 of the Supplemental Material provides a summary of what a researcher should check for in the data for the purposes of meeting the assumptions of a factor analysis and an illustration applied to the example data. These include analyses of missing values, outliers, factorability, normality, linearity, and multicollinearity. Box 3 provides an example of how to report these analyses in a manuscript.

Analytic Considerations for CFA

Once the data are screened to determine their properties, several analytical decisions must be made. Because there are some differences in analytical decisions and outputs for EFA and CFA, we will discuss EFA and CFA in separate sections. We will start with CFA, as most researchers adopting an existing instrument will use this method first and may not ever need to perform an EFA. See Box 2 for how to report analytical considerations for a CFA in a manuscript.

BOX 2. What to report in the methods of a publication for a CFA using the goal-endorsement example

We chose to start with a CFA to confirm a two-factor solution, because 1) the theoretical framework underlying the instrument is well understood and articulated and 2) Diekman et al. (2010) performed an EFA on a similar population to ours that supported the two-factor solution. If the assumed factor model was confirmed, then we could confidently combine the items into two sum scores and interpret the data as representing both an agentic and a communal factor. CFA was run using the R package lavaan ( Rosseel, 2012 ).

Selecting an estimator

In consideration of the ordinal and nonnormal nature of the data, the robust maximum-likelihood estimation (MLR) was used to extract the variances from the data. Full-information maximum likelihood in the estimation procedure was used to handle the missing data.

Specifying a two-factor CFA

To confirm the factor structure proposed by Diekman et al. (2010) , we specified a two-factor CFA, with items 1–14 representing the agentic scale and items 15–23 representing the communal factor ( Table 2 ). Correlation between the two factors was allowed. For identification purposes, the factor loading for one item on each factor was set to 1. The number of variances and covariances in the data was 276 (23(23 + 1)/2), which was larger than the number of parameter estimates (one factor correlation, 23 error terms, 21 factor loadings, and variances for each factor). Thus, the model was overidentified.

Selecting model fit indices and setting cutoff values

Multiple fit indices (chi-square value from robust MLR [MLR χ 2 ]; comparative fit index [CFI]; the root-mean-square error of approximation [RMSEA]; and the standardized root-mean-square residual [SRMR]) were consulted to evaluate model fit. The fit indices were chosen to represent an absolute, a parsimony-adjusted, and an incremental fit index. Consistent with the recommendations by Hu and Bentler (1999) , the following criteria were used to evaluate the adequacy of the models: CFI > 0.95, SRMR < 0.08, and RMSEA < 0.06. Coefficient alpha was computed based on the model results and used to assess reliability. Values > 0.70 were considered acceptable.

Selecting an Estimator.

When performing a CFA, a researcher must choose a statistical method for extracting the variance from the data. There are several different methods available, including unweighted least squares, generalized least squares, maximum likelihood, robust maximum likelihood, principal axis factoring, alpha factoring, and image factoring. Each of these methods has its strengths and weaknesses. Kline (2016) and Tabachnick and Fidell (2013) provide a useful discussion of several of these methods and when best to apply each one. In general, because data from surveys are often on an ordinal level (e.g., data from Likert scales) and sometimes slightly nonnormally distributed, estimators robust against nonnormality, such as maximum-likelihood estimation with robust standard errors (MLR) or weighted least-squares estimation (WLS), are often suitable for performing CFA. Whether or not MLR or WLS is most suitable depends partly on the number of response options for the survey items. MLR work best when data can be considered continuous. In most cases, scales with seven response options work well for this purpose, whereas scales with five response options are questionably continuous. MLR is still often used in estimation for five response options, but with four or fewer response options, WLS is better ( Finney and DiStefano, 2006 ). The decision regarding the number of response options to include in a survey should not be driven by these considerations. Rather, the number of response options and properties of the data should drive the selection of the CFA estimator. Although more response options for an item allow researchers to model it as continuous, respondents may not be able to meaningfully differentiate between the different response options. Fewer response options usually offer less ambiguity, but usually result in less variation in the response. For example, if students are provided with 10 options to indicate their level agreement with a given item, it is possible that not all of the response options may be used. In such a case, fewer response options may better capture the latent distribution of possible responses to an item.

Specifying the Model.

The purpose of a CFA is to test whether the data collected with an instrument support the hypothesized model. Using theory and previous validations of the instrument, the researcher specifies how the different items and factors relate to one another (see Figure 1 for an example model). For a CFA, the number of parameters that the researcher aims to estimate (e.g., error terms, variances, correlations and factor loadings) must be less than or equal to the number of possible variances and covariances among the items ( Kline, 2016 ). For a CFA, a simple equation tells you the number of possible variances and covariances: p ( p + 1)/2, where p = number of items. If the number of parameters to estimate is more than the number of possible variances and covariances among the items, the CFA is called “underidentified” and will not provide interpretable results. When the number of parameters to be estimated equals the number of covariances and variances among the items, the model is deemed “just identified” and will result in perfect fit of the data to the model, regardless of the true relationship between the items. To test whether the data fit the theoretical model, the number of parameters that are being estimated needs to be less than the number of variances and covariances observed in the data. In this case, the model is “over­identified.” For the example CFA in Figure 1 , the number of possible variances and covariances is 8(8 + 1)/2 = 36, and the number of parameters to estimate is 17 (one factor correlation, eight error terms, six factor loadings, and variances for each of the two factors 7 ), thus the model is overidentified.

Choosing Appropriate Model Fit Indices.

The true splendor of CFA is that so-called model fit indices have been developed to help researchers understand whether the data support the hypothesized theoretical model. 8 The closest statistic to an omnibus test of model fit is the model chi-square test. The null hypothesis for the chi-square test is that there is no difference between the hypothesized model and the observed relationships within the data. Several researchers argue that this is an unrealistic hypothesis ( Hu and Bentler, 1999 ; Tabachnick and Fidell, 2013 ). A close approximation of the data to the model is more realistic than a perfect model fit. Further, the model chi-square test is very sensitive to sample size (the chi-square statistic tends to increase with an increase in sample size, all other considerations constant; Kline, 2016). Thus, while large sample sizes provide good statistical power, the null hypothesis that the factor model and the data do not differ from each other may be rejected although the difference is actually quite small. Given these concerns, it is important to consider the result of the chi-square test in conjunction with multiple other model fit indices.

Many model fit indices have been developed that quantify the degree of fit between the model and the data. That is, the values provided by these indices are not intended to make binary (fit vs. no fit) judgments about model fit. These model fit indices can be divided into absolute, parsimony-adjusted, and incremental fit indices ( Bandalos and Finney, 2010 ). Because each type of index has its strengths and weaknesses (e.g., sensitivity to sample size, model complexity, or misspecified factor correlations), using at least two different types of fit indices is recommended ( Hu and Bentler, 1999 ; Tabachnick and Fidell, 2013 ). The researcher should decide a priori which model fit indices to use and the cutoff values that will be considered a good enough indicator of model fit to the data. Hu and Bentler (1999) recommend using one of the relative fix indices such as comparative fit index (CFI) with a cutoff of >0.95 in combination with standardized root-mean-square residual (SRMR; absolute fit indices, good model < 0.08) or root-mean-square error of approximation (RMSEA; parsimony-adjusted fit indices, good model < 0.06) as indicators for good fit. Some researchers, including Hu and Bentler (1999) , caution against using these cutoff values as golden rules because it might lead to incorrect rejection of acceptable models ( Marsh et al. , 2004 ; Perry et al. , 2015 ).

Interpreting the Outputs from CFA

After making all the suggested analytical decisions, a researcher is now ready to apply a CFA to the data. Model fit indices that the researcher a priori decided to use are the first element of the output that should be interpreted from a CFA. If these indices suggest that the data do not fit the specified model, then the researcher does not have empirical support for using the hypothesized survey structure. This is exactly what happened when we initially ran a CFA on Diekman’s goal-endorsement instrument example (see Box 3 ). In this case, focus should shift to understanding the source of the model misfit. For example, one should ask whether there are any items that do not seem to correlate with their specified latent factor, whether any correlations seem to be missing, or whether some items on a factor group together more strongly than other items on that same factor. These questions can be answered by analyzing factor loadings, correlation residuals, and modification indices. In the following sections, we describe these in more detail. See Boxes 3 , 6 , and 7 for examples of how to discuss and present output from a CFA in a paper.

BOX 3. How to interpret and report CFA output for publication using the goal-endorsement example, initial CFA

Descriptive statistics.

No items were missing more than 1.3% of their values, and this missingness was random (Little’s MCAR test: chi-square = 677.719, df = 625, p = 0.075 implemented with the BaylorEdPsych package; Beaujean, 2012 ). Mean values for the items ranged from 4.1 to 6.3. Most items had a skewness and kurtosis below |1.0|, and all items had a skewness below |2.0| and kurtosis below |4.0|. Mardia’s multivariate normality test (implemented with the psych package; Revelle 2017 ) showed significant multivariate skewness and kurtosis values. Intra-subscale correlations ranged from 0.02 to 0.73, and the lowest tolerance value was 0.36.

Interpreting output from the initial two-factor CFA

Results from the initial two-factor CFA indicated that, in our population, the data did not support the model specified. The chi-square test of model fit was significant (χ 2 = 1549, df = 229, p < 0.00), but this test is known to be sensitive to minor model misspecification with large sample sizes ( n = 796). However, additional model fit indices also indicated that the data did not support the model specified. SRMR was 0.079, suggesting good fit, but CFI was 0.818, and RMSEA was 0.084. Thus, the hypothesized model was not empirically supported by the data.

To better understand this model misspecification, we explored the factor loadings, correlational residuals, original interitem correlation matrix, and modification indices. Several factor loadings were well below 0.7, indicating that the factors did not explain these items well. Analysis of correlational residuals did not point out any special item-pair correlation as especially problematic; rather, several correlational residuals were residuals greater than |0.10|. Consequently, the poor model fit did not seem to be primarily caused by a few ill-fitting items. A reinvestigation of the interitem correlation matrix made when analyzing the factorability of the data (see the Supplemental Material, Section 1) suggested the presence of more than two factors. This was most pronounced for the agentic scale, for which some items had a relatively high correlation to one another and lower correlations to other items in that scale. Inspection of the modification indices suggested adding correlations between, for example, the items achievement and mastery. Together, these patterns indicate that the data might be better represented by more than two factors.

Factor Loadings.

As mentioned in Brief Technical Description of Factor Analysis , factor loadings represent how much of the respondent’s response to an item is due to the factor. When a construct is measured using a set of items, the assumption is that each item measures a slightly different aspect of the construct and that the common variance among them is the best possible representation of the construct. High, but not too high, factor loadings for these items are preferred. If several items have high standardized factor loadings 9 (e.g., above 0.9), this suggests that they share a lot of variance, which indicates that these items may be too similar and thus do not contribute unique information ( Clark and Watson, 1995 ). On the other hand, if an item has a low factor loading on its focal factor, it means that item shares no or little variance with the other items that theoretically belong to the same focal factor and thus its contribution to the factor is low. Including items with low factor loadings when combining the scores from several items into a single score (sum, average, or common variance) will introduce bias into the results. 10 There is, however, no clear rule for when an item has a factor loading that is too low to be included. Bandalos and Finney (2010) argue that, because the items are specifically chosen to indicate a factor, one would hope that the variability explained in the item by the factor would be high (at least 50%). Squaring the standardized factor loadings provides the amount of variability explained in the item by the factor ( R 2 ), indicating that it is desirable to have standardized factor loadings of at least 0.7 ( R 2 = 0.7 2 = ∼50%). However, the acceptable strength of the factor loading depends on the theoretically assumed relationship between the item and the factor. Some items might be more theoretically distant from the factor and therefore have lower factor loadings, but still comprise an essential part of the factor. This reinforces the idea that there are no hard and fast rules in factor analysis. Even if an item does not reach the suggested level for factor loading, if a researcher can argue from a theoretical basis for its inclusion, then it could be included.

Correlation Residuals.

As mentioned before, CFA is used to confirm a previously stated theoretical model. In CFA, the collected data are used to evaluate the accuracy of the proposed model by comparing the discrepancy between what the theoretical model implies (e.g., a two-factor model in the Diekman et al. [2010] example) and what is observed in the actual data. Correlation residuals represent the differences between the observed correlations in the data and the correlations implied by the CFA ( Bandalos and Finney, 2010 ). Local areas of misfit can be identified by inspecting correlational residuals. Correlation residuals greater than |0.10| are indicative of a specific item-pair relationship that is poorly reproduced by the model ( Kline, 2016 ). This guideline may be too low when working with small sample sizes and too large when working with large samples sizes and, as with all other fit indices, should only be used as one element among many to understand model fit.

Modification Indices.

Most statistical software used for CFA provides modification indices that can easily be viewed by the user. Modification indices propose a series of possible additions to the model and estimate the amount the model’s chi-square value would decrease if the suggested parameter were added (recall that a lower chi-square value indicates better model fit). For example, if an item strongly correlates with two factors but is constrained to only correlate with one, the modification index associated with adding a relationship to the second factor would indicate how much the chi-square model fit is expected to improve with the addition of this factor loading. In short, modification indices can be used to better understand which items or relationships might be driving the poor model fit.

If (and only if) theoretically justified, a suggested relationship can be added or problematic items can be removed during a CFA. However, caution should be taken before adding or removing any parameters ( Bandalos and Finney, 2010 ). As Bandalos and Finney (2010) state, “Researchers must keep in mind that the purpose of conducting a CFA study is to gain a better understanding of the underlying structure of the variables, not to force models to fit” (p. 112). If post hoc changes to the model are made, the analysis becomes more explorative in nature, and thus tenuous. The modified model should ideally be confirmed with a new data set to avoid championing a model that has an artificially good model fit.

Best practice if the model does not fit (as noted in Factor Analysis ) is to split the data and conduct a second round of analyses starting with an EFA using half of the sample and then conducting a CFA with the other half ( Bandalos and Finney, 2010 ). To see an example of how to write up this secondary CFA analysis, see Boxes 6 and 7 of the goal-endorsement example.

When the Model Fit Is Good.

When model fit indices indicate that the hypothesized model is a plausible explanation of the relationships between the items in the data, factor loadings and the correlation between the latent variables in the model (so-called factor correlations) can be interpreted and a better understanding of the construct can be gained. It is also now appropriate to calculate and report the coefficient alpha, omega, or any other index of reliability for each of the subscales. The researcher can more confidently use the results from the instrument to make conclusions about the intended constructs based on combined scale scores (given that other relevant validity evidence presented in Table 1 also supports the intended interpretations). If a researcher has used CFA to examine the dimensionality of the items and finds that the scale functions as intended, this information should be noted in the methods section of the research manuscript when describing the measurement instruments used in the study. At the very least, the researcher should report the estimator and fit indices that were used and accompanying values for the fit indices. If the scale has been adapted in some way, or if it is being empirically examined for the first time, all of the factor loadings and factor correlations should also be reported so future researchers can compare their values with these original estimates. These could be reported as a standalone instrument validation paper or in the methods section of a study using that instrument.

Analytical Considerations for EFA

If a researcher’s data do not fit the model proposed in the CFA, then using the items as indicators of the hypothesized construct is not sensible. If the researcher wants to continue to use the existing items, it is prudent to investigate this misfit to better understand the relationships between the items. This calls for the use of an EFA, where the relationships between variables and factors are not predetermined (i.e., a model is not specified a priori) but are instead allowed to emerge from the data. As mentioned before, EFA could also be the first choice for a researcher if the instrument is in an early stage of development. We outline the steps for conducting an EFA in the following sections. See Box 4 for a description of how to describe analytical considerations for an EFA in the methods section.

BOX 4. What to report in the methods of a publication for an EFA using the goal-endorsement example

Because the results from the initial CFA indicated that the data did not support a two-factor solution, we proceeded with an EFA to explore the factor structure of the data. The original sample was randomly divided into equal-sized parts, and EFA was performed on half of the sample ( n = 398) to determine the dimensionality of the goal-endorsement scale and detect possible problematic items. This was followed by a CFA ( n = 398) to confirm the result gained from the EFA. EFA and CFA were run using the R package lavaan ( Rosseel, 2012 ).

Selecting an estimator for the EFA

Considering the ordinal and nonnormal nature of the data, a principal axis factor estimator was used to extract the variances from the data. Only cases with complete items were used in the EFA.

Factor rotation

Due to the fact that theory and the preceding CFA indicated that the different subscales are correlated, quartimin rotation (an oblique rotation) was chosen for the EFA.

Determining the number of factors

Visual inspection of the scree plot, parallel analysis (PA) based on eigenvalues from the principal components and factor analysis in combination with theoretical considerations were used to decide on the appropriate number of factors to retain. PA was implemented with the psych package ( Revelle, 2017 ).

Just as with CFA, the first step in an EFA is selecting a statistical method to use to extract the variances from the data. The considerations for the selection of this estimator are similar to those for CFA (see Selecting an Estimator ). One of the most commonly used methods for extracting variance when conducting an EFA on ordinal data with slight nonnormality is principal axis factoring (Leandre et al. , 2012). If the items in one’s instrument have fewer than five response options, WLS can be considered.

Factor Rotation.

Factor rotation is a technical step to make the final output from the model easier to interpret (see Bandalos, 2018 , pp. 327–334, for more details). The main decision for the researcher to make here is whether the rotation should be orthogonal or oblique ( Raykov and Marcoulides, 2008 ; Leandre et al. , 2012; Bandalos, 2018 ). Orthogonal means that the factors are uncorrelated to one another in the model. Oblique allows the factors to correlate to one another. In educational studies, factors are likely to correlate to one another; thus oblique rotation should be chosen unless a strong hypothesis for uncorrelated factors exists (Leandre et al. , 2012). Orthogonal and oblique are actually families of rotations, so once the larger choice of family is made, a specific rotation method must be chosen. The specific rotation method within the oblique category that is chosen does not generally have a strong effect on the results ( Bandalos and Finney, 2010 ). However, the researcher should always provide information about which rotation method was used ( Bandalos and Finney, 2010 ).

Determining the Number of Factors.

After selecting the methods for estimation and rotation, researchers must determine how many factors to extract for EFA. This step is recognized as the greatest challenge of an EFA, and the issue has generated a large amount of debate (e.g., Cattell, 1966 ; Crawford et al. , 2010 ; Leandre et al. , 2012). Commonly used methods are to retain all factors with an eigenvalue >1 or to use a scree plot. Eigenvalues are roughly a measure of the amount of information contained in a factor, so factors with higher eigenvalues are the most useful for understanding the data. A scree plot is a plot of eigenvalues versus number of factors. Scree plots allow researchers to visually estimate the number of factors that are informative by considering the shape of the plot (see the annotated output in the Supplemental Material, Section 2, for an example of a scree plot). These two methods are considered heuristic, and many researchers recommend also using parallel analysis (PA) or the minimum average partial correlation test to determine the appropriate number of factors ( Ledesma and Valero-Mora, 2007 ; Leandre et al. , 2012; Tabachnick and Fidell, 2013 ). In addition, several statistics that mathematically analyze the shape of the scree plot have been developed in an effort to provide a nonvisual method of determining the number of factors ( Ruscio and Roche, 2012 ; Raiche et al. , 2013).

We recommend using a number of these indices, as well as theoretical considerations, to determine the number of factors to retain. The results of all of the various methods discussed provide plausible solutions that can all be explored to evaluate the best solution. When these indices are in agreement, this provides more evidence of a clear factor structure in the data. To make each factor interpretable, it is of outmost importance that the number and nature of factors retained make theoretical sense (see Box 5 for a discussion on how many factors to retain). Further, the intended use for the survey should also be considered. For example, say a researcher is interested in studying two distinct populations of students. If the empirical and theoretical evidence supports both a two-factor and a three-­factor solution, but the three-factor solution provides a clearer distinction between two populations of interest, then the researcher might choose the three-factor solution (see Box 7 ).

BOX 5. How to interpret and report EFA output for publication using the goal-endorsement example

Initial efas.

Parallel analysis based on eigenvalues from the principal components and factor analysis indicated three components and five factors. The scree plot indicated an initial leveling out at four factors and a second leveling out at six factors.

We started by running a three-factor model and then increased the number of factors by one until we had run all models ranging from three to six factors. The pattern matrices were then examined in detail with a special focus on whether the factors made theoretical sense (see Table 2 for pattern matrices for the three-, four-, and five-factor models). The three-factor solution consisted of one factor with high factor loadings for the items representing communal goals (explaining 17% of the variance in the data). The items originally representing agentic goals were split into two factors. One factor included items that theoretically could be described as prestige (explaining 12% of the variance in the data) and the other items related to autonomy and competency (explaining 11% of the variance in the data). The total variance explained by the three-factor model was 41%. In the four-factor solution, the autonomy and competency items were split into two different factors. In the five-factor solution, three items from the original communal goals scale (working with people, connection to others, and intimacy) contributed most to the additional factor. In total, 48% of the variance was explained by the five-factor model. For a six-factor solution, the sixth factor included only one item with pattern loadings greater than 0.40, and thus a six-factor solution was deemed to be inappropriate.

In conclusion, the communal scale might represent one underlying construct as suggested by previous research or it might be split into two subscales represented by items related to 1) serving others and 2) connection. Our data did not support a single agentic factor. Instead, these items seemed to fit on two or three subscales: prestige, autonomy, and possibly competency. Because all the suggested solutions (three-, four-, and five-factor solutions) included a number of poorly fitting items, we decided to remove items and run a second set of EFAs before proceeding to the CFA.

Second round of EFAs

On the basis of the results from the initial EFAs, we first continued with a three-factor solution, removing items with low pattern coefficients (<0.40; 10: success, 14: competition, and 22: intimacy, to begin with; Table 2 ). When these variables were removed in a stepwise manner, additional items now showed low pattern coefficients (<0.40) and/or low communalities in the new EFA solutions. The new items showing low pattern coefficients were items belonging to their own factors in the five-factor EFA (i.e., items representing competency and connection). Not until all items from these two scales were removed was a stable three-factor solution achieved with pattern coefficients >0.40. Thus, to achieve a three-factor solution, including only items with pattern coefficients >0.40, we had to drop 30% of the items and, consequently, extensively narrow the content validity of the scale.

To further explore a five-factor solution, we decided, on the basis of the empirical results and the theoretical meaning of the items, to stepwise remove items 4 (mastery), 14 (competition), and 22 (intimacy). We used an inclusive pattern coefficient cutoff (<0.40) for this initial round of validation, because we wanted to keep as many items as possible from the original scale. If some items continue to show pattern coefficients below 0.5 over repeated data collections, researchers should reconsider whether these items should be kept in the scale. The new 20-item five-factor solution resulted in theoretically the same factors as for the first five-factor EFA, but now all pattern coefficients but one were above 0.50 on the primary factor and below 0.20 on the other factors ( Table 3 ). In total, 52% of the variance in the data was explained.

a For clarity, pattern coefficients <0.2 are not shown.

In conclusion, the initial CFA, as well as the EFA analysis, indicated that the two-dimensional scale previously suggested was not supported in our sample. The EFA analysis mainly indicated a three- or a five-factor solution. To achieve a good three-factor solution, we had to exclude 30% of the original items. The final three factors were labeled “prestige,” “autonomy,” and “service.” Both the empirical data and theoretical consideration suggested two additional factors: a competency factor and a connection factor. We continued with this five-factor solution, as it allowed us to retain more of the original items and made theoretical sense, as the five factors were just a further parsing of the original agentic and communal scales.

BOX 6. How to interpret and report CFA output for publication using the goal-endorsement example, second CFA

Based on the results from the EFAs, a second CFA was specified using the five-factor model with 20 items (excluding 4: mastery, 10: competition, and 22: intimacy). The specified five-factor CFA demonstrated appropriate model fit (χ 2 = 266, df = 160, p < 0.00, CFI = 0.959, RMSEA = 0.046, and SRMR = 0.050). Factor loadings were close to or above 0.70 for all but three items ( Figure 2 ), meaning that, for most items, around 50% of the variance in the items was explained ( R 2 ≈ 0.5) by the theorized factor. This means that the factors explained most of the items well. Factor correlations were highest between the service and connection factors (0.76) and the autonomy and competency (0.67) factors. The lowest factor correlation found was between the prestige and service factors (0.21). Coefficient alpha values for the subscales were 0.81, 0.77, 0.66, 0.87, and 0.77 for prestige, autonomy, competency, service, and connection, respectively.

FIGURE 2. Results from the final five-factor CFA model. Survey items (for items descriptions see Table 3 ) are represented by squares and factors are represented by ovals. The numbers below the double-headed arrows represent correlations between the factors; the numbers by the one-directional arrows between the factors and the items represent standardized factor loadings. Small arrows indicate error terms. *, p < 0.01; p < 0.001 for all other estimates.

BOX 7. Writing conclusions from factor analysis for publication using the goal-endorsement example

Conclusions.

The results from the factor analysis did not confirm the proposed two-factor goal-endorsement scale for use with college STEM majors. Instead, our results indicated five subscales: prestige, autonomy, competency, service, and connection ( Table 4 ). The five-factor solution aligned with Diekman et al. ’s (2010) original two-factor scale, because communal items did not mix with agentic items. Our sample did, however, allows us to further refine the solution for the original two scales. Finer parsing of the agentic and communal scales may help identify important differences between students and allow researchers to better understand factors contributing to retention in STEM majors. In addition, with items related to autonomy and competency moved to their own scales, the refined prestige scale focusing on factors like power, recognition, and status may be a more direct contrast to the service scale. Additional evidence in support of this refinement include that the five-factor solution better distinguishes the service scale and the prestige scale (factor correlation = 0.21) than the two-factor solution (factor correlation between agentic and communal factors = 0.35). Further, retention may be significantly correlated to prestige but not to autonomy. Alternatively, differences between genders may exist for the service scale but not the connection scale.

On the basis of the result of this factor analysis, we recommend using the five-factor solution for interpreting the results of the current data set, but interpret the connection and competency scales with some caution, for reasons summarized in the next section.

Limitations and future studies

The proposed five-factor solution needs additional work. In particular, both the competency and connection scales need further development. Only two items represented connection, and this is not adequate to represent the full aspect of this construct, especially to make it clearly distinct from the construct of service. The competency scale included only three items, coefficient alpha was 0.66, and factor loadings for the scale were low (<0.40) for demonstrating skills or competency.

Another limitation of this study is that the sample consisted of 70% women, an overrepresentation of women for a typical undergraduate STEM population. Further studies should confirm whether the suggested dimensionality holds in a more representative sample. Future studies should also test whether the instrument has the same structure with STEM students from different backgrounds (i.e., measurement invariance should be investigated). The work presented here only establishes the dimensionality of the survey. We recommend the collection of other types of validity evidence, such as evidence based on content or relationships to other variables, to further strengthen our confidence that the scores from this survey represent STEM students’ goal orientation.

Interpreting Output from EFA

The aim of EFA is to gain a better understanding of underlying patterns in the data, investigate dimensionality, and identify potentially problematic items. In addition to the results from parallel analysis or other methods used to estimate the number of factors, other informative measures include pattern coefficients and communalities. These outputs from an EFA will be discussed in this section. See Box 5 for an example of how to write up the output from an EFA.

Pattern Coefficients and Communalities.

Pattern coefficients and communalities are parameters describing the relationship between the items and the factors. They help researchers understand the meaning of the factors and identify items that do not empirically appear to belong to their theorized factor.

Pattern coefficients closely correspond to factor loadings in CFA, and they are commonly the focal output from an EFA (Leandre et al. , 2012). Pattern coefficients represent the impact each factor has on an item after controlling for the impact of all the other factors on that item. A high pattern coefficient suggests that the item is well explained by a particular factor. However, as with CFA, there is no clear rule as to when an item has a pattern coefficient too low to be considered part of a particular factor. Guidelines for minimum pattern coefficient values range from 0.40 to 0.70. In other words, all items with pattern coefficients equal to or higher than the chosen cutoff value can be considered “good” items and should be kept in the survey ( Matsunaga, 2010 ).

It is also important to consider the magnitude of any cross-loadings. Cross-loading describes the situation in which an item seems to be influenced by more than one factor in the model. Cross-loading is indicated when an item has high pattern coefficients for multiple factors. Using that item is problematic when creating a summed/mean score for a factor, as responses to that item are not uniquely driven by its hypothesized factor, but instead by additional measured factors. Cross-loadings higher than 0.20 or 0.30 are usually considered to be problematic ( Matsunaga, 2010 ), especially if the item does not have a particularly strong loading on a focal factor.

Communality represents the percentage of the variance in responses on an item accounted for by all factors in the proposed model. Communalities are similar to R 2 in CFA (see Factor Loadings ). However, in CFA, the variance in an item is only explained by one factor, while in EFA, the variance in one item can be explained by several factors. Low communality for an item means that the variance in the item is not well explained by any part of the model, and thus that item could be a subject for elimination.

We emphasize that, even if pattern coefficients or communalities indicate that an item might be subject for elimination, it is important to consider the alignment between the item and hypothesized construct before actually eliminating the item. The items in a scale are presumably chosen for some theoretical reason, and eliminating any items can cause a decrease in content validity ( Bandalos and Finney, 2010 ). If any item is removed, the EFA should be rerun to ensure that the original factor structure persists. This can be done on the same data set, as EFA is exploratory in nature.

Interpreting the Final Solution.

Once the factors and the items make empirical and theoretical sense, the factor solution can be interpreted, and suitable names for the factors should be chosen (see Box 5 for a discussion of the output from an EFA). Important sources of information for this include: the amount variance explained by the whole solution and the factors, factor correlations, pattern coefficients, communality values, and the underlying theory. Because the names of the factors will be used to communicate the results, it is crucial that the names reflect the meaning of the underlying items. Because the item responses are manifestations of the constructs, different sets of items representing a construct will, accordingly, lead to slightly different nuanced interpretations of that construct. Once a plausible solution has been identified by an EFA, it is important to note that stronger support for the solution can be obtained by testing the hypothesized model using a CFA on a new sample.

CONCLUDING REMARKS

In this article, we have discussed the need for understanding the validity evidence available for an existing survey before its use in discipline-based educational research. We emphasized that validity is not a property of the measurement instrument itself but is instead a property of the instrument’s use. Thus, each time a researcher decides to use an instrument, they have to consider to what degree evidence and theory support the intended interpretations and use of the instrument. A researcher should always review the different kinds of validity evidence described by AERA, APA, and NCME (2014 ; Table 1 ) before using an instrument and should identify the evidence they need to feel confident when employing the instrument for an intended use. When using several related items to measure an underlying construct, one important validity aspect to consider is whether a set of items can confidently be combined to represent that construct. In this paper, we have shown how factor analysis (both exploratory and confirmatory) can be used to investigate that.

We recognize that the information presented herein may seem daunting and a potential barrier to carrying out important, substantive, educational research. We appreciate this sentiment and have experienced those fears ourselves, but we feel that properly understanding procedures for vetting instruments before their use is essential for robust and replicable research. To reiterate, at issue here is the confidence and trust one can have in one’s own research, both after its initial completion and in future studies that will rely on the replicability of results. Again, we can use an analogy for the measurement of unobservable phenomena: one would not expect an uncalibrated and calibrated scale to produce the same values for the weight of a rock. This does not mean that the uncalibrated scale will necessarily produce invalid measurements, only that one’s confidence in its ability to do so should be tempered by the knowledge that it has not yet been calibrated. Research conducted using uncalibrated or biased instruments, regardless of discipline, is at risk of inferring conclusions that are incorrect. The researcher may make the appropriate inferences given the values provided by the instrument, but if the instrument itself is invalid for the proposed use, then the inferences drawn are also invalid. Our aim in presenting these methods is to strengthen the research conducted in biology education and continue to improve the quality of biology education in higher education.

1 In this article, we will use the terms “surveys,” “measurement instrument,” and “instrument” interchangeably. We will, however, put the most emphasis on the term “measurement instrument,” because it conveys the importance of considering the quality of the measurement resulting from the instrument’s use.

2 “Latent variables” and “constructs” both refer to phenomena that are not directly observable. Examples could include a student’s goals, the strength of his or her interest in biology, or his or her tolerance of failure. The term “latent variable” is commonly used when discussing these phenomena from a measurement point of view, while “construct” is a more general term used when discussing these phenomena from a theoretical perspective. In this article, we will use the term “construct” only when referring to phenomena that are not directly observable.

3 In addition to coefficient alpha, there are a number of other reliability estimates available. We refer interested readers to Bandalos (2018) , Sijtsma (2009) , and Crocker and Algina (2008) .

4 This is partly due to identification issues (see Specifying the Model ).

5 In EFA, communalities describe how much of the variance in an item is explained by the factor. For more information about communalities, see Interpreting Output from EFA .

6 For a description of a correlation matrix, see the Supplemental Material, Sections 1 and 2.

7 It is necessary to set the metric to interpret factor loadings and variances in a CFA model. This is commonly done by either 1) choosing one of the factor loadings and fixing it to 1 (this is done for each factor in the model) or 2) by fixing the variance of the latent factors to 1. We have chosen the former approach for this example.

8 For some software and estimation methods, model fit indices are also provided for EFA. In a similar way as for CFA, these model fit indices can be used to evaluate the fit of the data to the model.

9 When using CFA, the default setting in most software is to provide factor loadings in the original metric of the items, such that the results are covariances between the items and the factor. Because these values are unstandardized, it is sometimes hard to interpret these relationships. For this reason, it is common to standardize factor loadings and other model relationships (e.g., correlations between latent factors), which puts them in the more familiar correlation format that is bounded by −1 and +1.

10 When distilling the responses of several items into a single score, one is implicitly assuming that all of the items measure the underlying construct equally well (usually without measurement error) and are of equal theoretical importance. Fully discussing the nuances of how to create a single score from a set of items is beyond the scope of this paper, but we would be remiss if we did not at least mention it and encourage the reader to seek more information, such as DiStefano et al . (2009 ).

ACKNOWLEDGMENTS

We are indebted to Ashely Rowland, Melissa McCartney, Matthew Kararo, Julie Charbonnier, and Marie Janelle Tacloban for their comments on earlier versions of this article. The research reported in this paper was supported by awards from the National Science Foundation (NSF DUE 1534195 and 1711082). This research was conducted under approved IRB 2015-06-0055, University of Texas at Austin.

  • Allen, J. M., Muragishi, G. A., Smith, J. L., Thoman, D. B., & Brown, E. R. ( 2015 ). To grab and to hold: Cultivating communal goals to overcome cultural and structural barriers in first-generation college students’ science interest . Translational Issues in Psychological Science , 1 (4), 331. Medline ,  Google Scholar
  • American Educational Research Association, American Psychological Association, and National Council for Measurement in Education (AERA, APA, and NCME) . ( 2014 ). Standards for educational and psychological testing . Washington, DC. Google Scholar
  • Andrews, S. E., Runyon, C., & Aikens, M. L. ( 2017 ). The math–biology values instrument: Development of a tool to measure life science majors’ task values of using math in the context of biology . CBE—Life Sciences Education , 16 (3), ar45. Link ,  Google Scholar
  • Armbruster, P., Patel, M., Johnson, E., & Weiss, M. ( 2009 ). Active learning and student-centered pedagogy improve student attitudes and performance in introductory biology . CBE—Life Sciences Education , 8 (3), 203–213. Link ,  Google Scholar
  • Bakan, D. ( 1966 ). The duality of human existence: An essay on psychology and religion . Oxford, UK: Rand McNally. Google Scholar
  • Bandalos, D. L. ( 2018 ). Measurement theory and applications for the social sciences . New York: Guilford. Google Scholar
  • Bandalos, D. L., & Finney, S. J. ( 2010 ). Factor analysis. Exploratory and confirmatory . In Hancock, G. R.Mueller, R. O. (Eds.), The reviewer’s guide to quantitative methods in the social science (pp. 93–114). New York: Routledge. Google Scholar
  • Beaujean, A. A. ( 2012 ). BaylorEdPsych: R package for Baylor University educational psychology quantitative courses . Retrieved from https://CRAN.R-project.org/package=BaylorEdPsych Google Scholar
  • Boomsma, A. ( 1982 ). Robustness of LISREL against small sample sizes in factor analysis models . In Joreskog, K. G.Wold, H. (Eds.), Systems under indirect observation: Causality, structure, prediction (Part 1, pp. 149–173). Amsterdam, Netherlands: North Holland. Google Scholar
  • Borsboom, D., Mellenbergh, G. J., & van Heerden, J. ( 2004 ). The concept of validity . Psychological Review , 111 (4), 1061–1071. Medline ,  Google Scholar
  • Cattell, R. B. ( 1966 ). The scree test for the number of factors . Multivariate Behavioral Research , 1 (2), 245–276. Medline ,  Google Scholar
  • Clark, L. A., & Watson, D. ( 1995 ). Constructing validity: Basic issues in objective scale development . Psychological Assessment , 7 (3), 309–319. Google Scholar
  • Comrey, A. L., & Lee, H. B. ( 1992 ). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum. Google Scholar
  • Crawford, A.V., Green, S.B., Levy, R., Lo W-J., Scott, L., Svetina, D., & Thompson, M.S. ( 2010 ). Evaluation of parallel analysis methods for determining the number of factors . Educational and Psychological Measurement , 70 (6), 885–901. Google Scholar
  • Crocker, L., & Algina, J. ( 2008 ). Introduction to classical and modern test theory . Mason, OH: Cengage Learning. Google Scholar
  • Cronbach, L. J., & Meehl, P. E. ( 1955 ). Construct validity in psychological tests . Psychological Bulletin , 52 (4), 281–302. Medline ,  Google Scholar
  • Diekman, A. B., Brown, E. R., Johnston, A. M., & Clark, E. K. ( 2010 ). Seeking congruity between goals and roles: A new look at why women opt out of science, technology, engineering, and mathematics careers . Psychological Science , 21 (8), 1051–1057. Medline ,  Google Scholar
  • DiStefano, C., Zhu, M., & Mindrila, D. ( 2009 ). Understanding and using factor scores: Considerations for the applied researcher . Practical Assessment, Research & Evaluation , 14 (20), 1–11. Google Scholar
  • Eagly, A. H., Wood, W., & Diekman, A. ( 2000 ). Social role theory of sex differences and similarities: A current appraisal . In Eckes, T.Trautner, H. M. (Eds.), The developmental social psychology of gender (pp. 123–174). Mahwah, NJ: Erlbaum. Google Scholar
  • Eddy, S. L., Brownell, S. E., Thummaphan, P., Lan, M. C., & Wenderoth, M. P. ( 2015 ). Caution, student experience may vary: Social identities impact a student’s experience in peer discussions . CBE—Life Sciences Education , 14 (4), ar45. Link ,  Google Scholar
  • Eddy, S. L., & Hogan, K. A. ( 2014 ). Getting under the hood: How and for whom does increasing course structure work? . CBE—Life Sciences Education , 13 (3), 453–468. Link ,  Google Scholar
  • Finney, S. J., & DiStefano, C. ( 2006 ). Nonnormal and categorical data in structural equation modeling . In Hancock, G. R.Mueller, R. O. (Eds.), A second course in structural equation modeling (pp. 269–314). Greenwich, CT: Information Age. Google Scholar
  • Fowler, F. J. ( 2014 ). Survey research methods . Los Angeles: Sage. Google Scholar
  • Gagne, P., & Hancock, G. R. ( 2006 ). Measurement model quality, sample size, and solution propriety in confirmatory factor models . Multivariate Behavioral Research , 41 (1), 65–83. Medline ,  Google Scholar
  • Gorsuch, R. L. ( 1983 ). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum. Google Scholar
  • Green, S. B., & Yang, Y. ( 2009 ). Commentary on coefficient alpha: A cautionary tale . Psychometrika , 74 (1), 121–135. Google Scholar
  • Hu, L., & Bentler, P. ( 1999 ). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives . Structural Equation Modeling: A Multidisciplinary Journal , 6 (1), 1–55. Google Scholar
  • Kline, R. B. ( 2016 ). Principles and practise of structural equation modeling (4th ed.). New York: Guilford. Google Scholar
  • Leandre, R., Fabrigar, L. R., & Wegener, D. T. ( 2012 ). Exploratory factor analysis . Oxford, UK: Oxford University Press. Google Scholar
  • Ledesma, R.D., & Valero-Mora, P. ( 2007 ). Determining the number of factors to retain in EFA: An easy-to-use computer program for carrying out parallel analysis . Practical Assessment, Research & Evaluation , 12 (2) Google Scholar
  • Lissitz, R. W., & Samuelsen, K. ( 2007 ). A suggested change in terminology and emphasis regarding validity and education . Educational Researcher , 36 (8), 437–448. Google Scholar
  • Matsunaga, M. ( 2010 ). How to factor analyze your data right: Do’s, don’t and how-to’s . International Journal of Psychological Research , 3 . Retrieved February 24, 2019, from www.redalyc.org/html/2990/ 299023509007/ Google Scholar
  • McNeish, D. ( 2018 ). Thanks coefficient alpha, we’ll take it from here . Psychological Methods , 23 (3), 412–433. https://dx.doi.org/10.1037/met0000144 Google Scholar
  • Mehrens, W. A. ( 1997 ). The consequences of consequential validity . Educational Measurement: Issues and Practise , 16 (2), 16–18. Google Scholar
  • Messick, S. ( 1995 ). Validity of psychological-assessment—Validation of inferences from person’s responses and performances as scientific inquiry into score meaning . American Psychologist , 50 (9), 741–749. Google Scholar
  • Perry, J. L., Nicholls, A. R, Clough, P. J., & Crust, L. ( 2015 ). Assessing model fit: Caveats and recommendations for confirmatory factor analysis and exploratory structural equation modeling . Measurement in Physical Education and Exercise Science , 19 (1), 12–21. Google Scholar
  • Prentice, D. A., & Carranza, E. ( 2002 ). What women and men should be, shouldn’t be, are allowed to be, and don’t have to be: The contents of prescriptive gender stereotypes . Psychology of Women Quarterly , 26 (4), 269–281. Google Scholar
  • Raykov, T., & Marcoulides, G. A. ( 2008 ). An introduction to applied multivariate analysis . New York: Routledge. Google Scholar
  • R Core Team . ( 2016 ). R: A language and environment for statistical computing . Vienna, Austria: R Foundation for Statistical Computing. Retrieved February 24, 2019, from www.R-project.org Google Scholar
  • Reeves, T. D., & Marbach-Ad, G. ( 2016 ). Contemporary test validity in theory and practice: A primer for discipline-based education researchers . CBE—Life Sciences Education , 15 (1), rm1. Google Scholar
  • Revelle, W. ( 2017 ). psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA . Retrieved February 24, 2019, from https://CRAN.R-project.org/package=psychVersion=1.7.8 Google Scholar
  • Rissing, S. W., & Cogan, J. G. ( 2009 ). Can an inquiry approach improve college student learning in a teaching laboratory ? CBE—Life Sciences Education , 8 (1), 55–61. Link ,  Google Scholar
  • Rosseel, Y. ( 2012 ). lavaan: An R Package for Structural Equation Modeling . Journal of Statistical Software , 48 (2), 1–36. Google Scholar
  • Ruscio, J., & Roche, B. ( 2012 ). Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure . Psychological Assessment , 24 (2), 282. Medline ,  Google Scholar
  • Schmitt, N. ( 1996 ). Uses and abuses of coefficient alpha . Psychological Assessment , 8 (4), 350–353. Google Scholar
  • Sijtsma, K. ( 2009 ). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha . Psychometrika , 74 (1), 107. Medline ,  Google Scholar
  • Slaney, K. ( 2017 ). Construct validity: Developments and debates . In Validating psychological constructs: Historical, Philosophical, and Practical Dimensions (pp. 83–109) (Palgrave studies in the theory and history of psycho­logy). London: Palgrave Macmillan. Google Scholar
  • Smith, J. L., Cech, E., Metz, A., Huntoon, M., & Moyer, C. ( 2014 ). Giving back or giving up: Native American student experiences in science and engineering . Cultural Diversity and Ethnic Minority Psychology , 20 (3), 413. Medline ,  Google Scholar
  • Stephens, N. M., Fryberg, S. A., Markus, H. R., Johnson, C. S., & Covarrubias, R. ( 2012 ). Unseen disadvantage: How American universities’ focus on independence undermines the academic performance of first-generation college students . Journal of Personality and Social Psychology , 102 (6), 1178–1197. Retrieved from http://doi.org/10.1037/a0027143 Medline ,  Google Scholar
  • Su, R., Rounds, J., & Armstrong, P. I. ( 2009 ). Men and things, women and people: A meta-analysis of sex differences in interests . Psychological Bulletin , 135 (6), 859. Medline ,  Google Scholar
  • Tabachnick, B. G., & Fidell, L. S. ( 2013 ). Using multivariate statistics (6th ed). Boston: Pearson. Google Scholar
  • Tavakol, M., & Dennick, R. ( 2011 ). Making sense of Cronbach’s alpha . International Journal of Medical Education , 2 , 53–55. Retrieved February 24, 2019, from http://doi.org/10.5116/ijme.4dfb.8dfd Medline ,  Google Scholar
  • Wachsmuth, L. P., Runyon, C. R., Drake, J. M., & Dolan, E. L. ( 2017 ). Do biology students really hate math? Empirical insights into undergraduate life science majors’ emotions about mathematics . CBE—Life Sciences Education , 16 (3), ar49. Link ,  Google Scholar
  • Wiggins, B. L., Eddy, S. L., Wener-Fligner, L., Freisem, K., Grunspan, D. Z., Theobald, E. J., & Crowe, A. J. ( 2017 ). ASPECT: A survey to assess student perspective of engagement in an active-learning classroom . CBE—Life Sciences Education , 16 (2), ar32. Link ,  Google Scholar
  • Wolf, E. J., Harrington, K. M., Clark, S. L., & Miller, M. W. ( 2013 ). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety . Educational and Psychological Measurement , 73 (6), 913–934. Google Scholar
  • Worthington, R. L., & Whittaker, T. A. ( 2006 ). Scale development research: A content analysis and recommendations for best practices . The Counseling Psychologist , 34 (6), 806–838. Google Scholar
  • Yang, Y., & Green, S. B. ( 2011 ). Coefficient alpha: A reliability coefficient for the 21st century ? Journal of Psychoeducational Assessment , 29 (4), 377–392. Google Scholar
  • Yong, A. G., & Pearce, S. ( 2013 ). A beginner’s guide to factor analysis: Focusing on exploratory factor analysis . Tutorials in Quantitative Methods for Psychology , 9 (2), 79–94. Google Scholar
  • The Prader-Willi syndrome Profile: validation of a new measure of behavioral and emotional problems in Prader-Willi syndrome 23 February 2024 | Orphanet Journal of Rare Diseases, Vol. 19, No. 1
  • Poor personal protective equipment practices were associated with heat-related symptoms among Asian healthcare workers: a large-scale multi-national questionnaire survey 1 March 2024 | BMC Nursing, Vol. 23, No. 1
  • Development and Validation of Instruments for Assessing the Impact of Artificial Intelligence on Students in Higher Education 15 May 2024 | International Journal of Educational Methodology, Vol. volume-10-2024, No. volume-10-issue-2-may-2024
  • Factors influencing the willingness to use agrivoltaics: A quantitative study among German farmers Applied Energy, Vol. 361
  • Validation of the Portuguese Version of the Perceived Physical Literacy Instrument Journal of Physical Activity and Health, Vol. 21, No. 4
  • Evaluating the impact of internet communication quality in human resource management on the productivity of construction projects Heliyon, Vol. 10, No. 7
  • Development and Validation of the Interprofessional Collaboration Practice Competency Scale (IPCPCS) for Clinical Nurses 8 April 2024 | Healthcare, Vol. 12, No. 7
  • The differential effects of self-view in virtual meetings when speaking vs. listening 29 March 2024 | European Journal of Information Systems, Vol. 4
  • Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size 20 March 2024 | Educational Measurement: Issues and Practice, Vol. 65
  • A Network Analysis of Digital Clock Drawing for Command and Copy Conditions 18 March 2024 | Assessment
  • Students’ attitudes towards fundamental democratic values: The construction of a measurement instrument 13 March 2024 | Education, Citizenship and Social Justice, Vol. 2
  • Miranda M. Chen Musgrove ,
  • Melissa E. Ko ,
  • Jeffrey N. Schinske , and
  • Lisa A. Corwin
  • Brian Sato, Monitoring Editor
  • Development and Validation of Scientific Inquiry Literacy Instrument (SILI) Using Rasch Measurement Model 18 March 2024 | Education Sciences, Vol. 14, No. 3
  • Measuring School Staff Confidence and Worries to Deliver Mental Health Content: An Examination of the Psychometric Properties of Two Measures in a Sample of Secondary School Staff 25 October 2023 | School Mental Health, Vol. 16, No. 1
  • Testing the reciprocal relationship between depressive symptoms and insomnia 26 February 2024 | Journal of Clinical Psychology, Vol. 10
  • Undergraduate-level biology students’ application of central dogma to understand COVID mRNA vaccines 23 February 2024 | Journal of Microbiology & Biology Education
  • The Development of the RA-VING Self-Trust Instrument (RSTI) across Three National Samples of Adults in the United States 22 February 2024 | Measurement and Evaluation in Counseling and Development, Vol. 10
  • The Italian Validation of the Healthcare Professional Humanization Scale for Nursing 19 February 2024 | Journal of Holistic Nursing, Vol. 70
  • Investigating evidence in support of validity and reliability for data collected with the meaningful learning in the laboratory instrument (MLLI) 1 January 2024 | Chemistry Education Research and Practice, Vol. 25, No. 1
  • Psychometric Evaluation of the Hope-Action Inventory in Individuals with Substance Use Issues 3 April 2023 | Measurement and Evaluation in Counseling and Development, Vol. 57, No. 1
  • Upcycled food choice motives and their association with hesitancy towards consumption of this type of food: a Swedish study 23 January 2023 | British Food Journal, Vol. 126, No. 1
  • The Interplay Between Supplier Management Practices and Customer Satisfaction: An Empirical Investigation of the Banking Industry in a Developing Country 5 February 2024 | Quality Management Journal, Vol. 31, No. 1
  • Students’ expert-like attitudes in calculus and introductory computer science courses with active-learning pedagogy 8 October 2022 | Computer Science Education, Vol. 34, No. 1
  • Using Protection Motivation Theory to develop a survey of over-the-counter decision-making by older adults Research in Social and Administrative Pharmacy, Vol. 20, No. 1
  • Understanding Children’s Online Victimization through the Psychosocial Lens: The Roles of Loneliness, Online Social Currency, and Digital Citizenship 31 December 2023 | Healthcare, Vol. 12, No. 1
  • Potential of serious games as a competency assessment tool for acute care nurses on the blood transfusion procedure 1 January 2024 | International Journal of Digital Health, Vol. 4
  • Psychometric properties of the Mongolian version of shitsu-taikan-sho (alexisomia) in young adults 23 January 2024 | Science Progress, Vol. 107, No. 1
  • Measuring perceived fitness interdependence between humans and non-humans 27 February 2024 | Evolutionary Human Sciences, Vol. 6
  • The Psychometric Properties of The Revised-Depression Attitude Questionnaire Among Primary Healthcare Physicians in Oman 1 January 2024 | Medical Science and Discovery, Vol. 11, No. 1
  • The pitfalls or gaps in monitoring and evaluation tools during Coronavirus disease 2019 era in South African municipalities 25 December 2023 | International Journal of Research in Business and Social Science (2147- 4478), Vol. 12, No. 10
  • ปัจจัยความสำเร็จของธุรกิจแฟรนไชส์ในเขตกรุงเทพมหานครและปริมณฑลในยุควิถีใหม่ 16 December 2023 | RMUTT Global Business and Economics Review, Vol. 18, No. 2
  • Construct Validity and Reliability of the Theory Evaluation Scale: A Factor Analysis 15 December 2023 | Journal of Social Work Education, Vol. 111
  • Influence of CUREs on STEM retention depends on demographic identities Journal of Microbiology & Biology Education, Vol. 24, No. 3
  • Measuring undergraduates’ understanding of the culture of scientific research as an outcome variable in research on CUREs Journal of Microbiology & Biology Education, Vol. 24, No. 3
  • Psychometric properties of the arabic version of the positive/negative experiences of parents for school-aged students about online learning during COVID-19 pandemic assessment scale 28 February 2023 | BMC Research Notes, Vol. 16, No. 1
  • Validation of IFMSA social accountability assessment tool: exploratory and confirmatory factor analysis 1 March 2023 | BMC Medical Education, Vol. 23, No. 1
  • Measurement in STEM education research: a systematic literature review of trends in the psychometric evidence of scales 2 June 2023 | International Journal of STEM Education, Vol. 10, No. 1
  • The New Zealand eating behavior questionnaire – Validation study for a novel assessment tool to describe actionable eating behavior traits Appetite, Vol. 191
  • Development and Validation of a Measure to Assess Readiness to Advance Health and Equity: The Assessment for Advancing Community Transformation (AACT) 14 November 2022 | Evaluation & the Health Professions, Vol. 46, No. 4
  • Prospective Physician Assistant Student Perspectives on an In-Person vs. Virtual Admissions Interview Process 18 September 2023 | Journal of Physician Assistant Education, Vol. 34, No. 4
  • Autism and intention attribution test: a non-verbal evaluation with comic strips 12 August 2023 | Annals of General Psychiatry, Vol. 22, No. 1
  • Characterizing faculty motivation to implement three-dimensional learning 14 August 2023 | Disciplinary and Interdisciplinary Science Education Research, Vol. 5, No. 1
  • Evaluation the validity and reliability of persian short form of the literacy of suicide scale (LOSS): a methodological study in 2022 25 October 2023 | BMC Psychiatry, Vol. 23, No. 1
  • Validation of the Cantonese version of the Traditional Chinese Medicine (TCM) Body constitution Questionnaire in elderly people 11 October 2023 | Chinese Medicine, Vol. 18, No. 1
  • Jeremy L. Hsu ,
  • Noelle Clark ,
  • Kate Hill , and
  • Melissa Rowland-Goldsmith
  • Luanna Prevost, Monitoring Editor
  • Forming Groups in a Large-Enrollment Biology Class: Group Permanence Matters More than Group Size CBE—Life Sciences Education, Vol. 22, No. 4
  • Bridging language barriers in developing valid health policy research tools: insights from the translation and validation process of the SHEMESH questionnaire 26 November 2023 | Israel Journal of Health Policy Research, Vol. 12, No. 1
  • Measuring self-regulation in everyday life: Reliability and validity of smartphone-based experiments in alcohol use disorder 12 December 2022 | Behavior Research Methods, Vol. 55, No. 8
  • Psychometric evaluation of the Spanish version of the Pediatric Quality of Life Eosinophilic Esophagitis Questionnaire (Peds QL-EoE Module ™) 13 December 2023 | Health and Quality of Life Outcomes, Vol. 21, No. 1
  • Examining Oral Communication Skills in Communication Training Programs for STEM Professionals: A Systematic Quantitative Literature Review 13 October 2023 | Science Communication, Vol. 45, No. 6
  • Validation of the Comprehensive Feeding Practices Questionnaire among parents of 5- to 7-year-old children in Sweden 30 November 2023 | Frontiers in Psychology, Vol. 14
  • Validation of a culturally adapted Swedish-language version of the Death Literacy Index 30 November 2023 | PLOS ONE, Vol. 18, No. 11
  • Self-Care Behavior and Associated Factors of Nursing Students with Dysmenorrhea: A Structural Equation Model Journal of Nursing Management, Vol. 2023
  • Development and Psychometric Validation of the 27 Item Zarit Caregiver Interview for Alzheimer’s Disease (ZCI-AD-27) Current Alzheimer Research, Vol. 19, No. 13
  • Can psychological distress account for the associations between COVID‐19 vaccination acceptance and socio‐economic vulnerability? 4 May 2023 | Applied Psychology: Health and Well-Being, Vol. 15, No. 4
  • Romantic Love and Behavioral Activation System Sensitivity to a Loved One 10 November 2023 | Behavioral Sciences, Vol. 13, No. 11
  • Factor structure and validity of the Parental Competence Questionnaire in the Paediatric Hospital Emergency Setting (ECP-U) Journal of Pediatric Nursing, Vol. 73
  • Notes to Factor Analysis Techniques for Construct Validity 6 October 2023 | Canadian Journal of Nursing Research, Vol. 18
  • How sensory perceptions and sensory brand experience influence customer behavioral intentions in the context of cartoon-themed restaurants International Journal of Hospitality Management, Vol. 115
  • Going Cashless? Elucidating Predictors for Mobile Payment Users’ Readiness and Intention to Adopt 12 December 2023 | SAGE Open, Vol. 13, No. 4
  • Social integration, solidarity, and psychological health of internally displaced persons in Cameroon: Exploring the role of community satisfaction Heliyon, Vol. 9, No. 10
  • Are there any “science people” in undergraduate health science courses? Assessing science identity among pre‐nursing and pre‐allied health students in a community college setting 19 September 2023 | Journal of Research in Science Teaching, Vol. 14
  • Five ways to waste food: food wasting behaviours questionnaire 30 June 2023 | British Food Journal, Vol. 125, No. 9
  • Differential Mechanisms Linking Early Childhood Threat and Social/Environmental Deprivation to Adolescent Conduct Problems 26 August 2023 | Journal of Family Violence, Vol. 21
  • Participation in a High-Structure General Chemistry Course Increases Student Sense of Belonging and Persistence to Organic Chemistry 12 July 2023 | Journal of Chemical Education, Vol. 100, No. 8
  • Adopting e-government to monitor public infrastructure projects execution in Nigeria: The public perspective Heliyon, Vol. 9, No. 8
  • Impacts of Tourists’ Shopping Destination Trust on Post-Visit Behaviors: A Loss Aversion Perspective 30 July 2023 | International Journal of Hospitality & Tourism Administration, Vol. 5
  • Breaking the mold: Study strategies of students who improve their achievement on introductory biology exams 3 July 2023 | PLOS ONE, Vol. 18, No. 7
  • Adaptation and Psychometric Properties of an Instrument to Assess Self-Efficacy in Client-Centeredness (SECCQ) 23 February 2022 | Journal of Social Work Education, Vol. 59, No. 3
  • Measuring policymaking capacities of schools: validation of the Policy Making Capacities Questionnaire (PMC-Q) 18 May 2023 | School Effectiveness and School Improvement, Vol. 34, No. 3
  • Identifying Drivers and Hindrances to the Disposal of Used Mobile Phones: A Study of User Behavior in the UAE 21 September 2023 | SAGE Open, Vol. 13, No. 3
  • What is reciprocity? A review and expert-based classification of cooperative transfers Evolution and Human Behavior, Vol. 44, No. 4
  • Factor structure of the Five Facets Mindfulness Questionnaire (FFMQ) (15 items) in a collectivist society—Pakistan 8 February 2023 | Psychology in the Schools, Vol. 60, No. 7
  • Verification of Reliability and Validity of a Malaysian Version of Rathus Assertiveness Schedule as Drug Prevention Scale 28 June 2023 | Islamic Guidance and Counseling Journal, Vol. 6, No. 2
  • L’vannah Abrams ,
  • Tess Carlson ,
  • Mark Dieter ,
  • Paulos Flores ,
  • David Frischer ,
  • Jolie Goolish ,
  • Michelle La-Fevre Bernt ,
  • Amber Lancaster ,
  • Christopher Lipski ,
  • Joshua Vargas Luna ,
  • Lucy M. C. Luong ,
  • Marlene Mullin ,
  • Mia Janelle Newman ,
  • Carolina Quintero ,
  • Julie Reis ,
  • Freja Robinson ,
  • Allison James Ross ,
  • Hilary Simon ,
  • Gianne Souza ,
  • Jess Taylor ,
  • Katherine E. Ward ,
  • Yvonne Lever White ,
  • Emily Witkop ,
  • Christine Yang ,
  • Aliza Zenilman ,
  • Eddie Zhang ,
  • Kimberly D. Tanner
  • Sehoya Cotner, Monitoring Editor
  • Develop and Validate a Survey to Assess Adult’s Perspectives on Autonomous Ridesharing and Ridehailing Services 1 June 2023 | Future Transportation, Vol. 3, No. 2
  • Exploring member trust in German community-supported agriculture: a multiple regression analysis 7 November 2022 | Agriculture and Human Values, Vol. 40, No. 2
  • Predicting unreliable response patterns in smartphone health surveys: A case study with the mood survey Smart Health, Vol. 28
  • The Structure of Cognitive Abilities and Associations with Problem Behaviors in Early Adolescence: An Analysis of Baseline Data from the Adolescent Brain Cognitive Development Study 10 May 2023 | Journal of Intelligence, Vol. 11, No. 5
  • Function, symbolism or society? Exploring consumer interest in electric and shared mobility Transportation Research Part D: Transport and Environment, Vol. 118
  • Construct validation of the teacher attitude to inclusion scale for Filipino pre-service teachers 30 April 2023 | Bedan Research Journal, Vol. 8, No. 1
  • Confidence Disparities: Pre-course Coding Confidence Predicts Greater Statistics Intentions and Perceived Achievement in a Project-Based Introductory Statistics Course 17 April 2023 | Journal of Statistics and Data Science Education, Vol. 12
  • Linking LMX and happiness at work through symbolic interaction theory – The role of self-esteem and organizational embeddedness 4 April 2023 | Journal of Economic and Administrative Sciences, Vol. 5
  • Development and Psychometric Properties of the Sleep Parenting Scale for Infants 18 November 2021 | Behavioral Medicine, Vol. 49, No. 2
  • Teaching design for additive manufacturing: efficacy of and engagement with lecture and laboratory approaches 8 August 2022 | International Journal of Technology and Design Education, Vol. 33, No. 2
  • Validity and Reliability of the Empowered Veteran Index Among Military Veterans 21 May 2023 | Journal of Prevention and Health Promotion, Vol. 4, No. 2
  • Psychometric properties of an innovative smartphone application to investigate the daily impact of hypoglycemia in people with type 1 or type 2 diabetes: The Hypo-METRICS app 17 March 2023 | PLOS ONE, Vol. 18, No. 3
  • Indonesia Translation and Cross-Cultural Validation of Pediatric Anesthesia Parent Satisfaction (PAPS) Questionnaire Cureus, Vol. 363
  • Irfanul Alam ,
  • Karen Ramirez ,
  • Katharine Semsar , and
  • Ross Nehm, Monitoring Editor
  • Texting and crossing: An extended theory of planned behaviour to model the psychological and demographic factors related to pedestrians' use of cell phone for texting at crosswalks in developing country IATSS Research, Vol. 47, No. 1
  • Validation of Self-Reported Attachment Classification Among Racially and Ethnically Diverse Parents of Young Children 9 December 2022 | Nursing Research, Vol. 72, No. 2
  • Development and preliminary validation of a questionnaire to measure parental support for drawing Thinking Skills and Creativity, Vol. 47
  • AN EXPLORATORY FACTOR ANALYSIS OF LONG COVID 14 February 2023 | Central Asian Journal of Medical Hypotheses and Ethics, Vol. 3, No. 4
  • Physical self-concept in Peruvian adolescent schoolchildren: Validity, reliability, and proposal of percentiles for its evaluation 10 February 2023 | Frontiers in Education, Vol. 8
  • Development and Validation of a Diabetes Questionnaire for Middle School Students Journal of Nutrition Education and Behavior, Vol. 55, No. 2
  • Gender Awareness in Healthcare: Contextualization of an Arabic Version of the Nijmegen Gender Awareness in Medicine Scale (N-GAMS) 20 February 2023 | Healthcare, Vol. 11, No. 4
  • Development and Initial Validation of the Attitudes Towards Older Adult Sexuality in Long-term Care Scale (AOASLC) 14 September 2021 | The Journal of Sex Research, Vol. 60, No. 1
  • Family Functioning in Humanitarian Contexts: Correlates of the Feminist-Grounded Family Functioning Scale among Men and Women in the Eastern Democratic Republic of Congo 8 July 2022 | Journal of Child and Family Studies, Vol. 32, No. 1
  • A Review on Dimensionality Reduction for Machine Learning 28 March 2023
  • The Swedish Version of the eHealth Literacy Questionnaire: Translation, Cultural Adaptation, and Validation Study 12 April 2023 | Journal of Medical Internet Research, Vol. 25
  • Japanese school children's intake of selected food groups and meal quality due to differences in guardian's literacy of meal preparation for children during the COVID-19 pandemic Appetite, Vol. 180
  • Validation Study of the Revised Spirituality and Spiritual Care Rating Scale (SSCRS): A Cross-Sectional Survey in Poland 1 May 2023 | Journal of Multidisciplinary Healthcare, Vol. Volume 16
  • Decline Is Not Inevitable: Changes in Science Identity during the Progression through a U.S. Middle School among Boys and Girls 25 February 2023 | Socius: Sociological Research for a Dynamic World, Vol. 9
  • The impact of perceived due care on trustworthiness and free market support in the Dutch banking sector 8 June 2022 | Business Ethics, the Environment & Responsibility, Vol. 32, No. 1
  • The Impact of COVID-19 on the Training of Anesthesiologists in Hong Kong: Overcoming the Challenge 20 November 2023 | Journal of Medical Education and Curricular Development, Vol. 10
  • Sensitivity Analysis Using Standardized Regression Coefficients of Roof Design Variables for Energy Performance in Residential Buildings 3 January 2024
  • Survey Data Analysis 27 January 2024
  • Modification and adaptation of the general self-efficacy scale to determine nursing students’ belief in their capability to care for older adults International Journal of Africa Nursing Sciences, Vol. 19
  • Validity and Reliability of the Symptom-Management Self-Efficacy Scale for Breast Cancer Related to Chemotherapy 16 December 2022 | Bezmialem Science, Vol. 10, No. 6
  • From Novice To Expert: An Assessment To Measure Strategies Students Implement While Learning To Read Primary Scientific Literature Journal of Microbiology & Biology Education, Vol. 23, No. 3
  • BAGAIMANA TERPAAN MEDIA UNTUK INFORMASI COVID-19 MEMENGARUHI NIAT MAHASISWA MENERAPKAN PERLINDUNGAN KESEHATAN SELAMA PANDEMI 15 December 2022 | Interaksi: Jurnal Ilmu Komunikasi, Vol. 11, No. 2
  • Kathryn M. Parsley ,
  • Bernie J. Daigle , and
  • Jaime L. Sabel
  • Validation and Scoring of the Greek Version of the Strategic and Clinical Quality Indicators in Postoperative Pain Management (SCQIPP) Questionnaire Journal of PeriAnesthesia Nursing, Vol. 37, No. 6
  • Validity and reliability of a questionnaire developed to explore quality assurance components for teaching and learning in vocational and technical education 2 September 2022 | Humanities and Social Sciences Communications, Vol. 9, No. 1
  • Clinical trial recruitment in primary care: exploratory factor analysis of a questionnaire to measure barriers and facilitators to primary care providers’ involvement 3 December 2022 | BMC Primary Care, Vol. 23, No. 1
  • Psychometric properties of the Internalized Stigma of Mental Illness (ISMI-10) scale in a Dutch sample of employees with mental illness 27 October 2022 | BMC Psychiatry, Vol. 22, No. 1
  • The effect of implementing mind maps for online learning and assessment on students during COVID-19 pandemic: a cross sectional study 12 March 2022 | BMC Medical Education, Vol. 22, No. 1
  • Nursing students’ understanding of health literacy and health practices: a cross-sectional study at a university in Namibia 4 January 2022 | BMC Nursing, Vol. 21, No. 1
  • Validity of score interpretations on an online English placement writing test 15 September 2022 | Language Testing in Asia, Vol. 12, No. 1
  • Coping behavior versus coping style: characterizing a measure of coping in undergraduate STEM contexts 14 February 2022 | International Journal of STEM Education, Vol. 9, No. 1
  • A case study of a novel summer bridge program to prepare transfer students for research in biological sciences 20 December 2022 | Disciplinary and Interdisciplinary Science Education Research, Vol. 4, No. 1
  • Behavioural Factors for Users of Bicycles as a Transport Alternative: A Case Study 15 December 2022 | Sustainability, Vol. 14, No. 24
  • Why Travel to Georgia? Motivations, Experiences, and Country’s Image Perceptions of Wine Tourists 6 October 2022 | Tourism and Hospitality, Vol. 3, No. 4
  • An Exploration of Masculinities and Concurrency Among Black Sexual Minority and Majority Men: Implications for HIV/STI Prevention 7 December 2022 | Annals of LGBTQ Public and Population Health, Vol. 3, No. 4
  • An interest-oriented laboratory microbial engineering course is helpful for the cultivation of scientific literacy 23 November 2022 | Journal of Biological Education
  • Reliability and Validity of a Perinatal Shared Decision-Making Measure: The Childbirth Options, Information, and Person-Centered Explanation Journal of Obstetric, Gynecologic & Neonatal Nursing, Vol. 51, No. 6
  • HLS19-NAV—Validation of a New Instrument Measuring Navigational Health Literacy in Eight European Countries 25 October 2022 | International Journal of Environmental Research and Public Health, Vol. 19, No. 21
  • Socio-Economic Factors Affecting Member’s Satisfaction towards National Health Insurance: An Evidence from the Philippines 21 November 2022 | International Journal of Environmental Research and Public Health, Vol. 19, No. 22
  • Development of a survey instrument to assess individual and organizational use of climate adaptation science Environmental Science & Policy, Vol. 137
  • Assessing how students value learning communication skills in an undergraduate anatomy and physiology course 3 December 2021 | Anatomical Sciences Education, Vol. 15, No. 6
  • How do STEM graduate students perceive science communication? Understanding science communication perceptions of future scientists 3 October 2022 | PLOS ONE, Vol. 17, No. 10
  • Validation of the Short-Test of Functional Health Literacy in Adults for the Samoan Population HLRP: Health Literacy Research and Practice, Vol. 6, No. 4
  • Nurses' professional values scale‒three: Validation and psychometric appraisal among Saudi undergraduate student nurses Journal of Taibah University Medical Sciences, Vol. 17, No. 5
  • Sensory quality control: Assessment of food company employees' knowledge, attitudes, and practices 29 June 2022 | Journal of Sensory Studies, Vol. 37, No. 5
  • Joshua Premo ,
  • Brittney N. Wyatt ,
  • Matthew Horn , and
  • Heather Wilson-Ashworth
  • Kristy Jean Wilson, Monitoring Editor
  • Shorter and sweeter: the 16-item version of the SRS questionnaire shows better structural validity than the 20-item version in young patients with spinal deformity 27 April 2022 | Spine Deformity, Vol. 10, No. 5
  • Validation of the Spanish Version of the Copenhagen Burnout Inventory in Mexican Medical Residents Archives of Medical Research, Vol. 53, No. 6
  • Validation of the Korean Version of Nurses’ Moral Courage Scale 15 September 2022 | International Journal of Environmental Research and Public Health, Vol. 19, No. 18
  • Importance of Top Management Commitment to Organizational Citizenship Behaviour towards the Environment, Green Training and Environmental Performance in Pakistani Industries 5 September 2022 | Sustainability, Vol. 14, No. 17
  • Human–cobot interaction fluency and cobot operators’ job performance. The mediating role of work engagement: A survey Robotics and Autonomous Systems, Vol. 155
  • Differential Item Functioning Analysis of the Fundamental Concepts for Organic Reaction Mechanisms Inventory 28 July 2022 | Journal of Chemical Education, Vol. 99, No. 8
  • Beyond online search strategies: The effects of internet epistemic beliefs and different note‐taking formats on online multiple document reading comprehension 11 April 2022 | Journal of Computer Assisted Learning, Vol. 38, No. 4
  • Investigating Student Engagement in General Chemistry Active Learning Activities using the Activity Engagement Survey (AcES) 7 June 2022 | Journal of Chemical Education, Vol. 99, No. 7
  • Distinct pathways to stakeholder use versus academic contribution in climate adaptation research 8 June 2022 | Conservation Letters, Vol. 15, No. 4
  • Lisa A. Corwin ,
  • Michael E. Ramsey ,
  • Eric A. Vance ,
  • Elizabeth Woolner ,
  • Stevie Maiden ,
  • Nina Gustafson and
  • Joseph A. Harsh
  • Erin Shortlidge, Monitoring Editor
  • “Thandi should feel embarrassed”: describing the validity and reliability of a tool to measure depression-related stigma among patients with depressive symptoms in Malawi 20 November 2021 | Social Psychiatry and Psychiatric Epidemiology, Vol. 57, No. 6
  • Motivation in Reading Primary Scientific Literature: a questionnaire to assess student purpose and efficacy in reading disciplinary literature 19 May 2022 | International Journal of Science Education, Vol. 44, No. 8
  • Development of the Adolescent Opioid Safety and Learning (AOSL) scale using exploratory factor analysis Research in Social and Administrative Pharmacy, Vol. 18, No. 5
  • Portuguese Version of the Spiritual Well-Being Questionnaire: Validation Study in People under Assisted Reproductive Techniques 26 April 2022 | Religions, Vol. 13, No. 5
  • Caregiver social support and child toilet training in rural Odisha, India: What types of support facilitate training and how? 19 October 2021 | Applied Psychology: Health and Well-Being, Vol. 14, No. 2
  • The Competence Scale in Managing Behavioral and Psychological Symptoms of Dementia (CS-MBPSD) for family caregivers: Instrument development and cross-sectional validation study International Journal of Nursing Studies, Vol. 129
  • Why Did Students Report Lower Test Anxiety during the COVID-19 Pandemic? Journal of Microbiology & Biology Education, Vol. 23, No. 1
  • A new perspective of work stress on teaching performance by competencies 13 April 2022 | International Journal of Leadership in Education, Vol. 17
  • Development and validation of Online Classroom Learning Environment Inventory (OCLEI): The case of Indonesia during the COVID-19 pandemic 4 March 2021 | Learning Environments Research, Vol. 25, No. 1
  • Validity and reliability of a social skills scale among Chilean health sciences students: A cross-sectional study 14 March 2022 | European Journal of Translational Myology, Vol. 32, No. 1
  • Assessment of the Psychometric Properties of the Holland Sleep Disorders Questionnaire in the Iranian Population Sleep Disorders, Vol. 2022
  • Comprehensive assessment of reliability and validity for the clinical cases in simulated community pharmacy 11 March 2022 | Pharmacy Education
  • Development and Evaluation of a Survey to Measure Student Engagement at the Activity Level in General Chemistry 9 February 2022 | Journal of Chemical Education, Vol. 99, No. 3
  • Developing a Multilevel Scale to Assess Retention of Workers with Disabilities 9 June 2021 | Journal of Occupational Rehabilitation, Vol. 32, No. 1
  • One size does NOT fit all: Understanding differences in perceived organizational support during the COVID‐19 pandemic 3 March 2022 | Business and Society Review, Vol. 127, No. S1
  • Dimensionality and Reliability of the Intentions to Seeking Counseling Inventory with International Students 9 July 2021 | Journal of International Students, Vol. 12, No. 1
  • Nursing Warmth Scale (NWS): Development and empirical validation 7 June 2022 | Avances en Enfermería, Vol. 40, No. 2
  • Psychometric properties and factor structure of the Finnish version of the Health Care Providers’ Pain and Impairment Relationship Scale Musculoskeletal Science and Practice, Vol. 57
  • Evolution of the earthworm (Eisenia fetida) microbial community in vitro and in vivo under tetracycline stress Ecotoxicology and Environmental Safety, Vol. 231
  • The moderation effect of work engagement on entrepreneurial attitude and organizational commitment: evidence from Thailand’s entry-level employees during the COVID-19 pandemic 29 July 2021 | Asia-Pacific Journal of Business Administration, Vol. 14, No. 1
  • Global Business and Organizational Excellence, Vol. 41, No. 5
  • Science & Education, Vol. 31, No. 3
  • Medical Science Educator, Vol. 32, No. 3
  • Development and validation of ESL/EFL reading strategies inventory Ampersand, Vol. 9
  • Factors influencing undergraduate nursing students’ evaluation of teaching effectiveness in a nursing program at a higher education institution in Namibia International Journal of Africa Nursing Sciences, Vol. 17
  • Ocean & Coastal Management, Vol. 225
  • Journal of Microbiology & Biology Education, Vol. 23, No. 1
  • Evolution: Education and Outreach, Vol. 15, No. 1
  • PLOS ONE, Vol. 17, No. 6
  • Psychometric Properties of Remote Teaching Efficacy Scale in Employed Filipino Teachers during COVID-19 Crisis Journal of Digital Educational Technology, Vol. 2, No. 1
  • The Technology Acceptance of Video Consultations for Type 2 Diabetes Care in General Practice: Cross-sectional Survey of Danish General Practitioners 30 August 2022 | Journal of Medical Internet Research, Vol. 24, No. 8
  • The Intersection of Persuasive System Design and Personalization in Mobile Health: Statistical Evaluation 14 September 2022 | JMIR mHealth and uHealth, Vol. 10, No. 9
  • An assessment of the policy and regulatory outcome by the telecom services users: The emerging economy study 9 May 2022 | Journal of Governance and Regulation, Vol. 11, No. 2, special issue
  • International Journal of Environmental Research and Public Health, Vol. 19, No. 12
  • Identifying Structure in Program-Level Competencies and Skills
  • Impact of a Mentorship Training Course on the Prevalence of Burnout in Nurse Leaders Working in a Regional Healthcare System SSRN Electronic Journal, Vol. 38
  • Increasing Resilience of Utility Tunnel PPP Projects Through Risk Management: A Case on in Shiyan City 2 September 2022
  • The Ph.D. Panic: Examining the Relationships Among Teaching Anxiety, Teaching Self-Efficacy, And Coping in Biology Graduate Teaching Assistants (GTAs) Journal of Research in Science, Mathematics and Technology Education, Vol. 5, No. SI
  • Modifying the ASPECT Survey to Support the Validity of Student Perception Data from Different Active Learning Environments Journal of Microbiology & Biology Education, Vol. 22, No. 3
  • S. Salehi ,
  • S. A. Berk ,
  • R. Brunelli ,
  • S. Cotner ,
  • C. Creech ,
  • A. G. Drake ,
  • S. Fagbodun ,
  • S. Hebert ,
  • J. Hewlett ,
  • A. C. James ,
  • M. Shuster ,
  • J. R. St. Juliana ,
  • D. B. Stovall ,
  • R. Whittington ,
  • M. Zhong , and
  • C. J. Ballen
  • Rebecca Price, Monitoring Editor
  • Lauren Hensley ,
  • Amy Kulesza ,
  • Joshua Peri ,
  • Anna C. Brady ,
  • Christopher A. Wolters ,
  • David Sovic , and
  • Caroline Breitenberger
  • Joel K. Abraham, Monitoring Editor
  • Development of emergency nursing care competency scale for school nurses 14 April 2021 | BMC Nursing, Vol. 20, No. 1
  • Measuring sexual violence stigma in humanitarian contexts: assessment of scale psychometric properties and validity with female sexual violence survivors from Somalia and Syria 24 December 2021 | Conflict and Health, Vol. 15, No. 1
  • Quantifying fear of failure in STEM: modifying and evaluating the Performance Failure Appraisal Inventory (PFAI) for use with STEM undergraduates 6 July 2021 | International Journal of STEM Education, Vol. 8, No. 1
  • Measuring COVID-19 related anxiety and obsession: Validation of the Coronavirus Anxiety Scale and the Obsession with COVID-19 Scale in a probability Chinese sample Journal of Affective Disorders, Vol. 295
  • Proposal of a temporality perspective for a successful organizational change project 10 August 2021 | International Journal of Workplace Health Management, Vol. 14, No. 5
  • Development and validation of the athletes’ rights survey 15 November 2021 | BMJ Open Sport & Exercise Medicine, Vol. 7, No. 4
  • On Black Male Leadership: A Study of Leadership Efficacy, Servant Leadership, and Engagement Mediated by Microaggressions 31 August 2021 | Advances in Developing Human Resources, Vol. 23, No. 4
  • Addressing the Unique Qualities of Upper-Level Biology Course-based Undergraduate Research Experiences through the Integration of Skill-Building 3 May 2021 | Integrative and Comparative Biology, Vol. 61, No. 3
  • Reassessment of climate zones for high-level pavement analysis using machine learning algorithms and NASA MERRA-2 data Advanced Engineering Informatics, Vol. 50
  • Mediation Analysis in Discipline-Based Education Research Using Structural Equation Modeling: Beyond “What Works” to Understand How It Works, and for Whom Journal of Microbiology & Biology Education, Vol. 22, No. 2
  • Psychometric Evaluation of the Nurses Professional Values Scale-3: Indonesian Version 20 August 2021 | International Journal of Environmental Research and Public Health, Vol. 18, No. 16
  • Chemistry self-efficacy in lower-division chemistry courses: changes after a semester of instruction and gaps still remain between student groups 1 January 2021 | Chemistry Education Research and Practice, Vol. 22, No. 3
  • The incoherence of sustainability literacy assessed with the Sulitest 25 February 2021 | Nature Sustainability, Vol. 4, No. 6
  • Enhancing the Positive Impact Rating: A New Business School Rating in Support of a Sustainable Future 8 June 2021 | Sustainability, Vol. 13, No. 12
  • Validation of the Arabic Version of the Copenhagen Psychosocial Questionnaire II (A-COPSOQ II) among Workers in Oil and Gas Industrial Sector 1 June 2021 | Journal of Biomedical Research & Environmental Sciences
  • Somatic symptoms have negligible impact on Patient Health Questionnaire‐9 depression scale scores in neurological patients 26 March 2021 | European Journal of Neurology, Vol. 28, No. 6
  • Multi-institutional Study of Self-Efficacy within Flipped Chemistry Courses 30 March 2021 | Journal of Chemical Education, Vol. 98, No. 5
  • Hospitality workers’ COVID-19 risk perception and depression: A contingent model based on transactional theory of stress model International Journal of Hospitality Management, Vol. 95
  • Commonalities and specificities of positive youth development in the U.S. and Taiwan Journal of Applied Developmental Psychology, Vol. 73
  • Course-Based Undergraduate Research Experiences Spanning Two Semesters of Biology Impact Student Self-Efficacy but not Future Goals 27 September 2023 | Journal of College Science Teaching, Vol. 50, No. 4
  • Preliminary validity and reliability evidence of the Brief Antisocial Behavior Scale (B-ABS) in young adults from four countries 22 February 2021 | PLOS ONE, Vol. 16, No. 2
  • Construct Validity and Test–Retest Reliability of the Automated Vehicle User Perception Survey 25 January 2021 | Frontiers in Psychology, Vol. 12
  • Cross-Cultural Adaptation and Validation of the Malay Satisfaction Questionnaire for Osteoporosis Prevention in Malaysia 1 June 2021 | Patient Preference and Adherence, Vol. Volume 15
  • Validity and Reliability of the Turkish Version of the COVID Stress Scale Journal of Korean Academy of Nursing, Vol. 51, No. 5
  • Symptom clusters and quality of life among patients with chronic heart failure: A cross‐sectional study 28 August 2020 | Japan Journal of Nursing Science, Vol. 18, No. 1
  • Measuring university students’ interest in biology: evaluation of an instrument targeting Hidi and Renninger’s individual interest 19 May 2020 | International Journal of STEM Education, Vol. 7, No. 1
  • Belonging in general chemistry predicts first-year undergraduates’ performance and attrition 1 January 2020 | Chemistry Education Research and Practice, Vol. 21, No. 4
  • Eva Knekta ,
  • Kyriaki Chatzikyriakidou ,, and
  • Melissa McCartney
  • David Feldon, Monitoring Editor
  • Brie Tripp and
  • Erin E. Shortlidge
  • Developing and testing a measure of COVID-19 organizational support of healthcare workers – results from Peru, Ecuador, and Bolivia Psychiatry Research, Vol. 291
  • Developing and validating five-construct model of customer satisfaction in beauty and cosmetic E-commerce Heliyon, Vol. 6, No. 9
  • Stepfanie M. Aguillon ,
  • Gregor-Fausto Siegmund ,
  • Renee H. Petipas ,
  • Abby Grace Drake ,
  • Sehoya Cotner , and
  • Cissy J. Ballen
  • Sarah L. Eddy, Monitoring Editor
  • Amanda R. Butz and
  • Janet L. Branchaw
  • All Happy Emotions Are Alike but Every Unhappy Emotion Is Unhappy in Its Own Way: A Network Perspective to Academic Emotions 30 April 2020 | Frontiers in Psychology, Vol. 11
  • Developing a Scale to Measure Students’ Attitudes toward Science 5 January 2020 | International Journal of Assessment Tools in Education, Vol. 6, No. 4
  • Ashley A. Rowland ,
  • Sarah Eddy , and
  • Cynthia Brame, Monitoring Editor
  • Beyond linear regression: A reference for analyzing common data types in discipline based education research 3 July 2019 | Physical Review Physics Education Research, Vol. 15, No. 2
  • Identification of University Students’ Psychological Capital Components from Islamic Perspective 1 March 2019 | Applied Issues in Quarterly Journal of Islamic Education, Vol. 4, No. 1
  • Motivation, Self-efficacy, and Student Engagement in Intermediate Mechanical Engineering Courses

factor analysis journal research

Submitted: 26 April 2018 Revised: 20 September 2018 Accepted: 27 November 2018

© 2019 E. Knekta et al. CBE—Life Sciences Education © 2019 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

SYSTEMATIC REVIEW article

Exercise interventions for nonspecific low back pain: a bibliometric analysis of global research from 2018 to 2023 provisionally accepted.

  • 1 Harbin Sport University, China
  • 2 The University of Newcastle, Australia

The final, formatted version of the article will be published soon.

Objective: This study aims to explore global research trends on exercise interventions for nonspecific low back pain from 2018 to 2023 through bibliometric analysis. Methods: A systematic search was conducted in the Web of Science Core Collection database to select relevant research articles published between 2018 and 2023. Using CiteSpace and VOSviewer, the relationships and impacts among publications, different countries, journals, author groups, references, and keywords were analyzed in depth.The bibliometric analysis included 4,896 publications, showing a trend of initial growth followed by a decline. At the national level, the United States made the most significant contributions in this field. The journal "Lancet" had three of the top ten most-cited articles, with an average citation count of 306.33, and an impact factor reaching 168.9 in 2023. The analysis also revealed that "disability," "prevalence," and "management" were high-frequency keywords beyond the search terms, while "rehabilitation medicine," "experiences," and "brain" emerged as new hotspots in the research.: This study reveals the global trends in research on exercise interventions for nonspecific low back pain over the past five years and highlights potential research frontiers in the field. These findings provide a solid foundation for focusing on key issues, potential collaboration directions, and trends in research development in the future, offering valuable references for further in-depth studies.

Keywords: Lower Back pain, Exercise, bibliometric analysis, Citespace, Global research

Received: 24 Feb 2024; Accepted: 08 Apr 2024.

Copyright: © 2024 Zang and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Jin Yan, The University of Newcastle, Callaghan, 2308, New South Wales, Australia

People also looked at

  • DIGITAL MAGAZINE

logo

  • Sign In Log out
  • Newsletter signup
  • Precious metals
  • Base metals
  • Rare earths
  • Energy Minerals
  • Exploration & Development
  • Project Finance
  • Miner's Rights
  • Mining Journal Comment
  • London to a Brick
  • Latin America
  • North America
  • News Analysis
  • Partner Content
  • RESOURCE STOCKS
  • About Mining Journal
  • Digital Magazine
  • Frequently Asked Questions
  • Terms and Conditions
  • Privacy Policy
  • Cookie Policy

Beta Hunt 'wow' factor seals Westgold-Karora merger

Premium to be paid in $2.2B combination of two WA gold producers

All smiles ahead of the expansion. Credit: Westgold.

All smiles ahead of the expansion. Credit: Westgold.

Consolidation among the Western Australian gold sector continued today with Westgold Resources revealing it has been talking with Canada-based Karora Resources over the past two years, and the pair have...

Already have an account?   Sign in here

Add the power of mining journal to your business decisions..

Since 1835, Mining Journal has been providing investors and mining professionals with daily breaking news covering all aspects of the industry. Subscribe today for individual and team access to mining's most respected news title.

Standard Subscription

  • Global news and analysis
  • Exclusive industry columns
  • Daily email newsletters
  • Monthly digital magazine

Individual, Team and Corporate options available

Premium Subscription

All Standard benefits, plus

  • 5x annual Research Reports on key industry topics*
  • 4x annual Equities Updates*

*Each valued at £595

RELATED ARTICLES

  • How to survive the slump in rare earths prices
  • London-listed news in brief

I-80 Gold raises C$115M as Ruby Hill partner found

  • Prospect solves Zambian copper belt brawl

MORE ON THIS TOPIC

  • PREVIOUS: Alamos closes Orford acquisition

Get the Mining Journal Newsletter delivered free each day

From our partners, northern territory copper project well positioned, polarx strongly backed by gold major, most popular, attackers blow up electricity towers at peru's poderosa mine, precious metals, base metals, first quantum warns panama of risks from stockpiled copper concentrate, us trade policy threatens global commodity flow, premium subscribers only.

A growing series of reports, each focused on a key discussion point for the farming sector, brought to you by the Kondinin team.

editions

Mining Journal Intelligence Investor Sentiment Report 2024

Survey revealing the plans, priorities, and preferences of 120+ mining investors and their expectations for the sector in 2024.

editions

Mining Journal Intelligence Mining Equities Report 2023

Access an exclusive, inside look on the quarterly mining IPOs and secondary raisings data and mining equities performance tables with an annual Stock Exchange Comparisons supplement.

editions

Mining Journal Intelligence World Risk Report 2023 (feat. MineHutte ratings)

A detailed analysis of mining investment risks across 121 jurisdictions globally, built on 11 ‘hard risk’ metrics and an industrywide survey.

editions

Mining Journal Intelligence Global Leadership Report 2023: Social licence

Gain insights into social licence trends and best practices from interviews with 20+ top mining company executives and an industrywide survey.

footer-logo

  • Commodities
  • Newsletter Signup
  • Terms And Conditions

THE ASPERMONT BRAND PORTFOLIO

footer-images

Copyright © 2000-2024 Aspermont Media Ltd. All rights reserved. Aspermont Media is a company registered in England and Wales. Company No. 08096447. VAT No. 136738101. Aspermont Media, WeWork, 1 Poultry, London, England, EC2R 8EJ.

Not a Subscriber?

Subscribe now.

IMAGES

  1. Factor Analysis

    factor analysis journal research

  2. Best Practices in Exploratory Factor Analysis

    factor analysis journal research

  3. Análisis factorial: Definición, Métodos y Ejemplos // Qualtrics

    factor analysis journal research

  4. (PDF) Using Factor Analysis on Survey Study of Factors Affecting

    factor analysis journal research

  5. Factor Analysis

    factor analysis journal research

  6. Analyse factorielle : Définition, méthodes et exemples // Qualtrics

    factor analysis journal research

VIDEO

  1. Factor analysis Lab

  2. factor analysis report 1

  3. Factor Analysis Part 1

  4. Exploratory Factor Analysis EFA in SPSS

  5. How to search Elsevier Interdisciplinary journal with IMPACT FACTOR and publish for free #elsevier

  6. PUBLISHING AN OBGYN PAPER IN A JOURNAL

COMMENTS

  1. Factor Analysis: a means for theory and instrument development in support of construct validity

    It should be noted if the prior EFA applied an orthogonal rotation to the factor solution, the factors produced would be uncorrelated. Hence, the analysis of the second-order factors is not possible. Generally, in social science research, most constructs assume inter-related factors, and therefore should apply an oblique rotation.

  2. Exploratory Factor Analysis: A Guide to Best Practice

    Exploratory factor analysis (EFA) is a multivariate statistical method that has become a fundamental tool in the development and validation of psychological theories and measurements. However, researchers must make several thoughtful and evidence-based methodological decisions while conducting an EFA, and there are a number of options available ...

  3. Factor Analysis

    Although factor analysis is routinely used in research, it is not always used well. In our experience, the following problematic practices are common in applications of factor analysis. ... Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25(June), 78-90. Article Google Scholar Vandenberg, R ...

  4. A systematic literature review of exploratory factor analyses in

    Exploratory factor analysis (EFA) is a powerful statistical technique that enables researchers to use their judgement and interpretation to identify a set of latent factors that meaningfully and parsimoniously represent a set of indicators (Goretzko et al., 2021, Hair et al., 2019, Howard, 2016, Watkins, 2018).The technique estimates the number of latent factors underlying the indicators as ...

  5. Factor Analysis

    Factor analysis is a multivariate method that can be used for analyzing large data sets with two main goals: 1. to reduce a large number of correlating variables to a fewer number of factors,. 2. to structure the data with the aim of identifying dependencies between correlating variables and examining them for common causes (factors) in order to generate a new construct (factor) on this basis.

  6. Exploratory factor analysis: Current use, methodological developments

    Psychological research often relies on Exploratory Factor Analysis (EFA). As the outcome of the analysis highly depends on the chosen settings, there is a strong need for guidelines in this context. Therefore, we want to examine the recent methodological developments as well as the current practice in psychological research. We reviewed ten years of studies containing EFAs and contrasted them ...

  7. Introduction to Exploratory Factor Analysis: An Applied Approach

    This chapter provides an overview of exploratory factor analysis (EFA) from an applied perspective. We start with a discussion of general issues and applications, including definitions of EFA and the underlying common factors model. We briefly cover history and general applications. The most substantive part of the chapter focuses on six steps ...

  8. Exploratory Factor Analysis: Implications for Theory, Research, and

    Exploratory factor analysis (EFA) serves many useful purposes in human resource development (HRD) research. The most frequent applications of EFA among researchers consists of reducing relatively large sets of variables into more manageable ones, developing and refining a new instrument's scales, and exploring relations among variables to build theory.

  9. (PDF) Overview of Factor Analysis

    Chapter 1. Theoretical In tro duction. • Factor analysis is a collection of methods used to examine how underlying constructs influence the. resp onses on a n umber of measured v ariables ...

  10. Factor Analysis Methods and Validity Evidence

    This value ranged from 1.54:1 to 3140.45:1, with a mean of 55.7:1 and a median of 11.55:1; 46 analyses (55.4%) met or exceeded a 10:1 ratio. Model of analysis and extraction method. Among the 95 factor analyses, PCA was the most frequently applied model and extraction method (n = 60; 63.2%).

  11. Factor Analysis as a Tool for Survey Analysis

    Abstract and Figures. Factor analysis is particularly suitable to extract few factors from the large number of related variables to a more manageable number, prior to using them in other analysis ...

  12. Factor Analysis

    Exploratory factor analysis (EFA) is one of the very classical latent variable techniques. The basic idea is to find latent variables (factors) based on the correlation structure of the manifest input variables (indicators) .First, EFA needs to be clearly distinguished from confirmatory factor analysis (CFA; introduced in Sect. 2.4).

  13. Factor Analysis

    Examples of Factor Analysis. Here are some real-time examples of factor analysis: Psychological Research: In a study examining personality traits, researchers may use factor analysis to identify the underlying dimensions of personality by analyzing responses to various questionnaires or surveys. Factors such as extroversion, neuroticism, and conscientiousness can be derived from the analysis.

  14. Notes to Factor Analysis Techniques for Construct Validity

    This paper introduces and discusses factor analysis techniques for construct validity, including some suggestions for reporting using the evidence to support the construct validity from exploratory and confirmatory factor analysis techniques.

  15. Factor Analysis as a Tool for Survey Analysis

    The application of factor analysis for questionnaire evaluation provides very valuable inputs to the decision makers to focus on few important factors rather than a large number of parameters. ... "Evaluating structural equation models with unobservable variables and measurement error," Journal of Marketing Research, 18(1), 39-50. 1981.

  16. International Journal of Finance & Economics

    The International Journal of Finance & Economics is a leading economics and finance journal that ... and, at the same time, interact with stakeholders. This paper first analyses how ESG influences corporate total factor productivity (TFP), then, using A-share listed company data from 2009 to 2021, conducts an empirical test proving that ESG ...

  17. One Size Doesn't Fit All: Using Factor Analysis to Gather Validity

    A brief history of the philosophical foundations of exploratory factor analysis. Journal of Multivariate Behavioral Research, 22(3), ... Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838. Google Scholar;

  18. Frontiers

    3.5 Prognostic factor analysis. Univariate Cox analysis of PFS suggested that first-line chemotherapy regimen, B symptoms and platelet level may be prognostic factors (P=0.030, 0.043, 0.019, respectively, Table 3). Multivariate analysis showed that first-line chemotherapy regimen and platelet level were independent prognostic factors of PFS in ...

  19. Land

    Coordination between the construction of transport infrastructure and the development and protection of territorial space is an important factor in promoting sustainable regional development, but there is still a lack of systematic research on the impact of transport on territorial space worldwide. Following the logic of "development trend revealing—theoretical and technological summary ...

  20. The insulin‐like growth factor binding protein-microfibrillar

    The British Journal of Pharmacology (BJP), an ... The insulin-like growth factor binding protein-microfibrillar associated protein-sterol regulatory element binding protein axis regulates fibroblast-myofibroblast transition and cardiac fibrosis ... Hainan Provincial Key Laboratory for Tropical Cardiovascular Diseases Research and Key ...

  21. Confirmatory Factor Analysis as a Tool in Research Using Questionnaires

    Journal portfolios in each of our subject areas. Links to Books and Digital Library content from across Sage. VIEW DISCIPLINE HUBS. Information for. ... Prudon, P. (2015) Confirmatory factor analysis as a tool in research using questionnaires: a critique. Comprehensive Psychology, 4, 10. DOI: 10.2466/03.CP.4.10. The formula for SRMR on page 4 ...

  22. Factor Analysis Revisited

    Factor Analysis, one of the topics dealt with in the area of multivariate statistical analysis, is often a very difficult and confusing topic for the students. Difficulties come from the presence of too many parameters and no clear exact analytical solutions for estimating these parameters and deciding on the nature of the model. All the procedures call for help from computers and seek ...

  23. Frontiers

    Objective: This study aims to explore global research trends on exercise interventions for nonspecific low back pain from 2018 to 2023 through bibliometric analysis. Methods: A systematic search was conducted in the Web of Science Core Collection database to select relevant research articles published between 2018 and 2023. Using CiteSpace and VOSviewer, the relationships and impacts among ...

  24. Use of Exploratory Factor Analysis in Published Research:

    Given the proliferation of factor analysis applications in the literature, the present article examines the use of factor analysis in current published research across four psychological journals. Notwithstanding ease of analysis due to computers, the appropriate use of factor analysis requires a series of thoughtful researcher judgments.

  25. British Educational Research Journal

    This paper proposes a model relating to the analysis of critical incidents, which aims to inform future research into how pre-service teachers respond to critical incidents regarding their identity. This model seeks to clarify tensions in the diverse lived experiences of pre-service teachers and helps to explore the importance of context.

  26. Beta Hunt 'wow' factor seals Westgold-Karora merger

    A detailed analysis of mining investment risks across 121 jurisdictions globally, built on 11 'hard risk' metrics and an industrywide survey. Mining Journal Intelligence Global Leadership ...

  27. Reviewer Resources: Confirmatory Factor Analysis

    Abstract. Confirmatory factor analyses (CFA) are widely used in the organizational literature. As a result, understanding how to properly conduct these analyses, report the results, and interpret their implications is critically important for advancing organizational research. The goal of this paper is to summarize the complexities of CFA ...

  28. Research on decoupled transfer path analysis method and ...

    Impact Factor: 2.8 / 5-Year Impact Factor: 2.8 . JOURNAL HOMEPAGE. SUBMIT PAPER. Close ... a transmissibility-based structural modification method for in-situ transfer path analysis. Journal of Sound and Vibration 499: 115991. Google Scholar. Mouzakis C, Vogiatzis K, Zafiropoulou V (2019) Assessing subway network ground borne noise and ...