Bookmark this page

  • A Model for the National Assessment of Higher Order Thinking
  • International Critical Thinking Essay Test
  • Online Critical Thinking Basic Concepts Test
  • Online Critical Thinking Basic Concepts Sample Test

Consequential Validity: Using Assessment to Drive Instruction

Translate this page from English...

*Machine translated pages not guaranteed for accuracy. Click Here for our professional translations.

instrument to measure critical thinking skills

Critical Thinking Testing and Assessment

The purpose of assessment in instruction is improvement. The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students’ abilities to think their way through content using disciplined skill in reasoning. The more particular we can be about what we want students to learn about critical thinking, the better we can devise instruction with that particular end in view.

instrument to measure critical thinking skills

The Foundation for Critical Thinking offers assessment instruments which share in the same general goal: to enable educators to gather evidence relevant to determining the extent to which instruction is teaching students to think critically (in the process of learning content). To this end, the Fellows of the Foundation recommend:

that academic institutions and units establish an oversight committee for critical thinking, and

that this oversight committee utilizes a combination of assessment instruments (the more the better) to generate incentives for faculty, by providing them with as much evidence as feasible of the actual state of instruction for critical thinking.

The following instruments are available to generate evidence relevant to critical thinking teaching and learning:

Course Evaluation Form : Provides evidence of whether, and to what extent, students perceive faculty as fostering critical thinking in instruction (course by course). Machine-scoreable.

Online Critical Thinking Basic Concepts Test : Provides evidence of whether, and to what extent, students understand the fundamental concepts embedded in critical thinking (and hence tests student readiness to think critically). Machine-scoreable.

Critical Thinking Reading and Writing Test : Provides evidence of whether, and to what extent, students can read closely and write substantively (and hence tests students' abilities to read and write critically). Short-answer.

International Critical Thinking Essay Test : Provides evidence of whether, and to what extent, students are able to analyze and assess excerpts from textbooks or professional writing. Short-answer.

Commission Study Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Based on the California Commission Study . Short-answer.

Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Short-answer.

Protocol for Interviewing Students Regarding Critical Thinking : Provides evidence of whether, and to what extent, students are learning to think critically at a college or university. Can be adapted for high school). Short-answer. 

Criteria for Critical Thinking Assignments : Can be used by faculty in designing classroom assignments, or by administrators in assessing the extent to which faculty are fostering critical thinking.

Rubrics for Assessing Student Reasoning Abilities : A useful tool in assessing the extent to which students are reasoning well through course content.  

All of the above assessment instruments can be used as part of pre- and post-assessment strategies to gauge development over various time periods.

Consequential Validity

All of the above assessment instruments, when used appropriately and graded accurately, should lead to a high degree of consequential validity. In other words, the use of the instruments should cause teachers to teach in such a way as to foster critical thinking in their various subjects. In this light, for students to perform well on the various instruments, teachers will need to design instruction so that students can perform well on them. Students cannot become skilled in critical thinking without learning (first) the concepts and principles that underlie critical thinking and (second) applying them in a variety of forms of thinking: historical thinking, sociological thinking, biological thinking, etc. Students cannot become skilled in analyzing and assessing reasoning without practicing it. However, when they have routine practice in paraphrasing, summariz­ing, analyzing, and assessing, they will develop skills of mind requisite to the art of thinking well within any subject or discipline, not to mention thinking well within the various domains of human life.

For full copies of this and many other critical thinking articles, books, videos, and more, join us at the Center for Critical Thinking Community Online - the world's leading online community dedicated to critical thinking!   Also featuring interactive learning activities, study groups, and even a social media component, this learning platform will change your conception of intellectual development.

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Back to Entry
  • Entry Contents
  • Entry Bibliography
  • Academic Tools
  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Supplement to Critical Thinking

How can one assess, for purposes of instruction or research, the degree to which a person possesses the dispositions, skills and knowledge of a critical thinker?

In psychometrics, assessment instruments are judged according to their validity and reliability.

Roughly speaking, an instrument is valid if it measures accurately what it purports to measure, given standard conditions. More precisely, the degree of validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” (American Educational Research Association 2014: 11). In other words, a test is not valid or invalid in itself. Rather, validity is a property of an interpretation of a given score on a given test for a specified use. Determining the degree of validity of such an interpretation requires collection and integration of the relevant evidence, which may be based on test content, test takers’ response processes, a test’s internal structure, relationship of test scores to other variables, and consequences of the interpretation (American Educational Research Association 2014: 13–21). Criterion-related evidence consists of correlations between scores on the test and performance on another test of the same construct; its weight depends on how well supported is the assumption that the other test can be used as a criterion. Content-related evidence is evidence that the test covers the full range of abilities that it claims to test. Construct-related evidence is evidence that a correct answer reflects good performance of the kind being measured and an incorrect answer reflects poor performance.

An instrument is reliable if it consistently produces the same result, whether across different forms of the same test (parallel-forms reliability), across different items (internal consistency), across different administrations to the same person (test-retest reliability), or across ratings of the same answer by different people (inter-rater reliability). Internal consistency should be expected only if the instrument purports to measure a single undifferentiated construct, and thus should not be expected of a test that measures a suite of critical thinking dispositions or critical thinking abilities, assuming that some people are better in some of the respects measured than in others (for example, very willing to inquire but rather closed-minded). Otherwise, reliability is a necessary but not a sufficient condition of validity; a standard example of a reliable instrument that is not valid is a bathroom scale that consistently under-reports a person’s weight.

Assessing dispositions is difficult if one uses a multiple-choice format with known adverse consequences of a low score. It is pretty easy to tell what answer to the question “How open-minded are you?” will get the highest score and to give that answer, even if one knows that the answer is incorrect. If an item probes less directly for a critical thinking disposition, for example by asking how often the test taker pays close attention to views with which the test taker disagrees, the answer may differ from reality because of self-deception or simple lack of awareness of one’s personal thinking style, and its interpretation is problematic, even if factor analysis enables one to identify a distinct factor measured by a group of questions that includes this one (Ennis 1996). Nevertheless, Facione, Sánchez, and Facione (1994) used this approach to develop the California Critical Thinking Dispositions Inventory (CCTDI). They began with 225 statements expressive of a disposition towards or away from critical thinking (using the long list of dispositions in Facione 1990a), validated the statements with talk-aloud and conversational strategies in focus groups to determine whether people in the target population understood the items in the way intended, administered a pilot version of the test with 150 items, and eliminated items that failed to discriminate among test takers or were inversely correlated with overall results or added little refinement to overall scores (Facione 2000). They used item analysis and factor analysis to group the measured dispositions into seven broad constructs: open-mindedness, analyticity, cognitive maturity, truth-seeking, systematicity, inquisitiveness, and self-confidence (Facione, Sánchez, and Facione 1994). The resulting test consists of 75 agree-disagree statements and takes 20 minutes to administer. A repeated disturbing finding is that North American students taking the test tend to score low on the truth-seeking sub-scale (on which a low score results from agreeing to such statements as the following: “To get people to agree with me I would give any reason that worked”. “Everyone always argues from their own self-interest, including me”. “If there are four reasons in favor and one against, I’ll go with the four”.) Development of the CCTDI made it possible to test whether good critical thinking abilities and good critical thinking dispositions go together, in which case it might be enough to teach one without the other. Facione (2000) reports that administration of the CCTDI and the California Critical Thinking Skills Test (CCTST) to almost 8,000 post-secondary students in the United States revealed a statistically significant but weak correlation between total scores on the two tests, and also between paired sub-scores from the two tests. The implication is that both abilities and dispositions need to be taught, that one cannot expect improvement in one to bring with it improvement in the other.

A more direct way of assessing critical thinking dispositions would be to see what people do when put in a situation where the dispositions would reveal themselves. Ennis (1996) reports promising initial work with guided open-ended opportunities to give evidence of dispositions, but no standardized test seems to have emerged from this work. There are however standardized aspect-specific tests of critical thinking dispositions. The Critical Problem Solving Scale (Berman et al. 2001: 518) takes as a measure of the disposition to suspend judgment the number of distinct good aspects attributed to an option judged to be the worst among those generated by the test taker. Stanovich, West and Toplak (2011: 800–810) list tests developed by cognitive psychologists of the following dispositions: resistance to miserly information processing, resistance to myside thinking, absence of irrelevant context effects in decision-making, actively open-minded thinking, valuing reason and truth, tendency to seek information, objective reasoning style, tendency to seek consistency, sense of self-efficacy, prudent discounting of the future, self-control skills, and emotional regulation.

It is easier to measure critical thinking skills or abilities than to measure dispositions. The following eight currently available standardized tests purport to measure them: the Watson-Glaser Critical Thinking Appraisal (Watson & Glaser 1980a, 1980b, 1994), the Cornell Critical Thinking Tests Level X and Level Z (Ennis & Millman 1971; Ennis, Millman, & Tomko 1985, 2005), the Ennis-Weir Critical Thinking Essay Test (Ennis & Weir 1985), the California Critical Thinking Skills Test (Facione 1990b, 1992), the Halpern Critical Thinking Assessment (Halpern 2016), the Critical Thinking Assessment Test (Center for Assessment & Improvement of Learning 2017), the Collegiate Learning Assessment (Council for Aid to Education 2017), the HEIghten Critical Thinking Assessment (https://territorium.com/heighten/), and a suite of critical thinking assessments for different groups and purposes offered by Insight Assessment (https://www.insightassessment.com/products). The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students’ critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level certificates in critical thinking on the basis of an examination (OCR 2011). Many of these standardized tests have received scholarly evaluations at the hands of, among others, Ennis (1958), McPeck (1981), Norris and Ennis (1989), Fisher and Scriven (1997), Possin (2008, 2013a, 2013b, 2013c, 2014, 2020) and Hatcher and Possin (2021). Their evaluations provide a useful set of criteria that such tests ideally should meet, as does the description by Ennis (1984) of problems in testing for competence in critical thinking: the soundness of multiple-choice items, the clarity and soundness of instructions to test takers, the information and mental processing used in selecting an answer to a multiple-choice item, the role of background beliefs and ideological commitments in selecting an answer to a multiple-choice item, the tenability of a test’s underlying conception of critical thinking and its component abilities, the set of abilities that the test manual claims are covered by the test, the extent to which the test actually covers these abilities, the appropriateness of the weighting given to various abilities in the scoring system, the accuracy and intellectual honesty of the test manual, the interest of the test to the target population of test takers, the scope for guessing, the scope for choosing a keyed answer by being test-wise, precautions against cheating in the administration of the test, clarity and soundness of materials for training essay graders, inter-rater reliability in grading essays, and clarity and soundness of advance guidance to test takers on what is required in an essay. Rear (2019) has challenged the use of standardized tests of critical thinking as a way to measure educational outcomes, on the grounds that  they (1) fail to take into account disputes about conceptions of critical thinking, (2) are not completely valid or reliable, and (3) fail to evaluate skills used in real academic tasks. He proposes instead assessments based on discipline-specific content.

There are also aspect-specific standardized tests of critical thinking abilities. Stanovich, West and Toplak (2011: 800–810) list tests of probabilistic reasoning, insights into qualitative decision theory, knowledge of scientific reasoning, knowledge of rules of logical consistency and validity, and economic thinking. They also list instruments that probe for irrational thinking, such as superstitious thinking, belief in the superiority of intuition, over-reliance on folk wisdom and folk psychology, belief in “special” expertise, financial misconceptions, overestimation of one’s introspective powers, dysfunctional beliefs, and a notion of self that encourages egocentric processing. They regard these tests along with the previously mentioned tests of critical thinking dispositions as the building blocks for a comprehensive test of rationality, whose development (they write) may be logistically difficult and would require millions of dollars.

A superb example of assessment of an aspect of critical thinking ability is the Test on Appraising Observations (Norris & King 1983, 1985, 1990a, 1990b), which was designed for classroom administration to senior high school students. The test focuses entirely on the ability to appraise observation statements and in particular on the ability to determine in a specified context which of two statements there is more reason to believe. According to the test manual (Norris & King 1985, 1990b), a person’s score on the multiple-choice version of the test, which is the number of items that are answered correctly, can justifiably be given either a criterion-referenced or a norm-referenced interpretation.

On a criterion-referenced interpretation, those who do well on the test have a firm grasp of the principles for appraising observation statements, and those who do poorly have a weak grasp of them. This interpretation can be justified by the content of the test and the way it was developed, which incorporated a method of controlling for background beliefs articulated and defended by Norris (1985). Norris and King synthesized from judicial practice, psychological research and common-sense psychology 31 principles for appraising observation statements, in the form of empirical generalizations about tendencies, such as the principle that observation statements tend to be more believable than inferences based on them (Norris & King 1984). They constructed items in which exactly one of the 31 principles determined which of two statements was more believable. Using a carefully constructed protocol, they interviewed about 100 students who responded to these items in order to determine the thinking that led them to choose the answers they did (Norris & King 1984). In several iterations of the test, they adjusted items so that selection of the correct answer generally reflected good thinking and selection of an incorrect answer reflected poor thinking. Thus they have good evidence that good performance on the test is due to good thinking about observation statements and that poor performance is due to poor thinking about observation statements. Collectively, the 50 items on the final version of the test require application of 29 of the 31 principles for appraising observation statements, with 13 principles tested by one item, 12 by two items, three by three items, and one by four items. Thus there is comprehensive coverage of the principles for appraising observation statements. Fisher and Scriven (1997: 135–136) judge the items to be well worked and sound, with one exception. The test is clearly written at a grade 6 reading level, meaning that poor performance cannot be attributed to difficulties in reading comprehension by the intended adolescent test takers. The stories that frame the items are realistic, and are engaging enough to stimulate test takers’ interest. Thus the most plausible explanation of a given score on the test is that it reflects roughly the degree to which the test taker can apply principles for appraising observations in real situations. In other words, there is good justification of the proposed interpretation that those who do well on the test have a firm grasp of the principles for appraising observation statements and those who do poorly have a weak grasp of them.

To get norms for performance on the test, Norris and King arranged for seven groups of high school students in different types of communities and with different levels of academic ability to take the test. The test manual includes percentiles, means, and standard deviations for each of these seven groups. These norms allow teachers to compare the performance of their class on the test to that of a similar group of students.

Copyright © 2022 by David Hitchcock < hitchckd @ mcmaster . ca >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Menu Trigger

Why Schools Need to Change Yes, We Can Define, Teach, and Assess Critical Thinking Skills

instrument to measure critical thinking skills

Jeff Heyck-Williams (He, His, Him) Director of the Two Rivers Learning Institute in Washington, DC

critical thinking

Today’s learners face an uncertain present and a rapidly changing future that demand far different skills and knowledge than were needed in the 20th century. We also know so much more about enabling deep, powerful learning than we ever did before. Our collective future depends on how well young people prepare for the challenges and opportunities of 21st-century life.

Critical thinking is a thing. We can define it; we can teach it; and we can assess it.

While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the advent of the term “21st century skills” and discussions of deeper learning. There is increasing agreement among education reformers that critical thinking is an essential ingredient for long-term success for all of our students.

However, there are still those in the education establishment and in the media who argue that critical thinking isn’t really a thing, or that these skills aren’t well defined and, even if they could be defined, they can’t be taught or assessed.

To those naysayers, I have to disagree. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. In fact, as part of a multi-year Assessment for Learning Project , Two Rivers Public Charter School in Washington, D.C., has done just that.

Before I dive into what we have done, I want to acknowledge that some of the criticism has merit.

First, there are those that argue that critical thinking can only exist when students have a vast fund of knowledge. Meaning that a student cannot think critically if they don’t have something substantive about which to think. I agree. Students do need a robust foundation of core content knowledge to effectively think critically. Schools still have a responsibility for building students’ content knowledge.

However, I would argue that students don’t need to wait to think critically until after they have mastered some arbitrary amount of knowledge. They can start building critical thinking skills when they walk in the door. All students come to school with experience and knowledge which they can immediately think critically about. In fact, some of the thinking that they learn to do helps augment and solidify the discipline-specific academic knowledge that they are learning.

The second criticism is that critical thinking skills are always highly contextual. In this argument, the critics make the point that the types of thinking that students do in history is categorically different from the types of thinking students do in science or math. Thus, the idea of teaching broadly defined, content-neutral critical thinking skills is impossible. I agree that there are domain-specific thinking skills that students should learn in each discipline. However, I also believe that there are several generalizable skills that elementary school students can learn that have broad applicability to their academic and social lives. That is what we have done at Two Rivers.

Defining Critical Thinking Skills

We began this work by first defining what we mean by critical thinking. After a review of the literature and looking at the practice at other schools, we identified five constructs that encompass a set of broadly applicable skills: schema development and activation; effective reasoning; creativity and innovation; problem solving; and decision making.

critical thinking competency

We then created rubrics to provide a concrete vision of what each of these constructs look like in practice. Working with the Stanford Center for Assessment, Learning and Equity (SCALE) , we refined these rubrics to capture clear and discrete skills.

For example, we defined effective reasoning as the skill of creating an evidence-based claim: students need to construct a claim, identify relevant support, link their support to their claim, and identify possible questions or counter claims. Rubrics provide an explicit vision of the skill of effective reasoning for students and teachers. By breaking the rubrics down for different grade bands, we have been able not only to describe what reasoning is but also to delineate how the skills develop in students from preschool through 8th grade.

reasoning rubric

Before moving on, I want to freely acknowledge that in narrowly defining reasoning as the construction of evidence-based claims we have disregarded some elements of reasoning that students can and should learn. For example, the difference between constructing claims through deductive versus inductive means is not highlighted in our definition. However, by privileging a definition that has broad applicability across disciplines, we are able to gain traction in developing the roots of critical thinking. In this case, to formulate well-supported claims or arguments.

Teaching Critical Thinking Skills

The definitions of critical thinking constructs were only useful to us in as much as they translated into practical skills that teachers could teach and students could learn and use. Consequently, we have found that to teach a set of cognitive skills, we needed thinking routines that defined the regular application of these critical thinking and problem-solving skills across domains. Building on Harvard’s Project Zero Visible Thinking work, we have named routines aligned with each of our constructs.

For example, with the construct of effective reasoning, we aligned the Claim-Support-Question thinking routine to our rubric. Teachers then were able to teach students that whenever they were making an argument, the norm in the class was to use the routine in constructing their claim and support. The flexibility of the routine has allowed us to apply it from preschool through 8th grade and across disciplines from science to economics and from math to literacy.

argumentative writing

Kathryn Mancino, a 5th grade teacher at Two Rivers, has deliberately taught three of our thinking routines to students using the anchor charts above. Her charts name the components of each routine and has a place for students to record when they’ve used it and what they have figured out about the routine. By using this structure with a chart that can be added to throughout the year, students see the routines as broadly applicable across disciplines and are able to refine their application over time.

Assessing Critical Thinking Skills

By defining specific constructs of critical thinking and building thinking routines that support their implementation in classrooms, we have operated under the assumption that students are developing skills that they will be able to transfer to other settings. However, we recognized both the importance and the challenge of gathering reliable data to confirm this.

With this in mind, we have developed a series of short performance tasks around novel discipline-neutral contexts in which students can apply the constructs of thinking. Through these tasks, we have been able to provide an opportunity for students to demonstrate their ability to transfer the types of thinking beyond the original classroom setting. Once again, we have worked with SCALE to define tasks where students easily access the content but where the cognitive lift requires them to demonstrate their thinking abilities.

These assessments demonstrate that it is possible to capture meaningful data on students’ critical thinking abilities. They are not intended to be high stakes accountability measures. Instead, they are designed to give students, teachers, and school leaders discrete formative data on hard to measure skills.

While it is clearly difficult, and we have not solved all of the challenges to scaling assessments of critical thinking, we can define, teach, and assess these skills . In fact, knowing how important they are for the economy of the future and our democracy, it is essential that we do.

Jeff Heyck-Williams (He, His, Him)

Director of the two rivers learning institute.

Jeff Heyck-Williams is the director of the Two Rivers Learning Institute and a founder of Two Rivers Public Charter School. He has led work around creating school-wide cultures of mathematics, developing assessments of critical thinking and problem-solving, and supporting project-based learning.

Read More About Why Schools Need to Change

high school student invention team

Nurturing STEM Identity and Belonging: The Role of Equitable Program Implementation in Project Invent

Alexis Lopez (she/her)

May 9, 2024

NGLC's Bravely 2024-2025

Bring Your Vision for Student Success to Life with NGLC and Bravely

March 13, 2024

teacher using Canva on laptop

For Ethical AI, Listen to Teachers

Jason Wilmot

October 23, 2023

instrument to measure critical thinking skills

  • ADEA Connect

' src=

  • Communities
  • Career Opportunities
  • New Thinking
  • ADEA Governance
  • House of Delegates
  • Board of Directors
  • Advisory Committees
  • Sections and Special Interest Groups
  • Governance Documents and Publications
  • Dental Faculty Code of Conduct
  • ADEAGies Foundation
  • About ADEAGies Foundation
  • ADEAGies Newsroom
  • Gies Awards
  • Press Center
  • Strategic Directions
  • 2023 Annual Report
  • ADEA Membership
  • Institutions
  • Faculty and Staff
  • Individuals
  • Corporations
  • ADEA Members
  • Predoctoral Dental
  • Allied Dental
  • Nonfederal Advanced Dental
  • U.S. Federal Dental
  • Students, Residents and Fellows
  • Corporate Members
  • Member Directory
  • Directory of Institutional Members (DIM)
  • 5 Questions With
  • ADEA Member to Member Recruitment
  • Students, Residents, and Fellows
  • Information For
  • Deans & Program Directors
  • Current Students & Residents
  • Prospective Students
  • Educational Meetings
  • Upcoming Events
  • 2025 Annual Session & Exhibition
  • eLearn Webinars
  • Past Events
  • Professional Development
  • eLearn Micro-credentials
  • Leadership Institute
  • Leadership Institute Alumni Association (LIAA)
  • Faculty Development Programs
  • ADEA Scholarships, Awards and Fellowships
  • Academic Fellowship
  • For Students
  • For Dental Educators
  • For Leadership Institute Fellows
  • Teaching Resources
  • ADEA weTeach®
  • MedEdPORTAL

Critical Thinking Skills Toolbox

  • Resources for Teaching
  • Policy Topics
  • Task Force Report
  • Opioid Epidemic
  • Financing Dental Education
  • Holistic Review
  • Sex-based Health Differences
  • Access, Diversity and Inclusion
  • ADEA Commission on Change and Innovation in Dental Education
  • Tool Resources
  • Campus Liaisons
  • Policy Resources
  • Policy Publications
  • Holistic Review Workshops
  • Leading Conversations Webinar Series
  • Collaborations
  • Summer Health Professions Education Program
  • Minority Dental Faculty Development Program
  • Federal Advocacy
  • Dental School Legislators
  • Policy Letters and Memos
  • Legislative Process
  • Federal Advocacy Toolkit
  • State Information
  • Opioid Abuse
  • Tracking Map
  • Loan Forgiveness Programs
  • State Advocacy Toolkit
  • Canadian Information
  • Dental Schools
  • Provincial Information
  • ADEA Advocate
  • Books and Guides
  • About ADEA Publications
  • 2023-24 Official Guide
  • Dental School Explorer
  • Dental Education Trends
  • Ordering Publications
  • ADEA Bookstore
  • Newsletters
  • About ADEA Newsletters
  • Bulletin of Dental Education
  • Charting Progress
  • Subscribe to Newsletter
  • Journal of Dental Education
  • Subscriptions
  • Submissions FAQs
  • Data, Analysis and Research
  • Educational Institutions
  • Applicants, Enrollees and Graduates
  • Dental School Seniors
  • ADEA AADSAS® (Dental School)
  • AADSAS Applicants
  • Health Professions Advisors
  • Admissions Officers
  • ADEA CAAPID® (International Dentists)
  • CAAPID Applicants
  • Program Finder
  • ADEA DHCAS® (Dental Hygiene Programs)
  • DHCAS Applicants
  • Program Directors
  • ADEA PASS® (Advanced Dental Education Programs)
  • PASS Applicants
  • PASS Evaluators
  • DentEd Jobs
  • Information For:

instrument to measure critical thinking skills

  • Introduction
  • Overview of Critical Thinking Skills
  • Teaching Observations
  • Avenues for Research

CTS Tools for Faculty and Student Assessment

  • Critical Thinking and Assessment
  • Conclusions
  • Bibliography
  • Helpful Links
  • Appendix A. Author's Impressions of Vignettes

A number of critical thinking skills inventories and measures have been developed:

     Watson-Glaser Critical Thinking Appraisal (WGCTA)      Cornell Critical Thinking Test      California Critical Thinking Disposition Inventory (CCTDI)      California Critical Thinking Skills Test (CCTST)      Health Science Reasoning Test (HSRT)      Professional Judgment Rating Form (PJRF)      Teaching for Thinking Student Course Evaluation Form      Holistic Critical Thinking Scoring Rubric      Peer Evaluation of Group Presentation Form

Excluding the Watson-Glaser Critical Thinking Appraisal and the Cornell Critical Thinking Test, Facione and Facione developed the critical thinking skills instruments listed above. However, it is important to point out that all of these measures are of questionable utility for dental educators because their content is general rather than dental education specific. (See Critical Thinking and Assessment .)

Table 7. Purposes of Critical Thinking Skills Instruments

  Reliability and Validity

Reliability means that individual scores from an instrument should be the same or nearly the same from one administration of the instrument to another. The instrument can be assumed to be free of bias and measurement error (68). Alpha coefficients are often used to report an estimate of internal consistency. Scores of .70 or higher indicate that the instrument has high reliability when the stakes are moderate. Scores of .80 and higher are appropriate when the stakes are high.

Validity means that individual scores from a particular instrument are meaningful, make sense, and allow researchers to draw conclusions from the sample to the population that is being studied (69) Researchers often refer to "content" or "face" validity. Content validity or face validity is the extent to which questions on an instrument are representative of the possible questions that a researcher could ask about that particular content or skills.

Watson-Glaser Critical Thinking Appraisal-FS (WGCTA-FS)

The WGCTA-FS is a 40-item inventory created to replace Forms A and B of the original test, which participants reported was too long.70 This inventory assesses test takers' skills in:

     (a) Inference: the extent to which the individual recognizes whether assumptions are clearly stated      (b) Recognition of assumptions: whether an individual recognizes whether assumptions are clearly stated      (c) Deduction: whether an individual decides if certain conclusions follow the information provided      (d) Interpretation: whether an individual considers evidence provided and determines whether generalizations from data are warranted      (e) Evaluation of arguments: whether an individual distinguishes strong and relevant arguments from weak and irrelevant arguments

Researchers investigated the reliability and validity of the WGCTA-FS for subjects in academic fields. Participants included 586 university students. Internal consistencies for the total WGCTA-FS among students majoring in psychology, educational psychology, and special education, including undergraduates and graduates, ranged from .74 to .92. The correlations between course grades and total WGCTA-FS scores for all groups ranged from .24 to .62 and were significant at the p < .05 of p < .01. In addition, internal consistency and test-retest reliability for the WGCTA-FS have been measured as .81. The WGCTA-FS was found to be a reliable and valid instrument for measuring critical thinking (71).

Cornell Critical Thinking Test (CCTT)

There are two forms of the CCTT, X and Z. Form X is for students in grades 4-14. Form Z is for advanced and gifted high school students, undergraduate and graduate students, and adults. Reliability estimates for Form Z range from .49 to .87 across the 42 groups who have been tested. Measures of validity were computed in standard conditions, roughly defined as conditions that do not adversely affect test performance. Correlations between Level Z and other measures of critical thinking are about .50.72 The CCTT is reportedly as predictive of graduate school grades as the Graduate Record Exam (GRE), a measure of aptitude, and the Miller Analogies Test, and tends to correlate between .2 and .4.73

California Critical Thinking Disposition Inventory (CCTDI)

Facione and Facione have reported significant relationships between the CCTDI and the CCTST. When faculty focus on critical thinking in planning curriculum development, modest cross-sectional and longitudinal gains have been demonstrated in students' CTS.74 The CCTDI consists of seven subscales and an overall score. The recommended cut-off score for each scale is 40, the suggested target score is 50, and the maximum score is 60. Scores below 40 on a specific scale are weak in that CT disposition, and scores above 50 on a scale are strong in that dispositional aspect. An overall score of 280 shows serious deficiency in disposition toward CT, while an overall score of 350 (while rare) shows across the board strength. The seven subscales are analyticity, self-confidence, inquisitiveness, maturity, open-mindedness, systematicity, and truth seeking (75).

In a study of instructional strategies and their influence on the development of critical thinking among undergraduate nursing students, Tiwari, Lai, and Yuen found that, compared with lecture students, PBL students showed significantly greater improvement in overall CCTDI (p = .0048), Truth seeking (p = .0008), Analyticity (p =.0368) and Critical Thinking Self-confidence (p =.0342) subscales from the first to the second time points; in overall CCTDI (p = .0083), Truth seeking (p= .0090), and Analyticity (p =.0354) subscales from the second to the third time points; and in Truth seeking (p = .0173) and Systematicity (p = .0440) subscales scores from the first to the fourth time points (76). California Critical Thinking Skills Test (CCTST)

Studies have shown the California Critical Thinking Skills Test captured gain scores in students' critical thinking over one quarter or one semester. Multiple health science programs have demonstrated significant gains in students' critical thinking using site-specific curriculum. Studies conducted to control for re-test bias showed no testing effect from pre- to post-test means using two independent groups of CT students. Since behavioral science measures can be impacted by social-desirability bias-the participant's desire to answer in ways that would please the researcher-researchers are urged to have participants take the Marlowe Crowne Social Desirability Scale simultaneously when measuring pre- and post-test changes in critical thinking skills. The CCTST is a 34-item instrument. This test has been correlated with the CCTDI with a sample of 1,557 nursing education students. Results show that, r = .201, and the relationship between the CCTST and the CCTDI is significant at p< .001. Significant relationships between CCTST and other measures including the GRE total, GRE-analytic, GRE-Verbal, GRE-Quantitative, the WGCTA, and the SAT Math and Verbal have also been reported. The two forms of the CCTST, A and B, are considered statistically significant. Depending on the testing, context KR-20 alphas range from .70 to .75. The newest version is CCTST Form 2000, and depending on the testing context, KR-20 alphas range from .78-.84.77

The Health Science Reasoning Test (HSRT)

Items within this inventory cover the domain of CT cognitive skills identified by a Delphi group of experts whose work resulted in the development of the CCTDI and CCTST. This test measures health science undergraduate and graduate students' CTS. Although test items are set in health sciences and clinical practice contexts, test takers are not required to have discipline-specific health sciences knowledge. For this reason, the test may have limited utility in dental education (78).

Preliminary estimates of internal consistency show that overall KR-20 coefficients range from .77 to .83.79 The instrument has moderate reliability on analysis and inference subscales, although the factor loadings appear adequate. The low K-20 coefficients may be result of small sample size, variance in item response, or both (see following table).

Table 8. Estimates of Internal Consistency and Factor Loading by Subscale for HSRT

Professional Judgment Rating Form (PJRF)

The scale consists of two sets of descriptors. The first set relates primarily to the attitudinal (habits of mind) dimension of CT. The second set relates primarily to CTS.

A single rater should know the student well enough to respond to at least 17 or the 20 descriptors with confidence. If not, the validity of the ratings may be questionable. If a single rater is used and ratings over time show some consistency, comparisons between ratings may be used to assess changes. If more than one rater is used, then inter-rater reliability must be established among the raters to yield meaningful results. While the PJRF can be used to assess the effectiveness of training programs for individuals or groups, the evaluation of participants' actual skills are best measured by an objective tool such as the California Critical Thinking Skills Test.

Teaching for Thinking Student Course Evaluation Form

Course evaluations typically ask for responses of "agree" or "disagree" to items focusing on teacher behavior. Typically the questions do not solicit information about student learning. Because contemporary thinking about curriculum is interested in student learning, this form was developed to address differences in pedagogy and subject matter, learning outcomes, student demographics, and course level characteristic of education today. This form also grew out of a "one size fits all" approach to teaching evaluations and a recognition of the limitations of this practice. It offers information about how a particular course enhances student knowledge, sensitivities, and dispositions. The form gives students an opportunity to provide feedback that can be used to improve instruction.

Holistic Critical Thinking Scoring Rubric

This assessment tool uses a four-point classification schema that lists particular opposing reasoning skills for select criteria. One advantage of a rubric is that it offers clearly delineated components and scales for evaluating outcomes. This rubric explains how students' CTS will be evaluated, and it provides a consistent framework for the professor as evaluator. Users can add or delete any of the statements to reflect their institution's effort to measure CT. Like most rubrics, this form is likely to have high face validity since the items tend to be relevant or descriptive of the target concept. This rubric can be used to rate student work or to assess learning outcomes. Experienced evaluators should engage in a process leading to consensus regarding what kinds of things should be classified and in what ways.80 If used improperly or by inexperienced evaluators, unreliable results may occur.

Peer Evaluation of Group Presentation Form

This form offers a common set of criteria to be used by peers and the instructor to evaluate student-led group presentations regarding concepts, analysis of arguments or positions, and conclusions.81 Users have an opportunity to rate the degree to which each component was demonstrated. Open-ended questions give users an opportunity to cite examples of how concepts, the analysis of arguments or positions, and conclusions were demonstrated.

Table 8. Proposed Universal Criteria for Evaluating Students' Critical Thinking Skills 

Aside from the use of the above-mentioned assessment tools, Dexter et al. recommended that all schools develop universal criteria for evaluating students' development of critical thinking skills (82).

Their rationale for the proposed criteria is that if faculty give feedback using these criteria, graduates will internalize these skills and use them to monitor their own thinking and practice (see Table 4).

' src=

  • Application Information
  • ADEA GoDental
  • ADEA AADSAS
  • ADEA CAAPID
  • Events & Professional Development
  • Scholarships, Awards & Fellowships
  • Publications & Data
  • Official Guide to Dental Schools
  • Data, Analysis & Research
  • Follow Us On:

' src=

  • ADEA Privacy Policy
  • Terms of Use
  • Website Feedback
  • Website Help

instrument to measure critical thinking skills

Royal Society of Chemistry

Development and validation of an instrument to measure undergraduate chemistry students’ critical thinking skills †

ORCID logo

First published on 12th July 2019

The importance of developing and assessing student critical thinking at university can be seen through its inclusion as a graduate attribute for universities and from research highlighting the value employers, educators and students place on demonstrating critical thinking skills. Critical thinking skills are seldom explicitly assessed at universities. Commercial critical thinking assessments, which are often generic in context, are available. However, literature suggests that assessments that use a context relevant to the students more accurately reflect their critical thinking skills. This paper describes the development and evaluation of a chemistry critical thinking test (the Danczak–Overton–Thompson Chemistry Critical Thinking Test or DOT test), set in a chemistry context, and designed to be administered to undergraduate chemistry students at any level of study. Development and evaluation occurred over three versions of the DOT test through a variety of quantitative and qualitative reliability and validity testing phases. The studies suggest that the final version of the DOT test has good internal reliability, strong test–retest reliability, moderate convergent validity relative to a commercially available test and is independent of previous academic achievement and university of study. Criterion validity testing revealed that third year students performed statistically significantly better on the DOT test relative to first year students, and postgraduates and academics performed statistically significantly better than third year students. The statistical and qualitative analysis indicates that the DOT test is a suitable instrument for the chemistry education community to use to measure the development of undergraduate chemistry students’ critical thinking skills.

Introduction

A survey of 167 recent science graduates compared the development of a variety of skills at university to the skills used in the work place ( Sarkar et al. , 2016 ). It found that 30% of graduates in full-time positions identified critical thinking as one of the top five skills they would like to have developed further within their undergraduate studies. Students, governments and employers all recognise that not only is developing students’ critical thinking an intrinsic good, but that it better prepares them to meet and exceed employer expectations when making decisions, solving problems and reflecting on their own performance ( Lindsay, 2015 ). Hence, it has become somewhat of an expectation from governments, employers and students that it is the responsibility of higher education providers to develop students’ critical thinking skills. Yet, despite the clear need to develop these skills, measuring student attainment of critical thinking is challenging.

The definition of critical thinking

Cognitive psychologists and education researchers use the term critical thinking to describe a set of cognitive skills, strategies or behaviours that increase the likelihood of a desired outcome ( Halpern, 1996b ; Tiruneh et al. , 2014 ). Psychologists typically investigate critical thinking experimentally and have developed a series of reasoning schemas with which to study and define critical thinking; conditional reasoning, statistical reasoning, methodological reasoning and verbal reasoning ( Nisbett et al. , 1987 ; Lehman and Nisbett, 1990 ). Halpern (1993) expanded on these schemas to define critical thinking as the thinking required to solve problems, formulate inferences, calculate likelihoods and make decisions.

In education research there is often an emphasis on critical thinking as a skill set ( Bailin, 2002 ) or putting critical thought into tangible action ( Barnett, 1997 ). Dressel and Mayhew (1954) suggested it is educationally useful to define critical thinking as the sum of specific behaviours which could be observed from student acts. They identify these critical thinking abilities as identifying central issues, recognising underlying assumptions, evaluating evidence or authority, and drawing warranted conclusions. Bailin (2002) raises the point that from a pedagogical perspective many of the skills or dispositions commonly used to define critical thinking are difficult to observe and, therefore, difficult to assess. Consequently, Bailin suggests that the concept of critical thinking should explicitly focus on adherence to criteria and standards to reflect ‘good’ critical thinking ( Bailin, 2002, p. 368 ).

It appears that there are several definitions of critical thinking of equally valuable meaning ( Moore, 2013 ). There is agreement across much of the field that meta-cognitive skills, such as self-evaluation, are essential to a well-rounded process of critical thinking ( Glaser, 1984 ; Kuhn, 1999 ; Pithers and Soden, 2000 ). There are key themes such as ‘critical thinking: as judgement, as scepticism, as originality, as sensitive reading, or as rationality’ which can be identified across the literature. In the context of developing an individual's critical thinking it is important that these themes take the form of observable behaviours.

Developing students’ critical thinking

In the latter half of the 20th century informal logic gained academic credence as it challenged the previous ideas of logic being related purely to deduction or inference and that there were, in fact, theories of argumentation and logical fallacies ( Johnson et al. , 1996 ). These theories began to be taught at universities as standalone courses free from any context in efforts to teach the structure of arguments and recognition of fallacies using abstract theories and symbolism. Cognitive psychology research lent evidence to the argument that critical thinking could be developed within a specific discipline and those reasoning skills were, at least to some degree, transferable to situations encountered in daily life ( Lehman et al. , 1988 ; Lehman and Nisbett, 1990 ). These perspectives form the basis of the subject generalist, who believed critical thinking can be developed independent of subject specific knowledge.

McMillan (1987) carried out a review of 27 empirical studies conducted at higher education institutions where critical thinking was taught, either in standalone courses or integrated into discipline-specific courses such as science. The review found that standalone and integrated courses were equally successful in developing critical thinking, provided critical thinking developmental goals were made explicit to the students. The review also suggested that the development of critical thinking was most effective when its principles were taught across a variety of discipline areas so as to make knowledge retrieval easier.

Ennis (1989) suggested that there are a range of approaches through which critical thinking can be taught: general, where critical thinking is taught separate from content or ‘discipline’; infusion, where the subject matter is covered in great depth and teaching of critical thinking is explicit; immersion, where the subject matter is covered in great depth but critical thinking goals are implicit; and mixed, a combination of the general approach with either the infusion or immersion approach. Ennis (1990) arrived at a pragmatic view to concede that the best critical thinking occurs within one's area of expertise, or domain specificity, but that critical thinking can still be effectively developed with or without discipline specific knowledge ( McMillan, 1987 ; Ennis, 1990 ).

Many scholars still remain entrenched in the debate regarding the role discipline-specific knowledge has in the development of critical thinking. For example, Moore (2011) rejected the use of critical thinking as a catch-all term to describe a range of cognitive skills, believing that to teach critical thinking as a set of generalisable skills is insufficient to provide students with an adequate foundation for the breadth of problems they will encounter throughout their studies. Conversely, Davies (2013) accepts that critical thinking skills share fundamentals at the basis of all disciplines and that there can be a need to accommodate the discipline-specific needs ‘higher up’ in tertiary education via the infusion approach. However, Davies considers the specifist approach to developing critical thinking ‘dangerous and wrong-headed’ ( Davies, 2013, p. 543 ), citing government reports and primary literature which demonstrates tertiary students’ inability to identify elements of arguments, and championing the need for standalone critical thinking courses.

Pedagogical approaches to developing critical thinking in chemistry in higher education range from writing exercises ( Oliver-Hoyo, 2003 ; Martineau and Boisvert, 2011 ; Stephenson and Sadler-Mcknight, 2016 ), inquiry-based projects ( Gupta et al. , 2015 ), flipped lectures ( Flynn, 2011 ) and open-ended practicals ( Klein and Carney, 2014 ) to gamification ( Henderson, 2010 ), and work integrated learning (WIL) ( Edwards et al. , 2015 ). Researchers have demonstrated the benefits of developing critical thinking skills across all first, second and third year programs of an undergraduate degree ( Phillips and Bond, 2004 ; Iwaoka et al. , 2010 ). Phillips and Bond (2004) indicated that such interventions help develop a culture of inquiry, and better prepare students for employment.

Some studies demonstrate the outcomes of teaching interventions via validated commercially available critical thinking tests, available from a variety of vendors for a fee ( Abrami et al. , 2008 ; Tiruneh et al. , 2014 ; Abrami et al. , 2015 ; Carter et al. , 2015 ). There are arguments against the generalisability of these commercially available tests. Many academics believe assessments need to closely align with the intervention(s) ( Ennis, 1993 ) and a more accurate representation of student ability is obtained when a critical thinking assessment is related to a students’ discipline, as they attach greater significance to the assessment ( Halpern, 1998 ).

Review of commercial critical thinking assessment tools

Several reviews of empirical studies suggest that the WGCTA is the most prominent test in use ( Behar-Horenstein and Niu, 2011 ; Carter et al. , 2015 ; Huber and Kuncel, 2016 ). However, the CCTST was developed much later than the WGCTA and recent trends suggest the CCTST has gained popularity amongst researchers since its inception. Typically, the tests are administered to address questions regarding the development of critical thinking over time or the effect of a teaching intervention. The results of this testing are inconsistent; some studies report significant changes while others report no significant changes in critical thinking ( Behar-Horenstein and Niu, 2011 ). For example, Carter et al. (2015) found studies which used the CCTST or the WGCTA did not all support the hypothesis of improved critical thinking with time, with some studies reporting increases, and some studies reporting decreases or no change over time. These reviews highlight the importance of experimental design when evaluating critical thinking. McMillan (1987) reviewed 27 studies and found that only seven of them demonstrated significant changes in critical thinking. He concluded that tests which were designed by the researcher are a better measure of critical thinking, as they specifically address the desired critical thinking learning outcomes, as opposed to commercially available tools which attempt to measure critical thinking as a broad and generalised construct.

The need for a contextualised chemistry critical thinking test

Several examples of chemistry specific critical thinking tests and teaching tools were found in the literature. However, while all of these tests and teaching activities were set within a chemistry context, they require discipline specific knowledge and/or were not suitable for very large cohorts of students. For example Jacob (2004) presented students with six questions each consisting of a statement requiring an understanding of declarative chemical knowledge. Students were expected to select whether the conclusion was valid, possible or invalid and provide a short statement to explain their reasoning. Similarly, Kogut (1993) developed exercises where students were required to note observations and underlying assumptions of chemical phenomena then develop hypotheses, and experimental designs with which to test these hypotheses. However, understanding the observations and underlying assumptions was dependent on declarative chemical knowledge such as trends in the periodic table or the ideal gas law.

Garratt et al. (1999) developed an entire book dedicated to developing chemistry critical thinking, titled ‘A Question of Chemistry’. In writing this book the authors took the view that thinking critically in chemistry draws on the generic skills of critical thinking and what they call ‘an ethos of a particular scientific method’ ( Garratt et al. , 2000, p. 153 ). The approach to delivering these questions ranged from basic multiple choice questions, to rearranging statements to generate a cohesive argument, or open-ended responses. The statements preceding the questions are very discipline specific and the authors acknowledge they are inaccessible to a lay person. Overall the chemistry context is used because ‘it adds to the students’ motivation if they can see the exercises are firmly rooted in, and therefore relevant to, their chosen discipline’ ( Garratt et al. , 2000, p. 166 ).

Thus, an opportunity has been identified to develop a chemistry critical thinking test which could be used to assist chemistry educators and chemistry education researchers in evaluating the effectiveness of teaching interventions designed to develop the critical thinking skills of chemistry undergraduate students. This study aimed to determine whether a valid and reliable critical thinking test could be developed and contextualised within the discipline of chemistry, yet independent of any discipline-specific knowledge, so as to accurately reflect the critical thinking ability of chemistry students from any level of study, at any university.

This study describes the development and reliability and validity testing of an instrument with which to measure undergraduate chemistry students’ critical thinking skills: The Danczak–Overton–Thompson Chemistry Critical Thinking Test (DOT test).

As critical thinking tests are considered to evaluate a psychometric construct ( Nunnally and Bernstein, 1994 ) there must be supporting evidence of their reliability and validity ( Kline, 2005 ; DeVellis, 2012 ). Changes made to each iteration of the DOT test and the qualitative and quantitative analysis performed at each stage of the study are described below.

Qualitative data analysis

The qualitative data for DOT V1 and DOT V2 were treated as separate studies. The data for these studies were collected and analysed separately as described in the following. The data collected from focus groups throughout this research were recorded with permission of the participants and transcribed verbatim into Microsoft Word, at which point participants were de-identified. The transcripts were then imported into NVivo version 11 and an initial analysis was performed to identify emergent themes. The data then underwent a second analysis to ensure any underlying themes were identified. A third review of the data used a redundancy approach to combine similar themes. The final themes were then used for subsequent coding of the transcripts ( Bryman, 2008 ).

Developing an operational definition of critical thinking

Therefore, students, teaching staff and employers did not define critical thinking in the holistic fashion of philosophers, cognitive psychologists or education researchers. In fact, very much in line with the constructivist paradigm ( Ferguson, 2007 ), participants seem to have drawn on elements of critical thinking relative to the environments in which they had previously been required to use critical thinking. For example, students focused on analysis and problem solving, possibly due to the assessment driven environment of university, whereas employers cited innovation and global contexts, likely to be reflective of a commercial environment.

The definitions of critical thinking in the literature ( Lehman et al. , 1988 ; Facione, 1990 ; Halpern, 1996b ) cover a wide range of skills and behaviours. These definitions often imply that to think critically necessitates that all of these skills or behaviours be demonstrated. However, it seems almost impossible that all of these attributes could be observed at a given point in time, let alone assessed ( Dressel and Mayhew, 1954 ; Bailin, 2002 ). Whilst the students in Danczak et al. (2017) used ‘problem solving’ and ‘analysis’ to define critical thinking, it does not necessarily mean that their description accurately reflects the critical thinking skills they have actually acquired, but rather their perception of what critical thinking skills they have developed. Therefore, to base a chemistry critical thinking test solely on analysis and problem solving skills would lead to the omission of the assessment of other important aspects of critical thinking.

To this end, the operation definition of critical thinking acknowledges the analysis and problem solving focus that students predominately used to describe critical thinking, whilst expanding into other important aspects of critical thinking such as inference and judgement. Consequently, guidance was sought from existing critical thinking assessments, as described below.

Development of the Danczak–Overton–Thompson chemistry critical thinking test (DOT test)

The WGCTA is an 85 item test which has undergone extensive scrutiny in the literature since its inception in the 1920s ( Behar-Horenstein and Niu, 2011 ; Huber and Kuncel, 2016 ). The WGCTA covers the core principles of critical thinking divided into five sections: inference, assumption identification, deduction, interpreting information and evaluation of arguments. The questions test each aspect of critical thinking independent of context. Each section consists of brief instructions and three or four short parent statements. Each parent statement acts as a prompt for three to seven subsequent questions. The instructions, parent statements, and the questions themselves were concise with respect to language and reading requirements. The fact that the WGCTA focused on testing assumptions, deductions, inferences, analysing arguments and interpreting information was an inherit limitation in its ability to assess all critical thinking skills and behaviours. However, these elements are commonly described by many definitions of critical thinking ( Facione, 1990 ; Halpern, 1996b ).

The questions from the WGCTA practice test were analysed and used to structure the DOT test. The pilot version of the DOT test (DOT P) was initially developed with 85 questions, set within a chemistry or science context, and using similar structure and instructions to the WGCTA with five sub-scales: making assumptions, analysing arguments, developing hypotheses, testing hypotheses, and drawing conclusions.

Below is an example of paired statements and questions written for the DOT P. This question is revisited throughout this paper to illustrate the evolution of the test throughout the study. The DOT P used essentially the same instructions as provided on the WGCTA practice test. In later versions of the DOT Test the instructions were changed as will be discussed later.

The parent statement of the exemplar question from the WGCTA required the participant to recognise proposition A and proposition B are different, and explicitly states that there is a relationship between proposition A and proposition C. This format was used to generate the following parent statement:

A chemist tested a metal centred complex by placing it in a magnetic field. The complex was attracted to the magnetic field. From this result the chemist decided the complex had unpaired electrons and was therefore paramagnetic rather than diamagnetic.

In writing an assumption question for the DOT test, paramagnetic and diamagnetic behaviour of metal complexes replaced propositions A and B. The relationship between propositions A and C was replaced with paramagnetic behaviour being related to unpaired electrons. The question then asked if it is a valid or invalid assumption that proposition B is not related to proposition C.

Diamagnetic metals centred complexes do not have any unpaired electrons.

The correct answer was a valid assumption as this question required the participant to identify proposition B and C were not related. The explanation for the correct answer was as follows:

The paragraph suggests that if the complex has unpaired electrons it is paramagnetic. This means diamagnetic complexes likely cannot have unpaired electrons.

All 85 questions on the WGCTA practice test were analysed in the manner exemplified above to develop the DOT P. In designing the test there were two requirements that had to be met. Firstly, the test needed to be able to be completed comfortably within 30 minutes to allow it to be administered in short time frames, such as at the end of laboratory sessions, and to increase the likelihood of voluntary completion by students. Secondly, the test needed to be able to accurately assess the critical thinking of chemistry students from any level of study, from first year general chemistry students to final year students. To this end, chemistry terminology was carefully chosen to ensure that prior knowledge of chemistry was not necessary to comprehend the questions. Chemical phenomena were explained and contextualised completely within the parent statement and the questions.

DOT pilot study: content validity

The test took in excess of 40 minutes to complete. Therefore, questions which were identified as unclear, which did not illicit the intended responses, or caused misconceptions of the scientific content were removed. The resulting DOT V1 contained seven questions relating to ‘Making Assumptions’, seven questions relating to ‘Analysing Arguments’, six questions relating to ‘Developing Hypotheses’, five questions relating to ‘Testing Hypotheses’ and five questions relating to ‘Drawing Conclusions’. The terms used to select a multiple choice option were written in a manner more accessible to science students, for example using terms such as ‘Valid Assumption’ or ‘Invalid Assumption’ instead of ‘Assumption Made’ or ‘Assumption Not Made’. Finally, the number of options in the ‘Developing Hypotheses’ section were reduced from five to three of ‘likely to be an accurate inference’, ‘insufficient information to determine accuracy’ and ‘unlikely to be an accurate inference’.

Data treatment of responses to the DOT test and WGCTA-S

All responses to the DOT test and WGCTA-S were imported into IBM SPSS statistics (V22). Frequency tables were generated to determine erroneous or missing data. Data was considered erroneous when participants had selected ‘C’ to questions which only contained options ‘A’ or ‘B’, or when undergraduate students who identified their education/occupation as that of an academic. The erroneous data was deleted and treated as missing data points. In each study a variable was created to determine the sum of unanswered questions (missing data) for each participant. Pallant (2016, pp. 58–59) suggests a judgement call is required when considering missing data and whether to treat certain cases as genuine attempts to complete the test or not. In the context of this study a genuine attempt was based on the number of questions a participant left unanswered. Participants who attempted at least 27 questions were considered to have genuinely attempted the test.

Responses to all DOT Test (and WGTCA-S) questions were coded as correct or incorrect responses. Upon performing descriptive statistics, the DOT V1 scores were found to exhibit a normal (Gaussian) distribution whereas the DOT V3 exhibited a non-parametric (not-normal) distribution. In light of these distributions it was decided to treat all data obtained as non-parametric.

Internal reliability of each iteration of the DOT test was determined by calculating Cronbach's α ( Cronbach, 1951 ). Within this study the comparison between two continuous variables was made using the non-parametric equivalent of a Pearson's r , Spearman's Rank Order test as recommended by Pallant (2016) . Continuous variables included DOT test scores and previous academic achievement as measured by tertiary entrance scores (ATAR score). When comparing DOT test scores between education groups, which were treated as categorical variables, the non-parametric equivalent of t -test, a Mann–Whitney U test was used. When comparing a continuous variable of the same participants taken at different times using Wilcoxon signed rank test, the non-parametric equivalent of a paired t -test, was used.

DOT V1: internal reliability and content validity

Internal reliability method, content validity method, dot v2: test–retest reliability, convergent validity and content validity.

The most extensive rewriting of the parent statements occurred in the ‘Analysing Arguments’ section. The feedback provided from the focus groups indicated that parent statements did not include sufficient information to adequately respond to the questions.

Additional qualifying statements were added to several questions in order to reduce ambiguity. In the parent statement of the exemplar question the first sentence was added to eliminate the need to understand that differences exist between diamagnetic and paramagnetic metal complexes, with respect to how they interact with magnetic fields:

Paramagnetic and diamagnetic metal complexes behave differently when exposed to a magnetic field. A chemist tested a metal complex by placing it in a magnetic field. From the result of the test the chemist decided the metal complex had unpaired electrons and was therefore paramagnetic.

Finally, great effort was made in the organisation of the DOT V2 to guide the test taker through a critical thinking process. Similar to Halpern's approach to analysing an argument ( Halpern, 1996a ), Halpern teaches that an argument is comprised of several conclusions, and that the credibility of these conclusions must be evaluated. Furthermore, the validity of any assumptions, inferences and deductions used to construct the conclusions within an argument need to be analysed. To this end the test taker was provided with scaffolding from making assumptions to analysing arguments in line with Halpern's approach.

Test–retest reliability and convergent validity method

On the first day, demographic data was collected: sex, dominant language, previous academic achievement using tertiary entrance scores (ATAR), level of chemistry being studied and highest level of chemistry study completed at Monash University. Students completed the DOT V2 using an optical reader multiple choice answer sheet. This was followed by completion of the WGCTA-S in line with procedures outlined by the Watson-Glaser critical thinking appraisal short form manual (2006). The WGCTA-S was chosen for analysis of convergent validity, as it was similar in length to the DOT V2 and was intended to measure the same aspects of critical thinking. The fact that participants completed the DOT V2 and then the WGCTA-S may have affected the participants’ performance on the WGCTA-S. This limitation will be addressed in the results.

After a brief break, the participants were divided into groups of five to eight students and interviewed about their overall impression of the WGCTA-S and their approach to various questions. Interviewers prevented the participants from discussing the DOT V2 so as not to influence each other's responses upon retesting.

On the second day participants repeated the DOT V2. DOT V2 attempts were completed on consecutive days to minimise participant attrition. Upon completion of the DOT V2 and after a short break, participants were divided into two groups of nine and interviewed about their impressions of the DOT V2, how they approached various questions and comparisons between the DOT V2 and WGCTA-S.

Responses to the tests and demographic data were imported into IBM SPSS (V22). Data was treated in accordance with the procedure outlined earlier. With the exception of tertiary entrance score, there was no missing or erroneous demographic data. Spearman's rank order correlations were performed comparing ATAR scores to scores on the WGCTA-S and the DOT V2. Test–retest reliability was determined using a Wilcoxon signed rank test ( Pallant, 2016, pp. 234–236, pp. 249–253 ), When the scores of the tests taken at different times have no significant difference, as determined by a p value greater than 0.05, the test can be considered to have acceptable test–retest reliability ( Pallant, 2016, p. 235 ). Acceptable test–retest reliability does not imply that test attempts are equivalent. Rather, good test–retest reliability suggests that the precision of the test to measure the construct of interest is acceptable. Median scores of the participants’ first attempt of the DOT V2 were compared with the median score of the participants’ second attempt of the DOT V2. To determine the convergent validity of the DOT V2 the relationship between scores on the DOT V2 and performance on the WGCTA-S was investigated using Spearman's Rank order correlation.

After approximately 15 minutes of participants freely discussing the relevant test, the interviewers asked the participants to look at a given section on a test, for example the ‘Testing Hypotheses’ section of the DOT V2, and identify any questions they found problematic. In the absence of students identifying any problematic questions, the interviewers used a list of questions from each test to prompt discussion. The participants were then asked as a group:

• ‘ What do you think the question is asking you? ’

• ‘ What do you think is the important information in this question? ’

• ‘ Why did you give the answer(s) you did to this question? ’

The interview recordings were transcribed and analysed in line with the procedures and theoretical frameworks described previously to result in four distinct themes which were used to code the transcripts.

DOT V3: internal reliability, criterion validity content validity and discriminate validity

Many scientific terms were either simplified or removed in the DOT V3. In the case of the exemplar question, the focus was moved to an alloy of thallium and lead rather than a ‘metal complex’. Generalising this question to focus on an alloy allowed these questions to retain scientific accuracy and reduce the tendency for participants to draw on knowledge outside the information presented in the questions:

Metals which are paramagnetic or diamagnetic behave differently when exposed to an induced magnetic field. A chemist tested a metallic alloy sample containing thallium and lead by placing it in an induced magnetic field. From the test results the chemist decided the metallic alloy sample repelled the induced magnetic field and therefore was diamagnetic.

This statement was then followed by the prompt asking the participant to decide if the assumption presented was valid or invalid:

Paramagnetic metals do not repel induced magnetic fields.

Several terms were rewritten as their use in science implied assumptions as identified by the student focus groups. These assumptions were not intended and hence the questions were reworded. For example, question 14 asked whether a ‘low yield’ would occur in a given synthetic route. The term ‘low yield’ was changed to ‘an insignificant amount’ to remove any assumptions regarding the term ‘yield’.

The study of the DOT V3 required participants to be drawn from several distinct groups in order to assess criterion and discriminate validity. For the purpose of criterion validity, the DOT V3 was administered to first year and third year undergraduate chemistry students, honours and PhD students and post-doctoral researchers at Monash University, and chemistry education academics from an online community of practice. Furthermore, third year undergraduate chemistry students from another Australian higher education institution (Curtin University) also completed the DOT V3 to determine discriminate validity with respect to performance of the DOT V3 outside of Monash University.

Participants

Third year participants were drawn from an advanced inorganic chemistry course at Monash University and a capstone chemical engineering course at Curtin University. 54 students (37%) responded to the DOT test at Monash University. The 23 students who completed the DOT V3 at Curtin University represented the entire cohort.

Post-doctoral researchers, honours and PhD students from Monash University were invited to attempt the DOT V3. 40 participants drawn from these cohorts attended a session where they completed the DOT V3 in a paper format, marking responses directly onto the test. All cohorts who completed the test in paper format required approximately 20 to 30 minutes to complete the test.

An online discussion group of approximately 300 chemistry academics with an interest in education, predominately from Australia, the UK and Europe, were invited to complete an online version of the DOT V3. Online completion was untimed and 46 participants completed the DOT V3.

Treatment of data

Descriptive statistics of the 270 DOT V3 results revealed a larger proportion scores were above the mean, thus the data was considered non-parametric for the purposes of reliability and validity statistical analysis. Internal consistency was then determined by calculating Cronbach's α ( Cronbach, 1951 ). The five sub-scales of the DOT V3 (Making Assumptions, Developing Hypotheses, Testing Hypotheses, Drawing Conclusion and Analysing Arguments) underwent a principle component analysis to determine the number of factors affecting the DOT V3.

Criterion validity method

Discriminate validity method, results and discussion.

The academic participants in the focus groups navigated the questions on the DOT V1 to arrive at the intended responses. However, there was rarely consensus within the group and a minority, usually one or two participants, disagreed with the group. The difficulties the academics had in responding to the DOT V1 were made clear from four themes which emerged from the analysis: ‘Instruction Clarity’, ‘Wording of the Question(s)’, ‘Information within the Statement’ and ‘Prior Knowledge’ ( Table 2 ). The last theme was generally found to be associated with the other themes.

The theme of ‘Instruction Clarity’ was used to describe when participants either had difficulty interpreting the instructions or intentionally ignored the instructions. Several participants self-reported their tendency to only scan the instructions without properly reading them, or did not read the statement preceding a question in its entirety. When this behaviour occurred, academics were quick to draw on outside knowledge. This theme identified the need for clarity of the instructions and providing relevant examples of what was meant by terms such as ‘is of significant importance’ or ‘is not of significant importance’.

The theme of ‘Wording of the Questions’ referred to evaluating the meaning and use of particular words within the questions or the parent statements. The wording of several questions led to confusion, causing the participants to draw on outside knowledge. Unclear terminology hindered non-science participants (education developers) from attempting the questions, and was further compounded by the use of terms such as ‘can only ever be’. For example, the use of the term ‘rather than’ confused participants when they knew a question had more than two alternatives.

The theme ‘Information within the Statement’ referred to the participants’ perceptions of the quality and depth of information provided in the parent statements. Participants suggested some test questions appeared to be non-sequiturs with respect the corresponding parent statements. Participants felt they did not have enough information to make a decision, and the lack of clarity in the instructions further compounded the problem for participants.

The theme ‘Prior Knowledge’ identified instances when participants had drawn on information not provided in the DOT V1 to answer the questions. Several issues regarding prior knowledge emerged from the discourse. Participants identified that there were some assumptions made about the use of the chemical notations. Finally some participants highlighted that having prior knowledge, specifically in science and/or chemistry, was to their detriment when attempting the questions.

DOT V2: test–retest reliability, convergent and content validity

A total of 15 participants provided their tertiary entrance score (ATAR), as a measure of previous academic achievement. There is some discussion in the literature which suggests university entrance scores obtained in high school do not reflect intelligence and cognitive ability ( Richardson, Abraham and Bond, 2012 ). However, a comparison of previous academic achievement, reported via ATAR scores, revealed a small positive correlation with scores obtained on the DOT V2 ( ρ = 0.23) and a moderately positive correlation with scores obtained on the WGCTA-S ( ρ = 0.47).

Test–retest reliability

“ The second time it felt like I was just remembering what I put down the day before. ”

The WGCTA-S manual ( Watson and Glaser, 2006, pp. 30–31 ) listed three studies where test–retesting intervals of three months, two weeks or four days. Each of these studies reported test–retest correlations ranging from p = 0.73 to 0.89, and larger p values were associated with shorter time frames between test–retesting. However, as the p value of the Wilcoxon's signed rank test was sufficiently large (0.91), it was unlikely that the DOT V2 would have exhibited poor test–retest reliability were it to be administered over a longer time interval.

Convergent validity

Content validity.

“ I found the questions (on the DOT V2) a bit more interesting and engaging in general where as this one (WGCTA-S) seemed a bit more clinical. ”

However, two participants did express their preference for the WGCTA-S citing the detailed examples in the instructions of each section, and their frustration when attempting the DOT V2, requiring them to recognise whether they were drawing on chemistry knowledge outside of the question.

The qualitative analysis of the student focus group data provided useful insight regarding the content validity of the DOT V2. When discussing their responses, the participants often arrived at a group consensus on the correct answers for both the DOT V2 and the WGCTA-S. Rarely did the participants initially arrive at a unanimous decision. In several instances on both tests, there were as many participants in favour of the incorrect response as there were participants in favour of the correct response. Four themes emerged from the analysis of the transcripts which are presented in Table 3 .

The theme ‘Strategies used to attempt test questions’ describes both the participants’ overall practice and increasing familiarity with the style of questions, and also the specific cognitive techniques used in attempting to answer questions. The approach participants used when performing these tests was reflective of the fact they became more familiar with the style of questions and their dependence on the examples provided diminished.

Some participants had difficulty understanding what was meant by ‘Assumption Made’ and ‘Assumption Not Made’ in the ‘Recognition of Assumption’ section in the WGCTA-S and drew heavily on the worked examples provided in the introduction to the section. At the conclusion of this study, these participants had completed three critical thinking tests and were becoming familiar with how the questions were asked and what was considered a correct response. However, test–retesting with the DOT V2 indicated that there was no change in performance.

There was concern that providing detailed instructions on the DOT test may in fact develop the participants’ critical thinking skills in the process of attempting to measure it. For example, a study conducted ( Heijltjes et al. , 2015 ) with 152 undergraduate economics students who were divided into six approximately equal groups found that participants who were exposed to the written instructions performed on average 50% better on the critical thinking skills test compared to those who did not receive written instructions. It does seem plausible that a similar result would occur with the DOT test, and evaluating the impact of instructions and examples using control and test groups would be beneficial in future studies of the DOT test.

The second aspect of this theme was the application of problem solving skills and the generation of hypothetical scenarios whereby deductive logic could be applied. The following quote described an example of a participant explicitly categorising the information they were provided with in the parent statements and systematically analysing those relationships to answer the questions.

“ I find that with (section) three, deduction, that it was really easy to think in terms of sets, it was easier to think in terms set rather than words, doodling Venn diagrams trying to solve these ones. ”

The Delphi report considers behaviours described by this theme to be part of the interpretation critical thinking skill which describes the ability ‘to detect … relationships’ or ‘to paraphrase or make explicit … conventional or intended meaning’ of a variety of stimuli ( Facione, 1990, p. 8 ). Others consider this behaviour to be more reflective of problem solving skills, describing the behaviour as ‘understanding of the information given’ in order to build a mental representation of the problem ( OECD, 2014, p. 31 ). These patterns of problem solving were most evident in the discussions in response to the WGCTA-S questions.

With respect to the DOT V2, participants had difficulty understanding the intended meaning of the questions without drawing on the previous knowledge of chemistry and science. For example, there were unexpected discussions of the implication of the term ‘lower yield’ in a chemistry context and the relationship to a reaction failing. Participants pointed out underlying assumptions associated with terms such as ‘yield’ highlighting that the term ‘yield’ was not necessarily reflective of obtaining a desired product.

The theme ‘Difficulties associated with prior knowledge’ described when participants drew on knowledge from outside the test in efforts to respond to the questions. In both the WGCTA-S and the DOT V2, the instructions clearly stated to only use the information provided within the parent statements and the questions. These difficulties were most prevalent when participants described their experiences with the DOT V2. For example, the participants were asked to determine the validity of a statement regarding the relationship between the formal charge of anions and how readily anions accept hydrogen ions. In arriving at their answer, one student drew on their outside knowledge of large molecules such as proteins to suggest:

“ What if you had some ridiculous molecule that has like a 3 minus charge but the negative zones are all the way inside the molecule then it would actually accept the H plus? ”

While this student's hypothesis led them to decide that the assumption was invalid, which was the intended response, the intended approach of this question was to recognise that the parent statement made no reference to how strongly cations and anions are attracted to each other.

It was concerning that some participants felt they had to ‘un-train’ themselves of their chemical knowledge in order to properly engage with the DOT V2. Some participants highlighted that they found the WGCTA-S easier as they did not have to reflect on whether they were using their prior knowledge. However, many participants were asking themselves ‘why am I thinking what I’m thinking?’ which is indicative of high order metacognitive skills described by several critical thinking theoreticians ( Facione, 1990, p. 10 ; Kuhn, 2000 ; Tsai, 2001 ). Students appear to be questioning their responses to the DOT V2 and whether their responses are based on their own pre-existing information or the information presented within the test as highlighted in the following statement.

“ You had to think more oh am I using my own knowledge or what's just in the question? I was like so what is assumed to be background knowledge. What's background knowledge? ”

The theme ‘Terms used to articulate cognitive processes’ described the participants applying the language from the instructions of the WGCTA-S and the DOT V2 to articulate their thought processes. In particular, participants were very aware of their prior knowledge, referring to this as ‘bias’.

In response to the questions in the ‘Developing Hypothesis’ section, which related to the probability of failure of an esterification reaction, one student identified that they attempted to view the questions from the perspective of individuals with limited scientific knowledge in order to minimise their prior chemical knowledge to influence their responses. There was much discussion of what was meant by the term ‘failure’ in the context of a chemical reaction and whether failure referred to the unsuccessful collisions at a molecular level or the absence of a product at the macroscopic level.

The students engaged in dialogues which helped refine the language they used in articulating their thoughts or helped them recognise thinking errors. This describes the final emergent theme of ‘Evidence of peer learning’. For example, when discussing their thought processes regarding a question in the ‘Deduction’ section of the WGCTA-S one student shared their strategy of having constructed mental Venn diagrams and had correctly identified how elements of the question related. This prompted others student to recognise the connection they had initially failed to make and reconsider their response.

DOT V3: internal reliability, criterion and discriminate validity

Internal reliability, criterion validity.

Interestingly, there appeared to be no statistically significant difference in DOT V3 scores when comparing postgraduates and academics. If the assumption that critical thinking skill is correlated positively to time spent in tertiary education environments is valid, it is likely that the DOT V3 was not sensitive enough to detect any difference in critical thinking skill between postgraduates and academics.

Discriminate validity

Using Spearman's Rank-order correlation coefficient, there was a weak, positive correlation between DOT V3 score and ATAR score ( ρ = 0.20, n = 194, p = 0.01), suggesting previous achievement was only a minor dependent with respect to score on the DOT V3. This correlation was in line with previous observations collected during testing of the DOT V2 where the correlation between previous academic achievement and performance on the test were found to have a small correlation ( ρ = 0.23) but the sample size was small ( n = 15). However, as the sample size used in the study of this relationship in the DOT V3 was much larger ( n = 194) these findings suggested performance on the DOT V3 was only slightly correlated to previous academic achievement.

In order to determine the validity of the DOT V3 outside of Monash University the median scores of third year students from Monash University and Curtin University were compared using a Mann–Whitney U Test. The test revealed no significant difference in the score obtained by Monash University students (Md = 20, n = 44) and Curtin University students (Md = 22, n = 24), U = 670.500, z = 1.835, p = 0.07, r = 0.22. Therefore, the score obtained on the DOT V3 was considered independent of where the participant attended university. It is possible that an insufficient number of tests were completed, due to the opportunistic sampling from both universities, and obtaining equivalent sample sizes across several higher education institutions would confirm whether the DOT V3 performs well across higher education institutions.

The DOT V3 offers a tool with which to measure a student's critical thinking skills and the effect of any teaching interventions specifically targeting the development of critical thinking. The test is suitable for studying the development of critical thinking using a cross section of students, and may be useful in longitudinal studies of a single cohort. In summary, research conducted within this study provides a body of evidence regarding reliability and validity of the DOT test, and it offers the chemistry education community a valuable research and educational tool with respect to the development of undergraduate chemistry students’ critical thinking skills. The DOT V3 is included in Appendix 1 (ESI † ) (Administrator guidelines and intended responses to the DOT V3 can be obtained upon email correspondence with the first author).

Implications for practice

Whilst there is the potential to measure the development of critical thinking over a semester using the DOT V3, there is evidence to suggest that a psychological construct, such as critical thinking, does not develop enough for measureable differences to occur in the space of only a semester ( Pascarella, 1999 ). While the DOT V3 could be administered to the same cohort of students annually to form the basis of a longitudinal study, there are many hurdles to overcome in such a study including participant retention and their developing familiarity with the test. Much like the CCTST and the WGCTA pre- and post-testing ( Jacobs, 1999 ; Carter et al. , 2015 ), at least two versions of the DOT V3 may be required for pre- and post-testing and for longitudinal studies. However, having a larger pool of questions does not prevent the participants from becoming familiar with the style of critical thinking questions. Development of an additional test would require further reliability and validity testing ( Nunnally and Bernstein, 1994 ). However, cross-sectional studies are useful in identifying changes in critical thinking skills and the DOT V3 has demonstrated it is sensitive enough to discern between the critical thinking skills of first year or third year undergraduate chemistry students.

Conflicts of interest

Acknowledgements.

  • Abrami P. C., Bernard R. M., Borokhovski E., Wade A., Surkes M. A., Tamim R. and Zhang D., (2008), Instructional interventions affecting critical thinking skills and dispositions: a stage 1 meta-analysis, Rev. Educ. Res. , 78 (4), 1102–1134.
  • Abrami P. C., Bernard R. M., Borokhovski E., Waddington D. I., Wade C. A. and Persson T., (2015), Strategies for teaching students to think critically: a meta-analysis, Rev. Educ. Res. , 85 (2), 275–314.
  • AssessmentDay Ltd, (2015), Watson Glaser critical thinking appraisal , retrieved from https://www.assessmentday.co.uk/watson-glaser-critical-thinking.htm, accessed on 03/07/2015.
  • Bailin S., (2002), Critical thinking and science education, Sci. Educ. , 11 , 361–375.
  • Barnett R., (1997), Higher education: a critical business , Buckingham: Open University Press.
  • Behar-Horenstein L. S. and Niu L., (2011), Teaching critical thinking skills in higher education: a review of the literature, J. Coll. Teach. Learn. , 8 (2), 25–41.
  • Bryman A., (2008), Social research methods , 3rd edn, Oxford: Oxford University Press.
  • Butler H. A., (2012), Halpern critical thinking assessment predicts real-world outcomes of critical thinking, Appl. Cognit. Psychol. , 26 (5), 721–729.
  • Carter A. G., Creedy D. K. and Sidebotham M., (2015), Evaluation of tools used to measure critical thinking development in nursing and midwifery undergraduate students: a systematic review, Nurse Educ. Today , 35 (7), 864–874.
  • Cronbach L. J., (1951), Coefficient alpha and the internal structure of tests, Psychometrika , 16 , 297–334.
  • Danczak S. M., Thompson C. D. and Overton T. L., (2017), What does the term critical thinking mean to you? A qualitative analysis of chemistry undergraduate, teaching staff and employers' views of critical thinking, Chem. Educ. Res. Pract. , 18 (3), 420–434.
  • Davies M., (2013), Critical thinking and the disciplines reconsidered, High. Educ. Res. Dev. , 32 (4), 529–544.
  • Desai M. S., Berger B. D. and Higgs R., (2016), Critical thinking skills for business school graduates as demanded by employers: a strategic perspective and recommendations, Acad. Educ. Leadership J. , 20 (1), 10–31.
  • DeVellis, R. F., (2012), Scale development: Theory and applications , 3rd edn, Thousand Oaks, CA: Sage.
  • Dressel P. L. and Mayhew L. B., (1954), General education: explorations in evaluation , Washington, DC: American Council on Eduction.
  • Edwards D., Perkins K., Pearce J. and Hong J., (2015), Work intergrated learning in STEM in Australian universities , retrieved from http://www.chiefscientist.gov.au/wp-content/uploads/ACER_WIL-in-STEM-in-Australian-Universities_June-2015.pdf, accessed on 05/12/2016.
  • Ennis R. H., (1989), Critical thinking and subject specificity: clarification and needed research, Educ. Res. , 18 (3), 4–10.
  • Ennis R. H., (1990), The extent to which critical thinking is subject-specific: further clarification, Educ. Res. , 19 (4), 13–16.
  • Ennis R. H., (1993), Critical thinking assessment, Theory Into Practice , 32 (3), 179–186.
  • Ennis R. H. and Weir E., (1985), The ennis-weir critical thinking essay test: test, manual, criteria, scoring sheet , retrieved from http://faculty.education.illinois.edu/rhennis/tewctet/Ennis-Weir_Merged.pdf, accessed on 09/10/2017.
  • Facione P. A., (1990), Critical thinking: a statement of expert consensus for purposes of educational assessment and instruction. Executive summary. “The Delphi report” , Millbrae, CA: T. C. A. Press.
  • Ferguson R. L., (2007), Constructivism and social constructivism, in Bodner G. M. and Orgill M. (ed.), Theoretical frameworks for research in chemistry and science education , Upper Saddle River, NJ: Pearson Education (US).
  • Flynn A. B., (2011), Developing problem-solving skills through retrosynthetic analysis and clickers in organic chemistry, J. Chem. Educ. , 88 , 1496–1500.
  • Garratt J., Overton T. and Threlfall T., (1999), A question of chemistry , Essex, England: Pearson Education Limited.
  • Garratt J., Overton T., Tomlinson J. and Clow D., (2000), Critical thinking exercises for chemists, Active Learn. High. Educ. , 1 (2), 152–167.
  • Glaser R., (1984), Education and thinking: the role of knowledge, Am. Psychol. , 39 (2), 93–104.
  • Gupta T., Burke K. A., Mehta A. and Greenbowe T. J., (2015), Impact of guided-inquiry-based instruction with a writing and reflection emphasis on chemistry students’ critical thinking abilities, J. Chem. Educ. , 92 (1), 32–38.
  • Halpern D. F., (1993), Assessing the effectiveness of critical thinking instruction, J. General Educ. , 50 (4), 238–254.
  • Halpern D. F., (1996a), Analyzing arguments, in Halpern D. F. (ed.), Thought and knowledge: an introduction to critical thinking , 3rd edn, Mahwah, NJ: L. Erlbaum Associates, pp. 167–211.
  • Halpern D. F., (1996b), Thought and knowledge: An introduction to critical thinking , 3rd edn, Mahwah, NJ: L. Erlbaum Associates.
  • Halpern D. F., (1998), Teaching critical thinking for transfer across domains. Dispositions, skills, structure training, and metacognitive monitoring, Am. Psychol. , 53 , 449–455.
  • Halpern D. F., (2016), Manual: Halpern critical thinking assessment , retrieved from https://drive.google.com/file/d/0BzUoP_pmwy1gdEpCR05PeW9qUzA/view, accessed on 09/10/2017.
  • Halpern D. F., Benbow C. P., Geary D. C., Gur R. C., Hyde J. S. and Gernsbacher M. A., (2007), The science of sex differences in science and mathematics, Psychol. Sci. Public Interest , 8 (1), 1–51.
  • Hassan K. E. and Madhum G., (2007), Validating the Watson Glaser critical thinking appraisal, High. Educ. , 54 (3), 361–383.
  • Heijltjes A., van Gog T., Leppink J. and Pass F., (2015), Unraveling the effects of critical thinking instructions, practice, and self-explanation on students' reasoning performance, Instr. Sci. , 43 , 487–506.
  • Henderson D. E., (2010), A chemical instrumentation game for teaching critical thinking and information literacy in instrumental analysis courses, J. Chem. Educ. , 87 , 412–415.
  • Huber C. R. and Kuncel N. R., (2016), Does college teach critical thinking? A meta-analysis, Rev. Educ. Res. , 86 (2), 431–468.
  • Inhelder B. and Piaget J., (1958), The growth of logical thinking from childhood to adolescence: an essay on the construction of formal operational structures , London: Routledge & Kegan Paul.
  • Insight Assessment, (2013), California critical thinking skills test (CCTST), Request information , retrieved from http://www.insightassessment.com/Products/Products-Summary/Critical-Thinking-Skills-Tests/California-Critical-Thinking-Skills-Test-CCTST, accessed on 07/09/2017.
  • Iwaoka W. T., Li Y. and Rhee W. Y., (2010), Measuring gains in critical thinking in food science and human nutrition courses: the Cornell critical thinking test, problem-based learning activities, and student journal entries, J. Food Sci. Educ. , 9 (3), 68–75.
  • Jackson D., (2010), An international profile of industry-relevant competencies and skill gaps in modern graduates, Int. J. Manage. Educ. , 8 (3), 29–58.
  • Jacob C., (2004), Critical thinking in the chemistry classroom and beyond, J. Chem. Educ. , 81 (8), 1216–1223.
  • Jacobs S. S., (1999), The equivalence of forms a and b of the California critical thinking skills test, Meas. Eval. Counsel. Dev. , 31 (4), 211–222.
  • Johnson R. H., Blair J. A. and Hoaglund J., (1996), The rise of informal logic: essays on argumentation, critical thinking, reasoning, and politics , Newport, VA: Vale Press.
  • Klein G. C. and Carney J. M., (2014), Comprehensive approach to the development of communication and critical thinking: bookend courses for third- and fourth-year chemistry majors, J. Chem. Educ. , 91 , 1649–1654.
  • Kline T., (2005), Psychological testing: A practical approach to design and evaluation , Thousand Oaks, CA: Sage Publications.
  • Kogut L. S., (1993), Critical thinking in general chemistry, J. Chem. Educ. , 73 (3), 218–221.
  • Krejcie R. V. and Morgan D. W., (1970), Determining sample size for research activities, Educ. Psychol. Meas. , 30 (3), 607–610.
  • Kuhn D., (1999), A developmental model of critical thinking, Educ. Res. , 28 (2), 16–26.
  • Kuhn D., (2000), Metacognitive development, Curr. Dir. Psychol. Sci. , 9 (5), 178–181.
  • Lehman D. R. and Nisbett R. E., (1990), A longitudinal study of the effects of undergraduate training on reasoning, Dev. Psychol. , 26 , 952–960.
  • Lehman D. R., Lempert R. O. and Nisbett R. E., (1988), The effects of graduate training on reasoning: formal discipline and thinking about everyday-life events, Am. Psychol. , 43 , 431–442.
  • Lindsay E., (2015), Graduate outlook 2014 employers' perspectives on graduate recruitment in Australia , Melbourne: Graduate Careers Australia, retrieved from http://www.graduatecareers.com.au/wp-content/uploads/2015/06/Graduate_Outlook_2014.pdf, accessed on 21/08/2015.
  • Lowden K., Hall S., Elliot D. and Lewin J., (2011), Employers’ perceptions of the employability skills of new graduates: research commissioned by the edge foundation , retrieved from http://www.educationandemployers.org/wp-content/uploads/2014/06/employability_skills_as_pdf_-_final_online_version.pdf, accessed on 06/12/2016.
  • Martineau E. and Boisvert L., (2011), Using wikipedia to develop students' critical analysis skills in the undergraduate chemistry curriculum, J. Chem. Educ. , 88 , 769–771.
  • McMillan J., (1987), Enhancing college students' critical thinking: a review of studies, J. Assoc. Inst. Res. , 26 (1), 3–29.
  • McPeak J. E., (1981), Critical thinking and education , Oxford: Martin Roberston.
  • McPeak J. E., (1990), Teaching critical thinking: dialogue and dialectic , New York: Routledge.
  • Monash University, (2015), Undergraduate – area of study. Chemistry , retrieved from http://www.monash.edu.au/pubs/2015handbooks/aos/chemistry/, accessed on 15/04/2015.
  • Moore T. J., (2011), Critical thinking and disciplinary thinking: a continuing debate, High. Educ. Res. Dev. , 30 (3), 261–274.
  • Moore T. J., (2013), Critical thinking: seven definitions in search of a concept, Stud. High. Educ. , 38 (4), 506–522.
  • Nisbett R. E., Fong G. T., Lehman D. R. and Cheng P. W., (1987), Teaching reasoning, Science , 238 , 625–631.
  • Nunnally J. C. and Bernstein I. H., (1994), Psychometric theory , New York: McGraw-Hill.
  • OECD, (2014), Pisa 2012 results: creative problem solving: students' skills in tackling real-life problems (volume v) , OECD Publishing, retrieved from http://dx.doi.org/10.1787/9789264208070-en , accessed on 05/01/2018.
  • Oliver-Hoyo M. T., (2003), Designing a written assignment to promote the use of critical thinking skills in an introductory chemistry course, J. Chem. Educ. , 80 , 899–903.
  • Ontario University, (2017), Appendix 1: OCAV's undergraduate and graduate degree level expectations , retrieved from http://oucqa.ca/framework/appendix-1/, accessed on 09/10/2017.
  • Pallant J. F., (2016), SPSS survival manual , 6th edn, Sydney: Allen & Unwin.
  • Pascarella E., (1999), The development of critical thinking: Does college make a difference? J. Coll. Stud. Dev. , 40 (5), 562–569.
  • Pearson, (2015), Watson-Glaser critical thinking appraisal – short form (WGCTA-S) , retrieved from https://www.pearsonclinical.com.au/products/view/208, accessed on 03/07/2015.
  • Phillips V. and Bond C., (2004), Undergraduates' experiences of critical thinking, High. Educ. Res. Dev. , 23 (3), 277–294.
  • Pithers R. T. and Soden R., (2000), Critical thinking in education: a review, Educ. Res. , 42 (3), 237–249.
  • Preiss D. D., Castillo J., Flotts P. and San Martin E., (2013), Assessment of argumentative writing and critical thinking in higher education: educational correlates and gender differences, Learn. Individ. Differ. , 28 , 193–203.
  • Prinsley R. and Baranyai K., (2015), STEM skills in the workforce: What do employers want? retrieved from http://www.chiefscientist.gov.au/wp-content/uploads/OPS09_02Mar2015_Web.pdf, accessed on 06/10/2015.
  • Richardson M., Abraham C. and Bond R., (2012), Psychological correlates of university students' academic performance: a systematic review and meta-analysis, Psychol. Bull. , 138 (2), 353–387.
  • Sarkar M., Overton T., Thompson C. and Rayner G., (2016), Graduate employability: views of recent science graduates and employers, Int. J. Innov. Sci. Math. Educ. , 24 (3), 31–48.
  • Stephenson N. S. and Sadler-Mcknight N. P., (2016), Developing critical thinking skills using the science writing heuristic in the chemistry laboratory, Chem. Educ. Res. Pract. , 17 (1), 72–79.
  • The Critical Thinking Co, (2017), Cornell critical thinking tests , retrieved from https://www.criticalthinking.com/cornell-critical-thinking-tests.html, accessed on 9/10/2017.
  • Thorndike E. L. and Woodworth R. S., (1901a), The influence of improvement in one mental function upon the efficiency of other functions, (i), Psychol. Rev. , 8 (3), 247–261.
  • Thorndike E. L. and Woodworth R. S., (1901b), The influence of improvement in one mental function upon the efficiency of other functions. ii. The estimation of magnitudes, Psychol. Rev. , 8 (4), 384–395.
  • Thorndike E. L. and Woodworth R. S., (1901c), The influence of improvement in one mental function upon the efficiency of other functions: functions involving attention, observation and discrimination, Psychol. Rev. , 8 (6), 553–564.
  • Tiruneh D. T., Verburgh A. and Elen J., (2014), Effectiveness of critical thinking instruction in higher education: a systematic review of intervention studies, High. Educ. Stud. , 4 (1), 1–17.
  • Tiruneh D. T., De Cock M., Weldeslassie A. G., Elen J. and Janssen R., (2016), Measuring critical thinking in physics: development and validation of a critical thinking test in electricity and magnetism, Int. J. Sci. Math. Educ. , 1–20.
  • Tsai C. C., (2001), A review and discussion of epistemological commitments, metacognition, and critical thinking with suggestions on their enhancement in internet-assisted chemistry classrooms, J. Chem. Educ. , 78 (7), 970–974.
  • University of Adelaide, (2015), University of Adelaide graduate attributes , retrieved from http://www.adelaide.edu.au/learning/strategy/gradattributes/, accessed on 15/04/2015.
  • University of Edinburgh, (2017), The University of Edinburgh's graduate attributes , retrieved from http://www.ed.ac.uk/employability/graduate-attributes/framework, accessed on 09/10/2017.
  • University of Melbourne, (2015), Handbook – chemistry , retrieved from https://handbook.unimelb.edu.au/view/2015/!R01-AA-MAJ%2B1007, accessed on 15/04/2015.
  • Watson G. and Glaser E. M., (2006), Watson-Glaser critical thinking appraisal short form manual , San Antonio, TX: Pearson.

Monash University Logo

  • Help & FAQ

Development and validation of an instrument to measure undergraduate chemistry students' critical thinking skills

  • School of Chemistry

Research output : Contribution to journal › Article › Research › peer-review

The importance of developing and assessing student critical thinking at university can be seen through its inclusion as a graduate attribute for universities and from research highlighting the value employers, educators and students place on demonstrating critical thinking skills. Critical thinking skills are seldom explicitly assessed at universities. Commercial critical thinking assessments, which are often generic in context, are available. However, literature suggests that assessments that use a context relevant to the students more accurately reflect their critical thinking skills. This paper describes the development and evaluation of a chemistry critical thinking test (the Danczak-Overton-Thompson Chemistry Critical Thinking Test or DOT test), set in a chemistry context, and designed to be administered to undergraduate chemistry students at any level of study. Development and evaluation occurred over three versions of the DOT test through a variety of quantitative and qualitative reliability and validity testing phases. The studies suggest that the final version of the DOT test has good internal reliability, strong test-retest reliability, moderate convergent validity relative to a commercially available test and is independent of previous academic achievement and university of study. Criterion validity testing revealed that third year students performed statistically significantly better on the DOT test relative to first year students, and postgraduates and academics performed statistically significantly better than third year students. The statistical and qualitative analysis indicates that the DOT test is a suitable instrument for the chemistry education community to use to measure the development of undergraduate chemistry students' critical thinking skills.

Access to Document

  • 10.1039/c8rp00130h

Other files and links

  • Link to publication in Scopus

T1 - Development and validation of an instrument to measure undergraduate chemistry students' critical thinking skills

AU - Danczak, Stephen M.

AU - Thompson, Christopher D.

AU - Overton, Tina L.

N1 - Funding Information: There are two extreme views regarding the teaching of critical thinking and the role subject-specific knowledge plays in its development: the subject specifist view and the subject generalist view. The subject specifist view, championed by McPeak (McPeak, 1981) states that thinking is never without context and thus courses designed to teach informal logic in an abstract environment provide no benefit to the student’s capacity to think critically (McPeak, 1990). This perspective is supported by the work of prominent psychologists in the early 20th century (Thorndike and Woodworth, 1901a, 1901b, 1901c; Inhelder and Piaget, 1958). Funding Information: The authors would like to acknowledge participants from Monash University, Curtin University and academics from the community of practice who took the time to complete the various versions of the DOT test and/or participate in the focus groups. This research was made possible through the Australian Post-graduate Award funding and with guidance of the Monash University Human Ethics Research Committee. Publisher Copyright: © 2020 The Royal Society of Chemistry.

PY - 2020/1/16

Y1 - 2020/1/16

N2 - The importance of developing and assessing student critical thinking at university can be seen through its inclusion as a graduate attribute for universities and from research highlighting the value employers, educators and students place on demonstrating critical thinking skills. Critical thinking skills are seldom explicitly assessed at universities. Commercial critical thinking assessments, which are often generic in context, are available. However, literature suggests that assessments that use a context relevant to the students more accurately reflect their critical thinking skills. This paper describes the development and evaluation of a chemistry critical thinking test (the Danczak-Overton-Thompson Chemistry Critical Thinking Test or DOT test), set in a chemistry context, and designed to be administered to undergraduate chemistry students at any level of study. Development and evaluation occurred over three versions of the DOT test through a variety of quantitative and qualitative reliability and validity testing phases. The studies suggest that the final version of the DOT test has good internal reliability, strong test-retest reliability, moderate convergent validity relative to a commercially available test and is independent of previous academic achievement and university of study. Criterion validity testing revealed that third year students performed statistically significantly better on the DOT test relative to first year students, and postgraduates and academics performed statistically significantly better than third year students. The statistical and qualitative analysis indicates that the DOT test is a suitable instrument for the chemistry education community to use to measure the development of undergraduate chemistry students' critical thinking skills.

AB - The importance of developing and assessing student critical thinking at university can be seen through its inclusion as a graduate attribute for universities and from research highlighting the value employers, educators and students place on demonstrating critical thinking skills. Critical thinking skills are seldom explicitly assessed at universities. Commercial critical thinking assessments, which are often generic in context, are available. However, literature suggests that assessments that use a context relevant to the students more accurately reflect their critical thinking skills. This paper describes the development and evaluation of a chemistry critical thinking test (the Danczak-Overton-Thompson Chemistry Critical Thinking Test or DOT test), set in a chemistry context, and designed to be administered to undergraduate chemistry students at any level of study. Development and evaluation occurred over three versions of the DOT test through a variety of quantitative and qualitative reliability and validity testing phases. The studies suggest that the final version of the DOT test has good internal reliability, strong test-retest reliability, moderate convergent validity relative to a commercially available test and is independent of previous academic achievement and university of study. Criterion validity testing revealed that third year students performed statistically significantly better on the DOT test relative to first year students, and postgraduates and academics performed statistically significantly better than third year students. The statistical and qualitative analysis indicates that the DOT test is a suitable instrument for the chemistry education community to use to measure the development of undergraduate chemistry students' critical thinking skills.

UR - http://www.scopus.com/inward/record.url?scp=85078153146&partnerID=8YFLogxK

U2 - 10.1039/c8rp00130h

DO - 10.1039/c8rp00130h

M3 - Article

AN - SCOPUS:85078153146

SN - 1756-1108

JO - Chemistry Education Research and Practice

JF - Chemistry Education Research and Practice

Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

Critical thinking skills assessment instrument in physics subjects: how to develop a four tier diagnostic test?

A A Waluyo 1 , Hartono 1 and Sulhadi 1

Published under licence by IOP Publishing Ltd Journal of Physics: Conference Series , Volume 1918 , Physics and its Application Citation A A Waluyo et al 2021 J. Phys.: Conf. Ser. 1918 022011 DOI 10.1088/1742-6596/1918/2/022011

Article metrics

394 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 Physics Education, Faculty of Mathematics and Natural Science, Universitas Negeri Semarang, Indonesia

Buy this article in print

Critical thinking is a thinking skills that involves a process of reasoning and reflective thinking that cannot be directly observed, so it requires a separate assessment instrument. This research is a development research that aims to produce an instrument for assessing critical thinking skills that can measure students' critical thinking skills and ilustrate the interconnectedness between aspects of critical thinking skills. The sample was selected using random sampling technique of 109 students in the trial stages and 136 students in the implementation stages. Data analysis was performed through several test that is content validity, construct validity, reliability, the level of students' critical thinking skills, and analysis of variance. The validity test shows that the instrument contains 4 factors that match theorized and the content validity score shows that can be used. While the reliability test shows that the Cronbach Alpha coefficient of 0,820. The tier of critical thinking skills of high school students in Banyumas Regency is less. Analysis of the relationship aspects of critical thinking skills shows that each aspect of critical thinking skills are interconnected with the analysis aspect as the basic and the conclusion and evaluation aspects as the culmination of being mutually sustainable in producing solutions.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

  • Open access
  • Published: 14 September 2022

Adaptation and validation of a critical thinking scale to measure the 3D critical thinking ability of EFL readers

  • Moloud Mohammadi   ORCID: orcid.org/0000-0001-7848-1869 1 ,
  • Gholam-Reza Abbasian   ORCID: orcid.org/0000-0003-1507-1736 2 &
  • Masood Siyyari 1  

Language Testing in Asia volume  12 , Article number:  24 ( 2022 ) Cite this article

5562 Accesses

3 Citations

2 Altmetric

Metrics details

Thinking has always been an integral part of human life, and it can be said that whenever humanity has been thinking, it has been practicing a kind of criticizing the issues around. This is the concept of critical thinking that enhances the ability of individuals to identify problems and find solutions. Most previous research has focused on only one aspect of critical thinking that is critical thinking skills, while two other dimensions of criticality and critical pedagogy should have also been included. In order to assure of the validity of the instrument designed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review), it was first adapted and then SEM modeling was used. Examination of the results of factor analysis and modeling of SEM showed that the model satisfied the fit indices ( χ 2 /df, CFI, TLI, RMSEA), and all the factor loads are greater than 0.4 which represent that the items are defined properly. This research suggested a SEM modeling of critical thinking skills, composed of six factors measured by twenty-two indices. The results of the PLS-SEM CFA represented that it is a valid structural model to measure a critical thinking of EFL readers at three levels.

Introduction

Recent research on reading has represented that, although it is generally established as the first skill in language learners, it is a complex cognitive activity for individuals to perform well in learning and obtaining sufficient information from the target community (Shang, 2010 ). According to Krathwohl ( 2002 ), the cognitive domain is divided into two parts: first is the knowledge (including real, theoretical, procedural, and metacognitive knowledge) and then the cognitive process (including recalling, comprehending, applying, examining, evaluating, and creating). In defining this skill, Chamot ( 2004 ) holds that reading is the process of activating language-acquired knowledge and skills to access information and transfer them. Swallow ( 2016 ) looks at it as a three-dimensional construct including content, perception, and understanding through thinking, metacognition, and meaning construction (Gear, 2006 ).

According to Rashel and Kinya ( 2021 ), the focus of education in this competitive period of time is on higher-level thinking skills (including critical thinking) rather than low-level thinking skills, and research into measuring critical thinking skills is growing. In the eyes of Ennis ( 2011 ), critical thinking ability is defined as clear and rational thinking that includes engaging in reflective and independent thinking. Moon ( 2008 ) and Willingham ( 2007 ) emphasized that the development of critical thinking in individuals is the goal of higher education and can be recognized as the primary goal of learning. Paul and Elder ( 2002 ), in describing a critical thinker, stated that in the eyes of such a person, all four skills of reading, writing, speaking, and listening are methods of skilled higher-order thinking. Such a person, while reading the text, finds it a representation of the author’s thinking and therefore tries to align with his point of view. In this regard, Din ( 2020 ) emphasizes that since a critical thinker has the ability to understand beyond the content of a text, they tend to react to the content being studied. Moreover, the tendency towards implementing critical thinking programs in the English language teaching context has increased as well (Heidari, 2020 ; Liu & Stapleton, 2014 ).

Beside the theory-wise investigations, there are a couple of studies with practical direction. Some research has examined the role of critical thinking in learning a language (e.g., Akyuz & Samsa, 2009 ; Willingham, 2007 ), others focused on the thinking strategies used by language learners in improving reading skills (Shang, 2010 ) or the relationship between critical thinking and language learning strategies (Nikopour et al., 2011 ). A few studies confirmed the relationship between critical thinking ability and reading comprehension (e.g., Eftekhary & Besharati Kalayeh, 2014 ). In such area, a limited number of studies have relied on the participation of the academic community (Hawkins, 2012 ; Hosseini et al., 2012 ), and this study is also innovative in this respect. It can be inferred that in most of these studies, critical thinking is limited to the use of definite basic skills (compare and contrast, conclusion, inferencing, etc.). According to Facione ( 1990 ) and Rashel and Kinya ( 2021 ), most research on this topic has focused on general critical thinking skills (but not expertise), although these skills have been of interest for years. But, is it enough to just use these skills to understand a content? Is critical thinking summarized in terms of several sub-skills? Where and how is the role and impression of society reflected in critical thinking or critical reading? Does learning these sub-skills alone indicate the internalization of critical thinking and reading in individuals? These key questions have been left intact mainly due to a lack of specific and valid instrument, as a rationale behind this very study.

The novel point in the present study is that, despite the existence of the one-dimensional attitude towards critical thinking (Facione, 1992 ; Kember et al., 2000 ), it tries to highlight the concept of a three-dimensional critical thinking in academic context and in this regard developed a tool for measuring its subscales (and not just individual skills). Such a tool can measure the real needs of the next generation with evidence of real-life multifaceted critical thinking issues. The purpose of this study was to evaluate the validity of the questionnaire developed for assessing three-dimensional critical thinking skills in EFL readers. Moreover, the application of the partial least squares method (PLS-SEM) in the development and validation of the proposed model has also made this research innovative. The objectives of this study were (1) to assess the validity of the items introduced in the questionnaire, (2) to investigate the relationship between and among the identified components, and (3) to determine the validity and reliability of the questionnaire designed to assess three-dimensional critical thinking skills in EFL readers. The contribution of this article in the literature is to illustrate the importance of critical thinking both in personal and sociocultural aspects, to evaluate and validate the tool that was developed to measure the components of three-dimensional critical thinking (proposed by the same researchers), to provide the model fit indices for factor analysis, and to adapt the instrument to the conditions of English language readers. Therefore, an attempt was made to briefly introduce the components of the proposed model, and then to discuss the validation method of the developed instrument to measure these components, and finally to report the validation results of the introduced instrument. The pedagogical implications of this study include the following: using the presented concepts in research centers to identify and introduce the method of teaching and developing each of the sub-skills of critical thinking in different societies; identifying differences in instructional approaches for each of the sub-skills; applying both concepts (i.e., three-dimensional critical thinking and reading) in other areas and assessing the generalizability of findings; and reviewing the previous literature by looking at all three dimensions introduced and evaluated in order to identify their strengths and weaknesses in this regard.

Literature review

Today that critical thinking is more prominent in language teaching than ever (Li, 2016 ; Van Laar et al., 2017 ), there is a wealth of research on the need and importance of fostering such thinking in language classrooms (Zhao et al., 2016 ), representing that developing such thinking facilitates the language acquisition (Wang & Henderson, 2014 ; Wu et al., 2013 ), and equips learners with such self-criticism that it develops analytical and reflective view of themselves and their environment in learners (Moghadam et al., 2021 ). Brookfield ( 2019 ), Dekker ( 2020 ), Haji Maibodi and Fahim ( 2012 ), and Zou and Lee ( 2021 ) acknowledged that teachers who emphasize the education and application of critical thinking increase awareness and understanding of socio-cultural concepts in learners. In this regard, Crenshaw et al. ( 2011 ) stated that encouraging language learners to participate actively in thinking activities is essential, and McGregor ( 2007 ) and Rezaei et al. ( 2011 ) emphasized that engaging teachers and language learners in thinking and reflecting on the views and assumptions presented in a text are among the essential steps in the development of critical thinking in individuals. Rezaei et al. ( 2011 ) acknowledged that learners’ participation in critical thinking processes during teaching is done through asking questions and providing answers, discussing topics, asking for explaining or elaborating on opinions, and so on. They also emphasized the need to provide teachers with accurate and comprehensive knowledge of critical thinking before attending such classes. In addition, Tehrani and Razali ( 2018 ) and (Li, 2016 ) have suggested that critical thinking training should begin at early ages and in the natural process of learning the target language. However, despite the importance and emphasis on its development, little progress has been made in its application and integration in education (Li, 2011 ; Pica, 2000 ), whose reasons, according to Lin et al. ( 2016 ) can be found in its challenging-widespread nature and ambiguous details of its components.

The traditional definitions of critical thinking by philosophers do not necessarily assist individuals to become a critical citizen/being. However, the core characteristics of critical thinking introduced in these definitions remain fundamental to what is meant by critical thinking on (Bali, 2015 ; Davies, 2015 ; Davies & Barnett, 2015 ; Renatovna & Renatovna, 2021 ; Widyastuti, 2018 ; Wilson, 2016 ). Considering critical thinking as a very pivotal learning skill, the acquisition of related skills in the traditional attitude was limited to practices of certain types of skills such as inferencing, reasoning, and analyzing (Davies, 2015 ). He emphasizes that one of the weaknesses of the traditional sense of critical thinking, which is crystallized in the critical thinking movement, is the lack of formation of the important component of action. This is worth mentioning that paying less attention to the topics related to critical thinking in higher education may result in a lack of having a proper and well-defined practical (and even theoretical) instruction, and as it was mentioned by Paulsen ( 2015 ), little advancement can be formed if critical thinking remains vague.

A model of critical thinking in higher education is suggested by Davies ( 2015 ) in which the basic focus is on critical rationality and critical character of individuals. He presumes six distinct dimensions for critical thinking including critical argumentation, critical judgment, critical dispositions, critical actions, critical social relations, and critical creativity or critical being. Each of these dimensions plays a significant role in the comprehensive model of critical thinking (Davies, 2015 ; Davies & Barnett, 2015 ).

There are many well-developed models of critical thinking which might be called “philosophical” models of critical thinking. These models might be dispersed on a continuum from the taxonomy of pedagogical objectives (e.g., Airasian et al., 2001 ; Bloom, 1956 ) to the APA Delphi Report and Paul-Elder models (e.g., Paul & Elder, 2002 ; Sadler, 2010 ) and also to the model of critical thinking by Ennis ( 1991 ) according to which the main emphasis is on cognitive decision-making. However, Davies ( 2015 ) represented that these models are utilized mostly in the case of educating for critical thinking in which the main goal is providing learners with activities based on which they can improve their basic judgment and decision-making ability, while critical thinking is a multidimensional concept containing both personal and social aspects. In endorsing and supporting the use of the term multidimensional in relation to the concept of critical thinking, some of the existing challenges can be mentioned. Lun et al. ( 2010 ) and Manalo and Sheppard ( 2016 ) stated that a specific level of language proficiency is expected to accomplish such thinking. Similarly, Peng ( 2014 ) stated that for students, language deficiency is one of the main reasons of cognitive barriers that prevents them from practicing critical thinking. Explaining the other challenges, Liang and Fung ( 2020 ) and Merrifield ( 2018 ) stated that the subject of culture is effective in applying and practicing such thinking. For example, factors such as a significant decline in the quality and quantity of social interactions and intense attention to the social status of an individual in a group (Suryantari, 2018 ), some considerate social standards explicitly in eastern setting (Bag & Gürsoy, 2021 ), socio-cultural factors (Imperio et al., 2020 ; Shpeizer, 2018 ), fear of being ridiculed during expressing an opinion (Tumasang, 2021 ), epistemic belief in the certainty of knowledge (Kahsay, 2019 ), the emphasis on teacher-centered language classes (Fahim & Ahmadian, 2012 ; Hemmati & Azizmalayeri, 2022 ; Khorasani & Farimani, 2010 ), or weakness in CT experiences due to the lack of inductive education in Iranian context (Birjandi et al., 2018 ), reduce the natural learning ability of developing such skill as well as the power of induction—especially in adults (Dornyei, 2010 ). Therefore, the subject of language learning, whether in a foreign or a second language context, complicates the issue of cultivating critical thinking in such a way that its development cannot be limited to learning a few individual skills. In this regard, Davies and Barnett ( 2015 ) attempted to bring together a set of perspectives, thus identified three broad perspectives on critical thinking in the literature. These perspectives are often opposed to each other, while overlapping and significantly merging with each other (Frykholm, 2020 ; Ledman, 2019 ; Shpeizer, 2018 ; Wilson, 2016 ; Wilson & Howitt, 2018 ). Shpeizer ( 2018 ) also emphasized that this mutual influence and the lack of transparency in the boundaries of each of the three areas have made the concept of critical thinking confusing and perhaps daunting for English teachers.

In addition, understanding the nature and dimensions of critical thinking in order to evaluate it is also of crucial importance. Assessing an individuals’ critical thinking requires a measuring instrument that can precisely and perfectly determine the true conditions. From the result of the literacy study, one can find some instruments to measure critical thinking skills and abilities of students each with their specific perspectives, definitions of criteria, and priorities. Among these instruments are California Critical Thinking Skill Test (CCTST) by Facione ( 1992 ); Critical Thinking Questionnaire by Kember et al. ( 2000 ); Ricketts ( 2003 ) questionnaire; Critical Reading Scale by Zhou et al. ( 2015 ); and Critical Thinking Inventory by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ). The designed questionnaire by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), unlike previous tools, addresses all the three dimensions of critical thinking (i.e., individual skills, criticality, and critical pedagogy).

Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), getting insights from Davies ( 2015 ) and Davies and Barnett ( 2015 ), represent that critical thinking is composed of both personal critical thinking skills, and those skills gained at the criticality level and critical pedagogy level. The levels, movements, and skills of each of the levels introduced in their study are presented in the figure below.

As shown in Fig. 1 , as one moves from the center to the outside (the surface), the stages of critical thinking development appear, according to which this process begins with the development of individual critical thinking skills, the criticality movement, and then the critical pedagogy movement. This figure includes the XY plane (page drown on x and y diagrams), indicating the measurement subscales; YZ plane (page drown on y and z diagrams) represents individual and socio-cultural dimensions; and the XZ plane (page drown on the x and z diagrams) is different movements.

figure 1

The model of critical thinking movements, skills and abilities, and assessing criteria extracted from Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review )

The model represents that in order to improve critical thinking in a person, it is necessary to consider both individual and socio-cultural aspects. In this figure, the X-Z page represents various dimensions of critical thinking, the Y-Z page represents cognitive-developmental skills, and the X-Y page shows sub-skills of each layer (i.e., assessing criteria in this study). Aspects and skills of the three-dimensional critical thinking which were previously introduced by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) are briefly explained in Table 1 .

Critical thinking and criticality are the most interwoven concepts with language skills acquisition in general and processing and development of reading skills in particular. And of course, developing skills related to each of these two movements requires critical pedagogy. According to Haji Maibodi ( 2014 ), reading comprehension refers to the ability to construct meaning through thinking, before, during, and after reading the text, as well as integrating the information presented in the text with one’s prior knowledge. She also stated that different types of texts with different levels of difficulty and various topics are available to people to be encouraged to read and thus gain new knowledge and strengthen their reading skills. As people go through this process, they realize that in order to understand the texts they read as much as possible, they have to use thinking skills (Haji Maibodi, 2014 ), and this thinking takes different forms and the implementation of each and requires skills that are called critical thinking skills. Haji Maibodi ( 2014 ) also emphasized that practical teaching of reading comprehension requires the development of the ability to understand, analyze, and recognize various components of a text.

Reading is viewed as the most crucial academic skill for foreign language learners which can facilitate their professional progression, social success, and personal development. Reading skill is defined by Berardo ( 2006 ) as a dynamic and complex phenomenon and is considered as a source of gaining language input since it is a receptive skill based on which there should be an interaction among the author of the text, his/her message, and the reader in order to comprehend it. Therefore, in order to read, comprehend, and respond to a written content, the reader is expected to have some certain skills and abilities including reading to grasp the message of each line and paragraph, reading to find the existing relationship between the paragraphs, understanding the basic message of the author, and finding the most appropriate answer to the idea of the writer (Berardo, 2006 ). According to Berardo ( 2006 ), stages of reading require readers to apply a higher order of thinking called “critical reading” by Bloom ( 1956 ). According to Hall and Piazza ( 2008 ), critical reading skill is still one of the skills which helps learners gain success in academic courses whilst it is still vague to many teachers and they usually fail to develop such skill in their students. They represent that if students lack the skill to analyze and challenge written content in the classroom environment, then they will face many problems in understanding and questioning their living environment and society.

Wallace ( 2003 ) and Sweet ( 1993 ) approach the critical reader as an active reader who is able to ask questions, to recognize, analyze, and confirm evidences; to detect the truth; to understand tone, bias, and persuasion; and to judge them throughout the reading process. Khonamri and Karimabadi ( 2015 ) state that in order to have an effective reading, readers should have the ability to read with their critical eyes, i.e., they have to read and evaluate a text for its intentions and the reasons behind it, that is the ability to think critically.

Critical reading, as the key player in the development of core language skills, involves activities such as reasoning, questioning, evaluation, comparison, and inference (e.g., Adalı, 2010 ; Adalı, 2011 ; Söylemez, 2015 ). Regarding critical reading, Nemat Tabrizi and Akhavan Saber ( 2016 ) emphasized that this skill plays an important role in the formation of democratic societies since it makes people decide what they accept as reality, only after reviewing, analyzing, and comparing the content presented with their knowledge and values of their internal-external worlds.

Instrument validation

Measurement validation in the eyes of Zumbo ( 2005 ) is a continuous process in which evidence is collected to support the appropriateness, significance, and usefulness of the inferences derived from scores obtained from a sample. He also emphasizes that the method and process of validation is important in the construction and evaluation of tools related to social sciences, behavioral, health, and humanities, since without the implementation of this process, any conclusions or inferences from the obtained scores are meaningless.

Many have argued that in the contemporary view, the main purpose is to extend the conceptual framework and power of the traditional vision towards validity (Johnson & Plake, 1998 ; Kane, 2001 ; Messick, 1989 ; Messick, 1995 ), according to which validity is not one of the characteristics of measuring tools anymore, but the characteristics of inferences made on scores that can be examined in the form of a continuum (valid/invalid dual is no longer considered). In this view, construct validity is the only and the most important feature in validation, and there are only different sources of evidence to prove the validity of inferences. Zumbo ( 2005 ) stated that the calculation of validity using statistical methods such as correlation is not acceptable, and it is necessary to provide a detailed theory and support for it, including analysis of covariance matrices between experimental data and covariance structure model. From the study of previous research, it can be seen that the two categories of models are introduced as key for validation, which are confirmatory factor analysis (CFA), which has a lengthy and rich history in research (for example, Byrne, 1998 ; Byrne, 2001 ; Kaplan, 2000 ) and Multiple Indicators Multiple Causes (MIMIC) that have been generalized to linear structural equation models by integrating structural equation modeling and item response theory (Ullman, 2001 ). The multidimensional and hierarchical representation of the skills needed for critical thinking at each level is primarily based on theoretical reasoning (by Davies, 2015 ; Davies & Barnett, 2015 ; Frykholm, 2020 ; Ledman, 2019 ; Shpeizer, 2018 ), as mentioned in the previous paragraphs.

Accordingly, this study was an attempt to adapt and assure of the validity of the questionnaire proposed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) in order to measure criteria introduced in Fig. 1 , XY plane (see Appendix A for the validated version). A review of previous studies showed that previous research has only examined individual skills and examined various subskills in this area. None of the studies have provided a comprehensive scale, consisting of both individual and socio-cultural factors, and the validation of a common scale for measuring the set of factors. Regarding this, the present study assessed the three-level scale of critical thinking and validates the proposed model. In this study, a measurement and structural model according to the previous literature and the method of factor analysis is proposed. This research is innovative because it uses the partial least squares method (PLS-SEM) and various software to validate the proposed model. The PLS method relies on a series of consecutive ordinary least square (OLS) regressions; thus, it eliminates the necessity of having a normal distribution of observations. OLS indicates the compatibility of the partial least squares method with small samples and is suitable for the conditions of this research (Farahani, 2010 ). On the other hand, given that PLS assumes that all blocks are linear combinations of their reagents, common problems such as nonlinear solutions and uncertainty of the factors that occur in covariance-based structural equation modeling (CB-SEM) techniques do not occur (Pirouz, 2006 ). Researchers aimed to answer the following question:

RQ. To what extent is the newly developed rating description a valid measure of critically thinkers’ reading ability?

Methodology

In this study, an attempt was made to validate the three-dimensional critical thinking instrument developed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) to assess critical thinking in English as a Foreign Language (EFL) readers (Tables 2 , 3 , 4 , 5 , and 6 ).

Participants

In order to answer the research question, 89 Iranian EFL under-graduate students (age range 18 to 35) were selected for the development and validation of a reading skill-oriented critical thinker measurement instrument. The participants were members of intact classes (with the aim of involving individuals with diverse abilities), and the homogeneity of the classes was also assessed via Preliminary English Test (PET score above 147). Due to the fact that the participants cooperated with the researchers during different phases of the study, the implementation steps were introduced to them, ethical approval was given, participants were assured of not publishing personal opinions to the third person/parties, and the final results were communicated to them.

Instruments

Critical thinking inventory: The CTI, by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), contains 59 items of 5-point Likert type to measure the factors of argumentation (15 items), judgment (5 items), disposition (9 items), criticality (12 items), social cognition (9 items), and creativity (9 items number) in 50 min. The minimum score of the questionnaire is 59 and the maximum is 295, and the participants were asked to respond within 60 min. The CR and AVE were reported in the work as 0.97 and 0.687 (see Table 7 ).

Preliminary English Test (PET): This test was used to select groups of participants who have similar language proficiency. It is an official intermediate English language test (designed by Cambridge ESOL examinations) with the maximum achievable score of 170. This test includes sections of reading (five parts, thirty-five items, scoring range of 0–35), writing (three parts, seven items, scoring range of 0–15), listening (four parts and twenty-five items, scoring range of 0–25), and speaking (four parts of face-to-face interview questions, scoring range of 0–25). Two raters were asked to assess the test to be assured of interrater consistency of scores. The intra-class correlation coefficient (ICC) test was run to determine if there was an agreement between raters’ judgment on the scores. A high degree of reliability was found between the scores ( F (88, 88)= 146.08, p < .000) with the average measure ICC of .982).

Initially, the written informed consent was obtained from the participants. Then, PET test was used to ensure the homogeneity of the participants and those with similar performance were selected for this study. Next, participants were asked to respond questions to assess CTI validity. After collecting data, the relationships between the elements, skills, and concepts introduced in the questionnaire (see Table 1 ) were assessed. For this purpose, the validity testing of the model was conducted through CFA method of evaluating and comparing alternative models: CFA of the measurement model (first-order model) and CFA of the structural model (second-order model). In this study, in order to increase statistical power, researchers tried to use predictor variables (i.e., AWC, QAR, classic instructions), considering less operating levels for continuous variables, utilizing continuous variables instead of polarizing or grouping them, defining focused hypothesis tests, crossing the extracted factors, etc., which are described in Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ). The scale validation in this study included a PLS-SEM analysis technique due to the abnormal distribution of the data gathered (Afthanorhan, 2013 ) and the model validation included the following tests:

Analysis of the convergent validity

Test of discriminatory validity

Test of construct validity.

Data analysis

After collecting the data of the designed inventory in SPSS, the collected data related to the validity of the questionnaire were transferred to SmartPLS software to validate the proposed model through model validation techniques (FIT, GFI, RMR, etc.), SEM, CR, and AVE estimation. The datasets generated and analyzed during the current study are not publicly available due to ethical issues and privacy restrictions.

In order to find the answer to the research question, a CFA-based approach was used as an MTMM technique to estimate the validity of the designed instrument (Bentler, 2000 ; Byrne, 2006 ; Koch et al., 2018 ). For this purpose, different types of validity of the developed inventory were evaluated.

Internal validity

Face validity: Face validity depends on the judgment of the constructor of the test and was approved according to the Advisor’s opinion.

Content validity: Various aspects of the structure are examined. Content validity was confirmed by the Advisor.

Criterion-related validity (both concurrent validity and predictive validity): In order to appraise the predictive validity, this instrument should be evaluated over a long period of time, for example, once at the beginning of the undergraduate course and then, again at the end of the fourth year, and then compare its performance in predicting the results with current results. To measure concurrent validity, it is necessary to examine this tool in a completely different content and on a completely different group of learners (at the same time).

Construct validity: The category is focused on the structure of the questionnaire. In order to measure the next three criteria, Smart PLS software was used.

Convergent validity: Estimation of CR and AVE

Discriminate (Divergent) validity: Confirmatory factor analysis ( t value)

Construct validity: Model validation (SRMR)

In examining the introduced validity criteria, the results of (a) checking the suitability of factor loads, (b) investigating structural equation model, and (c) estimating Goodness of Fit were investigated as follows:

At the beginning, in order to investigate the effect of items and factor loads in measuring the desired structure, the model parameters and their significance were calculated (Fig. 2 ).

figure 2

Measurement model test

It is observed that all factor loads are more than 0.4 and are significant. Therefore, the studied items have a significant effect on the measurement of the structure (Table 2 ).

The model parameter table accurately shows that the p value and t value measures are respectively, less than .001 and more than 1.96, representing a good value. In the following table, the measures of the overall hypothetical fitted model (i.e., goodness-of-fit indicators) are calculated (Table 3 ).

According to the results, both GFI and AGFI value are more than 0.80; RMR values are close to .00; X 2 /df ratios are less than 5; and RMSEA estimates are less than 0.08 indicating reasonable errors for approximation in the society. Therefore, all indicators are in the desired range, so the results of the model are trusted and valid and can be used, in general. It should be noted that variables with less than three items cannot be fitted and accurate calculation of their indicators are not possible. In the following, the results of detailed analysis of the model and determination of validity indicators are presented.

Next, the data analysis algorithm in Smart PLS software is displayed. In this algorithm, after model formation and confirmatory factor analysis, it is the time to examine the structural model in three areas:

Measurement model test: To evaluate the validity and reliability of each structure, the AVE (average variance extracted) and CR (composite reliability) are calculated, respectively (Table 4 ).

Therefore, according to the results, the validity criterion is more than 0.4 and the reliability criterion for this structure is close to 0.7, so it can be said that in terms of convergent validity criteria, all structures are in the desired range (Fig. 3 ).

Structural equation modeling: The results of confirmatory factor analysis of the model represented that:

figure 3

Structural equation modeling results

It can be seen that all items have a significant effect ( p <0.001) on the structure. This shows that the items related to each structure measure the desired structure well (Table 5 ).

The estimation of the model parameters represents that p values are lower than .001, and t values are greater than 1.96, meaning that the path is significant at the .05 level, meaning that its estimated path parameter has a significant effect on the structure (Ullman, 2001 ). This shows that the items related to each variable measure the desired structure well.

Goodness of fit: For the purpose of conducting confirmatory factor analysis (CFA), as an MTMM technique to assess divergent validity of the model, goodness-of-fit indices were estimated as follows (Table 6 ):

According to the obtained indicators, it can be seen that AGFI is greater than 0.80, x2/df ratio is less than 3, RMSEA value is less than .08, and CFI is greater than .95 which means that there is a great satisfactory fit. All in all, this can be concluded that the indicators are in the desired range and the results of the model are reliable. Finally, the results of confirmatory factor analysis confirm the relationships and structure of the model, investigating the validity and reliability of the structure (Table 7 ):

Investigation of the significance of covariance relations also shows that all covariance relationships between structures have a p value less than the error level of 0.05, and the relationships are significant. The advantage of composite reliability over Cronbach’s alpha is that the reliability of structures is not computed definitely; rather, it is obtained through evaluating the correlation of existing structures with each other. In this method, indicators that have a higher factor load are more important. Therefore, both criteria are used to better measure the reliability of this type of models. Moreover, the common measure for creating convergent validity at the structural level is the mean extracted variance (AVE). This criterion is defined as the equivalent to the share of a structure. Acceptable values for CR is over .70, and the excellent value for AVE is over .50.

Considering that the second generation of structural equation modeling is based on the variance approach, and in order to ensure the values of covariance and provide a complete report, the covariance relationships in this model were also examined and the results were reported (Table 8 ).

As it turns out, all covariance relationships between structures have a p value less than the 0.05 error level and a t value greater than 1.96, meaning that the relationships between latent variables are meaningful.

Campbell and Fiske ( 1959 ) and Langer et al. ( 2010 ) stated that CFA is an analysis for construct validity. Putting the results observed in steps 2 and 3 together, it can be concluded that all the three absolute fitness indices, parsimony fit indices, and incremental fit indices have desirable values in the model, and this theoretical model is consistent with its experimental model, and therefore, the divergent validity of this structure is confirmed. The results of calculating the reliability of the inventory were also presented in “instrumentation” section. Therefore, combining the results of covariance analysis and the three-level analyses, it can be seen that this questionnaire is valid and reliable.

Since there is little agreement on the nature and dimensions of the term critical thinking (Facione et al., 2000 ; Frykholm, 2020 ), the researchers of this study decided to provide a comprehensive picture of its various dimensions and develop a valid tool for its measurement. Frykholm ( 2020 ) believes that no educator has proposed a comprehensive definition and model of critical thinking, and it can be said that most previous studies have focused only on a few limited skills of critical thinking. However, the results of the interviews in the first phase of this study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ) clearly showed that the socio-cultural dimensions—if not more—are as important as the individual skills dimension. And by approaching the proposed model of the present study to the model of Davies ( 2015 ) and Ledman ( 2019 ), it can be inferred that the comprehensive model is well suited to the set of skills, judgments, and activities (especially for investigating and questioning tasks of receptive skills) as well as expressing desires or attitudes (expressing ideas, creativity, analysis, and other productive skills). In review, the main objectives of this study were to investigate the validity of items and components of the model and also the validity of the tool designed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) to assess three dimensional critical thinking in EFL readers based on which the following results were identified.

Examining the values obtained from the data, it was observed that the data distribution was not normal. Therefore, in order to assess the validity of this tool, confirmatory factor analysis (CFA) and PLS-SEM was used in SmartPLS software because this method is suitable for abnormal data (Hair et al., 2014 ) and makes it possible to examine complex models with multiple exogenous and endogenous constructs as well as index variables (Hair Jr. et al., 2010 ; Hair et al., 2014 ; Hair et al., 2019 ). The study of structural equation modeling and covariance relationships and also model evaluation indices clearly showed that the components were selected correctly, the relationships between the components of the model were defined properly and the questionnaire items were well designed, and in this way, the study has reached its objectives.

The six-factor and twenty-two items scale that was proposed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) has been validated using a hybrid technique mainly due to the existence of abnormally distributed data. Results indicated that the PSL-SEM CFA represented the best fit to the proposed model, in terms of factor loadings. The findings of the first phase of this study indicated the existence of validity between the factors introduced in the three-level model of critical thinking. From the results obtained in this phase, it can be seen that focusing on all the skills and abilities introduced (i.e., argumentation, judgment, disposition, action, social cognition, and creativity) is important in developing critical thinking in English readers.

Discussing the elements of the first movement, a comparison on the criteria was introduced in this study with the ones mentioned in Kaewpet ( 2018 ); this can be said that the same measures were mentioned by EFL learners. Focusing on factors of judgements, the elements of buck-passing and vigilance were extracted which were also mentioned by Mann et al. ( 1997 ). They also referred to hypervigilance and defensive avoidance which were not mentioned by EFL learners. The last skill of the first movement was disposition which was assessed based on innovativeness, maturity, and engagement as introduced by Ricketts ( 2003 ).

In the second movement of developing critical thinking, it was referred to criticality which was mentioned by learners in terms of habitual action, understanding, reflection, and critical reflection. These factors were also used by Kember et al. ( 2000 ). The findings of this section, contrary to the view of Shpeizer ( 2018 ), in which the two concepts introduced in the first and second movements were considered the same without considerable distinctions, clearly showed that the second movement involves the development of critical actions (and the introduced sub-actions) in individuals, while the first movement does not focus on the development of action skills in individuals. The findings of this study also confirm the views of Wilson and Howitt ( 2018 ) based on which they acknowledged that critical thinking in this movement is self-centered and manifests itself in the form of introspection, self-adjusting, and metacognition. The set of abilities acquired at this stage will make a person a prosperous learner, specialist, and scholar, while the first movement focuses on the application of rational-argumentative thinking in the form of training methods and with the aim of improving exactness, proficiency, and creativeness in individuals. Similarly, Ledman ( 2019 ) considers this dimension as disciplinary criticality based on which the thinking tools and habits of mind promote epistemological structures.

And the third movement in this study, namely critical pedagogy movement, was composed of the two layers of social cognition and creativity. The first layer was assessed based on factors such as social competence, literacy, cultural competence, and extraversion. The findings of this section are very similar to Pishghadam et al. ( 2011 ) criteria in which factors of social competence, social solidarity, literacy, cultural competence, and extraversion were introduced as basic criteria in measuring social cognition. But these findings are in contrast with criteria introduced by Pishvaei and Kasaian ( 2013 ) among which are tenets of monolingualism, monoculturalism, native-speakerism, native teacher, native-like pronunciation, and authenticity of native-designed materials quantitatively. Reasons for such a difference may include the nature of the classes, the objectives of the courses, and the interlocutors/participants. These findings are consistent with the works of Davies ( 2015 ) and Davies and Barnett ( 2015 ) who predicted that critical thinking is not only limited to individual critical thinking skills but also other dimensions such as socio-cultural dimensions and critical pedagogy should also be considered. The last layer was creativity which was assessed based on factors of fluency, flexibility, originality, and elaboration which were also mentioned by O’Neil et al. ( 1992 ) and Abedi ( 2002 ).

Discussing this movement, the introduced elements of this dimension confirmed the orientations taken by Davies ( 2015 ), Davies and Barnett ( 2015 ), Rahimi and Asadi Sajed ( 2014 ), and Shpeizer ( 2018 ) based on which critical pedagogy have impact on critical thinking. According to Shpeizer ( 2018 ), the fundamental difference between the two schools of critical thinking discourse and the critical pedagogy is in the contrast between the sociocultural as well as political and moral tendencies in this school and the apparent neutral tendencies of the school of critical thinking. According to Shpeizer ( 2018 ) and Freire ( 1993 ), in the former, it is not possible to intercept epistemology and politics, and if there is a critical approach, then people’s awareness of power relations and structural inequalities of the societies will be aroused. Shpeizer ( 2018 ) adds that, advocates of critical thinking believe that this approach is incompatible, inconsistent, and hazardous since it initially creates uncertain assumptions about a society and thus diverts us from the path of truth-seeking and enlightenment required by a critical thinker. And perhaps the main reason for the slow and one-dimensional movement of critical thinking during all the years can be found in this point. According to Shpeizer ( 2018 ) and Rahimi and Asadi Sajed ( 2014 ), the proponents of critical pedagogy development argue that since social, political, and educational structures in different societies hitherto run in an inequitable and oppressive manner, disregarding such conditions (which undoubtedly construct the lives and thoughts of individuals) makes objective critical development—and consequently, the progress of community members and communities—impossible. They emphasized that to develop critical pedagogy, it is not possible to teach rational and critical thinking skills and tendencies in individuals without regard to other dimensions such as awareness of cultural, political, and religious. The findings are also in line with Ledman ( 2019 ), who states that moral education (the name chosen for the third dimension) emphasizes the need to develop the capacity for moral thinking and judgment independent of official orders and requirements. Finally, by matching the findings of this study with the study of Davies ( 2015 ) and Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), it can be concluded that critical thinking can be defined in three complementary layers; critical thinking skills, criticality, and critical pedagogy. And the more one strives to become more capable in thinking critically, the more s/he moves from gaining initial-personal skills (critical thinker) to socio-cultural skills (critical being).

Regarding the methodology of the study, as explained, due to the fact that the distribution of data obtained from the questionnaire was not normal, the PLS-SEM method was used as a confirmatory factor analysis (CFA) technique. The validation of the model used in this study is based on theoretical and experimental concepts developed in the previous study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ). The model validity test was performed in the framework of SEM approach of CFA based on which the investigation of the first degree model (same as the measurement model), and the second degree model (same as the structural model) was conducted. Examination of the absolute values of skewness and Kurtosis as well as data distribution showed that the distribution was not normal; therefore, PLS-SEM confirmatory factor analysis was performed to determine the structural validity of the scale (Mâtă et al., 2020 ). In addition, the modeling approach is suitable for complex models with multiple endogenous and exogenous index structures and variables (Hair et al., 2014 ). Also, due to the fact that the sample size in this study is more than the minimum recommended value (i.e. 50), so the most appropriate method for model analysis was considered (Mâtă et al., 2020 ).

The results of this study provided the next implications: this study investigated a framework of assessing EFL readers who has critical thinking in the three main streams of individual skills, critical pedagogy, and criticality. The results showed that in each of these three main streams, there are criteria that can be used to assess learners’ abilities; Students were interviewed in different phases of the study and offered a variety of views not only on their attitudes toward critical thinking, but also on their perceptions of teaching instructions and the strengths and weaknesses of each, which can provide insights towards designing and implementing critical thinking training sessions; a review of previous literature on three-dimensional critical thinking provided a comprehensive overview of its strengths and weaknesses, as well as the supporters and opponents and finally, the findings of this study were a true validation of the studies confirming the views of all those who agree with the three-dimensional approach to critical thinking under any heading; using the presented concepts in research-academic institutions to identify the most suitable training methods of each of the sub-skills of critical thinking in different societies is very helpful. Given that this study was conducted only in the field of English language and in the university context, its application in other educational spaces and for people with different academic backgrounds and identifying differences in the application of various instructions for each of the sub-skills will be very effective. It is possible to apply both concepts (i.e., three-dimensional critical thinking and reading) in other areas to assess the generalizability of findings. An interesting finding was that in some cases, students engaged in group discussions sometimes returned to their first language, which could be a consequence of poor language proficiency. In such circumstances, Lun et al. ( 2010 ) have suggested that in order to promote critical thinking, the emphasis on language processing should be reduced or, on the recommendation of Ko ( 2013 ), teachers should first describe the task in order to prepare students and initialize the analysis and then ask them to complete it. The validity of the criterion proposed in the previous study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ) was evaluated through structural equation modeling, which is a new method and has a very limited history in language studies. This study showed that the method can be used to evaluate path analysis/regression, repeated measures analysis/latent changes modeling, and confirmatory factor analysis.

This study was designed and conducted to confirm the subscales introduced by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) in determining the critical thinking ability in three different layers (i.e., individual critical thinking skills, criticality, and critical pedagogy) through assessing the validity of the proposed questionnaire. The model studied in this study well confirmed the relationship between the factors identified in previous studies and the proposed model with six scales and twenty-two subscales showed a good fit representing that argumentation, judgment, disposition, action, social cognition, and creativity are proper components for measuring three-level critical thinking in language learners.

The results of assessing the validity of CTI through CFA represented that all the three absolute fitness indices, parsimony fit indices, and incremental fit indices have desirable values in the model, and the proposed model is consistent with its experimental model; meaning that the divergent validity of the structure is confirmed. Therefore, combining the results of covariance analysis, the three-level analyses, and the reliability calculations, it can be seen that the questionnaire is valid and reliable. This represents that a critical thinker EFL reader is an individual with the ability to make argumentation (i.e., to find relevance, provide reasoning, recognize language use, comprehend the text’s organization, and distinguish author’s voice), to make judgement (i.e., to pass the buck and vigilant), to provide dispositions (i.e., to innovate, be mature, and engage in doing activities), to act (i.e., to form habitual actions, to understand, to be reflective, and to have critical reflection towards issues), to have social cognition (i.e., to have social competence, literacy, cultural competence, and be extrovert), and to be creative (i.e., to be able to elaborate, be flexible, have fluency, and propose original ideas).

Future research can introduce the extent and manner of internalization of the introduced skills and the effectiveness of different internalization methods. In addition, it should be noted that in this study, the views of language learners were examined. It is necessary to examine the introduced criteria also from the point of view of teachers and administrators in order to answer questions such as the following: Are teachers’ perceptions different from students? If so, what are the differences? What are the effective strategies in teaching these criteria? This type of research can also determine whether students, teachers, and planners have the same understanding of the concepts as well as the strategies used in the classroom. And whether their understanding of the criteria introduced in the first language is the same as in the second language? Moreover, due to the distribution of the gathered data in this study, the factor analysis method with partial least squares (PLS) approach was used. Subsequent researchers can use other analysis programs, such as LISREL or AMOS, for structural analysis relying on larger communities.

Finally, it is necessary to mention that the generalization of the results of this study to other fields and research communities is not possible due to the limited number of participants and its specific field, and it is recommended that first the necessary research efforts be made to apply this scale in different educational fields and societies in order to have more strength and generalizability.

Availability of data and materials

The datasets generated and analyzed during the current study are not publicly available due to ethical issues and privacy restrictions, but are available from the corresponding author on reasonable request.

Abbreviations

Three dimensional

Adjusted Goodness of Fit Index

American Philosophical Association

Argumentation

Average variance extracted

Covariance-based structural equation modeling

Confirmatory factor analysis

Comparative Fit Index

Composite reliability

Critical thinking inventory

Cultural competence

English as a Foreign Language

Goodness of Fit Index

Multiple indicators multiple causes

Ordinary least square

Preliminary English Test

Partial Least Squares-Structural Equation Modeling Confirmatory Factor Analysis

Reflective thinking

Root mean squared residual

Root mean square error of approximation

Structural equation modeling

Social cognition

Social competence

Statistical Package for the Social Sciences

Standardized root mean squared residual

Tucker-Lewis Index

Abedi, J. (2002). Standardized achievement tests and English language learners: psychometrics issues. Educational Assessment , 8 , 231–257. https://doi.org/10.1207/S15326977EA0803_02 .

Article   Google Scholar  

Adalı, O. (2010). Interactive and critical reading techniques . Toroslu Library.

Google Scholar  

Adalı, O. (2011). Understand and tell . Pan Publishing.

Afthanorhan, W. M. A. (2013). A comparison of partial least square structural equation modeling (PLS-SEM) and covariance based structural equation modeling (CB-SEM) for confirmatory factor analysis. International Journal of Engineering Science and Innovative Technology , 2 , 198–205.

Airasian, P. W., Anderson, L. W., Krathwohl, D. R., & Bloom, B. S. (2001). A taxonomy for learning, teaching, and assessing: a revision of Bloom’s taxonomy of educational objectives . Longman.

Akyuz, H. I., & Samsa, S. (2009). The effects of blended learning environment on the critical thinking skills of students. Procedia Social and Behavioral Sciences , 1 , 1744–1748.

Bag, H. K., & Gürsoy, E. (2021). The effect of critical thinking embedded English course design to the improvement of critical thinking skills of secondary school learners. Thinking Skills and Creativity , 41 . https://doi.org/10.1016/j.tsc.2021.100910 .

Bali, M. (2015). Critical thinking through a multicultural lens: cultural challenges of teaching critical thinking. In M. Davies, & R. Barnett (Eds.), The Palgrave handbook of critical thinking in higher education , (pp. 317–334). Palgrave Macmillan.

Chapter   Google Scholar  

Bentler, P. M. (2000). Rites, wrongs, and gold in model testing. Structural Equation Modeling , 7 , 82–91. https://doi.org/10.1207/S15328007SEM0701_04 .

Berardo, S. (2006). The use of authentic materials in the teaching of reading. The Reading Matrix , 6 (2), 60–69.

Birjandi, P., Bagheri, M. B., & Maftoon, P. (2018). The place of critical thinking in Iranian Educational system. Foreign Language Research Journal , 7 (2), 299–324. https://doi.org/10.22059/JFLR.2017.236677.353 .

Bloom, B. S. (1956). Taxonomy of educational objectives, handbook 1: Cognitive domain . Longmans Green.

Brookfield, S. (2019). Using discussion to foster critical thinking. In D. Jahn, A. Kergel, & B. Heidkamp-Kergel (Eds.), Kritische Hpschullehre. Diversitat und bildung im digitalen zeitalter , (pp. 135–151). Springer. https://doi.org/10.1007/978-3-658-25740-8_7 .

Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS and SIMPLIS: basic concepts, applications and programming . Lawrence Erlbaum.

Byrne, B. M. (2001). Structural equation modeling with AMOS: basic concepts, applications, and programming . Lawrence Erlbaum Associates Publishers.

Byrne, B. M. (2006). Structural equation modeling with eqs: basic concepts, applications, and programming , (2nd ed., ). Lawrence Erlbaum Associates Publishers.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin , 56 , 81–105.

Chamot, A. U. (2004). Issues in language learning strategy research and teaching. Electronic Journal of Foreign Language Teaching , 1 , 14–26.

Crenshaw, P., Hale, E., & Harper, S. L. (2011). Producing intellectual labor in the classroom: the utilization of a critical thinking model to help students take command of their thinking. Journal of College Teaching & Learning , 8 (7), 13–26. https://doi.org/10.19030/tlc.v8i7.4848 .

Davies, M. (2015). A model of critical thinking in higher education. In M. B. Paulsen (Ed.), Higher Education: Handbook of Theory and Research , (vol. 30, pp. 41–92). Springer International Publishing.

Davies, M., & Barnett, R. (2015). Introduction. In M. Davies, & R. Barnett (Eds.), The Palgrave handbook of critical thinking in higher education , (pp. 1–26). Palgrave Macmillan.

Dekker, T. J. (2020). Teaching critical thinking through engagement with multiplicity. Thinking Skills and Creativity , 37 . https://doi.org/10.1016/j.tsc.2020.100701 .

Din, M. (2020). Evaluating university students’ critical thinking ability as reflected in their critical reading skill: a study at bachelor level in Pakistan. Thinking Skills and Creativity , 35 , 1–11. https://doi.org/10.1016/j.tsc.2020.100627 .

Dornyei, Z. (2010). Researching motivation: from integrativeness to the ideal L2 self. Introducing applied linguistics. Concepts and skills , 3 (5), 74–83.

Eftekhary, A. A., & Besharati Kalayeh, K. (2014). The relationship between critical thinking and extensive reading on Iranian intermediate EFL learners. Journal of Novel Applied Sciences , 3 (6), 623–628.

Ennis, R. (1991). Critical thinking: a streamlined conception. Teaching Philosophy , 14 (1), 5–24.

Ennis, R. H. (2011). Critical thinking: reflection and perspective, Part I. Inquiry: Critical Thinking across the Disciplines , 26 (1), 4–18. https://doi.org/10.5840/inquiryctnews20112613 .

Facione, P. A. (1990). Executive summary of critical thinking: a statement of expert consensus for purposes of educational assessment and instruction . The California Academic Press.

Facione, P. A. (1992). Critical thinking: what it is and why it counts . Insight Assessment and the California Academic Press Retrieved May 2019 from http://www.student.uwa.edu.au .

Facione, P. A., Facione, N. C., & Giancarlo, C. A. (2000). The disposition toward critical thinking: its measurement, and relationship to critical thinking skill. Informal Logic , 20 (1), 61–84. https://doi.org/10.22329/il.v20i1.2254 .

Fahim, M., & Ahmadian, M. (2012). Critical thinking and Iranian EFL context. Journal of Language Teaching and Research , 3 (4), 793–800. https://doi.org/10.4304/jltr.3.4.793-800 .

Farahani, H. A. (2010). A comparison of partial least squares (PLS) and ordinary least squares (OLS) regressions in predicting of couples’ mental health based on their communicational patterns. Procedia Social and Behavioral Sciences , 5 , 1459–1463. https://doi.org/10.1016/j.sbspro.2010.07.308 .

Freire, P. (1993). Pedagogy of the oppressed . Continuum.

Frykholm, J. (2020). Critical thinking and the humanities: a case study of conceptualizations and teaching practices at the Section for Cinema Studies at Stockholm University. Arts and Humanities in Higher Education , 20 (3), 253–273. https://doi.org/10.1177/1474022220948798 .

Gear, A. (2006). Reading power: teaching students to think while they read . Pembroke.

Hair, J., Hult, G. T. M., Ringle, C., & Sarstedt, M. (2014). A primer on partial least squares structural equation modeling (PLS-SEM) . SAGE Publications.

Hair, J. F., Risher, J. J., Sarstedt, M., & Ringle, C. M. (2019). When to use and how to report the results of PLS-SEM. European Business Review , 31 (1), 2–24. https://doi.org/10.1108/EBR-11-2018-0203 .

Hair Jr., J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis , (7th ed., ). Pearson Prentice-Hall.

Haji Maibodi, A. (2014). The effect of critical thinking skills on reading English novels. Research in English Language Pedagogy , 2 (2), 97–108.

Haji Maibodi, A., & Fahim, M. (2012). The impact of critical thinking in EFL/ESL literacy. The Iranian EFL Journal , 8 (3), 24–44.

Hall, L., & Piazza, S. (2008). Critically reading texts: what students do and how teachers can help. Reading Teacher , 62 (1), 32–41. https://doi.org/10.1598/RT.62.1.14 .

Hawkins, K. T. (2012). Thinking and reading among college undergraduates: an examination of the relationship between critical thinking skills and voluntary reading (Unpublished Doctoral Dissertation) University of Tennessee, USA.

Heidari, K. (2020). Critical thinking and EFL learners’ performance on textually-explicit, textually-implicit, and script-based reading items. Thinking Skills and Creativity , 37 . https://doi.org/10.1016/j.tsc.2020.100703 .

Hemmati, M. R., & Azizmalayeri, F. (2022). Iranian EFL teachers’ perceptions of obstacles to implementing student-centered learning: a mixed-methods study. International Journal of Foreign Language Teaching and Research , 10 (40), 133–152. https://doi.org/10.30495/JFL.2022.686698 .

Hosseini, E., Bakhshipour Khodaei, F., Sarfallah, S., & Dolatabad, H. R. (2012). Exploring the relationship between critical thinking, reading comprehension and reading strategies of English university students. World Applied Sciences Journal , 17 (10), 1356–1364.

Imperio, A., Staarman, J. K., & Basso, D. (2020). Relevance of the socio-cultural perspective in the discussion about critical thinking. Journal of Theories and Research in Education , 15 (1), 1–19. https://doi.org/10.6092/issn.1970-2221/9882 .

Johnson, J. L., & Plake, B. S. (1998). A historical comparison of validity standards and validity practices. Educational and Psychological Measurement , 58 , 736–753. https://doi.org/10.1177/0013164498058005002 .

Kaewpet, C. (2018). Quality of argumentation models. Theory and Practice in Language Studies , 8 (9), 1105–1113. https://doi.org/10.17507/TPLS.0809.01 .

Kahsay, M. T. (2019). EFL students’ epistemological beliefs and use of cognitive and metacognitive strategies in Bahir Dar University. International Journal of Foreign Language Teaching & Research , 7 (26), 69–83.

Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement , 38 , 319–342. https://doi.org/10.1111/j.1745-3984.2001.tb01130.x .

Kaplan, D. (2000). Structural equation modeling: foundations and extensions . Sage Publications.

Kember, D., Leung, D. Y. P., Jones, A., Loke, A. Y., McKay, J., Sinclair, K., & Yeung, E. (2000). Development of a questionnaire to measure the level of reflective thinking. Assessment and Evaluation in Higher Education , 25 (4), 382–395. https://doi.org/10.1080/713611442 .

Khonamri, F., & Karimabadi, M. (2015). Collaborative strategic reading and critical reading ability of intermediate Iranian learners. Theory and Practice in Language Studies , 5 (7), 1375–1382. https://doi.org/10.17507/tpls.0507.09 .

Khorasani, M. M., & Farimani, M. A. (2010). The analysis of critical thinking in Fariman’s teachers and factors influencing it. Journal of Social Science of Ferdowsi University , 6 (1), 197–230.

Ko, M. Y. (2013). Critical literacy practices in the EFL context and the English language proficiency: further exploration. English Language Teaching , 6 (11), 17–28.

Koch, T., Eid, M., & Lochner, K. (2018). Multitrait-multimethod-analysis: the psychometric foundation of CFA-MTMM models. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development , (vol. VII, pp. 781–846). Wiley-Blackwell Publishing Ltd..

Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: an overview. Theory into Practice , 41 , 212–218.

Langer, D. A., Wood, J. J., Bergman, R. L., & Piacentini, J. C. (2010). A multitrait–multimethod analysis of the construct validity of child anxiety disorders in a clinical sample. Child Psychiatry Hum Dev , 41 , 549–561. https://doi.org/10.1007/s10578-010-0187-0 .

Ledman, K. (2019). Discourses of criticality in Nordic countries’ school subject civics. Journal of Humanities and Social Science Education, 3 , 149–167.

Li, L. (2011). Obstacles and opportunities for developing thinking through interaction in language classrooms. Thinking Skills and Creativity , 6 (3), 146–158. https://doi.org/10.1016/j.tsc.2011.05.001 .

Li, L. (2016). Thinking skills and creativity in second language education: where are we now? Thinking Skills and Creativity , 22 , 267–272. https://doi.org/10.1016/j.tsc.2016.11.005 .

Liang, W., & Fung, D. (2020). Fostering critical thinking in English-as-a-second-language classrooms: challenges and opportunities. Thinking Skills and Creativity , 2020 . https://doi.org/10.1016/j.tsc.2020.100769 .

Lin, M., Preston, A., Kharrufa, A., & Kong, Z. (2016). Making L2 learners’ reasoning skills visible: the potential of computer supported collaborative learning environment. Thinking Skills and Creativity , 22 , 303–322. https://doi.org/10.1016/j.tsc.2016.06.004 .

Liu, F., & Stapleton, P. (2014). Counterargumentation and the cultivation of critical thinking in argumentative writing: investigating washback from a high-stakes test. System , 45 , 117–128. https://doi.org/10.1016/j.system.2014.05.005 .

Lun, V. M.-C., Fischer, R., & Ward, C. (2010). Exploring cultural differences in critical thinking: is it about my thinking style or the language I speak? Learning and Individual Differences , 20 (6), 604–616. https://doi.org/10.1016/j.lindif.2010.07.001 .

Manalo, E., & Sheppard, C. (2016). How might language affect critical thinking performance? Thinking Skills and Creativity , 21 , 41–49. https://doi.org/10.1016/j.tsc.2016.05.005 .

Mann, L., Burnett, P., Radford, M., & Ford, S. (1997). The Melbourne decision making questionnaire: an instrument for measuring patterns for coping with decisional conflict. Journal of Behavioral Decision Making, 10 (1), 1–19 https://doi.org/10.1002/(SICI)1099-0771(199703)10:1<1::AID-BDM242>3.0.CO;2-X .

Mâtă, L., Clipa, O., & Tzafilkou, K. (2020). The development and validation of a scale to measure university teachers’ attitude towards ethical use of information technology for a sustainable education. Sustainability , 12 (15), 1–20. https://doi.org/10.3390/su12156268 .

McGregor, D. (2007). Developing thinking; developing learning. A guide to thinking skills in education . Open University Press.

Merrifield, W. (2018). Culture and critical thinking: exploring culturally informed reasoning processes in a Lebanese university using think-aloud protocols (Unpublished Doctoral Dissertation) . George Fox University, College of Education.

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement , (pp. 13–103). American Council on Education and Macmillan Publishing Co., Inc.

Messick, S. (1995). Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist , 50 , 741–749. https://doi.org/10.1037/0003-066X.50.9.741 .

Moghadam, Z. B., Narafshan, M. H., & Tajadini, M. (2021). Development of a critical self in the language reading classroom: an examination of learners’ L2 self. Thinking Skills and Creativity , 42 , 1–29. https://doi.org/10.1016/j.tsc.2021.100944 .

Moon, J. (2008). Critical thinking: an exploration of theory and practice . Routledge.

Nemat Tabrizi, A. R., & Akhavan Saber, M. (2016). The effect of critical reading strategies on EFL learners’ recall and retention of collocations. International Journal of Education & Literacy Studies , 4 (4), 30–37. https://doi.org/10.7575/aiac.ijels.v.4n.4p.30 .

Nikopour, J., Amini, F. M., & Nasiri, M. (2011). On the relationship between critical thinking and language learning strategies among Iranian EFL learners. Journal of Technology & Education , 5 (3), 195–199. https://doi.org/10.22061/TEJ.2011.283 .

O’Neil, H. F., Abedi, J., & Spielberg, C. D. (1992). The measurement and teaching of creativity. In H. F. O’Neil, & M. Drillings (Eds.), Motivation: Theory and Research , (pp. 245–264). Lawrence Erlbaum.

Paul, R., & Elder, L. (2002). Critical thinking: tools for taking charge of your professional and personal life . Financial Times Prentice Hall.

Paulsen, M. B. (2015). Higher education: handbook of theory and research . Springer.

Book   Google Scholar  

Peng, J. (2014). Willingness to communicate in the Chinese EFL university classrooms: an ecological perspective . Multilingual Matters.

Pica, T. (2000). Tradition and transition in English language teaching methodology. System , 29 , 1–18.

Pirouz, D. M. (2006). An overview of partial least squares (unpublished doctoral dissertation) . University of California.

Pishghadam, R., Noghani, M., & Zabihi, R. (2011). The construct validation of a questionnaire of social and cultural capital. English Language Teaching , 4 (4), 195–203. https://doi.org/10.5539/elt.v4n4p195 .

Pishvaei, V., & Kasaian, S. A. (2013). Design, construction, and validation of a CP attitude questionnaire in Iran. European Online Journal of Natural and Social Sciences , 2 (2), 59–74.

Rahimi, A., & Asadi Sajed, M. (2014). The interplay between critical pedagogy and critical thinking: theoretical ties and practicalities. Procedia - Social and Behavioral Sciences , 136 , 41–45. https://doi.org/10.1016/j.sbspro.2014.05.284 .

Rashel, U. M., & Kinya, S. (2021). Development and validation of a test to measure the secondary students’ critical thinking skills: a focus on environmental education in Bangladesh. International Journal of Educational Research Review , 6 (3), 264–274.

Renatovna, A. G., & Renatovna, A. S. (2021). Pedagogical and psychological conditions of preparing students for social relations on the basis of the development of critical thinking. Psychology and Educational Journal , 58 (2), 4889–4902. https://doi.org/10.17762/pae.v58i2.2886 .

Rezaei, S., Derakhshan, A., & Bagherkazemi, M. (2011). Critical thinking in language education. Journal of Language Teaching and Research , 2 (4), 769–777. https://doi.org/10.4304/jltr.2.4.769-777 .

Ricketts, J. C. (2003). The efficacy of leadership development, critical thinking dispositions, and students academic performance on the critical thinking skills of selected youth leaders (unpublished doctoral dissertation) . University of Florida.

Sadler, D. R. (2010). Beyond feedback: developing student capability in complex appraisal. Assessment & Evaluation in Higher Education , 35 (5), 535–550. https://doi.org/10.1080/02602930903541015 .

Shang, H. F. (2010). Reading strategy use, self-efficacy and EFL reading comprehension. Asian EFL Journal , 12 (2), 18–42.

Shpeizer, R. (2018). Teaching critical thinking as a vehicle for personal and social transformation. Research in Education , 100 (1), 32–49. https://doi.org/10.1177/0034523718762176 .

Söylemez, Y. (2015). Development of critical essential language skills for middle school students (unpublished doctoral dissertation) . Ataturk University Educational Sciences Institute.

Suryantari, H. (2018). Children and adults in second language learning. Tell Teaching of English and Literature Journal , 6 (1), 30–38. https://doi.org/10.30651/tell.v6i1.2081 .

Swallow, C. (2016). Reading is thinking. BU Journal of Graduate Studies in Education , 8 (2), 27–31.

Sweet, A. P. (1993). Transforming ideas for teaching and learning to read . Office of Educational Research and Improvement CS 011 460.

Tehrani, H. T., & Razali, A. B. (2018). Developing thinking skills in teaching English as a second/foreign language at primary school. International Journal of Academic Research in Progressive Education and Development , 7 (4), 13–29. https://doi.org/10.6007/IJARPED/v7-i4/4755 .

Tumasang, S. S. (2021). How fear affects EFL acquisition: the case of “terminale” students in Cameroon. Journal of English Language Teaching and Applied Linguistics , 63–70. https://doi.org/10.32996/jeltal .

Ullman, J. R. (2001). Structural equation modeling in using multivariate statistics. In B. G. Tabachnick, & L. Fidell (Eds.), Understanding multivariate statistics , (4th ed., pp. 653–771). Allyn & Bacon.

Van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2017). The relation between 21st-century skills and digital skills: a systematic literature review. Computers in Human Behavior , 72 , 577–588. https://doi.org/10.1016/j.chb.2017.03.010 .

Wallace, C. (2003). Critical reading in language education . Palgrave Macmillan.

Wang, Y., & Henderson, F. (2014). Teaching content through Moodle to facilitate students’ critical thinking in academic reading. The Asian EFL Journal , 16 (3), 7–40.

Widyastuti, S. (2018). Fostering critical thinking skills through argumentative writing. Cakrawala Pendidikan , 37 (2), 182–189. https://doi.org/10.21831/cp.v37i2.20157 .

Willingham, D. (2007). Critical thinking why is it so hard to teach? Arts Education Policy Review , 109 (4), 21–32. https://doi.org/10.3200/AEPR.109.4.21-32 .

Wilson, A. N., & Howitt, S. M. (2018). Developing critical being in an undergraduate science course. Studies in Higher Education , 43 (7), 1160–1171. https://doi.org/10.1080/03075079.2016.1232381 .

Wilson, K. (2016). Critical reading, critical thinking: delicate scaffolding in English for Academic Purposes (EAP). Thinking Skills and Creativity , 22 , 256–265. https://doi.org/10.1016/j.tsc.2016.10.002 .

Wu, W. C. V., Marek, M., & Chen, N. S. (2013). Assessing cultural awareness and linguistic competency of EFL learners in a CMC-based active learning context. System , 41 (3), 515–528. https://doi.org/10.1016/j.system.2013.05.004 .

Zhao, C., Pandian, A., & Singh, M. K. M. (2016). Instructional strategies for developing critical thinking in EFL classrooms. English Language Teaching , 9 (10), 14–21. https://doi.org/10.5539/elt.v9n10p14 .

Zhou, J., Jiang, Y., & Yao, Y. (2015). The investigation on critical thinking ability in EFL reading class. English Language Teaching , 8 (1), 84–93. https://doi.org/10.5539/elt.v8n1p83 .

Zou, M., & Lee, I. (2021). Learning to teach critical thinking: testimonies of three EFL teachers in China. Asia Pacific Journal of Education. https://doi.org/10.1080/02188791.2021.1982674 .

Zumbo, B. D. (2005). Structural equation modeling and test validation. In B. Everitt, & D. C. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science , (pp. 1951–1958). Wiley.

Download references

Acknowledgements

Not applicable.

The research received no specific grant from any funding agency in the public, commercial, or non-profit sectors.

Author information

Authors and affiliations.

Department of English, Science and Research Branch, Islamic Azad University, Tehran, Iran

Moloud Mohammadi & Masood Siyyari

Imam Ali University, Tehran, Iran

Gholam-Reza Abbasian

You can also search for this author in PubMed   Google Scholar

Contributions

M.M., G-R. A., and M. S. contributed to the design and implementation of the research, to the analysis of the results, and to the writing of the manuscript. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Gholam-Reza Abbasian .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Mohammadi, M., Abbasian, GR. & Siyyari, M. Adaptation and validation of a critical thinking scale to measure the 3D critical thinking ability of EFL readers. Lang Test Asia 12 , 24 (2022). https://doi.org/10.1186/s40468-022-00173-6

Download citation

Received : 11 November 2021

Accepted : 05 July 2022

Published : 14 September 2022

DOI : https://doi.org/10.1186/s40468-022-00173-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Critical thinking and reading
  • Criticality
  • Critical pedagogy
  • PLS-SEM factor analysis
  • Scale validation

instrument to measure critical thinking skills

IMAGES

  1. Critical Thinking Definition, Skills, and Examples

    instrument to measure critical thinking skills

  2. Critical_Thinking_Skills_Diagram_svg

    instrument to measure critical thinking skills

  3. why is Importance of Critical Thinking Skills in Education

    instrument to measure critical thinking skills

  4. Critical Thinking Skills

    instrument to measure critical thinking skills

  5. Critical Thinking Assessment: 4 Ways to Test Applicants

    instrument to measure critical thinking skills

  6. Critical thinking measurement tools.

    instrument to measure critical thinking skills

VIDEO

  1. “Exploring the Meaning of Hiem as an Instrument of Critical Thinking for the Acehnese People”

  2. Do you know Financial Derivatives? #usa #uk #canada #germany #australia #brasil #india #philippines

  3. Critical Thinking & the Decision-Making Process

  4. A Simple Metaphor For Critical Thinking

  5. Critical Thinking

  6. Extreme Riddles to Measure Your Survival Skills

COMMENTS

  1. Critical Thinking Testing and Assessment

    The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students' abilities to think their way through content using disciplined skill in reasoning. The more particular we can be about what we want students to ...

  2. A Systematic Review on Instruments to Assess Critical Thinking

    Critical Think ing an d Problem Solving (CTPS) are soft skills essential to be equipped among students according to. 21st-century learning. Several instruments have bee n dev eloped to measure ...

  3. Critical Thinking > Assessment (Stanford Encyclopedia of Philosophy)

    Roughly speaking, an instrument is valid if it measures accurately what it purports to measure, given standard conditions. ... It is easier to measure critical thinking skills or abilities than to measure dispositions. The following eight currently available standardized tests purport to measure them: the Watson-Glaser Critical Thinking ...

  4. Instruments to assess students' critical thinking—A qualitative

    Critical thinking (CT) skills are essential to academic and professional success. Instruments to assess CT often rely on multiple-choice formats with inherent problems. This research presents two instruments for assessing CT, an essay and open-ended group-discussion format, which were implemented in an undergraduate business course at a large ...

  5. Teaching, Measuring & Assessing Critical Thinking Skills

    Yes, We Can Define, Teach, and Assess Critical Thinking Skills. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the ...

  6. CTS Tools for Faculty and Student Assessment

    The WGCTA-FS was found to be a reliable and valid instrument for measuring critical thinking (71). Cornell Critical Thinking Test (CCTT) There are two forms of the CCTT, X and Z. Form X is for students in grades 4-14. Form Z is for advanced and gifted high school students, undergraduate and graduate students, and adults.

  7. Development and validation of an instrument to measure undergraduate

    Critical thinking skills are seldom explicitly assessed at universities. Commercial critical thinking assessments, which are often generic in context, are available. However, literature suggests that assessments that use a context relevant to the students more accurately reflect their critical thinking skills.

  8. Development of assessment instruments to measure critical thinking skills

    Abstract and Figures. Assessment instruments that is commonly used in the school generally have not been orientated on critical thinking skills. The purpose of this research is to develop ...

  9. Development and validation of an instrument to measure undergraduate

    Introduction The term 'critical thinking' or expressions referring to critical thinking skills and behaviours such as 'analyse and interpret data meaningfully' can be found listed in the graduate attributes of many universities around the world (Monash University, 2015; University of Adelaide, 2015; University of Melbourne, 2015; Ontario University, 2017; University of Edinburgh, 2017).

  10. Development of assessment instruments to measure critical thinking skills

    The purpose of this research is to develop assessment instruments to measure critical thinking skills, to test validity, reliability, and practicality. This type of research is Research and Development. There are two stages on the preface step, which are field study and literacy study. On the development steps, there some parts, which are 1 ...

  11. Development and validation of an instrument to measure undergraduate

    N2 - The importance of developing and assessing student critical thinking at university can be seen through its inclusion as a graduate attribute for universities and from research highlighting the value employers, educators and students place on demonstrating critical thinking skills. Critical thinking skills are seldom explicitly assessed at ...

  12. The assessment of students' creative and critical thinking skills in

    critical thinking skills in higher education, this paper reviews existing policies and ... acquisition of knowledge and skills. Second, assessment provides measures of performance that can be used to monitor institutions, departments or programmes, and improve their ... instruments that were developed for that purpose at the international and ...

  13. PDF Development of Critical Thinking Skill Instruments on Mathematical ...

    The purpose of this study is to develop instruments of critical thinking skills that meet the validity, reliability, and criteria for good items. This research includes ... 2007). There are seven aspects to measure critical thinking skills, which include seeking the truth, being open-minded, analytical, systematic, confident, having curiosity ...

  14. Development of assessment instruments to measure critical thinking skills

    Assessment instruments that is commonly used in the school generally have not been orientated on critical thinking skills. The purpose of this research is to develop assessment instruments to measure critical thinking skills, to test validity, reliability, and practicality. This type of research is Research and Development. There are two stages on the preface step, which are field study and ...

  15. (PDF) Development and validation of an instrument to measure

    The validation results from evaluators, learning experts and chemistry teachers show that the instrument is valid and can be used to measure students' critical thinking skills in automotive ...

  16. Critical thinking skills assessment instrument in physics ...

    This research is a development research that aims to produce an instrument for assessing critical thinking skills that can measure students' critical thinking skills and ilustrate the interconnectedness between aspects of critical thinking skills. The sample was selected using random sampling technique of 109 students in the trial stages and ...

  17. Adaptation and validation of a critical thinking scale to measure the

    Assessing an individuals' critical thinking requires a measuring instrument that can precisely and perfectly determine the true conditions. From the result of the literacy study, one can find some instruments to measure critical thinking skills and abilities of students each with their specific perspectives, definitions of criteria, and ...

  18. Development of an assessment instrument for critical thinking skills in

    The critical thinking skills need to be possessed by every student, and its evaluation requires the right measuring tool., while the critical thinking assessment has not used the right instrument.

  19. [PDF] An instrument to support thinking critically about critical

    The creation of an instrument for use by instructors, students, or researchers to identify, measure or promote critical thinking in online asynchronous discussions (OADs) revealed that while the instrument was valuable in identifying and measuring CT in the OAD, issues of practicality need to be addressed. This paper reports on the creation of an instrument for use by instructors, students, or ...

  20. Item Analysis of an Instrument to Measure Chemistry Students' Critical

    Duran, M., and Sendag, S. (2012). A Preliminary Investigation into Critical Thinking Skills of Urban High School Students: Role of an IT / STEM Program. Scientifif Research, 3 (2), p. 241--250. Google Scholar; Ennis. (2011). The Nature of Critical Thinking: An Outline of Critical Thinking Dispositions and Abilities. [On line].

  21. Developing Instruments to Measure Thinking Skills and ...

    The research on developing instruments to measure thinking skills and problem solving skills (TSPSS) among primary school pupils is important and very limited. ... San Franscisco: Jossey-Bass Publishers. Mohd Dahlan, M. R. (1994). Infusing critical and creative thinking skills in teaching and learning Malay language KBSM. Paper presented in ...

  22. Effective Assessment of Student Critical Thinking Skills

    Assessing students' critical thinking abilities is a vital component of educational development, as it equips them with the skills to analyze information, solve complex problems, and make informed ...

  23. Development of Learning Evaluation Instruments to Measure Creative

    The passage describes the process of creating 10 questions based on curriculum achievements, followed by validation by experts and a final stage of testing the questions on 40 ninth-grade science students at MAN 2 of Bandung, indicating that these questions effectively train students' creative thinking skills. The passage highlights the significance of creative thinking skills as an essential ...

  24. The example of instrument for assessing critical thinking skills

    Download scientific diagram | The example of instrument for assessing critical thinking skills from publication: Guided Inquiry Lab: Its Effect to Improve Student's Critical Thinking on ...

  25. GEN-Z ACCOUNTANTS: Redefining Traditional Accounting Practices

    Join us at 6 PM (WAT) this Thursday May 9, 2024, as our distinguish guest will be discussing the topic: GEN-Z ACCOUNTANTS: Redefining Traditional...

  26. (PDF) Development of instrument for assessing students' critical and

    This study aimed to develop a test instrument to measure students' critical thinking skills on fluid material. Characteristics of the instrument and its validity estimation are also described.