U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Quantitative Research – Methods, Types and Analysis

Quantitative Research – Methods, Types and Analysis

Table of Contents

What is Quantitative Research

Quantitative Research

Quantitative research is a type of research that collects and analyzes numerical data to test hypotheses and answer research questions . This research typically involves a large sample size and uses statistical analysis to make inferences about a population based on the data collected. It often involves the use of surveys, experiments, or other structured data collection methods to gather quantitative data.

Quantitative Research Methods

Quantitative Research Methods

Quantitative Research Methods are as follows:

Descriptive Research Design

Descriptive research design is used to describe the characteristics of a population or phenomenon being studied. This research method is used to answer the questions of what, where, when, and how. Descriptive research designs use a variety of methods such as observation, case studies, and surveys to collect data. The data is then analyzed using statistical tools to identify patterns and relationships.

Correlational Research Design

Correlational research design is used to investigate the relationship between two or more variables. Researchers use correlational research to determine whether a relationship exists between variables and to what extent they are related. This research method involves collecting data from a sample and analyzing it using statistical tools such as correlation coefficients.

Quasi-experimental Research Design

Quasi-experimental research design is used to investigate cause-and-effect relationships between variables. This research method is similar to experimental research design, but it lacks full control over the independent variable. Researchers use quasi-experimental research designs when it is not feasible or ethical to manipulate the independent variable.

Experimental Research Design

Experimental research design is used to investigate cause-and-effect relationships between variables. This research method involves manipulating the independent variable and observing the effects on the dependent variable. Researchers use experimental research designs to test hypotheses and establish cause-and-effect relationships.

Survey Research

Survey research involves collecting data from a sample of individuals using a standardized questionnaire. This research method is used to gather information on attitudes, beliefs, and behaviors of individuals. Researchers use survey research to collect data quickly and efficiently from a large sample size. Survey research can be conducted through various methods such as online, phone, mail, or in-person interviews.

Quantitative Research Analysis Methods

Here are some commonly used quantitative research analysis methods:

Statistical Analysis

Statistical analysis is the most common quantitative research analysis method. It involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis can be used to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.

Regression Analysis

Regression analysis is a statistical technique used to analyze the relationship between one dependent variable and one or more independent variables. Researchers use regression analysis to identify and quantify the impact of independent variables on the dependent variable.

Factor Analysis

Factor analysis is a statistical technique used to identify underlying factors that explain the correlations among a set of variables. Researchers use factor analysis to reduce a large number of variables to a smaller set of factors that capture the most important information.

Structural Equation Modeling

Structural equation modeling is a statistical technique used to test complex relationships between variables. It involves specifying a model that includes both observed and unobserved variables, and then using statistical methods to test the fit of the model to the data.

Time Series Analysis

Time series analysis is a statistical technique used to analyze data that is collected over time. It involves identifying patterns and trends in the data, as well as any seasonal or cyclical variations.

Multilevel Modeling

Multilevel modeling is a statistical technique used to analyze data that is nested within multiple levels. For example, researchers might use multilevel modeling to analyze data that is collected from individuals who are nested within groups, such as students nested within schools.

Applications of Quantitative Research

Quantitative research has many applications across a wide range of fields. Here are some common examples:

  • Market Research : Quantitative research is used extensively in market research to understand consumer behavior, preferences, and trends. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform marketing strategies, product development, and pricing decisions.
  • Health Research: Quantitative research is used in health research to study the effectiveness of medical treatments, identify risk factors for diseases, and track health outcomes over time. Researchers use statistical methods to analyze data from clinical trials, surveys, and other sources to inform medical practice and policy.
  • Social Science Research: Quantitative research is used in social science research to study human behavior, attitudes, and social structures. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform social policies, educational programs, and community interventions.
  • Education Research: Quantitative research is used in education research to study the effectiveness of teaching methods, assess student learning outcomes, and identify factors that influence student success. Researchers use experimental and quasi-experimental designs, as well as surveys and other quantitative methods, to collect and analyze data.
  • Environmental Research: Quantitative research is used in environmental research to study the impact of human activities on the environment, assess the effectiveness of conservation strategies, and identify ways to reduce environmental risks. Researchers use statistical methods to analyze data from field studies, experiments, and other sources.

Characteristics of Quantitative Research

Here are some key characteristics of quantitative research:

  • Numerical data : Quantitative research involves collecting numerical data through standardized methods such as surveys, experiments, and observational studies. This data is analyzed using statistical methods to identify patterns and relationships.
  • Large sample size: Quantitative research often involves collecting data from a large sample of individuals or groups in order to increase the reliability and generalizability of the findings.
  • Objective approach: Quantitative research aims to be objective and impartial in its approach, focusing on the collection and analysis of data rather than personal beliefs, opinions, or experiences.
  • Control over variables: Quantitative research often involves manipulating variables to test hypotheses and establish cause-and-effect relationships. Researchers aim to control for extraneous variables that may impact the results.
  • Replicable : Quantitative research aims to be replicable, meaning that other researchers should be able to conduct similar studies and obtain similar results using the same methods.
  • Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis allows researchers to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.
  • Generalizability: Quantitative research aims to produce findings that can be generalized to larger populations beyond the specific sample studied. This is achieved through the use of random sampling methods and statistical inference.

Examples of Quantitative Research

Here are some examples of quantitative research in different fields:

  • Market Research: A company conducts a survey of 1000 consumers to determine their brand awareness and preferences. The data is analyzed using statistical methods to identify trends and patterns that can inform marketing strategies.
  • Health Research : A researcher conducts a randomized controlled trial to test the effectiveness of a new drug for treating a particular medical condition. The study involves collecting data from a large sample of patients and analyzing the results using statistical methods.
  • Social Science Research : A sociologist conducts a survey of 500 people to study attitudes toward immigration in a particular country. The data is analyzed using statistical methods to identify factors that influence these attitudes.
  • Education Research: A researcher conducts an experiment to compare the effectiveness of two different teaching methods for improving student learning outcomes. The study involves randomly assigning students to different groups and collecting data on their performance on standardized tests.
  • Environmental Research : A team of researchers conduct a study to investigate the impact of climate change on the distribution and abundance of a particular species of plant or animal. The study involves collecting data on environmental factors and population sizes over time and analyzing the results using statistical methods.
  • Psychology : A researcher conducts a survey of 500 college students to investigate the relationship between social media use and mental health. The data is analyzed using statistical methods to identify correlations and potential causal relationships.
  • Political Science: A team of researchers conducts a study to investigate voter behavior during an election. They use survey methods to collect data on voting patterns, demographics, and political attitudes, and analyze the results using statistical methods.

How to Conduct Quantitative Research

Here is a general overview of how to conduct quantitative research:

  • Develop a research question: The first step in conducting quantitative research is to develop a clear and specific research question. This question should be based on a gap in existing knowledge, and should be answerable using quantitative methods.
  • Develop a research design: Once you have a research question, you will need to develop a research design. This involves deciding on the appropriate methods to collect data, such as surveys, experiments, or observational studies. You will also need to determine the appropriate sample size, data collection instruments, and data analysis techniques.
  • Collect data: The next step is to collect data. This may involve administering surveys or questionnaires, conducting experiments, or gathering data from existing sources. It is important to use standardized methods to ensure that the data is reliable and valid.
  • Analyze data : Once the data has been collected, it is time to analyze it. This involves using statistical methods to identify patterns, trends, and relationships between variables. Common statistical techniques include correlation analysis, regression analysis, and hypothesis testing.
  • Interpret results: After analyzing the data, you will need to interpret the results. This involves identifying the key findings, determining their significance, and drawing conclusions based on the data.
  • Communicate findings: Finally, you will need to communicate your findings. This may involve writing a research report, presenting at a conference, or publishing in a peer-reviewed journal. It is important to clearly communicate the research question, methods, results, and conclusions to ensure that others can understand and replicate your research.

When to use Quantitative Research

Here are some situations when quantitative research can be appropriate:

  • To test a hypothesis: Quantitative research is often used to test a hypothesis or a theory. It involves collecting numerical data and using statistical analysis to determine if the data supports or refutes the hypothesis.
  • To generalize findings: If you want to generalize the findings of your study to a larger population, quantitative research can be useful. This is because it allows you to collect numerical data from a representative sample of the population and use statistical analysis to make inferences about the population as a whole.
  • To measure relationships between variables: If you want to measure the relationship between two or more variables, such as the relationship between age and income, or between education level and job satisfaction, quantitative research can be useful. It allows you to collect numerical data on both variables and use statistical analysis to determine the strength and direction of the relationship.
  • To identify patterns or trends: Quantitative research can be useful for identifying patterns or trends in data. For example, you can use quantitative research to identify trends in consumer behavior or to identify patterns in stock market data.
  • To quantify attitudes or opinions : If you want to measure attitudes or opinions on a particular topic, quantitative research can be useful. It allows you to collect numerical data using surveys or questionnaires and analyze the data using statistical methods to determine the prevalence of certain attitudes or opinions.

Purpose of Quantitative Research

The purpose of quantitative research is to systematically investigate and measure the relationships between variables or phenomena using numerical data and statistical analysis. The main objectives of quantitative research include:

  • Description : To provide a detailed and accurate description of a particular phenomenon or population.
  • Explanation : To explain the reasons for the occurrence of a particular phenomenon, such as identifying the factors that influence a behavior or attitude.
  • Prediction : To predict future trends or behaviors based on past patterns and relationships between variables.
  • Control : To identify the best strategies for controlling or influencing a particular outcome or behavior.

Quantitative research is used in many different fields, including social sciences, business, engineering, and health sciences. It can be used to investigate a wide range of phenomena, from human behavior and attitudes to physical and biological processes. The purpose of quantitative research is to provide reliable and valid data that can be used to inform decision-making and improve understanding of the world around us.

Advantages of Quantitative Research

There are several advantages of quantitative research, including:

  • Objectivity : Quantitative research is based on objective data and statistical analysis, which reduces the potential for bias or subjectivity in the research process.
  • Reproducibility : Because quantitative research involves standardized methods and measurements, it is more likely to be reproducible and reliable.
  • Generalizability : Quantitative research allows for generalizations to be made about a population based on a representative sample, which can inform decision-making and policy development.
  • Precision : Quantitative research allows for precise measurement and analysis of data, which can provide a more accurate understanding of phenomena and relationships between variables.
  • Efficiency : Quantitative research can be conducted relatively quickly and efficiently, especially when compared to qualitative research, which may involve lengthy data collection and analysis.
  • Large sample sizes : Quantitative research can accommodate large sample sizes, which can increase the representativeness and generalizability of the results.

Limitations of Quantitative Research

There are several limitations of quantitative research, including:

  • Limited understanding of context: Quantitative research typically focuses on numerical data and statistical analysis, which may not provide a comprehensive understanding of the context or underlying factors that influence a phenomenon.
  • Simplification of complex phenomena: Quantitative research often involves simplifying complex phenomena into measurable variables, which may not capture the full complexity of the phenomenon being studied.
  • Potential for researcher bias: Although quantitative research aims to be objective, there is still the potential for researcher bias in areas such as sampling, data collection, and data analysis.
  • Limited ability to explore new ideas: Quantitative research is often based on pre-determined research questions and hypotheses, which may limit the ability to explore new ideas or unexpected findings.
  • Limited ability to capture subjective experiences : Quantitative research is typically focused on objective data and may not capture the subjective experiences of individuals or groups being studied.
  • Ethical concerns : Quantitative research may raise ethical concerns, such as invasion of privacy or the potential for harm to participants.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • Quantitative Methods
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

Quantitative methods emphasize objective measurements and the statistical, mathematical, or numerical analysis of data collected through polls, questionnaires, and surveys, or by manipulating pre-existing statistical data using computational techniques . Quantitative research focuses on gathering numerical data and generalizing it across groups of people or to explain a particular phenomenon.

Babbie, Earl R. The Practice of Social Research . 12th ed. Belmont, CA: Wadsworth Cengage, 2010; Muijs, Daniel. Doing Quantitative Research in Education with SPSS . 2nd edition. London: SAGE Publications, 2010.

Need Help Locating Statistics?

Resources for locating data and statistics can be found here:

Statistics & Data Research Guide

Characteristics of Quantitative Research

Your goal in conducting quantitative research study is to determine the relationship between one thing [an independent variable] and another [a dependent or outcome variable] within a population. Quantitative research designs are either descriptive [subjects usually measured once] or experimental [subjects measured before and after a treatment]. A descriptive study establishes only associations between variables; an experimental study establishes causality.

Quantitative research deals in numbers, logic, and an objective stance. Quantitative research focuses on numeric and unchanging data and detailed, convergent reasoning rather than divergent reasoning [i.e., the generation of a variety of ideas about a research problem in a spontaneous, free-flowing manner].

Its main characteristics are :

  • The data is usually gathered using structured research instruments.
  • The results are based on larger sample sizes that are representative of the population.
  • The research study can usually be replicated or repeated, given its high reliability.
  • Researcher has a clearly defined research question to which objective answers are sought.
  • All aspects of the study are carefully designed before data is collected.
  • Data are in the form of numbers and statistics, often arranged in tables, charts, figures, or other non-textual forms.
  • Project can be used to generalize concepts more widely, predict future results, or investigate causal relationships.
  • Researcher uses tools, such as questionnaires or computer software, to collect numerical data.

The overarching aim of a quantitative research study is to classify features, count them, and construct statistical models in an attempt to explain what is observed.

  Things to keep in mind when reporting the results of a study using quantitative methods :

  • Explain the data collected and their statistical treatment as well as all relevant results in relation to the research problem you are investigating. Interpretation of results is not appropriate in this section.
  • Report unanticipated events that occurred during your data collection. Explain how the actual analysis differs from the planned analysis. Explain your handling of missing data and why any missing data does not undermine the validity of your analysis.
  • Explain the techniques you used to "clean" your data set.
  • Choose a minimally sufficient statistical procedure ; provide a rationale for its use and a reference for it. Specify any computer programs used.
  • Describe the assumptions for each procedure and the steps you took to ensure that they were not violated.
  • When using inferential statistics , provide the descriptive statistics, confidence intervals, and sample sizes for each variable as well as the value of the test statistic, its direction, the degrees of freedom, and the significance level [report the actual p value].
  • Avoid inferring causality , particularly in nonrandomized designs or without further experimentation.
  • Use tables to provide exact values ; use figures to convey global effects. Keep figures small in size; include graphic representations of confidence intervals whenever possible.
  • Always tell the reader what to look for in tables and figures .

NOTE:   When using pre-existing statistical data gathered and made available by anyone other than yourself [e.g., government agency], you still must report on the methods that were used to gather the data and describe any missing data that exists and, if there is any, provide a clear explanation why the missing data does not undermine the validity of your final analysis.

Babbie, Earl R. The Practice of Social Research . 12th ed. Belmont, CA: Wadsworth Cengage, 2010; Brians, Craig Leonard et al. Empirical Political Analysis: Quantitative and Qualitative Research Methods . 8th ed. Boston, MA: Longman, 2011; McNabb, David E. Research Methods in Public Administration and Nonprofit Management: Quantitative and Qualitative Approaches . 2nd ed. Armonk, NY: M.E. Sharpe, 2008; Quantitative Research Methods. Writing@CSU. Colorado State University; Singh, Kultar. Quantitative Social Research Methods . Los Angeles, CA: Sage, 2007.

Basic Research Design for Quantitative Studies

Before designing a quantitative research study, you must decide whether it will be descriptive or experimental because this will dictate how you gather, analyze, and interpret the results. A descriptive study is governed by the following rules: subjects are generally measured once; the intention is to only establish associations between variables; and, the study may include a sample population of hundreds or thousands of subjects to ensure that a valid estimate of a generalized relationship between variables has been obtained. An experimental design includes subjects measured before and after a particular treatment, the sample population may be very small and purposefully chosen, and it is intended to establish causality between variables. Introduction The introduction to a quantitative study is usually written in the present tense and from the third person point of view. It covers the following information:

  • Identifies the research problem -- as with any academic study, you must state clearly and concisely the research problem being investigated.
  • Reviews the literature -- review scholarship on the topic, synthesizing key themes and, if necessary, noting studies that have used similar methods of inquiry and analysis. Note where key gaps exist and how your study helps to fill these gaps or clarifies existing knowledge.
  • Describes the theoretical framework -- provide an outline of the theory or hypothesis underpinning your study. If necessary, define unfamiliar or complex terms, concepts, or ideas and provide the appropriate background information to place the research problem in proper context [e.g., historical, cultural, economic, etc.].

Methodology The methods section of a quantitative study should describe how each objective of your study will be achieved. Be sure to provide enough detail to enable the reader can make an informed assessment of the methods being used to obtain results associated with the research problem. The methods section should be presented in the past tense.

  • Study population and sampling -- where did the data come from; how robust is it; note where gaps exist or what was excluded. Note the procedures used for their selection;
  • Data collection – describe the tools and methods used to collect information and identify the variables being measured; describe the methods used to obtain the data; and, note if the data was pre-existing [i.e., government data] or you gathered it yourself. If you gathered it yourself, describe what type of instrument you used and why. Note that no data set is perfect--describe any limitations in methods of gathering data.
  • Data analysis -- describe the procedures for processing and analyzing the data. If appropriate, describe the specific instruments of analysis used to study each research objective, including mathematical techniques and the type of computer software used to manipulate the data.

Results The finding of your study should be written objectively and in a succinct and precise format. In quantitative studies, it is common to use graphs, tables, charts, and other non-textual elements to help the reader understand the data. Make sure that non-textual elements do not stand in isolation from the text but are being used to supplement the overall description of the results and to help clarify key points being made. Further information about how to effectively present data using charts and graphs can be found here .

  • Statistical analysis -- how did you analyze the data? What were the key findings from the data? The findings should be present in a logical, sequential order. Describe but do not interpret these trends or negative results; save that for the discussion section. The results should be presented in the past tense.

Discussion Discussions should be analytic, logical, and comprehensive. The discussion should meld together your findings in relation to those identified in the literature review, and placed within the context of the theoretical framework underpinning the study. The discussion should be presented in the present tense.

  • Interpretation of results -- reiterate the research problem being investigated and compare and contrast the findings with the research questions underlying the study. Did they affirm predicted outcomes or did the data refute it?
  • Description of trends, comparison of groups, or relationships among variables -- describe any trends that emerged from your analysis and explain all unanticipated and statistical insignificant findings.
  • Discussion of implications – what is the meaning of your results? Highlight key findings based on the overall results and note findings that you believe are important. How have the results helped fill gaps in understanding the research problem?
  • Limitations -- describe any limitations or unavoidable bias in your study and, if necessary, note why these limitations did not inhibit effective interpretation of the results.

Conclusion End your study by to summarizing the topic and provide a final comment and assessment of the study.

  • Summary of findings – synthesize the answers to your research questions. Do not report any statistical data here; just provide a narrative summary of the key findings and describe what was learned that you did not know before conducting the study.
  • Recommendations – if appropriate to the aim of the assignment, tie key findings with policy recommendations or actions to be taken in practice.
  • Future research – note the need for future research linked to your study’s limitations or to any remaining gaps in the literature that were not addressed in your study.

Black, Thomas R. Doing Quantitative Research in the Social Sciences: An Integrated Approach to Research Design, Measurement and Statistics . London: Sage, 1999; Gay,L. R. and Peter Airasain. Educational Research: Competencies for Analysis and Applications . 7th edition. Upper Saddle River, NJ: Merril Prentice Hall, 2003; Hector, Anestine. An Overview of Quantitative Research in Composition and TESOL . Department of English, Indiana University of Pennsylvania; Hopkins, Will G. “Quantitative Research Design.” Sportscience 4, 1 (2000); "A Strategy for Writing Up Research Results. The Structure, Format, Content, and Style of a Journal-Style Scientific Paper." Department of Biology. Bates College; Nenty, H. Johnson. "Writing a Quantitative Research Thesis." International Journal of Educational Science 1 (2009): 19-32; Ouyang, Ronghua (John). Basic Inquiry of Quantitative Research . Kennesaw State University.

Strengths of Using Quantitative Methods

Quantitative researchers try to recognize and isolate specific variables contained within the study framework, seek correlation, relationships and causality, and attempt to control the environment in which the data is collected to avoid the risk of variables, other than the one being studied, accounting for the relationships identified.

Among the specific strengths of using quantitative methods to study social science research problems:

  • Allows for a broader study, involving a greater number of subjects, and enhancing the generalization of the results;
  • Allows for greater objectivity and accuracy of results. Generally, quantitative methods are designed to provide summaries of data that support generalizations about the phenomenon under study. In order to accomplish this, quantitative research usually involves few variables and many cases, and employs prescribed procedures to ensure validity and reliability;
  • Applying well established standards means that the research can be replicated, and then analyzed and compared with similar studies;
  • You can summarize vast sources of information and make comparisons across categories and over time; and,
  • Personal bias can be avoided by keeping a 'distance' from participating subjects and using accepted computational techniques .

Babbie, Earl R. The Practice of Social Research . 12th ed. Belmont, CA: Wadsworth Cengage, 2010; Brians, Craig Leonard et al. Empirical Political Analysis: Quantitative and Qualitative Research Methods . 8th ed. Boston, MA: Longman, 2011; McNabb, David E. Research Methods in Public Administration and Nonprofit Management: Quantitative and Qualitative Approaches . 2nd ed. Armonk, NY: M.E. Sharpe, 2008; Singh, Kultar. Quantitative Social Research Methods . Los Angeles, CA: Sage, 2007.

Limitations of Using Quantitative Methods

Quantitative methods presume to have an objective approach to studying research problems, where data is controlled and measured, to address the accumulation of facts, and to determine the causes of behavior. As a consequence, the results of quantitative research may be statistically significant but are often humanly insignificant.

Some specific limitations associated with using quantitative methods to study research problems in the social sciences include:

  • Quantitative data is more efficient and able to test hypotheses, but may miss contextual detail;
  • Uses a static and rigid approach and so employs an inflexible process of discovery;
  • The development of standard questions by researchers can lead to "structural bias" and false representation, where the data actually reflects the view of the researcher instead of the participating subject;
  • Results provide less detail on behavior, attitudes, and motivation;
  • Researcher may collect a much narrower and sometimes superficial dataset;
  • Results are limited as they provide numerical descriptions rather than detailed narrative and generally provide less elaborate accounts of human perception;
  • The research is often carried out in an unnatural, artificial environment so that a level of control can be applied to the exercise. This level of control might not normally be in place in the real world thus yielding "laboratory results" as opposed to "real world results"; and,
  • Preset answers will not necessarily reflect how people really feel about a subject and, in some cases, might just be the closest match to the preconceived hypothesis.

Research Tip

Finding Examples of How to Apply Different Types of Research Methods

SAGE publications is a major publisher of studies about how to design and conduct research in the social and behavioral sciences. Their SAGE Research Methods Online and Cases database includes contents from books, articles, encyclopedias, handbooks, and videos covering social science research design and methods including the complete Little Green Book Series of Quantitative Applications in the Social Sciences and the Little Blue Book Series of Qualitative Research techniques. The database also includes case studies outlining the research methods used in real research projects. This is an excellent source for finding definitions of key terms and descriptions of research design and practice, techniques of data gathering, analysis, and reporting, and information about theories of research [e.g., grounded theory]. The database covers both qualitative and quantitative research methods as well as mixed methods approaches to conducting research.

SAGE Research Methods Online and Cases

  • << Previous: Qualitative Methods
  • Next: Insiderness >>
  • Last Updated: Mar 26, 2024 10:40 AM
  • URL: https://libguides.usc.edu/writingguide

Book cover

Principles of Social Research Methodology pp 257–260 Cite as

Techniques for Reporting Quantitative Data

  • Md. Mahsin 4  
  • First Online: 27 October 2022

1933 Accesses

A quantitative research report is a way of describing the completed study to other people. The findings are communicated through an oral presentation, a book, or a published paper. The report disseminates the results to research scientists or the policy decision-maker’s stakeholders. It is usually written in plain words so that a layperson can understand it, or it may be so highly technical so that the target audience can understand it easily. It organizes in many different ways depending on the intended audience and the author’s style. A rough sequence of steps for writing a quantitative research report describes in this section:

Specify a summary or abstract of the report to give a quick picture of the research article, thesis, review paper, conference proceeding, or in-depth analysis of a particular subject.

Define the research problem and discuss the methodology approach.

Present the results and findings and finally summarize the significance of the conclusions.

  • Social research
  • Data reporting
  • Quantitative data

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Neuman, W. L., & Robson, K. (2012). Basics of social research: Qualitative and quantitative approaches.

Google Scholar  

Download references

Author information

Authors and affiliations.

Department of Mathematics and Statistics, University of Calgary, 2500 University Drive NW, Calgary, Canada

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Md. Mahsin .

Editor information

Editors and affiliations.

Centre for Family and Child Studies, Research Institute of Humanities and Social Sciences, University of Sharjah, Sharjah, United Arab Emirates

M. Rezaul Islam

Department of Development Studies, University of Dhaka, Dhaka, Bangladesh

Niaz Ahmed Khan

Department of Social Work, School of Humanities, University of Johannesburg, Johannesburg, South Africa

Rajendra Baikady

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter.

Mahsin, M. (2022). Techniques for Reporting Quantitative Data. In: Islam, M.R., Khan, N.A., Baikady, R. (eds) Principles of Social Research Methodology. Springer, Singapore. https://doi.org/10.1007/978-981-19-5441-2_17

Download citation

DOI : https://doi.org/10.1007/978-981-19-5441-2_17

Published : 27 October 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-19-5219-7

Online ISBN : 978-981-19-5441-2

eBook Packages : Social Sciences

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Banner Image

Quantitative and Qualitative Research

  • I NEED TO . . .

What is Quantitative Research?

  • What is Qualitative Research?
  • Quantitative vs Qualitative
  • Step 1: Accessing CINAHL
  • Step 2: Create a Keyword Search
  • Step 3: Create a Subject Heading Search
  • Step 4: Repeat Steps 1-3 for Second Concept
  • Step 5: Repeat Steps 1-3 for Quantitative Terms
  • Step 6: Combining All Searches
  • Step 7: Adding Limiters
  • Step 8: Save Your Search!
  • What Kind of Article is This?
  • More Research Help This link opens in a new window

Quantitative methodology is the dominant research framework in the social sciences. It refers to a set of strategies, techniques and assumptions used to study psychological, social and economic processes through the exploration of numeric patterns . Quantitative research gathers a range of numeric data. Some of the numeric data is intrinsically quantitative (e.g. personal income), while in other cases the numeric structure is  imposed (e.g. ‘On a scale from 1 to 10, how depressed did you feel last week?’). The collection of quantitative information allows researchers to conduct simple to extremely sophisticated statistical analyses that aggregate the data (e.g. averages, percentages), show relationships among the data (e.g. ‘Students with lower grade point averages tend to score lower on a depression scale’) or compare across aggregated data (e.g. the USA has a higher gross domestic product than Spain). Quantitative research includes methodologies such as questionnaires, structured observations or experiments and stands in contrast to qualitative research. Qualitative research involves the collection and analysis of narratives and/or open-ended observations through methodologies such as interviews, focus groups or ethnographies.

Coghlan, D., Brydon-Miller, M. (2014).  The SAGE encyclopedia of action research  (Vols. 1-2). London, : SAGE Publications Ltd doi: 10.4135/9781446294406

What is the purpose of quantitative research?

The purpose of quantitative research is to generate knowledge and create understanding about the social world. Quantitative research is used by social scientists, including communication researchers, to observe phenomena or occurrences affecting individuals. Social scientists are concerned with the study of people. Quantitative research is a way to learn about a particular group of people, known as a sample population. Using scientific inquiry, quantitative research relies on data that are observed or measured to examine questions about the sample population.

Allen, M. (2017).  The SAGE encyclopedia of communication research methods  (Vols. 1-4). Thousand Oaks, CA: SAGE Publications, Inc doi: 10.4135/9781483381411

How do I know if the study is a quantitative design?  What type of quantitative study is it?

Quantitative Research Designs: Descriptive non-experimental, Quasi-experimental or Experimental?

Studies do not always explicitly state what kind of research design is being used.  You will need to know how to decipher which design type is used.  The following video will help you determine the quantitative design type.

  • << Previous: I NEED TO . . .
  • Next: What is Qualitative Research? >>
  • Last Updated: Dec 8, 2023 10:05 PM
  • URL: https://libguides.uta.edu/quantitative_and_qualitative_research

University of Texas Arlington Libraries 702 Planetarium Place · Arlington, TX 76019 · 817-272-3000

  • Internet Privacy
  • Accessibility
  • Problems with a guide? Contact Us.
  • Reviews / Why join our community?
  • For companies
  • Frequently asked questions

Quantitative Research

What is quantitative research.

Quantitative research is the methodology which researchers use to test theories about people’s attitudes and behaviors based on numerical and statistical evidence. Researchers sample a large number of users (e.g., through surveys) to indirectly obtain measurable, bias-free data about users in relevant situations.

“Quantification clarifies issues which qualitative analysis leaves fuzzy. It is more readily contestable and likely to be contested. It sharpens scholarly discussion, sparks off rival hypotheses, and contributes to the dynamics of the research process.” — Angus Maddison, Notable scholar of quantitative macro-economic history
  • Transcript loading…

See how quantitative research helps reveal cold, hard facts about users which you can interpret and use to improve your designs.

Use Quantitative Research to Find Mathematical Facts about Users

Quantitative research is a subset of user experience (UX) research . Unlike its softer, more individual-oriented “counterpart”, qualitative research , quantitative research means you collect statistical/numerical data to draw generalized conclusions about users’ attitudes and behaviors . Compare and contrast quantitative with qualitative research, below:

Quantitative research is often best done from early on in projects since it helps teams to optimally direct product development and avoid costly design mistakes later. As you typically get user data from a distance—i.e., without close physical contact with users—also applying qualitative research will help you investigate why users think and feel the ways they do. Indeed, in an iterative design process quantitative research helps you test the assumptions you and your design team develop from your qualitative research. Regardless of the method you use, with proper care you can gather objective and unbiased data – information which you can complement with qualitative approaches to build a fuller understanding of your target users. From there, you can work towards firmer conclusions and drive your design process towards a more realistic picture of how target users will ultimately receive your product.

quantitative research report study

Quantitative analysis helps you test your assumptions and establish clearer views of your users in their various contexts.

Quantitative Research Methods You Can Use to Guide Optimal Designs

There are many quantitative research methods, and they help uncover different types of information on users. Some methods, such as A/B testing, are typically done on finished products, while others such as surveys could be done throughout a project’s design process. Here are some of the most helpful methods:

A/B testing – You test two or more versions of your design on users to find the most effective. Each variation differs by just one feature and may or may not affect how users respond. A/B testing is especially valuable for testing assumptions you’ve drawn from qualitative research. The only potential concerns here are scale—in that you’ll typically need to conduct it on thousands of users—and arguably more complexity in terms of considering the statistical significance involved.

Analytics – With tools such as Google Analytics, you measure metrics (e.g., page views, click-through rates) to build a picture (e.g., “How many users take how long to complete a task?”).

Desirability Studies – You measure an aspect of your product (e.g., aesthetic appeal) by typically showing it to participants and asking them to select from a menu of descriptive words. Their responses can reveal powerful insights (e.g., 78% associate the product/brand with “fashionable”).

Surveys and Questionnaires – When you ask for many users’ opinions, you will gain massive amounts of information. Keep in mind that you’ll have data about what users say they do, as opposed to insights into what they do . You can get more reliable results if you incentivize your participants well and use the right format.

Tree Testing – You remove the user interface so users must navigate the site and complete tasks using links alone. This helps you see if an issue is related to the user interface or information architecture.

Another powerful benefit of conducting quantitative research is that you can keep your stakeholders’ support with hard facts and statistics about your design’s performance—which can show what works well and what needs improvement—and prove a good return on investment. You can also produce reports to check statistics against different versions of your product and your competitors’ products.

Most quantitative research methods are relatively cheap. Since no single research method can help you answer all your questions, it’s vital to judge which method suits your project at the time/stage. Remember, it’s best to spend appropriately on a combination of quantitative and qualitative research from early on in development. Design improvements can be costly, and so you can estimate the value of implementing changes when you get the statistics to suggest that these changes will improve usability. Overall, you want to gather measurements objectively, where your personality, presence and theories won’t create bias.

Learn More about Quantitative Research

Take our User Research course to see how to get the most from quantitative research.

See how quantitative research methods fit into your design research landscape .

This insightful piece shows the value of pairing quantitative with qualitative research .

Find helpful tips on combining quantitative research methods in mixed methods research .

Questions related to Quantitative Research

Qualitative and quantitative research differ primarily in the data they produce. Quantitative research yields numerical data to test hypotheses and quantify patterns. It's precise and generalizable. Qualitative research, on the other hand, generates non-numerical data and explores meanings, interpretations, and deeper insights. Watch our video featuring Professor Alan Dix on different types of research methods.

This video elucidates the nuances and applications of both research types in the design field.

In quantitative research, determining a good sample size is crucial for the reliability of the results. William Hudson, CEO of Syntagm, emphasizes the importance of statistical significance with an example in our video. 

He illustrates that even with varying results between design choices, we need to discern whether the differences are statistically significant or products of chance. This ensures the validity of the results, allowing for more accurate interpretations. Statistical tools like chi-square tests can aid in analyzing the results effectively. To delve deeper into these concepts, take William Hudson’s Data-Driven Design: Quantitative UX Research Course . 

Quantitative research is crucial as it provides precise, numerical data that allows for high levels of statistical inference. Our video from William Hudson, CEO of Syntagm, highlights the importance of analytics in examining existing solutions. 

Quantitative methods, like analytics and A/B testing, are pivotal for identifying areas for improvement, understanding user behaviors, and optimizing user experiences based on solid, empirical evidence. This empirical nature ensures that the insights derived are reliable, allowing for practical improvements and innovations. Perhaps most importantly, numerical data is useful to secure stakeholder buy-in and defend design decisions and proposals. Explore this approach in our Data-Driven Design: Quantitative Research for UX Research course and learn from William Hudson’s detailed explanations of when and why to use analytics in the research process.

After establishing initial requirements, statistical data is crucial for informed decisions through quantitative research. William Hudson, CEO of Syntagm, sheds light on the role of quantitative research throughout a typical project lifecycle in this video:

 During the analysis and design phases, quantitative research helps validate user requirements and understand user behaviors. Surveys and analytics are standard tools, offering insights into user preferences and design efficacy. Quantitative research can also be used in early design testing, allowing for optimal design modifications based on user interactions and feedback, and it’s fundamental for A/B and multivariate testing once live solutions are available.

To write a compelling quantitative research question:

Create clear, concise, and unambiguous questions that address one aspect at a time.

Use common, short terms and provide explanations for unusual words.

Avoid leading, compound, and overlapping queries and ensure that questions are not vague or broad.

According to our video by William Hudson, CEO of Syntagm, quality and respondent understanding are vital in forming good questions. 

He emphasizes the importance of addressing specific aspects and avoiding intimidating and confusing elements, such as extensive question grids or ranking questions, to ensure participant engagement and accurate responses. For more insights, see the article Writing Good Questions for Surveys .

Survey research is typically quantitative, collecting numerical data and statistical analysis to make generalizable conclusions. However, it can also have qualitative elements, mainly when it includes open-ended questions, allowing for expressive responses. Our video featuring the CEO of Syntagm, William Hudson, provides in-depth insights into when and how to effectively utilize surveys in the product or service lifecycle, focusing on user satisfaction and potential improvements.

He emphasizes the importance of surveys in triangulating data to back up qualitative research findings, ensuring we have a complete understanding of the user's requirements and preferences.

Descriptive research focuses on describing the subject being studied and getting answers to questions like what, where, when, and who of the research question. However, it doesn’t include the answers to the underlying reasons, or the “why” behind the answers obtained from the research. We can use both f qualitative and quantitative methods to conduct descriptive research. Descriptive research does not describe the methods, but rather the data gathered through the research (regardless of the methods used).

When we use quantitative research and gather numerical data, we can use statistical analysis to understand relationships between different variables. Here’s William Hudson, CEO of Syntagm with more on correlation and how we can apply tests such as Pearson’s r and Spearman Rank Coefficient to our data.

This helps interpret phenomena such as user experience by analyzing session lengths and conversion values, revealing whether variables like time spent on a page affect checkout values, for example.

Random Sampling: Each individual in the population has an equitable opportunity to be chosen, which minimizes biases and simplifies analysis.

Systematic Sampling: Selecting every k-th item from a list after a random start. It's simpler and faster than random sampling when dealing with large populations.

Stratified Sampling: Segregate the population into subgroups or strata according to comparable characteristics. Then, samples are taken randomly from each stratum.

Cluster Sampling: Divide the population into clusters and choose a random sample.

Multistage Sampling: Various sampling techniques are used at different stages to collect detailed information from diverse populations.

Convenience Sampling: The researcher selects the sample based on availability and willingness to participate, which may only represent part of the population.

Quota Sampling: Segment the population into subgroups, and samples are non-randomly selected to fulfill a predetermined quota from each subset.

These are just a few techniques, and choosing the right one depends on your research question, discipline, resource availability, and the level of accuracy required. In quantitative research, there isn't a one-size-fits-all sampling technique; choosing a method that aligns with your research goals and population is critical. However, a well-planned strategy is essential to avoid wasting resources and time, as highlighted in our video featuring William Hudson, CEO of Syntagm.

He emphasizes the importance of recruiting participants meticulously, ensuring their engagement and the quality of their responses. Accurate and thoughtful participant responses are crucial for obtaining reliable results. William also sheds light on dealing with failing participants and scrutinizing response quality to refine the outcomes.

The 4 types of quantitative research are Descriptive, Correlational, Causal-Comparative/Quasi-Experimental, and Experimental Research. Descriptive research aims to depict ‘what exists’ clearly and precisely. Correlational research examines relationships between variables. Causal-comparative research investigates the cause-effect relationship between variables. Experimental research explores causal relationships by manipulating independent variables. To gain deeper insights into quantitative research methods in UX, consider enrolling in our Data-Driven Design: Quantitative Research for UX course.

The strength of quantitative research is its ability to provide precise numerical data for analyzing target variables.This allows for generalized conclusions and predictions about future occurrences, proving invaluable in various fields, including user experience. William Hudson, CEO of Syntagm, discusses the role of surveys, analytics, and testing in providing objective insights in our video on quantitative research methods, highlighting the significance of structured methodologies in eliciting reliable results.

To master quantitative research methods, enroll in our comprehensive course, Data-Driven Design: Quantitative Research for UX . 

This course empowers you to leverage quantitative data to make informed design decisions, providing a deep dive into methods like surveys and analytics. Whether you’re a novice or a seasoned professional, this course at Interaction Design Foundation offers valuable insights and practical knowledge, ensuring you acquire the skills necessary to excel in user experience research. Explore our diverse topics to elevate your understanding of quantitative research methods.

Literature on Quantitative Research

Here’s the entire UX literature on Quantitative Research by the Interaction Design Foundation, collated in one place:

Learn more about Quantitative Research

Take a deep dive into Quantitative Research with our course User Research – Methods and Best Practices .

How do you plan to design a product or service that your users will love , if you don't know what they want in the first place? As a user experience designer, you shouldn't leave it to chance to design something outstanding; you should make the effort to understand your users and build on that knowledge from the outset. User research is the way to do this, and it can therefore be thought of as the largest part of user experience design .

In fact, user research is often the first step of a UX design process—after all, you cannot begin to design a product or service without first understanding what your users want! As you gain the skills required, and learn about the best practices in user research, you’ll get first-hand knowledge of your users and be able to design the optimal product—one that’s truly relevant for your users and, subsequently, outperforms your competitors’ .

This course will give you insights into the most essential qualitative research methods around and will teach you how to put them into practice in your design work. You’ll also have the opportunity to embark on three practical projects where you can apply what you’ve learned to carry out user research in the real world . You’ll learn details about how to plan user research projects and fit them into your own work processes in a way that maximizes the impact your research can have on your designs. On top of that, you’ll gain practice with different methods that will help you analyze the results of your research and communicate your findings to your clients and stakeholders—workshops, user journeys and personas, just to name a few!

By the end of the course, you’ll have not only a Course Certificate but also three case studies to add to your portfolio. And remember, a portfolio with engaging case studies is invaluable if you are looking to break into a career in UX design or user research!

We believe you should learn from the best, so we’ve gathered a team of experts to help teach this course alongside our own course instructors. That means you’ll meet a new instructor in each of the lessons on research methods who is an expert in their field—we hope you enjoy what they have in store for you!

All open-source articles on Quantitative Research

Best practices for qualitative user research.

quantitative research report study

  • 3 years ago

Card Sorting

quantitative research report study

Understand the User’s Perspective through Research for Mobile UX

quantitative research report study

7 Simple Ways to Get Better Results From Ethnographic Research

quantitative research report study

Question Everything

quantitative research report study

Tree Testing

quantitative research report study

Adding Quality to Your Design Research with an SSQS Checklist

quantitative research report study

  • 8 years ago

How to Fit Quantitative Research into the Project Lifecycle

quantitative research report study

Why and When to Use Surveys

quantitative research report study

Correlation in User Experience

quantitative research report study

First-Click Testing

quantitative research report study

What to Test

quantitative research report study

Rating Scales in UX Research: The Ultimate Guide

quantitative research report study

Open Access—Link to us!

We believe in Open Access and the  democratization of knowledge . Unfortunately, world-class educational materials such as this page are normally hidden behind paywalls or in expensive textbooks.

If you want this to change , cite this page , link to us, or join us to help us democratize design knowledge !

Privacy Settings

Our digital services use necessary tracking technologies, including third-party cookies, for security, functionality, and to uphold user rights. Optional cookies offer enhanced features, and analytics.

Experience the full potential of our site that remembers your preferences and supports secure sign-in.

Governs the storage of data necessary for maintaining website security, user authentication, and fraud prevention mechanisms.

Enhanced Functionality

Saves your settings and preferences, like your location, for a more personalized experience.

Referral Program

We use cookies to enable our referral program, giving you and your friends discounts.

Error Reporting

We share user ID with Bugsnag and NewRelic to help us track errors and fix issues.

Optimize your experience by allowing us to monitor site usage. You’ll enjoy a smoother, more personalized journey without compromising your privacy.

Analytics Storage

Collects anonymous data on how you navigate and interact, helping us make informed improvements.

Differentiates real visitors from automated bots, ensuring accurate usage data and improving your website experience.

Lets us tailor your digital ads to match your interests, making them more relevant and useful to you.

Advertising Storage

Stores information for better-targeted advertising, enhancing your online ad experience.

Personalization Storage

Permits storing data to personalize content and ads across Google services based on user behavior, enhancing overall user experience.

Advertising Personalization

Allows for content and ad personalization across Google services based on user behavior. This consent enhances user experiences.

Enables personalizing ads based on user data and interactions, allowing for more relevant advertising experiences across Google services.

Receive more relevant advertisements by sharing your interests and behavior with our trusted advertising partners.

Enables better ad targeting and measurement on Meta platforms, making ads you see more relevant.

Allows for improved ad effectiveness and measurement through Meta’s Conversions API, ensuring privacy-compliant data sharing.

LinkedIn Insights

Tracks conversions, retargeting, and web analytics for LinkedIn ad campaigns, enhancing ad relevance and performance.

LinkedIn CAPI

Enhances LinkedIn advertising through server-side event tracking, offering more accurate measurement and personalization.

Google Ads Tag

Tracks ad performance and user engagement, helping deliver ads that are most useful to you.

Share the knowledge!

Share this content on:

or copy link

Cite according to academic standards

Simply copy and paste the text below into your bibliographic reference list, onto your blog, or anywhere else. You can also just hyperlink to this page.

New to UX Design? We’re Giving You a Free ebook!

The Basics of User Experience Design

Download our free ebook The Basics of User Experience Design to learn about core concepts of UX design.

In 9 chapters, we’ll cover: conducting user interviews, design thinking, interaction design, mobile UX design, usability, UX research, and many more!

Illustration

  • Basics of Research Process
  • Methodology
  • Quantitative Research Study: Definition, Approaches, Methods & Examples
  • Speech Topics
  • Basics of Essay Writing
  • Essay Topics
  • Other Essays
  • Main Academic Essays
  • Research Paper Topics
  • Basics of Research Paper Writing
  • Miscellaneous
  • Chicago/ Turabian
  • Data & Statistics
  • Admission Writing Tips
  • Admission Advice
  • Other Guides
  • Student Life
  • Studying Tips
  • Understanding Plagiarism
  • Academic Writing Tips
  • Basics of Dissertation & Thesis Writing

Illustration

  • Essay Guides
  • Research Paper Guides
  • Formatting Guides
  • Admission Guides
  • Dissertation & Thesis Guides

Quantitative Research Study: Definition, Approaches, Methods & Examples

Quantitative Research

Table of contents

Illustration

Use our free Readability checker

Quantitative research is a type of scientific study that involves the collection and analysis of numerical data. It uses mathematical and statistical techniques to identify patterns in large datasets. Analysis of numbers allows researchers to make predictions about future trends or outcomes. Quantitative methods include surveys, experiments, field studies, structured interviews, standardized assessments and questionnaires.

In this article, we will focus on what a quantitative study is and its main methods. Prepare to go through:

  • Key characteristics
  • Main types and approaches
  • Steps of conducting a quantitative study .

Our paper writers also provided the best quantitative research methods and examples to showcase the benefits of this approach.

What Is Quantitative Research: Definition

Before jumping into a detailed discussion on how to launch quantitative research, let’s outline a definition of this type of study. 

Quantitative research involves analyzing numerical data to uncover patterns and statistical information, which can be used to test hypotheses and respond to research questions . Quantitative methods often include statistical analysis, surveys, and experiments to generate measurable data and make accurate predictions. 

Quantitative research studies are usually applied to fields such as social science, economics, marketing, biology, etc. It is also commonly used for descriptive , correlational , or experimental studies .

Next, we will delve deeper into specific methods to define which one can best fit your academic work. However, before you start analysis and data collection, you need to be clear with the study purpose and research questions you will try to answer in your work.

>> Read more: Difference Between Qualitative and Quantitative Research

Characteristics of Quantitative Research

First, let’s define quantitative research characteristics to ensure that you choose the right type of data for your study. A deep understanding of key traits is the guarantee that you won’t make a mistake when conducting your own study.

  • Quantitative data is measurable. There are variables that can be easily counted and applied to statistical formulas. In other words, this is numeric data.
  • You can apply structured tools for your quantitative research. Those tools are surveys, polls, and questionnaires – structured forms you can use to collect the information.
  • The sample size should be sufficient. To get accurate results, you need to collect data from a significant portion of the target market.Obtaining only 10 survey responses, for example, would not yield any meaningful insights.
  • Your data can be represented in tables, graphs, or charts. As quantitative methods of data collection are focused on numbers, you should utilize visual aids to structure those numbers clearly for analysis. Such representation can provide valuable insights into patterns, trends, and relationships between independent and dependent variables.

If you need a thesis helper online , check our expert services. Our team of professional writers and editors can provide valuable guidance and support in all stages of research and writing.

Quantitative Research Examples

Quite often, it is challenging to apply all the knowledge about this type of research to your specific field of study. However, we want to share examples of quantitative research that illustrate that it can be used for any purpose.

Quantitative research example 1

One common example for students is an evaluation survey after they finish a course at the university. Students usually answer some questions on a likable scale. For instance, they evaluate the quality of lectures on a scale from 1 to 10, where 10 is the highest grade. These numbers help universities to see general satisfaction from this course, define an average number of people who like the course, and run a correlation between student satisfaction by course and their grades.

Example of quantitative research 2

Another common example is customer surveys you get after purchasing something online. After the purchase, you will get an email from a retailer or brand with questions about your satisfaction. For example, you will grade on a scale from 1 to 5 how easy you could find the right size or get customer support. As a result, they get numeric data to evaluate how well their online shop works and what can be improved, make some predictions about future purchases, and use the insights for marketing purposes.

Types of Quantitative Research

There are a few different types of quantitative research that can be used for various studies. Let’s overview each type of quantitative research to understand in what circumstances and for what goals you can use each of them.

We would focus on 4 main types of quantitative approaches in data collection and analysis:

  • Descriptive study Descriptive research is used to measure variables for understanding the situation. It does not involve any manipulation of variables. In other words, descriptive studies focus on defining key statistical measures, without testing for specific data insights.
  • Correlation Correlational study is used in the quantitative research process when you need to measure a relationship between two variables and understand how one variable (for example, the age of respondents) is related to positive answers in the customer survey. Ensure you understand the difference between correlation and causation while making this kind of research.
  • Causal-comparative research Causal-comparative research is a type of non-experimental study that aims to investigate the causes behind differences in behavior among multiple groups. It is one of the commonly used types of quantitative studies for investigating  causal relationships between variables.
  • Experimental research. Experimental research involves manipulating independent variables to observe how they affect dependent variables under controlled conditions. Put simply, the researcher will need to alter the situation to measure various outcomes that may occur.

Primary Quantitative Research Methods

When discussing types of quantitative research methods, we also need to define primary and secondary methods. Primary methods are used to collect data directly from the source, such as through surveys , experiments , or systematic observations . 

Below we will explain each of these methods in detail.

1. Survey Research

Surveys are a widely used quantitative research methodology across various disciplines and fields of study. They are organized both online or offline to gather data from different audiences. In recent years, online surveys have become increasingly popular, replacing traditional methods such as phone or in-person questioning

Surveys can be applied to achieve various study aims, such as understanding attitudes, behaviors, opinions, preferences, or demographic characteristics.

We would like to define 2 main quantitative survey methods:

  • Cross-sectional surveys Cross sectional survey that analyzes data across a sample population at a specific point in time. It means you may send this survey to various different groups of people, but you will need a one-time point you are researching for your study. It is a common method in such fields like economy, epidemiology, or medicine.
  • Longitudinal surveys In longitudinal research , you will measure the same group of people, but the data should be collected repeatedly over time. This method requires repeated measurements at regular intervals, such as days, months, or even years, to track changes in dependent variables over time.

Survey quantitative research method example

One of the common examples is the survey for measuring how citizens are satisfied with local politicians. For this purpose, the sociology group used to develop a questionnaire sample and define target audiences – people living in specific areas or some age groups. Based on their answers, different methods can be applied to answer defined questions, make predictions for the next election or just measure the general attitude of the selected group to political ideas.

2. Experiment

Experimentation is one of the quantitative approaches to research that assumes testing various theories to prove or disprove them, or to identify their limitations. This is another powerful quantitative research method that is often used in psychology, biology, physics, and sociology, among others. 

Experimentation is a systematic quantitative research approach to testing hypotheses and understanding the causal relationships between variables. Researchers manipulate independent variables while holding all other variables constant to observe changes in their dependent variable. By comparing the outcomes of the experimental group to those of the control group , researchers can determine if intervention was effective.

There are two main types of experiments:

  • Laboratory experiments A laboratory experiment is conducted in a controlled environment, such as a laboratory or research center, where researchers have complete control over the variables they manipulate. Such experiments are carefully designed to ensure that all resulting data is carefully analyzed in a lab report .
  • Field experiments Field experiment is a quantitative methodology conducted in real-world settings, such as schools, businesses, or communities, where researchers have less control over variables.

Example of experiment method

You may know about such famous experiments in psychology, such as the marshmallow experiment, when children need to wait some time to eat the marshmallow. The psychologists test how the child's behavior and motivation are related to endurance. For this experiment, scientists measured an independent variable which is the number of sweeteners and dependent variables as time and children's attitude to the task.

>> Read more: How to Design an Experiment 

3. Systematic Observation

One of the most reliable types of quantitative research methodologies is systematic observation. It requires researchers to observe specific situations, behavior, or case and collect numeric data based on predefined forms. Those forms are based on the theoretical framework for a specific quantitative study. Usually, this method involves one or more observers and can be applied to different events or behavioral observations. 

Systematic observation relies on accurate coding and the proper recording of data onto the structured forms used in the study. This quantitative research method is commonly employed in fields such as sociology, medicine, education, and psychology, and requires precise numeric data to be collected. Although observations can be documented through video or audio recordings, researchers using systematic observation focus specifically on measuring specific variables of interest. 

Observation: quantitative research design example

Great example is the observation of children's behavior in the classroom. For study proposals, observers can keep an eye on a classroom during different activities. Then they add countable information into the form - how many times the teacher asked for a specific action or raised a question? How many times do children speak during the class?

Quantitative Methods for Data Collection

When we discuss various analytical techniques, we have already mentioned some types of quantitative data collection methods . However, let’s go deeper and identify key methods to gather numeric data. We will speak about sampling methods , surveys , and polls . Also, our experts prepared examples of quantitative methods for data collection to help students and researchers work accurately. You also should consider using specific tools for working with this type of data and what is the most important to understand your study purpose. 

1. Sampling Method

Let’s imagine you are conducting research about teens and their usage of social media. There is no way you can send a survey, and they analyze data from all teens in the world. However, you will need to choose a reliable number of teens and then create quantitative research designs for this group. 

There are two main sampling methods we are going to discuss – probability and non-probability sampling . 

Probability Sampling

This quantitative method of data collection can be applied to cases when you need to analyze a specific group of people. For instance, you need to learn what the chance is that a city will vote for a chosen politician. It means you need to understand the age and gender percentages in a city and choose people for the survey based on this information. If your city has 34% of women age 55+, using a quantitative approach, you need to have the same percentage in your sampling. 

There are 4 types of probability sampling:

  • Simple random sampling
  • Systematic sampling
  • Stratified sampling
  • Cluster sampling .

Non-Probability Sampling

These quantitative data collection methods consider that the choice of samples depends on a researcher's experience and knowledge. In other words, not everyone can be selected for this data collection procedure - not everyone has an equal probability of being a part of your study. 

Quantitative researchers use five models for non-probability data gathering:

  • Convenience sampling: the only reason to choose study participants is their proximity to researchers.
  • Quota sampling: scientists use their knowledge and experience to form a quota.
  • Consecutive sampling: similar to the conventional method but can be applied to the same situation during some period of time.
  • Snowball sampling: researchers ask their target audience whom they can recommend for the same study.
  • Judgmental samplings: it is usually chosen based on the researcher's skills.

The next and quite popular method to collect data is the quantitative survey method. The design of your question will depend on the theory you are using for your study. For instance, If you are looking at how customer awareness about product features influence their engagement with a brand, you will apply relationship communication theory. In other words, you can’t put any questions you want into the survey.

You can conduct surveys for your quantitative research using the following ways:

  • Social media
  • Survey on your website
  • Offline surveys, and other methods.

As quantitative studies are focused on numeric data, you need to use a likable scale for answers and not open questions.

Polls are a commonly used quantitative research method, particularly in election and exit polls. Conducting quantitative research for the election means you can ask simple questions with multiple choices. For instance, you may offer a few demographic questions (e.g., age, employment),as well as questions about voting behavior (e.g., candidate they voted for).

Applying quantitative design for polls, researchers need to ensure that the answers can be analyzed with statistical formulas. That is why the questions for the polls are often quite simple and comprise up to 5 questions. However, in some cases, polls may include even less queries. 

A quantitative research study can use benchmarks, brushfire, and tracking polls.

Data Analysis Methods

After you collect the results of your polls, samplings, or surveys, you need to analyze quantitative data. As we are discussing primary data, researchers have raw information that can be analyzed and later interpreted in different ways. 

Based on quantitative research approaches, we identify 2 key methods for the analysis of numeric data:

  • Descriptive statistics Descriptive statistics allows us to get average data on questions or measure variability. It helps to overview the data with statistical evaluation. Applying descriptive statistics, you can count the average mean or standard deviation.
  • Inferential statistics Inferential statistics helps to design predictions and understand the relations between variables. You can run a T-test to measure the relation between two variables. Likewise, you may arrange a Pearson correlation test and measure how one variable depends on another one.

Using these statistical instruments, scientists can go deeper into result discussion and test hypotheses.

Secondary Quantitative Research Methods

Secondary methods of quantitative research are based on the analysis of existing data – the information already gathered by someone or presented in other papers. In this case, we do not need to collect data. Instead, scientists conduct their own quantitative research applying statistical analysis methods and formulas to gain new insights from existing data. 

There are 5 most commonly used types of secondary quantitative research methods:

  • Data from open online sources This is probably one of the most frequently used resources for quantitative study is the internet. A lot of companies and government institutions share the data on their own work, like the number of mobile users or a number of people using state health insurance.
  • Official data from government and non-government organizations Some data can exist in official reports but are not published online. In this case, you can ask for data that can be shared without breaking privacy protection laws. You may need to make an official request for the information you want to use for your study.
  • Public libraries You may think that no one uses public libraries. However, this is where you can find old studies conducted by someone else. The library also has a dataset for the papers that can be used for your own study.
  • Educational institutions A lot of educational institutions are also conducting research. While you can find the analytics published in open sources, a data set can be shared with you after the request.
  • Commercial sources These sources typically include information from private research firms or companies that collect and analyze data on specific industries, markets, or consumer behavior. Researchers can access this data through websites, reports, or journals, or by requesting access directly from the companies themselves.

How to Conduct Quantitative Research?

If you are working in the academic field or going to get a master's or Ph.D. degree, you definitely will need to conduct various types of studies to write a dissertation . Let’s look at the common ways to conduct quantitative research. Make sure you keep these important considerations in mind:

  • Determine the type of research you need to conduct. Will you be testing a hypothesis? If so, you will likely need to analyze numerical data.
  • Identify the appropriate sample size for your study. Do you need a large sample size to obtain reliability of outcomes, or will a smaller sample size suffice?
  • Be clear about your research goals. It's important to define your research objectives and ensure that your study design aligns with these goals.
  • Simplify your research questions. If your questions are clear and concise, it will be easier to determine the appropriate type of analysis needed to answer them.

Adhering to these recommendations ensures that research is targeted and generates valuable findings.

Are you searching where to buy dissertation online ? Our professional writing service offers high-quality custom dissertations that are tailored to your specific needs and requirements.

Advantages of Quantitative Research

Before choosing this analytical type for your work, you need to be aware of the advantages of quantitative research methods. 

Here are the pros of using quantitative research methodologies for the research:

  • Time efficiency Gathering and analyzing numerical data usually takes less time than collecting and analyzing non-numerical data.
  • Reliable data Working with numbers allows for precise statistical analysis, resulting in more reliable results.
  • Objectivity The absence of personal comments or interpretation in quantitative data collection reduces the possibility of bias in the results.
  • Scientific approach The quantitative method is considered one of the most scientific research methods, which helps to establish credibility and believability of the results.
  • Verifiability The results can be easily checked and verified by repeating the formula or analysis, ensuring the accuracy of the data.

Disadvantages of Quantitative Research

It may look like working with quantitative research can bring only pros to your study. However, there are a few cons you need to be aware of before starting your data collection. How can methodology in quantitative research become a disadvantage for your study?

  • Risk of bias We mentioned that there is no way you will put your emotions into statistical formulas. But researcher experience and personal feelings can be used to form samplings. Even the daytime for data collection can influence the final results.
  • Narrow focus It is possible that you can be so focused on numbers that you miss the bigger picture. Anytime you are running the numeric study, you need to look at your questions broadly. You may also need some qualitative methods to answer your research questions.
  • Complexity For people who are not very good at math and statistics, it can be problematic to identify what type of numbers they need and what test should be conducted to get results.

Bottom Line on Quantitative Research

In the few paragraphs, we tried to guide you through key principles of numeric research and answer the question of what is a quantitative study and how to conduct it correctly. We identified critical approaches in collecting data for this type of analysis and outlined limitations you need to have in mind running this study. 

You also can find the best quantitative methods examples that will definitely help you with your own study. Try your best to launch a valuable and reliable study using all the knowledge on how to work with numbers!

Illustration

Our research paper service can help you meet your deadlines and deliver high-quality work. Whether you need a quantitative study or any other type of research, we can assist you in completing your project effort-free.

FAQ About Quantitative Research Studies & Methods

1. what is the purpose of quantitative research.

The primary purpose of quantitative research is to test the hypothesis you may have in your study. This is one of the most frequently used types of data analysis, but before start working with numbers, you need to be clear with your goal. For example, for answering research questions, you may need only qualitative data.

2. What is a quantitative research method?

Quantitative research methods are types of data collection and analysis that focus on numeric information. In other words, this is the research when you work with numbers instead of words. You may need to apply some statistical formulas to those numbers to get results, while in a qualitative study, you will deal with content analysis mostly.

3. When is quantitative research used?

You may need to conduct quantitative research in case you are going to test the hypothesis by running statistical formulas. In most cases, you understand what type of research you need to conduct when you are clear with the study's aims and purpose. After you define hypotheses or questions, you may focus on the methodology that will help you get results.

Joe_Eckel_1_ab59a03630.jpg

Joe Eckel is an expert on Dissertations writing. He makes sure that each student gets precious insights on composing A-grade academic writing.

You may also like

Qualitative Research

Quantitative research methods are opposite to approaches applied in a  qualitative study , where you are dealing with descriptions instead of numbers. In the latter case, analysis is focused on non-numerical data, like texts from interviews or focus groups, videos, or audio.

Note that a single study may require the use of multiple methods to gather different types of data. As such, researchers may need to employ a variety of methods to gain a comprehensive understanding of their research topics .

Logo for VIVA Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

20 16. Reporting quantitative results

Chapter outline.

  • Reporting quantitative results (8 minute read time)

Content warning: Brief discussion of violence against women.

16.1 Reporting quantitative results

Learning objectives.

Learners will be able to…

  • Execute a quantitative research report using key elements for accuracy and openness

So you’ve completed your quantitative analyses and are ready to report your results. We’re going to spend some time talking about what matters in quantitative research reports, but the very first thing to understand is this: openness with your data and analyses is key. You should never hide what you did to get to a particular conclusion and, if someone wanted to and could ethically access your data, they should be able to replicate more or less exactly what you did. While your quantitative report won’t have every single step you took to get to your conclusion, it should have plenty of detail so someone can get the picture.

Below, I’m going to take you through the key elements of a quantitative research report. This overview is pretty general and conceptual, and it will be helpful for you to look at existing scholarly articles that deal with quantitative research (like ones in your literature review) to see the structure applied. Also keep in mind that your instructor may want the sections broken out slightly differently; nonetheless, the content I outline below should be in your research report.

Introduction and literature review

These are what you’re working on building with your research proposal this semester. They should be included as part of your research report so that readers have enough information to evaluate your research for themselves. What’s here should be very similar to the introduction and literature review from your research proposal, where you described the literature relevant to the study you wanted to do. With your results in hand, though, you may find that you have to add information to the literature you wrote previously to help orient the reader of the report to important topics needed to understand the results of your study.

In this section, you should explicitly lay out your study design – for instance, if it was experimental, be specific about the type of experimental design. Discuss the type of sampling that you used, if that’s applicable to your project. You should also go into a general description of your data, including the time period, any exclusions you made from the original data set and the source – i.e., did you collect it yourself or was it secondary data?  Next, talk about the specific statistical methods you used, like t- tests, Chi-square tests, or regression analyses. For descriptive statistics, you can be relatively general – you don’t need to say “I looked at means and medians,” for instance. You need to provide enough information here that someone could replicate what you did.

In this section, you should also discuss how you operationalized your variables. What did you mean when you asked about educational attainment – did you ask for a grade number, or did you ask them to pick a range that you turned into a category? This is key information for readers to understand your research. Remember when you were looking for ways to operationalize your variables? Be the kind of author who provides enough information on operationalization so people can actually understand what they did.

You’re going to run lots of different analyses to settle on what finally makes sense to get a result – positive or negative – for your study. For this section, you’re going to provide tables with descriptions of your sample, including, but not limited to, sample size, frequencies of sample characteristics like race and gender, levels of measurement, appropriate measures of central tendency, standard deviations and variances. Here you will also want to focus on the analyses you used to actually draw whatever conclusion you settled on, both descriptive and inferential (i.e., bivariate or multivariate).

The actual statistics you report depend entirely on the kind of statistical analysis you do. For instance, if you’re reporting on a logistic regression, it’s going to look a little different than reporting on an ANOVA. In the previous chapter, we provided links to open textbooks that detail how to conduct quantitative data analysis. You should look at these resources and consult with your research professor to help you determine what is expected in a report about the particular statistical method you used.

The important thing to remember here – as we mentioned above – is that you need to be totally transparent about your results, even and especially if they don’t support your hypothesis. There is value in a disproved hypothesis, too – you now know something about how the state of the world is not .

In this section, you’re going to connect your statistical results back to your hypothesis and discuss whether your results support your hypothesis or not. You are also going to talk about what the results mean for the larger field of study of which your research is a part, the implications of your findings if you’re evaluating some kind of intervention, and how your research relates to what is already out there in this field. When your research doesn’t pan out the way you expect, if you’re able to make some educated guesses as to why this might be (supported by literature if possible, but practice wisdom works too), share those as well.

Let’s take a minute to talk about what happens when your findings disprove your hypothesis or actually indicate something negative about the group you are studying. The discussion section is where you can contextualize “negative” findings. For example, say you conducted a study that indicated that a certain group is more likely to commit violent crime. Here, you have an opportunity to talk about why this might be the case outside of their membership in that group, and how membership in that group does not automatically mean someone will commit a violent crime. You can present mitigating factors, like a history of personal and community trauma. It’s extremely important to provide this relevant context so that your results are more difficult to use against a group you are studying in a way that doesn’t reflect your actual findings.

Limitations

In this section, you’re going to critique your own study. What are the advantages, disadvantages, and trade-offs of what you did to define and analyze your variables? Some questions you might consider include:  What limits the study’s applicability to the population at large? Were there trade-offs you had to make between rigor and available data? Did the statistical analyses you used mean that you could only get certain types of results? What would have made the study more widely applicable or more useful for a certain group? You should be thinking about this throughout the analysis process so you can properly contextualize your results.

In this section, you may also consider discussing any threats to internal validity that you identified and whether you think you can generalize your research. Finally, if you used any measurement tools that haven’t been validated yet, discuss how this could have affected your results.

Significance and conclusions

Finally, you want to use the conclusions section to bring it full circle for your reader – why did this research matter? Talk about how it contributed to knowledge around the topic and how might it be used to further practice. Identify and discuss ethical implications of your findings for social workers and social work research. Finally, make sure to talk about the next steps for you, other researchers, or policy-makers based on your research findings.

Key Takeaways

  • Your quantitative research report should provide the reader with transparent, replicable methods and put your research into the context of existing literature, real-world practice and social work ethics.
  • Think about the research project you are building now. What could a negative finding be, and how might you provide your reader with context to ensure that you are not harming your study population?

The process of determining how to measure a construct that cannot be directly observed

Ability to say that one variable "causes" something to happen to another variable. Very important to assess when thinking about studies that examine causation such as experimental or quasi-experimental designs.

Graduate research methods in social work Copyright © 2020 by Matthew DeCarlo, Cory Cummings, Kate Agnelli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Quantitative Research Study Report

Profile image of Evelyn Mayes

2019, Quantitative Research Study Report

Quantitative research, aligned with the postpositivist world view, has heralded the scientific method, thus, pertaining to promote accuracy, credibility, and validity. However, quantitative methodologies often align with qualitative research, lending it the stamp of scientific proof. McCusker and Gunaydin (2015) examined the advantages of the amalgam of both. A meta-analysis, they claimed, does not belong to the realm of qualitative research alone, but is akin also to quantitative study in that it "enables the acquisition of multiple quantitative findings, followed by merging data and information to create a more representative viewpoint" (p. 3). Hence, in studying quantitative research, elements of qualitative research, such as surveys, emerge, representing a reciprocal relationship between the two.

Related Papers

DR FREDRICK ONASANYA

Quantitative research is a more consistent, coherent and data-resulted means of arriving in measuring of what people think from a statistical point of view. Quantitative research can gather prominent amount of data can easily be arranged and controlled into reports for analysis. In quantitative research, numerical data are gathered and mathematical based methods are used for analysis Quantitative research is basically about gathering numerical data to explicate a development especially that needs prompt answers using quantitative methods. It is used to measure mental outlook, beliefs, demeanors, and other defined variables that can be used to generalized results from prominent sample populations Quantitative research pertains to taxonomical of practical thorough check of social developments by the way of statistical, mathematical or computational proficiencies. Quantitative research data are gathered by surveys, audits, purchase points and so on. Quantitative research is quantifiable data to develop the truth or facts and reveal practices in research

quantitative research report study

Methodological Innovations Online

Christine Welch

Kezang sherab

Journal of Evidence Based Medicine and Healthcare

Anshika Srivastava

amelya herda losari

Munyaradzi Moyo

Robert Eppel

Research plays a very important role in making sense of the world around us and developing knowledge basis and systems. As such, understanding research methods and paradigms is very important to scholars and researchers if they are to come up with credible and comprehensive research. This paper discusses the differences between qualitative and quantitative research methods and also looks at how the two methods can be similar and how they can be used together. Qualitative research is a scientific and systematic method used to gather data that it not quantifiable (Yin 2018). This type of research, as Marshall (2016)) explains, "refers to the meanings, concepts definitions, characteristics, metaphors, symbols, and description of things". Therefore, as a research method, qualitative research is also used to unearth new trends in thought processes and actions, how people feel about specific circumstances and to get to the heart of issues and how they affect people (Wolcott 2016). Marshall (2016) emphasizes that qualitative research is primarily exploratory research and is used to obtain information such as intentions and motives that helps explain an occurrence. Thus, qualitative research methods help to understand new occurrences or trends and then helps to explain why such things are happening or occurring. Quantitative Research is used to quantify the problem by way of generating numerical data or data that can be transformed into useable statistics (Lichtman 2017). According to Wolcott (2016) quantitative research is used to quantify attitudes, opinions, behaviors, and other defined variables and generalize results from a larger sample population. Thus, quantitative Research uses measurable data to formulate facts and uncover patterns in research to make sense or deductions on how things have unfolded. Therefore, where qualitative data seeks to understand a phenomenon, quantitative methods seek to quantify them and identify variables that can be measured. Qualitative research uses data collecting methods that often require the direct participation of the researcher to gather data and information crucial to the study. Morgan (2017) notes that

Dr. Mustapha Kulungu

Tutors India

Meta-analysis is the statistical analysis and integration of multiple studies with similar concepts. The basic principle of meta-analysis is the presence of a common truth behind all conceptually similar studies which has been measured with a certain degree of error within each study. Therefore, the objective of meta-analysis will be to arrive at a pooled estimate closest to the common truth using the principles of statistics. Quantitative research deals with numbers and stats, the integration of such researches with similar concepts in order to improve statistical power of estimation is called meta-analysis. Meta-analysis will result in a weighted average that is a close estimate to the true value accounting in various factors. The difference lies in how the weights are allocated and the manner in which uncertainty is compared around the point estimate generated. Meta-analysis, in short, provides the point estimate of the unknown common truth.

Margaret Yuen

RELATED PAPERS

Stefan Vogt

Journal of Materials Research

Journal of the Royal Society of Medicine

Alastair Forbes

Angewandte Chemie

Steven Bernasek

Servanne Woodward

Res Publica

Prangya Das

Shing Mueller

Collis Mukungurutse

Journal of Physics and Chemistry of Solids

Global Journal of Health Science

Sani Ibrahim

Journal of Comparative Physiology B

Elkin Rueda

Quaternary Science Reviews

Danielle Schreve

EDGAR FERNANDO REGIN GUTIERREZ

The Journal of the Acoustical Society of America

Petri Leskinen

Pediatric Research

RICHARD OCONNOR

Cuadernos de Economía

julian Eduardo Melgarejo Torres

Asian Journal of Probability and Statistics

Ibrahim Adamu

The Journal of Molecular Diagnostics

Quang Bùi Q

PharmaTutor

Zulfkar Latief Qadrie

Revista médica de Chile

Max Andresen

Jordan Journal of Agricultural Sciences

Tom Fullerton

Angela Ibáñez

代办国外毕业证、代办澳洲毕业证 办理假毕业证在国内能用吗

See More Documents Like This

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Study Tracks Shifts in Student Mental Health During College

Dartmouth study followed 200 students all four years, including through the pandemic.

Andrew Campbell seated by a window in a blue t-shirt and glasses

Phone App Uses AI to Detect Depression From Facial Cues

A four-year study by Dartmouth researchers captures the most in-depth data yet on how college students’ self-esteem and mental health fluctuates during their four years in academia, identifying key populations and stressors that the researchers say administrators could target to improve student well-being. 

The study also provides among the first real-time accounts of how the coronavirus pandemic affected students’ behavior and mental health. The stress and uncertainty of COVID-19 resulted in long-lasting behavioral changes that persisted as a “new normal” even as the pandemic diminished, including students feeling more stressed, less socially engaged, and sleeping more.

The researchers tracked more than 200 Dartmouth undergraduates in the classes of 2021 and 2022 for all four years of college. Students volunteered to let a specially developed app called StudentLife tap into the sensors that are built into smartphones. The app cataloged their daily physical and social activity, how long they slept, their location and travel, the time they spent on their phone, and how often they listened to music or watched videos. Students also filled out weekly behavioral surveys, and selected students gave post-study interviews. 

The study—which is the longest mobile-sensing study ever conducted—is published in the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies .

The researchers will present it at the Association of Computing Machinery’s UbiComp/ISWC 2024 conference in Melbourne, Australia, in October. 

These sorts of tools will have a tremendous impact on projecting forward and developing much more data-driven ways to intervene and respond exactly when students need it most.

The team made their anonymized data set publicly available —including self-reports, surveys, and phone-sensing and brain-imaging data—to help advance research into the mental health of students during their college years. 

Andrew Campbell , the paper’s senior author and Dartmouth’s Albert Bradley 1915 Third Century Professor of Computer Science, says that the study’s extensive data reinforces the importance of college and university administrators across the country being more attuned to how and when students’ mental well-being changes during the school year.

“For the first time, we’ve produced granular data about the ebb and flow of student mental health. It’s incredibly dynamic—there’s nothing that’s steady state through the term, let alone through the year,” he says. “These sorts of tools will have a tremendous impact on projecting forward and developing much more data-driven ways to intervene and respond exactly when students need it most.”

First-year and female students are especially at risk for high anxiety and low self-esteem, the study finds. Among first-year students, self-esteem dropped to its lowest point in the first weeks of their transition from high school to college but rose steadily every semester until it was about 10% higher by graduation.

“We can see that students came out of high school with a certain level of self-esteem that dropped off to the lowest point of the four years. Some said they started to experience ‘imposter syndrome’ from being around other high-performing students,” Campbell says. “As the years progress, though, we can draw a straight line from low to high as their self-esteem improves. I think we would see a similar trend class over class. To me, that’s a very positive thing.”

Female students—who made up 60% of study participants—experienced on average 5% greater stress levels and 10% lower self-esteem than male students. More significantly, the data show that female students tended to be less active, with male students walking 37% more often.

Sophomores were 40% more socially active compared to their first year, the researchers report. But these students also reported feeling 13% more stressed during their second year than during their first year as their workload increased, they felt pressure to socialize, or as first-year social groups dispersed.

One student in a sorority recalled that having pre-arranged activities “kind of adds stress as I feel like I should be having fun because everyone tells me that it is fun.” Another student noted that after the first year, “students have more access to the whole campus and that is when you start feeling excluded from things.” 

In a novel finding, the researchers identify an “anticipatory stress spike” of 17% experienced in the last two weeks of summer break. While still lower than mid-academic year stress, the spike was consistent across different summers.

In post-study interviews, some students pointed to returning to campus early for team sports as a source of stress. Others specified reconnecting with family and high school friends during their first summer home, saying they felt “a sense of leaving behind the comfort and familiarity of these long-standing friendships” as the break ended, the researchers report. 

“This is a foundational study,” says Subigya Nepal , first author of the study and a PhD candidate in Campbell’s research group. “It has more real-time granular data than anything we or anyone else has provided before. We don’t know yet how it will translate to campuses nationwide, but it can be a template for getting the conversation going.”

The depth and accuracy of the study data suggest that mobile-sensing software could eventually give universities the ability to create proactive mental-health policies specific to certain student populations and times of year, Campbell says.

For example, a paper Campbell’s research group published in 2022 based on StudentLife data showed that first-generation students experienced lower self-esteem and higher levels of depression than other students throughout their four years of college.

“We will be able to look at campus in much more nuanced ways than waiting for the results of an annual mental health study and then developing policy,” Campbell says. “We know that Dartmouth is a small and very tight-knit campus community. But if we applied these same methods to a college with similar attributes, I believe we would find very similar trends.”

Weathering the pandemic

When students returned home at the start of the coronavirus pandemic, the researchers found that self-esteem actually increased during the pandemic by 5% overall and by another 6% afterward when life returned closer to what it was before. One student suggested in their interview that getting older came with more confidence. Others indicated that being home led to them spending more time with friends talking on the phone, on social media, or streaming movies together. 

The data show that phone usage—measured by the duration a phone was unlocked—indeed increased by nearly 33 minutes, or 19%, during the pandemic, while time spent in physical activity dropped by 52 minutes, or 27%. By 2022, phone usage fell from its pandemic peak to just above pre-pandemic levels, while engagement in physical activity had recovered to exceed the pre-pandemic period by three minutes. 

Despite reporting higher self-esteem, students’ feelings of stress increased by more than 10% during the pandemic. By the end of the study in June 2022, stress had fallen by less than 2% of its pandemic peak, indicating that the experience had a lasting impact on student well-being, the researchers report. 

In early 2021, as students returned to campus, their reunion with friends and community was tempered by an overwhelming concern about the still-rampant coronavirus. “There was the first outbreak in winter 2021 and that was terrifying,” one student recalls. Another student adds: “You could be put into isolation for a long time even if you did not have COVID. Everyone was afraid to contact-trace anyone else in case they got mad at each other.”

Female students were especially concerned about the coronavirus, on average 13% more than male students. “Even though the girls might have been hanging out with each other more, they are more aware of the impact,” one female student reported. “I actually had COVID and exposed some friends of mine. All the girls that I told tested as they were worried. They were continually checking up to make sure that they did not have it and take it home to their family.”

Students still learning remotely had social levels 16% higher than students on campus, who engaged in activity an average of 10% less often than when they were learning from home. However, on-campus students used their phones 47% more often. When interviewed after the study, these students reported spending extended periods of time video-calling or streaming movies with friends and family.

Social activity and engagement had not yet returned to pre-pandemic levels by the end of the study in June 2022, recovering by a little less than 3% after a nearly 10% drop during the pandemic. Similarly, the pandemic correlates with students sticking closer to home, with their distance traveled nearly cut in half during the pandemic and holding at that level since then.

Campbell and several of his fellow researchers are now developing a smartphone app known as MoodCapture that uses artificial intelligence paired with facial-image processing software to reliably detect the onset of depression before the user even knows something is wrong.

Morgan Kelly can be reached at [email protected] .

  • Mental Health and Wellness
  • Innovation and Impact
  • Arts and Sciences
  • Class of 2021
  • Class of 2022
  • Department of Computer Science
  • Guarini School of Graduate and Advanced Studies
  • Mental Health

A Q&A With Film Critic and Theorist Vinzenz Hediger

Portrait of Montgomery Fellow Vinzenz Hediger

After a Year of Turmoil, Harvard’s Applications Drop

  • Study Guides
  • Homework Questions

Discussion Thread Quantitative and Qualitative Research

  • Health Science

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 26 March 2024

Predicting and improving complex beer flavor through machine learning

  • Michiel Schreurs   ORCID: orcid.org/0000-0002-9449-5619 1 , 2 , 3   na1 ,
  • Supinya Piampongsant 1 , 2 , 3   na1 ,
  • Miguel Roncoroni   ORCID: orcid.org/0000-0001-7461-1427 1 , 2 , 3   na1 ,
  • Lloyd Cool   ORCID: orcid.org/0000-0001-9936-3124 1 , 2 , 3 , 4 ,
  • Beatriz Herrera-Malaver   ORCID: orcid.org/0000-0002-5096-9974 1 , 2 , 3 ,
  • Christophe Vanderaa   ORCID: orcid.org/0000-0001-7443-5427 4 ,
  • Florian A. Theßeling 1 , 2 , 3 ,
  • Łukasz Kreft   ORCID: orcid.org/0000-0001-7620-4657 5 ,
  • Alexander Botzki   ORCID: orcid.org/0000-0001-6691-4233 5 ,
  • Philippe Malcorps 6 ,
  • Luk Daenen 6 ,
  • Tom Wenseleers   ORCID: orcid.org/0000-0002-1434-861X 4 &
  • Kevin J. Verstrepen   ORCID: orcid.org/0000-0002-3077-6219 1 , 2 , 3  

Nature Communications volume  15 , Article number:  2368 ( 2024 ) Cite this article

39k Accesses

749 Altmetric

Metrics details

  • Chemical engineering
  • Gas chromatography
  • Machine learning
  • Metabolomics
  • Taste receptors

The perception and appreciation of food flavor depends on many interacting chemical compounds and external factors, and therefore proves challenging to understand and predict. Here, we combine extensive chemical and sensory analyses of 250 different beers to train machine learning models that allow predicting flavor and consumer appreciation. For each beer, we measure over 200 chemical properties, perform quantitative descriptive sensory analysis with a trained tasting panel and map data from over 180,000 consumer reviews to train 10 different machine learning models. The best-performing algorithm, Gradient Boosting, yields models that significantly outperform predictions based on conventional statistics and accurately predict complex food features and consumer appreciation from chemical profiles. Model dissection allows identifying specific and unexpected compounds as drivers of beer flavor and appreciation. Adding these compounds results in variants of commercial alcoholic and non-alcoholic beers with improved consumer appreciation. Together, our study reveals how big data and machine learning uncover complex links between food chemistry, flavor and consumer perception, and lays the foundation to develop novel, tailored foods with superior flavors.

Similar content being viewed by others

quantitative research report study

BitterSweet: Building machine learning models for predicting the bitter and sweet taste of small molecules

Rudraksh Tuwani, Somin Wadhwa & Ganesh Bagler

quantitative research report study

Sensory lexicon and aroma volatiles analysis of brewing malt

Xiaoxia Su, Miao Yu, … Tianyi Du

quantitative research report study

Predicting odor from molecular structure: a multi-label classification approach

Kushagra Saini & Venkatnarayan Ramanathan

Introduction

Predicting and understanding food perception and appreciation is one of the major challenges in food science. Accurate modeling of food flavor and appreciation could yield important opportunities for both producers and consumers, including quality control, product fingerprinting, counterfeit detection, spoilage detection, and the development of new products and product combinations (food pairing) 1 , 2 , 3 , 4 , 5 , 6 . Accurate models for flavor and consumer appreciation would contribute greatly to our scientific understanding of how humans perceive and appreciate flavor. Moreover, accurate predictive models would also facilitate and standardize existing food assessment methods and could supplement or replace assessments by trained and consumer tasting panels, which are variable, expensive and time-consuming 7 , 8 , 9 . Lastly, apart from providing objective, quantitative, accurate and contextual information that can help producers, models can also guide consumers in understanding their personal preferences 10 .

Despite the myriad of applications, predicting food flavor and appreciation from its chemical properties remains a largely elusive goal in sensory science, especially for complex food and beverages 11 , 12 . A key obstacle is the immense number of flavor-active chemicals underlying food flavor. Flavor compounds can vary widely in chemical structure and concentration, making them technically challenging and labor-intensive to quantify, even in the face of innovations in metabolomics, such as non-targeted metabolic fingerprinting 13 , 14 . Moreover, sensory analysis is perhaps even more complicated. Flavor perception is highly complex, resulting from hundreds of different molecules interacting at the physiochemical and sensorial level. Sensory perception is often non-linear, characterized by complex and concentration-dependent synergistic and antagonistic effects 15 , 16 , 17 , 18 , 19 , 20 , 21 that are further convoluted by the genetics, environment, culture and psychology of consumers 22 , 23 , 24 . Perceived flavor is therefore difficult to measure, with problems of sensitivity, accuracy, and reproducibility that can only be resolved by gathering sufficiently large datasets 25 . Trained tasting panels are considered the prime source of quality sensory data, but require meticulous training, are low throughput and high cost. Public databases containing consumer reviews of food products could provide a valuable alternative, especially for studying appreciation scores, which do not require formal training 25 . Public databases offer the advantage of amassing large amounts of data, increasing the statistical power to identify potential drivers of appreciation. However, public datasets suffer from biases, including a bias in the volunteers that contribute to the database, as well as confounding factors such as price, cult status and psychological conformity towards previous ratings of the product.

Classical multivariate statistics and machine learning methods have been used to predict flavor of specific compounds by, for example, linking structural properties of a compound to its potential biological activities or linking concentrations of specific compounds to sensory profiles 1 , 26 . Importantly, most previous studies focused on predicting organoleptic properties of single compounds (often based on their chemical structure) 27 , 28 , 29 , 30 , 31 , 32 , 33 , thus ignoring the fact that these compounds are present in a complex matrix in food or beverages and excluding complex interactions between compounds. Moreover, the classical statistics commonly used in sensory science 34 , 35 , 36 , 37 , 38 , 39 require a large sample size and sufficient variance amongst predictors to create accurate models. They are not fit for studying an extensive set of hundreds of interacting flavor compounds, since they are sensitive to outliers, have a high tendency to overfit and are less suited for non-linear and discontinuous relationships 40 .

In this study, we combine extensive chemical analyses and sensory data of a set of different commercial beers with machine learning approaches to develop models that predict taste, smell, mouthfeel and appreciation from compound concentrations. Beer is particularly suited to model the relationship between chemistry, flavor and appreciation. First, beer is a complex product, consisting of thousands of flavor compounds that partake in complex sensory interactions 41 , 42 , 43 . This chemical diversity arises from the raw materials (malt, yeast, hops, water and spices) and biochemical conversions during the brewing process (kilning, mashing, boiling, fermentation, maturation and aging) 44 , 45 . Second, the advent of the internet saw beer consumers embrace online review platforms, such as RateBeer (ZX Ventures, Anheuser-Busch InBev SA/NV) and BeerAdvocate (Next Glass, inc.). In this way, the beer community provides massive data sets of beer flavor and appreciation scores, creating extraordinarily large sensory databases to complement the analyses of our professional sensory panel. Specifically, we characterize over 200 chemical properties of 250 commercial beers, spread across 22 beer styles, and link these to the descriptive sensory profiling data of a 16-person in-house trained tasting panel and data acquired from over 180,000 public consumer reviews. These unique and extensive datasets enable us to train a suite of machine learning models to predict flavor and appreciation from a beer’s chemical profile. Dissection of the best-performing models allows us to pinpoint specific compounds as potential drivers of beer flavor and appreciation. Follow-up experiments confirm the importance of these compounds and ultimately allow us to significantly improve the flavor and appreciation of selected commercial beers. Together, our study represents a significant step towards understanding complex flavors and reinforces the value of machine learning to develop and refine complex foods. In this way, it represents a stepping stone for further computer-aided food engineering applications 46 .

To generate a comprehensive dataset on beer flavor, we selected 250 commercial Belgian beers across 22 different beer styles (Supplementary Fig.  S1 ). Beers with ≤ 4.2% alcohol by volume (ABV) were classified as non-alcoholic and low-alcoholic. Blonds and Tripels constitute a significant portion of the dataset (12.4% and 11.2%, respectively) reflecting their presence on the Belgian beer market and the heterogeneity of beers within these styles. By contrast, lager beers are less diverse and dominated by a handful of brands. Rare styles such as Brut or Faro make up only a small fraction of the dataset (2% and 1%, respectively) because fewer of these beers are produced and because they are dominated by distinct characteristics in terms of flavor and chemical composition.

Extensive analysis identifies relationships between chemical compounds in beer

For each beer, we measured 226 different chemical properties, including common brewing parameters such as alcohol content, iso-alpha acids, pH, sugar concentration 47 , and over 200 flavor compounds (Methods, Supplementary Table  S1 ). A large portion (37.2%) are terpenoids arising from hopping, responsible for herbal and fruity flavors 16 , 48 . A second major category are yeast metabolites, such as esters and alcohols, that result in fruity and solvent notes 48 , 49 , 50 . Other measured compounds are primarily derived from malt, or other microbes such as non- Saccharomyces yeasts and bacteria (‘wild flora’). Compounds that arise from spices or staling are labeled under ‘Others’. Five attributes (caloric value, total acids and total ester, hop aroma and sulfur compounds) are calculated from multiple individually measured compounds.

As a first step in identifying relationships between chemical properties, we determined correlations between the concentrations of the compounds (Fig.  1 , upper panel, Supplementary Data  1 and 2 , and Supplementary Fig.  S2 . For the sake of clarity, only a subset of the measured compounds is shown in Fig.  1 ). Compounds of the same origin typically show a positive correlation, while absence of correlation hints at parameters varying independently. For example, the hop aroma compounds citronellol, and alpha-terpineol show moderate correlations with each other (Spearman’s rho=0.39 and 0.57), but not with the bittering hop component iso-alpha acids (Spearman’s rho=0.16 and −0.07). This illustrates how brewers can independently modify hop aroma and bitterness by selecting hop varieties and dosage time. If hops are added early in the boiling phase, chemical conversions increase bitterness while aromas evaporate, conversely, late addition of hops preserves aroma but limits bitterness 51 . Similarly, hop-derived iso-alpha acids show a strong anti-correlation with lactic acid and acetic acid, likely reflecting growth inhibition of lactic acid and acetic acid bacteria, or the consequent use of fewer hops in sour beer styles, such as West Flanders ales and Fruit beers, that rely on these bacteria for their distinct flavors 52 . Finally, yeast-derived esters (ethyl acetate, ethyl decanoate, ethyl hexanoate, ethyl octanoate) and alcohols (ethanol, isoamyl alcohol, isobutanol, and glycerol), correlate with Spearman coefficients above 0.5, suggesting that these secondary metabolites are correlated with the yeast genetic background and/or fermentation parameters and may be difficult to influence individually, although the choice of yeast strain may offer some control 53 .

figure 1

Spearman rank correlations are shown. Descriptors are grouped according to their origin (malt (blue), hops (green), yeast (red), wild flora (yellow), Others (black)), and sensory aspect (aroma, taste, palate, and overall appreciation). Please note that for the chemical compounds, for the sake of clarity, only a subset of the total number of measured compounds is shown, with an emphasis on the key compounds for each source. For more details, see the main text and Methods section. Chemical data can be found in Supplementary Data  1 , correlations between all chemical compounds are depicted in Supplementary Fig.  S2 and correlation values can be found in Supplementary Data  2 . See Supplementary Data  4 for sensory panel assessments and Supplementary Data  5 for correlation values between all sensory descriptors.

Interestingly, different beer styles show distinct patterns for some flavor compounds (Supplementary Fig.  S3 ). These observations agree with expectations for key beer styles, and serve as a control for our measurements. For instance, Stouts generally show high values for color (darker), while hoppy beers contain elevated levels of iso-alpha acids, compounds associated with bitter hop taste. Acetic and lactic acid are not prevalent in most beers, with notable exceptions such as Kriek, Lambic, Faro, West Flanders ales and Flanders Old Brown, which use acid-producing bacteria ( Lactobacillus and Pediococcus ) or unconventional yeast ( Brettanomyces ) 54 , 55 . Glycerol, ethanol and esters show similar distributions across all beer styles, reflecting their common origin as products of yeast metabolism during fermentation 45 , 53 . Finally, low/no-alcohol beers contain low concentrations of glycerol and esters. This is in line with the production process for most of the low/no-alcohol beers in our dataset, which are produced through limiting fermentation or by stripping away alcohol via evaporation or dialysis, with both methods having the unintended side-effect of reducing the amount of flavor compounds in the final beer 56 , 57 .

Besides expected associations, our data also reveals less trivial associations between beer styles and specific parameters. For example, geraniol and citronellol, two monoterpenoids responsible for citrus, floral and rose flavors and characteristic of Citra hops, are found in relatively high amounts in Christmas, Saison, and Brett/co-fermented beers, where they may originate from terpenoid-rich spices such as coriander seeds instead of hops 58 .

Tasting panel assessments reveal sensorial relationships in beer

To assess the sensory profile of each beer, a trained tasting panel evaluated each of the 250 beers for 50 sensory attributes, including different hop, malt and yeast flavors, off-flavors and spices. Panelists used a tasting sheet (Supplementary Data  3 ) to score the different attributes. Panel consistency was evaluated by repeating 12 samples across different sessions and performing ANOVA. In 95% of cases no significant difference was found across sessions ( p  > 0.05), indicating good panel consistency (Supplementary Table  S2 ).

Aroma and taste perception reported by the trained panel are often linked (Fig.  1 , bottom left panel and Supplementary Data  4 and 5 ), with high correlations between hops aroma and taste (Spearman’s rho=0.83). Bitter taste was found to correlate with hop aroma and taste in general (Spearman’s rho=0.80 and 0.69), and particularly with “grassy” noble hops (Spearman’s rho=0.75). Barnyard flavor, most often associated with sour beers, is identified together with stale hops (Spearman’s rho=0.97) that are used in these beers. Lactic and acetic acid, which often co-occur, are correlated (Spearman’s rho=0.66). Interestingly, sweetness and bitterness are anti-correlated (Spearman’s rho = −0.48), confirming the hypothesis that they mask each other 59 , 60 . Beer body is highly correlated with alcohol (Spearman’s rho = 0.79), and overall appreciation is found to correlate with multiple aspects that describe beer mouthfeel (alcohol, carbonation; Spearman’s rho= 0.32, 0.39), as well as with hop and ester aroma intensity (Spearman’s rho=0.39 and 0.35).

Similar to the chemical analyses, sensorial analyses confirmed typical features of specific beer styles (Supplementary Fig.  S4 ). For example, sour beers (Faro, Flanders Old Brown, Fruit beer, Kriek, Lambic, West Flanders ale) were rated acidic, with flavors of both acetic and lactic acid. Hoppy beers were found to be bitter and showed hop-associated aromas like citrus and tropical fruit. Malt taste is most detected among scotch, stout/porters, and strong ales, while low/no-alcohol beers, which often have a reputation for being ‘worty’ (reminiscent of unfermented, sweet malt extract) appear in the middle. Unsurprisingly, hop aromas are most strongly detected among hoppy beers. Like its chemical counterpart (Supplementary Fig.  S3 ), acidity shows a right-skewed distribution, with the most acidic beers being Krieks, Lambics, and West Flanders ales.

Tasting panel assessments of specific flavors correlate with chemical composition

We find that the concentrations of several chemical compounds strongly correlate with specific aroma or taste, as evaluated by the tasting panel (Fig.  2 , Supplementary Fig.  S5 , Supplementary Data  6 ). In some cases, these correlations confirm expectations and serve as a useful control for data quality. For example, iso-alpha acids, the bittering compounds in hops, strongly correlate with bitterness (Spearman’s rho=0.68), while ethanol and glycerol correlate with tasters’ perceptions of alcohol and body, the mouthfeel sensation of fullness (Spearman’s rho=0.82/0.62 and 0.72/0.57 respectively) and darker color from roasted malts is a good indication of malt perception (Spearman’s rho=0.54).

figure 2

Heatmap colors indicate Spearman’s Rho. Axes are organized according to sensory categories (aroma, taste, mouthfeel, overall), chemical categories and chemical sources in beer (malt (blue), hops (green), yeast (red), wild flora (yellow), Others (black)). See Supplementary Data  6 for all correlation values.

Interestingly, for some relationships between chemical compounds and perceived flavor, correlations are weaker than expected. For example, the rose-smelling phenethyl acetate only weakly correlates with floral aroma. This hints at more complex relationships and interactions between compounds and suggests a need for a more complex model than simple correlations. Lastly, we uncovered unexpected correlations. For instance, the esters ethyl decanoate and ethyl octanoate appear to correlate slightly with hop perception and bitterness, possibly due to their fruity flavor. Iron is anti-correlated with hop aromas and bitterness, most likely because it is also anti-correlated with iso-alpha acids. This could be a sign of metal chelation of hop acids 61 , given that our analyses measure unbound hop acids and total iron content, or could result from the higher iron content in dark and Fruit beers, which typically have less hoppy and bitter flavors 62 .

Public consumer reviews complement expert panel data

To complement and expand the sensory data of our trained tasting panel, we collected 180,000 reviews of our 250 beers from the online consumer review platform RateBeer. This provided numerical scores for beer appearance, aroma, taste, palate, overall quality as well as the average overall score.

Public datasets are known to suffer from biases, such as price, cult status and psychological conformity towards previous ratings of a product. For example, prices correlate with appreciation scores for these online consumer reviews (rho=0.49, Supplementary Fig.  S6 ), but not for our trained tasting panel (rho=0.19). This suggests that prices affect consumer appreciation, which has been reported in wine 63 , while blind tastings are unaffected. Moreover, we observe that some beer styles, like lagers and non-alcoholic beers, generally receive lower scores, reflecting that online reviewers are mostly beer aficionados with a preference for specialty beers over lager beers. In general, we find a modest correlation between our trained panel’s overall appreciation score and the online consumer appreciation scores (Fig.  3 , rho=0.29). Apart from the aforementioned biases in the online datasets, serving temperature, sample freshness and surroundings, which are all tightly controlled during the tasting panel sessions, can vary tremendously across online consumers and can further contribute to (among others, appreciation) differences between the two categories of tasters. Importantly, in contrast to the overall appreciation scores, for many sensory aspects the results from the professional panel correlated well with results obtained from RateBeer reviews. Correlations were highest for features that are relatively easy to recognize even for untrained tasters, like bitterness, sweetness, alcohol and malt aroma (Fig.  3 and below).

figure 3

RateBeer text mining results can be found in Supplementary Data  7 . Rho values shown are Spearman correlation values, with asterisks indicating significant correlations ( p  < 0.05, two-sided). All p values were smaller than 0.001, except for Esters aroma (0.0553), Esters taste (0.3275), Esters aroma—banana (0.0019), Coriander (0.0508) and Diacetyl (0.0134).

Besides collecting consumer appreciation from these online reviews, we developed automated text analysis tools to gather additional data from review texts (Supplementary Data  7 ). Processing review texts on the RateBeer database yielded comparable results to the scores given by the trained panel for many common sensory aspects, including acidity, bitterness, sweetness, alcohol, malt, and hop tastes (Fig.  3 ). This is in line with what would be expected, since these attributes require less training for accurate assessment and are less influenced by environmental factors such as temperature, serving glass and odors in the environment. Consumer reviews also correlate well with our trained panel for 4-vinyl guaiacol, a compound associated with a very characteristic aroma. By contrast, correlations for more specific aromas like ester, coriander or diacetyl are underrepresented in the online reviews, underscoring the importance of using a trained tasting panel and standardized tasting sheets with explicit factors to be scored for evaluating specific aspects of a beer. Taken together, our results suggest that public reviews are trustworthy for some, but not all, flavor features and can complement or substitute taste panel data for these sensory aspects.

Models can predict beer sensory profiles from chemical data

The rich datasets of chemical analyses, tasting panel assessments and public reviews gathered in the first part of this study provided us with a unique opportunity to develop predictive models that link chemical data to sensorial features. Given the complexity of beer flavor, basic statistical tools such as correlations or linear regression may not always be the most suitable for making accurate predictions. Instead, we applied different machine learning models that can model both simple linear and complex interactive relationships. Specifically, we constructed a set of regression models to predict (a) trained panel scores for beer flavor and quality and (b) public reviews’ appreciation scores from beer chemical profiles. We trained and tested 10 different models (Methods), 3 linear regression-based models (simple linear regression with first-order interactions (LR), lasso regression with first-order interactions (Lasso), partial least squares regressor (PLSR)), 5 decision tree models (AdaBoost regressor (ABR), extra trees (ET), gradient boosting regressor (GBR), random forest (RF) and XGBoost regressor (XGBR)), 1 support vector regression (SVR), and 1 artificial neural network (ANN) model.

To compare the performance of our machine learning models, the dataset was randomly split into a training and test set, stratified by beer style. After a model was trained on data in the training set, its performance was evaluated on its ability to predict the test dataset obtained from multi-output models (based on the coefficient of determination, see Methods). Additionally, individual-attribute models were ranked per descriptor and the average rank was calculated, as proposed by Korneva et al. 64 . Importantly, both ways of evaluating the models’ performance agreed in general. Performance of the different models varied (Table  1 ). It should be noted that all models perform better at predicting RateBeer results than results from our trained tasting panel. One reason could be that sensory data is inherently variable, and this variability is averaged out with the large number of public reviews from RateBeer. Additionally, all tree-based models perform better at predicting taste than aroma. Linear models (LR) performed particularly poorly, with negative R 2 values, due to severe overfitting (training set R 2  = 1). Overfitting is a common issue in linear models with many parameters and limited samples, especially with interaction terms further amplifying the number of parameters. L1 regularization (Lasso) successfully overcomes this overfitting, out-competing multiple tree-based models on the RateBeer dataset. Similarly, the dimensionality reduction of PLSR avoids overfitting and improves performance, to some extent. Still, tree-based models (ABR, ET, GBR, RF and XGBR) show the best performance, out-competing the linear models (LR, Lasso, PLSR) commonly used in sensory science 65 .

GBR models showed the best overall performance in predicting sensory responses from chemical information, with R 2 values up to 0.75 depending on the predicted sensory feature (Supplementary Table  S4 ). The GBR models predict consumer appreciation (RateBeer) better than our trained panel’s appreciation (R 2 value of 0.67 compared to R 2 value of 0.09) (Supplementary Table  S3 and Supplementary Table  S4 ). ANN models showed intermediate performance, likely because neural networks typically perform best with larger datasets 66 . The SVR shows intermediate performance, mostly due to the weak predictions of specific attributes that lower the overall performance (Supplementary Table  S4 ).

Model dissection identifies specific, unexpected compounds as drivers of consumer appreciation

Next, we leveraged our models to infer important contributors to sensory perception and consumer appreciation. Consumer preference is a crucial sensory aspects, because a product that shows low consumer appreciation scores often does not succeed commercially 25 . Additionally, the requirement for a large number of representative evaluators makes consumer trials one of the more costly and time-consuming aspects of product development. Hence, a model for predicting chemical drivers of overall appreciation would be a welcome addition to the available toolbox for food development and optimization.

Since GBR models on our RateBeer dataset showed the best overall performance, we focused on these models. Specifically, we used two approaches to identify important contributors. First, rankings of the most important predictors for each sensorial trait in the GBR models were obtained based on impurity-based feature importance (mean decrease in impurity). High-ranked parameters were hypothesized to be either the true causal chemical properties underlying the trait, to correlate with the actual causal properties, or to take part in sensory interactions affecting the trait 67 (Fig.  4A ). In a second approach, we used SHAP 68 to determine which parameters contributed most to the model for making predictions of consumer appreciation (Fig.  4B ). SHAP calculates parameter contributions to model predictions on a per-sample basis, which can be aggregated into an importance score.

figure 4

A The impurity-based feature importance (mean deviance in impurity, MDI) calculated from the Gradient Boosting Regression (GBR) model predicting RateBeer appreciation scores. The top 15 highest ranked chemical properties are shown. B SHAP summary plot for the top 15 parameters contributing to our GBR model. Each point on the graph represents a sample from our dataset. The color represents the concentration of that parameter, with bluer colors representing low values and redder colors representing higher values. Greater absolute values on the horizontal axis indicate a higher impact of the parameter on the prediction of the model. C Spearman correlations between the 15 most important chemical properties and consumer overall appreciation. Numbers indicate the Spearman Rho correlation coefficient, and the rank of this correlation compared to all other correlations. The top 15 important compounds were determined using SHAP (panel B).

Both approaches identified ethyl acetate as the most predictive parameter for beer appreciation (Fig.  4 ). Ethyl acetate is the most abundant ester in beer with a typical ‘fruity’, ‘solvent’ and ‘alcoholic’ flavor, but is often considered less important than other esters like isoamyl acetate. The second most important parameter identified by SHAP is ethanol, the most abundant beer compound after water. Apart from directly contributing to beer flavor and mouthfeel, ethanol drastically influences the physical properties of beer, dictating how easily volatile compounds escape the beer matrix to contribute to beer aroma 69 . Importantly, it should also be noted that the importance of ethanol for appreciation is likely inflated by the very low appreciation scores of non-alcoholic beers (Supplementary Fig.  S4 ). Despite not often being considered a driver of beer appreciation, protein level also ranks highly in both approaches, possibly due to its effect on mouthfeel and body 70 . Lactic acid, which contributes to the tart taste of sour beers, is the fourth most important parameter identified by SHAP, possibly due to the generally high appreciation of sour beers in our dataset.

Interestingly, some of the most important predictive parameters for our model are not well-established as beer flavors or are even commonly regarded as being negative for beer quality. For example, our models identify methanethiol and ethyl phenyl acetate, an ester commonly linked to beer staling 71 , as a key factor contributing to beer appreciation. Although there is no doubt that high concentrations of these compounds are considered unpleasant, the positive effects of modest concentrations are not yet known 72 , 73 .

To compare our approach to conventional statistics, we evaluated how well the 15 most important SHAP-derived parameters correlate with consumer appreciation (Fig.  4C ). Interestingly, only 6 of the properties derived by SHAP rank amongst the top 15 most correlated parameters. For some chemical compounds, the correlations are so low that they would have likely been considered unimportant. For example, lactic acid, the fourth most important parameter, shows a bimodal distribution for appreciation, with sour beers forming a separate cluster, that is missed entirely by the Spearman correlation. Additionally, the correlation plots reveal outliers, emphasizing the need for robust analysis tools. Together, this highlights the need for alternative models, like the Gradient Boosting model, that better grasp the complexity of (beer) flavor.

Finally, to observe the relationships between these chemical properties and their predicted targets, partial dependence plots were constructed for the six most important predictors of consumer appreciation 74 , 75 , 76 (Supplementary Fig.  S7 ). One-way partial dependence plots show how a change in concentration affects the predicted appreciation. These plots reveal an important limitation of our models: appreciation predictions remain constant at ever-increasing concentrations. This implies that once a threshold concentration is reached, further increasing the concentration does not affect appreciation. This is false, as it is well-documented that certain compounds become unpleasant at high concentrations, including ethyl acetate (‘nail polish’) 77 and methanethiol (‘sulfury’ and ‘rotten cabbage’) 78 . The inability of our models to grasp that flavor compounds have optimal levels, above which they become negative, is a consequence of working with commercial beer brands where (off-)flavors are rarely too high to negatively impact the product. The two-way partial dependence plots show how changing the concentration of two compounds influences predicted appreciation, visualizing their interactions (Supplementary Fig.  S7 ). In our case, the top 5 parameters are dominated by additive or synergistic interactions, with high concentrations for both compounds resulting in the highest predicted appreciation.

To assess the robustness of our best-performing models and model predictions, we performed 100 iterations of the GBR, RF and ET models. In general, all iterations of the models yielded similar performance (Supplementary Fig.  S8 ). Moreover, the main predictors (including the top predictors ethanol and ethyl acetate) remained virtually the same, especially for GBR and RF. For the iterations of the ET model, we did observe more variation in the top predictors, which is likely a consequence of the model’s inherent random architecture in combination with co-correlations between certain predictors. However, even in this case, several of the top predictors (ethanol and ethyl acetate) remain unchanged, although their rank in importance changes (Supplementary Fig.  S8 ).

Next, we investigated if a combination of RateBeer and trained panel data into one consolidated dataset would lead to stronger models, under the hypothesis that such a model would suffer less from bias in the datasets. A GBR model was trained to predict appreciation on the combined dataset. This model underperformed compared to the RateBeer model, both in the native case and when including a dataset identifier (R 2  = 0.67, 0.26 and 0.42 respectively). For the latter, the dataset identifier is the most important feature (Supplementary Fig.  S9 ), while most of the feature importance remains unchanged, with ethyl acetate and ethanol ranking highest, like in the original model trained only on RateBeer data. It seems that the large variation in the panel dataset introduces noise, weakening the models’ performances and reliability. In addition, it seems reasonable to assume that both datasets are fundamentally different, with the panel dataset obtained by blind tastings by a trained professional panel.

Lastly, we evaluated whether beer style identifiers would further enhance the model’s performance. A GBR model was trained with parameters that explicitly encoded the styles of the samples. This did not improve model performance (R2 = 0.66 with style information vs R2 = 0.67). The most important chemical features are consistent with the model trained without style information (eg. ethanol and ethyl acetate), and with the exception of the most preferred (strong ale) and least preferred (low/no-alcohol) styles, none of the styles were among the most important features (Supplementary Fig.  S9 , Supplementary Table  S5 and S6 ). This is likely due to a combination of style-specific chemical signatures, such as iso-alpha acids and lactic acid, that implicitly convey style information to the original models, as well as the low number of samples belonging to some styles, making it difficult for the model to learn style-specific patterns. Moreover, beer styles are not rigorously defined, with some styles overlapping in features and some beers being misattributed to a specific style, all of which leads to more noise in models that use style parameters.

Model validation

To test if our predictive models give insight into beer appreciation, we set up experiments aimed at improving existing commercial beers. We specifically selected overall appreciation as the trait to be examined because of its complexity and commercial relevance. Beer flavor comprises a complex bouquet rather than single aromas and tastes 53 . Hence, adding a single compound to the extent that a difference is noticeable may lead to an unbalanced, artificial flavor. Therefore, we evaluated the effect of combinations of compounds. Because Blond beers represent the most extensive style in our dataset, we selected a beer from this style as the starting material for these experiments (Beer 64 in Supplementary Data  1 ).

In the first set of experiments, we adjusted the concentrations of compounds that made up the most important predictors of overall appreciation (ethyl acetate, ethanol, lactic acid, ethyl phenyl acetate) together with correlated compounds (ethyl hexanoate, isoamyl acetate, glycerol), bringing them up to 95 th percentile ethanol-normalized concentrations (Methods) within the Blond group (‘Spiked’ concentration in Fig.  5A ). Compared to controls, the spiked beers were found to have significantly improved overall appreciation among trained panelists, with panelist noting increased intensity of ester flavors, sweetness, alcohol, and body fullness (Fig.  5B ). To disentangle the contribution of ethanol to these results, a second experiment was performed without the addition of ethanol. This resulted in a similar outcome, including increased perception of alcohol and overall appreciation.

figure 5

Adding the top chemical compounds, identified as best predictors of appreciation by our model, into poorly appreciated beers results in increased appreciation from our trained panel. Results of sensory tests between base beers and those spiked with compounds identified as the best predictors by the model. A Blond and Non/Low-alcohol (0.0% ABV) base beers were brought up to 95th-percentile ethanol-normalized concentrations within each style. B For each sensory attribute, tasters indicated the more intense sample and selected the sample they preferred. The numbers above the bars correspond to the p values that indicate significant changes in perceived flavor (two-sided binomial test: alpha 0.05, n  = 20 or 13).

In a last experiment, we tested whether using the model’s predictions can boost the appreciation of a non-alcoholic beer (beer 223 in Supplementary Data  1 ). Again, the addition of a mixture of predicted compounds (omitting ethanol, in this case) resulted in a significant increase in appreciation, body, ester flavor and sweetness.

Predicting flavor and consumer appreciation from chemical composition is one of the ultimate goals of sensory science. A reliable, systematic and unbiased way to link chemical profiles to flavor and food appreciation would be a significant asset to the food and beverage industry. Such tools would substantially aid in quality control and recipe development, offer an efficient and cost-effective alternative to pilot studies and consumer trials and would ultimately allow food manufacturers to produce superior, tailor-made products that better meet the demands of specific consumer groups more efficiently.

A limited set of studies have previously tried, to varying degrees of success, to predict beer flavor and beer popularity based on (a limited set of) chemical compounds and flavors 79 , 80 . Current sensitive, high-throughput technologies allow measuring an unprecedented number of chemical compounds and properties in a large set of samples, yielding a dataset that can train models that help close the gaps between chemistry and flavor, even for a complex natural product like beer. To our knowledge, no previous research gathered data at this scale (250 samples, 226 chemical parameters, 50 sensory attributes and 5 consumer scores) to disentangle and validate the chemical aspects driving beer preference using various machine-learning techniques. We find that modern machine learning models outperform conventional statistical tools, such as correlations and linear models, and can successfully predict flavor appreciation from chemical composition. This could be attributed to the natural incorporation of interactions and non-linear or discontinuous effects in machine learning models, which are not easily grasped by the linear model architecture. While linear models and partial least squares regression represent the most widespread statistical approaches in sensory science, in part because they allow interpretation 65 , 81 , 82 , modern machine learning methods allow for building better predictive models while preserving the possibility to dissect and exploit the underlying patterns. Of the 10 different models we trained, tree-based models, such as our best performing GBR, showed the best overall performance in predicting sensory responses from chemical information, outcompeting artificial neural networks. This agrees with previous reports for models trained on tabular data 83 . Our results are in line with the findings of Colantonio et al. who also identified the gradient boosting architecture as performing best at predicting appreciation and flavor (of tomatoes and blueberries, in their specific study) 26 . Importantly, besides our larger experimental scale, we were able to directly confirm our models’ predictions in vivo.

Our study confirms that flavor compound concentration does not always correlate with perception, suggesting complex interactions that are often missed by more conventional statistics and simple models. Specifically, we find that tree-based algorithms may perform best in developing models that link complex food chemistry with aroma. Furthermore, we show that massive datasets of untrained consumer reviews provide a valuable source of data, that can complement or even replace trained tasting panels, especially for appreciation and basic flavors, such as sweetness and bitterness. This holds despite biases that are known to occur in such datasets, such as price or conformity bias. Moreover, GBR models predict taste better than aroma. This is likely because taste (e.g. bitterness) often directly relates to the corresponding chemical measurements (e.g., iso-alpha acids), whereas such a link is less clear for aromas, which often result from the interplay between multiple volatile compounds. We also find that our models are best at predicting acidity and alcohol, likely because there is a direct relation between the measured chemical compounds (acids and ethanol) and the corresponding perceived sensorial attribute (acidity and alcohol), and because even untrained consumers are generally able to recognize these flavors and aromas.

The predictions of our final models, trained on review data, hold even for blind tastings with small groups of trained tasters, as demonstrated by our ability to validate specific compounds as drivers of beer flavor and appreciation. Since adding a single compound to the extent of a noticeable difference may result in an unbalanced flavor profile, we specifically tested our identified key drivers as a combination of compounds. While this approach does not allow us to validate if a particular single compound would affect flavor and/or appreciation, our experiments do show that this combination of compounds increases consumer appreciation.

It is important to stress that, while it represents an important step forward, our approach still has several major limitations. A key weakness of the GBR model architecture is that amongst co-correlating variables, the largest main effect is consistently preferred for model building. As a result, co-correlating variables often have artificially low importance scores, both for impurity and SHAP-based methods, like we observed in the comparison to the more randomized Extra Trees models. This implies that chemicals identified as key drivers of a specific sensory feature by GBR might not be the true causative compounds, but rather co-correlate with the actual causative chemical. For example, the high importance of ethyl acetate could be (partially) attributed to the total ester content, ethanol or ethyl hexanoate (rho=0.77, rho=0.72 and rho=0.68), while ethyl phenylacetate could hide the importance of prenyl isobutyrate and ethyl benzoate (rho=0.77 and rho=0.76). Expanding our GBR model to include beer style as a parameter did not yield additional power or insight. This is likely due to style-specific chemical signatures, such as iso-alpha acids and lactic acid, that implicitly convey style information to the original model, as well as the smaller sample size per style, limiting the power to uncover style-specific patterns. This can be partly attributed to the curse of dimensionality, where the high number of parameters results in the models mainly incorporating single parameter effects, rather than complex interactions such as style-dependent effects 67 . A larger number of samples may overcome some of these limitations and offer more insight into style-specific effects. On the other hand, beer style is not a rigid scientific classification, and beers within one style often differ a lot, which further complicates the analysis of style as a model factor.

Our study is limited to beers from Belgian breweries. Although these beers cover a large portion of the beer styles available globally, some beer styles and consumer patterns may be missing, while other features might be overrepresented. For example, many Belgian ales exhibit yeast-driven flavor profiles, which is reflected in the chemical drivers of appreciation discovered by this study. In future work, expanding the scope to include diverse markets and beer styles could lead to the identification of even more drivers of appreciation and better models for special niche products that were not present in our beer set.

In addition to inherent limitations of GBR models, there are also some limitations associated with studying food aroma. Even if our chemical analyses measured most of the known aroma compounds, the total number of flavor compounds in complex foods like beer is still larger than the subset we were able to measure in this study. For example, hop-derived thiols, that influence flavor at very low concentrations, are notoriously difficult to measure in a high-throughput experiment. Moreover, consumer perception remains subjective and prone to biases that are difficult to avoid. It is also important to stress that the models are still immature and that more extensive datasets will be crucial for developing more complete models in the future. Besides more samples and parameters, our dataset does not include any demographic information about the tasters. Including such data could lead to better models that grasp external factors like age and culture. Another limitation is that our set of beers consists of high-quality end-products and lacks beers that are unfit for sale, which limits the current model in accurately predicting products that are appreciated very badly. Finally, while models could be readily applied in quality control, their use in sensory science and product development is restrained by their inability to discern causal relationships. Given that the models cannot distinguish compounds that genuinely drive consumer perception from those that merely correlate, validation experiments are essential to identify true causative compounds.

Despite the inherent limitations, dissection of our models enabled us to pinpoint specific molecules as potential drivers of beer aroma and consumer appreciation, including compounds that were unexpected and would not have been identified using standard approaches. Important drivers of beer appreciation uncovered by our models include protein levels, ethyl acetate, ethyl phenyl acetate and lactic acid. Currently, many brewers already use lactic acid to acidify their brewing water and ensure optimal pH for enzymatic activity during the mashing process. Our results suggest that adding lactic acid can also improve beer appreciation, although its individual effect remains to be tested. Interestingly, ethanol appears to be unnecessary to improve beer appreciation, both for blond beer and alcohol-free beer. Given the growing consumer interest in alcohol-free beer, with a predicted annual market growth of >7% 84 , it is relevant for brewers to know what compounds can further increase consumer appreciation of these beers. Hence, our model may readily provide avenues to further improve the flavor and consumer appreciation of both alcoholic and non-alcoholic beers, which is generally considered one of the key challenges for future beer production.

Whereas we see a direct implementation of our results for the development of superior alcohol-free beverages and other food products, our study can also serve as a stepping stone for the development of novel alcohol-containing beverages. We want to echo the growing body of scientific evidence for the negative effects of alcohol consumption, both on the individual level by the mutagenic, teratogenic and carcinogenic effects of ethanol 85 , 86 , as well as the burden on society caused by alcohol abuse and addiction. We encourage the use of our results for the production of healthier, tastier products, including novel and improved beverages with lower alcohol contents. Furthermore, we strongly discourage the use of these technologies to improve the appreciation or addictive properties of harmful substances.

The present work demonstrates that despite some important remaining hurdles, combining the latest developments in chemical analyses, sensory analysis and modern machine learning methods offers exciting avenues for food chemistry and engineering. Soon, these tools may provide solutions in quality control and recipe development, as well as new approaches to sensory science and flavor research.

Beer selection

250 commercial Belgian beers were selected to cover the broad diversity of beer styles and corresponding diversity in chemical composition and aroma. See Supplementary Fig.  S1 .

Chemical dataset

Sample preparation.

Beers within their expiration date were purchased from commercial retailers. Samples were prepared in biological duplicates at room temperature, unless explicitly stated otherwise. Bottle pressure was measured with a manual pressure device (Steinfurth Mess-Systeme GmbH) and used to calculate CO 2 concentration. The beer was poured through two filter papers (Macherey-Nagel, 500713032 MN 713 ¼) to remove carbon dioxide and prevent spontaneous foaming. Samples were then prepared for measurements by targeted Headspace-Gas Chromatography-Flame Ionization Detector/Flame Photometric Detector (HS-GC-FID/FPD), Headspace-Solid Phase Microextraction-Gas Chromatography-Mass Spectrometry (HS-SPME-GC-MS), colorimetric analysis, enzymatic analysis, Near-Infrared (NIR) analysis, as described in the sections below. The mean values of biological duplicates are reported for each compound.

HS-GC-FID/FPD

HS-GC-FID/FPD (Shimadzu GC 2010 Plus) was used to measure higher alcohols, acetaldehyde, esters, 4-vinyl guaicol, and sulfur compounds. Each measurement comprised 5 ml of sample pipetted into a 20 ml glass vial containing 1.75 g NaCl (VWR, 27810.295). 100 µl of 2-heptanol (Sigma-Aldrich, H3003) (internal standard) solution in ethanol (Fisher Chemical, E/0650DF/C17) was added for a final concentration of 2.44 mg/L. Samples were flushed with nitrogen for 10 s, sealed with a silicone septum, stored at −80 °C and analyzed in batches of 20.

The GC was equipped with a DB-WAXetr column (length, 30 m; internal diameter, 0.32 mm; layer thickness, 0.50 µm; Agilent Technologies, Santa Clara, CA, USA) to the FID and an HP-5 column (length, 30 m; internal diameter, 0.25 mm; layer thickness, 0.25 µm; Agilent Technologies, Santa Clara, CA, USA) to the FPD. N 2 was used as the carrier gas. Samples were incubated for 20 min at 70 °C in the headspace autosampler (Flow rate, 35 cm/s; Injection volume, 1000 µL; Injection mode, split; Combi PAL autosampler, CTC analytics, Switzerland). The injector, FID and FPD temperatures were kept at 250 °C. The GC oven temperature was first held at 50 °C for 5 min and then allowed to rise to 80 °C at a rate of 5 °C/min, followed by a second ramp of 4 °C/min until 200 °C kept for 3 min and a final ramp of (4 °C/min) until 230 °C for 1 min. Results were analyzed with the GCSolution software version 2.4 (Shimadzu, Kyoto, Japan). The GC was calibrated with a 5% EtOH solution (VWR International) containing the volatiles under study (Supplementary Table  S7 ).

HS-SPME-GC-MS

HS-SPME-GC-MS (Shimadzu GCMS-QP-2010 Ultra) was used to measure additional volatile compounds, mainly comprising terpenoids and esters. Samples were analyzed by HS-SPME using a triphase DVB/Carboxen/PDMS 50/30 μm SPME fiber (Supelco Co., Bellefonte, PA, USA) followed by gas chromatography (Thermo Fisher Scientific Trace 1300 series, USA) coupled to a mass spectrometer (Thermo Fisher Scientific ISQ series MS) equipped with a TriPlus RSH autosampler. 5 ml of degassed beer sample was placed in 20 ml vials containing 1.75 g NaCl (VWR, 27810.295). 5 µl internal standard mix was added, containing 2-heptanol (1 g/L) (Sigma-Aldrich, H3003), 4-fluorobenzaldehyde (1 g/L) (Sigma-Aldrich, 128376), 2,3-hexanedione (1 g/L) (Sigma-Aldrich, 144169) and guaiacol (1 g/L) (Sigma-Aldrich, W253200) in ethanol (Fisher Chemical, E/0650DF/C17). Each sample was incubated at 60 °C in the autosampler oven with constant agitation. After 5 min equilibration, the SPME fiber was exposed to the sample headspace for 30 min. The compounds trapped on the fiber were thermally desorbed in the injection port of the chromatograph by heating the fiber for 15 min at 270 °C.

The GC-MS was equipped with a low polarity RXi-5Sil MS column (length, 20 m; internal diameter, 0.18 mm; layer thickness, 0.18 µm; Restek, Bellefonte, PA, USA). Injection was performed in splitless mode at 320 °C, a split flow of 9 ml/min, a purge flow of 5 ml/min and an open valve time of 3 min. To obtain a pulsed injection, a programmed gas flow was used whereby the helium gas flow was set at 2.7 mL/min for 0.1 min, followed by a decrease in flow of 20 ml/min to the normal 0.9 mL/min. The temperature was first held at 30 °C for 3 min and then allowed to rise to 80 °C at a rate of 7 °C/min, followed by a second ramp of 2 °C/min till 125 °C and a final ramp of 8 °C/min with a final temperature of 270 °C.

Mass acquisition range was 33 to 550 amu at a scan rate of 5 scans/s. Electron impact ionization energy was 70 eV. The interface and ion source were kept at 275 °C and 250 °C, respectively. A mix of linear n-alkanes (from C7 to C40, Supelco Co.) was injected into the GC-MS under identical conditions to serve as external retention index markers. Identification and quantification of the compounds were performed using an in-house developed R script as described in Goelen et al. and Reher et al. 87 , 88 (for package information, see Supplementary Table  S8 ). Briefly, chromatograms were analyzed using AMDIS (v2.71) 89 to separate overlapping peaks and obtain pure compound spectra. The NIST MS Search software (v2.0 g) in combination with the NIST2017, FFNSC3 and Adams4 libraries were used to manually identify the empirical spectra, taking into account the expected retention time. After background subtraction and correcting for retention time shifts between samples run on different days based on alkane ladders, compound elution profiles were extracted and integrated using a file with 284 target compounds of interest, which were either recovered in our identified AMDIS list of spectra or were known to occur in beer. Compound elution profiles were estimated for every peak in every chromatogram over a time-restricted window using weighted non-negative least square analysis after which peak areas were integrated 87 , 88 . Batch effect correction was performed by normalizing against the most stable internal standard compound, 4-fluorobenzaldehyde. Out of all 284 target compounds that were analyzed, 167 were visually judged to have reliable elution profiles and were used for final analysis.

Discrete photometric and enzymatic analysis

Discrete photometric and enzymatic analysis (Thermo Scientific TM Gallery TM Plus Beermaster Discrete Analyzer) was used to measure acetic acid, ammonia, beta-glucan, iso-alpha acids, color, sugars, glycerol, iron, pH, protein, and sulfite. 2 ml of sample volume was used for the analyses. Information regarding the reagents and standard solutions used for analyses and calibrations is included in Supplementary Table  S7 and Supplementary Table  S9 .

NIR analyses

NIR analysis (Anton Paar Alcolyzer Beer ME System) was used to measure ethanol. Measurements comprised 50 ml of sample, and a 10% EtOH solution was used for calibration.

Correlation calculations

Pairwise Spearman Rank correlations were calculated between all chemical properties.

Sensory dataset

Trained panel.

Our trained tasting panel consisted of volunteers who gave prior verbal informed consent. All compounds used for the validation experiment were of food-grade quality. The tasting sessions were approved by the Social and Societal Ethics Committee of the KU Leuven (G-2022-5677-R2(MAR)). All online reviewers agreed to the Terms and Conditions of the RateBeer website.

Sensory analysis was performed according to the American Society of Brewing Chemists (ASBC) Sensory Analysis Methods 90 . 30 volunteers were screened through a series of triangle tests. The sixteen most sensitive and consistent tasters were retained as taste panel members. The resulting panel was diverse in age [22–42, mean: 29], sex [56% male] and nationality [7 different countries]. The panel developed a consensus vocabulary to describe beer aroma, taste and mouthfeel. Panelists were trained to identify and score 50 different attributes, using a 7-point scale to rate attributes’ intensity. The scoring sheet is included as Supplementary Data  3 . Sensory assessments took place between 10–12 a.m. The beers were served in black-colored glasses. Per session, between 5 and 12 beers of the same style were tasted at 12 °C to 16 °C. Two reference beers were added to each set and indicated as ‘Reference 1 & 2’, allowing panel members to calibrate their ratings. Not all panelists were present at every tasting. Scores were scaled by standard deviation and mean-centered per taster. Values are represented as z-scores and clustered by Euclidean distance. Pairwise Spearman correlations were calculated between taste and aroma sensory attributes. Panel consistency was evaluated by repeating samples on different sessions and performing ANOVA to identify differences, using the ‘stats’ package (v4.2.2) in R (for package information, see Supplementary Table  S8 ).

Online reviews from a public database

The ‘scrapy’ package in Python (v3.6) (for package information, see Supplementary Table  S8 ). was used to collect 232,288 online reviews (mean=922, min=6, max=5343) from RateBeer, an online beer review database. Each review entry comprised 5 numerical scores (appearance, aroma, taste, palate and overall quality) and an optional review text. The total number of reviews per reviewer was collected separately. Numerical scores were scaled and centered per rater, and mean scores were calculated per beer.

For the review texts, the language was estimated using the packages ‘langdetect’ and ‘langid’ in Python. Reviews that were classified as English by both packages were kept. Reviewers with fewer than 100 entries overall were discarded. 181,025 reviews from >6000 reviewers from >40 countries remained. Text processing was done using the ‘nltk’ package in Python. Texts were corrected for slang and misspellings; proper nouns and rare words that are relevant to the beer context were specified and kept as-is (‘Chimay’,’Lambic’, etc.). A dictionary of semantically similar sensorial terms, for example ‘floral’ and ‘flower’, was created and collapsed together into one term. Words were stemmed and lemmatized to avoid identifying words such as ‘acid’ and ‘acidity’ as separate terms. Numbers and punctuation were removed.

Sentences from up to 50 randomly chosen reviews per beer were manually categorized according to the aspect of beer they describe (appearance, aroma, taste, palate, overall quality—not to be confused with the 5 numerical scores described above) or flagged as irrelevant if they contained no useful information. If a beer contained fewer than 50 reviews, all reviews were manually classified. This labeled data set was used to train a model that classified the rest of the sentences for all beers 91 . Sentences describing taste and aroma were extracted, and term frequency–inverse document frequency (TFIDF) was implemented to calculate enrichment scores for sensorial words per beer.

The sex of the tasting subject was not considered when building our sensory database. Instead, results from different panelists were averaged, both for our trained panel (56% male, 44% female) and the RateBeer reviews (70% male, 30% female for RateBeer as a whole).

Beer price collection and processing

Beer prices were collected from the following stores: Colruyt, Delhaize, Total Wine, BeerHawk, The Belgian Beer Shop, The Belgian Shop, and Beer of Belgium. Where applicable, prices were converted to Euros and normalized per liter. Spearman correlations were calculated between these prices and mean overall appreciation scores from RateBeer and the taste panel, respectively.

Pairwise Spearman Rank correlations were calculated between all sensory properties.

Machine learning models

Predictive modeling of sensory profiles from chemical data.

Regression models were constructed to predict (a) trained panel scores for beer flavors and quality from beer chemical profiles and (b) public reviews’ appreciation scores from beer chemical profiles. Z-scores were used to represent sensory attributes in both data sets. Chemical properties with log-normal distributions (Shapiro-Wilk test, p  <  0.05 ) were log-transformed. Missing chemical measurements (0.1% of all data) were replaced with mean values per attribute. Observations from 250 beers were randomly separated into a training set (70%, 175 beers) and a test set (30%, 75 beers), stratified per beer style. Chemical measurements (p = 231) were normalized based on the training set average and standard deviation. In total, three linear regression-based models: linear regression with first-order interaction terms (LR), lasso regression with first-order interaction terms (Lasso) and partial least squares regression (PLSR); five decision tree models, Adaboost regressor (ABR), Extra Trees (ET), Gradient Boosting regressor (GBR), Random Forest (RF) and XGBoost regressor (XGBR); one support vector machine model (SVR) and one artificial neural network model (ANN) were trained. The models were implemented using the ‘scikit-learn’ package (v1.2.2) and ‘xgboost’ package (v1.7.3) in Python (v3.9.16). Models were trained, and hyperparameters optimized, using five-fold cross-validated grid search with the coefficient of determination (R 2 ) as the evaluation metric. The ANN (scikit-learn’s MLPRegressor) was optimized using Bayesian Tree-Structured Parzen Estimator optimization with the ‘Optuna’ Python package (v3.2.0). Individual models were trained per attribute, and a multi-output model was trained on all attributes simultaneously.

Model dissection

GBR was found to outperform other methods, resulting in models with the highest average R 2 values in both trained panel and public review data sets. Impurity-based rankings of the most important predictors for each predicted sensorial trait were obtained using the ‘scikit-learn’ package. To observe the relationships between these chemical properties and their predicted targets, partial dependence plots (PDP) were constructed for the six most important predictors of consumer appreciation 74 , 75 .

The ‘SHAP’ package in Python (v0.41.0) was implemented to provide an alternative ranking of predictor importance and to visualize the predictors’ effects as a function of their concentration 68 .

Validation of causal chemical properties

To validate the effects of the most important model features on predicted sensory attributes, beers were spiked with the chemical compounds identified by the models and descriptive sensory analyses were carried out according to the American Society of Brewing Chemists (ASBC) protocol 90 .

Compound spiking was done 30 min before tasting. Compounds were spiked into fresh beer bottles, that were immediately resealed and inverted three times. Fresh bottles of beer were opened for the same duration, resealed, and inverted thrice, to serve as controls. Pairs of spiked samples and controls were served simultaneously, chilled and in dark glasses as outlined in the Trained panel section above. Tasters were instructed to select the glass with the higher flavor intensity for each attribute (directional difference test 92 ) and to select the glass they prefer.

The final concentration after spiking was equal to the within-style average, after normalizing by ethanol concentration. This was done to ensure balanced flavor profiles in the final spiked beer. The same methods were applied to improve a non-alcoholic beer. Compounds were the following: ethyl acetate (Merck KGaA, W241415), ethyl hexanoate (Merck KGaA, W243906), isoamyl acetate (Merck KGaA, W205508), phenethyl acetate (Merck KGaA, W285706), ethanol (96%, Colruyt), glycerol (Merck KGaA, W252506), lactic acid (Merck KGaA, 261106).

Significant differences in preference or perceived intensity were determined by performing the two-sided binomial test on each attribute.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The data that support the findings of this work are available in the Supplementary Data files and have been deposited to Zenodo under accession code 10653704 93 . The RateBeer scores data are under restricted access, they are not publicly available as they are property of RateBeer (ZX Ventures, USA). Access can be obtained from the authors upon reasonable request and with permission of RateBeer (ZX Ventures, USA).  Source data are provided with this paper.

Code availability

The code for training the machine learning models, analyzing the models, and generating the figures has been deposited to Zenodo under accession code 10653704 93 .

Tieman, D. et al. A chemical genetic roadmap to improved tomato flavor. Science 355 , 391–394 (2017).

Article   ADS   CAS   PubMed   Google Scholar  

Plutowska, B. & Wardencki, W. Application of gas chromatography–olfactometry (GC–O) in analysis and quality assessment of alcoholic beverages – A review. Food Chem. 107 , 449–463 (2008).

Article   CAS   Google Scholar  

Legin, A., Rudnitskaya, A., Seleznev, B. & Vlasov, Y. Electronic tongue for quality assessment of ethanol, vodka and eau-de-vie. Anal. Chim. Acta 534 , 129–135 (2005).

Loutfi, A., Coradeschi, S., Mani, G. K., Shankar, P. & Rayappan, J. B. B. Electronic noses for food quality: A review. J. Food Eng. 144 , 103–111 (2015).

Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P. & Barabási, A.-L. Flavor network and the principles of food pairing. Sci. Rep. 1 , 196 (2011).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Bartoshuk, L. M. & Klee, H. J. Better fruits and vegetables through sensory analysis. Curr. Biol. 23 , R374–R378 (2013).

Article   CAS   PubMed   Google Scholar  

Piggott, J. R. Design questions in sensory and consumer science. Food Qual. Prefer. 3293 , 217–220 (1995).

Article   Google Scholar  

Kermit, M. & Lengard, V. Assessing the performance of a sensory panel-panellist monitoring and tracking. J. Chemom. 19 , 154–161 (2005).

Cook, D. J., Hollowood, T. A., Linforth, R. S. T. & Taylor, A. J. Correlating instrumental measurements of texture and flavour release with human perception. Int. J. Food Sci. Technol. 40 , 631–641 (2005).

Chinchanachokchai, S., Thontirawong, P. & Chinchanachokchai, P. A tale of two recommender systems: The moderating role of consumer expertise on artificial intelligence based product recommendations. J. Retail. Consum. Serv. 61 , 1–12 (2021).

Ross, C. F. Sensory science at the human-machine interface. Trends Food Sci. Technol. 20 , 63–72 (2009).

Chambers, E. IV & Koppel, K. Associations of volatile compounds with sensory aroma and flavor: The complex nature of flavor. Molecules 18 , 4887–4905 (2013).

Pinu, F. R. Metabolomics—The new frontier in food safety and quality research. Food Res. Int. 72 , 80–81 (2015).

Danezis, G. P., Tsagkaris, A. S., Brusic, V. & Georgiou, C. A. Food authentication: state of the art and prospects. Curr. Opin. Food Sci. 10 , 22–31 (2016).

Shepherd, G. M. Smell images and the flavour system in the human brain. Nature 444 , 316–321 (2006).

Meilgaard, M. C. Prediction of flavor differences between beers from their chemical composition. J. Agric. Food Chem. 30 , 1009–1017 (1982).

Xu, L. et al. Widespread receptor-driven modulation in peripheral olfactory coding. Science 368 , eaaz5390 (2020).

Kupferschmidt, K. Following the flavor. Science 340 , 808–809 (2013).

Billesbølle, C. B. et al. Structural basis of odorant recognition by a human odorant receptor. Nature 615 , 742–749 (2023).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Smith, B. Perspective: Complexities of flavour. Nature 486 , S6–S6 (2012).

Pfister, P. et al. Odorant receptor inhibition is fundamental to odor encoding. Curr. Biol. 30 , 2574–2587 (2020).

Moskowitz, H. W., Kumaraiah, V., Sharma, K. N., Jacobs, H. L. & Sharma, S. D. Cross-cultural differences in simple taste preferences. Science 190 , 1217–1218 (1975).

Eriksson, N. et al. A genetic variant near olfactory receptor genes influences cilantro preference. Flavour 1 , 22 (2012).

Ferdenzi, C. et al. Variability of affective responses to odors: Culture, gender, and olfactory knowledge. Chem. Senses 38 , 175–186 (2013).

Article   PubMed   Google Scholar  

Lawless, H. T. & Heymann, H. Sensory evaluation of food: Principles and practices. (Springer, New York, NY). https://doi.org/10.1007/978-1-4419-6488-5 (2010).

Colantonio, V. et al. Metabolomic selection for enhanced fruit flavor. Proc. Natl. Acad. Sci. 119 , e2115865119 (2022).

Fritz, F., Preissner, R. & Banerjee, P. VirtualTaste: a web server for the prediction of organoleptic properties of chemical compounds. Nucleic Acids Res 49 , W679–W684 (2021).

Tuwani, R., Wadhwa, S. & Bagler, G. BitterSweet: Building machine learning models for predicting the bitter and sweet taste of small molecules. Sci. Rep. 9 , 1–13 (2019).

Dagan-Wiener, A. et al. Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci. Rep. 7 , 1–13 (2017).

Pallante, L. et al. Toward a general and interpretable umami taste predictor using a multi-objective machine learning approach. Sci. Rep. 12 , 1–11 (2022).

Malavolta, M. et al. A survey on computational taste predictors. Eur. Food Res. Technol. 248 , 2215–2235 (2022).

Lee, B. K. et al. A principal odor map unifies diverse tasks in olfactory perception. Science 381 , 999–1006 (2023).

Mayhew, E. J. et al. Transport features predict if a molecule is odorous. Proc. Natl. Acad. Sci. 119 , e2116576119 (2022).

Niu, Y. et al. Sensory evaluation of the synergism among ester odorants in light aroma-type liquor by odor threshold, aroma intensity and flash GC electronic nose. Food Res. Int. 113 , 102–114 (2018).

Yu, P., Low, M. Y. & Zhou, W. Design of experiments and regression modelling in food flavour and sensory analysis: A review. Trends Food Sci. Technol. 71 , 202–215 (2018).

Oladokun, O. et al. The impact of hop bitter acid and polyphenol profiles on the perceived bitterness of beer. Food Chem. 205 , 212–220 (2016).

Linforth, R., Cabannes, M., Hewson, L., Yang, N. & Taylor, A. Effect of fat content on flavor delivery during consumption: An in vivo model. J. Agric. Food Chem. 58 , 6905–6911 (2010).

Guo, S., Na Jom, K. & Ge, Y. Influence of roasting condition on flavor profile of sunflower seeds: A flavoromics approach. Sci. Rep. 9 , 11295 (2019).

Ren, Q. et al. The changes of microbial community and flavor compound in the fermentation process of Chinese rice wine using Fagopyrum tataricum grain as feedstock. Sci. Rep. 9 , 3365 (2019).

Hastie, T., Friedman, J. & Tibshirani, R. The Elements of Statistical Learning. (Springer, New York, NY). https://doi.org/10.1007/978-0-387-21606-5 (2001).

Dietz, C., Cook, D., Huismann, M., Wilson, C. & Ford, R. The multisensory perception of hop essential oil: a review. J. Inst. Brew. 126 , 320–342 (2020).

CAS   Google Scholar  

Roncoroni, Miguel & Verstrepen, Kevin Joan. Belgian Beer: Tested and Tasted. (Lannoo, 2018).

Meilgaard, M. Flavor chemistry of beer: Part II: Flavor and threshold of 239 aroma volatiles. in (1975).

Bokulich, N. A. & Bamforth, C. W. The microbiology of malting and brewing. Microbiol. Mol. Biol. Rev. MMBR 77 , 157–172 (2013).

Dzialo, M. C., Park, R., Steensels, J., Lievens, B. & Verstrepen, K. J. Physiology, ecology and industrial applications of aroma formation in yeast. FEMS Microbiol. Rev. 41 , S95–S128 (2017).

Article   PubMed   PubMed Central   Google Scholar  

Datta, A. et al. Computer-aided food engineering. Nat. Food 3 , 894–904 (2022).

American Society of Brewing Chemists. Beer Methods. (American Society of Brewing Chemists, St. Paul, MN, U.S.A.).

Olaniran, A. O., Hiralal, L., Mokoena, M. P. & Pillay, B. Flavour-active volatile compounds in beer: production, regulation and control. J. Inst. Brew. 123 , 13–23 (2017).

Verstrepen, K. J. et al. Flavor-active esters: Adding fruitiness to beer. J. Biosci. Bioeng. 96 , 110–118 (2003).

Meilgaard, M. C. Flavour chemistry of beer. part I: flavour interaction between principal volatiles. Master Brew. Assoc. Am. Tech. Q 12 , 107–117 (1975).

Briggs, D. E., Boulton, C. A., Brookes, P. A. & Stevens, R. Brewing 227–254. (Woodhead Publishing). https://doi.org/10.1533/9781855739062.227 (2004).

Bossaert, S., Crauwels, S., De Rouck, G. & Lievens, B. The power of sour - A review: Old traditions, new opportunities. BrewingScience 72 , 78–88 (2019).

Google Scholar  

Verstrepen, K. J. et al. Flavor active esters: Adding fruitiness to beer. J. Biosci. Bioeng. 96 , 110–118 (2003).

Snauwaert, I. et al. Microbial diversity and metabolite composition of Belgian red-brown acidic ales. Int. J. Food Microbiol. 221 , 1–11 (2016).

Spitaels, F. et al. The microbial diversity of traditional spontaneously fermented lambic beer. PLoS ONE 9 , e95384 (2014).

Blanco, C. A., Andrés-Iglesias, C. & Montero, O. Low-alcohol Beers: Flavor Compounds, Defects, and Improvement Strategies. Crit. Rev. Food Sci. Nutr. 56 , 1379–1388 (2016).

Jackowski, M. & Trusek, A. Non-Alcohol. beer Prod. – Overv. 20 , 32–38 (2018).

Takoi, K. et al. The contribution of geraniol metabolism to the citrus flavour of beer: Synergy of geraniol and β-citronellol under coexistence with excess linalool. J. Inst. Brew. 116 , 251–260 (2010).

Kroeze, J. H. & Bartoshuk, L. M. Bitterness suppression as revealed by split-tongue taste stimulation in humans. Physiol. Behav. 35 , 779–783 (1985).

Mennella, J. A. et al. A spoonful of sugar helps the medicine go down”: Bitter masking bysucrose among children and adults. Chem. Senses 40 , 17–25 (2015).

Wietstock, P., Kunz, T., Perreira, F. & Methner, F.-J. Metal chelation behavior of hop acids in buffered model systems. BrewingScience 69 , 56–63 (2016).

Sancho, D., Blanco, C. A., Caballero, I. & Pascual, A. Free iron in pale, dark and alcohol-free commercial lager beers. J. Sci. Food Agric. 91 , 1142–1147 (2011).

Rodrigues, H. & Parr, W. V. Contribution of cross-cultural studies to understanding wine appreciation: A review. Food Res. Int. 115 , 251–258 (2019).

Korneva, E. & Blockeel, H. Towards better evaluation of multi-target regression models. in ECML PKDD 2020 Workshops (eds. Koprinska, I. et al.) 353–362 (Springer International Publishing, Cham, 2020). https://doi.org/10.1007/978-3-030-65965-3_23 .

Gastón Ares. Mathematical and Statistical Methods in Food Science and Technology. (Wiley, 2013).

Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? Preprint at http://arxiv.org/abs/2207.08815 (2022).

Gries, S. T. Statistics for Linguistics with R: A Practical Introduction. in Statistics for Linguistics with R (De Gruyter Mouton, 2021). https://doi.org/10.1515/9783110718256 .

Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2 , 56–67 (2020).

Ickes, C. M. & Cadwallader, K. R. Effects of ethanol on flavor perception in alcoholic beverages. Chemosens. Percept. 10 , 119–134 (2017).

Kato, M. et al. Influence of high molecular weight polypeptides on the mouthfeel of commercial beer. J. Inst. Brew. 127 , 27–40 (2021).

Wauters, R. et al. Novel Saccharomyces cerevisiae variants slow down the accumulation of staling aldehydes and improve beer shelf-life. Food Chem. 398 , 1–11 (2023).

Li, H., Jia, S. & Zhang, W. Rapid determination of low-level sulfur compounds in beer by headspace gas chromatography with a pulsed flame photometric detector. J. Am. Soc. Brew. Chem. 66 , 188–191 (2008).

Dercksen, A., Laurens, J., Torline, P., Axcell, B. C. & Rohwer, E. Quantitative analysis of volatile sulfur compounds in beer using a membrane extraction interface. J. Am. Soc. Brew. Chem. 54 , 228–233 (1996).

Molnar, C. Interpretable Machine Learning: A Guide for Making Black-Box Models Interpretable. (2020).

Zhao, Q. & Hastie, T. Causal interpretations of black-box models. J. Bus. Econ. Stat. Publ. Am. Stat. Assoc. 39 , 272–281 (2019).

Article   MathSciNet   Google Scholar  

Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning. (Springer, 2019).

Labrado, D. et al. Identification by NMR of key compounds present in beer distillates and residual phases after dealcoholization by vacuum distillation. J. Sci. Food Agric. 100 , 3971–3978 (2020).

Lusk, L. T., Kay, S. B., Porubcan, A. & Ryder, D. S. Key olfactory cues for beer oxidation. J. Am. Soc. Brew. Chem. 70 , 257–261 (2012).

Gonzalez Viejo, C., Torrico, D. D., Dunshea, F. R. & Fuentes, S. Development of artificial neural network models to assess beer acceptability based on sensory properties using a robotic pourer: A comparative model approach to achieve an artificial intelligence system. Beverages 5 , 33 (2019).

Gonzalez Viejo, C., Fuentes, S., Torrico, D. D., Godbole, A. & Dunshea, F. R. Chemical characterization of aromas in beer and their effect on consumers liking. Food Chem. 293 , 479–485 (2019).

Gilbert, J. L. et al. Identifying breeding priorities for blueberry flavor using biochemical, sensory, and genotype by environment analyses. PLOS ONE 10 , 1–21 (2015).

Goulet, C. et al. Role of an esterase in flavor volatile variation within the tomato clade. Proc. Natl. Acad. Sci. 109 , 19009–19014 (2012).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Borisov, V. et al. Deep Neural Networks and Tabular Data: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 1–21 https://doi.org/10.1109/TNNLS.2022.3229161 (2022).

Statista. Statista Consumer Market Outlook: Beer - Worldwide.

Seitz, H. K. & Stickel, F. Molecular mechanisms of alcoholmediated carcinogenesis. Nat. Rev. Cancer 7 , 599–612 (2007).

Voordeckers, K. et al. Ethanol exposure increases mutation rate through error-prone polymerases. Nat. Commun. 11 , 3664 (2020).

Goelen, T. et al. Bacterial phylogeny predicts volatile organic compound composition and olfactory response of an aphid parasitoid. Oikos 129 , 1415–1428 (2020).

Article   ADS   Google Scholar  

Reher, T. et al. Evaluation of hop (Humulus lupulus) as a repellent for the management of Drosophila suzukii. Crop Prot. 124 , 104839 (2019).

Stein, S. E. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. J. Am. Soc. Mass Spectrom. 10 , 770–781 (1999).

American Society of Brewing Chemists. Sensory Analysis Methods. (American Society of Brewing Chemists, St. Paul, MN, U.S.A., 1992).

McAuley, J., Leskovec, J. & Jurafsky, D. Learning Attitudes and Attributes from Multi-Aspect Reviews. Preprint at https://doi.org/10.48550/arXiv.1210.3926 (2012).

Meilgaard, M. C., Carr, B. T. & Carr, B. T. Sensory Evaluation Techniques. (CRC Press, Boca Raton). https://doi.org/10.1201/b16452 (2014).

Schreurs, M. et al. Data from: Predicting and improving complex beer flavor through machine learning. Zenodo https://doi.org/10.5281/zenodo.10653704 (2024).

Download references

Acknowledgements

We thank all lab members for their discussions and thank all tasting panel members for their contributions. Special thanks go out to Dr. Karin Voordeckers for her tremendous help in proofreading and improving the manuscript. M.S. was supported by a Baillet-Latour fellowship, L.C. acknowledges financial support from KU Leuven (C16/17/006), F.A.T. was supported by a PhD fellowship from FWO (1S08821N). Research in the lab of K.J.V. is supported by KU Leuven, FWO, VIB, VLAIO and the Brewing Science Serves Health Fund. Research in the lab of T.W. is supported by FWO (G.0A51.15) and KU Leuven (C16/17/006).

Author information

These authors contributed equally: Michiel Schreurs, Supinya Piampongsant, Miguel Roncoroni.

Authors and Affiliations

VIB—KU Leuven Center for Microbiology, Gaston Geenslaan 1, B-3001, Leuven, Belgium

Michiel Schreurs, Supinya Piampongsant, Miguel Roncoroni, Lloyd Cool, Beatriz Herrera-Malaver, Florian A. Theßeling & Kevin J. Verstrepen

CMPG Laboratory of Genetics and Genomics, KU Leuven, Gaston Geenslaan 1, B-3001, Leuven, Belgium

Leuven Institute for Beer Research (LIBR), Gaston Geenslaan 1, B-3001, Leuven, Belgium

Laboratory of Socioecology and Social Evolution, KU Leuven, Naamsestraat 59, B-3000, Leuven, Belgium

Lloyd Cool, Christophe Vanderaa & Tom Wenseleers

VIB Bioinformatics Core, VIB, Rijvisschestraat 120, B-9052, Ghent, Belgium

Łukasz Kreft & Alexander Botzki

AB InBev SA/NV, Brouwerijplein 1, B-3000, Leuven, Belgium

Philippe Malcorps & Luk Daenen

You can also search for this author in PubMed   Google Scholar

Contributions

S.P., M.S. and K.J.V. conceived the experiments. S.P., M.S. and K.J.V. designed the experiments. S.P., M.S., M.R., B.H. and F.A.T. performed the experiments. S.P., M.S., L.C., C.V., L.K., A.B., P.M., L.D., T.W. and K.J.V. contributed analysis ideas. S.P., M.S., L.C., C.V., T.W. and K.J.V. analyzed the data. All authors contributed to writing the manuscript.

Corresponding author

Correspondence to Kevin J. Verstrepen .

Ethics declarations

Competing interests.

K.J.V. is affiliated with bar.on. The other authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Florian Bauer, Andrew John Macintosh and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, description of additional supplementary files, supplementary data 1, supplementary data 2, supplementary data 3, supplementary data 4, supplementary data 5, supplementary data 6, supplementary data 7, reporting summary, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Schreurs, M., Piampongsant, S., Roncoroni, M. et al. Predicting and improving complex beer flavor through machine learning. Nat Commun 15 , 2368 (2024). https://doi.org/10.1038/s41467-024-46346-0

Download citation

Received : 30 October 2023

Accepted : 21 February 2024

Published : 26 March 2024

DOI : https://doi.org/10.1038/s41467-024-46346-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

quantitative research report study

About Stanford GSB

  • The Leadership
  • Dean’s Updates
  • School News & History
  • Commencement
  • Business, Government & Society
  • Centers & Institutes
  • Center for Entrepreneurial Studies
  • Center for Social Innovation
  • Stanford Seed

About the Experience

  • Learning at Stanford GSB
  • Experiential Learning
  • Guest Speakers
  • Entrepreneurship
  • Social Innovation
  • Communication
  • Life at Stanford GSB
  • Collaborative Environment
  • Activities & Organizations
  • Student Services
  • Housing Options
  • International Students

Full-Time Degree Programs

  • Why Stanford MBA
  • Academic Experience
  • Financial Aid
  • Why Stanford MSx
  • Research Fellows Program
  • See All Programs

Non-Degree & Certificate Programs

  • Executive Education
  • Stanford Executive Program
  • Programs for Organizations
  • The Difference
  • Online Programs
  • Stanford LEAD
  • Stanford Innovation and Entrepreneurship Certificate
  • Seed Transformation Program
  • Aspire Program
  • Seed Spark Program
  • Faculty Profiles
  • Academic Areas
  • Awards & Honors
  • Conferences

Faculty Research

  • Publications
  • Working Papers
  • Case Studies

Research Hub

  • Research Labs & Initiatives
  • Business Library
  • Data, Analytics & Research Computing
  • Behavioral Lab

Research Labs

  • Cities, Housing & Society Lab
  • Golub Capital Social Impact Lab

Research Initiatives

  • Corporate Governance Research Initiative
  • Corporations and Society Initiative
  • Policy and Innovation Initiative
  • Rapid Decarbonization Initiative
  • Stanford Latino Entrepreneurship Initiative
  • Value Chain Innovation Initiative
  • Venture Capital Initiative
  • Career & Success
  • Climate & Sustainability
  • Corporate Governance
  • Culture & Society
  • Finance & Investing
  • Government & Politics
  • Leadership & Management
  • Markets & Trade
  • Operations & Logistics
  • Opportunity & Access
  • Organizational Behavior
  • Political Economy
  • Social Impact
  • Technology & AI
  • Opinion & Analysis
  • Email Newsletter

Welcome, Alumni

  • Communities
  • Digital Communities & Tools
  • Regional Chapters
  • Women’s Programs
  • Identity Chapters
  • Find Your Reunion
  • Career Resources
  • Job Search Resources
  • Career & Life Transitions
  • Programs & Services
  • Career Video Library
  • Alumni Education
  • Research Resources
  • Volunteering
  • Alumni News
  • Class Notes
  • Alumni Voices
  • Contact Alumni Relations
  • Upcoming Events

Admission Events & Information Sessions

  • MBA Program
  • MSx Program
  • PhD Program
  • Alumni Events
  • All Other Events
  • Operations, Information & Technology
  • Classical Liberalism
  • The Eddie Lunch
  • Accounting Summer Camp
  • Videos, Code & Data
  • California Econometrics Conference
  • California Quantitative Marketing PhD Conference
  • California School Conference
  • China India Insights Conference
  • Homo economicus, Evolving
  • Political Economics (2023–24)
  • Scaling Geologic Storage of CO2 (2023–24)
  • A Resilient Pacific: Building Connections, Envisioning Solutions
  • Adaptation and Innovation
  • Changing Climate
  • Civil Society
  • Climate Impact Summit
  • Climate Science
  • Corporate Carbon Disclosures
  • Earth’s Seafloor
  • Environmental Justice
  • Operations and Information Technology
  • Organizations
  • Sustainability Reporting and Control
  • Taking the Pulse of the Planet
  • Urban Infrastructure
  • Watershed Restoration
  • Junior Faculty Workshop on Financial Regulation and Banking
  • Ken Singleton Celebration
  • Quantitative Marketing PhD Alumni Conference
  • Presentations
  • Theory and Inference in Accounting Research
  • Stanford Closer Look Series
  • Quick Guides
  • Core Concepts
  • Journal Articles
  • Glossary of Terms
  • Faculty & Staff
  • Researchers & Students
  • Research Approach
  • Charitable Giving
  • Financial Health
  • Government Services
  • Workers & Careers
  • Short Course
  • Adaptive & Iterative Experimentation
  • Incentive Design
  • Social Sciences & Behavioral Nudges
  • Bandit Experiment Application
  • Conferences & Events
  • Get Involved
  • Reading Materials
  • Teaching & Curriculum
  • Energy Entrepreneurship
  • Faculty & Affiliates
  • SOLE Report
  • Responsible Supply Chains
  • Current Study Usage
  • Pre-Registration Information
  • Participate in a Study

2023 State of Latino Entrepreneurship

This is the ninth annual State of Latino Entrepreneurship report collecting data from Latino/a-owned businesses to provide critical insights into the fastest-growing segment of the US business population. In this edition, we reached out nationally to over 10,000 Latino/a and non-Hispanic White-owned employer businesses to offer a comparative perspective. Each of the business owners surveyed, encompassing 5,102 Latino/a-owned and 5,024 White-owned businesses, generates at least $10,000 in annual revenues and employs at least one person aside from the owner.

This report sheds light on the intricacies of Latino/a-owned businesses by examining three prominent subgroups: women, immigrants and tech-centric firms. In doing so, our research acknowledges and addresses the multifaceted experiences of Latino/a entrepreneurs. This perspective helps unveil systemic challenges and provides a more accurate and comprehensive understanding of the Latino/a entrepreneurial ecosystem.

This approach provides organizations that support businesses (e.g., chambers of commerce, economic development associations, etc.), think tanks, governmental policymakers, and corporations, with insights into the specific strengths and needs of diverse Latino/a-owned business communities across the country.

  • See the Current DEI Report
  • Supporting Data
  • Research & Insights
  • Share Your Thoughts
  • Search Fund Primer
  • Affiliated Faculty
  • Faculty Advisors
  • Louis W. Foster Resource Center
  • Defining Social Innovation
  • Impact Compass
  • Global Health Innovation Insights
  • Faculty Affiliates
  • Student Awards & Certificates
  • Changemakers
  • Dean Jonathan Levin
  • Dean Garth Saloner
  • Dean Robert Joss
  • Dean Michael Spence
  • Dean Robert Jaedicke
  • Dean Rene McPherson
  • Dean Arjay Miller
  • Dean Ernest Arbuckle
  • Dean Jacob Hugh Jackson
  • Dean Willard Hotchkiss
  • Faculty in Memoriam
  • Stanford GSB Firsts
  • Certificate & Award Recipients
  • Dean’s Remarks
  • Keynote Address
  • Teaching Approach
  • Analysis and Measurement of Impact
  • The Corporate Entrepreneur: Startup in a Grown-Up Enterprise
  • Data-Driven Impact
  • Designing Experiments for Impact
  • Digital Business Transformation
  • The Founder’s Right Hand
  • Marketing for Measurable Change
  • Product Management
  • Public Policy Lab: Financial Challenges Facing US Cities
  • Public Policy Lab: Homelessness in California
  • Lab Features
  • Curricular Integration
  • View From The Top
  • Formation of New Ventures
  • Managing Growing Enterprises
  • Startup Garage
  • Explore Beyond the Classroom
  • Stanford Venture Studio
  • Summer Program
  • Workshops & Events
  • The Five Lenses of Entrepreneurship
  • Leadership Labs
  • Executive Challenge
  • Arbuckle Leadership Fellows Program
  • Selection Process
  • Training Schedule
  • Time Commitment
  • Learning Expectations
  • Post-Training Opportunities
  • Who Should Apply
  • Introductory T-Groups
  • Leadership for Society Program
  • Certificate
  • 2023 Awardees
  • 2022 Awardees
  • 2021 Awardees
  • 2020 Awardees
  • 2019 Awardees
  • 2018 Awardees
  • Social Management Immersion Fund
  • Stanford Impact Founder Fellowships and Prizes
  • Stanford Impact Leader Prizes
  • Social Entrepreneurship
  • Stanford GSB Impact Fund
  • Economic Development
  • Energy & Environment
  • Stanford GSB Residences
  • Environmental Leadership
  • Stanford GSB Artwork
  • A Closer Look
  • California & the Bay Area
  • Voices of Stanford GSB
  • Business & Beneficial Technology
  • Business & Sustainability
  • Business & Free Markets
  • Business, Government, and Society Forum
  • Second Year
  • Global Experiences
  • JD/MBA Joint Degree
  • MA Education/MBA Joint Degree
  • MD/MBA Dual Degree
  • MPP/MBA Joint Degree
  • MS Computer Science/MBA Joint Degree
  • MS Electrical Engineering/MBA Joint Degree
  • MS Environment and Resources (E-IPER)/MBA Joint Degree
  • Academic Calendar
  • Clubs & Activities
  • LGBTQ+ Students
  • Military Veterans
  • Minorities & People of Color
  • Partners & Families
  • Students with Disabilities
  • Student Support
  • Residential Life
  • Student Voices
  • MBA Alumni Voices
  • A Week in the Life
  • Career Support
  • Employment Outcomes
  • Cost of Attendance
  • Knight-Hennessy Scholars Program
  • Yellow Ribbon Program
  • BOLD Fellows Fund
  • Application Process
  • Loan Forgiveness
  • Contact the Financial Aid Office
  • Evaluation Criteria
  • GMAT & GRE
  • English Language Proficiency
  • Personal Information, Activities & Awards
  • Professional Experience
  • Letters of Recommendation
  • Optional Short Answer Questions
  • Application Fee
  • Reapplication
  • Deferred Enrollment
  • Joint & Dual Degrees
  • Entering Class Profile
  • Event Schedule
  • Ambassadors
  • New & Noteworthy
  • Ask a Question
  • See Why Stanford MSx
  • Is MSx Right for You?
  • MSx Stories
  • Leadership Development
  • Career Advancement
  • Career Change
  • How You Will Learn
  • Admission Events
  • Personal Information
  • Information for Recommenders
  • GMAT, GRE & EA
  • English Proficiency Tests
  • After You’re Admitted
  • Daycare, Schools & Camps
  • U.S. Citizens and Permanent Residents
  • Requirements
  • Requirements: Behavioral
  • Requirements: Quantitative
  • Requirements: Macro
  • Requirements: Micro
  • Annual Evaluations
  • Field Examination
  • Research Activities
  • Research Papers
  • Dissertation
  • Oral Examination
  • Current Students
  • Education & CV
  • International Applicants
  • Statement of Purpose
  • Reapplicants
  • Application Fee Waiver
  • Deadline & Decisions
  • Job Market Candidates
  • Academic Placements
  • Stay in Touch
  • Faculty Mentors
  • Current Fellows
  • Standard Track
  • Fellowship & Benefits
  • Group Enrollment
  • Program Formats
  • Developing a Program
  • Diversity & Inclusion
  • Strategic Transformation
  • Program Experience
  • Contact Client Services
  • Campus Experience
  • Live Online Experience
  • Silicon Valley & Bay Area
  • Digital Credentials
  • Faculty Spotlights
  • Participant Spotlights
  • Eligibility
  • International Participants
  • Stanford Ignite
  • Frequently Asked Questions
  • Founding Donors
  • Location Information
  • Participant Profile
  • Network Membership
  • Program Impact
  • Collaborators
  • Entrepreneur Profiles
  • Company Spotlights
  • Seed Transformation Network
  • Responsibilities
  • Current Coaches
  • How to Apply
  • Meet the Consultants
  • Meet the Interns
  • Intern Profiles
  • Collaborate
  • Research Library
  • News & Insights
  • Program Contacts
  • Databases & Datasets
  • Research Guides
  • Consultations
  • Research Workshops
  • Career Research
  • Research Data Services
  • Course Reserves
  • Course Research Guides
  • Material Loan Periods
  • Fines & Other Charges
  • Document Delivery
  • Interlibrary Loan
  • Equipment Checkout
  • Print & Scan
  • MBA & MSx Students
  • PhD Students
  • Other Stanford Students
  • Faculty Assistants
  • Research Assistants
  • Stanford GSB Alumni
  • Telling Our Story
  • Staff Directory
  • Site Registration
  • Alumni Directory
  • Alumni Email
  • Privacy Settings & My Profile
  • Success Stories
  • The Story of Circles
  • Support Women’s Circles
  • Stanford Women on Boards Initiative
  • Alumnae Spotlights
  • Insights & Research
  • Industry & Professional
  • Entrepreneurial Commitment Group
  • Recent Alumni
  • Half-Century Club
  • Fall Reunions
  • Spring Reunions
  • MBA 25th Reunion
  • Half-Century Club Reunion
  • Faculty Lectures
  • Ernest C. Arbuckle Award
  • Alison Elliott Exceptional Achievement Award
  • ENCORE Award
  • Excellence in Leadership Award
  • John W. Gardner Volunteer Leadership Award
  • Robert K. Jaedicke Faculty Award
  • Jack McDonald Military Service Appreciation Award
  • Jerry I. Porras Latino Leadership Award
  • Tapestry Award
  • Student & Alumni Events
  • Executive Recruiters
  • Interviewing
  • Land the Perfect Job with LinkedIn
  • Negotiating
  • Elevator Pitch
  • Email Best Practices
  • Resumes & Cover Letters
  • Self-Assessment
  • Whitney Birdwell Ball
  • Margaret Brooks
  • Bryn Panee Burkhart
  • Margaret Chan
  • Ricki Frankel
  • Peter Gandolfo
  • Cindy W. Greig
  • Natalie Guillen
  • Carly Janson
  • Sloan Klein
  • Sherri Appel Lassila
  • Stuart Meyer
  • Tanisha Parrish
  • Virginia Roberson
  • Philippe Taieb
  • Michael Takagawa
  • Terra Winston
  • Johanna Wise
  • Debbie Wolter
  • Rebecca Zucker
  • Complimentary Coaching
  • Changing Careers
  • Work-Life Integration
  • Career Breaks
  • Flexible Work
  • Encore Careers
  • D&B Hoovers
  • Data Axle (ReferenceUSA)
  • EBSCO Business Source
  • Firsthand (Vault)
  • Global Newsstream
  • Market Share Reporter
  • ProQuest One Business
  • Student Clubs
  • Entrepreneurial Students
  • Stanford GSB Trust
  • Alumni Community
  • How to Volunteer
  • Springboard Sessions
  • Consulting Projects
  • 2020 – 2029
  • 2010 – 2019
  • 2000 – 2009
  • 1990 – 1999
  • 1980 – 1989
  • 1970 – 1979
  • 1960 – 1969
  • 1950 – 1959
  • 1940 – 1949
  • Service Areas
  • ACT History
  • ACT Awards Celebration
  • ACT Governance Structure
  • Building Leadership for ACT
  • Individual Leadership Positions
  • Leadership Role Overview
  • Purpose of the ACT Management Board
  • Contact ACT
  • Business & Nonprofit Communities
  • Reunion Volunteers
  • Ways to Give
  • Fiscal Year Report
  • Business School Fund Leadership Council
  • Planned Giving Options
  • Planned Giving Benefits
  • Planned Gifts and Reunions
  • Legacy Partners
  • Strategic Initiatives
  • Giving News & Stories
  • Giving Deadlines
  • Development Staff
  • Submit Class Notes
  • Class Secretaries
  • Board of Directors
  • Health Care
  • Sustainability
  • Class Takeaways
  • All Else Equal: Making Better Decisions
  • If/Then: Business, Leadership, Society
  • Grit & Growth
  • Think Fast, Talk Smart
  • Spring 2022
  • Spring 2021
  • Autumn 2020
  • Summer 2020
  • Winter 2020
  • In the Media
  • For Journalists
  • DCI Fellows
  • Other Auditors
  • Academic Calendar & Deadlines
  • Course Materials
  • Entrepreneurial Resources
  • Campus Drive Grove
  • Campus Drive Lawn
  • CEMEX Auditorium
  • King Community Court
  • Seawell Family Boardroom
  • Stanford GSB Bowl
  • Stanford Investors Common
  • Town Square
  • Vidalakis Courtyard
  • Vidalakis Dining Hall
  • Catering Services
  • Policies & Guidelines
  • Reservations
  • Contact Faculty Recruiting
  • Lecturer Positions
  • Postdoctoral Positions
  • Accommodations
  • CMC-Managed Interviews
  • Recruiter-Managed Interviews
  • Virtual Interviews
  • Campus & Virtual
  • Search for Candidates
  • Think Globally
  • Recruiting Calendar
  • Recruiting Policies
  • Full-Time Employment
  • Summer Employment
  • Entrepreneurial Summer Program
  • Global Management Immersion Experience
  • Social-Purpose Summer Internships
  • Process Overview
  • Project Types
  • Client Eligibility Criteria
  • Client Screening
  • ACT Leadership
  • Social Innovation & Nonprofit Management Resources
  • Develop Your Organization’s Talent
  • Centers & Initiatives
  • Student Fellowships
  • Share full article

Advertisement

Supported by

More Studies by Columbia Cancer Researchers Are Retracted

The studies, pulled because of copied data, illustrate the sluggishness of scientific publishers to address serious errors, experts said.

quantitative research report study

By Benjamin Mueller

Scientists in a prominent cancer lab at Columbia University have now had four studies retracted and a stern note added to a fifth accusing it of “severe abuse of the scientific publishing system,” the latest fallout from research misconduct allegations recently leveled against several leading cancer scientists.

A scientific sleuth in Britain last year uncovered discrepancies in data published by the Columbia lab, including the reuse of photos and other images across different papers. The New York Times reported last month that a medical journal in 2022 had quietly taken down a stomach cancer study by the researchers after an internal inquiry by the journal found ethics violations.

Despite that study’s removal, the researchers — Dr. Sam Yoon, chief of a cancer surgery division at Columbia University’s medical center, and Changhwan Yoon, a more junior biologist there — continued publishing studies with suspicious data. Since 2008, the two scientists have collaborated with other researchers on 26 articles that the sleuth, Sholto David, publicly flagged for misrepresenting experiments’ results.

One of those articles was retracted last month after The Times asked publishers about the allegations. In recent weeks, medical journals have retracted three additional studies, which described new strategies for treating cancers of the stomach, head and neck. Other labs had cited the articles in roughly 90 papers.

A major scientific publisher also appended a blunt note to the article that it had originally taken down without explanation in 2022. “This reuse (and in part, misrepresentation) of data without appropriate attribution represents a severe abuse of the scientific publishing system,” it said .

Still, those measures addressed only a small fraction of the lab’s suspect papers. Experts said the episode illustrated not only the extent of unreliable research by top labs, but also the tendency of scientific publishers to respond slowly, if at all, to significant problems once they are detected. As a result, other labs keep relying on questionable work as they pour federal research money into studies, allowing errors to accumulate in the scientific record.

“For every one paper that is retracted, there are probably 10 that should be,” said Dr. Ivan Oransky, co-founder of Retraction Watch, which keeps a database of 47,000-plus retracted studies. “Journals are not particularly interested in correcting the record.”

Columbia’s medical center declined to comment on allegations facing Dr. Yoon’s lab. It said the two scientists remained at Columbia and the hospital “is fully committed to upholding the highest standards of ethics and to rigorously maintaining the integrity of our research.”

The lab’s web page was recently taken offline. Columbia declined to say why. Neither Dr. Yoon nor Changhwan Yoon could be reached for comment. (They are not related.)

Memorial Sloan Kettering Cancer Center, where the scientists worked when much of the research was done, is investigating their work.

The Columbia scientists’ retractions come amid growing attention to the suspicious data that undergirds some medical research. Since late February, medical journals have retracted seven papers by scientists at Harvard’s Dana-Farber Cancer Institute . That followed investigations into data problems publicized by Dr. David , an independent molecular biologist who looks for irregularities in published images of cells, tumors and mice, sometimes with help from A.I. software.

The spate of misconduct allegations has drawn attention to the pressures on academic scientists — even those, like Dr. Yoon, who also work as doctors — to produce heaps of research.

Strong images of experiments’ results are often needed for those studies. Publishing them helps scientists win prestigious academic appointments and attract federal research grants that can pay dividends for themselves and their universities.

Dr. Yoon, a robotic surgery specialist noted for his treatment of stomach cancers, has helped bring in nearly $5 million in federal research money over his career.

The latest retractions from his lab included articles from 2020 and 2021 that Dr. David said contained glaring irregularities . Their results appeared to include identical images of tumor-stricken mice, despite those mice supposedly having been subjected to different experiments involving separate treatments and types of cancer cells.

The medical journal Cell Death & Disease retracted two of the latest studies, and Oncogene retracted the third. The journals found that the studies had also reused other images, like identical pictures of constellations of cancer cells.

The studies Dr. David flagged as containing image problems were largely overseen by the more senior Dr. Yoon. Changhwan Yoon, an associate research scientist who has worked alongside Dr. Yoon for a decade, was often a first author, which generally designates the scientist who ran the bulk of the experiments.

Kun Huang, a scientist in China who oversaw one of the recently retracted studies, a 2020 paper that did not include the more senior Dr. Yoon, attributed that study’s problematic sections to Changhwan Yoon. Dr. Huang, who made those comments this month on PubPeer, a website where scientists post about studies, did not respond to an email seeking comment.

But the more senior Dr. Yoon has long been made aware of problems in research he published alongside Changhwan Yoon: The two scientists were notified of the removal in January 2022 of their stomach cancer study that was found to have violated ethics guidelines.

Research misconduct is often pinned on the more junior researchers who conduct experiments. Other scientists, though, assign greater responsibility to the senior researchers who run labs and oversee studies, even as they juggle jobs as doctors or administrators.

“The research world’s coming to realize that with great power comes great responsibility and, in fact, you are responsible not just for what one of your direct reports in the lab has done, but for the environment you create,” Dr. Oransky said.

In their latest public retraction notices, medical journals said that they had lost faith in the results and conclusions. Imaging experts said some irregularities identified by Dr. David bore signs of deliberate manipulation, like flipped or rotated images, while others could have been sloppy copy-and-paste errors.

The little-noticed removal by a journal of the stomach cancer study in January 2022 highlighted some scientific publishers’ policy of not disclosing the reasons for withdrawing papers as long as they have not yet formally appeared in print. That study had appeared only online.

Roland Herzog, the editor of the journal Molecular Therapy, said that editors had drafted an explanation that they intended to publish at the time of the article’s removal. But Elsevier, the journal’s parent publisher, advised them that such a note was unnecessary, he said.

Only after the Times article last month did Elsevier agree to explain the article’s removal publicly with the stern note. In an editorial this week , the Molecular Therapy editors said that in the future, they would explain the removal of any articles that had been published only online.

But Elsevier said in a statement that it did not consider online articles “to be the final published articles of record.” As a result, company policy continues to advise that such articles be removed without an explanation when they are found to contain problems. The company said it allowed editors to provide additional information where needed.

Elsevier, which publishes nearly 3,000 journals and generates billions of dollars in annual revenue , has long been criticized for its opaque removals of online articles.

Articles by the Columbia scientists with data discrepancies that remain unaddressed were largely distributed by three major publishers: Elsevier, Springer Nature and the American Association for Cancer Research. Dr. David alerted many journals to the data discrepancies in October.

Each publisher said it was investigating the concerns. Springer Nature said investigations take time because they can involve consulting experts, waiting for author responses and analyzing raw data.

Dr. David has also raised concerns about studies published independently by scientists who collaborated with the Columbia researchers on some of their recently retracted papers. For example, Sandra Ryeom, an associate professor of surgical sciences at Columbia, published an article in 2003 while at Harvard that Dr. David said contained a duplicated image . As of 2021, she was married to the more senior Dr. Yoon, according to a mortgage document from that year.

A medical journal appended a formal notice to the article last week saying “appropriate editorial action will be taken” once data concerns had been resolved. Dr. Ryeom said in a statement that she was working with the paper’s senior author on “correcting the error.”

Columbia has sought to reinforce the importance of sound research practices. Hours after the Times article appeared last month, Dr. Michael Shelanski, the medical school’s senior vice dean for research, sent an email to faculty members titled “Research Fraud Accusations — How to Protect Yourself.” It warned that such allegations, whatever their merits, could take a toll on the university.

“In the months that it can take to investigate an allegation,” Dr. Shelanski wrote, “funding can be suspended, and donors can feel that their trust has been betrayed.”

Benjamin Mueller reports on health and medicine. He was previously a U.K. correspondent in London and a police reporter in New York. More about Benjamin Mueller

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

COMMENTS

  1. A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

    INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...

  2. What Is Quantitative Research?

    Quantitative research methods. You can use quantitative research methods for descriptive, correlational or experimental research. In descriptive research, you simply seek an overall summary of your study variables.; In correlational research, you investigate relationships between your study variables.; In experimental research, you systematically examine whether there is a cause-and-effect ...

  3. Quantitative Research

    Here are some key characteristics of quantitative research: Numerical data: Quantitative research involves collecting numerical data through standardized methods such as surveys, experiments, and observational studies. This data is analyzed using statistical methods to identify patterns and relationships.

  4. Quantitative Methods

    As a consequence, the results of quantitative research may be statistically significant but are often humanly insignificant. Some specific limitations associated with using quantitative methods to study research problems in the social sciences include: Quantitative data is more efficient and able to test hypotheses, but may miss contextual detail;

  5. What Is Quantitative Research?

    Quantitative research methods. You can use quantitative research methods for descriptive, correlational or experimental research. In descriptive research, you simply seek an overall summary of your study variables.; In correlational research, you investigate relationships between your study variables.; In experimental research, you systematically examine whether there is a cause-and-effect ...

  6. Guidelines for Reporting Quantitative Methods and Results in Primary

    These guidelines, commissioned and vetted by the board of directors of Language Learning, outline the basic expectations for reporting of quantitative primary research with a specific focus on Method and Results sections. The guidelines are based on issues raised in: Norris, J. M., Ross, S., & Schoonen, R. (Eds.). (2015).

  7. Writing Quantitative Research Studies

    Summarizing quantitative data and its effective presentation and discussion can be challenging for students and researchers. This chapter provides a framework for adequately reporting findings from quantitative analysis in a research study for those contemplating to write a research paper. The rationale underpinning the reporting methods to ...

  8. Quantitative research

    Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philosophies.. Associated with the natural, applied, formal, and social sciences this research strategy promotes the objective empirical investigation of ...

  9. Quantitative Research

    Quantitative research methods are concerned with the planning, design, and implementation of strategies to collect and analyze data. Descartes, the seventeenth-century philosopher, suggested that how the results are achieved is often more important than the results themselves, as the journey taken along the research path is a journey of discovery. . High-quality quantitative research is ...

  10. Techniques for Reporting Quantitative Data

    A quantitative research report is a way of describing the completed study to other people. The findings are communicated through an oral presentation, a book, or a published paper. The report disseminates the results to research scientists or the policy decision-maker's stakeholders.

  11. Quantitative and Qualitative Research

    Social scientists are concerned with the study of people. Quantitative research is a way to learn about a particular group of people, known as a sample population. Using scientific inquiry, quantitative research relies on data that are observed or measured to examine questions about the sample population. Allen, M. (2017). The SAGE encyclopedia ...

  12. PDF Introduction to quantitative research

    Mixed-methods research is a flexible approach, where the research design is determined by what we want to find out rather than by any predetermined epistemological position. In mixed-methods research, qualitative or quantitative components can predominate, or both can have equal status. 1.4. Units and variables.

  13. What is Quantitative Research? Definition, Methods, Types, and Examples

    Quantitative research is used to validate or test a hypothesis through the collection and analysis of data. (Image by Freepik) If you're wondering what is quantitative research and whether this methodology works for your research study, you're not alone. If you want a simple quantitative research definition, then it's enough to say that this is a method undertaken by researchers based on ...

  14. Qualitative vs. Quantitative Research

    When collecting and analyzing data, quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings. Both are important for gaining different kinds of knowledge. Quantitative research. Quantitative research is expressed in numbers and graphs. It is used to test or confirm theories and assumptions.

  15. What is Quantitative Research? Definition, Examples, Key ...

    Quantitative research is a type of research that focuses on collecting and analyzing numerical data to answer research questions. There are two main methods used to conduct quantitative research: 1. Primary Method. There are several methods of primary quantitative research, each with its own strengths and limitations.

  16. What is Quantitative Research?

    What is Quantitative Research? Quantitative research is the methodology which researchers use to test theories about people's attitudes and behaviors based on numerical and statistical evidence. Researchers sample a large number of users (e.g., through surveys) to indirectly obtain measurable, bias-free data about users in relevant situations.

  17. What Is Quantitative Research Study: Methods & Examples

    Analysis of numbers allows researchers to make predictions about future trends or outcomes. Quantitative methods include surveys, experiments, field studies, structured interviews, standardized assessments and questionnaires. In this article, we will focus on what a quantitative study is and its main methods.

  18. 16. Reporting quantitative results

    Execute a quantitative research report using key elements for accuracy and openness. So you've completed your quantitative analyses and are ready to report your results. We're going to spend some time talking about what matters in quantitative research reports, but the very first thing to understand is this: openness with your data and ...

  19. A Quantitative Study of the Impact of Social Media Reviews on Brand

    A Quantitative Study of the Impact of Social Media Reviews on Brand Perception A Thesis Presented to ... For instance, a report by 2015 Pew research informs that there was a 7% rise in the usage of social media from 2005 to 2015. The report informs that 65% adults use social media (Perrin, 2015). As social media evolves into a more

  20. Qualitative vs Quantitative Research: What's the Difference?

    Qualitative research aims to produce rich and detailed descriptions of the phenomenon being studied, and to uncover new insights and meanings. Quantitative data is information about quantities, and therefore numbers, and qualitative data is descriptive, and regards phenomenon which can be observed but not measured, such as language.

  21. Critiquing Quantitative Research Reports: Key Points for the Beginner

    The first step in the critique process is for the reader to browse the abstract and article for an overview. During this initial review a great deal of information can be obtained. The abstract should provide a clear, concise overview of the study. During this review it should be noted if the title, problem statement, and research question (or ...

  22. How to appraise quantitative research

    Title, keywords and the authors. The title of a paper should be clear and give a good idea of the subject area. The title should not normally exceed 15 words 2 and should attract the attention of the reader. 3 The next step is to review the key words. These should provide information on both the ideas or concepts discussed in the paper and the ...

  23. (DOC) Quantitative Research Study Report

    Evelyn Mayes. 2019, Quantitative Research Study Report. Quantitative research, aligned with the postpositivist world view, has heralded the scientific method, thus, pertaining to promote accuracy, credibility, and validity. However, quantitative methodologies often align with qualitative research, lending it the stamp of scientific proof.

  24. Study Tracks Shifts in Student Mental Health During College

    The team made their anonymized data set publicly available—including self-reports, surveys, and phone-sensing and brain-imaging data—to help advance research into the mental health of students during their college years.. Andrew Campbell, the paper's senior author and Dartmouth's Albert Bradley 1915 Third Century Professor of Computer Science, says that the study's extensive data ...

  25. Discussion Thread Quantitative and Qualitative Research

    Health-science document from Liberty University, 1 page, Discussion Thread: Quantitative and Qualitative Research Quantitative research entails gathering numerical data to understand, forecast, and/or regulate events of interest. (Hall, Roussel, 2017). Research and experiments are effective techniques for provi

  26. Quantitative Evaluation of Large Language Models to Streamline

    Background The complex medical terminology of radiology reports may cause confusion or anxiety for patients, especially given increased access to electronic health records. Large language models (LLMs) can potentially simplify radiology report readability. Purpose To compare the performance of four publicly available LLMs (ChatGPT-3.5 and ChatGPT-4, Bard [now known as Gemini], and Bing) in ...

  27. Predicting and improving complex beer flavor through machine ...

    For each beer, we measure over 200 chemical properties, perform quantitative descriptive sensory analysis with a trained tasting panel and map data from over 180,000 consumer reviews to train 10 ...

  28. 2023 State of Latino Entrepreneurship

    This report sheds light on the intricacies of Latino/a-owned businesses by examining three prominent subgroups: women, immigrants and tech-centric firms. In doing so, our research acknowledges and addresses the multifaceted experiences of Latino/a entrepreneurs.

  29. More Studies by Columbia Cancer Researchers Are Retracted

    "For every one paper that is retracted, there are probably 10 that should be," said Dr. Ivan Oransky, co-founder of Retraction Watch, which keeps a database of 47,000-plus retracted studies.

  30. ESG Investing Studies Are Flawed, Reports Say

    Digging into the latest research, scrutinizing complex mathematical formulas and parsing tens of thousands of data points, he discovered what he says are flaws that skewed the results.