News alert: UC Berkeley has announced its next university librarian

Secondary menu

  • Log in to your Library account
  • Hours and Maps
  • Connect from Off Campus
  • UC Berkeley Home

Search form

Research methods--quantitative, qualitative, and more: overview.

  • Quantitative Research
  • Qualitative Research
  • Data Science Methods (Machine Learning, AI, Big Data)
  • Text Mining and Computational Text Analysis
  • Evidence Synthesis/Systematic Reviews
  • Get Data, Get Help!

About Research Methods

This guide provides an overview of research methods, how to choose and use them, and supports and resources at UC Berkeley. 

As Patten and Newhart note in the book Understanding Research Methods , "Research methods are the building blocks of the scientific enterprise. They are the "how" for building systematic knowledge. The accumulation of knowledge through research is by its nature a collective endeavor. Each well-designed study provides evidence that may support, amend, refute, or deepen the understanding of existing knowledge...Decisions are important throughout the practice of research and are designed to help researchers collect evidence that includes the full spectrum of the phenomenon under study, to maintain logical rules, and to mitigate or account for possible sources of bias. In many ways, learning research methods is learning how to see and make these decisions."

The choice of methods varies by discipline, by the kind of phenomenon being studied and the data being used to study it, by the technology available, and more.  This guide is an introduction, but if you don't see what you need here, always contact your subject librarian, and/or take a look to see if there's a library research guide that will answer your question. 

Suggestions for changes and additions to this guide are welcome! 

START HERE: SAGE Research Methods

Without question, the most comprehensive resource available from the library is SAGE Research Methods.  HERE IS THE ONLINE GUIDE  to this one-stop shopping collection, and some helpful links are below:

  • SAGE Research Methods
  • Little Green Books  (Quantitative Methods)
  • Little Blue Books  (Qualitative Methods)
  • Dictionaries and Encyclopedias  
  • Case studies of real research projects
  • Sample datasets for hands-on practice
  • Streaming video--see methods come to life
  • Methodspace- -a community for researchers
  • SAGE Research Methods Course Mapping

Library Data Services at UC Berkeley

Library Data Services Program and Digital Scholarship Services

The LDSP offers a variety of services and tools !  From this link, check out pages for each of the following topics:  discovering data, managing data, collecting data, GIS data, text data mining, publishing data, digital scholarship, open science, and the Research Data Management Program.

Be sure also to check out the visual guide to where to seek assistance on campus with any research question you may have!

Library GIS Services

Other Data Services at Berkeley

D-Lab Supports Berkeley faculty, staff, and graduate students with research in data intensive social science, including a wide range of training and workshop offerings Dryad Dryad is a simple self-service tool for researchers to use in publishing their datasets. It provides tools for the effective publication of and access to research data. Geospatial Innovation Facility (GIF) Provides leadership and training across a broad array of integrated mapping technologies on campu Research Data Management A UC Berkeley guide and consulting service for research data management issues

General Research Methods Resources

Here are some general resources for assistance:

  • Assistance from ICPSR (must create an account to access): Getting Help with Data , and Resources for Students
  • Wiley Stats Ref for background information on statistics topics
  • Survey Documentation and Analysis (SDA) .  Program for easy web-based analysis of survey data.

Consultants

  • D-Lab/Data Science Discovery Consultants Request help with your research project from peer consultants.
  • Research data (RDM) consulting Meet with RDM consultants before designing the data security, storage, and sharing aspects of your qualitative project.
  • Statistics Department Consulting Services A service in which advanced graduate students, under faculty supervision, are available to consult during specified hours in the Fall and Spring semesters.

Related Resourcex

  • IRB / CPHS Qualitative research projects with human subjects often require that you go through an ethics review.
  • OURS (Office of Undergraduate Research and Scholarships) OURS supports undergraduates who want to embark on research projects and assistantships. In particular, check out their "Getting Started in Research" workshops
  • Sponsored Projects Sponsored projects works with researchers applying for major external grants.
  • Next: Quantitative Research >>
  • Last Updated: Apr 3, 2023 3:14 PM
  • URL: https://guides.lib.berkeley.edu/researchmethods
  • Find My Rep

You are here

Sage online ordering services will be unavailable due to system maintenance on April 13th between 2:00 am and 8:00 am UK time If you need assistance, please contact Sage at  [email protected] . Thank you for your patience and we apologise for the inconvenience.

100 Questions (and Answers) About Research Methods

100 Questions (and Answers) About Research Methods

  • Neil J. Salkind
  • Description

"How do I know when my literature review is finished?"

"What is the difference between a sample and a population?"

"What is power and why is it important?"

In an increasingly data-driven world, it is more important than ever for students as well as professionals to better understand the process of research. This invaluable guide answers the essential questions that students ask about research methods in a concise and accessible way.

Sample Materials & Chapters

Question #16: Question #16: How Do I Know When My Literature Review Is Finished?

Question #32: How Can I Create a Good Research Hypothesis?

Question #40: What Is the Difference Between a Sample and a Population, and Why

Question #92: What Is Power, and Why Is It Important?

For instructors

Select a purchasing option.

  • Electronic Order Options VitalSource Amazon Kindle Google Play eBooks.com Kobo

100 Questions (and Answers) About Qualitative Research

100 Questions (and Answers) About Qualitative Research

  • Lisa M. Given - Swinburne University, Australia, Charles Sturt University, Australia, RMIT University, Melbourne, Australia

“This is a great companion book for a course on qualitative methods and it is also a great resource as a ‘ready-reference,’ which should be a required companion for all graduate students who will be taking qualitative research methods.”

“It provides an overview of the subject on the nuances of qualitative research.”

“ Very precise in helping students determine if their study is appropriate for this type of research design.”

“The book appears to provide the right combination of breadth and depth . There are a lot of topics covered, but the book seems to provide a succinct, snapshot-like answer for each question.”

“A book like this can provide a useful supplement to major texts and be used as a reference.”

Lisa M. Given

(Stanford users can avoid this Captcha by logging in.)

  • Send to text email RefWorks EndNote printer

100 questions (and answers) about research methods

Available online, at the library.

research method questions

Green Library

More options.

  • Find it at other libraries via WorldCat
  • Contributors

Description

Creators/contributors, contents/summary.

  • Part 1. Understanding the Research Process and Getting Started
  • Part 2. Reviewing and Writing About Your Research Question
  • Part 3. Introductory Ideas About Ethics
  • Part 4. Research Methods: Knowing the Language, Knowing the Ideas
  • Part 5. Sampling Ideas and Issues
  • Part 6. Describing Data Using Descriptive Techniques
  • Part 7. All About Testing and Measuring
  • Part 8. Understanding Different Research Methods
  • Part 9. All About Inference and Significance.
  • (source: Nielsen Book Data)

Bibliographic information

Browse related items.

Stanford University

  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Non-Discrimination
  • Accessibility

© Stanford University , Stanford , California 94305 .

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Research Methodology – Types, Examples and writing Guide

Research Methodology – Types, Examples and writing Guide

Table of Contents

Research Methodology

Research Methodology

Definition:

Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect , analyze , and interpret data to answer research questions or solve research problems . Moreover, They are philosophical and theoretical frameworks that guide the research process.

Structure of Research Methodology

Research methodology formats can vary depending on the specific requirements of the research project, but the following is a basic example of a structure for a research methodology section:

I. Introduction

  • Provide an overview of the research problem and the need for a research methodology section
  • Outline the main research questions and objectives

II. Research Design

  • Explain the research design chosen and why it is appropriate for the research question(s) and objectives
  • Discuss any alternative research designs considered and why they were not chosen
  • Describe the research setting and participants (if applicable)

III. Data Collection Methods

  • Describe the methods used to collect data (e.g., surveys, interviews, observations)
  • Explain how the data collection methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or instruments used for data collection

IV. Data Analysis Methods

  • Describe the methods used to analyze the data (e.g., statistical analysis, content analysis )
  • Explain how the data analysis methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or software used for data analysis

V. Ethical Considerations

  • Discuss any ethical issues that may arise from the research and how they were addressed
  • Explain how informed consent was obtained (if applicable)
  • Detail any measures taken to ensure confidentiality and anonymity

VI. Limitations

  • Identify any potential limitations of the research methodology and how they may impact the results and conclusions

VII. Conclusion

  • Summarize the key aspects of the research methodology section
  • Explain how the research methodology addresses the research question(s) and objectives

Research Methodology Types

Types of Research Methodology are as follows:

Quantitative Research Methodology

This is a research methodology that involves the collection and analysis of numerical data using statistical methods. This type of research is often used to study cause-and-effect relationships and to make predictions.

Qualitative Research Methodology

This is a research methodology that involves the collection and analysis of non-numerical data such as words, images, and observations. This type of research is often used to explore complex phenomena, to gain an in-depth understanding of a particular topic, and to generate hypotheses.

Mixed-Methods Research Methodology

This is a research methodology that combines elements of both quantitative and qualitative research. This approach can be particularly useful for studies that aim to explore complex phenomena and to provide a more comprehensive understanding of a particular topic.

Case Study Research Methodology

This is a research methodology that involves in-depth examination of a single case or a small number of cases. Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group.

Action Research Methodology

This is a research methodology that involves a collaborative process between researchers and practitioners to identify and solve real-world problems. Action research is often used in education, healthcare, and social work.

Experimental Research Methodology

This is a research methodology that involves the manipulation of one or more independent variables to observe their effects on a dependent variable. Experimental research is often used to study cause-and-effect relationships and to make predictions.

Survey Research Methodology

This is a research methodology that involves the collection of data from a sample of individuals using questionnaires or interviews. Survey research is often used to study attitudes, opinions, and behaviors.

Grounded Theory Research Methodology

This is a research methodology that involves the development of theories based on the data collected during the research process. Grounded theory is often used in sociology and anthropology to generate theories about social phenomena.

Research Methodology Example

An Example of Research Methodology could be the following:

Research Methodology for Investigating the Effectiveness of Cognitive Behavioral Therapy in Reducing Symptoms of Depression in Adults

Introduction:

The aim of this research is to investigate the effectiveness of cognitive-behavioral therapy (CBT) in reducing symptoms of depression in adults. To achieve this objective, a randomized controlled trial (RCT) will be conducted using a mixed-methods approach.

Research Design:

The study will follow a pre-test and post-test design with two groups: an experimental group receiving CBT and a control group receiving no intervention. The study will also include a qualitative component, in which semi-structured interviews will be conducted with a subset of participants to explore their experiences of receiving CBT.

Participants:

Participants will be recruited from community mental health clinics in the local area. The sample will consist of 100 adults aged 18-65 years old who meet the diagnostic criteria for major depressive disorder. Participants will be randomly assigned to either the experimental group or the control group.

Intervention :

The experimental group will receive 12 weekly sessions of CBT, each lasting 60 minutes. The intervention will be delivered by licensed mental health professionals who have been trained in CBT. The control group will receive no intervention during the study period.

Data Collection:

Quantitative data will be collected through the use of standardized measures such as the Beck Depression Inventory-II (BDI-II) and the Generalized Anxiety Disorder-7 (GAD-7). Data will be collected at baseline, immediately after the intervention, and at a 3-month follow-up. Qualitative data will be collected through semi-structured interviews with a subset of participants from the experimental group. The interviews will be conducted at the end of the intervention period, and will explore participants’ experiences of receiving CBT.

Data Analysis:

Quantitative data will be analyzed using descriptive statistics, t-tests, and mixed-model analyses of variance (ANOVA) to assess the effectiveness of the intervention. Qualitative data will be analyzed using thematic analysis to identify common themes and patterns in participants’ experiences of receiving CBT.

Ethical Considerations:

This study will comply with ethical guidelines for research involving human subjects. Participants will provide informed consent before participating in the study, and their privacy and confidentiality will be protected throughout the study. Any adverse events or reactions will be reported and managed appropriately.

Data Management:

All data collected will be kept confidential and stored securely using password-protected databases. Identifying information will be removed from qualitative data transcripts to ensure participants’ anonymity.

Limitations:

One potential limitation of this study is that it only focuses on one type of psychotherapy, CBT, and may not generalize to other types of therapy or interventions. Another limitation is that the study will only include participants from community mental health clinics, which may not be representative of the general population.

Conclusion:

This research aims to investigate the effectiveness of CBT in reducing symptoms of depression in adults. By using a randomized controlled trial and a mixed-methods approach, the study will provide valuable insights into the mechanisms underlying the relationship between CBT and depression. The results of this study will have important implications for the development of effective treatments for depression in clinical settings.

How to Write Research Methodology

Writing a research methodology involves explaining the methods and techniques you used to conduct research, collect data, and analyze results. It’s an essential section of any research paper or thesis, as it helps readers understand the validity and reliability of your findings. Here are the steps to write a research methodology:

  • Start by explaining your research question: Begin the methodology section by restating your research question and explaining why it’s important. This helps readers understand the purpose of your research and the rationale behind your methods.
  • Describe your research design: Explain the overall approach you used to conduct research. This could be a qualitative or quantitative research design, experimental or non-experimental, case study or survey, etc. Discuss the advantages and limitations of the chosen design.
  • Discuss your sample: Describe the participants or subjects you included in your study. Include details such as their demographics, sampling method, sample size, and any exclusion criteria used.
  • Describe your data collection methods : Explain how you collected data from your participants. This could include surveys, interviews, observations, questionnaires, or experiments. Include details on how you obtained informed consent, how you administered the tools, and how you minimized the risk of bias.
  • Explain your data analysis techniques: Describe the methods you used to analyze the data you collected. This could include statistical analysis, content analysis, thematic analysis, or discourse analysis. Explain how you dealt with missing data, outliers, and any other issues that arose during the analysis.
  • Discuss the validity and reliability of your research : Explain how you ensured the validity and reliability of your study. This could include measures such as triangulation, member checking, peer review, or inter-coder reliability.
  • Acknowledge any limitations of your research: Discuss any limitations of your study, including any potential threats to validity or generalizability. This helps readers understand the scope of your findings and how they might apply to other contexts.
  • Provide a summary: End the methodology section by summarizing the methods and techniques you used to conduct your research. This provides a clear overview of your research methodology and helps readers understand the process you followed to arrive at your findings.

When to Write Research Methodology

Research methodology is typically written after the research proposal has been approved and before the actual research is conducted. It should be written prior to data collection and analysis, as it provides a clear roadmap for the research project.

The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data analysis techniques, and any ethical considerations.

The methodology should be written in a clear and concise manner, and it should be based on established research practices and standards. It is important to provide enough detail so that the reader can understand how the research was conducted and evaluate the validity of the results.

Applications of Research Methodology

Here are some of the applications of research methodology:

  • To identify the research problem: Research methodology is used to identify the research problem, which is the first step in conducting any research.
  • To design the research: Research methodology helps in designing the research by selecting the appropriate research method, research design, and sampling technique.
  • To collect data: Research methodology provides a systematic approach to collect data from primary and secondary sources.
  • To analyze data: Research methodology helps in analyzing the collected data using various statistical and non-statistical techniques.
  • To test hypotheses: Research methodology provides a framework for testing hypotheses and drawing conclusions based on the analysis of data.
  • To generalize findings: Research methodology helps in generalizing the findings of the research to the target population.
  • To develop theories : Research methodology is used to develop new theories and modify existing theories based on the findings of the research.
  • To evaluate programs and policies : Research methodology is used to evaluate the effectiveness of programs and policies by collecting data and analyzing it.
  • To improve decision-making: Research methodology helps in making informed decisions by providing reliable and valid data.

Purpose of Research Methodology

Research methodology serves several important purposes, including:

  • To guide the research process: Research methodology provides a systematic framework for conducting research. It helps researchers to plan their research, define their research questions, and select appropriate methods and techniques for collecting and analyzing data.
  • To ensure research quality: Research methodology helps researchers to ensure that their research is rigorous, reliable, and valid. It provides guidelines for minimizing bias and error in data collection and analysis, and for ensuring that research findings are accurate and trustworthy.
  • To replicate research: Research methodology provides a clear and detailed account of the research process, making it possible for other researchers to replicate the study and verify its findings.
  • To advance knowledge: Research methodology enables researchers to generate new knowledge and to contribute to the body of knowledge in their field. It provides a means for testing hypotheses, exploring new ideas, and discovering new insights.
  • To inform decision-making: Research methodology provides evidence-based information that can inform policy and decision-making in a variety of fields, including medicine, public health, education, and business.

Advantages of Research Methodology

Research methodology has several advantages that make it a valuable tool for conducting research in various fields. Here are some of the key advantages of research methodology:

  • Systematic and structured approach : Research methodology provides a systematic and structured approach to conducting research, which ensures that the research is conducted in a rigorous and comprehensive manner.
  • Objectivity : Research methodology aims to ensure objectivity in the research process, which means that the research findings are based on evidence and not influenced by personal bias or subjective opinions.
  • Replicability : Research methodology ensures that research can be replicated by other researchers, which is essential for validating research findings and ensuring their accuracy.
  • Reliability : Research methodology aims to ensure that the research findings are reliable, which means that they are consistent and can be depended upon.
  • Validity : Research methodology ensures that the research findings are valid, which means that they accurately reflect the research question or hypothesis being tested.
  • Efficiency : Research methodology provides a structured and efficient way of conducting research, which helps to save time and resources.
  • Flexibility : Research methodology allows researchers to choose the most appropriate research methods and techniques based on the research question, data availability, and other relevant factors.
  • Scope for innovation: Research methodology provides scope for innovation and creativity in designing research studies and developing new research techniques.

Research Methodology Vs Research Methods

About the author.

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Research Paper Citation

How to Cite Research Paper – All Formats and...

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Paper Formats

Research Paper Format – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

  • Create A Quiz
  • Relationship
  • Personality
  • Harry Potter
  • Online Exam
  • Entertainment
  • Training Maker
  • Survey Maker
  • Brain Games
  • ProProfs.com

Research Method Quizzes, Questions & Answers

Top trending quizzes.

Radio Button

Popular Topics

Recent quizzes.

  • Examples of good research questions

Last updated

Reviewed by

Tanya Williams

However, developing a good research question is often challenging. But, doing appropriate data analysis or drawing meaningful conclusions from your investigation with a well-defined question make it easier.

So, to get you on the right track, let’s start by defining a research question, what types of research questions are common, and the steps to drafting an excellent research question.

Make research less tedious

Dovetail streamlines research to help you uncover and share actionable insights

  • What is a research question?

The definition of a research question might seem fairly obvious.

 At its simplest, a research question is a question you research to find the answer.

Researchers typically start with a problem or an issue and seek to understand why it has occurred, how it can be solved, or other aspects of its nature.

As you'll see, researchers typically start with a broad question that becomes narrower and more specific as the research stages are completed.

In some cases, a study may tackle more than one research question.

  • Research question types

Research questions are typically divided into three broad categories: qualitative, quantitative, and mixed-method.

These categories reflect the research type necessary to answer the research question.

Qualitative research

When you conduct qualitative research, you're broadly exploring a subject to analyze its inherent qualities.

There are many types of qualitative research questions, which include:

Descriptive: describing and illuminating little-known or overlooked aspects of a subject

Emancipatory: uncovering data that can serve to emancipate a particular group of people, such as disadvantaged or marginalized communities

Evaluative:  assessing how well a particular research approach or method works

Explanatory: answering “how” or “why” a given phenomenon occurs 

Exploratory:  identifying reasons behind certain behaviors and exploring motivations (also known as generative research because it can generate solutions to problems)

Ideological: researching ideologies or beliefs, such as political affiliation

Interpretive: understanding group perceptions, decision-making, and behavior in a natural setting

Predictive: forecasting a likely outcome or scenario by examining past events 

While it's helpful to understand the differences between these qualitative research question types, writing a good question doesn't start with determining the precise type of research question you'll be asking.

It starts with determining what answers you're seeking.

Quantitative research

Unlike broad, flexible qualitative research questions, quantitative research questions are precise. They also directly link the research question and the proposed methodology.

So, in a quantitative research question, you'll usually find

The study method 

An independent variable (or variables)

A dependent variable

The study population 

Quantitative research questions can also fall into multiple categories, including:

Comparative research questions compare two or more groups according to specific criteria and analyze their similarities and differences.

Descriptive questions measure a population's response to one or more variables.

Relationship (or relationship-based) questions examine how two or more variables interact.

Mixed-methods research

As its name suggests, mixed-methods research questions involve qualitative and quantitative components.

These questions are ideal when the answers require an evaluation of a specific aspect of a phenomenon that you can quantify and a broader understanding of aspects that can't.

  • How to write a research question

Writing a good research question can be challenging, even if you're passionate about the subject matter.

A good research question aims to solve a problem that still needs to be answered and can be solved empirically. 

The approach might involve quantitative or qualitative methodology, or a mixture of both. To write a well-developed research question, follow the four steps below:

1. Select a general topic

Start with a broad topic. You may already have one in mind or get one assigned to you. If you don't, think about one you're curious about. 

You can also use common brainstorming techniques , draw on discussions you've had with family and friends, take topics from the news, or use other similar sources of inspiration.

Also, consider a subject that has yet to be studied or addressed. If you're looking to tackle a topic that has already been thoroughly studied, you'll want to examine it from a new angle.

Still, the closer your question, approach, and outcomes are to existing literature, the less value your work will offer. It will also be less publishing-worthy (if that’s your goal).

2. Conduct preliminary research

Next, you'll want to conduct some initial research about your topic. You'll read coverage about your topic in academic journals, the news, and other credible sources at this stage.

You'll familiarize yourself with the terminology commonly used to describe your topic and the current take from subject matter experts and the general public. 

This preliminary review helps you in a few ways. First, you'll find many researchers will discuss challenges they found conducting their research in their "Limitations," "Results," and "Discussion" sections of research papers.

Assessing these sections also helps you avoid choosing the wrong methodological approach to answering your question. Initial research also enables you to avoid focusing on a topic that has already been covered. 

You can generate valuable research questions by tracking topics that have yet to be covered.

3. Consider your audience

Next, you'll want to give some thought to your audience. For example, what kinds of research material are they looking for, and what might they find valuable?

Reflect on why you’re conducting the research. 

What is your team looking to learn if your research is for a work assignment?

How does what they’re asking for from you connect to business goals?

Understanding what your audience is seeking can help you shape the direction of your research so that the final draft connects with your audience.

If you're writing for an academic journal, what types of research do they publish? What kinds of research approaches have they published? And what criteria do they expect submitted manuscripts to meet?

4. Generate potential questions

Take the insights you've gained from your preliminary research and your audience assessment to narrow your topic into a research question. 

Your question should be one that you can answer using the appropriate research methods. Unfortunately, some researchers start with questions they need more resources to answer and then produce studies whose outcomes are limited, limiting the study's value to the broader community. 

Make sure your question is one you can realistically answer.

  • Examples of poor research questions

"How do electronics distract teen drivers?"

This question could be better from a researcher's perspective because it is overly broad. For instance, what is “electronics” in this context? Some electronics, like eye-monitoring systems in semi-autonomous vehicles, are designed to keep drivers focused on the road.

Also, how does the question define “teens”? Some states allow you to get a learner's permit as young as 14, while others require you to be 18 to drive. Therefore, conducting a study without further defining the participants' ages is not scientifically sound.

Here's another example of an ineffective research question:

"Why is the sky blue?"

This question has been researched thoroughly and answered. 

A simple online search will turn up hundreds, if not thousands, of pages of resources devoted to this very topic. 

Suppose you spend time conducting original research on a long-answered question; your research won’t be interesting, relevant, or valuable to your audience.

Alternatively, here's an example of a good research question:

"How does using a vehicle’s infotainment touch screen by drivers aged 16 to 18 in the U.S. affect driving habits?"

This question is far more specific than the first bad example. It notes the population of the study, as well as the independent and dependent variables.

And if you're still interested in the sky's color, a better example of a research question might be:

"What color is the sky on Proxima Centauri b, based on existing observations?"

A qualitative research study based on this question could extrapolate what visitors on Proxima Centauri b (a planet in the closest solar system to ours) might see as they look at the sky.

You could approach this by contextualizing our understanding of how the light scatters off the molecules of air resulting in a blue sky, and the likely composition of Proxima Centauri b's atmosphere from data NASA and others have gathered.

  • Why the right research question is critical

As you can see from the examples, starting with a poorly-framed research question can make your study difficult or impossible to complete. 

Or it can lead you to duplicate research findings.

Ultimately, developing the right research question sets you up for success. It helps you define a realistic scope for your study, informs the best approach to answer the central question, and conveys its value to your audience. 

That's why you must take the time to get your research question right before you embark on any other part of your project.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 5 March 2024

Last updated: 25 November 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Research Question 101 📖

Everything you need to know to write a high-quality research question

By: Derek Jansen (MBA) | Reviewed By: Dr. Eunice Rautenbach | October 2023

If you’ve landed on this page, you’re probably asking yourself, “ What is a research question? ”. Well, you’ve come to the right place. In this post, we’ll explain what a research question is , how it’s differen t from a research aim, and how to craft a high-quality research question that sets you up for success.

Research Question 101

What is a research question.

  • Research questions vs research aims
  • The 4 types of research questions
  • How to write a research question
  • Frequently asked questions
  • Examples of research questions

As the name suggests, the research question is the core question (or set of questions) that your study will (attempt to) answer .

In many ways, a research question is akin to a target in archery . Without a clear target, you won’t know where to concentrate your efforts and focus. Essentially, your research question acts as the guiding light throughout your project and informs every choice you make along the way.

Let’s look at some examples:

What impact does social media usage have on the mental health of teenagers in New York?
How does the introduction of a minimum wage affect employment levels in small businesses in outer London?
How does the portrayal of women in 19th-century American literature reflect the societal attitudes of the time?
What are the long-term effects of intermittent fasting on heart health in adults?

As you can see in these examples, research questions are clear, specific questions that can be feasibly answered within a study. These are important attributes and we’ll discuss each of them in more detail a little later . If you’d like to see more examples of research questions, you can find our RQ mega-list here .

Free Webinar: How To Find A Dissertation Research Topic

Research Questions vs Research Aims

At this point, you might be asking yourself, “ How is a research question different from a research aim? ”. Within any given study, the research aim and research question (or questions) are tightly intertwined , but they are separate things . Let’s unpack that a little.

A research aim is typically broader in nature and outlines what you hope to achieve with your research. It doesn’t ask a specific question but rather gives a summary of what you intend to explore.

The research question, on the other hand, is much more focused . It’s the specific query you’re setting out to answer. It narrows down the research aim into a detailed, researchable question that will guide your study’s methods and analysis.

Let’s look at an example:

Research Aim: To explore the effects of climate change on marine life in Southern Africa.
Research Question: How does ocean acidification caused by climate change affect the reproduction rates of coral reefs?

As you can see, the research aim gives you a general focus , while the research question details exactly what you want to find out.

Need a helping hand?

research method questions

Types of research questions

Now that we’ve defined what a research question is, let’s look at the different types of research questions that you might come across. Broadly speaking, there are (at least) four different types of research questions – descriptive , comparative , relational , and explanatory . 

Descriptive questions ask what is happening. In other words, they seek to describe a phenomena or situation . An example of a descriptive research question could be something like “What types of exercise do high-performing UK executives engage in?”. This would likely be a bit too basic to form an interesting study, but as you can see, the research question is just focused on the what – in other words, it just describes the situation.

Comparative research questions , on the other hand, look to understand the way in which two or more things differ , or how they’re similar. An example of a comparative research question might be something like “How do exercise preferences vary between middle-aged men across three American cities?”. As you can see, this question seeks to compare the differences (or similarities) in behaviour between different groups.

Next up, we’ve got exploratory research questions , which ask why or how is something happening. While the other types of questions we looked at focused on the what, exploratory research questions are interested in the why and how . As an example, an exploratory research question might ask something like “Why have bee populations declined in Germany over the last 5 years?”. As you can, this question is aimed squarely at the why, rather than the what.

Last but not least, we have relational research questions . As the name suggests, these types of research questions seek to explore the relationships between variables . Here, an example could be something like “What is the relationship between X and Y” or “Does A have an impact on B”. As you can see, these types of research questions are interested in understanding how constructs or variables are connected , and perhaps, whether one thing causes another.

Of course, depending on how fine-grained you want to get, you can argue that there are many more types of research questions , but these four categories give you a broad idea of the different flavours that exist out there. It’s also worth pointing out that a research question doesn’t need to fit perfectly into one category – in many cases, a research question might overlap into more than just one category and that’s okay.

The key takeaway here is that research questions can take many different forms , and it’s useful to understand the nature of your research question so that you can align your research methodology accordingly.

Free Webinar: Research Methodology 101

How To Write A Research Question

As we alluded earlier, a well-crafted research question needs to possess very specific attributes, including focus , clarity and feasibility . But that’s not all – a rock-solid research question also needs to be rooted and aligned . Let’s look at each of these.

A strong research question typically has a single focus. So, don’t try to cram multiple questions into one research question; rather split them up into separate questions (or even subquestions), each with their own specific focus. As a rule of thumb, narrow beats broad when it comes to research questions.

Clear and specific

A good research question is clear and specific, not vague and broad. State clearly exactly what you want to find out so that any reader can quickly understand what you’re looking to achieve with your study. Along the same vein, try to avoid using bulky language and jargon – aim for clarity.

Unfortunately, even a super tantalising and thought-provoking research question has little value if you cannot feasibly answer it. So, think about the methodological implications of your research question while you’re crafting it. Most importantly, make sure that you know exactly what data you’ll need (primary or secondary) and how you’ll analyse that data.

A good research question (and a research topic, more broadly) should be rooted in a clear research gap and research problem . Without a well-defined research gap, you risk wasting your effort pursuing a question that’s already been adequately answered (and agreed upon) by the research community. A well-argued research gap lays at the heart of a valuable study, so make sure you have your gap clearly articulated and that your research question directly links to it.

As we mentioned earlier, your research aim and research question are (or at least, should be) tightly linked. So, make sure that your research question (or set of questions) aligns with your research aim . If not, you’ll need to revise one of the two to achieve this.

FAQ: Research Questions

Research question faqs, how many research questions should i have, what should i avoid when writing a research question, can a research question be a statement.

Typically, a research question is phrased as a question, not a statement. A question clearly indicates what you’re setting out to discover.

Can a research question be too broad or too narrow?

Yes. A question that’s too broad makes your research unfocused, while a question that’s too narrow limits the scope of your study.

Here’s an example of a research question that’s too broad:

“Why is mental health important?”

Conversely, here’s an example of a research question that’s likely too narrow:

“What is the impact of sleep deprivation on the exam scores of 19-year-old males in London studying maths at The Open University?”

Can I change my research question during the research process?

How do i know if my research question is good.

A good research question is focused, specific, practical, rooted in a research gap, and aligned with the research aim. If your question meets these criteria, it’s likely a strong question.

Is a research question similar to a hypothesis?

Not quite. A hypothesis is a testable statement that predicts an outcome, while a research question is a query that you’re trying to answer through your study. Naturally, there can be linkages between a study’s research questions and hypothesis, but they serve different functions.

How are research questions and research objectives related?

The research question is a focused and specific query that your study aims to answer. It’s the central issue you’re investigating. The research objective, on the other hand, outlines the steps you’ll take to answer your research question. Research objectives are often more action-oriented and can be broken down into smaller tasks that guide your research process. In a sense, they’re something of a roadmap that helps you answer your research question.

Need some inspiration?

If you’d like to see more examples of research questions, check out our research question mega list here .  Alternatively, if you’d like 1-on-1 help developing a high-quality research question, consider our private coaching service .

research method questions

Psst… there’s more (for free)

This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project. 

You Might Also Like:

Research constructs: construct validity and reliability

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Questionnaire Design | Methods, Question Types & Examples

Questionnaire Design | Methods, Question Types & Examples

Published on July 15, 2021 by Pritha Bhandari . Revised on June 22, 2023.

A questionnaire is a list of questions or items used to gather data from respondents about their attitudes, experiences, or opinions. Questionnaires can be used to collect quantitative and/or qualitative information.

Questionnaires are commonly used in market research as well as in the social and health sciences. For example, a company may ask for feedback about a recent customer service experience, or psychology researchers may investigate health risk perceptions using questionnaires.

Table of contents

Questionnaires vs. surveys, questionnaire methods, open-ended vs. closed-ended questions, question wording, question order, step-by-step guide to design, other interesting articles, frequently asked questions about questionnaire design.

A survey is a research method where you collect and analyze data from a group of people. A questionnaire is a specific tool or instrument for collecting the data.

Designing a questionnaire means creating valid and reliable questions that address your research objectives , placing them in a useful order, and selecting an appropriate method for administration.

But designing a questionnaire is only one component of survey research. Survey research also involves defining the population you’re interested in, choosing an appropriate sampling method , administering questionnaires, data cleansing and analysis, and interpretation.

Sampling is important in survey research because you’ll often aim to generalize your results to the population. Gather data from a sample that represents the range of views in the population for externally valid results. There will always be some differences between the population and the sample, but minimizing these will help you avoid several types of research bias , including sampling bias , ascertainment bias , and undercoverage bias .

Prevent plagiarism. Run a free check.

Questionnaires can be self-administered or researcher-administered . Self-administered questionnaires are more common because they are easy to implement and inexpensive, but researcher-administered questionnaires allow deeper insights.

Self-administered questionnaires

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Self-administered questionnaires can be:

  • cost-effective
  • easy to administer for small and large groups
  • anonymous and suitable for sensitive topics

But they may also be:

  • unsuitable for people with limited literacy or verbal skills
  • susceptible to a nonresponse bias (most people invited may not complete the questionnaire)
  • biased towards people who volunteer because impersonal survey requests often go ignored.

Researcher-administered questionnaires

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents.

Researcher-administered questionnaires can:

  • help you ensure the respondents are representative of your target audience
  • allow clarifications of ambiguous or unclear questions and answers
  • have high response rates because it’s harder to refuse an interview when personal attention is given to respondents

But researcher-administered questionnaires can be limiting in terms of resources. They are:

  • costly and time-consuming to perform
  • more difficult to analyze if you have qualitative responses
  • likely to contain experimenter bias or demand characteristics
  • likely to encourage social desirability bias in responses because of a lack of anonymity

Your questionnaire can include open-ended or closed-ended questions or a combination of both.

Using closed-ended questions limits your responses, while open-ended questions enable a broad range of answers. You’ll need to balance these considerations with your available time and resources.

Closed-ended questions

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. Closed-ended questions are best for collecting data on categorical or quantitative variables.

Categorical variables can be nominal or ordinal. Quantitative variables can be interval or ratio. Understanding the type of variable and level of measurement means you can perform appropriate statistical analyses for generalizable results.

Examples of closed-ended questions for different variables

Nominal variables include categories that can’t be ranked, such as race or ethnicity. This includes binary or dichotomous categories.

It’s best to include categories that cover all possible answers and are mutually exclusive. There should be no overlap between response items.

In binary or dichotomous questions, you’ll give respondents only two options to choose from.

White Black or African American American Indian or Alaska Native Asian Native Hawaiian or Other Pacific Islander

Ordinal variables include categories that can be ranked. Consider how wide or narrow a range you’ll include in your response items, and their relevance to your respondents.

Likert scale questions collect ordinal data using rating scales with 5 or 7 points.

When you have four or more Likert-type questions, you can treat the composite data as quantitative data on an interval scale . Intelligence tests, psychological scales, and personality inventories use multiple Likert-type questions to collect interval data.

With interval or ratio scales , you can apply strong statistical hypothesis tests to address your research aims.

Pros and cons of closed-ended questions

Well-designed closed-ended questions are easy to understand and can be answered quickly. However, you might still miss important answers that are relevant to respondents. An incomplete set of response items may force some respondents to pick the closest alternative to their true answer. These types of questions may also miss out on valuable detail.

To solve these problems, you can make questions partially closed-ended, and include an open-ended option where respondents can fill in their own answer.

Open-ended questions

Open-ended, or long-form, questions allow respondents to give answers in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered. For example, respondents may want to answer “multiracial” for the question on race rather than selecting from a restricted list.

  • How do you feel about open science?
  • How would you describe your personality?
  • In your opinion, what is the biggest obstacle for productivity in remote work?

Open-ended questions have a few downsides.

They require more time and effort from respondents, which may deter them from completing the questionnaire.

For researchers, understanding and summarizing responses to these questions can take a lot of time and resources. You’ll need to develop a systematic coding scheme to categorize answers, and you may also need to involve other researchers in data analysis for high reliability .

Question wording can influence your respondents’ answers, especially if the language is unclear, ambiguous, or biased. Good questions need to be understood by all respondents in the same way ( reliable ) and measure exactly what you’re interested in ( valid ).

Use clear language

You should design questions with your target audience in mind. Consider their familiarity with your questionnaire topics and language and tailor your questions to them.

For readability and clarity, avoid jargon or overly complex language. Don’t use double negatives because they can be harder to understand.

Use balanced framing

Respondents often answer in different ways depending on the question framing. Positive frames are interpreted as more neutral than negative frames and may encourage more socially desirable answers.

Use a mix of both positive and negative frames to avoid research bias , and ensure that your question wording is balanced wherever possible.

Unbalanced questions focus on only one side of an argument. Respondents may be less likely to oppose the question if it is framed in a particular direction. It’s best practice to provide a counter argument within the question as well.

Avoid leading questions

Leading questions guide respondents towards answering in specific ways, even if that’s not how they truly feel, by explicitly or implicitly providing them with extra information.

It’s best to keep your questions short and specific to your topic of interest.

  • The average daily work commute in the US takes 54.2 minutes and costs $29 per day. Since 2020, working from home has saved many employees time and money. Do you favor flexible work-from-home policies even after it’s safe to return to offices?
  • Experts agree that a well-balanced diet provides sufficient vitamins and minerals, and multivitamins and supplements are not necessary or effective. Do you agree or disagree that multivitamins are helpful for balanced nutrition?

Keep your questions focused

Ask about only one idea at a time and avoid double-barreled questions. Double-barreled questions ask about more than one item at a time, which can confuse respondents.

This question could be difficult to answer for respondents who feel strongly about the right to clean drinking water but not high-speed internet. They might only answer about the topic they feel passionate about or provide a neutral answer instead – but neither of these options capture their true answers.

Instead, you should ask two separate questions to gauge respondents’ opinions.

Strongly Agree Agree Undecided Disagree Strongly Disagree

Do you agree or disagree that the government should be responsible for providing high-speed internet to everyone?

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research method questions

You can organize the questions logically, with a clear progression from simple to complex. Alternatively, you can randomize the question order between respondents.

Logical flow

Using a logical flow to your question order means starting with simple questions, such as behavioral or opinion questions, and ending with more complex, sensitive, or controversial questions.

The question order that you use can significantly affect the responses by priming them in specific directions. Question order effects, or context effects, occur when earlier questions influence the responses to later questions, reducing the validity of your questionnaire.

While demographic questions are usually unaffected by order effects, questions about opinions and attitudes are more susceptible to them.

  • How knowledgeable are you about Joe Biden’s executive orders in his first 100 days?
  • Are you satisfied or dissatisfied with the way Joe Biden is managing the economy?
  • Do you approve or disapprove of the way Joe Biden is handling his job as president?

It’s important to minimize order effects because they can be a source of systematic error or bias in your study.

Randomization

Randomization involves presenting individual respondents with the same questionnaire but with different question orders.

When you use randomization, order effects will be minimized in your dataset. But a randomized order may also make it harder for respondents to process your questionnaire. Some questions may need more cognitive effort, while others are easier to answer, so a random order could require more time or mental capacity for respondents to switch between questions.

Step 1: Define your goals and objectives

The first step of designing a questionnaire is determining your aims.

  • What topics or experiences are you studying?
  • What specifically do you want to find out?
  • Is a self-report questionnaire an appropriate tool for investigating this topic?

Once you’ve specified your research aims, you can operationalize your variables of interest into questionnaire items. Operationalizing concepts means turning them from abstract ideas into concrete measurements. Every question needs to address a defined need and have a clear purpose.

Step 2: Use questions that are suitable for your sample

Create appropriate questions by taking the perspective of your respondents. Consider their language proficiency and available time and energy when designing your questionnaire.

  • Are the respondents familiar with the language and terms used in your questions?
  • Would any of the questions insult, confuse, or embarrass them?
  • Do the response items for any closed-ended questions capture all possible answers?
  • Are the response items mutually exclusive?
  • Do the respondents have time to respond to open-ended questions?

Consider all possible options for responses to closed-ended questions. From a respondent’s perspective, a lack of response options reflecting their point of view or true answer may make them feel alienated or excluded. In turn, they’ll become disengaged or inattentive to the rest of the questionnaire.

Step 3: Decide on your questionnaire length and question order

Once you have your questions, make sure that the length and order of your questions are appropriate for your sample.

If respondents are not being incentivized or compensated, keep your questionnaire short and easy to answer. Otherwise, your sample may be biased with only highly motivated respondents completing the questionnaire.

Decide on your question order based on your aims and resources. Use a logical flow if your respondents have limited time or if you cannot randomize questions. Randomizing questions helps you avoid bias, but it can take more complex statistical analysis to interpret your data.

Step 4: Pretest your questionnaire

When you have a complete list of questions, you’ll need to pretest it to make sure what you’re asking is always clear and unambiguous. Pretesting helps you catch any errors or points of confusion before performing your study.

Ask friends, classmates, or members of your target audience to complete your questionnaire using the same method you’ll use for your research. Find out if any questions were particularly difficult to answer or if the directions were unclear or inconsistent, and make changes as necessary.

If you have the resources, running a pilot study will help you test the validity and reliability of your questionnaire. A pilot study is a practice run of the full study, and it includes sampling, data collection , and analysis. You can find out whether your procedures are unfeasible or susceptible to bias and make changes in time, but you can’t test a hypothesis with this type of study because it’s usually statistically underpowered .

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Questionnaires can be self-administered or researcher-administered.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Questionnaire Design | Methods, Question Types & Examples. Scribbr. Retrieved April 10, 2024, from https://www.scribbr.com/methodology/questionnaire/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, survey research | definition, examples & methods, what is a likert scale | guide & examples, reliability vs. validity in research | difference, types and examples, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

GHANA EDUCATION NEWS AND JOBS

Research Methods Past Questions 2022

Research methods past questions 2022 free here.

Photo of BRIGHT FRANCIS

Are You finding it difficult to write your Research Methods Exams

CHECK THIS ONE ALSO: College of Education 2022/2023

Here are some probable research methods question from Professor, and Doctors holding PhD in Research Methods

take your time to go through well

THERE ARE OVER 100 PAST QUESTION HERE

RESEARCH METHODS

1. Which of the following should not be a criterion for a good research project? a) Demonstrates the abilities of the researcher b) Is dependent on the completion of other projects c) Demonstrates the integration of different fields of knowledge d) Develops the skills of the researcher Answer: b. Is dependent on the completion of other projects

2. Which form of reasoning is the process of drawing a specific conclusion from a set of premises? a. Objective reasoning b. Positivistic reasoning c. Inductive reasoning d. Deductive reasoning Answer: d: Deductive reasoning

3. Research that seeks to examine the findings of a study by using the same design but a different sample is which of the following? a. An exploratory study b. A replication study c. An empirical study d. Hypothesis testing Answer: b: A replication study

4. A researcher designs an experiment to test how variables interact to influence job-seeking behaviours. The main purpose of the study was: a. Description b. Prediction c. Exploration d. Explanation Answer: d: Explanation

5. Cyber bullying at work is a growing threat to employee job satisfaction. Researchers want to find out why people do this and how they feel about it. The primary purpose of the study is: a. Description b. Prediction c. Exploration d. Explanation Answer: c: Exploration

6. A theory: a. Is an accumulated body of knowledge b. Includes inconsequential ideas c. Is independent of research methodology d. Should be viewed uncritically Answer: a: Is an accumulated body of knowledge

7. Which research method is a bottom-up approach to research? a. Deductive method b. Explanatory method c. Inductive method d. Exploratory method Answer: c: Inductive method

8. How much confidence should you place in a single research study? a. You should trust research findings after different researchers have replicated the findings b. You should completely trust a single research study c. Neither a nor b d. Both a and b Answer: a: You should trust research findings after different researchers have replicated the findings

9. A qualitative research problem statement: a. Specifies the research methods to be utilized b. Specifies a research hypothesis c. Expresses a relationship between variables d. Conveys a sense of emerging design Answer: d: Conveys a sense of emerging design

10. Which of the following is a good research question? a. To produce a report on student job searching behaviours b. To identify the relationship between self-efficacy and student job searching behaviours c. Students with higher levels of self-efficacy will demonstrate more active job searching behaviours d. Do students with high levels of self-efficacy demonstrate more active job searching behaviours? Answer: d: Do students with high levels of self-efficacy demonstrate more active job searching behaviours?

11. A review of the literature prior to formulating research questions allows the researcher to a. Provide an up-to-date understanding of the subject, its significance, and structure b. Guide the development of research questions c. Present the kinds of research methodologies used in previous studies d. All of the above Answer: d: All of the above

12. Sometimes a comprehensive review of the literature prior to data collection is not recommended by: a. Ethnomethodology b. Grounded theory c. Symbolic interactionism d. Feminist theory Answer: b: Grounded theory

13. The feasibility of a research study should be considered in light of: a. Cost and time required to conduct the study b. Access to gatekeepers and respondents c. Potential ethical concerns d. All of the above Answer: d: All of the above

14. Research that uses qualitative methods for one phase and quantitative methods for the next phase is known as: a. Action research b. Mixed-method research c. Quantitative research d. Pragmatic research Answer: b: Mixed-method research

15. Research hypotheses are: a. Formulated prior to a review of the literature b. Statements of predicted relationships between variables c. B but not A d. Both A and B Answer: c: B but not A

16. Which research approach is based on the epistemological viewpoint of pragmatism? a. Quantitative research b. Qualitative research c. Mixed-methods research d. All of the above Answer: c: Mixed-methods research

17. Adopting ethical principles in research means: a. Avoiding harm to participants b. The researcher is anonymous c. Deception is only used when necessary d. Selected informants give their consent Answer: a: Avoiding harm to participants

18. A radical perspective on ethics suggests that: a. Researchers can do anything they want b. The use of checklists of ethical actions is essential c. The powers of Institutional Review Boards should be strengthened d. Ethics should be based on self-reflexivity Answer: d: Ethics should be based on self-reflexivity

19. Ethical problems can arise when researching the Internet because: a. Everyone has access to digital media b. Respondents may fake their identities c. Researchers may fake their identities d. Internet research has to be covert Answer: b: Respondents may fake their identities

20. The Kappa statistic: a. Is a measure of inter-judge validity b. Compares the level of agreement between two judges against what might have been predicted by chance c. Ranges from 0 to +1 d. Is acceptable above a score of 0.5 Answer: b: Compares the level of agreement between two judges against what might have been predicted by chance

PART B: RESEARCH METHODOLOGY

1. Which research paradigm is most concerned about generalizing its findings? a. Quantitative research b. Qualitative research c. Mixed-methods research d. All of the above Answer: a: Quantitative research

2. A variable that is presumed to cause a change in another variable is called: a. An intervening variable b. A dependent variable c. An independent variable d. A numerical variable Answer: c: An independent variable

3. A study of teaching professionals posits that their performance-related pay increases their motivation which in turn leads to an increase in their job satisfaction. What kind of variable is ‘motivation”’ in this study? a. Extraneous b. Confounding c. Intervening d. Manipulated Answer: c: Intervening

4. Which correlation is the strongest? a. –1.00 b. +80 c. –60 d. +05 Answer: a: –1.00

5. When interpreting a correlation coefficient expressing the relationship between two variables, it is important not to: a. Assume causality b. Measure the values for X and Y independently c. Choose X and Y values that are normally distributed d. Check the direction of the relationship Answer: a: Assume causality

6. Which of the following can be described as a nominal variable? a. Annual income b. Age c. Annual sales d. Geographical location of a firm Answer: d: Geographical location of a firm

7. A positive correlation occurs when: a. Two variables remain constant b. Two variables move in the same direction c. One variable goes up and the other goes down d. Two variables move in opposite directions Answer: b: Two variables move in the same direction

8. The key defining characteristic of experimental research is that: a. The independent variable is manipulated b. Hypotheses are proved c. A positive correlation exists d. Samples are large Answer: a: The independent variable is manipulated

9. Qualitative research is used in all the following circumstances, EXCEPT: a. It is based on a collection of non-numerical data such as words and pictures b. It often uses small samples c. It uses the inductive method d. It is typically used when a great deal is already known about the topic of interest Answer: d: It is typically used when a great deal is already known about the topic of interest

10. In an experiment, the group that does not receive the intervention is called: a. The experimental group b. The participant group c. The control group d. The treatment group Answer: c: The control group

11. Which generally cannot be guaranteed in conducting qualitative studies in the field? a. Keeping participants from physical and emotional harm b. Gaining informed consent c. Assuring anonymity rather than just confidentiality d. Maintaining consent forms Answer: c: Assuring anonymity rather than just confidentiality

12. Which of the following is not ethical practice in research with humans? a. Maintaining participants’ anonymity b. Gaining informed consent c. Informing participants that they are free to withdraw at any time d. Requiring participants to continue until the study has been completed Answer: d: Requiring participants to continue until the study has been completed

13. What do we call data that are used for a new study but which were collected by an earlier researcher for a different set of research questions? a. Secondary data b. Field notes c. Qualitative data d. Primary data Answer: a: Secondary data

14. When each member of a population has an equal chance of being selected, this is called: a. A snowball sample b. A stratified sample c. A random probability sample d. A non-random sample Answer: c: A random probability sample

15. Which of the following techniques yields a simple random sample of hospitals? a. Randomly selecting a district and then sampling all hospitals within the district b. Numbering all the elements of a hospital sampling frame and then using a random number generator to pick hospitals from the table c. Listing hospitals by sector and choosing a proportion from within each sector at random d. Choosing volunteer hospitals to participate Answer: b: Numbering all the elements of a hospital sampling frame and then using a random number generator to pick hospitals from the table

16. Which of the following statements are true? a. The larger the sample size, the larger the confidence interval b. The smaller the sample size, the greater the sampling error c. The more categories being measured, the smaller the sample size d. A confidence level of 95 percent is always sufficient Answer: b: The smaller the sample size, the greater the sampling error

17. Which of the following will produce the least sampling error? a. A large sample based on convenience sampling b. A small sample based on random sampling c. A large snowball samples d. A large sample based on random sampling Answer: d: A large sample based on random sampling

18. When people are readily available, volunteer, or are easily recruited to the sample, this is called: a. Snowball sampling b. Convenience sampling c. Stratified sampling d. Random sampling Answer: b: Convenience sampling

19. In qualitative research, sampling that involves selecting diverse cases is referred to as: a. Typical-case sampling b. Critical-case sampling c. Intensity sampling d. Maximum variation sampling Answer: d: Maximum variation sampling

20. A test accurately indicates an employee’s scores on a future criterion (e.g., conscientiousness). What kind of validity is this? a. Predictive b. Face c. Content d. Concurrent Answer: a: Predictive

PART C: DATA COLLECTION METHODS

1. When designing a questionnaire it is important to do each of the following EXCEPT a. Pilot the questionnaire b. Avoid jargon c. Avoid double questions d. Use leading questions Answer: d: Use leading questions

2. One advantage of using a questionnaire is that: a. Probe questions can be asked b. Respondents can be put at ease c. Interview bias can be avoided d. Response rates are always high Answer: c: Interview bias can be avoided

3. Which of the following is true of observations? a. It takes less time than interviews b. It is often not possible to determine exactly why people behave as they do c. Covert observation raises fewer ethical concerns than overt d. All of the above Answer: b: It is often not possible to determine exactly why people behave as they do

4. A researcher secretly becomes an active member of a group in order to observe their behaviour. This researcher is acting as: a. An overt participant observer b. A covert non-participant observer c. A covert participant observer d. None of the above Answer: c: A covert participant observer

5. All of the following are advantages of structured observation, EXCEPT: a. Results can be replicated at a different time b. The coding schedule might impose a framework on what is being observed c. Data can be collected that participants may not realize is important d. Data do not have to rely on the recall of participants Answer: b: The coding schedule might impose a framework on what is being observed

6. When conducting an interview, asking questions such as: “What else? or ‘Could you expand on that?’ are all forms of: a. Structured responses b. Category questions c. Protocols d. Probes Answer: d: Probes

7. Secondary data can include which of the following? a. Government statistics b. Personal diaries c. Organizational records d. All of the above Answer: d: All of the above

8. An ordinal scale is: a. The simplest form of measurement b. A scale with an absolute zero point c. A rank-order scale of measurement d. A scale with equal intervals between ranks Answer: c: A rank-order scale of measurement

9. Which term measures the extent to which scores from a test can be used to infer or predict performance in some activity? a. Face validity b. Content reliability c. Criterion-related validity d. Construct validity Answer: c: Criterion-related validity

10. The ‘reliability ‘of a measure refers to the researcher asking: a. Does it give consistent results? b. Does it measure what it is supposed to measure? c. Can the results be generalized? d. Does it have face reliability? Answer: a: Does it give consistent results?

11. Interviewing is the favored approach EXCEPT when: a. There is a need for highly personalized data b. It is important to ask supplementary questions c. High numbers of respondents are needed d. Respondents have difficulty with written language Answer: c: High numbers of respondents are needed

12. Validity in interviews is strengthened by the following EXCEPT: a. Building rapport with interviewees b. Multiple questions cover the same theme c. Constructing interview schedules that contain themes drawn from the literature d. Prompting respondents to expand on initial responses Answer: b: Multiple questions cover the same theme

13. Interview questions should: a. Lead the respondent b. Probe sensitive issues c. Be delivered in a neutral tone d. Test the respondents’ powers of memory Answer: c: Be delivered in a neutral tone

14. Active listening skills means: a. Asking as many questions as possible b. Avoiding silences c. Keeping to time d. Attentive listening Answer: d: Attentive listening

15. All the following are strengths of focus groups EXCEPT: a. They allow access to a wide range of participants b. Discussion allows for the validation of ideas and views c. They can generate a collective perspective d. They help maintain confidentiality

Answer: d: They help maintain confidentiality 16. Which of the following is not always true about focus groups? a. The ideal size is normally between 6 and 12 participants b. Moderators should introduce themselves to the group c. Participants should come from diverse backgrounds d. The moderator poses preplanner questions Answer: c: Participants should come from diverse backgrounds

17. A disadvantage of using secondary data is that: a. The data may have been collected with reference to research questions that are not those of the researcher b. The researcher may bring more detachment in viewing the data than original researchers could muster c. Data have often been collected by teams of experienced researchers d. Secondary data sets are often available and accessible Answer: a: The data may have been collected with reference to research questions that are not those of the researcher

18. All of the following are sources of secondary data EXCEPT: a. Official statistics b. A television documentary c. The researcher’s research diary d. A company’s annual report Answer: c: The researcher’s research diary

19. Which of the following is not true about visual methods? a. They are not reliant on respondent recall b. The have low resource requirements c. They do not rely on words to capture what is happening d. They can capture what is happening in real time Answer: b: The have low resource requirements

20. Avoiding naïve empiricism in the interpretation of visual data means: a. Understanding the context in which they were produced b. Ensuring that visual images such as photographs are accurately taken c. Only using visual images with other data gathering sources d. Planning the capture of visual data carefully Answer: a: Understanding the context in which they were produced

PART D: ANALYSIS AND REPORT WRITING

1. Which of the following is incorrect when naming a variable in SPSS? a. Must begin with a letter and not a number b. Must end in a full stop c. Cannot exceed 64 characters d. Cannot include symbols such as ?, & and % Answer: b: Must end in a full stop

2. Which of the following is not an SPSS Type variable? a. Word b. Numeric c. String d. Date Answer: a: Word

3. A graph that uses vertical bars to represent data is called: a. A bar chart b. A pie chart c. A line graph d. A vertical graph Answer: a: A bar chart

4. The purpose of descriptive statistics is to: a. Summarize the characteristics of a data set b. Draw conclusions from the data c. None of the above d. All of the above Answer: a: Summarize the characteristics of a data set

5. The measure of the extent to which responses vary from the mean is called: a. The mode b. The normal distribution c. The standard deviation d. The variance Answer: c: The standard deviation

6. To compare the performance of a group at time T1 and then at T2, we would use: a. A chi-squared test b. One-way analysis of variance c. Analysis of variance d. A paired t-test

7. A Type 1 error occurs in a situation where: a. The null hypothesis is accepted when it is in fact true b. The null hypothesis is rejected when it is in fact false c. The null hypothesis is rejected when it is in fact true d. The null hypothesis is accepted when it is in fact false Answer: c: The null hypothesis is rejected when it is in fact true

8. The significance level a. Is set after a statistical test is conducted b. Is always set at 0.05 c. Results in a p-value d. Measures the probability of rejecting a true null hypothesis Answer: d: Measures the probability of rejecting a true null hypothesis

9. To predict the value of the dependent variable for a new case based on the knowledge of one or more independent variables, we would use a. Regression analysis b. Correlation analysis c. Kolmogorov-Smirnov test d. One-way analysis of variance Answer: a: Regression analysis

10. In conducting secondary data analysis, researchers should ask themselves all of the following EXCEPT: a. Who produced the document? b. Is the material genuine? c. How can respondents be re-interviewed? d. Why was the document produced? Answer: c: How can respondents be re-interviewed?

11. Which of the following are not true of reflexivity? a. It recognizes that the researcher is not a neutral observer b. It has mainly been applied to the analysis of qualitative data c. It is part of a post-positivist tradition d. A danger of adopting a reflexive stance is the researcher can become the focus of the study Answer: c: It is part of a post-positivist tradition

12. Validity in qualitative research can be strengthened by all of the following EXCEPT: a. Member checking for accuracy and interpretation b. Transcribing interviews to improve accuracy of data c. Exploring rival explanations d. Analysing negative cases Answer: b: Transcribing interviews to improve accuracy of data

13. Qualitative data analysis programs are useful for each of the following EXCEPT: a. Manipulation of large amounts of data b. Exploring of the data against new dimensions c. Querying of data d. Generating codes Answer: d: Generating codes

14. Which part of a research report contains details of how the research was planned and conducted? a. Results b. Design c. Introduction d. Background Answer: b: Design

15. Which of the following is a form of research typically conducted by managers and other professionals to address issues in their organizations and/or professional practice? a. Action research b. Basic research c. Professional research d. Predictive research Answer: a: Action research

16. Plagiarism can be avoided by: a. Copying the work of others accurately b. Paraphrasing the author’s text in your own words c. Cut and pasting from the Internet d. Quoting directly without revealing the source Answer: b: Paraphrasing the author’s text in your own words

17. In preparing for a presentation, you should do all of the following EXCEPT: a. Practice the presentation b. Ignore your nerves c. Get to know more about your audience d. Take an advanced look, if possible, at the facilities Answer: b: Ignore your nerves

18. You can create interest in your presentation by: a. Using bullet points b. Reading from notes c. Maximizing the use of animation effects d. Using metaphors Answer: d: Using metaphors

19. In preparing for a viva or similar oral examination, it is best if you have: a. Avoided citing the examiner in your thesis b. Made exaggerated claims on the basis of your data c. Published and referenced your own article(s) d. Tried to memorize your work Answer: c: Published and referenced your own article(s)

20. Grounded theory coding: a. Makes use of a priori concepts from the literature b. Uses open coding, selective coding, then axial coding c. Adopts a deductive stance d. Stops when theoretical saturation has been reached Answer: d: Stops when theoretical saturation has been reached

Photo of BRIGHT FRANCIS

BRIGHT FRANCIS

Philosophy of education past question, aamusted guidance and counselling past question 2022, 2023, 2024, related articles.

research method questions

PLC Points awarding for the term

research method questions

Are Teachers Going To School Tomorrow Or Not

Schools that head masters and head mistress has instructed and force them to go to schools.

research method questions

President To Launch Ghana Smart Schools Projects

research method questions

JUST IN- Teacher Unions Declares Indefinite Strike Details

research method questions

NTC 1st Quarter Online Dialogue Series 2024

research method questions

Caution For All Teachers Who Are On Transfer Or Are Yet To Go On Transfer

research method questions

Two Kinbu SHS Students Are Purportedly Seen Having An Affair In An Open Location Of The School While Being Filmed On Live.

  • Pingback: GUIDANCE AND COUNSELLING PAST QUESTION 2022 – GhEduJob

An explicit explicated and a resplendent inkling been summarized

Good work done

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Detecting inattentive respondents by machine learning: A generic technique that substitutes for the directed questions scale and compensates for its shortcomings

  • Original Manuscript
  • Open access
  • Published: 08 April 2024

Cite this article

You have full access to this open access article

  • Koken Ozaki   ORCID: orcid.org/0009-0001-0382-1512 1  

176 Accesses

8 Altmetric

Explore all metrics

Web surveys are often used to collect data for psychological research. However, the inclusion of many inattentive respondents can be a problem. Various methods for detecting inattentive respondents have been proposed, most of which require the inclusion of additional items in the survey for detection or the calculation of variables for detection after data collection. This study proposes a method for detecting inattentive respondents in web surveys using machine learning. The method requires only the collection of response time and the inclusion of a Likert scale, eliminating the need to include special detection items in the survey. Based on data from 16 web surveys, a method was developed using predictor variables not included in existing methods. While previous machine learning methods for detecting inattentive respondents can only be applied to the same surveys as the data on which the models were developed, the proposed model is generic and can be applied to any questionnaire as long as response time is available, and a Likert scale is included. In addition, the proposed method showed partially higher accuracy than existing methods.

Similar content being viewed by others

research method questions

Using heterogeneous sources of data and interpretability of prediction models to explain the characteristics of careless respondents in survey data

Leon Kopitar & Gregor Stiglic

research method questions

Explanation-by-Example Based on Item Response Theory

research method questions

Use and Interpretation of Item Response Theory Applied to Machine Learning

Avoid common mistakes on your manuscript.

Introduction

Web surveys have become a popular data collection tool not only in psychological research but in the social sciences in general. They can be used to produce large amounts of data from respondents of various demographics at a modest cost. Recently, however, the inclusion of inattentive respondents in web surveys has been a serious problem, and more attention is being paid to their detection.

According to Bowling et al. ( 2016 ), IER (Insufficient Effort Responding) “occurs when research participants provide inaccurate data because they have failed to carefully read or comply with questionnaire instructions and item content.” IER is sometimes referred to as careless responding (CR) (Meade & Craig, 2012 ). Curran ( 2016 ) refers to these types of responses as “C/IE” responses. Following this convention, in this study, inattentive respondents will be referred to as C/IERs.

Influence of inattentive responses

When conducting web surveys, it is recommended that C/IERs be detected and excluded from the data before analysis (Maniaci & Rogge, 2014 ; Meade & Craig, 2012 ), as their inclusion can negatively affect the results. A number of studies on this topic have appeared in the literature. For example, Credé ( 2010 ) showed in a simulation study that even 5% random responses negatively affect correlations. Hamby and Taylor ( 2016 ) showed that the presence of C/IERs negatively affects the reliability and validity of a scale. Maniaci and Rogge ( 2014 ) showed that the presence of C/IERs decreases statistical power. Woods ( 2006 ) showed in simulations that 10% inattentive responses to reverse-worded items, even when the scale is unidimensional in nature, will reject the one-factor model and increase the likelihood of the two-factor model, where each factor corresponds to straight word items and reverse-worded items, respectively. Huang et al. ( 2015 ) showed that when the mean of a scale deviates from its midpoint, the presence of C/IERs can increase the correlation between observed variables and increase type 1 error. DeSimone et al. ( 2018 ) showed through simulations that random responses decreased inter-item correlations, internal consistency, and the first eigenvalue of the scale. Thus, the presence of inattentive respondents is a major problem affecting the quality of data in web surveys (Bowling & Huang, 2018 ; Weiner & Dalessio, 2006 ).

Broadly speaking, there are two ways to deal with C/IERs: prevention and detection. Prevention refers to measures to reduce the number of such respondents. Ward and Meade ( 2018 ) showed that surveys devised to increase cognitive dissonance and hypocrisy had fewer C/IERs than controls. Ward and Pond ( 2015 ) successfully reduced C/IER by displaying a virtual human to the respondents. However, despite significant efforts at prevention, C/IERs will likely be present in any survey data, making methods of detection highly important.

Two methods for detecting C/IERs

Two main methods for detecting C/IERs have thus far been considered in the literature: a priori methods, in which items for detection are incorporated into the survey, and post hoc methods, in which indicators for detection are calculated from the collected data. Typical indicators for a priori and post hoc methods are summarized in Table  1 .

Proportion of C/IERs

A priori and post hoc methods provide an indication of the proportion of survey respondents who are C/IERs. Johnson ( 2005 ) found that 3.5% of respondents repeatedly selected the same response options without reading the item content. Meade and Craig ( 2012 ) found that the proportion of C/IERs among all respondents was 11% by a latent profile analysis using response time and the Mahalanobis distance, while Maniaci and Rogge ( 2014 ) found that the proportion of C/IERs was 2.5%. Arias et al. ( 2020 ) applied a factor mixture model and found that between 4.4% and 10% were C/IERs. Bruhlmann et al. ( 2020 ) applied a latent profile analysis and found that 45.9% of the crowdsourced sample were C/IERs. Jones et al. ( 2022 ) estimated an average C/IER rate of 11.7% through a meta-analysis of 48 alcohol-related studies using crowdsourcing. Thus, the proportion of C/IERs varies considerably from study to study.

How to detect C/IERs using various indicators

In order to detect C/IERs using the indicators listed above, appropriate evaluative criteria need to be established. For example, in the case of DQS, the researcher needs to decide how many DQS items to use and how many of these should be considered indicative of a C/IER if not followed. For response times, Bowling et al. ( 2016 ) suggested a value of 2 s per item for determining C/IERs. If the survey system is not equipped with a mechanism for directly measuring response time per item, it can be calculated by dividing the response time per survey form or page by the number of items. However, in such cases, the value of 2 s per item may not be appropriate if the items involve a variety of response formats rather than only a Likert scale. For LS, Huang et al. ( 2012 ) recommended 6–14 items, but in the absence of reverse items, it may be possible to answer several questions in the same category consecutively.

Thus, although cutoff criteria have been proposed for each indicator, a consensus has not yet been reached on the “best” cutoff values. In addition, individuals who are identified as a C/IER by one method will not necessarily be judged a C/IER by another method (Curran, 2016 ). Recommendations that the user understand the characteristics of each method and apply them in some combination seem sound, but there is still no consensus on how to integrate each indicator to determine whether a respondent is inattentive. Ward and Meade ( 2023 ) state clearly that there are no clear guidelines on how to detect inattentive respondents using the various indicators.

While there are no generally accepted guidelines, Maniaci and Rogge ( 2014 ) argue for the effectiveness of DQS and ARS, noting that DQS showed effectiveness in terms of statistical power when either one or three DQS items were used. Ward and Meade ( 2023 ) set three screening levels – minimal, moderate, and extensive – and suggested indicators to be used for each stage. They recommend the use of DQS, response time, LS, and Mahalanobis distance. Notably, while researchers have proposed various methods for detecting C/IERs, most cite the usefulness of DQS.

Negative effects of using DQS, IMC, and bogus items

The impact on survey respondent perceptions resulting from the use of various detection measures and their impact on response results have also been studied. Kung et al. ( 2018 ) found that using DQS or IMC did not affect the measurement properties of the scales. On the other hand, Breitsohl and Steidelmüller ( 2018 ), using bogus items, DQS, and IMC, found that such attempts to detect C/IERs were perceived by some as insulting and undermined the respondents’ trust in the researcher, and, in some cases, led to more attentive responses. They further showed that the presence of DQS or IMC items negatively affects the goodness of fit for the factor analysis model. Similar to the findings of Breitsohl and Steidelmüller ( 2018 ), Oppenheimer et al. ( 2009 ) also suggested that the use of IMC may be taken by respondents as an insult and render the researcher untrustworthy in their eyes. In addition, Curran and Hauser ( 2019 ) found that even very diligent respondents agreed with some of the impossibilities included in the bogus items. Thus, such bogus items may lead to misclassifying careful respondents as C/IERs. In summary, although the effectiveness of DQS and other inattentiveness detection methods has been recognized by many, some studies have raised the issue of the negative impact of these methods on respondent perceptions and response results. To address this problem, we sought to develop a machine learning method to detect C/IERs without the use of DQS, IMC, or bogus items.

Previous studies on machine learning methods

Supervised machine learning methods for C/IER detection have been proposed in several recent studies (Ozaki & Suzuki, 2019 ; Gogami et al., 2021 ; Schroeders et al., 2022 ). Supervised machine learning is generally used when there is an outcome to predict or detect. In the case of C/IER detection, the C/IERs in the data are identified by a measure such as DQS, and a machine learning model is developed to detect such respondents using a set of predictors typically collected in surveys or calculated from these variables. Once the machine learning model is developed, detection with a certain degree of accuracy is possible so long as there are suitable predictors, making it no longer necessary to use DQS or other a priori methods to identify C/IERs. Figure  1 illustrates this idea.

figure 1

Framework and advantages of detecting inattentive respondents with machine learning models. Note: This figure provides a framework for detecting inattentive respondents through machine learning: the survey data, including DQS, are divided into training and test data. Using the training data, a machine learning model is developed to predict the DQS with predictors, and the model is fitted to the test data. If the fit is good, it is possible to detect C/IERs without using DQS for that survey

The contributions of Ozaki and Suzuki ( 2019 ), Gogami et al. ( 2021 ), and Schroeders et al. ( 2022 ) to the use of supervised learning to detect inattentive respondents are described below. The significance of their studies is threefold: (1) As noted, there are possible problems with the use of indicators such as DQS for C/IER detection due to their negative impact on respondent perceptions and the resulting estimates (Breitsohl & Steidelmüller, 2018 ). This problem can be eliminated if an effective machine learning alternative can be developed; (2) Detection items like DQS are essentially add-on survey items, and thus eliminating them is desirable. A proper machine learning model has the potential to achieve this; (3) Various indicators have been proposed for detecting C/IERs, but it is not yet clear how to integrate and use them. Machine learning is capable of producing a single indicator – the inattentive response probability – using each of the indices. Calculating the inattentive response probability by machine learning can be thought of as a way of integrating the indices. Importantly, it is an easy indicator to use, since respondents can be excluded according to their value.

Ozaki and Suzuki ( 2019 ) and Gogami et al. ( 2021 )

In devising their supervised machine learning approach, Ozaki and Suzuki ( 2019 ) used a Japanese web research company to conduct a survey to examine the impact of three generations living together on the number of children in a family. They included two DQS items and three item pairs to check for inconsistent responses as outcomes in the survey. The respondents were considered C/IERs if they responded improperly to any one of the five items. Gogami et al. ( 2021 ) conducted a crowdsourced survey that included Likert scale items. They also included a three-item DQS and ARS. Respondents were considered a C/IER if they violated any one of the three DQS items or if they had values above the cutoff point on the ARS. Both studies attempted to detect C/IERs identified by DQS and other items using response time, LS, Mahalanobis distance, etc.

The sample used by Ozaki and Suzuki ( 2019 ) consisted of 2000 PC respondents (610 C/IERs and 1390 attentive respondents). The data for half the respondents were used as training data (500 C/IERs and 500 attentive respondents). Accuracy was measured by fitting the model developed with the training data to the test data for the remaining 1000 respondents. Gogami et al. ( 2021 ) used a sample size of 4940 smartphone respondents (247 C/IERs and 4693 attentive respondents). They randomly selected 247 respondents from the 4693 attentive respondents five times. The accuracy of their models was evaluated using Leave-One-Out Cross-Validation for each C/IER: attentive = 247:247 and calculating the average of the five accuracy evaluation results.

Ozaki and Suzuki ( 2019 ) applied various methods, including random forests (Breiman, 2001 ) and gradient boosting (Chen & Guestrin, 2016 ; Friedman, 2001 , 2002 ), and used LS for Likert scale items, response time for the entire questionnaire, Mahalanobis distance calculated from the Likert scale items, the p  value of the Mahalanobis distance, and the gender and age of respondents as predictors. The results showed an accuracy of 81%, a precision of 32%, a recall of 66%, a balanced accuracy of 74%, and a specificity of 83% when the inattentive respondent probability (IRP) obtained by gradient boosting was .5 or higher. Although the precision was low, the proportion of C/IERs present in the test data was reduced by 56% when the respondents who were detected as C/IERs by this method were removed. Gogami et al. ( 2021 ) measured and used as predictors the number of times text was deleted and the respondent’s scrolling speed, etc. from smartphone screen operation data. In addition, response time was measured separately for Likert scales and free descriptions and used as a predictor. The number of letters in the open-ended responses, the number of intermediate responses on the scale, and LS also served as predictors. The detection results using gradient boosting showed an accuracy, precision, and recall of approximately 86% (it was not possible to calculate the balanced accuracy and the specificity from the information in their paper), suggesting the effectiveness of using smartphone screen operation data for response data.

It should be noted that, since the training and test data in both studies were from the same survey, it remains unclear as to whether the models that were developed could be used for other surveys. Unless the method is general enough to be applied to other surveys, it cannot be considered practical.

Schroeders et al. ( 2022 )

Schroeders et al. ( 2022 ) used gradient boosting as a machine learning method and conducted a simulation study and a study using real data. In the real data study, the sample was divided into an attentive respondent group and a C/IER group. The attentive group was given the usual instructions, such as taking time to carefully consider the contents of the items, while the C/IER group was asked to respond quickly without carefully reading the contents of the items. The sample size was 605 (244 C/IERs and 361 attentive respondents). From this sample, 226 C/IERs and 199 attentive respondents were randomly selected as training data. The remaining 180 were used as test data. The training and test data were randomly selected 1000 times, and the average accuracy was reported.

In the real data study, a comparison of detection accuracy was conducted in which machine learning (gradient boosting) methods and traditional methods such as the Mahalanobis distance and LS. The predictors for machine learning were the same traditional measures. In addition, the response time was used in the real data study.

While the results of the simulation study involving the machine learning model were generally good, the real data study did not achieve the same degree of accuracy. In the case of gradient boosting, recall was 60%, meaning that the percentage of correctly detected C/IERs was only 60%. Precision was low, at 19%, which means that only 19% of the respondents who were judged C/IERs were actually C/IERs. Recall values for the conventional methods were lower than for machine learning, and precision was less than or equal to that for machine learning. They also reported an accuracy of 70%, a balanced accuracy of 66%, and a specificity of 71%.

Schroeders et al. ( 2022 ), while pointing to the possibility that some respondents in the attentive group responded inattentively as a reason why the results of the real data analysis were not favorable, stated that the inattentive response process in the real world is much more heterogeneous than in the simulation and that larger training data sets are needed. They also argue that “generalizations to other data sets, samples, and situations are not possible, because every examination is highly specific in terms of items and persons” (Schroeders et al., 2022 , p. 49). Our study aimed to challenge this argument and develop a generic method that can be applied to any survey so long as it includes Likert scale items, and response times are available.

Purpose of this study

A common issue among the three machine learning studies described above is that the methods developed are not generic, meaning that they can only be applied to the questionnaires used to develop them.

The purpose of this study is to offer a method for detecting C/IERs using machine learning that can be applied to any questionnaire. Although the proposed method is not technically generic since it is premised on the condition that Likert scales be included in any survey to which it is applied, this condition is quite modest, as most psychological research includes Likert scales. Another condition is that response times are available, which is the case with most web-based survey systems. Thus, the method can be said to be applicable to most psychological research.

Advantages, characteristics, and novelty of method developed in this study

Using data from 16 web surveys, a method to detect C/IERs by machine learning for PC and smartphone responses, respectively, was developed. The proposed method has the following six advantages, features, and novelties over existing a priori, post hoc, and machine learning methods:

The method developed by Ozaki and Suzuki ( 2019 ), Gogami et al. ( 2021 ), and Schroeders et al. ( 2022 ) uses only one web survey data set; thus, it can only be applied to surveys with the same content as the survey used to develop the model. The method developed in this study is a general-purpose method that can be applied to any survey that includes a Likert scale.

Since the deletion of respondents can be done based on the probability of inattentive responses obtained by machine learning, it is unnecessary to comprehensively consider multiple indicators. It is also unnecessary to set a criterion that matches the content of each new survey (although it is necessary to decide what value should be used as the cutoff point for the IRP).

Since there is no need to include a DQS item or any other mechanism in the questionnaire in advance, there is no need to be concerned about offending respondents. In addition, eliminating the need to incorporate a DQS item or other detection indicator reduces the number of items, thereby reducing both the burden on respondents and survey costs.

The proposed method improves detection accuracy by using new predictors not used in previous studies.

By using much more training data than in the three previous studies, a generic model is developed.

The layout of the response screen for PCs differs from that of smartphones, as is the way that respondents answer the questions. However, previous studies comparing the differences between PC and smartphone responses indicate that, in general, there is no significant difference in the results for the two types of responses (Tourangeau et al., 2017 ; Andreadis, 2015 ). On the other hand, some studies have shown that the response time is longer for smartphone responses (Andreadis, 2015 ; de Bruijne & Wijnant, 2013 ; Keusch & Yan, 2017 ). Since the sample size used in this study is very large, separate models are developed for PC and smartphone responses to achieve more accurate results.

Sixteen web surveys were used in this study to develop the machine learning models for C/IER detection. A separate model was developed for PC responses and smartphone responses. Since three of the 16 surveys did not produce sufficient PC response data, only 13 surveys were used to develop and test the PC model. This study confirmed advantages 1 through 6 listed above. This study was not preregistered.

Although deep learning (Urban & Gates, 2021 ) has attracted considerable attention in the field of machine learning, Grinsztajn et al. ( 2022 ) showed that tree-based methods are more effective than deep learning when the sample size is less than 10,000. Since the sample sizes in this study are 5610 for smartphone responses and 4704 for PC responses, the detection model was developed using random forests and gradient boosting.

Summary of 16 web surveys

A summary of the 16 web surveys used in the study is presented in Table  2 , which provides information on the survey content, number of items, location of the two DQS items, and the starting position of the Likert scale used to apply the machine learning model. All surveys were conducted between 2020 and 2021 and were conducted primarily by researchers other than the first author of this paper for academic research in psychology or other fields. Therefore, the data were collected in practical situations where the machine learning models to be developed were applied. Of the 16 surveys, nine were for psychological research. Of the remaining seven, three were for business management, three were for marketing, and one was a behavioral study (i.e., behavior after returning home as it relates to COVID-19). Some of the surveys were pre-screened and administered to specific target groups, while others were not targeted. Therefore, diverse data were collected in terms of survey content and survey subjects. Furthermore, as described below, the C/IER rate also varied.

Each of the surveys contained Likert scale items, including the two DQS items. The content of the DQS items was similar to “Please select category 2 for this question.” All surveys were approved by the research ethics review committee of the first author’s institution and were conducted for psychology and other research as well as for this study. The first author’s only involvement in the questionnaire design was placement of the DQS items. In order to avoid the noticeable presence of DQS items, placement at the beginning or end of a block of Likert scale items or at the beginning or end of a page was avoided. In the DQS items, respondents were directed to respond in a category other than the middle category because it has been found that Asians, including Japanese respondents, tend to respond in the middle category (Harzing, 2006 ; Masuda et al., 2017 ). The positions of the two DQS items are shown in Table  2 . The position refers to the column number for the data. The number of items is the total number of columns. All surveys were conducted by the same web research company in Japan. The survey targets were monitors who were registered with the survey company. Although nationalities were not tabulated, it is assumed that most of the registered monitors are Japanese since the language of the questionnaire was Japanese. In addition, as described below, the machine learning model developed in this study utilizes data from 12 consecutive items with a Likert scale. Table  2 shows the position of the first Likert scale item.

Note that survey 10 is nearly identical to survey 12, with the difference being that survey 12 had the respondents promise to respond seriously at the beginning of the questionnaire. Therefore, it is conceivable that survey 10 and survey 12 could be analyzed without treating them as separate surveys. However, since the two surveys contained multiple Likert scales, different Likert scales could be used to create the predictors for machine learning. Since the inclusion of the two surveys allowed examining the effect of the position of the Likert scale on detection accuracy, it was decided to include both surveys for analysis. Table  3  provides information on the number of respondents, the number of C/IERs, and the rate of C/IERs for PC and smartphone responses, respectively.

The analysis was performed by R version 4.1.2 (R Core Team, 2022 ). The randomForest package (Liaw & Wiener, 2002 ) was used for the random forest analysis and the xgboost package (Chen et al., 2022 ) was used for the boosting analysis. The raw data cannot be shared with the public due to research ethics.

Training data and test data

Since the machine learning models were intended to make predictions and detections, the sample data were classified into data for model training and test data for detection by the learned model. In this study, as shown in Fig.  2 , out of 13 (16) surveys, 12 (15) surveys were used as training data; the single remaining survey was used as test data to test the detection accuracy of the model developed. The procedure was repeated 13 (16) times. This is an application of Leave-One-Out Cross-Validation, a procedure used to examine the generalization performance of machine learning models.

figure 2

Training and test data for the PC responses. Note. This figure shows the development of a machine learning model to detect C/IERs using 13 surveys. In each of the 13 analyses, 12 surveys are used to develop the model. The model is then applied to the one remaining survey to check its predictive accuracy. The same approach is followed for the 16 sets of smartphone response data. See Appendix B for other methods for developing models using multiple surveys

In terms of detection accuracy, it is desirable for the outcome in the training data to take values of 0 or 1, with each value being the case 50% of the time. For this purpose, rather than using all the data from the surveys, the minimum number of C/IERs for both PC and smartphone responses was used to set the sample size where the outcome in the training data is 1. This same sample size was used for cases where the value of the outcome was 0. For example, for smartphone responses, the minimum number of C/IERs (Table  2 ) is 187, so the sample size is 2 × 187 = 374 for each survey. As noted earlier, for smartphone responses, 15 surveys were used to develop the model; thus, the total sample size for the smartphone training data is 187 × 2 × 15 = 5610. For the PC responses, the lowest numbers of C/IERs were, in order, 37, 121, 162, and 196. Ultimately, 196 was considered the minimum acceptable number, and the three surveys with fewer than 196 C/IERs were omitted from the analysis. Thus, the total sample size for developing the model for PC responses was 196 × 2 × 12 = 4704. If 162 was used, the total sample size would be 162 × 2 × 12 = 3564, which is obviously smaller; 37 or 121 would also result in a smaller sample size. In the tests conducted to evaluate model performance, all the data in the designated test survey were used.

The sample sizes for the training data in the three previous studies were 1000 for Ozaki and Suzuki ( 2019 ), 493 for Gogami et al. ( 2021 ), and 425 for Schroeders et al. ( 2022 ). Thus, the present study sought to develop a model with high detection accuracy using samples 4.7 to 13.2 times larger than the samples used in the previous studies.

The (1, 0) machine learning outcome in this study indicates whether the respondent responded incorrectly to at least one of the two DQS items included in the survey. If the respondent responded incorrectly to at least one DQS item, the outcome was assigned a value of 1, otherwise 0. Various other indices besides DQS have been used in combination in prior inattentive respondent studies. Nevertheless, in this study, only DQS items were used as a measure of C/IER. There are two reasons for this choice:

Ward and Meade ( 2023 ) mentioned unambiguity in scoring as an advantage of DQS. Since the goal of supervised learning is to develop a method for predicting the outcome, the outcome should be unambiguous. As noted earlier, Schroeders et al. ( 2022 ) recognized the possibility that even respondents who were instructed to be attentive may have given inattentive responses as a reason for the lack of good detection accuracy in their real data study. By using DQS, such a possibility can be ruled out.

Maniaci and Rogge ( 2014 ) showed that with either three DQS items (where noncompliance with two or more DQS items indicates an inattentive respondent) or one DQS item, the statistical power of the sex difference analysis on the openness factor score was comparable to that of ARS and seven DQS items (where noncompliance with three or more DQS items indicates an inattentive respondent). In addition, Ward and Meade ( 2023 ) recommended the use of DQS as a minimal practice to cope with C/IERs. Thus, DQS is often recommended.

As constructed, this study, at the very least, answers the question of whether machine learning can be used as an alternative to DQS. Similar studies might be conducted using other indicators, such as ARS, to flag C/IERs. The results could then be used to answer the question of whether machine learning can be used as a substitute for the other indicators. Such studies are left to future work.

For any of the predictors in the study, it is not possible to compare the magnitude of the values across different questionnaires. For example, since the total response time naturally depends on the length of the questionnaire, it cannot be directly compared across questionnaires. As a result, it is necessary to transform the variables. This section gives the details of the predictors, describes the transformation method, and outlines the reasons for the transformation.

The indicators fall into three main categories: predictors using Likert scales, paradata obtained in the administration of the survey, and predictors using open-ended responses.

Predictors using Likert scales

Listed below are the study’s predictors using Likert scales. All 16 surveys included Likert scales, although the number of items on the Likert scale varied across surveys. In this study, 12 consecutive Likert items were used to compute the predictors. Costa and McCrae ( 2008 ) reported that, of the 983 respondents in their survey, none answered questions in the same category more than 6, 9, 10, 14, and 9 times in a row for each of the five response categories. Based on this and the length of the scale of the questionnaire administered, 12 items were used in the present study. The location of these items within the questionnaire is shown in Table  2 . The number of consecutive items can be varied depending on the data to which the machine learning model is applied.

The 12 items were chosen to satisfy one (or both) of the following conditions: they contain reverse-worded items, or they contain items measuring different constructs. A case in which neither of the conditions is met would entail selecting 12 items that measure the same construct and that do not contain reverse-worded items. It is thus possible that predictors such as LS may not work well, since even attentive respondents may choose the same category for 12 consecutive items. Table  2 shows the relationship among the 12 items, indicating that one or both of the above conditions is satisfied in all 16 surveys.

LS is defined as the maximum number of consecutive responses to questions in the same category on a 12-item Likert scale. As shown in Table  2 , the number of response categories for the scales used to calculate LS differs across surveys; however, we did not standardize the LS to account for the number of response categories. The reason for this is that C/IERs are expected to have a longer LS even if the number of response categories differs. Since all 12 items were used in this study, no standardization by number of items was done; however, when the number of items differs, the rate of LS to the number of items could be used as the predictor “LS/number of items.”

Response limited to no more than two categories

Although LS refers to consecutive responses using the same category, it may be possible for a respondent to switch to another category in the middle of responding to questions in the same category in a series or to always choose the adjacent category in a series. Four predictors were used to capture such response behavior (R2C, R3C, AC, and MAC). To the authors’ knowledge, these indicators have not been used before. The first of the four variables, abbreviated as R2C, is “response in no more than two categories,” a binary variable indicating whether the number of categories chosen in response to the Likert scale items used to develop the predictors (in this case, 12 items) is less than or equal to 2. Based on the assumption that C/IERs use a small number of response categories without being affected by the number of categories, no transformation was performed for R2C.

Response limited to no more than three categories

Similar to R2C, R3C is a binary variable indicating whether the respondent used three or fewer categories in their responses. R3C is not transformed for the same reasons as R2C.

Number of consecutive responses to items in adjacent categories

Variable AC is defined as the total number of consecutive responses in adjacent categories on a Likert scale, staggered by one category. For example, if the respondent responded 1, 2, 3, 4, 2, 1, the value of index AC would be 4 (1-2, 2-3, 3-4, 2-1). In this case, C/IERs cannot be detected by LS, R2C, or R3C. The number of consecutive responses to questions in adjacent categories is abbreviated as AC. AC is standardized as “number of categories*AC,” since the smaller the number of categories, the larger the AC value may be. If the number of items differs, “the number of categories×AC/number of items” is used.

Maximum number of consecutive responses to items in adjacent categories

Whereas AC represents the total number of consecutive responses, MAC indicates the maximum number of consecutive responses to questions in adjacent categories. For example, if the responses are 1, 2, 3, 4, 2, 1, the value of this index is 3 (1-2, 2-3, 3-4). The transformation method is the same as that for AC.

Although Dunn et al. ( 2018 ) showed the effectiveness of IRV, we chose not to use it in this study. The fact that IRV is a standard deviation makes it difficult to adjust for differences in the number of Likert scale categories across surveys. Instead, R2C, R3C, AC, and MAC were used, because the sum of their ability of each to assess the lack of variability in responses is equal to or better than that of IRV, and because it is relatively easy to adjust for differences in the number of categories. Dunn et al. ( 2018 ) stated that IRV can detect cases of responses with small variability that cannot be detected by LS. They then cite examples such as 2, 2, 2, 2, 3, 3, 2, 2, 2 and 4, 5, 4, 5, 4, 5, 4, 5, 5, 5, where adjacent categories are selected one after another. These can be detected by R2C, R3C, AC, or MAC. In addition, responses such as 1, 2, 3, 4, 5, 4, 3, 2 are difficult to detect with IRV, but can be detected with AC and MAC.

Mahalanobis distance

The Mahalanobis distance (maha) is an indicator used in various studies. In this study, 12 items were used for the calculation, providing an indicator of how distinctive each respondent is in his/her responses to these 12 items. The Mahalanobis distance maha i for respondent i is calculated by the following formula: \({{\text{maha}}}_{{\text{i}}}=\left({x}_{i}{\prime}-{\overline{x} }{\prime}\right){\Sigma }^{-1}\left({x}_{i}-\overline{x }\right)\) , where xi is the vector of responses to Likert scale items for respondent i and ∑ is the covariance matrix of Likert scale items. It should be noted that the mean vector and covariance matrix needed for the calculation would be incorrect if C/IERs were included in the data. Therefore, the mean vector and covariance matrix used here to compute the Mahalanobis distance were obtained by excluding respondents with LS = 12. The Mahalanobis distance is an index that considers the variance of each variable. Thus, it originally corresponds to the fact that the variance of each item differs from survey to survey due to the different number of response categories in the Likert scale used to create the index. However, because differences in the number of items on the Likert scale affect the Mahalanobis distance, it is transformed by dividing maha by the square of the number of items.

P  value for Mahalanobis distance

A statistical test in which the null hypothesis is “data for the respondent do not deviate from the mean vector” was performed on the maha, using the chi-square distribution with degrees of freedom equal to the number of items. The p  value in the significance test was then used as another predictor. The p  value for the maha is abbreviated as “maha_p.”

Mahalanobis distance using two variables with highest correlation

The maha using only the two variables with the highest correlation among the pairs of 12 Likert scale items was determined. As with the maha above, the mean vector and covariance matrix were obtained by excluding respondents with LS = 12. The maha using the two variables with the highest correlation is abbreviated as “maha2.”

P  value for Mahalanobis distance using two variables with highest correlation

This is the p  value for the maha using the two variables with the highest correlation. The number of degrees of freedom is 2. This indicator is abbreviated as “maha2_p.”

Sum of absolute values of deviations from mean vector

This index, abbreviated as “absdevi,” is the sum of the absolute values of the differences between the mean vector for the 12 items and each respondent’s responses. The aim of this indicator is similar to that of the maha; however, absdevi does not take into account the covariance between items. Because the value of absdevi increases with the number of Likert scale items and with the number of response categories, it is transformed as “absdevi/(number of items * number of categories).”

Predictors using paradata obtained from administration of surveys

Auxiliary data collected during the process of administering a survey are called paradata. In this study, the total response time of each respondent and the median of the total response time per survey were used as paradata-based indicators.

Total response time

While pointing out that fast responses are one of the characteristics of inattentive respondents, Ward and Meade ( 2023 ) also noted that, since there are cases where respondents drop out and come back in the middle of their responses, the response time per page is more accurate for detecting fast responses than the time taken to complete the entire questionnaire. However, the web survey system used in this study lacked a mechanism to measure the response time per page. As a result, the total response time was used. In box-and-whisker plots of the total response time, values greater than the extreme of the upper whisker were considered to be due to the respondent interrupting and then returning to his/her responses. In such cases, the response time for that section was replaced by the extreme of the upper whisker. Similar manipulations were performed by Maniaci and Rogge ( 2014 ) and Schroeders et al. ( 2022 ).

Since the total response time is affected by the number of items, the total response time divided by the number of items was used to account for differences among surveys. The total response time is abbreviated as “time.”

Median of total response time

The median of the above total response time was calculated for each survey and used as a predictor. This is a questionnaire-level variable since it is the same among respondents who took the same survey, although it is a different value for each survey. The median total response time is abbreviated as “time_m.” Since the purpose of this indicator is to express differences among surveys, no transformation by number of items was applied to time_m.

Number of survey items

It is assumed that the greater the number of survey items, the more likely it is that inattentive responses will occur. Therefore, the number of survey items was used as a predictor. This, too, is a variable at the questionnaire level. The number of survey items is abbreviated as “nitems.”

The machine learning model was developed using the above 13 variables as predictors. To the author’s knowledge, nine of the 13 variables, R2C, R3C, AC, MAC, maha2, maha2_p, absdevi, time_m, and nitems, are predictors that have not been used in prior studies.

Parameter tuning

As noted, the model was developed using training data consisting of 4704 (5610) values obtained from 12 (15) surveys. Model parameters were tuned to increase the detection accuracy when the model was applied to the validation data. After constructing the model, the model was fitted to the test data to examine its detection accuracy when applied to data that were not used for model development. The validation data were obtained by bootstrapping from the training data for random forests and by tenfold cross-validation on the training data for gradient boosting.

Random forests constitute a method of generating multiple decision trees and using the averaged tree of these trees for prediction. To generate multiple trees, B different training data are generated from the training data by the bootstrap method, and B trees are generated from the B training data. When generating the trees, the predictors to be candidates for partitioning are also selected by random sampling at each partitioning. The model is fit to the data not extracted in each bootstrap sample (called Out Ob Bug; OOB) to obtain the validation error. The number of predictors sampled in each partition is a parameter determined by searching for the value that minimizes the validation error. Similar to random forests, boosting is a method that uses a large number of forecasting models with decision trees, but differs in that the trees grow sequentially in steps. The depth of the largest tree is a parameter determined by ten-part cross-validation.

Threshold for inattentive response probability

In this study, the machine learning output is the probability that a respondent will be judged as a C/IER. Given that the output is a probability, it would seem reasonable to differentiate C/IERs and attentive respondents based on a value of .5. Results for criteria other than .5 can be downloaded from https://osf.io/2t64w . A higher threshold means that only respondents who show a high tendency to inattentive responses are judged as C/IERs, while a lower threshold means that even respondents who show only a slight tendency to inattentive responses are judged as C/IERs.

Ward and Meade ( 2023 ) proposed three levels of C/IER screening: minimal, moderate, and extensive. Since their proposal is to use a different detection index for each level, it does not directly correspond to the inattentive respondent probability produced by machine learning in this study, where DQS is used to define the outcome. However, multiple detection levels can be set by changing the threshold value used to identify C/IERs.

Table  4 shows the detection results for the PC responses when the model is applied to the test data. Table  5 shows the detection results for the smartphone responses. The results are given separately for the random forests and boosting cases. The row for each survey (labeled 1 through 13 in the case of PC responses and 1 through 16 in the case of smartphone responses) shows the C/IER detection results (accuracy, recall, precision, and balanced accuracy) when the test data are from the indicated survey and the training data are from all the other surveys. For example, in Table  4 , the row for survey 4 indicates the degree to which respondents in survey 4 who did not comply with either of the two DQS items in the survey can be detected by the machine learning model developed with survey data from all the surveys except survey 4 (i.e., surveys 1–3 and 5–13). Two averages, Mean 1 and Mean 2, are shown in the bottom row of Tables  4 and 5 . Mean 1 is the average of the values shown in the tables, while Mean 2 is the recalculated average using the data from all 13 (16) surveys. For example, the accuracy of Mean 2 is the rate at which 0,1 for all test data and 0,1 for the machine learning model are matched. As can be seen in the tables, there is little difference between Mean 1 and Mean 2.

The accuracy and precision values shown in Tables  4 and 5 are not strictly comparable across surveys, nor are the values from previous studies comparable. This is because the C/IER ratio differs between surveys and also differs from previous studies. Therefore, for accuracy, we obtained the balanced accuracy, which is a measure of unbalanced binary classification. Furthermore, the C/IER rates in the test data were 10% in Schroeders et al. ( 2022 ), 11% in Ozaki and Suzuki ( 2019 ), and 50% in Gogami et al. ( 2021 ). Therefore, to compare with previous studies on accuracy and precision, Tables  4 and 5 also show the mean values when the C/IER rate is artificially set to 10.5% and 50%, respectively. For example, setting the C/IER rate at 10.5% was achieved by artificially reducing the number of respondents whose outcome was C/IER, while leaving the cases where the outcome was attentively unchanged. Tables  4 and 5 show that accuracy, recall, precision, and balanced accuracy are slightly higher for random forests than boosting, so the results for random forests will be interpreted hereafter.

The row labeled “Old” in Tables  4 and 5 shows the results of a machine learning model using only time, LS, maha, and maha_p as predictors, which have been used in previous studies. Therefore, the difference between Old and Mean 1 indicates the effectiveness of the new predictor in this study.

Comparison of results with previous studies

Table  6 summarizes the results of a comparison with previous studies. When comparing with Schroeders et al. ( 2022 ) and Ozaki and Suzuki ( 2019 ), the 10.5% case in Table  4 (PC response) is referenced; when comparing with Gogami et al. ( 2021 ), the 50% case in Table  5 (smartphone response) is referenced.

Comparing the results of Schroeders et al. ( 2022 ) to those of the present study, the present method is 3 points better in accuracy, 12 points better in recall, 14 points better in precision, 6 points better in balanced accuracy, and 2 points better in specificity. Thus, compared to the results of Schroeders et al. ( 2022 ), the detection accuracy is improved in all aspects. In particular, precision and recall are improved, which means that the probability of detecting an actual C/IER as a C/IER and the probability that a predicted C/IER is actually a C/IER are higher with the developed method.

Comparing the results of Ozaki and Suzuki ( 2019 ) to those of the present study, the results of this study are inferior in accuracy by 8 points, superior in recall by 6 points, almost the same results in precision, slightly inferior in balanced accuracy, and 10 points inferior in specificity. Thus, compared to Ozaki and Suzuki ( 2019 ), the probability of detecting an actual C/IER as a C/IER is increased, but the probability of detecting an actual attentive respondent as an attentive respondent is decreased.

The accuracy, precision, and recall reported by Gogami et al. ( 2021 ) were each approximately 86%, meaning that the results of the present study are inferior in all three aspects. It is important to note, however, that the three previous studies, including Gogami et al. ( 2021 ), used test data from surveys with the same content as the training data. On the other hand, the results of the present study were produced using training data and test data from questionnaires whose content was quite different. The fact that the proposed method was able to achieve higher accuracy than two of the three previous studies in some accuracy indices can thus be considered a notable advance in establishing the generalizability of the method using machine learning. The reason for the higher accuracy than existing methods is that multiple survey data were treated in an integrated manner, as shown in Fig.  2 , which resulted in the sample size of the training data being much larger than in previous studies, as shown in Table  6 .

It is also worth mentioning that since Gogami et al. ( 2021 ) used smartphone response data, the results of the present study are the best ever obtained for PC response data for recall. It is worth noting, too, that the superiority of Gogami et al. ( 2021 ) in detection accuracy might be attributable to its use of smartphone screen operation data (the number of times text was deleted and the respondent’s scrolling speed, etc.). Since this information was not collected in the 16 surveys, incorporating it into future studies is an issue to be considered.

Effectiveness of the new predictor

The difference between Old and Mean 1 in Tables  4 and 5 shows the effect of adding a new predictor to the model with time, LS, maha, and maha_p. In the case of boosting, there is little difference and almost no effect of adding a new predictor. In fact, the PRC in Table 5 is about 4% lower when new predictors are included, suggesting that the generalization performance of the model may be reduced. On the other hand, in the case of random forests, the effect of adding a new predictor is about 2% to 5% for each indicator, indicating that the inclusion of a new predictor is effective, although not large.

Predictor importance

Before showing the predictor importance, the correlation matrix between predictors calculated using all the data is shown in Table  7 . What is striking about this correlation matrix is that there is almost no correlation between response time and the other predictors. Although not shown in the paper, a similar trend was observed when the correlation matrix was calculated for each survey: In four of the 13 surveys, the correlation between time and LS was more negative than – .10 for PC respondents. This is similar to the correlation between time and LS of – .05 in Maniaci and Rogge ( 2014 ) and – .12 in Meade and Craig ( 2012 ). This suggests that although one would think that shorter response times would lead to larger LS values, this is not the case. This suggests the importance of using other predictors in combination with response time. The correlation between LS and maha is – .41, which is significantly different from – .15 in Maniaci and Rogge ( 2014 ) and .10 in Meade and Craig ( 2012 ). This can be interpreted as a result of the smaller maha of respondents who answered consecutively in the intermediate category (i.e., longer LS), since, as mentioned earlier, Japanese have a strong tendency to respond in the intermediate category and the mean of the variable tends to be the value assigned to the intermediate category.

The random forest model to detect DQS results was retrained using response data from all PC responses ( n = 196 × 2 × 13 = 5096) and all smartphone responses ( n = 187 × 2 × 16 = 5984). Figure  3 shows the mean decrease in accuracy as an indicator of predictor importance in the random forest model. The left side of Fig.  3 shows the PC responses; the right side shows the smartphone responses. Mean decrease in accuracy indicates how much classification accuracy is lost when each variable is excluded from the model: the larger the value, the more important the predictor.

figure 3

Predictor. Note: Mean accuracy decrease represents the amount of classification accuracy lost when each predictor value is shuffled among respondents; the larger the value, the more important is the predictor. times is the number of seconds for response, LS is the maximum number of consecutive responses, maha_p is the p value of Mahalanobis distance using 12 items, R2C is the response limited to no more than 2 categories, AC is the number of consecutive responses to items in adjacent categories, maha is the Mahalanobis distance using 12 items, R3C is the response limited to no more than 3 categories, absdevi is the sum of absolute values of deviations from mean vector, maha2_p is the p value of Mahalanobis distance using the two variables with highest correlation, nitems is the number of survey items, maha2 is the Mahalanobis distance using the two variables with highest correlation, MAC is the maximum number of consecutive responses to items in adjacent categories, time_m is the median total response time for each survey

As indicated in the figure, the response time is the most important factor for both the PC and smartphone responses, followed, in order, by LS, mama_p, and maha. The importance of response time has been noted by Leiner ( 2019 ), Ward and Meade ( 2023 ), and others. mama_p and maha decreased classification accuracy when one of them was kept and the other was deleted. Therefore, it was found that the effects of both predictors were seen even when they were entered into the predictor set at the same time.

Among the predictors not previously used in the context of C/IER detection, we found that including R2C, AC, maha2, maha2_p, absdevi, time_m, and nitems helped improve accuracy. These were found to be almost equally effective and less effective than maha_p and maha. R3C and MAC were also found to have the lowest effect among the predictors used. This is probably because R2C substitutes for the role of R3C and AC substitutes for MAC. However, the inclusion of R3C and MAC helped to improve accuracy and, like R2C, AC, maha2, maha2_p, and absdevi, are indicators that can be easily computed given Likert-scale data. In addition, as discussed above, AC and MAC can detect cases that cannot be detected by IRV, and therefore, are recommended for future use in studies of C/IER detection. For maha2 and maha2_p, it was found that, as with maha and maha_p, the inclusion of both at the same time improved the estimation accuracy.

The questionnaire-level predictors, time_m and nitems, are the first predictors used in this study and were found to be as effective as R2C and others. time_m and nitems are measures of the approximate response time for each survey and the length of the questionnaire. The results indicate that time_m and nitems have an effect that cannot be fully substituted by the response time at the individual level. Time_m and nitems should be included when developing machine learning models using multiple survey data, as in this study.

Overall, the results of this study support claims 1 through 6 regarding the advantages of the proposed method. However, the effectiveness of the new predictor was not found in boosting, but a small effectiveness was found in random forests. This indicates that the contribution of the new predictor to the development of a generic method was small, and that it was more effective to construct the model by integrating the multiple survey data as shown in Fig.  2 . The transformation of the predictor values by the proposed method was also considered effective.

This study developed a generic method for detecting inattentive or careless survey respondents (C/IERs) using machine learning. This section summarizes the limitations of this study, future research directions, and recommendations for researchers using the methods proposed in this study.

Limitations

This study had the following two limitations. First, because this study uses DQS for outcome, the predicted inattentive response probability represents the tendency of respondents to not comply with DQS and so is not an exhaustive method for detecting various types of inattentive respondents. This study used DQS for outcome because, as discussed in subsection “Outcome,” DQS is unambiguous in scoring (Ward & Meade, 2023 ), and previous studies have argued for the validity of DQS (Maniaci & Rogge, 2014 ; Ward & Meade, 2023 ).

Second, this method is only applicable when the questionnaire contains a Likert scale and response time data are available for each respondent. Also, this method works well when the Likert scale contains reverse-worded items that measure the same construct or items that measure different constructs. This is due to the use of predictors such as LS.

Future research directions

A number of research directions should be noted here. The first is regarding improving accuracy. The results in Tables  4 , 5 , and 6 are partially superior to those of Ozaki and Suzuki ( 2019 ) and Schroeders et al. ( 2022 ), but fall short of those of Gogami et al. ( 2021 ), which used smartphone operation information. Since smartphone operation information is not available for some surveys, it is significant that this study showed that even in the absence of smartphone operation information, it is possible to obtain high prediction accuracy. On the other hand, Buchanan and Scofield ( 2018 ) also showed the effectiveness of using one of the operational data, the number of clicks during the response, to detect C/IERs. The inclusion of smartphone operation information and number of clicks as predictors is expected to improve accuracy.

Furthermore, although only 16 surveys were available for this study, if a larger number of surveys were available, models could be developed for each research category. This may also contribute to an improvement in accuracy. Recently, Yeung and Fernandes ( 2022 ) developed a method to extract invalid text sentences by machine learning. The same is possible with GPT-4. Although this study uses little information obtained from text sentences as predictors, it is expected that the accuracy of C/IER detection will be increased by using the evaluation values for text sentences as predictors.

The third point is to develop a machine learning model with outcomes other than DQS. By developing a machine learning model with other outcomes such as ARS, it may be possible to understand each respondent’s response behavior in more detail. The results of this study will help overcome the first limitation mentioned above.

The fourth point is to compare this study with a series of studies that modeled the response behavior of C/IERs and attentive respondents by a latent response mixture model (Ulitzsch, Pohl, et al., 2022a ; Ulitzsch, Yildirim-Erbasli, et al., 2022b ; Ulitzsch, Pohl, et al., 2023a ; Ulitzsch, Shin, et al., 2023b ). If response time per page or Likert scale data of different polarity is available, these previous studies can also be used as generically as this study, regardless of the content of the questionnaires. In particular, the model of Ulitzsch, Yildirim-Erbasli, et al. ( 2022b ) allows us to examine inattentive response tendencies by respondent and by item. This can be rephrased as being able to detect fluctuations in respondents’ attention. Since the method developed in this study may also be able to detect fluctuations in respondents’ attention by changing the Likert scale items applied, a comparison from this perspective is also possible.

Recommendations for researchers using the proposed methods

To actually use this method, it is first necessary to construct a machine learning model. To do so, it is necessary to collect data from multiple web surveys. It is also necessary that all surveys include DQS and Likert scales and that response time be measured.

When extracting predictors from the collected raw data, we can use the R code calculating_predictors.R as described in the Appendix . In developing a machine learning model using the training data containing the computed predictors and applying that model to the test data (survey data that were not used as training data), analysis.R can be used. analysis.R allows us to estimate the inattentive response probability for each respondent on the test data. If the prediction accuracy for the test data is high, a machine learning model can be developed using the data from all surveys. Note, however, that random sampling is desirable so that the ratio of C/IERs to attentive respondents is 50:50. The model can then be applied to any survey data to detect C/IERs. See the Appendix for the details of the R code.

It also seems necessary to create country-specific models when using this method. The model for this study was developed using data from Japanese respondents. Since the tendency to respond to survey items differs from country to country, it is not certain whether the model developed in this study will be applicable to data from respondents in other countries.

Availability of data and materials

The first author’s affiliation with the research ethics committee does not permit publication of the data in principle, so the data cannot be made public. However, sample data are available at https://osf.io/dx2mf .

Results for varying the threshold from 0.1 to 0.9 in machine learning predictions are available at https://osf.io/2t64w .

Code availability (software application or custom code)

The R codes for calculating predictors and for machine learning predictions are available at https://osf.io/dx2mf . Appendix provides examples of analyses using the sample data and analysis code.

Andreadis, I. (2015). Web surveys optimized for smartphones: Are there differences between computer and smartphone users? Methods, Data, Analysis, 9 , 213–228. https://doi.org/10.12758/mda.2015.012

Article   Google Scholar  

Arias, V., Garrido, L., Jenaro, C., & Martínez-Molina, Arias B. (2020). A little garbage in, lots of garbage out: assessing the impact of careless responding in personality survey data. Behavior Research Methods, 52 (6), 2489–2505. https://doi.org/10.3758/s13428-020-01401-8

Article   PubMed   Google Scholar  

Bowling, N. A., & Huang, J. L. (2018). Your attention please! Toward a better understanding of research participant carelessness. Applied Psychology, 67 (2), 227–230. https://doi.org/10.1111/apps.12143

Bowling, N. A., Huang, J. L., Bragg, C. B., Khazon, S., Liu, M., & Blackmore, C. E. (2016). Who cares and who is careless? Insufficient effort responding as a reflection of respondent personality. Journal of Personality and Social Psychology, 111 (2), 218–229. https://doi.org/10.1037/pspp0000085

Breiman, L. (2001). Random forests. Machine Learning, 45 (1), 5–32. https://doi.org/10.1023/A:1010933404324

Breitsohl, H., & Steidelmüller, C. (2018). The impact of insufficient effort responding detection methods on substantive responses: Results from an experiment testing parameter invariance. Applied Psychology, 67 (2), 284–308. https://doi.org/10.1111/apps.12121

Bruhlmann, F., Petralito, S., Aeschbach, L., & Opwis, K. (2020). The quality of data collected online: an investigation of careless responding in a crowdsourced sample. Methods in Psychology, 2 , 100022. https://doi.org/10.1016/j.metip.2020.1

Buchanan, E. M., & Scofield, I. E. (2018). Methods to detect low quality data and its implication for psychological research. Behavior Research Methods, 50 (6), 2586–2596. https://doi.org/10.3758/s13428-018-1035-6

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., Li, M., Xie, J., Lin, M., Geng, Y., Li, Y., & Yuan, J. (2022). xgboost: Extreme Gradient Boosting [Computer software]. https://CRAN.R-project.org/package=xgboost .

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). ACM. https://doi.org/10.1145/2939672.2939785

Chapter   Google Scholar  

Costa, P. T., Jr., & McCrae, R. R. (2008). The revised NEO personality inventory (NEO-PI-R). In D. H. Saklofske (Ed.), The SAGE handbook of personality theory and assessment: Personality measurement and testing (2nd ed., pp. 179–198). Sage.

Credé, M. (2010). Random responding as a threat to the validity of effect size estimates in correlational research. Educational and Psychological Measurement, 70 (4), 596–612. https://doi.org/10.1177/00131644103666

Curran, P. G. (2016). Methods for the detecting of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66 , 4–19. https://doi.org/10.1016/j.jesp.2015.07.006

Curran, P. G., & Hauser, K. A. (2019). I’m paid biweekly, just not by leprechauns: Evaluating valid-but-incorrect response rates to attention check items. Journal of Research in Personality, 82 , 103849. https://doi.org/10.1016/j.jrp.2019.103849

de Bruijne, M., & Wijnant, A. (2013). comparing survey results obtained via mobile devices and computers: An experiment with a mobile web survey on a heterogeneous group of mobile devices versus a computer-assisted web survey. Social Science Computer Review, 31 (4), 482–504. https://doi.org/10.1177/0894439313483976

DeSimone, J. A., DeSimone, A. J., Harms, P. D., & Wood, D. (2018). The differential impacts of two forms of insufficient effort responding. Applied Psychology, 67 (2), 309–338. https://doi.org/10.1111/apps.12117

Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Theilgard, N. (2018). Intra-individual response variability as an indicator of insufficient effort responding: Comparison to other indicators and relationships with individual differences. Journal of Business and Psychology, 33 (1), 105–121. https://doi.org/10.1007/s10869-016-9479-0

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29 (5), 1189–1232. https://doi.org/10.1214/aos/1013203451

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38 (4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

Gogami, M., Matsuda, Y., Arakawa, Y., & Yasumoto, K. (2021). Detection of careless responses in online surveys using answering behavior on smartphone. IEEE Access, 9 , 53205–53218. https://doi.org/10.1109/ACCESS.2021.3069049

Grinsztajn, L., Oyallon, E., & Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data?, ArXiv Preprints . https://doi.org/10.48550/arXiv.2207.08815

Hamby, T., & Taylor, W. (2016). Survey satisficing inflates reliability and validity measures: An experimental comparison of college and Amazon Mechanical Turk samples. Educational and Psychological Measurement, 76 (6), 912–932. https://doi.org/10.1177/0013164415627349

Article   PubMed   PubMed Central   Google Scholar  

Harzing, A.-W. (2006). Response Styles in Cross-national Survey Research: A 26-country Study. International Journal of Cross Cultural Management, 6 (2), 243–266. https://doi.org/10.1177/1470595806066332

Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27 (1), 99–114. https://doi.org/10.1007/s10869-011-9231-8

Huang, J. L., Liu, M., & Bowling, N. (2015). Insufficient effort responding: examining an insidious confound in survey data. Journal of Applied Psychology., 100 (3), 828–845. https://doi.org/10.1037/a0038510

Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39 (1), 103–129. https://doi.org/10.1016/j.jrp.2004.09.009

Jones, A., Earnest, J., Adam, M., Clarke, R., Yates, J., & Pennington, C. R. (2022). Careless responding in crowdsourced alcohol research: A systematic review and meta-analysis of practices and prevalence. Experimental and Clinical Psychopharmacology, 30 (4), 381–399. https://doi.org/10.1037/pha0000546

Keusch, F., & Yan, T. (2017). Web Versus Mobile Web: An Experimental Study of Device Effects and Self-Selection Effects. Social Science Computer Review, 35 (6), 751–769. https://doi.org/10.1177/0894439316675566

Kung, F. Y. H., Kwok, N., & Brown, D. J. (2018). Are attention check questions a threat to scale validity? Applied Psychology, 67 (2), 264–283. https://doi.org/10.1111/apps.12108

Leiner, D. J. (2019). Too fast, too straight, too weird: Non-reactive indicators for meaningless data in Internet surveys. Survey Research Methods, 13 (3), 229–248. https://doi.org/10.18148/srm/2019.v13i3.7403

Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest [Computer software]. R News, 2 (3), 18–22. https://CRAN.R-project.org/doc/Rnews/ .

Google Scholar  

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In: Proceedings of the Advances in Neural Information Processing Systems , (pp. 4765–4774). https://doi.org/10.48550/arXiv.1705.07874

Maniaci, M. R., & Rogge, R. D. (2014). Caring about carelessness: Participant inattention and its effects on research. Journal of Research in Personality, 48 , 61–83. https://doi.org/10.1016/j.jrp.2013.09.008

Marjanovic, Z., Holden, R., Struthers, W., Cribbie, R., & Greenglass, E. (2015). The inter-item standard deviation (ISD): An index that discriminates between conscientious and random responders. Personality and Individual Differences, 84 , 79–83. https://doi.org/10.1016/j.paid.2014.08.021

Masuda, S., Sakagami, T., Kawabata, H., Kijima, N., & Hoshino, T. (2017). Respondents with low motivation tend to choose middle category: survey questions on happiness in Japan. Behaviormetrika, 44 , 593–605. https://doi.org/10.1007/s41237-017-0026-8

Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17 (3), 437–455. https://doi.org/10.1037/a0028085

Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45 (4), 867–872. https://doi.org/10.1016/j.jesp.2009.03.009

Ozaki, K., & Suzuki, T. (2019). Kikaigakusyu ni yoru futekisetsukaitosya no yosoku [Using machine learning to predict inappropriate respondents]. Kodo Keiryogaku (Japanese Journal of Behaviormetrics), 46 (2), 39–52. https://doi.org/10.2333/jbhmk.46.39

R Core Team. (2022). R: A language and environment for statistical computing . R Foundation for Statistical Computing https://www.R-project.org/

Rubin, D. B. (1976). Inference and missing data. Biometrika, 63 (3), 581–592. https://doi.org/10.1093/biomet/63.3.581

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7 (2), 147–177. https://doi.org/10.1037/1082-989X.7.2.147

Schroeders, U., Schmidt, C., & Gnambs, T. (2022). Detecting careless responding in survey data using stochastic gradient boosting. Educational and Psychological Measurement, 82 (1), 29–56. https://doi.org/10.1177/00131644211004708

Tourangeau, R., Sun, H., Yan, T., Maitland, A., Rivero, G., & Williams, D. (2017). Web surveys by smartphones and tablets: effects on data quality. Public Opinion Quarterly, 81 (4), 896–929.

Ulitzsch, E., Pohl, S., Khorramdel, L., Kroehne, U., & von Davier, M. (2022a). A response-time-based latent response mixture model for identifying and modeling careless and insufficient effort responding in survey data. Psychometrika, 87 (2), 593–619. https://doi.org/10.1007/s11336-021-09817-7

Ulitzsch, E., Yildirim-Erbasli, S. N., Gorgun, G., & Bulut, O. (2022b). An explanatory mixture IRT model for careless and insufficient effort responding in self-report measures. British Journal of Mathematical and Statistical Psychology, 75 (3), 668–698. https://doi.org/10.1111/bmsp.12272

Ulitzsch, E., Pohl, S., Khorramdel, L., Kroehne, U., & von Davier, M. (2023a). Using response times for joint modeling of careless responding and attentive response styles. Journal of Educational and Behavioral Statistics . https://doi.org/10.3102/10769986231173607

Ulitzsch, E., Shin, H. J., & Lüdtke, O. (2023b). Accounting for careless and insufficient effort responding in large-scale survey data—development, evaluation, and application of a screen-time-based weighting procedure. Behavior Research Methods . https://doi.org/10.3758/s13428-022-02053-6

Urban, C. J., & Gates, K. M. (2021). Deep learning: A primer for psychologists. Psychological Methods, 26 (6), 743–773. https://doi.org/10.1037/met0000374

Ward, M. K., & Meade, A. W. (2018). Applying social psychology to prevent careless responding during online surveys. Applied Psychology, 67 (2), 231–263. https://doi.org/10.1111/apps.12118

Ward, M. K., & Meade, A. W. (2023). Dealing with careless responding in survey data: Prevention, identification, and recommended best practices. Annual Review of Psychology, 74 , 577–596. https://doi.org/10.1146/annurev-psych-040422-045007

Ward, M. K., & Pond, S. B. I. I. I. (2015). Using virtual presence and survey instructions to minimize careless responding on internet-based surveys. Computers in Human Behavior, 48 , 554–568. https://doi.org/10.1016/j.chb.2015.01.070

Weiner, S. P., & Dalessio, A. T. (2006). Oversurveying: Causes, consequences, and cures. In A. I. Kraut (Ed.), Getting action from organizational surveys: New concepts, technologies, and applications (pp. 294–311). Wiley.

Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28 (3), 189–194. https://doi.org/10.1007/s10862-005-9004-7

Yeung, R. C., & Fernandes, M. A. (2022). Machine learning to detect invalid text responses: Validation and comparison to existing detection methods. Behavior Research Methods, 54 , 3055–3070. https://doi.org/10.3758/s13428-022-01801-y

Download references

Acknowledgements

This research was funded by a joint research grant with I-Bridge Corporation.

This research was supported by a joint research grant with I-Bridge Corporation and JSPS KAKENHI Grant Number JP23K02859.

Author information

Authors and affiliations.

Graduate School of Business Sciences, University of Tsukuba, 3-29-1, Otsuka, Bunkyo-ku, Tokyo, 112-0012, Japan

Koken Ozaki

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Koken Ozaki .

Ethics declarations

Financial interests.

The first author of this paper has a concurrent employment agreement with the funding agency, I-Bridge Corporation, starting in 2022. In addition, a patent application related to this research has been filed (Japan Patent JP2023-28997A).

Conflicts of interest/competing interests (include appropriate disclosures)

The first author of this paper has a concurrent employment agreement with the funding agency, I-Bridge Corporation, starting in 2022. In addition, a patent application related to this research has been filed (Japan Patent JP2023-28997A). JP2023-28997A is a patent granted only in Japan and cannot be used for commercial purposes in Japan, but can be used for academic purposes both in Japan and outside Japan. Commercial use outside Japan is also possible.

Ethics approval

The questionnaire and methodology for this study was approved by the Human Research Ethics committee of the University of Tsukuba Institute of Business Sciences (ethics approval number: Survey ID 1: Business 2021-2, ID 2: Business 2021-3, ID 3: Business 30-10, ID 4: Business 2020-13, ID 5: Business 2020-5, ID 6: Business 2020-6, ID 7: Business 2020-4, ID 8: Business 2020-8, ID 9: Business 2020-2, ID 10: Business 2020-3, ID 11: Business 2020-12, ID 12: Business 2020-3, ID 13: Business 2020-11, ID 14: Business 2020-10, ID 15: Business 2020-1, ID 16: Business 2020-9).

Consent to participate

At the time of the survey, we confirmed that respondents had given their consent to participate in the study by taking the survey.

Consent for publication

All respondents were informed prior to the survey that the results of the survey may be published as a paper, and that by responding to the survey, they were consenting to the publication of the paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Practices Statement

Pre-Registered Surveys:

No pre-registered surveys were included in this study.

Example data and codes

Purpose of appendix and example data and r codes.

In this Appendix , sample data and sample R codes are presented. Three sample data sets are available, all of which were artificially generated. The sample size for each sample data is 500. These are called data1, data2, and data3. data1 and data2 are combined as the training data and data3 is used as the test data. Sample data and sample R codes are available at https://osf.io/dx2mf .

All three surveys include ID in the first column, response time (restime) in the second column, and whether the respondent correctly responded to the DQS (= 0) or not (= 1) in the last column. All surveys also include Likert scales; the 12 items used to calculate LS and other predictors are Q78 to Q89 in data1, Q86 to Q97 in data2, and Q73 to Q84 in data3. The Likert scale used is not limited to 12 items and can differ between surveys. Because the data are artificial, there are no specific item contents for each survey instrument.

R code calculating_predictors.R is used to read the data and extract the variables necessary to develop the inattentive respondent detection model from the data. The order of the extracted variables is ID, time, time_m, LS, AC, MAC, R2C, R3C, absdevi, maha, maha_p, maha2, maha2_p, nitems, DQS. Line 8, datalikert<-c("Q78", "Q89"), is for data1, so if you want to run it for data2, you need to read data2 on line 4 and then datalikert<-c("Q86", "Q97"). The same is true for data3. If you apply this R code to your own data, please change this part of the code. In this example code, the extracted data set is data1_analysis for data1, data2_analysis for data2, and data3_analysis for data3.

R code analysis.R combines data1_analysis and data2_analysis, develops a machine learning model for inattentive respondent detection using the combined data, and applies the model to data3_analysis. If the number of data is not three, change this R code accordingly. The outputs are accuracy, recall, precision, and balanced accuracy when applied to data3_analysis.

The minimum number of inattentive and non-inattentive respondents in data1_analysis and data2_analysis (which is 211 in the sample data) is obtained, and then inattentive and non-inattentive respondents from data1_analysis and data2_analysis are randomly selected to achieve the minimum for each. In this way, the numbers of inattentive and non-inattentive respondents are the same. This is expected to improve the prediction accuracy.

Accuracy is obtained at line 84, recall at line 86, precision at line 89, balanced accuracy at line 92, and specificity at line 96.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Ozaki, K. Detecting inattentive respondents by machine learning: A generic technique that substitutes for the directed questions scale and compensates for its shortcomings. Behav Res (2024). https://doi.org/10.3758/s13428-024-02407-2

Download citation

Accepted : 18 March 2024

Published : 08 April 2024

DOI : https://doi.org/10.3758/s13428-024-02407-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Inattentive respondents
  • Machine learning
  • The directed questions scale
  • Web surveys
  • Find a journal
  • Publish with us
  • Track your research

This paper is in the following e-collection/theme issue:

Published on 11.4.2024 in Vol 26 (2024)

Patients’ Experiences With Digitalization in the Health Care System: Qualitative Interview Study

Authors of this article:

Author Orcid Image

Original Paper

  • Christian Gybel Jensen 1 * , MA   ; 
  • Frederik Gybel Jensen 1 * , MA   ; 
  • Mia Ingerslev Loft 1, 2 * , MSc, PhD  

1 Department of Neurology, Rigshospitalet, Copenhagen, Denmark

2 Institute for People and Technology, Roskilde University, Roskilde, Denmark

*all authors contributed equally

Corresponding Author:

Mia Ingerslev Loft, MSc, PhD

Department of Neurology

Rigshospitalet

Inge Lehmanns Vej 8

Phone: 45 35457076

Email: [email protected]

Background: The digitalization of public and health sectors worldwide is fundamentally changing health systems. With the implementation of digital health services in health institutions, a focus on digital health literacy and the use of digital health services have become more evident. In Denmark, public institutions use digital tools for different purposes, aiming to create a universal public digital sector for everyone. However, this digitalization risks reducing equity in health and further marginalizing citizens who are disadvantaged. Therefore, more knowledge is needed regarding patients’ digital practices and experiences with digital health services.

Objective: This study aims to examine digital practices and experiences with public digital health services and digital tools from the perspective of patients in the neurology field and address the following research questions: (1) How do patients use digital services and digital tools? (2) How do they experience them?

Methods: We used a qualitative design with a hermeneutic approach. We conducted 31 semistructured interviews with patients who were hospitalized or formerly hospitalized at the department of neurology in a hospital in Denmark. The interviews were audio recorded and subsequently transcribed. The text from each transcribed interview was analyzed using manifest content analysis.

Results: The analysis provided insights into 4 different categories regarding digital practices and experiences of using digital tools and services in health care systems: social resources as a digital lifeline, possessing the necessary capabilities, big feelings as facilitators or barriers, and life without digital tools. Our findings show that digital tools were experienced differently, and specific conditions were important for the possibility of engaging in digital practices, including having access to social resources; possessing physical, cognitive, and communicative capabilities; and feeling motivated, secure, and comfortable. These prerequisites were necessary for participants to have positive experiences using digital tools in the health care system. Those who did not have these prerequisites experienced challenges and, in some cases, felt left out.

Conclusions: Experiences with digital practices and digital health services are complex and multifaceted. Engagement in digital practices for the examined population requires access to continuous assistance from their social network. If patients do not meet requirements, digital health services can be experienced as exclusionary and a source of concern. Physical, cognitive, and communicative difficulties might make it impossible to use digital tools or create more challenges. To ensure that digitalization does not create inequities in health, it is necessary for developers and institutions to be aware of the differences in digital health literacy, focus on simplifying communication with patients and next of kin, and find flexible solutions for citizens who are disadvantaged.

Introduction

In 2022, the fourth most googled question in Denmark was, “Why does MitID not work?” [ 1 ]. MitID (My ID) is a digital access tool that Danes use to enter several different private and public digital services, from bank accounts to mail from their municipality or the state. MitID is a part of many Danish citizens’ everyday lives because the public sector in Denmark is digitalized in many areas. In recent decades, digitalization has changed how governments and people interact and has demonstrated the potential to change the core functions of public sectors and delivery of public policies and services [ 2 ]. When public sectors worldwide become increasingly digitalized, this transformation extends to the public health sectors as well, and some studies argue that we are moving toward a “digital public health era” that is already impacting the health systems and will fundamentally change the future of health systems [ 3 ]. While health systems are becoming more digitalized, it is important that both patients and digitalized systems adapt to changes in accordance with each other. Digital practices of people can be understood as what people do with and through digital technologies and how people relate to technology [ 4 ]. Therefore, it is relevant to investigate digital practices and how patients perceive and experience their own use of digital tools and services, especially in relation to existing digital health services. In our study, we highlight a broad perspective on experiences with digital practices and particularly add insight into the challenges with digital practices faced by patients who have acute or chronic illness, with some of them also experiencing physical, communicative, or cognitive difficulties.

An international Organization for Economic Cooperation and Development report indicates that countries are digitalized to different extents and in different ways; however, this does not mean that countries do not share common challenges and insights into the implementation of digital services [ 2 ].

In its global Digital Government Index, Denmark is presented as one of the leading countries when it comes to public digitalization [ 2 ]. Recent statistics indicate that approximately 97% of Danish families have access to the internet at home [ 5 ]. The Danish health sector already offers many different digital services, including web-based delivery of medicine, e-consultations, patient-related outcome questionnaires, and seeking one’s own health journal or getting test results through; “Sundhed” [ 6 ] (the national health portal) and “Sundhedsjournalen” (the electronic patient record); or the apps “Medicinkortet” (the shared medication record), “Minlæge” (My Doctor, consisting of, eg, communication with the general practitioner), or “MinSP” (My Health Platform, consisting of, eg, communication with health care staff in hospitals) [ 6 - 8 ].

The Danish Digital Health Strategy from 2018 aims to create a coherent and user-friendly digital public sector for everyone [ 9 ], but statistics indicate that certain groups in society are not as digitalized as others. In particular, the older population uses digital services the least, with 5% of people aged 65 to 75 years and 18% of those aged 75 to 89 years having never used the internet in 2020 [ 5 ]. In parts of the literature, it has been problematized how the digitalization of the welfare state is related to the marginalization of older citizens who are socially disadvantaged [ 10 ]. However, statistics also indicate that the probability of using digital tools increases significantly as a person’s experience of using digital tools increases, regardless of their age or education level [ 5 ].

Understanding the digital practices of patients is important because they can use digital tools to engage with the health system and follow their own health course. Researching experiences with digital practices can be a way to better understand potential possibilities and barriers when patients use digital health services. With patients becoming more involved in their own health course and treatment, the importance of patients’ health literacy is being increasingly recognized [ 11 ]. The World Health Organization defines health literacy as the “achievement of a level of knowledge, personal skills and confidence to take action to improve personal and community health by changing personal lifestyles and living conditions” [ 12 ]. Furthermore, health literacy can be described as “a person’s knowledge and competencies to meet complex demands of health in modern society, ” and it is viewed as a critical step toward patient empowerment [ 11 , 12 ]. In a digitalized health care system, this also includes the knowledge, capabilities, and resources that individuals require to use and benefit from eHealth services, that is, “digital health literacy (eHealth literacy)” [ 13 ]. An eHealth literacy framework created by Norgaard et al [ 13 ] identified that different aspects, for example, the ability to process information and actively engage with digital services, can be viewed as important facets of digital health literacy. This argument is supported by studies that demonstrate how patients with cognitive and communicative challenges experience barriers to the use of digital tools and require different approaches in the design of digital solutions in the health sector [ 14 , 15 ]. Access to digital services and digital literacy is becoming increasingly important determinants of health, as people with digital literacy and access to digital services can facilitate improvement of health and involvement in their own health course [ 16 ].

The need for a better understanding of eHealth literacy and patients’ capabilities to meet public digital services’ demands as well as engage in their own health calls for a deeper investigation into digital practices and the use of digital tools and services from the perspective of patients with varying digital capabilities. Important focus areas to better understand digital practices and related challenges have already been highlighted in various studies. They indicate that social support, assessment of value in digital services, and systemic assessment of digital capabilities are important in the use and implementation of digital tools, and they call for better insight into complex experiences with digital services [ 13 , 17 , 18 ]. Therefore, we aimed to examine digital practices and experiences with public digital health services and digital tools from the perspective of patients, addressing the following research questions: how do patients use digital services and digital tools, and how do they experience them?

We aimed to investigate digital practices and experiences with digital health services and digital tools; therefore, we used a qualitative design and adopted a hermeneutic approach as the point of departure, which means including preexisting knowledge of digital practices but also providing room for new comprehension [ 19 ]. Our interpretive approach is underpinned by the philosophical hermeneutic approach by Gadamer et al [ 19 ], in which they described the interpretation process as a “hermeneutic circle,” where the researcher enters the interpretation process with an open mind and historical awareness of a phenomenon (preknowledge). We conducted semistructured interviews using an interview guide. This study followed the COREQ (Consolidated Criteria for Reporting Qualitative Research) checklist [ 20 ].

Setting and Participants

To gain a broad understanding of experiences with public digital health services, a purposive sampling strategy was used. All 31 participants were hospitalized or formerly hospitalized patients in a large neurological department in the capital of Denmark ( Table 1 ). We assessed whether including patients from the neurological field would give us a broad insight into the experiences of digital practices from different perspectives. The department consisted of, among others, 8 inpatient units covering, for example, acute neurology and stroke units, from which the patients were recruited. Patients admitted to a neurological department can have both acute and transient neurological diseases, such as infections in the brain, stroke, or blood clot in the brain from which they can recover completely or have persistent physical and mental difficulties, or experience chronic neurological and progressive disorders such as Parkinson disease and dementia. Some patients hospitalized in neurological care will have communicative and cognitive difficulties because of their neurological disorders. Nursing staff from the respective units helped the researchers (CGJ, FGJ, and MIL) identify patients who differed in terms of gender, age, and severity of neurological illness. Some patients (6/31, 19%) had language difficulties; however, a speech therapist assessed them as suitable participants. We excluded patients with severe cognitive difficulties and those who were not able to speak the Danish language. Including patients from the field of neurology provided an opportunity to study the experience of digital health practice from various perspectives. Hence, the sampling strategy enabled the identification and selection of information-rich participants relevant to this study [ 21 ], which is the aim of qualitative research. The participants were invited to participate by either the first (CGJ) or last author (MIL), and all invited participants (31/31, 100%) chose to participate.

All 31 participants were aged between 40 to 99 years, with an average age of 71.75 years ( Table 1 ). Out of the 31 participants, 10 (32%) had physical disabilities or had cognitive or communicative difficulties due to sequela in relation to neurological illness or other physical conditions.

Data Collection

The 31 patient interviews were conducted over a 2-month period between September and November 2022. Of the 31 patients, 20 (65%) were interviewed face-to-face at the hospital in their patient room upon admission and 11 (35%) were interviewed on the phone after being discharged. The interviews had a mean length of 20.48 minutes.

We developed a semistructured interview guide ( Table 2 ). The interview questions were developed based on the research aim, findings from our preliminary covering of literature in the field presented in the Introduction section, and identified gaps that we needed to elaborate on to be able to answer our research question [ 22 ]. The semistructured interview guide was designed to support the development of a trusting relationship and ensure the relevance of the interviews’ content [ 22 ]. The questions served as a prompt for the participants and were further supported by questions such as “please tell me more” and “please elaborate” throughout the interview, both to heighten the level of detail and to verify our understanding of the issues at play. If the participant had cognitive or communicative difficulties, communication was supported using a method called Supported Communication for Adults with Aphasia [ 23 ] during the interview.

The interviews were performed by all authors (CGJ, FGJ, and MIL individually), who were skilled in conducting interviews and qualitative research. The interviewers are not part of daily clinical practice but are employed in the department of neurology from where the patients were recruited. All interviews were audio recorded and subsequently transcribed verbatim by all 3 authors individually.

a PRO: patient-related outcome.

Data Analysis

The text from each transcribed interview was analyzed using manifest content analysis, as described by Graneheim and Lundman [ 24 ]. Content analysis is a method of analyzing written, verbal, and visual communication in a systematic way [ 25 ]. Qualitative content analysis is a structured but nonlinear process that requires researchers to move back and forth between the original text and parts of the text during the analysis. Manifest analysis is the descriptive level at which the surface structure of the text central to the phenomenon and the research question is described. The analysis was conducted as a collaborative effort between the first (CGJ) and last authors (MIL); hence, in this inductive circular process, to achieve consistency in the interpretation of the text, there was continued discussion and reflection between the researchers. The transcriptions were initially read several times to gain a sense of the whole context, and we analyzed each interview. The text was initially divided into domains that reflected the lowest degree of interpretation, as a rough structure was created in which the text had a specific area in common. The structure roughly reflected the interview guide’s themes, as guided by Graneheim and Lundman [ 24 ]. Thereafter, the text was divided into meaning units, condensed into text-near descriptions, and then abstracted and labeled further with codes. The codes were categorized based on similarities and differences. During this process, we discussed the findings to reach a consensus on the content, resulting in the final 4 categories presented in this paper.

Ethical Considerations

The interviewees received oral and written information about the study and its voluntary nature before the interviews. Written informed consent was obtained from all participants. Participants were able to opt of the study at any time. Data were anonymized and stored electronically on locked and secured servers. The Ethics Committee of the Capitol Region in Denmark was contacted before the start of the study. This study was registered and approved by the ethics committee and registered under the Danish Data Protection Agency (number P2021-839). Furthermore, the ethical principles of the Declaration of Helsinki were followed for this study.

The analysis provided insights into 4 different categories regarding digital practices and experiences of using digital tools and services in health care systems: social resources as a digital lifeline, possessing the necessary capabilities, big feelings as facilitators or barriers, and life without digital tools.

Social Resources as a Digital Lifeline

Throughout the analysis, it became evident that access to both material and social resources was of great importance when using digital tools. Most participants already possessed and had easy access to a computer, smartphone, or tablet. The few participants who did not own the necessary digital tools told us that they did not have the skills needed to use these tools. For these participants, the lack of material resources was tied particularly to a lack of knowledge and know-how, as they expressed that they would not know where to start after buying a computer—how to set it up, connect it to the internet, and use its many systems.

However, possessing the necessary material resources did not mean that the participants possessed the knowledge and skill to use digital tools. Furthermore, access to material resources was also a question of having access to assistance when needed. Some participants who had access to a computer, smartphone, and tablet and knew how to use these tools still had to obtain help when setting up hardware, updating software, or getting a new device. These participants were confident in their own ability to use digital devices but also relied on family, friends, and neighbors in their everyday use of these tools. Certain participants were explicitly aware of their own use of social resources when expressing their thoughts on digital services in health care systems:

I think it is a blessing and a curse. I think it is both. I would say that if I did not have someone around me in my family who was almost born into the digital world, then I think I would be in trouble. But I feel sorry for those who do not have that opportunity, and I know quite a few who do not. They get upset, and it’s really frustrating. [Woman, age 82 years]

The participants’ use of social resources indicates that learning skills and using digital tools are not solely individual tasks but rather continuously involve engagement with other people, particularly whenever a new unforeseen problem arises or when the participants want a deeper understanding of the tools they are using:

If tomorrow I have to get a new ipad...and it was like that when I got this one, then I had to get XXX to come and help me move stuff and he was sweet to help with all the practical stuff. I think I would have cursed a couple of times (if he hadn’t been there), but he is always helpful, but at the same time he is also pedagogic so I hope that next time he showed me something I will be able to do it. [Man, age 71 years]

For some participants, obtaining assistance from a more experienced family member was experienced as an opportunity to learn, whereas for other participants, their use of public digital services was even tied directly to assistance from a spouse or family member:

My wife, she has access to mine, so if something comes up, she can just go in and read, and we can talk about it afterwards what (it is). [Man, age 85 years]

The participants used social resources to navigate digital systems and understand and interpret communication from the health care system through digital devices. Another example of this was the participants who needed assistance to find, answer, and understand questionnaires from the health care department. Furthermore, social resources were viewed as a support system that made participants feel more comfortable and safer when operating digital tools. The social resources were particularly important when overcoming unforeseen and new challenges and when learning new skills related to the use of digital tools. Participants with physical, cognitive, and communicative challenges also explained how social resources were of great importance in their ability to use digital tools.

Possessing the Necessary Capabilities

The findings indicated that possessing the desire and knowing how to use digital tools are not always enough to engage with digital services successfully. Different health issues can carry consequences for motor skills and mobility. Some of these consequences were visibly affecting how our participants interacted with digital devices, and these challenges were somewhat easy to discover. However, our participants revealed hidden challenges that posed difficulties. In some specific cases, cognitive and communicative inabilities can make it difficult to use digital tools, and this might not always be clear until the individual tries to use a device’s more complex functions. An example of this is that some participants found it easy to turn on a computer and use it to write but difficult to go through security measures on digital services or interpret and understand digital language. Remembering passwords and logging on to systems created challenges, particularly for those experiencing health issues that directly affect memory and cognitive abilities, who expressed concerns about what they were able to do through digital tools:

I think it is very challenging because I would like to use it how I used to before my stroke; (I) wish that everything (digital skills) was transferred, but it just isn’t. [Man, age 80 years]

Despite these challenges, the participants demonstrated great interest in using digital tools, particularly regarding health care services and their own well-being. However, sometimes, the challenges that they experienced could not be conquered merely by motivation and good intentions. Another aspect of these challenges was the amount of extra time and energy that the participants had to spend on digital services. A patient diagnosed with Parkinson disease described how her symptoms created challenges that changed her digital practices:

Well it could for example be something like following a line in the device. And right now it is very limited what I can do with this (iPhone). Now I am almost only using it as a phone, and that is a little sad because I also like to text and stuff, but I also find that difficult (...) I think it is difficult to get an overview. [Woman, age 62 years]

Some participants said that after they were discharged from the hospital, they did not use the computer anymore because it was too difficult and too exhausting , which contributed to them giving up . Using digital tools already demanded a certain amount of concentration and awareness, and some diseases and health conditions affected these abilities further.

Big Feelings as Facilitators or Barriers

The findings revealed a wide range of digital practices in which digital tools were used as a communication device, as an entertainment device, and as a practical and informative tool for ordering medicine, booking consultations, asking health-related questions, or receiving email from public institutions. Despite these different digital practices, repeating patterns and arguments appeared when the participants were asked why they learned to use digital tools or wanted to improve their skills. A repeating argument was that they wanted to “follow the times, ” or as a participant who was still not satisfied with her digital skills stated:

We should not go against the future. [Woman, age 89 years]

The participants expressed a positive view of the technological developments and possibilities that digital devices offered, and they wanted to improve their knowledge and skills related to digital practice. For some participants, this was challenging, and they expressed frustration over how technological developments “moved too fast ,” but some participants interpreted these challenges as a way to “keep their mind sharp. ”

Another recurring pattern was that the participants expressed great interest in using digital services related to the health care system and other public institutions. The importance of being able to navigate digital services was explicitly clear when talking about finding test answers, written electronic messages, and questionnaires from the hospital or other public institutions. Keeping up with developments, communicating with public institutions, and taking an interest in their own health and well-being were described as good reasons to learn to use digital tools.

However, other aspects also affected these learning facilitators. Some participants felt alienated while using digital tools and described the practice as something related to feelings of anxiety, fear, and stupidity as well as something that demanded “a certain amount of courage. ” Some participants felt frustrated with the digital challenges they experienced, especially when the challenges were difficult to overcome because of their physical conditions:

I get sad because of it (digital challenges) and I get very frustrated and it takes a lot of time because I have difficulty seeing when I look away from the computer and have to turn back again to find out where I was and continue there (...) It pains me that I have to use so much time on it. [Man, age 71 years]

Fear of making mistakes, particularly when communicating with public institutions, for example, the health care system, was a common pattern. Another pattern was the fear of misinterpreting the sender and the need to ensure that the written electronic messages were actually from the described sender. Some participants felt that they were forced to learn about digital tools because they cared a lot about the services. Furthermore, fears of digital services replacing human interaction were a recurring concern among the participants. Despite these initial and recurring feelings, some participants learned how to navigate the digital services that they deemed relevant. Another recurring pattern in this learning process was repetition, the practice of digital skills, and consistent assistance from other people. One participant expressed the need to use the services often to remember the necessary skills:

Now I can figure it out because now I’ve had it shown 10 times. But then three months still pass... and then I think...how was it now? Then I get sweat on my forehead (feel nervous) and think; I’m not an idiot. [Woman, age 82 years]

For some participants, learning how to use digital tools demanded time and patience, as challenges had to be overcome more than once because they reappeared until the use of digital tools was more automatized into their everyday lives. Using digital tools and health services was viewed as easier and less stressful when part of everyday routines.

Life Without Digital Tools: Not a Free Choice

Even though some participants used digital tools daily, other participants expressed that it was “too late for them.” These participants did not view it as a free choice but as something they had to accept that they could not do. They wished that they could have learned it earlier in life but did not view it as a possibility in the future. Furthermore, they saw potential in digital services, including digital health care services, but they did not know exactly what services they were missing out on. Despite this lack of knowledge, they still felt sad about the position they were in. One participant expressed what she thought regarding the use of digital tools in public institutions:

Well, I feel alright about it, but it is very, very difficult for those of us who do not have it. Sometimes you can feel left out—outside of society. And when you do not have one of those (computers)...A reference is always made to w and w (www.) and then you can read on. But you cannot do that. [Woman, age 94 years]

The feeling of being left out of society was consistent among the participants who did not use digital tools. To them, digital systems seemed to provide unfair treatment based on something outside of their own power. Participants who were heavily affected by their medical conditions and could not use digital services also felt left out because they saw the advantages of using digital tools. Furthermore, a participant described the feelings connected to the use of digital tools in public institutions:

It is more annoying that it does not seem to work out in my favour. [Woman, age 62 years]

These statements indicated that it is possible for individuals to want to use digital tools and simultaneously find them too challenging. These participants were aware that there are consequences of not using digital tools, and that saddens them, as they feel like they are not receiving the same treatment as other people in society and the health care system.

Principal Findings

The insights from our findings demonstrated that our participants had different digital practices and different experiences with digital tools and services; however, the analysis also highlighted patterns related to how digital services and tools were used. Specific conditions were important for the possibility of digital practice, including having access to social resources; possessing the necessary capabilities; and feeling motivated, secure, and comfortable . These prerequisites were necessary to have positive experiences using digital tools in the health care system, although some participants who lived up to these prerequisites were still skeptical toward digital solutions. Others who did not live up to these prerequisites experienced challenges and even though they were aware of opportunities, this awareness made them feel left out. A few participants even viewed the digital tools as a threat to their participation in society. This supports the notion of Norgaard et al [ 13 ] that the attention paid to digital capability demands from eHealth systems is very important. Furthermore, our findings supported the argument of Hjeltholt and Papazu [ 17 ] that it is important to better understand experiences related to digital services. In our study, we accommodate this request and bring forth a broad perspective on experiences with digital practices; we particularly add insight into the challenges with digital practices for patients who also have acute or chronic illness, with some of them also experiencing physical, communicative, and cognitive difficulties. To our knowledge, there is limited existing literature focusing on digital practices that do not have a limited scope, for example, a focus on perspectives on eHealth literacy in the use of apps [ 26 ] or intervention studies with a focus on experiences with digital solutions, for example, telemedicine during the COVID-19 pandemic [ 27 ]. As mentioned by Hjeltholt et al [ 10 ], certain citizens are dependent on their own social networks in the process of using and learning digital tools. Rasi et al [ 28 ] and Airola et al [ 29 ] argued that digital health literacy is situated and should include the capabilities of the individual’s social network. Our findings support these arguments that access to social resources is an important condition; however, the findings also highlight that these resources can be particularly crucial in the use of digital health services, for example, when interpreting and understanding digital and written electronic messages related to one’s own health course or when dealing with physical, cognitive, and communicative disadvantages. Therefore, we argue that the awareness of the disadvantages is important if we want to understand patients’ digital capabilities, and the inclusion of the next of kin can be evident in unveiling challenges that are unknown and not easily visible or when trying to reach patients with digital challenges through digital means.

Studies by Kayser et al [ 30 ] and Kanoe et al [ 31 ] indicated that patients’ abilities to interpret and understand digital health–related services and their benefits are important for the successful implementation of eHealth services—an argument that our findings support. Health literacy in both digital and physical contexts is important if we want to understand how to better design and implement services. Our participants’ statements support the argument that communication through digital means cannot be viewed as similar to face-to-face communication and that an emphasis on digital health literacy demonstrates how health systems are demanding different capabilities from the patients [ 13 ]. We argue that it is important to communicate the purposes of digital services so that both the patient and their next of kin know why they participate and how it can benefit them. Therefore, it is important to make it as clear as possible that digital health services can benefit the patient and that these services are developed to support information, communication, and dialogue between patients and health professionals. However, our findings suggest that even after interpreting and understanding the purposes of digital health services, some patients may still experience challenges when using digital tools.

Therefore, it is important to understand how and why patients learn digital skills, particularly because both experience with digital devices and estimation of the value of digital tools have been highlighted as key factors for digital practices [ 5 , 18 ]. Our findings indicate that a combination of these factors is important, as recognizing the value of digital tools was not enough to facilitate the necessary learning process for some of our participants. Instead, our participants described the use of digital tools as complex and continuous processes in which automation of skills, assistance from others, and time to relearn forgotten knowledge were necessary and important facilitators for learning and understanding digital tools as well as becoming more comfortable and confident in the use of digital health services. This was particularly important, as it was more encouraging for our participants to learn digital tools when they felt secure, instead of feeling afraid and anxious, a point that Bailey et al [ 18 ] also highlighted. The value of digital solutions and the will to learn were greater when challenges were viewed as something to overcome and learn from instead of something that created a feeling of being stupid. This calls for attention on how to simplify and explain digital tools and services so that users do not feel alienated. Our findings also support the argument that digital health literacy should take into account emotional well-being related to digital practice [ 32 ].

The various perspectives that our participants provided regarding the use of digital tools in the health care system indicate that patients are affected by the use of digital health services and their own capabilities to use digital tools. Murray et al [ 33 ] argued that the use of digital tools in health sectors has the potential to improve health and health delivery by improving efficacy, efficiency, accessibility, safety, and personalization, and our participants also highlighted these positive aspects. However, different studies found that some patients, particularly older adults considered socially vulnerable, have lower digital health literacy [ 10 , 34 , 35 ], which is an important determinant of health and may widen disparities and inequity in health care [ 16 ]. Studies on older adult populations’ adaptation to information and communication technology show that engaging with this technology can be limited by the usability of technology, feelings of anxiety and concern, self-perception of technology use, and the need for assistance and inclusive design [ 36 ]. Our participants’ experiences with digital practices support the importance of these focus areas, especially when primarily older patients are admitted to hospitals. Furthermore, our findings indicate that some older patients who used to view themselves as being engaged in their own health care felt more distanced from the health care system because of digital services, and some who did not have the capabilities to use digital tools felt that they were treated differently compared to the rest of society. They did not necessarily view themselves as vulnerable but felt vulnerable in the specific experience of trying to use digital services because they wished that they were more capable. Moreover, this was the case for patients with physical and cognitive difficulties, as they were not necessarily aware of the challenges before experiencing them. Drawing on the phenomenological and feministic approach by Ahmed [ 37 ], these challenges that make patients feel vulnerable are not necessarily visible to others but can instead be viewed as invisible institutional “walls” that do not present themselves before the patient runs into them. Some participants had to experience how their physical, cognitive, or communicative difficulties affected their digital practice to realize that they were not as digitally capable as they once were or as others in society. Furthermore, viewed from this perspective, our findings could be used to argue that digital capabilities should be viewed as a privilege tied to users’ physical bodies and that digital services in the health care system are indirectly making patients without this privilege vulnerable. This calls for more attention to the inequities that digital tools and services create in health care systems and awareness that those who do not use digital tools are not necessarily indifferent about the consequences. Particularly, in a context such as the Danish one, in which the digital strategy is to create an intertwined and user-friendly public digital sector for everyone, it needs to be understood that patients have different digital capabilities and needs. Although some have not yet had a challenging experience that made them feel vulnerable, others are very aware that they receive different treatment and feel that they are on their own or that the rest of the society does not care about them. Inequities in digital health care, such as these, can and should be mitigated or prevented, and our investigation into the experiences with digital practices can help to show that we are creating standards and infrastructures that deliberately exclude the perspectives of those who are most in need of the services offered by the digital health care system [ 8 ]. Therefore, our findings support the notions that flexibility is important in the implementation of universal public digital services [ 17 ]; that it is important to adjust systems in accordance with patients’ eHealth literacy and not only improve the capabilities of individuals [ 38 ]; and that the development and improvement of digital health literacy are not solely an individual responsibility but are also tied to ways in which institutions organize, design, and implement digital tools and services [ 39 ].

Limitations

This qualitative study provided novel insights into the experiences with public digital health services from the perspective of patients in the Danish context, enabling a deeper understanding of how digital health services and digital tools are experienced and used. This helps build a solid foundation for future interventions aimed at digital health literacy and digital health interventions. However, this study has some limitations. First, the study was conducted in a country where digitalization is progressing quickly, and people, therefore, are accustomed to this pace. Therefore, readers must be aware of this. Second, the study included patients with different neurological conditions; some of their digital challenges were caused or worsened by these neurological conditions and are, therefore, not applicable to all patients in the health system. However, the findings provided insights into the patients’ digital practices before their conditions and other challenges not connected to neurological conditions shared by patients. Third, the study was broad, and although a large number of informants was included, from a qualitative research perspective, we would recommend additional research in this field to develop interventions that target digital health literacy and the use of digital health services.

Conclusions

Experiences with digital tools and digital health services are complex and multifaceted. The advantages in communication, finding information, or navigating through one’s own health course work as facilitators for engaging with digital tools and digital health services. However, this is not enough on its own. Furthermore, feeling secure and motivated and having time to relearn and practice skills are important facilitators. Engagement in digital practices for the examined population requires access to continuous assistance from their social network. If patients do not meet requirements, digital health services can be experienced as exclusionary and a source of concern. Physical, cognitive, and communicative difficulties might make it impossible to use digital tools or create more challenges that require assistance. Digitalization of the health care system means that patients do not have the choice to opt out of using digital services without having consequences, resulting in them receiving a different treatment than others. To ensure digitalization does not create inequities in health, it is necessary for developers and the health institutions that create, design, and implement digital services to be aware of differences in digital health literacy and to focus on simplifying communication with patients and next of kin through and about digital services. It is important to focus on helping individuals meet the necessary conditions and finding flexible solutions for those who do not have the same privileges as others if the public digital sector is to work for everyone.

Acknowledgments

The authors would like to thank all the people who gave their time to be interviewed for the study, the clinical nurse specialists who facilitated interviewing patients, and the other nurses on shift who assisted in recruiting participants.

Conflicts of Interest

None declared.

  • Year in search 2022. Google Trends. URL: https://trends.google.com/trends/yis/2022/DK/ [accessed 2024-04-02]
  • Digital government index: 2019. Organisation for Economic Cooperation and Development. URL: https://www.oecd-ilibrary.org/content/paper/4de9f5bb-en [accessed 2024-04-02]
  • Azzopardi-Muscat N, Sørensen K. Towards an equitable digital public health era: promoting equity through a health literacy perspective. Eur J Public Health. Oct 01, 2019;29(Supplement_3):13-17. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Digital practices. Umeå University. URL: https://www.umu.se/en/humlab/research/digital-practice/ [accessed 2024-04-02]
  • It-anvendelse i befolkningen 2020. Danmarks Statistik. URL: https://www.dst.dk/da/Statistik/nyheder-analyser-publ/Publikationer/VisPub?cid=29450 [accessed 2024-04-02]
  • Sundhed.dk homepage. Sundhed.dk. URL: https://www.sundhed.dk/borger/ [accessed 2024-04-02]
  • Nøhr C, Bertelsen P, Vingtoft S, Andersen SK. Digitalisering af Det Danske Sundhedsvæsen. Odense, Denmark. Syddansk Universitetsforlag; 2019.
  • Eriksen J, Ebbesen M, Eriksen KT, Hjermitslev C, Knudsen C, Bertelsen P, et al. Equity in digital healthcare - the case of Denmark. Front Public Health. Sep 6, 2023;11:1225222. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Digital health strategy. Sundhedsdatastyrelsen. URL: https://sundhedsdatastyrelsen.dk/da/english/digital_health_solutions/digital_health_strategy [accessed 2024-04-02]
  • Hjelholt M, Schou J, Bojsen LB, Yndigegn SL. Digital marginalisering af udsatte ældre: arbejdsrapport 2. IT-Universitetet i København. 2018. URL: https://egv.dk/images/Projekter/Projekter_2018/EGV_arbejdsrapport_2.pdf [accessed 2024-04-02]
  • Sørensen K, Van den Broucke S, Fullam J, Doyle G, Pelikan J, Slonska Z, et al. Health literacy and public health: a systematic review and integration of definitions and models. BMC Public Health. Jan 25, 2012;12(1):80. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Improving health literacy. World Health Organization. URL: https://www.who.int/activities/improving-health-literacy [accessed 2024-04-02]
  • Norgaard O, Furstrand D, Klokker L, Karnoe KA, Batterham R, Kayser L, et al. The e-health literacy framework: a conceptual framework for characterizing e-health users and their interaction with e-health systems. Knowl Manag E Learn. 2015;7(4). [ CrossRef ]
  • Kramer JM, Schwartz A. Reducing barriers to patient-reported outcome measures for people with cognitive impairments. Arch Phys Med Rehabil. Aug 2017;98(8):1705-1715. [ CrossRef ] [ Medline ]
  • Menger F, Morris J, Salis C. Aphasia in an internet age: wider perspectives on digital inclusion. Aphasiology. 2016;30(2-3):112-132. [ CrossRef ]
  • Richardson S, Lawrence K, Schoenthaler AM, Mann D. A framework for digital health equity. NPJ Digit Med. Aug 18, 2022;5(1):119. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hjelholt M, Papazu I. "De har fået NemID, men det er ikke nemt for mig” - Digital rum(me)lighed i den danske velfærdsstat. Social Kritik. 2021;2021-2(163). [ FREE Full text ]
  • Bailey C, Sheehan C. Technology, older persons’ perspectives and the anthropological ethnographic lens. Alter. 2009;3(2):96-109. [ CrossRef ]
  • Gadamer HG, Weinsheimer HG, Marshall DG. Truth and Method. New York, NY. Crossroad Publishing Company; 1991.
  • Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. Dec 16, 2007;19(6):349-357. [ CrossRef ] [ Medline ]
  • Polit DF, Beck CT. Nursing Research: Generating and Assessing Evidence for Nursing Practice. Philadelphia, PA. Lippincott Williams & Wilkins, Inc; 2012.
  • Kvale S, Brinkmann S. InterViews: Learning the Craft of Qualitative Research Interviewing. Thousand Oaks, CA. SAGE Publications; 2009.
  • Kagan A. Supported conversation for adults with aphasia: methods and resources for training conversation partners. Aphasiology. Sep 1998;12(9):816-830. [ CrossRef ]
  • Graneheim UH, Lundman B. Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness. Nurse Educ Today. Feb 2004;24(2):105-112. [ CrossRef ] [ Medline ]
  • Krippendorff K. Content Analysis: An Introduction to Its Methodology. Thousand Oaks, CA. SAGE Publications; 1980.
  • Klösch M, Sari-Kundt F, Reibnitz C, Osterbrink J. Patients' attitudes toward their health literacy and the use of digital apps in health and disease management. Br J Nurs. Nov 25, 2021;30(21):1242-1249. [ CrossRef ] [ Medline ]
  • Datta P, Eiland L, Samson K, Donovan A, Anzalone AJ, McAdam-Marx C. Telemedicine and health access inequalities during the COVID-19 pandemic. J Glob Health. Dec 03, 2022;12:05051. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rasi P, Lindberg J, Airola E. Older service users’ experiences of learning to use eHealth applications in sparsely populated healthcare settings in Northern Sweden and Finland. Educ Gerontol. Nov 24, 2020;47(1):25-35. [ CrossRef ]
  • Airola E, Rasi P, Outila M. Older people as users and non-users of a video conferencing service for promoting social connectedness and well-being – a case study from Finnish Lapland. Educ Gerontol. Mar 29, 2020;46(5):258-269. [ CrossRef ]
  • Kayser L, Kushniruk A, Osborne RH, Norgaard O, Turner P. Enhancing the effectiveness of consumer-focused health information technology systems through eHealth literacy: a framework for understanding users' needs. JMIR Hum Factors. May 20, 2015;2(1):e9. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Karnoe A, Furstrand D, Christensen KB, Norgaard O, Kayser L. Assessing competencies needed to engage with digital health services: development of the ehealth literacy assessment toolkit. J Med Internet Res. May 10, 2018;20(5):e178. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nielsen AS, Hanna L, Larsen BF, Appel CW, Osborne RH, Kayser L. Readiness, acceptance and use of digital patient reported outcome in an outpatient clinic. Health Informatics J. Jun 03, 2022;28(2):14604582221106000. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Murray E, Hekler EB, Andersson G, Collins LM, Doherty A, Hollis C, et al. Evaluating digital health interventions: key questions and approaches. Am J Prev Med. Nov 2016;51(5):843-851. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chesser A, Burke A, Reyes J, Rohrberg T. Navigating the digital divide: a systematic review of eHealth literacy in underserved populations in the United States. Inform Health Soc Care. Feb 24, 2016;41(1):1-19. [ CrossRef ] [ Medline ]
  • Chesser AK, Keene Woods N, Smothers K, Rogers N. Health literacy and older adults: a systematic review. Gerontol Geriatr Med. Mar 15, 2016;2:2333721416630492. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mitra S, Singh A, Rajendran Deepam S, Asthana MK. Information and communication technology adoption among the older people: a qualitative approach. Health Soc Care Community. Nov 21, 2022;30(6):e6428-e6437. [ CrossRef ] [ Medline ]
  • Ahmed S. How not to do things with words. Wagadu. 2016. URL: https://sites.cortland.edu/wagadu/wp-content/uploads/sites/3/2017/02/v16-how-not-to-do-ahmed.pdf [accessed 2024-04-02]
  • Monkman H, Kushniruk AW. eHealth literacy issues, constructs, models, and methods for health information technology design and evaluation. Knowl Manag E Learn. 2015;7(4). [ CrossRef ]
  • Brørs G, Norman CD, Norekvål TM. Accelerated importance of eHealth literacy in the COVID-19 outbreak and beyond. Eur J Cardiovasc Nurs. Aug 15, 2020;19(6):458-461. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by A Mavragani; submitted 14.03.23; peer-reviewed by G Myreteg, J Eriksen, M Siermann; comments to author 18.09.23; revised version received 09.10.23; accepted 27.02.24; published 11.04.24.

©Christian Gybel Jensen, Frederik Gybel Jensen, Mia Ingerslev Loft. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

Read our research on: Gun Policy | International Conflict | Election 2024

Regions & Countries

Political typology quiz.

Notice: Beginning April 18th community groups will be temporarily unavailable for extended maintenance. Thank you for your understanding and cooperation.

Where do you fit in the political typology?

Are you a faith and flag conservative progressive left or somewhere in between.

research method questions

Take our quiz to find out which one of our nine political typology groups is your best match, compared with a nationally representative survey of more than 10,000 U.S. adults by Pew Research Center. You may find some of these questions are difficult to answer. That’s OK. In those cases, pick the answer that comes closest to your view, even if it isn’t exactly right.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .

logo

Office of Science Policy

Event Years: 2023

Maximizing the impact of public engagement in policy development: informing the use of personal data in biomedical research.

Event Date: 11/13/2023

Time: 3-3:30 PM ET

NIH Videocast

LinkedIn

The NIH Office of Science Policy (OSP) will host a webinar to discuss its approach to maximizing the impact of public engagement in the policy making process. The webinar will address the recent recommendations of the Novel and Exceptional Technology and Research Advisory Committee (NExTRAC) and OSP’s commitment to engage with individuals and communities when considering policy matters.

https://videocast.nih.gov/watch=53839

Draft Agenda

(Some background documents have been provided in Spanish, as a community conversation was held in Spanish)

Webinar Summary and Community Conversation Summary Feedback Webinar Summary and Community Conversation Summary Feedback (Spanish)

For general questions:

Email:  [email protected]

ASHG Session: What You Need to Know About the NIH Genomic Data Sharing Policy

Event Date: 11/01/2023

Time: 10-11:30 AM ET

Room 204ABC at the Walter E. Washington Convention Center, Washington D.C

THIS EVENT IS RESERVED FOR PARTICIPANTS ATTENDING THE AMERICAN SOCIETY FOR HUMAN GENETICS (ASHG) ANNUAL CONFERENCE. YOU MUST BE REGISTERED TO ATTEND THE ASHG ANNUAL MEETING TO REGISTER FOR THIS SESSION

This session will give an overview of the NIH Genomic Data Sharing (GDS) Policy that sets forth expectations that ensure the broad and responsible sharing of large-scale human or non-human data. The GDS Policy applies to all NIH-funded research that generates large-scale human or non-human genomic data as well as the use of these data for subsequent research. The ancillary meeting will also provide in-depth understanding of GDS Policy privacy protection when working with human genomic data and navigate attendees through the recent implementation changes to harmonize with the NIH Data Management and Sharing (DMS) Policy ( NOT-OD-22-198 , NOT-OD-21-013 ). This is a great opportunity for ASHG attendees to meet experts from NIH and discuss GDS Policy questions.  

Speakers: Cheryl Jacobs, Ph.D., Team Lead, NIH Genomic Data Sharing Policy, Office of Science Policy and Julia Slutsman, Ph.D., Director, NIH Genomic Data Sharing Policy Implementation, Office of Extramural Research.  Website: https://sharing.nih.gov/

Registration

There is no cost to attend this event.  However, pre-registration is strongly recommended.  To register for the event, please provide the information below.

For program-related questions:

Contact: Danyelle Winchester Email:  [email protected]  

Novel and Exceptional Technology and Research Advisory Committee Meeting

Event Date: 08/29/2023

Time: 2:00 PM to 4:30 PM ET

The Novel and Exceptional Technology and Research Advisory Committee (NExTRAC) will hold a virtual meeting to discuss the draft report of the Working Group on Data Science and Emerging Technology.  In addition, the meeting will include a discussion of next steps for the Committee ( Federal Register notice ).

Registration is not required to view the webcast.

The materials below contain content that is not fully supported by assistive technology. For accessibility assistance with these files, please contact:  [email protected].

https://videocast.nih.gov/watch=52218

NExTRAC August 29, 2023 Agenda

NIH NExTRAC August 29, 2023 Meeting Minutes

Meeting Materials

Draft Report of the NExTRAC WG on Data Science and Emerging Technology*

*Note: The NExTRAC working group’s report is a draft, pre-decisional document. The NExTRAC publicly discussed and endorsed the draft findings and recommendations with minor modifications to the report at the August 29, 2023 meeting. The final NExTRAC report was transmitted to the Acting NIH Director and can be found here .

Public Comments

Members of the public may request to make an oral public comment or may submit written public comments. To sign up to make an oral public comment, please submit your name, affiliation, and short description of the oral comment to  [email protected] by August 25, 2023. Once all time slots are filled, only written comments will be accepted.

Any interested person may file written comments by e-mailing [email protected] by August 25, 2023. The statement should include the name, address, telephone number and, when applicable, the business or professional affiliation of the interested person. Other than name and contact information, please do not include any personally identifiable information or any information that you do not wish to make public. Proprietary, classified, confidential, or sensitive information should not be included in your comments. Please note that any comments NIH receives may be posted unredacted to the Office of Science Policy website.

Additional Files

Presentation – Draft Report of the Working Group on Data Science and Emerging Technology Presentation – ENGAGE Charge to the NExTRAC

NIH Workshop on Catalyzing the Development of Novel Alternatives Methods

Event Date: 08/21/2023

Time: 9 am to 5:00 pm ET

The National Institutes of Health (NIH) will hold a virtual workshop on August 21, 2023, on approaches, challenges, and opportunities relating to the development of Novel Alternative Methods (NAMs).  The workshop will also feature discussion on identifying incentives and barriers to successful implementation of NAMs technologies. 

https://videocast.nih.gov/watch=49776

Meeting Agenda

Registration Information

Pre-registration for viewing the workshop is requested.  There is no registration fee associated with this workshop.  For details on how to register, please see: https://web.cvent.com/event/ca67c4ad-b795-4b11-85f4-97bba96113b0/summary

Supplementary Information

Advisory Committee to the NIH Director Working Group on Catalyzing the Development and Use of Novel Alternative Methods to Advance Biomedical Research                                                                                                                                           

Request for Information (RFI): Catalyzing the Development and Use of Novel Alternative Methods to Advance Biomedical Research (Comments must be received by September 5, 2023)

IMAGES

  1. How to Develop a Strong Research Question

    research method questions

  2. How to Develop a Strong Research Question

    research method questions

  3. Research Questions

    research method questions

  4. 15 Types of Research Methods (2024)

    research method questions

  5. Research Questions

    research method questions

  6. What Is a Research Question? Tips on How to Find Interesting Topics

    research method questions

VIDEO

  1. Research Methodology Quiz

  2. MCQ Questions on Research Methodology Part 1

  3. 4 Types of Research Questions to Start Your Writing Project Right

  4. LAST MOMENT Research Methodology MCQs revision

  5. Business Research Methods Important Questions

  6. Important Questions of Research Methodology ( R.M) for BBA, MBA #ccsu

COMMENTS

  1. 100 Questions (and Answers) About Research Methods

    Key Features · The entire research process is covered from start to finish: Divided into nine parts, the book guides readers from the initial asking of questions, through the analysis and interpretation of data, to the final report · Each question and answer provides a stand-alone explanation: Readers gain enough information on a particular topic to move on to the next question, and topics ...

  2. RESEARCH METHODS EXAM QUESTIONS, ANSWERS & MARKS

    Research Methods- multiple choice exam questions. 61 terms. bls1g16. Preview. Research Methods. 113 terms. Scarlett_Lecoat. Preview. Group Cohesion. 19 terms. DanH007. Preview. PE - Sports psychology - Aggression. ... / In addition, an experiment is a research method / but correlation is a technique of data analysis applied to data gathered by ...

  3. Research Methods

    Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:

  4. Research Methods--Quantitative, Qualitative, and More: Overview

    About Research Methods. This guide provides an overview of research methods, how to choose and use them, and supports and resources at UC Berkeley. As Patten and Newhart note in the book Understanding Research Methods, "Research methods are the building blocks of the scientific enterprise. They are the "how" for building systematic knowledge.

  5. 100 Questions (and Answers) About Research Methods

    This invaluable guide answers the essential questions that students ask about research methods in a concise and accessible way. 100 Questions (and Answers) about Research Methods summarizes the most important questions that lie in those inbetween spaces that one could ask about research methods while providing an answer as well. This is a short ...

  6. 671 questions with answers in RESEARCH METHODS

    Data analysis in research methods refers to the process of transforming raw data into meaningful and interpretable information to answer research questions, test hypotheses, or conclude. It ...

  7. Writing Strong Research Questions

    A good research question is essential to guide your research paper, dissertation, or thesis. All research questions should be: Focused on a single problem or issue. Researchable using primary and/or secondary sources. Feasible to answer within the timeframe and practical constraints. Specific enough to answer thoroughly.

  8. 100 Questions (and Answers) About Qualitative Research

    Part 1: The Nature of Qualitative Inquiry. Question #1: What is qualitative research? Question #2: What disciplines use qualitative approaches and are there differences in disciplinary approach? Question #3: Is qualitative research used in practice or only in academic research?

  9. 100 questions (and answers) about research methods

    Research Methods: Knowing the Language, Knowing the Ideas; Part 5. Sampling Ideas and Issues; Part 6. Describing Data Using Descriptive Techniques ... and everything in between. 100 Questions (and Answers) about Research Methods summarizes the most important questions that lie in those inbetween spaces that one could ask about research methods ...

  10. 10 Research Question Examples to Guide your Research Project

    The first question asks for a ready-made solution, and is not focused or researchable. The second question is a clearer comparative question, but note that it may not be practically feasible. For a smaller research project or thesis, it could be narrowed down further to focus on the effectiveness of drunk driving laws in just one or two countries.

  11. Research Questions

    Designing the study: Research questions guide the design of the study, including the selection of participants, the collection of data, and the analysis of results. Collecting data: Research questions inform the selection of appropriate methods for collecting data, such as surveys, interviews, or experiments. Analyzing data: Research questions ...

  12. Research Methodology

    Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect, analyze, and interpret data to answer research questions or solve research problems.

  13. Research Method Quizzes, Questions & Answers

    Designed for students, researchers, and individuals interested in the scientific process, our quizzes cover a wide range of research methods and techniques. Through a series of thought-provoking questions, you'll explore the principles and practices that underpin qualitative and quantitative research. With our interactive platform, you can test ...

  14. 801 questions with answers in RESEARCH METHODOLOGY

    Answer. Research, research methodology, and publication ethics are all essential components of scientific inquiry. Conducting research using rigorous methodology and adhering to ethical ...

  15. How to Write a Research Question in 2024: Types, Steps, and Examples

    Mixed-methods studies. Mixed-methods studies typically require a set of both quantitative and qualitative research questions. Separate questions are appropriate when the mixed-methods study focuses on the significance and differences in quantitative and qualitative methods and not on the study's integrative component (Tashakkori & Teddlie, 2010).

  16. Examples of good research questions

    Unlike broad, flexible qualitative research questions, quantitative research questions are precise. They also directly link the research question and the proposed methodology. So, in a quantitative research question, you'll usually find . The study method An independent variable (or variables) A dependent variable. The study population

  17. Research Question Examples ‍

    A well-crafted research question (or set of questions) sets the stage for a robust study and meaningful insights. But, if you're new to research, it's not always clear what exactly constitutes a good research question. In this post, we'll provide you with clear examples of quality research questions across various disciplines, so that you can approach your research project with confidence!

  18. 100 Questions (and Answers) About Research Methods

    This invaluable guide answers the essential questions that students ask about research methods in a concise and accessible way. 100 Questions (and Answers) about Research Methods summarizes the most important questions that lie in those inbetween spaces that one could ask about research methods while providing an answer as well. This is a short ...

  19. SAGE Research Methods: Find resources to answer your research methods

    Click to continue

  20. Research Question 101

    Types of research questions. Now that we've defined what a research question is, let's look at the different types of research questions that you might come across. Broadly speaking, there are (at least) four different types of research questions - descriptive, comparative, relational, and explanatory. Descriptive questions ask what is happening. In other words, they seek to describe a ...

  21. What Is Research Methodology: Detailed Definition & Explanation

    A research methodology should encompass the following elements: Research design—should be chosen based on the research question and the data needed. Common research designs include experimental, quasi-experimental, correlational, descriptive, and exploratory. Research method—this can be quantitative, qualitative, or mixed-method.

  22. Questionnaire Design

    Questionnaires vs. surveys. A survey is a research method where you collect and analyze data from a group of people. A questionnaire is a specific tool or instrument for collecting the data.. Designing a questionnaire means creating valid and reliable questions that address your research objectives, placing them in a useful order, and selecting an appropriate method for administration.

  23. psychology research methods exam questions Flashcards

    Study with Quizlet and memorize flashcards containing terms like what is meant by an overt observation, The psychology teacher watched different students going to the bins and wrote down three types of waste the recycled. Identify one type of observation being used in this investigation, Explain the sampling technique the teacher used to record her observations of students recycling and more.

  24. Research Methods Past Questions 2022

    Answer: a: You should trust research findings after different researchers have replicated the findings. LET'S JOIN ALL UNIVERSITY STUDENTS SESSION. 9. A qualitative research problem statement: a. Specifies the research methods to be utilized. b. Specifies a research hypothesis. c. Expresses a relationship between variables.

  25. Children's understanding of well-being related questions: results of

    Ivan Dević is a research associate at the Institute of Social Sciences Ivo Pilar, Zagreb. His research field includes child and young people's well-being, education and STEM achievement. He is a co-author of 11 articles in scientific journals, 3 book chapters, and one book.

  26. Detecting inattentive respondents by machine learning: A ...

    Web surveys are often used to collect data for psychological research. However, the inclusion of many inattentive respondents can be a problem. Various methods for detecting inattentive respondents have been proposed, most of which require the inclusion of additional items in the survey for detection or the calculation of variables for detection after data collection. This study proposes a ...

  27. Journal of Medical Internet Research

    Background: The digitalization of public and health sectors worldwide is fundamentally changing health systems. With the implementation of digital health services in health institutions, a focus on digital health literacy and the use of digital health services have become more evident. In Denmark, public institutions use digital tools for different purposes, aiming to create a universal public ...

  28. Political Typology Quiz

    Take our quiz to find out which one of our nine political typology groups is your best match, compared with a nationally representative survey of more than 10,000 U.S. adults by Pew Research Center. You may find some of these questions are difficult to answer. That's OK. In those cases, pick the answer that comes closest to your view, even if ...

  29. 2023 Archives

    Event Date: 11/13/2023. Time: 3-3:30 PM ET. Location: NIH Videocast. Share: The NIH Office of Science Policy (OSP) will host a webinar to discuss its approach to maximizing the impact of public engagement in the policy making process. The webinar will address the recent recommendations of the Novel and Exceptional Technology and Research ...