Research guidance, Research Journals, Top Universities

Top 20 MCQs on literature review with answers

MCQs on literature review : The primary purpose of literature review is to facilitate detailed background of the previous studies to the readers on the topic of research.

In this blog post, we have published 20 MCQs on Literature Review (Literature Review in Research) with answers.

20 Multiple Choice Questions on Literature Review

1. Literature is a 

Written Record

Published Record

Unpublished Record

All of these

2. Which method of literature review involves a non-statistical method to present data having the feature of systematic Method too?

Narrative Method

Systematic Method

Meta-Analysis Method of Literature Review

Meta-Synthesis Method of Literature Review

3. Comparisons of non-statistical variables are performed under which method of literature review?

4. Literature review is not similar to

Annotated Bibliography 

5. APA Style, MLA Style, Chicago Manual, Blue Book, OSCOLA are famously known as

Citation Manuals

Directories

Abbreviation Manuals

6. Literature collected is reviewed and preferably arranged 

Alphabetically

Chronologically

None of these

7. Literature collected for review includes

Primary and Secondary Sources

Secondary and Tertiary Sources

Primary and Tertiary Sources

8. Literature includes

Previous Studies

Scholarly publications

Research Findings

9. No time frame is set to collect literature in which of the following method of compiling reviews?

Traditional Method

10. Which method of the literature review is more reliable for drawing conclusions of each individual researcher for new conceptualizations and interpretations?

11. The main purpose of finalization of research topics and sub-topics is

Collection of Literature

Collection of Questions

Collection of Statistics

Collection of Responses

12. Literature review is basically to bridge the gap between

Newly established facts

Previously established facts

Facts established time to time

Previous to current established facts

13. The last step in writing the literature review is 

Developing a Final Essay

Developing a Coherent Essay

Developing a Collaborated Essay

Developing a Coordinated Essay

14. The primary purpose of literature review is to facilitate detailed background of 

Present Studies

Previous studies

Future Studies

15. Narrative Literature Review method is also known as 

Advanced Method

Scientific Method

16. Which method of literature review starts with formulating research questions?

17. Which method of literature review involves application of clinical approach based on a specific subject.

18. Which literature review involves timeline based collection of literature for review

19. Which method of literature review involves application of statistical approach?

20. Which literature review method involves conclusions in numeric/statistical form?

More MCQs Related to MCQs on Literature Review

  • MCQs on Qualitative Research with answers
  • Research Proposal MCQs with answers PDF
  • Solved MCQ on legal Reasoning in Research
  • MCQ on data analysis in research methodology
  • Research Report writing MCQs with answers
  • All Solved MCQs on Research Methodology
  • MCQs on Legal Research with answers
  • MCQs on sampling in research methodology with answers
  • MCQs with answers on plagiarism
  • MCQ on Citation and Referencing in Research
  • Research Ethics MCQs with answers
  • Solved MCQs on Sampling in research methodology
  • Solved MCQs on Basic Research

MCQs  on literature review  with answers PDF | Research methods multiple choice questions | Literature review  questions and answers

Share this:

1 thought on “top 20 mcqs on literature review with answers”.

Very nice questions for revision

Leave a Comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

Notify me of follow-up comments by email.

Notify me of new posts by email.

X

Library Services

UCL LIBRARY SERVICES

  • Guides and databases
  • Library skills
  • Systematic reviews

Formulating a research question

  • What are systematic reviews?
  • Types of systematic reviews
  • Identifying studies
  • Searching databases
  • Describing and appraising studies
  • Synthesis and systematic maps
  • Software for systematic reviews
  • Online training and support
  • Live and face to face training
  • Individual support
  • Further help

Searching for information

Clarifying the review question leads to specifying what type of studies can best address that question and setting out criteria for including such studies in the review. This is often called inclusion criteria or eligibility criteria. The criteria could relate to the review topic, the research methods of the studies, specific populations, settings, date limits, geographical areas, types of interventions, or something else.

Systematic reviews address clear and answerable research questions, rather than a general topic or problem of interest. They also have clear criteria about the studies that are being used to address the research questions. This is often called inclusion criteria or eligibility criteria.

Six examples of types of question are listed below, and the examples show different questions that a review might address based on the topic of influenza vaccination. Structuring questions in this way aids thinking about the different types of research that could address each type of question. Mneumonics can help in thinking about criteria that research must fulfil to address the question. The criteria could relate to the context, research methods of the studies, specific populations, settings, date limits, geographical areas, types of interventions, or something else.

Examples of review questions

  • Needs - What do people want? Example: What are the information needs of healthcare workers regarding vaccination for seasonal influenza?
  • Impact or effectiveness - What is the balance of benefit and harm of a given intervention? Example: What is the effectiveness of strategies to increase vaccination coverage among healthcare workers. What is the cost effectiveness of interventions that increase immunisation coverage?
  • Process or explanation - Why does it work (or not work)? How does it work (or not work)?  Example: What factors are associated with uptake of vaccinations by healthcare workers?  What factors are associated with inequities in vaccination among healthcare workers?
  • Correlation - What relationships are seen between phenomena? Example: How does influenza vaccination of healthcare workers vary with morbidity and mortality among patients? (Note: correlation does not in itself indicate causation).
  • Views / perspectives - What are people's experiences? Example: What are the views and experiences of healthcare workers regarding vaccination for seasonal influenza?
  • Service implementation - What is happening? Example: What is known about the implementation and context of interventions to promote vaccination for seasonal influenza among healthcare workers?

Examples in practice :  Seasonal influenza vaccination of health care workers: evidence synthesis / Loreno et al. 2017

Example of eligibility criteria

Research question: What are the views and experiences of UK healthcare workers regarding vaccination for seasonal influenza?

  • Population: healthcare workers, any type, including those without direct contact with patients.
  • Context: seasonal influenza vaccination for healthcare workers.
  • Study design: qualitative data including interviews, focus groups, ethnographic data.
  • Date of publication: all.
  • Country: all UK regions.
  • Studies focused on influenza vaccination for general population and pandemic influenza vaccination.
  • Studies using survey data with only closed questions, studies that only report quantitative data.

Consider the research boundaries

It is important to consider the reasons that the research question is being asked. Any research question has ideological and theoretical assumptions around the meanings and processes it is focused on. A systematic review should either specify definitions and boundaries around these elements at the outset, or be clear about which elements are undefined. 

For example if we are interested in the topic of homework, there are likely to be pre-conceived ideas about what is meant by 'homework'. If we want to know the impact of homework on educational attainment, we need to set boundaries on the age range of children, or how educational attainment is measured. There may also be a particular setting or contexts: type of school, country, gender, the timeframe of the literature, or the study designs of the research.

Research question: What is the impact of homework on children's educational attainment?

  • Scope : Homework - Tasks set by school teachers for students to complete out of school time, in any format or setting.
  • Population: children aged 5-11 years.
  • Outcomes: measures of literacy or numeracy from tests administered by researchers, school or other authorities.
  • Study design: Studies with a comparison control group.
  • Context: OECD countries, all settings within mainstream education.
  • Date Limit: 2007 onwards.
  • Any context not in mainstream primary schools.
  • Non-English language studies.

Mnemonics for structuring questions

Some mnemonics that sometimes help to formulate research questions, set the boundaries of question and inform a search strategy.

Intervention effects

PICO  Population – Intervention– Outcome– Comparison

Variations: add T on for time, or ‘C’ for context, or S’ for study type,

Policy and management issues

ECLIPSE : Expectation – Client group – Location – Impact ‐ Professionals involved – Service

Expectation encourages  reflection on what the information is needed for i.e. improvement, innovation or information.  Impact looks at what  you would like to achieve e.g. improve team communication .

  • How CLIP became ECLIPSE: a mnemonic to assist in searching for health policy/management information / Wildridge & Bell, 2002

Analysis tool for management and organisational strategy

PESTLE:  Political – Economic – Social – Technological – Environmental ‐ Legal

An analysis tool that can be used by organizations for identifying external factors which may influence their strategic development, marketing strategies, new technologies or organisational change.

  • PESTLE analysis / CIPD, 2010

Service evaluations with qualitative study designs

SPICE:  Setting (context) – Perspective– Intervention – Comparison – Evaluation

Perspective relates to users or potential users. Evaluation is how you plan to measure the success of the intervention.

  • Clear and present questions: formulating questions for evidence based practice / Booth, 2006

Read more about some of the frameworks for constructing review questions:

  • Formulating the Evidence Based Practice Question: A Review of the Frameworks / Davis, 2011
  • << Previous: Stages in a systematic review
  • Next: Identifying studies >>
  • Last Updated: Apr 4, 2024 10:09 AM
  • URL: https://library-guides.ucl.ac.uk/systematic-reviews

Literature Review with MAXQDA

Interview transcription examples, make the most out of your literature review.

Literature reviews are an important step in the data analysis journey of many research projects, but often it is a time-consuming and arduous affair. Whether you are reviewing literature for writing a meta-analysis or for the background section of your thesis, work with MAXQDA. Our product comes with many exciting features which make your literature review faster and easier than ever before. Whether you are a first-time researcher or an old pro, MAXQDA is your professional software solution with advanced tools for you and your team.

Literature Review with MAXQDA - User interface

How to conduct a literature review with MAXQDA

Conducting a literature review with MAXQDA is easy because you can easily import bibliographic information and full texts. In addition, MAXQDA provides excellent tools to facilitate each phase of your literature review, such as notes, paraphrases, auto-coding, summaries, and tools to integrate your findings.

Step one: Plan your literature review

Similar to other research projects, one should carefully plan a literature review. Before getting started with searching and analyzing literature, carefully think about the purpose of your literature review and the questions you want to answer. This will help you to develop a search strategy which is needed to stay on top of things. A search strategy involves deciding on literature databases, search terms, and practical and methodological criteria for the selection of high-quality scientific literature.

MAXQDA supports you during this stage with memos and the newly developed Questions-Themes-Theories tool (QTT). Both are the ideal place to store your research questions and search parameters. Moreover, the Question-Themes-Theories tool is perfectly suited to support your literature review project because it provides a bridge between your MAXQDA project and your research report. It offers the perfect enviornment to bring together findings, record conclusions and develop theories.

systematic literature review is mcq

Step two: Search, Select, Save your material

Follow your search strategy. Use the databases and search terms you have identified to find the literature you need. Then, scan the search results for relevance by reading the title, abstract, or keywords. Try to determine whether the paper falls within the narrower area of the research question and whether it fulfills the objectives of the review. In addition, check whether the search results fulfill your pre-specified eligibility criteria. As this step typically requires precise reading rather than a quick scan, you might want to perform it in MAXQDA. If the piece of literature fulfills your criteria and context, you can save the bibliographic information using a reference management system which is a common approach among researchers as these programs automatically extract a paper’s meta-data from the publishing website. You can easily import this bibliographic data into MAXQDA via a specialized import tool. MAXQDA is compatible with all reference management programs that are able to export their literature databases in RIS format which is a standard format for bibliographic information. This is the case with all mainstream literature management programs such as Citavi, DocEar, Endnote, JabRef, Mendeley, and Zotero.

Search, select, save your literature

Step three: Import & Organize your material in MAXQDA

Importing bibliographic data into MAXQDA is easy and works seamlessly for all reference management programs that use the standard RIS files. MAXQDA offers an import option dedicated to bibliographic data which you can find in the MAXQDA Import tab. To import the selected literature, just click on the corresponding button, select the data you want to import, and click okay. Upon import, each literature entry becomes its own text document. If full texts are imported, MAXQDA automatically connects the full text to the literature entry with an internal link. The individual information in the literature entries is automatically coded for later analysis so that, for example, all titles or abstracts can be compiled and searched. To help you keeping your literature (review) organized, MAXQDA automatically creates a document group called “References” which contains the individual literature entries. Like full texts or interview documents, the bibliographic entries can be searched, coded, linked, edited, and you can add memos for further qualitative and quantitative content analysis (Kuckartz & Rädiker, 2019). Especially, when running multiple searches using different databases or search terms, you should carefully document your approach. Besides being a great place to store the respective search parameters, memos are perfectly suited to capture your ideas while reviewing our literature and can be attached to text segments, documents, document groups, and much more.

Import and organize your literature

Analyze your literature with MAXQDA

Once imported into MAXQDA, you can explore your material using a variety of tools and functions. With MAXQDA as your literature review & analysis software, you have numerous possibilities for analyzing your literature and writing your literature review – impossible to mention all. Thus, we can present only a subset of tools here. Check out our literature about performing literature reviews with MAXQDA to discover more possibilities.

Use the power of AI for your analysis

AI Assist: Introducing AI to literature reviews

AI Assist – MAXQDA’s AI-based add-on module – can simplify your literature reviews in many ways. Chat with your data and ask the AI questions about your documents. Let AI Assist automatically summarize entire papers and text segments. Automatically create summaries of all coded segments of a code or generate suggestions for subcodes, and if you don’t know a word’s or concept’s meaning, use AI Assist to get a definition without leaving MAXQDA. Visit our research guide for even more ideas on how AI can support your literature review:

AI for Literature Review

Code & Retrieve important segments

Coding qualitative data lies at the heart of many qualitative data analysis approaches and can be useful for literature reviews as well. Coding refers to the process of labeling segments of your material. For example, you may want to code definitions of certain terms, pro and con arguments, how a specific method is used, and so on. In a later step, MAXQDA allows you to compile all text segments coded with one (or more) codes of interest from one or more papers, so that you can for example compare definitions across papers.

But there is more. MAXQDA offers multiple ways of coding, such as in-vivo coding, highlighters, emoticodes, Creative Coding, or the Smart Coding Tool. The compiled segments can be enriched with variables and the segment’s context accessed with just one click. MAXQDA’s Text Search & Autocode tool is especially well-suited for a literature review, as it allows one to explore large amounts of text without reading or coding them first. Automatically search for keywords (or dictionaries of keywords), such as important concepts for your literature review, and automatically code them with just a few clicks.

Code name suggestions and quick resize

Paraphrase literature into your own words

Another approach is to paraphrase the existing literature. A paraphrase is a restatement of a text or passage in your own words, while retaining the meaning and the main ideas of the original. Paraphrasing can be especially helpful in the context of literature reviews, because paraphrases force you to systematically summarize the most important statements (and only the most important statements) which can help to stay on top of things.

With MAXQDA as your literature review software, you not only have a tool for paraphrasing literature but also tools to analyze the paraphrases you have written. For example, the Categorize Paraphrases tool (allows you to code your parpahrases) or the Paraphrases Matrix (allows you to compare paraphrases side-by-side between individual documents or groups of documents.)

Summaries & Overview tables: A look at the Bigger Picture

When conducting a literature review you can easily get lost. But with MAXQDA as your literature review software, you will never lose track of the bigger picture. Among other tools, MAXQDA’s overview and summary tables are especially useful for aggregating your literature review results. MAXQDA offers overview tables for almost everything, codes, memos, coded segments, links, and so on. With MAXQDA literature review tools you can create compressed summaries of sources that can be effectively compared and represented, and with just one click you can easily export your overview and summary tables and integrate them into your literature review report.

Summarize content with MAXQDA for your literature review

Visualize your qualitative data

The proverb “a picture is worth a thousand words” also applies to literature reviews. That is why MAXQDA offers a variety of Visual Tools that allow you to get a quick overview of the data, and help you to identify patterns. Of course, you can export your visualizations in various formats to enrich your final report. One particularly useful visual tool for literature reviews is the Word Cloud. It visualizes the most frequent words and allows you to explore key terms and the central themes of one or more papers. Thanks to the interactive connection between your visualizations with your MAXQDA data, you will never lose sight of the big picture. Another particularly useful tool is MAXQDA’s word/code frequency tool with which you can analyze and visualize the frequencies of words or codes in one or more documents. As with Word Clouds, nonsensical words can be added to the stop list and excluded from the analysis.

QTT: Synthesize your results and write up the review

MAXQDA has an innovative workspace to gather important visualization, notes, segments, and other analytics results. The perfect tool to organize your thoughts and data. Create a separate worksheet for your topics and research questions, fill it with associated analysis elements from MAXQDA, and add your conclusions, theories, and insights as you go. For example, you can add Word Clouds, important coded segments, and your literature summaries and write down your insights. Subsequently, you can view all analysis elements and insights to write your final conclusion. The Questions-Themes-Theories tool is perfectly suited to help you finalize your literature review reports. With just one click you can export your worksheet and use it as a starting point for your literature review report.

Collect relevant insights and develop new theories with MAXQDA

Literature about Literature Reviews and Analysis

We offer a variety of free learning materials to help you get started with your literature review. Check out our Getting Started Guide to get a quick overview of MAXQDA and step-by-step instructions on setting up your software and creating your first project with your brand new QDA software. In addition, the free Literature Reviews Guide explains how to conduct a literature review with MAXQDA in more detail.

Getting started with MAXQDA

Getting Started with MAXQDA

Literature Review Guide

Literature Reviews with MAXQDA

A literature review is a critical analysis and summary of existing research and literature on a particular topic or research question. It involves systematically searching and evaluating a range of sources, such as books, academic journals, conference proceedings, and other published or unpublished works, to identify and analyze the relevant findings, methodologies, theories, and arguments related to the research question or topic.

A literature review’s purpose is to provide a comprehensive and critical overview of the current state of knowledge and understanding of a topic, to identify gaps and inconsistencies in existing research, and to highlight areas where further research is needed. Literature reviews are commonly used in academic research, as they provide a framework for developing new research and help to situate the research within the broader context of existing knowledge.

A literature review is a critical evaluation of existing research on a particular topic and is part of almost every research project. The literature review’s purpose is to identify gaps in current knowledge, synthesize existing research findings, and provide a foundation for further research. Over the years, numerous types of literature reviews have emerged. To empower you in coming to an informed decision, we briefly present the most common literature review methods.

  • Narrative Review : A narrative review summarizes and synthesizes the existing literature on a particular topic in a narrative or story-like format. This type of review is often used to provide an overview of the current state of knowledge on a topic, for example in scientific papers or final theses.
  • Systematic Review : A systematic review is a comprehensive and structured approach to reviewing the literature on a particular topic with the aim of answering a defined research question. It involves a systematic search of the literature using pre-specified eligibility criteria and a structured evaluation of the quality of the research.
  • Meta-Analysis : A meta-analysis is a type of systematic review that uses statistical techniques to combine and analyze the results from multiple studies on the same topic. The goal of a meta-analysis is to provide a more robust and reliable estimate of the effect size than can be obtained from any single study.
  • Scoping Review : A scoping review is a type of systematic review that aims to map the existing literature on a particular topic in order to identify the scope and nature of the research that has been done. It is often used to identify gaps in the literature and inform future research.

There is no “best” way to do a literature review, as the process can vary depending on the research question, field of study, and personal preferences. However, here are some general guidelines that can help to ensure that your literature review is comprehensive and effective:

  • Carefully plan your literature review : Before you start searching and analyzing literature you should define a research question and develop a search strategy (for example identify relevant databases, and search terms). A clearly defined research question and search strategy will help you to focus your search and ensure that you are gathering relevant information. MAXQDA’s Questions-Themes-Theories tool is the perfect place to store your analysis plan.
  • Evaluate your sources : Screen your search results for relevance to your research question, for example by reading abstracts. Once you have identified relevant sources, read them critically and evaluate their quality and relevance to your research question. Consider factors such as the methodology used, the reliability of the data, and the overall strength of the argument presented.
  • Synthesize your findings : After evaluating your sources, synthesize your findings by identifying common themes, arguments, and gaps in the existing research. This will help you to develop a comprehensive understanding of the current state of knowledge on your topic.
  • Write up your review : Finally, write up your literature review, ensuring that it is well-structured and clearly communicates your findings. Include a critical analysis of the sources you have reviewed, and use evidence from the literature to support your arguments and conclusions.

Overall, the key to a successful literature review is to be systematic, critical, and comprehensive in your search and evaluation of sources.

As in all aspects of scientific work, preparation is the key to success. Carefully think about the purpose of your literature review, the questions you want to answer, and your search strategy. The writing process itself will differ depending on the your literature review method. For example, when writing a narrative review use the identified literature to support your arguments, approach, and conclusions. By contrast, a systematic review typically contains the same parts as other scientific papers: Abstract, Introduction (purpose and scope), Methods (Search strategy, inclusion/exclusion characteristics, …), Results (identified sources, their main arguments, findings, …), Discussion (critical analysis of the sources you have reviewed), Conclusion (gaps or inconsistencies in the existing research, future research, implications, etc.).

Start your free trial

Your trial will end automatically after 14 days and will not renew. There is no need for cancelation.

systematic literature review is mcq

Study Site Homepage

  • Request new password
  • Create a new account

Multiple Choice Questions

Management and business research.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Med Educ Online
  • v.27(1); 2022

Does developing multiple-choice Questions Improve Medical Students’ Learning? A Systematic Review

Youness touissi.

a Faculty of Medicine and Pharmacy of Rabat, Mohammed V University, Souissi, Rabat, Morocco

Ghita Hjiej

b Faculty of Medicine and Pharmacy of Oujda, Mohammed Premier University, Oujda, Morocco

Abderrazak Hajjioui

c Laboratory of Neurosciences, Faculty of Medicine and Pharmacy of Fes, Sidi Mohammed Ben Abdallah University, Fez, Morocco

Azeddine Ibrahimi

d Laboratory of Biotechnology, Mohammed V University, Souissi, Rabat, Morocco

Maryam Fourtassi

e Faculty of Medicine of Tangier, Abdelmalek Essaadi University, Tetouan, Morocco

Practicing Multiple-choice questions is a popular learning method among medical students. While MCQs are commonly used in exams, creating them might provide another opportunity for students to boost their learning. Yet, the effectiveness of student-generated multiple-choice questions in medical education has been questioned. This study aims to verify the effects of student-generated MCQs on medical learning either in terms of students’ perceptions or their performance and behavior, as well as define the circumstances that would make this activity more useful to the students. Articles were identified by searching four databases MEDLINE, SCOPUS, Web of Science, and ERIC, as well as scanning references. The titles and abstracts were selected based on a pre-established eligibility criterion, and the methodological quality of articles included was assessed using the MERSQI scoring system. Eight hundred and eighty-four papers were identified. Eleven papers were retained after abstract and title screening, and 6 articles were recovered from cross-referencing, making it 17 articles in the end. The mean MERSQI score was 10.42. Most studies showed a positive impact of developing MCQs on medical students’ learning in terms of both perception and performance. Few articles in the literature examined the influence of student-generated MCQs on medical students learning. Amid some concerns about time and needed effort, writing multiple-choice questions as a learning method appears to be a useful process for improving medical students’ learning.

Introduction

Active learning, where students are motivated to construct their understanding of things, and make connections between the information they grasp is proven to be more effective than passively absorb mere facts [ 1 ]. However, medical students, are still largely exposed to passive learning methods, such as lectures, with no active involvement in the learning process. In order to assimilate the vast amount of information they are supposed to learn, students adopt a variety of strategies, which are mostly oriented by the assessment methods used in examinations [ 2 ].

Multiple-choice questions (MCQs) represent the most common assessment tool in medical education worldwide [ 3 ]. Therefore, it is expected that students would favor practicing MCQs, either from old exams or commercial question banks, over other learning methods to get ready for their assessments [ 4 ]. Although this approach might seem practical for students as it strengthens their knowledge and gives them a prior exam experience, it might incite surface learning instead of constructing more elaborate learning skills, such as application and analysis [ 5 ].

Involving students in creating MCQs appears to be a potential learning strategy that combines students’ pragmatic approach and actual active learning. Developing good questions, in general, implies a deep understanding and a firm knowledge of the material that is evaluated [ 6 ]. Writing a good MCQ requires, in addition to a meticulously drafted stem, the ability to suggest erroneous but possible distractors [ 7 , 8 ]. It has been suggested that creating distractors may reveal misconceptions and mistakes and underlines when students have a defective understanding of the course material [ 6 , 9 ]. In other words, creating a well-constructed MCQ requires more cognitive abilities than answering one [ 10 ]. Several studies have shown that the process of producing questions is an efficient way to motivate students and enhance their performance, and linked MCQs generation to improve test performance [ 11–15 ]. Therefore, generating MCQs might develop desirable problem-solving skills and involve students in an activity that is immediately and clearly relevant to their final examinations.

In contrast, other studies indicated there was no considerable impact of this time-consuming MCQs development activity on students’ learning [ 10 ] or that question-generation might benefit only some categories of students [ 16 ].

Because of the conflicting conclusions about this approach in different studies, we conducted a systematic review to define and document evidence of the effect of writing MCQs activity on students learning, and understand how and under what circumstances it could benefit medical students, as to our knowledge, there is no prior systematic review addressing the effect of student-generated multiple-choice questions on medical students’ learning.

Study design

This systematic review was conducted following the guidelines of the Preferred Reporting Items for Systematic Review and Meta‐Analyses (PRISMA) [ 17 ]. Ethical approval was not required because this is a systematic review of previously published research, and does not include any individual participant information.

Inclusion and exclusion criteria

Table 1 summarizes the publications’ inclusion and exclusion criteria. The target population was undergraduate and graduate medical students. The intervention was generating MCQs of all types. The learning outcomes of the intervention had to be reported using validated or non-validated instruments. We excluded studies involving students from other health-related domains, those in which the intervention was writing questions other than MCQs, and also completely descriptive studies without an evaluation section of the learning outcome. Comparison to other educational interventions was not regarded as an exclusive criterion because much educational research in the literature is case-based.

Inclusion & exclusion criteria

Search strategy

On May 16 th, 2020, two reviewers separately conducted a systematic search on 4 databases, ‘Medline’ (via PubMed), ‘Scopus’, ‘Web of Science’ and ‘Eric’ using keywords as (Medical students, Multiple-choice questions, Learning, Creating) and their possible synonyms and abbreviations which were all combined by Boolean logic terms (AND, OR, NOT) with convenient search syntax for each database (Appendix 1). Then, all the references generated from the search were imported to a bibliographic tool (Zotero®) [ 18 ] used for the management of references. The reviewers also checked manually the references list of selected publications for more relevant papers. Sections as ‘Similar Articles’ below articles (e.g., PubMed) were also checked for possible additional articles. No restrictions regarding the publication date, language, or origin country were applied.

Study selection

The selection process was directed by two reviewers independently. It started with the screening of all papers generated with the databases search, followed by removal of all duplicates. All papers whose titles had a potential relation to the research subject were kept for an abstract screening, while those with obviously irrelevant titles were eliminated. The reviewers then conducted an abstract screening; all selected studies were retrieved for a final full-text screening. Any disagreement among the reviewers concerning papers inclusion was settled through consensus or arbitrated by a third reviewer if necessary.

Data collection

Two reviewers worked separately to create a provisional data extraction sheet, using a small sample made of 4 articles. Then, they met to finalize the coding sheet by adding, editing, and deleting sections, leading to a final template, implemented using Microsoft Excel® to ensure the consistency of collected data. Each reviewer then, extracted data independently using the created framework. Finally, the two reviewers compared their work to ensure the accuracy of the collected data. The items listed in the sheet were article authorship and year of publication, country, study design, participants, subject, intervention and co-interventions, MCQ type and quality, assessment instruments, and findings.

Assessment of study methodological quality

There are few scales to assess the methodological rigor and trustworthiness of quantitative research in medical education, to mention the Best Medical Education Evaluation global scale [ 19 ], Newcastle–Ottawa Scale [ 20 ], and Medical Education Research Study Quality Instrument (MERSQI) [ 21 ]. We chose the latter to assess quantitative studies because it provides a detailed list of items with specified definition, solid validity evidence, and its scores are correlated with the citation rate in the succeeding 3 years of publication, and with the journal impact factor [ 22 , 23 ]. MERSQI evaluates study quality based on 10 items: study design, number of institutions studied, response rate, data type, internal structure, content validity, relationship to other variables, appropriateness of data analysis, the complexity of analysis, and the learning outcome. The 10 items are organized into six domains, each with a maximum score of 3 and a minimum score of 1, not reported items are not scored, resulting in a maximum MERSQI score of 18 [ 21 ].

Each article was assessed independently by two reviewers; any disagreement between the reviewers about MERSQI scoring was resolved by consensus and arbitrated by a third reviewer if necessary. If a study reported more than one outcome, the one with the highest score was taken into account.

Study design and population characteristics

Eight hundred eighty-four papers were identified after the initial databases search, of which 18 papers were retained after title and abstract screening (see Figure 1 ). Seven of them didn’t fit in the inclusion criteria for reasons as the absence of learning outcome or the targeted population being other than medical students. Finally, only 11 articles were retained, added to another 6 articles retrieved by cross-referencing. For the 17 articles included, the two reviewers agreed about 16 articles, and only one paper was discussed and decided to be included.

An external file that holds a picture, illustration, etc.
Object name is ZMEO_A_2005505_F0001_OC.jpg

Flow-chart of the study selection.

The 17 included papers reported 18 studies, as one paper included two distinct studies. Thirteen out of the eighteen studies were single group studies representing the most used study design (See Table 2 ). Eleven of these single group studies were cross-sectional while two were pre-post-test studies. The second most frequent study design encountered was cohorts, which were adopted in three studies. The remaining two were randomized controlled trials (RCT). The studies have been conducted between 1996 and 2019 with 13 studies (79%) from 2012 to 2019.

Demographics, interventions, and outcome of the included studies

MCQs : Multiple-choice questions; N : Number; NR : Not reported; RCT : Randomized controlled trial

Regarding research methodology, 10 were quantitative studies, four were qualitative and four studies had mixed methods with a quantitative part and a qualitative one (students’ feedback).

Altogether, 2122 students participated in the 17 included papers. All participants were undergraduate medical students enrolled in the first five years of medical school. The preclinical stage was the most represented, with 13 out of the 17 papers including students enrolled in the first two years of medical studies.

Most studies used more than one data source, surveys were present as a main or a parallel instrument to collect data in eight studies. Other data sources were qualitative feedback (n = 8), qualitative feedback turned to quantitative data (n = 1), pre-post-test (n = 4), and post-test (n = 5).

Quality assessment

Overall, the MERSQI scores used to evaluate the quality of the 14 quantitative studies were relatively above average which is 10.7, with a mean MERSQI score of 10.75, ranging from 7 to 14 (see details of MERSQI score for each study in Table 3 ). Studies lost points on MERSQI for using single group design, limiting participants to a single institution, the lack of validity evidence for instrument (only two studies used valid instrument) in addition to measuring the learning outcome only in terms of students’ satisfaction and perceptions.

Methodological quality of included studies according to MERSQI

Details of MERSQI Scoring :

a. Study design: Single group cross-sectional/post-test only (1); single group pre- and post-test (1.5); nonrandomized 2 groups (2); randomized controlled experiment (3).

b. Sampling: Institutions studied: Single institution (0.5); 2 institutions (1); More than 2 institutions (1.5).

c. Sampling: Response rate: Not applicable (0); Response rate < 50% or not reported (0.5); Response rate 50–74% (1); Response rate > 75% (1.5).

d. Type of data: evaluation by study participants (1); Objective measurement (3).

e. Validity evidence for evaluation instrument scores: Content: Not reported/ Not applicable (0); Reported (1).

f. Validity evidence for evaluation instrument scores: Internal structure: Not reported/ Not applicable (0); Reported (1).

g. Validity evidence for evaluation instrument scores: Relationships to other variables: Not reported/ Not applicable (0); Reported (1).

h. Appropriateness of analysis: Inappropriate (0); appropriate (1)

i. Complexity of analysis: Descriptive analysis only (1); Beyond descriptive analysis (2).

j. Outcome: Satisfaction, attitudes, perceptions (1); Knowledge, skills (1.5); Behaviors (2); Patient/health care outcome (3)

The evaluation of the educational effect of MCQs writing was carried out using objective measures in 9 out of the 18 studies included, based on pre-post-tests or post-tests only. Subjective assessments as surveys and qualitative feedbacks were present as second data sources in 7 of these 9 studies, whereas they were the main measures in the remaining nine studies. Hence, 16 studies assessed the learning outcome in terms of students’ satisfaction and perceptions towards the activity representing the first learning level of the Kirkpatrick model which is a four-level model for analyzing and evaluating the results of training and educational programs [ 24 ]. Out of these 16 studies, 3 studies wherein students expressed dissatisfaction with the process and found it disadvantageous compared to other learning methods, whereas 4 studies found mixed results as students admitted the process value though they doubted its efficiency. On the other hand, nine studies provided favorable results of the exercise which was considered of immense importance and helped students consolidate their understanding and knowledge, although students showed reservations about the time expense of the exercise in three studies.

Regarding the nine studies that used objective measures to assess students’ skills and knowledge, which represent the second level of the Kirkpatrick model, six studies reported a significant improvement in students’ grades doing this activity, whereas two studies showed no noticeable difference in grades, and one showed a slight drop in grades.

One study suggested that students performed better when writing MCQs on certain modules compared to others. Two studies found the activity beneficial to all students’ categories while another two suggested the process was more beneficial for low performers.

Four Studies also found that writing and peer review combinations were more beneficial than solely writing MCQs. On the other hand, two studies revealed that peer-reviewing groups didn’t promote learning and one study found mixed results.

Concerning the quality of the generated multiple-choice questions, most studies reported that the MCQs were of good or even high quality when compared to faculty-written MCQs, except for two studies where students created MCQs of poor quality. However, only a few studies (n = 2) reported whether students wrote MCQs that tested higher-order skills such as application and analysis or simply tested recalling facts and concepts.

The majority of interventions required students to write single best answer MCQs (n = 6), three of which were vignettes MCQs. Assertion reason MCQs were present in two studies, and in one study, students were required to write only the stem of the MCQ, while in another study, students were asked to write distractors and the answer, while the rest of studies did not report the MCQs Type.

Data and methodology

This paper methodically reviewed 17 articles investigating the impact of writing multiple-choice questions by medical students on their learning. Several studies pointedly examined the effect of the activity inquired on the learning process, whereas it only represented a small section of the article, which was used for the review. This is due to the fact that many papers focused on other concepts like assessing the quality of students generated MCQs or the efficiency of online question platforms, reflecting the scarce research on the impact of a promising learning strategy (creating MCQs) in medical education.

The mean MERSQI score of quantitative studies was 10.75 which is slightly above the level suggestive of a solid methodology set to 10.7 or higher [ 21 ]. This indicates an acceptable methodology used by most of the studies included. Yet, only two studies [ 30 , 31 ] used a valid instrument in terms of internal structure, content, and relation to other variables, making the lack of the instrument validity, in addition to the use of a single institution and single group design, as the main identified methodological issues.

Furthermore, the studies assessing the outcome in terms of knowledge and skills scored higher than the ones appraising the learning outcome regarding perception and satisfaction. Hence, we recommend that future research should provide more details on the validity parameters of the assessment instruments, and also focus on higher learning outcome levels; precisely skills and knowledge as they are typically more linked with the nature of the studied activity.

Relation with existing literature

Apart from medical education, the impact of students’ generated questions has been a relevant research question in a variety of educational environments. Fu-Yun & Chun-Ping demonstrated through hundreds of papers that student-generated questions promoted learning and led to personal growth [ 32 ]. For example, in Ecology, students who were asked to construct multiple-choice questions significantly improved their grades [ 33 ]. Also, in an undergraduate taxation module, students who were asked to create multiple-choice questions significantly improved their academic achievement [ 34 ].

A previous review explored the impact of student-generated questions on learning and concluded that the process of constructing questions raised students’ abilities of recall and promoted understanding of essential subjects as well as problem-solving skills [ 35 ]. Yet, this review gave a general scope on the activity of generating questions, taking into consideration all questions formats. Thus, its conclusions will not necessarily concord with our review because medical students define a special students’ profile [ 36 ], along with the particularity of multiple-choice questions. As far as we know, this is the first systematic review made to appraise the pedagogical interest of the described process of creating MCQs in medical education.

Students’ satisfaction and perceptions

Students’ viewpoints and attitudes toward the MCQ generation process were evaluated in multiple studies, and the results were generally encouraging, despite a few exceptions where students expressed negative impressions of the process and favored other learning methods over it [ 4 , 10 ]. The most pronouncing remarks were essentially on the time-consumption limiting the process efficiency. This was mainly related to the complexity of the task given to students who were required to write MCQs in addition to other demanding assignments.

Since the most preferred learning method for students is learning by doing, they presumably benefit more when instructions are conveyed in shorter segments, and when introduced in an engaging format [ 37 ]. Thus, some researchers tried more flexible strategies as providing the MCQs distractors and asking students for the stem or better providing the stem and requesting distractors as these were considered to be the most challenging parts of the process [ 38 ].

Some authors used online platforms to create and share questions making the MCQs generation smoother. Another approach to motivate students was including some generated MCQs in examinations, to boost students’ confidence and enhance their reflective learning [ 39 ]. These measures, supposed to facilitate the task, were perceived positively by students.

Students’ performance

Regarding students’ performance, MCQs-generation exercise broadly improved students’ grades. However, not all studies have reported positive results. Some noted no significant effect of writing MCQs on students’ exam scores [ 10 , 31 ]. This was explained by the small number of participants, and the lack of instructors’ supervision. Moreover, students were tested on a broader material than the one they were instructed to write MCQs on, meaning that students might have effectively benefited from the process if they created a larger number of MCQs covering a wider range of material or if the process was aligned with the whole curriculum content. Besides, some studies reported that low performers benefited more from the process of writing MCQs, concordantly with the findings of other studies which indicate that activities promoting active learning advantage lower-performing students more than higher-performing ones [ 40 , 41 ]. Another suggested explanation was the fact that low achievers tried to memorize student-generated MCQs when these made part of their examinations, reversely favoring surface learning instead of the deep learning anticipated from this activity. This created a dilemma between enticing students to participate in this activity and the disadvantage of memorizing MCQs. Therefore, including modified student-generated MCQs after instructors’ input, rather than the original student-generated version in the examinations’ material, might be a reasonable option along with awarding extra points when students are more involved in the process of writing MCQs.

Determinant factors

Students’ performance tends to be related to their ability to generate high-quality questions. As suggested in preceding reviews [ 35 , 42 ], assisting students in constructing questions may enhance the quality of these students’ generated questions, encourage learning, and improve students’ achievement. Also, guiding students to write MCQs makes it possible to test higher-order skills as application and analysis besides recall and comprehension. Accordingly, in several studies, students were provided with instructions on how to write high-quality multiple-choice questions, resulting in high-quality student-generated MCQs [ 10 , 43–45 ]. Even so, such guidelines must take into account not making students’ job more challenging to maintain the process as pleasant.

Several papers discussed various factors that influence the learning outcome of the activity, as working in groups and peer checking MCQs, which were found to be associated with higher performance [ 30 , 38 , 43 , 44 , 46–49 ]. These factors were also viewed favorably by students because of their potential to broaden and deepen one’s knowledge, as well as to notice any misunderstandings or problems, according to many studies, that highlighted a variety of beneficial outcomes of peer learning approaches in the education community [ 42 , 50 , 51 ]. However, in other studies, students preferred to work alone and demanded that time devoted to peer-reviewing MCQs be reduced [ 38 , 45 ]. This was mostly due to students’ lack of trust in the quality of MCQs created by peers; thus, evaluating students’ MCQs by instructors was also a component of an effective intervention.

Strengths and limitations

The main limitation of the present review is the scarcity of studies in the literature. We used a narrowed inclusion criterion leading to the omission of articles published in non-indexed journals and papers from other health-care fields that may have been instructive. However, the choice of limiting the review scope to medical students only was motivated by the specificity of the medical education curricula and teaching methods compared to other health professions categories in most settings. Another limitation is the weak methodology of a non-negligible portion of studies included in this review which makes drawing and generalizing conclusions a delicate exercise. On the other hand, this is the first review to summarize data on the learning benefits of creating MCQs in medical education and to shed light on this interesting learning tool.

Writing multiple-choice questions as a learning method might be a valuable process to enhance medical students learning despite doubts raised on its real efficiency and pitfalls in terms of time and effort.

There is presently a dearth of research that examines the influence of student-generated MCQs on learning. Future research on the subject must use a strong study design, valid instruments, simple and flexible interventions, as well as measure learning based on performance and behavior, and explore the effect of the process on different students’ categories (eg. performance, gender, level), in order to reach the most appropriate circumstances for the activity to get the best out of it.

Appendix: Search strategy. 

  • Query: ((((Medical student) OR (Medical students)) AND (((Create) OR (Design)) OR (Generate))) AND ((((multiple-choice question) OR (Multiple-choice questions)) OR (MCQ)) OR (MCQs))) AND (Learning)
  • Results: 300
  • Query: ALL (medical PRE/0 students) AND ALL (multiple PRE/0 choice PRE/0 questions) AND ALL (learning) AND ALL (create OR generate OR design)
  • Results: 468
  • Query: (ALL = ‘Multiple Choice Questions’ OR ALL = ‘Multiple Choice Question’ OR ALL = MCQ OR ALL = MCQs) AND (ALL = ‘Medical Students’ OR ALL = ‘Medical Student’) AND (ALL = Learning OR ALL = Learn) AND (ALL = Create OR ALL = Generate OR ALL = Design)
  • Results: 109
  • Query: ‘Medical student’ AND ‘Multiple choice questions’ AND Learning AND (Create OR Generate OR Design)

Total = 884

After deleting double references : Number: 697

Funding Statement

The author(s) reported there is no funding associated with the work featured in this article.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Banner

Systematic Reviews: Before you begin...

  • Before you begin...
  • Introducing systematic reviews
  • Step 1: Preparation
  • Step 2: Scoping
  • Step 3: Planning your search strategy
  • Step 4: Recording and managing results
  • Step 5: Selecting papers and quality assessment

Welcome to the Library’s Systematic Review Subject Guide

New to systematic reviews?

This Subject Guide page highlights things you should consider before and while undertaking your review. 

The systematic review process requires a lot of preparation, detailed searching, and analysing. It may take longer than you think to complete!

Image of a laptop and a notepad and pen on a wooden desk

Any questions? Contact your Subject Librarian.

Before you begin your review...

Please be assured that your Subject Librarian will support you as best they can. 

Subject Librarians are able to show QUB students and staff undertaking any type of literature search (e.g. literature review, scoping review, systematic review) how to:

  • Structure searches using AND/OR
  • Select appropriate databases
  • Search selected databases
  • Save and re-run searches
  • Export database results
  • Store and deduplicate results using EndNote
  • Identify grey literature (if required)

At peak periods of demand, Subject Librarians might not be able to deliver all of the above. Please contact your Subject Librarian for guidance on this.

QUB students and staff must provide Subject Librarians with a clear search topic or question, along with a selection of appropriate keywords and synonyms. Students should discuss these with their supervisor before contacting Subject Librarians.

Subject Librarians are unable to do the following for QUB students and staff:

  • Check review protocols
  • Peer review, or approve, search strategies        
  • Create search strategies from scratch
  • Search databases or grey literature sources
  • Deduplicate results
  • Screen results
  • Demonstrate systematic review tools (e.g. Covidence, Rayyan)
  • Create PRISMA flowcharts or similar documentation

Subject Librarians do not need to be listed as review co-authors. However, if reviewers wish to acknowledge the input of a specific Subject Librarian, they should contact the relevant Subject Librarian to ensure appropriate wording.

  • Next: Introducing systematic reviews >>
  • Last Updated: May 8, 2024 2:56 PM
  • URL: https://libguides.qub.ac.uk/systematicreviews

universe84a

Systematic review and meta-analysis mcqs: multiple choice questions related to systematic review and meta-analysis theory, guidelines and softwares.

Systematic Review and Meta-Analysis MCQs: Multiple Choice Questions related to Systematic Review and Meta-Analysis Theory, Guidelines and Softwares

Systematic Review and Meta-Analysis MCQs

Systematic Review and Meta-Analysis MCQs: It is a  group of Multiple Choice Questions (MCQs) related to Systematic Review and Meta-Analysis Theory, Guidelines, and Softwares. Test your knowledge in Systematic Review and Meta-Analysis MCQs, in Theory, Guidelines, and Softwares of Systematic Review and Meta-Analysis by playing the “Systematic Review and Meta-Analysis MCQs” and raise your marks in examinations as well as elsewhere you need. There are multiple-choice questions (MCQs) . A question with four choices among them one right answer you have to chose.

A 20 MCQs set of Systematic Review and Meta-Analysis MCQs

  • Which of the following is not always required in systematic review?
  • Protocol development
  • Search strategy
  • Involvement of more than one author
  • Meta-analysis

Correct answer: Meta-analysis

  • A systematic review of evidence from qualitative studies is also known as a meta-analysis.
  • None of them

Correct answer: False

  • Which of the steps are included in the systematic review?
  • Formulate a question and develop a protocol
  • Conduct search
  • Select studies, assess study quality, and extract data
  • All of the above

Correct answer: All of the above

  • Where do we register the protocol of systematic review that will be conducted at the national level?
  •  Health Research Council of country
  • ClinicalTrial.gov

Correct answer: PROSPERO

  • What does “S” stand for in PICOS?
  • Systematic review

Correct answer: Study

  • Which is not the effect size based on continuous variables?
  • Absolute mean difference
  • Standardized mean difference
  • Response ratio

Correct answer: Odds ratio

  • A forest plot displays effect estimates and confidence intervals for both individual studies and meta-analyses. *

Correct answer: True

  • Which is/are the advantage/s of the meta-analyses?
  • To improve precision
  • To answer questions not posed by the individual studies
  • To settle controversies arising from apparently conflicting studies or to generate new hypotheses
  • In the inverse-variance method, larger studies are given more weight than smaller studies.
  • Which of the following method is done with the inverse-variance method?
  • Fixed-effect method for meta-analysis
  • Random-effects methods for meta-analysis

Correct answer: Both

  • Meta-regressions are similar in essence to simple regressions, in which an outcome variable is predicted according to the values of one or more explanatory variables.

Correct answer-True

  • Which of the following review does not necessarily require methodological quality assessment of included studies?
  • Narrative review
  • Scoping review
  • Both b and c

Correct answer: Both b and c

  • Which of the following is not related to selection bias?
  • Random sequence generation
  • Allocation concealment

Correct answer: Attrition

  • Which is true about GRADE?
  • A framework for developing and presenting summaries of evidence and providing a systematic approach for making clinical practice recommendations
  • Tool for grading the quality of evidence and for making recommendations
  • A reproducible and transparent framework for grading certainty in evidence
  • Which of the following checklist is used to report the systematic review?

Correct answer: PRISMA

  • Which of the following is the most rigorous and methodologically complex kind of review article?
  • Which of the following study provided the most robust evidence?
  • Randomized clinical trails
  • Cohort study
  • Cross-sectional analytical study
  • Systematic review and meta-analysis

Correct answer: Systematic review and meta-analysis

  • Steps in a meta-analysis include all of the following except:
  • Abstraction
  • Randomization

Correct answer: Randomization

  • A fixed-effects model is most appropriate in a meta-analysis when study findings are
  • Heterogenous
  • Either homogenous or heterogenous
  • Neither homogenous nor heterogenous

Correct answer: Homogenous

  • What is the full form of PRISMA?
  • Providing Results for Systematic Review and Meta-Analyses
  • Preferred Reporting Items for Systematic Reviews and Meta-Analyses
  • Providing Reporting Items for Systematic Reviews and Meta-Analyses
  • Provisional Reporting Items for Systematic Reviews and Meta-Analyses

Correct answer: Preferred Reporting Items for Systematic Reviews and Meta-Analyses

© 2024 Universe84a.com | All Rights Reserved

A systematic literature review of empirical research on ChatGPT in education

  • Open access
  • Published: 26 May 2024
  • Volume 3 , article number  60 , ( 2024 )

Cite this article

You have full access to this open access article

systematic literature review is mcq

  • Yazid Albadarin   ORCID: orcid.org/0009-0005-8068-8902 1 ,
  • Mohammed Saqr 1 ,
  • Nicolas Pope 1 &
  • Markku Tukiainen 1  

Over the last four decades, studies have investigated the incorporation of Artificial Intelligence (AI) into education. A recent prominent AI-powered technology that has impacted the education sector is ChatGPT. This article provides a systematic review of 14 empirical studies incorporating ChatGPT into various educational settings, published in 2022 and before the 10th of April 2023—the date of conducting the search process. It carefully followed the essential steps outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guidelines, as well as Okoli’s (Okoli in Commun Assoc Inf Syst, 2015) steps for conducting a rigorous and transparent systematic review. In this review, we aimed to explore how students and teachers have utilized ChatGPT in various educational settings, as well as the primary findings of those studies. By employing Creswell’s (Creswell in Educational research: planning, conducting, and evaluating quantitative and qualitative research [Ebook], Pearson Education, London, 2015) coding techniques for data extraction and interpretation, we sought to gain insight into their initial attempts at ChatGPT incorporation into education. This approach also enabled us to extract insights and considerations that can facilitate its effective and responsible use in future educational contexts. The results of this review show that learners have utilized ChatGPT as a virtual intelligent assistant, where it offered instant feedback, on-demand answers, and explanations of complex topics. Additionally, learners have used it to enhance their writing and language skills by generating ideas, composing essays, summarizing, translating, paraphrasing texts, or checking grammar. Moreover, learners turned to it as an aiding tool to facilitate their directed and personalized learning by assisting in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks. However, the results of specific studies (n = 3, 21.4%) show that overuse of ChatGPT may negatively impact innovative capacities and collaborative learning competencies among learners. Educators, on the other hand, have utilized ChatGPT to create lesson plans, generate quizzes, and provide additional resources, which helped them enhance their productivity and efficiency and promote different teaching methodologies. Despite these benefits, the majority of the reviewed studies recommend the importance of conducting structured training, support, and clear guidelines for both learners and educators to mitigate the drawbacks. This includes developing critical evaluation skills to assess the accuracy and relevance of information provided by ChatGPT, as well as strategies for integrating human interaction and collaboration into learning activities that involve AI tools. Furthermore, they also recommend ongoing research and proactive dialogue with policymakers, stakeholders, and educational practitioners to refine and enhance the use of AI in learning environments. This review could serve as an insightful resource for practitioners who seek to integrate ChatGPT into education and stimulate further research in the field.

Similar content being viewed by others

systematic literature review is mcq

Empowering learners with ChatGPT: insights from a systematic literature exploration

systematic literature review is mcq

Incorporating AI in foreign language education: An investigation into ChatGPT’s effect on foreign language learners

systematic literature review is mcq

Large language models in education: A focus on the complementary relationship between human teachers and ChatGPT

Avoid common mistakes on your manuscript.

1 Introduction

Educational technology, a rapidly evolving field, plays a crucial role in reshaping the landscape of teaching and learning [ 82 ]. One of the most transformative technological innovations of our era that has influenced the field of education is Artificial Intelligence (AI) [ 50 ]. Over the last four decades, AI in education (AIEd) has gained remarkable attention for its potential to make significant advancements in learning, instructional methods, and administrative tasks within educational settings [ 11 ]. In particular, a large language model (LLM), a type of AI algorithm that applies artificial neural networks (ANNs) and uses massively large data sets to understand, summarize, generate, and predict new content that is almost difficult to differentiate from human creations [ 79 ], has opened up novel possibilities for enhancing various aspects of education, from content creation to personalized instruction [ 35 ]. Chatbots that leverage the capabilities of LLMs to understand and generate human-like responses have also presented the capacity to enhance student learning and educational outcomes by engaging students, offering timely support, and fostering interactive learning experiences [ 46 ].

The ongoing and remarkable technological advancements in chatbots have made their use more convenient, increasingly natural and effortless, and have expanded their potential for deployment across various domains [ 70 ]. One prominent example of chatbot applications is the Chat Generative Pre-Trained Transformer, known as ChatGPT, which was introduced by OpenAI, a leading AI research lab, on November 30th, 2022. ChatGPT employs a variety of deep learning techniques to generate human-like text, with a particular focus on recurrent neural networks (RNNs). Long short-term memory (LSTM) allows it to grasp the context of the text being processed and retain information from previous inputs. Also, the transformer architecture, a neural network architecture based on the self-attention mechanism, allows it to analyze specific parts of the input, thereby enabling it to produce more natural-sounding and coherent output. Additionally, the unsupervised generative pre-training and the fine-tuning methods allow ChatGPT to generate more relevant and accurate text for specific tasks [ 31 , 62 ]. Furthermore, reinforcement learning from human feedback (RLHF), a machine learning approach that combines reinforcement learning techniques with human-provided feedback, has helped improve ChatGPT’s model by accelerating the learning process and making it significantly more efficient.

This cutting-edge natural language processing (NLP) tool is widely recognized as one of today's most advanced LLMs-based chatbots [ 70 ], allowing users to ask questions and receive detailed, coherent, systematic, personalized, convincing, and informative human-like responses [ 55 ], even within complex and ambiguous contexts [ 63 , 77 ]. ChatGPT is considered the fastest-growing technology in history: in just three months following its public launch, it amassed an estimated 120 million monthly active users [ 16 ] with an estimated 13 million daily queries [ 49 ], surpassing all other applications [ 64 ]. This remarkable growth can be attributed to the unique features and user-friendly interface that ChatGPT offers. Its intuitive design allows users to interact seamlessly with the technology, making it accessible to a diverse range of individuals, regardless of their technical expertise [ 78 ]. Additionally, its exceptional performance results from a combination of advanced algorithms, continuous enhancements, and extensive training on a diverse dataset that includes various text sources such as books, articles, websites, and online forums [ 63 ], have contributed to a more engaging and satisfying user experience [ 62 ]. These factors collectively explain its remarkable global growth and set it apart from predecessors like Bard, Bing Chat, ERNIE, and others.

In this context, several studies have explored the technological advancements of chatbots. One noteworthy recent research effort, conducted by Schöbel et al. [ 70 ], stands out for its comprehensive analysis of more than 5,000 studies on communication agents. This study offered a comprehensive overview of the historical progression and future prospects of communication agents, including ChatGPT. Moreover, other studies have focused on making comparisons, particularly between ChatGPT and alternative chatbots like Bard, Bing Chat, ERNIE, LaMDA, BlenderBot, and various others. For example, O’Leary [ 53 ] compared two chatbots, LaMDA and BlenderBot, with ChatGPT and revealed that ChatGPT outperformed both. This superiority arises from ChatGPT’s capacity to handle a wider range of questions and generate slightly varied perspectives within specific contexts. Similarly, ChatGPT exhibited an impressive ability to formulate interpretable responses that were easily understood when compared with Google's feature snippet [ 34 ]. Additionally, ChatGPT was compared to other LLMs-based chatbots, including Bard and BERT, as well as ERNIE. The findings indicated that ChatGPT exhibited strong performance in the given tasks, often outperforming the other models [ 59 ].

Furthermore, in the education context, a comprehensive study systematically compared a range of the most promising chatbots, including Bard, Bing Chat, ChatGPT, and Ernie across a multidisciplinary test that required higher-order thinking. The study revealed that ChatGPT achieved the highest score, surpassing Bing Chat and Bard [ 64 ]. Similarly, a comparative analysis was conducted to compare ChatGPT with Bard in answering a set of 30 mathematical questions and logic problems, grouped into two question sets. Set (A) is unavailable online, while Set (B) is available online. The results revealed ChatGPT's superiority in Set (A) over Bard. Nevertheless, Bard's advantage emerged in Set (B) due to its capacity to access the internet directly and retrieve answers, a capability that ChatGPT does not possess [ 57 ]. However, through these varied assessments, ChatGPT consistently highlights its exceptional prowess compared to various alternatives in the ever-evolving chatbot technology.

The widespread adoption of chatbots, especially ChatGPT, by millions of students and educators, has sparked extensive discussions regarding its incorporation into the education sector [ 64 ]. Accordingly, many scholars have contributed to the discourse, expressing both optimism and pessimism regarding the incorporation of ChatGPT into education. For example, ChatGPT has been highlighted for its capabilities in enriching the learning and teaching experience through its ability to support different learning approaches, including adaptive learning, personalized learning, and self-directed learning [ 58 , 60 , 91 ]), deliver summative and formative feedback to students and provide real-time responses to questions, increase the accessibility of information [ 22 , 40 , 43 ], foster students’ performance, engagement and motivation [ 14 , 44 , 58 ], and enhance teaching practices [ 17 , 18 , 64 , 74 ].

On the other hand, concerns have been also raised regarding its potential negative effects on learning and teaching. These include the dissemination of false information and references [ 12 , 23 , 61 , 85 ], biased reinforcement [ 47 , 50 ], compromised academic integrity [ 18 , 40 , 66 , 74 ], and the potential decline in students' skills [ 43 , 61 , 64 , 74 ]. As a result, ChatGPT has been banned in multiple countries, including Russia, China, Venezuela, Belarus, and Iran, as well as in various educational institutions in India, Italy, Western Australia, France, and the United States [ 52 , 90 ].

Clearly, the advent of chatbots, especially ChatGPT, has provoked significant controversy due to their potential impact on learning and teaching. This indicates the necessity for further exploration to gain a deeper understanding of this technology and carefully evaluate its potential benefits, limitations, challenges, and threats to education [ 79 ]. Therefore, conducting a systematic literature review will provide valuable insights into the potential prospects and obstacles linked to its incorporation into education. This systematic literature review will primarily focus on ChatGPT, driven by the aforementioned key factors outlined above.

However, the existing literature lacks a systematic literature review of empirical studies. Thus, this systematic literature review aims to address this gap by synthesizing the existing empirical studies conducted on chatbots, particularly ChatGPT, in the field of education, highlighting how ChatGPT has been utilized in educational settings, and identifying any existing gaps. This review may be particularly useful for researchers in the field and educators who are contemplating the integration of ChatGPT or any chatbot into education. The following research questions will guide this study:

What are students' and teachers' initial attempts at utilizing ChatGPT in education?

What are the main findings derived from empirical studies that have incorporated ChatGPT into learning and teaching?

2 Methodology

To conduct this study, the authors followed the essential steps of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) and Okoli’s [ 54 ] steps for conducting a systematic review. These included identifying the study’s purpose, drafting a protocol, applying a practical screening process, searching the literature, extracting relevant data, evaluating the quality of the included studies, synthesizing the studies, and ultimately writing the review. The subsequent section provides an extensive explanation of how these steps were carried out in this study.

2.1 Identify the purpose

Given the widespread adoption of ChatGPT by students and teachers for various educational purposes, often without a thorough understanding of responsible and effective use or a clear recognition of its potential impact on learning and teaching, the authors recognized the need for further exploration of ChatGPT's impact on education in this early stage. Therefore, they have chosen to conduct a systematic literature review of existing empirical studies that incorporate ChatGPT into educational settings. Despite the limited number of empirical studies due to the novelty of the topic, their goal is to gain a deeper understanding of this technology and proactively evaluate its potential benefits, limitations, challenges, and threats to education. This effort could help to understand initial reactions and attempts at incorporating ChatGPT into education and bring out insights and considerations that can inform the future development of education.

2.2 Draft the protocol

The next step is formulating the protocol. This protocol serves to outline the study process in a rigorous and transparent manner, mitigating researcher bias in study selection and data extraction [ 88 ]. The protocol will include the following steps: generating the research question, predefining a literature search strategy, identifying search locations, establishing selection criteria, assessing the studies, developing a data extraction strategy, and creating a timeline.

2.3 Apply practical screen

The screening step aims to accurately filter the articles resulting from the searching step and select the empirical studies that have incorporated ChatGPT into educational contexts, which will guide us in answering the research questions and achieving the objectives of this study. To ensure the rigorous execution of this step, our inclusion and exclusion criteria were determined based on the authors' experience and informed by previous successful systematic reviews [ 21 ]. Table 1 summarizes the inclusion and exclusion criteria for study selection.

2.4 Literature search

We conducted a thorough literature search to identify articles that explored, examined, and addressed the use of ChatGPT in Educational contexts. We utilized two research databases: Dimensions.ai, which provides access to a large number of research publications, and lens.org, which offers access to over 300 million articles, patents, and other research outputs from diverse sources. Additionally, we included three databases, Scopus, Web of Knowledge, and ERIC, which contain relevant research on the topic that addresses our research questions. To browse and identify relevant articles, we used the following search formula: ("ChatGPT" AND "Education"), which included the Boolean operator "AND" to get more specific results. The subject area in the Scopus and ERIC databases were narrowed to "ChatGPT" and "Education" keywords, and in the WoS database was limited to the "Education" category. The search was conducted between the 3rd and 10th of April 2023, which resulted in 276 articles from all selected databases (111 articles from Dimensions.ai, 65 from Scopus, 28 from Web of Science, 14 from ERIC, and 58 from Lens.org). These articles were imported into the Rayyan web-based system for analysis. The duplicates were identified automatically by the system. Subsequently, the first author manually reviewed the duplicated articles ensured that they had the same content, and then removed them, leaving us with 135 unique articles. Afterward, the titles, abstracts, and keywords of the first 40 manuscripts were scanned and reviewed by the first author and were discussed with the second and third authors to resolve any disagreements. Subsequently, the first author proceeded with the filtering process for all articles and carefully applied the inclusion and exclusion criteria as presented in Table  1 . Articles that met any one of the exclusion criteria were eliminated, resulting in 26 articles. Afterward, the authors met to carefully scan and discuss them. The authors agreed to eliminate any empirical studies solely focused on checking ChatGPT capabilities, as these studies do not guide us in addressing the research questions and achieving the study's objectives. This resulted in 14 articles eligible for analysis.

2.5 Quality appraisal

The examination and evaluation of the quality of the extracted articles is a vital step [ 9 ]. Therefore, the extracted articles were carefully evaluated for quality using Fink’s [ 24 ] standards, which emphasize the necessity for detailed descriptions of methodology, results, conclusions, strengths, and limitations. The process began with a thorough assessment of each study's design, data collection, and analysis methods to ensure their appropriateness and comprehensive execution. The clarity, consistency, and logical progression from data to results and conclusions were also critically examined. Potential biases and recognized limitations within the studies were also scrutinized. Ultimately, two articles were excluded for failing to meet Fink’s criteria, particularly in providing sufficient detail on methodology, results, conclusions, strengths, or limitations. The review process is illustrated in Fig.  1 .

figure 1

The study selection process

2.6 Data extraction

The next step is data extraction, the process of capturing the key information and categories from the included studies. To improve efficiency, reduce variation among authors, and minimize errors in data analysis, the coding categories were constructed using Creswell's [ 15 ] coding techniques for data extraction and interpretation. The coding process involves three sequential steps. The initial stage encompasses open coding , where the researcher examines the data, generates codes to describe and categorize it, and gains a deeper understanding without preconceived ideas. Following open coding is axial coding , where the interrelationships between codes from open coding are analyzed to establish more comprehensive categories or themes. The process concludes with selective coding , refining and integrating categories or themes to identify core concepts emerging from the data. The first coder performed the coding process, then engaged in discussions with the second and third authors to finalize the coding categories for the first five articles. The first coder then proceeded to code all studies and engaged again in discussions with the other authors to ensure the finalization of the coding process. After a comprehensive analysis and capturing of the key information from the included studies, the data extraction and interpretation process yielded several themes. These themes have been categorized and are presented in Table  2 . It is important to note that open coding results were removed from Table  2 for aesthetic reasons, as it included many generic aspects, such as words, short phrases, or sentences mentioned in the studies.

2.7 Synthesize studies

In this stage, we will gather, discuss, and analyze the key findings that emerged from the selected studies. The synthesis stage is considered a transition from an author-centric to a concept-centric focus, enabling us to map all the provided information to achieve the most effective evaluation of the data [ 87 ]. Initially, the authors extracted data that included general information about the selected studies, including the author(s)' names, study titles, years of publication, educational levels, research methodologies, sample sizes, participants, main aims or objectives, raw data sources, and analysis methods. Following that, all key information and significant results from the selected studies were compiled using Creswell’s [ 15 ] coding techniques for data extraction and interpretation to identify core concepts and themes emerging from the data, focusing on those that directly contributed to our research questions and objectives, such as the initial utilization of ChatGPT in learning and teaching, learners' and educators' familiarity with ChatGPT, and the main findings of each study. Finally, the data related to each selected study were extracted into an Excel spreadsheet for data processing. The Excel spreadsheet was reviewed by the authors, including a series of discussions to ensure the finalization of this process and prepare it for further analysis. Afterward, the final result being analyzed and presented in various types of charts and graphs. Table 4 presents the extracted data from the selected studies, with each study labeled with a capital 'S' followed by a number.

This section consists of two main parts. The first part provides a descriptive analysis of the data compiled from the reviewed studies. The second part presents the answers to the research questions and the main findings of these studies.

3.1 Part 1: descriptive analysis

This section will provide a descriptive analysis of the reviewed studies, including educational levels and fields, participants distribution, country contribution, research methodologies, study sample size, study population, publication year, list of journals, familiarity with ChatGPT, source of data, and the main aims and objectives of the studies. Table 4 presents a comprehensive overview of the extracted data from the selected studies.

3.1.1 The number of the reviewed studies and publication years

The total number of the reviewed studies was 14. All studies were empirical studies and published in different journals focusing on Education and Technology. One study was published in 2022 [S1], while the remaining were published in 2023 [S2]-[S14]. Table 3 illustrates the year of publication, the names of the journals, and the number of reviewed studies published in each journal for the studies reviewed.

3.1.2 Educational levels and fields

The majority of the reviewed studies, 11 studies, were conducted in higher education institutions [S1]-[S10] and [S13]. Two studies did not specify the educational level of the population [S12] and [S14], while one study focused on elementary education [S11]. However, the reviewed studies covered various fields of education. Three studies focused on Arts and Humanities Education [S8], [S11], and [S14], specifically English Education. Two studies focused on Engineering Education, with one in Computer Engineering [S2] and the other in Construction Education [S3]. Two studies focused on Mathematics Education [S5] and [S12]. One study focused on Social Science Education [S13]. One study focused on Early Education [S4]. One study focused on Journalism Education [S9]. Finally, three studies did not specify the field of education [S1], [S6], and [S7]. Figure  2 represents the educational levels in the reviewed studies, while Fig.  3 represents the context of the reviewed studies.

figure 2

Educational levels in the reviewed studies

figure 3

Context of the reviewed studies

3.1.3 Participants distribution and countries contribution

The reviewed studies have been conducted across different geographic regions, providing a diverse representation of the studies. The majority of the studies, 10 in total, [S1]-[S3], [S5]-[S9], [S11], and [S14], primarily focused on participants from single countries such as Pakistan, the United Arab Emirates, China, Indonesia, Poland, Saudi Arabia, South Korea, Spain, Tajikistan, and the United States. In contrast, four studies, [S4], [S10], [S12], and [S13], involved participants from multiple countries, including China and the United States [S4], China, the United Kingdom, and the United States [S10], the United Arab Emirates, Oman, Saudi Arabia, and Jordan [S12], Turkey, Sweden, Canada, and Australia [ 13 ]. Figures  4 and 5 illustrate the distribution of participants, whether from single or multiple countries, and the contribution of each country in the reviewed studies, respectively.

figure 4

The reviewed studies conducted in single or multiple countries

figure 5

The Contribution of each country in the studies

3.1.4 Study population and sample size

Four study populations were included: university students, university teachers, university teachers and students, and elementary school teachers. Six studies involved university students [S2], [S3], [S5] and [S6]-[S8]. Three studies focused on university teachers [S1], [S4], and [S6], while one study specifically targeted elementary school teachers [S11]. Additionally, four studies included both university teachers and students [S10] and [ 12 , 13 , 14 ], and among them, study [S13] specifically included postgraduate students. In terms of the sample size of the reviewed studies, nine studies included a small sample size of less than 50 participants [S1], [S3], [S6], [S8], and [S10]-[S13]. Three studies had 50–100 participants [S2], [S9], and [S14]. Only one study had more than 100 participants [S7]. It is worth mentioning that study [S4] adopted a mixed methods approach, including 10 participants for qualitative analysis and 110 participants for quantitative analysis.

3.1.5 Participants’ familiarity with using ChatGPT

The reviewed studies recruited a diverse range of participants with varying levels of familiarity with ChatGPT. Five studies [S2], [S4], [S6], [S8], and [S12] involved participants already familiar with ChatGPT, while eight studies [S1], [S3], [S5], [S7], [S9], [S10], [S13] and [S14] included individuals with differing levels of familiarity. Notably, one study [S11] had participants who were entirely unfamiliar with ChatGPT. It is important to note that four studies [S3], [S5], [S9], and [S11] provided training or guidance to their participants before conducting their studies, while ten studies [S1], [S2], [S4], [S6]-[S8], [S10], and [S12]-[S14] did not provide training due to the participants' existing familiarity with ChatGPT.

3.1.6 Research methodology approaches and source(S) of data

The reviewed studies adopted various research methodology approaches. Seven studies adopted qualitative research methodology [S1], [S4], [S6], [S8], [S10], [S11], and [S12], while three studies adopted quantitative research methodology [S3], [S7], and [S14], and four studies employed mixed-methods, which involved a combination of both the strengths of qualitative and quantitative methods [S2], [S5], [S9], and [S13].

In terms of the source(s) of data, the reviewed studies obtained their data from various sources, such as interviews, questionnaires, and pre-and post-tests. Six studies relied on interviews as their primary source of data collection [S1], [S4], [S6], [S10], [S11], and [S12], four studies relied on questionnaires [S2], [S7], [S13], and [S14], two studies combined the use of pre-and post-tests and questionnaires for data collection [S3] and [S9], while two studies combined the use of questionnaires and interviews to obtain the data [S5] and [S8]. It is important to note that six of the reviewed studies were quasi-experimental [S3], [S5], [S8], [S9], [S12], and [S14], while the remaining ones were experimental studies [S1], [S2], [S4], [S6], [S7], [S10], [S11], and [S13]. Figures  6 and 7 illustrate the research methodologies and the source (s) of data used in the reviewed studies, respectively.

figure 6

Research methodologies in the reviewed studies

figure 7

Source of data in the reviewed studies

3.1.7 The aim and objectives of the studies

The reviewed studies encompassed a diverse set of aims, with several of them incorporating multiple primary objectives. Six studies [S3], [S6], [S7], [S8], [S11], and [S12] examined the integration of ChatGPT in educational contexts, and four studies [S4], [S5], [S13], and [S14] investigated the various implications of its use in education, while three studies [S2], [S9], and [S10] aimed to explore both its integration and implications in education. Additionally, seven studies explicitly explored attitudes and perceptions of students [S2] and [S3], educators [S1] and [S6], or both [S10], [S12], and [S13] regarding the utilization of ChatGPT in educational settings.

3.2 Part 2: research questions and main findings of the reviewed studies

This part will present the answers to the research questions and the main findings of the reviewed studies, classified into two main categories (learning and teaching) according to AI Education classification by [ 36 ]. Figure  8 summarizes the main findings of the reviewed studies in a visually informative diagram. Table 4 provides a detailed list of the key information extracted from the selected studies that led to generating these themes.

figure 8

The main findings in the reviewed studies

4 Students' initial attempts at utilizing ChatGPT in learning and main findings from students' perspective

4.1 virtual intelligent assistant.

Nine studies demonstrated that ChatGPT has been utilized by students as an intelligent assistant to enhance and support their learning. Students employed it for various purposes, such as answering on-demand questions [S2]-[S5], [S8], [S10], and [S12], providing valuable information and learning resources [S2]-[S5], [S6], and [S8], as well as receiving immediate feedback [S2], [S4], [S9], [S10], and [S12]. In this regard, students generally were confident in the accuracy of ChatGPT's responses, considering them relevant, reliable, and detailed [S3], [S4], [S5], and [S8]. However, some students indicated the need for improvement, as they found that answers are not always accurate [S2], and that misleading information may have been provided or that it may not always align with their expectations [S6] and [S10]. It was also observed by the students that the accuracy of ChatGPT is dependent on several factors, including the quality and specificity of the user's input, the complexity of the question or topic, and the scope and relevance of its training data [S12]. Many students felt that ChatGPT's answers were not always accurate and most of them believed that it requires good background knowledge to work with.

4.2 Writing and language proficiency assistant

Six of the reviewed studies highlighted that ChatGPT has been utilized by students as a valuable assistant tool to improve their academic writing skills and language proficiency. Among these studies, three mainly focused on English education, demonstrating that students showed sufficient mastery in using ChatGPT for generating ideas, summarizing, paraphrasing texts, and completing writing essays [S8], [S11], and [S14]. Furthermore, ChatGPT helped them in writing by making students active investigators rather than passive knowledge recipients and facilitated the development of their writing skills [S11] and [S14]. Similarly, ChatGPT allowed students to generate unique ideas and perspectives, leading to deeper analysis and reflection on their journalism writing [S9]. In terms of language proficiency, ChatGPT allowed participants to translate content into their home languages, making it more accessible and relevant to their context [S4]. It also enabled them to request changes in linguistic tones or flavors [S8]. Moreover, participants used it to check grammar or as a dictionary [S11].

4.3 Valuable resource for learning approaches

Five studies demonstrated that students used ChatGPT as a valuable complementary resource for self-directed learning. It provided learning resources and guidance on diverse educational topics and created a supportive home learning environment [S2] and [S4]. Moreover, it offered step-by-step guidance to grasp concepts at their own pace and enhance their understanding [S5], streamlined task and project completion carried out independently [S7], provided comprehensive and easy-to-understand explanations on various subjects [S10], and assisted in studying geometry operations, thereby empowering them to explore geometry operations at their own pace [S12]. Three studies showed that students used ChatGPT as a valuable learning resource for personalized learning. It delivered age-appropriate conversations and tailored teaching based on a child's interests [S4], acted as a personalized learning assistant, adapted to their needs and pace, which assisted them in understanding mathematical concepts [S12], and enabled personalized learning experiences in social sciences by adapting to students' needs and learning styles [S13]. On the other hand, it is important to note that, according to one study [S5], students suggested that using ChatGPT may negatively affect collaborative learning competencies between students.

4.4 Enhancing students' competencies

Six of the reviewed studies have shown that ChatGPT is a valuable tool for improving a wide range of skills among students. Two studies have provided evidence that ChatGPT led to improvements in students' critical thinking, reasoning skills, and hazard recognition competencies through engaging them in interactive conversations or activities and providing responses related to their disciplines in journalism [S5] and construction education [S9]. Furthermore, two studies focused on mathematical education have shown the positive impact of ChatGPT on students' problem-solving abilities in unraveling problem-solving questions [S12] and enhancing the students' understanding of the problem-solving process [S5]. Lastly, one study indicated that ChatGPT effectively contributed to the enhancement of conversational social skills [S4].

4.5 Supporting students' academic success

Seven of the reviewed studies highlighted that students found ChatGPT to be beneficial for learning as it enhanced learning efficiency and improved the learning experience. It has been observed to improve students' efficiency in computer engineering studies by providing well-structured responses and good explanations [S2]. Additionally, students found it extremely useful for hazard reporting [S3], and it also enhanced their efficiency in solving mathematics problems and capabilities [S5] and [S12]. Furthermore, by finding information, generating ideas, translating texts, and providing alternative questions, ChatGPT aided students in deepening their understanding of various subjects [S6]. It contributed to an increase in students' overall productivity [S7] and improved efficiency in composing written tasks [S8]. Regarding learning experiences, ChatGPT was instrumental in assisting students in identifying hazards that they might have otherwise overlooked [S3]. It also improved students' learning experiences in solving mathematics problems and developing abilities [S5] and [S12]. Moreover, it increased students' successful completion of important tasks in their studies [S7], particularly those involving average difficulty writing tasks [S8]. Additionally, ChatGPT increased the chances of educational success by providing students with baseline knowledge on various topics [S10].

5 Teachers' initial attempts at utilizing ChatGPT in teaching and main findings from teachers' perspective

5.1 valuable resource for teaching.

The reviewed studies showed that teachers have employed ChatGPT to recommend, modify, and generate diverse, creative, organized, and engaging educational contents, teaching materials, and testing resources more rapidly [S4], [S6], [S10] and [S11]. Additionally, teachers experienced increased productivity as ChatGPT facilitated quick and accurate responses to questions, fact-checking, and information searches [S1]. It also proved valuable in constructing new knowledge [S6] and providing timely answers to students' questions in classrooms [S11]. Moreover, ChatGPT enhanced teachers' efficiency by generating new ideas for activities and preplanning activities for their students [S4] and [S6], including interactive language game partners [S11].

5.2 Improving productivity and efficiency

The reviewed studies showed that participants' productivity and work efficiency have been significantly enhanced by using ChatGPT as it enabled them to allocate more time to other tasks and reduce their overall workloads [S6], [S10], [S11], [S13], and [S14]. However, three studies [S1], [S4], and [S11], indicated a negative perception and attitude among teachers toward using ChatGPT. This negativity stemmed from a lack of necessary skills to use it effectively [S1], a limited familiarity with it [S4], and occasional inaccuracies in the content provided by it [S10].

5.3 Catalyzing new teaching methodologies

Five of the reviewed studies highlighted that educators found the necessity of redefining their teaching profession with the assistance of ChatGPT [S11], developing new effective learning strategies [S4], and adapting teaching strategies and methodologies to ensure the development of essential skills for future engineers [S5]. They also emphasized the importance of adopting new educational philosophies and approaches that can evolve with the introduction of ChatGPT into the classroom [S12]. Furthermore, updating curricula to focus on improving human-specific features, such as emotional intelligence, creativity, and philosophical perspectives [S13], was found to be essential.

5.4 Effective utilization of CHATGPT in teaching

According to the reviewed studies, effective utilization of ChatGPT in education requires providing teachers with well-structured training, support, and adequate background on how to use ChatGPT responsibly [S1], [S3], [S11], and [S12]. Establishing clear rules and regulations regarding its usage is essential to ensure it positively impacts the teaching and learning processes, including students' skills [S1], [S4], [S5], [S8], [S9], and [S11]-[S14]. Moreover, conducting further research and engaging in discussions with policymakers and stakeholders is indeed crucial for the successful integration of ChatGPT in education and to maximize the benefits for both educators and students [S1], [S6]-[S10], and [S12]-[S14].

6 Discussion

The purpose of this review is to conduct a systematic review of empirical studies that have explored the utilization of ChatGPT, one of today’s most advanced LLM-based chatbots, in education. The findings of the reviewed studies showed several ways of ChatGPT utilization in different learning and teaching practices as well as it provided insights and considerations that can facilitate its effective and responsible use in future educational contexts. The results of the reviewed studies came from diverse fields of education, which helped us avoid a biased review that is limited to a specific field. Similarly, the reviewed studies have been conducted across different geographic regions. This kind of variety in geographic representation enriched the findings of this review.

In response to RQ1 , "What are students' and teachers' initial attempts at utilizing ChatGPT in education?", the findings from this review provide comprehensive insights. Chatbots, including ChatGPT, play a crucial role in supporting student learning, enhancing their learning experiences, and facilitating diverse learning approaches [ 42 , 43 ]. This review found that this tool, ChatGPT, has been instrumental in enhancing students' learning experiences by serving as a virtual intelligent assistant, providing immediate feedback, on-demand answers, and engaging in educational conversations. Additionally, students have benefited from ChatGPT’s ability to generate ideas, compose essays, and perform tasks like summarizing, translating, paraphrasing texts, or checking grammar, thereby enhancing their writing and language competencies. Furthermore, students have turned to ChatGPT for assistance in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks, which fosters a supportive home learning environment, allowing them to take responsibility for their own learning and cultivate the skills and approaches essential for supportive home learning environment [ 26 , 27 , 28 ]. This finding aligns with the study of Saqr et al. [ 68 , 69 ] who highlighted that, when students actively engage in their own learning process, it yields additional advantages, such as heightened motivation, enhanced achievement, and the cultivation of enthusiasm, turning them into advocates for their own learning.

Moreover, students have utilized ChatGPT for tailored teaching and step-by-step guidance on diverse educational topics, streamlining task and project completion, and generating and recommending educational content. This personalization enhances the learning environment, leading to increased academic success. This finding aligns with other recent studies [ 26 , 27 , 28 , 60 , 66 ] which revealed that ChatGPT has the potential to offer personalized learning experiences and support an effective learning process by providing students with customized feedback and explanations tailored to their needs and abilities. Ultimately, fostering students' performance, engagement, and motivation, leading to increase students' academic success [ 14 , 44 , 58 ]. This ultimate outcome is in line with the findings of Saqr et al. [ 68 , 69 ], which emphasized that learning strategies are important catalysts of students' learning, as students who utilize effective learning strategies are more likely to have better academic achievement.

Teachers, too, have capitalized on ChatGPT's capabilities to enhance productivity and efficiency, using it for creating lesson plans, generating quizzes, providing additional resources, generating and preplanning new ideas for activities, and aiding in answering students’ questions. This adoption of technology introduces new opportunities to support teaching and learning practices, enhancing teacher productivity. This finding aligns with those of Day [ 17 ], De Castro [ 18 ], and Su and Yang [ 74 ] as well as with those of Valtonen et al. [ 82 ], who revealed that emerging technological advancements have opened up novel opportunities and means to support teaching and learning practices, and enhance teachers’ productivity.

In response to RQ2 , "What are the main findings derived from empirical studies that have incorporated ChatGPT into learning and teaching?", the findings from this review provide profound insights and raise significant concerns. Starting with the insights, chatbots, including ChatGPT, have demonstrated the potential to reshape and revolutionize education, creating new, novel opportunities for enhancing the learning process and outcomes [ 83 ], facilitating different learning approaches, and offering a range of pedagogical benefits [ 19 , 43 , 72 ]. In this context, this review found that ChatGPT could open avenues for educators to adopt or develop new effective learning and teaching strategies that can evolve with the introduction of ChatGPT into the classroom. Nonetheless, there is an evident lack of research understanding regarding the potential impact of generative machine learning models within diverse educational settings [ 83 ]. This necessitates teachers to attain a high level of proficiency in incorporating chatbots, such as ChatGPT, into their classrooms to create inventive, well-structured, and captivating learning strategies. In the same vein, the review also found that teachers without the requisite skills to utilize ChatGPT realized that it did not contribute positively to their work and could potentially have adverse effects [ 37 ]. This concern could lead to inequity of access to the benefits of chatbots, including ChatGPT, as individuals who lack the necessary expertise may not be able to harness their full potential, resulting in disparities in educational outcomes and opportunities. Therefore, immediate action is needed to address these potential issues. A potential solution is offering training, support, and competency development for teachers to ensure that all of them can leverage chatbots, including ChatGPT, effectively and equitably in their educational practices [ 5 , 28 , 80 ], which could enhance accessibility and inclusivity, and potentially result in innovative outcomes [ 82 , 83 ].

Additionally, chatbots, including ChatGPT, have the potential to significantly impact students' thinking abilities, including retention, reasoning, analysis skills [ 19 , 45 ], and foster innovation and creativity capabilities [ 83 ]. This review found that ChatGPT could contribute to improving a wide range of skills among students. However, it found that frequent use of ChatGPT may result in a decrease in innovative capacities, collaborative skills and cognitive capacities, and students' motivation to attend classes, as well as could lead to reduced higher-order thinking skills among students [ 22 , 29 ]. Therefore, immediate action is needed to carefully examine the long-term impact of chatbots such as ChatGPT, on learning outcomes as well as to explore its incorporation into educational settings as a supportive tool without compromising students' cognitive development and critical thinking abilities. In the same vein, the review also found that it is challenging to draw a consistent conclusion regarding the potential of ChatGPT to aid self-directed learning approach. This finding aligns with the recent study of Baskara [ 8 ]. Therefore, further research is needed to explore the potential of ChatGPT for self-directed learning. One potential solution involves utilizing learning analytics as a novel approach to examine various aspects of students' learning and support them in their individual endeavors [ 32 ]. This approach can bridge this gap by facilitating an in-depth analysis of how learners engage with ChatGPT, identifying trends in self-directed learning behavior, and assessing its influence on their outcomes.

Turning to the significant concerns, on the other hand, a fundamental challenge with LLM-based chatbots, including ChatGPT, is the accuracy and quality of the provided information and responses, as they provide false information as truth—a phenomenon often referred to as "hallucination" [ 3 , 49 ]. In this context, this review found that the provided information was not entirely satisfactory. Consequently, the utilization of chatbots presents potential concerns, such as generating and providing inaccurate or misleading information, especially for students who utilize it to support their learning. This finding aligns with other findings [ 6 , 30 , 35 , 40 ] which revealed that incorporating chatbots such as ChatGPT, into education presents challenges related to its accuracy and reliability due to its training on a large corpus of data, which may contain inaccuracies and the way users formulate or ask ChatGPT. Therefore, immediate action is needed to address these potential issues. One possible solution is to equip students with the necessary skills and competencies, which include a background understanding of how to use it effectively and the ability to assess and evaluate the information it generates, as the accuracy and the quality of the provided information depend on the input, its complexity, the topic, and the relevance of its training data [ 28 , 49 , 86 ]. However, it's also essential to examine how learners can be educated about how these models operate, the data used in their training, and how to recognize their limitations, challenges, and issues [ 79 ].

Furthermore, chatbots present a substantial challenge concerning maintaining academic integrity [ 20 , 56 ] and copyright violations [ 83 ], which are significant concerns in education. The review found that the potential misuse of ChatGPT might foster cheating, facilitate plagiarism, and threaten academic integrity. This issue is also affirmed by the research conducted by Basic et al. [ 7 ], who presented evidence that students who utilized ChatGPT in their writing assignments had more plagiarism cases than those who did not. These findings align with the conclusions drawn by Cotton et al. [ 13 ], Hisan and Amri [ 33 ] and Sullivan et al. [ 75 ], who revealed that the integration of chatbots such as ChatGPT into education poses a significant challenge to the preservation of academic integrity. Moreover, chatbots, including ChatGPT, have increased the difficulty in identifying plagiarism [ 47 , 67 , 76 ]. The findings from previous studies [ 1 , 84 ] indicate that AI-generated text often went undetected by plagiarism software, such as Turnitin. However, Turnitin and other similar plagiarism detection tools, such as ZeroGPT, GPTZero, and Copyleaks, have since evolved, incorporating enhanced techniques to detect AI-generated text, despite the possibility of false positives, as noted in different studies that have found these tools still not yet fully ready to accurately and reliably identify AI-generated text [ 10 , 51 ], and new novel detection methods may need to be created and implemented for AI-generated text detection [ 4 ]. This potential issue could lead to another concern, which is the difficulty of accurately evaluating student performance when they utilize chatbots such as ChatGPT assistance in their assignments. Consequently, the most LLM-driven chatbots present a substantial challenge to traditional assessments [ 64 ]. The findings from previous studies indicate the importance of rethinking, improving, and redesigning innovative assessment methods in the era of chatbots [ 14 , 20 , 64 , 75 ]. These methods should prioritize the process of evaluating students' ability to apply knowledge to complex cases and demonstrate comprehension, rather than solely focusing on the final product for assessment. Therefore, immediate action is needed to address these potential issues. One possible solution would be the development of clear guidelines, regulatory policies, and pedagogical guidance. These measures would help regulate the proper and ethical utilization of chatbots, such as ChatGPT, and must be established before their introduction to students [ 35 , 38 , 39 , 41 , 89 ].

In summary, our review has delved into the utilization of ChatGPT, a prominent example of chatbots, in education, addressing the question of how ChatGPT has been utilized in education. However, there remain significant gaps, which necessitate further research to shed light on this area.

7 Conclusions

This systematic review has shed light on the varied initial attempts at incorporating ChatGPT into education by both learners and educators, while also offering insights and considerations that can facilitate its effective and responsible use in future educational contexts. From the analysis of 14 selected studies, the review revealed the dual-edged impact of ChatGPT in educational settings. On the positive side, ChatGPT significantly aided the learning process in various ways. Learners have used it as a virtual intelligent assistant, benefiting from its ability to provide immediate feedback, on-demand answers, and easy access to educational resources. Additionally, it was clear that learners have used it to enhance their writing and language skills, engaging in practices such as generating ideas, composing essays, and performing tasks like summarizing, translating, paraphrasing texts, or checking grammar. Importantly, other learners have utilized it in supporting and facilitating their directed and personalized learning on a broad range of educational topics, assisting in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks. Educators, on the other hand, found ChatGPT beneficial for enhancing productivity and efficiency. They used it for creating lesson plans, generating quizzes, providing additional resources, and answers learners' questions, which saved time and allowed for more dynamic and engaging teaching strategies and methodologies.

However, the review also pointed out negative impacts. The results revealed that overuse of ChatGPT could decrease innovative capacities and collaborative learning among learners. Specifically, relying too much on ChatGPT for quick answers can inhibit learners' critical thinking and problem-solving skills. Learners might not engage deeply with the material or consider multiple solutions to a problem. This tendency was particularly evident in group projects, where learners preferred consulting ChatGPT individually for solutions over brainstorming and collaborating with peers, which negatively affected their teamwork abilities. On a broader level, integrating ChatGPT into education has also raised several concerns, including the potential for providing inaccurate or misleading information, issues of inequity in access, challenges related to academic integrity, and the possibility of misusing the technology.

Accordingly, this review emphasizes the urgency of developing clear rules, policies, and regulations to ensure ChatGPT's effective and responsible use in educational settings, alongside other chatbots, by both learners and educators. This requires providing well-structured training to educate them on responsible usage and understanding its limitations, along with offering sufficient background information. Moreover, it highlights the importance of rethinking, improving, and redesigning innovative teaching and assessment methods in the era of ChatGPT. Furthermore, conducting further research and engaging in discussions with policymakers and stakeholders are essential steps to maximize the benefits for both educators and learners and ensure academic integrity.

It is important to acknowledge that this review has certain limitations. Firstly, the limited inclusion of reviewed studies can be attributed to several reasons, including the novelty of the technology, as new technologies often face initial skepticism and cautious adoption; the lack of clear guidelines or best practices for leveraging this technology for educational purposes; and institutional or governmental policies affecting the utilization of this technology in educational contexts. These factors, in turn, have affected the number of studies available for review. Secondly, the utilization of the original version of ChatGPT, based on GPT-3 or GPT-3.5, implies that new studies utilizing the updated version, GPT-4 may lead to different findings. Therefore, conducting follow-up systematic reviews is essential once more empirical studies on ChatGPT are published. Additionally, long-term studies are necessary to thoroughly examine and assess the impact of ChatGPT on various educational practices.

Despite these limitations, this systematic review has highlighted the transformative potential of ChatGPT in education, revealing its diverse utilization by learners and educators alike and summarized the benefits of incorporating it into education, as well as the forefront critical concerns and challenges that must be addressed to facilitate its effective and responsible use in future educational contexts. This review could serve as an insightful resource for practitioners who seek to integrate ChatGPT into education and stimulate further research in the field.

Data availability

The data supporting our findings are available upon request.

Abbreviations

  • Artificial intelligence

AI in education

Large language model

Artificial neural networks

Chat Generative Pre-Trained Transformer

Recurrent neural networks

Long short-term memory

Reinforcement learning from human feedback

Natural language processing

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

AlAfnan MA, Dishari S, Jovic M, Lomidze K. ChatGPT as an educational tool: opportunities, challenges, and recommendations for communication, business writing, and composition courses. J Artif Intell Technol. 2023. https://doi.org/10.37965/jait.2023.0184 .

Article   Google Scholar  

Ali JKM, Shamsan MAA, Hezam TA, Mohammed AAQ. Impact of ChatGPT on learning motivation. J Engl Stud Arabia Felix. 2023;2(1):41–9. https://doi.org/10.56540/jesaf.v2i1.51 .

Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023. https://doi.org/10.7759/cureus.35179 .

Anderson N, Belavý DL, Perle SM, Hendricks S, Hespanhol L, Verhagen E, Memon AR. AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in sports & exercise medicine manuscript generation. BMJ Open Sport Exerc Med. 2023;9(1): e001568. https://doi.org/10.1136/bmjsem-2023-001568 .

Ausat AMA, Massang B, Efendi M, Nofirman N, Riady Y. Can chat GPT replace the role of the teacher in the classroom: a fundamental analysis. J Educ. 2023;5(4):16100–6.

Google Scholar  

Baidoo-Anu D, Ansah L. Education in the Era of generative artificial intelligence (AI): understanding the potential benefits of ChatGPT in promoting teaching and learning. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4337484 .

Basic Z, Banovac A, Kruzic I, Jerkovic I. Better by you, better than me, chatgpt3 as writing assistance in students essays. 2023. arXiv preprint arXiv:2302.04536 .‏

Baskara FR. The promises and pitfalls of using chat GPT for self-determined learning in higher education: an argumentative review. Prosiding Seminar Nasional Fakultas Tarbiyah dan Ilmu Keguruan IAIM Sinjai. 2023;2:95–101. https://doi.org/10.47435/sentikjar.v2i0.1825 .

Behera RK, Bala PK, Dhir A. The emerging role of cognitive computing in healthcare: a systematic literature review. Int J Med Inform. 2019;129:154–66. https://doi.org/10.1016/j.ijmedinf.2019.04.024 .

Chaka C. Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: the case of five AI content detection tools. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.2.12 .

Chiu TKF, Xia Q, Zhou X, Chai CS, Cheng M. Systematic literature review on opportunities, challenges, and future research recommendations of artificial intelligence in education. Comput Educ Artif Intell. 2023;4:100118. https://doi.org/10.1016/j.caeai.2022.100118 .

Choi EPH, Lee JJ, Ho M, Kwok JYY, Lok KYW. Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education. Nurse Educ Today. 2023;125:105796. https://doi.org/10.1016/j.nedt.2023.105796 .

Cotton D, Cotton PA, Shipway JR. Chatting and cheating: ensuring academic integrity in the era of ChatGPT. Innov Educ Teach Int. 2023. https://doi.org/10.1080/14703297.2023.2190148 .

Crawford J, Cowling M, Allen K. Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI). J Univ Teach Learn Pract. 2023. https://doi.org/10.53761/1.20.3.02 .

Creswell JW. Educational research: planning, conducting, and evaluating quantitative and qualitative research [Ebook]. 4th ed. London: Pearson Education; 2015.

Curry D. ChatGPT Revenue and Usage Statistics (2023)—Business of Apps. 2023. https://www.businessofapps.com/data/chatgpt-statistics/

Day T. A preliminary investigation of fake peer-reviewed citations and references generated by ChatGPT. Prof Geogr. 2023. https://doi.org/10.1080/00330124.2023.2190373 .

De Castro CA. A Discussion about the Impact of ChatGPT in education: benefits and concerns. J Bus Theor Pract. 2023;11(2):p28. https://doi.org/10.22158/jbtp.v11n2p28 .

Deng X, Yu Z. A meta-analysis and systematic review of the effect of Chatbot technology use in sustainable education. Sustainability. 2023;15(4):2940. https://doi.org/10.3390/su15042940 .

Eke DO. ChatGPT and the rise of generative AI: threat to academic integrity? J Responsib Technol. 2023;13:100060. https://doi.org/10.1016/j.jrt.2023.100060 .

Elmoazen R, Saqr M, Tedre M, Hirsto L. A systematic literature review of empirical research on epistemic network analysis in education. IEEE Access. 2022;10:17330–48. https://doi.org/10.1109/access.2022.3149812 .

Farrokhnia M, Banihashem SK, Noroozi O, Wals AEJ. A SWOT analysis of ChatGPT: implications for educational practice and research. Innov Educ Teach Int. 2023. https://doi.org/10.1080/14703297.2023.2195846 .

Fergus S, Botha M, Ostovar M. Evaluating academic answers generated using ChatGPT. J Chem Educ. 2023;100(4):1672–5. https://doi.org/10.1021/acs.jchemed.3c00087 .

Fink A. Conducting research literature reviews: from the Internet to Paper. Incorporated: SAGE Publications; 2010.

Firaina R, Sulisworo D. Exploring the usage of ChatGPT in higher education: frequency and impact on productivity. Buletin Edukasi Indonesia (BEI). 2023;2(01):39–46. https://doi.org/10.56741/bei.v2i01.310 .

Firat, M. (2023). How chat GPT can transform autodidactic experiences and open education.  Department of Distance Education, Open Education Faculty, Anadolu Unive .‏ https://orcid.org/0000-0001-8707-5918

Firat M. What ChatGPT means for universities: perceptions of scholars and students. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.22 .

Fuchs K. Exploring the opportunities and challenges of NLP models in higher education: is Chat GPT a blessing or a curse? Front Educ. 2023. https://doi.org/10.3389/feduc.2023.1166682 .

García-Peñalvo FJ. La percepción de la inteligencia artificial en contextos educativos tras el lanzamiento de ChatGPT: disrupción o pánico. Educ Knowl Soc. 2023;24: e31279. https://doi.org/10.14201/eks.31279 .

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor A, Chartash D. How does ChatGPT perform on the United States medical Licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9: e45312. https://doi.org/10.2196/45312 .

Hashana AJ, Brundha P, Ayoobkhan MUA, Fazila S. Deep Learning in ChatGPT—A Survey. In   2023 7th international conference on trends in electronics and informatics (ICOEI) . 2023. (pp. 1001–1005). IEEE. https://doi.org/10.1109/icoei56765.2023.10125852

Hirsto L, Saqr M, López-Pernas S, Valtonen T. (2022). A systematic narrative review of learning analytics research in K-12 and schools.  Proceedings . https://ceur-ws.org/Vol-3383/FLAIEC22_paper_9536.pdf

Hisan UK, Amri MM. ChatGPT and medical education: a double-edged sword. J Pedag Educ Sci. 2023;2(01):71–89. https://doi.org/10.13140/RG.2.2.31280.23043/1 .

Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023. https://doi.org/10.1093/jncics/pkad010 .

Househ M, AlSaad R, Alhuwail D, Ahmed A, Healy MG, Latifi S, Sheikh J. Large Language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ. 2023;9: e48291. https://doi.org/10.2196/48291 .

Ilkka T. The impact of artificial intelligence on learning, teaching, and education. Minist de Educ. 2018. https://doi.org/10.2760/12297 .

Iqbal N, Ahmed H, Azhar KA. Exploring teachers’ attitudes towards using CHATGPT. Globa J Manag Adm Sci. 2022;3(4):97–111. https://doi.org/10.46568/gjmas.v3i4.163 .

Irfan M, Murray L, Ali S. Integration of Artificial intelligence in academia: a case study of critical teaching and learning in Higher education. Globa Soc Sci Rev. 2023;8(1):352–64. https://doi.org/10.31703/gssr.2023(viii-i).32 .

Jeon JH, Lee S. Large language models in education: a focus on the complementary relationship between human teachers and ChatGPT. Educ Inf Technol. 2023. https://doi.org/10.1007/s10639-023-11834-1 .

Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT—Reshaping medical education and clinical management. Pak J Med Sci. 2023. https://doi.org/10.12669/pjms.39.2.7653 .

King MR. A conversation on artificial intelligence, Chatbots, and plagiarism in higher education. Cell Mol Bioeng. 2023;16(1):1–2. https://doi.org/10.1007/s12195-022-00754-8 .

Kooli C. Chatbots in education and research: a critical examination of ethical implications and solutions. Sustainability. 2023;15(7):5614. https://doi.org/10.3390/su15075614 .

Kuhail MA, Alturki N, Alramlawi S, Alhejori K. Interacting with educational chatbots: a systematic review. Educ Inf Technol. 2022;28(1):973–1018. https://doi.org/10.1007/s10639-022-11177-3 .

Lee H. The rise of ChatGPT: exploring its potential in medical education. Anat Sci Educ. 2023. https://doi.org/10.1002/ase.2270 .

Li L, Subbareddy R, Raghavendra CG. AI intelligence Chatbot to improve students learning in the higher education platform. J Interconnect Netw. 2022. https://doi.org/10.1142/s0219265921430325 .

Limna P. A Review of Artificial Intelligence (AI) in Education during the Digital Era. 2022. https://ssrn.com/abstract=4160798

Lo CK. What is the impact of ChatGPT on education? A rapid review of the literature. Educ Sci. 2023;13(4):410. https://doi.org/10.3390/educsci13040410 .

Luo W, He H, Liu J, Berson IR, Berson MJ, Zhou Y, Li H. Aladdin’s genie or pandora’s box For early childhood education? Experts chat on the roles, challenges, and developments of ChatGPT. Early Educ Dev. 2023. https://doi.org/10.1080/10409289.2023.2214181 .

Meyer JG, Urbanowicz RJ, Martin P, O’Connor K, Li R, Peng P, Moore JH. ChatGPT and large language models in academia: opportunities and challenges. Biodata Min. 2023. https://doi.org/10.1186/s13040-023-00339-9 .

Mhlanga D. Open AI in education, the responsible and ethical use of ChatGPT towards lifelong learning. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4354422 .

Neumann, M., Rauschenberger, M., & Schön, E. M. (2023). “We Need To Talk About ChatGPT”: The Future of AI and Higher Education.‏ https://doi.org/10.1109/seeng59157.2023.00010

Nolan B. Here are the schools and colleges that have banned the use of ChatGPT over plagiarism and misinformation fears. Business Insider . 2023. https://www.businessinsider.com

O’Leary DE. An analysis of three chatbots: BlenderBot, ChatGPT and LaMDA. Int J Intell Syst Account, Financ Manag. 2023;30(1):41–54. https://doi.org/10.1002/isaf.1531 .

Okoli C. A guide to conducting a standalone systematic literature review. Commun Assoc Inf Syst. 2015. https://doi.org/10.17705/1cais.03743 .

OpenAI. (2023). https://openai.com/blog/chatgpt

Perkins M. Academic integrity considerations of AI large language models in the post-pandemic era: ChatGPT and beyond. J Univ Teach Learn Pract. 2023. https://doi.org/10.53761/1.20.02.07 .

Plevris V, Papazafeiropoulos G, Rios AJ. Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard. arXiv (Cornell University) . 2023. https://doi.org/10.48550/arxiv.2305.18618

Rahman MM, Watanobe Y (2023) ChatGPT for education and research: opportunities, threats, and strategies. Appl Sci 13(9):5783. https://doi.org/10.3390/app13095783

Ram B, Verma P. Artificial intelligence AI-based Chatbot study of ChatGPT, google AI bard and baidu AI. World J Adv Eng Technol Sci. 2023;8(1):258–61. https://doi.org/10.30574/wjaets.2023.8.1.0045 .

Rasul T, Nair S, Kalendra D, Robin M, de Oliveira Santini F, Ladeira WJ, Heathcote L. The role of ChatGPT in higher education: benefits, challenges, and future research directions. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.29 .

Ratnam M, Sharm B, Tomer A. ChatGPT: educational artificial intelligence. Int J Adv Trends Comput Sci Eng. 2023;12(2):84–91. https://doi.org/10.30534/ijatcse/2023/091222023 .

Ray PP. ChatGPT: a comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet Things Cyber-Phys Syst. 2023;3:121–54. https://doi.org/10.1016/j.iotcps.2023.04.003 .

Roumeliotis KI, Tselikas ND. ChatGPT and Open-AI models: a preliminary review. Future Internet. 2023;15(6):192. https://doi.org/10.3390/fi15060192 .

Rudolph J, Tan S, Tan S. War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.23 .

Ruiz LMS, Moll-López S, Nuñez-Pérez A, Moraño J, Vega-Fleitas E. ChatGPT challenges blended learning methodologies in engineering education: a case study in mathematics. Appl Sci. 2023;13(10):6039. https://doi.org/10.3390/app13106039 .

Sallam M, Salim NA, Barakat M, Al-Tammemi AB. ChatGPT applications in medical, dental, pharmacy, and public health education: a descriptive study highlighting the advantages and limitations. Narra J. 2023;3(1): e103. https://doi.org/10.52225/narra.v3i1.103 .

Salvagno M, Taccone FS, Gerli AG. Can artificial intelligence help for scientific writing? Crit Care. 2023. https://doi.org/10.1186/s13054-023-04380-2 .

Saqr M, López-Pernas S, Helske S, Hrastinski S. The longitudinal association between engagement and achievement varies by time, students’ profiles, and achievement state: a full program study. Comput Educ. 2023;199:104787. https://doi.org/10.1016/j.compedu.2023.104787 .

Saqr M, Matcha W, Uzir N, Jovanović J, Gašević D, López-Pernas S. Transferring effective learning strategies across learning contexts matters: a study in problem-based learning. Australas J Educ Technol. 2023;39(3):9.

Schöbel S, Schmitt A, Benner D, Saqr M, Janson A, Leimeister JM. Charting the evolution and future of conversational agents: a research agenda along five waves and new frontiers. Inf Syst Front. 2023. https://doi.org/10.1007/s10796-023-10375-9 .

Shoufan A. Exploring students’ perceptions of CHATGPT: thematic analysis and follow-up survey. IEEE Access. 2023. https://doi.org/10.1109/access.2023.3268224 .

Sonderegger S, Seufert S. Chatbot-mediated learning: conceptual framework for the design of Chatbot use cases in education. Gallen: Institute for Educational Management and Technologies, University of St; 2022. https://doi.org/10.5220/0010999200003182 .

Book   Google Scholar  

Strzelecki A. To use or not to use ChatGPT in higher education? A study of students’ acceptance and use of technology. Interact Learn Environ. 2023. https://doi.org/10.1080/10494820.2023.2209881 .

Su J, Yang W. Unlocking the power of ChatGPT: a framework for applying generative AI in education. ECNU Rev Educ. 2023. https://doi.org/10.1177/20965311231168423 .

Sullivan M, Kelly A, McLaughlan P. ChatGPT in higher education: Considerations for academic integrity and student learning. J ApplLearn Teach. 2023;6(1):1–10. https://doi.org/10.37074/jalt.2023.6.1.17 .

Szabo A. ChatGPT is a breakthrough in science and education but fails a test in sports and exercise psychology. Balt J Sport Health Sci. 2023;1(128):25–40. https://doi.org/10.33607/bjshs.v127i4.1233 .

Taecharungroj V. “What can ChatGPT do?” analyzing early reactions to the innovative AI chatbot on Twitter. Big Data Cognit Comput. 2023;7(1):35. https://doi.org/10.3390/bdcc7010035 .

Tam S, Said RB. User preferences for ChatGPT-powered conversational interfaces versus traditional methods. Biomed Eng Soc. 2023. https://doi.org/10.58496/mjcsc/2023/004 .

Tedre M, Kahila J, Vartiainen H. (2023). Exploration on how co-designing with AI facilitates critical evaluation of ethics of AI in craft education. In: Langran E, Christensen P, Sanson J (Eds).  Proceedings of Society for Information Technology and Teacher Education International Conference . 2023. pp. 2289–2296.

Tlili A, Shehata B, Adarkwah MA, Bozkurt A, Hickey DT, Huang R, Agyemang B. What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education. Smart Learn Environ. 2023. https://doi.org/10.1186/s40561-023-00237-x .

Uddin SMJ, Albert A, Ovid A, Alsharef A. Leveraging CHATGPT to aid construction hazard recognition and support safety education and training. Sustainability. 2023;15(9):7121. https://doi.org/10.3390/su15097121 .

Valtonen T, López-Pernas S, Saqr M, Vartiainen H, Sointu E, Tedre M. The nature and building blocks of educational technology research. Comput Hum Behav. 2022;128:107123. https://doi.org/10.1016/j.chb.2021.107123 .

Vartiainen H, Tedre M. Using artificial intelligence in craft education: crafting with text-to-image generative models. Digit Creat. 2023;34(1):1–21. https://doi.org/10.1080/14626268.2023.2174557 .

Ventayen RJM. OpenAI ChatGPT generated results: similarity index of artificial intelligence-based contents. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4332664 .

Wagner MW, Ertl-Wagner BB. Accuracy of information and references using ChatGPT-3 for retrieval of clinical radiological information. Can Assoc Radiol J. 2023. https://doi.org/10.1177/08465371231171125 .

Wardat Y, Tashtoush MA, AlAli R, Jarrah AM. ChatGPT: a revolutionary tool for teaching and learning mathematics. Eurasia J Math, Sci Technol Educ. 2023;19(7):em2286. https://doi.org/10.29333/ejmste/13272 .

Webster J, Watson RT. Analyzing the past to prepare for the future: writing a literature review. Manag Inf Syst Quart. 2002;26(2):3.

Xiao Y, Watson ME. Guidance on conducting a systematic literature review. J Plan Educ Res. 2017;39(1):93–112. https://doi.org/10.1177/0739456x17723971 .

Yan D. Impact of ChatGPT on learners in a L2 writing practicum: an exploratory investigation. Educ Inf Technol. 2023. https://doi.org/10.1007/s10639-023-11742-4 .

Yu H. Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching. Front Psychol. 2023;14:1181712. https://doi.org/10.3389/fpsyg.2023.1181712 .

Zhu C, Sun M, Luo J, Li T, Wang M. How to harness the potential of ChatGPT in education? Knowl Manag ELearn. 2023;15(2):133–52. https://doi.org/10.34105/j.kmel.2023.15.008 .

Download references

The paper is co-funded by the Academy of Finland (Suomen Akatemia) Research Council for Natural Sciences and Engineering for the project Towards precision education: Idiographic learning analytics (TOPEILA), Decision Number 350560.

Author information

Authors and affiliations.

School of Computing, University of Eastern Finland, 80100, Joensuu, Finland

Yazid Albadarin, Mohammed Saqr, Nicolas Pope & Markku Tukiainen

You can also search for this author in PubMed   Google Scholar

Contributions

YA contributed to the literature search, data analysis, discussion, and conclusion. Additionally, YA contributed to the manuscript’s writing, editing, and finalization. MS contributed to the study’s design, conceptualization, acquisition of funding, project administration, allocation of resources, supervision, validation, literature search, and analysis of results. Furthermore, MS contributed to the manuscript's writing, revising, and approving it in its finalized state. NP contributed to the results, and discussions, and provided supervision. NP also contributed to the writing process, revisions, and the final approval of the manuscript in its finalized state. MT contributed to the study's conceptualization, resource management, supervision, writing, revising the manuscript, and approving it.

Corresponding author

Correspondence to Yazid Albadarin .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table  4

The process of synthesizing the data presented in Table  4 involved identifying the relevant studies through a search process of databases (ERIC, Scopus, Web of Knowledge, Dimensions.ai, and lens.org) using specific keywords "ChatGPT" and "education". Following this, inclusion/exclusion criteria were applied, and data extraction was performed using Creswell's [ 15 ] coding techniques to capture key information and identify common themes across the included studies.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Albadarin, Y., Saqr, M., Pope, N. et al. A systematic literature review of empirical research on ChatGPT in education. Discov Educ 3 , 60 (2024). https://doi.org/10.1007/s44217-024-00138-2

Download citation

Received : 22 October 2023

Accepted : 10 May 2024

Published : 26 May 2024

DOI : https://doi.org/10.1007/s44217-024-00138-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Large language models
  • Educational technology
  • Systematic review

Advertisement

  • Find a journal
  • Publish with us
  • Track your research
  • Systematic Review
  • Open access
  • Published: 23 May 2024

Systematic literature review of real-world evidence for treatments in HR+/HER2- second-line LABC/mBC after first-line treatment with CDK4/6i

  • Veronique Lambert   ORCID: orcid.org/0000-0002-6984-0038 1 ,
  • Sarah Kane   ORCID: orcid.org/0009-0006-9341-4836 2   na1 ,
  • Belal Howidi   ORCID: orcid.org/0000-0002-1166-7631 2   na1 ,
  • Bao-Ngoc Nguyen   ORCID: orcid.org/0000-0001-6026-2270 2   na1 ,
  • David Chandiwana   ORCID: orcid.org/0009-0002-3499-2565 3 ,
  • Yan Wu   ORCID: orcid.org/0009-0008-3348-9232 1 ,
  • Michelle Edwards   ORCID: orcid.org/0009-0001-4292-3140 3 &
  • Imtiaz A. Samjoo   ORCID: orcid.org/0000-0003-1415-8055 2   na1  

BMC Cancer volume  24 , Article number:  631 ( 2024 ) Cite this article

1 Altmetric

Metrics details

Cyclin-dependent kinase 4 and 6 inhibitors (CDK4/6i) combined with endocrine therapy (ET) are currently recommended by the National Comprehensive Cancer Network (NCCN) guidelines and the European Society for Medical Oncology (ESMO) guidelines as the first-line (1 L) treatment for patients with hormone receptor-positive, human epidermal growth factor receptor 2-negative, locally advanced/metastatic breast cancer (HR+/HER2- LABC/mBC). Although there are many treatment options, there is no clear standard of care for patients following 1 L CDK4/6i. Understanding the real-world effectiveness of subsequent therapies may help to identify an unmet need in this patient population. This systematic literature review qualitatively synthesized effectiveness and safety outcomes for treatments received in the real-world setting after 1 L CDK4/6i therapy in patients with HR+/ HER2- LABC/mBC.

MEDLINE®, Embase, and Cochrane were searched using the Ovid® platform for real-world evidence studies published between 2015 and 2022. Grey literature was searched to identify relevant conference abstracts published from 2019 to 2022. The review was conducted in accordance with PRISMA guidelines (PROSPERO registration: CRD42023383914). Data were qualitatively synthesized and weighted average median real-world progression-free survival (rwPFS) was calculated for NCCN/ESMO-recommended post-1 L CDK4/6i treatment regimens.

Twenty records (9 full-text articles and 11 conference abstracts) encompassing 18 unique studies met the eligibility criteria and reported outcomes for second-line (2 L) treatments after 1 L CDK4/6i; no studies reported disaggregated outcomes in the third-line setting or beyond. Sixteen studies included NCCN/ESMO guideline-recommended treatments with the majority evaluating endocrine-based therapy; five studies on single-agent ET, six studies on mammalian target of rapamycin inhibitors (mTORi) ± ET, and three studies with a mix of ET and/or mTORi. Chemotherapy outcomes were reported in 11 studies. The most assessed outcome was median rwPFS; the weighted average median rwPFS was calculated as 3.9 months (3.3-6.0 months) for single-agent ET, 3.6 months (2.5–4.9 months) for mTORi ± ET, 3.7 months for a mix of ET and/or mTORi (3.0–4.0 months), and 6.1 months (3.7–9.7 months) for chemotherapy. Very few studies reported other effectiveness outcomes and only two studies reported safety outcomes. Most studies had heterogeneity in patient- and disease-related characteristics.

Conclusions

The real-world effectiveness of current 2 L treatments post-1 L CDK4/6i are suboptimal, highlighting an unmet need for this patient population.

Peer Review reports

Introduction

Breast cancer (BC) is the most diagnosed form of cancer in women with an estimated 2.3 million new cases diagnosed worldwide each year [ 1 ]. BC is the second leading cause of cancer death, accounting for 685,000 deaths worldwide per year [ 2 ]. By 2040, the global burden associated with BC is expected to surpass three million new cases and one million deaths annually (due to population growth and aging) [ 3 ]. Numerous factors contribute to global disparities in BC-related mortality rates, including delayed diagnosis, resulting in a high number of BC cases that have progressed to locally advanced BC (LABC) or metastatic BC (mBC) [ 4 , 5 , 6 ]. In the United States (US), the five-year survival rate for patients who progress to mBC is three times lower (31%) than the overall five-year survival rate for all stages (91%) [ 6 , 7 ].

Hormone receptor (HR) positive (i.e., estrogen receptor and/or progesterone receptor positive) coupled with negative human epidermal growth factor 2 (HER2) expression is the most common subtype of BC, accounting for ∼ 60–70% of all BC cases [ 8 , 9 ]. Historically, endocrine therapy (ET) through estrogen receptor modulation and/or estrogen deprivation has been the standard of care for first-line (1 L) treatment of HR-positive/HER2-negative (HR+/HER2-) mBC [ 10 ]. However, with the approval of the cyclin-dependent kinase 4/6 inhibitor (CDK4/6i) palbociclib in combination with the aromatase inhibitor (AI) letrozole in 2015 by the US Food and Drug Administration (FDA), 1 L treatment practice patterns have evolved such that CDK4/6i (either in combination with AIs or with fulvestrant) are currently considered the standard of care [ 11 , 12 , 13 , 14 , 15 , 16 , 17 ]. Other CDK4/6i (ribociclib and abemaciclib) in combination with ET are approved for the treatment of HR+/HER2- LABC/mBC; 1 L use of ribociclib in combination with an AI was granted FDA approval in March 2017 for postmenopausal women (with expanded approval in July 2018 for pre/perimenopausal women and for use in 1 L with fulvestrant for patients with disease progression on ET as well as for postmenopausal women), and abemaciclib in combination with fulvestrant was granted FDA approval in September 2017 for patients with disease progression following ET and as monotherapy in cases where disease progression occurs following ET and prior chemotherapy in mBC (with expanded approval in February 2018 for use in 1 L in combination with an AI for postmenopausal women) [ 18 , 19 , 20 , 21 ].

Clinical trials investigating the addition of CDK4/6i to ET have demonstrated significant improvement in progression-free survival (PFS) and significant (ribociclib) or numerical (palbociclib and abemaciclib) improvement in overall survival (OS) compared to ET alone in patients with HR+/HER2- advanced or mBC, making this combination treatment the recommended option in the 1 L setting [ 22 , 23 , 24 , 25 , 26 , 27 ]. However, disease progression occurs in a significant portion of patients after 1 L CDK4/6i treatment [ 28 ] and the optimal treatment sequence after progression on CDK4/6i remains unclear [ 29 ]. At the time of this review (literature search conducted December 14, 2022), guidelines by the National Comprehensive Cancer Network (NCCN) and the European Society for Medical Oncology (ESMO) recommend various options for the treatment of HR+/HER2- advanced BC in the second-line (2 L) setting, including fulvestrant monotherapy, mammalian target of rapamycin inhibitors (mTORi; e.g., everolimus) ± ET, alpelisib + fulvestrant (if phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha mutation positive [PIK3CA-m+]), poly-ADP ribose polymerase inhibitors (PARPi) including olaparib or talazoparib (if breast cancer gene/partner and localizer of BRCA2 positive [BRCA/PALB2m+]), and chemotherapy (in cases when a visceral crisis is present) [ 15 , 16 ]. CDK4/6i can also be used in 2 L [ 16 , 30 ]; however, limited data are available to support CDK4/6i rechallenge after its use in the 1 L setting [ 15 ]. Depending on treatments used in the 1 L and 2 L settings, treatment in the third-line setting is individualized based on the patient’s response to prior treatments, tumor load, duration of response, and patient preference [ 9 , 15 ]. Understanding subsequent treatments after 1 L CDK4/6i, and their associated effectiveness, is an important focus in BC research.

Treatment options for HR+/HER2- LABC/mBC continue to evolve, with ongoing research in both clinical trials and in the real-world setting. Real-world evidence (RWE) offers important insights into novel therapeutic regimens and the effectiveness of treatments for HR+/HER2- LABC/mBC. The effectiveness of the current treatment options following 1 L CDK4/6i therapy in the real-world setting highlights the unmet need in this patient population and may help to drive further research and drug development. In this study, we conducted a systematic literature review (SLR) to qualitatively summarize the effectiveness and safety of treatment regimens in the real-world setting after 1 L treatment with CDK4/6i in patients with HR+/HER2- LABC/mBC.

Literature search

An SLR was performed in accordance with the Cochrane Handbook for Systematic Reviews of Interventions [ 31 ] and reported in alignment with the Preferred Reporting Items for Systematic Literature Reviews and Meta-Analyses (PRISMA) statement [ 32 ] to identify all RWE studies assessing the effectiveness and safety of treatments used for patients with HR+/HER2- LABC/mBC following 1 L CDK4/6i therapy and received subsequent treatment in 2 L and beyond (2 L+). The Ovid® platform was used to search MEDLINE® (including Epub Ahead of Print and In-Process, In-Data-Review & Other Non-Indexed Citations), Ovid MEDLINE® Daily, Embase, Cochrane Central Register of Controlled Trials, and Cochrane Database of Systematic Reviews by an experienced medical information specialist. The MEDLINE® search strategy was peer-reviewed independently by a senior medical information specialist before execution using the Peer Review of Electronic Search Strategies (PRESS) checklist [ 33 ]. Searches were conducted on December 14, 2022. The review protocol was developed a priori and registered with the International Prospective Register of Systematic Review (PROSPERO; CRD42023383914) which outlined the population, intervention, comparator, outcome, and study design (PICOS) criteria and methodology used to conduct the review (Table  1 ).

Search strategies utilized a combination of controlled vocabulary (e.g., “HER2 Breast Cancer” or “HR Breast Cancer”) and keywords (e.g., “Retrospective studies”). Vocabulary and syntax were adjusted across databases. Published and validated filters were used to select for study design and were supplemented using additional medical subject headings (MeSH) terms and keywords to select for RWE and nonrandomized studies [ 34 ]. No language restrictions were included in the search strategy. Animal-only and opinion pieces were removed from the results. The search was limited to studies published between January 2015 and December 2022 to reflect the time at which FDA approval was granted for the first CDK4/6i agent (palbociclib) in combination with AI for the treatment of LABC/mBC [ 35 ]. Further search details are presented in Supplementary Material 1 .

Grey literature sources were also searched to identify relevant abstracts and posters published from January 2019 to December 2022 for prespecified relevant conferences including ESMO, San Antonio Breast Cancer Symposium (SABCS), American Society of Clinical Oncology (ASCO), the International Society for Pharmacoeconomics and Outcomes Research (ISPOR US), and the American Association for Cancer Research (AACR). A search of ClinicalTrials.gov was conducted to validate the findings from the database and grey literature searches.

Study selection, data extraction & weighted average calculation

Studies were screened for inclusion using DistillerSR Version 2.35 and 2.41 (DistillerSR Inc. 2021, Ottawa, Canada) by two independent reviewers based on the prespecified PICOS criteria (Table  1 ). A third reviewer was consulted to resolve any discrepancies during the screening process. Studies were included if they reported RWE on patients aged ≥ 18 years with HR+/HER2- LABC/mBC who received 1 L CDK4/6i treatment and received subsequent treatment in 2 L+. Studies were excluded if they reported the results of clinical trials (i.e., non-RWE), were published in any language other than English, and/or were published prior to 2015 (or prior to 2019 for conference abstracts and posters). For studies that met the eligibility criteria, data relating to study design and methodology, details of interventions, patient eligibility criteria and baseline characteristics, and outcome measures such as efficacy, safety, tolerability, and patient-reported outcomes (PROs), were extracted (as available) using a Microsoft Excel®-based data extraction form (Microsoft Corporation, WA, USA). Data extraction was performed by a single reviewer and was confirmed by a second reviewer. Multiple publications identified for the same RWE study, patient population, and setting that reported data for the same intervention were linked and extracted as a single publication. Weighted average median real-world progression-free survival (rwPFS) values were calculated by considering the contribution to the median rwPFS of each study proportional to its respective sample size. These weighted values were then used to compute the overall median rwPFS estimate.

Quality assessment

The Newcastle-Ottawa scale (NOS) for nonrandomized (cohort) studies was used to assess the risk of bias for published, full-text studies [ 36 ]. The NOS allocates a maximum of nine points for the least risk of bias across three domains: (1) Formation of study groups (four points), (2) Comparability between study groups (two points), (3) Outcome ascertainment (three points). NOS scores can be categorized in three groups: very high risk of bias (0 to 3 points), high risk of bias (4 to 6), and low risk of bias (7 to 9) [ 37 ]. Risk of bias assessment was performed by one reviewer and validated by a second independent reviewer to verify accuracy. Due to limited methodological data by which to assess study quality, risk of bias assessment was not performed on conference abstracts or posters. An amendment to the PROSPERO record (CRD42023383914) for this study was submitted in relation to the quality assessment method (specifying usage of the NOS).

The database search identified 3,377 records; after removal of duplicates, 2,759 were screened at the title and abstract stage of which 2,553 were excluded. Out of the 206 reports retrieved and assessed for eligibility, an additional 187 records were excluded after full-text review; most of these studies were excluded for having patients with mixed lines of CDK4/6i treatment (i.e., did not receive CDK4/6i exclusively in 1 L) (Fig.  1 and Table S1 ). The grey literature search identified 753 records which were assessed for eligibility; of which 752 were excluded mainly due to the population not meeting the eligibility criteria (Fig.  1 ). In total, the literature searches identified 20 records (9 published full-text articles and 11 conference abstracts/posters) representing 18 unique RWE studies that met the inclusion criteria. The NOS quality scores for the included full-text articles are provided in Table S2 . The scores ranged from four to six points (out of a total score of nine) and the median score was five, indicating that all the studies suffered from a high risk of bias [ 37 ].

Most studies were retrospective analyses of chart reviews or medical registries, and all studies were published between 2017 and 2022 (Table S3 ). Nearly half of the RWE studies (8 out of 18 studies) were conducted in the US [ 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 ], while the remaining studies included sites in Canada, China, Germany, Italy, Japan, and the United Kingdom [ 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 ]. Sample sizes ranged from as few as 4 to as many as 839 patients across included studies, with patient age ranging from 26 to 86 years old.

Although treatment characteristics in the 1 L setting were not the focus of the present review, these details are captured in Table S3 . Briefly, several RWE studies reported 1 L CDK4/6i use in combination with ET (8 out of 18 studies) or as monotherapy (2 out of 18 studies) (Table S3 ). Treatments used in combination with 1 L CDK4/6i included letrozole, fulvestrant, exemestane, and anastrozole. Where reported (4 out of 18 studies), palbociclib was the most common 1 L CDK4/6i treatment. Many studies (8 out of 18 studies) did not report which specific CDK4/6i treatment(s) were used in 1 L or if its administration was in combination or monotherapy.

Characteristics of treatments after 1 L CDK4/6i therapy

Across all studies included in this review, effectiveness and safety data were only available for treatments administered in the 2 L setting after 1 L CDK4/6i treatment. No studies were identified that reported outcomes for patients treated in the third-line setting or beyond after 1 L CDK4/6i treatment. All 18 studies reported effectiveness outcomes in 2 L, with only two of these studies also describing 2 L safety outcomes. The distribution of outcomes reported in these studies is provided in Table S4 . Studies varied in their reporting of outcomes for 2 L treatments; some studies reported outcomes for a group of 2 L treatments while others described independent outcomes for specific 2 L treatments (i.e., everolimus, fulvestrant, or chemotherapy agents such as eribulin mesylate) [ 42 , 45 , 50 , 54 , 55 ]. Due to the heterogeneity in treatment classes reported in these studies, this data was categorized (as described below) to align with the guidelines provided by NCCN and ESMO [ 15 , 16 ]. The treatment class categorizations for the purpose of this review are: single-agent ET (patients who exclusively received a single-agent ET after 1 L CDK4/6i treatment), mTORi ± ET (patients who exclusively received an mTORi with or without ET after 1 L CDK4/6i treatment), mix of ET and/or mTORi (patients who may have received only ET, only mTORi, and/or both treatments but the studies in this group lacked sufficient information to categorize these patients in the “single-agent ET” or “mTOR ± ET” categories), and chemotherapy (patients who exclusively received chemotherapy after 1 L CDK4/6i treatment). Despite ESMO and NCCN guidelines indicating that limited evidence exists to support rechallenge with CDK4/6i after 1 L CDK4/6i treatment [ 15 , 16 ], two studies reported outcomes for this treatment approach. Data for such patients were categorized as “ CDK4/6i ± ET ” as it was unclear how many patients receiving CDK4/6i rechallenge received concurrent ET. All other patient groups that lacked sufficient information or did not report outcome/safety data independently (i.e., grouped patients with mixed treatments) to categorize as one of the treatment classes described above were grouped as “ other ”.

The majority of studies reported effectiveness outcomes for endocrine-based therapy after 1 L CDK4/6i treatment; five studies for single-agent ET, six studies for mTORi ± ET, and three studies for a mix of ET and/or mTORi (Fig.  2 ). Eleven studies reported effectiveness outcomes for chemotherapy after 1 L CDK4/6i treatment, and only two studies reported effectiveness outcomes for CDK4/6i rechallenge ± ET. Eight studies that described effectiveness outcomes were grouped into the “other” category. Safety data was only reported in two studies: one study evaluating the chemotherapy agent eribulin mesylate and one evaluating the mTORi everolimus.

Effectiveness outcomes

Real-world progression-free survival

Median rwPFS was described in 13 studies (Tables  2 and Table S5 ). Across the 13 studies, the median rwPFS ranged from 2.5 months [ 49 ] to 17.3 months [ 39 ]. Out of the 13 studies reporting median rwPFS, 10 studies reported median rwPFS for a 2 L treatment recommended by ESMO and NCCN guidelines, which ranged from 2.5 months [ 49 ] to 9.7 months [ 45 ].

Weighted average median rwPFS was calculated for 2 L treatments recommended by both ESMO and NCCN guidelines (Fig.  3 ). The weighted average median rwPFS for single-agent ET was 3.9 months ( n  = 92 total patients) and was derived using data from two studies reporting median rwPFS values of 3.3 months ( n  = 70) [ 38 ] and 6.0 months ( n  = 22) [ 40 ]. For one study ( n  = 7) that reported outcomes for single agent ET, median rwPFS was not reached during the follow-up period; as such, this study was excluded from the weighted average median rwPFS calculation [ 49 ].

The weighted average median rwPFS for mTORi ± ET was 3.6 months ( n  = 128 total patients) and was derived based on data from 3 studies with median rwPFS ranging from 2.5 months ( n  = 4) [ 49 ] to 4.9 months ( n  = 25) [ 54 ] (Fig.  3 ). For patients who received a mix of ET and/or mTORi but could not be classified into the single-agent ET or mTORi ± ET treatment classes, the weighted average median rwPFS was calculated to be 3.7 months ( n  = 17 total patients). This was calculated based on data from two studies reporting median rwPFS values of 3.0 months ( n  = 5) [ 46 ] and 4.0 months ( n  = 12) [ 49 ]. Notably, one study of patients receiving ET and/or everolimus reported a median rwPFS duration of 3.0 months; however, this study was excluded from the weighted average median rwPFS calculation for the ET and/or mTORi class as the sample size was not reported [ 53 ].

The weighted average median rwPFS for chemotherapy was 6.1 months ( n  = 499 total patients), calculated using data from 7 studies reporting median rwPFS values ranging from 3.7 months ( n  = 249) [ 38 ] to 9.7 months ( n  = 121) [ 45 ] (Fig.  3 ). One study with a median rwPFS duration of 5.6 months was not included in the weighted average median rwPFS calculation as the study did not report the sample size [ 53 ]. A second study was excluded from the calculation since the reported median rwPFS was not reached during the study period ( n  = 7) [ 41 ].

Although 2 L CDK4/6i ± ET rechallenge lacks sufficient information to support recommendation by ESMO and NCCN guidelines, the limited data currently available for this treatment have shown promising results. Briefly, two studies reported median rwPFS for CDK4/6i ± ET with values of 8.3 months ( n  = 302) [ 38 ] and 17.3 months ( n  = 165) (Table  2 ) [ 39 ]. The remaining median rwPFS studies reported data for patients classified as “Other” (Table S5 ). The “Other” category included median rwPFS outcomes from seven studies, and included a myriad of treatments (e.g., ET, mTOR + ET, chemotherapy, CDK4/6i + ET, alpelisib + fulvestrant, chidamide + ET) for which disaggregated median rwPFS values were not reported.

Overall survival

Median OS for 2 L treatment was reported in only three studies (Table  2 ) [ 38 , 42 , 43 ]. Across the three studies, the 2 L median OS ranged from 5.2 months ( n  = 3) [ 43 ] to 35.7 months ( n  = 302) [ 38 ]. Due to the lack of OS data in most of the studies, weighted averages could not be calculated. No median OS data was reported for the single-agent ET treatment class whereas two studies reported median OS for the mTORi ± ET treatment class, ranging from 5.2 months ( n  = 3) [ 43 ] to 21.8 months ( n  = 54) [ 42 ]. One study reported 2 L median OS of 24.8 months for a single patient treated with chemotherapy [ 43 ]. The median OS data in the CDK4/6i ± ET rechallenge group was 35.7 months ( n  = 302) [ 38 ].

Patient mortality was reported in three studies [ 43 , 44 , 45 ]. No studies reported mortality for the single-agent ET treatment class and only one study reported this outcome for the mTORi ± ET treatment class, where 100% of patients died ( n  = 3) as a result of rapid disease progression [ 43 ]. For the chemotherapy class, one study reported mortality for one patient receiving 2 L capecitabine [ 43 ]. An additional study reported eight deaths (21.7%) following 1 L CDK4/6i treatment; however, this study did not disclose the 2 L treatments administered to these patients [ 44 ].

Other clinical endpoints

The studies included limited information on additional clinical endpoints; two studies reported on time-to-discontinuation (TTD), two reported on duration of response (DOR), and one each on time-to-next-treatment (TTNT), time-to-progression (TTP), objective response rate (ORR), clinical benefit rate (CBR), and stable disease (Tables  2 and Table S5 ).

Safety, tolerability, and patient-reported outcomes

Safety and tolerability data were reported in two studies [ 40 , 45 ]. One study investigating 2 L administration of the chemotherapy agent eribulin mesylate reported 27 patients (22.3%) with neutropenia, 3 patients (2.5%) with febrile neutropenia, 10 patients (8.3%) with peripheral neuropathy, and 14 patients (11.6%) with diarrhea [ 45 ]. Of these, neutropenia of grade 3–4 severity occurred in 9 patients (33.3%) [ 45 ]. A total of 55 patients (45.5%) discontinued eribulin mesylate treatment; 1 patient (0.83%) discontinued treatment due to adverse events [ 45 ]. Another study reported that 5 out of the 22 patients receiving the mTORi everolimus combined with ET in 2 L (22.7%) discontinued treatment due to toxicity [ 40 ]. PROs were not reported in any of the studies included in the SLR.

The objective of this study was to summarize the existing RWE on the effectiveness and safety of therapies for patients with HR+/HER2- LABC/mBC after 1 L CDK4/6i treatment. We identified 18 unique studies reporting specifically on 2 L treatment regimens after 1 L CDK4/6i treatment. The weighted average median rwPFS for NCCN- and ESMO- guideline recommended 2 L treatments ranged from 3.6 to 3.9 months for ET-based treatments and was 6.1 months when including chemotherapy-based regimens. Treatment selection following 1 L CDK4/6i therapy remains challenging primarily due to the suboptimal effectiveness or significant toxicities (e.g., chemotherapy) associated with currently available options [ 56 ]. These results highlight that currently available 2 L treatments for patients with HR+/HER2- LABC/mBC who have received 1 L CDK4/6i are suboptimal, as evidenced by the brief median rwPFS duration associated with ET-based treatments, or notable side effects and toxicity linked to chemotherapy. This conclusion is aligned with a recent review highlighting the limited effectiveness of treatment options for HR+/HER2- LABC/mBC patients post-CDK4/6i treatment [ 56 , 57 ]. Registrational trials which have also shed light on the short median PFS of 2–3 months achieved by ET (i.e., fulvestrant) after 1 L CDK4/6i therapy emphasize the need to develop improved treatment strategies aimed at prolonging the duration of effective ET-based treatment [ 56 ].

The results of this review reveal a paucity of additional real-world effectiveness and safety evidence after 1 L CDK4/6i treatment in HR+/HER2- LABC/mBC. OS and DOR were only reported in two studies while other clinical endpoints (i.e., TTD, TTNT, TTP, ORR, CBR, and stable disease) were only reported in one study each. Similarly, safety and tolerability data were only reported in two studies each, and PROs were not reported in any study. This hindered our ability to provide a comprehensive assessment of real-world treatment effectiveness and safety following 1 L CDK4/6i treatment. The limited evidence may be due to the relatively short period of time that has elapsed since CDK4/6i first received US FDA approval for 1 L treatment of HR+/HER2- LABC/mBC (2015) [ 35 ]. As such, almost half of our evidence was informed by conference abstracts. Similarly, no real-world studies were identified in our review that reported outcomes for treatments in the third- or later-lines of therapy after 1 L CDK4/6i treatment. The lack of data in this patient population highlights a significant gap which limits our understanding of the effectiveness and safety for patients receiving later lines of therapy. As more patients receive CDK4/6i therapy in the 1 L setting, the number of patients requiring subsequent lines of therapy will continue to grow. Addressing this data gap over time will be critical to improve outcomes for patients with HR+/HER2- LABC/mBC following 1 L CDK4/6i therapy.

There are several strengths of this study, including adherence to the guidelines outlined in the Cochrane Handbook to ensure a standardized and reliable approach to the SLR [ 58 ] and reporting of the SLR following PRISMA guidelines to ensure transparency and reproducibility [ 59 ]. Furthermore, the inclusion of only RWE studies allowed us to assess the effectiveness of current standard of care treatments outside of a controlled environment and enabled us to identify an unmet need in this patient population.

This study had some notable limitations, including the lack of safety and additional effectiveness outcomes reported. In addition, the dearth of studies reporting PROs is a limitation, as PROs provide valuable insight into the patient experience and are an important aspect of assessing the impact of 2 L treatments on patients’ quality of life. The studies included in this review also lacked consistent reporting of clinical characteristics (e.g., menopausal status, sites of metastasis, prior surgery) making it challenging to draw comprehensive conclusions or comparisons based on these factors across the studies. Taken together, there exists an important gap in our understanding of the long-term management of patients with HR+/HER2- LABC/mBC. Additionally, the effectiveness results reported in our evidence base were informed by small sample sizes; many of the included studies reported median rwPFS based on less than 30 patients [ 39 , 40 , 41 , 46 , 49 , 51 , 60 ], with two studies not reporting the sample size at all [ 47 , 53 ]. This may impact the generalizability and robustness of the results. Relatedly, the SLR database search was conducted in December 2022; as such, novel agents (e.g., elacestrant and capivasertib + fulvestrant) that have since received FDA approval for the treatment of HR+/HER2- LABC/mBC may impact current 2 L rwPFS outcomes [ 61 , 62 ]. Finally, relative to the number of peer-reviewed full-text articles, this SLR identified eight abstracts and one poster presentation, comprising half (50%) of the included unique studies. As conference abstracts are inherently limited by how much content that can be described due to word limit constraints, this likely had implications on the present synthesis whereby we identified a dearth of real-world effectiveness outcomes in patients with HR+/HER2- LABC/mBC treated with 1 L CDK4/6i therapy.

Future research in this area should aim to address the limitations of the current literature and provide a more comprehensive understanding of optimal sequencing of effective and safe treatment for patients following 1 L CDK4/6i therapy. Specifically, future studies should strive to report robust data related to effectiveness, safety, and PROs for patients receiving 2 L treatment after 1 L CDK4/6i therapy. Future studies should also aim to understand the mechanism underlying CDK4/6i resistance. Addressing these gaps in knowledge may improve the long-term real-world management of patients with HR+/HER2- LABC/mBC. A future update of this synthesis may serve to capture a wider breadth of full-text, peer-reviewed articles to gain a more robust understanding of the safety, effectiveness, and real-world treatment patterns for patients with HR+/HER2- LABC/mBC. This SLR underscores the necessity for ongoing investigation and the development of innovative therapeutic approaches to address these gaps and improve patient outcomes.

This SLR qualitatively summarized the existing real-world effectiveness data for patients with HR+/HER2- LABC/mBC after 1 L CDK4/6i treatment. Results of this study highlight the limited available data and the suboptimal effectiveness of treatments employed in the 2 L setting and underscore the unmet need in this patient population. Additional studies reporting effectiveness and safety outcomes, in addition to PROs, for this patient population are necessary and should be the focus of future research.

figure 1

PRISMA flow diagram. *Two included conference abstracts reported the same information as already included full-text reports, hence both conference abstracts were not identified as unique. Abbreviations: 1 L = first-line; AACR = American Association of Cancer Research; ASCO = American Society of Clinical Oncology; CDK4/6i = cyclin-dependent kinase 4/6 inhibitor; ESMO = European Society for Medical Oncology; ISPOR = Professional Society for Health Economics and Outcomes Research; n = number of studies; NMA = network meta-analysis; pts = participants; SABCS = San Antonio Breast Cancer Symposium; SLR = systematic literature review.

figure 2

Number of studies reporting effectiveness outcomes exclusively for each treatment class. *Studies that lack sufficient information on effectiveness outcomes to classify based on the treatment classes outlined in the legend above. Abbreviations: CDK4/6i = cyclin-dependent kinase 4/6 inhibitor; ET = endocrine therapy; mTORi = mammalian target of rapamycin inhibitor.

figure 3

Weighted average median rwPFS for 2 L treatments (recommended in ESMO/NCCN guidelines) after 1 L CDK4/6i treatment. Circular dot represents weighted average median across studies. Horizontal bars represent the range of values reported in these studies. Abbreviations: CDK4/6i = cyclin-dependent kinase 4/6 inhibitor; ESMO = European Society for Medical Oncology; ET = endocrine therapy, mTORi = mammalian target of rapamycin inhibitor; n = number of patients; NCCN = National Comprehensive Cancer Network; rwPFS = real-world progression-free survival.

Data availability

All data generated or analyzed during this study are included in this published article [and its supplementary information files]. This study is registered with PROSPERO (CRD42023383914).

Abbreviations

Second-line

Second-line treatment setting and beyond

American Association of Cancer Research

Aromatase inhibitor

American Society of Clinical Oncology

  • Breast cancer

breast cancer gene/partner and localizer of BRCA2 positive

Clinical benefit rate

Cyclin-dependent kinase 4/6 inhibitor

Complete response

Duration of response

European Society for Medical Oncology

Food and Drug Administration

Human epidermal growth factor receptor 2

Human epidermal growth factor receptor 2 negative

Hormone receptor

Hormone receptor positive

Professional Society for Health Economics and Outcomes Research

Locally advanced breast cancer

Metastatic breast cancer

Medical Literature Analysis and Retrieval System Online

Medical subject headings

Mammalian target of rapamycin inhibitor

National Comprehensive Cancer Network

Newcastle Ottawa Scale

Objective response rate

Poly-ADP ribose polymerase inhibitor

Progression-free survival

Population, Intervention, Comparator, Outcome, Study Design

Partial response

Preferred Reporting Items for Systematic Literature Reviews and Meta-Analyses

Patient-reported outcomes

  • Real-world evidence

San Antonio Breast Cancer Symposium

  • Systematic literature review

Time-to-discontinuation

Time-to-next-treatment

Time-to-progression

United States

Łukasiewicz S, Czeczelewski M, Forma A, Baj J, Sitarz R, Stanisławek A, Breast, Cancer—Epidemiology. Risk factors, classification, prognostic markers, and current treatment Strategies—An. Updated Rev Cancers. 2021;13(17):4287.

Google Scholar  

World Health Organization (WHO). Breast Cancer Facts Sheet [updated July 12 2023. https://www.who.int/news-room/fact-sheets/detail/breast-cancer .

Arnold M, Morgan E, Rumgay H, Mafra A, Singh D, Laversanne M, et al. Current and future burden of breast cancer: global statistics for 2020 and 2040. Breast. 2022;66:15–23.

Article   PubMed   PubMed Central   Google Scholar  

Wilkinson L, Gathani T. Understanding breast cancer as a global health concern. Br J Radiol. 2022;95(1130):20211033.

Article   PubMed   Google Scholar  

Giaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A et al. Breast Cancer Statistics, 2022. CA: A Cancer Journal for Clinicians. 2022;72(6):524– 41.

National Cancer Institute (NIH). Cancer Stat Facts: Female Breast Cancer [updated 2020. https://seer.cancer.gov/statfacts/html/breast.html .

American Cancer Society. Key Statistics for Breast Cancer [ https://www.cancer.org/cancer/types/breast-cancer/about/how-common-is-breast-cancer.html .

Zagami P, Carey LA. Triple negative breast cancer: pitfalls and progress. npj Breast Cancer. 2022;8(1):95.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Matutino A, Joy AA, Brezden-Masley C, Chia S, Verma S. Hormone receptor-positive, HER2-negative metastatic breast cancer: redrawing the lines. Curr Oncol. 2018;25(Suppl 1):S131–41.

Lloyd MR, Wander SA, Hamilton E, Razavi P, Bardia A. Next-generation selective estrogen receptor degraders and other novel endocrine therapies for management of metastatic hormone receptor-positive breast cancer: current and emerging role. Ther Adv Med Oncol. 2022;14:17588359221113694.

Cardoso F, Senkus E, Costa A, Papadopoulos E, Aapro M, André F, et al. 4th ESO-ESMO International Consensus guidelines for advanced breast Cancer (ABC 4)†. Ann Oncol. 2018;29(8):1634–57.

Article   CAS   PubMed   Google Scholar  

US Food Drug Administration. Palbociclib (Ibrance) 2017 [updated March 31, 2017. https://www.fda.gov/drugs/resources-information-approved-drugs/palbociclib-ibrance .

US Food Drug Administration. FDA expands ribociclib indication in HR-positive, HER2-negative advanced or metastatic breast cancer 2018 [updated July 18. 2018. https://www.fda.gov/drugs/resources-information-approved-drugs/fda-expands-ribociclib-indication-hr-positive-her2-negative-advanced-or-metastatic-breast-cancer .

US Food Drug Administration. FDA approves abemaciclib for HR positive, HER2-negative breast cancer 2017 [updated Sept 28. 2017. https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-abemaciclib-hr-positive-her2-negative-breast-cancer .

NCCN Clinical Practice Guidelines in Oncology (NCCN Guidelines®). Breast Cancer 2022 [ https://www.nccn.org/professionals/physician_gls/pdf/breast.pdf .

Gennari A, André F, Barrios CH, Cortés J, de Azambuja E, DeMichele A, et al. ESMO Clinical Practice Guideline for the diagnosis, staging and treatment of patients with metastatic breast cancer. Ann Oncol. 2021;32(12):1475–95.

Beaver JA, Amiri-Kordestani L, Charlab R, Chen W, Palmby T, Tilley A, et al. FDA approval: Palbociclib for the Treatment of Postmenopausal Patients with estrogen Receptor-Positive, HER2-Negative metastatic breast Cancer. Clin Cancer Res. 2015;21(21):4760–6.

US Food Drug Administration. Ribociclib (Kisqali) [ https://www.fda.gov/drugs/resources-information-approved-drugs/ribociclib-kisqali#:~:text=On%20March%2013%2C%202017%2C%20the,hormone%20receptor%20(HR)%2Dpositive%2C .

US Food Drug Administration. FDA approves new treatment for certain advanced or metastatic breast cancers [ https://www.fda.gov/news-events/press-announcements/fda-approves-new-treatment-certain-advanced-or-metastatic-breast-cancers .

US Food Drug Administration. FDA expands ribociclib indication in HR-positive, HER2-negative advanced or metastatic breast cancer. 2018 [ https://www.fda.gov/drugs/resources-information-approved-drugs/fda-expands-ribociclib-indication-hr-positive-her2-negative-advanced-or-metastatic-breast-cancer .

US Food Drug Administration. FDA approves abemaciclib as initial therapy for HR-positive, HER2-negative metastatic breast cancer [ https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-abemaciclib-initial-therapy-hr-positive-her2-negative-metastatic-breast-cancer .

Turner NC, Slamon DJ, Ro J, Bondarenko I, Im S-A, Masuda N, et al. Overall survival with Palbociclib and fulvestrant in advanced breast Cancer. N Engl J Med. 2018;379(20):1926–36.

Slamon DJ, Neven P, Chia S, Fasching PA, De Laurentiis M, Im SA, et al. Phase III randomized study of Ribociclib and Fulvestrant in hormone Receptor-Positive, human epidermal growth factor receptor 2-Negative advanced breast Cancer: MONALEESA-3. J Clin Oncol. 2018;36(24):2465–72.

Goetz MP, Toi M, Campone M, Sohn J, Paluch-Shimon S, Huober J, et al. MONARCH 3: Abemaciclib as initial therapy for advanced breast Cancer. J Clin Oncol. 2017;35(32):3638–46.

Gopalan PK, Villegas AG, Cao C, Pinder-Schenck M, Chiappori A, Hou W, et al. CDK4/6 inhibition stabilizes disease in patients with p16-null non-small cell lung cancer and is synergistic with mTOR inhibition. Oncotarget. 2018;9(100):37352–66.

Watt AC, Goel S. Cellular mechanisms underlying response and resistance to CDK4/6 inhibitors in the treatment of hormone receptor-positive breast cancer. Breast Cancer Res. 2022;24(1):17.

Goetz M. MONARCH 3: final overall survival results of abemaciclib plus a nonsteroidal aromatase inhibitor as first-line therapy for HR+, HER2- advanced breast cancer. SABCS; 2023.

Munzone E, Pagan E, Bagnardi V, Montagna E, Cancello G, Dellapasqua S, et al. Systematic review and meta-analysis of post-progression outcomes in ER+/HER2– metastatic breast cancer after CDK4/6 inhibitors within randomized clinical trials. ESMO Open. 2021;6(6):100332.

Gennari A, André F, Barrios CH, Cortés J, de Azambuja E, DeMichele A, et al. ESMO Clinical Practice Guideline for the diagnosis, staging and treatment of patients with metastatic breast cancer. Annals of Oncology. 2021;32(12):1475-95.

European Society for Medical Oncology (ESMO). ESMO ​Metastatic Breast Cancer Living Guideline: ER-positive HER2-negative​ Breast Cancer​ [updated May 2023. https://www.esmo.org/living-guidelines/esmo-metastatic-breast-cancer-living-guideline/er-positive-her2-negative-breast-cancer .

Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Welch PM VA, editors. Cochrane Handbook for Systematic Reviews of Interventions version 6.2 (updated February 2021). www.training.cochrane.org/handbook : Cochrane; 2021.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. PLoS Med. 2021;18(3):e1003583.

McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 Guideline Statement. J Clin Epidemiol. 2016;75:40–6.

Fraser C, Murray A, Burr J. Identifying observational studies of surgical interventions in MEDLINE and EMBASE. BMC Med Res Methodol. 2006;6(1):41.

US Food Drug Administration. Palbociclib (Ibrance). Silver Spring, MD: US Food and Drug Administration; 2017.

Book   Google Scholar  

GA Wells BS, D O’Connell J, Peterson V, Welch M, Losos PT. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses [ https://www.ohri.ca/programs/clinical_epidemiology/oxford.asp .

Lo CK-L, Mertz D, Loeb M. Newcastle-Ottawa Scale: comparing reviewers’ to authors’ assessments. BMC Med Res Methodol. 2014;14(1):45.

Martin JM, Handorf EA, Montero AJ, Goldstein LJ. Systemic therapies following progression on first-line CDK4/6-inhibitor treatment: analysis of real-world data. Oncologist. 2022;27(6):441–6.

Kalinsky KM, Kruse M, Smyth EN, Guimaraes CM, Gautam S, Nisbett AR et al. Abstract P1-18-37: Treatment patterns and outcomes associated with sequential and non-sequential use of CDK4 and 6i for HR+, HER2- MBC in the real world. Cancer Research. 2022;82(4_Supplement):P1-18-37-P1-18-37.

Choong GM, Liddell S, Ferre RAL, O’Sullivan CC, Ruddy KJ, Haddad TC, et al. Clinical management of metastatic hormone receptor-positive, HER2-negative breast cancer (MBC) after CDK 4/6 inhibitors: a retrospective single-institution study. Breast Cancer Res Treat. 2022;196(1):229–37.

Xi J, Oza A, Thomas S, Ademuyiwa F, Weilbaecher K, Suresh R, et al. Retrospective Analysis of Treatment Patterns and effectiveness of Palbociclib and subsequent regimens in metastatic breast Cancer. J Natl Compr Canc Netw. 2019;17(2):141–7.

Rozenblit M, Mun S, Soulos P, Adelson K, Pusztai L, Mougalian S. Patterns of treatment with everolimus exemestane in hormone receptor-positive HER2-negative metastatic breast cancer in the era of targeted therapy. Breast Cancer Res. 2021;23(1):14.

Bashour SI, Doostan I, Keyomarsi K, Valero V, Ueno NT, Brown PH, et al. Rapid breast Cancer Disease Progression following cyclin dependent kinase 4 and 6 inhibitor discontinuation. J Cancer. 2017;8(11):2004–9.

Giridhar KV, Choong GM, Leon-Ferre R, O’Sullivan CC, Ruddy K, Haddad T, et al. Abstract P6-18-09: clinical management of metastatic breast cancer (MBC) after CDK 4/6 inhibitors: a retrospective single-institution study. Cancer Res. 2019;79:P6–18.

Article   Google Scholar  

Mougalian SS, Feinberg BA, Wang E, Alexis K, Chatterjee D, Knoth RL, et al. Observational study of clinical outcomes of eribulin mesylate in metastatic breast cancer after cyclin-dependent kinase 4/6 inhibitor therapy. Future Oncol. 2019;15(34):3935–44.

Moscetti LML, Riggi L, Sperduti I, Piacentini FOC, Toss A, Barbieri E, Cortesi L, Canino FMA, Zoppoli G, Frassoldati A, Schirone A, Dominici MECF. SEQUENCE OF TREATMENTS AFTER CDK4/6 THERAPY IN ADVANCED BREAST CANCER (ABC), A GOIRC MULTICENTER RETRO/ PROSPECTIVE STUDY. PRELIMINARY RESULTS IN THE RETROSPECTIVE SERIES OF 116 PATIENTS. Tumori. 2022;108(4S):80.

Menichetti AZE, Giorgi CA, Bottosso M, Leporati R, Giarratano T, Barbieri C, Ligorio F, Mioranza E, Miglietta F, Lobefaro R, Faggioni G, Falci C, Vernaci G, Di Liso E, Girardi F, Griguolo G, Vernieri C, Guarneri V, Dieci MV. CDK 4/6 INHIBITORS FOR METASTATIC BREAST CANCER: A MULTICENTER REALWORLD STUDY. Tumori. 2022;108(4S):70.

Marschner NW, Harbeck N, Thill M, Stickeler E, Zaiss M, Nusch A, et al. 232P Second-line therapies of patients with early progression under CDK4/6-inhibitor in first-line– data from the registry platform OPAL. Annals of Oncology. 2022;33:S643-S4

Gousis C, Lowe KMH, Kapiris M. V. Angelis. Beyond First Line CDK4/6 Inhibitors (CDK4/6i) and Aromatase Inhibitors (AI) in Patients with Oestrogen Receptor Positive Metastatic Breast Cancer (ERD MBC): The Guy’s Cancer Centre Experience. Clinical Oncology2022. p. e178.

Endo Y, Yoshimura A, Sawaki M, Hattori M, Kotani H, Kataoka A, et al. Time to chemotherapy for patients with estrogen receptor-positive breast Cancer and cyclin-dependent kinase 4 and 6 inhibitor use. J Breast Cancer. 2022;25(4):296–306.

Li Y, Li W, Gong C, Zheng Y, Ouyang Q, Xie N, et al. A multicenter analysis of treatment patterns and clinical outcomes of subsequent therapies after progression on palbociclib in HR+/HER2- metastatic breast cancer. Ther Adv Med Oncol. 2021;13:17588359211022890.

Amaro CP, Batra A, Lupichuk S. First-line treatment with a cyclin-dependent kinase 4/6 inhibitor plus an aromatase inhibitor for metastatic breast Cancer in Alberta. Curr Oncol. 2021;28(3):2270–80.

Crocetti SPM, Tassone L, Marcantognini G, Bastianelli L, Della Mora A, Merloni F, Cantini L, Scortichini L, Agostinelli V, Ballatore Z, Savini A, Maccaroni E. Berardi R. What is the best therapeutic sequence for ER-Positive/HER2- Negative metastatic breast cancer in the era of CDK4/6 inhibitors? A single center experience. Tumori. 2020;106(2S).

Nichetti F, Marra A, Giorgi CA, Randon G, Scagnoli S, De Angelis C, et al. 337P Efficacy of everolimus plus exemestane in CDK 4/6 inhibitors-pretreated or naïve HR-positive/HER2-negative breast cancer patients: A secondary analysis of the EVERMET study. Annals of Oncology. 2020;31:S382

Luhn P, O’Hear C, Ton T, Sanglier T, Hsieh A, Oliveri D, et al. Abstract P4-13-08: time to treatment discontinuation of second-line fulvestrant monotherapy for HR+/HER2– metastatic breast cancer in the real-world setting. Cancer Res. 2019;79(4Supplement):P4–13.

Mittal A, Molto Valiente C, Tamimi F, Schlam I, Sammons S, Tolaney SM et al. Filling the gap after CDK4/6 inhibitors: Novel Endocrine and Biologic Treatment options for metastatic hormone receptor positive breast Cancer. Cancers (Basel). 2023;15(7).

Ashai N, Swain SM. Post-CDK 4/6 inhibitor therapy: current agents and novel targets. Cancers (Basel). 2023;15(6).

Higgins JPTTJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022). www.training.cochrane.org/handbook : Cochrane; 2022.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

Serdar CC, Cihan M, Yücel D, Serdar MA. Sample size, power and effect size revisited: simplified and practical approaches in pre-clinical, clinical and laboratory studies. Biochem Med (Zagreb). 2021;31(1):010502.

US Food Drug Administration. FDA approves elacestrant for ER-positive, HER2-negative, ESR1-mutated advanced or metastatic breast cancer [updated January 27 2023. https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-elacestrant-er-positive-her2-negative-esr1-mutated-advanced-or-metastatic-breast-cancer .

US Food Drug Administration. FDA approves capivasertib with fulvestrant for breast cancer [updated November 16 2023. https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-capivasertib-fulvestrant-breast-cancer .

Download references

Acknowledgements

The authors would like to acknowledge Joanna Bielecki who developed, conducted, and documented the database searches.

This study was funded by Pfizer Inc. (New York, NY, USA) and Arvinas (New Haven, CT, USA).

Author information

Sarah Kane, Belal Howidi, Bao-Ngoc Nguyen and Imtiaz A. Samjoo contributed equally to this work.

Authors and Affiliations

Pfizer, 10017, New York, NY, USA

Veronique Lambert & Yan Wu

EVERSANA, Burlington, ON, Canada

Sarah Kane, Belal Howidi, Bao-Ngoc Nguyen & Imtiaz A. Samjoo

Arvinas, 06511, New Haven, CT, USA

David Chandiwana & Michelle Edwards

You can also search for this author in PubMed   Google Scholar

Contributions

VL, IAS, SK, BH, BN, DC, YW, and ME participated in the conception and design of the study. IAS, SK, BH and BN contributed to the literature review, data collection, analysis, and interpretation of the data. VL, IAS, SK, BH, BN, DC, YW, and ME contributed to the interpretation of the data and critically reviewed for the importance of intellectual content for the work. VL, IAS, SK, BH, BN, DC, YW, and ME were responsible for drafting or reviewing the manuscript and for providing final approval. VL, IAS, SK, BH, BN, DC, YW, and ME meet the International Committee of Medical Journal Editors (ICMJE) criteria for authorship for this article, take responsibility for the integrity of the work, and have given their approval for this version to be published.

Corresponding author

Correspondence to Imtiaz A. Samjoo .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors of this manuscript declare that the research presented was funded by Pfizer Inc. and Arvinas. While the support from Pfizer Inc. and Arvinas was instrumental in facilitating this research, the authors affirm that their interpretation of the data and the content of this manuscript were conducted independently and without bias to maintain the transparency and integrity of the research. IAS, SK, BH, and BN are employees of EVERSANA, Canada, which was a paid consultant to Pfizer in connection with the development of this manuscript.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Lambert, V., Kane, S., Howidi, B. et al. Systematic literature review of real-world evidence for treatments in HR+/HER2- second-line LABC/mBC after first-line treatment with CDK4/6i. BMC Cancer 24 , 631 (2024). https://doi.org/10.1186/s12885-024-12269-8

Download citation

Received : 26 January 2024

Accepted : 16 April 2024

Published : 23 May 2024

DOI : https://doi.org/10.1186/s12885-024-12269-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • First-line CDK4/6i

ISSN: 1471-2407

systematic literature review is mcq

Help | Advanced Search

Computer Science > Computation and Language

Title: text generation: a systematic literature review of tasks, evaluation, and challenges.

Abstract: Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using large language models, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text generation, summarization, translation, paraphrasing, and question answering. For each task, we review their relevant characteristics, sub-tasks, and specific challenges (e.g., missing datasets for multi-document summarization, coherence in story generation, and complex reasoning for question answering). Additionally, we assess current approaches for evaluating text generation systems and ascertain problems with current metrics. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications: bias, reasoning, hallucinations, misuse, privacy, interpretability, transparency, datasets, and computing. We provide a detailed analysis of these challenges, their potential solutions, and which gaps still require further engagement from the community. This systematic literature review targets two main audiences: early career researchers in natural language processing looking for an overview of the field and promising research directions, as well as experienced researchers seeking a detailed view of tasks, evaluation methodologies, open challenges, and recent mitigation strategies.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Validation Summary: Robot Screener’s Performance in Screening Records for Systematic Literature Review and Health Technology Assessment

Jade Thurnham

Jade Thurnham

systematic literature review is mcq

Systematic literature reviews (SLRs) are the backbone of Health Economics and Outcomes Research (HEOR), playing a vital role in Health Technology Assessments (HTAs). However, the vast quantity of research publications can turn the initial screening of abstracts into a time-consuming roadblock. Nested Knowledge’s (NK) Robot Screener is an AI tool poised to revolutionize this process, saving researchers time and ensuring high-quality evidence informs critical HTA decisions.

Two recent studies, one internal validation authored by the Nested Knowledge team and one independent study / external validation , have shed light on Robot Screener’s effectiveness in the diverse types of review needed to support both publishable SLRs and HTA. These studies employed robust methodologies for direct human-vs-AI performance across dozens of diverse reviews, providing compelling evidence for its role in streamlining HEOR-focused SLRs.

Methodology Matters: Assessing Recall and Precision

Both studies assessed Robot Screener’s performance by comparing its inclusion/exclusion decisions with those of human reviewers. The internal study focused on NK-published SLRs, analyzing over 8,900 abstracts across clinical, economic, and mental health research. The external study, specifically targeting HEOR topics for HTAs, included previously completed reviews on infectious diseases, neurodegenerative disorders, oncology, and more.

A key metric evaluated in both studies was Recall , which reflects the tool’s ability to identify relevant studies. Precision , on the other hand, measures the proportion of studies flagged by the tool that are relevant. These metrics provide a comprehensive picture of Robot Screener’s effectiveness in both flagging studies that are likely to be includable or of interest and thus preventing ‘missed’ studies (Recall) and in excluding studies that are likely outside of a review’s purview to save the effort of the adjudicator in excluding these non-relevant studies (Precision).

Internal Study: High Recall Prioritizes Comprehensiveness

The internal study conducted by Nested Knowledge was performed on a set of 19 reviews that employed Robot Screener as a second reviewer in a Dual Screening  process (see Figure). Robot Screener achieved a recall rate of 97.1% , significantly outperforming human reviewers (94.4%). This indicates that Robot Screener effectively captured a higher proportion of relevant studies. However, its precision rate of 47.3% fell short of human reviewers (86.4%). In effect, while Robot Screener missed significantly fewer includable studies, it sent forward significantly more studies that needed to be triaged out during adjudication.

systematic literature review is mcq

This difference reflects a strategic design choice. Robot Screener prioritizes high recall to ensure no crucial studies are missed in the initial screening phase. The adjudication stage then focuses on the flagged articles, maintaining the rigor of the SLR process and ensuring that irrelevant articles are screened out even when recommended for inclusion by either the human or Robot Screener. This prioritization of comprehensiveness is particularly valuable in HEOR, where missing relevant studies could lead to incomplete HTAs with potentially significant implications for interpretation.

External Study: Comparable Recall with Focus on Up-to-Date Evidence

The external study, conducted by Cichewicz et al., designed to simulate updating existing HEOR-based SLRs, also produced promising results. Mean and SD of recall and precision was measured with values ranging from 0 to 1 and higher values indicate better recall or precision. Robot Screener’s Recall rate (0.79 ± 0.18) was comparable to human reviewers (0.80 ± 0.20). While overall Recall for both humans and Robot Screener were lower than in the internal study, this likely indicates different proportions of final-included studies, and the consistency of Robot Screener in matching or exceeding Recall in HEOR-focused SLRs shows that the tool is generalizable across disciplines. Furthermore, the Cichewicz et al. study, rather than employing Robot Screener throughout the entire initial review, focused on updates, demonstrating Robot Screener’s ability to efficiently identify potentially relevant studies even when refreshing older reviews with new research.

As with the internal study, Robot Screener’s precision rate (0.46 ± 0.13) was lower than human reviewers (0.77 ± 0.19) in Cichewicz et al. However, the study’s conclusions emphasized that the time saved through efficient screening outweighs the need for additional human effort to adjudicate these “false positives.” Moreover, the low false negative rate (2%) for Robot Screener underscores its effectiveness in minimizing the risk of excluding truly relevant studies.

Large Language Models for Screening and “Recall-first” philosophy

Comparing Robot Screener to human screeners is helpful in determining workflows that can improve accuracy and timelines in SLR. However, Robot Screener, as a machine learning algorithm, depends on project-specific training on users’ decisions. A recent study by Tran et al. (2024) explored a related question, the effectiveness of GPT-3.5 Turbo models for title and abstract screening in five systematic reviews without any project-specific training. In different configurations, GPT-3.5 achieved high sensitivity (recall) rates, ranging from 81.1% to 99.8% depending on the optimization rule, though the specificity ranged from as low as 2.2% to as high as 80.4%. This indicates that Large Language Models (LLMs) may show promise for screening without the need for training data, but that overall accuracy ranged widely based on whether the model is optimized for maximizing recall vs. maximizing ‘correct’ exclusions. 

We look forward to the further development of both machine learning and LLM approaches to screening, and this publication also drives home that it is likely as important to optimize the thresholds used by models as it is to train or tune them correctly–the thresholding itself had a far greater impact on recall and specificity in Tran et al. To paraphrase Cichewicz et al., a more conservative approach (prioritizing and optimizing recall, so that no includable study is missed) mitigates the risk of non-comprehensiveness, and when used in a dual-screening approach, does not compromise the final review screening accuracy despite presenting lower precision (and specificity). Thus, approaches that include human adjudication and maximize recall can provide AI-assisted screening while maintaining review quality.

Confidence in Your HEOR Research: The Power of Robot Screener

These studies highlight the significant benefits Robot Screener offers researchers conducting HEOR-focused SLRs within Nested Knowledge. Here’s a quick recap of the key takeaways:

  • High Recall Rates: Robot Screener excels at capturing relevant studies, with a recall rate exceeding 97% (significantly higher than human screeners) in the internal study. This ensures a comprehensive foundation for your research.
  • Focus on Up-to-Date Evidence: The external study demonstrates Robot Screener’s effectiveness in keeping your reviews current, crucial for informing HTAs with the latest research.
  • Prioritizing Comprehensiveness: While precision may be lower, the studies emphasize the strategic choice to prioritize capturing all potentially relevant studies ( high recall ) to safeguard the comprehensiveness of your research.
  • Time Savings and Efficiency: Robot Screener’s efficient screening translates to faster completion of SLRs , allowing you to focus your expertise on critical analysis.
  • LLM screening: Recent research shows GPT 3.5 can achieve high recall in screening without training data; LLMs and machine learning approaches are both powerful assets for streamlining research workflows within SLRs for publication, HTA, gap analysis, and other purposes, without sacrificing overall review quality–so long as adjudicated methods with high recall are employed.

The Future of SLR Research: A Collaborative Approach

These validation studies pave the way for adoption of Robot Screener in adjudicated systems with human and AI collaborative screening, bringing time savings but also providing different strengths, with humans outperforming in Precision and AI potentially providing higher Recall. This also paves the way for a future research exploring the use of Robot Screener and other AI screening tools in different configurations, with different oversight or workflows, and in different review types beyond publishable SLRs and SLRs for HTA.

By embracing Robot Screener, researchers can leverage the power of AI to streamline their workflows, ensure high-quality evidence informs SLRs and evidence synthesis generally, and ultimately contribute to better evidence synthesis and health outcomes.  

A blog about systematic literature reviews?

Yep, you read that right. We started making software for conducting systematic reviews because we  like doing systematic reviews . And we bet you do too.

If you do, check out this featured post and come back often! We post all the time about best practices, new software features, and upcoming collaborations (that you can join!).

Better yet, subscribe to our blog, and get each new post straight to your inbox.

Stacks of spreadsheets

Feature Comparison for Systematic Literature Review: Nested Knowledge vs Spreadsheets

In the realm of conducting systematic literature reviews (SLRs), the question invariably arises: how do these new technologies compare to the trusty spreadsheet? Spreadsheets offer

systematic literature review is mcq

Have a question?

Send us an email and we’ll get back to you as quickly as we can!

IMAGES

  1. systematic literature review steps

    systematic literature review is mcq

  2. 10 Steps to Write a Systematic Literature Review Paper in 2023

    systematic literature review is mcq

  3. How to Conduct a Systematic Review

    systematic literature review is mcq

  4. systematic literature review use cases

    systematic literature review is mcq

  5. Research Methodology MCQS

    systematic literature review is mcq

  6. Systematic Literature Review Methodology

    systematic literature review is mcq

VIDEO

  1. Introduction to Systematic Literature Review by Dr. K. G. Priyashantha

  2. Systematic Literature Review (SLR)

  3. Literature Review, Systematic Literature Review, Meta

  4. Systematic Literature Review Part2 March 20, 2023 Joseph Ntayi

  5. Systematic Literature Review

  6. Chapter 11 Review of Literature PART 02 Conducting Systematic Review

COMMENTS

  1. Chapter 4: Multiple choice questions

    c) To find out what is already known about your area of interest. d) To make sure you have a long list of references. Question 2. To read the literature critically means: a) to suggest the previous research was always poorly conducted. b) skimming through the material because most of it is just padding.

  2. Top 20 MCQs on literature review with answers

    11. The main purpose of finalization of research topics and sub-topics is. 12. Literature review is basically to bridge the gap between. 13. The last step in writing the literature review is. 14. The primary purpose of literature review is to facilitate detailed background of. 15.

  3. Multiple choice question writing and medical students: a systematic

    Multiple-choice questions (MCQs) are widely used in medical education and can promote surface learning strategies, but creating MCQs requires both in-depth content knowledge and sophisticated ...

  4. Guidance on Conducting a Systematic Literature Review

    Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews. In this article, through a systematic search on the methodology of literature review, we categorize a typology of literature reviews, discuss steps in conducting a systematic literature review, and provide suggestions on how to enhance rigor in literature ...

  5. How-to conduct a systematic literature review: A quick guide for

    Method details Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12].An SLR updates the reader with current literature about a subject [6].The goal is to review critical points of current knowledge on a ...

  6. PDF SYSTEMATIC REVIEW Multiple choice question writing and medical students

    systematic literature review [version 1; peer review: awaiting peer review] Jacob Lahti 1,2, Miriam Salamon3, Jordan Farhat3, Thomas Varkey 1,3 ... Multiple-choice question (MCQ) tests have been a ...

  7. PDF Systematic Literature Reviews: an Introduction

    Systematic literature reviews (SRs) are a way of synthesising scientific evidence to answer a particular research question in a way that is transparent and reproducible, while seeking to include all published ... SRs treat the literature review process like a scientific process, and apply concepts of empirical research in order to make the ...

  8. Systematic Review

    A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr. Robert Boyle and his colleagues published a systematic review in ...

  9. PDF Reviewing systematic literature reviews: ten key questions ...

    Systematic literature review articles are important for synthesizing knowledge in management and business research. However, to date, we lack clear guidelines how to review such articles. This editorial takes the perspective of the reviewer. It pre-sents ten key questions and criteria that reviewers should ask when reviewing sys-

  10. Meta-Analysis Quiz: Question 1

    Meta-Analysis Quiz: Question 1. Question 1. The most rigorous and methodologically complex kind of review article is an: a. Overview. b. Systematic review. c. Meta-analysis.

  11. Introduction to systematic review and meta-analysis

    A systematic review collects all possible studies related to a given topic and design, and reviews and analyzes their results [ 1 ]. During the systematic review process, the quality of studies is evaluated, and a statistical meta-analysis of the study results is conducted on the basis of their quality. A meta-analysis is a valid, objective ...

  12. Literature Review MCQ [Free PDF]

    Get Literature Review Multiple Choice Questions (MCQ Quiz) with answers and detailed solutions. Download these Free Literature Review MCQ Quiz Pdf and prepare for your upcoming exams Like Banking, SSC, Railway, UPSC, State PSC. ... Thus, A systematic review of the literature adopts explicit procedures is the correct answer.. ...

  13. Formulating a research question

    The criteria could relate to the review topic, the research methods of the studies, specific populations, settings, date limits, geographical areas, types of interventions, or something else. Systematic reviews address clear and answerable research questions, rather than a general topic or problem of interest.

  14. Literature Review

    Systematic Review: A systematic review is a comprehensive and structured approach to reviewing the literature on a particular topic with the aim of answering a defined research question. It involves a systematic search of the literature using pre-specified eligibility criteria and a structured evaluation of the quality of the research.

  15. Design, format, validity and reliability of multiple choice questions

    Systematic multiple choice question design and use of valid and reliable multiple choice questions are vital if the results of research or educational testing are to be considered valid. ... The purpose of the discussion in this paper is to provide a review of findings from the literature about the use of MCQs and current recommendations for ...

  16. Multiple Choice Questions

    To enhance your experience on our site, SAGE stores cookies on your computer. By continuing you consent to receive cookies. Learn more. Close

  17. Does developing multiple-choice Questions Improve Medical Students

    Multiple-choice questions (MCQs) represent the most common assessment tool in medical ... we conducted a systematic review to define and document evidence of the effect of writing MCQs activity on students learning, and understand how and under what circumstances it could benefit medical students, as to our knowledge, there is no prior ...

  18. Systematic Reviews: Before you begin...

    Introducing systematic reviews. Step 1: Preparation. Step 2: Scoping. Step 3: Planning your search strategy. Step 4: Recording and managing results. Step 5: Selecting papers and quality assessment.

  19. Research Methodology MCQS

    Research Methodology MCQS writing literature review requires: planning clear writing research is: lab experiment report good writing all of the above systematic ... A literature review is based on the assumption that: a) Copy from the work of others b) Knowledge accumulates and learns from the work of others c) Knowledge disaccumulates d) None ...

  20. Systematic Review and Meta-Analysis MCQs

    Correct answer: Systematic review and meta-analysis. Steps in a meta-analysis include all of the following except: Analysis. Abstraction. Selection. Randomization. Correct answer: Randomization. A fixed-effects model is most appropriate in a meta-analysis when study findings are. Homogenous.

  21. (PDF) Does developing multiple-choice Questions Improve Medical

    Article PDF Available Literature Review. ... A Systematic Review. December 2022; Medical Education Online 27(1) ... Practicing Multiple-choice questions is a popular learning method among medical ...

  22. A systematic literature review is:

    Solved Answer of MCQ A systematic literature review is: - (a) One which starts in your own library, then goes to on-line databases and, finally, to the internet - (b) A replicable, scientific and transparent process - (c) One which gives equal attention to the principal contributors to the area - (d) A responsible, professional process of time-management for research - Getting started ...

  23. A systematic literature review of empirical research on ChatGPT in

    To conduct this study, the authors followed the essential steps of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) and Okoli's [] steps for conducting a systematic review.These included identifying the study's purpose, drafting a protocol, applying a practical screening process, searching the literature, extracting relevant data, evaluating the quality ...

  24. Systematic literature review of real-world evidence for treatments in

    This systematic literature review qualitatively synthesized effectiveness and safety outcomes for treatments received in the real-world setting after 1 L CDK4/6i therapy in patients with HR+/ HER2- LABC/mBC. MEDLINE®, Embase, and Cochrane were searched using the Ovid® platform for real-world evidence studies published between 2015 and 2022. ...

  25. [2405.15604] Text Generation: A Systematic Literature Review of Tasks

    Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using large language models, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text ...

  26. Systematic Review Overview: Violence Against Adults with Disabilities

    This study aimed to conduct an overview of systematic reviews in the field of violence against adults with disability. Eight electronic databases as well as gray literature from January 2022 to April 2023 were searched to identify systematic reviews that focused on violence against adults with disabilities.

  27. Validation Summary: Robot Screener's Performance in Screening Records

    Systematic literature reviews (SLRs) are the backbone of Health Economics and Outcomes Research (HEOR), playing a vital role in Health Technology Assessments (HTAs). However, the vast quantity of research publications can turn the initial screening of abstracts into a time-consuming roadblock.

  28. A systematic review of virtual reality features for skill training

    This systematic literature review explored the inherent associations between various VR features employed in professional training environments and their impact on learning outcomes. Furthermore, this review scrutinizes the assessment techniques employed to gauge the effects of VR applications in various learning scenarios. The Preferred ...

  29. Water

    Skyrise greenery, including green roofs and vertical gardens, has emerged as an indispensable tool for sustainable urban planning with multiple ecological and economic benefits. A bibliometric analysis was used to provide a systematic review of the functions associated with skyrise greenery in urban landscapes. Key research tools, including the "Bibliometrix" R package and "CiteSpace ...