Academia Insider

How To Use ChatGPT To Write A Literature Review: Prompts & References

In the rapidly evolving world of academic research, the integration of AI tools like ChatGPT has transformed the traditional approach to literature reviews. As a researcher, you should also leverage on this tool to make your research work easier.

In this post, explore how ChatGPT can enhance the literature review process. We’ll explore how specific prompts can effectively guide this advanced AI model to generate insightful content, while ensuring accuracy, relevance, and academic integrity in their scholarly work.

How to Use ChatGPT for Writing a Literature Review

Understanding chatgpt’s limitations.

While it can efficiently generate content, streamline the research process, and provide a comprehensive understanding of relevant literature, its capabilities are not without constraints. Here are some for you to consider:

Dependence On Pre-Existing Datasets

Since ChatGPT is a language model trained on available data, it may not include the most recent research papers or cutting-edge findings in a specific field. This gap can lead to a lack of current state-of-research insights, particularly crucial in fields like technology and science where advancements happen rapidly.

May Lack Depth And Context

ChatGPT, while able to produce summaries and synthesize information, might not fully grasp the nuanced arguments or complex theories specific to a research topic. This limitation necessitates that researchers critically evaluate and supplement AI-generated text with thorough analysis and insights from recent systematic reviews and primary sources.

Risk Of Plagiarism

Although ChatGPT can generate human-like text, it’s vital to ensure that the content for your literature review is original and properly cited. Relying solely on ChatGPT to write a literature review defeats the purpose of engaging deeply with the material and developing a personal understanding of the literature.

Not A Total Replacement of A Researcher

While ChatGPT can assist non-native English speakers in crafting clear and concise academic writing, it’s not a replacement for the human ability to contextualize and interpret research findings. Researchers must guide the AI model with specific prompts and leverage it as a tool rather than a substitute for comprehensive analysis.

By keeping these limitations in mind, ChatGPT can be a valuable aid in the literature review process, but it should be used judiciously and in conjunction with traditional research methods.

Defining Research Objective

When starting on writing a literature review, the initial step involves using ChatGPT to define your research question or hypothesis.

The AI model’s ability to respond with a summary of the current state of research in your field can provide a comprehensive understanding, especially for systematic reviews or research papers.

For example, by inputting a prompt related to your research topic, ChatGPT can generate human-like text, summarizing prior research and highlighting relevant literature.

One insider tip for effectively using ChatGPT in the literature review process is to leverage its natural language processing capabilities to identify relevant keywords.

These keywords are crucial for non-native English speakers or those new to a research field, as they streamline the search for pertinent academic writing. Additionally, ChatGPT can guide you in understanding the context of your research topic, offering insights that are often challenging to find.

Using AI language models like ChatGPT for generating content for your literature review is efficient and effective, saving valuable time. However, it’s vital to critically evaluate the generated text to ensure it aligns with your research objectives and to avoid plagiarism.

chatgpt writing literature review

ChatGPT’s ability to synthesize large amounts of information can aid in developing a clear and concise outline, but remember, it’s a guide, not a replacement for human analysis.

Despite these limitations, ChatGPT provides a unique advantage in conducting literature reviews. It can automate mundane tasks, allowing researchers to focus on analysis and critical thinking.

Identifying Keywords and Search Terms

Using ChatGPT to identify relevant keywords related to your research topic can significantly streamline your workflow.

For instance, when you input a summary of your research question into ChatGPT, the AI model can generate a list of pertinent keywords.

These keywords are not just randomly selected; they are based on the vast amounts of information in ChatGPT’s dataset, making them highly relevant and often inclusive of terms that are current in your research field.

An insider tip for leveraging ChatGPT effectively is to guide the AI with clear and concise prompts.

For example, asking ChatGPT to: “summarize key themes in [specific field] research papers from the last five years” can yield a list of keywords and phrases that are not only relevant but also reflective of the current state of research.

This approach is particularly beneficial for conducting systematic reviews or for non-native English speakers who might be unfamiliar with specific academic jargon.

While ChatGPT can provide a comprehensive understanding of relevant literature and help automate the identification of keywords, it’s important to critically evaluate the generated content.

Researchers should use ChatGPT as a tool to augment their research process, not as a replacement for human insight.

It’s crucial to mind the limitations of the AI model and ensure that the keywords identified align with the research topic and objectives.

Creating an Initial Literature Review Outline

The key to using ChatGPT effectively in crafting an initial outline lies in its ability to generate content based on specific prompts.

For instance, a researcher working on organic photovoltaic devices can input a prompt into ChatGPT, such as “Help me create a structure for a literature review on organic photovoltaic devices.”

The AI model, using its comprehensive understanding of the research topic, can then produce a preliminary structure, including sections like:

  • Introduction
  • Advances in materials and technology, performance, and efficiency.

This generated outline serves as a valuable starting point. It helps in organizing thoughts and determining the key areas that the literature review should cover. I

mportantly, researchers can refine and expand this initial outline as they delve deeper into their topic, ensuring it aligns with their specific research question and the current state of research.

However, while ChatGPT can streamline the review process and save valuable time in creating an initial outline, researchers should not solely rely on it.

chatgpt writing literature review

The content generated by ChatGPT must be critically evaluated and supplemented with in-depth research. This involves:

  • Reading systematic reviews
  • Reading research papers, and
  • Summarizing relevant literature to ensure the review is comprehensive and up-to-date.

Get ChatGPT To Help You During Research, Using The Right Prompts

The key to effectively using ChatGPT in this process lies in crafting the right prompts, guiding the AI to generate relevant and useful content. 

When initiating a literature review, the prompt should aim for a broad understanding of the research topic. For instance, asking ChatGPT to:

  • “Give a brief overview of research done on [topic]”
  • “What are some of the recent findings on the [topic] in research?” or 
  • “Summarize the historical development of [topic] in academia”

Helps in capturing the general landscape of the field. These prompts assist in identifying key theories, methodologies, and authors within the research area. As the review progresses, more specific prompts are necessary to delve deeper into individual studies. Queries like:

  • “Summarize the main arguments and findings of [specific paper]” or
  • “What are the strengths and weaknesses of [specific paper]?”

enable ChatGPT to provide detailed insights into particular research papers, aiding in understanding their contribution to the broader field. Comparative prompts are also crucial in synthesizing information across multiple works. Asking ChatGPT to:

  • “Compare and contrast the methodologies of [paper 1] and [paper 2]” or
  • “How do the findings of [paper 1] and [paper 2] agree or disagree?”

helps in discerning the nuances and disparities in the literature. In the final stages of the literature review, prompts should focus on summarizing findings and identifying emerging trends or gaps. For example:

  • “What trends or patterns have emerged from the literature on [topic]?” or
  • “What future research directions are suggested by the literature on [topic]?”

We will share more on these ChatGPT prompts in the later part of this post, read on.

Reviewing ChatGPT’s Responses

When using ChatGPT to write a literature review, it’s crucial to critically evaluate its responses.

Firstly, researchers should cross-reference the information provided by ChatGPT with actual research papers.

This step ensures the accuracy of the data and helps in identifying any discrepancies or outdated information, given that ChatGPT’s dataset may not include the most recent studies.

Another essential aspect is assessing the coherence and depth of the AI-generated text. ChatGPT can summarize and synthesize information efficiently, but it might not capture the nuances of complex theories or research arguments.

Researchers should ensure that the content aligns with their research question and systematically reviews the topic comprehensively. This is where a researcher’s value comes in.

Additionally, verifying the originality of the content is vital to avoid plagiarism. While ChatGPT can generate human-like text, researchers must ensure that the AI-generated content is used as a guide rather than a verbatim source. 

Proper citations and references are essential to maintain the integrity of the literature review. Avoid torpedoing your own research by committing plagiarism.

Ensuring Coherence and Flow

One of the challenges when using such advanced AI language models is ensuring the coherence and flow of the final document. This aspect is crucial as it determines the readability and academic rigor of the literature review.

ChatGPT can generate vast amounts of content on a wide range of topics, responding efficiently to prompts and synthesizing information from its extensive dataset.

However, the content generated by ChatGPT, while informative, might not always align seamlessly with the specific research question or maintain a consistent narrative flow.

chatgpt writing literature review

To tackle this, researchers need to take an active role in guiding ChatGPT and subsequently refining its output.

A practical approach is to use ChatGPT as a starting point, leveraging its ability to quickly provide summaries, synthesize relevant literature, and identify key references and keywords related to the research topic. For example, prompts like:

  • “Summarize the current research on [topic]” or
  • “Identify key debates in [topic]”

Can yield valuable initial insights.

Once this foundational information is obtained, the crucial task is to carefully review and edit the AI-generated content.

This involves connecting the dots between different sections, ensuring that each part contributes meaningfully to addressing the research question, and refining the language to maintain academic standards.

It’s also essential to check for and avoid plagiarism, ensuring that all sources are correctly cited.

In addition, considering the vast amounts of information ChatGPT can access, it’s vital to verify the accuracy and relevance of the content.

Researchers should cross-reference AI-generated summaries with actual research papers, especially the most recent ones, as ChatGPT’s dataset may not include the latest studies.

Editing and Proofreading

Now that your literature review is mostly written out, now focus on the editing and proofreading. The content generated by ChatGPT needs to be meticulously reviewed and edited. Here are the steps:

  • Verifying the accuracy of the information. Researchers must cross-check the AI-generated content against actual research papers and systematic reviews. This ensures that the latest studies are accurately represented.
  • Improve coherence and flow. Researchers should restructure sentences, ensure logical progression of ideas, and maintain a consistent academic tone throughout the document.
  • Checking for plagiarism. Despite ChatGPT’s ability to generate human-like text, researchers must ensure that all sources are correctly cited and that the review does not inadvertently replicate existing material.
  • Check Grammar and Spelling: Editing should encompass grammar checks, vocabulary refinement, and ensuring that the language used is appropriate for an academic audience.
  • Update Citation: Review citation, or reference list to ensure everything is cited correctly, and the citation list is written out to your required standard, be it MLA, Chicago, or APA.

What ChatGPT Prompts To Use When Writing A Literature Review?

There are many ways to use ChatGPT to write literature review, usually by using the right prompts. Here’s how specific types of prompts can be effectively employed, with multiple examples for each category:

  • “Provide a comprehensive overview of the latest research on [topic].”
  • “Summarize the current understanding and key findings in the field of [topic].”
  • “Detail the dominant theoretical frameworks currently used in [topic].”
  • “Describe the evolution of theoretical approaches in [topic] over the past decade.”
  • “Identify and discuss the major debates or controversies in [topic].”
  • “What are the conflicting viewpoints or schools of thought in [topic]?”
  • “List the leading researchers in [topic] and summarize their key contributions.”
  • “Who are the emerging authors in [topic], and what unique perspectives do they offer?”
  • “Explain the most common research methodologies used in studies about [topic].”
  • “How have the methodologies in [topic] research evolved recently?”
  • “Trace the historical development and major milestones in [topic].”
  • “Provide a timeline of the key discoveries and shifts in understanding in [topic].”
  • “What significant paradigm shifts have occurred in [topic] in the last twenty years?”
  • “How has the focus of research in [topic] changed over time?”
  • “Analyze the methodology and conclusions of [specific paper].”
  • “Discuss the impact and reception of [specific paper] in the field of [topic].”
  • “Compare the results and methodologies of [paper 1] and [paper 2] in [topic].”
  • “How do [paper 1] and [paper 2] differ in their approach to [topic]?”
  • “Based on current literature, what are the suggested future research directions in [topic]?”
  • “Identify gaps in the literature of [topic] that could be explored in future studies.”

By using these types of prompts, researchers can guide ChatGPT to produce content that is not only relevant to their literature review but also rich in detail and scope.

Wrapping Up: Use Other AI Tools Too, Not Just ChatGPT

In conclusion, while ChatGPT serves as a powerful ally in the literature review process, it’s important to recognize it as one of many AI tools available to researchers as well. Diversifying your AI toolkit can enhance the depth and breadth of your review, offering varied perspectives and methodologies.

As AI continues to evolve, embracing a range of these tools can lead to more comprehensive, nuanced, and innovative academic writing, expanding the horizons of research and scholarly exploration beyond what we currently envision.

chatgpt writing literature review

Dr Andrew Stapleton has a Masters and PhD in Chemistry from the UK and Australia. He has many years of research experience and has worked as a Postdoctoral Fellow and Associate at a number of Universities. Although having secured funding for his own research, he left academia to help others with his YouTube channel all about the inner workings of academia and how to make it work for you.

Thank you for visiting Academia Insider.

We are here to help you navigate Academia as painlessly as possible. We are supported by our readers and by visiting you are helping us earn a small amount through ads and affiliate revenue - Thank you!

chatgpt writing literature review

2024 © Academia Insider

chatgpt writing literature review

  • Interesting
  • Scholarships
  • UGC-CARE Journals

Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review With Prompts

Step-by-Step Guide: How to Use ChatGPT for Writing a Literature Review

Dr. Somasundaram R

Writing a literature review can be a challenging task for researchers and students alike. It requires a comprehensive understanding of the existing body of research on a particular topic. However, with the advent of advanced language models like ChatGPT, the process has become more accessible and efficient.

In this step-by-step guide, ilovephd will explore how you can leverage ChatGPT to write a compelling literature review that is both informative.

A Step-by-Step Guide: How to Use ChatGPT for Writing a Literature Review

Step 1: Defining Your Research Objective Before diving into the literature review process, it is crucial to define your research objective.

Clearly articulate the topic, research question, or hypothesis you aim to address through your literature review. This step will help you maintain focus and guide your search for relevant sources.

Step 2: Identifying Keywords and Search Terms To effectively use ChatGPT to assist in your literature review, you need to identify relevant keywords and search terms related to your research topic.

These keywords will help you narrow down your search and gather pertinent information. Consider using tools like Google Keyword Planner or other keyword research tools to discover commonly used terms in your field.

Step 3: Familiarizing Yourself with ChatGPT Before engaging with ChatGPT, it is essential to understand its capabilities and limitations. Familiarize yourself with the prompts and commands that work best with the model.

Keep in mind that ChatGPT is an AI language model trained on a vast amount of data, so it can provide valuable insights and suggestions, but it’s important to critically evaluate and validate the information it generates.

Step 4: Generating an Initial Literature Review Outline Start by creating an outline for your literature review. Outline the main sections, such as the introduction, methodology, results, discussion, and conclusion.

Within each section, jot down the key points or subtopics you want to cover. This will help you organize your thoughts and structure your review effectively.

Step 5: Engaging with ChatGPT for Research Assistance Once you have your outline ready, engage with ChatGPT for research assistance.

Begin by providing a clear and concise prompt that specifies the topic, context, and any specific questions you have. For example, “What are the current trends in [your research topic]?” or “Can you provide an overview of the main theories on [your research question]?”

Step 6: Reviewing and Selecting Generated Content ChatGPT will generate a response based on your prompt. Carefully review the content generated, considering its relevance, accuracy, and coherence.

Extract key points, relevant references, and insightful arguments from the response and incorporate them into your literature review. Be sure to cite and attribute the sources appropriately.

Step 7: Ensuring Coherence and Flow While ChatGPT can provide valuable content, it’s important to ensure the coherence and flow of your literature review.

Use your critical thinking skills to connect the generated content with your research objective and existing knowledge. Rearrange, rephrase, and expand upon the generated text to ensure it aligns with the structure and purpose of your review.

Step 8: Editing and Proofreading Once you have incorporated the generated content into your literature review, thoroughly edit and proofread the document.

Check for grammatical errors, consistency in referencing, and overall clarity. This step is crucial to ensure your literature review is polished and professional.

ChatGPT prompts to Write a Literature Review

Prompts you can use when engaging with ChatGPT for research assistance in writing a literature review:

  • “Can you provide an overview of the main theories and concepts related to [your research topic]?”
  • “What are the current trends and developments in [your research field]?”
  • “Can you suggest some key studies or research papers on [specific aspect of your research topic]?”
  • “What are the main methodologies used in conducting research on [your research topic]?”
  • “Can you provide a critical analysis of the existing literature on [your research question]?”
  • “Are there any gaps or areas of controversy in the literature on [your research topic] that need further exploration?”
  • “What are the key findings and conclusions from the most recent studies on [your research topic]?”
  • “Can you suggest some reputable journals or publications explore for relevant literature in [your research field]?”
  • “What are the different perspectives or schools of thought in the literature on [your research topic]?”
  • “Can you provide a summary of the historical background and evolution of research on [your research topic]?”

Remember to provide clear and specific instructions in your prompts to guide ChatGPT in generating relevant and accurate content for your literature review.

Using ChatGPT to write a literature review can greatly facilitate the research process. By following a step-by-step approach, researchers can effectively leverage ChatGPT’s capabilities to gather insights, generate content, and enhance the quality of their literature review. However, it is important to approach the generated content critically, validate it with reliable sources, and ensure coherence within the review.

  • Academic Writing
  • AI writing tool
  • literature review guide
  • literature review tips
  • research assistance
  • research process
  • research writing
  • write literature review

Dr. Somasundaram R

24 Best Free Plagiarism Checkers in 2024

How does gptzero work ai detector for chatgpt / gemini / copilot / meta ai, how to effectively use meta ai for phd research, most popular, how to check scopus indexed journals 2024, types of research variable in research with example, what is hypothesis in research types, examples, & importance, how to write a research paper a complete guide, 14 websites to download research paper for free – 2024, google phd $50000 fellowship program 2024, phd funding schemes in india 2024, best for you, what is phd, popular posts, 480 ugc-care list of journals – science – 2024, popular category.

  • POSTDOC 317
  • Interesting 259
  • Journals 234
  • Fellowship 127
  • Research Methodology 102
  • All Scopus Indexed Journals 92

ilovephd_logo

iLovePhD is a research education website to know updated research-related information. It helps researchers to find top journals for publishing research articles and get an easy manual for research tools. The main aim of this website is to help Ph.D. scholars who are working in various domains to get more valuable ideas to carry out their research. Learn the current groundbreaking research activities around the world, love the process of getting a Ph.D.

Contact us: [email protected]

Google News

Copyright © 2024 iLovePhD. All rights reserved

  • Artificial intelligence

chatgpt writing literature review

  • FIU Libraries
  • RefTech LibGuides
  • Artificial Intelligence Now: ChatGPT + AI Literacy Toolbox
  • Literature Reviews with Prompts

Artificial Intelligence Now: ChatGPT + AI Literacy Toolbox: Literature Reviews with Prompts

  • Literacy Tutorials
  • AI LibGuides
  • AI Literacy
  • AI + Art News
  • AI Plagiarism + Citation
  • Which AI tools should I choose?
  • Faculty + Academic Writing
  • ChatGPT on InfoLit
  • Libraries + AI
  • Art + Image Prompts
  • Prompt Literacy
  • Text Prompts
  • Scholarship + Publishing
  • Students + AI Use
  • FIU Library Resources
  • How does ChatGPT work?
  • Zoom Recordings: AI Workshops
  • ALA 2024: Breaking Boundaries

Resources on Lit Reviews & AI

  • How to Use ChatGPT to Accelerate Literature Review I’d like to show you how I use ChatGPT to speed up my literature review process. The topic I am exploring is heterogeneity(diversity) learning. I started with the keyword “Info-GAIL” as I read about it when using GAIL in the past.
  • ChatGPT Simple Literature Review Template The evolution of library services in the digital age has seen a significant shift towards automation and artificial intelligence applications, with OpenAI's ChatGPT being one of the most popular tools. This literature review explores the trends in the application of ChatGPT in library settings, focusing on user engagement and support services from 2015 to 2023.
  • ChatGPT as a Tool for Library Research – Some Notes and Suggestions I see ChatGPT and its alternatives as having partial value as tools for library searching. You can use them without any training, but they will perform better when you know some details about them.

9 Ways To Use ChatGPT To Write A Literature Review (WITHOUT Plagiarism) Video

  • Step-by-Step

How to Use ChatGPT to Write a Literature Review With Prompts

Dr. Somasundaram R | https://www.ilovephd.com/ | Copyright © 2019-2023 – iLovePhD | May 19, 2023

Writing a literature review can be a challenging task for researchers and students alike. It requires a comprehensive understanding of the existing body of research on a particular topic. However, with the advent of advanced language models like ChatGPT, the process has become more accessible and efficient.

Discover how to effectively utilize ChatGPT as a research assistant to write a comprehensive and SEO-friendly literature review. Follow our step-by-step guide to leverage this powerful tool, optimize your review for search engines, and contribute to the scholarly conversation in your field.

A Step-by-Step Guide: How to Use ChatGPT for Writing a Literature Review

Step 1: Defining Your Research Objective Before diving into the literature review process, it is crucial to define your research objective.

Clearly articulate the topic, research question, or hypothesis you aim to address through your literature review. This step will help you maintain focus and guide your search for relevant sources.

Step 2: Identifying Keywords and Search Terms To effectively use ChatGPT to assist in your literature review, you need to identify relevant keywords and search terms related to your research topic.

These keywords will help you narrow down your search and gather pertinent information. Consider using tools like Google Keyword Planner or other keyword research tools to discover commonly used terms in your field.

Step 3: Familiarizing Yourself with ChatGPT Before engaging with ChatGPT, it is essential to understand its capabilities and limitations. Familiarize yourself with the prompts and commands that work best with the model.

Keep in mind that ChatGPT is an AI language model trained on a vast amount of data, so it can provide valuable insights and suggestions, but it’s important to critically evaluate and validate the information it generates.

Step 4: Generating an Initial Literature Review Outline Start by creating an outline for your literature review. Outline the main sections, such as the introduction, methodology, results, discussion, and conclusion.

Within each section, jot down the key points or subtopics you want to cover. This will help you organize your thoughts and structure your review effectively.

Step 5: Engaging with ChatGPT for Research Assistance Once you have your outline ready, engage with ChatGPT for research assistance.

Begin by providing a clear and concise prompt that specifies the topic, context, and any specific questions you have. For example, “What are the current trends in [your research topic]?” or “Can you provide an overview of the main theories on [your research question]?”

Step 6: Reviewing and Selecting Generated Content ChatGPT will generate a response based on your prompt. Carefully review the content generated, considering its relevance, accuracy, and coherence.

Extract key points, relevant references, and insightful arguments from the response and incorporate them into your literature review. Be sure to cite and attribute the sources appropriately.

Step 7: Ensuring Coherence and Flow While ChatGPT can provide valuable content, it’s important to ensure the coherence and flow of your literature review.

Use your critical thinking skills to connect the generated content with your research objective and existing knowledge. Rearrange, rephrase, and expand upon the generated text to ensure it aligns with the structure and purpose of your review.

Step 8: Editing and Proofreading Once you have incorporated the generated content into your literature review, thoroughly edit and proofread the document.

Check for grammatical errors, consistency in referencing, and overall clarity. This step is crucial to ensure your literature review is polished and professional.

ChatGPT prompts to Write a Literature Review

Prompts you can use when engaging with ChatGPT for research assistance in writing a literature review:

“Can you provide an overview of the main theories and concepts related to [your research topic]?”

“What are the current trends and developments in [your research field]?”

“Can you suggest some key studies or research papers on [specific aspect of your research topic]?”

“What are the main methodologies used in conducting research on [your research topic]?”

“Can you provide a critical analysis of the existing literature on [your research question]?”

“Are there any gaps or areas of controversy in the literature on [your research topic] that need further exploration?”

“What are the key findings and conclusions from the most recent studies on [your research topic]?”

“Can you suggest some reputable journals or publications explore for relevant literature in [your research field]?”

“What are the different perspectives or schools of thought in the literature on [your research topic]?”

“Can you provide a summary of the historical background and evolution of research on [your research topic]?”

Remember to provide clear and specific instructions in your prompts to guide ChatGPT in generating relevant and accurate content for your literature review.

Using ChatGPT to write a literature review can greatly facilitate the research process. By following a step-by-step approach, researchers can effectively leverage ChatGPT’s capabilities to gather insights, generate content, and enhance the quality of their literature review. However, it is important to approach the generated content critically, validate it with reliable sources, and ensure coherence within the review.

  • << Previous: Art + Image Prompts
  • Next: Prompt Literacy >>
  • Last Updated: May 2, 2024 3:03 PM
  • URL: https://library.fiu.edu/ai

Using ChatGPT to conduct a literature review

Affiliation.

  • 1 Department of Humanities, Czech University of Life Sciences Prague; Prague, Czech Republic.
  • PMID: 36879536
  • DOI: 10.1080/08989621.2023.2185514

AI Habit

80+ ChatGPT Prompts for Literature Review

Conducting a literature review can often seem like a daunting task. It's an essential part of academic and research work, involving comprehensive and critical analysis of previously published research related to a specific topic. The goal is to gain a deep understanding of the state of knowledge on the topic, identify gaps, and provide a foundation for further research.

Traditionally, literature reviews have been manual, time-consuming processes. However, the advancement of technology, especially Artificial Intelligence (AI), is reshaping this landscape. One such development is ChatGPT, an AI developed by OpenAI. This tool can assist researchers in conducting more efficient and thorough literature reviews.

In this blog post, we will delve into the ways that ChatGPT, with its remarkable language understanding and generation capabilities, can be used to streamline your literature review process.

From explaining the basic capabilities of ChatGPT to providing essential prompts and tips to conduct a literature review, this post will serve as a comprehensive guide for those looking to leverage AI for academic research.

Stay tuned as we navigate this fascinating convergence of AI and research, and explore the potential of ChatGPT in making literature reviews more efficient and effective.

Table of Contents

The Power of AI in Literature Reviews

The role of Artificial Intelligence (AI) in academia and research is expanding rapidly. AI has emerged as a game-changing tool that can significantly enhance the process of conducting literature reviews.

The Benefits of AI in Conducting Literature Reviews

There are numerous advantages to using AI for literature reviews. Let's delve into these benefits:

Time Efficiency

AI can streamline the literature review process, significantly reducing the time taken. With AI, you can quickly scan vast amounts of literature, saving valuable time that can be used for analysis and synthesis.

Comprehensive Coverage

AI algorithms are capable of scanning and analyzing a far greater volume of literature than a human could reasonably manage. This ensures a comprehensive and exhaustive review of existing literature.

Consistency and Bias Reduction

AI maintains a level of consistency that is hard to achieve manually. It does not suffer from fatigue or bias, ensuring that all relevant literature is considered fairly.

The Role of ChatGPT in Literature Reviews

ChatGPT, developed by OpenAI, has proven to be a valuable tool for conducting literature reviews. It is a language model that can understand and generate human-like text, making it an excellent tool for understanding complex academic literature and providing summaries or insights.

Here are some example prompts that can be used to guide ChatGPT in assisting with literature reviews:

Example Prompts:

Generate a summary of this research paper.

What are the key findings of this study?

How does this research contribute to the field of [specific field]?

Identify the strengths and weaknesses of this study.

Compare this study with the previous study on the same topic.

What are the research gaps identified in this literature?

What future research directions are suggested by this study?

Preparing for a ChatGPT-Assisted Literature Review

Before diving into a literature review with ChatGPT, you need to set the stage right. Preparation is crucial to ensuring you get the most out of your AI-assisted literature review process.

Detailed Steps on How to Set Up for a Literature Review Using ChatGPT

  • Understand Your Topic : Before starting, ensure you have a clear understanding of your topic and the specific aspects you want to focus on in the literature review.
  • Gather Your Materials : Collect all the papers, articles, and sources you plan to include in your literature review.
  • Organize Your Materials : Categorize your sources based on various parameters like theme, methodology, results, etc. This will make it easier to navigate through them during the review.
  • Prepare Your Prompts : Create a list of prompts you plan to use with ChatGPT. These prompts should be designed to extract the information you need from each source.

Tips for Effective Use of ChatGPT in Literature Review

Preparing a list of keywords or topics.

Start with a broad overview of your research topic and gradually narrow down the scope by identifying specific sub-topics, methodologies, or research questions you want to explore. Use these as keywords for your prompts.

Having Clear Objectives for the Review

Before beginning your review, clearly outline what you hope to achieve. This could be understanding the current state of research, identifying gaps, comparing methodologies, etc.

Understanding the Limits of AI Assistance

While ChatGPT is a powerful tool, it's essential to remember it is a tool and not a replacement for human analysis. Use it to process and summarize information, but the interpretation and synthesis of this information should be done by you.

Provide an overview of the literature on [specific topic].

Summarize the methodologies used in the studies on [specific topic].

Identify common themes in the literature on [specific topic].

What are the disagreements in the literature about [specific topic]?

Analyze the impact of [specific study] on the broader field of [specific topic].

What are the emerging trends in the literature on [specific topic]?

What are the commonly cited sources in the literature on [specific topic]?

Essential ChatGPT Prompts for Literature Review

ChatGPT is a powerful tool for conducting literature reviews, and the key to using it effectively is crafting the right prompts. These prompts guide the AI in scanning, analyzing, and summarizing the literature based on your specific needs.

Need To Write A Thesis? ➡️ 87+ ChatGPT Prompts for Thesis Writing

Initial Exploration of a Topic

When you're just starting a literature review, you'll need prompts that help you gain a broad understanding of your research topic. These prompts can assist in grasping the general landscape of the literature.

Give a brief overview of research done on [topic].

What are the prominent theories or models in [topic]?

What are the key debates or controversies in [topic]?

Who are the key authors or researchers in [topic]?

What methodologies are commonly used in studies on [topic]?

Summarize the historical development of [topic] in academia.

What have been the significant shifts or changes in [topic] over time?

Deep Diving into Specific Studies

Once you have a broad understanding, you'll want to delve into specific studies or papers. These prompts help you extract detailed information from individual sources.

Summarize the main arguments and findings of [specific paper].

What methodology does [specific paper] use and why?

What are the strengths and weaknesses of [specific paper]?

How does [specific paper] contribute to the broader field of [topic]?

Compare the findings of [specific paper] with [another specific paper].

What future research does [specific paper] suggest or inspire?

How has [specific paper] been received or critiqued by other scholars in the field?

Comparing and Contrasting Different Works

ChatGPT can also help you identify similarities and differences between various sources. These prompts are useful for synthesizing information across multiple works.

Compare and contrast the methodologies of [paper 1] and [paper 2].

How do the findings of [paper 1] and [paper 2] agree or disagree?

How does [author 1]'s perspective on [topic] differ from [author 2]'s?

Compare the theoretical frameworks used in [paper 1] and [paper 2].

How does the approach to [topic] change over time as seen in [earlier paper] vs. [later paper]?

What themes are common across these studies on [topic]?

What disparities or contradictions exist in the literature on [topic]?

Summarizing Findings

Finally, as you conclude your literature review, you'll want to summarize your findings. These prompts help you consolidate your insights and present them clearly.

Summarize the key findings from my literature review on [topic].

What trends or patterns have emerged from the literature on [topic]?

What gaps or unanswered questions exist in the literature on [topic]?

How have perspectives on [topic] evolved over time, based on the literature?

What consensus, if any, exists in the literature on [topic]?

How does the literature inform our understanding of [topic]?

What future research directions are suggested by the literature on [topic]?

How to Create Effective ChatGPT Prompts

Crafting effective prompts for ChatGPT is part art, part science. It involves understanding the workings of the AI model, your research objectives, and the specific information you seek from your literature review.

Guidelines for Creating Effective ChatGPT Prompts

Here are some general guidelines to follow when crafting prompts for ChatGPT:

  • Be Specific : The more specific your prompt, the more focused and relevant the AI's response will be.
  • Use Clear Language : Make your prompts as clear and concise as possible. Avoid ambiguous terms and complex phrasing.
  • Guide the AI : To get the best results, guide the AI towards the kind of response you're looking for. For example, if you want a summary, explicitly ask for a summary.
  • Experiment and Iterate : Different prompts can yield different results. Don't be afraid to tweak your prompts and try different approaches to find what works best.

Tips for Refining and Iterating on Prompts

Prompts may not always work perfectly on the first try. Here are some tips for refining and iterating on your prompts:

  • Analyze the Responses : Look at the AI's responses to understand how it is interpreting your prompts. If the responses are not what you expected, tweak your prompt to guide the AI more accurately.
  • Try Different Angles : If a prompt isn't working, try approaching the question from a different angle or rephrasing the prompt.
  • Split Complex Prompts : If a prompt is too complex, try splitting it into multiple simpler prompts.

Provide a summary of the main arguments in [specific paper].

What are the primary findings of [specific paper]?

Analyze the methodology used in [specific paper].

What are the implications of [specific paper] for future research?

Compare the findings of [specific paper] with those of [another specific paper].

What critiques have been made about [specific paper]?

What are the key themes across the literature on [specific topic]?

Examples of Good and Less Effective Prompts

Here are some examples of good and less effective prompts to help guide your prompt crafting:

Good Prompts:

Less Effective Prompts:

  • Tell me about [specific paper]. (This is too vague and doesn't guide the AI towards the specific information you're seeking.)
  • What's interesting about [specific paper]? (This prompt is subjective and may not yield useful information for a literature review.)
  • What's everyone saying about [topic]? (This prompt is overly broad and might result in a wide range of responses that aren't focused enough for a literature review.)

Navigating Challenges in Using ChatGPT for Literature Reviews

While ChatGPT can be a powerful tool for conducting literature reviews, it's important to understand its limitations and potential challenges. Being aware of these can help you navigate the process more effectively.

Understanding the Limitations of ChatGPT

Despite its sophisticated capabilities, ChatGPT is not without its limitations. Here are some to keep in mind:

  • Dependency on Input Quality : The quality of ChatGPT's responses depends on the quality and clarity of the prompts you provide.
  • Lack of Critical Analysis : While ChatGPT can summarize and synthesize information, it can't provide the same level of critical analysis and judgment that a human researcher can.
  • Potential for Misinterpretation : ChatGPT may occasionally misinterpret prompts or generate responses that miss the mark.

Overcoming Challenges in Using ChatGPT for Literature Reviews

Despite these limitations, there are ways to overcome the challenges in using ChatGPT for literature reviews:

  • Carefully Craft Your Prompts : As noted earlier, the quality of your prompts greatly influences the AI's output. Spend time crafting and refining your prompts.
  • Double-Check the Information : Always cross-check the AI's responses against the original sources to ensure accuracy.
  • Combine AI and Human Analysis : Use ChatGPT as a tool to assist with your literature review, not as a replacement for your own analysis.

Here are some example prompts that can help you navigate potential challenges:

If the response seems off: Could you provide another summary of [specific paper]?

For double-checking information: What are the key findings of [specific paper]?

To ensure comprehensive coverage: Are there any other significant points in [specific paper] that haven't been mentioned?

If the response is too broad: Could you provide a more focused summary of the methodology in [specific paper]?

If the response is too narrow: What are the broader implications of [specific paper] for the field?

To clarify a complex point: Can you explain the main argument in [specific paper] in simpler terms?

To guide the AI's interpretation: Based on [specific paper], what are the potential areas for future research in this field?

More Great ChatGPT Prompts for Literature Reviews

An engaging literature review requires delving into a range of perspectives, methodologies, and findings within your chosen area of study. Crafting prompts that encourage varied and in-depth analysis can help you gain a comprehensive understanding of your topic. Here are more unique prompt ideas to enrich your literature review process using ChatGPT:

What are the key variables studied in relation to [topic]?

What theoretical frameworks have been applied to the study of [topic]?

Explain the significance of [specific study] in the field of [topic].

What are the ethical considerations discussed in the literature on [topic]?

How does [specific paper] challenge the prevailing understanding of [topic]?

What statistical methods are commonly used in studies on [topic]?

Discuss the influence of [specific author] on the research of [topic].

How have the research methods evolved over time in the study of [topic]?

What are the socio-cultural considerations in the literature on [topic]?

How does [specific paper] address the limitations of previous studies on [topic]?

Describe the study design of [specific paper] in the context of [topic].

What contributions does [specific paper] make to the existing literature on [topic]?

How have policy changes affected the research on [topic] over time?

What is the role of technology in the research of [topic]?

Identify the key longitudinal studies in the field of [topic].

What are the demographic considerations in studies on [topic]?

How does [specific paper] incorporate interdisciplinary approaches to [topic]?

What biases or assumptions are present in the literature on [topic]?

How does [specific paper] use empirical data to support its arguments on [topic]?

Discuss the replication studies in the field of [topic].

What are the implications of the literature on [topic] for practitioners?

How does the literature on [topic] align or contrast with public opinion?

What are the critical reviews or meta-analyses in the field of [topic]?

How does [specific paper] discuss the limitations of its own study on [topic]?

Identify the seminal works in the field of [topic] and explain their significance.

In the rapidly evolving world of AI, tools like ChatGPT have the potential to revolutionize the way we approach tasks such as literature reviews. This powerful AI model, when provided with carefully crafted prompts, can sift through and synthesize vast amounts of information, assisting you in navigating the complex landscape of academic literature.

Remember, the power of ChatGPT lies in its ability to respond to diverse prompts. While this blog post has offered several useful prompts and tips for conducting a literature review, don't hesitate to experiment and innovate with your own prompts. Your creativity and critical thinking, combined with ChatGPT's capabilities, can turn the daunting task of literature review into a more engaging and manageable process.

However, it's important to bear in mind the limitations of AI and to always complement AI-generated insights with your own critical analysis.

As you embark on your journey of exploration and discovery in your chosen field of study, may ChatGPT prove a valuable ally in helping you uncover the rich tapestry of ideas, debates, and breakthroughs that form the essence of academic literature.

75+ ChatGPT Prompts for Presentations

120+ ChatGPT Prompts for Social Media Posts

Author Avatar

About R.J. Adams

R.J. is an artificial intelligence aficionado, committed to demystifying the intricacies of this cutting-edge technology for enthusiasts and novices alike. Leveraging his comprehensive background in internet marketing, online tools, and communications, he continually ventures into the landscape of newly released AI tools, conducting thorough tests and analyses, as well as finding the best prompts to use on the latest AI tools. He can also solve Rubik's Cube in under 2 minutes.

Connect With Me:

Reach out to us for sponsorship opportunities..

We look forward to talking to you!

© 2024 AIHabit.net

chatgpt writing literature review

Literature Reviews

  • Getting Started
  • Choosing a Type of Review
  • Developing a Research Question
  • Searching the Literature
  • Searching Tips
  • ChatGPT [beta]
  • ChatGPT Resources
  • Documenting your Search
  • Using Citation Managers
  • Concept Mapping
  • Writing the Review
  • Further Resources

UM's Report on GenerativeAI

The initial report from the U-M Generative Artificial Intelligence Advisory (GAIA) Committee.

chatgpt writing literature review

—User-friendly Interface that allows faculty, staff and students to engage in chat-based queries and benefit from the expertise of GENAI technology

— U-M GPT, enhances, teaching learning research in collaboration, providing a valuable and equitable resource for the entire university community

— designed to be accessible, including for use with screen readers

— provides access to GPT-3.5 and GPT-4 with no charge

chatgpt writing literature review

https://chat.openai.com

  • The knowledge cutoff for the ChatGPT 3.5 is September 2021 and it has no access to the Internet. Academic users may consider alternatives such as  Semantic Scholar,  Elicit,  Consensus  or other AI-powered tools for doing  real-time  literature reviews.
  • Input/ Output length  - ChatGPT-3.5 allows a maximum token limit of 4096 tokens. According to ChatGPT " On average, a token in English is roughly equivalent to 4 bytes or characters. English words are typically around 5 characters long. This means that, very roughly, you could fit around 800 to 1000 English words within 4096 tokens." 
  • Notes : (1) According to ChatGPT, the generated responses are non-deterministic by default. So if you run the searches again and get slightly or very different results than the ones listed on this guide, it's likely due to this factor. (2) While ChatGPT or Bing Chat searches offer conversational interaction, they might not be a substitute for the rigor and reliability of traditional academic database searches.  

KEYWORD GENERATION

Write a literature review abstract based on the following keywords: Keywords: Artificial Intelligence, Employment Trends, Skill Development, Future Workforce

Infant death Neonatal mortality  (referring specifically to deaths within the first 28 days of life) Perinatal mortality  (referring to deaths occurring during the period from 22 completed weeks of gestation to 7 completed days after birth) Early childhood mortality  (referring to deaths occurring within the first five years of life) Child mortality  (referring to deaths occurring before the age of 18) [Press the  Regenerate  button to get more.]

  • Example -   RELATED WORDS What are the related words of infant mortality?

Neonatal mortality, Post-neonatal mortality, Stillbirths, Low birth weight, Malnutrition, Infectious diseases, Vaccination, Maternal health,  Access to healthcare, Poverty, Social inequality, Sanitation, Hygiene, Water quality, Childbirth complications, Congenital abnormalities, Birth defects, Maternal age, Under-five mortality, Child mortality, Perinatal mortality, Preterm birth, Low birth weight, Sudden infant death syndrome (SIDS), Maternal mortality, Postnatal care, Malnutrition, Immunization, Access to healthcare, Clean water and sanitation, Congenital anomalies, Infant health disparities, Infant mortality rate (IMR), Infant survival.  

(("Infant Death"[Mesh] OR "Sudden Infant Death Syndrome"[Mesh] OR "Infant Mortality"[Mesh] OR "Perinatal Mortality"[Mesh] OR "Neonatal Mortality"[Mesh] OR "Fetal Mortality"[Mesh] OR "Infant Death" OR "Sudden Infant Death Syndrome" OR "Infant Mortality" OR "Perinatal Mortality" OR "Neonatal Mortality" OR "Fetal Mortality") AND ("Risk Factors"[Mesh] OR "Etiology"[Mesh] OR "Prevention and Control"[Mesh] OR "Interventions"[Mesh] OR "Public Health"[Mesh] OR "Health Disparities"[Mesh] OR "Healthcare Disparities"[Mesh] OR "Poverty"[Mesh] OR "Malnutrition"[Mesh] OR "Infection"[Mesh] OR "Congenital Abnormalities"[Mesh] OR "Prematurity"[Mesh] OR "Low Birth Weight"[Mesh] OR "Maternal Education"[Mesh] OR "Maternal Health"[Mesh] OR "Risk Factors" OR "Etiology" OR "Prevention and Control" OR "Interventions" OR "Public Health" OR "Health Disparities" OR "Healthcare Disparities" OR "Poverty" OR "Malnutrition" OR "Infection" OR "Congenital Abnormalities" OR "Prematurity" OR "Low Birth Weight" OR "Maternal Education" OR "Maternal Health"))

Comments and Suggestions: #1 -This search string may not work on all database platforms. For example,  Medline (OVID)  may not be able to run it,  Medline  &  CINAHL (EBSCO)  can run it without any problems while   PubMed  will run it but it will indicate which MeSH terms are not indexed. See PubMed's search  History and Details  under  Advanced . #2 -This is just one possible search string and you may need to adjust it based on your specific research questions and the databases you are using. #3 - It might be easier if you break the task into 3 parts: ask the chatbot give you the synonyms > ask it to give you the thesauri terms > and ask it to combine them.

RESEARCH QUESTION

Increased risk of sudden infant death syndrome (SIDS) Increased risk of premature birth Increased risk of low birth weight Increased risk of respiratory problems in newborns Increased risk of respiratory problems in infants exposed to secondhand smoke Increased risk of developing asthma and other respiratory illnesses later in life for infants exposed to secondhand smoke Please generate a literature review that addresses the research question: 'What are the effects of social media on the mental health of adolescents? [Author's note : If you ask for the specific studies, ChatGPT may generate   non-existent references or false knowledge. To find out   why Large Language Models   hallucinate, check out this WiKi article:  Hallucination (artificial intelligence)   and this blog post -   A Gentle Introduction to Hallucinations in Large Language Models  by Adrian Tam ]

DATABASE RECOMMENDATIONS  

Heath Science  -  Which databases to use to find articles on the EFFECTS OF MATERNAL SMOKING ON INFANT DEATH? PubMed:  This is a database of biomedical literature from the National Library of Medicine. It includes articles from medical journals and other sources on a wide range of health-related topics. Embase:  This is a database of biomedical literature that covers more international and pharmaceutical content than PubMed. It includes articles from medical journals and other sources on a wide range of health-related topics. Cochrane Library : This is a collection of databases that provides systematic reviews and meta-analyses of healthcare interventions, including interventions related to maternal smoking and infant death. Scopus:  This is a multidisciplinary database that covers research from a wide range of fields, including health sciences. It includes articles from scientific journals, books, and conference proceedings. Web of Science:  This is a database that covers research from a wide range of fields, including health sciences. It includes articles from scientific journals, books, and conference proceedings. CINAHL:  This is a database of nursing and allied health literature, including articles on maternal smoking and infant death. [Note: The response above was copied and pasted directly from ChatGPT. No editing was added.] Education  --  I am doing a  meta-analysis  on k-12 STEM education. What databases should I use? ERIC : This is a database of education research and information, including articles from journals, reports, and conference papers. It is a commonly used database for educational research. Education Source : This is a comprehensive database of education research and information, including articles from journals, books, and conference papers. Web of Science:  This is a multidisciplinary database that covers research from a wide range of fields, including education. It includes articles from scientific journals, books, and conference proceedings. Scopus:  This is another multidisciplinary database that covers research from a wide range of fields, including education. It includes articles from scientific journals, books, and conference proceedings. PsycINFO:  This is a database of psychology research, including articles from journals, books, and dissertations. It can be useful for research on STEM education that includes psychological factors. Cochrane Library : This is a collection of databases that provides systematic reviews and meta-analyses of healthcare interventions including interventions related to education.

Example:   ACT-AS-AN-EXPERT PROMPT Here's an alternative way to consolidate all the questions into a single consultation by using this popular Act-As-An-Expert prompt, aka, system message prompt:  I am going to conduct a systematic review on games and mathematics related to K-12 education  I want you to act as a systematic review expert. I will ask you some questions related to my research topic and you advise me on how to do it. 

GREY LITERATURE  

Please show me all the resources that offer grey literature for conducting research in the field of education     ERIC (Education Resources Information Center): https://eric.ed.gov/     OpenGrey: http://www.opengrey.eu/     British Education Index: https://www.ebscohost.com/academic/british-education-index     WorldCat: https://www.worldcat.org/     ProQuest Dissertations & Theses Global: https://www.proquest.com/products-services/pqdtglobal.html     EThOS (Electronic Theses Online Service): https://ethos.bl.uk/     Institutional Repositories of universities and academic institutions (search for specific institution repositories)     American Educational Research Association (AERA): https://www.aera.net/     National Education Association (NEA): https://www.nea.org/     Association for Supervision and Curriculum Development (ASCD): http://www.ascd.org/     RAND Corporation: https://www.rand.org/     National Center for Education Research (NCER): https://ies.ed.gov/ncer/     National Center for Education Statistics (NCES): https://nces.ed.gov/     European Educational Research Association (EERA): https://www.eera-ecer.de/     UNESCO International Institute for Educational Planning (IIEP): http://www.iiep.unesco.org/     Organization for Economic Cooperation and Development (OECD): https://www.oecd.org/education/     Education Development Center (EDC): https://www.edc.org/     National Institute for Educational Policy Research (NIER) - Japan: https://www.nier.go.jp/     Australian Council for Educational Research (ACER): https://www.acer.org/     Canadian Education Association (CEA): https://www.cea-ace.ca/ My topic is about stem education. Which grey literature databases should I use?    STEM Education Research Centre (SERC) Grey Literature Database    National STEM Education Research Repository (NSERR)    National Science Digital Library (NSDL)    STEM Education Resource Center (SERC) GrayLit Network    STEM Education Grey Literature Library

MORE RESOURCES

  • << Previous: AI Lit Searching [beta]
  • Next: Organizing the Literature >>
  • Last Updated: Feb 29, 2024 10:31 AM
  • URL: https://guides.lib.umich.edu/litreview

MIM Learnovate

How to Use ChatGPT to Write a Literature Review With Prompts

chatgpt writing literature review

Are you struggling with writing a literature review for your research paper?

Do you find it challenging to organize your thoughts and present a cohesive and engaging review?

Look no further, as this step-by-step guide will walk you through how to use ChatGPT to write a literature review effortlessly and effectively.

With the assistance of ChatGPT, you can enhance your writing process and create a well-structured literature review that will impress your readers and save your time in research.

Leveraging ChatGPT for Conducting Meta-Analysis in Research

ChatGPT For Large Documents in 6 Easy Steps [2023]

How to use ChatGPT for Hypothesis Development

12 ChatGPT Prompts for Text Analysis

Table of Contents

Step 1: Familiarize Yourself with ChatGPT

To get started, visit the official website of ChatGPT at chat.openai.com

Take a moment to explore the features and functionalities offered by this powerful writing assistant.

You’ll find a user-friendly interface that allows you to interact with the model and generate high-quality content effortlessly.

Step 2: Understanding Prompts

Prompts act as a guiding framework for ChatGPT , providing it with the necessary context to generate accurate and relevant content.

When writing a literature review, you can leverage prompts to direct the AI’s attention towards specific areas or topics you want to cover.

This ensures that the generated text aligns with your intended purpose.

Step 3: Crafting an Effective Prompt

To write a compelling literature review, it’s crucial to formulate an effective prompt.

A well-crafted prompt should be clear, concise, and include specific instructions to guide the AI model.

Craft a prompt according to the requirements of your study. Provide detailed information about what you want to be included in your literature review. You need to give detailed information about the variables that you have used in the study, the relationships between them, and the research hypothesis .

Example: Prompt # 1

Let’s consider an example of a prompt for a literature review on “The Impact of Artificial Intelligence in Healthcare”:

Example: Prompt # 2

Write a literature review of a research paper. Topic: Effects on Stress on digital hoarding behavior of an Employee. The mediating role of procrastination and moderating role of work overload. Variab les: Stress, digital hoarding, procrastination, and work overload. Independent Variable: Stress, Dependent Variable: digital hoarding, Mediator: procrastination, Moderator: work overload. Provide separate definition of each variable. After that explain the relationships between variables. Research Hypothesis: There are 3 hypotheses. H1: Stress has positive influence on digital hoarding. H2: Procrastination is mediating the relationship of stress and digital hording. H3 : Workload moderates the relationship positively of stress and procrastination in an employee. Time_Period: 2015-2023 Include the authors and references in the literature review. Also Include citations. The references should be provided at the end.

By providing a detailed prompt like the one above, you ensure that ChatGPT understands the specific requirements of your literature review, enabling it to generate relevant and insightful content.

10+ ChatGPT Prompts to Write Your Resume

How to Use ChatGPT to Write SQL Queries: Examples and Tips

10 ChatGPT Prompts for Job Applications

20 ChatGPT Prompts to Write Code in Python

20 ChatGPT Prompts to Write JavaScript Code

Step 4: Interacting with ChatGPT

Once you have a well-crafted prompt, it’s time to interact with ChatGPT and generate your literature review.

On the website, you’ll find a text input field where you can enter your prompt. Simply copy and paste the prompt we discussed in the previous step into the input field and press the “Generate” button.

How to Use ChatGPT to Write a Literature Review With Prompts

ChatGPT will then process the prompt and generate a response in real-time. The generated content will be based on the data it has been trained on and the context provided by the prompt.

Output provided by the ChatGPT

Literature Review

Title: Effects of Stress on Digital Hoarding Behavior of an Employee: The Mediating Role of Procrastination and Moderating Role of Work Overload

Introduction: The rapid advancement of technology and the increasing reliance on digital platforms have transformed the way individuals manage and organize their information. However, this digital age has also given rise to new behavioral patterns, such as digital hoarding, which refers to the excessive accumulation and difficulty in discarding digital files and data. Digital hoarding can have significant consequences for individuals, particularly in a work setting, where it can lead to decreased productivity, increased stress, and impaired decision-making. This literature review aims to explore the effects of stress on digital hoarding behavior among employees, while considering the mediating role of procrastination and the moderating role of work overload.

Definition of Variables:

  • Stress: Stress is a psychological and physiological response to external pressures or demands, characterized by feelings of tension, anxiety, and unease. In the context of this research, stress refers to the emotional and cognitive strain experienced by employees in their work environment, which may arise from factors such as excessive workload, time pressure, or interpersonal conflicts.
  • Digital Hoarding: Digital hoarding is a behavioral tendency characterized by the excessive accumulation and reluctance to delete or organize digital files, documents, emails, or other forms of electronic data. It represents a disorganized and cluttered digital environment, resulting in difficulties in locating and managing information effectively.
  • Procrastination: Procrastination refers to the act of delaying or postponing tasks or activities that need to be accomplished. In the context of this research, procrastination is viewed as a mediator variable, playing a role in the relationship between stress and digital hoarding. It represents a cognitive and behavioral tendency to engage in non-essential activities or tasks that provide temporary relief from stress, leading to the neglect of important tasks, including digital file management.
  • Work Overload: Work overload refers to a state in which an individual is faced with an excessive amount of work or tasks that exceed their available resources, including time and energy. As a moderating variable in this study, work overload is expected to influence the relationship between stress and procrastination. High levels of work overload may intensify the impact of stress on procrastination tendencies, exacerbating digital hoarding behavior.

Relationships between Variables: The proposed research hypotheses aim to investigate the relationships among the variables in the context of stress, digital hoarding, procrastination, and work overload.

Hypothesis 1 (H1): Stress has a positive influence on digital hoarding. This hypothesis suggests that higher levels of stress experienced by employees will be associated with increased digital hoarding behavior. Previous studies (Smith et al., 2017; Johnson & Brown, 2019) have indicated a positive relationship between stress and hoarding tendencies, suggesting that individuals may resort to hoarding behaviors as a coping mechanism to manage stress.

Hypothesis 2 (H2): Procrastination mediates the relationship between stress and digital hoarding. This hypothesis posits that procrastination acts as a mediator, partially explaining the relationship between stress and digital hoarding behavior. Research by Roberts and Zomer (2016) has highlighted the role of procrastination in information overload and disorganization, which aligns with the concept of digital hoarding.

Hypothesis 3 (H3): Work overload moderates the relationship between stress and procrastination. This hypothesis suggests that work overload enhances the positive relationship between stress and procrastination. The presence of high workloads can intensify the impact of stress on procrastination tendencies, leading to greater digital hoarding behavior. This is supported by studies conducted by Miller and Brown (2018) and Thompson et al. (2021), which have demonstrated the influence of work overload on procrastination behavior.

Conclusion: This literature review has provided an overview of the effects of stress on digital hoarding behavior among employees, considering the mediating role of procrastination and the moderating role of work overload. The identified research hypotheses (H1, H2, H3) will serve as a foundation for further empirical investigation to deepen our understanding of the complex relationships between these variables. By examining the impact of stress, procrastination, and work overload on digital hoarding behavior, organizations can develop strategies to promote healthier digital practices and enhance employee productivity.

References: Johnson, A. M., & Brown, T. A. (2019). Examining the relationship between stress, anxiety, and hoarding: A systematic review and meta-analysis. Journal of Obsessive-Compulsive and Related Disorders, 23, 100468.

Miller, R. B., & Brown, T. A. (2018). The relationship between work overload and mental health symptoms in firefighters: The moderating role of perceived supervisor support. Journal of Workplace Behavioral Health, 33(3), 213-229.

Roberts, L. D., & Zomer, A. C. (2016). The procrastination–information systems quality link: The mediating role of self-efficacy. Computers in Human Behavior, 65, 167-176.

Smith, C. G., et al. (2017). The relationship between hoarding symptomatology, compulsive buying, and obsessive-compulsive disorder. International Journal of Cognitive Therapy, 10(4), 332-348.

Thompson, M. A., et al. (2021). Work overload and turnover intentions: The mediating role of procrastination and the moderating role of neuroticism. Journal of Vocational Behavior, 128, 103603.

Step 5: Refining and Polishing

After ChatGPT generates the initial response, it’s time to refine and polish the content. Review the output carefully, ensuring that it aligns with the objectives of your literature review.

You can ask ChatGPT to provide much more detailed explanation on the variable which evr you would like.

Its good to provide ChatGPT detailed prompt to get better results. If you provide much detailed prompt, ChatGPT would write literature review according to your needs.

Note: The literature review that you would get is plagiarism-free. This is another benefit of using chatgpt to write literature review.

As an AI-powered writing assistant, ChatGPT provides a starting point for your literature review.

It can help you brainstorm ideas, explore different perspectives, and save time on the initial drafting process. However, it’s essential to remember that human judgment and expertise are still vital in crafting a comprehensive and well-rounded literature review.

How to Use ChatGPT to Write Your Cover Letter | Prompts

How to Use ChatGPT to Write a Resume? | Prompts

How to Use ChatGPT to Write Code: A Software Developer’s Guide

Best ChatGPT Prompts For Students

Step 6: Iterative Process

Writing a literature review is often an iterative process. You may need to go through multiple rounds of generating and refining the content until you achieve the desired outcome. Take advantage of ChatGPT’s capabilities to experiment with different prompts , explore diverse angles, and refine your review further.

By leveraging the power of ChatGPT and following this step-by-step guide , you can streamline your literature review writing process and enhance the quality of your work.

ChatGPT is a valuable tool for writing literature reviews. By following the steps outlined in this guide , you can effectively utilize ChatGPT to generate high-quality content for your literature review.

Remember, though, that while ChatGPT is a powerful AI model, it’s essential to review and refine the generated content to ensure it aligns with your specific requirements.

Other articles

Please read through some of our other articles with examples and explanations if you’d like to learn more.

  • Paraphraser.io
  • Imagestotext.io
  • Websites to Read Books for Free
  • Tools to Convert PNG Image to Excel
  • Detect AI-Generated Text using ZeroGPT and Transform It using Quillbot
  • How is QuillBot used in Academic Writing?
  • Tools for Presentations
  • AI Tools for Citation Management
  • Improve your Writing with QuillBot and ChatGPT
  • Tools Transforming Knowledge Management
  • Plagiarism Checkers Online
  • Information Management Software 
  • Tools for Information Management
  • Software Tools for Writing Thesis
  • OpenAI WordPress Plugin
  • TTS Voiceover
  • Backend Automation Testing Tools
  • AI Tools for Academic Research
  • SendMultiple PDF Files as a Single Attachment
  • Google Bard
  • Google Bard for SEO
  • Shorten the Content’s Length Without Lossing its Meaning
  • Write Scientific Research Paper Using Google Bard AI
  • Google Bard Can Create Spreadsheets
  • How AI Technology is Helpful For Marketers
  • How AI Chatbots Can Identify Trading Patterns
  • Quorum vs Hyperledger Fabric vs R3 Corda
  • ChatGPT vs. Google Bard
  • Quantum Apocalypse
  • Firmware and Software
  • Synergy vs. KVM Switches
  • Flash Drive Vs Pen Drive
  • Type A, B and C USB Cable
  • WSN and IoT
  • eSourcing and eProcurement
  • Difference between eProcurement and Traditional Procurement
  • Over the Top and Video On Demand
  • Software Development Life Cycle Vs Software Testing Life Cycle
  • Software Development Team vs Freelancer
  • HDMI over IP vs HDBaseT
  • Display Stream Compression
  • Top Internet of Things Journals
  • Supervised Learning vs. Unsupervised Learning
  • Video Upscaling and Video Downscaling
  • How To Build a Distributed System
  • How to Get Into Data Science From a Non-Technical Background?
  • How To Build Ethereum Mining Rig?
  • Paraphrase Text
  • PowerPoint Slides
  • Learn Languages
  • Write Code in Python
  • Write Literature Review
  • Document Review
  • Job Interview Preparation
  • Prompts For Students
  • Write Cover Letter
  • Write Resume
  • Job Applications
  • Write SQL Queries
  • Write Excel Formulas
  • Academic Writing
  • Translate Text
  • Keyword Research
  • Business Idea Generation
  • Business Entrepreneur
  • Use ChatGPT on WhatsApp
  • Virtual Research Assistant
  • Meta-Analysis in Research
  • Large Documents
  • Hypothesis Development
  • Share ChatGPT Responses
  • Text Analysis
  • Upload PDF on ChatGPT
  • Books ChatGPT

chatgpt writing literature review

Misbah Rashid, an expert in Technology Management, holds an MBA and an MS in Information Systems and Technology Management. She has experience teaching marketing and technology in business at the university level.

Related Posts

Who is a good peer reviewer, peer review | types of peer review, ethics in research: safeguarding integrity and credibility, advantages and disadvantages of snowball sampling, exploring qualitative researcher skills: what they are and how to develop them, difference between quota sampling and stratified sampling, how effective laboratory design impacts health, safety, and productivity, why is laboratory safety important in research, what is purposive sampling | examples, quota sampling in research.

Comments are closed.

  • Learn Chat GPT
  • Learn Chat GPT (Advanced)

Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review With Prompts

Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review With Prompts

Writing a literature review can be a challenging task for researchers and students alike. It requires a comprehensive understanding of the existing body of research on a particular topic. However, with the advent of advanced language models like ChatGPT, the process has become more accessible and efficient.

In this step-by-step guide, ilovephd will explore how you can leverage ChatGPT to write a compelling literature review that is both informative.

Discover how to effectively utilize ChatGPT as a research assistant to write a comprehensive and SEO-friendly literature review. Follow our step-by-step guide to leverage this powerful tool, optimize your review for search engines, and contribute to the scholarly conversation in your field.

A Step-by-Step Guide: How to Use ChatGPT for Writing a Literature Review

Step 1: Defining Your Research Objective Before diving into the literature review process, it is crucial to define your research objective.

Clearly articulate the topic, research question, or hypothesis you aim to address through your literature review. This step will help you maintain focus and guide your search for relevant sources.

Step 2: Identifying Keywords and Search Terms To effectively use ChatGPT to assist in your literature review, you need to identify relevant keywords and search terms related to your research topic.

These keywords will help you narrow down your search and gather pertinent information. Consider using tools like Google Keyword Planner or other keyword research tools to discover commonly used terms in your field.

Step 3: Familiarizing Yourself with ChatGPT Before engaging with ChatGPT, it is essential to understand its capabilities and limitations. Familiarize yourself with the prompts and commands that work best with the model.

Keep in mind that ChatGPT is an AI language model trained on a vast amount of data, so it can provide valuable insights and suggestions, but it’s important to critically evaluate and validate the information it generates.

Step 4: Generating an Initial Literature Review Outline Start by creating an outline for your literature review. Outline the main sections, such as the introduction, methodology, results, discussion, and conclusion.

Within each section, jot down the key points or subtopics you want to cover. This will help you organize your thoughts and structure your review effectively.

Step 5: Engaging with ChatGPT for Research Assistance Once you have your outline ready, engage with ChatGPT for research assistance.

Begin by providing a clear and concise prompt that specifies the topic, context, and any specific questions you have. For example, “What are the current trends in [your research topic]?” or “Can you provide an overview of the main theories on [your research question]?”

Step 6: Reviewing and Selecting Generated Content ChatGPT will generate a response based on your prompt. Carefully review the content generated, considering its relevance, accuracy, and coherence.

Extract key points, relevant references, and insightful arguments from the response and incorporate them into your literature review. Be sure to cite and attribute the sources appropriately.

Step 7: Ensuring Coherence and Flow While ChatGPT can provide valuable content, it’s important to ensure the coherence and flow of your literature review.

Use your critical thinking skills to connect the generated content with your research objective and existing knowledge. Rearrange, rephrase, and expand upon the generated text to ensure it aligns with the structure and purpose of your review.

Step 8: Editing and Proofreading Once you have incorporated the generated content into your literature review, thoroughly edit and proofread the document.

Check for grammatical errors, consistency in referencing, and overall clarity. This step is crucial to ensure your literature review is polished and professional.

ChatGPT prompts to Write a Literature Review

Prompts you can use when engaging with ChatGPT for research assistance in writing a literature review:

“Can you provide an overview of the main theories and concepts related to [your research topic]?”

“What are the current trends and developments in [your research field]?”

“Can you suggest some key studies or research papers on [specific aspect of your research topic]?”

“What are the main methodologies used in conducting research on [your research topic]?”

“Can you provide a critical analysis of the existing literature on [your research question]?”

“Are there any gaps or areas of controversy in the literature on [your research topic] that need further exploration?”

“What are the key findings and conclusions from the most recent studies on [your research topic]?”

“Can you suggest some reputable journals or publications explore for relevant literature in [your research field]?”

“What are the different perspectives or schools of thought in the literature on [your research topic]?”

“Can you provide a summary of the historical background and evolution of research on [your research topic]?”

Remember to provide clear and specific instructions in your prompts to guide ChatGPT in generating relevant and accurate content for your literature review.

Using ChatGPT to write a literature review can greatly facilitate the research process. By following a step-by-step approach, researchers can effectively leverage ChatGPT’s capabilities to gather insights, generate content, and enhance the quality of their literature review. However, it is important to approach the generated content critically, validate it with reliable sources, and ensure coherence within the review.

How to Build a Website With ChatGPT: Using AI to Create a WordPress Site From Scratch

By signing up, you agree to our User Agreement and Privacy Policy & Cookie Statement .

ChatGPT in higher education - a synthesis of the literature and a future research agenda

  • Open access
  • Published: 02 May 2024

Cite this article

You have full access to this open access article

chatgpt writing literature review

  • Pritpal Singh Bhullar 1 ,
  • Mahesh Joshi 2 &
  • Ritesh Chugh   ORCID: orcid.org/0000-0003-0061-7206 3  

846 Accesses

19 Altmetric

Explore all metrics

ChatGPT has emerged as a significant subject of research and exploration, casting a critical spotlight on teaching and learning practices in the higher education domain. This study examines the most influential articles, leading journals, and productive countries concerning citations and publications related to ChatGPT in higher education, while also shedding light on emerging thematic and geographic clusters within research on ChatGPT’s role and challenges in teaching and learning at higher education institutions. Forty-seven research papers from the Scopus database were shortlisted for bibliometric analysis. The findings indicate that the use of ChatGPT in higher education, particularly issues of academic integrity and research, has been studied extensively by scholars in the United States, who have produced the largest volume of publications, alongside the highest number of citations. This study uncovers four distinct thematic clusters (academic integrity, learning environment, student engagement, and scholarly research) and highlights the predominant areas of focus in research related to ChatGPT in higher education, including student examinations, academic integrity, student learning, and field-specific research, through a country-based bibliographic analysis. Plagiarism is a significant concern in the use of ChatGPT, which may reduce students’ ability to produce imaginative, inventive, and original material. This study offers valuable insights into the current state of ChatGPT in higher education literature, providing essential guidance for scholars, researchers, and policymakers.

Similar content being viewed by others

chatgpt writing literature review

Mapping the global evidence around the use of ChatGPT in higher education: A systematic scoping review

chatgpt writing literature review

ChatGPT and the digitisation of writing

chatgpt writing literature review

The Educational Affordances and Challenges of ChatGPT: State of the Field

Avoid common mistakes on your manuscript.

1 Introduction

ChatGPT, or Chat Generative Pre-trained Transformer, is a popular generative Artificial Intelligence (AI) chatbot developed by OpenAI, employing natural language processing to deliver interactive human-like conversational experiences (Jeon et al., 2023 ; Angelis et al., 2023 ). ChatGPT utilises a pre-trained language learning model, derived from an extensive big-data corpus, to predict outcomes based on a given prompt (Crawford et al., 2023 ; Geerling et al., 2023 ; Li et al., 2023 ). Since its inception, ChatGPT has attracted widespread attention and popularity and has the potential to disrupt the education sector (Rana, 2023 ). According to a research survey of adults conducted by the Pew Research Centre, approximately 60% of adults in the United States and 78% of adults in Asia possess knowledge of ChatGPT; furthermore, men are more familiar with ChatGPT than women (Vogels, 2023 ). The study also found that among ethnic groups globally, individuals of Asian descent have the highest level of familiarity with AI-based large language models (LLMs).

People have found value in using ChatGPT for a wide range of purposes, including generating creative content, answering questions, providing explanations, offering suggestions, and even having casual conversations (Crawford et al., 2023 ; Throp, 2023 ; Wu et al., 2023 ). Furthermore, ChatGPT is an effective digital assistant for facilitating a thorough understanding of diverse and intricate subjects using simple and accessible language. Given these features, ChatGPT has the potential to bring about a paradigm shift in traditional methods of delivering instruction and revolutionise the future of education (Tlili et al., 2023 ). ChatGPT stands out as a promising tool for open education, enhancing the independence and autonomy of autodidactic learners through personalised support, guidance, and feedback, potentially fostering increased motivation and engagement (Firat, 2023 ). Its capabilities encompass facilitating complex learning, asynchronous communication, feedback provision, and cognitive offloading (Memarian & Doleck, 2023 ).

However, the rapid expansion of ChatGPT has also aroused apprehensions in the academic world, particularly after reports surfaced that the New York Department of Education had unexpectedly imposed a ban on access to the tool due to concerns about academic integrity violations (Sun et al., 2023 ; Neumann et al., 2023 ; Crawford et al., 2023 ). Students who use ChatGPT to produce superior written assignments may have an unfair advantage over peers who lack access (Farrokhnia et al., 2023 ; Cotton et al., 2023 ). Ethical concerns about the deployment of LLMs include the potential for bias, effects on employment, misuse and unethical deployment, and loss of integrity. However, there has been little research on the potential dangers that a sophisticated chatbot such as ChatGPT poses in the realm of higher education, particularly through the lens of a systematic literature review and bibliometric techniques.

In this light, this paper explores the literature on the application of ChatGPT in higher education institutions and the obstacles encountered in various disciplines from the perspectives of both faculty and students. The paper aims to analyse the current state of the field by addressing the following overarching research questions using bibliographic coupling, co-occurrence analysis, citation analysis, and co-authorship analysis:

What are the most influential articles in terms of citations in research related to ChatGPT in education?

What are the top journals and countries in terms of publication productivity related to the implications of ChatGPT in higher education institutions?

What are the emerging thematic clusters in research on the role and challenges of ChatGPT in teaching and learning in higher education institutions?

What are the geographic clusters in research on the role and challenges of ChatGPT in teaching and learning in higher education institutions?

2 Methodology

In conducting this study, publications on the impact of ChatGPT on various aspects of higher education institutions were systematically identified through an extensive search using Elsevier’s Scopus database, a comprehensive repository hosting over 20,000 globally ranked, peer-reviewed journals (Mishra et al., 2017 ; Palomo et al., 2017 ; Vijaya & Mathur, 2023 ). Scopus is a widely used database for bibliometric analyses and is considered one of the “largest curated databases covering scientific journals” (pg. 5116) in different subject areas (Singh et al., 2021 ). Widely acclaimed for its comprehensive coverage, Scopus has been extensively employed in bibliometric analyses across diverse disciplines, as evidenced by studies in capital structure theories, business research, entrepreneurial orientation and blockchain security (Bajaj et al., 2020 ; Donthu et al., 2020 ; Gupta et al., 2021 ; Patrício & Ferreira, 2020 ). Notably, despite the “extremely high” correlation between the Web of Science and Scopus databases, Scopus’s status as a superior and versatile data source for literature extraction is reinforced by its broader coverage of subject areas and categories compared to the narrower journal scope of Web of Science, facilitating scholars in locating literature most pertinent to the review area (Archambault et al., 2009 ; Paul et al., 2021 ). To ensure a systematic literature review, we adhered to the preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines (Page et al., 2021 ) for the search, identification, selection, reading, and data extraction from the articles retrieved through the Scopus database (Fig.  1 ). Reliance on a single database is acceptable within the PRISMA framework (Moher et al., 2009 ).

Employing Boolean-assisted search queries, we aimed to capture a comprehensive range of topics related to ChatGPT’s impact on higher education institutions. Specific search queries were carefully selected to ensure a broad yet relevant search scope and included the following:

“ChatGPT and Teaching learning in universities” OR “Effect of ChatGPT in higher education institution” OR “ChatGPT and student assessment in higher education” OR “ChatGPT and academic integrity” OR “ChatGPT and teaching pedagogy in higher education institution” OR “ChatGPT and cheating student course assignment” OR “ChatGPT and teaching in higher education” OR “Implications of ChatGPT in higher education institutions” OR “ChatGPT and evaluation criteria in higher education institution” OR “ChatGPT in universities” OR “ChatGPT and student learnings. ”

The study includes papers published and included in the Scopus database on or before May 26, 2023 on the theme of ChatGPT and higher education. This timeframe was chosen to encompass the most recent and relevant literature available up to the point of data retrieval. Papers identified through the search queries underwent inclusion or exclusion based on predetermined criteria. Specifically, only papers published in journals were considered for this study, as these undergo a peer-review process and are subject to stringent selection criteria set by the journals, ensuring their quality and reliability. Papers in conference proceedings were excluded from the start of the search. Only papers written in English were included to maintain consistency and clarity, whereas others were excluded. Of the 48 research papers that were initially identified, 47 were ultimately selected for the bibliometric analysis, which was conducted using VOSviewer, a bibliometric analysis tool.

figure 1

PRISMA Flowchart

From the identified pool of 47 articles, the analysis uncovered a nuanced distribution of research methodologies. Specifically, 11 studies were grounded in quantitative research methodologies, underscoring a quantitative focus within the literature. In contrast, a substantial majority of 31 articles embraced a qualitative framework, showcasing a diverse spectrum that included pure qualitative research, editorials, letters to the editor, and opinion pieces. Furthermore, the review brought to light four literature reviews, signifying a synthesis of existing knowledge, and identified one study that strategically employed a mixed-methods approach, blending both qualitative and quantitative research techniques.

To address the research questions, the selected publications underwent analysis using various bibliometric techniques. For the first and second research questions, citation analysis was employed. For the third and fourth research questions, bibliographic analysis was performed in VOSviewer software to generate clusters.

3 Findings and discussion

3.1 publication trend.

Information from the Scopus database indicates that academics began focusing on investigating various aspects of ChatGPT’s potential in higher education in 2022, as they published their findings in 2023. All academic articles in reputable publications in the Scopus database were published in 2023.

3.2 Citation analysis

Table  1 presents the top ten articles according to the number of citations. The number of articles increased significantly in 2023, consistent with the emerging nature and growing relevance of the topic. Exploring the ramifications of ChatGPT in higher education is a recent focal point for scholars, with numerous aspects warranting deeper investigation. The limited citation count, as anticipated, underscores that publications from 2023 are in the early stages of gaining visibility and recognition within the academic community.

The article by Thorp ( 2023 ), entitled “ChatGPT is fun, but not an author”, has received the highest number of citations (79). Thorp stresses the risks associated with implementing ChatGPT in the classroom. Although ChatGPT is an innovative AI tool, significant barriers remain to its implementation in the field of education. According to Thorp, using ChatGPT in academic writing is still inefficient. Thorp also expresses concerns about the rising prevalence of ChatGPT in the fabrication of scientific publications. The second most-cited work, “How Does ChatGPT Perform on the United States Medical Licensing Examination?” by Gilson and colleagues, has received 27 citations. Gilson et al. ( 2023 ) evaluated the accuracy, speed and clarity of ChatGPT’s responses to questions on the United States Medical Licensing Examination’s Step 1 and Step 2 tests. The text responses generated by ChatGPT were evaluated using three qualitative metrics: the logical justification of the chosen answer, the inclusion of information relevant to the question, and the inclusion of information extraneous to the question. The model attained a level of proficiency comparable to that of a third-year medical student. The study demonstrates the potential utility of ChatGPT as an interactive educational resource in the field of medicine to facilitate the acquisition of knowledge and skills. Third is Kasneci et al.’s article “ChatGPT for good? On opportunities and challenges of large language models for education”, with 13 citations. This paper examines the benefits and drawbacks of using language models in the classroom from the perspectives of both teachers and students. The authors find that these comprehensive language models can serve as a supplement rather than a replacement for classroom instruction. Each of the remaining top-ten articles mentioned the impact of ChatGPT on academic integrity in education and had received fewer than ten citations at the time of analysis.

Table  2 presents the top 10 journals in terms of the number of citations of publications related to the topic of ChatGPT in higher education. The journal Science , which published “ChatGPT is fun, but not an author,” was deemed most influential because it received the highest number of citations (79). JMIR Medical Education has published two articles that have been cited by 30 other research articles on the same topic. Journal of University Teaching and Learning Practise has published the most articles: three. Innovations in Education and Teaching International has published two articles on this topic, which together have been cited by six articles.

As shown in Table  3 , the majority of research articles pertaining to ChatGPT and higher education have originated from countries in Asia. Six of the top 10 countries for publishing articles on this topic are located in the Asian continent. However, the most influential studies in terms of citations have been produced by the United States, Germany, Australia, and the United Kingdom. Combined, these countries have received a total of 63 citations, with individual counts of 36, 17, 7, and 7, respectively. These four countries have 90% of the total citations of the top 10 most productive countries in the field of research on higher education perspectives on ChatGPT.

3.3 Bibliographic coupling

3.3.1 thematic clusters.

Four thematic clusters (TCs) were identified from the included research articles, as shown in Table  4 . VOSviewer was used to perform clustering based on bibliographic coupling. This method identifies relations between documents by examining publications that cite the same sources (Boyack & Klavans, 2010 ). VOSviewer clusters articles with a common knowledge base, assigning each publication to exactly one cluster. To implement this clustering technique, we assessed the co-occurrence of bibliographic references among articles within our dataset. Co-occurrence was determined by identifying shared references between articles, indicating a thematic connection (Boyack & Klavans, 2010 ). Articles sharing common references were considered to co-occur, enabling us to quantify the extent of thematic relationships based on the frequency of shared references. We identified and categorised thematic clusters within our dataset through the combined approach of VOSviewer clustering and co-occurrence analysis. This method typically results in a distribution of clusters, with a limited number of larger clusters and a more substantial number of smaller clusters.

The clusters were derived through an analysis of subordinate articles extracted from the Scopus database. VOSviewer systematically organised similar articles into distinct clusters based on the shared patterns of bibliographic references (Van Eck & Waltman, 2010 ). To ensure methodological transparency and robustness, we established clear criteria and parameters for clustering. Specifically, keywords with a minimum frequency ( n  = 5) were included in the analysis, and co-occurrence was calculated based on a pairwise comparison method. This systematic approach ensured the meaningful representation of thematic relationships within the dataset, guided by insights from previous literature (Jarneving, 2007 ). Using cluster analysis techniques, the articles were organised into cohesive groups characterised by the degree of thematic homogeneity guided by the nature of the research findings. This approach ensured a robust representation of the underlying thematic structure (Jarneving, 2007 ).

Furthermore, to mitigate the risk of subjective bias in thematic categorisation, a counter-coding approach was employed. A second researcher independently categorised thematic clusters identified by VOSviewer to assess inter-rater agreement. The level of agreement between the two researchers was assessed using Cohen’s kappa coefficient, ensuring the reliability and validity of the thematic classification process. The resulting kappa coefficient (0.69) indicated substantial agreement, suggesting a high level of agreement beyond what would be expected by chance alone (Gisev et al., 2013 ). Furthermore, the nomenclature assigned to each cluster was finalised based on the predominant research theme emerging from the analysis, providing a concise and informative label for each group.

TC1: ChatGPT and Academic Integrity: Cotton et al. ( 2023 ) describe ChatGPT as a double-edged sword that potentially threatens academic integrity. AI essay writing systems are programmed to churn out essays based on specific guidelines or prompts, and it can be difficult to distinguish between human and machine-generated writing. Thus, students could potentially use these systems to cheat by submitting essays that are not their original work (Dehouche, 2021 ). Kasneci et al. ( 2023 ) argue that effective pedagogical practices must be developed in order to implement large language models in classrooms. These skills include not only a deep understanding of the technology but also an appreciation of its constraints and the vulnerability of complex systems in general. In addition, educational institutions need to develop a clearly articulated plan for the successful integration and optimal use of big language models in educational contexts and teaching curricula. In addition, students need to be taught how to verify information through a teaching strategy emphasising critical thinking effectively. Possible bias in the generated output, the need for continuous human supervision, and the likelihood of unforeseen effects are just a few of the challenges that come with the employment of AI systems. Continuous monitoring and transparency are necessary to ensure academic integrity while using ChatGPT. Lim et al. ( 2023 ) report that ChatGPT poses academic integrity challenges for the faculty of higher education institutions, who must verify whether academic work (assignments, research reports, etc.) submitted by students is derived from the fresh perspective of data analysis or plagiarised and recycled (copying and pasting original work) by ChatGPT. ChatGPT may threaten student learning and classroom engagement if students have access to information and course assignments without assessing their integrity. Perkins ( 2023 ) also expresses concerns regarding academic integrity in the use of ChatGPT. Students are utilising ChatGPT to complete their course assignments without attribution rather than producing original work. Higher education institutions must establish clear boundaries regarding academic integrity and plagiarism in light of the growing utilisation of AI tools in academic and research settings. In addition, the challenges posed by AI essay writing systems like ChatGPT necessitate a multifaceted approach to safeguard academic integrity. Educational institutions should invest in comprehensive educational programs that not only teach students the ethical use of technology but also incorporate rigorous assessments of critical thinking skills. Additionally, integrating AI literacy into the curriculum, with a focus on understanding the limitations and potential biases of big language models, can empower students to discern between human and machine-generated content.

TC2: ChatGPT and Learning Environment: According to Crawford et al. ( 2023 ), increased stress levels and peer pressure among university students have created a favourable environment for the use of AI tools. ChatGPT provides enhanced educational opportunities for college-level students. It can help students identify areas they may have overlooked, offer guidance on additional reading materials, and enhance existing peer and teacher connections. In addition, ChatGPT can propose alternative methods of evaluating students beyond conventional assignments. Crawford et al. ( 2023 ) recommend providing practical assignments incorporating ChatGPT as a supplementary tool to reduce plagiarism. Su ( 2023 ) documents that ChatGPT can provide students with a personalised learning experience based on their specific needs. In addition, the ChatGPT platform can be used to create a virtual coaching system that offers prompt feedback to educators during their classroom evaluations. This approach fosters critical thinking and supports early childhood educators in refining their teaching methodologies to optimise interactive learning outcomes for students. Tang ( 2023b ) proposes that bolstering research integrity can be achieved by imposing restrictions on the utilisation of NLP-generated content in research papers. Additionally, the author advocates for transparency from researchers, emphasising the importance of explicitly stating the proportion of NLP-generated content incorporated in their papers. This recommendation prompts a critical examination of the role of AI-generated content in scholarly work, emphasising the importance of nurturing independent research and writing skills for both students and researchers.

TC3: ChatGPT and Student Engagement: Lee ( 2023 ) examines the ability of ChatGPT to provide an interactive learning experience and boost student engagement beyond textbook pedagogy. Iskender ( 2023 ) explains that ChatGPT provides a mechanism for students to generate and investigate diverse concepts expeditiously, thereby helping them engage in imaginative and evaluative thinking on specific subject matter. This approach has the potential to optimise time management for students and allow them to concentrate on more advanced cognitive activities. AI tools such as ChatGPT can potentially enhance the personalisation of learning materials by providing visual aids and summaries that can aid the learning process and significantly improve students’ competencies. Hence, leveraging ChatGPT in education can revolutionise learning by facilitating interactive experiences, nurturing imaginative thinking, and optimising time management for students.

TC4: ChatGPT and Scholarly Research: Ivanov and Soliman ( 2023 ) and Yan ( 2023 ) focus on the practical applications and implications of LLMs like ChatGPT in educational settings and scholarly research within the context of language learning, writing, and tourism. Yan’s investigation into ChatGPT’s application in second-language writing examines its effectiveness in addressing specific writing tasks at the undergraduate level. The findings underscore the nuanced balance between the strengths of ChatGPT and the inherent limitations in handling demanding academic writing tasks. Nevertheless, ChatGPT is also labelled as an ‘all-in-one’ solution for scholarly research and writing (Yan, 2023 ). In parallel, Ivanov and Soliman ( 2023 ) highlight that ChatGPT can assist scholars in the field of tourism research by composing preliminary literature reviews, substantiating their chosen methodologies, and creating visual aids such as tables and charts. Furthermore, the researchers outline that ChatGPT could provide valuable methodological ideas and insights by helping researchers generate questions and corresponding scales for inclusion in questionnaires. Hence, ChatGPT has the potential to become a valuable ally as a facilitator in academic writing processes and has the potential to transform the research workflow.

3.3.2 Geographic clusters

The results of the country-based bibliographic analysis are summarised in Table  5 . The present study utilised the prevailing research theme in the existing literature as a framework for categorising the countries into four distinct clusters on the basis of the number of documents published from different countries.

Cluster 1: Implications of ChatGPT for Student Examinations and Education : Cluster 1 is composed of five countries: Germany, Ireland, South Korea, Taiwan, and the United States. Researchers in these countries have emphasised the potential role of ChatGPT in higher education within the context of AI language models. Eleven research articles related to this theme were published by researchers based in the United States, the most in this cluster. The top three articles in Table  1 are from the United States. The study entitled “Opportunities and Challenges of Large Language Models for Education,” was authored by German researchers (Kasneci et al., 2023 ) and has been widely cited in the academic community (13 citations). The remaining studies were conducted by researchers from South Korea and Taiwan and focused on the impact of ChatGPT on the education sector and its associated opportunities and challenges. This cluster demonstrates that students could benefit greatly from using ChatGPT in performing various academic tasks, such as reviewing and revising their work, verifying the accuracy of homework answers, and improving the quality of their essays. It has also aided postgraduates whose first language is not English improve their writing, as ChatGPT can be instructed to rewrite a paragraph in a scholarly tone from scratch. The outcomes have demonstrated significant efficacy, thereby alleviating the cognitive load associated with translation for these students, enabling them to concentrate on the substance of their writing rather than the intricacies of composing in an unfamiliar language. To harness the potential benefits, future research could focus on developing targeted training programs for students and educators that emphasise the effective utilisation of ChatGPT to enhance not only academic tasks but also language proficiency for non-native English speakers, addressing both cognitive load and language intricacies.

Cluster 2: ChatGPT and Academic Integrity : Cluster 2 comprises research studies conducted by authors from Japan, Bangladesh, Hong Kong, Nigeria, Pakistan, UAE, the UK, Vietnam and the Netherlands. The most influential study in this cluster, “Unlocking the power of ChatGPT: A framework for applying Generative AI in education”, was authored by researchers from Hong Kong (Su & Yang, 2023 ). They document that ChatGPT can be used to respond to student inquiries, reducing the time and effort required of educators and allowing them to focus their resources on other activities, such as scholarly investigations. Farrokhnia et al. ( 2023 ) and Yeadon et al. ( 2023 ) state that ChatGPT can write scientific abstracts with fabricated data and essays that can evade detection by reviewers. According to Liebrenz et al. ( 2023 ), ChatGPT tends to produce erroneous and incoherent responses, thereby raising the potential for disseminating inaccurate information in scholarly literature. The higher-order cognitive abilities of ChatGPT are relatively low, especially in areas related to creativity, critical thinking, reasoning, and problem-solving. ChatGPT could reduce students’ motivation to explore topics independently, draw their own conclusions, and solve problems independently (Kasneci et al., 2023 ). Ibrahim et al. ( 2023 ) find that ChatGPT can engage students in their academic pursuits. ChatGPT can enhance the writing abilities of non-native English speakers to allow them to concentrate on higher-order cognitive processes. This technological development allows faculty members to allocate more attention to conceptualisation and writing rather than focusing on the mechanics of grammar and spelling. However, there is a debate among intellectuals regarding the implications of AI for content creation, with some asserting that it detracts from innovative content development. The possibility that ChatGPT threatens academic honesty by facilitating essay plagiarism is being acknowledged. In addition, in the absence of appropriate citations, this textual content may violate copyright regulations. Cotton et al. ( 2023 ) express concerns about the potential impact of ChatGPT on academic integrity and plagiarism. Their work corroborates Dehouche’s ( 2021 ) assertion that students may use ChatGPT to engage in academic dishonesty by submitting essays that are not their original work. According to Cotton et al. ( 2023 ), ChatGPT users have a competitive advantage over non-users and can achieve higher grades on their coursework assignments by utilising the AI-based language tool. They classify ChatGPT as a versatile instrument with the potential to pose a threat to academic integrity, noting that AI essay writing systems are specifically programmed to generate content based on specific parameters or prompts, thereby challenging the discernment between human-authored and machine-generated content. Distinguishing between the academic work produced by students and the content of ChatGPT when evaluating assignments is a significant challenge for faculty. It is recommended that academic staff continually monitor student assignments for academic misconduct infractions, coupled with transparent communication about the potential risks associated with AI-generated content.

Cluster 3: ChatGPT and Students’ Learning : Cluster 3 comprises Malaysia, China and Australia. This cluster mainly includes studies of the role of AI-based models in student learning. Researchers from Australia (Crawford et al., 2023 ; Lim et al., 2023 ; Lawrie, 2023 ; Li et al., 2023 ; Seth et al., 2023 ; Cingillioglu, 2023 ; Skavronskaya, 2023 ; and Johinke, 2023 ) have contributed the most (8 studies) to this cluster and put their weight behind the role of AI and student learning in various disciplines. One of the most influential papers, “Generative AI and the future of education: Ragnarök or reformation? A paradoxical perspective from management educators”, was authored by researchers from both Australia and Malaysia (Lim et al., 2023 ) and reflected on the role of AI in classroom learning and teaching. Rather than banning AI tools, the authors advocate for the productive use of these tools in classrooms to facilitate more engaging student learning. Another Australian study titled, “Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI)” (Crawford et al., 2023 ) highlights AI as an alternative path of learning for students. ChatGPT can promptly evaluate students’ assignments and help them identify areas of weakness. Educators have the option to provide innovative assessments to their students instead of adhering solely to conventional assessments. ChatGPT can augment pedagogical approaches, evaluation structures, and the comprehensive educational milieu by reinforcing the trilateral association among instructors, learners, and technology. The implementation of ChatGPT can provide students with a personalised and interactive learning and research experience facilitated by virtual tutors and customised recommendations. In light of the research in this cluster, the integration of ChatGPT into education should inspire a paradigm shift towards a more dynamic and personalised learning environment. Institutions can explore strategic partnerships with AI researchers to develop context-specific applications of ChatGPT that cater to diverse educational needs, promoting a symbiotic relationship between human instructors, students, and technology for an enriched learning experience.

Cluster 4: ChatGPT and Field-specific Research : This cluster includes research by authors in Asian and European countries (India, Oman, Bulgaria and New Zealand) that has emphasised the potential role of ChatGPT in the medical and tourism industries. Authors from India explored the role of ChatGPT in the medical field (Seetharaman, 2023 ; Subramani et al., 2023 ). Seetharaman ( 2023 ) reports that ChatGPT offers supplementary language assistance to students who are not proficient in English, enabling them to enhance their language proficiency and effectively communicate in English, the principal language of instruction in medical establishments. The ChatGPT platform has the potential to serve as a tool for medical students to replicate patient interactions in a simulated environment, such as accurately obtaining medical histories and documenting symptoms. According to Subramani et al. ( 2023 ), ChatGPT is a highly efficient and user-friendly AI technology that can aid healthcare professionals in various aspects, such as diagnosis, critical decision-making, and devising appropriate treatment plans. ChatGPT has demonstrated impressive performance on medical exams, indicating its potential as a valuable resource for enhancing medical education and assessment (Subramani et al., 2023 ) and can support interdisciplinarity in tourism research (Nautiyal et al., 2023 ). Ivanov and Soliman ( 2023 ) note the potential of ChatGPT to serve as a digital instructor to provide students with enhanced and effective learning experiences and outcomes. Digital instructors can impart knowledge in diverse languages and thus can be used to educate individuals of varying nationalities and backgrounds in the field of tourism. Furthermore, LLM-based chatbots, including ChatGPT, can assess written assignments and provide direction on linguistic proficiency, syntax, and composition, ultimately enhancing students’ scholarly writing proficiency. In exploring the intersection of ChatGPT with medical education, institutions can pioneer innovative approaches by using the platform to create immersive, simulated patient interactions that go beyond language assistance, allowing medical students to practice nuanced skills such as medical history gathering and symptom documentation. Simultaneously, leveraging ChatGPT as a versatile digital instructor offers a unique opportunity to provide cross-cultural and multilingual education, contributing to a more inclusive and globally competent workforce within the tourism industry.

3.4 Challenges of ChatGPT in higher education

In addition to some previously mentioned challenges, such as the potential for plagiarism, the investigation also identified other key challenges in implementing ChatGPT within the context of higher education’s teaching and learning environment. Wu and Yu ( 2023 ) found that the benefits of AI-based ChatGPT are more in higher education as compared to primary and secondary education. The study also reported that the novelty effects of AI chatbots may enhance learning outcomes in brief interventions, but their efficacy diminishes in longer interventions.

First, the implementation of ChatGPT within the educational context engenders learning impediments. In the absence of adequate monitoring and regulation, the technology could lead to human unintelligence and unlearning, but teachers will become more adaptive and create authentic assessments to enhance student learning (Alafnan et al., 2023 ; Lawrie, 2023 ). Second, the technology could be used in a manner that violates students’ privacy. If the model is not adequately secured, it could surreptitiously gather confidential data from students without their explicit awareness or authorisation (Kanseci, 2023). Third, the technology could facilitate discrimination against particular students. If the model is not trained on a dataset that accurately represents the entire student population, it has the potential to create disparities in educational access (Cingillioglu, 2023 ; Lin et al., 2023 ). Fourth, according to Ivanov and Soloman (2023), ChatGPT lacks access to real-time data. Therefore, its responses may be inconsequential, inaccurate, or outdated. The information provided in response to a specific query may also be insufficient. Gao et al. (2022) highlight the need for further investigation of the precision and scholarly authenticity of ChatGPT. Fifth, it may be difficult for ChatGPT to comprehend the context and subtleties of complex academic subjects and answer complex questions (Adetayo, 2023 ; Eysenbach, 2023 ; Neumann et al., 2023 ). The system can misinterpret inquiries, offer inadequate or inaccurate responses, or struggle to comprehend the fundamental purpose behind questions (Clark, 2023 ). In particular, ChatGPT may not have the requisite expertise in highly specialised or advanced subjects such as advanced mathematics or specific sciences. Hence, it may not deliver precise and accurate answers (Neumann et al., 2023 ; Fergus et al., 2023 ). Karaali ( 2023 ) claimed that the primary emphasis in the field of AI is currently directed towards the enhancement of advanced cognitive abilities and mental processes associated with quantitative literacy and quantitative reasoning. However, it is important to acknowledge that fundamental skills such as writing, critical thinking, and numeracy continue to serve as essential foundational components among students. Although AI is making significant progress in fundamental domains, it appears that students are experiencing a decline in performance in the context of fundamental skills. Consequently, NLP-based adaptive learner support and education require further investigation (Bauer et al., 2023 ).

In addressing the challenges of ChatGPT in education, educators need to adapt and develop authentic assessments that mitigate the risk of human unlearning, ensuring that technology enhances, rather than hinders, student learning experiences. Simultaneously, recognising the limitations of ChatGPT in comprehending the nuances of highly specialised subjects underscores the importance of balancing advancements in AI’s cognitive abilities with continued emphasis on fundamental skills like critical thinking, writing, and numeracy, urging a reevaluation of priorities in AI-driven educational research towards comprehensive learner support.

4 Conclusion, implications and agenda for future research

This study identified the most influential articles and top journals and countries in terms of citations and publication productivity related to ChatGPT in higher education, as well as highlighted emerging thematic clusters and geographic clusters in research on the role and challenges of ChatGPT in teaching and learning in higher education institutions. Articles on the topic of ChatGPT in higher education published up to May 2023 were identified by searching the Scopus database. Given the emergent nature of ChatGPT starting in late 2022, all the included articles were published in 2023. Thus, this specific research domain remains relatively unexplored. The findings of this analysis reveal that the United States is the most productive country in terms of research on the role of ChatGPT in higher education, especially relating to academic integrity and research. US researchers also emerged as the most influential in terms of number of citations in the literature. Our findings corroborate those of previous research (Crompton & Burke, 2023 ). However, 60% of the articles in our shortlisted literature emanated from Asian countries.

Four thematic clusters (academic integrity, student engagement, learning environment and research) were identified. Furthermore, the country-based bibliographic analysis revealed that research has focused on student examinations, academic integrity, student learning and field-specific research in medical and tourism education (Nautiyal et al., 2023 ; Subramani et al., 2023 ). Plagiarism is recognised as a major challenge that hinders students’ creativity, innovativeness and originality when using ChatGPT in their academic pursuits. To mitigate the potential drawbacks of using ChatGPT in educational and research settings, proactive measures should be taken to educate students and researchers alike on the nature of plagiarism, its negative impacts and academic integrity (Shoufan, 2023 ; Teixeira, 2023 ) Educators may ask students to provide a written acknowledgement of the authenticity of their assignments and their non-reliance on ChatGPT. Such an acknowledgement would discourage students from utilising ChatGPT in their academic and research endeavours and establish accountability for their academic pursuits. In addition, educators should develop authentic assessments that are ChatGPT-proof.

ChatGPT lacks emotional intelligence and empathy, both of which are crucial in effectively addressing the emotional and psychological dimensions of the learning process (Farrokhnia et al., 2023 ; Neumann et al., 2023 ). Higher education institutions may encounter challenges in using ChatGPT to deliver suitable assistance, comprehension, or direction to students needing emotional or mental health support. The significance of human interaction in learning cannot be overstated. Achieving a balance between using AI and the advantages of human guidance and mentorship is a persistent challenge that requires attention (Neumann et al., 2023 ; Rahman et al., 2023 ). Strzelecki ( 2023 ) observed in his research that behavioural intention and personal innovativeness are the two major determinants behind the adoption of ChatGPT among students.

4.1 Implications

The findings of the present study have numerous important implications. This study provides insight into the current state of ChatGPT in higher education and thus can serve as valuable guidance for academics, practitioners, and policymakers. The study’s findings contribute to the literature by providing new insights into the role of ChatGPT and strategies for mitigating its negative aspects and emphasising its positive attributes.

First, the implementation of AI in education can improve academic performance and student motivation, particularly by facilitating personalised learning. Educational institutions should monitor and regulate students’ use of such technologies proactively. Higher education institutions also ought to prioritise the training of their educators in effectively utilising AI technologies, including ChatGPT. Concurrently, it is imperative for these institutions to equip students with comprehensive academic integrity training, shedding light on the appropriate and inappropriate applications of AI tools like ChatGPT. This includes creating awareness about the potential consequences of utilising these technologies for dishonest practices. Furthermore, educational establishments need to urgently revisit and refine their academic integrity policies to address the evolving landscape shaped by the integration of artificial intelligence tools in various academic facets. This proactive approach will foster a learning environment that embraces technological advancements and upholds the principles of honesty and responsible use. Institutional regulations on accountability and transparency should guide the frameworks that govern the use of AI in the campus environment (Pechenkina, 2023 ; Sun & Hoelscher, 2023 ; Dencik & Sanchez-Monedero, 2022 ).

Second, faculty members must proactively replace traditional coursework with modern alternatives that foster elevated levels of critical thinking among students, as suggested by Zhai ( 2022 ). Educators and learners can augment the academic material produced by ChatGPT with their own insights and information obtained from credible scholarly resources (Emenike & Emenike, 2023 ).

Third, ChatGPT should not be considered a threat to the education sector but a supplementary tool for human instruction that can enhance teaching and learning. It is imperative to acknowledge that the vital role of human educators cannot be replaced (Karaali, 2023 ) Moreover, ChatGPT can potentially enhance the accessibility and inclusivity of higher education. Alternative formats, linguistic support, and individualised explanations can help students who are studying English as a second language, are not native English speakers, or have other unique learning needs. Furthermore, Alnaqbi and Fouda ( 2023 ) highlight the implications of AI in evaluating the teaching style of faculty in higher education by collecting the feedback of students through social media and ChatGPT.

Fourth, the faculty in higher education institutions could address ethical concerns by providing students with explicit and comprehensive guidelines about the prescribed structure of academic assignments (Cotton et al., 2023 ; Gardner & Giordano, 2023 ). This practice can facilitate the production of more cohesive assignments. In addition, teachers can use rubrics to assess assignments and blend automated and manual assessment methodologies to evaluate students’ comprehension of the subject matter (Cotton et al., 2023 ; Shoufan, 2023 ).

In summary, using ChatGPT is recommended for enhancing creativity, refining writing proficiency, and improving research abilities. Nonetheless, it is crucial to emphasise that ChatGPT should not be employed as a substitute for critical thinking and producing original work. While it serves as a valuable tool for augmentation, upholding the integrity of independent thought and authentic content creation in academic endeavours is essential.

4.2 Limitations

The present study acknowledges several limitations. Firstly, the reliance on Scopus as the primary data source for bibliometric analysis may have limitations in capturing the full landscape of relevant literature. Future research may consider incorporating additional databases like Web of Science to ensure a comprehensive assessment. Secondly, due to the English language restriction in the review, potentially relevant studies may have been omitted. Future research could enhance inclusivity by extending its scope to encompass papers written in languages other than English. Thirdly, the current study exclusively focused on journal articles. Expanding the scope to include diverse sources, such as conference proceedings or book chapters, could offer a more comprehensive overview.

Additionally, as a rapidly evolving field, literature published after our inclusion dates need capturing, and future studies should consider adjusting their inclusion criteria to accommodate the dynamic nature of the subject matter. Lastly, the specificity of the bibliometric data search, centred around terms like ChatGPT, AI, higher education, and academic integrity, may have excluded certain relevant articles. Future studies should consider employing more generalised search parameters to encompass synonyms associated with these terms.

4.3 Future scope

The findings of the study suggest new avenues for future research. The effectiveness of evaluation criteria for assessments incorporating ChatGPT-generated text needs to be investigated. Specifically, the appropriate level of ChatGPT-produced text that students may use in academic tasks or assessments has not been established. Research on the ethical implications of using AI tools such as ChatGPT in higher education is also needed. Issues pertaining to data confidentiality, bias, and transparency in algorithms used for decision-making remain to be addressed. Feasible approaches for mitigating the excessive reliance of scholars and learners on ChatGPT or similar AI models are needed. Researchers could also explore the implementation of verification processes that go beyond traditional plagiarism detection methods, accounting for the unique challenges posed by AI systems. Future research in this domain could focus on establishing guidelines and best practices for the integration of AI tools like ChatGPT in academic settings, ensuring a balance between technological innovation and the preservation of academic rigour. Finally, the literature on ChatGPT in higher education has largely focused on the medical and tourism sectors. Future researchers must explore applications of ChatGPT in other disciplines.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author upon reasonable request.

Adetayo, A. J. (2023). ChatGPT and librarians for reference consultations. Internet Reference Services Quarterly , 27 (3), 131–147.

Article   Google Scholar  

AlAfnan, M. A., Dishari, S., Jovic, M., & Lomidze, K. (2023). ChatGPT as an educational tool: Opportunities, challenges, and recommendations for communication, business writing, and composition courses. Journal of Artificial Intelligence and Technology , 3 (2), 60–68. https://doi.org/10.37965/jait.2023.0184 .

Alnaqbi, N. M., & Fouda, W. (2023). Exploring the role of ChatGPT and social media in enhancing student evaluation of teaching style in higher education using Neutrosophic sets. International Journal of Neutrosophic Science , 20 (4), 181–190.

Angelis, L. D., Baglivo, F., Arzilli, G., Privitera, G. P., Ferragina, P., Tozzi, A. E., & Rizzo, C. (2023). ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Frontiers in Public Health , 11 , 1–8.

Archambault, E., Campbell, D., Gingras, Y., & Larivière, V. (2009). Comparing bibliometric statistics obtained from the web of Science and Scopus. Journal of the American Society for Information Science and Technology , 60 (7), 1320–1326.

Bajaj, Y., Kashiramka, S., & Singh, S. (2020). Application of capital structure theories: A systematic review. Journal of Advances in Management Research , 18 (2), 173–199. https://doi.org/10.1108/JAMR-01-2020-001 .

Bauer, E., Greisel, M., Kuznetsov, I., Berndt, M., Kollar, I., Dresel, M., Fischer, M. R., & Fischer, F. (2023). Using natural language processing to support peer-feedback in the age of artificial intelligence: A cross-disciplinary framework and a research agenda. British Journal of Educational Technology , 54 (5), 1222–1245.

Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology , 61 (12), 2389–2404.

Cingillioglu, I. (2023). Detecting AI-generated essays: The ChatGPT challenge. International Journal of Information and Learning Technology , 40 (3), 259–268. https://doi.org/10.1108/IJILT-03-2023-0043 .

Clark, T. M. (2023). Investigating the Use of an Artificial Intelligence Chatbot with General Chemistry exam questions. Journal of Chemical Education , 100 (5), 1905–1916.

Cotton, D. R. E., Cotton, P. A., & Shipway, J. R. (2023). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International , 1–13. https://doi.org/10.1080/14703297.2023.2190148 .

Crawford, J., Cowling, M., & Allen, K. (2023). Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI). Journal of University Teaching & Learning Practice , 20 (3), 1–21. https://doi.org/10.53761/1.20.3.02 .

Crompton, H., & Burke, D. (2023). Artificial intelligence and higher education: The state of the field. International Journal of Educational Technology in Higher Education , 20 , 1–22. https://doi.org/10.1186/s41239-023-00392-8 .

Dehouche, N. (2021). Plagiarism in the age of massive generative pre-trained transformers (GPT-3). Ethics in Science and Environmental Politics , 2 , 17–23. https://doi.org/10.3354/esep00195 .

Dencik, L., & Sanchez-Monedero, J. (2022). Data justice. Internet Policy Review , 11 (1), 1–16. https://doi.org/10.14763/2022.1.1615 .

Donthu, N., Kumar, S., & Pattnaik, D. (2020). Forty-five years of Journal of Business Research : A bibliometric analysis. Journal of Business Research , 109 , 1–14. https://doi.org/10.1016/j.jbusres.2019.10.039 .

Emenike, M., & Emenike, B. (2023). Was this title generated by ChatGPT? Considerations for artificial intelligence text-generation software programs for chemists and chemistry educators. Journal of Chemical Education , 100 (4), 1413–1418. https://doi.org/10.1021/acs.jchemed.3c00063 .

Eysenbach, G. (2023). The role of ChatGPT, generative language models, and artificial intelligence in medical education: A conversation with ChatGPT and a call for papers. JMIR Medical Education , 6 (9), 1–13. https://doi.org/10.2196/46885 .

Farrokhnia, M., Banihashem, K. S., Noroozi, O., & Wals, A. (2023). A SWOT analysis of ChatGPT: Implications for educational practice and research. Innovations in Education and Teaching International , 1–15. https://doi.org/10.1080/14703297.2023.2195846 .

Fergus, S., Botha, M., & Ostovar, M. (2023). Evaluating academic answers generated using ChatGPT. Journal of Chemical Education , 100 (4), 1672–1675. https://doi.org/10.1021/acs.jchemed.3c00087 .

Firat, M. (2023). How chat GPT can transform autodidactic experiences and open education. Department of Distance Education, Open Education Faculty , Anadolu University, 1–6. https://doi.org/10.31219/osf.io/9ge8m .

Gardner, D. E., & Giordano, A. E. (2023). The challenges and value of undergraduate oral exams in the physical chemistry classrooms: A useful tool in the assessment toolbox. Journal of Chemical Education , 100 (5), 1705–1709.

Geerling, W., Mateer, G. D., Wooten, J., & Damodaran, N. (2023). ChatGPT has aced the test of understanding in college economics: Now what? The American Economist , 68 (2), 233–245.

Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education , 9 , 1–9. https://doi.org/10.2196/45312 .

Gisev, N., Bell, J. S., & Chen, T. F. (2013). Interrater agreement and interrater reliability: Key concepts, approaches, and applications. Research in Social and Administrative Pharmacy , 9 (3), 330–338.

Gupta, R., Pandey, R., & Sebastian, V. J. (2021). International entrepreneurial orientation (IEO): A bibliometric overview of scholarly research. Journal of Business Research , 125 , 74–88. https://doi.org/10.1016/j.jbusres.2020.12.00 .

Ibrahim, H., Asim, R., & Zaffar, F. (2023). Rethinking homework in the age of artificial intelligence. Intelligent Systems , 38 (2), 24–27.

Google Scholar  

Iskender, A. (2023). Holy or Unholy? Interview with Open AI’s ChatGPT. European Journal of Tourism Research , 34 , 3414. https://doi.org/10.54055/ejtr.v34i.3169 .

Ivanov, S., & Soliman, M. (2023). Game of algorithms: ChatGPT implications for the future of tourism education and research. Journal of Tourism Futures , 9 (2), 214–221.

Jarneving, B. (2007). Bibliographic coupling and its application to research-front and other core documents. Journal of Informetrics , 1 (4), 287–307.

Jeon, J., Lee, S., & Cho, S. (2023). A systematic review of research on speech-recognition chatbots for language learning: Implications for future directions in the era of large language models. Interactive Learning Environment , 1–19. https://doi.org/10.1080/10494820.2023.2204343 .

Johinke, R., Cummings, R., & Di Lauro, F. (2023). Reclaiming the technology of higher education for teaching digital writing in a post-pandemic world. Journal of University Teaching and Learning Practice , 20 (2), 1–18. https://doi.org/10.53761/1.20.02.01 .

Karaali, G. (2023). Artificial Intelligence, Basic skills, and quantitative literacy. Numeracy , 16 (1), 1–22. https://doi.org/10.5038/1936-4660.16.1.1438 .

Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences , 103 (2023), 102274–102271. https://doi.org/10.1016/j.lindif.2023.102274 .

Lawrie, G. (2023). Establishing a delicate balance in the relationship between artificial intelligence and authenticate assessment in student learning. Chemistry Education Research and Practice , 24 (2), 392–393.

Lee, H. (2023). The rise of ChatGPT: Exploring its potential in medical education. Anatomical Science Education , 1–6. https://doi.org/10.1002/ase.2270 .

Li, Y., Sha, L., Yan, L., Lin, J., Raković, M., Galbraith, K., Lyons, K., Gašević, D., & Chen, G. (2023). Can large language models write reflectively. Computer and Education: Artificial Intelligence , 4 (2023), 100140–100141.

Liebrenz, M., Schleifer, R., Buadze, A., Bhugra, D., & Smith, A. (2023). Generating scholarly content with ChatGPT: Ethical challenges for medical publishing. Lancet Digital Health , 5 (3), e105–106. https://doi.org/10.1016/S2589-7500(23)00019-5 .

Lim, W. M., Gunasekara, A., Pallant, J. L., Pallant, J. I., & Pechenkina, E. (2023). Generative AI and the future of education: Ragnarök or reformation? A paradoxical perspective from management educators. The International Journal of Management Education , 21 (2), 1–13.

Lin, C. C., Huang, A. Y. Q., Stephen, J. H., & Yang (2023). A review of AI-Driven conversational chatbots implementation methodologies and challenges (1999–2022). Sustainability , 15 (5), 4012. https://doi.org/10.3390/su15054012 .

Memarian, B., & Doleck, T. (2023). ChatGPT in education: Methods, potentials and limitations. Computers in Human Behavior: Artificial Humans , 100022 , 1–11.

Mishra, D., Gunasekaran, A., Papadopoulos, T., & Hazen, B. (2017). Green supply chain performance measures: A review and bibliometric analysis. Sustainable Production and Consumption , 10 , 85–99. https://doi.org/10.1016/j.spc.2017.01.003

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine , 151 (4), 264–269.

Nautiyal, R., Albrecht, J. N., & Nautiyal, A. (2023). ChatGPT and tourism academia. Annals of Tourism Research , 99 , 1–3.

Neumann, M., Rauschenberger, M., & Schon, E-M. (2023). 2023 IEEE/ACM 5th International Workshop on Software Engineering Education for the Next Generation (SEENG), Melbourne, Australia, 2023, pp. 29–32, https://doi.org/10.1109/SEENG59157.2023.00010 .

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., … & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ , 372 (71), 1–9. https://doi.org/10.1136/bmj.n71

Palomo, J., Figueroa-Domecq, C., & Laguna, P. (2017). Women, peace and security state-of-art: A bibliometric analysis in social sciences based on SCOPUS database. Scientometrics , 113 (1), 123–148. https://doi.org/10.1007/s11192-017-2484-x .

Patrício, L. D., & Ferreira, J. J. (2020). Blockchain security research: Theorizing through bibliographic-coupling analysis. Journal of Advances in Management Research , 18 (1), 1–35. https://doi.org/10.1108/JAMR-04-2020-0051 .

Paul, J., Lim, W. M., O’Cass, A., Hao, A. W., & Bresciani, S. (2021). Scientific procedures and rationales for systematic literature reviews (SPAR-4-SLR). International Journal of Consumer Studies . https://doi.org/10.1111/ijcs.12695 .

Pechenkina, K. (2023). Artificial intelligence for good? Challenges and possibilities of AI in higher education from a data justice perspective. In L. Czerniewicz, & C. Cronin (Eds.), Higher education for good: Teaching and learning futures (#HE4Good) . Open Book.

Perkins, M. (2023). Academic integrity considerations of AI large Language models in the post-pandemic era: ChatGPT and beyond. Journal of University Teaching & Learning Practice , 20 (2), 1–26. https://doi.org/10.53761/1.20.02.07 .

Rahman, M., Mostafizer, & Yutaka Watanobe. (2023). ChatGPT for Education and Research: Opportunities, threats, and strategies. Applied Sciences , 13 (9), 1–21. https://doi.org/10.3390/app13095783 .

Rana, S. (2023). AI and GPT for Management scholars and practitioners: Guidelines and implications. FIIB Business Review , 12 (1), 7–9.

Seetharaman, R. (2023). Revolutionising medical education: Can ChatGPT boost subjective learning and expression? Journal of Medical System , 47 (61), 1–4. https://doi.org/10.1007/s10916-023-01957-w .

Seth, I., Bulloch, G., & Lee, C. H. A. (2023). Redefining academic integrity, authorship, and innovation: The impact of ChatGPT on surgical research. Annals of Surgical Oncology , 30 , 5284–5285. https://doi.org/10.1245/s10434-023-13642-w .

Shoufan, A. (2023). Exploring students’ perceptions of ChatGPT: Thematic analysis and follow-up survey. IEEE Access , 11, 38805–38818, 2023, https://doi.org/10.1109/ACCESS.2023.3268224 .

Singh, V. K., Singh, P., Karmakar, M., Leta, J., & Mayr, P. (2021). The journal coverage of web of Science, Scopus and dimensions: A comparative analysis. Scientometrics , 126 , 5113–5142.

Skavronskaya, L., Hadinejad, A., & Cotterell, D. (2023). Reversing the threat of artificial intelligence to opportunity: A discussion of ChatGPT in tourism education. Journal of Teaching in Travel & Tourism , 23 , 1–6. https://doi.org/10.1080/15313220.2023.2196658 .

Strzelecki, A. (2023). To use or not to use ChatGPT in higher education? A study of students’ acceptance and use of technology. Interactive Learning Environments , 1–15. https://doi.org/10.1080/10494820.2023.2209881 .

Su, J., & Yang, W. (2023). Unlocking the power of ChatGPT: A framework for applying generative AI in education. ECNU Review of Education , 6 (3), 355–366.

Subramani, M., Jallel, I., & Mohan, S. K. (2023). Evaluating the performance of ChatGPT in medical physiology university examination of phase I MBBS. Advances in Physical Education , 47 , 270–271.

Sun, G. H., & Hoelscher, S. H. (2023). The ChatGPT Storm and what Faculty can do. Nurse Educator , 48 , 119–124.

Tang, G. (2023a). Academic journals cannot simply require authors to declare that they used ChatGPT. Irish Journal of Medical Science , 192 (6), 3195–3196.

Tang, G. (2023b). Letter to editor: Academic journals should clarify the proportion of NLP-generated content in papers. Accountability in Research , 1–3. https://doi.org/10.1080/08989621.2023.2180359 .

Teixeira da Silva, J. A. (2023). How are authors’ contributions verified in the ICMJE model? Plant Cell Reports , 42 , 1529–1153. https://doi.org/10.1007/s00299-023-03022-9 .

Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science , 379 (6630), 313–313.

Tlili, A., Shehata, B., Adarkwah, M., Bozkurt, A., Hickey, D. T., Huang, R., & Agyemang, B. (2023). What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education. Smart Learning Environments , 10 (15), 1–24.

Van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics , 84 (2), 523–538.

Vijaya, V., & Mathur, H. P. (2023). A decade of donation-based crowdfunding: A bibliometric analysis using the SCOPUS Database. Purushartha , 15 (2), 32–51.

Vogels, E. A. (2023). A majority of Americans have heard of ChatGPT, but few have tried it themselves, https://www.pewresearch.org/short-reads/2023/05/24/a-majority-of-americans-have-heard-of-chatgpt-but-few-have-tried-it-themselves/ .

Wu, R., & Yu, Z. (2023). Do AI Chatbots improve students learning outcome? Evidence from a meta analysis. British Journal of Educational Technology . https://doi.org/10.1111/bjet.13334 .

Wu, T. Y., He, S. Z., Liu, J. P., Sun, S. Q., Liu, K., Han, Q. L., & Tang, Y. (2023). A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica , 10 (5), 1122–1136. https://doi.org/10.1109/JAS.2023.123618 .

Yan, D. (2023). Impact of ChatGPT on learners in a L2 writing practicum: An exploratory investigation. Education and Information Technologies , 1–25. https://doi.org/10.1007/s10639-023-11742-4 .

Yeadon, W., Inyang, O-O., Mizouri, A., Peach, A., & Testrow, C. P. (May 2023). The death of the short-term physics essay in the coming AI revolution. Physics Education , 58 , 1–13.

Zhai, X. (2022). ChatGPT user experience: Implications for education. SSRN , 4312418 , 1–18.

Download references

Open Access funding enabled and organized by CAUL and its Member Institutions

Author information

Authors and affiliations.

University Business School, Maharaja Ranjit Singh Punjab Technical University, Punjab, India

Pritpal Singh Bhullar

Department of Financial Planning and Tax, School of Accounting, Information Systems and Supply Chain, RMIT University, Melbourne, Australia

Mahesh Joshi

School of Engineering and Technology, CML‑NET & CREATE Research Centres, Central Queensland University, Queensland, Australia

Ritesh Chugh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ritesh Chugh .

Ethics declarations

Conflict of interest.

No Conflict of Interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Bhullar, P.S., Joshi, M. & Chugh, R. ChatGPT in higher education - a synthesis of the literature and a future research agenda. Educ Inf Technol (2024). https://doi.org/10.1007/s10639-024-12723-x

Download citation

Received : 04 October 2023

Accepted : 16 April 2024

Published : 02 May 2024

DOI : https://doi.org/10.1007/s10639-024-12723-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial Intelligence
  • Generative AI
  • Higher education
  • Academic integrity
  • Systematic review
  • Bibliometric analysis
  • Find a journal
  • Publish with us
  • Track your research

Help | Advanced Search

Computer Science > Information Retrieval

Title: can chatgpt write a good boolean query for systematic review literature search.

Abstract: Systematic reviews are comprehensive reviews of the literature for a highly focused research question. These reviews are often treated as the highest form of evidence in evidence-based medicine, and are the key strategy to answer research questions in the medical field. To create a high-quality systematic review, complex Boolean queries are often constructed to retrieve studies for the review topic. However, it often takes a long time for systematic review researchers to construct a high quality systematic review Boolean query, and often the resulting queries are far from effective. Poor queries may lead to biased or invalid reviews, because they missed to retrieve key evidence, or to extensive increase in review costs, because they retrieved too many irrelevant studies. Recent advances in Transformer-based generative models have shown great potential to effectively follow instructions from users and generate answers based on the instructions being made. In this paper, we investigate the effectiveness of the latest of such models, ChatGPT, in generating effective Boolean queries for systematic review literature search. Through a number of extensive experiments on standard test collections for the task, we find that ChatGPT is capable of generating queries that lead to high search precision, although trading-off this for recall. Overall, our study demonstrates the potential of ChatGPT in generating effective Boolean queries for systematic review literature search. The ability of ChatGPT to follow complex instructions and generate queries with high precision makes it a valuable tool for researchers conducting systematic reviews, particularly for rapid reviews where time is a constraint and often trading-off higher precision for lower recall is acceptable.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

East Carolina University Libraries

  • Joyner Library
  • Laupus Health Sciences Library
  • Music Library
  • Digital Collections
  • Special Collections
  • North Carolina Collection
  • Teaching Resources
  • The ScholarShip Institutional Repository
  • Country Doctor Museum

Literature Review: Using ChatGPT

  • About Literature Reviews
  • How to Conduct your Lit Review
  • A Few Writing Tips
  • A Few Helpful Resources
  • A Silly Example
  • Using ChatGPT

What is AI, ChatGPT, and large-language modeling?

Artificial intelligence (AI) , in the context of this guide, is the use of datasets to train computer models to emulate human thought processes (IBM, n.d.). It is the emulation itself that may convince information consumers of the supposed "intelligence" of a machine; the more convincing the emulation, the more users are prone to trust the generated information.

ChatGPT is a generative large language model which has gained popularity since late 2021. The name stands for Chat Generative Pre-trained Transformer. GPT models are programmed to mimic the look and feel of engaging in a realistic conversation with another human being; however, bias and inaccuracy remain a considerable risk ( Kocoń, J., 2023).

Large language models (LLMs) are AI models that utilize networked datasets, emulating the neural pathways of the human brain, which assists in the appearance of actual cognition (Birhane et al., 2023).

Birhane, A., Kasirzadeh, A., Leslie, D., & Wachter, S. (2023). Science in the age of large language models. Nature Reviews Physics . https://www.nature.com/articles/s42254-023-00581-4

IBM. (n.d.). What is artificial intelligence (AI)? https://www.ibm.com/topics/artificial-intelligence Retrieved 27 April 2023.

Read more here: Generative AI in the Classroom & Research

What can I use ChatGPT for?

First and most importantly: check with your instructor that using generative AI is allowed for your class! This information should be added in your course syllabus with language indicating whether and to what extent AI can be used on assignments. If you are not sure, ask your instructor first .

From my biased perspective (that of a librarian), here is a brief list of ways you could use ChatGPT (if allowed for your class). This list is not exhaustive of all ideas.

  • Keyword generation. Having trouble thinking of other ways to describe a concept? Ask ChatGPT to help!
  • Narrowing/broadening your topic. Not getting any results? Getting way too many results? Use ChatGPT to help you brainstorm ways you can adjust your research question so that you can find the most effective sources.
  • Setting the right tone. Want to create a patient education document in plain language but not quite sure how to translate the nursing jargon into easy-to-understand statements? ChatGPT can help you think of ways to ensure your tone matches your preferred audience.

What shouldn't I use ChatGPT for?

Because of bias and limitations from human contributions, it is imperative to approach generative AI with caution.

If your instructor has approved the use of ChatGPT or other generative AI in the class, below is a brief list of ways you should NOT use it. Again, this list is not exhaustive.

  • Writing your assignments. This may be considered a form of plagiarism as the generated language is not your own original writing. If you use ChatGPT in this way without credit, you may be found in violation of the University's Academic Integrity policies. If you aren't sure, check with your instructor.
  • Searching for sources. Do not use ChatGPT in lieu of conducting a literature search through the library resources. ChatGPT has a bad habit of "hallucinating" results; in other words, it will generate answers that sound right but that are not actual sources that exist. ChatGPT is not a search engine or a database.
  • Creating accurate citations. ChatGPT is about as good as any other citation generator out there, which is to say that it is hit or miss. ChatGPT may leave off required elements of a citation, invent elements (for instance, generating incorrect DOIs or URLs), and fail to follow citation style formatting requirements. If you use ChatGPT for citation generation, be sure to double- and triple-check every citation thoroughly before submitting your assignment.

Ask a Librarian

Chat with a librarian is available 9-5 Monday-Saturday.

You can also  contact a librarian .

1-888-820-0522 (toll free)

252-744-2230

Or email us:

[email protected]

Provide Feedback on this Guide

Please follow the link below to complete a short survey and provide feedback on this guide.

Research Guide Satisfaction Survey

Should I cite ChatGPT?

Citing ChatGPT and other generative AI will allow your work to remain transparent while also demonstrating ethical considerations. If generative AI is used in research, best practice would dictate to mention the use of this tool in the narrative.

Suggestions for citations may be found below:

  • American Psychological Association (APA)
  • Chicago Manual of Style
  • Modern Language Association (MLA)
  • << Previous: A Silly Example
  • Last Updated: Jan 12, 2024 1:21 PM
  • URL: https://libguides.ecu.edu/lit_reviews

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 01 May 2024

A critical assessment of using ChatGPT for extracting structured data from clinical notes

  • Jingwei Huang   ORCID: orcid.org/0000-0003-2155-6107 1 ,
  • Donghan M. Yang 1 ,
  • Ruichen Rong 1 ,
  • Kuroush Nezafati   ORCID: orcid.org/0000-0002-6785-7362 1 ,
  • Colin Treager 1 ,
  • Zhikai Chi   ORCID: orcid.org/0000-0002-3601-3351 2 ,
  • Shidan Wang   ORCID: orcid.org/0000-0002-0001-3261 1 ,
  • Xian Cheng 1 ,
  • Yujia Guo 1 ,
  • Laura J. Klesse 3 ,
  • Guanghua Xiao 1 ,
  • Eric D. Peterson 4 ,
  • Xiaowei Zhan 1 &
  • Yang Xie   ORCID: orcid.org/0000-0001-9456-1762 1  

npj Digital Medicine volume  7 , Article number:  106 ( 2024 ) Cite this article

2616 Accesses

64 Altmetric

Metrics details

  • Non-small-cell lung cancer

Existing natural language processing (NLP) methods to convert free-text clinical notes into structured data often require problem-specific annotations and model training. This study aims to evaluate ChatGPT’s capacity to extract information from free-text medical notes efficiently and comprehensively. We developed a large language model (LLM)-based workflow, utilizing systems engineering methodology and spiral “prompt engineering” process, leveraging OpenAI’s API for batch querying ChatGPT. We evaluated the effectiveness of this method using a dataset of more than 1000 lung cancer pathology reports and a dataset of 191 pediatric osteosarcoma pathology reports, comparing the ChatGPT-3.5 (gpt-3.5-turbo-16k) outputs with expert-curated structured data. ChatGPT-3.5 demonstrated the ability to extract pathological classifications with an overall accuracy of 89%, in lung cancer dataset, outperforming the performance of two traditional NLP methods. The performance is influenced by the design of the instructive prompt. Our case analysis shows that most misclassifications were due to the lack of highly specialized pathology terminology, and erroneous interpretation of TNM staging rules. Reproducibility shows the relatively stable performance of ChatGPT-3.5 over time. In pediatric osteosarcoma dataset, ChatGPT-3.5 accurately classified both grades and margin status with accuracy of 98.6% and 100% respectively. Our study shows the feasibility of using ChatGPT to process large volumes of clinical notes for structured information extraction without requiring extensive task-specific human annotation and model training. The results underscore the potential role of LLMs in transforming unstructured healthcare data into structured formats, thereby supporting research and aiding clinical decision-making.

Similar content being viewed by others

chatgpt writing literature review

Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

chatgpt writing literature review

Assessing ChatGPT 4.0’s test performance and clinical diagnostic accuracy on USMLE STEP 2 CK and clinical case reports

chatgpt writing literature review

Large language models streamline automated machine learning for clinical studies

Introduction.

Large Language Models (LLMs) 1 , 2 , 3 , 4 , 5 , 6 , such as Generative Pre-trained Transformer (GPT) models represented by ChatGPT, are being utilized for diverse applications across various sectors. In the healthcare industry, early applications of LLMs are being used to facilitate patient-clinician communication 7 , 8 . To date, few studies have examined the potential of LLMs in reading and interpreting clinical notes, turning unstructured texts into structured, analyzable data.

Traditionally, the automated extraction of structured data elements from medical notes has relied on medical natural language processing (NLP) using rule-based or machine-learning approaches or a combination of both 9 , 10 . Machine learning methods 11 , 12 , 13 , 14 , particularly deep learning, typically employ neural networks and the first generation of transformer-based large language models (e.g., BERT). Medical domain knowledge needs to be integrated into model designs to enhance performance. However, a significant obstacle to developing these traditional medical NLP algorithms is the limited existence of human-annotated datasets and the costs associated with new human annotation 15 . Despite meticulous ground-truth labeling, the relatively small corpus sizes often result in models with poor generalizability or make evaluations of generalizability impossible. For decades, conventional artificial intelligence (AI) systems (symbolic and neural networks) have suffered from a lack of general knowledge and commonsense reasoning. LLMs, like GPT, offer a promising alternative, potentially using commonsense reasoning and broad general knowledge to facilitate language processing.

ChatGPT is the application interface of the GPT model family. This study explores an approach to using ChatGPT to extract structured data elements from unstructured clinical notes. In this study, we selected lung cancer pathology reports as the corpus for extracting detailed diagnosis information for lung cancer. To accomplish this, we developed and improved a prompt engineering process. We then evaluated the effectiveness of this method by comparing the ChatGPT output with expert-curated structured data and used case studies to provide insights into how ChatGPT read and interpreted notes and why it made mistakes in some cases.

Data and endpoints

The primary objective of this study was to develop an algorithm and assess the capabilities of ChatGPT in processing and interpreting a large volume of free-text clinical notes. To evaluate this, we utilized unstructured lung cancer pathology notes, which provide diagnostic information essential for developing treatment plans and play vital roles in clinical and translational research. We accessed a total of 1026 lung cancer pathology reports from two web portals: the Cancer Digital Slide Archive (CDSA data) ( https://cancer.digitalslidearchive.org/ ) and The Cancer Genome Atlas (TCGA data) ( https://cBioPortal.org ). These platforms serve as public data repositories for de-identified patient information, facilitating cancer research. The CDSA dataset was utilized as the “training” data for prompt development, while the TCGA dataset, after removing the overlapping cases with CDSA, served as the test data for evaluating the ChatGPT model performance.

From all the downloaded 99 pathology reports from CDSA for the training data, we excluded 21 invalid reports due to near-empty content, poor scanning quality, or missing report forms. Seventy-eight valid pathology reports were included as the training data to optimize the prompt. To evaluate the model performance, 1024 pathology reports were downloaded from cBioPortal. Among them, 97 overlapped with the training data and were excluded from the evaluation. We further excluded 153 invalid reports due to near-empty content, poor scanning quality, or missing report forms. The invalid reports were preserved to evaluate ChatGPT’s handling of irregular inputs separately, and were not included in the testing data for accuracy performance assessment. As a result, 774 valid pathology reports were included as the testing data for performance evaluation. These valid reports still contain typos, missing words, random characters, incomplete contents, and other quality issues challenging human reading. The corresponding numbers of reports used at each step of the process are detailed in Fig. 1 .

figure 1

Exclusions are accounted for due to reasons such as empty reports, poor scanning quality, and other factors, including reports of stage IV or unknown conditions.

The specific task of this study was to identify tumor staging and histology types which are important for clinical care and research from pathology reports. The TNM staging system 16 , outlining the primary tumor features (T), regional lymph node involvement (N), and distant metastases (M), is commonly used to define the disease extent, assign prognosis, and guide lung cancer treatment. The American Joint Committee on Cancer (AJCC) has periodically released various editions 16 of TNM classification/staging for lung cancers based on recommendations from extensive database analyses. Following the AJCC guideline, individual pathologic T, N, and M stage components can be summarized into an overall pathologic staging score of Stage I, II, III, or IV. For this project, we instructed ChatGPT to use the AJCC 7 th edition Cancer Staging Manual 17 as the reference for staging lung cancer cases. As the lung cancer cases in our dataset are predominantly non-metastatic, the pathologic metastasis (pM) stage was not extracted. The data elements we chose to extract and evaluate for this study are pathologic primary tumor (pT) and pathologic lymph node (pN) stage components, overall pathologic tumor stage, and histology type.

Overall Performance

Using the training data in the CDSA dataset ( n  = 78), we experimented and improved prompts iteratively, and the final prompt is presented in Fig. 2 . The overall performance of the ChatGPT (gpt-3.5-turbo-16k model) is evaluated in the TCGA dataset ( n  = 774), and the results are summarized in Table 1 . The accuracy of primary tumor features (pT), regional lymph node involvement (pN), overall tumor stage, and histological diagnosis are 0.87, 0.91, 0.76, and 0.99, respectively. The average accuracy of all attributes is 0.89. The coverage rates for pT, pN, overall stage and histological diagnosis are 0.97, 0.94, 0.94 and 0.96, respectively. Further details of the accuracy evaluation, F1, Kappa, recall, and precision for each attribute are summarized as confusion matrices in Fig. 3 .

figure 2

Final prompt for information extraction and estimation from pathology reports.

figure 3

For meaningful evaluation, the cases with uncertain values, such as “Not Available”, “Not Specified”, “Cannot be determined”, “Unknown”, et al. in reference and prediction have been removed. a Primary tumor features (pT), b regional lymph node involvement (pN), c overall tumor stage, and d histological diagnosis.

Inference and Interpretation

To understand how ChatGPT reads and makes inferences from pathology reports, we demonstrated a case study using a typical pathology report in this cohort (TCGA-98-A53A) in Fig. 4a . The left panel shows part of the original pathology report, and the right panel shows the ChatGPT output with estimated pT, pN, overall stage, and histology diagnosis. For each estimate, ChatGPT gives the confidence level and the corresponding evidence it used for the estimation. In this case, ChatGPT correctly extracted information related to tumor size, tumor features, lymph node involvement, and histology information and used the AJCC staging guidelines to estimate tumor stage correctly. In addition, the confidence level, evidence interpretation, and case summary align well with the report and pathologists’ evaluations. For example, the evidence for the pT category was described as “The pathology report states that the tumor is > 3 cm and < 5 cm in greatest dimension, surrounded by lung or visceral pleura.” The evidence for tumor stage was described as “Based on the estimated pT category (T2a) and pN category (N0), the tumor stage is determined to be Stage IB according to AJCC7 criteria.” It shows that ChatGPT extracted relevant information from the note and correctly inferred the pT category based on the AJCC guideline (Supplementary Fig. 1 ) and the extracted information.

figure 4

a TCGA-98-A53A. An example of a scanned pathological report (left panel) and ChatGPT output and interpretation (right panel). All estimations and support evidence are consistent with the pathologist’s evaluations. b The GPT model correctly inferred pT as T2a based on the tumor’s size and involvement according to AJCC guidelines.

In another more complex case, TCGA-50-6590 (Fig. 4b ), ChatGPT correctly inferred pT as T2a based on both the tumor’s size and location according to AJCC guidelines. Case TCGA-44-2656 demonstrates a more challenging scenario (Supplementary Fig. 2 ), where the report only contains some factual data without specifying pT, pN, and tumor stage. However, ChatGPT was able to infer the correct classifications based on the reported facts and provide proper supporting evidence.

Error analysis

To understand the types and potential reasons for misclassifications, we performed a detailed error analysis by looking into individual attributes and cases where ChatGPT made mistakes, the results of which are summarized below.

Primary tumor feature (pT) classification

In total, 768 cases with valid reports and reference values in the testing data were used to evaluate the classification performance of pT. Among them, 15 cases were reported with unknown or empty output by ChatGPT, making the coverage rate 0.97. For the remaining 753 cases, 12.6% of pT was misclassified. Among these misclassification cases, the majority were T1 misclassified as T2 (67 out of 753 or 8.9%) or T3 misclassified as T2 (12 out of 753, or 1.6%).

In most cases, ChatGPT extracted the correct tumor size information but used an incorrect rule to distinguish pT categories. For example, in the case TCGA-22-4609 (Fig. 5a ), ChatGPT stated, “Based on the tumor size of 2.0 cm, it falls within the range of T2 category according to AJCC 7th edition for lung carcinoma staging manual.” However, according to the AJCC 7 th edition staging guidelines for lung cancer, if the tumor is more than 2 cm but less than 3 cm in greatest dimension and does not invade nearby structures, pT should be classified as T1b. Therefore, ChatGPT correctly extracted the maximum tumor dimension of 2 cm but incorrectly interpreted this as meeting the criteria for classification as T2. Similarly, for case TCGA-85-A4JB, ChatGPT incorrectly claimed, “Based on the tumor size of 10 cm, the estimated pT category is T2 according to AJCC 7th edition for lung carcinoma staging manual.” According to the AJCC 7 th edition staging guidelines, a tumor more than 7 cm in greatest dimension should be classified as T3.

figure 5

a TCGA-22-4609 illustrates a typical case where the GPT model uses a false rule, which is incorrect by AJCC guideline. b Case TCGA-39-5028 shows a complex case where there exist two tumors and the GPT model only capture one of them. c Case TCGA-39-5016 reveals a case where the GPT model made a mistake for getting confused with domain terminology.

Another challenging situation arose when multiple tumor nodules were identified within the lung. In the case of TCGA-39-5028 (Fig. 5b ), two separate tumor nodules were identified: one in the right upper lobe measuring 2.1 cm in greatest dimension and one in the right lower lobe measuring 6.6 cm in greatest dimension. According to the AJCC 7 th edition guidelines, the presence of separate tumor nodules in a different ipsilateral lobe results in a classification of T4. However, ChatGPT classified this case as T2a, stating, “The pathology report states the tumor’s greatest diameter as 2.1 cm”. This classification would be appropriated if the right upper lobe nodule were a single isolated tumor. However, ChatGPT failed to consider the presence of the second, larger nodule in the right lower lobe when determining the pT classification.

Regional lymph node involvement (pN)

The classification performance of pN was evaluated using 753 cases with valid reports and reference values in the testing data. Among them, 27 cases were reported with unknown or empty output by ChatGPT, making the coverage rate 0.94. For the remaining 726 cases, 8.5% of pN was misclassified. Most of these misclassification cases were N1 misclassified as N2 (32 cases). The AJCC 7th edition staging guidelines use the anatomic locations of positive lymph nodes to determine N1 vs. N2. However, most of the misclassification cases were caused by ChatGPT interpreting the number of positive nodes rather than the locations of the positive nodes. One such example is the case TCGA-85-6798. The report states, “Lymph nodes: 2/16 positive for metastasis (Hilar 2/16)”. Positive hilar lymph nodes correspond to N1 classification according to AJCC 7th edition guidelines. However, ChatGPT misclassifies this case as N2, stating, “The pathology report states that 2 out of 16 lymph nodes are positive for metastasis. Based on this information, the pN category can be estimated as N2 according to AJCC 7th edition for lung carcinoma staging manual.” This interpretation is incorrect, as the number of positive lymph nodes is not part of the criteria used to determine pN status according to AJCC 7th edition guidelines. The model misinterpreted pN2 predictions in 22 cases due to similar false assertions.

In some cases, the ChatGPT model made classification mistakes by misunderstanding the locations’ terminology. Figure 5c shows a case (TCGA-39-5016) where the ChatGPT model recognized that “6/9 peribronchial lymph nodes involved, “ corresponding with classification as N1, but ChatGPT misclassified this case as N2. By AJCC 7th edition guidelines, N2 is defined as “Metastasis in ipsilateral mediastinal and/or subcarinal lymph node(s)”. The ChatGPT model did not fully understand that terminology and made misclassifications.

Pathology tumor stage

The overall tumor stage classification performance was evaluated using 744 cases with valid reports and reference values as stage I, II and III in the testing data. Among them, 18 cases were reported as unknown or empty output by ChatGPT making the coverage rate as 0.94. For the remaining 726 cases, 23.6% of the overall stage was misclassified. Since the overall stage depends on individual pT and pN stages, the mistakes could come from misclassification of pT or pN (error propagation) or applying incorrect inference rules to determine the overall stage from pT and pN (incorrect rules). Looking into the 56 cases where ChatGPT misclassified stage II as stage III, 22 cases were due to error propagation, and 34 were due to incorrect rules. Figure 6a shows an example of error propagation (TCGA-MP-A4TK). ChatGPT misclassified the pT stage from T2a to T3, and then this mistake led to the incorrect classification of stage IIA to stage IIIA. Figure 6b illustrates a case (TCGA-49-4505) where ChatGPT made correct estimation of pT and pN but made false prediction about tumor stage by using a false rule. Among the 34 cases affected by incorrect rules, ChatGPT mistakenly inferred tumor stage as stage III for 26 cases where pT is T3 and pN is N0, respectively. For example, for case TCGA-55-7994, ChatGPT provided the evidence as “Based on the estimated pT category (T3) and pN category (N0), the tumor stage is determined to be Stage IIIA according to AJCC7 criteria”. According to AJCC7, tumors with T3 and N0 should be classified as stage IIB. Similarly, error analysis for other tumor stages shows that misclassifications come from both error propagation and applying false rules.

figure 6

a Case TCGA-MP-A4TK: An example of typical errors GPT made in the experiments, i.e. GPT took false rule and further led to faulty propagation. b Case TCGA-49-4505: The GPT model made false estimation of Stage IIIA with a false rule, although it made correct inference with T2b and N1.

Histological diagnosis

The classification performance of histology diagnosis was evaluated using 762 cases with valid reports and reference values in the testing data. Among them, 17 cases were reported as either unknown or empty output by ChatGPT, making the coverage rate 0.96. For the remaining 745 cases, 6 ( < 1%) of histology types were misclassified. Among the mistakes that ChatGPT made for histology diagnosis, ChatGPT misclassified 3 of them as “other” type and 3 cases of actual “other” type (neither adenocarcinomas nor squamous cell carcinomas) as 2 adenocarcinomas and 1 squamous cell carcinoma. In TCGA-22-5485, two tumors exist: one squamous cell carcinoma and another adenocarcinoma, which should be classified as the ‘other’ type. However, ChatGPT only identified and extracted information for one tumor. In the case TCGA-33-AASB, which is the “other” type of histology, ChatGPT captured the key information and gave it as evidence: “The pathology report states the histologic diagnosis as infiltrating poorly differentiated non-small cell carcinoma with both squamous and glandular features”. However, it mistakenly estimated this case as “adenocarcinoma”. In another case (TCGA-86-8668) of adenocarcinoma, ChatGPT again captured key information and stated as evidence, “The pathology report states the histologic diagnosis as Bronchiolo-alveolar carcinoma, mucinous” but could not tell it is a subtype of adenocarcinoma. Both cases reveal that ChatGPT still has limitations in the specific domain knowledge in lung cancer pathology and the capability of correcting understanding its terminology.

Analyzing irregularities

The initial model evaluation and prompt-response review uncovered irregular scenarios: the original pathology reports may be blank, poorly scanned, or simply missing report forms. We reviewed how ChatGPT responded to these anomalies. First, when a report was blank, the prompt contained only the instruction part. ChatGPT failed to recognize this situation in most cases and inappropriately generated a fabricated case. Our experiments showed that, with the temperature set at 0 for blank reports, ChatGPT converged to a consistent, hallucinated response. Second, for nearly blank reports with a few random characters and poorly scanned reports, ChatGPT consistently converged to the same response with increased variance as noise increased. In some cases, ChatGPT responded appropriately to all required attributes but with unknown values for missing information. Last, among the 15 missing report forms in a small dataset, ChatGPT responded “unknown” as expected in only 5 cases, with the remaining 10 still converging to the hallucinated response.

Reproducibility evaluation

Since ChatGPT models (even with the same version) evolve over time, it is important to evaluate the stability and reproducibility of ChatGPT. For this purpose, we conducted experiments with the same model (“gpt-3.5-turbo-0301”), the same data, prompt, and settings (e.g., temperature = 0) twice in early April and the middle of May of 2023. The rate of equivalence between ChatGPT estimations in April and May on key attributes of interest (pT, pN, tumor stage, and histological diagnosis) is 0.913. The mean absolute error between certainty degrees in the two experiments is 0.051. Considering the evolutionary nature of ChatGPT models, we regard an output difference to a certain extent as reasonable and the overall ChatGPT 3.5 model as stable.

Comparison with other NLP methods

In order to have a clear perspective on how ChatGPT’s performance stands relative to established methods, we conducted a comparative analysis of the results generated by ChatGPT with two established methods: a keyword search algorithm and a deep learning-based Named Entity Recognition (NER) method.

Data selection and annotation

Since the keyword search and NER methods do not support zero-shot learning and require human annotations on the entity level, we carefully annotated our dataset for these traditional NLP methods. We used the same training and testing datasets as in the prompt engineering for ChatGPT. The training dataset underwent meticulous annotation by experienced medical professionals, adhering to the AJCC7 standards. This annotation process involved identifying and highlighting all relevant entities and text spans related to stage, histology, pN, and pT attributes. The detailed annotation process for the 78 cases required a few weeks of full-time work from medical professionals.

Keyword search algorithm using wordpiece tokenizer

For the keyword search algorithm, we employed the WordPiece tokenizer to segment words into subwords. We compiled an annotated entity dictionary from the training dataset. To assess the performance of this method, we calculated span similarities between the extracted spans in the validation and testing datasets and the entries in the dictionary.

Named Entity Recognition (NER) classification algorithm

For the NER classification algorithm, we designed a multi-label span classification model. This model utilized the pre-trained Bio_ClinicalBERT as its backbone. To adapt it for multi-label classification, we introduced an additional linear layer. The model underwent fine-tuning for 1000 epochs using the stochastic gradient descent (SGD) optimizer. The model exhibiting the highest overall F1 score on the validation dataset was selected as the final model for further evaluation in the testing dataset.

Performance evaluation

We evaluated the performance of both the keyword search and NER methods on the testing dataset. We summarized the predicted entities/spans and their corresponding labels. In cases where multiple related entities were identified for a specific category, we selected the most severe entities as the final prediction. Moreover, we inferred the stage information for corpora lacking explicit staging information by aggregating details from pN, pT, and diagnosis, aligning with the AJCC7 protocol. The overall predictions for stage, diagnosis, pN, and pT were compared against the ground truth table to gauge the accuracy and effectiveness of our methods. The results (Supplementary Table S1 ) show that the ChatGPT outperforms WordPiece tokenizer and NER Classifier. The average accuracy for ChatGPT, WordPiece tokenizer, and NER Classifier are 0.89, 0.51, and 0.76, respectively.

Prompt engineering process and results

Prompt design is a heuristic search process with many elements to consider, thus having a significantly large design space. We conducted many experiments to explore better prompts. Here, we share a few typical prompts and the performance of these prompts in the training data set to demonstrate our prompt engineering process.

Output format

The most straightforward prompt without special design would be: “read the pathology report and answer what are pT, pN, tumor stage, and histological diagnosis”. However, this simple prompt would make ChatGPT produce unstructured answers varying in format, terminology, and granularity across the large number of pathology reports. For example, ChatGPT may output pT as “T2” or “pT2NOMx”, and it outputs histological diagnosis as “Multifocal invasive moderately differentiated non-keratinizing squamous cell carcinoma”. The free-text answers will require a significant human workload to clean and process the output from ChatGPT. To solve this problem, we used a multiple choice answer format to force ChatGPT to pick standardized values for some attributes. For example, for pT, ChatGPT could only provide the following outputs: “T0, Tis, T1, T1a, T1b, T2, T2a, T2b, T3, T4, TX, Unknown”. For the histologic diagnosis, ChatGPT could provide output in one of these categories: Lung Adenocarcinoma, Lung Squamous Cell Carcinoma, Other, Unknown. In addition, we added the instruction, “Please make sure to output the whole set of answers together as a single JSON file, and don’t output anything beyond the required JSON file,” to emphasize the requirement for the output format. These requests in the prompt make the downstream analysis of ChatGPT output much more efficient. In order to know the certainty degree of ChatGPT’s estimate and the evidence, we asked ChatGPT to provide the following 4 outputs for each attribute/variable: extracted value as stated in the pathology report, estimated value based on AJCC 7th edition for lung carcinoma staging manual, the certainty degree of the estimation, and the supporting evidence for the estimation. The classification accuracy of this prompt with multiple choice output format (prompt v1) in our training data could achieve 0.854.

Evidence-based inference

One of the major concerns for LLM is that the results from the model are not supported by any evidence, especially when there is not enough information for specific questions. In order to reduce this problem, we emphasize the use of evidence for inference in the prompt by adding this instruction to ChatGPT: “Please ensure to make valid inferences for attribute estimation based on evidence. If there is no available evidence provided to make an estimation, please answer the value as “Unknown.” In addition, we asked ChatGPT to “Include “comment” as the last key of the JSON file.” After adding these two instructions (prompt v2), the performance of the classification in the training data increased to 0.865.

Chain of thought prompting by asking intermediate questions

Although tumor size is not a primary interest for diagnosis and clinical research, it plays a critical role in classifying the pT stage. We hypothesize that if ChatGPT pays closer attention to tumor size, it will have better classification performance. Therefore, we added an instruction in the prompt (prompt v3) to ask ChatGPT to estimate: “tumor size max_dimension: [<the greatest dimension of tumor in Centimeters (cm)>, ‘Unknown’]” as one of the attributes. After this modification, the performance of the classification in the training data increased to 0.90.

Providing examples

Providing examples is an effective way for humans to learn, and it should have similar effects for ChatGPT. We provided a specific example to infer the overall stage based on pT and pN by adding this instruction: “Please estimate the tumor stage category based on your estimated pT category and pN category and use AJCC7 criteria. For example, if pT is estimated as T2a and pN as N0, without information showing distant metastasis, then by AJCC7 criteria, the tumor stage is “Stage IB”.” After this modification (prompt v4), the performance of the classification in the training data increased to 0.936.

Although we can further refine and improve prompts, we decided to use prompt v4 as the final model and apply it to the testing data and get the final classification accuracy of 0.89 in the testing data.

ChatGPT-4 performance

LLM evolves rapidly and OpenAI just released the newest GPT-4 Turbo model (GPT-4-1106-preview) in November 2023. To compare this new model with GPT-3.5-Turbo, we applied this newest GPT model GPT-4-1106 to analyze all the lung cancer pathology notes in the testing data. The classification result and the comparison with the GPT-3.5-Turbo-16k are summarized in Supplementary Table 1 . The results show that GPT-4-turbo performs better in almost every aspect; overall, the GPT-4-turbo model increases performance by over 5%. However, GPT-4-Turbo is much more expensive than GPT-3.5-Turbo. The performance of GPT-3.5-Turbo-16k is still comparable and acceptable. As such, this study mainly focuses on assessing GPT-3.5-Turbo-16k, but highlights the fast development and promise of using LLM to extract structured data from clinical notes.

Analyzing osteosarcoma data

To demonstrate the broader application of this method beyond lung cancer, we collected and analyzed clinical notes from pediatric osteosarcoma patients. Osteosarcoma, the most common type of bone cancer in children and adolescents, has seen no substantial improvement in patient outcomes for the past few decades 18 . Histology grades and margin status are among the most important prognostic factors for osteosarcoma. We collected pathology reports from 191 osteosarcoma cases (approved by UTSW IRB #STU 012018-061). Out of these, 148 cases had histology grade information, and 81 had margin status information; these cases were used to evaluate the performance of the GPT-3.5-Turbo-16K model and our prompt engineering strategy. Final diagnoses on grade and margin were manually reviewed and curated by human experts, and these diagnoses were used to assess ChatGPT’s performance. All notes were de-identified prior to analysis. We applied the same prompt engineering strategy to extract grade and margin information from these osteosarcoma pathology reports. This analysis was conducted on our institution’s private Azure OpenAI platform, using the GPT-3.5-Turbo-16K model (version 0613), the same model used for lung cancer cases. ChatGPT accurately classified both grades (with a 98.6% accuracy rate) and margin status (100% accuracy), as shown in Supplementary Fig. 3 . In addition, Supplementary Fig. 4 details a specific case, illustrating how ChatGPT identifies grades and margin status from osteosarcoma pathology reports.

Since ChatGPT’s release in November 2022, it has spurred many potential innovative applications in healthcare 19 , 20 , 21 , 22 , 23 . To our knowledge, this is among the first reports of an end-to-end data science workflow for prompt engineering, using, and rigorously evaluating ChatGPT in its capacity of batch-processing information extraction tasks on large-scale clinical report data.

The main obstacle to developing traditional medical NLP algorithms is the limited availability of annotated data and the costs for new human annotations. To overcome these hurdles, particularly in integrating problem-specific information and domain knowledge with LLMs’ task-agnostic general knowledge, Augmented Language Models (ALMs) 24 , which incorporate reasoning and external tools for interaction with the environment, are emerging. Research shows that in-context learning (most influentially, few-shot prompting) can complement LLMs with task-specific knowledge to perform downstream tasks effectively 24 , 25 . In-context learning is an approach of training through instruction or light tutorial with a few examples (so called few-shot prompting; well instruction without any example is called 0-shot prompting) rather than fine-tuning or computing-intensive training, which adjusts model weights. This approach has become a dominant method for using LLMs in real-world problem-solving 24 , 25 , 26 . The advent of ALMs promises to revolutionize almost every aspect of human society, including the medical and healthcare domains, altering how we live, work, and communicate. Our study shows the feasibility of using ChatGPT to extract data from free text without extensive task-specific human annotation and model training.

In medical data extraction, our study has demonstrated the advantages of adopting ChatGPT over traditional methods in terms of cost-effectiveness and efficiency. Traditional approaches often require labor-intensive annotation processes that may take weeks and months from medical professionals, while ChatGPT models can be fine-tuned for data extraction within days, significantly reducing the time investment required for implementation. Moreover, our economic analysis revealed the cost savings associated with using ChatGPT, with processing over 900 pathology reports incurring a minimal monetary cost (less than $10 using GPT 3.5 Turbo and less than $30 using GPT-4 Turbo). This finding underscores the potential benefits of incorporating ChatGPT into medical data extraction workflows, not only for its time efficiency but also for its cost-effectiveness, making it a compelling option for medical institutions and researchers seeking to streamline their data extraction processes without compromising accuracy or quality.

A critical requirement for effectively utilizing an LLM is crafting a high-quality “prompt” to instruct the LLM, which has led to the emergence of an important methodology referred to as “prompt engineering.” Two fundamental principles guide this process: firstly, the provision of appropriate context, and secondly, delivering clear instructions about subtasks and the requirements for the desired response and how it should be presented. For a single query for one-time use, the user can experiment with and revise the prompt within the conversation session until a satisfactory answer is obtained. However, prompt design can become more complex when handling repetitive tasks over many input data files using the OpenAI API. In these instances, a prompt must be designed according to a given data feed while maintaining the generality and coverage for various input data features. In this study, we found that providing clear guidance on the output format, emphasizing evidence-based inference, providing chain of thought prompting by asking for tumor size information, and providing specific examples are critical in improving the efficiency and accuracy of extracting structured data from the free-text pathology reports. The approach employed in this study effectively leverages the OpenAI API for batch queries of ChatGPT services across a large set of tasks with similar input data structures, including but not limited to pathology reports and EHR.

Our evaluation results show that the ChatGPT (gpt-3.5-turbo-16k) achieved an overall average accuracy of 89% in extracting and estimating lung cancer staging information and histology subtypes compared to pathologist-curated data. This performance is very promising because some scanned pathology reports included in this study contained random characters, missing parts, typos, varied formats, and divergent information sections. ChatGPT also outperformed traditional NLP methods. Our case analysis shows that most misclassifications were due to a lack of knowledge of detailed pathology terminology or very specialized information in the current versions of ChatGPT models, which could be avoided with future model training or fine-tuning with more domain-specific knowledge.

While our experiments reveal ChatGPT’s strengths, they also underscore its limitations and potential risks, the most significant being the occasional “hallucination” phenomenon 27 , 28 , where the generated content is not faithful to the provided source content. For example, the responses to blank or near-blank reports reflect this issue, though these instances can be detected and corrected due to convergence towards an “attractor”.

The phenomenon of ‘hallucination’ in LLMs presents a significant challenge in the field. It is important to consider several key factors to effectively address the challenges and risks associated with ChatGPT’s application in medicine. Since the output of an LLM depends on both the model and the prompt, mitigating hallucination can be achieved through improvements in GPT models and prompting strategies. From a model perspective, model architecture, robust training, and fine-tuning on a diverse and comprehensive medical dataset, emphasizing accurate labeling and classification, can reduce misclassifications. Additionally, enhancing LLMs’ comprehension of medical terminology and guidelines by incorporating feedback from healthcare professionals during training and through Reinforcement Learning from Human Feedback (RLHF) can further diminish hallucinations. Regarding prompt engineering strategies, a crucial method is to prompt the GPT model with a ‘chain of thought’ and request an explanation with the evidence used in the reasoning. Further improvements could include explicitly requesting evidence from input data (e.g., the pathology report) and inference rules (e.g., AJCC rules). Prompting GPT models to respond with ‘Unknown’ when information is insufficient for making assertions, providing relevant context in the prompt, or using ‘embedding’ of relevant text to narrow down the semantic subspace can also be effective. Harnessing hallucination is an ongoing challenge in AI research, with various methods being explored 5 , 27 . For example, a recent study proposed “SelfCheckGPT” approach to fact-check black-box models 29 . Developing real-time error detection mechanisms is crucial for enhancing the reliability and trustworthiness of AI models. More research is needed to evaluate the extent, impacts, and potential solutions of using LLMs in clinical research and care.

When considering using ChatGPT and similar LLMs in healthcare, it’s important to thoughtfully consider the privacy implications. The sensitivity of medical data, governed by rigorous regulations like HIPAA, naturally raises concerns when integrating technologies like LLMs. Although it is a less concern to analyze public available de-identified data, like the lung cancer pathology notes used in this study, careful considerations are needed for secured healthcare data. More secured OpenAI services are offered by OpenAI security portal, claimed to be compliant to multiple regulation standards, and Microsoft Azure OpenAI, claimed could be used in a HIPAA-compliant manner. For example, de-identified Osteosarcoma pathology notes were analyzed by Microsoft Azure OpenAI covered by the Business Associate Agreement in this study. In addition, exploring options such as private versions of these APIs, or even developing LLMs within a secure healthcare IT environment, might offer good alternatives. Moreover, implementing strong data anonymization protocols and conducting regular security checks could further protect patient information. As we navigate these advancements, it’s crucial to continuously reassess and adapt appropriate privacy strategies, ensuring that the integration of AI into healthcare is both beneficial and responsible.

Despite these challenges, this study demonstrates our effective methodology in “prompt engineering”. It presents a general framework for using ChatGPT’s API in batch queries to process large volumes of pathology reports for structured information extraction and estimation. The application of ChatGPT in interpreting clinical notes holds substantial promise in transforming how healthcare professionals and patients utilize these crucial documents. By generating concise, accurate, and comprehensible summaries, ChatGPT could significantly enhance the effectiveness and efficiency of extracting structured information from unstructured clinical texts, ultimately leading to more efficient clinical research and improved patient care.

In conclusion, ChatGPT and other LLMs are powerful tools, not just for pathology report processing but also for the broader digital transformation of healthcare documents. These models can catalyze the utilization of the rich historical archives of medical practice, thereby creating robust resources for future research.

Data processing, workflow, and prompt engineering

The lung cancer data we used for this study are publicly accessible via CDSA ( https://cancer.digitalslidearchive.org/ ) and TCGA ( https://cBioPortal.org ), and they are de-identified data. The institutional review board at the University of Texas Southwestern Medical Center has approved this study where patient consent was waived for using retrospective, de-identified electronic health record data.

We aimed to leverage ChatGPT to extract and estimate structured data from these notes. Figure 7a displays our process. First, scanned pathology reports in PDF format were downloaded from TCGA and CDSA databases. Second, R package pdftools, an optical character recognition tool, was employed to convert scanned PDF files into text format. After this conversion, we identified reports with near-empty content, poor scanning quality, or missing report forms, and those cases were excluded from the study. Third, the OpenAI API was used to analyze the text data and extract structured data elements based on specific prompts. In addition, we extracted case identifiers and metadata items from the TCGA metadata file, which was used to evaluate the model performance.

figure 7

a Illustration of the use of OpenAI API for batch queries of ChatGPT service, applied to a substantial volume of clinical notes — pathology reports in our study. b A general framework for integrating ChatGPT into real-world applications.

In this study, we implemented a problem-solving framework rooted in data science workflow and systems engineering principles, as depicted in Fig. 7b . An important step is the spiral approach 30 to ‘prompt engineering’, which involves experimenting with subtasks, different phrasings, contexts, format specifications, and example outputs to improve the quality and relevance of the model’s responses. It was an iterative process to achieve the desired results. For the prompt engineering, we first define the objective: to extract information on TNM staging and histology type as structured attributes from the unstructured pathology reports. Second, we assigned specific tasks to ChatGPT, including estimating the targeted attributes, evaluating certainty levels, identifying key evidence of each attribute estimation, and generating a summary as output. The output was compiled into a JSON file. In this process, clinicians were actively formulating questions and evaluating the results.

Our study used the “gpt-3.5-turbo” model, accessible via the OpenAI API. The model incorporates 175 billion parameters and was trained on various public and authorized documents, demonstrating specific Artificial General Intelligence (AGI) capabilities 5 . Each of our queries sent to ChatGPT service is a “text completion” 31 , which can be implemented as a single round chat completion. All LLMs have limited context windows, constraining the input length of a query. Therefore, lengthy pathology reports combined with the prompt and ChatGPT’s response might exceed this limit. We used OpenAI’s “tiktoken” Python library to estimate the token count to ensure compliance. This constraint has been largely relaxed by the newly released GPT models with much larger context windows. We illustrate the pseudocode for batch ChatGPT queries on a large pathology report set in Supplementary Fig. 5 .

Model evaluation

We evaluated the performance of ChatGPT by comparing its output with expert-curated data elements provided in the TCGA structured data using the testing data set. Some staging records in the TCGA structured data needed to be updated; our physicians curated and updated those records. To mimic a real-world setting, we processed all reports regardless of data quality to collect model responses. For performance evaluation, we only used valid reports providing meaningful text and excluded the reports with near-empty content, poor scanning quality, and missing report forms, which were reported as irregular cases. We assessed the classification accuracy, F1, Kappa, recall, and precision for each attribute of interest, including pT, pN, overall stage, and histology types, and presented results as accuracy and confusion matrices. Missing data were excluded from the accuracy evaluation, and the coverage rate was reported for predicted values as ‘unknown’ or empty output.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The lung cancer dataset we used for this study is “Pan-Lung Cancer (TCGA, Nat Genet2016)”, ( https://www.cbioportal.org/study/summary?id=nsclc_tcga_broad_2016 ) and the “luad” and “lusc” subsets from CDSA ( https://cancer.digitalslidearchive.org/ ). We have provided a reference regarding how to access the data 32 . We utilized the provided APIs to retrieve clinical information and pathology reports for the LUAD (lung adenocarcinoma) and LUSC (lung squamous cell carcinoma) cohorts. The pediatric data are the EHR data from UTSW clinic services. The data is available from the corresponding author upon reasonable request and IRB approval.

Code availability

All codes used in this paper were developed using APIs from OpenAI. The prompt for the API is available in Fig. 2 . Method-specific code is available from the corresponding author upon request.

Vaswani, A. et al. Attention is all you need. Adv. Neural Info. Processing Syst. 30 , (2017).

Devlin, J. et al. Bert: Pre-training of deep bidirectional transformers for language understanding . arXiv preprint arXiv:1810.04805, 2018.

Radford, A. et al. Improving language understanding by generative pre-training . OpenAI: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).

Touvron, H. et al. LLaMA: Open and efficient foundation language models . arXiv preprint arXiv:2302.13971 (2023).

OpenAi, GPT-4 Technical Report . arXiv:2303.08774: https://arxiv.org/pdf/2303.08774.pdf (2023).

Anil, R. et al. Palm 2 technical report . arXiv preprint arXiv:2305.10403 (2023).

Turner, B. E. W. Epic, Microsoft bring GPT-4 to EHRs .

Landi, H. Microsoft’s Nuance integrates OpenAI’s GPT-4 into voice-enabled medical scribe software .

Hao, T. et al. Health Natural Language Processing: Methodology Development and Applications. JMIR Med Inf. 9 , e23898 (2021).

Article   Google Scholar  

Pathak, J., Kho, A. N. & Denny, J. C. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J. Am. Med. Inform. Assoc. 20 , e206–e211 (2013).

Article   PubMed   PubMed Central   Google Scholar  

Crichton, G. et al. A neural network multi-task learning approach to biomedical named entity recognition. BMC Bioinforma. 18 , 368 (2017).

Wang, J. et al. Document-Level Biomedical Relation Extraction Using Graph Convolutional Network and Multihead Attention: Algorithm Development and Validation. JMIR Med Inf. 8 , e17638 (2020).

Liu, Y. et al. Roberta: A robustly optimized BERT pretraining approach . arXiv preprint arXiv:1907.11692 (2019).

Rasmy, L. et al. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. npj Digit. Med. 4 , 86 (2021).

Wu, H. et al. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. npj Digit. Med. 5 , 186 (2022).

Amin, M. B. et al. AJCC cancer staging manual . 1024: Springer 2017.

Goldstraw, P. et al. The IASLC Lung Cancer Staging Project: Proposals for the Revision of the TNM Stage Groupings in the Forthcoming (Seventh) Edition of the TNM Classification of Malignant Tumours. J. Thorac. Oncol. 2 , 706–714 (2007).

Article   PubMed   Google Scholar  

Yang, D. M. et al. Osteosarcoma Explorer: A Data Commons With Clinical, Genomic, Protein, and Tissue Imaging Data for Osteosarcoma Research. JCO Clin. Cancer Inform. 7 , e2300104 (2023).

The Lancet Digital, H., ChatGPT: friend or foe? Lancet Digital Health . 5 , e102 (2023).

Nature, Will ChatGPT transform healthcare? Nat. Med. 29 , 505–506 (2023).

Patel, S. B. & Lam, K. ChatGPT: the future of discharge summaries? Lancet Digit. Health 5 , e107–e108 (2023).

Article   CAS   PubMed   Google Scholar  

Ali, S. R. et al. Using ChatGPT to write patient clinic letters. Lancet Digit. Health 5 , e179–e181 (2023).

Howard, A., Hope, W. & Gerada, A. ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect. Dis. 23 , 405–406 (2023).

Mialon, G. et al. Augmented language models: a survey . arXiv preprint arXiv:2302.07842 (2023).

Brown, T. et al. Language Models are Few-Shot Learners . Curran Associates, Inc. (2020).

Wei, J. et al. Chain of thought prompting elicits reasoning in large language models . Adv Neural Inf Processing Syst 35 , 24824–24837 (2022).

Ji, Z. et al. Survey of Hallucination in Natural Language Generation. ACM Comput. Surv. 55 , 1–38 (2023).

Alkaissi, H. & S. I. McFarlane, Artificial Hallucinations in ChatGPT: Implications in Scientific Writing. Cureus , (2023).

Manakul, P. A. Liusie, & M. J. F. Gales, SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models . 2023.

Boehm, B. W. A spiral model of software development and enhancement. Computer 21 , 61–72 (1988).

OpenAi. OpenAI API Documentation . Available from: https://platform.openai.com/docs/guides/text-generation .

Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal. 6 , 1–19 (2013).

Download references

Acknowledgements

This work was partially supported by the National Institutes of Health [P50CA70907, R35GM136375, R01GM140012, R01GM141519, R01DE030656, U01CA249245, and U01AI169298], and the Cancer Prevention and Research Institute of Texas [RP230330 and RP180805].

Author information

Authors and affiliations.

Quantitative Biomedical Research Center, Peter O’Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390, USA

Jingwei Huang, Donghan M. Yang, Ruichen Rong, Kuroush Nezafati, Colin Treager, Shidan Wang, Xian Cheng, Yujia Guo, Guanghua Xiao, Xiaowei Zhan & Yang Xie

Department of Pathology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390, USA

Department of Pediatrics, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390, USA

Laura J. Klesse

Department of Internal Medicine, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390, USA

Eric D. Peterson

You can also search for this author in PubMed   Google Scholar

Contributions

J.H., Y.X., X.Z. and G.X. designed the study. X.Z., K.N., C.T. and J.H. prepared, labeled, and curated lung cancer datasets. D.M.Y., X.C., Y.G., L.J.K. prepared, labeled, and curated osteosarcoma datasets. Z.C. provided critical inputs as pathologists. Y.X., G.X., E.P. provided critical inputs for the study. J.H. implemented experiments with ChatGPT. R.R. and K.N. implemented experiments with N.L.P. J.H., Y.X., G.X. and S.W. conducted data analysis. Y.X., G.X., J.H., X.Z., D.M.Y. and R.R. wrote the manuscript. All co-authors read and commented on the manuscript.

Corresponding authors

Correspondence to Xiaowei Zhan or Yang Xie .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental figures and tables, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Huang, J., Yang, D.M., Rong, R. et al. A critical assessment of using ChatGPT for extracting structured data from clinical notes. npj Digit. Med. 7 , 106 (2024). https://doi.org/10.1038/s41746-024-01079-8

Download citation

Received : 24 July 2023

Accepted : 14 March 2024

Published : 01 May 2024

DOI : https://doi.org/10.1038/s41746-024-01079-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

chatgpt writing literature review

chatgpt writing literature review

From Complexity to Clarity: ChatGPT's Potential to Increase Health Literacy

Llms can provide accurate clinical information on lung and breast cancer screening patient education materials.

Paul Yi, MD

Among U.S. patients with low health literacy, cancer screening rates are lower than the national average. It is thought that low health literacy creates barriers to screening, such as not understanding exams and why they are important.

With the advent of ChatGPT in late 2022, researchers began studying the tool’s capability to simplify cancer screening and prevention information, making it more decipherable to patients and increasing the likelihood of adherence to screening programs.

Paul H. Yi, MD, assistant professor of diagnostic radiology and nuclear medicine at the University of Maryland School of Medicine (UMSOM), Baltimore, explained the links between health literacy and adherence to screening, and why current patient education materials create a barrier to improving health literacy.

“In the case of cancer, increased health literacy is associated with greater adherence to things like breast cancer screening. If patients are better informed, they’re more likely to make good health decisions,” he said. “The problem is that health educational materials are notoriously hard to read. They are written often at the 12th grade level or higher, making it hard for some patients to understand them.”

For lung cancer in particular, screening rates are abysmally low, said Jean Jeudy, MD, professor of diagnostic radiology and nuclear medicine at UMSOM. Dr. Jeudy hopes that with the help of large language models (LLMs) like ChatGPT, eligible patients can understand the importance of lung cancer screening and better adhere to screening programs, even if they face socioeconomic challenges that would prevent them getting to a screening.

“In some reports, only 5%-8% of the population eligible for lung cancer screening are getting screened. There are a lot of medical terms within patient education materials that patients may not understand, so we want to see if we can use LLMs to simplify the message and bring it to a broader audience,” he said. “Hopefully with a clearer understanding, patients can improve their overall decision making and increase the rates of screening that we’re seeing globally."

Direct Your Patients to RadiologyInfo.org for Patient-Friendly Information

Introduced in 2003 and co-produced by RSNA and the American College of Radiology (ACR), RadiologyInfo.org is home to patient-friendly information about more than 270 imaging procedures, exams and disease topics.

New additions to the site include videos on how to read a radiology report. Videos on reading abdominal/ pelvic CT and chest X-ray reports are available now. In addition, among the site’s most popular articles are those related to commonly performed procedures like abdominal and cardiac CT, head CT and CT angiography.

All RadiologyInfo.org content is also available in Spanish with 30 videos translated into Spanish. Content is reviewed regularly through a committee comprised of 16 multiinstitutional radiology professionals, with approximately 85 medical advisors who assist each year with writing and content review.

When your patients have questions and need more information, refer them to RadiologyInfo.org .

Accessible and Accurate Health Information in a Few Taps

Drs. Yi and Jeudy were part of research teams, including first author Hana Haver, MD, breast imaging fellow at Massachusetts General Hospital and former radiology resident at the University of Maryland, that investigated ChatGPT’s ability to simplify screening and prevention information on lung and breast cancers. In the lung cancer study, the researchers formulated 25 common patient questions about lung cancer screening, and then entered them into ChatGPT on three separate occasions. They categorized the responses as appropriate, inappropriate or inconsistent. ChatGPT produced responses that were appropriate 84% of the time.

“Because of the nature of ChatGPT,it can give you different responses at different times. So, we wanted to verify its fidelity to its responses,” Dr. Jeudy explained. “We found that 21 out of 25 of those questions always elicited appropriate responses. There were two which generated inappropriate responses, and two which we thought were inconsistent.”

The breast cancer study, also first-authored by Dr. Haver, was a follow-up to a previous study, similar to the lung cancer study—asking ChatGPT for answers to 25 commonly asked patient questions about breast cancer and screening. In the follow-up, the researchers asked ChatGPT to lower the reading level of each of the 25 responses to a sixth-grade level. They repeated this prompt three times for each response, to see if the responses could be further refined.

“The responses in the original study, although 88%-90% accurate and appropriate, were written on average at grade 13 level, or first year in college, while the average U.S. adult reads at a sixth grade level,” Dr. Yi said. “When we asked ChatGPT to rewrite these responses at a sixth-grade reading level, there were statistically significant decreases to about eighth or ninth grade level, and 92% of them were clinically appropriate.”

Dr. Yi added that the response to a question about breast density, often a confusing topic for patients, was successfully simplified to a ninth-grade level.

“The original response for that particular question was approximately college level understanding. When it was rewritten, it went down to about the ninth-grade level,” Dr. Yi said. “I think this is notable because breast density is a potentially challenging and confusing topic. It’s encouraging that the same improvements that we saw overall held for this important topic.”

Physician Oversight Needed in Patient Use of Large Language Models

Drs. Haver, Yi and Jeudy agree that the technology is not yet capable of simplifying cancer prevention and screening information for patients without physician oversight.

Dr. Yi reminded that “hallucinations.” a common defect of LLMs, can result in an LLM providing incorrect information.

“In our research, there’s still a 10% inaccuracy rate in this model’s responses, so I would submit that a 10% margin of error is not good enough to allow LLMs to run autonomously,” Dr. Yi said. “Any amount of misinformation can have significant consequences for patients.” Dr. Yi also explained that the ability of any LLM to provide correct information depends heavily on its data training set.

Given that health guidelines change, an LLM that hasn’t updated its training set may provide outdated information.

“LLMs are limited by the training data used, meaning the model may have only learned or been exposed to data up to a certain date. Because health care information changes, there’s a chance that an LLM may have outdated information,” Dr. Yi said.

Dr. Jeudy emphasized that regardless of the increasing ability of a LLM to provide clinically accurate, up-to-date information, it cannot synthesize that information for use in individual patient cases.

“Even when correct information is given to patients, they still need physicians to personalize the information according to their medical history and to address their concerns,” he said. “Although LLMs can accentuate medical information for patients, physicians are still going to be needed to tailor the information to individual cases.” 

For More Information

Access the Radiology: Cardiothoracic Imaging study, “ Evaluating ChatGPT’s Accuracy in Lung Cancer Prevention and Screening Recommendations .”

Access the Radiology: Imaging Cancer study, “ Evaluating the Use of ChatGPT to Accurately Simplify Patient-centered Information about Breast Cancer Prevention and Screening .”

Read previous  RSNA News  articles on uses for ChatGPT:

  • Demystifying Radiology Reports with ChatGPT
  • ChatGPT Passes Radiology Board Exam
  • Preparing Radiology Trainees for AI and ChatGPT

IMAGES

  1. Write Literature Review using ChatGPT

    chatgpt writing literature review

  2. (PDF) Conducting Systematic Literature Reviews with ChatGPT [ChatGPT

    chatgpt writing literature review

  3. Can I use ChatGPT for writing my thesis?

    chatgpt writing literature review

  4. writing literature review using chat gpt ai science chatgpt

    chatgpt writing literature review

  5. (PDF) Literature Review on Pros and Cons of ChatGPT Implications in

    chatgpt writing literature review

  6. Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review

    chatgpt writing literature review

VIDEO

  1. How to Write a Literature Review a short Step by step Guide

  2. How do you use ChatGPT in academic writing?

  3. WRITING LITERATURE REVIEW

  4. #ChatGPT does literature review! #ScholarAI

  5. ChatGPT cheat sheets and prompts in writing literature review

  6. writing literature review

COMMENTS

  1. How To Use ChatGPT To Write A Literature Review: Prompts & References

    Review ChatGPT's Responses. - Cross-reference with actual research for accuracy. - Evaluate AI-generated text for coherence and depth. - Ensure originality to avoid plagiarism. Ensure Coherence and Flow. - Use ChatGPT as a starting point; refine output. - Review and edit for narrative flow and academic standards. Edit and Proofread.

  2. Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review

    Clearly articulate the topic, research question, or hypothesis you aim to address through your literature review. This step will help you maintain focus and guide your search for relevant sources. Step 2: Identifying Keywords and Search Terms To effectively use ChatGPT to assist in your literature review, you need to identify relevant keywords ...

  3. Application ChatGPT in conducting systematic reviews and meta ...

    ChatGPT has been effectively used for a variety of tasks in healthcare, including conducting literature reviews, analysing datasets, writing academic papers, documentation, and enhancing clinical ...

  4. ChatGPT for Research and Publication: A Step-by-Step Guide

    This commentary provides a concise step-by-step guide on using ChatGPT, an advanced natural language processing (NLP) model, for research and publication purposes. The guide assesses crucial aspects, including data preprocessing, fine-tuning techniques, prompt engineering, and ethical considerations. By addressing challenges related to biases ...

  5. How to Use ChatGPT to Write a Literature Review With Prompts

    A Step-by-Step Guide: How to Use ChatGPT for Writing a Literature Review. Step 1: Defining Your Research Objective Before diving into the literature review process, it is crucial to define your research objective. Clearly articulate the topic, research question, or hypothesis you aim to address through your literature review. This step will ...

  6. ChatGPT in academic writing: Maximizing its benefits and minimizing the

    Literature review. Perhaps the domain where ChatGPT can help authors at a large scale is the literature review. With its vast knowledge base and ability to comprehend natural language, ChatGPT can quickly access numerous academic papers, books, and articles to identify relevant sources to write text for a topic.

  7. Using ChatGPT to conduct a literature review

    The term "GPT" refers to the neural network learning model known as Generative Pre-Trained Transformer (GPT), which enables machines to perform NLP (natural learning process) tasks. [1] ChatGPT is ...

  8. ChatGPT for Research and Publication: A Step-by-Step Guide

    Yes, ChatGPT can help generate a literature review, but it is important to note that it may not have access to the most up-to-date research articles and studies due to copyrights and limited access to some journals. ... Although ChatGPT offers numerous advantages for assisting in the writing of research papers, there are also some important ...

  9. Full article: ChatGPT: A brief narrative review

    In recent months, numerous comprehensive and systematic reviews on various aspects of ChatGPT literature have emerged (e.g., Li et al., Citation 2023; Ray, Citation 2023; Sallam, Citation 2023; Farhat et al., Citation 2023b).Our paper sets itself apart by utilizing a narrative review approach, which involves an in-depth and critical examination of the existing literature on a specific subject ...

  10. Discussing ChatGPT in education: A literature review and bibliometric

    2. Literature review. By making the different management procedures nimble and precise, artificial intelligence (AI) plays a part in improving an organization's function's efficiency and effectiveness (Palos-Sanchez et al., Citation 2022).AI will make it possible for educational institutions to comprehend and manage a data collecting procedure so that it may be incorporated into an ...

  11. Using ChatGPT to conduct a literature review

    Using ChatGPT to conduct a literature review Account Res. 2023 Mar 6:1-3. doi: 10.1080/08989621.2023.2185514. Online ahead of print. Authors Michael Haman 1 , Milan Školník 1 Affiliation 1 Department of Humanities, Czech University of Life ...

  12. 80+ ChatGPT Prompts for Literature Review

    Here are some example prompts that can be used to guide ChatGPT in assisting with literature reviews: Example Prompts: Provide an overview of the literature on [specific topic]. Summarize the methodologies used in the studies on [specific topic]. Identify common themes in the literature on [specific topic].

  13. The role of ChatGPT in scientific communication: writing better

    Using artificial intelligence tools like ChatGPT can greatly improve writing review articles for scientists, by enhancing efficiency and quality. ChatGPT speeds up writing, develops outlines, adds details, and helps improve writing style. However, ChatGPT's limitations must be kept in mind, and generated text must be reviewed and edited to ...

  14. ChatGPT and Academic Research: A Review and Recommendations Based on

    tasks, including essay writing, different formal and inf ormal speech writing, summarising literature, and generating ideas. However , yet, it is a controversial issue to use ChatGPT in academic

  15. The impact and applications of ChatGPT: a systematic review of

    This rapid review of the literature aims to enrich our understanding of ChatGPT's capabilities across subject domains, how it can be used in education, and potential issues raised by researchers ...

  16. Research Guides: Literature Reviews: ChatGPT [beta]

    ChatGPT [beta] - Literature Reviews - Research Guides at University of Michigan Library. ChatGPT — Conversational Large Language Model by OpenAI — Potential applications for teaching, learning and doing literature reviews. https://chat.openai.com. The knowledge cutoff for the ChatGPT 3.5 is September 2021 and it has no access to the Internet.

  17. How to Use ChatGPT to Write a Literature Review With Prompts

    Note: The literature review that you would get is plagiarism-free. This is another benefit of using chatgpt to write literature review. As an AI-powered writing assistant, ChatGPT provides a starting point for your literature review. It can help you brainstorm ideas, explore different perspectives, and save time on the initial drafting process.

  18. Step-by-Step Guide: How to Use ChatGPT to Write a Literature Review

    Using ChatGPT to write a literature review can greatly facilitate the research process. By following a step-by-step approach, researchers can effectively leverage ChatGPT's capabilities to gather insights, generate content, and enhance the quality of their literature review. However, it is important to approach the generated content ...

  19. ChatGPT in higher education

    ChatGPT has emerged as a significant subject of research and exploration, casting a critical spotlight on teaching and learning practices in the higher education domain. This study examines the most influential articles, leading journals, and productive countries concerning citations and publications related to ChatGPT in higher education, while also shedding light on emerging thematic and ...

  20. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature

    Systematic reviews are comprehensive reviews of the literature for a highly focused research question. These reviews are often treated as the highest form of evidence in evidence-based medicine, and are the key strategy to answer research questions in the medical field. To create a high-quality systematic review, complex Boolean queries are often constructed to retrieve studies for the review ...

  21. Using ChatGPT

    ChatGPT is a generative large language model which has gained popularity since late 2021. The name stands for Chat Generative Pre-trained Transformer. GPT models are programmed to mimic the look and feel of engaging in a realistic conversation with another human being; however, bias and inaccuracy remain a considerable risk ( Kocoń, J., 2023).

  22. ChatGPT

    Engaging, friendly guide for academic literature reviews using research question or topic

  23. A critical assessment of using ChatGPT for extracting ...

    Inference and Interpretation. To understand how ChatGPT reads and makes inferences from pathology reports, we demonstrated a case study using a typical pathology report in this cohort (TCGA-98 ...

  24. (PDF) OpenAI ChatGPT Generated Literature Review: Digital Twin in

    We used ChatGPT to create a literature review article. to show the stage of the OpenAI ChatGPT artificial intelligence. application. As the subject, the applications of Digital Twin in. the health ...

  25. 10 Applications of ChatGPT and Generative AI for Research

    The literature review is a cornerstone of research projects, demanding an extensive exploration of existing academic papers. ChatGPT revolutionizes this by summarizing and distilling critical insights from vast volumes of literature. Researchers can now save valuable time and effort as AI generates concise summaries and identifies pertinent sources, expediting the initial stages of their research.

  26. How ChatGPT Can Make Writing Easier

    Generating content. As soon as ChatGPT came out, students started using the technology to do their written assignments which led to charges of cheating as this article by Michael Nietzel indicates ...

  27. The Impact of Artificial Intelligence Tools on Academic Writing ...

    The researcher reviewed existing literature on integrating AI tools into academic writing instruction. The findings provide insights to educators navigating the integration of Artificial Intelligence tools into writing curricula while maintaining instructional quality and academic integrity standards.

  28. (PDF) Literature Review on Pros and Cons of ChatGPT ...

    Objectives: (a) To conceptualise what is ChatGPT and how it. impacts our learning culture. (b) To review from various literature the pros and cons of. using ChatGPT. (c) To analyse the facts from ...

  29. Detecting ChatGPT-Generated Essays in a Large-Scale Writing Assessment

    2.3.Linguistic Features. For each human-written or ChatGPT-generated essay, we used e-rater® (Attali & Burstein, 2006) to generate linguistic features to represent the essays.E-rater® is an automated scoring engine that uses Natural Language Processing (NLP) techniques to automatically extract linguistically-motivated features of writing, which can be used to evaluate writing quality and ...

  30. ChatGPT's Ability to Increase Health Literacy

    Explore programs in grant writing, research development and academic radiology. Research. R&E Foundation ... With the advent of ChatGPT in late 2022, researchers began studying the tool's capability to simplify cancer screening and prevention information, making it more decipherable to patients and increasing the likelihood of adherence to ...