Help | Advanced Search

Computer Science > Artificial Intelligence

Title: character is destiny: can large language models simulate persona-driven decisions in role-playing.

Abstract: Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided with the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,401 character decision points from 395 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and methods for LLM role-playing. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet there is substantial room for improvement. Hence, we further propose the CHARMAP method, which achieves a 6.01% increase in accuracy via persona-based memory retrieval. We will make our datasets and code publicly available.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

AI Index Report

Welcome to the seventh edition of the AI Index report. The 2024 Index is our most comprehensive to date and arrives at an important moment when AI’s influence on society has never been more pronounced. This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ever before, this edition introduces new estimates on AI training costs, detailed analyses of the responsible AI landscape, and an entirely new chapter dedicated to AI’s impact on science and medicine.

Read the 2024 AI Index Report

The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI.

The AI Index is recognized globally as one of the most credible and authoritative sources for data and insights on artificial intelligence. Previous editions have been cited in major newspapers, including the The New York Times, Bloomberg, and The Guardian, have amassed hundreds of academic citations, and been referenced by high-level policymakers in the United States, the United Kingdom, and the European Union, among other places. This year’s edition surpasses all previous ones in size, scale, and scope, reflecting the growing significance that AI is coming to hold in all of our lives.

Steering Committee Co-Directors

Jack Clark

Ray Perrault

Steering committee members.

Erik Brynjolfsson

Erik Brynjolfsson

John Etchemendy

John Etchemendy

Katrina light

Katrina Ligett

Terah Lyons

Terah Lyons

James Manyika

James Manyika

Juan Carlos Niebles

Juan Carlos Niebles

Vanessa Parli

Vanessa Parli

Yoav Shoham

Yoav Shoham

Russell Wald

Russell Wald

Staff members.

Loredana Fattorini

Loredana Fattorini

Nestor Maslej

Nestor Maslej

Letter from the co-directors.

A decade ago, the best AI systems in the world were unable to classify objects in images at a human level. AI struggled with language comprehension and could not solve math problems. Today, AI systems routinely exceed human performance on standard benchmarks.

Progress accelerated in 2023. New state-of-the-art systems like GPT-4, Gemini, and Claude 3 are impressively multimodal: They can generate fluent text in dozens of languages, process audio, and even explain memes. As AI has improved, it has increasingly forced its way into our lives. Companies are racing to build AI-based products, and AI is increasingly being used by the general public. But current AI technology still has significant problems. It cannot reliably deal with facts, perform complex reasoning, or explain its conclusions.

AI faces two interrelated futures. First, technology continues to improve and is increasingly used, having major consequences for productivity and employment. It can be put to both good and bad uses. In the second future, the adoption of AI is constrained by the limitations of the technology. Regardless of which future unfolds, governments are increasingly concerned. They are stepping in to encourage the upside, such as funding university R&D and incentivizing private investment. Governments are also aiming to manage the potential downsides, such as impacts on employment, privacy concerns, misinformation, and intellectual property rights.

As AI rapidly evolves, the AI Index aims to help the AI community, policymakers, business leaders, journalists, and the general public navigate this complex landscape. It provides ongoing, objective snapshots tracking several key areas: technical progress in AI capabilities, the community and investments driving AI development and deployment, public opinion on current and potential future impacts, and policy measures taken to stimulate AI innovation while managing its risks and challenges. By comprehensively monitoring the AI ecosystem, the Index serves as an important resource for understanding this transformative technological force.

On the technical front, this year’s AI Index reports that the number of new large language models released worldwide in 2023 doubled over the previous year. Two-thirds were open-source, but the highest-performing models came from industry players with closed systems. Gemini Ultra became the first LLM to reach human-level performance on the Massive Multitask Language Understanding (MMLU) benchmark; performance on the benchmark has improved by 15 percentage points since last year. Additionally, GPT-4 achieved an impressive 0.97 mean win rate score on the comprehensive Holistic Evaluation of Language Models (HELM) benchmark, which includes MMLU among other evaluations.

Although global private investment in AI decreased for the second consecutive year, investment in generative AI skyrocketed. More Fortune 500 earnings calls mentioned AI than ever before, and new studies show that AI tangibly boosts worker productivity. On the policymaking front, global mentions of AI in legislative proceedings have never been higher. U.S. regulators passed more AI-related regulations in 2023 than ever before. Still, many expressed concerns about AI’s ability to generate deepfakes and impact elections. The public became more aware of AI, and studies suggest that they responded with nervousness.

Ray Perrault Co-director, AI Index

Our Supporting Partners

Supporting Partner Logos

Analytics & Research Partners

artificial intelligence based research paper

Stay up to date on the AI Index by subscribing to the  Stanford HAI newsletter.

  • Event calendar

The Top 17 ‘Must-Read’ AI Papers in 2022

The Top 17 ‘Must-Read’ AI Papers in 2022

We caught up with experts in the RE•WORK community to find out what the top 17 AI papers are for 2022 so far that you can add to your Summer must reads. The papers cover a wide range of topics including AI in social media and how AI can benefit humanity and are free to access.

Interested in learning more? Check out all the upcoming RE•WORK events to find out about the latest trends and industry updates in AI here .

Max Li, Staff Data Scientist – Tech Lead at Wish

Max is a Staff Data Scientist at Wish where he focuses on experimentation (A/B testing) and machine learning.  His passion is to empower data-driven decision-making through the rigorous use of data. View Max’s presentation, ‘Assign Experiment Variants at Scale in A/B Tests’, from our Deep Learning Summit in February 2022 here .

1. Boostrapped Meta-Learning (2022) – Sebastian Flennerhag et al.

The first paper selected by Max proposes an algorithm in which allows the meta-learner teach itself, allowing to overcome the meta-optimisation challenge. The algorithm focuses meta-learning with gradients, which guarantees improvements in performance. The paper also looks at how bootstrapping opens up possibilities. Read the full paper here .

2. Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces (2022) – Samuel Daulton et al.

Another paper selected by Max proposes MORBO, a scalable method for multiple-objective BO as it performs better than that of high-dimensional search spaces. MORBO significantly improves the sample efficiency, and where BO algorithms fail, MORBO provides improved sample efficiencies to the current BO approach used. Read the full paper here .

3. Tabular Data: Deep Learning is Not All You Need (2021) – Ravid Shwartz-Ziv, Amitai Armon

To solve real-life data science problems, selecting the right model to use is crucial. This final paper selected by Max explores whether deep models should be recommended as an option for tabular data. Read the full paper here .

artificial intelligence based research paper

Jigyasa Grover, Senior Machine Learning Engineer at Twitter

Jigyasa Grover is a Senior Machine Learning Engineer at Twitter working in the performance ads ranking domain. Recently, she was honoured with the 'Outstanding in AI: Young Role Model Award' by Women in AI across North America. She is one of the few ML Google Developer Experts globally. Jigyasa has previously presented at our Deep Learning Summit and MLOps event in San Fransisco earlier this year.

4. Privacy for Free: How does Dataset Condensation Help Privacy? (2022) – Tian Dong et al.

Jigyasa’s first recommendation concentrates on Privacy Preserving Machine Learning, specifically mitigating the leakage of sensitive data in Machine Learning. The paper provides one of the first propositions of using dataset condensation techniques to preserve the data efficiency during model training and furnish membership privacy. This paper was published by Sony AI and won the Outstanding Paper Award at ICML 2022. Read the full paper here .

5. Affective Signals in a Social Media Recommender System (2022) – Jane Dwivedi-Yu et al.

The second paper recommended by Jigyasa talks about operationalising Affective Computing, also known as Emotional AI, for an improved personalised feed on social media. The paper discusses the design of an affective taxonomy customised to user needs on social media. It further lays out the curation of suitable training data by combining engagement data and data from a human-labelling task to enable the identification of the affective response a user might exhibit for a particular post. Read the full paper here .

6. ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest (2022) – Paul Baltescu et al.

Jigyasa’s last recommendation is a paper by Pinterest that illustrates the aggregation of both textual and visual information to build a unified set of product embeddings to enhance recommendation results on e-commerce websites. By applying multi-task learning, the proposed embeddings can optimise for multiple engagement types and ensures that the shopping recommendation stack is efficient with respect to all objectives. Read the full article here .

Asmita Poddar, Software Development Engineer at Amazon Alexa

Asmita is a Software Development Engineer at Amazon Alexa, where she works on developing and productionising natural language processing and speech models. Asmita also has prior experience in applying machine learning in diverse domains. Asmita will be presenting at our London AI Summit , in September, where she will discuss AI for Spoken Communication.

7. Competition-Level Code Generation with AlphaCode (2022) – Yujia Li et al.

Systems can help programmers become more productive. Asmita has selected this paper which addresses the problems with incorporating innovations in AI into these systems. AlphaCode is a system that creates solutions for problems that requires deeper reasoning. Read the full paper here .

8. A Commonsense Knowledge Enhanced Network with Retrospective Loss for Emotion Recognition in Spoken Dialog (2022) – Yunhe Xie et al.

There are limits to model’s reasoning in regards to the existing ERSD datasets. The final paper selected by Asmita proposes a Commonsense Knowledge Enhanced Network with a backward-looking loss to perform dialog modelling, external knowledge integration and historical state retrospect. The model used has been shown to outperform other models. Read the full paper here .

artificial intelligence based research paper

Discover the speakers we have lined up and the topics we will cover at the London AI Summit.

Sergei Bobrovskyi, Expert in Anomaly Detection for Root Cause Analysis at Airbus

Dr. Sergei Bobrovskyi is a Data Scientist within the Analytics Accelerator team of the Airbus Digital Transformation Office. His work focuses on applications of AI for anomaly detection in time series, spanning various use-cases across Airbus. Sergei will be presenting at our Berlin AI Summit in October about Anomaly Detection, Root Cause Analysis and Explainability.

9. LaMDA: Language Models for Dialog Applications (2022) – Romal Thoppilan et al.

The paper chosen by Sergei describes the LaMDA system, which caused the furor this summer, when a former Google engineer claimed it has shown signs of being sentient. LaMDA is a family of large language models for dialog applications based on Transformer architecture. The interesting feature of the model is their fine-tuning with human annotated data and possibility to consult external sources. In any case, this is a very interesting model family, which we might encounter in many of the applications we use daily. Read the full paper here .

10. A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27 (2022) – Yann LeCun

The second paper chosen by Sergei provides a vision on how to progress towards general AI. The study combines a number of concepts including configurable predictive world model, behaviour driven through intrinsic motivation, and hierarchical joint embedding architectures. Read the full paper here .

11. Coordination Among Neural Modules Through a Shared Global Workpace (2022) – Anirudh Goyal et al.

This paper chosen by Sergei combines the Transformer architecture underlying most of the recent successes of deep learning with ideas from the Global Workspace Theory from cognitive sciences. This is an interesting read to broaden the understanding of why certain model architectures perform well and in which direction we might go in the future to further improve performance on challenging tasks. Read the full paper here .

12. Magnetic control of tokamak plasmas through deep reinforcement learning (2022) – Jonas Degrave et al.

Sergei chose the next paper, which asks the question of ‘how can AI research benefit humanity?’. The use of AI to enable safe, reliable and scalable deployment of fusion energy could contribute to the solution of pression problems of climate change. Sergei has said that this is an extremely interesting application of AI technology for engineering. Read the full paper here .

13. TranAd: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data (2022) – Shreshth Tuli, Giuliano Casale and Nicholas R. Jennings

The final paper chosen by Sergei is a specialised paper applying transformer architecture to the problem of unsupervised anomaly detection in multivariate time-series. Many architectures which were successful in other fields are at some points being also applied to time-series. The paper shows an improved performance on some known data sets. Read the full paper here .

artificial intelligence based research paper

Abdullahi Adamu, Senior Software Engineer at Sony

Abdullahi has worked in various industries including working at a market research start-up where he developed models that could extract insights from human conversations about products or services. He moved to Publicis, where he became Data Engineer and Data Scientist in 2018. Abdullahi will be part of our panel discussion at the London AI Summit in September, where he will discuss Harnessing the Power of Deep Learning.

14. Self-Supervision for Learning from the Bottom Up (2022) – Alexei Efros

This paper chosen by Abdullahi makes compelling arguments for why self-supervision is the next step in the evolution of AI/ML for building more robust models. Overall, these compelling arguments justify even further why self-supervised learning is important on our journey towards more robust models that generalise better in the wild. Read the full paper here .

15. Neural Architecture Search Survey: A Hardware Perspective (2022) – Krishna Teja Chitty-Venkata and Arun K. Somani

Another paper chosen by Abdullahi understands that as we move towards edge computing and federated learning, neural architecture search that takes into account hardware constraints which will be more critical in ensuring that we have leaner neural network models that balance latency and generalisation performance. This survey gives a birds eye view of the various neural architecture search algorithms that take into account hardware constraints to design artificial neural networks that give the best tradeoff of performance and accuracy. Read the full paper here .

16. What Should Not Be Contrastive In Contrastive Learning (2021) – Tete Xiao et al.

In the paper chosen by Abdullahi highlights the underlying assumptions behind data augmentation methods and how these can be counter productive in the context of contrastive learning; for example colour augmentation whilst a downstream task is meant to differentiate colours of objects. The result reported show promising results in the wild. Overall, it presents an elegant solution to using data augmentation for contrastive learning. Read the full paper here .

17. Why do tree-based models still outperform deep learning on tabular data? (2022) – Leo Grinsztajn, Edouard Oyallon and Gael Varoquaux

The final paper selected by Abdulliah works on answering the question of why deep learning models still find it hard to compete on tabular data compared to tree-based models. It is shown that MLP-like architectures are more sensitive to uninformative features in data, compared to their tree-based counterparts. Read the full paper here .

Sign up to the RE•WORK monthly newsletter for the latest AI news, trends and events.

Join us at our upcoming events this year:

·       London AI Summit – 14-15 September 2022

·       Berlin AI Summit – 4-5 October 2022

·       AI in Healthcare Summit Boston – 13-14 October 2022

·       Sydney Deep Learning and Enterprise AI Summits – 17-18 October 2022

·       MLOps Summit – 9-10 November 2022

·       Toronto AI Summit – 9-10 November 2022

·       Nordics AI Summit - 7-8 December 2022

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Social justice
  • Black holes
  • Classes and programs

Departments

  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

To build a better AI helper, start by modeling the irrational behavior of humans

Press contact :.

A person plays chess. A techy overlay says “AI.”

Previous image Next image

To build AI systems that can collaborate effectively with humans, it helps to have a good model of human behavior to start with. But humans tend to behave suboptimally when making decisions.

This irrationality, which is especially difficult to model, often boils down to computational constraints. A human can’t spend decades thinking about the ideal solution to a single problem.

Researchers at MIT and the University of Washington developed a way to model the behavior of an agent, whether human or machine, that accounts for the unknown computational constraints that may hamper the agent’s problem-solving abilities.

Their model can automatically infer an agent’s computational constraints by seeing just a few traces of their previous actions. The result, an agent’s so-called “inference budget,” can be used to predict that agent’s future behavior.

In a new paper, the researchers demonstrate how their method can be used to infer someone’s navigation goals from prior routes and to predict players’ subsequent moves in chess matches. Their technique matches or outperforms another popular method for modeling this type of decision-making.

Ultimately, this work could help scientists teach AI systems how humans behave, which could enable these systems to respond better to their human collaborators. Being able to understand a human’s behavior, and then to infer their goals from that behavior, could make an AI assistant much more useful, says Athul Paul Jacob, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this technique .

“If we know that a human is about to make a mistake, having seen how they have behaved before, the AI agent could step in and offer a better way to do it. Or the agent could adapt to the weaknesses that its human collaborators have. Being able to model human behavior is an important step toward building an AI agent that can actually help that human,” he says.

Jacob wrote the paper with Abhishek Gupta, assistant professor at the University of Washington, and senior author Jacob Andreas, associate professor in EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the International Conference on Learning Representations.

Modeling behavior

Researchers have been building computational models of human behavior for decades. Many prior approaches try to account for suboptimal decision-making by adding noise to the model. Instead of the agent always choosing the correct option, the model might have that agent make the correct choice 95 percent of the time.

However, these methods can fail to capture the fact that humans do not always behave suboptimally in the same way.

Others at MIT have also studied more effective ways to plan and infer goals in the face of suboptimal decision-making.

To build their model, Jacob and his collaborators drew inspiration from prior studies of chess players. They noticed that players took less time to think before acting when making simple moves and that stronger players tended to spend more time planning than weaker ones in challenging matches.

“At the end of the day, we saw that the depth of the planning, or how long someone thinks about the problem, is a really good proxy of how humans behave,” Jacob says.

They built a framework that could infer an agent’s depth of planning from prior actions and use that information to model the agent’s decision-making process.

The first step in their method involves running an algorithm for a set amount of time to solve the problem being studied. For instance, if they are studying a chess match, they might let the chess-playing algorithm run for a certain number of steps. At the end, the researchers can see the decisions the algorithm made at each step.

Their model compares these decisions to the behaviors of an agent solving the same problem. It will align the agent’s decisions with the algorithm’s decisions and identify the step where the agent stopped planning.

From this, the model can determine the agent’s inference budget, or how long that agent will plan for this problem. It can use the inference budget to predict how that agent would react when solving a similar problem.

An interpretable solution

This method can be very efficient because the researchers can access the full set of decisions made by the problem-solving algorithm without doing any extra work. This framework could also be applied to any problem that can be solved with a particular class of algorithms.

“For me, the most striking thing was the fact that this inference budget is very interpretable. It is saying tougher problems require more planning or being a strong player means planning for longer. When we first set out to do this, we didn’t think that our algorithm would be able to pick up on those behaviors naturally,” Jacob says.

The researchers tested their approach in three different modeling tasks: inferring navigation goals from previous routes, guessing someone’s communicative intent from their verbal cues, and predicting subsequent moves in human-human chess matches.

Their method either matched or outperformed a popular alternative in each experiment. Moreover, the researchers saw that their model of human behavior matched up well with measures of player skill (in chess matches) and task difficulty.

Moving forward, the researchers want to use this approach to model the planning process in other domains, such as reinforcement learning (a trial-and-error method commonly used in robotics). In the long run, they intend to keep building on this work toward the larger goal of developing more effective AI collaborators.

This work was supported, in part, by the MIT Schwarzman College of Computing Artificial Intelligence for Augmentation and Productivity program and the National Science Foundation.

Share this news article on:

Related links.

  • Athul Paul Jacob
  • Jacob Andreas
  • Language and Intelligence Group
  • Computer Science and Artificial Intelligence Laboratory
  • Department of Electrical Engineering and Computer Science

Related Topics

  • Computer science and technology
  • Artificial intelligence
  • Human-computer interaction
  • Computer modeling
  • Computer Science and Artificial Intelligence Laboratory (CSAIL)
  • Electrical Engineering & Computer Science (eecs)
  • National Science Foundation (NSF)

Related Articles

Cartoon showing two characters, an "agent" and an "observer" looking at a puzzle-like diagram and thinking about the problem with thought balloons

Building machines that better understand human goals 

Digital illustration of a white robot with a magnifying glass, looking at a circuit-style display of a battery with a brain icon. The room resembles a lab with a white table, and there are two tech-themed displays on the wall showing abstract neural structures in glowing turquoise. A wire connects the robot's magnifying glass to the larger display.

AI agents help explain other AI systems

Medical technician looking at an X-ray with the help of an AI assistant

Automated system teaches users when to collaborate with an AI assistant

Yellow and red balls compete on an AI-generated football field. Chalk lines show possible movement, and a football is in the middle.

A far-sighted approach to machine learning

Previous item Next item

More MIT News

Side-by-side headshots of Riyam Al-Msari and Francisca Vasconcelos

Two from MIT awarded 2024 Paul and Daisy Soros Fellowships for New Americans

Read full story →

Cartoon images of people connected by networks, depicts a team working remotely on a project.

MIT Emerging Talent opens pathways for underserved global learners

Two students push the tubular steel Motorsports car into Lobby 13 while a third sits in the car and steers

The MIT Edgerton Center’s third annual showcase dazzles onlookers

Drone photo of Killian court and the MIT campus.

Seven from MIT elected to American Academy of Arts and Sciences for 2024

Dramatic lighting highlights a futuristic computer chip on a stylized circuit board.

Two MIT teams selected for NSF sustainable materials grants

Lydia Bourouiba stands near a full bookshelf and chalk board.

3 Questions: A shared vocabulary for how infectious diseases spread

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram

Academia Insider

The best AI tools for research papers and academic research (Literature review, grants, PDFs and more)

As our collective understanding and application of artificial intelligence (AI) continues to evolve, so too does the realm of academic research. Some people are scared by it while others are openly embracing the change. 

Make no mistake, AI is here to stay!

Instead of tirelessly scrolling through hundreds of PDFs, a powerful AI tool comes to your rescue, summarizing key information in your research papers. Instead of manually combing through citations and conducting literature reviews, an AI research assistant proficiently handles these tasks.

These aren’t futuristic dreams, but today’s reality. Welcome to the transformative world of AI-powered research tools!

This blog post will dive deeper into these tools, providing a detailed review of how AI is revolutionizing academic research. We’ll look at the tools that can make your literature review process less tedious, your search for relevant papers more precise, and your overall research process more efficient and fruitful.

I know that I wish these were around during my time in academia. It can be quite confronting when trying to work out what ones you should and shouldn’t use. A new one seems to be coming out every day!

Here is everything you need to know about AI for academic research and the ones I have personally trialed on my YouTube channel.

My Top AI Tools for Researchers and Academics – Tested and Reviewed!

There are many different tools now available on the market but there are only a handful that are specifically designed with researchers and academics as their primary user.

These are my recommendations that’ll cover almost everything that you’ll want to do:

Want to find out all of the tools that you could use?

Here they are, below:

AI literature search and mapping – best AI tools for a literature review – elicit and more

Harnessing AI tools for literature reviews and mapping brings a new level of efficiency and precision to academic research. No longer do you have to spend hours looking in obscure research databases to find what you need!

AI-powered tools like Semantic Scholar and elicit.org use sophisticated search engines to quickly identify relevant papers.

They can mine key information from countless PDFs, drastically reducing research time. You can even search with semantic questions, rather than having to deal with key words etc.

With AI as your research assistant, you can navigate the vast sea of scientific research with ease, uncovering citations and focusing on academic writing. It’s a revolutionary way to take on literature reviews.

  • Elicit –  https://elicit.org
  • Litmaps –  https://www.litmaps.com
  • Research rabbit – https://www.researchrabbit.ai/
  • Connected Papers –  https://www.connectedpapers.com/
  • Supersymmetry.ai: https://www.supersymmetry.ai
  • Semantic Scholar: https://www.semanticscholar.org
  • Laser AI –  https://laser.ai/
  • Inciteful –  https://inciteful.xyz/
  • Scite –  https://scite.ai/
  • System –  https://www.system.com

If you like AI tools you may want to check out this article:

  • How to get ChatGPT to write an essay [The prompts you need]

AI-powered research tools and AI for academic research

AI research tools, like Concensus, offer immense benefits in scientific research. Here are the general AI-powered tools for academic research. 

These AI-powered tools can efficiently summarize PDFs, extract key information, and perform AI-powered searches, and much more. Some are even working towards adding your own data base of files to ask questions from. 

Tools like scite even analyze citations in depth, while AI models like ChatGPT elicit new perspectives.

The result? The research process, previously a grueling endeavor, becomes significantly streamlined, offering you time for deeper exploration and understanding. Say goodbye to traditional struggles, and hello to your new AI research assistant!

  • Consensus –  https://consensus.app/
  • Iris AI –  https://iris.ai/
  • Research Buddy –  https://researchbuddy.app/
  • Mirror Think – https://mirrorthink.ai

AI for reading peer-reviewed papers easily

Using AI tools like Explain paper and Humata can significantly enhance your engagement with peer-reviewed papers. I always used to skip over the details of the papers because I had reached saturation point with the information coming in. 

These AI-powered research tools provide succinct summaries, saving you from sifting through extensive PDFs – no more boring nights trying to figure out which papers are the most important ones for you to read!

They not only facilitate efficient literature reviews by presenting key information, but also find overlooked insights.

With AI, deciphering complex citations and accelerating research has never been easier.

  • Aetherbrain – https://aetherbrain.ai
  • Explain Paper – https://www.explainpaper.com
  • Chat PDF – https://www.chatpdf.com
  • Humata – https://www.humata.ai/
  • Lateral AI –  https://www.lateral.io/
  • Paper Brain –  https://www.paperbrain.study/
  • Scholarcy – https://www.scholarcy.com/
  • SciSpace Copilot –  https://typeset.io/
  • Unriddle – https://www.unriddle.ai/
  • Sharly.ai – https://www.sharly.ai/
  • Open Read –  https://www.openread.academy

AI for scientific writing and research papers

In the ever-evolving realm of academic research, AI tools are increasingly taking center stage.

Enter Paper Wizard, Jenny.AI, and Wisio – these groundbreaking platforms are set to revolutionize the way we approach scientific writing.

Together, these AI tools are pioneering a new era of efficient, streamlined scientific writing.

  • Jenny.AI – https://jenni.ai/ (20% off with code ANDY20)
  • Yomu – https://www.yomu.ai
  • Wisio – https://www.wisio.app

AI academic editing tools

In the realm of scientific writing and editing, artificial intelligence (AI) tools are making a world of difference, offering precision and efficiency like never before. Consider tools such as Paper Pal, Writefull, and Trinka.

Together, these tools usher in a new era of scientific writing, where AI is your dedicated partner in the quest for impeccable composition.

  • PaperPal –  https://paperpal.com/
  • Writefull –  https://www.writefull.com/
  • Trinka –  https://www.trinka.ai/

AI tools for grant writing

In the challenging realm of science grant writing, two innovative AI tools are making waves: Granted AI and Grantable.

These platforms are game-changers, leveraging the power of artificial intelligence to streamline and enhance the grant application process.

Granted AI, an intelligent tool, uses AI algorithms to simplify the process of finding, applying, and managing grants. Meanwhile, Grantable offers a platform that automates and organizes grant application processes, making it easier than ever to secure funding.

Together, these tools are transforming the way we approach grant writing, using the power of AI to turn a complex, often arduous task into a more manageable, efficient, and successful endeavor.

  • Granted AI – https://grantedai.com/
  • Grantable – https://grantable.co/

Best free AI research tools

There are many different tools online that are emerging for researchers to be able to streamline their research processes. There’s no need for convience to come at a massive cost and break the bank.

The best free ones at time of writing are:

  • Elicit – https://elicit.org
  • Connected Papers – https://www.connectedpapers.com/
  • Litmaps – https://www.litmaps.com ( 10% off Pro subscription using the code “STAPLETON” )
  • Consensus – https://consensus.app/

Wrapping up

The integration of artificial intelligence in the world of academic research is nothing short of revolutionary.

With the array of AI tools we’ve explored today – from research and mapping, literature review, peer-reviewed papers reading, scientific writing, to academic editing and grant writing – the landscape of research is significantly transformed.

The advantages that AI-powered research tools bring to the table – efficiency, precision, time saving, and a more streamlined process – cannot be overstated.

These AI research tools aren’t just about convenience; they are transforming the way we conduct and comprehend research.

They liberate researchers from the clutches of tedium and overwhelm, allowing for more space for deep exploration, innovative thinking, and in-depth comprehension.

Whether you’re an experienced academic researcher or a student just starting out, these tools provide indispensable aid in your research journey.

And with a suite of free AI tools also available, there is no reason to not explore and embrace this AI revolution in academic research.

We are on the precipice of a new era of academic research, one where AI and human ingenuity work in tandem for richer, more profound scientific exploration. The future of research is here, and it is smart, efficient, and AI-powered.

Before we get too excited however, let us remember that AI tools are meant to be our assistants, not our masters. As we engage with these advanced technologies, let’s not lose sight of the human intellect, intuition, and imagination that form the heart of all meaningful research. Happy researching!

Thank you to Ivan Aguilar – Ph.D. Student at SFU (Simon Fraser University), for starting this list for me!

artificial intelligence based research paper

Dr Andrew Stapleton has a Masters and PhD in Chemistry from the UK and Australia. He has many years of research experience and has worked as a Postdoctoral Fellow and Associate at a number of Universities. Although having secured funding for his own research, he left academia to help others with his YouTube channel all about the inner workings of academia and how to make it work for you.

Thank you for visiting Academia Insider.

We are here to help you navigate Academia as painlessly as possible. We are supported by our readers and by visiting you are helping us earn a small amount through ads and affiliate revenue - Thank you!

artificial intelligence based research paper

2024 © Academia Insider

artificial intelligence based research paper

Modified crayfish optimization algorithm for solving multiple engineering application problems

  • Open access
  • Published: 24 April 2024
  • Volume 57 , article number  127 , ( 2024 )

Cite this article

You have full access to this open access article

artificial intelligence based research paper

  • Heming Jia 1 ,
  • Xuelian Zhou 1 ,
  • Jinrui Zhang 1 ,
  • Laith Abualigah 2 , 3 ,
  • Ali Riza Yildiz 4 &
  • Abdelazim G. Hussien 5  

Crayfish Optimization Algorithm (COA) is innovative and easy to implement, but the crayfish search efficiency decreases in the later stage of the algorithm, and the algorithm is easy to fall into local optimum. To solve these problems, this paper proposes an modified crayfish optimization algorithm (MCOA). Based on the survival habits of crayfish, MCOA proposes an environmental renewal mechanism that uses water quality factors to guide crayfish to seek a better environment. In addition, integrating a learning strategy based on ghost antagonism into MCOA enhances its ability to evade local optimality. To evaluate the performance of MCOA, tests were performed using the IEEE CEC2020 benchmark function and experiments were conducted using four constraint engineering problems and feature selection problems. For constrained engineering problems, MCOA is improved by 11.16%, 1.46%, 0.08% and 0.24%, respectively, compared with COA. For feature selection problems, the average fitness value and accuracy are improved by 55.23% and 10.85%, respectively. MCOA shows better optimization performance in solving complex spatial and practical application problems. The combination of the environment updating mechanism and the learning strategy based on ghost antagonism significantly improves the performance of MCOA. This discovery has important implications for the development of the field of optimization.

Graphical Abstract

artificial intelligence based research paper

Similar content being viewed by others

artificial intelligence based research paper

Crayfish optimization algorithm

artificial intelligence based research paper

Enhanced Tunicate Swarm Algorithm for Solving Large-Scale Nonlinear Optimization Problems

artificial intelligence based research paper

Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems

Avoid common mistakes on your manuscript.

1 Introduction

For a considerable period, engineering application problems have been widely discussed by people. At present, improving the modern scientific level of engineering construction has become the goal of human continuous struggle, including constrained engineering design problems (Zhang et al. 2022a ; Mortazavi 2019 ) affected by a series of external factors and feature selection problems (Kira and Rendell 1992 ), and so on. Constrained engineering design problems refers to the problem of achieving optimization objectives and reducing calculation costs under many external constraints, which is widely used in mechanical engineering (Abualigah et al. 2022 ), electrical engineering (Razmjooy et al. 2021 ), civil engineering (Kaveh 2017 ), chemical engineering (Talatahari et al. 2021 ) and other engineering fields, such as workshop scheduling (Meloni et al. 2004 ), wind power generation (Lu et al. 2021 ), and UAV path planning (Belge et al. 2022 ), parameter extraction of photovoltaic models(Zhang et al. 2022b ; Zhao et al. 2022 ), Optimization of seismic foundation isolation system (Kandemir and Mortazavi 2022 ), optimal design of RC support foundation system of industrial buildings (Kamal et al. 2023 ), synchronous optimization of fuel type and external wall insulation performance of intelligent residential buildings (Moloodpoor and Mortazavi 2022 ), economic optimization of double-tube heaters (Moloodpoor et al. 2021 ).

Feature selection is the process of choosing specific subsets of features from a larger set based on defined criteria. In this approach, each original feature within the subset is individually evaluated using an assessment function. The aim is to select pertinent features that carry distinctive characteristics. This selection process reduces the dimensionality of the feature space, enhancing the model's generalization ability and accuracy. The ultimate goal is to create the best possible combination of features for the model. By employing feature selection, the influence of irrelevant factors is minimized. This reduction in irrelevant features not only streamlines the computational complexity but also reduces the time costs associated with processing the data. Through this method, redundant and irrelevant features are systematically removed from the model. This refinement improves the model’s accuracy and results in a higher degree of fit, ensuring that the model aligns more closely with the underlying data patterns.

In practical applications of feature selections, models are primarily refined using two main methods: the filter (Cherrington et al. 2019 ) and wrapper (Jović et al. 2015 ) techniques. The filter method employs a scoring mechanism to assess and rank the model's features. It selects the subset of features with the highest scores, considering it as the optimal feature combination. On the other hand, the wrapper method integrates the selection process directly into the learning algorithm. It embeds the feature subset evaluation within the learning process, assessing the correlation between the chosen features and the model. In recent years, applications inspired by heuristic algorithms can be seen everywhere in our lives and are closely related to the rapid development of today's society. These algorithms play an indispensable role in solving a myriad of complex engineering problems and feature selection challenges. They have proven particularly effective in addressing spatial, dynamic, and random problems, showcasing significant practical impact and tangible outcomes.

With the rapid development of society and science and technology, through continuous exploitation and exploration in the field of science, more and more complex and difficult to describe multi-dimensional engineering problems also appear in our research process. Navigating these complexities demands profound contemplation and exploration. While traditional heuristic algorithms have proven effective in simpler, foundational problems, they fall short when addressing the novel and intricate multi-dimensional challenges posed by our current scientific landscape and societal needs. Thus, researchers have embarked on a journey of continuous contemplation and experimentation. By cross-combining and validating existing heuristic algorithms, they have ingeniously devised a groundbreaking solution: Metaheuristic Algorithms (MAs) (Yang 2011 ). This innovative approach aims to tackle the complexities of our evolving problems, ensuring alignment with the rapid pace of social and technological development. MAs is a heuristic function based algorithm. It works by evaluating the current state of the problem and possible solutions to guide the algorithm in making choices in the search space. MAs improves the efficiency and accuracy of the problem solving process by combining multiple heuristic functions and updating the search direction at each step based on their weights. The diversity of MAs makes it a universal problem solver, adapting to the unique challenges presented by different problem domains. Essentially represents a powerful paradigm shift in computational problem solving, providing a powerful approach to address the complexity of modern engineering and scientific challenges. Compared with traditional algorithms, MAs has made great progress in finding optimal solutions, jumping out of local optima, and overcoming convergence difficulties in the later stage of solution through the synergy of different algorithms. These enhancements mark a significant progress, which not only demonstrates the adaptability of the scientific method, but also emphasizes the importance of continuous research and cooperation. It also has the potential to radically solve problems in domains of complex engineering challenges, enabling researchers to navigate complex problem landscapes with greater accuracy and efficiency.

Research shows that MAs are broadly classified into four different research directions: swarm-based, natural evolution-based, human-based, and physics-based. These categories include a wide range of innovative problem-solving approaches, each drawing inspiration from a different aspect of nature, human behavior, or physical principles. Researchers exploration these different pathways to solve complex challenges and optimize the solutions efficiently. First of all, the swarm-based optimization algorithm is the optimization algorithm that uses the wisdom of population survival to solve the problem. For example, Particle Swarm Optimization Algorithm (PSO) (Wang et al. 2018a ) is an optimization algorithm based on the group behavior of birds. PSO has a fast search speed and is only used for real-valued processing. However, it is not good at handling discrete optimization problems and has fallen into local optimization. Artificial Bee Colony Optimization Algorithm (ABC) (Jacob and Darney 2021 ) realizes the sharing and communication of information among individuals when bees collect honey according to their respective division of labor. In the Salp Swarm Algorithm (SSA) (Mirjalili et al. 2017 ), individual sea squirts are connected end to end and move and prey in a chain, and follow the leader with followers according to a strict “hierarchical” system. Ant Colony Optimization Algorithm (ACO) (Dorigo et al. 2006 ), ant foraging relies on the accumulation of pheromone on the path, and spontaneously finds the optimal path in an organized manner.

Secondly, a natural evolutionary algorithm inspired by the law of group survival of the fittest, an optimization algorithm that finds the best solution by preserving the characteristics of easy survival and strong individuals, such as: Genetic Programming Algorithm (GP) (Espejo et al. 2009 ), because biological survival and reproduction have certain natural laws, according to the structure of the tree to deduce certain laws of biological genetic and evolutionary process. Evolutionary Strategy Algorithm (ES) (Beyer and Schwefel 2002 ), the ability of a species to evolve itself to adapt to the environment, and produce similar but different offspring after mutation and recombination from the parent. Differential Evolution (DE) (Storn and Price 1997 ) eliminates the poor individuals and retains the good ones in the process of evolution, so that the good ones are constantly approaching the optimal solution. It has a strong global search ability in the initial iteration, but when there are fewer individuals in the population, individuals are difficult to update, and it is easy to fall into the local optimal. The Biogeography-based Optimization Algorithm (BBO) (Simon 2008 ), influenced by biogeography, filters out the global optimal value through the iteration of the migration and mutation of species information.

Then, Human-based optimization algorithms are optimization algorithms that take advantage of the diverse and complex human social relationships and activities in a specific environment to solve problems, such as: The teaching–learning-based Optimization (TLBO) (Rao and Rao 2016 ) obtained the optimal solution by simulating the Teaching relationship between students and teachers. It simplifies the information sharing mechanism within each round, and all evolved individuals can converge to the global optimal solution faster, but the algorithm often loses its advantage when solving some optimization problems far from the origin. Coronavirus Mask Protection Algorithm (CMPA) (Yuan et al. 2023 ), which is mainly inspired by the self-protection process of human against coronavirus, establishes a mathematical model of self-protection behavior and solves the optimization problem. Cultural Evolution Algorithm (CEA) (Kuo and Lin 2013 ), using the cultural model of system thinking framework for exploitation to achieve the purpose of cultural transformation, get the optimal solution. Volleyball Premier League Algorithm (VPL) (Moghdani and Salimifard 2018 ) simulates the process of training, competition and interaction of each team in the volleyball game to solve the global optimization problem.

Finally, Physics-based optimization algorithm is an optimization algorithm that uses the basic principles of physics to simulate the physical characteristics of particles in space to solve problems. For example, Snow Ablation Algorithm (SAO) (Deng and Liu 2023 ), inspired by the physical reaction of snow in nature, realizes the transformation among snow, water and steam by simulating the sublation and ablation of snow. RIME Algorithm (RIME) (Su et al. 2023 ) is a exploration and exploitation of mathematical model balance algorithm based on the growth process of soft rime and hard rime in nature. Central Force Optimization Algorithm (CFO) (Formato 2007 ), aiming at the problem of complex calculation of the initial detector, a mathematical model of uniform design is proposed to reduce the calculation time. Sine and cosine algorithm (SCA) (Mirjalili 2016 ) establishes mathematical models and seeks optimal solutions based on the volatility and periodicity characteristics of sine and cosine functions. Compared with the candidate solution set of a certain scale, the algorithm has a strong search ability and the ability to jump out of the local optimal, but the results of some test functions fluctuate around the optimal solution, and there is a certain precocious situation, and the convergence needs to be improved.

While the original algorithm is proposed, many improved MAs algorithms are also proposed to further improve the optimization performance of the algorithm in practical application problems, such as: Yujun-Zhang et al. combined the arithmetic optimization algorithm (AOA) with the Aquila Optimizer(AO) algorithm to propose a new meta-heuristic algorithm (AOAAO) (Zhang et al. 2022c ). CSCAHHO algorithm (Zhang et al. 2022d ) is a new algorithm obtained by chaotic mixing of sine and cosine algorithm (SCA) and Harris Hqwk optimization algorithm (HHO). Based on LMRAOA algorithm proposed to solve numerical and engineering problems (Zhang et al. 2022e ). Yunpeng Ma et al. proposed an improved teaching-based optimization algorithm to artificially reduce NOx emission concentration in circulating fluidized bed boilers (Ma et al. 2021 ). The improved algorithm SOS(MSOS) (Kumar et al. 2019 ), based on the natural Symbiotic search (SOS) algorithm, improves the search efficiency of the algorithm by introducing adaptive return factors and modified parasitic vectors. Modified beluga whale optimization with multi-strategies for solving engineering problems (MBWO) (Jia et al. 2023a ) by gathering Beluga populations for feeding and finding new habitats during long-distance migration. Betul Sultan Yh-ld-z et al. proposed a novel hybrid optimizer named AO-NM, which aims to optimize engineering design and manufacturing problems (Yıldız et al. 2023 ).

The Crayfish Optimization Algorithm (COA) (Jia et al. 2023b ) is a novel metaheuristic algorithm rooted in the concept of population survival wisdom, introduced by Heming Jia et al. in 2023. Drawing inspiration from crayfish behavior, including heat avoidance, competition for caves, and foraging, COA employs a dual-stage strategy. During the exploration stage, it replicates crayfish searching for caves in space for shelter, while the exploitation stage mimics their competition for caves and search for food. Crayfish, naturally averse to dry heat, thrive in freshwater habitats. To simulate their behavior and address challenges related to high temperatures and food scarcity, COA incorporates temperature variations into its simulation. By replicating crayfish habits, the algorithm dynamically adapts to environmental factors, ensuring robust problem-solving capabilities. Based on temperature fluctuations, crayfish autonomously select activities such as seeking shelter, competing for caves, and foraging. When the temperature exceeds 30°C, crayfish instinctively seek refuge in cool, damp caves to escape the heat. If another crayfish is already present in the cave, a competition ensues for occupancy. Conversely, when the temperature drops below 30°C, crayfish enter the foraging stage. During this phase, they make decisions about food consumption based on the size of the available food items. COA achieves algorithmic transformation between exploration and exploitation stages by leveraging temperature variations, aiming to balance the exploration and exploitation capabilities of the algorithm. However, COA solely emulates the impact of temperature on crayfish behavior, overlooking other significant crayfish habits, leading to inherent limitations. In the latter stages of global search, crayfish might cluster around local optimum positions, restricting movement. This hampers the crayfish's search behavior, slowing down convergence speed, and increasing the risk of falling into local optima, thereby making it challenging to find the optimal solution.

In response to the aforementioned challenges, this paper proposes a Modified Crayfish Optimization Algorithm (MCOA). MCOA introduces an environmental update mechanism inspired by crayfish's preference for living in fresh flowing water. MCOA incorporates crayfish's innate perception abilities to assess the quality of the surrounding aquatic environment, determining whether the current habitat is suitable for survival. The simulation of crayfish crawling upstream to find a more suitable aquatic environment is achieved by utilizing adaptive flow factors and leveraging the crayfish's second, third foot perceptions to determine the direction of water flow.This method partially replicates the survival and reproduction behavior of crayfish, ensuring the continual movement of the population. It heightens the randomness within the group, widens the search scope for crayfish, enhances the algorithm's exploration efficiency, and effectively strengthens the algorithm’s global optimization capabilities. Additionally, the ghost opposition-based learning strategy (Jia et al. 2023c ) is implemented to introduce random population initialization when the algorithm becomes trapped in local optima. This enhancement significantly improves the algorithm's capability to escape local optima, promoting better exploration of the solution space. After the careful integration of the aforementioned two strategies, the search efficiency and predation speed of the crayfish algorithm experience a substantial improvement. Moreover, the algorithm's convergence rate and global optimization ability are significantly enhanced, leading to more effective and efficient problem-solving capabilities.

In the experimental section, we conducted a comprehensive comparison between MCOA and nine other metaheuristic algorithms. We utilized the IEEE CEC2020 benchmark function to evaluate the performance of the algorithm. The evaluation involved statistical methods such as the Wilcoxon rank sum test and Friedman test to rank the averages, validating the efficiency of the MCOA algorithm and the effectiveness of the proposed improvements. Furthermore, MCOA was applied to address four constrained engineering design problems as well as the high-dimensional feature selection problem using the wrapper method. These practical applications demonstrated the practicality and effectiveness of MCOA in solving real-world engineering problems.

The main contributions of this paper are as follows:

In the environmental renewal mechanism, the water quality factor and roulette wheel selection method are introduced to simulate the process of crayfish searching for a more suitable water environment for survival.

The introduction of the ghost opposition-based learning strategy enhances the randomness of crayfish update locations, effectively preventing the algorithm from getting trapped in local optima, and improving the overall global optimization performance of the algorithm.

The fixed value of food intake is adaptively adjusted based on the number of evaluations, enhancing the algorithm's capacity to escape local optima. This adaptive change ensures a more dynamic exploration of the solution space, improving the algorithm's overall optimization effectiveness.

The MCOA’s performance is compared with nine metaheuristics, including COA, using the IEEE CEC2020 benchmark function. The comparison employs the Wilcoxon rank sum test and Friedman test to rank the averages, providing evidence for the efficiency of MCOA and the effectiveness of the proposed improvements.

The application of MCOA to address four constrained engineering design problems and the high-dimensional feature selection problem using the wrapper method demonstrates the practicality and effectiveness of MCOA in real-world applications.

The main structure of this paper is as follows, the first part of the paper serves as a brief introduction to the entire document, providing an overview of the topics and themes that will be covered. In the second part, the paper provides a comprehensive summary of the Crayfish Optimization Algorithm (COA). In the third part, a modified crawfish optimization algorithm (MCOA) is proposed. By adding environment updating mechanism and ghost opposition-based learning strategy, MCOA can enhance the global search ability and convergence speed to some extent. Section four shows the experimental results and analysis of MCOA in IEEE CEC2020 benchmark functions. The fifth part applies MCOA to four kinds of constrained engineering design problems. In Section six, MCOA is applied to the high-dimensional feature selection problem of wrapper methods to demonstrate the effectiveness of MCOA in practical application problems. Finally, Section seven concludes the paper.

2 Crayfish optimization algorithm (COA)

Crayfish is a kind of crustaceans living in fresh water, its scientific name is crayfish, also called red crayfish or freshwater crayfish, because of its food, fast growth rate, rapid migration, strong adaptability and the formation of absolute advantages in the ecological environment. Changes in temperature often cause changes in crayfish behavior. When the temperature is too high, crayfish choose to enter the cave to avoid the damage of high temperature, and when the temperature is suitable, they will choose to climb out of the cave to forage. According to the living habits of crayfish, it is proposed that the three stages of summer, competition for caves and going out to forage correspond to the three living habits of crayfish, respectively.

Crayfish belong to ectotherms and are affected by temperature to produce behavioral differences, which range from 20 °C to 35 °C. The temperature is calculated as follows:

where temp represents the temperature of the crayfish's environment.

2.1 Initializing the population

In the d -dimensional optimization problem of COA, each crayfish is a 1 ×  d matrix representing the solution of the problem. In a set of variables ( X 1 , X 2 , X 3 …… X d ), the position ( X ) of each crayfish is between the upper boundary ( ub ) and lower boundary ( lb ) of the search space. In each evaluation of the algorithm, an optimal solution is calculated, and the solutions calculated in each evaluation are compared, and the optimal solution is found and stored as the optimal solution of the whole problem. The position to initialize the crayfish population is calculated using the following formula.

where X i,j denotes the position of the i-th crayfish in the j-th dimension, ub j denotes the upper bound of the j-th dimension, lb j denotes the lower bound of the j-th dimension, and rand is a random number from 0 to 1.

2.2 Summer escape stage (exploration stage)

In this paper, the temperature of 30 °C is assumed to be the dividing line to judge whether the current living environment is in a high temperature environment. When the temperature is greater than 30 ℃ and it is in the summer, in order to avoid the harm caused by the high temperature environment, crayfish will look for a cool and moist cave and enter the summer to avoid the influence of high temperature. The caverns are calculated as follows.

where X G represents the optimal position obtained so far for this evaluation number, and X L represents the optimal position of the current population.

The behavior of crayfish competing for the cave is a random event. To simulate the random event of crayfish competing for the cave, a random number rand is defined, when rand < 0.5 means that there are no other crayfish currently competing for the cave, and the crayfish will go straight into the cave for the summer. At this point, the crayfish position update calculation formula is as follows.

Here, X new is the next generation position after location update, and C 2 is a decreasing curve. C 2 is calculated as follows.

Here, FEs represents the number of evaluations and MaxFEs represents the maximum number of evaluations.

2.3 Competition stage (exploitation stage)

When the temperature is greater than 30 °C and rand ≥ 0.5, it indicates that the crayfish have other crayfish competing with them for the cave when they search for the cave for summer. At this point, the two crayfish will struggle against the cave, and crayfish X i adjusts its position according to the position of the other crayfish X z . The adjustment position is calculated as follows.

Here, z represents the random individual of the crayfish, and the random individual calculation formula is as follows.

where, N is the population size.

2.4 Foraging stage (exploitation stage)

The foraging behavior of crayfish is affected by temperature, and temperature less than or equal to 30 ℃ is an important condition for crayfish to climb out of the cave to find food. When the temperature is less than or equal to 30 °C, the crayfish will drill out of the cave and judge the location of the food according to the optimal location obtained in this evaluation, so as to find the food to complete the foraging. The position of the food is calculated as follows.

The amount of food crayfish eat depends on the temperature. When the temperature is between 20 °C and 30°C, crayfish have strong foraging behavior, and the most food is found and the maximum food intake is also obtained at 25 °C. Thus, the food intake pattern of crayfish resembles a normal distribution. Food intake was calculated as follows.

Here, µ is the most suitable temperature for crayfish feeding, and σ and C 1 are the parameters used to control the variation of crayfish intake at different temperatures.

The food crayfish get depends not only on the amount of food they eat, but also on the size of the food. If the food is too large, the crayfish can't eat the food directly. They need to tear it up with their claws before eating the food. The size of the food is calculated as follows.

Here, C 3 is the food factor, which represents the largest food, and its value is 3, fitness i represents the fitness value of the i-th crayfish, and fitness food represents the fitness value of the location of the food.

Crayfish use the value of the maximum food Q to judge the size of the food obtained and thus decide the feeding method. When Q  > ( C 3  + 1)/2, it means that the food is too large for the crayfish to eat directly, and it needs to tear the food with its claws and eat alternately with the second and third legs. The formula for shredding food is as follows.

After the food is shredded into a size that is easy to eat, the second and third claws are used to pick up the food and put it into the mouth alternately. In order to simulate the process of bipedal eating, the mathematical models of sine function and cosine function are used to simulate the crayfish eating alternately. The formula for crayfish alternating feeding is as follows.

When Q  ≤ ( C 3  + 1)/2, it indicates that the food size is suitable for the crayfish to eat directly at this time, and the crayfish will directly move towards the food location and eat directly. The formula for direct crayfish feeding is as follows.

2.5 Pseudo-code for COA

figure b

Crayfish optimization algorithm pseudo-code

3 Modified crayfish optimization algorithm (MCOA)

Based on crayfish optimization algorithm, we propose a high-dimensional feature selection problem solving algorithm (MCOA) based on improved crayfish optimization algorithm. In MCOA, we know that the quality of the aquatic environment has a great impact on the survival of crayfish, according to the living habits of crayfish, which mostly feed on plants and like fresh water. Oxygen is an indispensable energy for all plants and animals to maintain life, the higher the content of dissolved oxygen in the water body, the more vigorous the feeding of crayfish, the faster the growth, the less disease, and the faster the water flow in the place of better oxygen permeability, more aquatic plants, suitable for survival, so crayfish has a strong hydrotaxis. When crayfish perceive that the current environment is too dry and hot or lack of food, they crawl backward according to their second, third and foot perception (r) to judge the direction of water flow, and find an aquatic environment with sufficient oxygen and food to sustain life. Good aquatic environment has sufficient oxygen and abundant aquatic plants, to a certain extent, to ensure the survival and reproduction of crayfish.

In addition, we introduce ghost opposition-based learning to help MCOA escape the local optimal trap. The ghost opposition-based learning strategy combines the candidate individual, the current individual and the optimal individual to randomly generate a new candidate position to replace the previous poor candidate position, and then takes the best point or the candidate solution as the central point, and then carries out more specific and extensive exploration of other positions. Traditional opposition-based learning (Mahdavi et al. 2018 ) is based on the central point and carries out opposition-based learning in a fixed format. Most of the points gather near the central point and their positions will not exceed the distance between the current point and the central point, and most solutions will be close to the optimal individual. However, if the optimal individual is not near the current exploration point, the algorithm will fall into local optimal and it is difficult to find the optimal solution. Compared with traditional opposition-based learning, ghost opposition-based learning is a opposition-based learning solution that can be dynamically changed by adjusting the size of parameter k, thereby expanding the algorithm's exploration range of space, effectively solving the problem that the optimal solution is not within the search range based on the center point, and making the algorithm easy to jump out of the local optimal.

According to the life habits of crayfish, this paper proposes a Modified Crayfish Optimization Algorithm (MCOA), which uses environment update mechanism and ghost opposition-based learning strategy to improve COA, and shows the implementation steps, pseudo-code and flow chart of MCOA algorithm as follows.

3.1 Environment update mechanism

In the environmental renewal mechanism, a water quality factor V is introduced to represent the quality of the aquatic environment at the current location. In order to simplify the design and computational complexity of the system, the water quality factor V of the MCOA is represented by a hierarchical discretization, and its value range is set to 0 to 5. Crayfish perceive the quality of the current aquatic environment through the perception ( r ) of the second and third legs, judge whether the current living environment can continue to survive through the perception, and independently choose whether to update the current location. The location update is calculated as follows.

Among them, each crayfish has a certain difference in its own perception of water environment r , X 2 is a random position between the candidate optimal position and the current position, which is calculated by Eq. ( 15 ), X 1 is a random position in the population, and B is an adaptive water flow factor, which is calculated by Eq. ( 16 ).

Among them, the sensing force r of the crayfish’s second and third legs is a random number [0,1]. c is a constant that represents the water flow velocity factor with a value of 2. When V  ≤ 3, it indicates that the crayfish perceives the quality of the current living environment to be good and is suitable for continued survival. When V > 3, it indicates that the crayfish perceives that the current living environment quality is poor, and it needs to crawl in the opposite direction according to the direction of water flow that crayfish perceives, so as to find an aquatic environment with sufficient oxygen and abundant food Fig.  1 .

figure 1

Classification of MAs

In the environmental updating mechanism, in order to describe the behavior of crayfish upstream in more detail, the perception area of crayfish itself is abstractly defined as a circle in MCOA, and crayfish is in the center of the circle. In each evaluation calculation, a random Angle θ is first calculated by the roulette wheel selection algorithm to determine the moving direction of the crayfish in the circular area, and then the moving path of the crayfish is determined according to the current moving direction. In the whole circle, random angles can be chosen from 0 to 360 degrees, from which the value of θ can be determined to be of magnitude [− 1,1]. The difference of random Angle indicates that each crayfish moves its position in a random direction, which broadens the search range of crayfish, enhances the randomness of position and the ability to escape from local optimum, and avoids local convergence Fig.  2 .

figure 2

Schematic diagram of the environment update mechanism

3.2 Ghost opposition-based learning strategy

The ghost opposition-based learning strategy takes a two-dimensional space as an example. It is assumed that there is a two-dimensional space, as shown in Fig.  3 . On the X-axis, [ ub , lb ] represents the search range of the solution, and the ghost generation method is shown in Fig.  3 . Assuming that the position of a new candidate solution is Xnew and the height of the solution is h1 i , the position of the best solution on the X-axis is the projected position of the candidate solution, and the position and height are XG , h2 i , respectively. In addition, on the X-axis there is a projection position X i of the candidate solution with a height of h3 i. Thus, the position of the ghost is obtained. The projection position of the ghost on the X-axis is x i by vector calculation, and its height is h i . The ghost position is calculated using the following formula.

figure 3

Schematic diagram of ghost opposition-based learning strategy

In Fig.  3 , the Y-axis represents the convex lens. Suppose there is a ghost position P i , where x i is its projection on the X-axis and h i is its height. P* i is the real image obtained by convex lens imaging. P* i is projected on the X-axis as x* i and has height h* i . Therefore, the opposite individual x* i of individual x i can be obtained. x* i is the corresponding point corresponding to the ghost individual x i obtained from O as the base point. According to the lens imaging principle, we can obtain Eq. ( 18 ), and the calculation formula is as follows.

The strategy formula of ghost opposition-based learning is evolved from Eq. ( 18 ). The strategy formula of ghost opposition-based learning is calculated as follows.

3.3 Implementation of MCOA algorithm

3.3.1 initialization phase.

Initialize the population size N , the population dimension d , and the number of evaluations FEs . The initialized population is shown in Eq. ( 2 ).

3.3.2 Environment update mechanism

Crayfish judge the quality of the current aquatic environment according to the water quality factor V , and speculate whether the current aquatic environment can continue to survive. When V  > 3 indicates that the crawfish perceives the quality of the current aquatic environment as poor and is not suitable for survival. According to the sensory information of the second and third legs and the adaptive flow factor, the crawfish judges the direction of the current flow, and then moves upstream to find a better aquatic environment to update the current position. The position update formula is shown in Eq. ( 14 ). When V  < 3, it means that the crayfish has a good perception of the current living environment and is suitable for survival, and does not need to update its position.

3.3.3 Exploration phase

When the temperature is greater than 30 ℃ and V  > 3, it indicates that crayfish perceive the current aquatic environment quality is poor, and the cave is dry and without moisture, which cannot achieve the effect of summer vacation. It is necessary to first update the position by crawling in the reverse direction according to the flow direction, and find a cool and moist cave in a better quality aquatic environment for summer.

3.3.4 Exploitation stage

When the temperature is less than 30 ℃ and V  > 3, it indicates that crayfish perceive the current aquatic environment is poor, and there is not enough food to maintain the survival of crayfish. It is necessary to escape from the current food shortage living environment by crawling in the reverse direction according to the current direction, and find a better aquatic environment to maintain the survival and reproduction of crayfish.

3.3.5 Ghost opposition-based learning strategy

Through the combination of the candidate individual, the current individual and the optimal individual, a candidate solution is randomly generated and compared with the current solution, the better individual solution is retained, the opposite individual is obtained, and the location of the ghost is obtained. The combination of multiple positions effectively prevents the algorithm from falling into local optimum, and the specific implementation formula is shown in Eq. ( 19 ).

3.3.6 Update the location

The position of the update is determined by comparing the fitness values. If the fitness of the current individual update is better, the current individual replaces the original individual. If the fitness of the original individual is better, the original individual is retained to exist as the optimal solution.

The pseudocode for MCOA is as follows (Algorithm 2).

figure c

Modified Crayfish optimization algorithm pseudo-code

The flow chart of the MCOA algorithm is as follows.

3.4 Computational complexity analysis

The complexity analysis of algorithms is an essential step to evaluate the performance of algorithms. In the experiment of complexity analysis of the algorithm, we choose the IEEE CEC2020 Special Session and Competition as the complexity evaluation standard of the single objective optimization algorithm. The complexity of MCOA algorithm mainly depends on several important parameters, such as the population size ( N  = 30), the number of dimensions of the problem ( d  = 10), the maximum number of evaluations of the algorithm ( MaxFEs  = 100,000) and the solution function ( C ). Firstly, the running time of the test program is calculated and the running time ( T 0 ) of the test program is recorded, and the test program is shown in Algorithm 3. Secondly, under the same dimension of calculating the running time of the test program, the 10 test functions in the IEEE CEC2020 function set were evaluated 100,000 times, and their running time ( T 1 ) was recorded. Finally, the running time of 100,000 evaluations of 10 test functions performed by MCOA for 5 times under the same dimension was recorded, and the average value was taken as the running time of the algorithm ( T 2 ). Therefore, the formula for calculating the time complexity of MCOA algorithm is given in Eq. ( 21 ).

figure d

IEEE CEC2020 complexity analysis test program

The experimental data table of algorithm complexity analysis is shown in Table  1 . In the complexity analysis of the algorithm, we use the method of comparing MCOA algorithm with other seven metaheuristic algorithms to illustrate the complexity of MCOA. In Table  1 , we can see that the complexity of MCOA is much lower than other comparison algorithms such as ROA, STOA, and AOA. However, compared with COA, the complexity of MCOA is slightly higher than that of COA because it takes a certain amount of time to update the location through the environment update mechanism and ghost opposition-based learning strategy. Although the improved strategy of MCOA increases the computation time to a certain extent, the optimization performance of MOCA has been significantly improved through a variety of experiments in section four of this paper, which proves the good effect of the improved strategy.

4 Experimental results and discussion

The experiments are carried out on a 2.50 GHz 11th Gen Intel(R) Core(TM) i7-11,700 CPU with 16 GB memory and 64-bit Windows11 operating system using Matlab R2021a. In order to verify the performance of MCOA algorithm, MCOA is compared with nine metaheuristic algorithms in this subsection. In the experiments, we used the IEEE CEC2020 test function to evaluate the optimization performance of the MCOA algorithm Fig.  4 .

figure 4

Flow chart of the MCOA algorithm

4.1 Experiments with IEEE CEC2020 test functions

In this subsection, using the Crayfish Optimization Algorithm (COA), Remora Optimization Algorithm (ROA) (Jia et al. 2021 ), Sooty Tern Optimization Algorithm (STOA) (Dhiman and Kaur 2019 ), Arithmetic Optimization Algorithm (AOA) (Abualigah et al. 2021 ), Harris Hawk Optimization Algorithm (HHO) (Heidari et al. 2019 ), Prairie Dog Optimization Algorithm (PDO) (Ezugwu et al. 2022 ), Genetic Algorithm (GA) (Mirjalili and Mirjalili 2019 ),Modified Sand Cat Swarm Optimization Algorithm (MSCSO) (Wu et al. 2022 ) and a competition algorithm LSHADE (Piotrowski 2018 ) were compared to verify the optimization effect of MCOA. The parameter Settings of each algorithm are shown in Table  2 .

In order to test the performance of MCOA, this paper selects 10 benchmark test functions of IEEE CEC2020 for simulation experiments. Where F1 is a unimodal function, F2–F3 is a multimodal function, F4 is a non-peak function, F5–F7 is a hybrid function, and F8-F10 is a composite function. The parameters of this experiment are uniformly set as follows: the maximum number of evaluation MaxFEs is 100,000, the population size N is 30, and the dimension size d is 10. The MCOA algorithm and the other nine algorithms are run independently for 30 times, and the average fitness value, standard deviation of fitness value and Friedman ranking calculation of each algorithm are obtained. The specific function Settings of the IEEE CEC2020 benchmark functions are shown in Table  3 .

4.1.1 Results statistics and convergence curve analysis of IEEE CEC2020 benchmark functions

In order to more clearly and intuitively compare the ability of MCOA and various algorithms to find individual optimal solutions, the average fitness value, standard deviation of fitness value and Friedman ranking obtained by running MCOA and other comparison algorithms independently for 30 times are presented in the form of tables and images. The data and images are shown in Table  4 and Fig.  5 respectively.

figure 5

Convergence curve of MCOA algorithm in IEEE CEC2020

In Table  4 , mean represents the average fitness value, std represents the standard deviation of fitness value, rank represents the Friedman ranking, Friedman average rank represents the average ranking of the algorithm among all functions, and Friedman rank represents the final ranking of this algorithm. Compared with other algorithms, MCOA achieved the best results in average fitness value, standard deviation of fitness value and Friedman ranking. In unimodal function F1, although MCOA algorithm is slightly worse than LSHADE algorithm, MCOA is superior to other algorithms in mean fitness value, standard deviation of fitness value, Friedman ranking and other aspects. In the multimodal functions F2 and F3, although the average fitness value of MCOA is slightly worse, it also achieves a good result of ranking second. The standard deviation of fitness value in F3 is better than other comparison algorithms in terms of stability. In the peakless function F4, except GA and LSHADE algorithm, other algorithms can find the optimal individual solution stably. In the mixed functions F5, F6, and F7, although the mean fitness value of LSHADE is better than that of MCOA, the standard deviation of the fitness value of MCOA is better than that of the other algorithms compared. Among the composite functions of F8, F9 and F10, the standard deviation of MCOA's fitness value at F8 is slightly worse than that of LSHADE, but the average fitness value and standard deviation of fitness value are the best in other composite functions, and it has achieved the first place in all composite functions. Finally, from the perspective of Friedman average rank, MCOA has a strong comprehensive performance and still ranks first. Through the analysis of the data in Table  4 , it can be seen that MCOA ranks first overall and has good optimization effect, and its optimization performance is better than other 9 comparison algorithms.

Figure  5 shows that in the IEEE CEC2020 benchmark functions, for the unimodal function F1, although LSHADE algorithm has a better optimization effect, compared with similar meta-heuristic algorithms, MCOA has a slower convergence rate in the early stage, but can be separated from local optimal and converge quickly in the middle stage. In the multimodal functions F2 and F3, similar to F1, MCOA converges faster in the middle and late stages, effectively exiting the local optimal. Although the convergence speed is slower than that of LSHADE, the optimal value can still be found. In the peak-free function F4, the optimal value can be found faster by all algorithms except LSHADE, STOA and PDO because the function is easy to implement. In the mixed functions F5, F6 and F7, although the convergence rate of MCOA is slightly slower than that of COA algorithm in the early stage, it can still find better values than the other eight algorithms except LSHADE in the later stage. For the composite functions F8, F9 and F10, MCOA can find the optimal value faster than the other nine algorithms.

Based on the above, although LSHADE has a stronger ability to find the optimal value in a small number of functions, MCOA can still find the optimal value in most functions in the later stage, and compared with the other eight pair algorithms of the same type, MCOA has more obvious enhancement in optimization ability and avoidance of local optimization, and has better application effect.

4.1.2 Analysis of Wilcoxon rank sum test results

In the comparison experiment, the different effects of multiple algorithms solving the same problem are used to judge whether each algorithm has high efficiency and more obvious influence on solving the current problem, such as the convergence speed of the convergence curve, the fitness value of the optimal solution, the ability to jump out of the local optimum, etc. At present, only the average fitness value, the standard deviation of fitness value and the convergence curve can not be used as the basis for judging whether the performance of the algorithm is efficient. Therefore, the data and images presented by each algorithm in solving the current problem are comprehensively analyzed, and the Wilcoxon rank sum test is used to further verify the difference between MCOA and the other nine comparison algorithms. In this experiment, the significance level is defined as 5%. If its calculated value is less than 5%, it proves that there is a significant difference between the two algorithms, and if it is greater than 5%, it proves that there is no significant difference between the two algorithms. Table 5 shows the Wilcoxon rank-sum test results of the MCOA algorithm and the other nine comparison algorithms. Where the symbols “ + ”, “−” and “ = ” table the performance of MCOA better, worse and equal to the comparison algorithms, respectively.

In the calculation of the function F4 without peak, the value of 1 appears in the comparison of various algorithms such as MCOA, COA, ROA, STOA and other algorithms, indicating that in this function, a variety of algorithms have found the optimal value, there is no obvious difference, which can be ignored. However, in most of the remaining functions, the significance level of MCOA compared with the other nine algorithms is less than 5%, which is a significant difference.

From the overall table, the MCOA algorithm also achieves good results in the Wilcoxon rank-sum test of the IEEE CEC2020 benchmark function, and the contrast ratio with other algorithms is less than 5%, which proves that the MCOA algorithm has a significant difference from the other nine algorithms, and MCOA has better optimization performance. According to the comparison results with the original algorithm, it is proved that MCOA algorithm has a good improvement effect.

4.2 Comparison experiment of single strategy

MCOA adopts two strategies, environment update mechanism and ghost opposition-based learning strategy, to improve COA. In order to prove the effectiveness of these two strategies for algorithm performance optimization, a single strategy comparison experiment is added in this section. In the experiment in this section, EUCOA algorithm which only adds environment update mechanism and GOBLCOA algorithm which only adds ghost opposition-based learning strategy are compared with the basic COA algorithm. The experiments are independently run 30 times in IEEE CEC2020 benchmark test function, and the statistical data obtained are shown in Table  6 . In order to make the table easy to view the statistical results, the poor data in the table will be bolded to make the statistical results more clear and intuitive. It can be seen from the table that among the best fitness values, average fitness values and standard deviation of fitness values of the 10 test functions, GOBLCOA and EUCOA account for less bolded data, while most data of the original algorithm COA are bolded in the table, which effectively proves that both the environment update mechanism and the ghost opposition-based learning strategy play a certain role in COA. The comprehensive performance of COA has been significantly improved.

4.3 Parameter sensitivity analysis of water flow velocity factor c

In order to better prove the influence of flow velocity coefficient on MCOA, we choose different flow velocity coefficient c values for comparison experiments. Table 7 shows the statistical results of 30 independent runs of different water flow velocity coefficients in CEC2020. The bold sections represent the best results. As can be seen from the table, the result obtained by c  = 2 is significantly better than the other values. Only in individual test functions are the results slightly worse. In the F1 function, c  = 5 has the best std. In the F5 function, std is best at c  = 6. Among F10 functions, c  = 5 has the best std. Among the other test functions, both the mean fitness value and std at water flow velocity factor c  = 2 are optimal. Through the above analysis, it is proved that the water flow velocity factor c  = 2 has a good optimization effect.

4.4 Experimental summary

In this section, we first test MCOA's optimization performance on the IEEE CEC2020 benchmark function. The improved MCOA is compared with the original algorithm COA and six other meta-heuristic algorithms in the same environment and the experimental analysis is carried out. Secondly, the rank sum test is used to verify whether there are significant differences between MCOA and the other nine comparison algorithms. Finally, three algorithms, EUCOA with environment update mechanism, GOBLCOA with ghost opposition-based learning strategy, COA and MCOA, are tested to improve performance. These three experimental results show that MCOA has a good ability to find optimal solutions and get rid of local optimal solutions.

5 Constrained engineering design problems

With the new development of the era of big data, the solution process becomes complicated and the calculation results become accurate, and more and more people pay close attention to the dynamic development of the feasibility and practicality of the algorithm, so as to ensure that the algorithm has good practical performance on constrained engineering design problems. In order to verify the optimization effect of MCOA in practical applications, four constrained engineering design problems are selected for application testing of MCOA to evaluate the performance of MCOA in solving practical application problems. Every constrained engineering design problems has a minimization objective function (Papaioannou and Koulocheris 2018 ) that is used to calculate the fitness value for a given problem. In addition, each problem contains a varying number of constraints that are taken into account during the calculation of the objective function. If the constraints are not met, the penalty function (Yeniay 2005 ) is used to adjust the fitness value. However, the processing of constraints is not the focus of our research, our focus is on the optimization of parameters in a convex region composed of constraints (Liu and Lu 2014 ). In order to ensure the fairness of the experiment, the parameters of all experiments in this section are set as follows: the maximum evaluation time MaxFEs is 10,000 and the overall scale N is 30. In each experiment, all the algorithms were analyzed 500 times and the optimal results were obtained.

5.1 Multi-disc clutch braking problem

In the field of vehicle engineering, there is a common constrained engineering design problems multi-disc clutch braking problem, and the purpose of our algorithm is to minimize the mass of the multi-disc clutch by optimizing eight constraints and five variables, so as to improve the performance of the multi-disc clutch. Among them, the five variables are: inner diameter r i , outer diameter r o , brake disc thickness t , driving force F , and surface friction coefficient Z . The specific structure of the multi-disc clutch is shown in Fig.  6 .

figure 6

Schematic diagram of the multi-disc clutch braking problem

The mathematical formulation of the multi-disc clutch braking problem is as follows.

Objective function:

Subject to:

Variable range:

Other parameters:

After calculation and experiments, the experimental results of the multi-disc clutch braking problem are made into a table as shown in Table  8 . In Table  8 , MCOA concluded that the inner diameter r i  = 70, the outer diameter r 0  = 90, the thickness of the brake disc t  = 1, the driving force F  = 600, and the surface friction coefficient Z  = 2. At this time, the minimum weight obtained is 0.2352424, it is 11.16% higher than the original algorithm. Compared with MCOA, the other five algorithms in the calculation of this problem show that the optimization effect is far lower than that of MCOA.

5.2 Design problem of welding beam

The welded beam design problem is very common in the field of structural engineering and is constrained not only by four decision variables (welding width h , connecting beam length l , beam height t , and connecting beam thickness b ) but also by seven other different conditions. Therefore, it is challenging to solve this problem. The purpose of the optimization algorithm is to achieve the best structural performance of the welded beam and reduce its weight by optimizing the small problems such as the shape, size and layout of the weld under many constraints. The specific structure of the welded beam is shown in Fig.  7 .

figure 7

Schematic diagram of the welded beam design problem

The mathematical formulation of the welded beam design problem is as follows.

Boundaries:

The experimental results of the welding beam design problem are shown in Table  9 . In the table, the welding width obtained by the MCOA algorithm h  = 0.203034,the length of the connecting beam is l  = 3.310032, the height of the beam is t  = 9.084002, and the thickness of the connecting beam is b  = 0.20578751. At this time, the minimum weight is 1.707524, it is 1.46% higher than the original algorithm. In the welding beam design problem, the weight determines the application effect of the algorithm in the practical problem. The weight of MCOA algorithm is smaller than that of other algorithms. Therefore, the practical application effect of MCOA is much greater than that of other algorithms.

5.3 Design problem of reducer

A reducer is a mechanical device used to reduce the speed of rotation and increase the torque. Among them, gears and bearings are an indispensable part of the reducer design, which have a great impact on the transmission efficiency, running stability and service life of the reducer. The weight of the reducer also determines the use of the reducer. Therefore, we will adjust the number of teeth, shape, radius and other parameters of the gear in the reducer to maximize the role of the reducer, reduce the friction between the parts, and extend the service life of the reducer. In this problem, a total of seven variables are constrained, which are the gear width x 1 , the gear modulus x 2 , the gear teeth x 3 , the length of the first axis between bearings x 4 , the length of the second axis between bearings x 5 , the diameter of the first axis x 6 and the diameter of the second axis x 7 . The specific structure of the reducer is shown in Fig.  8 .

figure 8

Schematic diagram of the reducer design problem

The mathematical model of the reducer design problem is as follows.

The experimental results of the reducer design problem are shown in Table  10 . From Table  10 , it is known that the gear width calculated by the MCOA algorithm is x 1  = 3.47635, the gear modulus x 2  = 0.7, the gear teeth x 3  = 17, the length of the first axis between the bearings x 4  = 7.3, the length of the second axis between the bearings × 5 = 7.8, and the length of the first axis between the bearings x 5  = 7.8. The diameter of the first axis is x 6  = 3.348620, the diameter of the second axis is x 7  = 5.2768, and the minimum weight is 2988.27135, it is 0.08% higher than the original algorithm. In this experiment, it can be concluded that MCOA has the smallest data among the minimum weights obtained by MCOA and other comparison algorithms in this problem, which proves that MCOA has the best optimization effect in solving such problems.

5.4 Design problem of three-bar truss

Three-bar truss structure is widely used in bridge, building, and mechanical equipment and other fields. However, the size, shape and connection mode of the rod need to be further explored by human beings. Therefore, A 1  =  x 1 and A 2  =  x 2 determined by the pairwise property of the system need to be considered in solving this problem. In addition to this, there will be constraints on the total support load, material cost, and other conditions such as cross-sectional area. The structural diagram of the three-bar truss is shown in Fig.  9 .

figure 9

Schematic diagram of the three-bar truss design problem

The mathematical formulation of the three-bar truss design problem is as follows.

The experimental results of the three-bar truss design problem are shown in Table  11 , from which it can be concluded that x 1  = 0.7887564and x 2  = 0.4079948of the MCOA algorithm on the three-bar truss design problem. At this time, the minimum weight value is 263.85438633, it is 0.24% higher than the original algorithm. Compared with the minimum weight value of other algorithms, the value of MCOA is the smallest. It is concluded that the MCOA algorithm has a good optimization effect on the three-bar truss design problem.

The experimental results of four constrained engineering design problems show that MCOA has good optimization performance in dealing with problems similar to constrained engineering design problems. In addition, we will also introduce the high-dimensional feature selection problem of the wrapper method, and further judge whether MCOA has good optimization performance and the ability to deal with diversified problems through the classification and processing effect of data.

6 High-dimensional feature selection problem

The objective of feature selection is to eliminate redundant and irrelevant features, thereby obtaining a more accurate model. However, in high-dimensional feature spaces, feature selection encounters challenges such as high computational costs and susceptibility to over-fitting. To tackle these issues, this paper propose novel high-dimensional feature selection methods based on metaheuristic algorithms. These methods aim to enhance the efficiency and effectiveness of feature selection in complex, high-dimensional datasets.

High-dimensional feature selection, as discussed in reference (Ghaemi and Feizi-Derakhshi 2016 ), focuses on processing high-dimensional data to extract relevant features while eliminating redundant and irrelevant ones. This process enhances the model's generalization ability and reduces computational costs. The problem of high-dimensional feature selection is often referred to as sparse modeling, encompassing two primary methods: filter and wrapper. Filter methods, also called classifier-independent methods, can be categorized into univariate and multivariate methods. Univariate methods consider individual features independently, leveraging the correlation and dependence within the data to quickly screen and identify the optimal feature subset. On the other hand, multivariate methods assess relationships between multiple features simultaneously, aiming to comprehensively select the most informative feature combinations. Wrapper methods offer more diverse solutions. This approach treats feature selection as an optimization problem, utilizing specific performance measures of classifiers and objective functions. Wrapper methods continuously explore and evaluate various feature combinations to find the best set of features that maximizes the model’s performance. Unlike filter methods, wrapper methods provide a more customized and problem-specific approach to feature selection.

Filter methods, being relatively single and one-sided, approach the problem of feature selection in a straightforward manner by considering individual features and their relationships within the dataset. However, they might lack the flexibility needed for complex and specific problem scenarios. However, wrapper methods offer tailored and problem-specific solutions. They exhibit strong adaptability, wide applicability, and high relevance to the specific problem at hand. Wrapper methods can be seamlessly integrated into any learning algorithm, allowing for a more customized and targeted approach to feature selection. By treating feature selection as an optimization problem and continuously evaluating different feature combinations, wrapper methods can maximize the effectiveness of the algorithm and optimize its performance to a greater extent compared to filter methods. In summary, wrapper methods provide a more sophisticated and problem-specific approach to feature selection, enabling the algorithm to achieve its maximum potential by selecting the most relevant and informative features for the given task.

6.1 Fitness function

In this subsection, the wrapper method in high-dimensional feature selection is elucidated, employing the classification error rate (CEE) (Wang et al. 2005 ) as an illustrative example. CEE is utilized as the fitness function or objective function to assess the optimization effectiveness of the feature selection algorithm for the problem at hand. Specifically, CEE quantifies the classification error rate when employing the k-nearest-neighbors (KNN) algorithm (Datasets | Feature Selection @ ASU. 2019 ), with the Euclidean distance (ED) (The UCI Machine Learning Repository xxxx) serving as the metric for measuring the distance between the current model being tested and its neighboring models. By using CEE as the fitness function, the wrapper method evaluates different feature subsets based on their performance in the context of the KNN algorithm. This approach enables the algorithm to identify the most relevant features that lead to the lowest classification error rate, thereby optimizing the model's performance. By focusing on the accuracy of classification in a specific algorithmic context, the wrapper method ensures that the selected features are highly tailored to the problem and the chosen learning algorithm. This targeted feature selection process enhances the overall performance and effectiveness of the algorithm in handling high-dimensional data.

where X denotes feat, Y denotes label, both X and Y are specific features in the given data model, and D is the total number of features recorded.

In the experimental setup, each dataset is partitioned into a training set and a test set, with an 80% and 20% ratio. The training set is initially utilized to select the most characteristic features and fine-tune the parameters of the KNN model. Subsequently, the test set is employed to evaluate and calculate the data model and algorithm performance. To address concerns related to fitting ability and overfitting, hierarchical cross-validation with K = 10 was employed in this experiment. In hierarchical cross-validation, the training portion is divided into ten equal-sized subsets. The KNN classifier is trained using 9 out of the 10 folds (K-1 folds) to identify the optimal KNN classifier, while the remaining fold is used for validation purposes. This process is repeated 10 times, ensuring that each subset serves both as a validation set and as part of the training data. This iterative approach is a crucial component of our evaluation methodology, providing a robust assessment of the algorithm's performance. By repeatedly employing replacement validation and folding training, we enhance the reliability and accuracy of our evaluation, enabling a comprehensive analysis of the algorithm's effectiveness across various datasets.

6.2 High-dimensional datasets

In this subsection, the optimization performance of MCOA is assessed using 12 high-dimensional datasets sourced from the Arizona State University (Too et al. 2021 ) and University of California Irvine (UCI) Machine Learning databases (Chandrashekar and Sahin 2014 ). By conducting experiments on these high-dimensional datasets, the results obtained are not only convincing but also pose significant challenges. These datasets authentically capture the intricacies of real-life spatial problems, making the experiments more meaningful and applicable to complex and varied spatial scenarios. For a detailed overview of the 12 high-dimensional datasets, please refer to Table  12 .

6.3 Experimental results and analysis

In order to assess the effectiveness and efficiency of MCOA in feature selection, we conducted comparative tests using MCOA as well as several other algorithms including COA, SSA, PSO, ABC, WSA (Baykasoğlu et al. 2020 ), FPA (Yang 2012 ), and ABO (Qi et al. 2017 ) on 12 datasets. In this section of the experiment, the fitness value of each algorithm was calculated, and the convergence curve, feature selection accuracy (FS Accuracy), and selected feature size for each algorithm were analyzed. Figures 10 , 11 and 12 display the feature selection (FS) convergence curve, FS Accuracy, and selected feature size for the eight algorithms across the 12 datasets. From these figures, it is evident that the optimization ability and prediction accuracy of the MCOA algorithm surpass those of the other seven comparison algorithms. Taking the dataset CLL-SUB-111 as an example in Figs.  11 and 12 , MCOA selected 20 features, while the other seven algorithms selected more than 2000 features. Moreover, the prediction accuracy achieved by MCOA was higher than that of the other seven algorithms. Across all 12 datasets, the comparison figures indicate that the MCOA algorithm consistently outperforms the others. Specifically, the MCOA algorithm tends to select smaller feature subsets, leading to higher prediction accuracy and stronger optimization capabilities. This pattern highlights the superior performance of MCOA in feature selection, demonstrating its effectiveness in optimizing feature subsets for improved prediction accuracy.

figure 10

Convergence curve of FS

figure 11

Comparison plot of verification accuracy of eight algorithms

figure 12

Comparison plots of feature sizes of the eight algorithms

To address the randomness and instability inherent in experiments, a single experiment may not fully demonstrate the effectiveness of algorithm performance. Therefore, we conducted 30 independent experiments using 12 datasets and 8 algorithms. For each algorithm and dataset combination, we calculated the average fitness value, standard deviation of the fitness value, and Friedman rank. Subsequently, the Wilcoxon rank sum test was employed to determine significant differences between the performance of different algorithms across various datasets. Throughout the experiment, a fixed population size of 10 and a maximum of 100 iterations were used. The 12 datasets were utilized to evaluate the 8 algorithms 300 times (tenfold cross-validation × 30 runs). It is essential to note that all algorithms were assessed using the same fitness function derived from the dataset, ensuring a consistent evaluation criterion across the experiments. By conducting multiple independent experiments and statistical analyses, the study aimed to provide a comprehensive and robust assessment of algorithm performance. This approach helps in drawing reliable conclusions regarding the comparative effectiveness of the algorithms under consideration across different datasets, accounting for the inherent variability and randomness in the experimental process.

Table 13 presents the average fitness calculation results from 30 independent experiments for the eight algorithms, it is 55.23% higher than the original algorithm. According to the table, in the Ionosphere dataset, MCOA exhibits the best average fitness, albeit with slightly lower stability compared to ABC. Similarly, in the WarpAR10P dataset, MCOA achieves the best average fitness, with stability slightly lower than COA. After conducting Friedman ranking on the fitness calculation results of the 30 independent experiments, it is concluded that although MCOA shows slightly lower stability in some datasets, it ranks first overall. Among the other seven algorithms, PSO ranks second, ABO ranks third, COA ranks fourth, and ABC, SSA, FPA, and WSA rank fifth to ninth, respectively. These results demonstrate that MCOA exhibits robust optimization performance and high stability in solving high-dimensional feature selection problems. Moreover, MCOA outperforms COA, showcasing its superior improvement in solving these complex problems.

Table 14 presents the accuracy calculation results of the eight algorithms for 30 independent experiments, it is 10.85% higher than the original algorithm. According to the table, the average accuracy of MCOA is the highest across all datasets. Notably, in the Colon dataset, MCOA performs exceptionally well with a perfect average accuracy of 100%. However, in the Ionosphere dataset, MCOA exhibits slightly lower stability compared to ABC, and in the WarpAR10P dataset, it is slightly less stable than COA. Upon conducting Friedman ranking on the average accuracy calculation results of 30 independent experiments, it is evident that MCOA ranks first overall. Among the other seven algorithms, PSO ranks second, ABC ranks third, COA ranks fourth, and ABO, FPA, SSA, and WSA rank fifth to ninth, respectively. These results highlight that MCOA consistently achieves high accuracy and stability in solving high-dimensional feature selection problems. Its superior performance across various datasets underscores its effectiveness and reliability in real-world applications.

Table 15 demonstrates that the MCOA algorithm has shown significant results in the Wilcoxon rank sum test for high-dimensional feature selection fitness. The comparison values with other algorithms are less than 5%, indicating that the MCOA algorithm exhibits significant differences compared to the other seven algorithms. This result serves as evidence that MCOA outperforms the other algorithms, showcasing its superior optimization performance. Additionally, when comparing the results with the original algorithm, it becomes evident that the MCOA algorithm has a substantial and positive impact, demonstrating its effectiveness and improvement over existing methods. These findings underscore the algorithm's potential and its ability to provide substantial enhancements in the field of high-dimensional feature selection.

7 Conclusions and future work

The Crayfish Optimization Algorithm (COA) is grounded in swarm intelligence, drawing inspiration from crayfish behavior to find optimal solutions within a specific range. However, COA’s limitations stem from neglecting crucial survival traits of crayfish, such as crawling against water to discover better aquatic environments. This oversight weakens COA’s search ability, making it susceptible to local optima and hindering its capacity to find optimal solutions. To address these issues, this paper introduces a Modified Crayfish Optimization Algorithm (MCOA). MCOA incorporates an environmental updating mechanism, enabling crayfish to randomly select directions toward better aquatic environments for location updates, enhancing search ability. The addition of the ghost opposition-based learning strategy expands MCOA’s search range and promotes escape from local optima. Experimental validations using IEEE CEC2020 benchmark functions confirm MCOA’s outstanding optimization performance.

Moreover, MCOA’s practical applicability is demonstrated through applications to four constrained engineering problems and high-dimensional feature selection challenges. These experiments underscore MCOA’s efficacy in real-world scenarios, but MCOA can only solve the optimization problem of a single goal. In future studies, efforts will be made to further optimize MCOA and enhance its function. We will exploitation multi-objective version of the algorithm to increase the search ability and convergence of the algorithm through non-dominated sorting, multi-objective selection, crossover and mutation, etc., to solve more complex practical problems. It is extended to wireless sensor network coverage, machine learning, image segmentation and other practical applications.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609. https://doi.org/10.1016/j.cma.2020.113609

Article   MathSciNet   Google Scholar  

Abualigah L, Elaziz MA, Khasawneh AM, Alshinwan M, Ibrahim RA, Al-Qaness MA, Gandomi AH (2022) Meta-heuristic optimization algorithms for solving real-world mechanical engineering design problems: a comprehensive survey, applications, comparative analysis, and results. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06747-4

Article   Google Scholar  

Ahmed AM, Rashid TA, Saeed SAM (2020) Cat swarm optimization algorithm: a survey and performance evaluation. Comput Intell Neurosci. https://doi.org/10.1155/2020/4854895

Baykasoğlu A, Ozsoydan FB, Senol ME (2020) Weighted superposition attraction algorithm for binary optimization problems. Oper Res Int Journal 20:2555–2581. https://doi.org/10.1007/s12351-018-0427-9

Baykasoglu A, Ozsoydan FB (2015) Adaptive firefly algorithm with chaos for mechanical design optimization problems. Appl Soft Comput 36:152–164. https://doi.org/10.1016/j.asoc.2015.06.056

Belge E, Altan A, Hacıoğlu R (2022) Metaheuristic optimization-based path planning and tracking of quadcopter for payload hold-release mission. Electronics 11(8):1208. https://doi.org/10.3390/electronics11081208

Beyer HG, Schwefel HP (2002) Evolution strategies–a comprehensive introduction. Nat Comput 1:3–52. https://doi.org/10.1023/A:1015059928466

Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024

Cherrington, M., Thabtah, F., Lu, J., & Xu, Q. (2019, April). Feature selection: filter methods performance challenges. In 2019 International Conference on Computer and Information Sciences (ICCIS) (pp. 1–4). IEEE. https://doi.org/10.1109/ICCISci.2019.8716478

Datasets | Feature Selection @ ASU. Accessed from 3 Oct 2019 https://jundongl.github.io/scikit-feature/OLD/home_old.html .

Deng L, Liu S (2023) Snow ablation optimizer: a novel metaheuristic technique for numerical optimization and engineering design. Expert Syst Appl 225:120069. https://doi.org/10.1016/j.eswa.2023.120069

Dhiman G, Kaur A (2019) STOA: a bio-inspired based optimization algorithm for industrial engineering problems. Eng Appl Artif Intell 82:148–174. https://doi.org/10.1016/j.engappai.2019.03.021

Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39. https://doi.org/10.1109/MCI.2006.329691

Eskandar H, Sadollah A, Bahreininejad A, Hamdi M (2012) Water cycle algorithm–a novel metaheuristic optimization method for solving constrained engineering optimization problems. Comput Struct 110:151–166. https://doi.org/10.1016/j.compstruc.2012.07.010

Espejo, P. G., Ventura, S., & Herrera, F. (2009) A survey on the application of genetic programming to classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews). 40(2): 121–144. https://doi.org/10.1109/TSMCC.2009.2033566

Ezugwu AE, Agushaka JO, Abualigah L, Mirjalili S, Gandomi AH (2022) Prairie dog optimization algorithm. Neural Comput Appl 34(22):20017–20065. https://doi.org/10.1007/s00521-022-07530-9

Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377. https://doi.org/10.1016/j.eswa.2020.113377

Formato RA (2007) Central force optimization. Prog Electromagn Res 77(1):425–491. https://doi.org/10.2528/PIER07082403

Ghaemi M, Feizi-Derakhshi MR (2016) Feature selection using forest optimization algorithm. Pattern Recogn 60:121–129. https://doi.org/10.1016/j.patcog.2016.05.012

Guedria NB (2016) Improved accelerated PSO algorithm for mechanical engineering optimization problems. Appl Soft Comput 40:455–467. https://doi.org/10.1016/j.asoc.2015.10.048

He Q, Wang L (2007) An effective co-evolutionary particle swarm optimization for constrained engineering design problems. Eng Appl Artif Intel 20:89–99. https://doi.org/10.1016/j.engappai.2006.03.003

Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Futur Gener Comput Syst 97:849–872. https://doi.org/10.1016/j.future.2019.02.028

Jacob DIJ, Darney DPE (2021) Artificial bee colony optimization algorithm for enhancing routing in wireless networks. J Artif Intell Capsule Networks 3(1):62–71. https://doi.org/10.36548/jaicn.2021.1.006

Jia H, Peng X, Lang C (2021) Remora optimization algorithm. Expert Syst Appl 185:115665. https://doi.org/10.1016/j.eswa.2021.115665

Jia H, Wen Q, Wu D, Wang Z, Wang Y, Wen C, Abualigah L (2023a) Modified beluga whale optimization with multi-strategies for solving engineering problems. J Comput Design Eng 10(6):2065–2093. https://doi.org/10.1093/jcde/qwad089

Jia H, Rao H, Wen C, Mirjalili S (2023b) Crayfish optimization algorithm. Artif Intell Rev. https://doi.org/10.1007/s10462-023-10567-4

Jia H, Lu C, Wu D, Wen C, Rao H, Abualigah L (2023c) An improved reptile search algorithm with ghost opposition-based learning for global optimization problems. J Comput Design Eng. https://doi.org/10.1093/jcde/qwad048

Jović, A., Brkić, K., & Bogunović, N. (2015). A review of feature selection methods with applications. In 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO) (pp. 1200–1205). IEEE. https://doi.org/10.1109/MIPRO.2015.7160458

Kamal M, Mortazavi A, Cakici Z (2023) Optimal design of RC bracket and footing systems of precast industrial buildings using fuzzy differential evolution incorporated virtual mutant. Arabian J Sci Eng. https://doi.org/10.3934/mbe.2022263

Kandemir EC, Mortazavi A (2022) Optimization of seismic base isolation system using a fuzzy reinforced swarm intelligence. Adv Eng Softw 174:103323. https://doi.org/10.1016/j.advengsoft.2022.103323

Kaveh A (2017) Applications of metaheuristic optimization algorithms in civil engineering. Springer International Publishing, Basel, Switzerland. https://doi.org/10.1007/978-3-319-48012-1

Book   Google Scholar  

Kaveh A, Khayatazad M (2012) A new meta-heuristic method: ray optimization. Comput Struct 112:283–294. https://doi.org/10.1016/j.compstruc.2012.09.003

Kaveh A, Mahdavi V (2014) Colliding bodies optimization: a novel meta-heuristic method. Comput Struct 139:18–27. https://doi.org/10.1016/j.compstruc.2014.04.005

Kira, K., & Rendell, L. A. (1992). The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the tenth national conference on Artificial intelligence (pp. 129–134). https://doi.org/10.5555/1867135.1867155

Kiran MS (2015) TSA: tree-seed algorithm for continuous optimization. Expert Syst Appl 42(19):6686–6698. https://doi.org/10.1016/j.eswa.2015.04.055

Kumar S, Tejani GG, Mirjalili S (2019) Modified symbiotic organisms search for structural optimization. Eng with Comput 35(4):1269–1296. https://doi.org/10.1007/s00366-018-0662-y

Kuo HC, Lin CH (2013) Cultural evolution algorithm for global optimizations and its applications. J Appl Res Technol 11(4):510–522

Liu X, Lu P (2014) Solving nonconvex optimal control problems by convex optimization. J Guid Control Dyn 37(3):750–765. https://doi.org/10.2514/1.62110

Lu P, Ye L, Zhao Y, Dai B, Pei M, Tang Y (2021) Review of meta-heuristic algorithms for wind power prediction: methodologies, applications and challenges. Appl Energy 301:117446. https://doi.org/10.1016/j.apenergy.2021.117446

Ma Y, Zhang X, Song J, Chen L (2021) A modified teaching–learning-based optimization algorithm for solving optimization problem. Knowl-Based Syst 212:106599. https://doi.org/10.1016/j.knosys.2023.110554

Mahdavi M, Fesanghary M, Damangir E (2007) An improved harmony search algorithm for solving optimization problems. Appl Math Comput 188(2):1567–1579. https://doi.org/10.1016/j.amc.2006.11.033

Mahdavi S, Rahnamayan S, Deb K (2018) Opposition based learning: a literature review. Swarm Evol Comput 39:1–23. https://doi.org/10.1016/j.swevo.2017.09.010

Meloni C, Pacciarelli D, Pranzo M (2004) A rollout metaheuristic for job shop scheduling problems. Ann Oper Res 131:215–235. https://doi.org/10.1023/B:ANOR.0000039520.24932.4b

Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl-Based Syst 89:228–249. https://doi.org/10.1016/j.knosys.2015.07.006

Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133. https://doi.org/10.1016/j.knosys.2015.12.022

Mirjalili S, Mirjalili SM, Hatamlou A (2016) Multi-verse optimizer: a nature-inspired algorithm for global optimization. Neural Comput Appl 27:495–513. https://doi.org/10.1007/s00521-015-1870-7

Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002

Mirjalili, S., & Mirjalili, S. (2019). Genetic algorithm. Evolutionary Algorithms and Neural Networks: Theory and Applications, 43–55. https://doi.org/10.1007/978-3-319-93025-1_4

Moghdani R, Salimifard K (2018) Volleyball premier league algorithm. Appl Soft Comput 64:161–185. https://doi.org/10.1016/j.asoc.2017.11.043

Moloodpoor M, Mortazavi A (2022) Simultaneous optimization of fuel type and exterior walls insulation attributes for residential buildings using a swarm intelligence. Int J Environ Sci Technol 19(4):2809–2822. https://doi.org/10.1007/s13762-021-03323-0

Moloodpoor M, Mortazavi A, Özbalta N (2021) Thermo-economic optimization of double-pipe heat exchanger using a compound swarm intelligence. Heat Transfer Res. https://doi.org/10.1615/HeatTransRes.2021037293

Mortazavi A (2019) Comparative assessment of five metaheuristic methods on distinct problems. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 10(3):879–898. https://doi.org/10.24012/dumf.585790

Papaioannou G, Koulocheris D (2018) An approach for minimizing the number of objective functions in the optimization of vehicle suspension systems. J Sound Vib 435:149–169. https://doi.org/10.1016/j.jsv.2018.08.009

Piotrowski AP (2018) L-SHADE optimization algorithms with population-wide inertia. Inf Sci 468:117–141. https://doi.org/10.1016/j.ins.2018.08.030

Qi X, Zhu Y, Zhang H (2017) A new meta-heuristic butterfly-inspired algorithm. Journal of Computational Science 23:226–239. https://doi.org/10.1016/j.jocs.2017.06.003

Rao H, Jia H, Wu D, Wen C, Li S, Liu Q, Abualigah L (2022) A modified group teaching optimization algorithm for solving constrained engineering optimization problems. Mathematics 10(20):3765. https://doi.org/10.3390/math10203765

Rao, R. V., & Rao, R. V. (2016). Teaching-learning-based optimization algorithm (pp. 9–39). Springer International Publishing. https://doi.org/10.1016/j.cad.2010.12.015

Rashedi E, Nezamabadi-Pour HS (2009) GSA: a gravitational search algorithm. Inform Sci 179:2232–2248. https://doi.org/10.1016/j.ins.2009.03.004

Razmjooy, N., Ashourian, M., & Foroozandeh, Z. (Eds). (2021). Metaheuristics and optimization in computer and electrical engineering. https://doi.org/10.1007/978-3-030-56689-0

Sadollah A, Bahreininejad A, Eskandar H, Hamdi M (2013) Mine blast algorithm: a new population based algorithm for solving constrained engineering optimization problems. Appl Soft Comput 13:2592–2612. https://doi.org/10.1016/j.asoc.2012.11.026

Saremi S, Mirjalili S, Lewis A (2017) Grasshopper optimisation algorithm: theory and application. Adv Eng Softw 105:30–47. https://doi.org/10.1016/j.advengsoft.2017.01.004

Sayed GI, Darwish A, Hassanien AE (2018) A new chaotic multi-verse optimization algorithm for solving engineering optimization problems. J Exp Theor Artif Intell 30(2):293–317. https://doi.org/10.1080/0952813X.2018.1430858

Shadravan S, Naji HR, Bardsiri VK (2019) The sailfish optimizer: a novel nature-inspired metaheuristic algorithm for solving constrained engineering optimization problems. Eng Appl Artif Intell 80:20–34. https://doi.org/10.1016/j.engappai.2019.01.001

Simon D (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12(6):702–713. https://doi.org/10.1109/TEVC.2008.919004

Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11:341–359. https://doi.org/10.1023/A:1008202821328

Su H, Zhao D, Heidari AA, Liu L, Zhang X, Mafarja M, Chen H (2023) RIME: a physics-based optimization. Neurocomputing 532:183–214. https://doi.org/10.1016/j.neucom.2023.02.010

Talatahari S, Azizi M, Gandomi AH (2021) Material generation algorithm: a novel metaheuristic algorithm for optimization of engineering problems. Processes 9(5):859. https://doi.org/10.3390/pr9050859

Markelle Kelly, Rachel Longjohn, Kolby Nottingham, The UCI Machine Learning Repository, https://archive.ics.uci.edu

Too J, Mafarja M, Mirjalili S (2021) Spatial bound whale optimization algorithm: an efficient high-dimensional feature selection approach. Neural Comput Appl 33:16229–16250. https://doi.org/10.1007/s00521-021-06224-y

Wang L, Zhang Y, Feng J (2005) On the Euclidean distance of images. IEEE Trans Pattern Anal Mach Intell 27(8):1334–1339. https://doi.org/10.1109/TPAMI.2005.165

Wang D, Tan D, Liu L (2018a) Particle swarm optimization algorithm: an overview. Soft Comput 22:387–408. https://doi.org/10.1007/s00500-016-2474-6

Wang H, Hu Z, Sun Y, Su Q, Xia X (2018b) Modified backtracking search optimization algorithm inspired by simulated annealing for constrained engineering optimization problems. Comput Intell Neurosci. https://doi.org/10.1155/2018/9167414

Wang S, Hussien AG, Jia H, Abualigah L, Zheng R (2022) Enhanced remora optimization algorithm for solving constrained engineering optimization problems. Mathematics 10(10):1696. https://doi.org/10.3390/math10101696

Wu D, Rao H, Wen C, Jia H, Liu Q, Abualigah L (2022) Modified sand cat swarm optimization algorithm for solving constrained engineering optimization problems. Mathematics 10(22):4350. https://doi.org/10.3390/math10224350

Yang, X. S. (2011). Metaheuristic optimization: algorithm analysis and open problems. In International symposium on experimental algorithms (pp. 21–32). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-20662-7_2

Yang, X. S. (2012, September). Flower pollination algorithm for global optimization. In International conference on unconventional computing and natural computation (pp. 240–249). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-32894-7_27

Yang XS, Deb S (2014) Cuckoo search: recent advances and applications. Neural Comput Appl 24:169–174. https://doi.org/10.1007/s00521-013-1367-1

Yeniay Ö (2005) Penalty function methods for constrained optimization with genetic algorithms. Math Comput Appl 10(1):45–56. https://doi.org/10.3390/mca10010045

Yıldız BS, Kumar S, Panagant N, Mehta P, Sait SM, Yildiz AR, Mirjalili S (2023) A novel hybrid arithmetic optimization algorithm for solving constrained optimization problems. Knowledge-Based Syst 271:110554. https://doi.org/10.1016/j.knosys.2023.110554

Yuan Y, Shen Q, Wang S, Ren J, Yang D, Yang Q, Mu X (2023) Coronavirus mask protection algorithm: a new bio-inspired optimization algorithm and its applications. J Bionic Eng. https://doi.org/10.1007/s42235-023-00359-5

Zhang Y, Jin Z (2020) Group teaching optimization algorithm: a novel metaheuristic method for solving global optimization problems. Expert Syst Appl 148:113246. https://doi.org/10.1016/j.eswa.2020.113246

Zhang YJ, Wang YF, Tao LW, Yan YX, Zhao J, Gao ZM (2022a) Self-adaptive classification learning hybrid JAYA and Rao-1 algorithm for large-scale numerical and engineering problems. Eng Appl Artif Intell 114:105069. https://doi.org/10.1016/j.engappai.2022.105069

Zhang Y, Wang Y, Li S, Yao F, Tao L, Yan Y, Gao Z (2022b) An enhanced adaptive comprehensive learning hybrid algorithm of Rao-1 and JAYA algorithm for parameter extraction of photovoltaic models. Math Biosci Eng 19(6):5610–5637. https://doi.org/10.3934/mbe.2022263

Zhang YJ, Yan YX, Zhao J, Gao ZM (2022c) AOAAO: The hybrid algorithm of arithmetic optimization algorithm with aquila optimizer. IEEE Access 10:10907–10933. https://doi.org/10.1109/ACCESS.2022.3144431

Zhang YJ, Yan YX, Zhao J, Gao ZM (2022d) CSCAHHO: chaotic hybridization algorithm of the Sine Cosine with Harris Hawk optimization algorithms for solving global optimization problems. PLoS ONE 17(5):e0263387. https://doi.org/10.1371/journal.pone.0263387

Zhang YJ, Wang YF, Yan YX, Zhao J, Gao ZM (2022e) LMRAOA: An improved arithmetic optimization algorithm with multi-leader and high-speed jum** based on opposition-based learning solving engineering and numerical problems. Alex Eng J 61(12):12367–12403. https://doi.org/10.1016/j.aej.2022.06.017

Zhao J, Zhang Y, Li S, Wang Y, Yan Y, Gao Z (2022) A chaotic self-adaptive JAYA algorithm for parameter extraction of photovoltaic models. Math Biosci Eng 19:5638–5670. https://doi.org/10.3934/mbe.2022264

Zhao S, Zhang T, Ma S, Wang M (2023) Sea-horse optimizer: a novel nature-inspired meta-heuristic for global optimization problems. Appl Intell 53(10):11833–11860. https://doi.org/10.1007/s10489-022-03994-3

Download references

Acknowledgements

The authors would like to thank the support of Fujian Key Lab of Agriculture IOT Application, IOT Application Engineering Research Center of Fujian Province Colleges and Universities, Guiding Science and Technology Projects in Sanming City (2023-G-5), Industry-University Cooperation Project of Fujian Province (2021H6039), Fujian Province Industrial Guidance (Key) Project (2022H0053), Sanming Major Science and Technology Project of Industry-University-Research Collaborative Innovation (2022-G-4), and also the anonymous reviewers and the editor for their careful reviews and constructive suggestions to help us improve the quality of this paper.

Author information

Authors and affiliations.

School of Information Engineering, Sanming University, Sanming, 365004, China

Heming Jia, Xuelian Zhou & Jinrui Zhang

Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, 19328, Jordan

Laith Abualigah

MEU Research Unit, Middle East University, Amman, 11831, Jordan

Department of Mechanical Engineering, Bursa Uludağ University, 16059, Görükle, Bursa, Turkey

Ali Riza Yildiz

Department of Computer and Information Science, Linköping University, 58183, Linköping, Sweden

Abdelazim G. Hussien

You can also search for this author in PubMed   Google Scholar

Contributions

Heming Jia: Methodology, Formal analysis, Investigation, Resources, Funding acquisition, Project administration; Xuelian Zhou: Investigation, Conceptualization, Software, Data Curation, Writing—Original Draft; Jinrui Zhang: Validation, Conceptualization; Laith Abualigah: Supervision, Writing—Review & Editing; Ali Riza Yildiz: Visualization, Writing—Review & Editing; Abdelazim G. Hussien: Writing—Review & Editing.

Corresponding author

Correspondence to Heming Jia .

Ethics declarations

Competing interest.

We declare that we have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Jia, H., Zhou, X., Zhang, J. et al. Modified crayfish optimization algorithm for solving multiple engineering application problems. Artif Intell Rev 57 , 127 (2024). https://doi.org/10.1007/s10462-024-10738-x

Download citation

Accepted : 24 February 2024

Published : 24 April 2024

DOI : https://doi.org/10.1007/s10462-024-10738-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Crayfish Optimization Algorithm
  • Environmental updating mechanism
  • Ghost opposition-based learning strategy
  • Global optimization problem
  • Constrained engineering design problems
  • High dimensional feature selection
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 24 February 2023

Artificial intelligence in academic writing: a paradigm-shifting technological advance

  • Roei Golan   ORCID: orcid.org/0000-0002-7214-3073 1   na1 ,
  • Rohit Reddy 2   na1 ,
  • Akhil Muthigi 2 &
  • Ranjith Ramasamy 2  

Nature Reviews Urology volume  20 ,  pages 327–328 ( 2023 ) Cite this article

3699 Accesses

22 Citations

62 Altmetric

Metrics details

  • Preclinical research
  • Translational research

Artificial intelligence (AI) has rapidly become one of the most important and transformative technologies of our time, with applications in virtually every field and industry. Among these applications, academic writing is one of the areas that has experienced perhaps the most rapid development and uptake of AI-based tools and methodologies. We argue that use of AI-based tools for scientific writing should widely be adopted.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

How artificial intelligence will affect the future of medical publishing

  • Jean-Louis Vincent

Critical Care Open Access 06 July 2023

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Checco, A., Bracciale, L., Loreti, P., Pinfield, S. & Bianchi, G. AI-assisted peer review. Humanit. Soc. Sci. Commun. 8 , 25 (2021).

Article   Google Scholar  

Hutson, M. Could AI help you to write your next paper? Nature 611 , 192–193 (2022).

Article   CAS   PubMed   Google Scholar  

Krzastek, S. C., Farhi, J., Gray, M. & Smith, R. P. Impact of environmental toxin exposure on male fertility potential. Transl Androl. Urol. 9 , 2797–2813 (2020).

Article   PubMed   PubMed Central   Google Scholar  

Khullar, D. Social media and medical misinformation: confronting new variants of an old problem. JAMA 328 , 1393–1394 (2022).

Article   PubMed   Google Scholar  

Reddy, R. V. et al. Assessing the quality and readability of online content on shock wave therapy for erectile dysfunction. Andrologia 54 , e14607 (2022).

Khodamoradi, K., Golan, R., Dullea, A. & Ramasamy, R. Exosomes as potential biomarkers for erectile dysfunction, varicocele, and testicular injury. Sex. Med. Rev. 10 , 311–322 (2022).

Stone, L. You’ve got a friend online. Nat. Rev. Urol. 17 , 320 (2020).

PubMed   Google Scholar  

Pai, R. K. et al. A review of current advancements and limitations of artificial intelligence in genitourinary cancers. Am. J. Clin. Exp. Urol. 8 , 152–162 (2020).

PubMed   PubMed Central   Google Scholar  

You, J. B. et al. Machine learning for sperm selection. Nat. Rev. Urol. 18 , 387–403 (2021).

Stone, L. The dawning of the age of artificial intelligence in urology. Nat. Rev. Urol. 18 , 322 (2021).

Download references

Acknowledgements

The manuscript was edited for grammar and structure using the advanced language model ChatGPT. The authors thank S. Verma for addressing inquiries related to artificial intelligence.

Author information

These authors contributed equally: Roei Golan, Rohit Reddy.

Authors and Affiliations

Department of Clinical Sciences, Florida State University College of Medicine, Tallahassee, FL, USA

Desai Sethi Urology Institute, University of Miami Miller School of Medicine, Miami, FL, USA

Rohit Reddy, Akhil Muthigi & Ranjith Ramasamy

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ranjith Ramasamy .

Ethics declarations

Competing interests.

R.R. is funded by the National Institutes of Health Grant R01 DK130991 and the Clinician Scientist Development Grant from the American Cancer Society. The other authors declare no competing interests.

Additional information

Related links.

ChatGPT: https://chat.openai.com/

Cohere: https://cohere.ai/

CoSchedule Headline Analyzer: https://coschedule.com/headline-analyzer

DALL-E 2: https://openai.com/dall-e-2/

Elicit: https://elicit.org/

Penelope.ai: https://www.penelope.ai/

Quillbot: https://quillbot.com/

Semantic Scholar: https://www.semanticscholar.org/

Wordtune by AI21 Labs: https://www.wordtune.com/

Writefull: https://www.writefull.com/

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Golan, R., Reddy, R., Muthigi, A. et al. Artificial intelligence in academic writing: a paradigm-shifting technological advance. Nat Rev Urol 20 , 327–328 (2023). https://doi.org/10.1038/s41585-023-00746-x

Download citation

Published : 24 February 2023

Issue Date : June 2023

DOI : https://doi.org/10.1038/s41585-023-00746-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Techniques for supercharging academic writing with generative ai.

  • Zhicheng Lin

Nature Biomedical Engineering (2024)

Critical Care (2023)

What do academics have to say about ChatGPT? A text mining analytics on the discussions regarding ChatGPT on research writing

  • Rex Bringula

AI and Ethics (2023)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

artificial intelligence based research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Innovation (Camb)
  • v.2(4); 2021 Nov 28

Artificial intelligence: A powerful paradigm for scientific research

1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

35 University of Chinese Academy of Sciences, Beijing 100049, China

5 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

10 Zhongshan Hospital Institute of Clinical Science, Fudan University, Shanghai 200032, China

Changping Huang

18 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

11 Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China

37 Songshan Lake Materials Laboratory, Dongguan, Guangdong 523808, China

26 Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China

Xingchen Liu

28 Institute of Coal Chemistry, Chinese Academy of Sciences, Taiyuan 030001, China

2 Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

Fengliang Dong

3 National Center for Nanoscience and Technology, Beijing 100190, China

Cheng-Wei Qiu

4 Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore

6 Department of Gynaecology, Obstetrics and Gynaecology Hospital, Fudan University, Shanghai 200011, China

36 Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases, Shanghai 200011, China

7 School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China

41 Second Affiliated Hospital School of Medicine, and School of Public Health, Zhejiang University, Hangzhou 310058, China

8 Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, China

9 Zhejiang Provincial People’s Hospital, Hangzhou 310014, China

Chenguang Fu

12 School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, China

Zhigang Yin

13 Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou 350002, China

Ronald Roepman

14 Medical Center, Radboud University, 6500 Nijmegen, the Netherlands

Sabine Dietmann

15 Institute for Informatics, Washington University School of Medicine, St. Louis, MO 63110, USA

Marko Virta

16 Department of Microbiology, University of Helsinki, 00014 Helsinki, Finland

Fredrick Kengara

17 School of Pure and Applied Sciences, Bomet University College, Bomet 20400, Kenya

19 Agriculture College of Shihezi University, Xinjiang 832000, China

Taolan Zhao

20 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China

21 The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

38 Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China

Jialiang Yang

22 Geneis (Beijing) Co., Ltd, Beijing 100102, China

23 Department of Communication Studies, Hong Kong Baptist University, Hong Kong, China

24 South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China

39 Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou 510650, China

Zhaofeng Liu

27 Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China

29 Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou 215123, China

Xiaohong Liu

30 Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China

James P. Lewis

James m. tiedje.

34 Center for Microbial Ecology, Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA

40 Zhejiang Lab, Hangzhou 311121, China

25 Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China

31 Department of Computer Science, Aberystwyth University, Aberystwyth, Ceredigion SY23 3FL, UK

Zhipeng Cai

32 Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA

33 Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China

Jiabao Zhang

Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life. The ML techniques have been developed to analyze high-throughput data with a view to obtaining useful insights, categorizing, predicting, and making evidence-based decisions in novel ways, which will promote the growth of novel applications and fuel the sustainable booming of AI. This paper undertakes a comprehensive survey on the development and application of AI in different aspects of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. The challenges that each discipline of science meets, and the potentials of AI techniques to handle these challenges, are discussed in detail. Moreover, we shed light on new research trends entailing the integration of AI into each scientific discipline. The aim of this paper is to provide a broad research guideline on fundamental sciences with potential infusion of AI, to help motivate researchers to deeply understand the state-of-the-art applications of AI-based fundamental sciences, and thereby to help promote the continuous development of these fundamental sciences.

Graphical abstract

An external file that holds a picture, illustration, etc.
Object name is fx1.jpg

Public summary

  • • “Can machines think?” The goal of artificial intelligence (AI) is to enable machines to mimic human thoughts and behaviors, including learning, reasoning, predicting, and so on.
  • • “Can AI do fundamental research?” AI coupled with machine learning techniques is impacting a wide range of fundamental sciences, including mathematics, medical science, physics, etc.
  • • “How does AI accelerate fundamental research?” New research and applications are emerging rapidly with the support by AI infrastructure, including data storage, computing power, AI algorithms, and frameworks.

Introduction

“Can machines think?” Alan Turing posed this question in his famous paper “Computing Machinery and Intelligence.” 1 He believes that to answer this question, we need to define what thinking is. However, it is difficult to define thinking clearly, because thinking is a subjective behavior. Turing then introduced an indirect method to verify whether a machine can think, the Turing test, which examines a machine's ability to show intelligence indistinguishable from that of human beings. A machine that succeeds in the test is qualified to be labeled as artificial intelligence (AI).

AI refers to the simulation of human intelligence by a system or a machine. The goal of AI is to develop a machine that can think like humans and mimic human behaviors, including perceiving, reasoning, learning, planning, predicting, and so on. Intelligence is one of the main characteristics that distinguishes human beings from animals. With the interminable occurrence of industrial revolutions, an increasing number of types of machine types continuously replace human labor from all walks of life, and the imminent replacement of human resources by machine intelligence is the next big challenge to be overcome. Numerous scientists are focusing on the field of AI, and this makes the research in the field of AI rich and diverse. AI research fields include search algorithms, knowledge graphs, natural languages processing, expert systems, evolution algorithms, machine learning (ML), deep learning (DL), and so on.

The general framework of AI is illustrated in Figure 1 . The development process of AI includes perceptual intelligence, cognitive intelligence, and decision-making intelligence. Perceptual intelligence means that a machine has the basic abilities of vision, hearing, touch, etc., which are familiar to humans. Cognitive intelligence is a higher-level ability of induction, reasoning and acquisition of knowledge. It is inspired by cognitive science, brain science, and brain-like intelligence to endow machines with thinking logic and cognitive ability similar to human beings. Once a machine has the abilities of perception and cognition, it is often expected to make optimal decisions as human beings, to improve the lives of people, industrial manufacturing, etc. Decision intelligence requires the use of applied data science, social science, decision theory, and managerial science to expand data science, so as to make optimal decisions. To achieve the goal of perceptual intelligence, cognitive intelligence, and decision-making intelligence, the infrastructure layer of AI, supported by data, storage and computing power, ML algorithms, and AI frameworks is required. Then by training models, it is able to learn the internal laws of data for supporting and realizing AI applications. The application layer of AI is becoming more and more extensive, and deeply integrated with fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace, which has a profound impact on our work and lifestyle.

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

The general framework of AI

History of AI

The beginning of modern AI research can be traced back to John McCarthy, who coined the term “artificial intelligence (AI),” during at a conference at Dartmouth College in 1956. This symbolized the birth of the AI scientific field. Progress in the following years was astonishing. Many scientists and researchers focused on automated reasoning and applied AI for proving of mathematical theorems and solving of algebraic problems. One of the famous examples is Logic Theorist, a computer program written by Allen Newell, Herbert A. Simon, and Cliff Shaw, which proves 38 of the first 52 theorems in “Principia Mathematica” and provides more elegant proofs for some. 2 These successes made many AI pioneers wildly optimistic, and underpinned the belief that fully intelligent machines would be built in the near future. However, they soon realized that there was still a long way to go before the end goals of human-equivalent intelligence in machines could come true. Many nontrivial problems could not be handled by the logic-based programs. Another challenge was the lack of computational resources to compute more and more complicated problems. As a result, organizations and funders stopped supporting these under-delivering AI projects.

AI came back to popularity in the 1980s, as several research institutions and universities invented a type of AI systems that summarizes a series of basic rules from expert knowledge to help non-experts make specific decisions. These systems are “expert systems.” Examples are the XCON designed by Carnegie Mellon University and the MYCIN designed by Stanford University. The expert system derived logic rules from expert knowledge to solve problems in the real world for the first time. The core of AI research during this period is the knowledge that made machines “smarter.” However, the expert system gradually revealed several disadvantages, such as privacy technologies, lack of flexibility, poor versatility, expensive maintenance cost, and so on. At the same time, the Fifth Generation Computer Project, heavily funded by the Japanese government, failed to meet most of its original goals. Once again, the funding for AI research ceased, and AI was at the second lowest point of its life.

In 2006, Geoffrey Hinton and coworkers 3 , 4 made a breakthrough in AI by proposing an approach of building deeper neural networks, as well as a way to avoid gradient vanishing during training. This reignited AI research, and DL algorithms have become one of the most active fields of AI research. DL is a subset of ML based on multiple layers of neural networks with representation learning, 5 while ML is a part of AI that a computer or a program can use to learn and acquire intelligence without human intervention. Thus, “learn” is the keyword of this era of AI research. Big data technologies, and the improvement of computing power have made deriving features and information from massive data samples more efficient. An increasing number of new neural network structures and training methods have been proposed to improve the representative learning ability of DL, and to further expand it into general applications. Current DL algorithms match and exceed human capabilities on specific datasets in the areas of computer vision (CV) and natural language processing (NLP). AI technologies have achieved remarkable successes in all walks of life, and continued to show their value as backbones in scientific research and real-world applications.

Within AI, ML is having a substantial broad effect across many aspects of technology and science: from computer science to geoscience to materials science, from life science to medical science to chemistry to mathematics and to physics, from management science to economics to psychology, and other data-intensive empirical sciences, as ML methods have been developed to analyze high-throughput data to obtain useful insights, categorize, predict, and make evidence-based decisions in novel ways. To train a system by presenting it with examples of desired input-output behavior, could be far easier than to program it manually by predicting the desired response for all potential inputs. The following sections survey eight fundamental sciences, including information science (informatics), mathematics, medical science, materials science, geoscience, life science, physics, and chemistry, which develop or exploit AI techniques to promote the development of sciences and accelerate their applications to benefit human beings, society, and the world.

AI in information science

AI aims to provide the abilities of perception, cognition, and decision-making for machines. At present, new research and applications in information science are emerging at an unprecedented rate, which is inseparable from the support by the AI infrastructure. As shown in Figure 2 , the AI infrastructure layer includes data, storage and computing power, ML algorithms, and the AI framework. The perception layer enables machines have the basic ability of vision, hearing, etc. For instance, CV enables machines to “see” and identify objects, while speech recognition and synthesis helps machines to “hear” and recognize speech elements. The cognitive layer provides higher ability levels of induction, reasoning, and acquiring knowledge with the help of NLP, 6 knowledge graphs, 7 and continual learning. 8 In the decision-making layer, AI is capable of making optimal decisions, such as automatic planning, expert systems, and decision-supporting systems. Numerous applications of AI have had a profound impact on fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace. The following subsections provide an overview of the AI framework, automatic machine learning (AutoML) technology, and several state-of-the-art AI/ML applications in the information field.

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

The knowledge graph of the AI framework

The AI framework provides basic tools for AI algorithm implementation

In the past 10 years, applications based on AI algorithms have played a significant role in various fields and subjects, on the basis of which the prosperity of the DL framework and platform has been founded. AI frameworks and platforms reduce the requirement of accessing AI technology by integrating the overall process of algorithm development, which enables researchers from different areas to use it across other fields, allowing them to focus on designing the structure of neural networks, thus providing better solutions to problems in their fields. At the beginning of the 21st century, only a few tools, such as MATLAB, OpenNN, and Torch, were capable of describing and developing neural networks. However, these tools were not originally designed for AI models, and thus faced problems, such as complicated user API and lacking GPU support. During this period, using these frameworks demanded professional computer science knowledge and tedious work on model construction. As a solution, early frameworks of DL, such as Caffe, Chainer, and Theano, emerged, allowing users to conveniently construct complex deep neural networks (DNNs), such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and LSTM conveniently, and this significantly reduced the cost of applying AI models. Tech giants then joined the march in researching AI frameworks. 9 Google developed the famous open-source framework, TensorFlow, while Facebook's AI research team released another popular platform, PyTorch, which is based on Torch; Microsoft Research published CNTK, and Amazon announced MXNet. Among them, TensorFlow, also the most representative framework, referred to Theano's declarative programming style, offering a larger space for graph-based optimization, while PyTorch inherited the imperative programming style of Torch, which is intuitive, user friendly, more flexible, and easier to be traced. As modern AI frameworks and platforms are being widely applied, practitioners can now assemble models swiftly and conveniently by adopting various building block sets and languages specifically suitable for given fields. Polished over time, these platforms gradually developed a clearly defined user API, the ability for multi-GPU training and distributed training, as well as a variety of model zoos and tool kits for specific tasks. 10 Looking forward, there are a few trends that may become the mainstream of next-generation framework development. (1) Capability of super-scale model training. With the emergence of models derived from Transformer, such as BERT and GPT-3, the ability of training large models has become an ideal feature of the DL framework. It requires AI frameworks to train effectively under the scale of hundreds or even thousands of devices. (2) Unified API standard. The APIs of many frameworks are generally similar but slightly different at certain points. This leads to some difficulties and unnecessary learning efforts, when the user attempts to shift from one framework to another. The API of some frameworks, such as JAX, has already become compatible with Numpy standard, which is familiar to most practitioners. Therefore, a unified API standard for AI frameworks may gradually come into being in the future. (3) Universal operator optimization. At present, kernels of DL operator are implemented either manually or based on third-party libraries. Most third-party libraries are developed to suit certain hardware platforms, causing large unnecessary spending when models are trained or deployed on different hardware platforms. The development speed of new DL algorithms is usually much faster than the update rate of libraries, which often makes new algorithms to be beyond the range of libraries' support. 11

To improve the implementation speed of AI algorithms, much research focuses on how to use hardware for acceleration. The DianNao family is one of the earliest research innovations on AI hardware accelerators. 12 It includes DianNao, DaDianNao, ShiDianNao, and PuDianNao, which can be used to accelerate the inference speed of neural networks and other ML algorithms. Of these, the best performance of a 64-chip DaDianNao system can achieve a speed up of 450.65× over a GPU, and reduce the energy by 150.31×. Prof. Chen and his team in the Institute of Computing Technology also designed an Instruction Set Architecture for a broad range of neural network accelerators, called Cambricon, which developed into a serial DL accelerator. After Cambricon, many AI-related companies, such as Apple, Google, HUAWEI, etc., developed their own DL accelerators, and AI accelerators became an important research field of AI.

AI for AI—AutoML

AutoML aims to study how to use evolutionary computing, reinforcement learning (RL), and other AI algorithms, to automatically generate specified AI algorithms. Research on the automatic generation of neural networks has existed before the emergence of DL, e.g., neural evolution. 13 The main purpose of neural evolution is to allow neural networks to evolve according to the principle of survival of the fittest in the biological world. Through selection, crossover, mutation, and other evolutionary operators, the individual quality in a population is continuously improved and, finally, the individual with the greatest fitness represents the best neural network. The biological inspiration in this field lies in the evolutionary process of human brain neurons. The human brain has such developed learning and memory functions that it cannot do without the complex neural network system in the brain. The whole neural network system of the human brain benefits from a long evolutionary process rather than gradient descent and back propagation. In the era of DL, the application of AI algorithms to automatically generate DNN has attracted more attention and, gradually, developed into an important direction of AutoML research: neural architecture search. The implementation methods of neural architecture search are usually divided into the RL-based method and the evolutionary algorithm-based method. In the RL-based method, an RNN is used as a controller to generate a neural network structure layer by layer, and then the network is trained, and the accuracy of the verification set is used as the reward signal of the RNN to calculate the strategy gradient. During the iteration, the controller will give the neural network, with higher accuracy, a higher probability value, so as to ensure that the strategy function can output the optimal network structure. 14 The method of neural architecture search through evolution is similar to the neural evolution method, which is based on a population and iterates continuously according to the principle of survival of the fittest, so as to obtain a high-quality neural network. 15 Through the application of neural architecture search technology, the design of neural networks is more efficient and automated, and the accuracy of the network gradually outperforms that of the networks designed by AI experts. For example, Google's SOTA network EfficientNet was realized through the baseline network based on neural architecture search. 16

AI enabling networking design adaptive to complex network conditions

The application of DL in the networking field has received strong interest. Network design often relies on initial network conditions and/or theoretical assumptions to characterize real network environments. However, traditional network modeling and design, regulated by mathematical models, are unlikely to deal with complex scenarios with many imperfect and high dynamic network environments. Integrating DL into network research allows for a better representation of complex network environments. Furthermore, DL could be combined with the Markov decision process and evolve into the deep reinforcement learning (DRL) model, which finds an optimal policy based on the reward function and the states of the system. Taken together, these techniques could be used to make better decisions to guide proper network design, thereby improving the network quality of service and quality of experience. With regard to the aspect of different layers of the network protocol stack, DL/DRL can be adopted for network feature extraction, decision-making, etc. In the physical layer, DL can be used for interference alignment. It can also be used to classify the modulation modes, design efficient network coding 17 and error correction codes, etc. In the data link layer, DL can be used for resource (such as channels) allocation, medium access control, traffic prediction, 18 link quality evaluation, and so on. In the network (routing) layer, routing establishment and routing optimization 19 can help to obtain an optimal routing path. In higher layers (such as the application layer), enhanced data compression and task allocation is used. Besides the above protocol stack, one critical area of using DL is network security. DL can be used to classify the packets into benign/malicious types, and how it can be integrated with other ML schemes, such as unsupervised clustering, to achieve a better anomaly detection effect.

AI enabling more powerful and intelligent nanophotonics

Nanophotonic components have recently revolutionized the field of optics via metamaterials/metasurfaces by enabling the arbitrary manipulation of light-matter interactions with subwavelength meta-atoms or meta-molecules. 20 , 21 , 22 The conventional design of such components involves generally forward modeling, i.e., solving Maxwell's equations based on empirical and intuitive nanostructures to find corresponding optical properties, as well as the inverse design of nanophotonic devices given an on-demand optical response. The trans-dimensional feature of macro-optical components consisting of complex nano-antennas makes the design process very time consuming, computationally expensive, and even numerically prohibitive, such as device size and complexity increase. DL is an efficient and automatic platform, enabling novel efficient approaches to designing nanophotonic devices with high-performance and versatile functions. Here, we present briefly the recent progress of DL-based nanophotonics and its wide-ranging applications. DL was exploited for forward modeling at first using a DNN. 23 The transmission or reflection coefficients can be well predicted after training on huge datasets. To improve the prediction accuracy of DNN in case of small datasets, transfer learning was introduced to migrate knowledge between different physical scenarios, which greatly reduced the relative error. Furthermore, a CNN and an RNN were developed for the prediction of optical properties from arbitrary structures using images. 24 The CNN-RNN combination successfully predicted the absorption spectra from the given input structural images. In inverse design of nanophotonic devices, there are three different paradigms of DL methods, i.e., supervised, unsupervised, and RL. 25 Supervised learning has been utilized to design structural parameters for the pre-defined geometries, such as tandem DNN and bidirectional DNNs. Unsupervised learning methods learn by themselves without a specific target, and thus are more accessible to discovering new and arbitrary patterns 26 in completely new data than supervised learning. A generative adversarial network (GAN)-based approach, combining conditional GANs and Wasserstein GANs, was proposed to design freeform all-dielectric multifunctional metasurfaces. RL, especially double-deep Q-learning, powers up the inverse design of high-performance nanophotonic devices. 27 DL has endowed nanophotonic devices with better performance and more emerging applications. 28 , 29 For instance, an intelligent microwave cloak driven by DL exhibits millisecond and self-adaptive response to an ever-changing incident wave and background. 28 Another example is that a DL-augmented infrared nanoplasmonic metasurface is developed for monitoring dynamics between four major classes of bio-molecules, which could impact the fields of biology, bioanalytics, and pharmacology from fundamental research, to disease diagnostics, to drug development. 29 The potential of DL in the wide arena of nanophotonics has been unfolding. Even end-users without optics and photonics background could exploit the DL as a black box toolkit to design powerful optical devices. Nevertheless, how to interpret/mediate the intermediate DL process and determine the most dominant factors in the search for optimal solutions, are worthy of being investigated in depth. We optimistically envisage that the advancements in DL algorithms and computation/optimization infrastructures would enable us to realize more efficient and reliable training approaches, more complex nanostructures with unprecedented shapes and sizes, and more intelligent and reconfigurable optic/optoelectronic systems.

AI in other fields of information science

We believe that AI has great potential in the following directions:

  • • AI-based risk control and management in utilities can prevent costly or hazardous equipment failures by using sensors that detect and send information regarding the machine's health to the manufacturer, predicting possible issues that could occur so as to ensure timely maintenance or automated shutdown.
  • • AI could be used to produce simulations of real-world objects, called digital twins. When applied to the field of engineering, digital twins allow engineers and technicians to analyze the performance of an equipment virtually, thus avoiding safety and budget issues associated with traditional testing methods.
  • • Combined with AI, intelligent robots are playing an important role in industry and human life. Different from traditional robots working according to the procedures specified by humans, intelligent robots have the ability of perception, recognition, and even automatic planning and decision-making, based on changes in environmental conditions.
  • • AI of things (AIoT) or AI-empowered IoT applications. 30 have become a promising development trend. AI can empower the connected IoT devices, embedded in various physical infrastructures, to perceive, recognize, learn, and act. For instance, smart cities constantly collect data regarding quality-of-life factors, such as the status of power supply, public transportation, air pollution, and water use, to manage and optimize systems in cities. Due to these data, especially personal data being collected from informed or uninformed participants, data security, and privacy 31 require protection.

AI in mathematics

Mathematics always plays a crucial and indispensable role in AI. Decades ago, quite a few classical AI-related approaches, such as k-nearest neighbor, 32 support vector machine, 33 and AdaBoost, 34 were proposed and developed after their rigorous mathematical formulations had been established. In recent years, with the rapid development of DL, 35 AI has been gaining more and more attention in the mathematical community. Equipped with the Markov process, minimax optimization, and Bayesian statistics, RL, 36 GANs, 37 and Bayesian learning 38 became the most favorable tools in many AI applications. Nevertheless, there still exist plenty of open problems in mathematics for ML, including the interpretability of neural networks, the optimization problems of parameter estimation, and the generalization ability of learning models. In the rest of this section, we discuss these three questions in turn.

The interpretability of neural networks

From a mathematical perspective, ML usually constructs nonlinear models, with neural networks as a typical case, to approximate certain functions. The well-known Universal Approximation Theorem suggests that, under very mild conditions, any continuous function can be uniformly approximated on compact domains by neural networks, 39 which serves a vital function in the interpretability of neural networks. However, in real applications, ML models seem to admit accurate approximations of many extremely complicated functions, sometimes even black boxes, which are far beyond the scope of continuous functions. To understand the effectiveness of ML models, many researchers have investigated the function spaces that can be well approximated by them, and the corresponding quantitative measures. This issue is closely related to the classical approximation theory, but the approximation scheme is distinct. For example, Bach 40 finds that the random feature model is naturally associated with the corresponding reproducing kernel Hilbert space. In the same way, the Barron space is identified as the natural function space associated with two-layer neural networks, and the approximation error is measured using the Barron norm. 41 The corresponding quantities of residual networks (ResNets) are defined for the flow-induced spaces. For multi-layer networks, the natural function spaces for the purposes of approximation theory are the tree-like function spaces introduced in Wojtowytsch. 42 There are several works revealing the relationship between neural networks and numerical algorithms for solving partial differential equations. For example, He and Xu 43 discovered that CNNs for image classification have a strong connection with multi-grid (MG) methods. In fact, the pooling operation and feature extraction in CNNs correspond directly to restriction operation and iterative smoothers in MG, respectively. Hence, various convolution and pooling operations used in CNNs can be better understood.

The optimization problems of parameter estimation

In general, the optimization problem of estimating parameters of certain DNNs is in practice highly nonconvex and often nonsmooth. Can the global minimizers be expected? What is the landscape of local minimizers? How does one handle the nonsmoothness? All these questions are nontrivial from an optimization perspective. Indeed, numerous works and experiments demonstrate that the optimization for parameter estimation in DL is itself a much nicer problem than once thought; see, e.g., Goodfellow et al. 44 As a consequence, the study on the solution landscape ( Figure 3 ), also known as loss surface of neural networks, is no longer supposed to be inaccessible and can even in turn provide guidance for global optimization. Interested readers can refer to the survey paper (Sun et al. 45 ) for recent progress in this aspect.

An external file that holds a picture, illustration, etc.
Object name is gr3.jpg

Recent studies indicate that nonsmooth activation functions, e.g., rectified linear units, are better than smooth ones in finding sparse solutions. However, the chain rule does not work in the case that the activation functions are nonsmooth, which then makes the widely used stochastic gradient (SG)-based approaches not feasible in theory. Taking approximated gradients at nonsmooth iterates as a remedy ensures that SG-type methods are still in extensive use, but that the numerical evidence has also exposed their limitations. Also, the penalty-based approaches proposed by Cui et al. 46 and Liu et al. 47 provide a new direction to solve the nonsmooth optimization problems efficiently.

The generalization ability of learning models

A small training error does not always lead to a small test error. This gap is caused by the generalization ability of learning models. A key finding in statistical learning theory states that the generalization error is bounded by a quantity that grows with the increase of the model capacity, but shrinks as the number of training examples increases. 48 A common conjecture relating generalization to solution landscape is that flat and wide minima generalize better than sharp ones. Thus, regularization techniques, including the dropout approach, 49 have emerged to force the algorithms to bypass the sharp minima. However, the mechanism behind this has not been fully explored. Recently, some researchers have focused on the ResNet-type architecture, with dropout being inserted after the last convolutional layer of each modular building. They thus managed to explain the stochastic dropout training process and the ensuing dropout regularization effect from the perspective of optimal control. 50

AI in medical science

There is a great trend for AI technology to grow more and more significant in daily operations, including medical fields. With the growing needs of healthcare for patients, hospital needs are evolving from informationization networking to the Internet Hospital and eventually to the Smart Hospital. At the same time, AI tools and hardware performance are also growing rapidly with each passing day. Eventually, common AI algorithms, such as CV, NLP, and data mining, will begin to be embedded in the medical equipment market ( Figure 4 ).

An external file that holds a picture, illustration, etc.
Object name is gr4.jpg

AI doctor based on electronic medical records

For medical history data, it is inevitable to mention Doctor Watson, developed by the Watson platform of IBM, and Modernizing Medicine, which aims to solve oncology, and is now adopted by CVS & Walgreens in the US and various medical organizations in China as well. Doctor Watson takes advantage of the NLP performance of the IBM Watson platform, which already collected vast data of medical history, as well as prior knowledge in the literature for reference. After inputting the patients' case, Doctor Watson searches the medical history reserve and forms an elementary treatment proposal, which will be further ranked by prior knowledge reserves. With the multiple models stored, Doctor Watson gives the final proposal as well as the confidence of the proposal. However, there are still problems for such AI doctors because, 51 as they rely on prior experience from US hospitals, the proposal may not be suitable for other regions with different medical insurance policies. Besides, the knowledge updating of the Watson platform also relies highly on the updating of the knowledge reserve, which still needs manual work.

AI for public health: Outbreak detection and health QR code for COVID-19

AI can be used for public health purposes in many ways. One classical usage is to detect disease outbreaks using search engine query data or social media data, as Google did for prediction of influenza epidemics 52 and the Chinese Academy of Sciences did for modeling the COVID-19 outbreak through multi-source information fusion. 53 After the COVID-19 outbreak, a digital health Quick Response (QR) code system has been developed by China, first to detect potential contact with confirmed COVID-19 cases and, secondly, to indicate the person's health status using mobile big data. 54 Different colors indicate different health status: green means healthy and is OK for daily life, orange means risky and requires quarantine, and red means confirmed COVID-19 patient. It is easy to use for the general public, and has been adopted by many other countries. The health QR code has made great contributions to the worldwide prevention and control of the COVID-19 pandemic.

Biomarker discovery with AI

High-dimensional data, including multi-omics data, patient characteristics, medical laboratory test data, etc., are often used for generating various predictive or prognostic models through DL or statistical modeling methods. For instance, the COVID-19 severity evaluation model was built through ML using proteomic and metabolomic profiling data of sera 55 ; using integrated genetic, clinical, and demographic data, Taliaz et al. built an ML model to predict patient response to antidepressant medications 56 ; prognostic models for multiple cancer types (such as liver cancer, lung cancer, breast cancer, gastric cancer, colorectal cancer, pancreatic cancer, prostate cancer, ovarian cancer, lymphoma, leukemia, sarcoma, melanoma, bladder cancer, renal cancer, thyroid cancer, head and neck cancer, etc.) were constructed through DL or statistical methods, such as least absolute shrinkage and selection operator (LASSO), combined with Cox proportional hazards regression model using genomic data. 57

Image-based medical AI

Medical image AI is one of the most developed mature areas as there are numerous models for classification, detection, and segmentation tasks in CV. For the clinical area, CV algorithms can also be used for computer-aided diagnosis and treatment with ECG, CT, eye fundus imaging, etc. As human doctors may be tired and prone to make mistakes after viewing hundreds and hundreds of images for diagnosis, AI doctors can outperform a human medical image viewer due to their specialty at repeated work without fatigue. The first medical AI product approved by FDA is IDx-DR, which uses an AI model to make predictions of diabetic retinopathy. The smartphone app SkinVision can accurately detect melanomas. 58 It uses “fractal analysis” to identify moles and their surrounding skin, based on size, diameter, and many other parameters, and to detect abnormal growth trends. AI-ECG of LEPU Medical can automatically detect heart disease with ECG images. Lianying Medical takes advantage of their hardware equipment to produce real-time high-definition image-guided all-round radiotherapy technology, which successfully achieves precise treatment.

Wearable devices for surveillance and early warning

For wearable devices, AliveCor has developed an algorithm to automatically predict the presence of atrial fibrillation, which is an early warning sign of stroke and heart failure. The 23andMe company can also test saliva samples at a small cost, and a customer can be provided with information based on their genes, including who their ancestors were or potential diseases they may be prone to later in life. It provides accurate health management solutions based on individual and family genetic data. In the 20–30 years of the near feature, we believe there are several directions for further research: (1) causal inference for real-time in-hospital risk prediction. Clinical doctors usually acquire reasonable explanations for certain medical decisions, but the current AI models nowadays are usually black box models. The casual inference will help doctors to explain certain AI decisions and even discover novel ground truths. (2) Devices, including wearable instruments for multi-dimensional health monitoring. The multi-modality model is now a trend for AI research. With various devices to collect multi-modality data and a central processor to fuse all these data, the model can monitor the user's overall real-time health condition and give precautions more precisely. (3) Automatic discovery of clinical markers for diseases that are difficult to diagnose. Diseases, such as ALS, are still difficult for clinical doctors to diagnose because they lack any effective general marker. It may be possible for AI to discover common phenomena for these patients and find an effective marker for early diagnosis.

AI-aided drug discovery

Today we have come into the precision medicine era, and the new targeted drugs are the cornerstones for precision therapy. However, over the past decades, it takes an average of over one billion dollars and 10 years to bring a new drug into the market. How to accelerate the drug discovery process, and avoid late-stage failure, are key concerns for all the big and fiercely competitive pharmaceutical companies. The highlighted emerging role of AI, including ML, DL, expert systems, and artificial neural networks (ANNs), has brought new insights and high efficiency into the new drug discovery processes. AI has been adopted in many aspects of drug discovery, including de novo molecule design, structure-based modeling for proteins and ligands, quantitative structure-activity relationship research, and druggable property judgments. DL-based AI appliances demonstrate superior merits in addressing some challenging problems in drug discovery. Of course, prediction of chemical synthesis routes and chemical process optimization are also valuable in accelerating new drug discovery, as well as lowering production costs.

There has been notable progress in the AI-aided new drug discovery in recent years, for both new chemical entity discovery and the relating business area. Based on DNNs, DeepMind built the AlphaFold platform to predict 3D protein structures that outperformed other algorithms. As an illustration of great achievement, AlphaFold successfully and accurately predicted 25 scratch protein structures from a 43 protein panel without using previously built proteins models. Accordingly, AlphaFold won the CASP13 protein-folding competition in December 2018. 59 Based on the GANs and other ML methods, Insilico constructed a modular drug design platform GENTRL system. In September 2019, they reported the discovery of the first de novo active DDR1 kinase inhibitor developed by the GENTRL system. It took the team only 46 days from target selection to get an active drug candidate using in vivo data. 60 Exscientia and Sumitomo Dainippon Pharma developed a new drug candidate, DSP-1181, for the treatment of obsessive-compulsive disorder on the Centaur Chemist AI platform. In January 2020, DSP-1181 started its phase I clinical trials, which means that, from program initiation to phase I study, the comprehensive exploration took less than 12 months. In contrast, comparable drug discovery using traditional methods usually needs 4–5 years with traditional methods.

How AI transforms medical practice: A case study of cervical cancer

As the most common malignant tumor in women, cervical cancer is a disease that has a clear cause and can be prevented, and even treated, if detected early. Conventionally, the screening strategy for cervical cancer mainly adopts the “three-step” model of “cervical cytology-colposcopy-histopathology.” 61 However, limited by the level of testing methods, the efficiency of cervical cancer screening is not high. In addition, owing to the lack of knowledge by doctors in some primary hospitals, patients cannot be provided with the best diagnosis and treatment decisions. In recent years, with the advent of the era of computer science and big data, AI has gradually begun to extend and blend into various fields. In particular, AI has been widely used in a variety of cancers as a new tool for data mining. For cervical cancer, a clinical database with millions of medical records and pathological data has been built, and an AI medical tool set has been developed. 62 Such an AI analysis algorithm supports doctors to access the ability of rapid iterative AI model training. In addition, a prognostic prediction model established by ML and a web-based prognostic result calculator have been developed, which can accurately predict the risk of postoperative recurrence and death in cervical cancer patients, and thereby better guide decision-making in postoperative adjuvant treatment. 63

AI in materials science

As the cornerstone of modern industry, materials have played a crucial role in the design of revolutionary forms of matter, with targeted properties for broad applications in energy, information, biomedicine, construction, transportation, national security, spaceflight, and so forth. Traditional strategies rely on the empirical trial and error experimental approaches as well as the theoretical simulation methods, e.g., density functional theory, thermodynamics, or molecular dynamics, to discover novel materials. 64 These methods often face the challenges of long research cycles, high costs, and low success rates, and thus cannot meet the increasingly growing demands of current materials science. Accelerating the speed of discovery and deployment of advanced materials will therefore be essential in the coming era.

With the rapid development of data processing and powerful algorithms, AI-based methods, such as ML and DL, are emerging with good potentials in the search for and design of new materials prior to actually manufacturing them. 65 , 66 By integrating material property data, such as the constituent element, lattice symmetry, atomic radius, valence, binding energy, electronegativity, magnetism, polarization, energy band, structure-property relation, and functionalities, the machine can be trained to “think” about how to improve material design and even predict the properties of new materials in a cost-effective manner ( Figure 5 ).

An external file that holds a picture, illustration, etc.
Object name is gr5.jpg

AI is expected to power the development of materials science

AI in discovery and design of new materials

Recently, AI techniques have made significant advances in rational design and accelerated discovery of various materials, such as piezoelectric materials with large electrostrains, 67 organic-inorganic perovskites for photovoltaics, 68 molecular emitters for efficient light-emitting diodes, 69 inorganic solid materials for thermoelectrics, 70 and organic electronic materials for renewable-energy applications. 66 , 71 The power of data-driven computing and algorithmic optimization can promote comprehensive applications of simulation and ML (i.e., high-throughput virtual screening, inverse molecular design, Bayesian optimization, and supervised learning, etc.), in material discovery and property prediction in various fields. 72 For instance, using a DL Bayesian framework, the attribute-driven inverse materials design has been demonstrated for efficient and accurate prediction of functional molecular materials, with desired semiconducting properties or redox stability for applications in organic thin-film transistors, organic solar cells, or lithium-ion batteries. 73 It is meaningful to adopt automation tools for quick experimental testing of potential materials and utilize high-performance computing to calculate their bulk, interface, and defect-related properties. 74 The effective convergence of automation, computing, and ML can greatly speed up the discovery of materials. In the future, with the aid of AI techniques, it will be possible to accomplish the design of superconductors, metallic glasses, solder alloys, high-entropy alloys, high-temperature superalloys, thermoelectric materials, two-dimensional materials, magnetocaloric materials, polymeric bio-inspired materials, sensitive composite materials, and topological (electronic and phonon) materials, and so on. In the past decade, topological materials have ignited the research enthusiasm of condensed matter physicists, materials scientists, and chemists, as they exhibit exotic physical properties with potential applications in electronics, thermoelectrics, optics, catalysis, and energy-related fields. From the most recent predictions, more than a quarter of all inorganic materials in nature are topologically nontrivial. The establishment of topological electronic materials databases 75 , 76 , 77 and topological phononic materials databases 78 using high-throughput methods will help to accelerate the screening and experimental discovery of new topological materials for functional applications. It is recognized that large-scale high-quality datasets are required to practice AI. Great efforts have also been expended in building high-quality materials science databases. As one of the top-ranking databases of its kind, the “atomly.net” materials data infrastructure, 79 has calculated the properties of more than 180,000 inorganic compounds, including their equilibrium structures, electron energy bands, dielectric properties, simulated diffraction patterns, elasticity tensors, etc. As such, the atomly.net database has set a solid foundation for extending AI into the area of materials science research. The X-ray diffraction (XRD)-matcher model of atomly.net uses ML to match and classify the experimental XRD to the simulated patterns. Very recently, by using the dataset from atomly.net, an accurate AI model was built to rapidly predict the formation energy of almost any given compound to yield a fairly good predictive ability. 80

AI-powered Materials Genome Initiative

The Materials Genome Initiative (MGI) is a great plan for rational realization of new materials and related functions, and it aims to discover, manufacture, and deploy advanced materials efficiently, cost-effectively, and intelligently. The initiative creates policy, resources, and infrastructure for accelerating materials development at a high level. This is a new paradigm for the discovery and design of next-generation materials, and runs from a view point of fundamental building blocks toward general materials developments, and accelerates materials development through efforts in theory, computation, and experiment, in a highly integrated high-throughput manner. MGI raises an ultimately high goal and high level for materials development and materials science for humans in the future. The spirit of MGI is to design novel materials by using data pools and powerful computation once the requirements or aspirations of functional usages appear. The theory, computation, and algorithm are the primary and substantial factors in the establishment and implementation of MGI. Advances in theories, computations, and experiments in materials science and engineering provide the footstone to not only accelerate the speed at which new materials are realized but to also shorten the time needed to push new products into the market. These AI techniques bring a great promise to the developing MGI. The applications of new technologies, such as ML and DL, directly accelerate materials research and the establishment of MGI. The model construction and application to science and engineering, as well as the data infrastructure, are of central importance. When the AI-powered MGI approaches are coupled with the ongoing autonomy of manufacturing methods, the potential impact to society and the economy in the future is profound. We are now beginning to see that the AI-aided MGI, among other things, integrates experiments, computation, and theory, and facilitates access to materials data, equips the next generation of the materials workforce, and enables a paradigm shift in materials development. Furthermore, the AI-powdered MGI could also design operational procedures and control the equipment to execute experiments, and to further realize autonomous experimentation in future material research.

Advanced functional materials for generation upgrade of AI

The realization and application of AI techniques depend on the computational capability and computer hardware, and this bases physical functionality on the performance of computers or supercomputers. For our current technology, the electric currents or electric carriers for driving electric chips and devices consist of electrons with ordinary characteristics, such as heavy mass and low mobility. All chips and devices emit relatively remarkable heat levels, consuming too much energy and lowering the efficiency of information transmission. Benefiting from the rapid development of modern physics, a series of advanced materials with exotic functional effects have been discovered or designed, including superconductors, quantum anomalous Hall insulators, and topological fermions. In particular, the superconducting state or topologically nontrivial electrons will promote the next-generation AI techniques once the (near) room temperature applications of these states are realized and implanted in integrated circuits. 81 In this case, the central processing units, signal circuits, and power channels will be driven based on the electronic carriers that show massless, energy-diffusionless, ultra-high mobility, or chiral-protection characteristics. The ordinary electrons will be removed from the physical circuits of future-generation chips and devices, leaving superconducting and topological chiral electrons running in future AI chips and supercomputers. The efficiency of transmission, for information and logic computing will be improved on a vast scale and at a very low cost.

AI for materials and materials for AI

The coming decade will continue to witness the development of advanced ML algorithms, newly emerging data-driven AI methodologies, and integrated technologies for facilitating structure design and property prediction, as well as to accelerate the discovery, design, development, and deployment of advanced materials into existing and emerging industrial sectors. At this moment, we are facing challenges in achieving accelerated materials research through the integration of experiment, computation, and theory. The great MGI, proposed for high-level materials research, helps to promote this process, especially when it is assisted by AI techniques. Still, there is a long way to go for the usage of these advanced functional materials in future-generation electric chips and devices to be realized. More materials and functional effects need to be discovered or improved by the developing AI techniques. Meanwhile, it is worth noting that materials are the core components of devices and chips that are used for construction of computers or machines for advanced AI systems. The rapid development of new materials, especially the emergence of flexible, sensitive, and smart materials, is of great importance for a broad range of attractive technologies, such as flexible circuits, stretchable tactile sensors, multifunctional actuators, transistor-based artificial synapses, integrated networks of semiconductor/quantum devices, intelligent robotics, human-machine interactions, simulated muscles, biomimetic prostheses, etc. These promising materials, devices, and integrated technologies will greatly promote the advancement of AI systems toward wide applications in human life. Once the physical circuits are upgraded by advanced functional or smart materials, AI techniques will largely promote the developments and applications of all disciplines.

AI in geoscience

Ai technologies involved in a large range of geoscience fields.

Momentous challenges threatening current society require solutions to problems that belong to geoscience, such as evaluating the effects of climate change, assessing air quality, forecasting the effects of disaster incidences on infrastructure, by calculating the incoming consumption and availability of food, water, and soil resources, and identifying factors that are indicators for potential volcanic eruptions, tsunamis, floods, and earthquakes. 82 , 83 It has become possible, with the emergence of advanced technology products (e.g., deep sea drilling vessels and remote sensing satellites), for enhancements in computational infrastructure that allow for processing large-scale, wide-range simulations of multiple models in geoscience, and internet-based data analysis that facilitates collection, processing, and storage of data in distributed and crowd-sourced environments. 84 The growing availability of massive geoscience data provides unlimited possibilities for AI—which has popularized all aspects of our daily life (e.g., entertainment, transportation, and commerce)—to significantly contribute to geoscience problems of great societal relevance. As geoscience enters the era of massive data, AI, which has been extensively successful in different fields, offers immense opportunities for settling a series of problems in Earth systems. 85 , 86 Accompanied by diversified data, AI-enabled technologies, such as smart sensors, image visualization, and intelligent inversion, are being actively examined in a large range of geoscience fields, such as marine geoscience, rock physics, geology, ecology, seismicity, environment, hydrology, remote sensing, Arc GIS, and planetary science. 87

Multiple challenges in the development of geoscience

There are some traits of geoscience development that restrict the applicability of fundamental algorithms for knowledge discovery: (1) inherent challenges of geoscience processes, (2) limitation of geoscience data collection, and (3) uncertainty in samples and ground truth. 88 , 89 , 90 Amorphous boundaries generally exist in geoscience objects between space and time that are not as well defined as objects in other fields. Geoscience phenomena are also significantly multivariate, obey nonlinear relationships, and exhibit spatiotemporal structure and non-stationary characteristics. Except for the inherent challenges of geoscience observations, the massive data at multiple dimensions of time and space, with different levels of incompleteness, noise, and uncertainties, disturb processes in geoscience. For supervised learning approaches, there are other difficulties owing to the lack of gold standard ground truth and the “small size” of samples (e.g., a small amount of historical data with sufficient observations) in geoscience applications.

Usage of AI technologies as efficient approaches to promote the geoscience processes

Geoscientists continually make every effort to develop better techniques for simulating the present status of the Earth system (e.g., how much greenhouse gases are released into the atmosphere), and the connections between and within its subsystems (e.g., how does the elevated temperature influence the ocean ecosystem). Viewed from the perspective of geoscience, newly emerging approaches, with the aid of AI, are a perfect combination for these issues in the application of geoscience: (1) characterizing objects and events 91 ; (2) estimating geoscience variables from observations 92 ; (3) forecasting geoscience variables according to long-term observations 85 ; (4) exploring geoscience data relationships 93 ; and (5) causal discovery and causal attribution. 94 While characterizing geoscience objects and events using traditional methods are primarily rooted in hand-coded features, algorithms can automatically detect the data by improving the performance with pattern-mining techniques. However, due to spatiotemporal targets with vague boundaries and the related uncertainties, it can be necessary to advance pattern-mining methods that can explain the temporal and spatial characteristics of geoscience data when characterizing different events and objects. To address the non-stationary issue of geoscience data, AI-aided algorithms have been expanded to integrate the holistic results of professional predictors and engender robust estimations of climate variables (e.g., humidity and temperature). Furthermore, forecasting long-term trends of the current situation in the Earth system using AI-enabled technologies can simulate future scenarios and formulate early resource planning and adaptation policies. Mining geoscience data relationships can help us seize vital signs of the Earth system and promote our understanding of geoscience developments. Of great interest is the advancement of AI-decision methodology with uncertain prediction probabilities, engendering vague risks with poorly resolved tails, signifying the most extreme, transient, and rare events formulated by model sets, which supports various cases to improve accuracy and effectiveness.

AI technologies for optimizing the resource management in geoscience

Currently, AI can perform better than humans in some well-defined tasks. For example, AI techniques have been used in urban water resource planning, mainly due to their remarkable capacity for modeling, flexibility, reasoning, and forecasting the water demand and capacity. Design and application of an Adaptive Intelligent Dynamic Water Resource Planning system, the subset of AI for sustainable water resource management in urban regions, largely prompted the optimization of water resource allocation, will finally minimize the operation costs and improve the sustainability of environmental management 95 ( Figure 6 ). Also, meteorology requires collecting tremendous amounts of data on many different variables, such as humidity, altitude, and temperature; however, dealing with such a huge dataset is a big challenge. 96 An AI-based technique is being utilized to analyze shallow-water reef images, recognize the coral color—to track the effects of climate change, and to collect humidity, temperature, and CO 2 data—to grasp the health of our ecological environment. 97 Beyond AI's capabilities for meteorology, it can also play a critical role in decreasing greenhouse gas emissions originating from the electric-power sector. Comprised of production, transportation, allocation, and consumption of electricity, many opportunities exist in the electric-power sector for Al applications, including speeding up the development of new clean energy, enhancing system optimization and management, improving electricity-demand forecasts and distribution, and advancing system monitoring. 98 New materials may even be found, with the auxiliary of AI, for batteries to store energy or materials and absorb CO 2 from the atmosphere. 99 Although traditional fossil fuel operations have been widely used for thousands of years, AI techniques are being used to help explore the development of more potential sustainable energy sources for the development (e.g., fusion technology). 100

An external file that holds a picture, illustration, etc.
Object name is gr6.jpg

Applications of AI in hydraulic resource management

In addition to the adjustment of energy structures due to climate change (a core part of geoscience systems), a second, less-obvious step could also be taken to reduce greenhouse gas emission: using AI to target inefficiencies. A related statistical report by the Lawrence Livermore National Laboratory pointed out that around 68% of energy produced in the US could be better used for purposeful activities, such as electricity generation or transportation, but is instead contributing to environmental burdens. 101 AI is primed to reduce these inefficiencies of current nuclear power plants and fossil fuel operations, as well as improve the efficiency of renewable grid resources. 102 For example, AI can be instrumental in the operation and optimization of solar and wind farms to make these utility-scale renewable-energy systems far more efficient in the production of electricity. 103 AI can also assist in reducing energy losses in electricity transportation and allocation. 104 A distribution system operator in Europe used AI to analyze load, voltage, and network distribution data, to help “operators assess available capacity on the system and plan for future needs.” 105 AI allowed the distribution system operator to employ existing and new resources to make the distribution of energy assets more readily available and flexible. The International Energy Agency has proposed that energy efficiency is core to the reform of energy systems and will play a key role in reducing the growth of global energy demand to one-third of the current level by 2040.

AI as a building block to promote development in geoscience

The Earth’s system is of significant scientific interest, and affects all aspects of life. 106 The challenges, problems, and promising directions provided by AI are definitely not exhaustive, but rather, serve to illustrate that there is great potential for future AI research in this important field. Prosperity, development, and popularization of AI approaches in the geosciences is commonly driven by a posed scientific question, and the best way to succeed is that AI researchers work closely with geoscientists at all stages of research. That is because the geoscientists can better understand which scientific question is important and novel, which sample collection process can reasonably exhibit the inherent strengths, which datasets and parameters can be used to answer that question, and which pre-processing operations are conducted, such as removing seasonal cycles or smoothing. Similarly, AI researchers are better suited to decide which data analysis approaches are appropriate and available for the data, the advantages and disadvantages of these approaches, and what the approaches actually acquire. Interpretability is also an important goal in geoscience because, if we can understand the basic reasoning behind the models, patterns, or relationships extracted from the data, they can be used as building blocks in scientific knowledge discovery. Hence, frequent communication between the researchers avoids long detours and ensures that analysis results are indeed beneficial to both geoscientists and AI researchers.

AI in the life sciences

The developments of AI and the life sciences are intertwined. The ultimate goal of AI is to achieve human-like intelligence, as the human brain is capable of multi-tasking, learning with minimal supervision, and generalizing learned skills, all accomplished with high efficiency and low energy cost. 107

Mutual inspiration between AI and neuroscience

In the past decades, neuroscience concepts have been introduced into ML algorithms and played critical roles in triggering several important advances in AI. For example, the origins of DL methods lie directly in neuroscience, 5 which further stimulated the emergence of the field of RL. 108 The current state-of-the-art CNNs incorporate several hallmarks of neural computation, including nonlinear transduction, divisive normalization, and maximum-based pooling of inputs, 109 which were directly inspired by the unique processing of visual input in the mammalian visual cortex. 110 By introducing the brain's attentional mechanisms, a novel network has been shown to produce enhanced accuracy and computational efficiency at difficult multi-object recognition tasks than conventional CNNs. 111 Other neuroscience findings, including the mechanisms underlying working memory, episodic memory, and neural plasticity, have inspired the development of AI algorithms that address several challenges in deep networks. 108 These algorithms can be directly implemented in the design and refinement of the brain-machine interface and neuroprostheses.

On the other hand, insights from AI research have the potential to offer new perspectives on the basics of intelligence in the brains of humans and other species. Unlike traditional neuroscientists, AI researchers can formalize the concepts of neural mechanisms in a quantitative language to extract their necessity and sufficiency for intelligent behavior. An important illustration of such exchange is the development of the temporal-difference (TD) methods in RL models and the resemblance of TD-form learning in the brain. 112 Therefore, the China Brain Project covers both basic research on cognition and translational research for brain disease and brain-inspired intelligence technology. 113

AI for omics big data analysis

Currently, AI can perform better than humans in some well-defined tasks, such as omics data analysis and smart agriculture. In the big data era, 114 there are many types of data (variety), the volume of data is big, and the generation of data (velocity) is fast. The high variety, big volume, and fast velocity of data makes having it a matter of big value, but also makes it difficult to analyze the data. Unlike traditional statistics-based methods, AI can easily handle big data and reveal hidden associations.

In genetics studies, there are many successful applications of AI. 115 One of the key questions is to determine whether a single amino acid polymorphism is deleterious. 116 There have been sequence conservation-based SIFT 117 and network-based SySAP, 118 but all these methods have met bottlenecks and cannot be further improved. Sundaram et al. developed PrimateAI, which can predict the clinical outcome of mutation based on DNN. 119 Another problem is how to call copy-number variations, which play important roles in various cancers. 120 , 121 Glessner et al. proposed a DL-based tool DeepCNV, in which the area under the receiver operating characteristic (ROC) curve was 0.909, much higher than other ML methods. 122 In epigenetic studies, m6A modification is one of the most important mechanisms. 123 Zhang et al. developed an ensemble DL predictor (EDLm6APred) for mRNA m6A site prediction. 124 The area under the ROC curve of EDLm6APred was 86.6%, higher than existing m6A methylation site prediction models. There are many other DL-based omics tools, such as DeepCpG 125 for methylation, DeepPep 126 for proteomics, AtacWorks 127 for assay for transposase-accessible chromatin with high-throughput sequencing, and deepTCR 128 for T cell receptor sequencing.

Another emerging application is DL for single-cell sequencing data. Unlike bulk data, in which the sample size is usually much smaller than the number of features, the sample size of cells in single-cell data could also be big compared with the number of genes. That makes the DL algorithm applicable for most single-cell data. Since the single-cell data are sparse and have many unmeasured missing values, DeepImpute can accurately impute these missing values in the big gene × cell matrix. 129 During the quality control of single-cell data, it is important to remove the doublet solo embedded cells, using autoencoder, and then build a feedforward neural network to identify the doublet. 130 Potential energy underlying single-cell gradients used generative modeling to learn the underlying differentiation landscape from time series single-cell RNA sequencing data. 131

In protein structure prediction, the DL-based AIphaFold2 can accurately predict the 3D structures of 98.5% of human proteins, and will predict the structures of 130 million proteins of other organisms in the next few months. 132 It is even considered to be the second-largest breakthrough in life sciences after the human genome project 133 and will facilitate drug development among other things.

AI makes modern agriculture smart

Agriculture is entering a fourth revolution, termed agriculture 4.0 or smart agriculture, benefiting from the arrival of the big data era as well as the rapid progress of lots of advanced technologies, in particular ML, modern information, and communication technologies. 134 , 135 Applications of DL, information, and sensing technologies in agriculture cover the whole stages of agricultural production, including breeding, cultivation, and harvesting.

Traditional breeding usually exploits genetic variations by searching natural variation or artificial mutagenesis. However, it is hard for either method to expose the whole mutation spectrum. Using DL models trained on the existing variants, predictions can be made on multiple unidentified gene loci. 136 For example, an ML method, multi-criteria rice reproductive gene predictor, was developed and applied to predict coding and lincRNA genes associated with reproductive processes in rice. 137 Moreover, models trained in species with well-studied genomic data (such as Arabidopsis and rice) can also be applied to other species with limited genome information (such as wild strawberry and soybean). 138 In most cases, the links between genotypes and phenotypes are more complicated than we expected. One gene can usually respond to multiple phenotypes, and one trait is generally the product of the synergism between multi-genes and multi-development. For this reason, multi-traits DL models were developed and enabled genomic editing in plant breeding. 139 , 140

It is well known that dynamic and accurate monitoring of crops during the whole growth period is vitally important to precision agriculture. In the new stage of agriculture, both remote sensing and DL play indispensable roles. Specifically, remote sensing (including proximal sensing) could produce agricultural big data from ground, air-borne, to space-borne platforms, which have a unique potential to offer an economical approach for non-destructive, timely, objective, synoptic, long-term, and multi-scale information for crop monitoring and management, thereby greatly assisting in precision decisions regarding irrigation, nutrients, disease, pests, and yield. 141 , 142 DL makes it possible to simply, efficiently, and accurately discover knowledge from massive and complicated data, especially for remote sensing big data that are characterized with multiple spatial-temporal-spectral information, owing to its strong capability for feature representation and superiority in capturing the essential relation between observation data and agronomy parameters or crop traits. 135 , 143 Integration of DL and big data for agriculture has demonstrated the most disruptive force, as big as the green revolution. As shown in Figure 7 , for possible application a scenario of smart agriculture, multi-source satellite remote sensing data with various geo- and radio-metric information, as well as abundance of spectral information from UV, visible, and shortwave infrared to microwave regions, can be collected. In addition, advanced aircraft systems, such as unmanned aerial vehicles with multi/hyper-spectral cameras on board, and smartphone-based portable devices, will be used to obtain multi/hyper-spectral data in specific fields. All types of data can be integrated by DL-based fusion techniques for different purposes, and then shared for all users for cloud computing. On the cloud computing platform, different agriculture remote sensing models developed by a combination of data-driven ML methods and physical models, will be deployed and applied to acquire a range of biophysical and biochemical parameters of crops, which will be further analyzed by a decision-making and prediction system to obtain the current water/nutrient stress, growth status, and to predict future development. As a result, an automatic or interactive user service platform can be accessible to make the correct decisions for appropriate actions through an integrated irrigation and fertilization system.

An external file that holds a picture, illustration, etc.
Object name is gr7.jpg

Integration of AI and remote sensing in smart agriculture

Furthermore, DL presents unique advantages in specific agricultural applications, such as for dense scenes, that increase the difficulty of artificial planting and harvesting. It is reported that CNNs and Autoencoder models, trained with image data, are being used increasingly for phenotyping and yield estimation, 144 such as counting fruits in orchards, grain recognition and classification, disease diagnosis, etc. 145 , 146 , 147 Consequently, this may greatly liberate the labor force.

The application of DL in agriculture is just beginning. There are still many problems and challenges for the future development of DL technology. We believe, with the continuous acquisition of massive data and the optimization of algorithms, DL will have a better prospect in agricultural production.

AI in physics

The scale of modern physics ranges from the size of a neutron to the size of the Universe ( Figure 8 ). According to the scale, physics can be divided into four categories: particle physics on the scale of neutrons, nuclear physics on the scale of atoms, condensed matter physics on the scale of molecules, and cosmic physics on the scale of the Universe. AI, also called ML, plays an important role in all physics in different scales, since the use of the AI algorithm will be the main trend in data analyses, such as the reconstruction and analysis of images.

An external file that holds a picture, illustration, etc.
Object name is gr8.jpg

Scale of the physics

Speeding up simulations and identifications of particles with AI

There are many applications or explorations of applications of AI in particle physics. We cannot cover all of them here, but only use lattice quantum chromodynamics (LQCD) and the experiments on the Beijing spectrometer (BES) and the large hadron collider (LHC) to illustrate the power of ML in both theoretical and experimental particle physics.

LQCD studies the nonperturbative properties of QCD by using Monte Carlo simulations on supercomputers to help us understand the strong interaction that binds quarks together to form nucleons. Markov chain Monte Carlo simulations commonly used in LQCD suffer from topological freezing and critical slowing down as the simulations approach the real situation of the actual world. New algorithms with the help of DL are being proposed and tested to overcome those difficulties. 148 , 149 Physical observables are extracted from LQCD data, whose signal-to-noise ratio deteriorates exponentially. For non-Abelian gauge theories, such as QCD, complicated contour deformations can be optimized by using ML to reduce the variance of LQCD data. Proof-of-principle applications in two dimensions have been studied. 150 ML can also be used to reduce the time cost of generating LQCD data. 151

On the experimental side, particle identification (PID) plays an important role. Recently, a few PID algorithms on BES-III were developed, and the ANN 152 is one of them. Also, extreme gradient boosting has been used for multi-dimensional distribution reweighting, muon identification, and cluster reconstruction, and can improve the muon identification. U-Net is a convolutional network for pixel-level semantic segmentation, which is widely used in CV. It has been applied on BES-III to solve the problem of multi-turn curling track finding for the main drift chamber. The average efficiency and purity for the first turn's hits is about 91%, at the threshold of 0.85. Current (and future) particle physics experiments are producing a huge amount of data. Machine leaning can be used to discriminate between signal and overwhelming background events. Examples of data analyses on LHC, using supervised ML, can be found in a 2018 collaboration. 153 To take the potential advantage of quantum computers forward, quantum ML methods are also being investigated, see, for example, Wu et al., 154 and references therein, for proof-of-concept studies.

AI makes nuclear physics powerful

Cosmic ray muon tomography (Muography) 155 is an imaging graphe technology using natural cosmic ray muon radiation rather than artificial radiation to reduce the dangers. As an advantage, this technology can detect high-Z materials without destruction, as muon is sensitive to high-Z materials. The Classification Model Algorithm (CMA) algorithm is based on the classification in the supervised learning and gray system theory, and generates a binary classifier designing and decision function with the input of the muon track, and the output indicates whether the material exists at the location. The AI helps the user to improve the efficiency of the scanning time with muons.

AIso, for nuclear detection, the Cs 2 LiYCl 6 :Ce (CLYC) signal can react to both electrons and neutrons to create a pulse signal, and can therefore be applied to detect both neutrons and electrons, 156 but needs identification of the two particles by analyzing the shapes of the waves, that is n-γ ID. The traditional method has been the PSD (pulse shape discrimination) method, which is used to separate the waves of two particles by analyzing the distribution of the pulse information—such as amplitude, width, raise time, fall time, and the two particles that can be separated when the distribution has two separated Gaussian distributions. The traditional PSD can only analyze single-pulse waves, rather than multipulse waves, when two particles react with CLYC closely. But it can be solved by using an ANN method for classification of the six categories (n,γ,n + n,n + γ,γ + n,γ). Also, there are several parameters that could be used by AI to improve the reconstruction algorithm with high efficiency and less error.

AI-aided condensed matter physics

AI opens up a new avenue for physical science, especially when a trove of data is available. Recent works demonstrate that ML provides useful insights to improve the density functional theory (DFT), in which the single-electron picture of the Kohn-Sham scheme has the difficulty of taking care of the exchange and correlation effects of many-body systems. Yu et al. proposed a Bayesian optimization algorithm to fit the Hubbard U parameter, and the new method can find the optimal Hubbard U through a self-consistent process with good efficiency compared with the linear response method, 157 and boost the accuracy to the near-hybrid-functional-level. Snyder et al. developed an ML density functional for a 1D non-interacting non-spin-polarized fermion system to obtain significantly improved kinetic energy. This method enabled a direct approximation of the kinetic energy of a quantum system and can be utilized in orbital-free DFT modeling, and can even bypass the solving of the Kohn-Sham equation—while maintaining the precision to the quantum chemical level when a strong correlation term is included. Recently, FermiNet showed that the many-body quantum mechanics equations can be solved via AI. AI models also show advantages of capturing the interatom force field. In 2010, the Gaussian approximation potential (GAP) 158 was introduced as a powerful interatomic force field to describe the interactions between atoms. GAP uses kernel regression and invariant many-body representations, and performs quite well. For instance, it can simulate crystallization of amorphous crystals under high pressure fairly accurately. By employing the smooth overlap of the atomic position kernel (SOAP), 159 the accuracy of the potential can be further enhanced and, therefore, the SOAP-GAP can be viewed as a field-leading method for AI molecular dynamic simulation. There are also several other well-developed AI interatomic potentials out there, e.g., crystal graph CNNs provide a widely applicable way of vectorizing crystalline materials; SchNet embeds the continuous-filter convolutional layers into its DNNs for easing molecular dynamic as the potentials are space continuous; DimeNet constructs the directional message passing neural network by adding not only the bond length between atoms but also the bond angle, the dihedral angle, and the interactions between unconnected atoms into the model to obtain good accuracy.

AI helps explore the Universe

AI is one of the newest technologies, while astronomy is one of the oldest sciences. When the two meet, new opportunities for scientific breakthroughs are often triggered. Observations and data analysis play a central role in astronomy. The amount of data collected by modern telescopes has reached unprecedented levels, even the most basic task of constructing a catalog has become challenging with traditional source-finding tools. 160 Astronomers have developed automated and intelligent source-finding tools based on DL, which not only offer significant advantages in operational speed but also facilitate a comprehensive understanding of the Universe by identifying particular forms of objects that cannot be detected by traditional software and visual inspection. 160 , 161

More than a decade ago, a citizen science project called “Galaxy Zoo” was proposed to help label one million images of galaxies collected by the Sloan Digital Sky Survey (SDSS) by posting images online and recruiting volunteers. 162 Larger optical telescopes, in operation or under construction, produce data several orders of magnitude higher than SDSS. Even with volunteers involved, there is no way to analyze the vast amount of data received. The advantages of ML are not limited to source-finding and galaxy classification. In fact, it has a much wider range of applications. For example, CNN plays an important role in detecting and decoding gravitational wave signals in real time, reconstructing all parameters within 2 ms, while traditional algorithms take several days to accomplish the same task. 163 Such DL systems have also been used to automatically generate alerts for transients and track asteroids and other fast-moving near-Earth objects, improving detection efficiency by several orders of magnitude. In addition, astrophysicists are exploring the use of neural networks to measure galaxy clusters and study the evolution of the Universe.

In addition to the amazing speed, neural networks seem to have a deeper understanding of the data than expected and can recognize more complex patterns, indicating that the “machine” is evolving rather than just learning the characteristics of the input data.

AI in chemistry

Chemistry plays an important “central” role in other sciences 164 because it is the investigation of the structure and properties of matter, and identifies the chemical reactions that convert substances into to other substances. Accordingly, chemistry is a data-rich branch of science containing complex information resulting from centuries of experiments and, more recently, decades of computational analysis. This vast treasure trove of data is most apparent within the Chemical Abstract Services, which has collected more than 183 million unique organic and inorganic substances, including alloys, coordination compounds, minerals, mixtures, polymers, and salts, and is expanding by addition of thousands of additional new substances daily. 165 The unlimited complexity in the variety of material compounds explains why chemistry research is still a labor-intensive task. The level of complexity and vast amounts of data within chemistry provides a prime opportunity to achieve significant breakthroughs with the application of AI. First, the type of molecules that can be constructed from atoms are almost unlimited, which leads to unlimited chemical space 166 ; the interconnection of these molecules with all possible combinations of factors, such as temperature, substrates, and solvents, are overwhelmingly large, giving rise to unlimited reaction space. 167 Exploration of the unlimited chemical space and reaction space, and navigating to the optimum ones with the desired properties, is thus practically impossible solely from human efforts. Secondly, in chemistry, the huge assortment of molecules and the interplay of them with the external environments brings a new level of complexity, which cannot be simply predicted using physical laws. While many concepts, rules, and theories have been generalized from centuries of experience from studying trivial (i.e., single component) systems, nontrivial complexities are more likely as we discover that “more is different” in the words of Philip Warren Anderson, American physicist and Nobel Laureate. 168 Nontrivial complexities will occur when the scale changes, and the breaking of symmetry in larger, increasingly complex systems, and the rules will shift from quantitative to qualitative. Due to lack of systematic and analytical theory toward the structures, properties, and transformations of macroscopic substances, chemistry research is thus, incorrectly, guided by heuristics and fragmental rules accumulated over the previous centuries, yielding progress that only proceeds through trial and error. ML will recognize patterns from large amounts of data; thereby offering an unprecedented way of dealing with complexity, and reshaping chemistry research by revolutionizing the way in which data are used. Every sub-field of chemistry, currently, has utilized some form of AI, including tools for chemistry research and data generation, such as analytical chemistry and computational chemistry, as well as application to organic chemistry, catalysis, and medical chemistry, which we discuss herein.

AI breaks the limitations of manual feature selection methods

In analytical chemistry, the extraction of information has traditionally relied heavily on the feature selection techniques, which are based on prior human experiences. Unfortunately, this approach is inefficient, incomplete, and often biased. Automated data analysis based on AI will break the limitations of manual variable selection methods by learning from large amounts of data. Feature selection through DL algorithms enables information extraction from the datasets in NMR, chromatography, spectroscopy, and other analytical tools, 169 thereby improving the model prediction accuracy for analysis. These ML approaches will greatly accelerate the analysis of materials, leading to the rapid discovery of new molecules or materials. Raman scattering, for instance, since its discovery in the 1920s, has been widely employed as a powerful vibrational spectroscopy technology, capable of providing vibrational fingerprints intrinsic to analytes, thus enabling identification of molecules. 170 Recently, ML methods have been trained to recognize features in Raman (or SERS) spectra for the identity of an analyte by applying DL networks, including ANN, CNN, and fully convolutional network for feature engineering. 171 For example, Leong et al. designed a machine-learning-driven “SERS taster” to simultaneously harness useful vibrational information from multiple receptors for enhanced multiplex profiling of five wine flavor molecules at ppm levels. Principal-component analysis is employed for the discrimination of alcohols with varying degrees of substitution, and supported with vector machine discriminant analysis, is used to quantitatively classify all flavors with 100% accuracy. 172 Overall, AI techniques provide the first glimmer of hope for a universal method for spectral data analysis, which is fast, accurate, objective and definitive and with attractive advantages in a wide range of applications.

AI improves the accuracy and efficiency for various levels of computational theory

Complementary to analytical tools, computational chemistry has proven a powerful approach for using simulations to understand chemical properties; however, it is faced with an accuracy-versus-efficiency dilemma. This dilemma greatly limits the application of computational chemistry to real-world chemistry problems. To overcome this dilemma, ML and other AI methods are being applied to improve the accuracy and efficiency for various levels of theory used to describe the effects arising at different time and length scales, in the multi-scaling of chemical reactions. 173 Many of the open challenges in computational chemistry can be solved by ML approaches, for example, solving Schrödinger's equation, 174 developing atomistic 175 or coarse graining 176 potentials, constructing reaction coordinates, 177 developing reaction kinetics models, 178 and identifying key descriptors for computable properties. 179 In addition to analytical chemistry and computational chemistry, several disciplines of chemistry have incorporated AI technology to chemical problems. We discuss the areas of organic chemistry, catalysis, and medical chemistry as examples of where ML has made a significant impact. Many examples exist in literature for other subfields of chemistry and AI will continue to demonstrate breakthroughs in a wide range of chemical applications.

AI enables robotics capable of automating the synthesis of molecules

Organic chemistry studies the structure, property, and reaction of carbon-based molecules. The complexity of the chemical and reaction space, for a given property, presents an unlimited number of potential molecules that can be synthesized by chemists. Further complications are added when faced with the problems of how to synthesize a particular molecule, given that the process relies much on heuristics and laborious testing. Challenges have been addressed by researchers using AI. Given enough data, any properties of interest of a molecule can be predicted by mapping the molecular structure to the corresponding property using supervised learning, without resorting to physical laws. In addition to known molecules, new molecules can be designed by sampling the chemical space 180 using methods, such as autoencoders and CNNs, with the molecules coded as sequences or graphs. Retrosynthesis, the planning of synthetic routes, which was once considered an art, has now become much simpler with the help of ML algorithms. The Chemetica system, 181 for instance, is now capable of autonomous planning of synthetic routes that are subsequently proven to work in the laboratory. Once target molecules and the route of synthesis are determined, suitable reaction conditions can be predicted or optimized using ML techniques. 182

The integration of these AI-based approaches with robotics has enabled fully AI-guided robotics capable of automating the synthesis of small organic molecules without human intervention Figure 9 . 183 , 184

An external file that holds a picture, illustration, etc.
Object name is gr9.jpg

A closed loop workflow to enable automatic and intelligent design, synthesis, and assay of molecules in organic chemistry by AI

AI helps to search through vast catalyst design spaces

Catalytic chemistry originates from catalyst technologies in the chemical industry for efficient and sustainable production of chemicals and fuels. Thus far, it is still a challenging endeavor to make novel heterogeneous catalysts with good performance (i.e., stable, active, and selective) because a catalyst's performance depends on many properties: composition, support, surface termination, particle size, particle morphology, atomic coordination environment, porous structure, and reactor during the reaction. The inherent complexity of catalysis makes discovering and developing catalysts with desired properties more dependent on intuition and experiment, which is costly and time consuming. AI technologies, such as ML, when combined with experimental and in silico high-throughput screening of combinatorial catalyst libraries, can aid catalyst discovery by helping to search through vast design spaces. With a well-defined structure and standardized data, including reaction results and in situ characterization results, the complex association between catalytic structure and catalytic performance will be revealed by AI. 185 , 186 An accurate descriptor of the effect of molecules, molecular aggregation states, and molecular transport, on catalysts, could also be predicted. With this approach, researchers can build virtual laboratories to develop new catalysts and catalytic processes.

AI enables screening of chemicals in toxicology with minimum ethical concerns

A more complicated sub-field of chemistry is medical chemistry, which is a challenging field due to the complex interactions between the exotic substances and the inherent chemistry within a living system. Toxicology, for instance, as a broad field, seeks to predict and eliminate substances (e.g., pharmaceuticals, natural products, food products, and environmental substances), which may cause harm to a living organism. Living organisms are already complex, nearly any known substance can cause toxicity at a high enough exposure because of the already inherent complexity within living organisms. Moreover, toxicity is dependent on an array of other factors, including organism size, species, age, sex, genetics, diet, combination with other chemicals, overall health, and/or environmental context. Given the scale and complexity of toxicity problems, AI is likely to be the only realistic approach to meet regulatory body requirements for screening, prioritization, and risk assessment of chemicals (including mixtures), therefore revolutionizing the landscape in toxicology. 187 In summary, AI is turning chemistry from a labor-intensive branch of science to a highly intelligent, standardized, and automated field, and much more can be achieved compared with the limitation of human labor. Underlying knowledge with new concepts, rules, and theories is expected to advance with the application of AI algorithms. A large portion of new chemistry knowledge leading to significant breakthroughs is expected to be generated from AI-based chemistry research in the decades to come.

Conclusions

This paper carries out a comprehensive survey on the development and application of AI across a broad range of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. Despite the fact that AI has been pervasively used in a wide range of applications, there still exist ML security risks on data and ML models as attack targets during both training and execution phases. Firstly, since the performance of an ML system is highly dependent on the data used to train it, these input data are crucial for the security of the ML system. For instance, adversarial example attacks 188 providing malicious input data often lead the ML system into making false judgments (predictions or categorizations) with small perturbations that are imperceptible to humans; data poisoning by intentionally manipulating raw, training, or testing data can result in a decrease in model accuracy or lead to other error-specific attack purposes. Secondly, ML model attacks include backdoor attacks on DL, CNN, and federated learning that manipulate the model's parameters directly, as well as model stealing attack, model inversion attack, and membership inference attack, which can steal the model parameters or leak the sensitive training data. While a number of defense techniques against these security threats have been proposed, new attack models that target ML systems are constantly emerging. Thus, it is necessary to address the problem of ML security and develop robust ML systems that remain effective under malicious attacks.

Due to the data-driven character of the ML method, features of the training and testing data must be drawn from the same distribution, which is difficult to guarantee in practice. This is because, in practical application, the data source might be different from that in the training dataset. In addition, the data feature distribution may drift over time, which leads to a decline of the performance of the model. Moreover, if the model is trained with only new data, it will lead to catastrophic “forgetting” of the model, which means the model only remembers the new features and forgets the previously learned features. To solve this problem, more and more scholars pay attention on how to make the model have the ability of lifelong learning, that is, a change in the computing paradigm from “offline learning + online reasoning” to “online continuous learning,” and thus give the model have the ability of lifelong learning, just like a human being.

Acknowledgments

This work was partially supported by the National Key R&D Program of China (2018YFA0404603, 2019YFA0704900, 2020YFC1807000, and 2020YFB1313700), the Youth Innovation Promotion Association CAS (2011225, 2012006, 2013002, 2015316, 2016275, 2017017, 2017086, 2017120, 2017204, 2017300, 2017399, 2018356, 2020111, 2020179, Y201664, Y201822, and Y201911), NSFC (nos. 11971466, 12075253, 52173241, and 61902376), the Foundation of State Key Laboratory of Particle Detection and Electronics (SKLPDE-ZZ-201902), the Program of Science & Technology Service Network of CAS (KFJ-STS-QYZX-050), the Fundamental Science Center of the National Nature Science Foundation of China (nos. 52088101 and 11971466), the Scientific Instrument Developing Project of CAS (ZDKYYQ20210003), the Strategic Priority Research Program (B) of CAS (XDB33000000), the National Science Foundation of Fujian Province for Distinguished Young Scholars (2019J06023), the Key Research Program of Frontier Sciences, CAS (nos. ZDBS-LY-7022 and ZDBS-LY-DQC012), the CAS Project for Young Scientists in Basic Research (no. YSBR-005). The study is dedicated to the 10th anniversary of the Youth Innovation Promotion Association of the Chinese Academy of Sciences.

Author contributions

Y.X., Q.W., Z.A., Fei W., C.L., Z.C., J.M.T., and J.Z. conceived and designed the research. Z.A., Q.W., Fei W., Libo.Z., Y.W., F.D., and C.W.-Q. wrote the “ AI in information science ” section. Xin.L. wrote the “ AI in mathematics ” section. J.Q., K.H., W.S., J.W., H.X., Y.H., and X.C. wrote the “ AI in medical science ” section. E.L., C.F., Z.Y., and M.L. wrote the “ AI in materials science ” section. Fang W., R.R., S.D., M.V., and F.K. wrote the “ AI in geoscience ” section. C.H., Z.Z., L.Z., T.Z., J.D., J.Y., L.L., M.L., and T.H. wrote the “ AI in life sciences ” section. Z.L., S.Q., and T.A. wrote the “ AI in physics ” section. X.L., B.Z., X.H., S.C., X.L., W.Z., and J.P.L. wrote the “ AI in chemistry ” section. Y.X., Q.W., and Z.A. wrote the “Abstract,” “ introduction ,” “ history of AI ,” and “ conclusions ” sections.

Declaration of interests

The authors declare no competing interests.

Published Online: October 28, 2021

IMAGES

  1. How To Write A Research Paper On Artificial Intelligence?

    artificial intelligence based research paper

  2. (PDF) A Review of Artificial Intelligence Methods for Data Science and

    artificial intelligence based research paper

  3. Research Paper On Artificial Intelligence In Medicine Pdf

    artificial intelligence based research paper

  4. 😀 Research paper artificial intelligence. Research paper on artificial

    artificial intelligence based research paper

  5. (PDF) A Study on Artificial Intelligence Technologies and its

    artificial intelligence based research paper

  6. (PDF) Research Paper on Artificial Intelligence

    artificial intelligence based research paper

VIDEO

  1. Solution of Artificial Intelligence Question Paper || AI || 843 Class 12 || CBSE Board 2023-24

  2. Artificial Intelligence (AI) Sample Paper class 10 2023 -24

  3. Artificial Intelligence Based Smart Attendance System

  4. ARTIFICIAL INTELLIGENCE 💯 QUESTIONS PAPER 📜💫💫🤗@StudyCampus2023 #boardexamination #board

  5. Artificial Intelligence: Knowledge Representation And Reasoning

  6. Artificial intelligence April may 2023 question paper

COMMENTS

  1. Scientific discovery in the age of artificial intelligence

    Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect ...

  2. Artificial intelligence: A powerful paradigm for scientific research

    Artificial intelligence (AI) is a rapidly evolving field that has transformed various domains of scientific research. This article provides an overview of the history, applications, challenges, and opportunities of AI in science. It also discusses how AI can enhance scientific creativity, collaboration, and communication. Learn more about the potential and impact of AI in science by reading ...

  3. AIJ

    The journal of Artificial Intelligence (AIJ) welcomes papers on broad aspects of AI that constitute advances in the overall field including, but not limited to, cognition and AI, automated reasoning and inference, case-based reasoning, commonsense reasoning, computer vision, constraint …. View full aims & scope.

  4. AI-Based Modeling: Techniques, Applications and Research ...

    Artificial intelligence (AI) is a leading technology of the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR), with the capability of incorporating human behavior and intelligence into machines or systems. Thus, AI-based modeling is the key to build automated, intelligent, and smart systems according to today's needs. To solve real-world issues, various types of AI such ...

  5. A Survey of Artificial Intelligence Challenges: Analyzing the ...

    From a historical point of view, the evolution of AI-based systems starts with artificial narrow intelligence (ANI), then continues with artificial general intelligence (AGI), and finally meets artificial super intelligence (ASI), which will surpass human capabilities in all dimensions [13,14].All of the mentioned terms will be explained in the rest of this paper.

  6. Artificial Intelligence authors/titles recent submissions

    A Multimodal Automated Interpretability Agent. Tamar Rott Shaham, Sarah Schwettmann, Franklin Wang, Achyuta Rajaram, Evan Hernandez, Jacob Andreas, Antonio Torralba. Comments: 25 pages, 13 figures. Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

  7. Forecasting the future of artificial intelligence with machine learning

    The corpus of scientific literature grows at an ever-increasing speed. Specifically, in the field of artificial intelligence (AI) and machine learning (ML), the number of papers every month is ...

  8. The collective use and evaluation of generative AI tools in digital

    The advent of generative artificial intelligence (GenAI) technologies has revolutionized research, with significant implications for Digital Humanities (DH), a field inherently intertwined with technological progress. This article investigates how digital humanities scholars adopt, practice, as well as critically evaluate, GenAI technologies such as ChatGPT in the research process. Drawing on ...

  9. AI

    AI. AI is an international, peer-reviewed, open access journal on artificial intelligence (AI), including broad aspects of cognition and reasoning, perception and planning, machine learning, intelligent robotics, and applications of AI, published quarterly online by MDPI. Open Access — free for readers, with article processing charges (APC ...

  10. [2404.12138v1] Character is Destiny: Can Large Language Models Simulate

    Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we ...

  11. AI Index Report

    Mission. The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI.

  12. Could AI help you to write your next paper?

    OpenAI, a research laboratory in San Francisco, California, created the most well-known LLM, GPT-3, in 2020, by training a network to predict the next piece of text based on what came before. On ...

  13. PDF The Impact of Artificial Intelligence on Innovation

    ABSTRACT. Artificial intelligence may greatly increase the efficiency of the existing economy. But it may have an even larger impact by serving as a new general-purpose "method of invention" that can reshape the nature of the innovation process and the organization of R&D.

  14. Artificial intelligence in information systems research: A systematic

    identify the opportunities for future AI research in IS. The structure of the paper is as follows. First, an introduction to related work on AI in the IS field is presented. ... that "theories of intelligence and the goal of Artificial Intelligence (A.I.) ... The set of outcomes directly based on the research results obtained from the data ...

  15. Students' Intention toward Artificial Intelligence in the Context of

    The analysis of students' attitudes and perceptions represents a basis for enhancing different types of activities, including teaching, learning, assessment, etc. Emphasis might be placed on the implementation of modern procedures and technologies, which play an important role in the process of digital transformation. Among them is artificial intelligence—a technology that has already been ...

  16. Cognitive psychology-based artificial intelligence review

    Introduction. At present, in the development of artificial intelligence (AI), the scientific community is mostly based on brain cognition research (Nadji-Tehrani and Eslami, 2020), which is to reproduce the real physiological activities of our human brain through computer software.This replication of the biology of the human brain cannot well simulate the subjective psychological changes ...

  17. Artificial intelligence in healthcare: transforming the practice of

    Artificial intelligence (AI) is a powerful and disruptive area of computer science, with the potential to fundamentally transform the practice of medicine and the delivery of healthcare. In this review article, we outline recent breakthroughs in the application of AI in healthcare, describe a roadmap to building effective, reliable and safe AI ...

  18. PDF CHAPTER 1: Index Report 2024 Research and Development

    Artificial Intelligence Index Report 2024 CHAPTER 1: Research and Development Preview. 3 Artificial Intelligence ... GitHub is a web-based platform that enables individuals and teams to host, review, and collaborate on code ... recently published research paper, a shift from the previously detailed methodology in an earlier paper. ...

  19. The Top 17 'Must-Read' AI Papers in 2022

    In any case, this is a very interesting model family, which we might encounter in many of the applications we use daily. Read the full paper here. 10. A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27 (2022) - Yann LeCun.

  20. The Ethics of Artificial Intelligence: exacerbated problems ...

    Floridi, Luciano, The Ethics of Artificial Intelligence: exacerbated problems, renewed problems, unprecedented problems - Introduction to the Special Issue of the American Philosophical Quarterly dedicated to The Ethics of AI (April 20, 2024). ... Centre for Digital Ethics (CEDE) Research Paper Series. Subscribe to this free journal for more ...

  21. (PDF) Enhancing cybersecurity: The power of artificial intelligence in

    This paper presents a systematic literature research to identify publications of artificial intelligence-based cyber-attacks and to analyze them for deriving cyber security measures.

  22. To build a better AI helper, start by modeling the irrational behavior

    Jacob wrote the paper with Abhishek Gupta, assistant professor at the University of Washington, and senior author Jacob Andreas, associate professor in EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the International Conference on Learning Representations. Modeling ...

  23. The best AI tools for research papers and academic research (Literature

    The integration of artificial intelligence in the world of academic research is nothing short of revolutionary. With the array of AI tools we've explored today - from research and mapping, literature review, peer-reviewed papers reading, scientific writing, to academic editing and grant writing - the landscape of research is significantly ...

  24. AI Research Papers

    Qualcomm is enabling a world where everyone and everything can be intelligently connected. We are efficiently scaling the technologies that launched the mobile revolution to the next generation of connected smart devices. Qualcomm contributes impactful artificial intelligence research covering computer vision, machine learning, generative ...

  25. Modified crayfish optimization algorithm for solving multiple

    Crayfish Optimization Algorithm (COA) is innovative and easy to implement, but the crayfish search efficiency decreases in the later stage of the algorithm, and the algorithm is easy to fall into local optimum. To solve these problems, this paper proposes an modified crayfish optimization algorithm (MCOA). Based on the survival habits of crayfish, MCOA proposes an environmental renewal ...

  26. Artificial intelligence for cybersecurity: Literature review and future

    Artificial intelligence for cybersecurity: Literature review and future research directions ... The article is a full research paper (i.e., not a presentation or supplement to a poster). ... Based on the removal of non-English papers, posters, reviews, surveys, non-scientific publications, editorials, books, chapters, summaries of workshops and ...

  27. Multiple Similarity Assessment (MSA) Method for Data Optimization Based

    Additionally, computer science's branch of machine learning and artificial intelligence, both of which are designed to improve human intelligence, are related technologies. AI and ML can be utilized in e-healthcare to improve workflow, automatically handle volumes of medical data, and offer useful medical decision assistance.

  28. Artificial intelligence in academic writing: a paradigm-shifting

    Artificial intelligence (AI) has rapidly become one of the most important and transformative technologies of our time, with applications in virtually every field and industry. Among these ...

  29. Artificial intelligence: A powerful paradigm for scientific research

    Cognitive intelligence is a higher-level ability of induction, reasoning and acquisition of knowledge. It is inspired by cognitive science, brain science, and brain-like intelligence to endow machines with thinking logic and cognitive ability similar to human beings. Once a machine has the abilities of perception and cognition, it is often ...

  30. (PDF) Research paper on Artificial Intelligence

    "Best Paper Award Second Prize" ICGECD 2020 -2nd International Conference on General Education and Contemporary Development, October 23-24, 2020 with our research paper Artificial intelligence ...