How to Synthesize Written Information from Multiple Sources

Shona McCombes

Content Manager

B.A., English Literature, University of Glasgow

Shona McCombes is the content manager at Scribbr, Netherlands.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

When you write a literature review or essay, you have to go beyond just summarizing the articles you’ve read – you need to synthesize the literature to show how it all fits together (and how your own research fits in).

Synthesizing simply means combining. Instead of summarizing the main points of each source in turn, you put together the ideas and findings of multiple sources in order to make an overall point.

At the most basic level, this involves looking for similarities and differences between your sources. Your synthesis should show the reader where the sources overlap and where they diverge.

Unsynthesized Example

Franz (2008) studied undergraduate online students. He looked at 17 females and 18 males and found that none of them liked APA. According to Franz, the evidence suggested that all students are reluctant to learn citations style. Perez (2010) also studies undergraduate students. She looked at 42 females and 50 males and found that males were significantly more inclined to use citation software ( p < .05). Findings suggest that females might graduate sooner. Goldstein (2012) looked at British undergraduates. Among a sample of 50, all females, all confident in their abilities to cite and were eager to write their dissertations.

Synthesized Example

Studies of undergraduate students reveal conflicting conclusions regarding relationships between advanced scholarly study and citation efficacy. Although Franz (2008) found that no participants enjoyed learning citation style, Goldstein (2012) determined in a larger study that all participants watched felt comfortable citing sources, suggesting that variables among participant and control group populations must be examined more closely. Although Perez (2010) expanded on Franz’s original study with a larger, more diverse sample…

Step 1: Organize your sources

After collecting the relevant literature, you’ve got a lot of information to work through, and no clear idea of how it all fits together.

Before you can start writing, you need to organize your notes in a way that allows you to see the relationships between sources.

One way to begin synthesizing the literature is to put your notes into a table. Depending on your topic and the type of literature you’re dealing with, there are a couple of different ways you can organize this.

Summary table

A summary table collates the key points of each source under consistent headings. This is a good approach if your sources tend to have a similar structure – for instance, if they’re all empirical papers.

Each row in the table lists one source, and each column identifies a specific part of the source. You can decide which headings to include based on what’s most relevant to the literature you’re dealing with.

For example, you might include columns for things like aims, methods, variables, population, sample size, and conclusion.

For each study, you briefly summarize each of these aspects. You can also include columns for your own evaluation and analysis.

summary table for synthesizing the literature

The summary table gives you a quick overview of the key points of each source. This allows you to group sources by relevant similarities, as well as noticing important differences or contradictions in their findings.

Synthesis matrix

A synthesis matrix is useful when your sources are more varied in their purpose and structure – for example, when you’re dealing with books and essays making various different arguments about a topic.

Each column in the table lists one source. Each row is labeled with a specific concept, topic or theme that recurs across all or most of the sources.

Then, for each source, you summarize the main points or arguments related to the theme.

synthesis matrix

The purposes of the table is to identify the common points that connect the sources, as well as identifying points where they diverge or disagree.

Step 2: Outline your structure

Now you should have a clear overview of the main connections and differences between the sources you’ve read. Next, you need to decide how you’ll group them together and the order in which you’ll discuss them.

For shorter papers, your outline can just identify the focus of each paragraph; for longer papers, you might want to divide it into sections with headings.

There are a few different approaches you can take to help you structure your synthesis.

If your sources cover a broad time period, and you found patterns in how researchers approached the topic over time, you can organize your discussion chronologically .

That doesn’t mean you just summarize each paper in chronological order; instead, you should group articles into time periods and identify what they have in common, as well as signalling important turning points or developments in the literature.

If the literature covers various different topics, you can organize it thematically .

That means that each paragraph or section focuses on a specific theme and explains how that theme is approached in the literature.

synthesizing the literature using themes

Source Used with Permission: The Chicago School

If you’re drawing on literature from various different fields or they use a wide variety of research methods, you can organize your sources methodologically .

That means grouping together studies based on the type of research they did and discussing the findings that emerged from each method.

If your topic involves a debate between different schools of thought, you can organize it theoretically .

That means comparing the different theories that have been developed and grouping together papers based on the position or perspective they take on the topic, as well as evaluating which arguments are most convincing.

Step 3: Write paragraphs with topic sentences

What sets a synthesis apart from a summary is that it combines various sources. The easiest way to think about this is that each paragraph should discuss a few different sources, and you should be able to condense the overall point of the paragraph into one sentence.

This is called a topic sentence , and it usually appears at the start of the paragraph. The topic sentence signals what the whole paragraph is about; every sentence in the paragraph should be clearly related to it.

A topic sentence can be a simple summary of the paragraph’s content:

“Early research on [x] focused heavily on [y].”

For an effective synthesis, you can use topic sentences to link back to the previous paragraph, highlighting a point of debate or critique:

“Several scholars have pointed out the flaws in this approach.” “While recent research has attempted to address the problem, many of these studies have methodological flaws that limit their validity.”

By using topic sentences, you can ensure that your paragraphs are coherent and clearly show the connections between the articles you are discussing.

As you write your paragraphs, avoid quoting directly from sources: use your own words to explain the commonalities and differences that you found in the literature.

Don’t try to cover every single point from every single source – the key to synthesizing is to extract the most important and relevant information and combine it to give your reader an overall picture of the state of knowledge on your topic.

Step 4: Revise, edit and proofread

Like any other piece of academic writing, synthesizing literature doesn’t happen all in one go – it involves redrafting, revising, editing and proofreading your work.

Checklist for Synthesis

  •   Do I introduce the paragraph with a clear, focused topic sentence?
  •   Do I discuss more than one source in the paragraph?
  •   Do I mention only the most relevant findings, rather than describing every part of the studies?
  •   Do I discuss the similarities or differences between the sources, rather than summarizing each source in turn?
  •   Do I put the findings or arguments of the sources in my own words?
  •   Is the paragraph organized around a single idea?
  •   Is the paragraph directly relevant to my research question or topic?
  •   Is there a logical transition from this paragraph to the next one?

Further Information

How to Synthesise: a Step-by-Step Approach

Help…I”ve Been Asked to Synthesize!

Learn how to Synthesise (combine information from sources)

How to write a Psychology Essay

Print Friendly, PDF & Email

Related Articles

How To Cite A YouTube Video In APA Style – With Examples

Student Resources

How To Cite A YouTube Video In APA Style – With Examples

How to Write an Abstract APA Format

How to Write an Abstract APA Format

APA References Page Formatting and Example

APA References Page Formatting and Example

APA Title Page (Cover Page) Format, Example, & Templates

APA Title Page (Cover Page) Format, Example, & Templates

How do I Cite a Source with Multiple Authors in APA Style?

How do I Cite a Source with Multiple Authors in APA Style?

How to Write a Psychology Essay

How to Write a Psychology Essay

Academic Success Center

Writing Resources

  • Student Paper Template
  • Grammar Guidelines
  • Punctuation Guidelines
  • Writing Guidelines
  • Creating a Title
  • Outlining and Annotating
  • Using Generative AI (Chat GPT and others)
  • Introduction, Thesis, and Conclusion
  • Strategies for Citations
  • Determining the Resource This link opens in a new window
  • Citation Examples
  • Paragraph Development
  • Paraphrasing
  • Inclusive Language
  • International Center for Academic Integrity
  • How to Synthesize and Analyze
  • Synthesis and Analysis Practice
  • Synthesis and Analysis Group Sessions
  • Decoding the Assignment Prompt
  • Annotated Bibliography
  • Comparative Analysis
  • Conducting an Interview
  • Infographics
  • Office Memo
  • Policy Brief
  • Poster Presentations
  • PowerPoint Presentation
  • White Paper
  • Writing a Blog
  • Research Writing: The 5 Step Approach
  • Step 1: Seek Out Evidence
  • Step 2: Explain
  • Step 3: The Big Picture
  • Step 4: Own It
  • Step 5: Illustrate
  • MLA Resources
  • Time Management

ASC Chat Hours

ASC Chat is usually available at the following times ( Pacific Time):

If there is not a coach on duty, submit your question via one of the below methods:

  928-440-1325

  Ask a Coach

  [email protected]

Search our FAQs on the Academic Success Center's  Ask a Coach   page.

Learning about Synthesis Analysis

What D oes Synthesis and Analysis Mean?

Synthesis: the combination of ideas to

Synthesis, Analysis, and Evaluation

  • show commonalities or patterns

Analysis: a detailed examination

  • of elements, ideas, or the structure of something
  • can be a basis for discussion or interpretation

Synthesis and Analysis: combine and examine ideas to

  • show how commonalities, patterns, and elements fit together
  • form a unified point for a theory, discussion, or interpretation
  • develop an informed evaluation of the idea by presenting several different viewpoints and/or ideas

Key Resource: Synthesis Matrix

Synthesis Matrix

A synthesis matrix is an excellent tool to use to organize sources by theme and to be able to see the similarities and differences as well as any important patterns in the methodology and recommendations for future research. Using a synthesis matrix can assist you not only in synthesizing and analyzing,  but it can also aid you in finding a researchable problem and gaps in methodology and/or research.

Synthesis Matrix

Use the Synthesis Matrix Template attached below to organize your research by theme and look for patterns in your sources .Use the companion handout, "Types of Articles" to aid you in identifying the different article types for the sources you are using in your matrix. If you have any questions about how to use the synthesis matrix, sign up for the synthesis analysis group session to practice using them with Dr. Sara Northern!

Writing Icon Purple Circle w/computer inside

Was this resource helpful?

  • << Previous: International Center for Academic Integrity
  • Next: How to Synthesize and Analyze >>
  • Last Updated: May 22, 2024 5:49 AM
  • URL: https://resources.nu.edu/writingresources

NCU Library Home

Analysis vs. Synthesis

What's the difference.

Analysis and synthesis are two fundamental processes in problem-solving and decision-making. Analysis involves breaking down a complex problem or situation into its constituent parts, examining each part individually, and understanding their relationships and interactions. It focuses on understanding the components and their characteristics, identifying patterns and trends, and drawing conclusions based on evidence and data. On the other hand, synthesis involves combining different elements or ideas to create a new whole or solution. It involves integrating information from various sources, identifying commonalities and differences, and generating new insights or solutions. While analysis is more focused on understanding and deconstructing a problem, synthesis is about creating something new by combining different elements. Both processes are essential for effective problem-solving and decision-making, as they complement each other and provide a holistic approach to understanding and solving complex problems.

Analysis

Further Detail

Introduction.

Analysis and synthesis are two fundamental processes in various fields of study, including science, philosophy, and problem-solving. While they are distinct approaches, they are often interconnected and complementary. Analysis involves breaking down complex ideas or systems into smaller components to understand their individual parts and relationships. On the other hand, synthesis involves combining separate elements or ideas to create a new whole or understanding. In this article, we will explore the attributes of analysis and synthesis, highlighting their differences and similarities.

Attributes of Analysis

1. Focus on details: Analysis involves a meticulous examination of individual components, details, or aspects of a subject. It aims to understand the specific characteristics, functions, and relationships of these elements. By breaking down complex ideas into smaller parts, analysis provides a deeper understanding of the subject matter.

2. Objective approach: Analysis is often driven by objectivity and relies on empirical evidence, data, or logical reasoning. It aims to uncover patterns, trends, or underlying principles through systematic observation and investigation. By employing a structured and logical approach, analysis helps in drawing accurate conclusions and making informed decisions.

3. Critical thinking: Analysis requires critical thinking skills to evaluate and interpret information. It involves questioning assumptions, identifying biases, and considering multiple perspectives. Through critical thinking, analysis helps in identifying strengths, weaknesses, opportunities, and threats, enabling a comprehensive understanding of the subject matter.

4. Reductionist approach: Analysis often adopts a reductionist approach, breaking down complex systems into simpler components. This reductionist perspective allows for a detailed examination of each part, facilitating a more in-depth understanding of the subject matter. However, it may sometimes overlook the holistic view or emergent properties of the system.

5. Diagnostic tool: Analysis is commonly used as a diagnostic tool to identify problems, errors, or inefficiencies within a system. By examining individual components and their interactions, analysis helps in pinpointing the root causes of issues, enabling effective problem-solving and optimization.

Attributes of Synthesis

1. Integration of ideas: Synthesis involves combining separate ideas, concepts, or elements to create a new whole or understanding. It aims to generate novel insights, solutions, or perspectives by integrating diverse information or viewpoints. Through synthesis, complex systems or ideas can be approached holistically, considering the interconnections and interdependencies between various components.

2. Creative thinking: Synthesis requires creative thinking skills to generate new ideas, concepts, or solutions. It involves making connections, recognizing patterns, and thinking beyond traditional boundaries. By embracing divergent thinking, synthesis enables innovation and the development of unique perspectives.

3. Systems thinking: Synthesis often adopts a systems thinking approach, considering the interactions and interdependencies between various components. It recognizes that the whole is more than the sum of its parts and aims to understand emergent properties or behaviors that arise from the integration of these parts. Systems thinking allows for a comprehensive understanding of complex phenomena.

4. Constructive approach: Synthesis is a constructive process that builds upon existing knowledge or ideas. It involves organizing, reorganizing, or restructuring information to create a new framework or understanding. By integrating diverse perspectives or concepts, synthesis helps in generating comprehensive and innovative solutions.

5. Design tool: Synthesis is often used as a design tool to create new products, systems, or theories. By combining different elements or ideas, synthesis enables the development of innovative and functional solutions. It allows for the exploration of multiple possibilities and the creation of something new and valuable.

Interplay between Analysis and Synthesis

While analysis and synthesis are distinct processes, they are not mutually exclusive. In fact, they often complement each other and are interconnected in various ways. Analysis provides the foundation for synthesis by breaking down complex ideas or systems into manageable components. It helps in understanding the individual parts and their relationships, which is essential for effective synthesis.

On the other hand, synthesis builds upon the insights gained from analysis by integrating separate elements or ideas to create a new whole. It allows for a holistic understanding of complex phenomena, considering the interconnections and emergent properties that analysis alone may overlook. Synthesis also helps in identifying gaps or limitations in existing knowledge, which can then be further analyzed to gain a deeper understanding.

Furthermore, analysis and synthesis often involve an iterative process. Initial analysis may lead to the identification of patterns or relationships that can inform the synthesis process. Synthesis, in turn, may generate new insights or questions that require further analysis. This iterative cycle allows for continuous refinement and improvement of understanding.

Analysis and synthesis are two essential processes that play a crucial role in various fields of study. While analysis focuses on breaking down complex ideas into smaller components to understand their individual parts and relationships, synthesis involves integrating separate elements or ideas to create a new whole or understanding. Both approaches have their unique attributes and strengths, and they often complement each other in a cyclical and iterative process. By employing analysis and synthesis effectively, we can gain a comprehensive understanding of complex phenomena, generate innovative solutions, and make informed decisions.

Comparisons may contain inaccurate information about people, places, or facts. Please report any issues.

Module 8: Analysis and Synthesis

Putting it together: analysis and synthesis.

Decorative image.

The ability to analyze effectively is fundamental to success in college and the workplace, regardless of your major or your career plans. Now that you have an understanding of what analysis is, the keys to effective analysis, and the types of analytic assignments you may face, work on improving your analytic skills by keeping the following important concepts in mind:

  • Recognize that analysis comes in many forms. Any assignment that asks how parts relate to the whole, how something works, what something means, or why something is important is asking for analysis.
  • Suspend judgment before undertaking analysis.
  • Craft analytical theses that address how, why, and so what.
  • Support analytical interpretations with clear, explicitly cited evidence.
  • Remember that all analytical tasks require you to break down or investigate something.

Analysis is the first step towards synthesis, which requires not only thinking critically and investigating a topic or source, but combining thoughts and ideas to create new ones. As you synthesize, you will draw inferences and make connections to broader themes and concepts. It’s this step that will really help add substance, complexity, and interest to your essays.

  • Analysis. Provided by : University of Mississippi. License : CC BY: Attribution
  • Putting It Together: Analysis and Synthesis. Provided by : Lumen Learning. License : CC BY: Attribution
  • Image of a group in a workplace. Authored by : Free-Photos. Provided by : Pixabay. Located at : https://pixabay.com/photos/workplace-team-business-meeting-1245776/ . License : Other . License Terms : https://pixabay.com/service/terms/#license

Footer Logo Lumen Waymaker

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Analysis has always been at the heart of philosophical method, but it has been understood and practised in many different ways. Perhaps, in its broadest sense, it might be defined as a process of isolating or working back to what is more fundamental by means of which something, initially taken as given, can be explained or reconstructed. The explanation or reconstruction is often then exhibited in a corresponding process of synthesis. This allows great variation in specific method, however. The aim may be to get back to basics, but there may be all sorts of ways of doing this, each of which might be called ‘analysis’. The dominance of ‘analytic’ philosophy in the English-speaking world, and increasingly now in the rest of the world, might suggest that a consensus has formed concerning the role and importance of analysis. This assumes, though, that there is agreement on what ‘analysis’ means, and this is far from clear. On the other hand, Wittgenstein's later critique of analysis in the early (logical atomist) period of analytic philosophy, and Quine's attack on the analytic-synthetic distinction, for example, have led some to claim that we are now in a ‘post-analytic’ age. Such criticisms, however, are only directed at particular conceptions of analysis. If we look at the history of philosophy, and even if we just look at the history of analytic philosophy, we find a rich and extensive repertoire of conceptions of analysis which philosophers have continually drawn upon and reconfigured in different ways. Analytic philosophy is alive and well precisely because of the range of conceptions of analysis that it involves. It may have fragmented into various interlocking subtraditions, but those subtraditions are held together by both their shared history and their methodological interconnections. It is the aim of this article to indicate something of the range of conceptions of analysis in the history of philosophy and their interconnections, and to provide a bibliographical resource for those wishing to explore analytic methodologies and the philosophical issues that they raise.

1.1 Characterizations of Analysis

1.2 guide to this entry.

  • Supplementary Document: Definitions and Descriptions of Analysis
  • 1. Introduction
  • 2. Ancient Greek Geometry
  • 4. Aristotle
  • 1. Medieval Philosophy
  • 2. Renaissance Philosophy
  • 2. Descartes and Analytic Geometry
  • 3. British Empiricism

5. Modern Conceptions of Analysis, outside Analytic Philosophy

  • 5. Wittgenstein
  • 6. The Cambridge School of Analysis
  • 7. Carnap and Logical Positivism
  • 8. Oxford Linguistic Philosophy
  • 9. Contemporary Analytic Philosophy

7. Conclusion

Other internet resources, related entries, 1. general introduction.

This section provides a preliminary description of analysis—or the range of different conceptions of analysis—and a guide to this article as a whole.

If asked what ‘analysis’ means, most people today immediately think of breaking something down into its components; and this is how analysis tends to be officially characterized. In the Concise Oxford Dictionary , for example, ‘analysis’ is defined as the “resolution into simpler elements by analysing (opp. synthesis )”, the only other uses mentioned being the mathematical and the psychological [ Quotation ]. And in the Oxford Dictionary of Philosophy , ‘analysis’ is defined as “the process of breaking a concept down into more simple parts, so that its logical structure is displayed” [ Quotation ]. The restriction to concepts and the reference to displaying ‘logical structure’ are important qualifications, but the core conception remains that of breaking something down.

This conception may be called the decompositional conception of analysis (see Section 4 ). But it is not the only conception, and indeed is arguably neither the dominant conception in the pre-modern period nor the conception that is characteristic of at least one major strand in ‘analytic’ philosophy. In ancient Greek thought, ‘analysis’ referred primarily to the process of working back to first principles by means of which something could then be demonstrated. This conception may be called the regressive conception of analysis (see Section 2 ). In the work of Frege and Russell, on the other hand, before the process of decomposition could take place, the statements to be analyzed had first to be translated into their ‘correct’ logical form (see Section 6 ). This suggests that analysis also involves a transformative or interpretive dimension. This too, however, has its roots in earlier thought (see especially the supplementary sections on Ancient Greek Geometry and Medieval Philosophy ).

These three conceptions should not be seen as competing. In actual practices of analysis, which are invariably richer than the accounts that are offered of them, all three conceptions are typically reflected, though to differing degrees and in differing forms. To analyze something, we may first have to interpret it in some way, translating an initial statement, say, into the privileged language of logic, mathematics or science, before articulating the relevant elements and structures, and all in the service of identifying fundamental principles by means of which to explain it. The complexities that this schematic description suggests can only be appreciated by considering particular types of analysis.

Understanding conceptions of analysis is not simply a matter of attending to the use of the word ‘analysis’ and its cognates—or obvious equivalents in languages other than English, such as ‘ analusis ’ in Greek or ‘ Analyse ’ in German. Socratic definition is arguably a form of conceptual analysis, yet the term ‘ analusis ’ does not occur anywhere in Plato's dialogues (see Section 2 below). Nor, indeed, do we find it in Euclid's Elements , which is the classic text for understanding ancient Greek geometry: Euclid presupposed what came to be known as the method of analysis in presenting his proofs ‘synthetically’. In Latin, ‘ resolutio ’ was used to render the Greek word ‘ analusis ’, and although ‘resolution’ has a different range of meanings, it is often used synonymously with ‘analysis’ (see the supplementary section on Renaissance Philosophy ). In Aristotelian syllogistic theory, and especially from the time of Descartes, forms of analysis have also involved ‘reduction’; and in early analytic philosophy it was ‘reduction’ that was seen as the goal of philosophical analysis (see especially the supplementary section on The Cambridge School of Analysis ).

Further details of characterizations of analysis that have been offered in the history of philosophy, including all the classic passages and remarks (to which occurrences of ‘[ Quotation ]’ throughout this entry refer), can be found in the supplementary document on

Definitions and Descriptions of Analysis .

A list of key reference works, monographs and collections can be found in the

Annotated Bibliography, §1 .

This entry comprises three sets of documents:

  • The present document
  • Six supplementary documents (one of which is not yet available)
  • An annotated bibliography on analysis, divided into six documents

The present document provides an overview, with introductions to the various conceptions of analysis in the history of philosophy. It also contains links to the supplementary documents, the documents in the bibliography, and other internet resources. The supplementary documents expand on certain topics under each of the six main sections. The annotated bibliography contains a list of key readings on each topic, and is also divided according to the sections of this entry.

2. Ancient Conceptions of Analysis and the Emergence of the Regressive Conception

The word ‘analysis’ derives from the ancient Greek term ‘ analusis ’. The prefix ‘ ana ’ means ‘up’, and ‘ lusis ’ means ‘loosing’, ‘release’ or ‘separation’, so that ‘ analusis ’ means ‘loosening up’ or ‘dissolution’. The term was readily extended to the solving or dissolving of a problem, and it was in this sense that it was employed in ancient Greek geometry and philosophy. The method of analysis that was developed in ancient Greek geometry had an influence on both Plato and Aristotle. Also important, however, was the influence of Socrates's concern with definition, in which the roots of modern conceptual analysis can be found. What we have in ancient Greek thought, then, is a complex web of methodologies, of which the most important are Socratic definition, which Plato elaborated into his method of division, his related method of hypothesis, which drew on geometrical analysis, and the method(s) that Aristotle developed in his Analytics . Far from a consensus having established itself over the last two millennia, the relationships between these methodologies are the subject of increasing debate today. At the heart of all of them, too, lie the philosophical problems raised by Meno's paradox, which anticipates what we now know as the paradox of analysis, concerning how an analysis can be both correct and informative (see the supplementary section on Moore ), and Plato's attempt to solve it through the theory of recollection, which has spawned a vast literature on its own.

‘Analysis’ was first used in a methodological sense in ancient Greek geometry, and the model that Euclidean geometry provided has been an inspiration ever since. Although Euclid's Elements dates from around 300 BC, and hence after both Plato and Aristotle, it is clear that it draws on the work of many previous geometers, most notably, Theaetetus and Eudoxus, who worked closely with Plato and Aristotle. Plato is even credited by Diogenes Laertius ( LEP , I, 299) with inventing the method of analysis, but whatever the truth of this may be, the influence of geometry starts to show in his middle dialogues, and he certainly encouraged work on geometry in his Academy.

The classic source for our understanding of ancient Greek geometrical analysis is a passage in Pappus's Mathematical Collection , which was composed around 300 AD, and hence drew on a further six centuries of work in geometry from the time of Euclid's Elements :

Now analysis is the way from what is sought—as if it were admitted—through its concomitants ( akolouthôn ) in order[,] to something admitted in synthesis. For in analysis we suppose that which is sought to be already done, and we inquire from what it results, and again what is the antecedent of the latter, until we on our backward way light upon something already known and being first in order. And we call such a method analysis, as being a solution backwards ( anapalin lysin ). In synthesis, on the other hand, we suppose that which was reached last in analysis to be already done, and arranging in their natural order as consequents ( epomena ) the former antecedents and linking them one with another, we in the end arrive at the construction of the thing sought. And this we call synthesis. [ Full Quotation ]

Analysis is clearly being understood here in the regressive sense—as involving the working back from ‘what is sought’, taken as assumed, to something more fundamental by means of which it can then be established, through its converse, synthesis. For example, to demonstrate Pythagoras's theorem—that the square on the hypotenuse of a right-angled triangle is equal to the sum of the squares on the other two sides—we may assume as ‘given’ a right-angled triangle with the three squares drawn on its sides. In investigating the properties of this complex figure we may draw further (auxiliary) lines between particular points and find that there are a number of congruent triangles, from which we can begin to work out the relationship between the relevant areas. Pythagoras's theorem thus depends on theorems about congruent triangles, and once these—and other—theorems have been identified (and themselves proved), Pythagoras's theorem can be proved. (The theorem is demonstrated in Proposition 47 of Book I of Euclid's Elements .)

The basic idea here provides the core of the conception of analysis that one can find reflected, in its different ways, in the work of Plato and Aristotle (see the supplementary sections on Plato and Aristotle ). Although detailed examination of actual practices of analysis reveals more than just regression to first causes, principles or theorems, but decomposition and transformation as well (see especially the supplementary section on Ancient Greek Geometry ), the regressive conception dominated views of analysis until well into the early modern period.

Ancient Greek geometry was not the only source of later conceptions of analysis, however. Plato may not have used the term ‘analysis’ himself, but concern with definition was central to his dialogues, and definitions have often been seen as what ‘conceptual analysis’ should yield. The definition of ‘knowledge’ as ‘justified true belief’ (or ‘true belief with an account’, in more Platonic terms) is perhaps the classic example. Plato's concern may have been with real rather than nominal definitions, with ‘essences’ rather than mental or linguistic contents (see the supplementary section on Plato ), but conceptual analysis, too, has frequently been given a ‘realist’ construal. Certainly, the roots of conceptual analysis can be traced back to Plato's search for definitions, as we shall see in Section 4 below.

Further discussion can be found in the supplementary document on

Ancient Conceptions of Analysis .

Further reading can be found in the

Annotated Bibliography, §2 .

3. Medieval and Renaissance Conceptions of Analysis

Conceptions of analysis in the medieval and renaissance periods were largely influenced by ancient Greek conceptions. But knowledge of these conceptions was often second-hand, filtered through a variety of commentaries and texts that were not always reliable. Medieval and renaissance methodologies tended to be uneasy mixtures of Platonic, Aristotelian, Stoic, Galenic and neo-Platonic elements, many of them claiming to have some root in the geometrical conception of analysis and synthesis. However, in the late medieval period, clearer and more original forms of analysis started to take shape. In the literature on so-called ‘syncategoremata’ and ‘exponibilia’, for example, we can trace the development of a conception of interpretive analysis. Sentences involving more than one quantifier such as ‘Some donkey every man sees’, for example, were recognized as ambiguous, requiring ‘exposition’ to clarify.

In John Buridan's masterpiece of the mid-fourteenth century, the Summulae de Dialectica , we can find all three of the conceptions outlined in Section 1.1 above. He distinguishes explicitly between divisions, definitions and demonstrations, corresponding to decompositional, interpretive and regressive analysis, respectively. Here, in particular, we have anticipations of modern analytic philosophy as much as reworkings of ancient philosophy. Unfortunately, however, these clearer forms of analysis became overshadowed during the Renaissance, despite—or perhaps because of—the growing interest in the original Greek sources. As far as understanding analytic methodologies was concerned, the humanist repudiation of scholastic logic muddied the waters.

Medieval and Renaissance Conceptions of Analysis .
Annotated Bibliography, §3 .

4. Early Modern Conceptions of Analysis and the Development of the Decompositional Conception

The scientific revolution in the seventeenth century brought with it new forms of analysis. The newest of these emerged through the development of more sophisticated mathematical techniques, but even these still had their roots in earlier conceptions of analysis. By the end of the early modern period, decompositional analysis had become dominant (as outlined in what follows), but this, too, took different forms, and the relationships between the various conceptions of analysis were often far from clear.

In common with the Renaissance, the early modern period was marked by a great concern with methodology. This might seem unsurprising in such a revolutionary period, when new techniques for understanding the world were being developed and that understanding itself was being transformed. But what characterizes many of the treatises and remarks on methodology that appeared in the seventeenth century is their appeal, frequently self-conscious, to ancient methods (despite, or perhaps—for diplomatic reasons—because of, the critique of the content of traditional thought), although new wine was generally poured into the old bottles. The model of geometrical analysis was a particular inspiration here, albeit filtered through the Aristotelian tradition, which had assimilated the regressive process of going from theorems to axioms with that of moving from effects to causes (see the supplementary section on Aristotle ). Analysis came to be seen as a method of discovery, working back from what is ordinarily known to the underlying reasons (demonstrating ‘the fact’), and synthesis as a method of proof, working forwards again from what is discovered to what needed explanation (demonstrating ‘the reason why’). Analysis and synthesis were thus taken as complementary, although there remained disagreement over their respective merits.

There is a manuscript by Galileo, dating from around 1589, an appropriated commentary on Aristotle's Posterior Analytics , which shows his concern with methodology, and regressive analysis, in particular (see Wallace 1992a and 1992b). Hobbes wrote a chapter on method in the first part of De Corpore , published in 1655, which offers his own interpretation of the method of analysis and synthesis, where decompositional forms of analysis are articulated alongside regressive forms [ Quotations ]. But perhaps the most influential account of methodology, from the middle of the seventeenth century until well into the nineteenth century, was the fourth part of the Port-Royal Logic , the first edition of which appeared in 1662 and the final revised edition in 1683. Chapter 2 (which was the first chapter in the first edition) opens as follows:

The art of arranging a series of thoughts properly, either for discovering the truth when we do not know it, or for proving to others what we already know, can generally be called method. Hence there are two kinds of method, one for discovering the truth, which is known as analysis , or the method of resolution , and which can also be called the method of discovery . The other is for making the truth understood by others once it is found. This is known as synthesis , or the method of composition , and can also be called the method of instruction . [ Fuller Quotations ]

That a number of different methods might be assimilated here is not noted, although the text does go on to distinguish four main types of ‘issues concerning things’: seeking causes by their effects, seeking effects by their causes, finding the whole from the parts, and looking for another part from the whole and a given part ( ibid ., 234). While the first two involve regressive analysis and synthesis, the third and fourth involve decompositional analysis and synthesis.

As the authors of the Logic make clear, this particular part of their text derives from Descartes's Rules for the Direction of the Mind , written around 1627, but only published posthumously in 1684. The specification of the four types was most likely offered in elaborating Descartes's Rule Thirteen, which states: “If we perfectly understand a problem we must abstract it from every superfluous conception, reduce it to its simplest terms and, by means of an enumeration, divide it up into the smallest possible parts.” ( PW , I, 51. Cf. the editorial comments in PW , I, 54, 77.) The decompositional conception of analysis is explicit here, and if we follow this up into the later Discourse on Method , published in 1637, the focus has clearly shifted from the regressive to the decompositional conception of analysis. All the rules offered in the earlier work have now been reduced to just four. This is how Descartes reports the rules he says he adopted in his scientific and philosophical work:

The first was never to accept anything as true if I did not have evident knowledge of its truth: that is, carefully to avoid precipitate conclusions and preconceptions, and to include nothing more in my judgements than what presented itself to my mind so clearly and so distinctly that I had no occasion to doubt it. The second, to divide each of the difficulties I examined into as many parts as possible and as may be required in order to resolve them better. The third, to direct my thoughts in an orderly manner, by beginning with the simplest and most easily known objects in the order to ascend little by little, step by step, to knowledge of the most complex, and by supposing some order even among objects that have no natural order of precedence. And the last, throughout to make enumerations so complete, and reviews so comprehensive, that I could be sure of leaving nothing out. ( PW , I, 120.)

The first two are rules of analysis and the second two rules of synthesis. But although the analysis/synthesis structure remains, what is involved here is decomposition/composition rather than regression/progression. Nevertheless, Descartes insisted that it was geometry that influenced him here: “Those long chains composed of very simple and easy reasonings, which geometers customarily use to arrive at their most difficult demonstrations, had given me occasion to suppose that all the things which can fall under human knowledge are interconnected in the same way.” ( Ibid . [ Further Quotations ])

Descartes's geometry did indeed involve the breaking down of complex problems into simpler ones. More significant, however, was his use of algebra in developing ‘analytic’ geometry as it came to be called, which allowed geometrical problems to be transformed into arithmetical ones and more easily solved. In representing the ‘unknown’ to be found by ‘ x ’, we can see the central role played in analysis by the idea of taking something as ‘given’ and working back from that, which made it seem appropriate to regard algebra as an ‘art of analysis’, alluding to the regressive conception of the ancients. Illustrated in analytic geometry in its developed form, then, we can see all three of the conceptions of analysis outlined in Section 1.1 above, despite Descartes's own emphasis on the decompositional conception. For further discussion of this, see the supplementary section on Descartes and Analytic Geometry .

Descartes's emphasis on decompositional analysis was not without precedents, however. Not only was it already involved in ancient Greek geometry, but it was also implicit in Plato's method of collection and division. We might explain the shift from regressive to decompositional (conceptual) analysis, as well as the connection between the two, in the following way. Consider a simple example, as represented in the diagram below, ‘collecting’ all animals and ‘dividing’ them into rational and non-rational , in order to define human beings as rational animals.

On this model, in seeking to define anything, we work back up the appropriate classificatory hierarchy to find the higher (i.e., more basic or more general) ‘Forms’, by means of which we can lay down the definition. Although Plato did not himself use the term ‘analysis’—the word for ‘division’ was ‘ dihairesis ’—the finding of the appropriate ‘Forms’ is essentially analysis. As an elaboration of the Socratic search for definitions, we clearly have in this the origins of conceptual analysis. There is little disagreement that ‘Human beings are rational animals’ is the kind of definition we are seeking, defining one concept, the concept human being , in terms of other concepts, the concepts rational and animal . But the construals that have been offered of this have been more problematic. Understanding a classificatory hierarchy extensionally , that is, in terms of the classes of things denoted, the classes higher up are clearly the larger, ‘containing’ the classes lower down as subclasses (e.g., the class of animals includes the class of human beings as one of its subclasses). Intensionally , however, the relationship of ‘containment’ has been seen as holding in the opposite direction. If someone understands the concept human being , at least in the strong sense of knowing its definition, then they must understand the concepts animal and rational ; and it has often then seemed natural to talk of the concept human being as ‘containing’ the concepts rational and animal . Working back up the hierarchy in ‘analysis’ (in the regressive sense) could then come to be identified with ‘unpacking’ or ‘decomposing’ a concept into its ‘constituent’ concepts (‘analysis’ in the decompositional sense). Of course, talking of ‘decomposing’ a concept into its ‘constituents’ is, strictly speaking, only a metaphor (as Quine was famously to remark in §1 of ‘Two Dogmas of Empiricism’), but in the early modern period, this began to be taken more literally.

For further discussion, see the supplementary document on

Early Modern Conceptions of Analysis ,

which contains sections on Descartes and Analytic Geometry, British Empiricism, Leibniz, and Kant.

For further reading, see the

Annotated Bibliography, §4 .

As suggested in the supplementary document on Kant , the decompositional conception of analysis found its classic statement in the work of Kant at the end of the eighteenth century. But Kant was only expressing a conception widespread at the time. The conception can be found in a very blatant form, for example, in the writings of Moses Mendelssohn, for whom, unlike Kant, it was applicable even in the case of geometry [ Quotation ]. Typified in Kant's and Mendelssohn's view of concepts, it was also reflected in scientific practice. Indeed, its popularity was fostered by the chemical revolution inaugurated by Lavoisier in the late eighteenth century, the comparison between philosophical analysis and chemical analysis being frequently drawn. As Lichtenberg put it, “Whichever way you look at it, philosophy is always analytical chemistry” [ Quotation ].

This decompositional conception of analysis set the methodological agenda for philosophical approaches and debates in the (late) modern period (nineteenth and twentieth centuries). Responses and developments, very broadly, can be divided into two. On the one hand, an essentially decompositional conception of analysis was accepted, but a critical attitude was adopted towards it. If analysis simply involved breaking something down, then it appeared destructive and life-diminishing, and the critique of analysis that this view engendered was a common theme in idealism and romanticism in all its main varieties—from German, British and French to North American. One finds it reflected, for example, in remarks about the negating and soul-destroying power of analytical thinking by Schiller [ Quotation ], Hegel [ Quotation ] and de Chardin [ Quotation ], in Bradley's doctrine that analysis is falsification [ Quotation ], and in the emphasis placed by Bergson on ‘intuition’ [ Quotation ].

On the other hand, analysis was seen more positively, but the Kantian conception underwent a certain degree of modification and development. In the nineteenth century, this was exemplified, in particular, by Bolzano and the neo-Kantians. Bolzano's most important innovation was the method of variation, which involves considering what happens to the truth-value of a sentence when a constituent term is substituted by another. This formed the basis for his reconstruction of the analytic/synthetic distinction, Kant's account of which he found defective. The neo-Kantians emphasized the role of structure in conceptualized experience and had a greater appreciation of forms of analysis in mathematics and science. In many ways, their work attempts to do justice to philosophical and scientific practice while recognizing the central idealist claim that analysis is a kind of abstraction that inevitably involves falsification or distortion. On the neo-Kantian view, the complexity of experience is a complexity of form and content rather than of separable constituents, requiring analysis into ‘moments’ or ‘aspects’ rather than ‘elements’ or ‘parts’. In the 1910s, the idea was articulated with great subtlety by Ernst Cassirer [ Quotation ], and became familiar in Gestalt psychology.

In the twentieth century, both analytic philosophy and phenomenology can be seen as developing far more sophisticated conceptions of analysis, which draw on but go beyond mere decompositional analysis. The following Section offers an account of analysis in analytic philosophy, illustrating the range and richness of the conceptions and practices that arose. But it is important to see these in the wider context of twentieth-century methodological practices and debates, for it is not just in ‘analytic’ philosophy—despite its name—that analytic methods are accorded a central role. Phenomenology, in particular, contains its own distinctive set of analytic methods, with similarities and differences to those of analytic philosophy. Phenomenological analysis has frequently been compared to conceptual clarification in the ordinary language tradition, for example, and the method of ‘phenomenological reduction’ that Husserl invented in 1905 offers a striking parallel to the reductive project opened up by Russell's theory of descriptions, which also made its appearance in 1905.

Just like Frege and Russell, Husserl's initial concern was with the foundations of mathematics, and in this shared concern we can see the continued influence of the regressive conception of analysis. According to Husserl, the aim of ‘eidetic reduction’, as he called it, was to isolate the ‘essences’ that underlie our various forms of thinking, and to apprehend them by ‘essential intuition’ (‘ Wesenserschauung ’). The terminology may be different, but this resembles Russell's early project to identify the ‘indefinables’ of philosophical logic, as he described it, and to apprehend them by ‘acquaintance’ (cf. POM , xx). Furthermore, in Husserl's later discussion of ‘explication’ (cf. EJ , §§ 22-4 [ Quotations ]), we find appreciation of the ‘transformative’ dimension of analysis, which can be fruitfully compared with Carnap's account of explication (see the supplementary section on Carnap and Logical Positivism ). Carnap himself describes Husserl's idea here as one of “the synthesis of identification between a confused, nonarticulated sense and a subsequently intended distinct, articulated sense” (1950, 3 [ Quotation ]).

Phenomenology is not the only source of analytic methodologies outside those of the analytic tradition. Mention might be made here, too, of R. G. Collingwood, working within the tradition of British idealism, which was still a powerful force prior to the Second World War. In his Essay on Philosophical Method (1933), for example, he criticizes Moorean philosophy, and develops his own response to what is essentially the paradox of analysis (concerning how an analysis can be both correct and informative), which he recognizes as having its root in Meno's paradox. In his Essay on Metaphysics (1940), he puts forward his own conception of metaphysical analysis, in direct response to what he perceived as the mistaken repudiation of metaphysics by the logical positivists. Metaphysical analysis is characterized here as the detection of ‘absolute presuppositions’, which are taken as underlying and shaping the various conceptual practices that can be identified in the history of philosophy and science. Even among those explicitly critical of central strands in analytic philosophy, then, analysis in one form or another can still be seen as alive and well.

Annotated Bibliography, §5 .

6. Conceptions of Analysis in Analytic Philosophy and the Introduction of the Logical (Transformative) Conception

If anything characterizes ‘analytic’ philosophy, then it is presumably the emphasis placed on analysis. But as the foregoing sections have shown, there is a wide range of conceptions of analysis, so such a characterization says nothing that would distinguish analytic philosophy from much of what has either preceded or developed alongside it. Given that the decompositional conception is usually offered as the main conception today, it might be thought that it is this that characterizes analytic philosophy. But this conception was prevalent in the early modern period, shared by both the British Empiricists and Leibniz, for example. Given that Kant denied the importance of decompositional analysis, however, it might be suggested that what characterizes analytic philosophy is the value it places on such analysis. This might be true of Moore's early work, and of one strand within analytic philosophy; but it is not generally true. What characterizes analytic philosophy as it was founded by Frege and Russell is the role played by logical analysis , which depended on the development of modern logic. Although other and subsequent forms of analysis, such as linguistic analysis, were less wedded to systems of formal logic, the central insight motivating logical analysis remained.

Pappus's account of method in ancient Greek geometry suggests that the regressive conception of analysis was dominant at the time—however much other conceptions may also have been implicitly involved (see the supplementary section on Ancient Greek Geometry ). In the early modern period, the decompositional conception became widespread (see Section 4 ). What characterizes analytic philosophy—or at least that central strand that originates in the work of Frege and Russell—is the recognition of what was called earlier the transformative or interpretive dimension of analysis (see Section 1.1 ). Any analysis presupposes a particular framework of interpretation, and work is done in interpreting what we are seeking to analyze as part of the process of regression and decomposition. This may involve transforming it in some way, in order for the resources of a given theory or conceptual framework to be brought to bear. Euclidean geometry provides a good illustration of this. But it is even more obvious in the case of analytic geometry, where the geometrical problem is first ‘translated’ into the language of algebra and arithmetic in order to solve it more easily (see the supplementary section on Descartes and Analytic Geometry ). What Descartes and Fermat did for analytic geometry, Frege and Russell did for analytic philosophy. Analytic philosophy is ‘analytic’ much more in the sense that analytic geometry is ‘analytic’ than in the crude decompositional sense that Kant understood it.

The interpretive dimension of modern philosophical analysis can also be seen as anticipated in medieval scholasticism (see the supplementary section on Medieval Philosophy ), and it is remarkable just how much of modern concerns with propositions, meaning, reference, and so on, can be found in the medieval literature. Interpretive analysis is also illustrated in the nineteenth century by Bentham's conception of paraphrasis , which he characterized as “that sort of exposition which may be afforded by transmuting into a proposition, having for its subject some real entity, a proposition which has not for its subject any other than a fictitious entity” [ Full Quotation ]. He applied the idea in ‘analyzing away’ talk of ‘obligations’, and the anticipation that we can see here of Russell's theory of descriptions has been noted by, among others, Wisdom (1931) and Quine in ‘Five Milestones of Empiricism’ [ Quotation ].

What was crucial in the emergence of twentieth-century analytic philosophy, however, was the development of quantificational theory, which provided a far more powerful interpretive system than anything that had hitherto been available. In the case of Frege and Russell, the system into which statements were ‘translated’ was predicate logic, and the divergence that was thereby opened up between grammatical and logical form meant that the process of translation itself became an issue of philosophical concern. This induced greater self-consciousness about our use of language and its potential to mislead us, and inevitably raised semantic, epistemological and metaphysical questions about the relationships between language, logic, thought and reality which have been at the core of analytic philosophy ever since.

Both Frege and Russell (after the latter's initial flirtation with idealism) were concerned to show, against Kant, that arithmetic is a system of analytic and not synthetic truths. In the Grundlagen , Frege had offered a revised conception of analyticity, which arguably endorses and generalizes Kant's logical as opposed to phenomenological criterion, i.e., (AN L ) rather than (AN O ) (see the supplementary section on Kant ):

(AN) A truth is analytic if its proof depends only on general logical laws and definitions.

The question of whether arithmetical truths are analytic then comes down to the question of whether they can be derived purely logically. (Here we already have ‘transformation’, at the theoretical level—involving a reinterpretation of the concept of analyticity.) To demonstrate this, Frege realized that he needed to develop logical theory in order to formalize mathematical statements, which typically involve multiple generality (e.g., ‘Every natural number has a successor’, i.e. ‘For every natural number x there is another natural number y that is the successor of x ’). This development, by extending the use of function-argument analysis in mathematics to logic and providing a notation for quantification, was essentially the achievement of his first book, the Begriffsschrift (1879), where he not only created the first system of predicate logic but also, using it, succeeded in giving a logical analysis of mathematical induction (see Frege FR , 47-78).

In his second book, Die Grundlagen der Arithmetik (1884), Frege went on to provide a logical analysis of number statements. His central idea was that a number statement contains an assertion about a concept. A statement such as ‘Jupiter has four moons’ is to be understood not as predicating of Jupiter the property of having four moons, but as predicating of the concept moon of Jupiter the second-level property has four instances , which can be logically defined. The significance of this construal can be brought out by considering negative existential statements (which are equivalent to number statements involving the number 0). Take the following negative existential statement:

(0a) Unicorns do not exist.

If we attempt to analyze this decompositionally , taking its grammatical form to mirror its logical form, then we find ourselves asking what these unicorns are that have the property of non-existence. We may then be forced to posit the subsistence —as opposed to existence —of unicorns, just as Meinong and the early Russell did, in order for there to be something that is the subject of our statement. On the Fregean account, however, to deny that something exists is to say that the relevant concept has no instances: there is no need to posit any mysterious object . The Fregean analysis of (0a) consists in rephrasing it into (0b), which can then be readily formalized in the new logic as (0c):

(0b) The concept unicorn is not instantiated. (0c) ~(∃ x ) Fx .

Similarly, to say that God exists is to say that the concept God is (uniquely) instantiated, i.e., to deny that the concept has 0 instances (or 2 or more instances). On this view, existence is no longer seen as a (first-level) predicate, but instead, existential statements are analyzed in terms of the (second-level) predicate is instantiated , represented by means of the existential quantifier. As Frege notes, this offers a neat diagnosis of what is wrong with the ontological argument, at least in its traditional form ( GL , §53). All the problems that arise if we try to apply decompositional analysis (at least straight off) simply drop away, although an account is still needed, of course, of concepts and quantifiers.

The possibilities that this strategy of ‘translating’ into a logical language opens up are enormous: we are no longer forced to treat the surface grammatical form of a statement as a guide to its ‘real’ form, and are provided with a means of representing that form. This is the value of logical analysis: it allows us to ‘analyze away’ problematic linguistic expressions and explain what it is ‘really’ going on. This strategy was employed, most famously, in Russell's theory of descriptions, which was a major motivation behind the ideas of Wittgenstein's Tractatus (see the supplementary sections on Russell and Wittgenstein ). Although subsequent philosophers were to question the assumption that there could ever be a definitive logical analysis of a given statement, the idea that ordinary language may be systematically misleading has remained.

To illustrate this, consider the following examples from Ryle's classic 1932 paper, ‘Systematically Misleading Expressions’:

(Ua) Unpunctuality is reprehensible. (Ta) Jones hates the thought of going to hospital.

In each case, we might be tempted to make unnecessary reifications, taking ‘unpunctuality’ and ‘the thought of going to hospital’ as referring to objects. It is because of this that Ryle describes such expressions as ‘systematically misleading’. (Ua) and (Ta) must therefore be rephrased:

(Ub) Whoever is unpunctual deserves that other people should reprove him for being unpunctual. (Tb) Jones feels distressed when he thinks of what he will undergo if he goes to hospital.

In these formulations, there is no overt talk at all of ‘unpunctuality’ or ‘thoughts’, and hence nothing to tempt us to posit the existence of any corresponding entities. The problems that otherwise arise have thus been ‘analyzed away’.

At the time that Ryle wrote ‘Systematically Misleading Expressions’, he, too, assumed that every statement had an underlying logical form that was to be exhibited in its ‘correct’ formulation [ Quotations ]. But when he gave up this assumption (for reasons indicated in the supplementary section on The Cambridge School of Analysis ), he did not give up the motivating idea of logical analysis—to show what is wrong with misleading expressions. In The Concept of Mind (1949), for example, he sought to explain what he called the ‘category-mistake’ involved in talk of the mind as a kind of ‘Ghost in the Machine’. His aim, he wrote, was to “rectify the logical geography of the knowledge which we already possess” (1949, 9), an idea that was to lead to the articulation of connective rather than reductive conceptions of analysis, the emphasis being placed on elucidating the relationships between concepts without assuming that there is a privileged set of intrinsically basic concepts (see the supplementary section on Oxford Linguistic Philosophy ).

What these various forms of logical analysis suggest, then, is that what characterizes analysis in analytic philosophy is something far richer than the mere ‘decomposition’ of a concept into its ‘constituents’. But this is not to say that the decompositional conception of analysis plays no role at all. It can be found in the early work of Moore, for example (see the supplementary section on Moore ). It might also be seen as reflected in the approach to the analysis of concepts that seeks to specify the necessary and sufficient conditions for their correct employment. Conceptual analysis in this sense goes back to the Socrates of Plato's early dialogues (see the supplementary section on Plato ). But it arguably reached its heyday in the 1950s and 1960s. As mentioned in Section 2 above, the definition of ‘knowledge’ as ‘justified true belief’ is perhaps the most famous example; and this definition was criticised in Gettier's classic paper of 1963. (For details of this, see the entry in this Encyclopedia on The Analysis of Knowledge .) The specification of necessary and sufficient conditions may no longer be seen as the primary aim of conceptual analysis, especially in the case of philosophical concepts such as ‘knowledge’, which are fiercely contested; but consideration of such conditions remains a useful tool in the analytic philosopher's toolbag.

For a more detailed account of the these and related conceptions of analysis, see the supplementary document on

Conceptions of Analysis in Analytic Philosophy .
Annotated Bibliography, §6 .

The history of philosophy reveals a rich source of conceptions of analysis. Their origin may lie in ancient Greek geometry, and to this extent the history of analytic methodologies might be seen as a series of footnotes to Euclid. But analysis developed in different though related ways in the two traditions stemming from Plato and Aristotle, the former based on the search for definitions and the latter on the idea of regression to first causes. The two poles represented in these traditions defined methodological space until well into the early modern period, and in some sense is still reflected today. The creation of analytic geometry in the seventeenth century introduced a more reductive form of analysis, and an analogous and even more powerful form was introduced around the turn of the twentieth century in the logical work of Frege and Russell. Although conceptual analysis, construed decompositionally from the time of Leibniz and Kant, and mediated by the work of Moore, is often viewed as characteristic of analytic philosophy, logical analysis, taken as involving translation into a logical system, is what inaugurated the analytic tradition. Analysis has also frequently been seen as reductive, but connective forms of analysis are no less important. Connective analysis, historically inflected, would seem to be particularly appropriate, for example, in understanding analysis itself.

What follows here is a selection of thirty classic and recent works published over the last half-century that together cover the range of different conceptions of analysis in the history of philosophy. A fuller bibliography, which includes all references cited, is provided as a set of supplementary documents, divided to correspond to the sections of this entry:

Annotated Bibliography on Analysis
  • Baker, Gordon, 2004, Wittgenstein's Method , Oxford: Blackwell, especially essays 1, 3, 4, 10, 12
  • Baldwin, Thomas, 1990, G.E. Moore , London: Routledge, ch. 7
  • Beaney, Michael, 2004, ‘Carnap's Conception of Explication: From Frege to Husserl?’, in S. Awodey and C. Klein, (eds.), Carnap Brought Home: The View from Jena , Chicago: Open Court, pp. 117-50
  • –––, 2005, ‘Collingwood's Conception of Presuppositional Analysis’, Collingwood and British Idealism Studies 11, no. 2, 41-114
  • –––, (ed.), 2007, The Analytic Turn: Analysis in Early Analytic Philosophy and Phenomenology , London: Routledge [includes papers on Frege, Russell, Wittgenstein, C.I. Lewis, Bolzano, Husserl]
  • Byrne, Patrick H., 1997, Analysis and Science in Aristotle , Albany: State University of New York Press
  • Cohen, L. Jonathan, 1986, The Dialogue of Reason: An Analysis of Analytical Philosophy , Oxford: Oxford University Press, chs. 1-2
  • Dummett, Michael, 1991, Frege: Philosophy of Mathematics , London: Duckworth, chs. 3-4, 9-16
  • Engfer, Hans-Jürgen, 1982, Philosophie als Analysis , Stuttgart-Bad Cannstatt: Frommann-Holzboog [Descartes, Leibniz, Wolff, Kant]
  • Garrett, Aaron V., 2003, Meaning in Spinoza's Method , Cambridge: Cambridge University Press, ch. 4
  • Gaukroger, Stephen, 1989, Cartesian Logic , Oxford: Oxford University Press, ch. 3
  • Gentzler, Jyl, (ed.), 1998, Method in Ancient Philosophy , Oxford: Oxford University Press [includes papers on Socrates, Plato, Aristotle, mathematics and medicine]
  • Gilbert, Neal W., 1960, Renaissance Concepts of Method , New York: Columbia University Press
  • Hacker, P.M.S., 1996, Wittgenstein's Place in Twentieth-Century Analytic Philosophy , Oxford: Blackwell
  • Hintikka, Jaakko and Remes, Unto, 1974, The Method of Analysis , Dordrecht: D. Reidel [ancient Greek geometrical analysis]
  • Hylton, Peter, 2005, Propositions, Functions, Analysis: Selected Essays on Russell's Philosophy , Oxford: Oxford University Press
  • –––, 2007, Quine , London: Routledge, ch. 9
  • Jackson, Frank, 1998, From Metaphysics to Ethics: A Defence of Conceptual Analysis , Oxford: Oxford University Press, chs. 2-3
  • Kretzmann, Norman, 1982, ‘Syncategoremata, exponibilia, sophistimata’, in N. Kretzmann et al. , (eds.), The Cambridge History of Later Medieval Philosophy , Cambridge: Cambridge University Press, 211-45
  • Menn, Stephen, 2002, ‘Plato and the Method of Analysis’, Phronesis 47, 193-223
  • Otte, Michael and Panza, Marco, (eds.), 1997, Analysis and Synthesis in Mathematics , Dordrecht: Kluwer
  • Rorty, Richard, (ed.), 1967, The Linguistic Turn , Chicago: University of Chicago Press [includes papers on analytic methodology]
  • Rosen, Stanley, 1980, The Limits of Analysis , New York: Basic Books, repr. Indiana: St. Augustine's Press, 2000 [critique of analytic philosophy from a ‘continental’ perspective]
  • Sayre, Kenneth M., 1969, Plato's Analytic Method , Chicago: University of Chicago Press
  • –––, 2006, Metaphysics and Method in Plato's Statesman , Cambridge: Cambridge University Press, Part I
  • Soames, Scott, 2003, Philosophical Analysis in the Twentieth Century , Volume 1: The Dawn of Analysis , Volume 2: The Age of Meaning , New Jersey: Princeton University Press [includes chapters on Moore, Russell, Wittgenstein, logical positivism, Quine, ordinary language philosophy, Davidson, Kripke]
  • Strawson, P.F., 1992, Analysis and Metaphysics: An Introduction to Philosophy , Oxford: Oxford University Press, chs. 1-2
  • Sweeney, Eileen C., 1994, ‘Three Notions of Resolutio and the Structure of Reasoning in Aquinas’, The Thomist 58, 197-243
  • Timmermans, Benoît, 1995, La résolution des problèmes de Descartes à Kant , Paris: Presses Universitaires de France
  • Urmson, J.O., 1956, Philosophical Analysis: Its Development between the Two World Wars , Oxford: Oxford University Press
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Analysis , a journal in philosophy.
  • Bertrand Russell Archives
  • Leibniz-Archiv
  • Wittgenstein Archives at the University of Bergen

abstract objects | analytic/synthetic distinction | Aristotle | Bolzano, Bernard | Buridan, John [Jean] | Descartes, René | descriptions | Frege, Gottlob | Kant, Immanuel | knowledge: analysis of | Leibniz, Gottfried Wilhelm | logical constructions | logical form | Moore, George Edward | necessary and sufficient conditions | Ockham [Occam], William | Plato | Russell, Bertrand | Wittgenstein, Ludwig

Acknowledgments

In first composing this entry (in 2002-3) and then revising the main entry and bibliography (in 2007), I have drawn on a number of my published writings (especially Beaney 1996, 2000, 2002, 2007b, 2007c; see Annotated Bibliography §6.1 , §6.2 ). I am grateful to the respective publishers for permission to use this material. Research on conceptions of analysis in the history of philosophy was initially undertaken while a Research Fellow at the Institut für Philosophie of the University of Erlangen-Nürnberg during 1999-2000, and further work was carried out while a Research Fellow at the Institut für Philosophie of the University of Jena during 2006-7, in both cases funded by the Alexander von Humboldt-Stiftung. In the former case, the account was written up while at the Open University (UK), and in the latter case, I had additional research leave from the University of York. I acknowledge the generous support given to me by all five institutions. I am also grateful to the editors of this Encyclopedia, and to Gideon Rosen and Edward N. Zalta, in particular, for comments and suggestions on the content and organisation of this entry in both its initial and revised form. I would like to thank John Ongley, too, for reviewing the first version of this entry, which has helped me to improve it (see Annotated Bibliography §1.3 ). In updating the bibliography (in 2007), I am indebted to various people who have notified me of relevant works, and especially, Gyula Klima (regarding §2.1), Anna-Sophie Heinemann (regarding §§ 4.2 and 4.4), and Jan Wolenski (regarding §5.3). I invite anyone who has further suggestions of items to be included or comments on the article itself to email me at the address given below.

Copyright © 2014 by Michael Beaney < michael . beaney @ hu-berlin . de >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

A Guide to Evidence Synthesis: What is Evidence Synthesis?

  • Meet Our Team
  • Our Published Reviews and Protocols
  • What is Evidence Synthesis?
  • Types of Evidence Synthesis
  • Evidence Synthesis Across Disciplines
  • Finding and Appraising Existing Systematic Reviews
  • 0. Develop a Protocol
  • 1. Draft your Research Question
  • 2. Select Databases
  • 3. Select Grey Literature Sources
  • 4. Write a Search Strategy
  • 5. Register a Protocol
  • 6. Translate Search Strategies
  • 7. Citation Management
  • 8. Article Screening
  • 9. Risk of Bias Assessment
  • 10. Data Extraction
  • 11. Synthesize, Map, or Describe the Results
  • Evidence Synthesis Institute for Librarians
  • Open Access Evidence Synthesis Resources

What are Evidence Syntheses?

What are evidence syntheses.

According to the Royal Society, 'evidence synthesis' refers to the process of bringing together information from a range of sources and disciplines to inform debates and decisions on specific issues. They generally include a methodical and comprehensive literature synthesis focused on a well-formulated research question.  Their aim is to identify and synthesize all  of the scholarly research on a particular topic, including both published and unpublished studies. Evidence syntheses are conducted in an unbiased, reproducible way to provide evidence for practice and policy-making, as well as to identify gaps in the research. Evidence syntheses may also include a meta-analysis, a more quantitative process of synthesizing and visualizing data retrieved from various studies. 

Evidence syntheses are much more time-intensive than traditional literature reviews and require a multi-person research team. See this PredicTER tool to get a sense of a systematic review timeline (one type of evidence synthesis). Before embarking on an evidence synthesis, it's important to clearly identify your reasons for conducting one. For a list of types of evidence synthesis projects, see the next tab.

How Does a Traditional Literature Review Differ From an Evidence Synthesis?

How does a systematic review differ from a traditional literature review.

One commonly used form of evidence synthesis is a systematic review.  This table compares a traditional literature review with a systematic review.

Video: Reproducibility and transparent methods (Video 3:25)

Reporting Standards

There are some reporting standards for evidence syntheses. These can serve as guidelines for protocol and manuscript preparation and journals may require that these standards are followed for the review type that is being employed (e.g. systematic review, scoping review, etc). ​

  • PRISMA checklist Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses.
  • PRISMA-P Standards An updated version of the original PRISMA standards for protocol development.
  • PRISMA - ScR Reporting guidelines for scoping reviews and evidence maps
  • PRISMA-IPD Standards Extension of the original PRISMA standards for systematic reviews and meta-analyses of individual participant data.
  • EQUATOR Network The EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network is an international initiative that seeks to improve the reliability and value of published health research literature by promoting transparent and accurate reporting and wider use of robust reporting guidelines. They provide a list of various standards for reporting in systematic reviews.

Video: Guidelines and reporting standards

PRISMA Flow Diagram

The  PRISMA  flow diagram depicts the flow of information through the different phases of an evidence synthesis. It maps the search (number of records identified), screening (number of records included and excluded), and selection (reasons for exclusion).  Many evidence syntheses include a PRISMA flow diagram in the published manuscript.

See below for resources to help you generate your own PRISMA flow diagram.

  • PRISMA Flow Diagram Tool
  • PRISMA Flow Diagram Word Template
  • << Previous: Our Published Reviews and Protocols
  • Next: Types of Evidence Synthesis >>
  • Last Updated: May 24, 2024 3:21 PM
  • URL: https://guides.library.cornell.edu/evidence-synthesis

Logo for RMIT Open Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Synthesising the data

Decorative image

Synthesis is a stage in the systematic review process where extracted data, that is the findings of individual studies, are combined and evaluated.   

The general purpose of extracting and synthesising data is to show the outcomes and effects of various studies, and to identify issues with methodology and quality. This means that your synthesis might reveal several elements, including:  

  • overall level of evidence  
  • the degree of consistency in the findings  
  • what the positive effects of a drug or treatment are ,  and what these effects  are  based on  
  • how many studies found a relationship or association between two components, e.g. the impact of disability-assistance animals on the psychological health of workplaces

There are two commonly accepted methods of synthesis in systematic reviews:  

Qualitative data synthesis

  • Quantitative data synthesis  (i.e. meta-analysis)  

The way the data is extracted from your studies, then synthesised and presented, depends on the type of data being handled.  

In a qualitative systematic review, data can be presented in a number of different ways. A typical procedure in the health sciences is  thematic analysis .

Thematic synthesis has three stages:

  • the coding of text ‘line-by-line’
  • the development of ‘descriptive themes’
  • and the generation of ‘analytical themes’

If you have qualitative information, some of the more common tools used to summarise data include:  

  • textual descriptions, i.e. written words  
  • thematic or content analysis

Example qualitative systematic review

A good example of how to conduct a thematic analysis in a systematic review is the following journal article on cancer patients. In it, the authors go through the process of:

  • identifying and coding information about the selected studies’ methodologies and findings on patient care
  • organising these codes into subheadings and descriptive categories
  • developing these categories into analytical themes

What Facilitates “Patient Empowerment” in Cancer Patients During Follow-Up: A Qualitative Systematic Review of the Literature

Quantitative data synthesis

In a quantitative systematic review, data is presented statistically. Typically, this is referred to as a  meta-analysis .

The usual method is to combine and evaluate data from multiple studies. This is normally done in order to draw conclusions about outcomes, effects, shortcomings of studies and/or applicability of findings.

Remember, the data you synthesise should relate to your research question and protocol (plan). In the case of quantitative analysis, the data extracted and synthesised will relate to whatever method was used to generate the research question (e.g. PICO method), and whatever quality appraisals were undertaken in the analysis stage.

If you have quantitative information, some of the more common tools used to summarise data include:  

  • grouping of similar data, i.e. presenting the results in tables  
  • charts, e.g. pie-charts  
  • graphical displays, i.e. forest plots

Example of a quantitative systematic review

A quantitative systematic review is a combination of qualitative and quantitative, usually referred to as a meta-analysis.

Effectiveness of Acupuncturing at the Sphenopalatine Ganglion Acupoint Alone for Treatment of Allergic Rhinitis: A Systematic Review and Meta-Analysis

About meta-analyses

Decorative image

A systematic review may sometimes include a  meta-analysis , although it is not a requirement of a systematic review. Whereas, a meta-analysis also includes a systematic review.  

A meta-analysis is a statistical  analysis  that combines data from  previous  studies  to calculate an overall result.

One way of accurately representing all the data is in the form of a  forest plot . A forest plot is a way of combining the results of multiple studies in order to show point estimates arising from different studies of the same condition or treatment.

It is comprised of a graphical representation and often also a table. The graphical display shows the mean value for each study and often with a confidence interval (the horizontal bars). Each mean is plotted relative to the vertical line of no difference.

The following is an example of the graphical representation of a forest plot.

forest plot example

“File:The effect of zinc acetate lozenges on the duration of the common cold.svg”  by  Harri Hemilä  is licensed under  CC BY 3.0

Watch the following short video where a social health example is used to explain how to construct a forest plot graphic.

Forest Plots: Understanding a Meta-Analysis in 5 Minutes or Less (5:38 mins)

Forest Plots – Understanding a Meta-Analysis in 5 Minutes or Less  (5:38 min) by The NCCMT ( YouTube )

Test your knowledge

Research and Writing Skills for Academic and Graduate Researchers Copyright © 2022 by RMIT University is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

UCI Libraries Mobile Site

  • Langson Library
  • Science Library
  • Grunigen Medical Library
  • Law Library
  • Connect From Off-Campus
  • Accessibility
  • Gateway Study Center

Libaries home page

Email this link

Systematic reviews & evidence synthesis methods.

  • Schedule a Consultation / Meet our Team
  • What is Evidence Synthesis?
  • Types of Evidence Synthesis
  • Evidence Synthesis Across Disciplines
  • Finding and Appraising Existing Systematic Reviews
  • 0. Preliminary Searching
  • 1. Develop a Protocol
  • 2. Draft your Research Question
  • 3. Select Databases
  • 4. Select Grey Literature Sources
  • 5. Write a Search Strategy
  • 6. Register a Protocol
  • 7. Translate Search Strategies
  • 8. Citation Management
  • 9. Article Screening
  • 10. Risk of Bias Assessment
  • 11. Data Extraction
  • 12. Synthesize, Map, or Describe the Results
  • Open Access Evidence Synthesis Resources

About This Guide

This research guide provides an overview of the evidence synthesis process, guidance documents for conducting evidence synthesis projects, and links to resources to help you conduct a comprehensive and systematic search of the scholarly literature. Navigate the guide using the tabs on the left.

"Evidence synthesis" refers to rigorous, well-documented methods of identifying, selecting, and combining results from multiple studies. These projects are conducted by teams and follow specific methodologies to minimize bias and maximize reproducibility. A systematic review is a type of evidence synthesis. We use the term evidence synthesis to better reflect the breadth of methodologies that we support, including systematic reviews, scoping reviews , evidence gap maps, umbrella reviews, meta-analyses and others.

Note: Librarians at UC Irvine Libraries have supported systematic reviews and related methodologies in STEM fields for several years. As our service has evolved, we have added capacity to support these reviews in the Social Sciences as well.

Systematic Review OR Literature Review Conducted Systematically?

There are many types of literature reviews. Before beginning a systematic review, consider whether it is the best type of review for your question, goals, and resources. The table below compares systematic reviews, scoping reviews, and systematized reviews (narrative literature reviews employing some, but not all elements of a systematic review) to help you decide which is best for you. See the Types of Evidence Synthesis page for a more in-depth overview at types of reviews.

  • Next: UCI Libraries Evidence Synthesis Service >>
  • Last Updated: May 25, 2024 10:49 AM
  • URL: https://guides.lib.uci.edu/evidence-synthesis

Off-campus? Please use the Software VPN and choose the group UCIFull to access licensed content. For more information, please Click here

Software VPN is not available for guests, so they may not have access to some content when connecting from off-campus.

Banner Image

Library Guides

Literature reviews: synthesis.

  • Criticality

Synthesise Information

So, how can you create paragraphs within your literature review that demonstrates your knowledge of the scholarship that has been done in your field of study?  

You will need to present a synthesis of the texts you read.  

Doug Specht, Senior Lecturer at the Westminster School of Media and Communication, explains synthesis for us in the following video:  

Synthesising Texts  

What is synthesis? 

Synthesis is an important element of academic writing, demonstrating comprehension, analysis, evaluation and original creation.  

With synthesis you extract content from different sources to create an original text. While paraphrase and summary maintain the structure of the given source(s), with synthesis you create a new structure.  

The sources will provide different perspectives and evidence on a topic. They will be put together when agreeing, contrasted when disagreeing. The sources must be referenced.  

Perfect your synthesis by showing the flow of your reasoning, expressing critical evaluation of the sources and drawing conclusions.  

When you synthesise think of "using strategic thinking to resolve a problem requiring the integration of diverse pieces of information around a structuring theme" (Mateos and Sole 2009, p448). 

Synthesis is a complex activity, which requires a high degree of comprehension and active engagement with the subject. As you progress in higher education, so increase the expectations on your abilities to synthesise. 

How to synthesise in a literature review: 

Identify themes/issues you'd like to discuss in the literature review. Think of an outline.  

Read the literature and identify these themes/issues.  

Critically analyse the texts asking: how does the text I'm reading relate to the other texts I've read on the same topic? Is it in agreement? Does it differ in its perspective? Is it stronger or weaker? How does it differ (could be scope, methods, year of publication etc.). Draw your conclusions on the state of the literature on the topic.  

Start writing your literature review, structuring it according to the outline you planned.  

Put together sources stating the same point; contrast sources presenting counter-arguments or different points.  

Present your critical analysis.  

Always provide the references. 

The best synthesis requires a "recursive process" whereby you read the source texts, identify relevant parts, take notes, produce drafts, re-read the source texts, revise your text, re-write... (Mateos and Sole, 2009). 

What is good synthesis?  

The quality of your synthesis can be assessed considering the following (Mateos and Sole, 2009, p439):  

Integration and connection of the information from the source texts around a structuring theme. 

Selection of ideas necessary for producing the synthesis. 

Appropriateness of the interpretation.  

Elaboration of the content.  

Example of Synthesis

Original texts (fictitious): 

  

Synthesis: 

Animal experimentation is a subject of heated debate. Some argue that painful experiments should be banned. Indeed it has been demonstrated that such experiments make animals suffer physically and psychologically (Chowdhury 2012; Panatta and Hudson 2016). On the other hand, it has been argued that animal experimentation can save human lives and reduce harm on humans (Smith 2008). This argument is only valid for toxicological testing, not for tests that, for example, merely improve the efficacy of a cosmetic (Turner 2015). It can be suggested that animal experimentation should be regulated to only allow toxicological risk assessment, and the suffering to the animals should be minimised.   

Bibliography

Mateos, M. and Sole, I. (2009). Synthesising Information from various texts: A Study of Procedures and Products at Different Educational Levels. European Journal of Psychology of Education,  24 (4), 435-451. Available from https://doi.org/10.1007/BF03178760 [Accessed 29 June 2021].

  • << Previous: Structure
  • Next: Criticality >>
  • Last Updated: Nov 18, 2023 10:56 PM
  • URL: https://libguides.westminster.ac.uk/literature-reviews

CONNECT WITH US

Analysis and Synthesis

  • First Online: 18 February 2020

Cite this chapter

synthesis of analysis

  • Patricia A. Dwyer 3  

4704 Accesses

10 Citations

Data analysis is a challenging stage of the integrative review process as it requires the reviewer to synthesize data from diverse methodological sources. Although established approaches to data analysis and synthesis of integrative review findings continue to evolve, adherence to systematic methods during this stage is essential to mitigating potential bias. The use of rigorous and transparent data analysis methods facilitates an evidence synthesis that can be confidently incorporated into practice. This chapter discusses strategies for data analysis including creating a data matrix and presents inductive analysis approaches to support the integration and interpretation of data from a body of literature. This chapter also discusses the presentation of results and includes examples of narrative and thematic syntheses from recently published integrative reviews.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Alexis O, Worsley A (2018) An integrative review exploring black men of African and Caribbean backgrounds, their fears of prostate cancer and their attitudes towards screening. Health Educ Res 33(2):155–166. https://doi.org/10.1093/her/cyy001

Article   PubMed   Google Scholar  

Beyea SC, Nicoll LH (1998) Writing an integrative review. AORN J 67(4):877–880

Article   CAS   Google Scholar  

Blondy LC, Blakeslee AM, Scheffer BK, Rubenfeld MG, Cronin BM, Luster-Turner R (2016) Understanding synthesis across disciplines to improve nursing education. West J Nurs Res 38(6):668–685

Article   Google Scholar  

Booth A (2012) Synthesizing included studies. In: Booth A, Papaioannou D, Sutton A (eds) Systematic approaches to a successful literature review. Sage, London, pp 125–169

Google Scholar  

Brady S, Lee N, Gibbons K, Bogossian F (2019) Woman-centred care: an integrative review of the empirical literature. Int J Nurs Stud 94:107–119

Braun V, Clarke V (2006) Using thematic analysis in psychology. Qual Res Psychol 3(2):77–101. https://doi.org/10.1191/1478088706qp063oa

Cameron J, Roxburgh M, Taylor J, Lauder W (2011) An integrative literature review of student retention in programmes of nursing and midwifery education: why do students stay? J Clin Nurs. 20:1372–1382. https://doi.org/10.1111/j.1365-2702.2010.03336.x

Cooper H (1998) Synthesizing research: a guide for literature reviews, 3rd edn. Sage, Thousand Oaks, CA

Coughlin MB, Sethares KA (2017) Chronic sorrow in parents of children with a chronic illness or disability: an integrative literature review. J Pediatr Nurs 37:108–116

Elo S, Kynga SH (2008) The qualitative content analysis process. J Adv Nurs 62(1):107–115. https://doi.org/10.1111/j.1365-2648.2007.04569.x

Garrard J (2017) Health sciences literature review made easy: the matrix method. In: Chapter 5, Review matrix folder: how to abstract the research literature, 4th edn. Jones & Bartlett Learning, Burlington, MA, pp 139–160

Harstade CW, Blomberg K, Benzein E, Ostland U (2018) Dignity-conserving care actions in palliative care: an integrative review of Swedish research. Scand J Caring Sci 32(1):8–23. https://doi.org/10.1111/scs.12433

Hopia H, Latvala E, Liimatainen L (2016) Reviewing the methodology of an integrative review. Scand J Caring Sci 30:662–669

Knafl K, Whittemore R (2017) Top 10 tips for undertaking synthesis research. Res Nurs Health 40:189–193

Miles MB, Huberman AM (1994a) Chapter 1, Introduction. In: Qualitative data analysis: an expanded sourcebook, 2nd edn. Sage, Thousand Oaks, CA, pp 1–11

Miles MB, Huberman AM (1994b) Chapter 7, Cross-case displays: exploring and describing. In: Qualitative data analysis: An expanded sourcebook, 2nd edn. Sage, Thousand Oaks, CA, pp 172–205

Popay J, Roberts H, Sowden A, Petticrew M, Arai L, Rodgers M, Britten N (2006) Chapter 3, Guidance on narrative synthesis: an overview. In: Guidance on the conduct of narrative synthesis in systematic reviews: a product from the ESRC methods programme. ESRC, pp 11–24

Sandelowski M (1995) Qualitative analysis: what it is and how to begin. Res Nurs Health 18:371–375. https://doi.org/10.1002/nur.4770180411

Article   CAS   PubMed   Google Scholar  

Sandelowski M (2000) Focus on research methods: whatever happened to qualitative description? Res Nurs Health 23:334–340

Tobiano G, Marshall A, Bucknall T, Chaboyer W (2015) Patient participation in nursing care on medical wards: an integrative review. Int J Nurs Stud 52:1107–1120. https://doi.org/10.1016/j.ijnurstu.2015.02.010

Toronto CE, LaRocco SA (2019) Family perception of and experience with family presence during cardiopulmonary resuscitation: an integrative review. J Clin Nurs 28(1):32–46

Toronto CE, Quinn B, Remington R (2018) Characteristics of reviews published in nursing literature: a methodological review. ANS Adv Nurs Sci 41(1):30–40. https://doi.org/10.1097/ANS.0000000000000180

Torraco RJ (2005) Writing integrative literature reviews: guidelines and examples. Hum Resour Dev Rev 4(3):356–367

Torraco RJ (2016) Writing integrative literature reviews: using the past and present to explore the future. Hum Resour Dev Rev 15(4):404–428. https://doi.org/10.1177/1534484316671606

Whittemore R (2005) Combining evidence in nursing research: methods and implications. Nurs Res 54(1):56–62

Whittemore R, Knafl K (2005) The integrative review: updated methodology. J Adv Nurs 52(5):546–553. https://doi.org/10.1111/j.1365-2648.2005.03621.x

Download references

Author information

Authors and affiliations.

Boston Children’s Hospital, Waltham, MA, USA

Patricia A. Dwyer

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

School of Nursing, Curry College, Milton, MA, USA

Coleen E. Toronto

Department of Nursing, Framingham State University, Framingham, MA, USA

Ruth Remington

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Dwyer, P.A. (2020). Analysis and Synthesis. In: Toronto, C., Remington, R. (eds) A Step-by-Step Guide to Conducting an Integrative Review. Springer, Cham. https://doi.org/10.1007/978-3-030-37504-1_5

Download citation

DOI : https://doi.org/10.1007/978-3-030-37504-1_5

Published : 18 February 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-37503-4

Online ISBN : 978-3-030-37504-1

eBook Packages : Medicine Medicine (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

Methods for the synthesis of qualitative research: a critical review

Elaine barnett-page.

1 Evidence for Policy and Practice Information and Co-ordinating (EPPI-) Centre, Social Science Research Unit, 18 Woburn Square, London WC1H 0NS, UK

James Thomas

Associated data.

In recent years, a growing number of methods for synthesising qualitative research have emerged, particularly in relation to health-related research. There is a need for both researchers and commissioners to be able to distinguish between these methods and to select which method is the most appropriate to their situation.

A number of methodological and conceptual links between these methods were identified and explored, while contrasting epistemological positions explained differences in approaches to issues such as quality assessment and extent of iteration. Methods broadly fall into 'realist' or 'idealist' epistemologies, which partly accounts for these differences.

Methods for qualitative synthesis vary across a range of dimensions. Commissioners of qualitative syntheses might wish to consider the kind of product they want and select their method – or type of method – accordingly.

The range of different methods for synthesising qualitative research has been growing over recent years [ 1 , 2 ], alongside an increasing interest in qualitative synthesis to inform health-related policy and practice [ 3 ]. While the terms 'meta-analysis' (a statistical method to combine the results of primary studies), or sometimes 'narrative synthesis', are frequently used to describe how quantitative research is synthesised, far more terms are used to describe the synthesis of qualitative research. This profusion of terms can mask some of the basic similarities in approach that the different methods share, and also lead to some confusion regarding which method is most appropriate in a given situation. This paper does not argue that the various nomenclatures are unnecessary, but rather seeks to draw together and review the full range of methods of synthesis available to assist future reviewers in selecting a method that is fit for their purpose. It also represents an attempt to guide the reader through some of the varied terminology to spring up around qualitative synthesis. Other helpful reviews of synthesis methods have been undertaken in recent years with slightly different foci to this paper. Two recent studies have focused on describing and critiquing methods for the integration of qualitative research with quantitative [ 4 , 5 ] rather than exclusively examining the detail and rationale of methods for the synthesis of qualitative research. Two other significant pieces of work give practical advice for conducting the synthesis of qualitative research, but do not discuss the full range of methods available [ 6 , 7 ]. We begin our Discussion by outlining each method of synthesis in turn, before comparing and contrasting characteristics of these different methods across a range of dimensions. Readers who are more familiar with the synthesis methods described here may prefer to turn straight to the 'dimensions of difference' analysis in the second part of the Discussion.

Overview of synthesis methods

Meta-ethnography.

In their seminal work of 1988, Noblit and Hare proposed meta-ethnography as an alternative to meta-analysis [ 8 ]. They cited Strike and Posner's [ 9 ] definition of synthesis as an activity in which separate parts are brought together to form a 'whole'; this construction of the whole is essentially characterised by some degree of innovation, so that the result is greater than the sum of its parts. They also borrowed from Turner's theory of social explanation [ 10 ], a key tenet of which was building 'comparative understanding' [[ 8 ], p22] rather than aggregating data.

To Noblit and Hare, synthesis provided an answer to the question of 'how to "put together" written interpretive accounts' [[ 8 ], p7], where mere integration would not be appropriate. Noblit and Hare's early work synthesised research from the field of education.

Three different methods of synthesis are used in meta-ethnography. One involves the 'translation' of concepts from individual studies into one another, thereby evolving overarching concepts or metaphors. Noblit and Hare called this process reciprocal translational analysis (RTA). Refutational synthesis involves exploring and explaining contradictions between individual studies. Lines-of-argument (LOA) synthesis involves building up a picture of the whole (i.e. culture, organisation etc) from studies of its parts. The authors conceptualised this latter approach as a type of grounded theorising.

Britten et al [ 11 ] and Campbell et al [ 12 ] have both conducted evaluations of meta-ethnography and claim to have succeeded, by using this method, in producing theories with greater explanatory power than could be achieved in a narrative literature review. While both these evaluations used small numbers of studies, more recently Pound et al [ 13 ] conducted both an RTA and an LOA synthesis using a much larger number of studies (37) on resisting medicines. These studies demonstrate that meta-ethnography has evolved since Noblit and Hare first introduced it. Campbell et al claim to have applied the method successfully to non-ethnographical studies. Based on their reading of Schutz [ 14 ], Britten et al have developed both second and third order constructs in their synthesis (Noblit and Hare briefly allude to the possibility of a 'second level of synthesis' [[ 8 ], p28] but do not demonstrate or further develop the idea).

In a more recent development, Sandelowski & Barroso [ 15 ] write of adapting RTA by using it to ' integrate findings interpretively, as opposed to comparing them interpretively' (p204). The former would involve looking to see whether the same concept, theory etc exists in different studies; the latter would involve the construction of a bigger picture or theory (i.e. LOA synthesis). They also talk about comparing or integrating imported concepts (e.g. from other disciplines) as well as those evolved 'in vivo'.

Grounded theory

Kearney [ 16 ], Eaves [ 17 ] and Finfgeld [ 18 ] have all adapted grounded theory to formulate a method of synthesis. Key methods and assumptions of grounded theory, as originally formulated and subsequently refined by Glaser and Strauss [ 19 ] and Strauss and Corbin [ 20 , 21 ], include: simultaneous phases of data collection and analysis; an inductive approach to analysis, allowing the theory to emerge from the data; the use of the constant comparison method; the use of theoretical sampling to reach theoretical saturation; and the generation of new theory. Eaves cited grounded theorists Charmaz [ 22 ] and Chesler [ 23 ], as well as Strauss and Corbin [ 20 ], as informing her approach to synthesis.

Glaser and Strauss [ 19 ] foresaw a time when a substantive body of grounded research should be pushed towards a higher, more abstract level. As a piece of methodological work, Eaves undertook her own synthesis of the synthesis methods used by these authors to produce her own clear and explicit guide to synthesis in grounded formal theory. Kearney stated that 'grounded formal theory', as she termed this method of synthesis, 'is suited to study of phenomena involving processes of contextualized understanding and action' [[ 24 ], p180] and, as such, is particularly applicable to nurses' research interests.

As Kearney suggested, the examples examined here were largely dominated by research in nursing. Eaves synthesised studies on care-giving in rural African-American families for elderly stroke survivors; Finfgeld on courage among individuals with long-term health problems; Kearney on women's experiences of domestic violence.

Kearney explicitly chose 'grounded formal theory' because it matches 'like' with 'like': that is, it applies the same methods that have been used to generate the original grounded theories included in the synthesis – produced by constant comparison and theoretical sampling – to generate a higher-level grounded theory. The wish to match 'like' with 'like' is also implicit in Eaves' paper. This distinguishes grounded formal theory from more recent applications of meta-ethnography, which have sought to include qualitative research using diverse methodological approaches [ 12 ].

Thematic Synthesis

Thomas and Harden [ 25 ] have developed an approach to synthesis which they term 'thematic synthesis'. This combines and adapts approaches from both meta-ethnography and grounded theory. The method was developed out of a need to conduct reviews that addressed questions relating to intervention need, appropriateness and acceptability – as well as those relating to effectiveness – without compromising on key principles developed in systematic reviews. They applied thematic synthesis in a review of the barriers to, and facilitators of, healthy eating amongst children.

Free codes of findings are organised into 'descriptive' themes, which are then further interpreted to yield 'analytical' themes. This approach shares characteristics with later adaptations of meta-ethnography, in that the analytical themes are comparable to 'third order interpretations' and that the development of descriptive and analytical themes using coding invoke reciprocal 'translation'. It also shares much with grounded theory, in that the approach is inductive and themes are developed using a 'constant comparison' method. A novel aspect of their approach is the use of computer software to code the results of included studies line-by-line, thus borrowing another technique from methods usually used to analyse primary research.

Textual Narrative Synthesis

Textual narrative synthesis is an approach which arranges studies into more homogenous groups. Lucas et al [ 26 ] comment that it has proved useful in synthesising evidence of different types (qualitative, quantitative, economic etc). Typically, study characteristics, context, quality and findings are reported on according to a standard format and similarities and differences are compared across studies. Structured summaries may also be developed, elaborating on and putting into context the extracted data [ 27 ].

Lucas et al [ 26 ] compared thematic synthesis with textual narrative synthesis. They found that 'thematic synthesis holds most potential for hypothesis generation' whereas textual narrative synthesis is more likely to make transparent heterogeneity between studies (as does meta-ethnography, with refutational synthesis) and issues of quality appraisal. This is possibly because textual narrative synthesis makes clearer the context and characteristics of each study, while the thematic approach organises data according to themes. However, Lucas et al found that textual narrative synthesis is 'less good at identifying commonality' (p2); the authors do not make explicit why this should be, although it may be that organising according to themes, as the thematic approach does, is comparatively more successful in revealing commonality.

Paterson et al [ 28 ] have evolved a multi-faceted approach to synthesis, which they call 'meta-study'. The sociologist Zhao [ 29 ], drawing on Ritzer's work [ 30 ], outlined three components of analysis, which they proposed should be undertaken prior to synthesis. These are meta-data-analysis (the analysis of findings), meta-method (the analysis of methods) and meta-theory (the analysis of theory). Collectively, these three elements of analysis, culminating in synthesis, make up the practice of 'meta-study'. Paterson et al pointed out that the different components of analysis may be conducted concurrently.

Paterson et al argued that primary research is a construction; secondary research is therefore a construction of a construction. There is need for an approach that recognises this, and that also recognises research to be a product of its social, historical and ideological context. Such an approach would be useful in accounting for differences in research findings. For Paterson et al, there is no such thing as 'absolute truth'.

Meta-study was developed to study the experiences of adults living with a chronic illness. Meta-data-analysis was conceived of by Paterson et al in similar terms to Noblit and Hare's meta-ethnography (see above), in that it is essentially interpretive and seeks to reveal similarities and discrepancies among accounts of a particular phenomenon. Meta-method involves the examination of the methodologies of the individual studies under review. Part of the process of meta-method is to consider different aspects of methodology such as sampling, data collection, research design etc, similar to procedures others have called 'critical appraisal' (CASP [ 31 ]). However, Paterson et al take their critique to a deeper level by establishing the underlying assumptions of the methodologies used and the relationship between research outcomes and methods used. Meta-theory involves scrutiny of the philosophical and theoretical assumptions of the included research papers; this includes looking at the wider context in which new theory is generated. Paterson et al described meta-synthesis as a process which creates a new interpretation which accounts for the results of all three elements of analysis. The process of synthesis is iterative and reflexive and the authors were unwilling to oversimplify the process by 'codifying' procedures for bringing all three components of analysis together.

Meta-narrative

Greenhalgh et al [ 32 ]'s meta-narrative approach to synthesis arose out of the need to synthesise evidence to inform complex policy-making questions and was assisted by the formation of a multi-disciplinary team. Their approach to review was informed by Thomas Kuhn's The Structure of Scientific Revolutions [ 33 ], in which he proposed that knowledge is produced within particular paradigms which have their own assumptions about theory, about what is a legitimate object of study, about what are legitimate research questions and about what constitutes a finding. Paradigms also tend to develop through time according to a particular set of stages, central to which is the stage of 'normal science', in which the particular standards of the paradigm are largely unchallenged and seen to be self-evident. As Greenhalgh et al pointed out, Kuhn saw paradigms as largely incommensurable: 'that is, an empirical discovery made using one set of concepts, theories, methods and instruments cannot be satisfactorily explained through a different paradigmatic lens' [[ 32 ], p419].

Greenhalgh et al synthesised research from a wide range of disciplines; their research question related to the diffusion of innovations in health service delivery and organisation. They thus identified a need to synthesise findings from research which contains many different theories arising from many different disciplines and study designs.

Based on Kuhn's work, Greenhalgh et al proposed that, across different paradigms, there were multiple – and potentially mutually contradictory – ways of understanding the concept at the heart of their review, namely the diffusion of innovation. Bearing this in mind, the reviewers deliberately chose to select key papers from a number of different research 'paradigms' or 'traditions', both within and beyond healthcare, guided by their multidisciplinary research team. They took as their unit of analysis the 'unfolding "storyline" of a research tradition over time' [[ 32 ], p417) and sought to understand diffusion of innovation as it was conceptualised in each of these traditions. Key features of each tradition were mapped: historical roots, scope, theoretical basis; research questions asked and methods/instruments used; main empirical findings; historical development of the body of knowledge (how have earlier findings led to later findings); and strengths and limitations of the tradition. The results of this exercise led to maps of 13 'meta-narratives' in total, from which seven key dimensions, or themes, were identified and distilled for the synthesis phase of the review.

Critical Interpretive Synthesis

Dixon-Woods et al [ 34 ] developed their own approach to synthesising multi-disciplinary and multi-method evidence, termed 'critical interpretive synthesis', while researching access to healthcare by vulnerable groups. Critical interpretive synthesis is an adaptation of meta-ethnography, as well as borrowing techniques from grounded theory. The authors stated that they needed to adapt traditional meta-ethnographic methods for synthesis, since these had never been applied to quantitative as well as qualitative data, nor had they been applied to a substantial body of data (in this case, 119 papers).

Dixon-Woods et al presented critical interpretive synthesis as an approach to the whole process of review, rather than to just the synthesis component. It involves an iterative approach to refining the research question and searching and selecting from the literature (using theoretical sampling) and defining and applying codes and categories. It also has a particular approach to appraising quality, using relevance – i.e. likely contribution to theory development – rather than methodological characteristics as a means of determining the 'quality' of individual papers [ 35 ]. The authors also stress, as a defining characteristic, critical interpretive synthesis's critical approach to the literature in terms of deconstructing research traditions or theoretical assumptions as a means of contextualising findings.

Dixon-Woods et al rejected reciprocal translational analysis (RTA) as this produced 'only a summary in terms that have already been used in the literature' [[ 34 ], p5], which was seen as less helpful when dealing with a large and diverse body of literature. Instead, Dixon-Woods et al adopted a lines-of-argument (LOA) synthesis, in which – rejecting the difference between first, second and third order constructs – they instead developed 'synthetic constructs' which were then linked with constructs arising directly from the literature.

The influence of grounded theory can be seen in particular in critical interpretive synthesis's inductive approach to formulating the review question and to developing categories and concepts, rejecting a 'stage' approach to systematic reviewing, and in selecting papers using theoretical sampling. Dixon-Woods et al also claim that critical interpretive synthesis is distinct in its 'explicit orientation towards theory generation' [[ 34 ], p9].

Ecological Triangulation

Jim Banning is the author of 'ecological triangulation' or 'ecological sentence synthesis', applying this method to the evidence for what works for youth with disabilities. He borrows from Webb et al [ 36 ] and Denzin [ 37 ] the concept of triangulation, in which phenomena are studied from a variety of vantage points. His rationale is that building an 'evidence base' of effectiveness requires the synthesis of cumulative, multi-faceted evidence in order to find out 'what intervention works for what kind of outcomes for what kind of persons under what kind of conditions' [[ 38 ], p1].

Ecological triangulation unpicks the mutually interdependent relationships between behaviour, persons and environments. The method requires that, for data extraction and synthesis, 'ecological sentences' are formulated following the pattern: 'With this intervention, these outcomes occur with these population foci and within these grades (ages), with these genders ... and these ethnicities in these settings' [[ 39 ], p1].

Framework Synthesis

Brunton et al [ 40 ] and Oliver et al [ 41 ] have applied a 'framework synthesis' approach in their reviews. Framework synthesis is based on framework analysis, which was outlined by Pope, Ziebland and Mays [ 42 ], and draws upon the work of Ritchie and Spencer [ 43 ] and Miles and Huberman [ 44 ]. Its rationale is that qualitative research produces large amounts of textual data in the form of transcripts, observational fieldnotes etc. The sheer wealth of information poses a challenge for rigorous analysis. Framework synthesis offers a highly structured approach to organising and analysing data (e.g. indexing using numerical codes, rearranging data into charts etc).

Brunton et al applied the approach to a review of children's, young people's and parents' views of walking and cycling; Oliver et al to an analysis of public involvement in health services research. Framework synthesis is distinct from the other methods outlined here in that it utilises an a priori 'framework' – informed by background material and team discussions – to extract and synthesise findings. As such, it is largely a deductive approach although, in addition to topics identified by the framework, new topics may be developed and incorporated as they emerge from the data. The synthetic product can be expressed in the form of a chart for each key dimension identified, which may be used to map the nature and range of the concept under study and find associations between themes and exceptions to these [ 40 ].

'Fledgling' approaches

There are three other approaches to synthesis which have not yet been widely used. One is an approach using content analysis [ 45 , 46 ] in which text is condensed into fewer content-related categories. Another is 'meta-interpretation' [ 47 ], featuring the following: an ideographic rather than pre-determined approach to the development of exclusion criteria; a focus on meaning in context; interpretations as raw data for synthesis (although this feature doesn't distinguish it from other synthesis methods); an iterative approach to the theoretical sampling of studies for synthesis; and a transparent audit trail demonstrating the trustworthiness of the synthesis.

In addition to the synthesis methods discussed above, Sandelowski and Barroso propose a method they call 'qualitative metasummary' [ 15 ]. It is mentioned here as a new and original approach to handling a collection of qualitative studies but is qualitatively different to the other methods described here since it is aggregative; that is, findings are accumulated and summarised rather than 'transformed'. Metasummary is a way of producing a 'map' of the contents of qualitative studies and – according to Sandelowski and Barroso – 'reflect [s] a quantitative logic' [[ 15 ], p151]. The frequency of each finding is determined and the higher the frequency of a particular finding, the greater its validity. The authors even discuss the calculation of 'effect sizes' for qualitative findings. Qualitative metasummaries can be undertaken as an end in themselves or may serve as a basis for a further synthesis.

Dimensions of difference

Having outlined the range of methods identified, we now turn to an examination of how they compare with one another. It is clear that they have come from many different contexts and have different approaches to understanding knowledge, but what do these differences mean in practice? Our framework for this analysis is shown in Additional file 1 : dimensions of difference [ 48 ]. We have examined the epistemology of each of the methods and found that, to some extent, this explains the need for different methods and their various approaches to synthesis.

Epistemology

The first dimension that we will consider is that of the researchers' epistemological assumptions. Spencer et al [ 49 ] outline a range of epistemological positions, which might be organised into a spectrum as follows:

Subjective idealism : there is no shared reality independent of multiple alternative human constructions

Objective idealism : there is a world of collectively shared understandings

Critical realism : knowledge of reality is mediated by our perceptions and beliefs

Scientific realism : it is possible for knowledge to approximate closely an external reality

Naïve realism : reality exists independently of human constructions and can be known directly [ 49 , 45 , 46 ].

Thus, at one end of the spectrum we have a highly constructivist view of knowledge and, at the other, an unproblematized 'direct window onto the world' view.

Nearly all of positions along this spectrum are represented in the range of methodological approaches to synthesis covered in this paper. The originators of meta-narrative synthesis, critical interpretive synthesis and meta-study all articulate what might be termed a 'subjective idealist' approach to knowledge. Paterson et al [ 28 ] state that meta-study shies away from creating 'grand theories' within the health or social sciences and assume that no single objective reality will be found. Primary studies, they argue, are themselves constructions; meta-synthesis, then, 'deals with constructions of constructions' (p7). Greenhalgh et al [ 32 ] also view knowledge as a product of its disciplinary paradigm and use this to explain conflicting findings: again, the authors neither seek, nor expect to find, one final, non-contestable answer to their research question. Critical interpretive synthesis is similar in seeking to place literature within its context, to question its assumptions and to produce a theoretical model of a phenomenon which – because highly interpretive – may not be reproducible by different research teams at alternative points in time [[ 34 ], p11].

Methods used to synthesise grounded theory studies in order to produce a higher level of grounded theory [ 24 ] appear to be informed by 'objective idealism', as does meta-ethnography. Kearney argues for the near-universal applicability of a 'ready-to-wear' theory across contexts and populations. This approach is clearly distinct from one which recognises multiple realities. The emphasis is on examining commonalities amongst, rather than discrepancies between, accounts. This emphasis is similarly apparent in most meta-ethnographies, which are conducted either according to Noblit and Hare's 'reciprocal translational analysis' technique or to their 'lines-of-argument' technique and which seek to provide a 'whole' which has a greater explanatory power. Although Noblit and Hare also propose 'refutational synthesis', in which contradictory findings might be explored, there are few examples of this having been undertaken in practice, and the aim of the method appears to be to explain and explore differences due to context, rather than multiple realities.

Despite an assumption of a reality which is perhaps less contestable than those of meta-narrative synthesis, critical interpretive synthesis and meta-study, both grounded formal theory and meta-ethnography place a great deal of emphasis on the interpretive nature of their methods. This still supposes a degree of constructivism. Although less explicit about how their methods are informed, it seems that both thematic synthesis and framework synthesis – while also involving some interpretation of data – share an even less problematized view of reality and a greater assumption that their synthetic products are reproducible and correspond to a shared reality. This is also implicit in the fact that such products are designed directly to inform policy and practice, a characteristic shared by ecological triangulation. Notably, ecological triangulation, according to Banning, can be either realist or idealist. Banning argues that the interpretation of triangulation can either be one in which multiple viewpoints converge on a point to produce confirming evidence (i.e. one definitive answer to the research question) or an idealist one, in which the complexity of multiple viewpoints is represented. Thus, although ecological triangulation views reality as complex, the approach assumes that it can be approximately knowable (at least when the realist view of ecological triangulation is adopted) and that interventions can and should be modelled according to the products of its syntheses.

While pigeonholing different methods into specific epistemological positions is a problematic process, we do suggest that the contrasting epistemologies of different researchers is one way of explaining why we have – and need – different methods for synthesis.

Variation in terms of the extent of iteration during the review process is another key dimension. All synthesis methods include some iteration but the degree varies. Meta-ethnography, grounded theory and thematic synthesis all include iteration at the synthesis stage; both framework synthesis and critical interpretive synthesis involve iterative literature searching – in the case of critical interpretive synthesis, it is not clear whether iteration occurs during the rest of the review process. Meta-narrative also involves iteration at every stage. Banning does not mention iteration in outlining ecological triangulation and neither do Lucas or Thomas and Harden for thematic narrative synthesis.

It seems that the more idealist the approach, the greater the extent of iteration. This might be because a large degree of iteration does not sit well with a more 'positivist' ideal of procedural objectivity; in particular, the notion that the robustness of the synthetic product depends in part on the reviewers stating up front in a protocol their searching strategies, inclusion/exclusion criteria etc, and being seen not to alter these at a later stage.

Quality assessment

Another dimension along which we can look at different synthesis methods is that of quality assessment. When the approaches to the assessment of the quality of studies retrieved for review are examined, there is again a wide methodological variation. It might be expected that the further towards the 'realism' end of the epistemological spectrum a method of synthesis falls, the greater the emphasis on quality assessment. In fact, this is only partially the case.

Framework synthesis, thematic narrative synthesis and thematic synthesis – methods which might be classified as sharing a 'critical realist' approach – all have highly specified approaches to quality assessment. The review in which framework synthesis was developed applied ten quality criteria: two on quality and reporting of sampling methods, four to the quality of the description of the sample in the study, two to the reliability and validity of the tools used to collect data and one on whether studies used appropriate methods for helping people to express their views. Studies which did not meet a certain number of quality criteria were excluded from contributing to findings. Similarly, in the example review for thematic synthesis, 12 criteria were applied: five related to reporting aims, context, rationale, methods and findings; four relating to reliability and validity; and three relating to the appropriateness of methods for ensuring that findings were rooted in participants' own perspectives. Studies which were deemed to have significant flaws were excluded and sensitivity analyses were used to assess the possible impact of study quality on the review's findings. Thomas and Harden's use of thematic narrative synthesis similarly applied quality criteria and developed criteria additional to those they found in the literature on quality assessment, relating to the extent to which people's views and perspectives had been privileged by researchers. It is worth noting not only that these methods apply quality criteria but that they are explicit about what they are: assessing quality is a key component in the review process for both of these methods. Likewise, Banning – the originator of ecological triangulation – sees quality assessment as important and adapts the Design and Implementation Assessment Device (DIAD) Version 0.3 (a quality assessment tool for quantitative research) for use when appraising qualitative studies [ 50 ]. Again, Banning writes of excluding studies deemed to be of poor quality.

Greenhalgh et al's meta-narrative review [ 32 ] modified a range of existing quality assessment tools to evaluate studies according to validity and robustness of methods; sample size and power; and validity of conclusions. The authors imply, but are not explicit, that this process formed the basis for the exclusion of some studies. Although not quite so clear about quality assessment methods as framework and thematic synthesis, it might be argued that meta-narrative synthesis shows a greater commitment to the concept that research can and should be assessed for quality than either meta-ethnography or grounded formal theory. The originators of meta-ethnography, Noblit and Hare [ 8 ], originally discussed quality in terms of quality of metaphor, while more recent use of this method has used amended versions of CASP (the Critical Appraisal Skills Programme tool, [ 31 ]), yet has only referred to studies being excluded on the basis of lack of relevance or because they weren't 'qualitative' studies [ 8 ]. In grounded theory, quality assessment is only discussed in terms of a 'personal note' being made on the context, quality and usefulness of each study. However, contrary to expectation, meta-narrative synthesis lies at the extreme end of the idealism/realism spectrum – as a subjective idealist approach – while meta-ethnography and grounded theory are classified as objective idealist approaches.

Finally, meta-study and critical interpretive synthesis – two more subjective idealist approaches – look to the content and utility of findings rather than methodology in order to establish quality. While earlier forms of meta-study included only studies which demonstrated 'epistemological soundness', in its most recent form [ 51 ] this method has sought to include all relevant studies, excluding only those deemed not to be 'qualitative' research. Critical interpretive synthesis also conforms to what we might expect of its approach to quality assessment: quality of research is judged as the extent to which it informs theory. The threshold of inclusion is informed by expertise and instinct rather than being articulated a priori.

In terms of quality assessment, it might be important to consider the academic context in which these various methods of synthesis developed. The reason why thematic synthesis, framework synthesis and ecological triangulation have such highly specified approaches to quality assessment may be that each of these was developed for a particular task, i.e. to conduct a multi-method review in which randomised controlled trials (RCTs) were included. The concept of quality assessment in relation to RCTs is much less contested and there is general agreement on criteria against which quality should be judged.

Problematizing the literature

Critical interpretive synthesis, the meta-narrative approach and the meta-theory element of meta-study all share some common ground in that their review and synthesis processes include examining all aspects of the context in which knowledge is produced. In conducting a review on access to healthcare by vulnerable groups, critical interpretive synthesis sought to question 'the ways in which the literature had constructed the problematics of access, the nature of the assumptions on which it drew, and what has influenced its choice of proposed solutions' [[ 34 ], p6]. Although not claiming to have been directly influenced by Greenhalgh et al's meta-narrative approach, Dixon-Woods et al do cite it as sharing similar characteristics in the sense that it critiques the literature it reviews.

Meta-study uses meta-theory to describe and deconstruct the theories that shape a body of research and to assess its quality. One aspect of this process is to examine the historical evolution of each theory and to put it in its socio-political context, which invites direct comparison with meta-narrative synthesis. Greenhalgh et al put a similar emphasis on placing research findings within their social and historical context, often as a means of seeking to explain heterogeneity of findings. In addition, meta-narrative shares with critical interpretive synthesis an iterative approach to searching and selecting from the literature.

Framework synthesis, thematic synthesis, textual narrative synthesis, meta-ethnography and grounded theory do not share the same approach to problematizing the literature as critical interpretive synthesis, meta-study and meta-narrative. In part, this may be explained by the extent to which studies included in the synthesis represented a broad range of approaches or methodologies. This, in turn, may reflect the broadness of the review question and the extent to which the concepts contained within the question are pre-defined within the literature. In the case of both the critical interpretive synthesis and meta-narrative reviews, terminology was elastic and/or the question formed iteratively. Similarly, both reviews placed great emphasis on employing multi-disciplinary research teams. Approaches which do not critique the literature in the same way tend to have more narrowly-focused questions. They also tend to include a more limited range of studies: grounded theory synthesis includes grounded theory studies, meta-ethnography (in its original form, as applied by Noblit and Hare) ethnographies. The thematic synthesis incorporated studies based on only a narrow range of qualitative methodologies (interviews and focus groups) which were informed by a similarly narrow range of epistemological assumptions. It may be that the authors of such syntheses saw no need for including such a critique in their review process.

Similarities and differences between primary studies

Most methods of synthesis are applicable to heterogeneous data (i.e. studies which use contrasting methodologies) apart from early meta-ethnography and synthesis informed by grounded theory. All methods of synthesis state that, at some level, studies are compared; many are not so explicit about how this is done, though some are. Meta-ethnography is one of the most explicit: it describes the act of 'translation' where terms and concepts which have resonance with one another are subsumed into 'higher order constructs'. Grounded theory, as represented by Eaves [ 17 ], is undertaken according to a long list of steps and sub-steps, includes the production of generalizations about concepts/categories, which comes from classifying these categories. In meta-narrative synthesis, comparable studies are grouped together at the appraisal phase of review.

Perhaps more interesting are the ways in which differences between studies are explored. Those methods with a greater emphasis on critical appraisal may tend (although this is not always made explicit) to use differences in method to explain differences in finding. Meta-ethnography proposes 'refutational synthesis' to explain differences, although there are few examples of this in the literature. Some synthesis methods – for example, thematic synthesis – look at other characteristics of the studies under review, whether types of participants and their context vary, and whether this can explain differences in perspective.

All of these methods, then, look within the studies to explain differences. Other methods look beyond the study itself to the context in which it was produced. Critical interpretive synthesis and meta-study look at differences in theory or in socio-economic context. Critical interpretive synthesis, like meta-narrative, also explores epistemological orientation. Meta-narrative is unique in concerning itself with disciplinary paradigm (i.e. the story of the discipline as it progresses). It is also distinctive in that it treats conflicting findings as 'higher order data' [[ 32 ], p420], so that the main emphasis of the synthesis appears to be on examining and explaining contradictions in the literature.

Going 'beyond' the primary studies

Synthesis is sometimes defined as a process resulting in a product, a 'whole', which is more than the sum of its parts. However, the methods reviewed here vary in the extent to which they attempt to 'go beyond' the primary studies and transform the data. Some methods – textual narrative synthesis, ecological triangulation and framework synthesis – focus on describing and summarising their primary data (often in a highly structured and detailed way) and translating the studies into one another. Others – meta-ethnography, grounded theory, thematic synthesis, meta-study, meta-narrative and critical interpretive synthesis – seek to push beyond the original data to a fresh interpretation of the phenomena under review. A key feature of thematic synthesis is its clear differentiation between these two stages.

Different methods have different mechanisms for going beyond the primary studies, although some are more explicit than others about what these entail. Meta-ethnography proposes a 'Line of Argument' (LOA) synthesis in which an interpretation is constructed to both link and explain a set of parts. Critical interpretive synthesis based its synthesis methods on those of meta-ethnography, developing an LOA using what the authors term 'synthetic constructs' (akin to 'third order constructs' in meta-ethnography) to create a 'synthesising argument'. Dixon-Woods et al claim that this is an advance on Britten et al's methods, in that they reject the difference between first, second and third order constructs.

Meta-narrative, as outlined above, focuses on conflicting findings and constructs theories to explain these in terms of differing paradigms. Meta study derives questions from each of its three components to which it subjects the dataset and inductively generates a number of theoretical claims in relation to it. According to Eaves' model of grounded theory [ 17 ], mini-theories are integrated to produce an explanatory framework. In ecological triangulation, the 'axial' codes – or second level codes evolved from the initial deductive open codes – are used to produce Banning's 'ecological sentence' [ 39 ].

The synthetic product

In overviewing and comparing different qualitative synthesis methods, the ultimate question relates to the utility of the synthetic product: what is it for? It is clear that some methods of synthesis – namely, thematic synthesis, textual narrative synthesis, framework synthesis and ecological triangulation – view themselves as producing an output that is directly applicable to policy makers and designers of interventions. The example of framework synthesis examined here (on children's, young people's and parents' views of walking and cycling) involved policy makers and practitioners in directing the focus of the synthesis and used the themes derived from the synthesis to infer what kind of interventions might be most effective in encouraging walking and cycling. Likewise, the products of the thematic synthesis took the form of practical recommendations for interventions (e.g. 'do not promote fruit and vegetables in the same way in the same intervention'). The extent to which policy makers and practitioners are involved in informing either synthesis or recommendation is less clear from the documents published on ecological triangulation, but the aim certainly is to directly inform practice.

The outputs of synthesis methods which have a more constructivist orientation – meta-study, meta-narrative, meta-ethnography, grounded theory, critical interpretive synthesis – tend to look rather different. They are generally more complex and conceptual, sometimes operating on the symbolic or metaphorical level, and requiring a further process of interpretation by policy makers and practitioners in order for them to inform practice. This is not to say, however, that they are not useful for practice, more that they are doing different work. However, it may be that, in the absence of further interpretation, they are more useful for informing other researchers and theoreticians.

Looking across dimensions

After examining the dimensions of difference of our included methods, what picture ultimately emerges? It seems clear that, while similar in some respects, there are genuine differences in approach to the synthesis of what is essentially textual data. To some extent, these differences can be explained by the epistemological assumptions that underpin each method. Our methods split into two broad camps: the idealist and the realist (see Table ​ Table1 1 for a summary). Idealist approaches generally tend to have a more iterative approach to searching (and the review process), have less a priori quality assessment procedures and are more inclined to problematize the literature. Realist approaches are characterised by a more linear approach to searching and review, have clearer and more well-developed approaches to quality assessment, and do not problematize the literature.

Summary table

N.B.: In terms of the above dimensions, it is generally a question of degree rather than of absolute distinctions.

Mapping the relationships between methods

What is interesting is the relationship between these methods of synthesis, the conceptual links between them, and the extent to which the originators cite – or, in some cases, don't cite – one another. Some methods directly build on others – framework synthesis builds on framework analysis, for example, while grounded theory and constant comparative analysis build on grounded theory. Others further develop existing methods – meta-study, critical interpretive synthesis and meta-narrative all adapt aspects of meta-ethnography, while also importing concepts from other theorists (critical interpretive synthesis also adapts grounded theory techniques).

Some methods share a clear conceptual link, without directly citing one another: for example, the analytical themes developed during thematic synthesis are comparable to the third order interpretations of meta-ethnography. The meta-theory aspect of meta-study is echoed in both meta-narrative synthesis and critical interpretive synthesis (see 'Problematizing the literature, above); however, the originators of critical interpretive synthesis only refer to the originators of meta-study in relation to their use of sampling techniques.

While methods for qualitative synthesis have many similarities, there are clear differences in approach between them, many of which can be explained by taking account of a given method's epistemology.

However, within the two broad idealist/realist categories, any differences between methods in terms of outputs appear to be small.

Since many systematic reviews are designed to inform policy and practice, it is important to select a method – or type of method – that will produce the kind of conclusions needed. However, it is acknowledged that this is not always simple or even possible to achieve in practice.

The approaches that result in more easily translatable messages for policy-makers and practitioners may appear to be more attractive than the others; but we do need to take account lessons from the more idealist end of the spectrum, that some perspectives are not universal.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

Both authors made substantial contributions, with EBP taking a lead on writing and JT on the analytical framework. Both authors read and approved the final manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2288/9/59/prepub

Supplementary Material

Dimensions of difference . Ranging from subjective idealism through objective idealism and critical realism to scientific realism to naïve realism

Acknowledgements

The authors would like to acknowledge the helpful contributions of the following in commenting on earlier drafts of this paper: David Gough, Sandy Oliver, Angela Harden, Mary Dixon-Woods, Trisha Greenhalgh and Barbara L. Paterson. We would also like to thank the peer reviewers: Helen J Smith, Rosaline Barbour and Mark Rodgers for their helpful reviews. The methodological development was supported by the Department of Health (England) and the ESRC through the Methods for Research Synthesis Node of the National Centre for Research Methods (NCRM). An earlier draft of this paper currently appears as a working paper on the National Centre for Research Methods' website http://www.ncrm.ac.uk/ .

  • Dixon-Woods M, Agarwhal S, Jones D, Young B, Sutton A. Synthesising qualitative and quantitative evidence: a review of possible methods. J Health Serv Res Pol. 2005; 10 (1):45–53b. doi: 10.1258/1355819052801804. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Barbour RS, Barbour M. Evaluating and synthesizing qualitative research: the need to develop a distinctive approach. J Eval Clin Pract. 2003; 9 (2):179–186. doi: 10.1046/j.1365-2753.2003.00371.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mays N, Pope C, Popay J. Systematically reviewing qualitative and quantitative evidence to inform management and policy-making in the health field. J Health Serv Res Pol. 2005; 10 (Suppl 1):6–20. doi: 10.1258/1355819054308576. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dixon-Woods M, Bonas S, Booth A, Jones DR, Miller T, Shaw RL, Smith J, Sutton A, Young B. How can systematic reviews incorporate qualitative research? A critical perspective. Qual Res. 2006; 6 :27–44. doi: 10.1177/1468794106058867. [ CrossRef ] [ Google Scholar ]
  • Pope C, Mays N, Popay J. Synthesizing Qualitative and Quantitative Health Evidence: a Guide to Methods. Maidenhead: Open University Press; 2007. [ Google Scholar ]
  • Thorne S, Jenson L, Kearney MH, Noblit G, Sandelowski M. Qualitative metasynthesis: reflections on methodological orientation and ideological agenda. Qual Health Res. 2004; 14 :1342–1365. doi: 10.1177/1049732304269888. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Centre for Reviews and Dissemination. Systematic Reviews. CRD's Guidance for Undertaking Reviews in Health Care. York: CRD; 2008. [ Google Scholar ]
  • Noblit GW, Hare RD. Meta-Ethnography: Synthesizing Qualitative Studies. London: Sage; 1988. [ Google Scholar ]
  • Strike K, Posner G. In: Knowledge Structure and Use. Ward S, Reed L, editor. Philadelphia: Temple University Press; 1983. Types of synthesis and their criteria. [ Google Scholar ]
  • Turner S. Sociological Explanation as Translation. New York: Cambridge University Press; 1980. [ Google Scholar ]
  • Britten N, Campbell R, Pope C, Donovan J, Morgan M, Pill R. Using meta-ethnography to synthesis qualitative research: a worked example. J Health Serv Res. 2002; 7 :209–15. doi: 10.1258/135581902320432732. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Campbell R, Pound P, Pope C, Britten N, Pill R, Morgan M, Donovan J. Evaluating meta-ethnography: a synthesis of qualitative research on lay experiences of diabetes and diabetes care. Soc Sci Med. 2003; 65 :671–84. doi: 10.1016/S0277-9536(02)00064-3. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Pound P, Britten N, Morgan M, Yardley L, Pope C, Daker-White G, Campbell R. Resisting medicines: a synthesis of qualitative studies of medicine taking. Soc Sci Med. 2005; 61 :133–155. doi: 10.1016/j.socscimed.2004.11.063. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schutz A. Collected Paper. Vol. 1. The Hague: Martinus Nijhoff; 1962. [ Google Scholar ]
  • Sandelowski M, Barroso J. Handbook for Synthesizing Qualitative Research. New York: Springer Publishing Company; 2007. [ Google Scholar ]
  • Kearney MH. Enduring love: a grounded formal theory of women's experience of domestic violence. Research Nurs Health. 2001; 24 :270–82. doi: 10.1002/nur.1029. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Eaves YD. A synthesis technique for grounded theory data analysis. J Adv Nurs. 2001; 35 :654–63. doi: 10.1046/j.1365-2648.2001.01897.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Finfgeld D. Courage as a process of pushing beyond the struggle. Qual Health Res. 1999; 9 :803–814. doi: 10.1177/104973299129122298. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Glaser BG, Strauss AL. The Discovery of Grounded Theory: Strategies for Qualitative Research. New York: Aldine De Gruyter; 1967. [ Google Scholar ]
  • Strauss AL, Corbin J. Basics of Qualitative Research: Grounded Theory Procedures and Techniques. Newbury Park, CA: Sage; 1990. [ Google Scholar ]
  • Strauss AL, Corbin J. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. Thousand Oaks, CA: Sage; 1998. [ Google Scholar ]
  • Charmaz K. In: Contemporary Field Research: A Collection of Readings. Emerson RM, editor. Waveland Press: Prospect Heights, IL; 1983. The grounded theory method: an explication and interpretation; pp. 109–126. [ Google Scholar ]
  • Chesler MA. Professionals' Views of the Dangers of Self-Help Groups: Explicating a Grounded Theoretical Approach. [Michigan]: Department of Sociology, University of Michigan, Ann Arbour Centre for Research on Social Organisation, Working Paper Series; 1987. [ Google Scholar ]
  • Kearney MH. Ready-to-wear: discovering grounded formal theory. Res Nurs Health. 1988; 21 :179–186. doi: 10.1002/(SICI)1098-240X(199804)21:2<179::AID-NUR8>3.0.CO;2-G. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Thomas J, Harden A. Methods for the thematic synthesis of qualitative research in systematic reviews. BMC Med Res Meth. 2008; 8 :45. doi: 10.1186/1471-2288-8-45. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lucas PJ, Arai L, Baird, Law C, Roberts HM. Worked examples of alternative methods for the synthesis of qualitative and quantitative research in systematic reviews. BMC Med Res Meth. 2007; 7 (4) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Harden A, Garcia J, Oliver S, Rees R, Shepherd J, Brunton G, Oakley A. Applying systematic review methods to studies of people's views: an example from public health research. J Epidemiol Community H. 2004; 58 :794–800. doi: 10.1136/jech.2003.014829. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Paterson BL, Thorne SE, Canam C, Jillings C. Meta-Study of Qualitative Health Research. A Practical Guide to Meta-Analysis and Meta-Synthesis. Thousand Oaks, CA: Sage Publications; 2001. [ Google Scholar ]
  • Zhao S. Metatheory, metamethod, meta-data-analysis: what, why and how? Sociol Perspect. 1991; 34 :377–390. [ Google Scholar ]
  • Ritzer G. Metatheorizing in Sociology. Lexington, MA: Lexington Books; 1991. [ Google Scholar ]
  • CASP (Critical Appraisal Skills Programme) http://www.phru.nhs.uk/Pages/PHD/CASP.htm date unknown.
  • Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O, Peacock R. Storylines of research in diffusion of innovation: a meta-narrative approach to systematic review. Soc Sci Med. 2005; 61 :417–30. doi: 10.1016/j.socscimed.2004.12.001. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kuhn TS. The Structure of Scientific Revolutions. Chicago: University of Chicago Press; 1962. [ Google Scholar ]
  • Dixon-Woods M, Cavers D, Agarwal S, Annandale E, Arthur A, Harvey J, Hsu R, Katbamna S, Olsen R, Smith L, Riley R, Sutton AJ. Conducting a critical interpretive synthesis of the literature on access to healthcare by vulnerable groups. BMC Med Res Meth. 2006; 6 (35) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gough D. In: Applied and Practice-based Research. 2. Furlong J, Oancea A, editor. Vol. 22. Special Edition of Research Papers in Education; 2007. Weight of evidence: a framework for the appraisal of the quality and relevance of evidence; pp. 213–228. [ Google Scholar ]
  • Webb EJ, Campbell DT, Schwartz RD, Sechrest L. Unobtrusive Measures. Chicago: Rand McNally; 1966. [ Google Scholar ]
  • Denzin NK. The Research Act: a Theoretical Introduction to Sociological Methods. New York: McGraw-Hill; 1978. [ Google Scholar ]
  • Banning J. Ecological Triangulation. http://mycahs.colostate.edu/James.H.Banning/PDFs/Ecological%20Triangualtion.pdf
  • Banning J. Ecological Sentence Synthesis. http://mycahs.colostate.edu/James.H.Banning/PDFs/Ecological%20Sentence%20Synthesis.pdf
  • Brunton G, Oliver S, Oliver K, Lorenc T. A Synthesis of Research Addressing Children's, Young People's and Parents' Views of Walking and Cycling for Transport. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London; 2006. [ Google Scholar ]
  • Oliver S, Rees R, Clarke-Jones L, Milne R, Oakley A, Gabbay J, Stein K, Buchanan P, Gyte G. A multidimensional conceptual framework for analysing public involvement in health services research. Health Expect. 2008; 11 :72–84. doi: 10.1111/j.1369-7625.2007.00476.x. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Pope C, Ziebland S, Mays N. Qualitative research in health care: analysing qualitative data. BMJ. 2000; 320 :114–116. doi: 10.1136/bmj.320.7227.114. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ritchie J, Spencer L. In: Analysing Qualitative Data. Bryman A, Burgess R, editor. London: Routledge; 1993. Qualitative data analysis for applied policy research; pp. 173–194. [ Google Scholar ]
  • Miles M, Huberman A. Qualitative Data Analysis. London: Sage; 1984. [ Google Scholar ]
  • Evans D, Fitzgerald M. Reasons for physically restraining patients and residents: a systematic review and content analysis. Int J Nurs Stud. 2002; 39 :739–743. doi: 10.1016/S0020-7489(02)00015-9. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Suikkala A, Leino-Kilpi H. Nursing student-patient relationships: a review of the literature from 1984–1998. J Adv Nurs. 2000; 33 :42–50. doi: 10.1046/j.1365-2648.2001.01636.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Weed M. 'Meta-interpretation': a method for the interpretive synthesis of qualitative research. Forum: Qual Soc Res. 2005; 6 :Art 37. [ Google Scholar ]
  • Gough D, Thomas J. Dimensions of difference in systematic reviews. http://www.ncrm.ac.uk/RMF2008/festival/programme/sys1
  • Spencer L, Ritchie J, Lewis J, Dillon L. Quality in Qualitative Evaluation: a Framework for Assessing Research Evidence. London: Government Chief Social Researcher's Office; 2003. [ Google Scholar ]
  • Banning J. Design and Implementation Assessment Device (DIAD) Version 0.3: A response from a qualitative perspective. http://mycahs.colostate.edu/James.H.Banning/PDFs/Design%20and%20Implementation%20Assessment%20Device.pdf
  • Paterson BL. In: Reviewing Research Evidence for Nursing Practice. Webb C, Roe B, editor. [Oxford]: Blackwell Publishing Ltd; 2007. Coming out as ill: understanding self-disclosure in chronic illness from a meta-synthesis of qualitative research; pp. 73–83. [ Google Scholar ]

Your browser is not supported

Sorry but it looks as if your browser is out of date. To get the best experience using our site we recommend that you upgrade or switch browsers.

Find a solution

  • Skip to main content
  • Skip to navigation

Education Prizes 2024: Give someone the recognition they deserve! Nominate before 19 June

synthesis of analysis

  • Back to parent navigation item
  • Primary teacher
  • Secondary/FE teacher
  • Early career or student teacher
  • Higher education
  • Curriculum support
  • Literacy in science teaching
  • Periodic table
  • Interactive periodic table
  • Climate change and sustainability
  • Resources shop
  • Collections
  • Post-lockdown teaching support
  • Remote teaching support
  • Starters for ten
  • Screen experiments
  • Assessment for learning
  • Microscale chemistry
  • Faces of chemistry
  • Classic chemistry experiments
  • Nuffield practical collection
  • Anecdotes for chemistry teachers
  • On this day in chemistry
  • Global experiments
  • PhET interactive simulations
  • Chemistry vignettes
  • Context and problem based learning
  • Journal of the month
  • Chemistry and art
  • Art analysis
  • Pigments and colours
  • Ancient art: today's technology
  • Psychology and art theory
  • Art and archaeology
  • Artists as chemists
  • The physics of restoration and conservation
  • Ancient Egyptian art
  • Ancient Greek art
  • Ancient Roman art
  • Classic chemistry demonstrations
  • In search of solutions
  • In search of more solutions
  • Creative problem-solving in chemistry
  • Solar spark
  • Chemistry for non-specialists
  • Health and safety in higher education
  • Analytical chemistry introductions
  • Exhibition chemistry
  • Introductory maths for higher education
  • Commercial skills for chemists
  • Kitchen chemistry
  • Journals how to guides
  • Chemistry in health
  • Chemistry in sport
  • Chemistry in your cupboard
  • Chocolate chemistry
  • Adnoddau addysgu cemeg Cymraeg
  • The chemistry of fireworks
  • Festive chemistry
  • Education in Chemistry
  • Teach Chemistry
  • On-demand online
  • Live online
  • Selected PD articles
  • PD for primary teachers
  • PD for secondary teachers
  • What we offer
  • Chartered Science Teacher (CSciTeach)
  • Teacher mentoring
  • UK Chemistry Olympiad
  • Who can enter?
  • How does it work?
  • Resources and past papers
  • Top of the Bench
  • Schools' Analyst
  • Regional support
  • Education coordinators
  • RSC Yusuf Hamied Inspirational Science Programme
  • RSC Education News
  • Supporting teacher training
  • Interest groups

A primary school child raises their hand in a classroom

  • More from navigation items

Education Prizes 2024: Give someone the recognition they deserve! Nominate before 19 June

Gifted and Talented Chemistry - Synthesis and Analysis

  • No comments

Synthesis and analysis are two key aspects of chemistry, particularly when exploring the role of chemistry in an industrial context and relating the products formed to applications in everyday life.  In order to successfully synthesise any material it is necessary to have an understanding of the starting materials, the product(s) and the mechanisms and conditions needed to move from one to the other. This pulls together many different aspects of chemistry learning. This programme is designed to develop students understanding of these topics from basic concepts to higher level thinking. It also aims to show that understanding how aspects of chemistry link together gives fuller understanding of the chemical processes as a whole. Working through the activities will also develop thinking and research skills.

Synthesis and analysis student pack

Synthesis and analysis teacher pack, additional information.

The RSC would like to thank Tim Jolliff for the use of his materials in these resources. Tim’s book ‘Chemistry for the Gifted and Talented’ is also available on Learn Chemistry.

  • 11-14 years
  • 14-16 years
  • 16-18 years
  • Practical experiments
  • Teacher notes
  • Organic chemistry
  • Analytical chemistry
  • Reactions and synthesis
  • Able and talented
  • Chromatography

Specification

  • The industrial advantages of ethanoic anhydride over ethanoyl chloride in the manufacture of the drug aspirin.
  • The synthesis of an organic compound can involve several steps.
  • a) the techniques and procedures used for the preparation and purification of organic solids involving use of a range of techniques including: i) organic preparation: use of Quickfit apparatus; distillation and heating under reflux
  • a) the techniques and procedures used for the preparation and purification of organic solids involving use of a range of techniques including: ii) purification of an organic solid: filtration under reduced pressure; recrystallisation; measurement of melt…
  • b) for an organic molecule containing several functional groups: identification of individual functional groups; prediction of properties and reactions
  • c) multi-stage synthetic routes for preparing organic compounds.
  • 16. The preparation of aspirin
  • 7B. Determination of acetyl salicylic acid in a commercial tablet, using pure aspirin as a control.
  • The reaction mechanism for SN1 and SN2 reactions
  • (a) synthesis of organic compounds by a sequence of reactions
  • (b) principles underlying the techniques of manipulation, separation and purification used in organic chemistry
  • (g) use of chromatographic data from TLC/paper chromatography, GC and HPLC to find the composition of mixtures
  • 1.9.6 describe paper chromatography as the separation of mixtures of soluble substances by running a solvent (mobile phase) through the mixture on the paper (stationary phase), which causes the substances to move at different rates over the paper;
  • prepare aspirin using salicylic acid and ethanoic anhydride; and
  • use chromatography to compare the purity of laboratory-made aspirin with commercial tablets.
  • 2.4.8 recall the mechanism of electrophilic addition between chlorine, bromine, hydrogen chloride and hydrogen bromide with alkenes using curly arrows;
  • 4.10.5 prepare methyl-3-nitrobenzoate from methyl benzoate to illustrate nitration of the benzene ring.
  • 2. Develop and use models to describe the nature of matter; demonstrate how they provide a simple way to to account for the conservation of mass, changes of state, physical change, chemical change, mixtures, and their separation.
  • Mechanisms of ionic addition (addition of HCl, Br₂, Cl₂ only to ethene).
  • Chromatography as a separation technique in which a mobile phase carrying a mixture is caused to move in contact with a selectively absorbent stationary phase.

Related articles

Example pages from the teacher guidance showing answers, and student worksheets at three levels

Fractional distillation and hydrocarbons | Review my learning worksheets | 14–16 years

By Lyn Nicholls

Identify learning gaps and misconceptions with this set of worksheets offering three levels of support

A cartoon showing iron ore being mined then processed in a furnace before being made into buildings, household goods and ships

How to teach extraction of metals at 14–16

2024-04-09T07:20:00Z By Niall Begley

Solidify learners’ understanding of extraction processes with these tips, misconception busters and teaching ideas

A man holding his nose with a disgusted expression

Sniffing out the science of smells

2024-03-25T04:00:00Z By Hayley Bennett

What makes a bad smell smell bad? Sniff out the chemical culprits behind obnoxious odours

No comments yet

Only registered users can comment on this article., more from resources.

Previews of the Review my learning: chromatography teacher guidance and scaffolded student sheets

Chromatography | Review my learning worksheets | 14–16 years

2024-05-10T13:33:00Z By Lyn Nicholls

Previews of the Review my learning: solubility teacher guidance and scaffolded student sheets

Solubility | Review my learning worksheets | 14–16 years

Previews of the Review my learning: representing elements and compounds teacher guidance and scaffolded student sheets

Representing elements and compounds | Review my learning worksheets | 14–16 years

  • Contributors
  • Email alerts

Site powered by Webvision Cloud

synthesis of analysis

  • Walden University
  • Faculty Portal

Video Transcripts: Analyzing & Synthesizing Sources: Synthesis: Definition and Examples

Analyzing & synthesizing sources: synthesis: definition and examples.

Last updated 11/8/2016

Video Length: 2:50

Visual: The screen shows the Walden University Writing Center logo along with a pencil and notebook. “Walden University Writing Center.” “Your writing, grammar, and APA experts” appears in center of screen. The background changes to the title of the video with open books in the background.

Audio: Guitar music plays.

Visual: Slide changes to the title “Moving Towards Synthesis” and the following:

Interpreting, commenting on, explaining, discussion of, or making connections between MULTIPLE ideas and sources for the reader.

Often answers questions such as:

  • What do these things mean when put together?
  • How do you as the author interpret what you’ve presented?

Audio : Synthesis is a lot like, I like to say it's like analysis on steroids. It's a lot like analysis, where analysis is you're commenting or interpreting one piece of evidence or one idea, one paraphrase or one quote. Synthesis is where you take multiple pieces of evidence or multiple sources and their ideas and you talk about the connections between those ideas or those sources. And you talk about where they intersect or where they have commonalities or where they differ. And that's what synthesis is. But really, in synthesis, when we have synthesis, it really means we're working with multiple pieces of evidence and analyzing them.

Visual: Slide changes to the title “Examples of Synthesis” and the following example:

Ang (2016) found that small businesses that followed the theory of financial management reduced business costs by 12%, while Sonfield (2015) found that this theory reduced costs by 17%. These studies together confirmed that adopting the theory of financial management reduces costs for U.S. small businesses.

Audio: So here's an example for you. In this eaxmple we have Ang (2016), that's source number 1, right? Then Sonfield (2015), that's source number 2. They are both using this theory and found that it reduced costs by both 12% and 17%. So this is my evidence, right?

I have one sentence, but two pieces of evidence, because we're working with two different sources, Ang and Sonfield, one and two. In my next sentence, my last sentence here, we have my piece of synthesis. Because I'm taking these two sources and saying that they both found something very similar. They confirmed that adopting the theory for financial management reduces costs for small businesses. So I'm showing the commonality between these two sources. So it's a very, sort of, not simple, but, you know, clean approach to synthesis. It's a very direct approach to kind of showing the similarities between these two sources. So that's an example of synthesis, okay.

Visual : The following example is added to the slide:

Sharpe (2016) observed an increase in students’ ability to focus after they had recess. Similarly, Barnes (2015) found that hands-on activities also helped students focus. Both of these techniques have worked well in my classroom, helping me to keep my students engaged in learning.

Audio: Another example here. So Sharpe found that one thing helps students. Barnes found another thing helps students focus. Two different sources, two different ideas. In the bold sentence of synthesis, I'm taking these two ideas together and talking about how they have both worked well in my classroom.

The synthesis that we have here kind of take two different approaches. The first example is more about how these studies confirm something. The second example is about how these two ideas can be useful in my own practice, I'm applying it to my own practice, or the author is applying it to their own practice in the classroom. But they both are examples of synthesis and taking different pieces of evidence showing how they work together or relate, okay.

I kind of like to think of synthesis as taking two pieces of a puzzle. So each piece of evidence is a piece of the puzzle. And you're putting together those pieces for the reader and saying, look, this is the overall picture, right? This is what we can see, when these two pieces--or three pieces--of the puzzle are put together. So it's kind of like putting together a puzzle.

Visual: “Walden University Writing Center. Questions? E-mail [email protected] ” appears in center of screen.

  • Previous Page: Analyzing & Synthesizing Sources: Analysis in Paragraphs
  • Next Page: Analyzing & Synthesizing Sources: Synthesis in Paragraphs
  • Office of Student Disability Services

Walden Resources

Departments.

  • Academic Residencies
  • Academic Skills
  • Career Planning and Development
  • Customer Care Team
  • Field Experience
  • Military Services
  • Student Success Advising
  • Writing Skills

Centers and Offices

  • Center for Social Change
  • Office of Academic Support and Instructional Services
  • Office of Degree Acceleration
  • Office of Research and Doctoral Services
  • Office of Student Affairs

Student Resources

  • Doctoral Writing Assessment
  • Form & Style Review
  • Quick Answers
  • ScholarWorks
  • SKIL Courses and Workshops
  • Walden Bookstore
  • Walden Catalog & Student Handbook
  • Student Safety/Title IX
  • Legal & Consumer Information
  • Website Terms and Conditions
  • Cookie Policy
  • Accessibility
  • Accreditation
  • State Authorization
  • Net Price Calculator
  • Contact Walden

Walden University is a member of Adtalem Global Education, Inc. www.adtalem.com Walden University is certified to operate by SCHEV © 2024 Walden University LLC. All rights reserved.

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Synthesizing Sources

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

When you look for areas where your sources agree or disagree and try to draw broader conclusions about your topic based on what your sources say, you are engaging in synthesis. Writing a research paper usually requires synthesizing the available sources in order to provide new insight or a different perspective into your particular topic (as opposed to simply restating what each individual source says about your research topic).

Note that synthesizing is not the same as summarizing.  

  • A summary restates the information in one or more sources without providing new insight or reaching new conclusions.
  • A synthesis draws on multiple sources to reach a broader conclusion.

There are two types of syntheses: explanatory syntheses and argumentative syntheses . Explanatory syntheses seek to bring sources together to explain a perspective and the reasoning behind it. Argumentative syntheses seek to bring sources together to make an argument. Both types of synthesis involve looking for relationships between sources and drawing conclusions.

In order to successfully synthesize your sources, you might begin by grouping your sources by topic and looking for connections. For example, if you were researching the pros and cons of encouraging healthy eating in children, you would want to separate your sources to find which ones agree with each other and which ones disagree.

After you have a good idea of what your sources are saying, you want to construct your body paragraphs in a way that acknowledges different sources and highlights where you can draw new conclusions.

As you continue synthesizing, here are a few points to remember:

  • Don’t force a relationship between sources if there isn’t one. Not all of your sources have to complement one another.
  • Do your best to highlight the relationships between sources in very clear ways.
  • Don’t ignore any outliers in your research. It’s important to take note of every perspective (even those that disagree with your broader conclusions).

Example Syntheses

Below are two examples of synthesis: one where synthesis is NOT utilized well, and one where it is.

Parents are always trying to find ways to encourage healthy eating in their children. Elena Pearl Ben-Joseph, a doctor and writer for KidsHealth , encourages parents to be role models for their children by not dieting or vocalizing concerns about their body image. The first popular diet began in 1863. William Banting named it the “Banting” diet after himself, and it consisted of eating fruits, vegetables, meat, and dry wine. Despite the fact that dieting has been around for over a hundred and fifty years, parents should not diet because it hinders children’s understanding of healthy eating.

In this sample paragraph, the paragraph begins with one idea then drastically shifts to another. Rather than comparing the sources, the author simply describes their content. This leads the paragraph to veer in an different direction at the end, and it prevents the paragraph from expressing any strong arguments or conclusions.

An example of a stronger synthesis can be found below.

Parents are always trying to find ways to encourage healthy eating in their children. Different scientists and educators have different strategies for promoting a well-rounded diet while still encouraging body positivity in children. David R. Just and Joseph Price suggest in their article “Using Incentives to Encourage Healthy Eating in Children” that children are more likely to eat fruits and vegetables if they are given a reward (855-856). Similarly, Elena Pearl Ben-Joseph, a doctor and writer for Kids Health , encourages parents to be role models for their children. She states that “parents who are always dieting or complaining about their bodies may foster these same negative feelings in their kids. Try to keep a positive approach about food” (Ben-Joseph). Martha J. Nepper and Weiwen Chai support Ben-Joseph’s suggestions in their article “Parents’ Barriers and Strategies to Promote Healthy Eating among School-age Children.” Nepper and Chai note, “Parents felt that patience, consistency, educating themselves on proper nutrition, and having more healthy foods available in the home were important strategies when developing healthy eating habits for their children.” By following some of these ideas, parents can help their children develop healthy eating habits while still maintaining body positivity.

In this example, the author puts different sources in conversation with one another. Rather than simply describing the content of the sources in order, the author uses transitions (like "similarly") and makes the relationship between the sources evident.

  • Organizations
  • Planning & Activities
  • Product & Services
  • Structure & Systems
  • Career & Education
  • Entertainment
  • Fashion & Beauty
  • Political Institutions
  • SmartPhones
  • Protocols & Formats
  • Communication
  • Web Applications
  • Household Equipments
  • Career and Certifications
  • Diet & Fitness
  • Mathematics & Statistics
  • Processed Foods
  • Vegetables & Fruits

Difference Between Analysis and Synthesis

• Categorized under Science | Difference Between Analysis and Synthesis

synthesis of analysis

Analysis Vs Synthesis

Analysis is like the process of deduction wherein you cut down a bigger concept into smaller ones. As such, analysis breaks down complex ideas into smaller fragmented concepts so as to come up with an improved understanding. Synthesis, on the other hand, resolves a conflict set between an antithesis and a thesis by settling what truths they have in common. In the end, the synthesis aims to make a new proposal or proposition.

Derived from the Greek word ‘analusis’ which literally means ‘a breaking up,’ analysis is, by far, mostly used in the realm of logic and mathematics even before the time of the great philosopher Aristotle. When learners are asked to analyze a certain concept or subject matter, they are encouraged to connect different ideas or examine how each idea was composed. The relation of each idea that connects to the bigger picture is studied. They are also tasked to spot for any evidences that will help them lead into a concrete conclusion. These evidences are found by discovering the presence of biases and assumptions.

Synthesizing is different because when the learners are asked to synthesize, they already try to put together the separate parts that have already been analyzed with other ideas or concepts to form something new or original. It’s like they look into varied resource materials to get insights and bright ideas and from there, they form their own concepts.

Similar definitions of synthesis (from other sources) state that it is combining two (or even more) concepts that form something fresh. This may be the reason why synthesis in chemistry means starting a series of chemical reactions in order to form a complex molecule out of simpler chemical precursors. In botany, plants perform their basic function of photosynthesis wherein they use the sunlight’s energy as catalyst to make an organic molecule from a simple carbon molecule. In addition, science professors use this term like bread and butter to denote that something is being made. When they mention about amino acid (the building blocks of proteins) synthesis, then it is the process of making amino acids out of its many basic elements or constituents. But in the field of Humanities, synthesis (in the case of philosophy) is the end product of dialectic (i.e. a thesis) and is considered as a higher process compared to analysis.

When one uses analysis in Chemistry, he will perform any of the following: (quantitative analysis) search for the proportionate components of a mixture, (qualitative analysis) search for the components of a specific chemical, and last is to split chemical processes and observe any reactions that occur between the individual elements of matter.

1. Synthesis is a higher process that creates something new. It is usually done at the end of an entire study or scientific inquiry. 2. Analysis is like the process of deduction wherein a bigger concept is broken down into simpler ideas to gain a better understanding of the entire thing.

  • Recent Posts
  • Difference Between Plant Protein and Animal Protein - March 7, 2024
  • Difference Between Crohn’s and Colitis - March 7, 2024
  • Difference Between Expression and Equation - March 7, 2024

Sharing is caring!

Search DifferenceBetween.net :

Email This Post

  • Difference Between Hydrolysis and Dehydration Synthesis
  • Difference Between Idea and Concept
  • Difference Between Anticodon and Codon
  • Difference Between Deep Learning and Surface Learning
  • Difference Between Compound and Mixture

Cite APA 7 , . (2011, March 19). Difference Between Analysis and Synthesis. Difference Between Similar Terms and Objects. http://www.differencebetween.net/science/difference-between-analysis-and-synthesis/. MLA 8 , . "Difference Between Analysis and Synthesis." Difference Between Similar Terms and Objects, 19 March, 2011, http://www.differencebetween.net/science/difference-between-analysis-and-synthesis/.

It’s very useful to understand the science and other subjects. Thanks

It was insightful

Thanks so much…. You explained so beautifully and simply….. Thanks again a lot

Thank you sir for your good explanation

Leave a Response

Name ( required )

Email ( required )

Please note: comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Notify me of followup comments via e-mail

Written by : Julita. and updated on 2011, March 19 Articles on DifferenceBetween.net are general information, and are not intended to substitute for professional advice. The information is "AS IS", "WITH ALL FAULTS". User assumes all risk of use, damage, or injury. You agree that we have no liability for any damages.

Advertisments

More in 'science'.

  • Difference Between Suicide and Euthanasia
  • Difference Between Vitamin D and Vitamin D3
  • Difference Between Global Warming and Greenhouse Effect
  • Differences Between Reptiles and Amphibians
  • Difference Between Ophthalmology and Optometry

Top Difference Betweens

Get new comparisons in your inbox:, most emailed comparisons, editor's picks.

  • Difference Between MAC and IP Address
  • Difference Between Platinum and White Gold
  • Difference Between Civil and Criminal Law
  • Difference Between GRE and GMAT
  • Difference Between Immigrants and Refugees
  • Difference Between DNS and DHCP
  • Difference Between Computer Engineering and Computer Science
  • Difference Between Men and Women
  • Difference Between Book value and Market value
  • Difference Between Red and White wine
  • Difference Between Depreciation and Amortization
  • Difference Between Bank and Credit Union
  • Difference Between White Eggs and Brown Eggs

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 24 May 2024

Environment modulates protein heterogeneity through transcriptional and translational stop codon readthrough

  • Maria Luisa Romero Romero   ORCID: orcid.org/0000-0003-3397-6758 1 , 2 ,
  • Jonas Poehls   ORCID: orcid.org/0000-0002-6137-2794 1 , 2 ,
  • Anastasiia Kirilenko 1 , 2 ,
  • Doris Richter 1 , 2 ,
  • Tobias Jumel   ORCID: orcid.org/0000-0001-6064-843X 1 ,
  • Anna Shevchenko 1 &
  • Agnes Toth-Petroczy   ORCID: orcid.org/0000-0002-0333-604X 1 , 2 , 3  

Nature Communications volume  15 , Article number:  4446 ( 2024 ) Cite this article

2 Altmetric

Metrics details

  • Molecular evolution
  • Transcription
  • Translation

Stop codon readthrough events give rise to longer proteins, which may alter the protein’s function, thereby generating short-lasting phenotypic variability from a single gene. In order to systematically assess the frequency and origin of stop codon readthrough events, we designed a library of reporters. We introduced premature stop codons into mScarlet, which enabled high-throughput quantification of protein synthesis termination errors in E. coli using fluorescent microscopy. We found that under stress conditions, stop codon readthrough may occur at rates as high as 80%, depending on the nucleotide context, suggesting that evolution frequently samples stop codon readthrough events. The analysis of selected reporters by mass spectrometry and RNA-seq showed that not only translation but also transcription errors contribute to stop codon readthrough. The RNA polymerase was more likely to misincorporate a nucleotide at premature stop codons. Proteome-wide detection of stop codon readthrough by mass spectrometry revealed that temperature regulated the expression of cryptic sequences generated by stop codon readthrough in E. coli . Overall, our findings suggest that the environment affects the accuracy of protein production, which increases protein heterogeneity when the organisms need to adapt to new conditions.

Similar content being viewed by others

synthesis of analysis

Start codon-associated ribosomal frameshifting mediates nutrient stress adaptation

synthesis of analysis

A short translational ramp determines the efficiency of protein synthesis

synthesis of analysis

Translation stalling proline motifs are enriched in slow-growing, thermophilic, and multicellular bacteria

Introduction.

Protein synthesis termination can achieve high fidelity, yet it is not perfect. Within all life forms on Earth, the end of the translation process is signaled by stop codons and catalyzed by release factors 1 , 2 , 3 . This process has evolved to rewrite genomic information into canonical-size proteins accurately 4 . However, stop codon readthrough (SCR) 5 may occur either by a transcription error, when the RNA polymerase misincorporates a nucleotide and eliminates the stop codon, or by a translation error in which the ribosome misincorporates a tRNA at the stop codon (also called nonsense suppression 6 , 7 ). Thus, SCR covers transcription and translation events that deviate from the genetic code without identifying a specific mechanism. These errors result in protein variants with extended C termini 8 , 9 , 10 , generating a heterogeneous protein-length population from a single gene.

Proteome diversification arising from SCR can potentially generate phenotypic variability, shaping the cell’s fate 11 . For instance, errors in protein synthesis termination can be adaptive and functional 8 , 10 , 12 , 13 , 14 , 15 , 16 , 17 or non-adaptive 18 , occasionally leading to fitness decrease 19 . Further, slippery sequences upstream the stop codon that cause the ribosome to slip and thereby lead to stop codon readthrough are often functional 14 , 20 , 21 . Transcription and translation errors may generate short-lasting phenotypic variability on a physiological time scale, faster than genomic mutations. Thus, SCR events may facilitate rapid adaptation to sudden environmental changes. These tremendous evolutionary implications reveal the need to study the rules that dictate SCR error rates under diverse living conditions.

Several studies have highlighted the relevance of SCR under diverse environmental conditions. Both carbon starvation 22 and excess glucose promote the readthrough of TGA in E. coli by lowering the pH 23 . However, these studies provided information for a given genetic context, detecting a maximum of 14% TGA readthrough for cells grown in LB supplemented with lactose 23 . Further, low growth temperatures dramatically increased ribosomal frameshift rates in Bacillus subtilis 24 . Nevertheless, the impact of temperature on SCR remains unknown.

Here, we systematically examined SCR events of all stop codons in a variety of genetic contexts under different temperatures and nutrient conditions. We specifically explored: i) How frequent errors occur in protein synthesis termination. ii) Whether non-optimum environmental conditions modulate the chances for evolution to encounter these events. iii) Whether they originate from translation or transcription errors. We designed a library of reporters that allowed for high-throughput quantification of protein synthesis termination errors in E. coli using fluorescent microscopy. We targeted 43 arbitrary positions along the mScarlet sequence, and, in each position, we mutated the wild-type codon to each of the three stop codons. Thus, only upon errors resulting in skipping protein synthesis termination would a full-length mScarlet be synthesized.

We confirmed that protein termination accuracy depends on the identity and genetic context of the stop codon. We then proposed a set of simple rules to predict hotspots in the protein sequence that are error-prone for protein synthesis termination. We further showed that environmental stress conditions such as low temperature and nutrient depletion increase the SCR rate of all stop codons. Accordingly, the opal stop codon TGA, present in 29% of the E. coli proteome, fails to terminate the protein synthesis in certain positions at a rate of up to 80% under stress conditions at a given nucleotide context. RNA-seq and mass spectrometry experiments of selected reporters revealed that protein synthesis mis-termination is not only due to ribosomal readthrough, but RNA polymerase errors also contribute to SCR. We found that the RNA polymerase is more likely to misincorporate a nucleotide at premature stop codons. Finally, mass spectrometry analysis of the K12 MG1655 E. coli proteome provided evidence of cryptic sequences revealed by SCR, validating our method for predicting SCR in a more natural context. Overall, our findings suggest a cross-talk between the environment and the flux of biological information that increases protein heterogeneity when organisms need to adapt to new conditions.

Visualizing and quantifying stop codon readthrough events in E. coli

To monitor error rates in protein synthesis termination, we designed a fluorescence reporter by which stop codon readthrough can be visualized and quantified in E. coli inspired by the pioneering work of R.F. Rosenberger and G.Foskett 25 and the more recent study of Meyerovich et al. 24 . The strategy relies on introducing a premature stop codon into an mScarlet allele. Thus, only upon SCR, full-length and, therefore, functional mScarlet will be synthesized (Fig.  1A ).

figure 1

A To study stop codon readthrough (SCR), we designed a fluorescence reporter introducing a premature stop codon into an mScarlet allele. B We detected full-length mScarlet and, therefore, SCR events in cells transformed with the reporter. C Titrating the expression of the reporter increases the detection of SCR events. D We designed a library of reporters for SCR mutating to TAA (purple), TAG (green), and TGA (blue), 41 randomly selected codons along the mScarlet sequence. We studied the library of reporters in a high-throughput fashion. E Fluorescence distributions displayed by the E. coli cells, transformed with each library’s reporters and grown at 37 °C in rich media. The propensity of SCR, calculated as the percent of the median fluorescence compared with the positive control (PC, wild-type mScarlet), is shown for the distributions with a median fluorescence higher than the negative control. There are hotspots in the protein sequence prone to SCR. The identity of the stop codon affected the likelihood of stop codon readthrough with the trend TGA > TAG > TAA. Each fluorescence distribution was derived from one biological replicate. Source data are provided as a Source data file. Figure 1A and D were created with BioRender.com released under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.

We first introduced a TGA stop codon in the randomly selected position 95 of an mScarlet flanked by two tags, a Strep-tag at the N-terminus and a His-tag at the C-terminus. On the one hand, these tags provided a way to purify expressed fragments. On the other hand, the His-tag served as an orthologous method to detect SCR events by Western blot. To tightly regulate the expression of the reporter, we used a low copy number vector and an inducible promoter.

We transformed a wild-type K12 MG1655 E. coli strain with the Gly95-TGA reporter. Then, we grew the transformed cells until stationary phase under optimal conditions, testing the expression of the Gly95-TGA reporter with titrated concentrations of the inducer. We used cells transformed with the empty vector as negative control (NC) and cells expressing wild-type mScarlet as positive control (PC; Fig.  1B, C ). While no signal was detected in the negative control, an elevated fluorescent signal with increasing inducer concentration was detected in the cells expressing the Gly95-TGA reporter and in the positive control. These results suggest that functional mScarlet was expressed in the cells carrying the Gly95-TGA reporter, indicating SCR. Assuming that the fluorescence properties of the SCR variants were not altered, we were able to quantify the error rate by calculating the relative fluorescence signal as the percentage of the median compared with the PC median. We found that SCR was not a rare event; at a 400 μg/L AHT concentration, protein synthesis termination failed for 2.1% of the expressed Gly95-TGA reporters (Fig.  1C ).

Since we compared cells with orders of magnitude difference in fluorescence intensity, we confirmed that the fluorescence measurements were within the dynamic range of the instrument, and the relationship between mScarlet concentration and fluorescent intensity was linear (Supplementary Fig.  1 ). Therefore, relative fluorescence is a valid approximation for the relative protein abundance, allowing us to quantify the stop codon readthrough error rate.

Stop codon readthrough events are frequent in E. coli

As demonstrated in previous studies 24 , 26 , 27 , 28 , 29 , we showed proof-of-principle that SCR occurred at the opal stop codon, TGA, and could be visualized and quantified with a fluorescent reporter. Next, we extended the strategy to study SCR event frequency and whether and how the identity of the stop codon and the sequence context regulated the error rate of protein synthesis termination. To address these questions in a high-throughput fashion, we designed a library of reporters for stop codon readthrough. We randomly targeted 43 codons along the mScarlet sequence and mutated them to each of the three stop codons. We confirmed by Sanger sequencing that the final library consisted of 117 reporters: 38 with TAA, 39 with TAG, and 40 with TGA stop codons. We individually transformed E. coli with the reporters and grew them, inducing the reporter expression under normal conditions (37 °C in LB medium) in a 384-well plate. Next, we automatically imaged the cells in the stationary phase to determine fluorescence (Fig.  1D , see methods for detailed description).

Several reporters (16 out of 117) displayed high fluorescence output, meaning full-length mScarlet was expressed (Fig.  1E ). This is a rather unexpected result considering that the premature stop codons were introduced randomly along the mScarlet sequence, without considering the genome context that may promote SCR. For instance, the fluorescence distribution of the cells carrying a reporter with a premature stop codon at position 105 displayed a relative median fluorescence of 0.7 to 6%, depending on the stop codon identity (Fig.  1E ).

The stop codon identity seemed pivotal in regulating the SCR propensity, with TGA being the most error-prone stop codon. 14 out of the 40 reporters carrying a premature TGA displayed median fluorescence above the threshold defined as the median plus two standard deviations of the fluorescence signal of the NC. Further, the reporters with a stop codon at position 105 suggested that its identity affected the likelihood of SCR events. The highest relative fluorescence values were measured for TGA and followed the trend TGA > TAG > TAA, corroborating previous observations in bacteria 30 , yeast 31 , and mammals 16 , 32 . Surprisingly, the least efficient stop codon, TGA, is present in 29% of the E. coli proteome. However, TAG, which is less prone to be read-through, is only present in 8% of the E. coli proteome 33 . We surmise, that the highest error rate detected at TGA could be due to the inefficient RF2 (prfB allele), which recognizes TGA, present in the wild-type K-12 MG1655 E. coli 34 , 35 .

In summary, our results suggest that SCR events are more frequent in E. coli than previously thought.

Non-optimal growth temperatures and nutrient scarcity promote stop codon readthrough

SCR events can generate short-lasting phenotypic variability faster than genomic mutations. Thus, SCR events may facilitate rapid evolution due to sudden environmental changes. However, up to now, little attention has been paid to the effect of environmental conditions on SCR. It is known that the readthrough of stop codons in E. coli depends on the growth media 22 , 23 . However, how temperature affects SCR remains unknown. To examine this, we first screened our previously described library of reporters expressed in wild-type E. coli grown at different temperatures (Fig.  2A and Supplementary Fig.  2 ).

figure 2

A In cells grown in rich media, the stop codon readthrough (SCR) levels varied considerably as a function of temperature. At 18 °C, more reporters displayed SCR events and at a higher rate. The plotted variable is the median percentage of each reporter’s fluorescence distribution, normalized by the wild-type (PC). B When cells were grown in rich media, TAA was the most accurate codon terminating the protein synthesis, and TGA was the least accurate at all tested temperatures. The plotted variable is the median percentage of each reporter’s fluorescence distribution relative to the wild-type (PC). The top subpanel describes the reporters with TAA ( N  = [38, 38, 38, 35], mean = [0.05, 0.01, 0.03, 0.01]), the middle with TAG ( N  = [39, 39, 39, 39], mean = [0.49, 0.80, 0.03, 0.01], and the bottom with TGA ( N  = [40, 40, 40, 40], mean = [7.23, 0.21, 0.51, 0.38]) respectively at 18 °C, 25 °C, 37 °C and 42 °C. C Fluorescence distributions of E. coli cultures transformed with the reporters that carried each of the stop codons at positions 105 and 155, grown at 18 °C, 25 °C, 30 °C, 37 °C, and 42 °C in LB. Three biological replicates are shown per reporter. The vertical lines represent the NC and PC median fluorescence (wild-type mScarlet). D Fluorescence and bright field images of an E. coli culture transformed with the reporter that introduced a TGA at position 105 of the mScarlet. While some cells were prone to SCR (marked with arrows), others did not exhibit it. E Nutrient scarcity promoted SCR. In cells grown in minimal media, SCR levels increased while lowering the growth temperature. The plotted variable is the median percentage of each reporter’s fluorescence distribution, normalized by the wild-type (PC). F When cells were grown in a minimal medium, TAA was the most accurate codon terminating the protein synthesis, and TGA was the least for all tested temperatures. The plotted variable is the median percentage of each reporter’s fluorescence distribution relative to the wild-type (PC). The top subpanel describes the reporters with TAA ( N  = [38, 38, 38, 38], mean = [0.04, 0.04, 0.04, 0.04]), the middle with TAG ( N  = [39, 39, 39, 39], mean = [0.72, 0.17, 0.02, 0.04], and the bottom with TGA ( N  = [40, 40, 40, 40], mean = [5.58, 1.11, 0.89, 0.52]) respectively at 18 °C, 25 °C, 37 °C and 42 °C.The boxs span the interquartile range (25th to 75th percentile) with the median marked by a horizontal line. Whiskers extend 1.5 times the interquartile range from the quartiles. Source data are provided as a Source data file.

A closer inspection of the temperature-effect scan revealed hotspots in the sequence prone to SCR. We define hotspot positions, where the premature stop codon is read-through in more than 50% of the tested conditions, for example, positions 105 and 135 (Fig.  2A ). These hotspots seem to depend on the sequence context of the premature stop codon since, regardless of the stop codon identity, they are likely to prevent correct protein synthesis termination at different temperatures.

Within all temperatures, TAA is least likely to be read-through while TGA is the most likely, in agreement with previous observations 32 , 36 , 37 . At each temperature, more TGA reporters display stop codon readthrough than TAG and TAA (Fig.  2A, B , and Supplementary Fig.  2 ). Additionally, the stop codon termination error rates follow the same trend at the hotspot positions: TGA > TAG > TAA (Fig.  2C ).

The SCR rates vary considerably as a function of the temperature. Interestingly, at lower temperatures, more reporters display stop codon readthrough events (Fig.  2A , Supplementary Figs.  2 and 3AB). At the hotspot positions, the protein synthesis termination seems more accurate at the optimal growth temperature, 37 °C (Supplementary Fig.  2 ). However, at low temperature (18 °C), when TGA is at position 135, the SCR is surprisingly frequent, at a rate of ~80%. The experiment at 25 °C seems to be an outlier on the observed temperature trend as it presents fewer reporters with SCR than the one at 37 °C (Fig.  2A, B , and Supplementary Fig.  2 ). However, most reporters that exhibited SCR events at 37 °C did so with a low rate (below 1%, Supplementary Fig.  2 ).

To better address the temperature effect and study the reproducibility among biological replicates, we focused on the reporters with TAA, TAG, and TGA at positions 105 and 155. We analyzed three biological replicates for each reporter at 18 °C, 25 °C, 30 °C, 37 °C, and 42 °C (Fig.  2C ). These experiments, along with their statistical analysis (see Methods section for further details), revealed two findings: i) significant cell-to-cell heterogeneity within a clonal population (Fig.  2C, D and Supplementary Table  1 ), and ii) higher levels of SCR with decreasing temperature, particularly evident at 18 °C, when a stop codon was inserted at positions 105 and 155 (Supplementary Fig.  4 and Supplementary Table  2 ). Previous studies have suggested that such heterogeneity may facilitate adaptation to changing environments 26 . The observed heterogeneity poses challenges for quantifying SCR. Throughout this work, we utilized relative median fluorescence as a summary statistic to quantify SCR error rates. Nevertheless, these numeric values do not fully represent the distributions’ complexity, as they are neither normal nor symmetric.

To test whether the observed temperature effect is an artifact due to the unfolding of mScarlet at high temperatures, we assayed the stability of mScarlet, showing that mScarlet has an apparent melting temperature (T m ) above 80 °C (Supplementary Fig.  3C ).

To examine how nutrient depletion modulates protein synthesis termination accuracy, we screened our library of reporters in wild-type E. coli grown in a minimal medium, M9, at different temperatures (Fig.  2D and Supplementary Fig.  2 ). The most surprising result was the frequent occurrence of SCR events observed in low-nutrient conditions. Indeed, the number of hotspots prone to SCR increased to 10 (position 16, 90, 105, 135, 140, 150, 155, 165, and 195; Fig.  2D ). Besides the higher error rate, the stop codon identity (Fig.  2E , Supplementary Figs.  2 , 3A, B ) appeared to play a similar role when growing E. coli in minimal media and in rich media. I.e., the SCR level followed the trend TGA > TAG > TAA.

To assess the generality of the temperature effect, we statistically analyzed all reporters under all experimental conditions (see Methods section). We assessed the median of the fluorescence relative to the wild-type, excluding those reporters that showed no SCR at the studied temperatures. The analyses revealed evidence of a non-linear temperature-driven effect on SCR: 18 °C > 25 °C ~ 37 °C > 42 °C (Supplementary Fig.  5 , Supplementary Table  3 ).

To further explore the key nutrients essential for keeping an accurate protein synthesis termination, we first supplemented the minimal media with a higher carbon source concentration (1.6% glycerol, Supplementary Fig.  6A ) and, secondly, with a higher casamino acid concentration (0.4% casamino acid, Supplementary Fig.  6B ). In both cases, the protein synthesis termination accuracy increased. Accordingly, when cells were grown in non-supplemented minimal media, 18 reporters with TGA displayed a median fluorescence above threshold (defined as the median plus two standard deviations of the NC’s fluorescence signal). However, when supplementing with higher glycerol and casamino acid concentrations, only 6 and 8 reporters, respectively, presented a median fluorescence above the threshold. This suggests that both nutrients, the carbon source and the casamino acids, are essential to prevent SCR events.

Global inspection of these results suggests that environmental stress conditions, such as non-optimal growth temperature or nutrient scarcity, promote stop codon readthrough and, thus, protein heterogeneity. A similar temperature effect has been previously reported for ribosomal frameshift errors 24 , suggesting an increase in different types of protein synthesis errors at low temperatures. A known general response to environmental stress conditions in E. coli comprises the downregulation of genes related to transcription and ribosomal biogenesis 38 . This general stress response may end up in an increase in protein synthesis errors, which can modulate protein heterogeneity. Since sudden environmental changes can modulate protein heterogeneity in a rapid and short-lasting manner, we hypothesize that E. coli might make use of SCR to quickly adapt to sudden and short-lasting stress conditions.

C-terminal His-tag detection as an orthologous method to quantify stop codon readthrough events

We have linked the fluorescence signal of mScarlet with SCR events. However, the readthrough of a stop codon could disrupt the mScarlet structure or decrease its stability, resulting in full-length yet nonfunctional (i.e., non-fluorescent or dark) mScarlet or in mScarlet with potentially enhanced fluorescence. We used an orthologous method to detect SCR events and evaluated the limitations of our fluorescent reporters’ library. The reporters include a His-tag at the C-terminal (Fig.  1A, D ), and thus, only upon SCR His-tag is synthesized. We performed western blotting using anti-Histag antibodies as an orthologous method to detect and quantify SCR events.

We expressed reporters in wild-type E. coli at 18 °C in LB media and calculated the relative expression of His-tag compared to the positive control using Western blot (Supplementary Figs.  7 and 8 ). We subsequently compared the results with the corresponding relative fluorescence values obtained by microscopy. Most reporters consistently show similar SCR rate measured by fluorescence and by western blot assays. Only a few reporters presented a high His-tag expression but did not exhibit fluorescence (“dark reporters”) (Supplementary Table  7 ).

We further checked that the highest SCR error rates detected (Trp-94-TGA, Ala-105-TGA, and Pro-135-TGA) were not artifacts resulting from an enhanced mScarlet fluorescence variant. Since tryptophan is usually misincorporated by SCR at TGA sites (see results section “Stop codon readthrough events related to amino acid misincorporations” and references 39 , 40 ), we studied the fluorescence of the variants where tryptophan was introduced in positions 94 (same sequence as the wild-type mScarlet), 105, and 135 (Supplementary Fig.  9 ). While the Ala-105-Trp mutant had a comparable fluorescence to the wild-type mScarlet, the Pro-135-Trp mutant displayed one order of magnitude reduced fluorescence (Supplementary Fig.  9 ). We confirmed a 79% SCR rate for the reporter that introduced a TGA at position 135 by His-tag detection (Supplementary Table  7 ). We hypothesize that the higher fluorescence observed previously for this reporter relates to a different amino acid misincorporation.

Overall, the C-terminal His-tag analysis helped us identify and tackle a limitation in our reporters’ design. Most reporters estimate similar SCR rates with the fluorescence measurement as with the His-tag detection, proving that our fluorescent reporter library is a valid method for studying SCR events. Notably, we have considered the Western blot His-tag detection assay as a qualitative analysis. Consequently, for downstream analyses, we relied on fluorescence measurements to quantify SCR events.

High G and low T content downstream of the stop codon increase the likelihood of stop codon readthrough

Our results show hotspots in the sequence prone to SCR events. Since we targeted 43 positions in the mScarlet sequence, we can systematically compare the genetic context effect on SCR. We analyzed the nucleotide occurrences surrounding the premature stop codon and the amino acid position to determine whether they correlate with protein synthesis termination errors.

We derived a score to estimate the likelihood of an SCR event at a given position. Then, we binned the scored values and correlated them with the occurrence of each nucleotide in a 5-nt window upstream and downstream of the premature stop codon. The results show a significant positive correlation for G and negative for T content with SCR likelihood upstream of the premature stop codon (Fig.  3A ). However, a non-significant correlation was found in a 5-nt window downstream of the premature stop codon (Fig.  3B ) (see Methods section for further details). This is consistent with previous work reporting that the two amino acids at the C-terminus of the nascent peptide, and therefore the six nucleotides upstream of the stop codon, modulate termination error frequency in E. coli 41 , 42 , 43 .

figure 3

A The effect of the identity of the nucleotides in a 5-amino acid window upstream of the stop codon on stop codon readthrough (SCR) probability. Statistical analysis was performed using Pearson correlation and two-sided testing. B High G content in a 5-amino acid window downstream of the stop codon does not increase the probability of SCR, while T decreases it. Statistical analysis was performed using Pearson correlation and two-sided testing. C Reporters were binned into three categories based on the SCR rates, and nucleotide frequencies were calculated 1-nt upstream and downstream of the stop codons. The identity of the base immediately before the stop codon did not impact SCR probability. The identity of the base immediately after the stop codon impacted SCR probability. T increases, and G decreases the protein synthesis termination efficiency. D Stop codon readthrough events occurred primarily due to amino acid misincorporations identified by mass spectrometry. TGA, in blue, was almost always replaced by tryptophan. TAA, in purple, was mainly replaced by alanine, glutamine, and tryptophan, while TGA, in green, was replaced by glutamine and tyrosine. Misincorporation of several other amino acids at the stop codon was clearly a minor process (relative abundance <10%). These MS experiments were performed with the reporters after purification with His-tag affinity resin. Thus, we could not quantify the likelihood of SCR events. However, the MS analyses did provide information on the relative abundance of the amino acids’ misincorporation, presented in Supplementary Table  4 . E The downstream mutation to T and the upstream mutation to ATTAT reduced the likelihood of SCR events for all three tested positions. To quantify the SCR likelihood, we measured the His-tag expression with western blot in E. coli cells transformed with these mutants, grown at 18 °C. We calculated the percentage of His-tag expression compared with the internal positive control (PC, wild-type mScarlet). Source data are provided as a Source data file.

Numerous reports have shown that the nucleotide immediately following the stop codon influences SCR in eukaryotes 32 , 44 and prokaryotes 45 , 46 . Hence, we also studied the effect of the identity of the base following the stop codon on SCR. For this purpose, we used the previously described score for SCR likelihood. Sequence-logos of the base following the stop codon position suggest that the presence of T after the stop codon significantly increases the protein synthesis termination efficiency (Fig.  3C ). Contrarily, G in the fourth position significantly increases the likelihood of SCR (Fig.  3C ). This result agrees partially with a pioneer work showing how T and C favor termination efficiency 46 . However, it differs from previous observations in mammalian cells where the nucleotide at the +4 position increased SCR frequency in the order C > U > A > G 32 . The identity of the base preceding the stop codon does not seem to influence the likelihood of SCR (Fig.  3D , see Methods section).

To experimentally confirm the previous findings, we designed a set of mutants and measured their SCR likelihood. Since these mutants changed the mScarlet’s primary structure in up to four amino acids, which will likely affect its functionality, we measured the SCR events, detecting the His-tag expression with western blot. We focused on those reporters with the highest SCR score: those with a premature stop codon in positions 105, 135, and 155. For these reporters, we mutated the downstream region towards T because, according to our analysis, T reduces the likelihood of SCR events when placed after the stop codon. We mutated the upstream region towards ATTAT because T reduces the likelihood of SCR events when placed in a 5-nt window before the stop codon. This sequence is the richest in T content among reporters that did not exhibit SCR events in any of the studied conditions (Supplementary Table  6 ). We predicted these mutants to increase the protein synthesis termination accuracy. We transformed E. coli cells with these mutants and grew them at 18 °C. We observed that both the downstream mutations towards T and the upstream mutation towards ATTAT reduced the likelihood of SCR events for all three tested positions. Further, mutating the upstream and downstream regions simultaneously had an additive effect (Fig.  3E ). Thus, we confirmed our prediction experimentally.

Finally, to study whether the distance to the C-terminus modulates the propensity of SCR events, we analyzed the correlation between the SCR score and the mScarlet amino acid position. We found no significant correlation (Supplementary Fig.  10 , R = 0.15, p  = 0.362).

Overall, these analyses indicate that the adjacent region to the premature stop codon (5-nt upstream and 1-nt downstream) influences the SCR rate. While G increases the SCR propensity, T has the opposite effect. The higher stability of the secondary structures of the mRNA with high GC content may hamper the fidelity of translation. To test this hypothesis, we predicted the minimum free energies (MFE) of the possible secondary structures in a 100-nt window downstream and upstream of the inserted stop codon. We analyzed the correlation between the SCR score and the minimum free energy of the predicted structures and found no significant correlation (Supplementary Fig.  11A, B, and C , see Methods for more details).

As expected, decreasing the temperature stabilized the predicted RNA secondary structures, which may explain the higher SCR errors at lower temperatures (Supplementary Fig.  11A, B, and C ). Yet, the higher stability of the secondary structures cannot explain the presence of hotspots for SCR errors in the mScarlet sequence. Since, in the context of ribosomal frameshifting, it has been argued that local thermodynamic stability has a greater effect than the structure’s overall stability 47 , we focused next on the local thermodynamic stability of the secondary structures surrounding the premature stop codon and its correlation with the SCR scores (Supplementary Fig.  11 D and E , see methods for more details). We found no correlation (Supplementary Fig.  11 D and E ).

Overall, the predicted mRNA secondary structures could not explain the presence of hotspots for SCR events.

Stop codon readthrough events related to amino acid misincorporations

Next, we investigated the events occurring at the position of the stop codon for the cases where SCR was detected. We chose nine reporters for which SCR events have been detected in the previous experiments, and the premature stop codon was located within tryptic peptide detectable by mass spectrometry. The selected reporters were expressed in E. coli grown on nutritionally rich medium (LB) at 18 °C; the expressed products were purified using C-terminal His-tag by Ni-NTA chromatography and analyzed by GeLC-MS/MS. Wild-type mScarlet sample was processed similarly and used as a control.

Protein synthesis termination error may relate to the misincorporation of an amino acid at the stop codon position, skipping the stop codon 39 , or a frameshift. Our data showed that amino acid misincorporation occurred for all nine selected reporters producing a mixture of several products, and the misincorporated amino acids were not a random selection (Fig.  3D , Supplementary Fig.  12 , and Supplementary Table  4 ). TGA stop codon was reproducibly replaced by tryptophan and cysteine, where tryptophan was favored according to quantitative estimates. Misincorporation of several other amino acids at the TGA position was clearly a minor process (Supplementary Table  4 ). On the other hand, the TAG stop codon followed another pattern. It was repeatedly replaced by tyrosine, glutamine, and lysine, with all three amino acids having comparable misincorporation rates. TAA stop codon exhibited, in the only sample studied, replacement by tyrosine, glutamine, lysine, and alanine. No misincorporation events were detected in the control sample.

Overall, our results agree with observations in S. cerevisiae , where TAG and TAA were found to be replaced by glutamine, lysine, and tyrosine, and TGA by tryptophan, cysteine, and arginine 40 . In mammals, tryptophan, cysteine, arginine, and serine can be incorporated at the TGA stop codon position 39 . Most detected amino acid misincorporations may be explained by a single nucleotide mismatch between the tRNA anticodon and the stop codon. A similar strategy is reported for stop-to-sense reassignment in some organisms, where, e.g., tRNA Glu cognates TAG and TAA 48 , and tRNA Trp binds to TGA 49 .

We detected only a single mass spectrum matched to a peptide with a stop codon deletion (the sample with TGA at position 190), suggesting that skipping the stop codon is a non-significant contributor to SCR events. No cases of extended deletion (stop codon together with 1-2 neighboring amino acids) were detected. We also did not observe alternative-frame peptides, as expected, since we purified the proteins using a C-terminal tag in the frame. Thus, our approach does not allow us to rule out the existence of frameshifts.

Altogether, our results indicate that SCR events are due to amino acid misincorporations, and the pattern of misincorporated amino acids depends on the stop codon identity.

Low RNA polymerase accuracy at premature stop codons

We explored whether the amino acid misincorporations identified by LC-MS/MS were due to transcription or translation errors. We studied the mRNA sequence of 12 reporters, including the nine reporters previously studied by mass spectrometry and the mScarlet wild-type, expressed in wild-type E. coli grown at 18 °C in LB media (Data S2). We confirmed the sample sequences by DNA sequencing (Data S3).

The most surprising aspect of the RNA-seq results is the higher probability of mismatches at the premature stop codon sites (Supplementary Fig.  13A ). Some of these mismatches do not result in an amino acid exchange in the protein sequence due to the degeneration of the genetic code, i.e., several codons encode for the same amino acid or stop signal. Nevertheless, it appears that the mismatches at the premature stop codon sites are unusually high (Supplementary Fig.  13B ). The observed increase of mismatches may be the result of the selective degradation of mRNA containing premature stop codons 50 , 51 , e.g., by Rho-mediated transcription termination of premature stop codon-carrying mRNAs. However, the prevalence of synonymous mismatches leading to another stop codon (Fig.  4E ) suggests a higher RNA polymerase error rate at these stop codons. We hypothesized that the reason for the RNA polymerase errors could be the nucleotide context around the premature stop codon. However, the probability of a mismatch in a given position was significantly higher for nucleotides encoding for a premature stop codon than those encoding for an amino acid or the canonical stop codon (Fig.  4A ). The same trend could be observed for non-synonymous mismatches only (Fig.  4B ). To further study the effect of the sequence context, we analyzed the probability of a mismatch based on the adjacent nucleotides (Fig. 4CD). The analysis revealed that the identity of the adjacent nucleotides did not affect the accuracy of the RNA polymerase. Interestingly, according to this analysis, the RNA polymerase is more prone to mismatch at a stop codon when it is premature or it is in the ribosome’s open reading frame, pointing to the coupling of translation and transcription as a putative reason for the higher inaccuracy of the RNA polymerase.

figure 4

A The likelihood of the RNA polymerase mismatching was higher at premature stop codons. Density plots of %mRNA mismatches (nucleotide substitutions) observed in RNA-seq experiments at the four nucleotides (T, G, C, A) grouped by amino-acid encoding (left), premature stop codon (middle), or canonical stop codon (right) encoding nucleotides. B When focusing solely on non-synonymous mismatches (those leading to amino acid changes), RNA polymerase was more likely to incorporate a mismatched nucleotide at a stop codon when in the frame. The same analysis is presented as in panel A), but includes only non-synonymous mismatches. C The identity of the adjacent nucleotides did not affect the probability of RNA mismatch at a given position. The probability of RNA mismatch in a given nucleotide was higher when it encoded a premature stop codon than for canonical stop codon or out-of-frame stop codon sites. The plot shows the percentage of RNA mismatches of a given nucleotide (represented in bold on the x-axis), depending on the adjacent nucleotides. D The identity of the adjacent nucleotides did not affect the probability of non-synonymous RNA mismatch at a given position. The same analyses are presented as in panel C , but only for RNA mismatches that lead to amino acid changes (non-synonymous mismatches). E RNA polymerase errors at premature stop codons resulted in a broad range of misincorporated amino acids. A bar plot showing the amino acid encoded by the mRNA sequence at the premature stop codon versus the number of reads. The percentage of encoded amino acids is shown next to each bar. The boxs span the interquartile range (25th to 75th percentile) with the median marked by a horizontal line. Whiskers extend 1.5 times the interquartile range from the quartiles. Source data is provided as a Source data file.

We subsequently moved to study whether the nucleotide misincorporations by the RNA polymerase errors match the protein sequences identified by MS. We aimed to clarify the source of the SCR: Is it transcriptional or translational? We analyzed the amino acids encoded by the mRNA sequences at the premature stop codon in each reporter (Fig.  4E ). As expected, the wild-type stop codon was found in most of the reporters’ mRNA sequences. Due to the degeneration of the genetic code, i.e., several codons code for the same amino acid or stop signal, RNA polymerase errors often result in a different stop codon only different in one nucleotide (e.g., TAA to TAG) (Fig. 4BD). However, occasionally, RNA polymerase errors result in an amino acid insertion at the premature stop codon site. The amino acid detected by mass spectrometry was often found already encoded in the mRNA sequences due to nucleotide misincorporations, although at much smaller proportions and among many other amino acids. Overall, the RNA-seq and mass spectrometry results suggest that mainly translation errors contribute to SCR events. This is in agreement with previous studies that revealed higher translational than transcriptional error rates 14 , 24 , 52 . Transcription errors contribute less to SCR events. Yet, they diversify the resulting protein sequence independent of the stop codon identity.

Proteome-wide detection of stop codon readthrough events revealed the conditional expression of non-coding sequences in E. coli

To test whether and to what extent SCR occurs in a natural context, we analyzed the proteome of wild-type E. coli by mass spectrometry, searching for evidence of SCR events. We matched the acquired peptide fragmentation spectra to a customized database comprising canonical E. coli protein sequences and predicted products of stop codon readthrough events. Since growth temperature was proven to be essential for SCR, as described above, the experiments were conducted for cells grown at 37 °C and at 18 °C.

In total, we identified 16 peptides from non-coding regions, mapping to 15 different proteins (Supplementary Table  5 ). These peptides do not exist in canonical protein sequences and can be generated only if the corresponding stop codon is read-through. Of the 16 peptides, 12 covered the stop codon position, enabling the identification of the misincorporated amino acid.

While we found that TAA can be miscoded by many amino acids, TGA was preferentially miscoded by tryptophan (5 out of 10 cases, Fig.  5A , Supplementary Table  5 ). For the ribosomal protein rpsG, in addition to tryptophan, cysteine was identified as misincorporated at the TGA site (Supplementary Table  5 and Supplementary Fig.  14 ). Since tryptophane is one of the rarest amino acids in the E. coli proteome (comprising 2% of the proteome), it seems overrepresented among the SCR peptides. Notably, tryptophan misincorporation at the TGA position may be common in other organisms 40 , 48 .

figure 5

A We detected SCR in 0.23% of TAA E. coli proteins and 0.77% of TGA E. coli proteins. While we found that TAA could be miscoded by numerous amino acids with similar frequencies, TGA was most often miscoded by Trp. B The occurrence of an additional stop codon in a 15, 30, and 60-nt window downstream of the stop codon correlated with the protein synthesis accuracy of the stop codon in the E. coli genome of the K12 MG1655 strain. TGA, as the least accurate stop codon, had the highest frequency of an additional stop codon in the 3’ regions of genes. C We detected more cases of SCR in E. coli samples grown at l8 °C than at 37 °C. Five genes were exclusively detected at 18 °C, and only two genes were exclusively detected at 37 °C. D The identity of the nucleotide downstream of the stop codon modulated the SCR probability. While G decreased the protein synthesis termination accuracy, T increased it. Source data are provided as a Source data file.

Of the 15 genes where SCR was detected, 9 contained TGA and 6 TAA, representing 0.77% of TGA- and 0.23% of TAA-containing E. coli genes, respectively. We did not detect SCR events for genes with a TAG stop codon, probably due to its low representation in E. coli (8%, Fig.  5A ). This is in line with the observation from the reporter library of TGA being the most error-prone of the three codons and a previous ribosome profiling study on E. coli that observed enrichment for TGA and depletion of TAA in the SCR events detected 35 .

Subsequently, we looked for additional stop codons in a 15-nt window downstream of the canonical E. coli stop codons. The probability of an additional stop codon correlates with the protein synthesis termination accuracy. For example, it is more likely to find an additional stop codon after TGA, the most error-prone stop codon, followed by TAG, and finally TAA (Fig.  5B ). This trend is maintained when extending the search to 30-nt and 60-nt windows (Fig.  5B ), suggesting differences in selection pressure to fix an additional stop codon, depending on the accuracy of the stop codon. However, this selection pressure could be affected by the fact that the wild-type K-12 MG1665 E. coli strain used here carries a defective RF2, which is less efficient in recognizing the TGA stop codon. We, therefore, repeated the analyses using the BL21 E. coli strain, which expresses a more efficient RF2 variant. Both E. coli strains showed the same results (Fig.  5B and Supplementary Fig.  15 ). Regardless of the efficiency of the RF2, it is more likely to find an additional stop codon after TGA, the most error-prone stop codon, followed by TAG and, lastly, TAA.

We detected more cases of SCR in E. coli samples grown at l8 °C than at 37 °C, in agreement with the fluorescence reporter’s study (a peptide was considered present in a sample if it was identified in at least three out of six replicates). Five genes were exclusively detected at 18 °C. In comparison, only two genes were exclusively detected at 37 °C (Fig.  5C , Supplementary Tables  5 and 14 ). Overall, this reinforces the conclusion drawn from the library experiments that lower temperature increases SCR events (Fig.  5C ).

Lastly, we wanted to analyze how the genome context modulates the likelihood of proteome-wide SCR. The previous experiments with the fluorescence reporters indicated that the identity of the base following the stop codon affected SCR likelihood (Fig. 3BC). While G seemed to increase the likelihood of SCR events, T appeared to decrease it. We subsequently explored whether the identified 15 genes with SCR follow the same trend. We calculated the enrichment of each nucleotide at the position immediately after the stop codon in the 15 identified genes compared with the rest of the genome. We confirmed that the identity of the nucleotide downstream of the stop codon modulated the likelihood of SCR. While G decreased the protein synthesis termination accuracy, T had the opposite effect (Fig.  5D ).

Overall, there was an agreement between the conclusions drawn from reporters and from the proteome-wide study. Furthermore, the proteome-wide mass spectrometry analysis revealed the expression of cryptic sequences, i.e., peptides from noncoding regions that could be generated only by SCR events, only expressed under certain environmental conditions.

This study demonstrates that evolution frequently samples stop codon readthrough (SCR) events in E. coli . Furthermore, we report that internal factors, such as stop codon identity and genetic context, and external factors, such as growth temperature or nutrients, modulate SCR event rates (Fig.  2 , Supplementary Figs.  1 – 4 ). We have studied the impact of SCR events on the proteome (Fig.  5 , Supplementary Table  5 ). Transcriptional errors at stop codons introduce a vast exploration of the mutational space since RNA polymerase often miscodes premature stop codons towards different codons independently of the stop codon identity (Fig.  4E ). Translational errors introduce a comparatively minor exploration of the mutational space, depending on the stop codon identity (Fig.  3D , Supplementary Table  4 ), yet, they have higher rates, being the main contributor to SCR events.

Despite the significant number of functional SCR reported 8 , 10 , 12 , 13 , 14 , 15 , 16 , most errors in protein synthesis termination are considered non-adaptative 18 . However, our findings show an error rate of up to 6% when bacteria are grown under normal conditions (Fig.  1E ) (previous error rates recorded by fluorescence reporters for non-programmed SCR in E. coli grown under normal conditions were 2% 23 , 26 and 0.4% in B. subtilis 24 ). Even lower SCR levels have often been linked to functionality 12 . For example, it has been reported how a 1% ribosomal readthrough level at a short conserved stop codon context is used in animals and fungi to generate peroxisomal isoforms of metabolic enzymes 15 . Further, our results indicate that the selection pressure to prevent SCR is not enough to reduce the usage of the most error-prone stop codon or increase release factors’ efficiency. Interestingly, wild-type E. coli K12 strains have an RF2 variant with reduced ability to terminate translation 35 . Instead, our data suggest selection pressure to fixate an additional stop codon downstream of the most error-prone stop codon (Fig.  5B ).

Further, our data show that non-optimal growth temperatures and nutrient scarcity dramatically increase SCR events. As observed in some reporters, SCR occurs at a rate up to 80% (Supplementary Fig.  2 ) (the previous highest reported error rate for a non-programmed stop codon was 14% TGA readthrough for cells grown in LB supplemented with lactose 23 ). The effect of nutrient scarcity in SCR may be due to the lower carbon source concentration, which has been previously linked to modulating the RF2 activity 23 . We hypothesize that the higher stability of the secondary structures of nucleotides at low temperatures could explain the effect of temperature on SCR. Strong mRNA secondary structures hamper the unfolding of the mRNA, potentially affecting the accuracy of protein synthesis 42 . The same arguments may support why the identity of the nucleotide at the adjacent regions affects the protein synthesis accuracy since stem-loops containing GC base pairs have been shown to decrease ribosomal accuracy 42 . Although we detect no significant correlation between predicted mRNA secondary structures and SCR likelihood (Supplementary Fig.  11 ), that may result from the prediction’s limitations 53 . At least in the context of ribosomal frameshifting, the effect of mRNA structure on ribosome movement appears to depend not only on thermodynamic stability but also on the exact distance between structure and ribosome 47 , 54 , the size of the structure 55 , and the number of possible conformations 56 , 57 . These more complex relationships may not be captured by prediction methods.

On the other hand, it has been reported that the amino acid identity at the C-terminal of the nascent peptide modulates termination accuracy 41 , 42 , 43 . Proline and glycine in the −1 and −2 positions upstream of the stop codon increase termination error frequency 42 . Interestingly, codons encoding for proline and glycine are enriched in C and G. Thus, the observed correlation between GC content upstream of the stop codon and the fidelity of termination may be a consequence of the amino acid effect on the termination.

Based on these results, we propose avoiding TGA and optimizing the adjacent region of the stop codon when designing a vector for protein expression in E. coli to minimize SCR events. The most common strategy to purify proteins involves E. coli , often grown at low temperatures, which may promote the production of non-desired protein forms 58 .

Ribosome profiling to explore stop codon readthrough proteome-wide in Drosophila melanogaster proposed that readthrough adds plasticity to the proteome during evolution 14 . Similarly, we hypothesize that bacteria could use SCR errors as a mechanism that allows the rapid diversification of its proteome to adapt to sudden environmental changes. Further, ribosome profiling of E. coli identified >50 genes having possible translation past the stop codon 35 . We also detected cryptic sequences among the E. coli proteome only expressed by SCR under cold shock. Further validation through quantitative mass spectrometry will be invaluable in confirming the identity of these 16 peptides. Exploring their role in generating protein diversity is an interesting open research question. Importantly, the proteome-wide analysis validates the proposed rules to predict endogenous SCR in vivo. Further studies are required to elucidate the phenotypes and potential functional role of these cryptic sequences.

Interestingly, we observed that genes within multi-genes operons are enriched in TGA, the most error-prone stop codon, while depleted in TAA, the most accurate one (Supplementary Fig.  16A ). Further, genes within multi-gene operons are enriched in G and depleted in T in the position right after the TGA stop codon (Supplementary Fig.  16B ). Since we observe that the presence of T increases while G decreases the protein synthesis termination accuracy, protein expression termination errors are predicted to be enhanced among genes within multi-gene operons. Further, 50% of the genes detected by the proteome-wide mass spectrometry analysis that suffered SCR events are within multi-gene operons (representing 43% of the E. coli proteome).

We investigated the source of error behind these high SCR levels. We discern that, although ribosomal errors are the main contributors to SCR, RNA polymerase errors contribute on a minor scale yet introduce a greater proteome diversification. Intriguingly, our study shows that RNA polymerase misincorporates nucleotides non-randomly, i.e., it mainly misincorporates at premature and in-frame stop codons. Future work will be required to identify the molecular mechanisms causing this bias in RNA polymerase error rates.

Our work highlights that both transcription and translation errors contribute to protein diversity. We show that SCR is more frequent than previously thought 12 , 15 , 23 , 24 , 26 , thereby providing an evolutionary mechanism enabling cells to respond rapidly to the environment by increasing protein heterogeneity.

Gene Libraries

The reporter gene library was ordered from Twist Bioscience. The gene library was cloned in a pASK vector (Purchased from Addgene # 65020 . This plasmid is a modified version of the vector pASK-IBA3 plus, with the following changes: the substitution of the ampicillin-resistance gene with a chloramphenicol-resistance gene and the replacement of the pBR322 replicon with a p15A replicon). We used a tetracycline-controlled promoter as it allows tight regulation of protein expression upon anhydrotetracycline titration 59 . We chose p15A as the replication origin since it provides low copy number of vectors in the cell 60 .

Below is the DNA and protein sequence of the wild-type mScarlet. The positions mutated to TAA, TAG, and TGA are highlighted in bold. The N-terminal strep-tag and the C-terminal His-tag are marked in blue and red, respectively. Downstream of the His-tag, we introduced two stop codons.

DNA sequence

synthesis of analysis

Protein sequence

synthesis of analysis

E. coli strain and media

Plasmids encoding the reporters for SCR events were electrotransformed in the wild-type K-12 MG1655 E. coli strain. Transformants were grown on LB-agar plates and inoculated into 384-well plates containing LB media supplemented with 15 µg/mL of chloramphenicol. To store the transformants at −80 °C (glycerol stock), they were grown at 37 °C to saturation, and glycerol was added until a final concentration of 20%.

To investigate the effect of temperature on SCR, E. coli cultures were grown into 384-well plates without shaking under saturated humidity conditions at 18, 25, 37, and 42 °C to saturation. To address the nutrient depletion effect on SCR, LB and M9 media were tested.

To titrate the expression of the Gly95-TGA reporter, 0, 25, 50, 100, 200, 400, and 800 µg/L of anhydrotetracycline was added to the media. For the library study, to induce the expression, 400 µg/L of anhydrotetracycline was added to the media. Cells were grown under light protection to avoid the photodegradation of the anhydrotetracycline.

M9 standard media

M9 media was supplemented with 0.4% glycerol, 0.2% casamino acids, 1 mM thiamine hydrochloride, 2 mM MgSO 4 , and 0.1 mM CaCl 2 .

M9 supplemented with higher carbon source concentration

M9 media was supplemented with 1.6% glycerol, 0.2% casamino acids, 1 mM thiamine hydrochloride, 2 mM MgSO 4 , and 0.1 mM CaCl 2 .

M9 supplemented with higher casamino acid concentration

M9 media was supplemented with 0.4% glycerol, 0.4% casamino acids, 1 mM thiamine hydrochloride, 2 mM MgSO 4 , and 0.1 mM CaCl 2 .

Microscopy Screenings and Sample Preparation

A liquid handling robot (Beckman Coulter Biomek FXp with Thermo Cytomat 6002) was used to perform plate-to-plate transfers of cells. The cells were inoculated from a glycerol stock into 384-well plates (Eppendorf microplate 384/V #0030621301) containing 100 µL of LB-media supplemented with 15 µg/mL of chloramphenicol. Preculture plates were then grown until cells reached saturation ( ~ 24 h) at 37 °C. Then, 2 µL of culture from the saturated cultures were used to inoculate the 384-well plates (Eppendorf microplate 384/V #0030621301) containing 100 µL of the appropriate media (LB or M9) supplemented with 15 µg/mL of chloramphenicol and 400 µg/L of anhydrotetracycline. The cells were grown at different temperatures (18, 25, 37, and 42 °C) at constantly controlled temperatures until reaching saturation (24-48 h) under light protection. Then, cells from the saturated cultures were transferred into Greiner 384-well glass-bottom optical imaging plates (#781092) previously coated with poly-L-lysine containing 50 µL of PBS. We implemented different dilution steps in response to variations in growth rates (Supplementary Table  8 and Supplementary Data  3 ): 1:2500 for 37 °C and 25 °C, 1:1000 at 42 °C, and 1:500 at 18 °C for LB media, and 1:1250 for 37 °C and 25 °C, 1:500 at 42 °C, and 1:100 18 °C at M9 media. To coat the Greiner 384-well glass-bottom optical imaging plates, 50 µL of 0.01% (w/v) of poly-L-lysine (SIGMA, #P4832) was added and incubated for at least 1 h. Then, the plates were washed with water and dried overnight.

All confocal imaging was performed on an automated spinning disc confocal microscope (CellVoyager CV7000, Yokogawa), using a 60×1.2NA objective. For excitation, a 561 nm laser was used, and fluorescence was detected through a 600/37 bandpass emission filter. We recorded 2560×2560 16-bit images at binning 2 for brightfield and for mScarlet fluorescent protein in epifluorescence mode on SCROS cameras. A laser-based hardware autofocus was used to acquire 5 x to 9 images per well.

Fluorescence data analysis

Image analysis was performed with Fiji 61 , and downstream data analysis and visualization were performed using R v4.1.2.

The cells were identified and segmented, and their fluorescent signal (mean and standard deviation, mode, minimum, and maximum), as well as additional cell properties (area, x- and y-coordinates), were determined in Fiji using custom macros 61 .

Statistical analysis

Friedman and sign test.

We employed the Friedman and sign tests to evaluate the statistical differences among the biological replicas depicted in Fig.  2C . These tests are non-parametric and suitable for non-normally distributed data. The Friedman test is ideal for comparing more than two samples, while the sign test is suitable for comparing two samples. To satisfy the requirements of these tests, which demand equal-sized samples, we determined the maximum number of cells available across replicas. Replicas with fewer than 200 cells were excluded from the analysis (Supplementary Table  1 ).

Wilcoxon test

To determine whether one distribution significantly exceeded another, we performed the Wilcoxon test. This non-parametric method is suitable for analyzing non-normally distributed data, serving as an alternative to the t -test. Similar to the Friedman and sign tests, the Wilcoxon test necessitates equal-sized replicates. We evaluated the maximum number of cells available across replicas and excluded those with fewer than 200 cells. Then, we pooled the replicas together as a dataset, ensuring equal representation from each replica to prevent bias in the results.

Western blot

Single-colony E. coli LB-cultures were grown at 37 °C until reaching an absorbance of 0.6 at 600 nm. Next, they were grown at 18 °C for 2 h. Protein expression was induced by adding anhydrotetracycline to a final concentration of 400 µg/L, and cells were grown overnight at 18 °C. Aliquots of ~ 1 mg of total proteins (A 600nm  = 1 ≈ 0.3 mg/mL) were centrifuged. Cell pellets were resuspended in 400 µL of ice-cold disruption buffer (PBS containing 10% Glycerol, 1 mM MgSO 4 , Benzonase 0,05U/m, Roche complete cocktail EDTA free 1 tablet/10 mL) and 300 mg glass beads (0,1 mm Scientific industry SI-BG01) were added. Cells were disrupted in FastPrep-24™ (MP Biomedicals) at low temperatures and centrifuged. Supernatants were processed by SDS–polyacrylamide gel electrophoresis (4–20% Tris-Glycine-Gel, Anamed #TG 42015).

All gels contained the lysate of cells carrying the mScarlet (PC), an empty vector (NC), and a protein marker. The mScarlet and marker bands were visualized with a Typhoon 9500 at 532 nm directly from the gel. The rest of the proteins were transferred to nitrocellulose membranes (Whatman BA85), performing semi-dry blotting (Transfer buffer: 20 mM Tris-Base, 160 mM Glycine, 0, 1% SDS, 20% Methanol). From the membranes, total protein quantities were assayed using Fast Green FCF (Sigma, #F7252) staining solution 62 and imaged using LICOR Odyssey 700 nm. For immunodetection, the membranes were blocked with Blocking buffer (5% milk, 0, 1% Tween20 in PBS) followed by incubation with 1:5000 Qiagen mouse anti-penta His-tag (# 34660) and 1:10.000 Licor IRDYE 800 goat anti-mouse (#926-32350). Signals were detected with LICOR Odyssey and analyzed with Fiji 61 .

The mScarlet sample (PC) shows in-gel fluorescence in several bands with a molecular weight of around 26 kDa, probably due to degradation. The area selected to analyze the His-tag expression was where the PC presented fluorescence (Supplementary Fig.  7 ).

Analysis of the genetic context effect

Stop codon readthrough likelihood score (scr score).

Those reporters displaying a median fluorescence below a threshold (defined as the median plus two standard deviations of the fluorescence signal of the NC) received a score of 0. The reporter with the highest fluorescence signal was assigned the highest score, with a value of 1.0. All other reporters were ranked between 0 and 1.0 according to their median fluorescence values. We repeated this ranking strategy for all stop codons at all tested conditions. Then, we calculate the average score among conditions and the three stop codons (24 values per position) to have a unique score per position and, therefore, per genome context.

Correlation between nucleotide content and stop codon readthrough likelihood score

The stop codon readthrough likelihood scores were binned in 8 equal-width bins. The number of samples per bin are: N  = 12 (bin = 0), N  = 12 (0<bin<0.125), N  = 9 (0.125<bin<0.250), N  = 2 (0.250<bin<0.375), N  = 1 (0.375<bin<0.5), N  = 0 (0.5<bin<0.625), N  = 1 (0.625<bin<0.750).

Correlation between the nucleotide identity adjacent to the stop codon and stop codon readthrough likelihood score

The stop codon readthrough likelihood scores were binned into three categories: i) accurate protein synthesis termination (score = 0, N  = 12), ii) medium tendency to SCR (0 <score <mean of all scores, N  = 16), and iii) high tendency to SCR (score > mean of all scores, N  = 9).

Protein purification

Wild-type mScarlet for the stability assay, for the calibration curve, and the samples to study by mass spectrometry were expressed in E. coli and purified with His-tag affinity chromatography. Briefly, genes encoding these proteins with C-terminal 8x His-tag were cloned into a pASK vector under a tetracycline-controlled promoter. Vectors were transformed into K-12 MG1655 E. coli cells. From this point, two procedures were used to grow the cells and induce protein expression: i) To purify the samples for mass spectrometry analyses, cells were grown on LB-medium supplemented with 15 µg/mL of chloramphenicol at 37 °C until reaching an absorbance of 0.2 at 600 nm, then at 18 °C until reaching absorbance of 0.6 at 600 nm. Protein expression was induced by anhydrotetracycline (final concentration of 400 µg/L), and cells were grown overnight at 18 °C (~12 h). ii) To purify the wild-type mScarlet for the stability assay and calibration curve, cells were grown on LB-medium supplemented with 15 µg/mL of chloramphenicol at 37 °C until reaching an absorbance of 0.6 at 600 nm. Then, protein expression was induced by the addition of anhydrotetracycline (final concentration of 400 µg/L), and cells were grown for 4 h at 37 °C.

Cells were harvested by centrifugation and resuspended in lysis buffer comprising 20 mM sodium phosphate, 500 mM NaCl, and 20 mM imidazole, pH 7.4, supplemented with Roche complete cocktail EDTA free 1 tablet/10 mL and benzonase 0,05U/m. Cells were lysed using an LM20 microfluidizer (Microfluidics), and the lysates were clarified by centrifugation for 1 h at 4 °C at 12000 rpm and loaded on a His GraviTrap™ column (GE Healthcare). After washing with 10 mL of washing buffer (40 mM sodium phosphate, 500 mM NaCl, and 20 mM imidazole, pH 7.4), proteins were eluted with 20 mM sodium phosphate, 500 mM NaCl, and 500 mM imidazole at pH 7.4 63 .

The purity of all the samples was assessed by SDS–polyacrylamide gel electrophoresis (Life Technologies GmbH), and protein concentration was determined by absorbance at 280 nm (using the extinction coefficient of the wild-type mScarlet for all the samples, 39880 M −1 cm −1 ). For the mass spectrometry analyses, we separated 50 μg of each sample by SDS-PAGE. For the thermostability assay and the calibration curve, mScarlet was purified and dialyzed against PBS.

Calibration curve for fluorescence measurements

Triplicates of purified mScarlet were fluorescently imaged at 0, 3, 4, 7, 10, 15, 23, 35, 52, 78, 117, 175, 262, 393, 590, 885, 1327, 1991, and 2986 nM in PBS. Saturation of the fluorescence signal above 70000 AU defines the upper limit of the dynamic range (Supplementary Fig.  1 ). A linear relationship between mScarlet concentration and fluorescence arbitrary units was observed between 40 and 800 nM.

mScarlet thermostability assay

Aliquots of purified mScarlet at 60 nM in PBS were incubated at a range of temperatures (from 30.5 °C to 98.2 °C) for 30 min. Samples were then fluorescently imaged. The procedure was performed in triplicates.

Unfolded mScarlet is not functional and, therefore, does not exhibit fluorescence. On the contrary, mScarlet that remains folded exhibits fluorescence. The thermostability curve (Fig. S3C) indicated that mScarlet remained functional, i.e., fluorescent, until 70 °C.

mRNA secondary structure prediction

We used RNAfold2 53 to calculate the optimal RNA secondary structure that has the minimum free energy at the experimentally tested temperatures. We predicted the minimum free energies of the possible secondary structures in a 100-nt window downstream and upstream of the inserted stop codon. Based on Chen et al. 64 , we defined a 6-nt spacer downstream of the stop codon, which is typically occupied by the ribosome. No spacer was considered upstream of the stop codon (Supplementary Figs.  11A , 9B, C ).

We then focused on the local thermodynamic stability of the secondary structures surrounding the premature stop codon. We studied the number of any base pairs, G-C pairs, and the longest stretch of consecutive pairs in a set of windows (5, 10, 20, and 50-nt) upstream (Supplementary Fig.  11D ) and downstream (Supplementary Fig.  11E ) of the premature stop codon of the most stable RNA structure previously predicted (100-nt window upstream and 100-nt downstream of the premature stop codon plus the 6-nt spacer).

Mass spectrometry analysis of reporters

Gel regions corresponding to the molecular weight of the reporters were excised and analyzed by LC-MS/MS. Briefly, samples were in-gel digested with trypsin (sequencing grade, Promega, Mannheim), the resulting peptides extracted by two changes of 5% formic acid (FA) and acetonitrile, and dried down in a vacuum centrifuge. Peptide pellets were dissolved in 100 μL of 5% FA and 5 μL aliquot of peptide mixture and taken for MS analysis.

LC-MS/MS analysis was performed on a nanoUPLC Vanquish system interfaced online to an Orbitrap HF hybrid mass spectrometer (both Thermo Fischer Scientific, Bremen). The nano-LC system was equipped with Acclam PepMap tm 100 75 µm x 2 cm trapping column and 50 cm μPAC analytical column (Thermo Fischer Scientific, Bremen). Peptides were separated using a 75 min linear gradient, solvent A—0.1% aqueous FA, solvent B − 0.1% FA in acetonitrile. Samples were first analyzed using data-dependent acquisition (DDA) and then by targeted acquisition with an inclusion list guided by the results of RNA-seq analysis. DDA analysis was performed using the Top20 method; precursor m/z range was 350–1600; mass resolution (FWHM)—120 000 and 15 000 for MS and MS/MS spectra, respectively; dynamic exclusion time was set to 15 s. The lock mass function was set to recalibrate MS1 scans using the background ion (Si(CH3)2 O)6 at m/z 445.1200. Targeted analysis was performed in profile mode; a full mass spectrum at the mass resolution of 240 000 (AGC target 3×10 6 , 150 ms maximum injection time, m/z 350–1700) was followed by PRM scans at a mass resolution of 120 000 (AGC target 1×10 5 , 200 ms maximum injection time, isolation window 3 Th) triggered by a scheduled inclusion list. To avoid carryover, 3-5 blank runs were performed after each sample analysis, the last blank was recorded and also searched against a customized database.

Spectra were matched by MASCOT software (v. 2.2.04, Matrix Science, UK) against a customized database comprising E. coli protein sequences extracted from a UniProt database (version October 2022) and a set of modified mScarlet sequences. mScarlet sequences included three-frame translated nucleotide sequences with stop codon insertions, sequences with deletion of 1-2 amino acids surrounding the position of the stop codon that were denoted as “X” (equivalent to any amino acid). Database search was performed with 5ppm and 0.025 Da mass tolerance for precursor and fragment ions, respectively; enzyme specificity—trypsin; one miscleavage allowed; variable modifications—methionine oxidation, N/Q deamidation, cysteine sulfonic acid, cysteine propionamide, peptide N-terminal acetylation. The results were then evaluated by Scaffold software (v.4.11.1, Proteome Software, Portland) and also manually inspected. Identification of modified peptides was accepted if it passed the 95% peptide probability threshold and if the matched fragmentation spectra (minimal number of PMS: 2) comprised fragment ions, unequivocally confirming the misincorporated amino acid (Supplementary Fig.  12 ). The ratio of peptide forms comprising misincorporated amino acids was estimated based on extracted ions chromatograms (XIC) for each form generated in the Xcalibur software (Thermo Fischer Scientific) and normalized to the sum intensity of all forms of the peptide. Peptides with misincorporated Lys were excluded from the calculations.

Single-colony E. coli LB-cultures were grown at 37 °C until an absorbance of 0.4 at 600 nm. Then, they were grown at 18 °C until an absorbance of 0.6 (~2 h), protein expression was induced by the addition of anhydrotetracycline to a final concentration of 400 µg/L, and cells were grown overnight at 18 °C. Cells were diluted to an absorbance of 1.0 at 600 nm, and 100 μL of the cell suspension was mixed with 200 μL of RNAprotect bacteria reagent (Qiagen #76506). Cells were harvested by centrifugation, and pellets were frozen. Then, the total RNA of the samples was extracted according to manufacturing specifications (RNeasy Protect Bacteria Mini Kit Qiagen kit).

To detect rare transcriptional error events, we optimized i) the reverse transcription reaction, ii) the cDNA amplification reaction, and iii) the PacHifi sequencing step. Such optimization enabled us to achieve >99.99% accuracy. Briefly:

i) 300 ng of total RNA per sample was mixed with 50 ng of random hexamers and dNTPs and hybridized at 65 °C for 5 minutes. cDNA was reversely transcribed using the Thermo Fisher Superscript IV transcriptase according to the manufacturer’s instructions (reporter median error frequency of \({5.01*10}^{-5}\) 65 ).

ii) Then, mScarlet was specifically amplified from the resulting cDNA with forward primer 5’ AGTTATTTTACCACTCCCTATCAGT 3’ and reverse primer 5’ AGTAGCGGTAAACGGCAGAC 3’, resulting in an amplified PCR fragment of 948 bps. The NEB Q5® High-Fidelity DNA polymerase was used according to the manufacturer’s instructions. This DNA polymerase is one of the highest fidelity polymerases incorporating, according to the manufacturer, 1 error in 28.000.000 base pairs). PCR conditions were: an initial denaturation step for 30 sec at 98 °C, 30 cycles of 10-sec denaturation at 98 °C followed by annealing at 67 °C for 30 sec, and extension at 72 °C for 30 sec. A final extension step was performed for 2 min at 72 °C. Primer and dNTPs have been removed with 1x volume AMPure bead purification. Amplified mScarlet fragments were finally quantified with the Thermo Fisher Qubit high sensitivity DNA quantification system. 1 ng of the amplified mScarlet fragments were analyzed on the Agilent Fragment Analyzer system using the NGS high sensitivity kit.

iii) Pacbio SMRTbell® libraries have been generated following the PacBio® Barcoded overhand adapters for multiplexing amplicons (Express template kit 2.0). For each multiplexed library, eight samples have been pooled equimolarly. Briefly, 50 ng of each amplified mScarlet fragment was damage repaired, followed by end repair and A-tailing according to the instructions. Pacbio barcoded overhang adapters (BAK8A and BAK8B) were ligated to the PCR fragments and equimolarly pooled prior to two final AMPure bead purification steps (1x volume). The final quality control of the resulting library was performed on the Agilent Fragment Analyzer with the large fragment kit. We generated, in total, two different 8plex PacBio HiFi libraries. The v4 PacBio sequencing primer, in combination with the SEQUEL II binding kit 2.1, was used to sequence both eightplex PacBio SMRTbell® libraries. 80 pM and 120 pM of each library were loaded by diffusion loading, pre-extension time was 0.3 hours, and run time 10 hours on the SEQUEL II, making use of the SEQUEL II sequencing 2.0 chemistry. Circular consensus reads have been called with the PacBio SMRT link ccs calling tool and demultiplexed with lima, the PacBio demultiplexer, and primer removal tool ( https://lima.how/ ). During the Pacbio HiFi library sequencing, we obtained reads up to 100 kB. Since our cDNA fragments are short (~1 kB), most of them have been read multiple times during the circular sequencing steps (quantified as the number of passes). Specifically, only ~15% of all reads had less than 10 full passes, and the maximum number of passes was 60. The standard PacBio SMRT link pipeline for circular consensus sequencing, by default, considers only reads with an error rate of less than 1%. Additionally, based on Wenger et al., 10 passes correspond to 99.9% base accuracy, and 20 passes correspond to 99.99% base accuracy, in agreement with our empirical error rate observed (Supplementary Fig.  15 , mean = 0.005% mismatch per base).

The long PacBio RNA-seq reads were mapped to the reference with BWA v0.7.17-r1198. BAM files representing the mapped reads were further processed with Samtools mpileup v1.15.1 using htslib 1.15.1 to generate a textual description of the mapped reads, including information about positions and read mutations, insertions, deletions, and indels found (all BAM files are provided as Data S2). The pileups were further processed with R to extract information on frequencies of mutated trimers along the mScarlet gene. For extracting information on synonymous and non-synonymous mutations, PySam v0.16.0 with Python 3.7.12 was used with the BAM/SAM output from BWA. All plots were generated with base plotting of R v4.1.2.

Analysis of E. coli proteome by mass spectrometry

Cell culturing and sample preparation.

E. coli (strain K-12 MG1655) was grown overnight at 37 °C on LB medium, then split into two parts, one incubated at 37 °C, the other at 18 °C until reaching OD600 of ca. 0.6. The experiment was performed in three biological replicates, and samples were prepared and measured in a block-randomized fashion 66 with two technical repeats. Cells were pelleted by centrifugation at 4500 g for 10 minutes, washed with PBS, and resuspended in 1 ml of lysis buffer consisting of 8 M Urea, 0.1 M ammonium bicarbonate, 0.1 M NaCl, and 1x Roche cOmplete™ Protease Inhibitor Cocktail (Roche Diagnostics Deutschland GmbH, Germany). Then, 2 micro spatula spoons of 0.5 mm stainless steel beads (Next Advance Inc., USA) were added, cells were lysed in a TissueLyser II (QIAGEN GmbH, Germany) for 2x 5 min at 30 Hz at 4 °C, and the debris removed by centrifugation for 10 min at 13,000 g. Protein concentration in the supernatant was measured by Pierce BCA Protein Assay (Thermo Scientific, USA), and aliquots of 100 µg of proteins were taken for LC-MS/MS analysis. After reduction and alkylation, proteins were precipitated with isopropanol 67 and digested overnight at 37 °C with a 1:50 enzyme: protein ratio by Trypsin/Lys-C Mix (Promega GmbH, Germany). The resulting peptides were desalted on a MicroSpin column (The Nest Group, Inc., USA) and dried down in a vacuum concentrator. Prior to mass spectrometric analysis, the samples were reconstituted in 0.2% aqueous formic acid; peptide concentration was determined by measuring absorption at 280 nm and 260 nm using a Nanodrop 1000 ND-1000 spectrophotometer (Thermo Fisher Scientific Inc, USA) and the Warburg-Christian method 68 and adjusted to a final concentration of 0.12 µg/µL; 5 µL were then taken for analyses.

LC-MS/MS analysis

LC-MS/MS analysis was carried out on the mass spectrometric equipment described in the ‘Mass spectrometry analysis of reporters’ section. Peptides were separated using 120 min 2-sloped gradient, solvent A − 0.1% aqueous FA, solvent B − 0.1% FA in acetonitrile: 80 min 0 to 17.5 % ACN, 40 min 17.5 % to 35% ACN at a flow rate of 0.5 µl/min. The gradient was followed by a 7 min wash with 95% ACN. To avoid carryover, 2 blank runs were performed after each sample analysis. Further settings: spray voltage − 2.5 kV, capillary temperature − 280 °C, S-lens RF value − 50. Spectra were acquired by Data Independent Acquisition (DIA) in a staggered fashion 69 , 70 ; full-scan mass spectrum with a mass range of 395-971 m/z and resolution of 60,000 (AGC 3e6, 40 ms maximum injection time, fixed first mass 100) was followed by 32 MS2 scans in centroid mode at a 30,000 mass resolution with an isolation window of 18 m/z covering a 400-966 m/z mass range (55 ms maximum injection time, AGC 1e6, normalized collision energy 24).

Database search and validation of candidate peptides

Acquired spectra were matched against a customized database using DIA-NN software suit v1.8 71 . A customized database, including sequences resulting from potential stop codon readthrough (SCR) events, was created based on the E. coli K12 MG1655 reference genome and the corresponding genome annotation in the NCBI database (accession GCF_000005845.2, version from 18/05/21). For each gene, the ‘downstream sequence’ was determined by finding the next in-frame ORF after that stop codon in the genome (using the reverse complement for genes on the negative strand). The minimum length of the downstream sequence was 60 nucleotides, and the maximum was either 300 nucleotides or the distance to the next in-frame ORF. Genes were then in silico translated 20 times from canonical start to the end of the downstream sequence or until the second in-frame stop codon, each time translating the canonical (the first) stop codon with a different amino acid, and added to the fasta file. Thus, for each gene, the sequence list contained 20 sequences, each encompassing the canonical and genomic downstream sequence, with one of the 20 amino acids in place of the canonical stop. This file, as well as a file containing common contaminant sequences 72 , were used as input for DIA-NN to create the spectral library. Database preparation and data analysis were performed with Python 3 and Jupyter Notebook, using pandas 73 , numpy 74 , and Biopython 75 .

Staggered DIA raw files were converted to.mzML format and demultiplexed using Proteowizard MSConvert v3.0.2 76 and analyzed with DIA-NN v1.8. The predicted spectral library was generated from sequences in a customized database under the following settings: enzyme - Trypsin/P, one missed cleavage allowed; C(carbamidomethyl) as fixed modification; M(oxidation) and N-terminal M-excision as variable modifications, allowing up to one variable modification per precursor. Precursor and protein group matrices were filtered at 1% FDR. Other settings were: –double-search –smart-profiling –no-ifs-removal –no-quant-files –report-lib-info –il-eq.

The precursor-by-sample-matrix output generated by DIA-NN was processed further to identify SCR events. First, all precursors mapped to a contaminant protein or any sequence in the Swissprot database (version from 22/12/21) were discarded. The remaining precursors mapped to the downstream sequence of an E. coli gene either fully (complete sequence is after the stop codon) or partially (overlapping the stop codon) and were treated as candidate SCR precursors. Singly charged precursors containing lysine or arginine were removed, as well as precursors not extending for at least 2 amino acids over the stop codon. Further, only candidate SCR precursors reported in 3 out of 6 replicates for at least one temperature value were retained. Fragmentation spectra of candidate SCR precursors were then manually validated in Skyline ‘daily’, version 22.2.1.351 77 . For this, the spectral library generated by DIA-NN based on observed fragment intensities (with ‘full-profiling’ library generation setting) and the peak boundaries from the DIA-NN main output table were imported into Skyline, along with sequences of SCR precursors remaining after filtering. Candidates were evaluated by comparison of the empirical library spectra generated by DIA-NN against Prosit-predicted spectra 78 and by assessing chromatographic coelution of fragments. Peptide identification was accepted if i) peptide sequence matched at least three fragments (y3 and higher for tryptic peptides, b3 and higher or a combination of b/y-ions for others); ii) the fragments co-eluted; iii) two fragments predicted by Prosit to be the most intense (excluding y1, y2, b1, and b2) had to be matched. In addition, indexed retention times predicted by Prosit were compared to measured (indexed) retention times, and outliers eluting much earlier or later than predicted were removed from the analyses (Supplementary Fig.  18 ). Peptides meeting all the filters described were further considered for analyses of differential abundance, sequence context, and stop codon usage. To identify statistically significant differences in abundance between the 37 °C and 18 °C conditions, a Student’s two-sided t -test with Benjamini-Hochberg adjustment 79 was carried out on the log2 peptide intensities of the two groups. In cases where multiple precursors represented the same peptide sequence, the respective intensities were summed.

Statistical analysis and data visualization

Statistical analysis was performed by R v4.1.2. For all box plot representations, the thick black line indicates the median, the box indicates the 25 th and 75 th percentiles, and the whiskers indicate 1.5 times the interquartile range. Figures were created in Adobe Illustrator and Fig.  1A and D were created with Biorender (biorender.com) and released under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The mass spectrometry data have been deposited in the MassIVE repository under accession number MSV000091065 ( https://doi.org/10.25345/c5tm72991 ) and to the ProteomeXchange repository under accession number PXD039448. RNA-seq data have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE226936 and are provided as Supplementary Data  1 (BAM files) and their DNA-seq chromatograms as Supplementary Data  2 (bl1 files). The raw fluorescent data obtained from microscopy images and the Western blot gels are provided as Source data files.  Source data are provided with this paper.

Youngman, E. M., McDonald, M. E. & Green, R. Peptide release on the ribosome: mechanism and implications for translational control. Annu. Rev. Microbiol. 62 , 353–373 (2008).

Article   CAS   PubMed   Google Scholar  

Burroughs, A. M. & Aravind, L. The Origin and Evolution of Release Factors: Implications for Translation Termination, Ribosome Rescue, and Quality Control Pathways. Int. J. Mol. Sci. 20 , 1981 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Swart, E. C., Serra, V., Petroni, G. & Nowacki, M. Genetic Codes with No Dedicated Stop Codon: Context-Dependent Translation Termination. Cell 166 , 691–702 (2016).

Mukai, T., Lajoie, M. J., Englert, M. & Söll, D. Rewriting the Genetic Code. Annu. Rev. Microbiol. 71 , 557–577 (2017).

Baranov, P. V., Atkins, J. F. & Yordanova, M. M. Augmented genetic decoding: global, local and temporal alterations of decoding processes and codon meaning. Nat. Rev. Genet. 16 , 517–529 (2015).

Eggertsson, G. & Söll, D. Transfer ribonucleic acid-mediated suppression of termination codons in Escherichia coli. Microbiol. Rev. 52 , 354–374 (1988).

Albers, S. et al. Repurposing tRNAs for nonsense suppression. Nat. Commun. 12 , 3850 (2021).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Steneberg, P. & Samakovlis, C. A novel stop codon readthrough mechanism produces functional Headcase protein in Drosophila trachea. EMBO Rep. 2 , 593–597 (2001).

Jungreis, I. et al. Evidence of abundant stop codon readthrough in Drosophila and other metazoa. Genome Res 21 , 2096–2113 (2011).

Schueren, F. et al. Peroxisomal lactate dehydrogenase is generated by translational readthrough in mammals. Elife 3 , e03640 (2014).

Article   PubMed   PubMed Central   Google Scholar  

True, H. L. & Lindquist, S. L. A yeast prion provides a mechanism for genetic variation and phenotypic diversity. Nature 407 , 477–483 (2000).

Article   ADS   CAS   PubMed   Google Scholar  

Romero Romero, M. L., Landerer, C., Poehls, J. & Toth-Petroczy, A. Phenotypic mutations contribute to protein diversity and shape protein evolution. Protein Sci. 31 , e4397 (2022).

Namy, O., Duchateau-Nguyen, G. & Rousset, J.-P. Translational readthrough of the PDE2 stop codon modulates cAMP levels in Saccharomyces cerevisiae. Mol. Microbiol. 43 , 641–652 (2002).

Dunn, J. G., Foo, C. K., Belletier, N. G., Gavis, E. R. & Weissman, J. S. Ribosome profiling reveals pervasive and regulated stop codon readthrough in Drosophila melanogaster. Elife 2013 , 1–32 (2013).

Google Scholar  

Stiebler, A. C. et al. Ribosomal readthrough at a short UGA stop codon context triggers dual localization of metabolic enzymes in Fungi and animals. PLoS Genet 10 , e1004685 (2014).

Schueren, F. & Thoms, S. Functional Translational Readthrough: A Systems Biology Perspective. PLoS Genet 12 , e1006196 (2016).

Freitag, J., Ast, J. & Bölker, M. Cryptic peroxisomal targeting via alternative splicing and stop codon read-through in fungi. Nature 485 , 522–525 (2012).

Li, C. & Zhang, J. Stop-codon read-through arises largely from molecular errors and is generally nonadaptive. PLoS Genet 15 , e1008141 (2019).

Wohlgemuth, I. et al. Translation error clusters induced by aminoglycoside antibiotics. Nat. Commun. 12 , 1830 (2021).

Yanagida, H. et al. The Evolutionary Potential of Phenotypic Mutations. PLoS Genet 11 , e1005445 (2015).

Craigen, W. J. & Caskey, C. T. Expression of peptide chain release factor 2 requires high-efficiency frameshift. Nature 322 , 273–275 (1986).

Ballesteros, M., Fredriksson, Å., Henriksson, J. & Nyström, T. Bacterial senescence: Protein oxidation in non-proliferating cells is dictated by the accuracy of the ribosomes. EMBO J. 20 , 5280–5289 (2001).

Zhang, H. et al. Metabolic stress promotes stop-codon readthrough and phenotypic heterogeneity. Proc. Natl Acad. Sci. Usa. 117 , 22167–22172 (2020).

Meyerovich, M., Mamou, G. & Ben-Yehuda, S. Visualizing high error levels during gene expression in living bacterial cells. Proc. Natl Acad. Sci. Usa. 107 , 11543–11548 (2010).

Rosenberger, R. F. & Foskett, G. An estimate of the frequency of in vivo transcriptional errors at a nonsense codon in Escherichia coli. Mol. Gen. Genet. 183 , 561–563 (1981).

Fan, Y. et al. Heterogeneity of Stop Codon Readthrough in Single Bacterial Cells and Implications for Population Fitness. Mol. Cell 67 , 826–836.e5 (2017).

Lentini, L. et al. Toward a rationale for the PTC124 (Ataluren) promoted readthrough of premature stop codons: a computational approach and GFP-reporter cell-based assay. Mol. Pharm. 11 , 653–664 (2014).

Halvey, P. J., Liebler, D. C. & Slebos, R. J. C. A reporter system for translational readthrough of stop codons in human cells. FEBS Open Bio 2 , 56–59 (2012).

Buck, N. E., Wood, L., Hu, R. & Peters, H. L. Stop codon read-through of a methylmalonic aciduria mutation. Mol. Genet. Metab. 97 , 244–249 (2009).

Belin, D. & Puigbò, P. Why Is the UAG (Amber) Stop Codon Almost Absent in Highly Expressed Bacterial Genes? Life 12 , 431 (2022).

Bonetti, B., Fu, L., Moon, J. & Bedwell, D. M. The Efficiency of Translation Termination is Determined by a Synergistic Interplay Between Upstream and Downstream Sequences inSaccharomyces cerevisiae. J. Mol. Biol. 251 , 334–345 (1995).

Wangen, J. R. & Green, R. Stop codon context influences genome-wide stimulation of termination codon readthrough by aminoglycosides. Elife 9 , e52611 (2020).

Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277 , 1453–1462 (1997).

Uno, M., Ito, K. & Nakamura, Y. Functional specificity of amino acid at position 246 in the tRNA mimicry domain of bacterial release factor 2. Biochimie 78 , 935–943 (1996).

Baggett, N. E., Zhang, Y. & Gross, C. A. Global analysis of translation termination in E. coli. PLoS Genet 13 , e1006676 (2017).

Harrell, L., Melcher, U. & Atkins, J. F. Predominance of six different hexanucleotide readthrough signals 3’ of read-through stop codons. Nucleic Acids Res 30 , 2011–2017 (2002).

Martin, R., Phillips-Jones, M. K., Watson, F. J. & Hill, L. S. Codon context effects on nonsense suppression in human cells. Biochem. Soc. Trans. 21 , 846–851 (1993).

López-Maury, L., Marguerat, S. & Bähler, J. Tuning gene expression to changing environments: from rapid responses to evolutionary adaptation. Nat. Rev. Genet. 9 , 583–593 (2008).

Article   PubMed   Google Scholar  

Chittum, H. S. et al. Rabbit beta-globin is extended beyond its UGA stop codon by multiple suppressions and translational reading gaps. Biochemistry 37 , 10866–10870 (1998).

Blanchet, S., Cornu, D., Argentini, M. & Namy, O. New insights into the incorporation of natural suppressor tRNAs at stop codons in Saccharomyces cerevisiae. Nucleic Acids Res 42 , 10061–10072 (2014).

Mottagui-Tabar, S., Björnsson, A. & Isaksson, L. A. The second to last amino acid in the nascent peptide as a codon context determinant. EMBO J. 13 , 249–257 (1994).

Björnsson, A., Mottagui-Tabar, S. & Isaksson, L. A. Structure of the C-terminal end of the nascent peptide influences translation termination. EMBO J. 15 , 1696–1704 (1996).

Mottagui-Tabar, S. & Isaksson, L. A. Only the last amino acids in the nascent peptide influence translation termination in Escherichia coli genes. FEBS Lett. 414 , 165–170 (1997).

Brown, A., Shao, S., Murray, J., Hegde, R. S. & Ramakrishnan, V. Structural basis for stop codon recognition in eukaryotes. Nature 524 , 493–496 (2015).

Poole, E. S., Brown, C. M. & Tate, W. P. The identity of the base following the stop codon determines the efficiency of in vivo translational termination in Escherichia coli. EMBO J. 14 , 151–158 (1995).

Bossi, L. Context effects: translation of UAG codon by suppressor tRNA is affected by the sequence following UAG in the message. J. Mol. Biol. 164 , 73–87 (1983).

Mouzakis, K. D., Lang, A. L., Vander Meulen, K. A., Easterday, P. D. & Butcher, S. E. HIV-1 frameshift efficiency is primarily determined by the stability of base pairs positioned at the mRNA entrance channel of the ribosome. Nucleic Acids Res 41 , 1901–1913 (2013).

Keeling, P. J. & Leander, B. S. Characterisation of a non-canonical genetic code in the oxymonad Streblomastix strix. J. Mol. Biol. 326 , 1337–1349 (2003).

Kachale, A. et al. Short tRNA anticodon stem and mutant eRF1 allow stop codon reassignment. Nature 613 , 751–758 (2023).

Nilsson, G., Belasco, J. G., Cohen, S. N. & von Gabain, A. Effect of premature termination of translation on mRNA stability depends on the site of ribosome release. Proc. Natl Acad. Sci. Usa. 84 , 4890–4894 (1987).

Belasco, J. G. All things must pass: contrasts and commonalities in eukaryotic and bacterial mRNA decay. Nat. Rev. Mol. Cell Biol. 11 , 467–478 (2010).

Li, W. & Lynch, M. Universally high transcript error rates in bacteria. Elife 9 , e54898 (2020).

Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6 , 26 (2011).

Lin, Z., Gilbert, R. J. C. & Brierley, I. Spacer-length dependence of programmed-1 or-2 ribosomal frameshifting on a U6A heptamer supports a role for messenger RNA (mRNA) tension in frameshifting. Nucleic Acids Res 40 , 8674–8689 (2012).

Bao, C. et al. Specific length and structure rather than high thermodynamic stability enable regulatory mRNA stem-loops to pause translation. Nat. Commun. 13 , 988 (2022).

Halma, M. T. J., Ritchie, D. B. & Woodside, M. T. Conformational Shannon Entropy of mRNA Structures from Force Spectroscopy Measurements Predicts the Efficiency of −1 Programmed Ribosomal Frameshift Stimulation. Phys. Rev. Lett. 126 , 038102 (2021).

Ritchie, D. B. & Foster, D. A. N. Programmed− 1 frameshifting efficiency correlates with RNA pseudoknot conformational plasticity, not resistance to mechanical unfolding. Biophys Comput. Biol. 109 , 16167–16172 (2012).

MacBeath, G. & Kast, P. UGA read-through artifacts–when popular gene expression systems need a pATCH. Biotechniques 24 , 789–794 (1998).

Skerra, A. Use of the tetracycline promoter for the tightly regulated production of a murine antibody fragment in Escherichia coli. Gene 151 , 131–135 (1994).

Chang, A. C. & Cohen, S. N. Construction and characterization of amplifiable multicopy DNA cloning vehicles derived from the P15A cryptic miniplasmid. J. Bacteriol. 134 , 1141–1156 (1978).

Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9 , 676–682 (2012).

Luo, S., Wehr, N. B. & Levine, R. L. Quantitation of protein on gels and blots by infrared fluorescence of Coomassie blue and Fast Green. Anal. Biochem. 350 , 233–238 (2006).

Romero, M. L., Garcia Seisdedos, H. & Ibarra-Molero, B. Active site center redesign increases protein stability preserving catalysis in thioredoxin. Protein Sci . 31 , e4417 (2022).

Chen, C. et al. Dynamics of translation by single ribosomes through mRNA secondary structures. Nat. Struct. Mol. Biol. 20 , 582–588 (2013).

Houlihan, G. et al. Discovery and evolution of RNA and XNA reverse transcriptase function and fidelity. Nat. Chem. 12 , 683–690 (2020).

Burger, B., Vaudel, M. & Barsnes, H. Importance of Block Randomization When Designing Proteomics Experiments. J. Proteome Res . 20 , 122–128 (2021).

Knittelfelder, O. et al. Shotgun Lipidomics Combined with Laser Capture Microdissection: A Tool To Analyze Histological Zones in Cryosections of Tissues. Anal. Chem. 90 , 9868–9878 (2018).

Warburg, O. & Christian, W. Spectrophotometric method for the determination of protein and nucleic acids. Biochem. Z . 310 , 384–421 (1941).

Amodei, D. et al. Improving Precursor Selectivity in Data-Independent Acquisition Using Overlapping Windows. J. Am. Soc. Mass Spectrom. 30 , 669–684 (2019).

Gillet, L. C. et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteom. 11 , O111.016717 (2012).

Article   Google Scholar  

Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S. & Ralser, M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat. Methods 17 , 41–44 (2020).

Frankenfield, A. M., Ni, J., Ahmed, M. & Hao, L. Protein Contaminants Matter: Building Universal Protein Contaminant Libraries for DDA and DIA Proteomics. J. Proteome Res. 21 , 2104–2113 (2022).

McKinney. Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference 56–61 (2010).

van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 13 , 22–30 (2011).

Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25 , 1422–1423 (2009).

Adusumilli, R. & Mallick, P. Data Conversion with ProteoWizard msConvert. Methods Mol. Biol. 1550 , 339–368 (2017).

Egertson, J. D., MacLean, B., Johnson, R., Xuan, Y. & MacCoss, M. J. Multiplexed peptide analysis using data-independent acquisition and Skyline. Nat. Protoc. 10 , 887–903 (2015).

Gessulat, S. et al. Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat. Methods 16 , 509–518 (2019).

Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. 57 , 289–300 (1995).

Article   MathSciNet   Google Scholar  

Download references

Acknowledgements

We thank Andrej Shevchenko for his support and advice with mass spectrometry. We thank Lena Hersemann, Noreen Walker, and Andre Gohr from the Scientific Computing Facility of the MPI-CBG for helping with RNA-seq data and image analyses; Marc Bickle and Martin Stöter from the Technology Development Studio for guidance and assisting with the automatic microscope; Barbara Borgonovo, Eric Geertsma and Aliona Bogdanova from the Protein Biochemistry Facility of the MPI-CBG for helping with the western blot experiment; Julia Jarrells from the Cell Technology Facility of the MPI-CBG for the RNA extraction preparation and Sylke Winkler and Nicola Gscheidel from the Sequencing and Genotyping Facility of the MPI-CBG for the RNA-seq experiments. We thank Michele Marass and Miri Trainic for their help with scientific writing and editing.

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany

Maria Luisa Romero Romero, Jonas Poehls, Anastasiia Kirilenko, Doris Richter, Tobias Jumel, Anna Shevchenko & Agnes Toth-Petroczy

Center for Systems Biology Dresden, 01307, Dresden, Germany

Maria Luisa Romero Romero, Jonas Poehls, Anastasiia Kirilenko, Doris Richter & Agnes Toth-Petroczy

Cluster of Excellence Physics of Life, TU Dresden, 01062, Dresden, Germany

Agnes Toth-Petroczy

You can also search for this author in PubMed   Google Scholar

Contributions

M.L.R.R. and A.T.P. designed the project. M.L.R.R., A.K., D.R. performed experiments. J.P., T.J. and A.S. performed mass spectrometry experiments and analyses. M.L.R.R. analysed the data and prepared all the figures. M.L.R.R., A.T.P., J.P., A.S. interpreted the data. M.L.R.R. and A.T.P. wrote the manuscript with the help of all co-authors. A.T.P. acquired funding.

Corresponding authors

Correspondence to Maria Luisa Romero Romero or Agnes Toth-Petroczy .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, description of additional supplementary files, supplementary data 1, supplementary data 2, supplementary data 3, reporting summary, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Romero Romero, M.L., Poehls, J., Kirilenko, A. et al. Environment modulates protein heterogeneity through transcriptional and translational stop codon readthrough. Nat Commun 15 , 4446 (2024). https://doi.org/10.1038/s41467-024-48387-x

Download citation

Received : 22 February 2023

Accepted : 25 April 2024

Published : 24 May 2024

DOI : https://doi.org/10.1038/s41467-024-48387-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

synthesis of analysis

  • Open access
  • Published: 23 May 2024

The Carthamus tinctorius L. genome sequence provides insights into synthesis of unsaturated fatty acids

  • Yuanyuan Dong 1   na1 ,
  • Xiaojie Wang 2   na1 ,
  • Naveed Ahmad 1 ,
  • Yepeng Sun 1 ,
  • Yuanxin Wang 1 ,
  • Xiuming Liu 1 ,
  • Yang Jing 1 ,
  • Linna Du 1 ,
  • Xiaowei Li 1 ,
  • Nan Wang 1 ,
  • Weican Liu 1 ,
  • Fawei Wang 1 ,
  • Xiaokun Li 2 &
  • Haiyan Li 3  

BMC Genomics volume  25 , Article number:  510 ( 2024 ) Cite this article

58 Accesses

1 Altmetric

Metrics details

Domesticated safflower ( Carthamus tinctorius L.) is a widely cultivated edible oil crop. However, despite its economic importance, the genetic basis underlying key traits such as oil content, resistance to biotic and abiotic stresses, and flowering time remains poorly understood. Here, we present the genome assembly for C. tinctorius variety Jihong01 , which was obtained by integrating Oxford Nanopore Technologies (ONT) and BGI-SEQ500 sequencing results. The assembled genome was 1,061.1 Mb, and consisted of 32,379 protein-coding genes, 97.71% of which were functionally annotated. Safflower had a recent whole genome duplication (WGD) event in evolution history and diverged from sunflower approximately 37.3 million years ago. Through comparative genomic analysis at five seed development stages, we unveiled the pivotal roles of fatty acid desaturase 2 (FAD2) and fatty acid desaturase 6 (FAD6) in linoleic acid (LA) biosynthesis. Similarly, the differential gene expression analysis further reinforced the significance of these genes in regulating LA accumulation. Moreover, our investigation of seed fatty acid composition at different seed developmental stages unveiled the crucial roles of FAD2 and FAD6 in LA biosynthesis. These findings offer important insights into enhancing breeding programs for the improvement of quality traits and provide reference resource for further research on the natural properties of safflower.

Peer Review reports

Introduction

Safflower ( Carthamus tinctorius L.) is a diploid (2n = 24) dicot plant in the family Asteraceae (Compositae) [ 1 ]. It is an annual plant that is predominantly self-pollinated. This herbaceous crop is adapted to hot and dry environments due to its deep root system and xerophytic spines. Therefore, it is widely cultivated in arid and semiarid regions [ 2 ]. Safflower is assumed to have been domesticated in the Fertile Cresent region over 4,000 years ago, and it has a long history of cultivation in Asia, the Mediterranean region, Europe, and the Americas [ 3 , 4 , 5 ].

Safflower is mainly grown as an oil crop, it has been cultivated for use as birdseed and as a source of oil for the paint industry [ 6 , 7 ]. In some areas such as Western Europe, safflower is cultivated as a source of Safflor Yellow (SY) that is produced in the floret, and used as a natural dyestuff [ 8 ]. Safflower is valuable as an edible oil crop because it produces a large amount of oil (approx. 25% oil content in seeds). It has relatively higher polyunsaturated/saturated ratios than other edible oil, which is rich in octadecadienoic acid and contains more than 70% LA [ 9 ]. As a type of essential polyunsaturated fatty acid (PUFA), LA is vital in the dietary composition for both humans and animals. As one of the oldest sources of oil for humans worldwide, the main economic traits of cultivated safflower varieties are related to its composition proportion in LA [ 10 , 11 ]. Fatty acids desaturase such as FAD2 play a crucial role in regulating the composition of fatty acids, including LA. These enzymes catalyze the desaturation reactions necessary for the synthesis of unsaturated fatty acids, from saturated or monounsaturated precursors [ 12 ]. Although, it is not a mainstream oilseed crop in today’s world, it has been cultivated widely and distributed across various geographic regions. The species diversity of safflower could also serve as an important resource for genetic breeding.

The genetic diversity and natural variations in safflower have been studied using several molecular and analytical methods in recent decades [ 3 , 13 ]. More recent studies have provided molecular information for safflower including its complete chloroplast genome [ 14 ], full-length transcriptome [ 15 ], and the locations of 2,008,196 single nucleotide polymorphisms, which were identified from recombinant inbred safflower lines [ 16 ]. The results of those studies and others indicated that the genetic architecture and evolution of safflower domestication are complex [ 17 ]. Such complexity has posed challenges to safflower breeding endeavors. In the past, breeding programs have used hybridization to breed new cultivars [ 18 ] and have characterized the safflower germplasm using various molecular markers, including expressed sequence tags, inter simple sequence repeats, single nucleotide polymorphism, and simple sequence repeat markers [ 15 , 19 , 20 , 21 , 22 ]. Genome evolution involves intricate mechanisms such as gene duplication, divergence, and selection, which shape the genetic landscape of organisms over time. Gene duplication, in particular, serves as a significant driver of genome evolution, often leading to the expansion of gene families [ 23 , 24 ]. Therefore, high-quality safflower reference genome and genome evolution research can reveal genetic structure and phylogenetic details, as well as biosynthesis processes of bioactive compounds. The inaugural sequencing of the safflower cultivar Anhui-1 genome employed PacBio Sequel (Pacific Biosciences) in conjunction with the Illumina Hiseq 2500 sequencing platform. The investigation primarily targeted the biosynthetic pathways of hydroxysafflor yellow A and unsaturated fatty acid [ 25 ].

To deepen our understanding of the genetic landscape of safflower, we conducted genome assembly of the Jihong01 safflower cultivar. This particular landrace is extensively cultivated and sourced from western China. It is also used as a main source of breeding novel safflower varieties with improved medicinal properties. In this research, we provide a genome overview of safflower that includes details of genome evolution, gene family expansion, and putative genes for unsaturated fatty acids biosynthesis and their composition. This reference genome will serve as a platform for investigating the genome background, and for identifying important genes to exploit in genetic breeding programs.

Genome sequencing and assembly

Genomic DNA was extracted from leaves of the “ Jihong01 ” safflower variety and sequenced on BGI-SEQ500 and Oxford Nanopore Technologies (ONT) platforms. We obtained 101 Gb short reads and 130 Gb long reads data in total. By a 17-mer frequency statistics, the size of safflower genome was estimated to be 1,061.1 Mb (Figure S1 ). The primary contigs was assembled by NECAT software ( https://github.com/xiaochuanle/NECAT ) with an N50 of 8.6 Mb. The initial assembly was error-corrected by Pilon [ 26 ], and the redundant sequences were removed by HaploMerger2 [ 27 ]. After several steps of polishing, the total length of the final assembly was 1,061.1 Mb which is close to the estimated genome size (Table S1 ). To generate a chromosomal-level assembly of the safflower genome, a high-throughput chromosome conformation capture (Hi-C) library was constructed and produced 51 Gb valid reads (47×). 98.82% of the final assembly was anchored to 12 pseudochromosomes with length ranging from 68.06 Mb to 106.66 Mb. GC content of the final assembly was 38.37%. The completeness of our assembly was evaluated by BUSCO (Benchmarking Universal Single Copy Orthologs) [ 28 ]. In embryophyta_odb9 dataset, 90.3% BUSCO genes was complete, 2.4% fragmented and 7.3% missing in the assembly. Also, we used LTR Assembly Index (LAI) [ 29 ] values to evaluate the quality of non-coding region in the assembly (Fig.  1 ). The LAI score of our assembly was 21.94, suggesting a high-quality safflower genome assembly.

figure 1

Overview of the Carthamus tinctorius genome. ( a ) Chromosomal pseudomolecules. ( b ) GC content (1 Mb windows). ( c ) Gene density (1 Mb windows). ( d ) TE density (1 Mb windows). ( e ) LAI score (3 Mb windows with 300 Kb sliding step). Inner grey ribbons indicate links of synteny blocks, while colored ribbons highlight the residues of whole genome triplication

Genome annotation

We identified 63.4% of the assembly as repetitive sequences. The proportion is comparatively close to artichoke (58.4%) [ 30 ]but lower than sunflower (74.7%) [ 31 ] and lettuce (74.2%) [ 32 ], which may be the reason why the genome size of lettuce or sunflower is two to three times larger than that of artichoke and safflower [ 33 ]. The most abundant transposable elements (TEs) were long terminal repeat (LTR) retrotransposons, accounting for 54.2% of the assembly. Like most plants, Gypsy (26.9%) and Copia (25.2%) were found to be the two dominant LTR super families. Similarly, insertion time of LTR was estimated at 1.5 Mya based on the sequence divergence of all LTRs, later than artichoke (Figure S2 ). Importantly, the DNA transposons covered 7.1% of the assembly. A total of 32,379 protein-coding genes were predicted using a combination of homology prediction and transcripts supporting. Distributions of gene set parameters showed a consistent trend with other plants (Figure S3 ). The gene set covered 93.9% complete BUSCOs of embryophyte BUSCO groups. A sum of 97.71% of predicted proteins were functionally annotated against public protein databases (InterPro, UniProt and KEGG). Besides protein-coding genes, we also annotated 131 miRNAs, 998 tRNAs, 3,017 rRNAs and 1,408 snRNAs.

Gene family and phylogeny analysis

Phylogenetic tree was reconstructed based on the coding sequences (CDS) of 212 single-copy gene families. Plants of Monocotyledons, Rosidae and Asteridae were separated into respective branches and each species were clustered at the reported evolutionary positions. Noticeably, safflower and sunflower were clustered into one branch in the phylogenetic tree (Fig.  2 a). Divergence time of safflower and sunflower was estimated at approximately 37.3 Mya, after the whole genome duplication event at the basal of Asteraceae family [ 34 ]. For comparative analysis, we chose high-quality proteins of 10 oil plant species including (oil palm: Elaeis guineensis , soybean: Glycine max , sunflower: Helianthus annuus , Jatropha: Jatropha curcas , walnut: Juglans regia , flax; Linum usitatissimum , olive tree: Olea europaea , castor: Ricinus communis , sesame: Sesamum indicum and maize: Zea mays ) together with our predicted proteins. All proteins were clustered into 27,600 gene families by OrthoMCL [ 35 ] pipeline, within which 495 gene families were safflower-specific. Compared to olive tree and sesame, safflower shared more gene families with sunflower (Fig.  2 b), indicating a closer relationship between safflower and sunflower.

figure 2

Comparative analysis of Carthamus tinctorius with other oil crops. ( a ) Phylogeny, divergence time and gene family expansion/contraction of 11 species. The green numbers are families under size expansion while the red numbers are families under size contraction. The vertical stacked column right is the ortholog genes in 11 species. ( b ) Venn diagram of safflower, sesame, olive tree and sunflower

Furthermore, gene family size changes was evaluated by CAFE software [ 36 ]. As for safflower, 516 gene families demonstrated under size expansion, while 2,126 gene families indicated under size contraction. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment was implemented to the expanded gene families. From which, 225 of 2,751 genes were markedly enriched in fatty acids biosynthesis and metabolism pathways, including linoleic acid metabolism (map00591), alpha-linolenic acid metabolism (map00592) and biosynthesis of unsaturated fatty acids (map01040) (Figure S4 ). Expansion of gene families involved in fatty acid metabolism especially in unsaturated fatty acids biosynthesis may result in high oil production in safflower.

WGD in safflower genome

As a member of Compositae family, safflower had a recent whole genome duplication (WGD) event in evolution history (38–50 Mya) (Figure S5 ). We used WGD pipeline [ 37 ] to calculate Ks distribution of paralogs in safflower, sunflower, lettuce, artichoke and coffee tree, respectively (Fig.  3 a). After the γ duplication event in eudicots (peak of coffee tree), safflower had experienced another WGD event, which was also found in artichoke and lettuce (Ks  ∼  0.75-1). Besides, this round of duplication was a triplication event, illustrated by the residues of triplication (Fig. 1 ) and triplicate synteny blocks between coffee tree and safflower. In Compositae, two rounds of WGD events occurred in Heliantheae species, which revealed that several 1:2 synteny blocks between safflower and sunflower were existed (Fig.  3 b).

figure 3

WGD events in Carthamus tinctorius evolutionary history. ( a ) Histogram of five species paralog Ks distributions. ( b ) Macrosynteny of gene regions among coffee, safflower and sunflower. Grey lines indicate the synteny blocks between each two species. Red lines highlight the 1:3 between coffee and safflower and 1:2 between safflower and sunflower synteny block corresponding relations

Unsaturated fatty acid biosynthesis pathway

We annotated a total of 1,586 genes involved in lipid metabolism in safflower genome. Proteins of 96 genes were functionally enriched in biosynthesis of unsaturated fatty acids pathway (Table S2 ). In plants, fatty acid desaturases (FAD) catalyse the desaturation reactions of fatty acids. Stearoyl-ACP desaturase (SAD) is a soluble FAD in plastids, transforming stearic acid (C18:0) to oleic acid (C18:1). FAD2 and FAD6 catalyse further desaturation from oleic acid to linoleic acid (C18:2). FAD2 is localised in the endoplasmic reticulum (ER) while FAD6 in the plastid’s inner envelope. In a previous study, researchers have demonstrated the isolation of 11 members of FAD2 family in safflower [ 6 ]. In our assembly, 29 copies of FAD2 genes were annotated, as well as 4 SAD genes and 1 FAD6 gene (Table S3 ). The FAD2 genes were amplified by tandem duplication and formed 2 gene clusters located on chromosome 9 and chromosome 11 respectively (Fig.  4 a). Phylogenetic analysis of FAD2 gene family in safflower and other oil crops indicated that FAD2 gene family were significantly expanded in safflower and sunflower. Also, multi-copies of FAD2 genes were found in flax genome [ 38 ] (Fig.  4 b).

figure 4

FAD2 gene clusters. ( a ) Two FAD2 gene clusters on chromosome 9 and chromosome 11. The line indicates chromosome segment. Brown arrows indicate FAD2 genes. ( b ) Phylogenetic analysis of the FAD2 gene family among 11 species [ 31 , 38 ]. Each circle indicates a FAD2 gene. Different colors represent different species

We also sequenced transcriptome of seed tissue at five development stages after flowering (days after flowering, DAF) (DAF6, DAF12, DAF18, DAF24, DAF30). Each stage was selected with three duplications. The expression patterns of 96 genes likely involved in the biosynthesis of unsaturated fatty acids pathway were analysed using Mfuzz package [ 39 ] (Table S2 ). The upregulation of FAD2 genes observed from DAF18 to DAF24 (Figure S6 , clusters 2 and 7) implying that linoleic acid biosynthesis could be regulated around the fifth or sixth day after flowering. Moreover, the expression pattern of some genes was significantly enhanced at DAF6, however, as the developmental stages progresses, the expression was supressed (Figure S6 , cluster 9). This includes genes related to acyl-CoA oxidase and very-long-chain enoyl-CoA reductase.

Changes in the fatty acid composition and levels during seed developmental stages

Fatty acids are essential for plant growth and development. FAs are synthesized in plastids and to a large extent transported to the endoplasmic reticulum for modification and lipid assembly. Many genes participate in lipid metabolism within the plastid and endoplasmic reticulum, particularly in fatty acid elongation and desaturation (Fig.  5 a). Fatty acid composition and contents are the most important indicators to measure the lipid quality. We examined the fatty acid composition of seed storage lipids in developing seeds. The contents of fatty acids of seeds in five developmental stages was measured by GC-MS. Compositional analyses of seed oil revealed that palmitic acid (C16:0), stearic acid (C18:0), oleic acid (C18:1) and linoleic acid (C18:2) accounted for a predominant proportion of the lipid content in safflower (Fig.  5 b). The contents of oleic acid (C18:1) and linoleic acid (C18:2) were found the highest at maturity stage. For instance, C18:1n7, C18:1n9 and C18:2n6 reached 1,776, 1,666, 1,510 µg/g DW, increased by 29.9, 5.4, and 2.6 times when compared to the early stages of grain formation, respectively. According to the data results, C18:3n3 and C18:3n6 did not belong to the high content of PUFA, and their contents decreased first and then increased in the seed developmental stage. In addition to this, most of the fatty acids were increased in the seeds except C14:1, which was decreased. The composition and content of fatty acids in safflower seeds during seed development indicated that the synthesis and accumulation of polyunsaturated fatty acids C18:2 was the main factor determining the oil quality of safflower seeds.

To gain a better understanding of the relationship between genes and fatty acids species, the Pearson correlation test was performed for the intensity of fatty acids and the expression pattern of genes during the safflower seeds development stage. Our results showed that a total of 28 genes were significantly correlated with C18:1, C18:2 and C18:3 molecular species metabolites that exhibit a Pearson correlation coefficient > 0.7 and p-value < 0.05. Among them, expression pattern of 15 FAD2 genes (Cti _chr11_01896, Cti_chr9_01626, Cti_chr11_01899, Cti_chr11_01897, Cti_ chr9_01627, Cti_chr11_01898, Cti_chr9_01634, Cti_chr9_01616, Cti_chr9_01625, Cti_chr9_01617, Cti_chr3_02112, Cti_chr3_02111, Cti_chr11_01894, Cti_chr10_00208, Cti_chr4_00382) and 4 FAD6 genes (Cti_chr11_01893, Cti_chr7_00474, Cti_chr11_01895, Cti_chr3_02287) were positively correlated with C18:1n7, C18:1n9 and C18:2n6 composition patterns during seed developing stages (Fig.  6 ). Importantly, two FAD6 genes Cti_chr11_01893 and Cti_chr11_01895 expression patterns showed significant correlation with C18:2 contents during seed development stage. These results indicated that FAD2 and FAD6 genes appear to be responsible for the high proportion of C18:2 in developing safflower seeds.

figure 5

Fatty acids biosynthesis and contents of seed lipids. ( a ) Fatty acid and oil biosynthesis in safflower. KAS II, β-ketoacyl-ACP synthetase; SAD, stromal stearoyl-ACP desaturase; FATA, acyl-ACP thioesterase; G3P, glycerol-3-phosphate; GPAT, glycerol-3-phosphate acyltransferase; LPA, lysophosphatidic acid; LPAAT, Lysophosphatidic acid acyltransferase; PA, phosphatidic acid; FAD2/3/6/7, Fatty acid desaturase; DAG, diglycerides; TAG, triglycerides; ( b ) Fatty acids contents of seed storage lipids in different developing stages

figure 6

Correlation coefficient between gene expression level and contents of fatty acids. * p  < 0.05

In the present study, we report the complete genome sequence of an economically important crop safflower. We presented valuable insights into the genetic organization of safflower, which facilitates the identification of key functional genes implicated in fatty acid synthesis. These valuable genomic resources could be easily accessible to researchers in the field for future functional and molecular breeding studies. Previous studies on the Compositae have reported genome sequences for H. annuus [ 31 ], lettuce ( Lactuca sativa ) [ 32 ], and globe artichoke ( Cynara cardunculus var. scolymus) [ 30 ]. Those studies have provided scientific resources for comprehensive analyses of genome evolution, functional gene exploration, metabolic pathway construction, and molecular breeding programs.

In light of our results, we revealed a high-quality safflower genome, with a size of 1,061.1 Mb and 12 pseudochromosomes. There are many karyotypes in the ancestor species of safflower with 10, 11, 12, 22, and 32 pairs of chromosomes, many of which are self-incompatibility species [ 40 ]. The current karyotype of cultivated safflower could have originated from wild ancestor C. tinctorius , with 2n = 24 chromosomes karyotype. This high-quality genome information will be useful for analysing sequences of homologous species, and provides genetic evidence for the nutritional compounds encoded in the safflower genome. In addition, our analysis also revealed that 63.4% of the assembled genome comprises repetitive sequences. This percentage is notably close to that of artichoke (58.4%), yet lower than observed in sunflower (74.7%) and lettuce (74.2%). The relatively larger genome size of sunflower and lettuce could be related to the larger amount of repetitive sequences. It is widely believed that transposable elements play a dominant role in the growth of genome size, and much of the variation in plant genome size can be attributed to the continuous accumulation of these transposable elements. For instance, the sunflower genome is 3.6 Gb and the lettuce genome are 2.38 Gb, which are nearly two to three times of artichoke genome (1.08 Gb) and safflower genome (1.06 Gb). Although, for sunflower genome, the influence of WGD event should be take into consideration regarding the large genome size, our analysis also suggests that repetitive sequences have contributed significantly to the genome size of both sunflower and lettuce. The disparities in repetitive sequence content provide crucial insights into the genomic architecture of these plant species. The higher proportion of repetitive elements in sunflower and lettuce genomes may contribute to their larger genome sizes compared to artichoke and safflower.

First safflower ( Carthamus tinctorius L.) cultivar ‘Anhui-1’ genome was sequencing using PacBio Sequel (Pacific Biosciences) combined with Illumina Hiseq 2500 sequencing platform and focused on biosynthetic pathways of LA and HSYA [ 25 ]. Here, in this work, we sequenced and denovo assembled genome of safflower cultivar ‘Jihong01’ using BGI-SEQ500 and Oxford Nanopore Technologies (ONT) platform and gained another high-quality genome assembly results. Meanwhile, we paid more attention to UFA contents than saturated fatty acid during seed development stages. Seed oil fatty acids composition continues to be important trait for safflower breeding [ 41 ]. Recent studies on the molecular mechanisms of lipid metabolism have identified a number of genes that form the genetic basis of this trait. The factors determining seed oil composition were found to be complex. In cultivated safflower, lipid metabolic pathways underlie the natural trait of seed oil content, and candidate genes in the genome were identified to be involved in lipid metabolism and reported before. The unsaturated fatty acid synthesis in ER is important during safflower seed development. Our analyses indicated that the genes involved in fatty acids composition in safflower seeds have undergone expansion during evolution. These analyses may provide essential clues about the biochemical relevance of lipid composition in seeds.

In terms of the fatty acid composition of oil, there is a lower proportion of LA in sesame (C18:2, 32.95–52.94%) [ 42 ] than in safflower (C18:2, 63.9–76.1%) [ 41 , 43 , 44 ]. However, the number of genes involved in fatty acid elongation, biosynthesis, and degradation are similar among the genomes of safflower, sunflower, sesame [ 42 ], grape [ 45 ], capsicum [ 46 ], and Arabidopsis [ 47 ]. A few key enzymes in the desaturation metabolic pathway regulate unsaturated fatty acid biosynthesis in the ER. Genetic evolution, including genome duplication or gene family expansion, is crucial for generating new gene functions and/or for intensifying pathways [ 48 , 49 ]. Divergence from an ancestral genome can result in an evolutionary bias towards the production of specific natural products. The emergence of duplicates can result in gene expansion, contraction, or loss [ 50 ]. FAD2 gene family encoding enzymes that catalyse linoleic acid biosynthesis has expanded via tandem duplication and formed two gene clusters located on chromosome 9 and chromosome 11 respectively. The result indicated that tandem duplication possibly contributed to the expansion of the gene families in safflower. In particular, the FAD2 and FAD6 homologs involved in unsaturated fatty acid biosynthesis showed the highest transcript levels at the DAF18 and DAF24 stage of seed development. Previous studies found that their expression patterns were related to the LA content, with high transcript levels during seed development [ 41 , 51 ]. These findings provide substantial novel insights into the reasons for the high proportion of LA in safflower oil.

The assembled genome sequence of safflower mostly consisted of repeats, coding and non-coding RNAs, and other related sequences. This information allowed us to reconstruct the evolutionary history of safflower, which includes a large-scale safflower-specific whole-genome duplication events. Candidate genes, including FAD2 and FAD6 , which encode key functional enzymes related to LA composition. FAD2 families were expanded in safflower, and correlation analysis of gene expression alongside contents of fatty acids indicated that the specific FAD2 and FAD6 genes could be responsible for the synthesis of a wide range of LA.

Conclusions

The safflower genome assembly represents a cornerstone for future research programs aimed at exploiting the economic properties of safflower, while also considering agricultural constraints and human nutritional needs and for advancing molecular breeding programs aimed at producing new safflower cultivars. The candidate FAD2 and FAD6 genes revealed by our integrated approach provide a genetic resources of unsaturated fatty acid biosynthesis and provide a genetic landscape for safflower germplasm utilization.

Experimental procedures

Dna extracting and sequencing.

The plant material of safflower ( Jihong01 cultivar, deposited in Engineering Research Center of Bioreactor and Pharmaceutical Development, Ministry of Education, JLAU) used in this study was identified by Yuanyuan Dong. Safflower variety seedlings ( Jihong01 ) collected and planted in an experimental field of Jilin Agriculture University, Changchun City, China were used in this research and stored in Engineering Research Center of Bioreactor and Pharmaceutical Development, Ministry of Education. High-quality genomic DNA from Jihong01 leaves was extracted using the MolPure® Plant DNA Kit (Yeasen, China). Subsequently, the extraction process focused on selecting large-size fragments, which were accomplished through automated Blue Pippin system. Following this, the DNA underwent treatment involving the end-repair/dA tailing module, and subsequently, it was ligated to an adaptor using the ONT 1D ligation sequencing kit. The prepared library was loaded onto flow cells and subjected to sequencing using the Nanopore PromethION platform.

Genome assembly

K-mer frequency was calculated by Jellyfish v2.26 [ 52 ] and the genome size was estimated using GenomeScope [ 53 ]. The initial contigs were assembled by NECAT with default parameters using Nanopore reads longer than 5 kb. The initial assembly was error-corrected by Pilon with short reads. Size of the initial assembly was a little larger than estimated, so we used HaploMerger2 software [ 27 ] to remove redundant contigs in the initial assembly. Then, the assembly was error-corrected again. Reads generated by Hi-C library were filtered strictly by HiC-Pro pipeline [ 54 ] to remove invalid reads pairs. We used Juicer [ 55 ] and 3D de novo assembly (3D-DNA) pipeline [ 56 ] to anchor contigs to pseudochromosomes.

Repeat annotation and LTR insertion time

Repetitive sequences were annotated using a combination of de novo and homology strategy. We used RepeatModeler [ 57 ], LTR_FINDER [ 58 ] and TRF [ 59 ] software for de novo repeats identification based on repetitive sequences features. Then RepeatMasker and RepeatProteinMask were used to annotate transposon elements based on RepBase. LTR insertion time was estimated based on the divergence of LTR pairs. Intact LTRs were identified using LTR_FINDER software in the four Compositae genomes. Then we used MUSCLE [ 60 ] to align LTR pairs and distmat to calculate K values under the Kimura two-parameter model. With the K values of LTR pairs, the insertion time was calculated using formula T = K/2r , where r was the rate of nucleotide substitution and set as 7 × 10 − 9 per site per generation here [ 61 , 62 ].

Genes prediction and function annotation

Protein-coding genes were predicted based on both homolog proteins and transcripts. Proteins of seven plants species ( Arachis hypogaea (GCF_003086295.2), Brassica napus (GCF_000686985.2), Glycine max (GCF_000004515.5), Helianthus annuus (GCF_002127325.1), Lactuca sativa (GCF_002870075.1), Ricinus communis (GCF_000151685.1) and Sesamum indicum (GCF_000512975.1)) was downloaded from NCBI database. We first aligned these proteins with the assembly using BLAT [ 63 ], then the alignment was input to Genewise [ 64 ] to get homology annotations. RNA-seq reads were mapped to the assembly using HISAT2 [ 65 ] and transcripts were assembled using StringTie [ 66 ]. All of the evidences were integrated to the final protein-coding gene set by GLEAN [ 67 ]. Protein-coding genes were assessed for conserved protein domains in the ProDom, ProSiteProfiles, SMART, PANTHER, Pfam, PIRSF and ProSitePatterns databases using InterProScan [ 68 ]. Also, amino acid sequences were aligned to the following protein databases: Swiss-Prot, TrEMBL, Kyoto Encyclopaedia of Genes and Genomes (KEGG) using BLASTP (e-value < 1e-5) for function annotation.

Comparative analysis

All protein-coding gene sequences of the 10 oil plant species ( Elaeis guineensis (GCF_000442705.1), Glycine max , Helianthus annuus , Jatropha curcas (GCF_000696525.1), Juglans regia (GCF_001411555.1), Linum usitatissimum (Linum usitatissimum v1.0), Olea europaea (GCF_002742605.1), Ricinus communis , Sesamum indicum and Zea mays (GCF_000005005.2)) were downloaded from the NCBI or Phytozome database. The longest transcript of each gene without frame shift or internal termination was selected and translated into amino acid sequences for subsequent analyses. First, we used BLASTP for an all-to-all proteins alignment under the e-value of 1e-5. Then the ortholog genes were clustered into groups using OrthoMCL with a Markov inflation index of 1.5 and a maximum e-value of 1e-5. One-to-one single-copy ortholog groups were joined to a super-gene (single-copy orthologous genes are composed of head-to-tail connections) and aligned using MUSCLE [ 60 ]. Phylogenetic tree was constructed using RAxML [ 69 ] with Z. mays and E. guineensis as outgroups, under GTR + Optimization of substitution rates and GAMMA model of rate heterogeneity. The divergence time was estimated using MCMCTREE in PAML packages [ 70 ] based on HKY85 model and correlated rates molecular clock model. The size changes of each gene families were calculated by CAFE [ 36 ] with the random birth and death model.

WGD events and synteny

We used WGD software [ 37 ] to identify WGD events in the evolutionary process of safflower, sunflower, lettuce, artichoke (GCF_002870075.1) and coffee tree (AUK_PRJEB4211_v1). Synteny blocks between safflower and sunflower and between safflower and coffee tree were identified and displayed using jcvi packages [ 71 ].

Identification of unsaturated fatty acids biosynthesis genes

We used BLASTP to identify safflower FAD2 and FAD6 based on amino acids sequences of Arabidopsis thaliana FAD2 (NP_187819.1), FAD6 (NP_194824.1) with e-value of 1e-10.

Fatty acids content analysis

Safflower seeds ( Jihong01) were sourced from Engineering Research Center of Bioreactor and Pharmaceutical Development, Ministry of Education, JLAU. Dry seed samples (DAF6 (6 days after flowering), DAF12, DAF18, DAF24, DAF30) were collected at five different stages of seed development for the purpose of analysing fatty acid content and composition. Methyl esters (FAME) were prepared from each seed sample. Subsequently, quantitative analysis of fatty acids within these FAME samples was conducted utilizing Gas Chromatography-Mass Spectrometry (GC-MS) through the Agilent Technologies 6890 N/5975B system. The methods employed for this analysis were in accordance with those described by Ecker et al. [ 72 ]. The fatty acids present, including saturated fatty acids (SFAs), monounsaturated fatty acids (MUFAs), and polyunsaturated fatty acids (PUFAs), were subjected to quantitative calculations. These calculations were carried out based on a standard fatty acid methyl ester mix.

RNA sequencing

Total RNA was extracted from the 5 different developing seeds (DAF6 (6 days after flowering), DAF12, DAF18, DAF24, DAF30) using a TRIzol Plus RNA Purification Kit following the manufacturer’s instructions. RNA integrity and quantity were confirmed using an Agilent 2100 Bioanalyzer. The mRNA was hybridized with an Oligo(dT) probe and captured using magnetic beads. Subsequently, the mRNA was fragmented at high temperature and reverse-transcribed into first-strand DNA. This first-strand DNA served as a template for the synthesis of second-strand DNA, resulting in the formation of double-stranded DNA (dsDNA). Adaptors with dTTP tails were ligated to both ends of the dsDNA fragments. The ligation products were then amplified via PCR and circularized to generate a single-stranded circular (ssCir) library. The ssCir library was further amplified through rolling circle amplification (RCA) to produce DNA nanoballs (DNB). Finally, the DNBs were loaded onto a flow cell and sequenced using the DNBSEQ platform. Each sample was sequenced three times in triplicate.

Data availability

The raw sequence reads were deposited in China National GeneBank DataBase (CNGB db) under Project No. CNP0004859 and CNP0004861.

Chapman MA, Burke JM. DNA sequence diversity and the origin of cultivated safflower (Carthamus tinctorius L.; Asteraceae). BMC Plant Biol. 2007;7:60.

Article   PubMed   PubMed Central   Google Scholar  

Dajue L, Mündel H-H. Safflower, Carthamus Tinctorius L. Volume 7. Bioversity International; 1996.

Panahi B, Ghorbanzadeh Neghab M. Genetic characterization of Iranian safflower (Carthamus tinctorius) using inter simple sequence repeats (ISSR) markers. Physiol Mol Biology Plants: Int J Funct Plant Biology. 2013;19(2):239–43.

Article   CAS   Google Scholar  

McPherson MA, Good AG, Topinka AK, Yang RC, McKenzie RH, Cathcart RJ, Christianson JA, Strobeck C, Hall LM. Pollen-mediated gene flow from transgenic safflower (Carthamus tinctorius L.) intended for plant molecular farming to conventional safflower. Environ Biosaf Res. 2009;8(1):19–32.

Ahmad N, Li T, Liu Y, Hoang NQV, Ma X, Zhang X, Liu J, Yao N, Liu X, Li H. Molecular and biochemical rhythms in dihydroflavonol 4-reductase-mediated regulation of leucoanthocyanidin biosynthesis in Carthamus tinctorius L. Industrial Crops Prod. 2020;156:112838.

Cao S, Zhou XR, Wood CC, Green AG, Singh SP, Liu L, Liu Q. A large and functionally diverse family of Fad2 genes in safflower (Carthamus tinctorius L). BMC Plant Biol. 2013;13:5.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Matthaus B, Ozcan MM, Al Juhaimi FY. Fatty acid composition and tocopherol profiles of safflower (Carthamus tinctorius L.) seed oils. Nat Prod Res. 2015;29(2):193–6.

Article   CAS   PubMed   Google Scholar  

Hou Y, Wang Y, Liu X, Ahmad N, Wang N, Jin L, Yao N, Liu X. A cinnamate 4-HYDROXYLASE1 from Safflower promotes flavonoids Accumulation and stimulates antioxidant Defense System in Arabidopsis. Int J Mol Sci. 2023;24(6):5393.

Rapson S, Wu M, Okada S, Das A, Shrestha P, Zhou XR, Wood C, Green A, Singh S, Liu Q. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L). Front Plant Sci. 2015;6:691.

Li R, Beaudoin F, Ammah AA, Bissonnette N, Benchaar C, Zhao X, Lei C, Ibeagha-Awemu EM. Deep sequencing shows microRNA involvement in bovine mammary gland adaptation to diets supplemented with linseed oil or safflower oil. BMC Genomics. 2015;16:884.

Kazuma K, Takahashi T, Sato K, Takeuchi H, Matsumoto T, Okuno T. Quinochalcones and flavonoids from fresh florets in different cultivars of Carthamus tinctorius L. Biosci Biotechnol Biochem. 2000;64(8):1588–99.

Nguyen VC, Nakamura Y, Kanehara K. Membrane lipid polyunsaturation mediated by FATTY ACID DESATURASE 2 (FAD2) is involved in endoplasmic reticulum stress tolerance in Arabidopsis thaliana. Plant Journal: Cell Mol Biology. 2019;99(3):478–93.

Sehgal D, Rajpal VR, Raina SN, Sasanuma T, Sasakuma T. Assaying polymorphism at DNA level for genetic diversity diagnostics of the safflower (Carthamus tinctorius L.) world germplasm resources. Genetica. 2009;135(3):457–70.

Lu C, Shen Q, Yang J, Wang B, Song C. The complete chloroplast genome sequence of Safflower (Carthamus tinctorius L). Mitochondrial DNA Part DNA Mapp Sequencing Anal. 2016;27(5):3351–3.

Chen J, Tang X, Ren C, Wei B, Wu Y, Wu Q, Pei J. Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower. BMC Genomics. 2018;19(1):548.

Bowers JE, Pearl SA, Burke JM. Genetic Mapping of Millions of SNPs in Safflower (Carthamus tinctorius L.) via Whole-Genome Resequencing. G3 2016, 6(7):2203–2211.

Pearl SA, Bowers JE, Reyes-Chin-Wo S, Michelmore RW, Burke JM. Genetic analysis of safflower domestication. BMC Plant Biol. 2014;14:43.

Mayerhofer M, Mayerhofer R, Topinka D, Christianson J, Good AG. Introgression potential between safflower (Carthamus tinctorius) and wild relatives of the genus Carthamus. BMC Plant Biol. 2011;11:47.

Pearl SA, Burke JM. Genetic diversity in Carthamus tinctorius (Asteraceae; safflower), an underutilized oilseed crop. Am J Bot. 2014;101(10):1640–50.

Article   PubMed   Google Scholar  

Lee GA, Sung JS, Lee SY, Chung JW, Yi JY, Kim YG, Lee MC. Genetic assessment of safflower (Carthamus tinctorius L.) collection with microsatellite markers acquired via pyrosequencing method. Mol Ecol Resour. 2014;14(1):69–78.

Chapman MA, Hvala J, Strever J, Matvienko M, Kozik A, Michelmore RW, Tang S, Knapp SJ, Burke JM. Development, polymorphism, and cross-taxon utility of EST–SSR markers from safflower (Carthamus tinctorius L). Theor Appl Genet. 2009;120(1):85–91.

Yang Y-X, Wu W, Zheng Y-L, Chen L, Liu R-J, Huang C-Y. Genetic diversity and relationships among safflower (Carthamus tinctorius L.) analyzed by inter-simple sequence repeats (ISSRs). Genet Resour Crop Evol. 2007;54(5):1043–51.

Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. Selection in the evolution of gene duplications. Genome Biol. 2002;3(2):Research0008.

Panchy N, Lehti-Shiu M, Shiu SH. Evolution of gene duplication in plants. Plant Physiol. 2016;171(4):2294–316.

Wu Z, Liu H, Zhan W, Yu Z, Qin E, Liu S, Yang T, Xiang N, Kudrna D, Chen Y, et al. The chromosome-scale reference genome of safflower (Carthamus tinctorius) provides insights into linoleic acid and flavonoid biosynthesis. Plant Biotechnol J. 2021;19(9):1725–42.

Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9(11):e112963.

Huang S, Kang M, Xu A. HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly. Bioinformatics. 2017;33(16):2577–9.

Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2.

Ou S, Chen J, Jiang N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 2018;46(21):e126.

PubMed   PubMed Central   Google Scholar  

Scaglione D, Reyes-Chin-Wo S, Acquadro A, Froenicke L, Portis E, Beitel C, Tirone M, Mauro R, Lo Monaco A, Mauromicale G, et al. The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny. Sci Rep. 2016;6:19427.

Badouin H, Gouzy J, Grassa CJ, Murat F, Staton SE, Cottret L, Lelandais-Briere C, Owens GL, Carrere S, Mayjonade B, et al. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature. 2017;546(7656):148–52.

Reyes-Chin-Wo S, Wang Z, Yang X, Kozik A, Arikit S, Song C, Xia L, Froenicke L, Lavelle DO, Truco MJ, et al. Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce. Nat Commun. 2017;8:14953.

Barreda VD, Palazzesi L, Tellería MC, Olivero EB, Raine JI, Forest F. Early evolution of the angiosperm clade Asteraceae in the Cretaceous of Antarctica. Proceedings of the National Academy of Sciences 2015, 112(35):10989–10994.

Barker MS, Kane NC, Matvienko M, Kozik A, Michelmore RW, Knapp SJ, Rieseberg LH. Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years. Mol Biol Evol. 2008;25(11):2445–55.

Li L, Stoeckert CJ Jr., Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13(9):2178–89.

De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006;22(10):1269–71.

Zwaenepoel A, Van de Peer Y. Wgd-simple command line tools for the analysis of ancient whole-genome duplications. Bioinformatics. 2019;35(12):2153–5.

Wang Z, Hobson N, Galindo L, Zhu S, Shi D, McDill J, Yang L, Hawkins S, Neutelings G, Datla R, et al. The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. Plant Journal: Cell Mol Biology. 2012;72(3):461–73.

Article   Google Scholar  

Kumar L. Mfuzz: a software package for soft clustering of microarray data. Bioinformation. 2007;2(1):5–7.

Garnatje T, Garcia S, Vilatersana R, Vallès J. Genome size variation in the genus Carthamus (Asteraceae, Cardueae): systematic implications and additive changes during allopolyploidization. Ann Bot. 2006;97(3):461–7.

Sabzalian MR, Saeidi G, Mirlohi A. Oil content and fatty acid composition in seeds of three safflower species. J Am Oil Chem Soc. 2008;85(8):717–21.

Wei X, Liu K, Zhang Y, Feng Q, Wang L, Zhao Y, Li D, Zhao Q, Zhu X, Zhu X, et al. Genetic discovery for oil production and quality in sesame. Nat Commun. 2015;6:8609.

Bozan B, Temelli F. Chemical composition and oxidative stability of flax, safflower and poppy seed and seed oils. Bioresour Technol. 2008;99(14):6354–9.

Ma DW, Wierzbicki AA, Field CJ, Clandinin MT. Preparation of conjugated linoleic acid from safflower oil. J Am Oil Chem Soc. 1999;76(6):729–30.

Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449(7161):463–7.

Kim S, Park M, Yeom S-I, Kim Y-M, Lee JM, Lee H-A, Seo E, Choi J, Cheong K, Kim K-T. Genome sequence of the hot pepper provides insights into the evolution of pungency in Capsicum species. Nat Genet. 2014;46(3):270–8.

Arabidopsis Genome I. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408(6814):796–815.

Roth C, Rastogi S, Arvestad L, Dittmar K, Light S, Ekman D, Liberles DA. Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. J Experimental Zool Part B Mol Dev Evol. 2007;308(1):58–73.

Kim S, Park M, Yeom SI, Kim YM, Lee JM, Lee HA, Seo E, Choi J, Cheong K, Kim KT, et al. Genome sequence of the hot pepper provides insights into the evolution of pungency in Capsicum species. Nat Genet. 2014;46(3):270–8.

Ibarra-Laclette E, Lyons E, Hernandez-Guzman G, Perez-Torres CA, Carretero-Paulet L, Chang TH, Lan T, Welch AJ, Juarez MJ, Simpson J, et al. Architecture and evolution of a minute plant genome. Nature. 2013;498(7452):94–8.

Li Y, Beisson F, Pollard M, Ohlrogge J. Oil content of Arabidopsis seeds: the influence of seed anatomy, light and plant-to-plant variation. Phytochemistry. 2006;67(9):904–15.

Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–70.

Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 2017;33(14):2202–4.

Servant N, Varoquaux N, Lajoie BR, Viara E, Chen CJ, Vert JP, Heard E, Dekker J, Barillot E. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015;16:259.

Durand NC, Shamim MS, Machol I, Rao SS, Huntley MH, Lander ES, Aiden EL. Juicer provides a one-click system for analyzing Loop-Resolution Hi-C experiments. Cell Syst. 2016;3(1):95–8.

Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, Shamim MS, Machol I, Lander ES, Aiden AP, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–5.

RepeatModeler. Open-1.0 [ http://www.repeatmasker.org ].

Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 2007, 35(Web Server issue):W265–268.

Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27(2):573–80.

Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.

Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, et al. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet. 2011;43(5):476–81.

Ossowski S, Schneeberger K, Lucas-Lledo JI, Warthmann N, Clark RM, Shaw RG, Weigel D, Lynch M. The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science. 2010;327(5961):92–4.

Kent WJ. BLAT–the BLAST-like alignment tool. Genome Res. 2002;12(4):656–64.

CAS   PubMed   PubMed Central   Google Scholar  

Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res. 2004;14(5):988–95.

Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60.

Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33(3):290–5.

Elsik CG, Mackey AJ, Reese JT, Milshina NV, Roos DS, Weinstock GM. Creating a honey bee consensus gene set. Genome Biol. 2007;8(1):R13.

Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30(9):1236–40.

Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3.

Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91.

Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH. Synteny and Collinearity in Plant genomes. Science. 2008;320:486–8.

Ecker J, Scherer M, Schmitz G, Liebisch G. A rapid GC-MS method for quantification of positional and geometric isomers of fatty acid methyl esters. J Chromatogr B Analyt Technol Biomed Life Sci. 2012;897:98–104.

Download references

Acknowledgements

This work was supported by China National GeneBank (CNGB).

This research was funded by The Science and Technology Development Project of Jilin province (20210402044GH, 20220101354JC), Science and Technology Research Project of the Education Department of Jilin Province (JJKH20220325KJ).

Author information

Yuanyuan Dong and Xiaojie Wang Contribute equally to this work.

Authors and Affiliations

Engineering Research Center of Bioreactor and Pharmaceutical Development, College of Life Sciences, Ministry of Education, Jilin Agricultural University, Changchun, 130118, China

Yuanyuan Dong, Naveed Ahmad, Yepeng Sun, Yuanxin Wang, Xiuming Liu, Na Yao, Yang Jing, Linna Du, Xiaowei Li, Nan Wang, Weican Liu & Fawei Wang

School of Pharmaceutical Science, Key Laboratory of Biotechnology and Pharmaceutical Engineering of Zhejiang Province, Wenzhou Medical University, Wenzhou, 325035, China

Xiaojie Wang & Xiaokun Li

Sanya Nanfan Research Institute of Hainan University, Sanya, 572025, China

You can also search for this author in PubMed   Google Scholar

Contributions

Y.D., X.W., X.L., and H.L. performed some data analyses. J.Y., N.Y. and L.D. managed samples and tissues. Y.W. prepared materials and uploaded data. Y.D., X.W., X.L., and J.Y. performed some data analyses and prepared graphics. N.Y., X.L., N.W., X.L., and W.L., prepared the libraries. H.L., and X.L. assisted in data analysis and in the overall design of the project. F.W., H.L., and X.L. developed the figure of the study and assisted with manuscript preparation. Y.D., X.W., N.A. Y.S. X.L., J.Y., F.W., and H.L. wrote and revised the manuscript. All of the authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Haiyan Li .

Ethics declarations

Ethics approval and consent to participate.

No specific approval was required for voucher specimens for this study. Voucher specimens were prepared and deposited at the Jilin agricultural University. Yuanyuan Dong and Xiaojie Wang undertook the identification work of the plant material. The authors have complied with all relevant institutional and national guidelines and legislation in experimental research and field studies on plants, including the collection of plant materials for this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Dong, Y., Wang, X., Ahmad, N. et al. The Carthamus tinctorius L. genome sequence provides insights into synthesis of unsaturated fatty acids. BMC Genomics 25 , 510 (2024). https://doi.org/10.1186/s12864-024-10405-z

Download citation

Received : 19 September 2023

Accepted : 10 May 2024

Published : 23 May 2024

DOI : https://doi.org/10.1186/s12864-024-10405-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Whole genome duplication
  • Evolutionary history
  • Fatty acid biosynthesis

BMC Genomics

ISSN: 1471-2164

synthesis of analysis

IMAGES

  1. The vector illustration in a concept of pyramid of Critical Analysis

    synthesis of analysis

  2. A science of analysis + synthesis

    synthesis of analysis

  3. The Analysis Synthesis Model

    synthesis of analysis

  4. Analysis, Synthesis, and Design of Chemical Processes, 5th Edition

    synthesis of analysis

  5. Analysis vs Synthesis: Difference and Comparison

    synthesis of analysis

  6. PPT

    synthesis of analysis

VIDEO

  1. Approach to Design Outcome based Learning

  2. Retrosynthetic Analysis

  3. Lecture Designing Organic Syntheses 4 Prof G Dyker 151014

  4. Lecture Designing Organic Syntheses 7 Prof G Dyker 291014

  5. Lecture Designing Organic Syntheses 15 Prof G Dyker 261114

  6. Combinational Circuit Analysis and Synthesis

COMMENTS

  1. What Synthesis Methodology Should I Use? A Review and Analysis of Approaches to Research Synthesis

    The first is a well-developed research question that gives direction to the synthesis (e.g., meta-analysis, systematic review, meta-study, concept analysis, rapid review, realist synthesis). The second begins as a broad general question that evolves and becomes more refined over the course of the synthesis (e.g., meta-ethnography, scoping ...

  2. How To Write Synthesis In Research: Example Steps

    Step 1 Organize your sources. Step 2 Outline your structure. Step 3 Write paragraphs with topic sentences. Step 4 Revise, edit and proofread. When you write a literature review or essay, you have to go beyond just summarizing the articles you've read - you need to synthesize the literature to show how it all fits together (and how your own ...

  3. LibGuides: Writing Resources: Synthesis and Analysis

    Synthesis: the combination of ideas to. form a theory, system, larger idea, point or outcome. show commonalities or patterns. Analysis: a detailed examination. of elements, ideas, or the structure of something. can be a basis for discussion or interpretation. Synthesis and Analysis: combine and examine ideas to.

  4. Definitions and Descriptions of Analysis

    It makes use of synthesis and analysis, always starting from hypotheses and first principles that it obtains from the science above it and employing all the procedures of dialectic—definition and division for establishing first principles and articulating species and genera, and demonstrations and analyses in dealing with the consequences ...

  5. Analysis vs. Synthesis

    On the other hand, synthesis involves combining different elements or ideas to create a new whole or solution. It involves integrating information from various sources, identifying commonalities and differences, and generating new insights or solutions. While analysis is more focused on understanding and deconstructing a problem, synthesis is ...

  6. Putting It Together: Analysis and Synthesis

    Analysis is the first step towards synthesis, which requires not only thinking critically and investigating a topic or source, but combining thoughts and ideas to create new ones. As you synthesize, you will draw inferences and make connections to broader themes and concepts. It's this step that will really help add substance, complexity, and ...

  7. Analysis

    While the first two involve regressive analysis and synthesis, the third and fourth involve decompositional analysis and synthesis. As the authors of the Logic make clear, this particular part of their text derives from Descartes's Rules for the Direction of the Mind, written around 1627, but only published posthumously in 1684. The ...

  8. A Guide to Evidence Synthesis: What is Evidence Synthesis?

    Evidence syntheses may also include a meta-analysis, a more quantitative process of synthesizing and visualizing data retrieved from various studies. ... Before embarking on an evidence synthesis, it's important to clearly identify your reasons for conducting one. For a list of types of evidence synthesis projects, see the next tab.

  9. PDF DATA SYNTHESIS AND ANALYSIS

    This preliminary synthesis is the first step in systematically analysing the results—but it is only a preliminary analysis (not the endpoint). Possible examples of ways to approach this step are: Describe each of the included studies: summarising the same features for each study and in the same order).

  10. Meta-analysis and the science of research synthesis

    Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a ...

  11. Synthesising the data

    Qualitative data synthesis. In a qualitative systematic review, data can be presented in a number of different ways. A typical procedure in the health sciences is thematic analysis. Thematic synthesis has three stages: the coding of text 'line-by-line' the development of 'descriptive themes' and the generation of 'analytical themes'

  12. Systematic Reviews & Evidence Synthesis Methods

    "Evidence synthesis" refers to rigorous, well-documented methods of identifying, selecting, and combining results from multiple studies. These projects are conducted by teams and follow specific methodologies to minimize bias and maximize reproducibility. A systematic review is a type of evidence synthesis.

  13. Synthesis

    Synthesis is an important element of academic writing, demonstrating comprehension, analysis, evaluation and original creation. With synthesis you extract content from different sources to create an original text. While paraphrase and summary maintain the structure of the given source (s), with synthesis you create a new structure.

  14. Synthesis

    Local synthesis occurs at the paragraph level when writers connect individual pieces of evidence from multiple sources to support a paragraph's main idea and advance a paper's thesis statement. A common example in academic writing is a scholarly paragraph that includes a main idea, evidence from multiple sources, and analysis of those ...

  15. Analysis and Synthesis

    Abstract. Data analysis is a challenging stage of the integrative review process as it requires the reviewer to synthesize data from diverse methodological sources. Although established approaches to data analysis and synthesis of integrative review findings continue to evolve, adherence to systematic methods during this stage is essential to ...

  16. Methods for the synthesis of qualitative research: a critical review

    Background. The range of different methods for synthesising qualitative research has been growing over recent years [1,2], alongside an increasing interest in qualitative synthesis to inform health-related policy and practice [].While the terms 'meta-analysis' (a statistical method to combine the results of primary studies), or sometimes 'narrative synthesis', are frequently used to describe ...

  17. Synthesis and Analysis

    Bookmark. Synthesis and analysis are two key aspects of chemistry, particularly when exploring the role of chemistry in an industrial context and relating the products formed to applications in everyday life. In order to successfully synthesise any material it is necessary to have an understanding of the starting materials, the product (s) and ...

  18. Video Transcripts: Analyzing & Synthesizing Sources: Synthesis

    It's a lot like analysis, where analysis is you're commenting or interpreting one piece of evidence or one idea, one paraphrase or one quote. Synthesis is where you take multiple pieces of evidence or multiple sources and their ideas and you talk about the connections between those ideas or those sources. And you talk about where they intersect ...

  19. Synthesizing Sources

    Argumentative syntheses seek to bring sources together to make an argument. Both types of synthesis involve looking for relationships between sources and drawing conclusions. In order to successfully synthesize your sources, you might begin by grouping your sources by topic and looking for connections. For example, if you were researching the ...

  20. Synthesis

    In a summary, you share the key points from an individual source and then move on and summarize another source. In synthesis, you need to combine the information from those multiple sources and add your own analysis of the literature. This means that each of your paragraphs will include multiple sources and citations, as well as your own ideas ...

  21. Synthesizing Sources

    In a synthesis matrix, each column represents one source, and each row represents a common theme or idea among the sources. In the relevant rows, fill in a short summary of how the source treats each theme or topic. This helps you to clearly see the commonalities or points of divergence among your sources. You can then synthesize these sources ...

  22. Difference Between Analysis and Synthesis

    1. Synthesis is a higher process that creates something new. It is usually done at the end of an entire study or scientific inquiry. 2. Analysis is like the process of deduction wherein a bigger concept is broken down into simpler ideas to gain a better understanding of the entire thing. Author.

  23. Critical Thinking

    Critical thinking is considered a higher order thinking skills, such as analysis, synthesis, deduction, inference, reason, and evaluation. In order to demonstrate critical thinking, you would need to develop skills in; Interpreting: understanding the significance or meaning of information. Analyzing: breaking information down into its parts

  24. Environment modulates protein heterogeneity through ...

    Protein synthesis termination can achieve high fidelity, yet it is not perfect. Within all life forms on Earth, the end of the translation process is signaled by stop codons and catalyzed by ...

  25. Solid‐Phase‐Supported Chemoenzymatic Synthesis and Analysis of

    The enzymatic synthesis was much more efficient than the chemical synthesis of the corresponding CS glycopeptides, which could reduce the total number of synthetic steps by 80%. The structures of the CS glycopeptides were confirmed by mass spectrometry analysis and NMR studies. In addition, the interactions between the CS glycopeptides and ...

  26. Comprehensive Analysis on the Synthesis Methods of 2-Phenylindole

    2-Phenylindoles, as a special subset of indole compounds, stand out as some of the most promising candidates for drug development. To guide researchers toward developing new 2-phenylindole drugs in a more environmentally friendly and cost-effective manner, we have summarized and analyzed synthesis methods for 2-phenylindole over the past two decades. Our key findings are as follows: First ...

  27. Molecules

    The 3D-QSAR analysis of quinoxaline-2-oxyacetate hydrazide derivatives has provided new insights into the design and optimization of novel antifungal drug molecules based on quinoxaline. ... Peng, Yufei Li, Ruoyu Fang, Yuchuan Zhu, Peng Dai, and Weihua Zhang. 2024. "Design, Synthesis, Antifungal Activity, and 3D-QSAR Study of Novel Quinoxaline ...

  28. The Carthamus tinctorius L. genome sequence provides insights into

    Comparative analysis of Carthamus tinctorius with other oil crops. (a) Phylogeny, divergence time and gene family expansion/contraction of 11 species.The green numbers are families under size expansion while the red numbers are families under size contraction. The vertical stacked column right is the ortholog genes in 11 species.b

  29. Comprehensive analysis of the effects of Mo and Co on the synthesis

    Abstract. In the present work, three materials-based nanoscale TiO 2 compound materials were directly fabricated via hydrothermal synthesis methods to be used in gamma-ray shielding applications as ceramics and paints. X-ray diffraction and energy dispersive X-ray techniques were utilized to characterize the formation of TiO 2, Mo-TiO 2, and Co-TiO 2 nanoparticles.