Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology

Content Analysis | A Step-by-Step Guide with Examples

Published on 5 May 2022 by Amy Luo . Revised on 5 December 2022.

Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual:

  • Books, newspapers, and magazines
  • Speeches and interviews
  • Web content and social media posts
  • Photographs and films

Content analysis can be both quantitative (focused on counting and measuring) and qualitative (focused on interpreting and understanding). In both types, you categorise or ‘code’ words, themes, and concepts within the texts and then analyse the results.

Table of contents

What is content analysis used for, advantages of content analysis, disadvantages of content analysis, how to conduct content analysis.

Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyse.

Content analysis can be used to quantify the occurrence of certain words, phrases, subjects, or concepts in a set of historical or contemporary texts.

In addition, content analysis can be used to make qualitative inferences by analysing the meaning and semantic relationship of words and concepts.

Because content analysis can be applied to a broad range of texts, it is used in a variety of fields, including marketing, media studies, anthropology, cognitive science, psychology, and many social science disciplines. It has various possible goals:

  • Finding correlations and patterns in how concepts are communicated
  • Understanding the intentions of an individual, group, or institution
  • Identifying propaganda and bias in communication
  • Revealing differences in communication in different contexts
  • Analysing the consequences of communication content, such as the flow of information or audience responses

Prevent plagiarism, run a free check.

  • Unobtrusive data collection

You can analyse communication and social interaction without the direct involvement of participants, so your presence as a researcher doesn’t influence the results.

  • Transparent and replicable

When done well, content analysis follows a systematic procedure that can easily be replicated by other researchers, yielding results with high reliability .

  • Highly flexible

You can conduct content analysis at any time, in any location, and at low cost. All you need is access to the appropriate sources.

Focusing on words or phrases in isolation can sometimes be overly reductive, disregarding context, nuance, and ambiguous meanings.

Content analysis almost always involves some level of subjective interpretation, which can affect the reliability and validity of the results and conclusions.

  • Time intensive

Manually coding large volumes of text is extremely time-consuming, and it can be difficult to automate effectively.

If you want to use content analysis in your research, you need to start with a clear, direct  research question .

Next, you follow these five steps.

Step 1: Select the content you will analyse

Based on your research question, choose the texts that you will analyse. You need to decide:

  • The medium (e.g., newspapers, speeches, or websites) and genre (e.g., opinion pieces, political campaign speeches, or marketing copy)
  • The criteria for inclusion (e.g., newspaper articles that mention a particular event, speeches by a certain politician, or websites selling a specific type of product)
  • The parameters in terms of date range, location, etc.

If there are only a small number of texts that meet your criteria, you might analyse all of them. If there is a large volume of texts, you can select a sample .

Step 2: Define the units and categories of analysis

Next, you need to determine the level at which you will analyse your chosen texts. This means defining:

  • The unit(s) of meaning that will be coded. For example, are you going to record the frequency of individual words and phrases, the characteristics of people who produced or appear in the texts, the presence and positioning of images, or the treatment of themes and concepts?
  • The set of categories that you will use for coding. Categories can be objective characteristics (e.g., aged 30–40, lawyer, parent) or more conceptual (e.g., trustworthy, corrupt, conservative, family-oriented).

Step 3: Develop a set of rules for coding

Coding involves organising the units of meaning into the previously defined categories. Especially with more conceptual categories, it’s important to clearly define the rules for what will and won’t be included to ensure that all texts are coded consistently.

Coding rules are especially important if multiple researchers are involved, but even if you’re coding all of the text by yourself, recording the rules makes your method more transparent and reliable.

Step 4: Code the text according to the rules

You go through each text and record all relevant data in the appropriate categories. This can be done manually or aided with computer programs, such as QSR NVivo , Atlas.ti , and Diction , which can help speed up the process of counting and categorising words and phrases.

Step 5: Analyse the results and draw conclusions

Once coding is complete, the collected data is examined to find patterns and draw conclusions in response to your research question. You might use statistical analysis to find correlations or trends, discuss your interpretations of what the results mean, and make inferences about the creators, context, and audience of the texts.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Luo, A. (2022, December 05). Content Analysis | A Step-by-Step Guide with Examples. Scribbr. Retrieved 29 April 2024, from https://www.scribbr.co.uk/research-methods/content-analysis-explained/

Is this article helpful?

Amy Luo

Other students also liked

How to do thematic analysis | guide & examples, data collection methods | step-by-step guide & examples, qualitative vs quantitative research | examples & methods.

Logo for Open Educational Resources

Chapter 17. Content Analysis

Introduction.

Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or Facebook. Really, almost anything can be the “content” to be analyzed. This is a qualitative research method because the focus is on the meanings and interpretations of that content rather than strictly numerical counts or variables-based causal modeling. [1] Qualitative content analysis (sometimes referred to as QCA) is particularly useful when attempting to define and understand prevalent stories or communication about a topic of interest—in other words, when we are less interested in what particular people (our defined sample) are doing or believing and more interested in what general narratives exist about a particular topic or issue. This chapter will explore different approaches to content analysis and provide helpful tips on how to collect data, how to turn that data into codes for analysis, and how to go about presenting what is found through analysis. It is also a nice segue between our data collection methods (e.g., interviewing, observation) chapters and chapters 18 and 19, whose focus is on coding, the primary means of data analysis for most qualitative data. In many ways, the methods of content analysis are quite similar to the method of coding.

content analysis in historical research

Although the body of material (“content”) to be collected and analyzed can be nearly anything, most qualitative content analysis is applied to forms of human communication (e.g., media posts, news stories, campaign speeches, advertising jingles). The point of the analysis is to understand this communication, to systematically and rigorously explore its meanings, assumptions, themes, and patterns. Historical and archival sources may be the subject of content analysis, but there are other ways to analyze (“code”) this data when not overly concerned with the communicative aspect (see chapters 18 and 19). This is why we tend to consider content analysis its own method of data collection as well as a method of data analysis. Still, many of the techniques you learn in this chapter will be helpful to any “coding” scheme you develop for other kinds of qualitative data. Just remember that content analysis is a particular form with distinct aims and goals and traditions.

An Overview of the Content Analysis Process

The first step: selecting content.

Figure 17.2 is a display of possible content for content analysis. The first step in content analysis is making smart decisions about what content you will want to analyze and to clearly connect this content to your research question or general focus of research. Why are you interested in the messages conveyed in this particular content? What will the identification of patterns here help you understand? Content analysis can be fun to do, but in order to make it research, you need to fit it into a research plan.

Figure 17.1. A Non-exhaustive List of "Content" for Content Analysis

To take one example, let us imagine you are interested in gender presentations in society and how presentations of gender have changed over time. There are various forms of content out there that might help you document changes. You could, for example, begin by creating a list of magazines that are coded as being for “women” (e.g., Women’s Daily Journal ) and magazines that are coded as being for “men” (e.g., Men’s Health ). You could then select a date range that is relevant to your research question (e.g., 1950s–1970s) and collect magazines from that era. You might create a “sample” by deciding to look at three issues for each year in the date range and a systematic plan for what to look at in those issues (e.g., advertisements? Cartoons? Titles of articles? Whole articles?). You are not just going to look at some magazines willy-nilly. That would not be systematic enough to allow anyone to replicate or check your findings later on. Once you have a clear plan of what content is of interest to you and what you will be looking at, you can begin, creating a record of everything you are including as your content. This might mean a list of each advertisement you look at or each title of stories in those magazines along with its publication date. You may decide to have multiple “content” in your research plan. For each content, you want a clear plan for collecting, sampling, and documenting.

The Second Step: Collecting and Storing

Once you have a plan, you are ready to collect your data. This may entail downloading from the internet, creating a Word document or PDF of each article or picture, and storing these in a folder designated by the source and date (e.g., “ Men’s Health advertisements, 1950s”). Sølvberg ( 2021 ), for example, collected posted job advertisements for three kinds of elite jobs (economic, cultural, professional) in Sweden. But collecting might also mean going out and taking photographs yourself, as in the case of graffiti, street signs, or even what people are wearing. Chaise LaDousa, an anthropologist and linguist, took photos of “house signs,” which are signs, often creative and sometimes offensive, hung by college students living in communal off-campus houses. These signs were a focal point of college culture, sending messages about the values of the students living in them. Some of the names will give you an idea: “Boot ’n Rally,” “The Plantation,” “Crib of the Rib.” The students might find these signs funny and benign, but LaDousa ( 2011 ) argued convincingly that they also reproduced racial and gender inequalities. The data here already existed—they were big signs on houses—but the researcher had to collect the data by taking photographs.

In some cases, your content will be in physical form but not amenable to photographing, as in the case of films or unwieldy physical artifacts you find in the archives (e.g., undigitized meeting minutes or scrapbooks). In this case, you need to create some kind of detailed log (fieldnotes even) of the content that you can reference. In the case of films, this might mean watching the film and writing down details for key scenes that become your data. [2] For scrapbooks, it might mean taking notes on what you are seeing, quoting key passages, describing colors or presentation style. As you might imagine, this can take a lot of time. Be sure you budget this time into your research plan.

Researcher Note

A note on data scraping : Data scraping, sometimes known as screen scraping or frame grabbing, is a way of extracting data generated by another program, as when a scraping tool grabs information from a website. This may help you collect data that is on the internet, but you need to be ethical in how to employ the scraper. A student once helped me scrape thousands of stories from the Time magazine archives at once (although it took several hours for the scraping process to complete). These stories were freely available, so the scraping process simply sped up the laborious process of copying each article of interest and saving it to my research folder. Scraping tools can sometimes be used to circumvent paywalls. Be careful here!

The Third Step: Analysis

There is often an assumption among novice researchers that once you have collected your data, you are ready to write about what you have found. Actually, you haven’t yet found anything, and if you try to write up your results, you will probably be staring sadly at a blank page. Between the collection and the writing comes the difficult task of systematically and repeatedly reviewing the data in search of patterns and themes that will help you interpret the data, particularly its communicative aspect (e.g., What is it that is being communicated here, with these “house signs” or in the pages of Men’s Health ?).

The first time you go through the data, keep an open mind on what you are seeing (or hearing), and take notes about your observations that link up to your research question. In the beginning, it can be difficult to know what is relevant and what is extraneous. Sometimes, your research question changes based on what emerges from the data. Use the first round of review to consider this possibility, but then commit yourself to following a particular focus or path. If you are looking at how gender gets made or re-created, don’t follow the white rabbit down a hole about environmental injustice unless you decide that this really should be the focus of your study or that issues of environmental injustice are linked to gender presentation. In the second round of review, be very clear about emerging themes and patterns. Create codes (more on these in chapters 18 and 19) that will help you simplify what you are noticing. For example, “men as outdoorsy” might be a common trope you see in advertisements. Whenever you see this, mark the passage or picture. In your third (or fourth or fifth) round of review, begin to link up the tropes you’ve identified, looking for particular patterns and assumptions. You’ve drilled down to the details, and now you are building back up to figure out what they all mean. Start thinking about theory—either theories you have read about and are using as a frame of your study (e.g., gender as performance theory) or theories you are building yourself, as in the Grounded Theory tradition. Once you have a good idea of what is being communicated and how, go back to the data at least one more time to look for disconfirming evidence. Maybe you thought “men as outdoorsy” was of importance, but when you look hard, you note that women are presented as outdoorsy just as often. You just hadn’t paid attention. It is very important, as any kind of researcher but particularly as a qualitative researcher, to test yourself and your emerging interpretations in this way.

The Fourth and Final Step: The Write-Up

Only after you have fully completed analysis, with its many rounds of review and analysis, will you be able to write about what you found. The interpretation exists not in the data but in your analysis of the data. Before writing your results, you will want to very clearly describe how you chose the data here and all the possible limitations of this data (e.g., historical-trace problem or power problem; see chapter 16). Acknowledge any limitations of your sample. Describe the audience for the content, and discuss the implications of this. Once you have done all of this, you can put forth your interpretation of the communication of the content, linking to theory where doing so would help your readers understand your findings and what they mean more generally for our understanding of how the social world works. [3]

Analyzing Content: Helpful Hints and Pointers

Although every data set is unique and each researcher will have a different and unique research question to address with that data set, there are some common practices and conventions. When reviewing your data, what do you look at exactly? How will you know if you have seen a pattern? How do you note or mark your data?

Let’s start with the last question first. If your data is stored digitally, there are various ways you can highlight or mark up passages. You can, of course, do this with literal highlighters, pens, and pencils if you have print copies. But there are also qualitative software programs to help you store the data, retrieve the data, and mark the data. This can simplify the process, although it cannot do the work of analysis for you.

Qualitative software can be very expensive, so the first thing to do is to find out if your institution (or program) has a universal license its students can use. If they do not, most programs have special student licenses that are less expensive. The two most used programs at this moment are probably ATLAS.ti and NVivo. Both can cost more than $500 [4] but provide everything you could possibly need for storing data, content analysis, and coding. They also have a lot of customer support, and you can find many official and unofficial tutorials on how to use the programs’ features on the web. Dedoose, created by academic researchers at UCLA, is a decent program that lacks many of the bells and whistles of the two big programs. Instead of paying all at once, you pay monthly, as you use the program. The monthly fee is relatively affordable (less than $15), so this might be a good option for a small project. HyperRESEARCH is another basic program created by academic researchers, and it is free for small projects (those that have limited cases and material to import). You can pay a monthly fee if your project expands past the free limits. I have personally used all four of these programs, and they each have their pluses and minuses.

Regardless of which program you choose, you should know that none of them will actually do the hard work of analysis for you. They are incredibly useful for helping you store and organize your data, and they provide abundant tools for marking, comparing, and coding your data so you can make sense of it. But making sense of it will always be your job alone.

So let’s say you have some software, and you have uploaded all of your content into the program: video clips, photographs, transcripts of news stories, articles from magazines, even digital copies of college scrapbooks. Now what do you do? What are you looking for? How do you see a pattern? The answers to these questions will depend partially on the particular research question you have, or at least the motivation behind your research. Let’s go back to the idea of looking at gender presentations in magazines from the 1950s to the 1970s. Here are some things you can look at and code in the content: (1) actions and behaviors, (2) events or conditions, (3) activities, (4) strategies and tactics, (5) states or general conditions, (6) meanings or symbols, (7) relationships/interactions, (8) consequences, and (9) settings. Table 17.1 lists these with examples from our gender presentation study.

Table 17.1. Examples of What to Note During Content Analysis

One thing to note about the examples in table 17.1: sometimes we note (mark, record, code) a single example, while other times, as in “settings,” we are recording a recurrent pattern. To help you spot patterns, it is useful to mark every setting, including a notation on gender. Using software can help you do this efficiently. You can then call up “setting by gender” and note this emerging pattern. There’s an element of counting here, which we normally think of as quantitative data analysis, but we are using the count to identify a pattern that will be used to help us interpret the communication. Content analyses often include counting as part of the interpretive (qualitative) process.

In your own study, you may not need or want to look at all of the elements listed in table 17.1. Even in our imagined example, some are more useful than others. For example, “strategies and tactics” is a bit of a stretch here. In studies that are looking specifically at, say, policy implementation or social movements, this category will prove much more salient.

Another way to think about “what to look at” is to consider aspects of your content in terms of units of analysis. You can drill down to the specific words used (e.g., the adjectives commonly used to describe “men” and “women” in your magazine sample) or move up to the more abstract level of concepts used (e.g., the idea that men are more rational than women). Counting for the purpose of identifying patterns is particularly useful here. How many times is that idea of women’s irrationality communicated? How is it is communicated (in comic strips, fictional stories, editorials, etc.)? Does the incidence of the concept change over time? Perhaps the “irrational woman” was everywhere in the 1950s, but by the 1970s, it is no longer showing up in stories and comics. By tracing its usage and prevalence over time, you might come up with a theory or story about gender presentation during the period. Table 17.2 provides more examples of using different units of analysis for this work along with suggestions for effective use.

Table 17.2. Examples of Unit of Analysis in Content Analysis

Every qualitative content analysis is unique in its particular focus and particular data used, so there is no single correct way to approach analysis. You should have a better idea, however, of what kinds of things to look for and what to look for. The next two chapters will take you further into the coding process, the primary analytical tool for qualitative research in general.

Further Readings

Cidell, Julie. 2010. “Content Clouds as Exploratory Qualitative Data Analysis.” Area 42(4):514–523. A demonstration of using visual “content clouds” as a form of exploratory qualitative data analysis using transcripts of public meetings and content of newspaper articles.

Hsieh, Hsiu-Fang, and Sarah E. Shannon. 2005. “Three Approaches to Qualitative Content Analysis.” Qualitative Health Research 15(9):1277–1288. Distinguishes three distinct approaches to QCA: conventional, directed, and summative. Uses hypothetical examples from end-of-life care research.

Jackson, Romeo, Alex C. Lange, and Antonio Duran. 2021. “A Whitened Rainbow: The In/Visibility of Race and Racism in LGBTQ Higher Education Scholarship.” Journal Committed to Social Change on Race and Ethnicity (JCSCORE) 7(2):174–206.* Using a “critical summative content analysis” approach, examines research published on LGBTQ people between 2009 and 2019.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology . 4th ed. Thousand Oaks, CA: SAGE. A very comprehensive textbook on both quantitative and qualitative forms of content analysis.

Mayring, Philipp. 2022. Qualitative Content Analysis: A Step-by-Step Guide . Thousand Oaks, CA: SAGE. Formulates an eight-step approach to QCA.

Messinger, Adam M. 2012. “Teaching Content Analysis through ‘Harry Potter.’” Teaching Sociology 40(4):360–367. This is a fun example of a relatively brief foray into content analysis using the music found in Harry Potter films.

Neuendorft, Kimberly A. 2002. The Content Analysis Guidebook . Thousand Oaks, CA: SAGE. Although a helpful guide to content analysis in general, be warned that this textbook definitely favors quantitative over qualitative approaches to content analysis.

Schrier, Margrit. 2012. Qualitative Content Analysis in Practice . Thousand Okas, CA: SAGE. Arguably the most accessible guidebook for QCA, written by a professor based in Germany.

Weber, Matthew A., Shannon Caplan, Paul Ringold, and Karen Blocksom. 2017. “Rivers and Streams in the Media: A Content Analysis of Ecosystem Services.” Ecology and Society 22(3).* Examines the content of a blog hosted by National Geographic and articles published in The New York Times and the Wall Street Journal for stories on rivers and streams (e.g., water-quality flooding).

  • There are ways of handling content analysis quantitatively, however. Some practitioners therefore specify qualitative content analysis (QCA). In this chapter, all content analysis is QCA unless otherwise noted. ↵
  • Note that some qualitative software allows you to upload whole films or film clips for coding. You will still have to get access to the film, of course. ↵
  • See chapter 20 for more on the final presentation of research. ↵
  • . Actually, ATLAS.ti is an annual license, while NVivo is a perpetual license, but both are going to cost you at least $500 to use. Student rates may be lower. And don’t forget to ask your institution or program if they already have a software license you can use. ↵

A method of both data collection and data analysis in which a given content (textual, visual, graphic) is examined systematically and rigorously to identify meanings, themes, patterns and assumptions.  Qualitative content analysis (QCA) is concerned with gathering and interpreting an existing body of material.    

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

content analysis in historical research

Using Content Analysis

This guide provides an introduction to content analysis, a research methodology that examines words or phrases within a wide range of texts.

  • Introduction to Content Analysis : Read about the history and uses of content analysis.
  • Conceptual Analysis : Read an overview of conceptual analysis and its associated methodology.
  • Relational Analysis : Read an overview of relational analysis and its associated methodology.
  • Commentary : Read about issues of reliability and validity with regard to content analysis as well as the advantages and disadvantages of using content analysis as a research methodology.
  • Examples : View examples of real and hypothetical studies that use content analysis.
  • Annotated Bibliography : Complete list of resources used in this guide and beyond.

An Introduction to Content Analysis

Content analysis is a research tool used to determine the presence of certain words or concepts within texts or sets of texts. Researchers quantify and analyze the presence, meanings and relationships of such words and concepts, then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of which these are a part. Texts can be defined broadly as books, book chapters, essays, interviews, discussions, newspaper headlines and articles, historical documents, speeches, conversations, advertising, theater, informal conversation, or really any occurrence of communicative language. Texts in a single study may also represent a variety of different types of occurrences, such as Palmquist's 1990 study of two composition classes, in which he analyzed student and teacher interviews, writing journals, classroom discussions and lectures, and out-of-class interaction sheets. To conduct a content analysis on any such text, the text is coded, or broken down, into manageable categories on a variety of levels--word, word sense, phrase, sentence, or theme--and then examined using one of content analysis' basic methods: conceptual analysis or relational analysis.

A Brief History of Content Analysis

Historically, content analysis was a time consuming process. Analysis was done manually, or slow mainframe computers were used to analyze punch cards containing data punched in by human coders. Single studies could employ thousands of these cards. Human error and time constraints made this method impractical for large texts. However, despite its impracticality, content analysis was already an often utilized research method by the 1940's. Although initially limited to studies that examined texts for the frequency of the occurrence of identified terms (word counts), by the mid-1950's researchers were already starting to consider the need for more sophisticated methods of analysis, focusing on concepts rather than simply words, and on semantic relationships rather than just presence (de Sola Pool 1959). While both traditions still continue today, content analysis now is also utilized to explore mental models, and their linguistic, affective, cognitive, social, cultural and historical significance.

Uses of Content Analysis

Perhaps due to the fact that it can be applied to examine any piece of writing or occurrence of recorded communication, content analysis is currently used in a dizzying array of fields, ranging from marketing and media studies, to literature and rhetoric, ethnography and cultural studies, gender and age issues, sociology and political science, psychology and cognitive science, and many other fields of inquiry. Additionally, content analysis reflects a close relationship with socio- and psycholinguistics, and is playing an integral role in the development of artificial intelligence. The following list (adapted from Berelson, 1952) offers more possibilities for the uses of content analysis:

  • Reveal international differences in communication content
  • Detect the existence of propaganda
  • Identify the intentions, focus or communication trends of an individual, group or institution
  • Describe attitudinal and behavioral responses to communications
  • Determine psychological or emotional state of persons or groups

Types of Content Analysis

In this guide, we discuss two general categories of content analysis: conceptual analysis and relational analysis. Conceptual analysis can be thought of as establishing the existence and frequency of concepts most often represented by words of phrases in a text. For instance, say you have a hunch that your favorite poet often writes about hunger. With conceptual analysis you can determine how many times words such as hunger, hungry, famished, or starving appear in a volume of poems. In contrast, relational analysis goes one step further by examining the relationships among concepts in a text. Returning to the hunger example, with relational analysis, you could identify what other words or phrases hunger or famished appear next to and then determine what different meanings emerge as a result of these groupings.

Conceptual Analysis

Traditionally, content analysis has most often been thought of in terms of conceptual analysis. In conceptual analysis, a concept is chosen for examination, and the analysis involves quantifying and tallying its presence. Also known as thematic analysis [although this term is somewhat problematic, given its varied definitions in current literature--see Palmquist, Carley, & Dale (1997) vis-a-vis Smith (1992)], the focus here is on looking at the occurrence of selected terms within a text or texts, although the terms may be implicit as well as explicit. While explicit terms obviously are easy to identify, coding for implicit terms and deciding their level of implication is complicated by the need to base judgments on a somewhat subjective system. To attempt to limit the subjectivity, then (as well as to limit problems of reliability and validity ), coding such implicit terms usually involves the use of either a specialized dictionary or contextual translation rules. And sometimes, both tools are used--a trend reflected in recent versions of the Harvard and Lasswell dictionaries.

Methods of Conceptual Analysis

Conceptual analysis begins with identifying research questions and choosing a sample or samples. Once chosen, the text must be coded into manageable content categories. The process of coding is basically one of selective reduction . By reducing the text to categories consisting of a word, set of words or phrases, the researcher can focus on, and code for, specific words or patterns that are indicative of the research question.

An example of a conceptual analysis would be to examine several Clinton speeches on health care, made during the 1992 presidential campaign, and code them for the existence of certain words. In looking at these speeches, the research question might involve examining the number of positive words used to describe Clinton's proposed plan, and the number of negative words used to describe the current status of health care in America. The researcher would be interested only in quantifying these words, not in examining how they are related, which is a function of relational analysis. In conceptual analysis, the researcher simply wants to examine presence with respect to his/her research question, i.e. is there a stronger presence of positive or negative words used with respect to proposed or current health care plans, respectively.

Once the research question has been established, the researcher must make his/her coding choices with respect to the eight category coding steps indicated by Carley (1992).

Steps for Conducting Conceptual Analysis

The following discussion of steps that can be followed to code a text or set of texts during conceptual analysis use campaign speeches made by Bill Clinton during the 1992 presidential campaign as an example. To read about each step, click on the items in the list below:

  • Decide the level of analysis.

First, the researcher must decide upon the level of analysis . With the health care speeches, to continue the example, the researcher must decide whether to code for a single word, such as "inexpensive," or for sets of words or phrases, such as "coverage for everyone."

  • Decide how many concepts to code for.

The researcher must now decide how many different concepts to code for. This involves developing a pre-defined or interactive set of concepts and categories. The researcher must decide whether or not to code for every single positive or negative word that appears, or only certain ones that the researcher determines are most relevant to health care. Then, with this pre-defined number set, the researcher has to determine how much flexibility he/she allows him/herself when coding. The question of whether the researcher codes only from this pre-defined set, or allows him/herself to add relevant categories not included in the set as he/she finds them in the text, must be answered. Determining a certain number and set of concepts allows a researcher to examine a text for very specific things, keeping him/her on task. But introducing a level of coding flexibility allows new, important material to be incorporated into the coding process that could have significant bearings on one's results.

  • Decide whether to code for existence or frequency of a concept.

After a certain number and set of concepts are chosen for coding , the researcher must answer a key question: is he/she going to code for existence or frequency ? This is important, because it changes the coding process. When coding for existence, "inexpensive" would only be counted once, no matter how many times it appeared. This would be a very basic coding process and would give the researcher a very limited perspective of the text. However, the number of times "inexpensive" appears in a text might be more indicative of importance. Knowing that "inexpensive" appeared 50 times, for example, compared to 15 appearances of "coverage for everyone," might lead a researcher to interpret that Clinton is trying to sell his health care plan based more on economic benefits, not comprehensive coverage. Knowing that "inexpensive" appeared, but not that it appeared 50 times, would not allow the researcher to make this interpretation, regardless of whether it is valid or not.

  • Decide on how you will distinguish among concepts.

The researcher must next decide on the , i.e. whether concepts are to be coded exactly as they appear, or if they can be recorded as the same even when they appear in different forms. For example, "expensive" might also appear as "expensiveness." The research needs to determine if the two words mean radically different things to him/her, or if they are similar enough that they can be coded as being the same thing, i.e. "expensive words." In line with this, is the need to determine the level of implication one is going to allow. This entails more than subtle differences in tense or spelling, as with "expensive" and "expensiveness." Determining the level of implication would allow the researcher to code not only for the word "expensive," but also for words that imply "expensive." This could perhaps include technical words, jargon, or political euphemism, such as "economically challenging," that the researcher decides does not merit a separate category, but is better represented under the category "expensive," due to its implicit meaning of "expensive."

  • Develop rules for coding your texts.

After taking the generalization of concepts into consideration, a researcher will want to create translation rules that will allow him/her to streamline and organize the coding process so that he/she is coding for exactly what he/she wants to code for. Developing a set of rules helps the researcher insure that he/she is coding things consistently throughout the text, in the same way every time. If a researcher coded "economically challenging" as a separate category from "expensive" in one paragraph, then coded it under the umbrella of "expensive" when it occurred in the next paragraph, his/her data would be invalid. The interpretations drawn from that data will subsequently be invalid as well. Translation rules protect against this and give the coding process a crucial level of consistency and coherence.

  • Decide what to do with "irrelevant" information.

The next choice a researcher must make involves irrelevant information . The researcher must decide whether irrelevant information should be ignored (as Weber, 1990, suggests), or used to reexamine and/or alter the coding scheme. In the case of this example, words like "and" and "the," as they appear by themselves, would be ignored. They add nothing to the quantification of words like "inexpensive" and "expensive" and can be disregarded without impacting the outcome of the coding.

  • Code the texts.

Once these choices about irrelevant information are made, the next step is to code the text. This is done either by hand, i.e. reading through the text and manually writing down concept occurrences, or through the use of various computer programs. Coding with a computer is one of contemporary conceptual analysis' greatest assets. By inputting one's categories, content analysis programs can easily automate the coding process and examine huge amounts of data, and a wider range of texts, quickly and efficiently. But automation is very dependent on the researcher's preparation and category construction. When coding is done manually, a researcher can recognize errors far more easily. A computer is only a tool and can only code based on the information it is given. This problem is most apparent when coding for implicit information, where category preparation is essential for accurate coding.

  • Analyze your results.

Once the coding is done, the researcher examines the data and attempts to draw whatever conclusions and generalizations are possible. Of course, before these can be drawn, the researcher must decide what to do with the information in the text that is not coded. One's options include either deleting or skipping over unwanted material, or viewing all information as relevant and important and using it to reexamine, reassess and perhaps even alter one's coding scheme. Furthermore, given that the conceptual analyst is dealing only with quantitative data, the levels of interpretation and generalizability are very limited. The researcher can only extrapolate as far as the data will allow. But it is possible to see trends, for example, that are indicative of much larger ideas. Using the example from step three, if the concept "inexpensive" appears 50 times, compared to 15 appearances of "coverage for everyone," then the researcher can pretty safely extrapolate that there does appear to be a greater emphasis on the economics of the health care plan, as opposed to its universal coverage for all Americans. It must be kept in mind that conceptual analysis, while extremely useful and effective for providing this type of information when done right, is limited by its focus and the quantitative nature of its examination. To more fully explore the relationships that exist between these concepts, one must turn to relational analysis.

Relational Analysis

Relational analysis, like conceptual analysis, begins with the act of identifying concepts present in a given text or set of texts. However, relational analysis seeks to go beyond presence by exploring the relationships between the concepts identified. Relational analysis has also been termed semantic analysis (Palmquist, Carley, & Dale, 1997). In other words, the focus of relational analysis is to look for semantic, or meaningful, relationships. Individual concepts, in and of themselves, are viewed as having no inherent meaning. Rather, meaning is a product of the relationships among concepts in a text. Carley (1992) asserts that concepts are "ideational kernels;" these kernels can be thought of as symbols which acquire meaning through their connections to other symbols.

Theoretical Influences on Relational Analysis

The kind of analysis that researchers employ will vary significantly according to their theoretical approach. Key theoretical approaches that inform content analysis include linguistics and cognitive science.

Linguistic approaches to content analysis focus analysis of texts on the level of a linguistic unit, typically single clause units. One example of this type of research is Gottschalk (1975), who developed an automated procedure which analyzes each clause in a text and assigns it a numerical score based on several emotional/psychological scales. Another technique is to code a text grammatically into clauses and parts of speech to establish a matrix representation (Carley, 1990).

Approaches that derive from cognitive science include the creation of decision maps and mental models. Decision maps attempt to represent the relationship(s) between ideas, beliefs, attitudes, and information available to an author when making a decision within a text. These relationships can be represented as logical, inferential, causal, sequential, and mathematical relationships. Typically, two of these links are compared in a single study, and are analyzed as networks. For example, Heise (1987) used logical and sequential links to examine symbolic interaction. This methodology is thought of as a more generalized cognitive mapping technique, rather than the more specific mental models approach.

Mental models are groups or networks of interrelated concepts that are thought to reflect conscious or subconscious perceptions of reality. According to cognitive scientists, internal mental structures are created as people draw inferences and gather information about the world. Mental models are a more specific approach to mapping because beyond extraction and comparison because they can be numerically and graphically analyzed. Such models rely heavily on the use of computers to help analyze and construct mapping representations. Typically, studies based on this approach follow five general steps:

  • Identifing concepts
  • Defining relationship types
  • Coding the text on the basis of 1 and 2
  • Coding the statements
  • Graphically displaying and numerically analyzing the resulting maps

To create the model, a researcher converts a text into a map of concepts and relations; the map is then analyzed on the level of concepts and statements, where a statement consists of two concepts and their relationship. Carley (1990) asserts that this makes possible the comparison of a wide variety of maps, representing multiple sources, implicit and explicit information, as well as socially shared cognitions.

Relational Analysis: Overview of Methods

As with other sorts of inquiry, initial choices with regard to what is being studied and/or coded for often determine the possibilities of that particular study. For relational analysis, it is important to first decide which concept type(s) will be explored in the analysis. Studies have been conducted with as few as one and as many as 500 concept categories. Obviously, too many categories may obscure your results and too few can lead to unreliable and potentially invalid conclusions. Therefore, it is important to allow the context and necessities of your research to guide your coding procedures.

The steps to relational analysis that we consider in this guide suggest some of the possible avenues available to a researcher doing content analysis. We provide an example to make the process easier to grasp. However, the choices made within the context of the example are but only a few of many possibilities. The diversity of techniques available suggests that there is quite a bit of enthusiasm for this mode of research. Once a procedure is rigorously tested, it can be applied and compared across populations over time. The process of relational analysis has achieved a high degree of computer automation but still is, like most forms of research, time consuming. Perhaps the strongest claim that can be made is that it maintains a high degree of statistical rigor without losing the richness of detail apparent in even more qualitative methods.

Three Subcategories of Relational Analysis

Affect extraction: This approach provides an emotional evaluation of concepts explicit in a text. It is problematic because emotion may vary across time and populations. Nevertheless, when extended it can be a potent means of exploring the emotional/psychological state of the speaker and/or writer. Gottschalk (1995) provides an example of this type of analysis. By assigning concepts identified a numeric value on corresponding emotional/psychological scales that can then be statistically examined, Gottschalk claims that the emotional/psychological state of the speaker or writer can be ascertained via their verbal behavior.

Proximity analysis: This approach, on the other hand, is concerned with the co-occurrence of explicit concepts in the text. In this procedure, the text is defined as a string of words. A given length of words, called a window , is determined. The window is then scanned across a text to check for the co-occurrence of concepts. The result is the creation of a concept determined by the concept matrix . In other words, a matrix, or a group of interrelated, co-occurring concepts, might suggest a certain overall meaning. The technique is problematic because the window records only explicit concepts and treats meaning as proximal co-occurrence. Other techniques such as clustering, grouping, and scaling are also useful in proximity analysis.

Cognitive mapping: This approach is one that allows for further analysis of the results from the two previous approaches. It attempts to take the above processes one step further by representing these relationships visually for comparison. Whereas affective and proximal analysis function primarily within the preserved order of the text, cognitive mapping attempts to create a model of the overall meaning of the text. This can be represented as a graphic map that represents the relationships between concepts.

In this manner, cognitive mapping lends itself to the comparison of semantic connections across texts. This is known as map analysis which allows for comparisons to explore "how meanings and definitions shift across people and time" (Palmquist, Carley, & Dale, 1997). Maps can depict a variety of different mental models (such as that of the text, the writer/speaker, or the social group/period), according to the focus of the researcher. This variety is indicative of the theoretical assumptions that support mapping: mental models are representations of interrelated concepts that reflect conscious or subconscious perceptions of reality; language is the key to understanding these models; and these models can be represented as networks (Carley, 1990). Given these assumptions, it's not surprising to see how closely this technique reflects the cognitive concerns of socio-and psycholinguistics, and lends itself to the development of artificial intelligence models.

Steps for Conducting Relational Analysis

The following discussion of the steps (or, perhaps more accurately, strategies) that can be followed to code a text or set of texts during relational analysis. These explanations are accompanied by examples of relational analysis possibilities for statements made by Bill Clinton during the 1998 hearings.

  • Identify the Question.

The question is important because it indicates where you are headed and why. Without a focused question, the concept types and options open to interpretation are limitless and therefore the analysis difficult to complete. Possibilities for the Hairy Hearings of 1998 might be:

What did Bill Clinton say in the speech? OR What concrete information did he present to the public?
  • Choose a sample or samples for analysis.

Once the question has been identified, the researcher must select sections of text/speech from the hearings in which Bill Clinton may have not told the entire truth or is obviously holding back information. For relational content analysis, the primary consideration is how much information to preserve for analysis. One must be careful not to limit the results by doing so, but the researcher must also take special care not to take on so much that the coding process becomes too heavy and extensive to supply worthwhile results.

  • Determine the type of analysis.

Once the sample has been chosen for analysis, it is necessary to determine what type or types of relationships you would like to examine. There are different subcategories of relational analysis that can be used to examine the relationships in texts.

In this example, we will use proximity analysis because it is concerned with the co-occurrence of explicit concepts in the text. In this instance, we are not particularly interested in affect extraction because we are trying to get to the hard facts of what exactly was said rather than determining the emotional considerations of speaker and receivers surrounding the speech which may be unrecoverable.

Once the subcategory of analysis is chosen, the selected text must be reviewed to determine the level of analysis. The researcher must decide whether to code for a single word, such as "perhaps," or for sets of words or phrases like "I may have forgotten."

  • Reduce the text to categories and code for words or patterns.

At the simplest level, a researcher can code merely for existence. This is not to say that simplicity of procedure leads to simplistic results. Many studies have successfully employed this strategy. For example, Palmquist (1990) did not attempt to establish the relationships among concept terms in the classrooms he studied; his study did, however, look at the change in the presence of concepts over the course of the semester, comparing a map analysis from the beginning of the semester to one constructed at the end. On the other hand, the requirement of one's specific research question may necessitate deeper levels of coding to preserve greater detail for analysis.

In relation to our extended example, the researcher might code for how often Bill Clinton used words that were ambiguous, held double meanings, or left an opening for change or "re-evaluation." The researcher might also choose to code for what words he used that have such an ambiguous nature in relation to the importance of the information directly related to those words.

  • Explore the relationships between concepts (Strength, Sign & Direction).

Once words are coded, the text can be analyzed for the relationships among the concepts set forth. There are three concepts which play a central role in exploring the relations among concepts in content analysis.

  • Strength of Relationship: Refers to the degree to which two or more concepts are related. These relationships are easiest to analyze, compare, and graph when all relationships between concepts are considered to be equal. However, assigning strength to relationships retains a greater degree of the detail found in the original text. Identifying strength of a relationship is key when determining whether or not words like unless, perhaps, or maybe are related to a particular section of text, phrase, or idea.
  • Sign of a Relationship: Refers to whether or not the concepts are positively or negatively related. To illustrate, the concept "bear" is negatively related to the concept "stock market" in the same sense as the concept "bull" is positively related. Thus "it's a bear market" could be coded to show a negative relationship between "bear" and "market". Another approach to coding for strength entails the creation of separate categories for binary oppositions. The above example emphasizes "bull" as the negation of "bear," but could be coded as being two separate categories, one positive and one negative. There has been little research to determine the benefits and liabilities of these differing strategies. Use of Sign coding for relationships in regard to the hearings my be to find out whether or not the words under observation or in question were used adversely or in favor of the concepts (this is tricky, but important to establishing meaning).
  • Direction of the Relationship: Refers to the type of relationship categories exhibit. Coding for this sort of information can be useful in establishing, for example, the impact of new information in a decision making process. Various types of directional relationships include, "X implies Y," "X occurs before Y" and "if X then Y," or quite simply the decision whether concept X is the "prime mover" of Y or vice versa. In the case of the 1998 hearings, the researcher might note that, "maybe implies doubt," "perhaps occurs before statements of clarification," and "if possibly exists, then there is room for Clinton to change his stance." In some cases, concepts can be said to be bi-directional, or having equal influence. This is equivalent to ignoring directionality. Both approaches are useful, but differ in focus. Coding all categories as bi-directional is most useful for exploratory studies where pre-coding may influence results, and is also most easily automated, or computer coded.
  • Code the relationships.

One of the main differences between conceptual analysis and relational analysis is that the statements or relationships between concepts are coded. At this point, to continue our extended example, it is important to take special care with assigning value to the relationships in an effort to determine whether the ambiguous words in Bill Clinton's speech are just fillers, or hold information about the statements he is making.

  • Perform Statisical Analyses.

This step involves conducting statistical analyses of the data you've coded during your relational analysis. This may involve exploring for differences or looking for relationships among the variables you've identified in your study.

  • Map out the Representations.

In addition to statistical analysis, relational analysis often leads to viewing the representations of the concepts and their associations in a text (or across texts) in a graphical -- or map -- form. Relational analysis is also informed by a variety of different theoretical approaches: linguistic content analysis, decision mapping, and mental models.

The authors of this guide have created the following commentaries on content analysis.

Issues of Reliability & Validity

The issues of reliability and validity are concurrent with those addressed in other research methods. The reliability of a content analysis study refers to its stability , or the tendency for coders to consistently re-code the same data in the same way over a period of time; reproducibility , or the tendency for a group of coders to classify categories membership in the same way; and accuracy , or the extent to which the classification of a text corresponds to a standard or norm statistically. Gottschalk (1995) points out that the issue of reliability may be further complicated by the inescapably human nature of researchers. For this reason, he suggests that coding errors can only be minimized, and not eliminated (he shoots for 80% as an acceptable margin for reliability).

On the other hand, the validity of a content analysis study refers to the correspondence of the categories to the conclusions , and the generalizability of results to a theory.

The validity of categories in implicit concept analysis, in particular, is achieved by utilizing multiple classifiers to arrive at an agreed upon definition of the category. For example, a content analysis study might measure the occurrence of the concept category "communist" in presidential inaugural speeches. Using multiple classifiers, the concept category can be broadened to include synonyms such as "red," "Soviet threat," "pinkos," "godless infidels" and "Marxist sympathizers." "Communist" is held to be the explicit variable, while "red," etc. are the implicit variables.

The overarching problem of concept analysis research is the challenge-able nature of conclusions reached by its inferential procedures. The question lies in what level of implication is allowable, i.e. do the conclusions follow from the data or are they explainable due to some other phenomenon? For occurrence-specific studies, for example, can the second occurrence of a word carry equal weight as the ninety-ninth? Reasonable conclusions can be drawn from substantive amounts of quantitative data, but the question of proof may still remain unanswered.

This problem is again best illustrated when one uses computer programs to conduct word counts. The problem of distinguishing between synonyms and homonyms can completely throw off one's results, invalidating any conclusions one infers from the results. The word "mine," for example, variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. One may obtain an accurate count of that word's occurrence and frequency, but not have an accurate accounting of the meaning inherent in each particular usage. For example, one may find 50 occurrences of the word "mine." But, if one is only looking specifically for "mine" as an explosive device, and 17 of the occurrences are actually personal pronouns, the resulting 50 is an inaccurate result. Any conclusions drawn as a result of that number would render that conclusion invalid.

The generalizability of one's conclusions, then, is very dependent on how one determines concept categories, as well as on how reliable those categories are. It is imperative that one defines categories that accurately measure the idea and/or items one is seeking to measure. Akin to this is the construction of rules. Developing rules that allow one, and others, to categorize and code the same data in the same way over a period of time, referred to as stability , is essential to the success of a conceptual analysis. Reproducibility , not only of specific categories, but of general methods applied to establishing all sets of categories, makes a study, and its subsequent conclusions and results, more sound. A study which does this, i.e. in which the classification of a text corresponds to a standard or norm, is said to have accuracy .

Advantages of Content Analysis

Content analysis offers several advantages to researchers who consider using it. In particular, content analysis:

  • looks directly at communication via texts or transcripts, and hence gets at the central aspect of social interaction
  • can allow for both quantitative and qualitative operations
  • can provides valuable historical/cultural insights over time through analysis of texts
  • allows a closeness to text which can alternate between specific categories and relationships and also statistically analyzes the coded form of the text
  • can be used to interpret texts for purposes such as the development of expert systems (since knowledge and rules can both be coded in terms of explicit statements about the relationships among concepts)
  • is an unobtrusive means of analyzing interactions
  • provides insight into complex models of human thought and language use

Disadvantages of Content Analysis

Content analysis suffers from several disadvantages, both theoretical and procedural. In particular, content analysis:

  • can be extremely time consuming
  • is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation
  • is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study
  • is inherently reductive, particularly when dealing with complex texts
  • tends too often to simply consist of word counts
  • often disregards the context that produced the text, as well as the state of things after the text is produced
  • can be difficult to automate or computerize

The Palmquist, Carley and Dale study, a summary of "Applications of Computer-Aided Text Analysis: Analyzing Literary and Non-Literary Texts" (1997) is an example of two studies that have been conducted using both conceptual and relational analysis. The Problematic Text for Content Analysis shows the differences in results obtained by a conceptual and a relational approach to a study.

Related Information: Example of a Problematic Text for Content Analysis

In this example, both students observed a scientist and were asked to write about the experience.

Student A: I found that scientists engage in research in order to make discoveries and generate new ideas. Such research by scientists is hard work and often involves collaboration with other scientists which leads to discoveries which make the scientists famous. Such collaboration may be informal, such as when they share new ideas over lunch, or formal, such as when they are co-authors of a paper.
Student B: It was hard work to research famous scientists engaged in collaboration and I made many informal discoveries. My research showed that scientists engaged in collaboration with other scientists are co-authors of at least one paper containing their new ideas. Some scientists make formal discoveries and have new ideas.

Content analysis coding for explicit concepts may not reveal any significant differences. For example, the existence of "I, scientist, research, hard work, collaboration, discoveries, new ideas, etc..." are explicit in both texts, occur the same number of times, and have the same emphasis. Relational analysis or cognitive mapping, however, reveals that while all concepts in the text are shared, only five concepts are common to both. Analyzing these statements reveals that Student A reports on what "I" found out about "scientists," and elaborated the notion of "scientists" doing "research." Student B focuses on what "I's" research was and sees scientists as "making discoveries" without emphasis on research.

Related Information: The Palmquist, Carley and Dale Study

Consider these two questions: How has the depiction of robots changed over more than a century's worth of writing? And, do students and writing instructors share the same terms for describing the writing process? Although these questions seem totally unrelated, they do share a commonality: in the Palmquist, Carley & Dale study, their answers rely on computer-aided text analysis to demonstrate how different texts can be analyzed.

Literary texts

One half of the study explored the depiction of robots in 27 science fiction texts written between 1818 and 1988. After texts were divided into three historically defined groups, readers look for how the depiction of robots has changed over time. To do this, researchers had to create concept lists and relationship types, create maps using a computer software (see Fig. 1), modify those maps and then ultimately analyze them. The final product of the analysis revealed that over time authors were less likely to depict robots as metallic humanoids.

Non-literary texts

The second half of the study used student journals and interviews, teacher interviews, texts books, and classroom observations as the non-literary texts from which concepts and words were taken. The purpose behind the study was to determine if, in fact, over time teacher and students would begin to share a similar vocabulary about the writing process. Again, researchers used computer software to assist in the process. This time, computers helped researchers generated a concept list based on frequently occurring words and phrases from all texts. Maps were also created and analyzed in this study (see Fig. 2).

Annotated Bibliography

Resources On How To Conduct Content Analysis

Beard, J., & Yaprak, A. (1989). Language implications for advertising in international markets: A model for message content and message execution. A paper presented at the 8th International Conference on Language Communication for World Business and the Professions. Ann Arbor, MI.

This report discusses the development and testing of a content analysis model for assessing advertising themes and messages aimed primarily at U.S. markets which seeks to overcome barriers in the cultural environment of international markets. Texts were categorized under 3 headings: rational, emotional, and moral. The goal here was to teach students to appreciate differences in language and culture.

Berelson, B. (1971). Content analysis in communication research . New York: Hafner Publishing Company.

While this book provides an extensive outline of the uses of content analysis, it is far more concerned with conveying a critical approach to current literature on the subject. In this respect, it assumes a bit of prior knowledge, but is still accessible through the use of concrete examples.

Budd, R. W., Thorp, R.K., & Donohew, L. (1967). Content analysis of communications . New York: Macmillan Company.

Although published in 1967, the decision of the authors to focus on recent trends in content analysis keeps their insights relevant even to modern audiences. The book focuses on specific uses and methods of content analysis with an emphasis on its potential for researching human behavior. It is also geared toward the beginning researcher and breaks down the process of designing a content analysis study into 6 steps that are outlined in successive chapters. A useful annotated bibliography is included.

Carley, K. (1992). Coding choices for textual analysis: A comparison of content analysis and map analysis. Unpublished Working Paper.

Comparison of the coding choices necessary to conceptual analysis and relational analysis, especially focusing on cognitive maps. Discusses concept coding rules needed for sufficient reliability and validity in a Content Analysis study. In addition, several pitfalls common to texts are discussed.

Carley, K. (1990). Content analysis. In R.E. Asher (Ed.), The Encyclopedia of Language and Linguistics. Edinburgh: Pergamon Press.

Quick, yet detailed, overview of the different methodological kinds of Content Analysis. Carley breaks down her paper into five sections, including: Conceptual Analysis, Procedural Analysis, Relational Analysis, Emotional Analysis and Discussion. Also included is an excellent and comprehensive Content Analysis reference list.

Carley, K. (1989). Computer analysis of qualitative data . Pittsburgh, PA: Carnegie Mellon University.

Presents graphic, illustrated representations of computer based approaches to content analysis.

Carley, K. (1992). MECA . Pittsburgh, PA: Carnegie Mellon University.

A resource guide explaining the fifteen routines that compose the Map Extraction Comparison and Analysis (MECA) software program. Lists the source file, input and out files, and the purpose for each routine.

Carney, T. F. (1972). Content analysis: A technique for systematic inference from communications . Winnipeg, Canada: University of Manitoba Press.

This book introduces and explains in detail the concept and practice of content analysis. Carney defines it; traces its history; discusses how content analysis works and its strengths and weaknesses; and explains through examples and illustrations how one goes about doing a content analysis.

de Sola Pool, I. (1959). Trends in content analysis . Urbana, Ill: University of Illinois Press.

The 1959 collection of papers begins by differentiating quantitative and qualitative approaches to content analysis, and then details facets of its uses in a wide variety of disciplines: from linguistics and folklore to biography and history. Includes a discussion on the selection of relevant methods and representational models.

Duncan, D. F. (1989). Content analysis in health educaton research: An introduction to purposes and methods. Heatlth Education, 20 (7).

This article proposes using content analysis as a research technique in health education. A review of literature relating to applications of this technique and a procedure for content analysis are presented.

Gottschalk, L. A. (1995). Content analysis of verbal behavior: New findings and clinical applications. Hillside, NJ: Lawrence Erlbaum Associates, Inc.

This book primarily focuses on the Gottschalk-Gleser method of content analysis, and its application as a method of measuring psychological dimensions of children and adults via the content and form analysis of their verbal behavior, using the grammatical clause as the basic unit of communication for carrying semantic messages generated by speakers or writers.

Krippendorf, K. (1980). Content analysis: An introduction to its methodology Beverly Hills, CA: Sage Publications.

This is one of the most widely quoted resources in many of the current studies of Content Analysis. Recommended as another good, basic resource, as Krippendorf presents the major issues of Content Analysis in much the same way as Weber (1975).

Moeller, L. G. (1963). An introduction to content analysis--including annotated bibliography . Iowa City: University of Iowa Press.

A good reference for basic content analysis. Discusses the options of sampling, categories, direction, measurement, and the problems of reliability and validity in setting up a content analysis. Perhaps better as a historical text due to its age.

Smith, C. P. (Ed.). (1992). Motivation and personality: Handbook of thematic content analysis. New York: Cambridge University Press.

Billed by its authors as "the first book to be devoted primarily to content analysis systems for assessment of the characteristics of individuals, groups, or historical periods from their verbal materials." The text includes manuals for using various systems, theory, and research regarding the background of systems, as well as practice materials, making the book both a reference and a handbook.

Solomon, M. (1993). Content analysis: a potent tool in the searcher's arsenal. Database, 16 (2), 62-67.

Online databases can be used to analyze data, as well as to simply retrieve it. Online-media-source content analysis represents a potent but little-used tool for the business searcher. Content analysis benchmarks useful to advertisers include prominence, offspin, sponsor affiliation, verbatims, word play, positioning and notational visibility.

Weber, R. P. (1990). Basic content analysis, second edition . Newbury Park, CA: Sage Publications.

Good introduction to Content Analysis. The first chapter presents a quick overview of Content Analysis. The second chapter discusses content classification and interpretation, including sections on reliability, validity, and the creation of coding schemes and categories. Chapter three discusses techniques of Content Analysis, using a number of tables and graphs to illustrate the techniques. Chapter four examines issues in Content Analysis, such as measurement, indication, representation and interpretation.

Examples of Content Analysis

Adams, W., & Shriebman, F. (1978). Television network news: Issues in content research . Washington, DC: George Washington University Press.

A fairly comprehensive application of content analysis to the field of television news reporting. The books tripartite division discusses current trends and problems with news criticism from a content analysis perspective, four different content analysis studies of news media, and makes recommendations for future research in the area. Worth a look by anyone interested in mass communication research.

Auter, P. J., & Moore, R. L. (1993). Buying from a friend: a content analysis of two teleshopping programs. Journalism Quarterly, 70 (2), 425-437.

A preliminary study was conducted to content-analyze random samples of two teleshopping programs, using a measure of content interactivity and a locus of control message index.

Barker, S. P. (???) Fame: A content analysis study of the American film biography. Ohio State University. Thesis.

Barker examined thirty Oscar-nominated films dating from 1929 to 1979 using O.J. Harvey Belief System and the Kohlberg's Moral Stages to determine whether cinema heroes were positive role models for fame and success or morally ambiguous celebrities. Content analysis was successful in determining several trends relative to the frequency and portrayal of women in film, the generally high ethical character of the protagonists, and the dogmatic, close-minded nature of film antagonists.

Bernstein, J. M. & Lacy, S. (1992). Contextual coverage of government by local television news. Journalism Quarterly, 69 (2), 329-341.

This content analysis of 14 local television news operations in five markets looks at how local TV news shows contribute to the marketplace of ideas. Performance was measured as the allocation of stories to types of coverage that provide the context about events and issues confronting the public.

Blaikie, A. (1993). Images of age: a reflexive process. Applied Ergonomics, 24 (1), 51-58.

Content analysis of magazines provides a sharp instrument for reflecting the change in stereotypes of aging over past decades.

Craig, R. S. (1992). The effect of day part on gender portrayals in television commercials: a content analysis. Sex Roles: A Journal of Research, 26 (5-6), 197-213.

Gender portrayals in 2,209 network television commercials were content analyzed. To compare differences between three day parts, the sample was chosen from three time periods: daytime, evening prime time, and weekend afternoon sportscasts. The results indicate large and consistent differences in the way men and women are portrayed in these three day parts, with almost all comparisons reaching significance at the .05 level. Although ads in all day parts tended to portray men in stereotypical roles of authority and dominance, those on weekends tended to emphasize escape form home and family. The findings of earlier studies which did not consider day part differences may now have to be reevaluated.

Dillon, D. R. et al. (1992). Article content and authorship trends in The Reading Teacher, 1948-1991. The Reading Teacher, 45 (5), 362-368.

The authors explore changes in the focus of the journal over time.

Eberhardt, EA. (1991). The rhetorical analysis of three journal articles: The study of form, content, and ideology. Ft. Collins, CO: Colorado State University.

Eberhardt uses content analysis in this thesis paper to analyze three journal articles that reported on President Ronald Reagan's address in which he responded to the Tower Commission report concerning the IranContra Affair. The reports concentrated on three rhetorical elements: idea generation or content; linguistic style or choice of language; and the potential societal effect of both, which Eberhardt analyzes, along with the particular ideological orientation espoused by each magazine.

Ellis, B. G. & Dick, S. J. (1996). 'Who was 'Shadow'? The computer knows: applying grammar-program statistics in content analyses to solve mysteries about authorship. Journalism & Mass Communication Quarterly, 73 (4), 947-963.

This study's objective was to employ the statistics-documentation portion of a word-processing program's grammar-check feature as a final, definitive, and objective tool for content analyses - used in tandem with qualitative analyses - to determine authorship. Investigators concluded there was significant evidence from both modalities to support their theory that Henry Watterson, long-time editor of the Louisville Courier-Journal, probably was the South's famed Civil War correspondent "Shadow" and to rule out another prime suspect, John H. Linebaugh of the Memphis Daily Appeal. Until now, this Civil War mystery has never been conclusively solved, puzzling historians specializing in Confederate journalism.

Gottschalk, L. A., Stein, M. K. & Shapiro, D.H. (1997). The application of computerized content analysis in a psychiatric outpatient clinic. Journal of Clinical Psychology, 53 (5) , 427-442.

Twenty-five new psychiatric outpatients were clinically evaluated and were administered a brief psychological screening battery which included measurements of symptoms, personality, and cognitive function. Included in this assessment procedure were the Gottschalk-Gleser Content Analysis Scales on which scores were derived from five minute speech samples by means of an artificial intelligence-based computer program. The use of this computerized content analysis procedure for initial, rapid diagnostic neuropsychiatric appraisal is supported by this research.

Graham, J. L., Kamins, M. A., & Oetomo, D. S. (1993). Content analysis of German and Japanese advertising in print media from Indonesia, Spain, and the United States. Journal of Advertising , 22 (2), 5-16.

The authors analyze informational and emotional content in print advertisements in order to consider how home-country culture influences firms' marketing strategies and tactics in foreign markets. Research results provided evidence contrary to the original hypothesis that home-country culture would influence ads in each of the target countries.

Herzog, A. (1973). The B.S. Factor: The theory and technique of faking it in America . New York: Simon and Schuster.

Herzog takes a look at the rhetoric of American culture using content analysis to point out discrepancies between intention and reality in American society. The study reveals, albeit in a comedic tone, how double talk and "not quite lies" are pervasive in our culture.

Horton, N. S. (1986). Young adult literature and censorship: A content analysis of seventy-eight young adult books . Denton, TX: North Texas State University.

The purpose of Horton's content analysis was to analyze a representative seventy-eight current young adult books to determine the extent to which they contain items which are objectionable to would-be censors. Seventy-eight books were identified which fit the criteria of popularity and literary quality. Each book was analyzed for, and tallied for occurrence of, six categories, including profanity, sex, violence, parent conflict, drugs and condoned bad behavior.

Isaacs, J. S. (1984). A verbal content analysis of the early memories of psychiatric patients . Berkeley: California School of Professional Psychology.

Isaacs did a content analysis investigation on the relationship between words and phrases used in early memories and clinical diagnosis. His hypothesis was that in conveying their early memories schizophrenic patients tend to use an identifiable set of words and phrases more frequently than do nonpatients and that schizophrenic patients use these words and phrases more frequently than do patients with major affective disorders.

Jean Lee, S. K. & Hwee Hoon, T. (1993). Rhetorical vision of men and women managers in Singapore. Human Relations, 46 (4), 527-542.

A comparison of media portrayal of male and female managers' rhetorical vision in Singapore is made. Content analysis of newspaper articles used to make this comparison also reveals the inherent conflicts that women managers have to face. Purposive and multi-stage sampling of articles are utilized.

Kaur-Kasior, S. (1987). The treatment of culture in greeting cards: A content analysis . Bowling Green, OH: Bowling Green State University.

Using six historical periods dating from 1870 to 1987, this content analysis study attempted to determine what structural/cultural aspects of American society were reflected in greeting cards. The study determined that the size of cards increased over time, included more pages, and had animals and flowers as their most dominant symbols. In addition, white was the most common color used. Due to habituation and specialization, says the author, greeting cards have become institutionalized in American culture.

Koza, J. E. (1992). The missing males and other gender-related issues in music education: A critical analysis of evidence from the Music Supervisor's Journal, 1914-1924. Paper presented at the annual meeting of the American Educational Research Association. San Francisco.

The goal of this study was to identify all educational issues that would today be explicitly gender related and to analyze the explanations past music educators gave for the existence of gender-related problems. A content analysis of every gender-related reference was undertaken, finding that the current preoccupation with males in music education has a long history and that little has changed since the early part of this century.

Laccinole, M. D. (1982). Aging and married couples: A language content analysis of a conversational and expository speech task . Eugene, OR: University of Oregon.

Using content analysis, this paper investigated the relationship of age to the use of the grammatical categories, and described the differences in the usage of these grammatical categories in a conversation and expository speech task by fifty married couples. The subjects Laccinole used in his analysis were Caucasian, English speaking, middle class, ranged in ages from 20 to 83 years of age, were in good health and had no history of communication disorders.
Laffal, J. (1995). A concept analysis of Jonathan Swift's 'A Tale of a Tub' and 'Gulliver's Travels.' Computers and Humanities, 29 (5), 339-362.
In this study, comparisons of concept profiles of "Tub," "Gulliver," and Swift's own contemporary texts, as well as a composite text of 18th century writers, reveal that "Gulliver" is conceptually different from "Tub." The study also discovers that the concepts and words of these texts suggest two strands in Swift's thinking.

Lewis, S. M. (1991). Regulation from a deregulatory FCC: Avoiding discursive dissonance. Masters Thesis, Fort Collins, CO: Colorado State University.

This thesis uses content analysis to examine inconsistent statements made by the Federal Communications Commission (FCC) in its policy documents during the 1980s. Lewis analyzes positions set forth by the FCC in its policy statements and catalogues different strategies that can be used by speakers to be or to appear consistent, as well as strategies to avoid inconsistent speech or discursive dissonance.

Norton, T. L. (1987). The changing image of childhood: A content analysis of Caldecott Award books. Los Angeles: University of South Carolina.

Content analysis was conducted on 48 Caldecott Medal Recipient books dating from 1938 to 1985 to determine whether the reflect the idea that the social perception of childhood has altered since the early 1960's. The results revealed an increasing "loss of childhood innocence," as well as a general sentimentality for childhood pervasive in the texts. Suggests further study of children's literature to confirm the validity of such study.

O'Dell, J. W. & Weideman, D. (1993). Computer content analysis of the Schreber case. Journal of Clinical Psychology, 49 (1), 120-125.

An example of the application of content analysis as a means of recreating a mental model of the psychology of an individual.

Pratt, C. A. & Pratt, C. B. (1995). Comparative content analysis of food and nutrition advertisements in Ebony, Essence, and Ladies' Home Journal. Journal of Nutrition Education, 27 (1), 11-18.

This study used content analysis to measure the frequencies and forms of food, beverage, and nutrition advertisements and their associated health-promotional message in three U.S. consumer magazines during two 3-year periods: 1980-1982 and 1990-1992. The study showed statistically significant differences among the three magazines in both frequencies and types of major promotional messages in the advertisements. Differences between the advertisements in Ebony and Essence, the readerships of which were primarily African-American, and those found in Ladies Home Journal were noted, as were changes in the two time periods. Interesting tie in to ethnographic research studies?
Riffe, D., Lacy, S., & Drager, M. W. (1996). Sample size in content analysis of weekly news magazines. Journalism & Mass Communication Quarterly,73 (3), 635-645.
This study explores a variety of approaches to deciding sample size in analyzing magazine content. Having tested random samples of size six, eight, ten, twelve, fourteen, and sixteen issues, the authors show that a monthly stratified sample of twelve issues is the most efficient method for inferring to a year's issues.

Roberts, S. K. (1987). A content analysis of how male and female protagonists in Newbery Medal and Honor books overcome conflict: Incorporating a locus of control framework. Fayetteville, AR: University of Arkansas.

The purpose of this content analysis was to analyze Newbery Medal and Honor books in order to determine how male and female protagonists were assigned behavioral traits in overcoming conflict as it relates to an internal or external locus of control schema. Roberts used all, instead of just a sample, of the fictional Newbery Medal and Honor books which met his study's criteria. A total of 120 male and female protagonists were categorized, from Newbery books dating from 1922 to 1986.

Schneider, J. (1993). Square One TV content analysis: Final report . New York: Children's Television Workshop.

This report summarizes the mathematical and pedagogical content of the 230 programs in the Square One TV library after five seasons of production, relating that content to the goals of the series which were to make mathematics more accessible, meaningful, and interesting to the children viewers.

Smith, T. E., Sells, S. P., and Clevenger, T. Ethnographic content analysis of couple and therapist perceptions in a reflecting team setting. The Journal of Marital and Family Therapy, 20 (3), 267-286.

An ethnographic content analysis was used to examine couple and therapist perspectives about the use and value of reflecting team practice. Postsession ethnographic interviews from both couples and therapists were examined for the frequency of themes in seven categories that emerged from a previous ethnographic study of reflecting teams. Ethnographic content analysis is briefly contrasted with conventional modes of quantitative content analysis to illustrate its usefulness and rationale for discovering emergent patterns, themes, emphases, and process using both inductive and deductive methods of inquiry.

Stahl, N. A. (1987). Developing college vocabulary: A content analysis of instructional materials. Reading, Research and Instruction , 26 (3).

This study investigates the extent to which the content of 55 college vocabulary texts is consistent with current research and theory on vocabulary instruction. It recommends less reliance on memorization and more emphasis on deep understanding and independent vocabulary development.

Swetz, F. (1992). Fifteenth and sixteenth century arithmetic texts: What can we learn from them? Science and Education, 1 (4).

Surveys the format and content of 15th and 16th century arithmetic textbooks, discussing the types of problems that were most popular in these early texts and briefly analyses problem contents. Notes the residual educational influence of this era's arithmetical and instructional practices.
Walsh, K., et al. (1996). Management in the public sector: a content analysis of journals. Public Administration 74 (2), 315-325.
The popularity and implementaion of managerial ideas from 1980 to 1992 are examined through the content of five journals revolving on local government, health, education and social service. Contents were analyzed according to commercialism, user involvement, performance evaluation, staffing, strategy and involvement with other organizations. Overall, local government showed utmost involvement with commercialism while health and social care articles were most concerned with user involvement.

For Further Reading

Abernethy, A. M., & Franke, G. R. (1996).The information content of advertising: a meta-analysis. Journal of Advertising, Summer 25 (2) , 1-18.

Carley, K., & Palmquist, M. (1992). Extracting, representing and analyzing mental models. Social Forces , 70 (3), 601-636.

Fan, D. (1988). Predictions of public opinion from the mass media: Computer content analysis and mathematical modeling . New York, NY: Greenwood Press.

Franzosi, R. (1990). Computer-assisted coding of textual data: An application to semantic grammars. Sociological Methods and Research, 19 (2), 225-257.

McTavish, D.G., & Pirro, E. (1990) Contextual content analysis. Quality and Quantity , 24 , 245-265.

Palmquist, M. E. (1990). The lexicon of the classroom: language and learning in writing class rooms . Doctoral dissertation, Carnegie Mellon University, Pittsburgh, PA.

Palmquist, M. E., Carley, K.M., and Dale, T.A. (1997). Two applications of automated text analysis: Analyzing literary and non-literary texts. In C. Roberts (Ed.), Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Tanscripts. Hillsdale, NJ: Lawrence Erlbaum Associates.

Roberts, C.W. (1989). Other than counting words: A linguistic approach to content analysis. Social Forces, 68 , 147-177.

Issues in Content Analysis

Jolliffe, L. (1993). Yes! More content analysis! Newspaper Research Journal , 14 (3-4), 93-97.

The author responds to an editorial essay by Barbara Luebke which criticizes excessive use of content analysis in newspaper content studies. The author points out the positive applications of content analysis when it is theory-based and utilized as a means of suggesting how or why the content exists, or what its effects on public attitudes or behaviors may be.

Kang, N., Kara, A., Laskey, H. A., & Seaton, F. B. (1993). A SAS MACRO for calculating intercoder agreement in content analysis. Journal of Advertising, 22 (2), 17-28.

A key issue in content analysis is the level of agreement across the judgments which classify the objects or stimuli of interest. A review of articles published in the Journal of Advertising indicates that many authors are not fully utilizing recommended measures of intercoder agreement and thus may not be adequately establishing the reliability of their research. This paper presents a SAS MACRO which facilitates the computation of frequently recommended indices of intercoder agreement in content analysis.
Lacy, S. & Riffe, D. (1996). Sampling error and selecting intercoder reliability samples for nominal content categories. Journalism & Mass Communication Quarterly, 73 (4) , 693-704.
This study views intercoder reliability as a sampling problem. It develops a formula for generating sample sizes needed to have valid reliability estimates. It also suggests steps for reporting reliability. The resulting sample sizes will permit a known degree of confidence that the agreement in a sample of items is representative of the pattern that would occur if all content items were coded by all coders.

Riffe, D., Aust, C. F., & Lacy, S. R. (1993). The effectiveness of random, consecutive day and constructed week sampling in newspaper content analysis. Journalism Quarterly, 70 (1), 133-139.

This study compares 20 sets each of samples for four different sizes using simple random, constructed week and consecutive day samples of newspaper content. Comparisons of sample efficiency, based on the percentage of sample means in each set of 20 falling within one or two standard errors of the population mean, show the superiority of constructed week sampling.

Thomas, S. (1994). Artifactual study in the analysis of culture: A defense of content analysis in a postmodern age. Communication Research, 21 (6), 683-697.

Although both modern and postmodern scholars have criticized the method of content analysis with allegations of reductionism and other epistemological limitations, it is argued here that these criticisms are ill founded. In building and argument for the validity of content analysis, the general value of artifact or text study is first considered.

Zollars, C. (1994). The perils of periodical indexes: Some problems in constructing samples for content analysis and culture indicators research. Communication Research, 21 (6), 698-714.

The author examines problems in using periodical indexes to construct research samples via the use of content analysis and culture indicator research. Issues of historical and idiosyncratic changes in index subject category heading and subheadings make article headings potentially misleading indicators. Index subject categories are not necessarily invalid as a result; nevertheless, the author discusses the need to test for category longevity, coherence, and consistency over time, and suggests the use of oversampling, cross-references, and other techniques as a means of correcting and/or compensating for hidden inaccuracies in classification, and as a means of constructing purposive samples for analytic comparisons.

Busch, Carol, Paul S. De Maret, Teresa Flynn, Rachel Kellum, Sheri Le, Brad Meyers, Matt Saunders, Robert White, and Mike Palmquist. (2005). Content Analysis. Writing@CSU . Colorado State University. https://writing.colostate.edu/guides/guide.cfm?guideid=61

Accounting Historians Notebook

Home > Library > Archival Digital Accounting Collection > Accounting Historians Notebook > Vol. 6 (1983) > No. 2

Article Title

How to use content analysis in historical research

Marilyn Neimark

Publication Date

This paper illustrates the use of a content analysis in historical research. The purpose of a content analysis study is to illustrate the ways in which an individual organization participates in the processes of social change.

Recommended Citation

Neimark, Marilyn (1983) "How to use content analysis in historical research," Accounting Historians Notebook : Vol. 6 : No. 2 , Article 1. Available at: https://egrove.olemiss.edu/aah_notebook/vol6/iss2/1

Since January 01, 2019

Included in

Accounting Commons , Taxation Commons

To view the content in your browser, please download Adobe Reader or, alternately, you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.

  • Journal Home
  • Most Popular Papers
  • Receive Email Notices or RSS

Advanced Search

ISSN: 1075-1416

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

  • Subject List
  • Take a Tour
  • For Authors
  • Subscriber Services
  • Publications
  • African American Studies
  • African Studies
  • American Literature
  • Anthropology
  • Architecture Planning and Preservation
  • Art History
  • Atlantic History
  • Biblical Studies
  • British and Irish Literature
  • Childhood Studies
  • Chinese Studies
  • Cinema and Media Studies

Communication

  • Criminology
  • Environmental Science
  • Evolutionary Biology
  • International Law
  • International Relations
  • Islamic Studies
  • Jewish Studies
  • Latin American Studies
  • Latino Studies
  • Linguistics
  • Literary and Critical Theory
  • Medieval Studies
  • Military History
  • Political Science
  • Public Health
  • Renaissance and Reformation
  • Social Work
  • Urban Studies
  • Victorian Literature
  • Browse All Subjects

How to Subscribe

  • Free Trials

In This Article Expand or collapse the "in this article" section Content Analysis

Introduction.

  • The Centrality of Content Analysis to Programmatic Communication Research
  • Measurement
  • Sampling Traditional and Digital Media
  • Sampling from Databases
  • Reliability Sampling Studies
  • Reliability and Validity
  • Automated Textual Analysis (Computer Assisted Content Analysis)

Related Articles Expand or collapse the "related articles" section about

About related articles close popup.

Lorem Ipsum Sit Dolor Amet

Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam ligula odio, euismod ut aliquam et, vestibulum nec risus. Nulla viverra, arcu et iaculis consequat, justo diam ornare tellus, semper ultrices tellus nunc eu tellus.

  • Advertising
  • Agenda Setting
  • Intercultural Conflict Mediation
  • Mass Communication
  • Media Effects
  • Message Characteristics and Persuasion
  • News Framing
  • Product Placement
  • Public Opinion
  • Race and Communication
  • Sex in the Media
  • Social Media
  • Tabloidization
  • Televised Debates
  • Violence in the Media

Other Subject Areas

Forthcoming articles expand or collapse the "forthcoming articles" section.

  • Culture Shock and Communication
  • LGBTQ+ Family Communication
  • Queerbaiting
  • Find more forthcoming titles...
  • Export Citations
  • Share This Facebook LinkedIn Twitter

Content Analysis by Brendan Watson , Stephen Lacy LAST REVIEWED: 27 April 2017 LAST MODIFIED: 27 April 2017 DOI: 10.1093/obo/9780199756841-0175

Content analysis is a quantitative method that uses human coders to apply a set of valid measurement rules to reduce manifest features of content to numeric data in order to make replicable, generalizable inferences about that content. Because the method is applied to human artifacts, it has generic advantages that apply whether doing quantitative content analysis or qualitative textual or rhetorical analysis. For example, analyzing communication content is an unobtrusive research activity that is unaffected by self-report biases. However, it is critical to differentiate content analysis as a distinct, quantitative, social-scientific method using human coders from other methods of analyzing content: this is done in order to call attention to the method’s unique strengths and weaknesses. A weakness of content analysis is that assigning content to numeric categories loses some of the richness of human communication. A strength of content analysis is that it reduces complex communication phenomenon to numeric data, allowing researchers to study broader phenomenon than would be possible via methods that rely on close reading. Furthermore, probabilistic sampling allows researchers to draw inferences about a given communication phenomenon without observing all cases and processes. Reliability testing also helps ensure that results have greater precision and are replicable. Although content analysis developed out of the US scholarly community building on code breaking during the Second World War, it is now used around the world. However, most of the available texts in non-English languages are translations from texts originally written in English. The following sections provide references that give scholars, both novices and those who are experienced in using content analysis, a strong foundation in the method, especially as it applies to studying media content. The references focus on content analysis applied to theory, units of measurement, sampling, and reliability. They also suggest core texts and journals that are good outlets for content analysis scholarship. Compared to other methods based on measuring implicit attitudes (e.g., survey research), content analysis has been the subject of much less methodological research aimed at improving the method itself. So the following discussion also calls attention to those areas where more empirical research may help advance the method, providing young and experienced scholars alike an opportunity to make their own contributions to the method and improve measurement.

Berelson 1952 is the first quantitative content analysis text, and since then a handful of additional texts have been written for communication scholars. However, it was not until 2004 that a second edition appeared for any of the texts. Almost two decades after Berelson 1952 , Holsti 1969 appeared as an alternative. Currently, there are three texts in print, and two of them are in their third edition— Krippendorff 2013 ; Neuendorf 2017 ; and Riffe, et al. 2014 . Although these texts are stylistically varied, they tend to be consistent (with a few differences) in the recommendations for best practices and the standards they advocate. All of these texts provide an overview of the techniques and processes of content analysis, covering topics such as research design, protocol development, coding schemes, data analysis, as well as issues of validity and reliability. The three texts currently in print have more detail and discuss methodological issues to a greater degree than earlier text. Therefore, texts with more recent publication dates will provide more up-to-date standards on the conducting and reporting of content analysis. Krippendorff and Bock 2009 is a collection of articles, which is the only currently available content analysis reader. Most general communication research texts contain chapters about content analysis as an important data-generation technique. Although these may be worthwhile introductions and summaries of content analysis, scholars conducting a content analysis should read at least one of the more recent texts before conducting a quantitative content analysis.

Berelson, Bernard. 1952. Content analysis in communication research . New York: Free Press.

The first content analysis text. Much of 21st-century methodology is based on the theoretical foundations in this book. At the time of writing, the method was empirically underexplored to the point that one chapter title, “Technical Problems,” covered the areas of validity, reliability, sampling, and analysis.

Holsti, Ole R. 1969. Content analysis for the social sciences and humanities . Reading, MA: Addison-Wesley.

During the late 1950s and 1960s, content analysis began to be used in fields other than communication. This text aimed to serve scholars in a range of relevant social science and humanities fields by using a variety of examples. The chapters’ titles became the structure for future texts.

Krippendorff, Klaus. 2013. Content analysis: An introduction to its methodology . 3d ed. Los Angeles: SAGE.

This text contains the most detailed explication of Krippendorff’s alpha, a commonly used reliability coefficient. Alpha was first introduced in the initial edition. In addition, this text is the most mathematical of the texts.

Krippendorff, Klaus, and Mary A. Bock, eds. 2009. The content analysis reader . Thousand Oaks, CA: SAGE.

This is a collection of fifty-two published articles that cover the history of the process, discuss methodology, and provide important examples of content analysis studies that cover a number of social science fields, media (textual and visual), and approaches.

Neuendorf, Kimberly A. 2017. The content analysis guidebook . 2d ed. Thousand Oaks, CA: SAGE.

As with the other two texts currently in print, this one fully covers both the theory and methodology of content analysis and comes with a website and description of additional resources for students and content analysts.

Riffe, Daniel, Stephen Lacy, and Frederick G. Fico. 2014. Analyzing media messages: Using quantitative content analysis in research . 3d ed. New York: Routledge.

This text covers the application of content analysis to a range of media using examples from mediated communication studies. It provides the steps necessary to conduct a content analysis of textual and visual media.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login .

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here .

  • About Communication »
  • Meet the Editorial Board »
  • Accounting Communication
  • Acculturation Processes and Communication
  • Action Assembly Theory
  • Action-Implicative Discourse Analysis
  • Activist Media
  • Adherence and Communication
  • Adolescence and the Media
  • Advertisements, Televised Political
  • Advertising, Children and
  • Advertising, International
  • Advocacy Journalism
  • Annenberg, Walter H.
  • Apologies and Accounts
  • Applied Communication Research Methods
  • Argumentation
  • Artificial Intelligence (AI) Advertising
  • Attitude-Behavior Consistency
  • Audience Fragmentation
  • Audience Studies
  • Authoritarian Societies, Journalism in
  • Bakhtin, Mikhail
  • Bandwagon Effect
  • Baudrillard, Jean
  • Blockchain and Communication
  • Bourdieu, Pierre
  • Brand Equity
  • British and Irish Magazine, History of the
  • Broadcasting, Public Service
  • Capture, Media
  • Castells, Manuel
  • Celebrity and Public Persona
  • Civil Rights Movement and the Media, The
  • Co-Cultural Theory and Communication
  • Codes and Cultural Discourse Analysis
  • Cognitive Dissonance
  • Collective Memory, Communication and
  • Comedic News
  • Communication Apprehension
  • Communication Campaigns
  • Communication, Definitions and Concepts of
  • Communication History
  • Communication Law
  • Communication Management
  • Communication Networks
  • Communication, Philosophy of
  • Community Attachment
  • Community Journalism
  • Community Structure Approach
  • Computational Journalism
  • Computer-Mediated Communication
  • Content Analysis
  • Corporate Social Responsibility and Communication
  • Crisis Communication
  • Critical and Cultural Studies
  • Critical Race Theory and Communication
  • Cross-tools and Cross-media Effects
  • Cultivation
  • Cultural and Creative Industries
  • Cultural Imperialism Theories
  • Cultural Mapping
  • Cultural Persuadables
  • Cultural Pluralism and Communication
  • Cyberpolitics
  • Death, Dying, and Communication
  • Debates, Televised
  • Deliberation
  • Developmental Communication
  • Diffusion of Innovations
  • Digital Divide
  • Digital Gender Diversity
  • Digital Intimacies
  • Digital Literacy
  • Diplomacy, Public
  • Distributed Work, Comunication and
  • Documentary and Communication
  • E-democracy/E-participation
  • E-Government
  • Elaboration Likelihood Model
  • Electronic Word-of-Mouth (eWOM)
  • Embedded Coverage
  • Entertainment
  • Entertainment-Education
  • Environmental Communication
  • Ethnic Media
  • Ethnography of Communication
  • Experiments
  • Families, Multicultural
  • Family Communication
  • Federal Communications Commission
  • Feminist and Queer Game Studies
  • Feminist Data Studies
  • Feminist Journalism
  • Feminist Theory
  • Focus Groups
  • Food Studies and Communication
  • Freedom of the Press
  • Friendships, Intercultural
  • Gatekeeping
  • Gender and the Media
  • Global Englishes
  • Global Media, History of
  • Global Media Organizations
  • Glocalization
  • Goffman, Erving
  • Habermas, Jürgen
  • Habituation and Communication
  • Health Communication
  • Hermeneutic Communication Studies
  • Homelessness and Communication
  • Hook-Up and Dating Apps
  • Hostile Media Effect
  • Identification with Media Characters
  • Identity, Cultural
  • Image Repair Theory
  • Implicit Measurement
  • Impression Management
  • Infographics
  • Information and Communication Technology for Development
  • Information Management
  • Information Overload
  • Information Processing
  • Infotainment
  • Innis, Harold
  • Instructional Communication
  • Integrated Marketing Communications
  • Interactivity
  • Intercultural Capital
  • Intercultural Communication
  • Intercultural Communication, Tourism and
  • Intercultural Communication, Worldview in
  • Intercultural Competence
  • Intercultural Dialogue
  • Intercultural New Media
  • Intergenerational Communication
  • Intergroup Communication
  • International Communications
  • Interpersonal Communication
  • Interpersonal LGBTQ Communication
  • Interpretation/Reception
  • Interpretive Communities
  • Journalism, Accuracy in
  • Journalism, Alternative
  • Journalism and Trauma
  • Journalism, Citizen
  • Journalism, Citizen, History of
  • Journalism Ethics
  • Journalism, Interpretive
  • Journalism, Peace
  • Journalism, Tabloid
  • Journalists, Violence against
  • Knowledge Gap
  • Lazarsfeld, Paul
  • Leadership and Communication
  • McLuhan, Marshall
  • Media Activism
  • Media Aesthetics
  • Media and Time
  • Media Convergence
  • Media Credibility
  • Media Dependency
  • Media Ecology
  • Media Economics
  • Media Economics, Theories of
  • Media, Educational
  • Media Ethics
  • Media Events
  • Media Exposure Measurement
  • Media, Gays and Lesbians in the
  • Media Literacy
  • Media Logic
  • Media Management
  • Media Policy and Governance
  • Media Regulation
  • Media, Social
  • Media Sociology
  • Media Systems Theory
  • Merton, Robert K.
  • Mobile Communication Studies
  • Multimodal Discourse Analysis, Approaches to
  • Multinational Organizations, Communication and Culture in
  • Murdoch, Rupert
  • Narrative Engagement
  • Narrative Persuasion
  • Net Neutrality
  • News Media Coverage of Women
  • NGOs, Communication and
  • Online Campaigning
  • Open Access
  • Organizational Change and Organizational Change Communicat...
  • Organizational Communication
  • Organizational Communication, Aging and
  • Parasocial Theory in Communication
  • Participation, Civic/Political
  • Participatory Action Research
  • Patient-Provider Communication
  • Peacebuilding and Communication
  • Perceived Realism
  • Personalized Communication
  • Persuasion and Social Influence
  • Persuasion, Resisting
  • Photojournalism
  • Political Advertising
  • Political Communication, Normative Analysis of
  • Political Economy
  • Political Knowledge
  • Political Marketing
  • Political Scandals
  • Political Socialization
  • Polls, Opinion
  • Public Interest Communication
  • Public Relations
  • Public Sphere
  • Queer Intercultural Communication
  • Queer Migration and Digital Media
  • Racism and Communication
  • Radio Studies
  • Reality Television
  • Reasoned Action Frameworks
  • Religion and the Media
  • Reporting, Investigative
  • Rhetoric and Communication
  • Rhetoric and Intercultural Communication
  • Rhetoric and Social Movements
  • Rhetoric, Religious
  • Rhetoric, Visual
  • Risk Communication
  • Rumor and Communication
  • Schramm, Wilbur
  • Science Communication
  • Scripps, E. W.
  • Selective Exposure
  • Sense-Making/Sensemaking
  • Sesame Street
  • Small-Group Communication
  • Social Capital
  • Social Change
  • Social Cognition
  • Social Construction
  • Social Identity Theory and Communication
  • Social Interaction
  • Social Movements
  • Social Network Analysis
  • Social Protest
  • Sports Communication
  • Stereotypes
  • Strategic Communication
  • Superdiversity
  • Surveillance and Communication
  • Symbolic Interactionism in Communication
  • Synchrony in Intercultural Communication
  • Telecommunications History/Policy
  • Television, Cable
  • Textual Analysis and Communication
  • Third Culture Kids
  • Third-Person Effect
  • Time Warner
  • Transgender Media Studies
  • Transmedia Storytelling
  • Two-Step Flow
  • United Nations and Communication
  • Urban Communication
  • Uses and Gratifications
  • Video Deficit
  • Video Games and Communication
  • Virtual Reality and Communication
  • Visual Communication
  • Web Archiving
  • Whistleblowing
  • Whiteness Theory in Intercultural Communication
  • Youth and Media
  • Zines and Communication
  • Privacy Policy
  • Cookie Policy
  • Legal Notice
  • Accessibility

Powered by:

  • [66.249.64.20|185.126.86.119]
  • 185.126.86.119

Content Analysis

  • First Online: 02 January 2023

Cite this chapter

content analysis in historical research

  • Scott Tunison 4  

Part of the book series: Springer Texts in Education ((SPTE))

4236 Accesses

1 Citations

Content analysis emerged from studies of archived texts (Vogt et al., When to use what research design, Guilford Press, 2012), such as newspapers, transcripts of speeches, and magazines. (Ellingson, The SAGE handbook of qualitative research (4th ed., pp. 595–610). Sage, 2011.) noted that content analysis resides in the postpositivist typology which allows researchers to “conduct an inductive analysis of textual data, form a typology grounded in the data … use the derived typology to sort data into categories, and then count the frequencies of each theme or category across data” (p. 596).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Anderson, G., & Arsenault, N. (1998). Fundamentals of educational research (2nd ed.). Routledge.

Google Scholar  

Cohen, L., Manion, L., & Morrison, K. (2018). Research methods in education (8th ed.). Routledge.

Creswell, J. (2011). Controversies in mixed methods research. In N. Denzin & Y. Lincoln (Eds.), The SAGE Handbook of Qualitative Research (4th ed., pp. 269–283). Sage.

Ellingson, L. (2011). Analysis and representation across the continuum. In N. Denzin & Y. Lincoln (Eds.), The SAGE Handbook of Qualitative Research (4th ed., pp. 595–610). Sage.

Ezzy, D. (2002). Qualitative analysis: Practice and innovation . Routledge.

Mayring, P. (2004). Qualitative content analysis. In U. Flick, E. von Kardoff, & I. Steinke (Eds.), A companion to qualitative research . Sage.

Newby, P. (2010). Research methods for education . Pearson Education.

Robson, C. (2002). Real world research (2nd ed.). Blackwell.

Vogt, W. P., Gardner, D., & Haeffele, L. (2012). When to use what research design . Guilford Press.

Online Resources

Adu, P. (2016). Qualitative analysis: Coding and categorizing data . https://youtu.be/v_mg7OBpb2Y

Duke University—Mod-U (2016). How to know you are coding correctly: Qualitative research methods . https://youtu.be/iL7Ww5kpnIM

Gramenz, G. (2014). How to code a document and create themes . https://youtu.be/sHv3RzKWNcQ

Shaw, A. (2019). NVivo 12 and thematic/content analysis . https://youtu.be/5s9-rg1ygWs

Download references

Author information

Authors and affiliations.

University of Saskatchewan, Saskatoon, Canada

Scott Tunison

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Scott Tunison .

Editor information

Editors and affiliations.

Department of Educational Administration, College of Education, University of Saskatchewan, Saskatoon, SK, Canada

Janet Mola Okoko

Department of Educational Administration, University of Saskatchewan, Saskatoon, SK, Canada

Keith D. Walker

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Tunison, S. (2023). Content Analysis. In: Okoko, J.M., Tunison, S., Walker, K.D. (eds) Varieties of Qualitative Research Methods. Springer Texts in Education. Springer, Cham. https://doi.org/10.1007/978-3-031-04394-9_14

Download citation

DOI : https://doi.org/10.1007/978-3-031-04394-9_14

Published : 02 January 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-04396-3

Online ISBN : 978-3-031-04394-9

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Skip to content

Read the latest news stories about Mailman faculty, research, and events. 

Departments

We integrate an innovative skills-based curriculum, research collaborations, and hands-on field experience to prepare students.

Learn more about our research centers, which focus on critical issues in public health.

Our Faculty

Meet the faculty of the Mailman School of Public Health. 

Become a Student

Life and community, how to apply.

Learn how to apply to the Mailman School of Public Health. 

Content Analysis

Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate language used within a news article to search for bias or partiality. Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of surrounding the text.

Description

Sources of data could be from interviews, open-ended questions, field research notes, conversations, or literally any occurrence of communicative language (such as books, essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content analysis, the text must be coded, or broken down, into manageable code categories for analysis (i.e. “codes”). Once the text is coded into code categories, the codes can then be further categorized into “code categories” to summarize data even further.

Three different definitions of content analysis are provided below.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.” (from Holsti, 1968)

Definition 2: “An interpretive and naturalistic approach. It is both observational and narrative in nature and relies less on the experimental elements normally associated with scientific research (reliability, validity, and generalizability) (from Ethnography, Observational Research, and Narrative Inquiry, 1994-2012).

Definition 3: “A research technique for the objective, systematic and quantitative description of the manifest content of communication.” (from Berelson, 1952)

Uses of Content Analysis

Identify the intentions, focus or communication trends of an individual, group or institution

Describe attitudinal and behavioral responses to communications

Determine the psychological or emotional state of persons or groups

Reveal international differences in communication content

Reveal patterns in communication content

Pre-test and improve an intervention or survey prior to launch

Analyze focus group interviews and open-ended questions to complement quantitative data

Types of Content Analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops the conceptual analysis further by examining the relationships among concepts in a text. Each type of analysis may lead to different results, conclusions, interpretations and meanings.

Conceptual Analysis

Typically people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen for examination and the analysis involves quantifying and counting its presence. The main goal is to examine the occurrence of selected terms in the data. Terms may be explicit or implicit. Explicit terms are easy to identify. Coding of implicit terms is more complicated: you need to decide the level of implication and base judgments on subjectivity (an issue for reliability and validity). Therefore, coding of implicit terms involves using a dictionary or contextual translation rules or both.

To begin a conceptual content analysis, first identify the research question and choose a sample or samples for analysis. Next, the text must be coded into manageable content categories. This is basically a process of selective reduction. By reducing the text to categories, the researcher can focus on and code for specific words or patterns that inform the research question.

General steps for conducting a conceptual content analysis:

1. Decide the level of analysis: word, word sense, phrase, sentence, themes

2. Decide how many concepts to code for: develop a pre-defined or interactive set of categories or concepts. Decide either: A. to allow flexibility to add categories through the coding process, or B. to stick with the pre-defined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications to one’s research question.

Option B allows the researcher to stay focused and examine the data for specific concepts.

3. Decide whether to code for existence or frequency of a concept. The decision changes the coding process.

When coding for the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

When coding for the frequency of a concept, the researcher would count the number of times a concept appears in a text.

4. Decide on how you will distinguish among concepts:

Should text be coded exactly as they appear or coded as the same when they appear in different forms? For example, “dangerous” vs. “dangerousness”. The point here is to create coding rules so that these word segments are transparently categorized in a logical fashion. The rules could make all of these word segments fall into the same category, or perhaps the rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of implication is to be allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” vs. “the person is scary” vs. “that person could cause harm to me”. These word segments may not merit separate categories, due the implicit meaning of “dangerous”.

5. Develop rules for coding your texts. After decisions of steps 1-4 are complete, a researcher can begin developing rules for translation of text into codes. This will keep the coding process organized and consistent. The researcher can code for exactly what he/she wants to code. Validity of the coding process is ensured when the researcher is consistent and coherent in their codes, meaning that they follow their translation rules. In content analysis, obeying by the translation rules is equivalent to validity.

6. Decide what to do with irrelevant information: should this be ignored (e.g. common English words like “the” and “and”), or used to reexamine the coding scheme in the case that it would add to the outcome of coding?

7. Code the text: This can be done by hand or by using software. By using software, researchers can input categories and have coding done automatically, quickly and efficiently, by the software program. When coding is done by hand, a researcher can recognize errors far more easily (e.g. typos, misspelling). If using computer coding, text could be cleaned of errors to include all available data. This decision of hand vs. computer coding is most relevant for implicit information where category preparation is essential for accurate coding.

8. Analyze your results: Draw conclusions and generalizations where possible. Determine what to do with irrelevant, unwanted, or unused text: reexamine, ignore, or reassess the coding scheme. Interpret results carefully as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational Analysis

Relational analysis begins like conceptual analysis, where a concept is chosen for examination. However, the analysis involves exploring the relationships between concepts. Individual concepts are viewed as having no inherent meaning and rather the meaning is a product of the relationships among concepts.

To begin a relational content analysis, first identify a research question and choose a sample or samples for analysis. The research question must be focused so the concept types are not open to interpretation and can be summarized. Next, select text for analysis. Select text for analysis carefully by balancing having enough information for a thorough analysis so results are not limited with having information that is too extensive so that the coding process becomes too arduous and heavy to supply meaningful and worthwhile results.

There are three subcategories of relational analysis to choose from prior to going on to the general steps.

Affect extraction: an emotional evaluation of concepts explicit in a text. A challenge to this method is that emotions can vary across time, populations, and space. However, it could be effective at capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis: an evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called a “window” that is scanned for the co-occurrence of concepts. The result is the creation of a “concept matrix”, or a group of interrelated co-occurring concepts that would suggest an overall meaning.

Cognitive mapping: a visualization technique for either affect extraction or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of the text such as a graphic map that represents the relationships between concepts.

General steps for conducting a relational content analysis:

1. Determine the type of analysis: Once the sample has been selected, the researcher needs to determine what types of relationships to examine and the level of analysis: word, word sense, phrase, sentence, themes. 2. Reduce the text to categories and code for words or patterns. A researcher can code for existence of meanings or words. 3. Explore the relationship between concepts: once the words are coded, the text can be analyzed for the following:

Strength of relationship: degree to which two or more concepts are related.

Sign of relationship: are concepts positively or negatively related to each other?

Direction of relationship: the types of relationship that categories exhibit. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the primary motivator of Y.

4. Code the relationships: a difference between conceptual and relational analysis is that the statements or relationships between concepts are coded. 5. Perform statistical analyses: explore differences or look for relationships among the identified variables during coding. 6. Map out representations: such as decision mapping and mental models.

Reliability and Validity

Reliability : Because of the human nature of researchers, coding errors can never be eliminated but only minimized. Generally, 80% is an acceptable margin for reliability. Three criteria comprise the reliability of a content analysis:

Stability: the tendency for coders to consistently re-code the same data in the same way over a period of time.

Reproducibility: tendency for a group of coders to classify categories membership in the same way.

Accuracy: extent to which the classification of text corresponds to a standard or norm statistically.

Validity : Three criteria comprise the validity of a content analysis:

Closeness of categories: this can be achieved by utilizing multiple classifiers to arrive at an agreed upon definition of each specific category. Using multiple classifiers, a concept category that may be an explicit variable can be broadened to include synonyms or implicit variables.

Conclusions: What level of implication is allowable? Do conclusions correctly follow the data? Are results explainable by other phenomena? This becomes especially problematic when using computer software for analysis and distinguishing between synonyms. For example, the word “mine,” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. Software can obtain an accurate count of that word’s occurrence and frequency, but not be able to produce an accurate accounting of the meaning inherent in each particular usage. This problem could throw off one’s results and make any conclusion invalid.

Generalizability of the results to a theory: dependent on the clear definitions of concept categories, how they are determined and how reliable they are at measuring the idea one is seeking to measure. Generalizability parallels reliability as much of it depends on the three criteria for reliability.

Advantages of Content Analysis

Directly examines communication using text

Allows for both qualitative and quantitative analysis

Provides valuable historical and cultural insights over time

Allows a closeness to data

Coded form of the text can be statistically analyzed

Unobtrusive means of analyzing interactions

Provides insight into complex models of human thought and language use

When done well, is considered a relatively “exact” research method

Content analysis is a readily-understood and an inexpensive research method

A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of Content Analysis

Can be extremely time consuming

Is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation

Is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study

Is inherently reductive, particularly when dealing with complex texts

Tends too often to simply consist of word counts

Often disregards the context that produced the text, as well as the state of things after the text is produced

Can be difficult to automate or computerize

Textbooks & Chapters  

Berelson, Bernard. Content Analysis in Communication Research.New York: Free Press, 1952.

Busha, Charles H. and Stephen P. Harter. Research Methods in Librarianship: Techniques and Interpretation.New York: Academic Press, 1980.

de Sola Pool, Ithiel. Trends in Content Analysis. Urbana: University of Illinois Press, 1959.

Krippendorff, Klaus. Content Analysis: An Introduction to its Methodology. Beverly Hills: Sage Publications, 1980.

Fielding, NG & Lee, RM. Using Computers in Qualitative Research. SAGE Publications, 1991. (Refer to Chapter by Seidel, J. ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’.)

Methodological Articles  

Hsieh HF & Shannon SE. (2005). Three Approaches to Qualitative Content Analysis.Qualitative Health Research. 15(9): 1277-1288.

Elo S, Kaarianinen M, Kanste O, Polkki R, Utriainen K, & Kyngas H. (2014). Qualitative Content Analysis: A focus on trustworthiness. Sage Open. 4:1-10.

Application Articles  

Abroms LC, Padmanabhan N, Thaweethai L, & Phillips T. (2011). iPhone Apps for Smoking Cessation: A content analysis. American Journal of Preventive Medicine. 40(3):279-285.

Ullstrom S. Sachs MA, Hansson J, Ovretveit J, & Brommels M. (2014). Suffering in Silence: a qualitative study of second victims of adverse events. British Medical Journal, Quality & Safety Issue. 23:325-331.

Owen P. (2012).Portrayals of Schizophrenia by Entertainment Media: A Content Analysis of Contemporary Movies. Psychiatric Services. 63:655-659.

Choosing whether to conduct a content analysis by hand or by using computer software can be difficult. Refer to ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’ listed above in “Textbooks and Chapters” for a discussion of the issue.

QSR NVivo:  http://www.qsrinternational.com/products.aspx

Atlas.ti:  http://www.atlasti.com/webinars.html

R- RQDA package:  http://rqda.r-forge.r-project.org/

Rolly Constable, Marla Cowell, Sarita Zornek Crawford, David Golden, Jake Hartvigsen, Kathryn Morgan, Anne Mudgett, Kris Parrish, Laura Thomas, Erika Yolanda Thompson, Rosie Turner, and Mike Palmquist. (1994-2012). Ethnography, Observational Research, and Narrative Inquiry. Writing@CSU. Colorado State University. Available at: https://writing.colostate.edu/guides/guide.cfm?guideid=63 .

As an introduction to Content Analysis by Michael Palmquist, this is the main resource on Content Analysis on the Web. It is comprehensive, yet succinct. It includes examples and an annotated bibliography. The information contained in the narrative above draws heavily from and summarizes Michael Palmquist’s excellent resource on Content Analysis but was streamlined for the purpose of doctoral students and junior researchers in epidemiology.

At Columbia University Mailman School of Public Health, more detailed training is available through the Department of Sociomedical Sciences- P8785 Qualitative Research Methods.

Join the Conversation

Have a question about methods? Join us on Facebook

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Afr J Emerg Med
  • v.7(3); 2017 Sep

A hands-on guide to doing content analysis

Christen erlingsson.

a Department of Health and Caring Sciences, Linnaeus University, Kalmar 391 82, Sweden

Petra Brysiewicz

b School of Nursing & Public Health, University of KwaZulu-Natal, Durban 4041, South Africa

Associated Data

There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including the emergency care context in Africa. Novice qualitative researchers are often daunted by the prospect of qualitative data analysis and thus may experience much difficulty in the data analysis process. Our objective with this manuscript is to provide a practical hands-on example of qualitative content analysis to aid novice qualitative researchers in their task.

African relevance

  • • Qualitative research is useful to deepen the understanding of the human experience.
  • • Novice qualitative researchers may benefit from this hands-on guide to content analysis.
  • • Practical tips and data analysis templates are provided to assist in the analysis process.

Introduction

There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including emergency care research. An increasing number of health researchers are currently opting to use various qualitative research approaches in exploring and describing complex phenomena, providing textual accounts of individuals’ “life worlds”, and giving voice to vulnerable populations our patients so often represent. Many articles and books are available that describe qualitative research methods and provide overviews of content analysis procedures [1] , [2] , [3] , [4] , [5] , [6] , [7] , [8] , [9] , [10] . Some articles include step-by-step directions intended to clarify content analysis methodology. What we have found in our teaching experience is that these directions are indeed very useful. However, qualitative researchers, especially novice researchers, often struggle to understand what is happening on and between steps, i.e., how the steps are taken.

As research supervisors of postgraduate health professionals, we often meet students who present brilliant ideas for qualitative studies that have potential to fill current gaps in the literature. Typically, the suggested studies aim to explore human experience. Research questions exploring human experience are expediently studied through analysing textual data e.g., collected in individual interviews, focus groups, documents, or documented participant observation. When reflecting on the proposed study aim together with the student, we often suggest content analysis methodology as the best fit for the study and the student, especially the novice researcher. The interview data are collected and the content analysis adventure begins. Students soon realise that data based on human experiences are complex, multifaceted and often carry meaning on multiple levels.

For many novice researchers, analysing qualitative data is found to be unexpectedly challenging and time-consuming. As they soon discover, there is no step-wise analysis process that can be applied to the data like a pattern cutter at a textile factory. They may become extremely annoyed and frustrated during the hands-on enterprise of qualitative content analysis.

The novice researcher may lament, “I’ve read all the methodology but don’t really know how to start and exactly what to do with my data!” They grapple with qualitative research terms and concepts, for example; differences between meaning units, codes, categories and themes, and regarding increasing levels of abstraction from raw data to categories or themes. The content analysis adventure may now seem to be a chaotic undertaking. But, life is messy, complex and utterly fascinating. Experiencing chaos during analysis is normal. Good advice for the qualitative researcher is to be open to the complexity in the data and utilise one’s flow of creativity.

Inspired primarily by descriptions of “conventional content analysis” in Hsieh and Shannon [3] , “inductive content analysis” in Elo and Kyngäs [5] and “qualitative content analysis of an interview text” in Graneheim and Lundman [1] , we have written this paper to help the novice qualitative researcher navigate the uncertainty in-between the steps of qualitative content analysis. We will provide advice and practical tips, as well as data analysis templates, to attempt to ease frustration and hopefully, inspire readers to discover how this exciting methodology contributes to developing a deeper understanding of human experience and our professional contexts.

Overview of qualitative content analysis

Synopsis of content analysis.

A common starting point for qualitative content analysis is often transcribed interview texts. The objective in qualitative content analysis is to systematically transform a large amount of text into a highly organised and concise summary of key results. Analysis of the raw data from verbatim transcribed interviews to form categories or themes is a process of further abstraction of data at each step of the analysis; from the manifest and literal content to latent meanings ( Fig. 1 and Table 1 ).

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

Example of analysis leading to higher levels of abstraction; from manifest to latent content.

Glossary of terms as used in this hands-on guide to doing content analysis. *

The initial step is to read and re-read the interviews to get a sense of the whole, i.e., to gain a general understanding of what your participants are talking about. At this point you may already start to get ideas of what the main points or ideas are that your participants are expressing. Then one needs to start dividing up the text into smaller parts, namely, into meaning units. One then condenses these meaning units further. While doing this, you need to ensure that the core meaning is still retained. The next step is to label condensed meaning units by formulating codes and then grouping these codes into categories. Depending on the study’s aim and quality of the collected data, one may choose categories as the highest level of abstraction for reporting results or you can go further and create themes [1] , [2] , [3] , [5] , [8] .

Content analysis as a reflective process

You must mould the clay of the data , tapping into your intuition while maintaining a reflective understanding of how your own previous knowledge is influencing your analysis, i.e., your pre-understanding. In qualitative methodology, it is imperative to vigilantly maintain an awareness of one’s pre-understanding so that this does not influence analysis and/or results. This is the difficult balancing task of keeping a firm grip on one’s assumptions, opinions, and personal beliefs, and not letting them unconsciously steer your analysis process while simultaneously, and knowingly, utilising one’s pre-understanding to facilitate a deeper understanding of the data.

Content analysis, as in all qualitative analysis, is a reflective process. There is no “step 1, 2, 3, done!” linear progression in the analysis. This means that identifying and condensing meaning units, coding, and categorising are not one-time events. It is a continuous process of coding and categorising then returning to the raw data to reflect on your initial analysis. Are you still satisfied with the length of meaning units? Do the condensed meaning units and codes still “fit” with each other? Do the codes still fit into this particular category? Typically, a fair amount of adjusting is needed after the first analysis endeavour. For example: a meaning unit might need to be split into two meaning units in order to capture an additional core meaning; a code modified to more closely match the core meaning of the condensed meaning unit; or a category name tweaked to most accurately describe the included codes. In other words, analysis is a flexible reflective process of working and re-working your data that reveals connections and relationships. Once condensed meaning units are coded it is easier to get a bigger picture and see patterns in your codes and organise codes in categories.

Content analysis exercise

The synopsis above is representative of analysis descriptions in many content analysis articles. Although correct, such method descriptions still do not provide much support for the novice researcher during the actual analysis process. Aspiring to provide guidance and direction to support the novice, a practical example of doing the actual work of content analysis is provided in the following sections. This practical example is based on a transcribed interview excerpt that was part of a study that aimed to explore patients’ experiences of being admitted into the emergency centre ( Fig. 2 ).

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

Excerpt from interview text exploring “Patient’s experience of being admitted into the emergency centre”

This content analysis exercise provides instructions, tips, and advice to support the content analysis novice in a) familiarising oneself with the data and the hermeneutic spiral, b) dividing up the text into meaning units and subsequently condensing these meaning units, c) formulating codes, and d) developing categories and themes.

Familiarising oneself with the data and the hermeneutic spiral

An important initial phase in the data analysis process is to read and re-read the transcribed interview while keeping your aim in focus. Write down your initial impressions. Embrace your intuition. What is the text talking about? What stands out? How did you react while reading the text? What message did the text leave you with? In this analysis phase, you are gaining a sense of the text as a whole.

You may ask why this is important. During analysis, you will be breaking down the whole text into smaller parts. Returning to your notes with your initial impressions will help you see if your “parts” analysis is matching up with your first impressions of the “whole” text. Are your initial impressions visible in your analysis of the parts? Perhaps you need to go back and check for different perspectives. This is what is referred to as the hermeneutic spiral or hermeneutic circle. It is the process of comparing the parts to the whole to determine whether impressions of the whole verify the analysis of the parts in all phases of analysis. Each part should reflect the whole and the whole should be reflected in each part. This concept will become clearer as you start working with your data.

Dividing up the text into meaning units and condensing meaning units

You have now read the interview a number of times. Keeping your research aim and question clearly in focus, divide up the text into meaning units. Located meaning units are then condensed further while keeping the central meaning intact ( Table 2 ). The condensation should be a shortened version of the same text that still conveys the essential message of the meaning unit. Sometimes the meaning unit is already so compact that no further condensation is required. Some content analysis sources warn researchers against short meaning units, claiming that this can lead to fragmentation [1] . However, our personal experience as research supervisors has shown us that a greater problem for the novice is basing analysis on meaning units that are too large and include many meanings which are then lost in the condensation process.

Suggestion for how the exemplar interview text can be divided into meaning units and condensed meaning units ( condensations are in parentheses ).

Formulating codes

The next step is to develop codes that are descriptive labels for the condensed meaning units ( Table 3 ). Codes concisely describe the condensed meaning unit and are tools to help researchers reflect on the data in new ways. Codes make it easier to identify connections between meaning units. At this stage of analysis you are still keeping very close to your data with very limited interpretation of content. You may adjust, re-do, re-think, and re-code until you get to the point where you are satisfied that your choices are reasonable. Just as in the initial phase of getting to know your data as a whole, it is also good to write notes during coding on your impressions and reactions to the text.

Suggestions for coding of condensed meaning units.

Developing categories and themes

The next step is to sort codes into categories that answer the questions who , what , when or where? One does this by comparing codes and appraising them to determine which codes seem to belong together, thereby forming a category. In other words, a category consists of codes that appear to deal with the same issue, i.e., manifest content visible in the data with limited interpretation on the part of the researcher. Category names are most often short and factual sounding.

In data that is rich with latent meaning, analysis can be carried on to create themes. In our practical example, we have continued the process of abstracting data to a higher level, from category to theme level, and developed three themes as well as an overarching theme ( Table 4 ). Themes express underlying meaning, i.e., latent content, and are formed by grouping two or more categories together. Themes are answering questions such as why , how , in what way or by what means? Therefore, theme names include verbs, adverbs and adjectives and are very descriptive or even poetic.

Suggestion for organisation of coded meaning units into categories and themes.

Some reflections and helpful tips

Understand your pre-understandings.

While conducting qualitative research, it is paramount that the researcher maintains a vigilance of non-bias during analysis. In other words, did you remain aware of your pre-understandings, i.e., your own personal assumptions, professional background, and previous experiences and knowledge? For example, did you zero in on particular aspects of the interview on account of your profession (as an emergency doctor, emergency nurse, pre-hospital professional, etc.)? Did you assume the patient’s gender? Did your assumptions affect your analysis? How about aspects of culpability; did you assume that this patient was at fault or that this patient was a victim in the crash? Did this affect how you analysed the text?

Staying aware of one’s pre-understandings is exactly as difficult as it sounds. But, it is possible and it is requisite. Focus on putting yourself and your pre-understandings in a holding pattern while you approach your data with an openness and expectation of finding new perspectives. That is the key: expect the new and be prepared to be surprised. If something in your data feels unusual, is different from what you know, atypical, or even odd – don’t by-pass it as “wrong”. Your reactions and intuitive responses are letting you know that here is something to pay extra attention to, besides the more comfortable condensing and coding of more easily recognisable meaning units.

Use your intuition

Intuition is a great asset in qualitative analysis and not to be dismissed as “unscientific”. Intuition results from tacit knowledge. Just as tacit knowledge is a hallmark of great clinicians [11] , [12] ; it is also an invaluable tool in analysis work [13] . Literally, take note of your gut reactions and intuitive guidance and remember to write these down! These notes often form a framework of possible avenues for further analysis and are especially helpful as you lift the analysis to higher levels of abstraction; from meaning units to condensed meaning units, to codes, to categories and then to the highest level of abstraction in content analysis, themes.

Aspects of coding and categorising hard to place data

All too often, the novice gets overwhelmed by interview material that deals with the general subject matter of the interview, but doesn’t seem to answer the research question. Don’t be too quick to consider such text as off topic or dross [6] . There is often data that, although not seeming to match the study aim precisely, is still important for illuminating the problem area. This can be seen in our practical example about exploring patients’ experiences of being admitted into the emergency centre. Initially the participant is describing the accident itself. While not directly answering the research question, the description is important for understanding the context of the experience of being admitted into the emergency centre. It is very common that participants will “begin at the beginning” and prologue their narratives in order to create a context that sets the scene. This type of contextual data is vital for gaining a deepened understanding of participants’ experiences.

In our practical example, the participant begins by describing the crash and the rescue, i.e., experiences leading up to and prior to admission to the emergency centre. That is why we have chosen in our analysis to code the condensed meaning unit “Ambulance staff looked worried about all the blood” as “In the ambulance” and place it in the category “Reliving the rescue”. We did not choose to include this meaning unit in the categories specifically about admission to the emergency centre itself. Do you agree with our coding choice? Would you have chosen differently?

Another common problem for the novice is deciding how to code condensed meaning units when the unit can be labelled in several different ways. At this point researchers usually groan and wish they had thought to ask one of those classic follow-up questions like “Can you tell me a little bit more about that?” We have examples of two such coding conundrums in the exemplar, as can be seen in Table 3 (codes we conferred on) and Table 4 (codes we reached consensus on). Do you agree with our choices or would you have chosen different codes? Our best advice is to go back to your impressions of the whole and lean into your intuition when choosing codes that are most reasonable and best fit your data.

A typical problem area during categorisation, especially for the novice researcher, is overlap between content in more than one initial category, i.e., codes included in one category also seem to be a fit for another category. Overlap between initial categories is very likely an indication that the jump from code to category was too big, a problem not uncommon when the data is voluminous and/or very complex. In such cases, it can be helpful to first sort codes into narrower categories, so-called subcategories. Subcategories can then be reviewed for possibilities of further aggregation into categories. In the case of a problematic coding, it is advantageous to return to the meaning unit and check if the meaning unit itself fits the category or if you need to reconsider your preliminary coding.

It is not uncommon to be faced by thorny problems such as these during coding and categorisation. Here we would like to reiterate how valuable it is to have fellow researchers with whom you can discuss and reflect together with, in order to reach consensus on the best way forward in your data analysis. It is really advantageous to compare your analysis with meaning units, condensations, coding and categorisations done by another researcher on the same text. Have you identified the same meaning units? Do you agree on coding? See similar patterns in the data? Concur on categories? Sometimes referred to as “researcher triangulation,” this is actually a key element in qualitative analysis and an important component when striving to ensure trustworthiness in your study [14] . Qualitative research is about seeking out variations and not controlling variables, as in quantitative research. Collaborating with others during analysis lets you tap into multiple perspectives and often makes it easier to see variations in the data, thereby enhancing the quality of your results as well as contributing to the rigor of your study. It is important to note that it is not necessary to force consensus in the findings but one can embrace these variations in interpretation and use that to capture the richness in the data.

Yet there are times when neither openness, pre-understanding, intuition, nor researcher triangulation does the job; for example, when analysing an interview and one is simply confused on how to code certain meaning units. At such times, there are a variety of options. A good starting place is to re-read all the interviews through the lens of this specific issue and actively search for other similar types of meaning units you might have missed. Another way to handle this is to conduct further interviews with specific queries that hopefully shed light on the issue. A third option is to have a follow-up interview with the same person and ask them to explain.

Additional tips

It is important to remember that in a typical project there are several interviews to analyse. Codes found in a single interview serve as a starting point as you then work through the remaining interviews coding all material. Form your categories and themes when all project interviews have been coded.

When submitting an article with your study results, it is a good idea to create a table or figure providing a few key examples of how you progressed from the raw data of meaning units, to condensed meaning units, coding, categorisation, and, if included, themes. Providing such a table or figure supports the rigor of your study [1] and is an element greatly appreciated by reviewers and research consumers.

During the analysis process, it can be advantageous to write down your research aim and questions on a sheet of paper that you keep nearby as you work. Frequently referring to your aim can help you keep focused and on track during analysis. Many find it helpful to colour code their transcriptions and write notes in the margins.

Having access to qualitative analysis software can be greatly helpful in organising and retrieving analysed data. Just remember, a computer does not analyse the data. As Jennings [15] has stated, “… it is ‘peopleware,’ not software, that analyses.” A major drawback is that qualitative analysis software can be prohibitively expensive. One way forward is to use table templates such as we have used in this article. (Three analysis templates, Templates A, B, and C, are provided as supplementary online material ). Additionally, the “find” function in word processing programmes such as Microsoft Word (Redmond, WA USA) facilitates locating key words, e.g., in transcribed interviews, meaning units, and codes.

Lessons learnt/key points

From our experience with content analysis we have learnt a number of important lessons that may be useful for the novice researcher. They are:

  • • A method description is a guideline supporting analysis and trustworthiness. Don’t get caught up too rigidly following steps. Reflexivity and flexibility are just as important. Remember that a method description is a tool helping you in the process of making sense of your data by reducing a large amount of text to distil key results.
  • • It is important to maintain a vigilant awareness of one’s own pre-understandings in order to avoid bias during analysis and in results.
  • • Use and trust your own intuition during the analysis process.
  • • If possible, discuss and reflect together with other researchers who have analysed the same data. Be open and receptive to new perspectives.
  • • Understand that it is going to take time. Even if you are quite experienced, each set of data is different and all require time to analyse. Don’t expect to have all the data analysis done over a weekend. It may take weeks. You need time to think, reflect and then review your analysis.
  • • Keep reminding yourself how excited you have felt about this area of research and how interesting it is. Embrace it with enthusiasm!
  • • Let it be chaotic – have faith that some sense will start to surface. Don’t be afraid and think you will never get to the end – you will… eventually!

Peer review under responsibility of African Federation for Emergency Medicine.

Appendix A Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.afjem.2017.08.001 .

Appendix A. Supplementary data

  • What is content analysis?

Last updated

20 March 2023

Reviewed by

Miroslav Damyanov

When you're conducting qualitative research, you'll find yourself analyzing various texts. Perhaps you'll be evaluating transcripts from audio interviews you've conducted. Or you may find yourself assessing the results of a survey filled with open-ended questions.

Streamline content analysis

Bring all your qualitative research into one place to code and analyze with Dovetail

Content analysis is a research method used to identify the presence of various concepts, words, and themes in different texts. Two types of content analysis exist: conceptual analysis and relational analysis . In the former, researchers determine whether and how frequently certain concepts appear in a text. In relational analysis, researchers explore how different concepts are related to one another in a text. 

Both types of content analysis require the researcher to code the text. Coding the text means breaking it down into different categories that allow it to be analyzed more easily.

  • What are some common uses of content analysis?

You can use content analysis to analyze many forms of text, including:

Interview and discussion transcripts

Newspaper articles and headline

Literary works

Historical documents

Government reports

Academic papers

Music lyrics

Researchers commonly use content analysis to draw insights and conclusions from literary works. Historians and biographers may apply this approach to letters, papers, and other historical documents to gain insight into the historical figures and periods they are writing about. Market researchers can also use it to evaluate brand performance and perception.

Some researchers have used content analysis to explore differences in decision-making and other cognitive processes. While researchers traditionally used this approach to explore human cognition, content analysis is also at the heart of machine learning approaches currently being used and developed by software and AI companies.

  • Conducting a conceptual analysis

Conceptual analysis is more commonly associated with content analysis than relational analysis. 

In conceptual analysis, you're looking for the appearance and frequency of different concepts. Why? This information can help further your qualitative or quantitative analysis of a text. It's an inexpensive and easily understood research method that can help you draw inferences and conclusions about your research subject. And while it is a relatively straightforward analytical tool, it does consist of a multi-step process that you must closely follow to ensure the reliability and validity of your study.

When you're ready to conduct a conceptual analysis, refer to your research question and the text. Ask yourself what information likely found in the text is relevant to your question. You'll need to know this to determine how you'll code the text. Then follow these steps:

1. Determine whether you're looking for explicit terms or implicit terms.

Explicit terms are those that directly appear in the text, while implicit ones are those that the text implies or alludes to or that you can infer. 

Coding for explicit terms is straightforward. For example, if you're looking to code a text for an author's explicit use of color,  you'd simply code for every instance a color appears in the text. However, if you're coding for implicit terms, you'll need to determine and define how you're identifying the presence of the term first. Doing so involves a certain amount of subjectivity and may impinge upon the reliability and validity of your study .

2. Next, identify the level at which you'll conduct your analysis.

You can search for words, phrases, or sentences encapsulating your terms. You can also search for concepts and themes, but you'll need to define how you expect to identify them in the text. You must also define rules for how you'll code different terms to reduce ambiguity. For example, if, in an interview transcript, a person repeats a word one or more times in a row as a verbal tic, should you code it more than once? And what will you do with irrelevant data that appears in a term if you're coding for sentences? 

Defining these rules upfront can help make your content analysis more efficient and your final analysis more reliable and valid.

3. You'll need to determine whether you're coding for a concept or theme's existence or frequency.

If you're coding for its existence, you’ll only count it once, at its first appearance, no matter how many times it subsequently appears. If you're searching for frequency, you'll count the number of its appearances in the text.

4. You'll also want to determine the number of terms you want to code for and how you may wish to categorize them.

For example, say you're conducting a content analysis of customer service call transcripts and looking for evidence of customer dissatisfaction with a product or service. You might create categories that refer to different elements with which customers might be dissatisfied, such as price, features, packaging, technical support, and so on. Then you might look for sentences that refer to those product elements according to each category in a negative light.

5. Next, you'll need to develop translation rules for your codes.

Those rules should be clear and consistent, allowing you to keep track of your data in an organized fashion.

6. After you've determined the terms for which you're searching, your categories, and translation rules, you're ready to code.

You can do so by hand or via software. Software is quite helpful when you have multiple texts. But it also becomes more vital for you to have developed clear codes, categories, and translation rules, especially if you're looking for implicit terms and concepts. Otherwise, your software-driven analysis may miss key instances of the terms you seek.

7. When you have your text coded, it's time to analyze it.

Look for trends and patterns in your results and use them to draw relevant conclusions about your research subject.

  • Conducting a relational analysis

In a relational analysis, you're examining the relationship between different terms that appear in your text(s). To do so requires you to code your texts in a similar fashion as in a relational analysis. However, depending on the type of relational analysis you're trying to conduct, you may need to follow slightly different rules.

Three types of relational analyses are commonly used: affect extraction , proximity analysis , and cognitive mapping .

Affect extraction

This type of relational analysis involves evaluating the different emotional concepts found in a specific text. While the insights from affect extraction can be invaluable, conducting it may prove difficult depending on the text. For example, if the text captures people's emotional states at different times and from different populations, you may find it difficult to compare them and draw appropriate inferences.

Proximity analysis

A relatively simpler analytical approach than affect extraction, proximity analysis assesses the co-occurrence of explicit concepts in a text. You can create what's known as a concept matrix, which is a group of interrelated co-occurring concepts. Concept matrices help evaluate and determine the overall meaning of a text or the identification of a secondary message or theme.

Cognitive mapping

You can use cognitive mapping as a way to visualize the results of either affect extraction or proximity analysis. This technique uses affect extraction or proximity analysis results to create a graphic map illustrating the relationship between co-occurring emotions or concepts.

To conduct a relational analysis, you must start by determining the type of analysis that best fits the study: affect extraction or proximity analysis. 

Complete steps one through six as outlined above. When it comes to the seventh step, analyze the text according to the relational analysis type they've chosen. During this step, feel free to use cognitive mapping to help draw inferences and conclusions about the relationships between co-occurring emotions or concepts. And use other tools, such as mental modeling and decision mapping as necessary, to analyze the results.

  • The advantages of content analysis

Content analysis provides researchers with a robust and inexpensive method to qualitatively and quantitatively analyze a text. By coding the data, you can perform statistical analyses of the data to affirm and reinforce conclusions you may draw. And content analysis can provide helpful insights into language use, behavioral patterns, and historical or cultural conventions that can be valuable beyond the scope of the initial study.

When content analyses are applied to interview data, the approach provides a way to closely analyze data without needing interview-subject interaction, which can be helpful in certain contexts. For example, suppose you want to analyze the perceptions of a group of geographically diverse individuals. In this case, you can conduct a content analysis of existing interview transcripts rather than assuming the time and expense of conducting new interviews.

What is meant by content analysis?

Content analysis is a research method that helps a researcher explore the occurrence of and relationships between various words, phrases, themes, or concepts in a text or set of texts. The method allows researchers in different disciplines to conduct qualitative and quantitative analyses on a variety of texts.

Where is content analysis used?

Content analysis is used in multiple disciplines, as you can use it to evaluate a variety of texts. You can find applications in anthropology, communications, history, linguistics, literary studies, marketing, political science, psychology, and sociology, among other disciplines.

What are the two types of content analysis?

Content analysis may be either conceptual or relational. In a conceptual analysis, researchers examine a text for the presence and frequency of specific words, phrases, themes, and concepts. In a relational analysis, researchers draw inferences and conclusions about the nature of the relationships of co-occurring words, phrases, themes, and concepts in a text.

What's the difference between content analysis and thematic analysis?

Content analysis typically uses a descriptive approach to the data and may use either qualitative or quantitative analytical methods. By contrast, a thematic analysis only uses qualitative methods to explore frequently occurring themes in a text.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 25 November 2023

Last updated: 12 May 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 18 May 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free

  • Privacy Policy

Research Method

Home » Content Analysis – Methods, Types and Examples

Content Analysis – Methods, Types and Examples

Table of Contents

Content Analysis

Content Analysis

Definition:

Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

Content analysis can be used to study a wide range of topics, including media coverage of social issues, political speeches, advertising messages, and online discussions, among others. It is often used in qualitative research and can be combined with other methods to provide a more comprehensive understanding of a particular phenomenon.

Types of Content Analysis

There are generally two types of content analysis:

Quantitative Content Analysis

This type of content analysis involves the systematic and objective counting and categorization of the content of a particular form of communication, such as text or video. The data obtained is then subjected to statistical analysis to identify patterns, trends, and relationships between different variables. Quantitative content analysis is often used to study media content, advertising, and political speeches.

Qualitative Content Analysis

This type of content analysis is concerned with the interpretation and understanding of the meaning and context of the content. It involves the systematic analysis of the content to identify themes, patterns, and other relevant features, and to interpret the underlying meanings and implications of these features. Qualitative content analysis is often used to study interviews, focus groups, and other forms of qualitative data, where the researcher is interested in understanding the subjective experiences and perceptions of the participants.

Methods of Content Analysis

There are several methods of content analysis, including:

Conceptual Analysis

This method involves analyzing the meanings of key concepts used in the content being analyzed. The researcher identifies key concepts and analyzes how they are used, defining them and categorizing them into broader themes.

Content Analysis by Frequency

This method involves counting and categorizing the frequency of specific words, phrases, or themes that appear in the content being analyzed. The researcher identifies relevant keywords or phrases and systematically counts their frequency.

Comparative Analysis

This method involves comparing the content of two or more sources to identify similarities, differences, and patterns. The researcher selects relevant sources, identifies key themes or concepts, and compares how they are represented in each source.

Discourse Analysis

This method involves analyzing the structure and language of the content being analyzed to identify how the content constructs and represents social reality. The researcher analyzes the language used and the underlying assumptions, beliefs, and values reflected in the content.

Narrative Analysis

This method involves analyzing the content as a narrative, identifying the plot, characters, and themes, and analyzing how they relate to the broader social context. The researcher identifies the underlying messages conveyed by the narrative and their implications for the broader social context.

Content Analysis Conducting Guide

Here is a basic guide to conducting a content analysis:

  • Define your research question or objective: Before starting your content analysis, you need to define your research question or objective clearly. This will help you to identify the content you need to analyze and the type of analysis you need to conduct.
  • Select your sample: Select a representative sample of the content you want to analyze. This may involve selecting a random sample, a purposive sample, or a convenience sample, depending on the research question and the availability of the content.
  • Develop a coding scheme: Develop a coding scheme or a set of categories to use for coding the content. The coding scheme should be based on your research question or objective and should be reliable, valid, and comprehensive.
  • Train coders: Train coders to use the coding scheme and ensure that they have a clear understanding of the coding categories and procedures. You may also need to establish inter-coder reliability to ensure that different coders are coding the content consistently.
  • Code the content: Code the content using the coding scheme. This may involve manually coding the content, using software, or a combination of both.
  • Analyze the data: Once the content is coded, analyze the data using appropriate statistical or qualitative methods, depending on the research question and the type of data.
  • Interpret the results: Interpret the results of the analysis in the context of your research question or objective. Draw conclusions based on the findings and relate them to the broader literature on the topic.
  • Report your findings: Report your findings in a clear and concise manner, including the research question, methodology, results, and conclusions. Provide details about the coding scheme, inter-coder reliability, and any limitations of the study.

Applications of Content Analysis

Content analysis has numerous applications across different fields, including:

  • Media Research: Content analysis is commonly used in media research to examine the representation of different groups, such as race, gender, and sexual orientation, in media content. It can also be used to study media framing, media bias, and media effects.
  • Political Communication : Content analysis can be used to study political communication, including political speeches, debates, and news coverage of political events. It can also be used to study political advertising and the impact of political communication on public opinion and voting behavior.
  • Marketing Research: Content analysis can be used to study advertising messages, consumer reviews, and social media posts related to products or services. It can provide insights into consumer preferences, attitudes, and behaviors.
  • Health Communication: Content analysis can be used to study health communication, including the representation of health issues in the media, the effectiveness of health campaigns, and the impact of health messages on behavior.
  • Education Research : Content analysis can be used to study educational materials, including textbooks, curricula, and instructional materials. It can provide insights into the representation of different topics, perspectives, and values.
  • Social Science Research: Content analysis can be used in a wide range of social science research, including studies of social media, online communities, and other forms of digital communication. It can also be used to study interviews, focus groups, and other qualitative data sources.

Examples of Content Analysis

Here are some examples of content analysis:

  • Media Representation of Race and Gender: A content analysis could be conducted to examine the representation of different races and genders in popular media, such as movies, TV shows, and news coverage.
  • Political Campaign Ads : A content analysis could be conducted to study political campaign ads and the themes and messages used by candidates.
  • Social Media Posts: A content analysis could be conducted to study social media posts related to a particular topic, such as the COVID-19 pandemic, to examine the attitudes and beliefs of social media users.
  • Instructional Materials: A content analysis could be conducted to study the representation of different topics and perspectives in educational materials, such as textbooks and curricula.
  • Product Reviews: A content analysis could be conducted to study product reviews on e-commerce websites, such as Amazon, to identify common themes and issues mentioned by consumers.
  • News Coverage of Health Issues: A content analysis could be conducted to study news coverage of health issues, such as vaccine hesitancy, to identify common themes and perspectives.
  • Online Communities: A content analysis could be conducted to study online communities, such as discussion forums or social media groups, to understand the language, attitudes, and beliefs of the community members.

Purpose of Content Analysis

The purpose of content analysis is to systematically analyze and interpret the content of various forms of communication, such as written, oral, or visual, to identify patterns, themes, and meanings. Content analysis is used to study communication in a wide range of fields, including media studies, political science, psychology, education, sociology, and marketing research. The primary goals of content analysis include:

  • Describing and summarizing communication: Content analysis can be used to describe and summarize the content of communication, such as the themes, topics, and messages conveyed in media content, political speeches, or social media posts.
  • Identifying patterns and trends: Content analysis can be used to identify patterns and trends in communication, such as changes over time, differences between groups, or common themes or motifs.
  • Exploring meanings and interpretations: Content analysis can be used to explore the meanings and interpretations of communication, such as the underlying values, beliefs, and assumptions that shape the content.
  • Testing hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the effects of media on attitudes and behaviors or the framing of political issues in the media.

When to use Content Analysis

Content analysis is a useful method when you want to analyze and interpret the content of various forms of communication, such as written, oral, or visual. Here are some specific situations where content analysis might be appropriate:

  • When you want to study media content: Content analysis is commonly used in media studies to analyze the content of TV shows, movies, news coverage, and other forms of media.
  • When you want to study political communication : Content analysis can be used to study political speeches, debates, news coverage, and advertising.
  • When you want to study consumer attitudes and behaviors: Content analysis can be used to analyze product reviews, social media posts, and other forms of consumer feedback.
  • When you want to study educational materials : Content analysis can be used to analyze textbooks, instructional materials, and curricula.
  • When you want to study online communities: Content analysis can be used to analyze discussion forums, social media groups, and other forms of online communication.
  • When you want to test hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the framing of political issues in the media or the effects of media on attitudes and behaviors.

Characteristics of Content Analysis

Content analysis has several key characteristics that make it a useful research method. These include:

  • Objectivity : Content analysis aims to be an objective method of research, meaning that the researcher does not introduce their own biases or interpretations into the analysis. This is achieved by using standardized and systematic coding procedures.
  • Systematic: Content analysis involves the use of a systematic approach to analyze and interpret the content of communication. This involves defining the research question, selecting the sample of content to analyze, developing a coding scheme, and analyzing the data.
  • Quantitative : Content analysis often involves counting and measuring the occurrence of specific themes or topics in the content, making it a quantitative research method. This allows for statistical analysis and generalization of findings.
  • Contextual : Content analysis considers the context in which the communication takes place, such as the time period, the audience, and the purpose of the communication.
  • Iterative : Content analysis is an iterative process, meaning that the researcher may refine the coding scheme and analysis as they analyze the data, to ensure that the findings are valid and reliable.
  • Reliability and validity : Content analysis aims to be a reliable and valid method of research, meaning that the findings are consistent and accurate. This is achieved through inter-coder reliability tests and other measures to ensure the quality of the data and analysis.

Advantages of Content Analysis

There are several advantages to using content analysis as a research method, including:

  • Objective and systematic : Content analysis aims to be an objective and systematic method of research, which reduces the likelihood of bias and subjectivity in the analysis.
  • Large sample size: Content analysis allows for the analysis of a large sample of data, which increases the statistical power of the analysis and the generalizability of the findings.
  • Non-intrusive: Content analysis does not require the researcher to interact with the participants or disrupt their natural behavior, making it a non-intrusive research method.
  • Accessible data: Content analysis can be used to analyze a wide range of data types, including written, oral, and visual communication, making it accessible to researchers across different fields.
  • Versatile : Content analysis can be used to study communication in a wide range of contexts and fields, including media studies, political science, psychology, education, sociology, and marketing research.
  • Cost-effective: Content analysis is a cost-effective research method, as it does not require expensive equipment or participant incentives.

Limitations of Content Analysis

While content analysis has many advantages, there are also some limitations to consider, including:

  • Limited contextual information: Content analysis is focused on the content of communication, which means that contextual information may be limited. This can make it difficult to fully understand the meaning behind the communication.
  • Limited ability to capture nonverbal communication : Content analysis is limited to analyzing the content of communication that can be captured in written or recorded form. It may miss out on nonverbal communication, such as body language or tone of voice.
  • Subjectivity in coding: While content analysis aims to be objective, there may be subjectivity in the coding process. Different coders may interpret the content differently, which can lead to inconsistent results.
  • Limited ability to establish causality: Content analysis is a correlational research method, meaning that it cannot establish causality between variables. It can only identify associations between variables.
  • Limited generalizability: Content analysis is limited to the data that is analyzed, which means that the findings may not be generalizable to other contexts or populations.
  • Time-consuming: Content analysis can be a time-consuming research method, especially when analyzing a large sample of data. This can be a disadvantage for researchers who need to complete their research in a short amount of time.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Cluster Analysis

Cluster Analysis – Types, Methods and Examples

Discriminant Analysis

Discriminant Analysis – Methods, Types and...

MANOVA

MANOVA (Multivariate Analysis of Variance) –...

Documentary Analysis

Documentary Analysis – Methods, Applications and...

ANOVA

ANOVA (Analysis of variance) – Formulas, Types...

Graphical Methods

Graphical Methods – Types, Examples and Guide

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Media
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business Ethics
  • Business History
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic History
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Theory
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Content Analysis

Content Analysis

Content Analysis

Associate Professor

  • Cite Icon Cite
  • Permissions Icon Permissions

This book offers an overview of the variation within content analysis, along with detailed descriptions of three approaches found in the contemporary literature: basic content analysis, interpretive content analysis, and qualitative content analysis. This book provides an inclusive and carefully differentiated examination of contemporary content analysis research purposes and methods. Chapter 1 examines the conceptual base and history of content analysis. The next three chapters examine in depth each approach as a single approach to content analysis, using brief, illustrative exemplar studies. Each of the methodology chapters employs a consistent outline to help readers compare and contrast the three different approaches. Chapter 5 examines rigor in content analysis and highlights steps to ensure the internal coherence of studies. The book concludes with exploration of two full-length studies. Chapter 6 examines the use of content analysis for advocacy and to build public awareness to promote human rights and social justice. Chapter 7 reviews a full-length study of older adults in prison to detail how content analysis is completed and how different approaches may be usefully combined.

Signed in as

Institutional accounts.

  • GoogleCrawler [DO NOT DELETE]
  • Google Scholar Indexing

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code

Institutional access

  • Sign in with a library card Sign in with username/password Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Sign in through your institution

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Sign in with a library card

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

History and Definitions of Content Analysis

History and Definitions of Qualitative Research Ludomedia

António Pedro Costa João Amado

Text originally published in Content Analysis Supported by Software

Introduction

Content Analysis is a data analysis technique, collected from a variety of sources, but preferably expressed in text or images. The nature of these documents can be varied, such as archival material, literary texts, reports, news, evaluative comments of a given situation, diaries and autobiographies, articles selected through the method of literature review, transcripts of interviews, texts requested on a specific subject, field notes, etc. The same can be said about the nature of the images: photographs, films, book illustrations, etc.

1.1. History of Content Analysis Technique

We have already mentioned that Content Analysis is a natural, spontaneous process that we all use when we underline ideas in a text and try to organize them. But the history of Content Analysis, as a scientific method, therefore subject to controlled and systematic procedures, goes back to the times of the First World War, as an instrument for the study of the political propaganda disseminated in the mass media, having as main reference of that time the work by Harold Lasswell, Propaganda in the World War, 1927.

In World War II it was used in the analysis of newspapers, with the purpose of detecting signs of Nazi propaganda in the North American media. It is also worth noting the work of collective responsibility of Lasswell and Leites, entitled Language of Politics (1949).

Since then, with more or less hesitations of epistemological and methodological nature (often seeking to reinforce the quantitative character), Content Analysis has been applied in many fields of the human sciences, such as linguistics (discourse analysis), anthropology (thematic analyses of the discourses of the mentally ill), history (systematic analysis of documents), etc., a tendency that was consecrated and developed after the Congress of Alberton House, which took place in 1955 ( Krippendorff, 1990; Vala, 1986). The need for a congress dedicated to this theme was felt due to the fact that the technique began to subside in the face of criticism and attacks from various origins. The most compelling criticism referred in particular to one of its ‘constitutive defects’, namely ‘the encoder’s intervention in establishing the meaning of the text’ (Ghiglione & Matalon, 1992, p.180). However, in our days it is rare to find research that, exclusively or combined with other data collection and analysis techniques (e.g. questionnaires), as a means of constructing other instruments (still the questionnaire), or as a central methodology, does not make any use of it.

Currently, the use of software to support this technique allows for faster, more rigorous and highly complex processes that can be safely performed; there are already over two dozen software packages, such as NVivo (www.qsrinternational.com), Atlas.ti (www.atlasti.com), MaxQDA (www.maxqda.com). More recently, programs that work in cloud computing have begun to emerge, such as webQDA (www.webqda.net). One of the advantages of this innovation is to enable collaborative work in small or large groups, large data analysis, in a way that was not possible before (Costa, 2016). Next, we will describe the evolution of all these instruments and discuss their advantages in more detail.

1.2. The concept of Content Analysis

What is meant by this technique, in general terms, is to ‘arrange’ in an organized, systematic set, as quantified as possible, of categories of signification, the ‘manifest content’ of the most diverse types of communications, so as to be able to interpret them taking into account the diverse factors that led to their production.

The concept of Content Analysis has undergone an evolution over time; in a first phase, under the influence of Berelson (1952, apud Krippendorf, 1990), one of the classics of this technique, the major concern was to describe and quantify the manifest contents of the documents under analysis; in this positivist perspective, the technique focused on the denotations, that is, on the very surface meaning of discourse.

In addition to its descriptive function and its incidence on denotations, Content Analysis assumes an “inferential function, in search of a meaning that is far beyond what is immediately apprehensible, and which awaits the opportunity to be uncovered” (Amado, Costa, & Crusoe, 2017, p. 303). It is also interested, therefore, on the connotations of discourses, which often have more to do with what is between the lines, ellipsis, implied, and the tone itself, than with what is explicit (Esteves, 2006; Morgado, 2012).

The inferential process, however, needs to obey rules and to be subject to some control, so as not to let the analysts’ imagination make them fall into “naive or wild inferences” (Vala 1986: 103). This concern raises the need for mechanisms that confer reliability and allow the validation of the entire analysis process; the reflection of this new step is the definition of Content Analysis offered by Krippendorff (1990), one of the most recognized authors in this field: “a research technique that allows to make valid and replicable inferences of the data for its context” (p.28).

Replicability thus emerges as fundamental, so that one can offer confidence in the process developed from a technical point of view, in the identification of categories. As Lima says (2013, p.8) “it is important that the classification procedures be consensual so that different people can carry out this classification in a similar way. It is equally essential that the Content Analysis process be transparent, public and verifiable.” It can be said then that rigor is attained by applying the appropriate procedures accompanied by a clear and adjusted description of them, where the definition of each category or subcategory is not lacking, as well as becoming patent, in tables or matrices, some of the moments and intermediate or final results; therefore, it is fair to say that rigor is not confused with statistical analysis.

On the other hand, the production of inferences is based on the establishment of relationships, based on logical and pertinent deductions, between four differentiated poles:

1. The data. These, in turn, can be analysed according to certain perspectives, based (among other aspects to take into account in the previous questions and the objectives of the research) on:

a. what is said (in this case, it is a thematic analysis, the most common in Content Analysis, and that can focus on the distinction of themes, the delimiting of categories and subcategories within these themes, and the calculation of their relative frequency in the documental corpus as a whole);

b. by whom it is said (for example, the affinities between the message and the statute or the psychological state of the subject);

c. to whom it is said (analysis of relations, establishing the affinities between the message and its recipients);

d. for what purpose (analysis of the objectives of a particular message);

e. with what results ( evaluative analysis, for example, of recipients’ responses to the communication).

2. The frames of reference of those who produced the communication (intentions, social representations, presuppositions, ‘states of mind’, values ​​and symbols, as well as biographical aspects and personality traits of the author of the communication, etc.);

3. The conditions of production or the context of the emergence of the data in question (the local context and the social, cultural, historical, political and historical circumstances in which the document was produced and reflected therein);     

4. The reference frames of the analysts, requiring that they be prepared theoretically and methodologically to make their interpretations. That is, the analysts must know and mobilize frames of reference absorbed, in large part, from one or more theories of human and social sciences, they must know how to use intuition and creativity in the identification and clipping of topics, categories and subcategories, equipped with a know-how-to-do and a know-how-to-be that allows them to make adequate decisions in the face of data and escape from uncontrolled subjectivity and lack of ethics.

The definition offered by Robert and Bouillaguet (1997) seems to us to be one of the most comprehensive, encompassing the descriptive, objective perspective and the subjective, inferential perspective: “Content Analysis stricto sensu is defined as a technique that enables methodological examination, objective, and sometimes quantitative, content of certain texts in order to classify and interpret their constituent elements and which are not fully accessible for immediate reading” (p.4).

In a previous text (Amado, Costa, & Crusoe, 2017) we summarized all these considerations in the following terms: “We can therefore say that the most important aspect of Content Analysis is that it allows, in addition to a rigorous and objective representation (discourse, interview, text, article, etc.) through its codification and classification by categories and subcategories, progress (fruitful, systematic, verifiable and to some extent replicable) in the sense of capturing its meaning (at the cost of interpretive inferences derived or inspired by the theoretical frameworks of the researcher), by less obvious areas constituted by the said ‘context’ or ‘conditions’ of production. We believe that it is this aspect that allows us to creatively apply Content Analysis to a wide range of documents (communications), especially those that translate subjective views of the world, so that the researcher can “adopt” the role of the actor and see the world from his/her place, as proposed by the research of an interactionist and phenomenological nature” (p. 306).

Related publications

E-book Content Analysis in 7 Steps with webQDA

Related   News

Pensamento_Critico-web1

  • QNOW , Sem categoria
  • February 19, 2024

The importance of Critical Thinking when using a CAQDAS

Escrita__com_IA-web

  • February 7, 2024

Scientific Writing with Generative Artificial Intelligence: Innovation vs Integrity

Qualitative-Research-Consumer-Behavior

  • October 23, 2023

How qualitative analysis helps food industries in understanding consumer behavior in the metaverse food world

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 25 April 2024

A framework for the analysis of historical newsreels

  • Mila Oiva   ORCID: orcid.org/0000-0002-5241-7436 1 ,
  • Ksenia Mukhina 1 ,
  • Vejune Zemaityte   ORCID: orcid.org/0000-0001-9714-7903 1 ,
  • Andres Karjus   ORCID: orcid.org/0000-0002-2445-5072 1 , 2 ,
  • Mikhail Tamm 1 ,
  • Tillmann Ohm 1 ,
  • Mark Mets 1 ,
  • Daniel Chávez Heras   ORCID: orcid.org/0000-0002-9877-7496 3 ,
  • Mar Canet Sola   ORCID: orcid.org/0000-0001-5986-3239 1 ,
  • Helena Hanna Juht 4 &
  • Maximilian Schich 1  

Humanities and Social Sciences Communications volume  11 , Article number:  530 ( 2024 ) Cite this article

193 Accesses

1 Altmetric

Metrics details

  • Cultural and media studies

Audiovisual news is a critical cultural phenomenon that has been influencing audience worldviews for more than a hundred years. To understand historical trends in multimodal audiovisual news, we need to explore them longitudinally using large sets of data. Despite promising developments in film history, computational video analysis, and other relevant fields, current research streams have limitations related to the scope of data used, the systematism of analysis, and the modalities and elements to be studied in audiovisual material and its metadata. Simultaneously, each disciplinary approach contributes significant input to research reducing these limitations. We therefore advocate for combining the strengths of several disciplines. Here we propose a multidisciplinary framework for systematically studying large collections of historical audiovisual news to gain a coherent picture of their temporal dynamics, cultural diversity, and potential societal effects across several quantitative and qualitative dimensions of analysis. By using newsreels as an example of such complex historically formed data, we combine the context crucial to qualitative approaches with the systematicity and ability to cover large amounts of data from quantitative methods. The framework template for historical newsreels is exemplified by a case study of the “News of the Day” newsreel series produced in the Soviet Union during 1944–1992. The proposed framework enables a more nuanced analysis of longitudinal collections of audiovisual news, expanding our understanding of the dynamics of global knowledge cultures.

Similar content being viewed by others

content analysis in historical research

Improving microbial phylogeny with citizen science within a mass-market video game

content analysis in historical research

Worldwide divergence of values

content analysis in historical research

Song lyrics have become simpler and more repetitive over the last five decades

Introduction.

Audiovisual news has affected the global knowledge landscape for over a century. As a media format, audiovisual is impactful, and news as a genre is particularly effective for forming knowledge about the world. Although what counts as news is debatable (Tworek, 2019 ), labelling a story as ‘news’ suggests that the offered content has contemporary relevance and that it provides truthful information on the surrounding world—even if we know that this is not always the case (Winston, 2018 ; Lazer et al., 2018 ). Understanding audiovisual news content is crucial because news, taking part in creating the ‘media reality’ (Morgan, 2008 ), steers our gaze to the world, affects our opinions, and shapes our identities (Imesch et al., 2016 ; Hoffmann, 2018 ; Werenskjold, 2018 ). Even if some audiences may disagree with the content (Sampaio, 2022 ), news sets the agenda for societal discussions and contributes to what we consider worth knowing.

To fully understand the functions of audiovisual news content, production, and dissemination, exploring them through large and consistent sets of data, covering a long time span, can be helpful. Examining a consistent set of data that, for example, covers all the issues of a newsreel series, gives an understanding of the variety of individual findings and contextualises them. Detecting long-term continuities, and short-term trends helps us better understand the past information culture, alongside what is perhaps specific to our time or area. However, so far, studying audiovisual news via large quantities of data or across a long time span systematically in ways that would take into account their complexity has been hampered by availability of data and integration of methods across disciplines. Here we work towards a unified approach to study audiovisual news that enables the comparison of data coming from different sources to reveal the cultural and temporal variations of the global news scene.

Newsreels were the first widely spread form of audiovisual news. Starting in France in 1909, these approximately ten-minute-long news films, shown in the weekly changing series in cinemas, informed audiences about the latest political events, innovations, sports competitions, and fashion trends. Each newsreel issue contained around five to twelve short news stories, often showing the ‘more serious’ ones first and ending with entertaining topics. Until the mid-1950s, newsreels were the main source of audiovisual news for audiences globally that also conveyed both political propaganda and commercial interests. Their production continued in some countries under state support until the 1990s (Chambers et al., 2018 ; Pozharliev and Gallo González, 2018 ; Fielding, 2009 ).

Like other audiovisual products, newsreels are multifaceted. They are multimodal combinations of moving images, sounds, music, and of spoken and written language, gestures, iconographies, and signs as deeply rooted in the surrounding societies. The meanings created are interrelated across modalities with an individual news story gaining additional meaning, depending on its embedding and temporal position in a newsreel issue. In fact, one may argue, the meaning of a newsreel issue can be understood only when looking at the contents of the other issues of the newsreel series. Therefore, in order to understand the messages and role of individual news stories in a society, it is necessary to study newsreels as a whole (Hickethier, 2016 ) and comparatively, systematically analysing larger collections through the interconnections of small-scale units. This, we argue, has to transcend the debate of a single community of practice, such as media studies or communication, which is why this paper brings forth the expertise from a broad range of research streams—including film history, computational video analysis, film studies, and the so-called New Cinema History—to study film and video in a comprehensive way.

Lately, many newsreel series have been digitised, and several national film archives as well as transnational collections, such as Europeana, the Internet Archive, and Wikimedia Commons, all increasingly provide access to newsreels in digital form. This has opened new possibilities for studying long-term patterns of audiovisual news. However, despite promising developments in various disciplines, current approaches to newsreels, as discussed below, do not allow us to fully grasp these complex cultural products. Many established research fields are relevant to the longitudinal study of historical newsreels and audiovisual news in general. However, if the current approaches from each research stream are used separately, they produce considerable gaps in the nuanced understanding of newsreels in a long temporal continuum. The available approaches are either qualitative and do not allow a systematic analysis of large-scale data, or quantitative and reveal only one aspect of the multimodal newsreel data. Table 1 summarises the related research streams and their gaps, ranging from the scope of data used, comparability of analysis, the modalities taken into account, and the elements creating meaning in audiovisual material. Simultaneously, each research stream offers a contribution that helps fill the gaps in other fields. In the following paragraphs, we elaborate further on the gaps in each of these research streams together with the beneficial contributions that they may bring.

Qualitative film history

Qualitative film historical studies on newsreels have been demonstrating the variety of production conditions, core messages, and distribution channels in a number of countries and at different times (Chambers et al., 2018 ; Garrett Cooper et al., 2018 ; Imesch et al., 2016 ). Their strength is that they take into account the interplay of multiple modalities in the film material and produce nuanced analyses of the messages they have conveyed to the audiences. Simultaneously, however, they focus on temporally restricted segments of data, use analysis methods that are hard to apply to a large quantities of data, and do not usually utilise categories that would allow systematic comparisons between different studies (Chambers, 2018 , Pozharliev and Gallo González, 2018 , Bergström and Jönsson, 2018 , Vande Winkel and Biltereyst, 2018 ; Pozdorovkin, 2012 , Veldi et al., 2019 ). The main limitation of qualitative enquiry is its incomplete ability to offer an understanding of what is prevailing and what is marginal in wider terms in the data, which makes it difficult to see the bigger picture and contextualise the findings. As van Noord ( 2022 ) notes, exploring recurring motifs or patterns in cultural data is crucial for a deeper understanding. Although an experienced qualitative scholar is usually able to point out some of the repeating patterns based on their accumulated knowledge of the field, computational methods can back that up, measure the prevalence of the pattern in the collection, and detect also other, possibly unnoticed patterns.

Computational video analysis

Computational video analysis focuses on the systematic study of large collections of data, while typically addressing a single modality rather than aggregates of contextual and temporal factors. Examples include increasingly accurate and effective methods for recognising shot and scene boundaries (Hanjalic, 2002 ; Rasheed and Shah, 2003 ), persons (Wang and Zhang, 2022 ), objects (Brasó et al., 2022 ), human poses (Broadwell and Tangherlini, 2021 ), number of individuals in a crowd (Zhang and Chan, 2022 ), events (Wan et al., 2021 ), sounds (Park et al., 2021 ), human and animal behaviour (Gulshad et al., 2023 ; Bain et al., 2021 ; Sommer et al., 2020 ) and to perform image segmentation (Hu et al., 2022 ). Different solutions for condensing audiovisual content have also been developed, either for creating video representations to enable efficient browsing (Zhao et al., 2021 ) or numerical fingerprints allowing comparisons of video content for retrieval and recommendation systems (Kordopatis-Zilos et al., 2022 ; Nazir et al., 2020 ). Deep Learning applications in computer vision have been used for various item recognition tasks in images and videos (Bhargav et al., 2019 ; Liu et al., 2020 ; Kong and Fu, 2022 ; Brissman et al., 2022 ; Kandukuri et al., 2022 ). While mainstream computational video content analysis has focused on images, other modalities, like sound, have been also gaining increased attention (Valverde et al., 2021 ; Yang et al., 2020 ; Senocak et al., 2018 ; Hasegawa and Kato, 2019 ; Hu et al., 2022 ; Ye and Kovashka, 2022 ; Sanguineti et al., 2022 ; Pérez et al., 2020 ), eventually feeding into multi-modal analysis (Mourchid et al., 2019 ; Ren et al., 2018 ). However, considering different modalities of audiovisual data, particularly within the historical focus of this paper, remains beyond mainstream in video analysis. In addition, there is a lack of discussion on how certain units of analysis, such as recognised objects or condensed forms of video content, can be credibly used to detect the ways audiovisual content creates and conveys meaning to audiences.

Computational film studies

Situated between the qualitative and quantitative study of audiovisual contents, computational film studies often combine the two approaches. This stream of literature started by using shot detection to analyse film dynamics and editing styles (Salt, 1974 ; Tsivian, 2009 ; Pustu-Iren et al., 2020 ). In addition to addressing dynamics as an important modality of audiovisual content, computational film scholars have also been combining different modalities, such as images and sound (Grósz et al., 2022 ), spoken texts (Carrive et al., 2021 ; van Noord et al., 2021 ), or shown locations (Olesen et al., 2016 ). Computational studies of newsreels more specifically have addressed the contents of news either on the level of textual descriptions of news story topics (Althaus et al., 2018 ; Althaus and Britzman, 2018 ) or at a more granular level combining different modalities by analysing the voice-over text and automatically recognising well-known individuals in the film material (Carrive et al., 2021 ).

An ongoing debate in computational film studies concerns how film creates meaning, what are the most important meaning-making units, and how they could be best extracted (Chávez Heras, 2024 ; Burghardt et al., 2020 ; Burges et al., 2021 ). A profound challenge is that many modalities of film, such as images, can be interpreted in divergent ways depending on the viewer and their context (van Noord, 2022 ; Arnold and Tilton, 2019 ; Pozdorovkin, 2012 ). Different modalities may also create juxtaposing messages (Pozharliev and Gallo González, 2018 ). David Bordwell ( 1991 ) has argued that films contain ‘cues’ on which the further comprehension and interpretation of their meaning is based. Although the spectators may have differing opinions on the profound message of a film, an important hypothesis is that they nevertheless usually agree upon what the meaning-making cues are (such as shown activities or spoken sentences). This means that the variety of “credible” interpretations of the message of the film is limited (Bordwell, 1991 ). A central premise of computational film studies is thus that it can be possible to detect these cues and reach for an aggregate meaning of films through them.

Lately, in pursuit of understanding the meanings carried by film, a number of scholars have been using recognition and annotation of pre-set categories or stylistic features, discussing whether human interpretation should be applied already at the event of recognising the items, or at a later stage of the analysis (Carrive et al., 2021 ; Bhargav et al., 2019 ; Heftberger, 2018 ; Burges et al., 2021 ; Williams and Bell, 2021 ; Hielscher, 2020 ; Cooper et al., 2021 ; Bakels et al., 2020 ; authors discussing this issue: Burghardt et al., 2020 ; Arnold et al., 2021 ; Masson et al., 2020 ). There are also scholars further problematising object recognition by stating that in addition to recognising an object we should know how it is portrayed in order to understand its meaning (Hielscher, 2020 ) and calling for more thorough thinking of which measures can be used to analyse film contents (Olesen and Kisjes, 2018 ). This discussion connects with the wider question if there are cues in film that create meaning, how to find them, how to decide what to measure, and how to make sure that what is being measured gives responses to salient research questions. Although computational studies of historical newsreels use elaborate methods (Carrive et al., 2021 ; Althaus et al., 2018 ; Althaus and Britzman, 2018 ) including more explicit discussion on the connection of the research question and the variables can be an important methodological amendment to research.

New cinema history

New Cinema History (Maltby et al., 2011 ) stresses the importance of societal and temporal context in recent studies on film production (Dang, 2020 ), circulation (Clariana-Rodagut and Hagener, 2023 ; van Oort et al., 2020 ; Verhoeven et al., 2019 ; Navitski and Poppe, 2017 ), and reception (Treveri Gennari and Sedgwick, 2022 ). The premise of this discipline is that alongside the content, the surrounding context and its change over time are crucial factors in creating the meaning of film (as also pointed out by van Noord et al., 2022 ). Focusing on the contextual factors, this research stream has dealt less with content, yet because the meaning of cultural artefacts relies on both, these aspects need to be combined to reach a more nuanced understanding of newsreels or their aggregated meaning.

Digital hermeneutics

Examining historical material adds its particularities to a study. Current digital historical research has used the concept of ‘digital hermeneutics’ to call for epistemological data aka source criticism and method criticism (Fickers et al., 2022 ; Oberbichler et al., 2022 ; Salmi, 2020 ; Paju et al., 2020 ). It is crucial to understand how the data was formed and by whom, and what kinds of activities and worldviews it reflects. Firstly, the temporal meaning change of the formally similar units has to be taken into account. For example, showing a horse in a newsreel in 1910 and 1990 most likely creates very different interpretations. Secondly, digitised data are no longer in their original format (Fickers, 2021 ), and may contain traces left by the production, storage, archiving, digitising, and acquisition processes. For instance, textual descriptions of newsreel content are often added during the digitisation of the material and thus might reflect the perceptions or diligence of the digitisers rather than the activities of the original newsreel authors (Elo, 2020 ; see also Althaus and Britzman, 2018 ). As our case study shows in Section III, heavily censored data can also offer relevant results, when interpreted with an understanding that it provides the view of the authorities. Gaps in the data can produce meaningful insights. Therefore, it is important to account for which activities and to whom the traces that are being analysed belong. Furthermore, off-the-shelf computational analysis methods are often trained on contemporary materials and may not work similarly well with historical materials without adaptation (Grósz et al., 2022 ; Bhargav et al., 2019 ; Wevers, 2021 ; Wevers and Smits, 2020 ). Finally, the quality of cultural heritage materials can vary greatly, which poses additional challenges when studying long-term developments of audiovisual news.

Towards a unified approach

To summarise, while computational video analysis customarily assumes meaning to be contained in the artefact (i.e. the video), qualitative research and New Cinema History argue that meaning only arises when the artefact comes into contact with its audience and can be perceived as having different meanings. Simultaneously, an analysis that ignores inter-subjective contingency is blind to context; an interpretive framework that ignores inter-objective dependency is blind to structure. Both the content and the context should be taken into account, and, we argue, substantial advances in audiovisual (news) studies can be made by coupling these two positions.

The analysis framework for audiovisual newsreel corpora, as outlined in this paper was co-designed within a research process that started with experimental explorations of newsreel data, while negotiating and integrating methods from a spectrum of disciplines as brought together in the CUDAN ERA Chair project for Cultural Data Analytics at Tallinn University. Oscillating between joint reflections in collaborative group work, including two three-day hackathons, and more concentrated work on individual aspects, eventually led to the proposed generalisation of multidisciplinary collaboration in a systematic research process to make sense of historical newsreels at corpus scale. Following C.P. Snow’s call regarding the necessity to bridge the so-called “two worlds” of scholarly enquiry (Snow, 2001 [1959]), our starting point was that multidisciplinary integration brings forth more than a sum of its components. The specific stages of the proposed framework, explained in more detail below, were discovered by combining the established research processes of cultural data analytics and digital history, while experimenting with different ways of integrating quantitative and qualitative approaches, including expertise that is usually found in computation and the natural sciences

The objective of the framework is to exemplify how qualitative and quantitative approaches can be successfully brought together into a joint research pipeline. Towards this purpose, we combine the strengths of qualitative film history, computational video analysis, computational film studies, and New Cinema History listed in Table 1 , while closing their mutual and common gaps. In sum, we present a framework for systematically studying large collections of historical newsreels covering several decades in the context of their temporal and cultural dynamics, diversity, and functions. We propose bringing together a comprehensive set of aspects for a nuanced understanding of newsreels as an interplay of different modalities and contextual factors. The framework includes both qualitative and quantitative research feeding into a systematic approach and ability to cover large quantities of data. The framework, which we discuss in Section II, constitutes a schematic template for research projects combining quantitative and qualitative approaches (see Fig. 1 ). In Section III, we exemplify the framework using a dataset of “News of the Day” newsreel series produced in the Soviet Union in 1944–1992. Finally, Section IV contains the discussion and concludes the article.

figure 1

The newsreel framework combines qualitative and quantitative approaches into a research pipeline. It contains ( a ) pairing meaning-making units with variables , ( b ) digital data (source) and method criticism, and ( c ) combining quantitative analysis with qualitative conclusions.

Newsreel Framework

Our framework essentially centres around a workflow pipeline configuration (Oberbichler et al., 2022 ) that includes qualitative and quantitative enquiry (Fig. 1 ). There are three important stages in the pipeline: detecting and pairing the meaning-making units and variables, digital data (source) and methods criticism, plus merging and explaining analysis visualisations of different dimensions of the data (Fig. 1a–c ). The study of newsreels begins with identifying meaningful research questions and data, in relation to preceding research. Perhaps more explicit than in established qualitative approaches, we propose to identify relevant meaning-making units arising from preceding research and qualitative enquiry, and pair them with available variables at the first stage (Fig. 1a ). In the second stage, we account for different temporal layers embedded in digitised heritage data to gain a better understanding of how the variables connect with the meaning-making units and the final conclusions of the study (Fig. 1b ). After this, appropriate analysis methods are selected, keeping in mind the available variables and research questions, followed by computational analysis. In the third stage the selected variables are studied quantitatively, feeding into an examination of the resulting dimensions of analysis to jointly produce final qualitative conclusions (Fig. 1c ). This stage brings the dimensions of analysis together, critically evaluates what the findings jointly report, contextualises them, and responds to the research questions. Adding these three stages to the research pipeline ensures that newsreels are analysed systematically by considering the multidimensional nature of meaning of cultural data (Schich, 2017 ; Cassirer, 1927 ), focusing on variables relevant to the research questions, and accounting for multimodality in the final results. The framework is modular, which means that it allows selecting methods that suit the particular research question or using multiple methods comparatively, while dealing with particular meaning-making units and variables. Qualitative and quantitative enquiry are firmly intermingled and mutually dependent in this research process, as exemplified in our case study below. Importantly, different parts of the research project are continuously adjusted in relation to each other (Schich, 2017 ; Gadamer, 2013 (1960).

While meaning-making units are the elements related to human understanding of what the phenomenon under study is composed of, the variables are the metadata entries or other features of the data that can be directly analysed computationally (cf. the distinction of elements and features in GIS; Zeiler, 1999 ). Detection and pairing of the meaning-making units with the available or traceable variables (Fig. 1a , see also Fig. 2 ) improve critical evaluation of meaning and comparability identified as gaps in preceding research (see Table 1 ). Furthermore, it establishes an explicit connection between the analysed variables and the phenomenon under study, enabling critical evaluation. The preceding literature uses the term ‘cue’ both when referring to what we call here the meaning-making units and variables (e.g. Bordwell, 1991 ; Ren et al., 2018 ), which complicates differentiating between the two. The meaning-making units come from the initial idea of the study, the research question, and the preceding literature, while variables are concretely present in the data. While they arise from different roots, the successful pairing of the two concepts is crucial for a fruitful study.

figure 2

a Meaning-making units selected from Supplementary Table 1 for further analysis. b Existing and enriched variables of the News of the Day data. Arrows signify data enrichment based on the original data. c Resulting dimensions of analysis that interconnect the meaning-making units and variables.

The meaning-making units are elements that make up the phenomenon under scrutiny. Examples of meaning-making units, as relevant for newsreel research and broadly agreed in literature, include images, voice-over narration, acoustic motifs, the persons, activities, or locations shown, and content topics (Supplementary Table 1 ). Contextual factors are also important, including the socio-political circumstances, other concurrently available mass-communication media, and agency-related issues, like funding and the role of audiences. Relevant meaning-making units can be identified via an extensive literature review of qualitative studies on the topic to see what elements are often suggested and by critically evaluating the gaps. Of course, they may also emerge from analysis itself, in which case the research is firmly going beyond the state of the art.

In addition to the existing feature variables, others can be added, by either manually or algorithmically enriching the data, or adding additional data sources. As Table 2 shows, the most frequent metadata entries in the largest openly available collections of digitised newsreels contain information on production year, newsreel series title, duration, and content annotations either as text or keywords. The metadata entries, together with the available newsreel videos, form the basis for extracting variables. They can be further enriched with information concerning the newsreel authors, distribution, audience reactions, etc. To obtain well-selected units for computational analysis it is crucial to critically evaluate and pair the meaning-making units necessary for responding to research questions with variables that are available or traceable via enrichment. Notably, some variables might reveal meaning-making units indirectly (e.g. the number of people working on newsreels can be indicative of funding and the societal importance of newsreels).

The second stage we propose for the analysis part of the pipeline is to incorporate digital data (source) criticism by taking into account the historical multidimensionality of heritage data, as well as the temporal change affecting the meaning-making units and variables into the study (Fig. 1b ). This stage includes qualitative historical reflection, complementing the two other stages of the framework (Fig. 1 a, c). At this stage, firstly, the researchers scrutinise how the historical traces of the data, coming from production, storage, archiving, or digitising, are present in the data, affecting which variables should be selected for further analysis. The variable can be connected with the different meaning-making units, depending on the point in time and by whom it was created. For example, if a textual description of the contents was created as a newspaper advertisement or censorship card at the time of producing the newsreels (Werenskjold, 2018 ; Althaus and Britzman, 2018 ), the variable connects to the distribution and competition within the cinema market or the political context. If it was created within the digitisation process at a later stage, it should be combined with the interpretations of the later generations of what is noticeable in the contents. Secondly, the researchers will return to this stage after completing computational analysis of the variables to weight the effect of temporal change to the analysis results. As an example, they might reflect upon whether an increasing number of cars detected is due to an explicit choice of the filmmakers, the overall increase of the amount of cars in the society, the fact that the used algorithm detects better new car models than the old ones, or some other reason. Some results may also be absent due to conscious selections in data handling. For example, as our case study in Section III demonstrates, the qualitatively observed absence of footage portraying Stalin before his death in 1953 is most likely a result of de-selection of this material from the data (Fig. 3a ). With the twofold reflections concerning the content and method dependency, this stage addresses the lack of historical contextualisation identified in the preceding literature (see Table 1 ) by proposing to take into consideration the temporal aspects of data both when selecting the variables and when performing the final analysis.

figure 3

a All News of the Day issues (scatter plot): x-axis publication years, y-axis issue number; the total number of news stories per year based on textual outlines; b number of shots per issue over time; c mean shot length per issue; d shares of news story topics per year classified based on textual descriptions of newsreels using an instructable zero-shot classifier. Each news story is classified with a single class. e A UMAP projection of story embeddings, coloured by the content predictions in ( e ) and ( f ); f annual news story topic distribution averaged over years.

The third stage is that selected variables are computationally examined and visualised as different dimensions of analysis (Fig. 1c ). Evidently, the used research methods should be selected so that they respond to the research questions when applied to the available variables (for method selection and comparison cf. for e.g. Opoku et al., 2016 ; Gentles et al., 2016 ). This stage addresses the lack of multimodality identified in preceding literature (see Table 1 ), and allows to combine newsreel contents with the contexts in a more streamlined manner. These dimensions, focusing, for example, on newsreel production conditions, or visual and content dynamics of newsreels, are further combined thematically or temporally into preliminary findings. Ideally, the dimensions of analysis represent different parts of the newsreel production, content, and distribution process to reach for a more comprehensive understanding of them. The findings are merged with the wider contextual information from the preceding literature.

The approach proposed here arises from discussions within the field of Cultural Data Analytics (Arnold and Tilton, 2023 ; van Noord et al., 2022 ; van Noord, 2022 ; Manovich, 2020 ; Arnold and Tilton, 2019 ; Schich, 2017 ; CUDAN, 2020 –2024). The starting points of this multidisciplinary approach are that cultural phenomena are inherently multi-scale and vary through time and space, that the interactions of particularity and universality are important, and that the meaning of cultural phenomena lies in the multidimensional relations of entities. When reaching for a bigger picture through longitudinal exploration, the main challenge is in maintaining the multitude of the phenomenon under study and simultaneously tracking the dynamics of selected variables. In this circumstance, recognising plurality and multidimensionality is crucial for understanding cultural phenomena, and we should be careful when reducing this multitude into means or homogenous groups (van Noord et al., 2022 ; van Noord, 2022 ; Manovich, 2020 ).

The design of our newsreel framework supports maintaining the multitude of cultural data while tracking its dynamics in a manner that allows comparisons across time and datasets. The following section exemplifies the application of the proposed framework to the analysis of the “News of the Day” newsreel series, published weekly in the Soviet Union from 1944 to 1992.

Materials and Methods

The data used in the case study is a collection of 1747 issues of the Russian-language Soviet newsreel journal News of the Day digitised by Net-Film company covering the years 1944–1992. We scraped the video files of newsreels with metadata containing information on the production year, issue number, authors and brief content descriptions in Russian and English with the permission of the data provider, the Net-Film company. The data is incomplete in many ways: the collection lacks some newsreel issues; the image and audio quality of the videos is low; and the metadata is imperfect. When working with digitised historical data and analysing the results it provides, incompleteness of the data is a common feature that needs to be taken into account (Carrive et al., 2021 ). Simultaneously, as our case study shows below, systematic holes in data can reveal crucial source-critical aspects of the data, informing the whole research. It is part of a historians’ skillset to be able to work with incomplete data, and to decide how far conclusions can be drawn from it (Howell and Prevenier, 2001 ).

The methodology of our case study followed the above proposed phenomenon categorisation by defining the central meaning-making units, and organisation and enrichment of the data to receive corresponding variables. We selected the methods used for analysing the resulting variables based on the team members’ domain expertise and their evaluations on the methods that would best respond to the research question of how the world was depicted in the News of the Day and by what kinds of groups of individuals involved in newsreel production. As the more detailed description of the methods below shows, all the steps of the research process involved intermingled qualitative, quantitative, computational, and human-made processes.

Meaning-making units

The table containing the meaning-making units of newsreels (Supplementary Table 1 ) was prepared by extensive reading of the preceding qualitative literature on newsreels. Identifying meaning-making units in qualitative research was purposeful because qualitative analysis takes a more holistic view to the phenomenon under scrutiny that quantitative approaches. We collected all the meaning-making units mentioned also in passing in the studies. Because scholars use varying terminology, we homogenised and aggregated the labels of the units. In addition to giving a general view, it also helps to pinpoint groups of studies that have different emphases, for example, on more abstract motifs, or those ones emphasising the contextual and agency-related meaning-making units instead of contents.

The matrix of the most frequent variables in the largest openly accessible collections of digitised newsreels (Table 2 ) lists the most commonly used metadata fields and their presence in some of the most well-known digitised newsreel collections. For the purpose of mapping the variety and prevalence of the metadata fields, the matrix lists the metadata entries using a common description, and not the specific entry titles each individual collection uses. Different digitising and archiving projects may use different types of metadata in variable formats, which may necessitate harmonising data in projects using several collections (see also Beals and Bell, 2020 ). In addition to the listed metadata entries, many collections also contain other data. For this mapping, we did not study how well the metadata entries have been filled or the consistency of the data. We have marked with “x” those entries that already exist, and with “i” those entries that can be extracted from the data. When selecting the variables for a study, qualitative evaluation of the historical dimensions of the data is essential.

Data enrichment

We amended the data by explicating further information both from the newsreel videos and metadata. For the videos, we ran shot boundary detection analysis (SBD), extracted the middle frames of each shot, and produced a ResNet50 (He et al., 2015 ) embedding for those frames. From the textual descriptions of newsreel contents in the metadata, we identified places mentioned using Named Entity Recognition (NER), and further geocoded the recognised location by adding lat/long coordinates. We also applied automatic detection of the assumed gender of newsreel directors and other crew members based on the surnames, which are grammatically gendered in Russian (Fig. 2b ). All automated steps involved qualitative and manual validation and correction of the processed results with human expertise in the loop.

News story categories

Each News of the Day issue is split into individual stories (12,707 across the 1747 reels), which have synopsis-like descriptions in the metadata. We also corrected small numbering and consistency issues in a minority of them by hand. We then applied two types of automatic content categorisation to the stories, topic modelling and content classification. Topic modelling (often using Latent Dirichlet Allocation, a form of “soft” clustering) is a common approach in digital humanities and other fields dealing with large text collections. For topics, we use the pretrained model driven approach (Angelov, 2020 , Grootendorst, 2022 ) where texts are first embedded using a word or sentence embedding (we use fasttext; Bojanowski et al., 2017 ) and then clustered, with cluster keywords derived via grouped term-frequency inverse-document-frequency (TF-IDF) scaling. The upside of topic modelling as an explorative approach is that the topics need not be known in advance. The downside is that the clusters may be hard to interpret or even meaningless, and the number of clusters must still be defined in advance. We therefore also experimented with another classification approach.

While in the recent past classifying content or topics would have required purpose-trained supervised classifiers, the advent of instructable large language models (LLMs, such as ChatGPT) makes it possible to predict topic or class prevalence in a “zero-shot” manner. Instead of training or tuning a classifier in a supervised manner on annotated examples, generative LLMs can be simply prompted (instructed) to output relevant text, including topic tags given an input example accompanied with the prompt. The simplest example would be along the lines of “Tag this sentence as being of topic X or Y. Example: [text]”, but we find more verbose prompts with topic definitions yield more accurate results. We defined eight topics of interest based on previous qualitative literature and Soviet history: USSR politics, sports, military (defence, wars), scientific and industrial progress (includes innovation, construction projects, space and aviation), USSR economy and industry, USSR agriculture (excludes other economy topics), natural disasters, social issues and lifestyle (includes education, family, health, leisure, culture, religion topics), and a “misc” topic meant to cover everything else (for the prompts, see the Supplementary material). We tested the zero-shot classification accuracy of two models, OpenAI’s generative pre-trained transformer (GPT) models gpt-3.5-turbo-0301 and gpt-4-0301 (OpenAI, 2023 ). These achieved 88 and 84% accuracy respectively on a hand-annotated 100-story test set. We therefore applied the 3.5 model to the rest of the story synopses, as illustrated in the Results section.

Visual characteristics

We extracted 117 shots on average (ranging from 20 to 247) per newsreel video, with 126 frames (5 s) per shot on average (ranging from 4 to 4508 frames or 0.2 to 180 s). Representing each shot with one frame, the corpus consists of 205.678 frames in total. We used a pre-trained ResNet Convolutional Neural Network (CNN), to embed the extracted video frames in high-dimensional feature space. The original training set for the ResNet50 is ImageNet (Deng et al., 2009 ), a standard collection of contemporary images, and here we apply it to a collection of low-resolution mostly grayscale images. To identify clusters of visually similar frames and detect common themes across reels we projected the embedding space in 2D using common dimension reduction methods such as t-SNE (van der Maaten and Hinton, 2008 ) and UMAP (McInnes et al., 2020 ). Using the Collection Space Navigator (Ohm et al., 2023 ), an interactive open source tool for exploring image collections, was instrumental in exploring the large-scale visual data and gaining new insights to it. We also visualised all the newsreels by sequencing one frame per shot next to one another, effectively in this way creating a storyboard covering all the examined newsreels. In this part we used standard methods with known biases (see, for example Studer et al., 2019 ).

From the results of the Named Entity Recognition (NER) we extracted mentions of cities. We used Wiktionary and authors’ knowledge of Russian grammar to extract additional name-derivative words related to cities. Using this list, we counted mentions of cities in the story descriptions. We qualitatively distinguish five types of city mentions: a) city itself and city dwellers; b) organisations located in the city and named after it; c) names of a region named after the capital (for example ‘Leningrad oblast’) and organisations located there; d) toponyms named after the city which are not located there or in its vicinity including entities, treatises, and historical events (for example ‘Warsaw Pact’); e) not a mention (coincidences and homonyms). We added geo-coordinates taken from Wikipedia to the list of cities to visualise them on a map.

Crew composition

We used newsreel crew metadata to construct a directed graph of co-working relations (Verhoeven et al., 2020 ) where directors and other crew members act as nodes, and edges indicate collaboration on a newsreel issue. The edge direction is drawn from the director to all other crew members and signifies hiring and supervisory relationships. We utilised Levenshtein distance (Levenshtein, 1965 , see also Navarro, 2001 ) to detect potentially misspelt duplicate names and manually checked the need to merge nodes. The crew dataset contains information about 1251 people who worked on 1730 newsreel productions during 1954–1991 across different positions: director (1740 roles by 104 persons), cinematographer (15,145 roles by 1132 persons) and other crew (editors, sounds designers, etc.; 158 roles by 45 persons). Notably, a small portion of staff work across different roles. The dataset results in a network with 1251 unique person nodes and 15,425 person-to-person links. The first nine years of the data collection period were omitted from network analysis due to inconsistent data.

Cinematic and topic trends

The cinematic and topic trends of the News of the Day data show that newsreel production and release as measured by the number of newsreel issues appears to be stable over fifty years (Fig. 3a ) with consistent content shares dedicated to different topics (Fig. 3e ). The first and last few years (1945–1953 and 1990–1992) look somewhat different, but they have much less data than the rest of the period (Fig. 3a ). Newsreel issue numbers recorded, leased, and preserved in the sparse available data before 1954 seem to indicate that newsreels were produced more or less weekly during that period, but only a tiny fragment has been stored and/or digitised (Fig. 3a ). The absence of data before 1954 most likely relates to the ‘de-Stalinization’ of film materials after Stalin’s death in 1953, which included the confiscation of materials with excessive references to the former leader (Heftberger, 2018 ). During 1954–1986, the weekly production was stable, and newsreels were archived, kept, and later digitised systematically (apart from 1965 with missing data). From 1987, the annual number of produced newsreels decreased by half. The 1987 drop in newsreel production volumes coincides with the time of perestroika characterised by economic turbulence and the rethinking of the Soviet media ecosystem (Rodgers, 2014 ).

Topic-wise, the shares of political, economic, agricultural, and social news, classified using the zero-shot prediction approach, remained relatively stable until the mid-1980s when the social, and later political themes began to take more room of the preserved newsreels (Fig. 3f ). The trend shows an annual rhythm (Fig. 3d ), where social news topics usually increased around issue numbers 8–9, which coincided with International Women’s Day, and around issue numbers 48–52 coinciding with the New Year, both officially recognised celebrations in the Soviet Union. Also the topic of agriculture was more prominent around issues 30–40 published in August and September, which were the most important months of harvest.

With a closer look, it is possible to identify subtle changes across the observed period. Although the annual number of issues remained relatively stable during 1954–1986, the number of news stories per issue, determined based on the textual outlines in the metadata, decreased gradually during this time (Fig. 3a ). Also the number of shots in a newsreel decreased over time (Fig. 3b ), while the mean length of shots started to increase towards the end of the period (Fig. 3c ). These results show a contrary trend to the findings of scholars studying Hollywood feature films that indicate shortening shot lengths towards the end of the 20th century (Cutting et al., 2011 ). The reasons for the ‘stagnating’ Soviet newsreel dynamics should be further explored, with candidates obviously including the availability of film material of extended length, and labour cost in post production, such as cutting and composition. While we provide preliminary exploratory results here, quantitative data like these also naturally allow for the testing of specific hypotheses.

Our examination of the central frames of each shot reveals recurring visual patterns that repeat during the whole studied period (Fig. 4 ). Laying out all the frames of every issue into a storyboard shows subtle length and darkness variation of the (digitised) film material, as well as the launch of colour film in the mid-1980s (Fig. 4a right). Placing the frames in the order of year, issue, and scene number allows for comparing the recurring patterns and changes of the newsreel series. For example, the closeup of the storyboard shows that the opening title frames were customarily followed by frames showing a city scene, indicating the place of the news story. This prelude was followed by scenes depicting activities, such as leaders meeting each other (Fig. 4a left). Using the ResNet50 CNN embedding to extract visual features from the central frame of each shot allows us to examine visual similarities across reels. A UMAP projection of the embedded frames reveals aspects of these similarities at least at a coarse-grain level (Fig. 4b ). Consequently the UMAP allows for visual examination, grouping, and annotation of the most prominent image types in the collection, such as “Nature”, “Monumental gatherings”, “People in meetings”, “Closeups of people at work”, ”Industrial production”, “Title frames and other texts”, and “City views”.

figure 4

a A storyboard of all newsreel issues, x-axis shot number, y-axis publication years and issue numbers in ascending order. The layout of all issues (4a right) shows the temporal variation of issue lengths and the closeup of the storyboard ( a left) visualises the first scenes of issues 6–14 from 1970. b A UMAP projection of ResNet50 embedding of all central frames of each shot with seven most prominent image clusters named by the authors as (1) “Nature”; (2) “Monumental gatherings”; (3) “People in meetings”; (4) “Closeups of people at work”; (5) ”Industrial production”; (6) “Title frames and other texts”; and (7) “City views”. We used the Collection Space Navigator (Ohm et al., 2023 ), i.e. a flexible open-source user interface, for examining the frames and to produce the figure.

City mentions

Our examination of the cities mentioned in the textual descriptions of the newsreel metadata is summarised in Fig. 5 . Spatially, it demonstrates a heavy emphasis on Europe, both within the Soviet Union and globally, while the Asian part of the Soviet Union in East of the Ural Mountains, was far less covered, matching its lower population rates (Fig. 5a, b ). Outside the Soviet Union, the Warsaw Pact socialist countries are the most frequently covered (36% of all mentions despite being 3% of world population in 1970), as well as ‘neutral’ capitalist countries such as Austria and Finland (9% of all mentions despite being less than 0.4% of world population) (Fig. 5 a, b, d). These findings match the consensus among historians studying Soviet history generally (Koivunen, 2016 ; Gilburd, 2013 ; Turoma and Waldstein, 2013 ). Timewise, the number of mentions per year trends downwards (Fig. 5c–e ), which matches the general decrease in the number of stories per year (Fig. 3a ) and is mostly due to the newsreel issues typically having fewer and longer stories in the 1970s and the 1980s than in the earlier period. It is, however, noteworthy that the number of mentions of foreign cities is shrinking even faster (Fig. 5 c, e), emphasising the decline of the fraction of stories dedicated to international events after around 1960. The temporal patterns for some cities demonstrate a variety of interesting qualitative behaviour (Fig. 5e ). Constant popularity of Leningrad/St. Petersburg seems natural in the view of its importance as the second-largest city in the USSR and the “cradle of the revolution”, the upward trend in the mentions of Minsk correlates with the rapid growth of its population in the period under consideration, and the bump in the popularity of Krasnoyarsk in the 1960s coincides with the building of the Krasnoyarsk Hydroelectric Dam, which was a topic of multiple newsreel stories. The decline of the mentions of Odesa require further historical analysis. The data for individual cities is rather sparse and noisy so extracting statistically significant information from it requires application of advanced statistical techniques and will be done in detail elsewhere.

figure 5

Map showing all the cities mentioned in 1944–1992, ( a ) globally and ( b ) in Europe. The bubble size indicates the number of mentions. c Average number of mentions of top 50 cities per 1000 stories, the red line is the Soviet cities, and the blue line foreign cities. d Heatmap of city mentions per year for the top 50 most-mentioned cities (Moscow excluded due to heavy overrepresentation). e Heatmap of the top-50 most mentioned cities (Moscow excluded) per 1000 stories in the periods of 1954–1964, 1966–1976, and 1977–1992.

The analysis of newsreel production crews reveals production labour market dynamics and labour division between genders over time. Newsreel production crew numbers (Fig. 6a ) closely follow newsreel production volumes (Fig. 3a ), with ten people working on a newsreel on average. Directors who lead the productions are expectedly vastly outnumbered by other crews since newsreels contain multiple stories often shot by different cinematographers (on average nine versus a single director per newsreel). The historical labour market features several prominent directors, who lead multiple teams (as seen from high degree-centrality nodes in the director–crew network and node degree distribution in Fig. 6d–e ), and who pursue long-lasting careers (Fig. 6c ). The analysis of director gender composition reveals the existence of three distinct periods: gender equality during 1945–1959, a women director’s era during 1960–1974, and men director’s era during 1975–1992 (Fig. 6b ).

figure 6

a Number of individuals working on newsreel production over time, coloured by role. b Number of individuals working as directors over time, coloured by the assumed gender (women and men). c Director career longevity for the top-20 most productive directors. d Newsreel production crew network during 1954–1991, edges drawn from directors to other crew, coloured by role. e Degree distribution for the unipartite directed newsreel production crew network, both axes in logarithm.

Merging and explaining

As we have shown above, each dimension of analysis reveals new avenues for further qualitative and quantitative enquiry. In addition, analysing similar trends, interrelated themes, or temporal sequences overarching different dimensions of analysis, including combining them in statistical modelling, may help to explain the studied phenomenon better. In our case study, interested in the worldviews portrayed in the Soviet newsreels, bringing the results from different dimensions of analysis together points out a period with emerging shifts. The most prominent temporal change, found across all dimensions of analysis, was the time of perestroika , which introduced major political and cultural changes in the Soviet Union (1985–1991). Although some of the identified changes during this period, such as the launch of colour film (Fig. 4a ), likely had little to do with the political changes, the dimensions of analysis show how profoundly the time of change was affecting different spheres of society. The number of yearly newsreel issues was cut in half, and the published issues contained far fewer news stories (1–3 stories per newsreel against the earlier number of 8–10 stories, Fig. 3a ). Simultaneously, the number of filmmakers producing newsreels was rapidly decreasing following the shrinking newsreel production (Fig. 6a, b ). It is possible that the collapsing Soviet economy and decentralising cultural policy together with the prevalence of television overran the outdated media of newsreels in the era characterised by a gradual increase of freedom of speech and press (Rodgers, 2014 ). Digging deeper via qualitative inspection, we can see that in the 1990s the newsreel contents became focused on political meetings held in Moscow, which is visible in the emphasis on political and social topics covered (Fig. 3f ), and in geographical concentration on only a few cities (Fig. 5d ). The newsreels of the perestroika were characterised by long shots of speeches (Fig. 3c ), as many newsreel issues at the time covered extensively the political discussions on the direction of the country, which provided the public in a way first-hand knowledge of who said what in the discussions. Clearly, the worldview that the News of the Day depicted to its audiences, changed in many ways.

While many of these observations are not novel to studies on Soviet history, seeing a signal pointing out the particularity of this period in all the dimensions of analysis is important. It shows that the change of policy in the mid-1980s had way more profound effects than for example, the change of leadership from Khrushchev to Brezhnev in 1964. Quantitative analysis of large amounts of data provides the necessary contextualisation emphasising the specificity of the period, which would not be possible to show in such a concrete manner by a solely qualitative study. The signal evidence furthermore becomes visible to a broader audience, beyond experts whose formation requires years of qualitative research. Additionally, harnessing the findings of the different yet complementary dimensions of analysis together reveals trends that may be interrelated. For example, the diminishing number of crew members can partially explain the decreasing number of issues and shots, and the concentration of the newsreels in only a few cities. With fewer people, it was impossible to cover a larger volume of news material from different places. Focusing only on one dimension of analysis in our case study would not have revealed this possible connection. Finally, all these findings can be enhanced by further qualitative enquiry referencing back to the historical dimensions of the data corpus used in this study and in preceding studies, as well as statistical modelling focusing on any particular questions of interest.

Discussion and conclusions

In this paper, we proposed a framework for studying historical newsreels specifically and audiovisual news more generally in large quantities, while simultaneously maintaining an understanding of the multimodality and complexity of audiovisual data and the relational way of meaning-making associated with them. Analysing newsreels using long-term and large-scale data is beneficial for our understanding of societies in question of the global information landscape, its geographical differences, and the generic features of news content. As our case study on worldviews in the News of the Day newsreel series produced weekly in the Soviet Union during 1944–1992 has demonstrated, combining different dimensions of quantitative analysis together with qualitative enquiry, helps to understand newsreel contents in a long continuum and in a more nuanced way than previously achieved. Quantitative visualisations driven by computational analysis methods help to contextualise smaller-scale qualitative analysis, simultaneously as qualitative analysis allows to explain the detected long-term changes and their nuances. Acknowledging the complexity of the data, i.e. that new quality emerges from large quantities of data, allows for a better-rounded understanding of audiovisual culture. Necessitating a range of co-authors, our approach makes an argument for multidisciplinary research and advocates studying culture by combining different methods and approaches.

The outlined framework is the first attempt to combine the different disciplinary approaches into a comprehensive study of newsreels. Weaknesses in our proposition may of course become apparent when applying it in a variety of studies, yet we argue that this too will necessitate similar multidisciplinary expertise, collaboration, and negotiation. The case study we have presented here provides a brief glimpse into the application of the framework. One limitation of our approach is that while it selects dimensions of analysis intuitively, yet based on expertise of the crowd of co-authors, it does not explore in detail the selection of analysis methods. This will be further explored in the future. In our case study, we have focused on preliminary exploratory enquiry and less on confirmatory analysis or hypothesis testing. Examining the different ways to compare a variety of datasets, coming from different sources, has not been touched upon in this article, and should be further studied to enhance transnational approaches to the study of newsreels. This article has proposed a methodological solution for studying audiovisual news, while the questions of copyright and access to comprehensive collections of audiovisual data and corresponding metadata continue to be major obstacles to further development of this field (Arnold et al., 2021 ). A further potential hurdle in scaling the approach is the necessity of access to high-performance computation infrastructure for the effective processing of large-scale audiovisual data. In sum, however, with this framework, we hope to open a discussion on how to best study audiovisual news in long-term and large-scale data.

Data availability

The data is available at the company’s website ( https://www.net-film.ru/ ). The code used for accessing the data is available at the supplementary materials.

Althaus SL, Britzman K (2018) Researching the issued content of American newsreels. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching Newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 247–263

Chapter   Google Scholar  

Althaus SL, Usry K, Richards S, Van Thuyle B, Aron I, Huang L, Leetaru K et al. (2018) Global news broadcasting in the pre-television era: a cross-national comparative analysis of World War II Newsreel coverage. J Broadcast Electron Media 62(1):147–167. https://doi.org/10.1080/08838151.2017.1375500

Article   Google Scholar  

Angelov D (2020) Top2Vec: distributed representations of topics, arXiv, https://doi.org/10.48550/arXiv.2008.09470

Arnold T, Scagliola S, Tilton L, and Van Gorp J (2021) Introduction: special issue on audiovisual data in DH. Digit Hum Q 15 (1). http://digitalhumanities.org/dhq/vol/15/1/000541/000541.html

Arnold T, Tilton L (2019) Distant viewing: analyzing large visual corpora. Digit Scholarsh Hum 34:i3–16. https://doi.org/10.1093/llc/fqz013

Arnold T, Tilton L (2023) Distant viewing: computational exploration of digital images. MIT Press, Cambridge, MA. https://doi.org/10.7551/mitpress/14046.001.0001

Bain M, Nagrani A, Schofield D, Berdugo S, Bessa J, Owen J, Hockings KJ, et al. (2021) Automated audiovisual behavior recognition in wild primates. Sci Adv 7(46). https://doi.org/10.1126/sciadv.abi4883

Bakels J-H, Grotkopp M, Scherer T, Stratil J (2020) Matching computational analysis and human experience: performative arts and the digital humanities. Digit Hum Q 14:4

Google Scholar  

Beals M, Bell E (2020) The atlas of digitised newspapers and metadata: reports from oceanic exchanges. https://doi.org/10.6084/m9.figshare.11560059.v1

Bergström Å, Jönsson M (2018) Screening war and peace: newsreel pragmatism in neutral Sweden, September 1939 and May 1945. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 157–182

Bhargav S, van Noord N, Kamps J (2019) Deep learning as a tool for early cinema analysis. Proceedings of the 1st workshop on structuring and understanding of multimedia heritage contents. SUMAC ’19. Association for Computing Machinery, New York, NY, USA, p 61–68. https://doi.org/10.1145/3347317.3357240

Book   Google Scholar  

Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146. https://doi.org/10.1162/tacl_a_00051

Bordwell D (1991) Making meaning: inference and rhetoric in the interpretation of cinema. Harvard University Press, Cambridge

Brasó G, Cetintas O, Leal-Taixé L (2022) Multi-object tracking and segmentation via neural message passing. Int J Comput Vis 130(12):3035–3053. https://doi.org/10.1007/s11263-022-01678-6

Brissman E, Johnander J, Danelljan M, and Felsberg M (2022) Recurrent graph neural networks for video instance segmentation. Int J Comput Vis. https://doi.org/10.1007/s11263-022-01703-8

Broadwell P, Tangherlini TR (2021) Comparative K-Pop choreography analysis through deep-learning pose estimation across a large video corpus. Digit Hum Q 15:1

Burges J, Armoskaite S, Fox T, Mueller D, Romphf J, Sherwood E, Ullrich M (2021) Audiovisualities out of annotation: three case studies in teaching digital annotation with mediate. Digit Hum Q 15:1

Burghardt M, Heftberger A, Pause J, Walkowski N-O, Zeppelzauer M (2020) Film and video analysis in the digital humanities—an interdisciplinary dialog. Digit Hum Q 14:4

Carrive J, Beloued A, Goetschel P, Heiden S, Laurent A, Lisena P, Mazuet F, et al. (2021) Transdisciplinary analysis of a corpus of French newsreels: the ANTRACT project. Digit Hum Q 15 (1). http://digitalhumanities.org/dhq/vol/15/1/000523/000523.html

Cassirer E (1927) Das symbolproblem und seine stellung im system der philosophie. Z für ÄEsthet Allg Kunstwiss 21:295–322

Chambers C (2018) The Irish Question: newsreels and national identity. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 265–283

Chambers Ciara, Jönsson Mats, Vande Winkel Roel (eds.) (2018) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland

Chapman, J, Glancy M, and Harper S (2007) The new film history: sources, methods, approaches . Springer

Chávez Heras D (2024) Cinema and machine vision: artificial intelligence, aesthetics and spectatorship. Edinburgh University Press, Edinburgh

Clariana-Rodagut A, Hagener M (2023) Transnational networks of avant-garde film in the interwar period. In: Roig-Sanz D, Rotger N (eds.) Global literary studies: key concepts. De Gruyter, Berlin, p 253–277

Cooper A, Nascimento F, Francis D (2021) Exploring film language with a digital analysis tool: the case of Kinolab. Digit Hum Q 15:1

CUDAN Open Lab Seminar series 2020–2024. https://www.youtube.com/@CUDANLab/videos)

Cutting JE, Brunick KL, DeLong JE, Iricinschi C, Candan A (2011) Quicker, faster, darker: changes in Hollywood film over 75 years. I-Percept 2(6):569–576. https://doi.org/10.1068/2Fi0441aap

Dang S-M (2020) Unknowable facts and digital databases: reflections on the women film pioneers project and women in film history. Digit Hum Q 14:4

Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In Proceedings of the IEEE conference on computer vision and pattern recognition , 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Elo K (2020) Big Data, Bad Metadata: a methodological note on the importance of good metadata in the age of digital history. In: Fridlund M, Oiva M, Paju P (eds.) Digital histories. Emergent approaches within the new digital history. Helsinki University Press, Helsinki, pp 103–111 https://doi.org/10.33134/HUP-5-6

Fickers A (2021) Authenticity: historical data integrity and the layered materiality of digital objects. In: Fickers A, Schafer V, Takats S, Zaagsma G (eds.) Digital roots. Historicizing media and communication concepts of the digital age. De Gruyter Oldenbourg, Berlin, pp 299–312 https://doi.org/10.1515/9783110740202-017

Fickers A, Tatarinov J, Heijden T (2022) Digital history and hermeneutics—between theory and practice: an introduction. In: Fickers A, Tatarinov J (eds.) Digital history and hermeneutics: between theory and practice. De Gruyter Oldenbourg, Berlin, pp 1–19 https://doi.org/10.1515/9783110723991

Fielding R (2009) Newsreels. Encyclopedia of journalism, 3 , SAGE Publications, Thousand Oaks, pp 992–994

Gadamer H-G (2013 (1960)) Truth and method . Bloomsbury Academic

Garrett Cooper Mark, Levavy SaraBeth, Melnick Ross, Williams Mark (eds.) (2018) Rediscovering U.S. newsfilm: cinema, television, and the archive. Routledge, London

Gentles SJ, Charles C, Nicholas DB, Ploeg J, McKibbon KA (2016) Reviewing the research methods literature: principles and strategies illustrated by a systematic overview of sampling in qualitative research. Syst Rev 5(1):172. https://doi.org/10.1186/s13643-016-0343-0

Article   PubMed   PubMed Central   Google Scholar  

Gilburd E (2013) The Revival of Soviet Internationalism in the Mid to Late 1950s. In: Gilburd E, Kozlov D (eds.) The Thaw: Soviet Society and Culture during the 1950s and 1960s. University of Toronto Press, Toronto, p 362–401

Grootendorst M (2022) BERTopic: neural topic modeling with a class-based TF-IDF procedure. arXiv https://doi.org/10.48550/arXiv.2203.05794

Grósz T, Kallioniemi N, Kiiskinen H, Laine K, Moisio A, Römpötti T, Virkkunen A, Salmi H, Kurimo M, Laaksonen J (2022) Tracing signs of urbanity in the finnish fiction film of the 1950s: toward a multimodal analysis of audiovisual data. In: Proceedings of the 6th digital humanities in the nordic and Baltic countries conference (DHNB 2022), 3232: 63–78. https://ceur-ws.org/Vol-3232/paper05.pdf

Gulshad S, Long T, van Noord N (2023) Hierarchical explanations for video action recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 3703–3708. https://doi.org/10.48550/arXiv.2301.00436

Hanjalic A (2002) Shot-boundary detection: unraveled and resolved? IEEE Trans Circuits Syst Video Technol 12(2):90–105. https://doi.org/10.1109/76.988656

Hasegawa T, Kato S (2019) Dialogue mood estimation from speech sounds clusterized with speakers’ personality traits. In Proceedings of the IEEE 8th global conference on consumer electronics (GCCE) : 399–401. https://doi.org/10.1109/GCCE46687.2019.9015238

He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 770–778. https://doi.org/10.48550/arXiv.1512.03385

Heftberger A (2018) Digital humanities and film studies. Visualizing Dziga Vertov’s Work. Quantitative methods in the humanities and social sciences. Springer International Publishing, Switzerland, https://doi.org/10.1007/978-3-030-02864-0_4

Hickethier K (2016) The creation of cultural identity through weekly newsreels in Germany in the 1950s: as illustrated by the NEUE DEUTSCHE WOCHENSCHAU and the UFA-WOCHENSCHAU (With a Side Glance at the DEFA Weekly Newsreel DER AUGENZEUGE). In: Imesch K, Schade S, Sieber S (eds.) Constructions of cultural identities in newsreel cinema and television after 1945. Transcript Verlag, Bielefeld, p 39–54. https://doi.org/10.14361/9783839429754-003

Hielscher E (2020) The phenomenon of interwar city symphonies: a combined methodology of digital tools and traditional film analysis methods to study visual motifs and structural patterns of experimental-documentary city films. Digit Hum Q 14:4

Hoffmann K (2018) The commentary makes the difference: an analysis of the Suez War in East and West German Newsreels, 1956. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 77–92

Howell MC, Prevenier W (2001) From reliable sources: an introduction to historical methods. Cornell University Press, Ithaca

Hu X, Tang C, Chen H, Li X, Li J, Zhang Z (2022) Improving image segmentation with boundary patch refinement. Int J Comput Vis 130(11):2571–2589. https://doi.org/10.1007/s11263-022-01662-0

Imesch Kornelia, Schade Sigrid, Sieber Samuel (eds.) (2016) Constructions of cultural identities in newsreel cinema and television after 1945. Transcript Verlag, Bielefeld, 10.14361/9783839429754

Kandukuri RK, Achterhold J, Moeller M, Stueckler J (2022) Physical representation learning and parameter identification from video using differentiable physics. Int J Comput Vis 130(1):3–16. https://doi.org/10.1007/s11263-021-01493-5

Koivunen P (2016) Friends, ‘potential friends,’ and enemies: reimagining Soviet relations to the First, Second, and Third Worlds at the Moscow 1957 youth festival. In: Babiracki P, Jersild A (eds.) Socialist internationalism in the cold war. Springer International Publishing, New York, pp 219–47 https://doi.org/10.1007/978-3-319-32570-5_9

Kong Y, Fu Y (2022) Human action recognition and prediction: a survey. Int J Comput Vis 130(5):1366–1401. https://doi.org/10.1007/s11263-022-01594-9

Kordopatis-Zilos G, Tzelepis C, Papadopoulos S, Kompatsiaris I, Patras I (2022) DnS: distill-and-select for efficient and accurate video indexing and retrieval. Int J Comput Vis 130(10):2385–2407. https://doi.org/10.1007/s11263-022-01651-3

Lazer DMJ, Baum MA, Benkler Y, Berinsky AJ, Greenhill KM, Menczer F, Metzger MJ, Nyhan B, Pennycook G, Rothschild D (2018) The science of fake news. Science 359(6380):1094–1096

Article   ADS   CAS   PubMed   Google Scholar  

Levenshtein VI (1965) Binary codes capable of correcting spurious insertions and deletions of ones. Probl Inf Transm 1(1):8–17

Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vis 128(2):261–318. https://doi.org/10.1007/s11263-019-01247-4

van der Maaten L, Hinton G (2008) Visualizing data using T-SNE. J Mach Learn Res 11:9

Maltby R, Biltereyst D, Meers P (eds.) (2011) Explorations in new cinema history: approaches and case studies . Wiley

Manovich L (2020) Cultural analytics. MIT Press, Cambridge, MA

Masson E, Olesen CG, Noord Nvan, Fossati G (2020) Exploring digitised moving image collections: the SEMIA project, visual analysis and the turn to abstraction. Digit Hum Q 14:4

McInnes L, Healy J, Melville J (2020) UMAP: uniform manifold approximation and projection for dimension reduction. ArXiv http://arxiv.org/abs/1802.03426

Morgan M (2008) Reality and media reality. The international encyclopedia of communication , John Wiley & Sons, New Jersey. https://doi.org/10.1002/9781405186407.wbiecr015

Mourchid Y, Renoust B, Roupin O, Văn L, Cherifi H, Hassouni ME (2019) Movienet: a movie multilayer network model using visual and textual semantic cues. Appl Netw Sci 4(1):121. https://doi.org/10.1007/s41109-019-0226-0

Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88. https://doi.org/10.1145/375360.375365

Navitski R, Poppe N (2017) Cosmopolitan film cultures in Latin America, 1896–1960. Indiana University Press, Indiana

Nazir S, Cagali T, Sadrzadeh M, Newell C (2020) Audiovisual, genre, neural and topical textual embeddings for TV programme content representation. In: Proceedings of the IEEE international symposium on multimedia (ISM) : p 197–200. https://doi.org/10.1109/ISM.2020.00041

van Noord N (2022) A survey of computational methods for iconic image analysis. Digit Scholarsh Hum 37(4):1316–1338. https://doi.org/10.1093/llc/fqac003

van Noord N, Olesen CG, Ordelman R, Noordegraaf J (2021) Automatic annotations and enrichments for audiovisual archives. ICAART 1:633–640

van Noord N, Wevers M, Blanke T, Noordegraaf J, and Worring M (2022) An analytics of culture: modeling subjectivity, scalability, contextuality, and temporality. arXiv https://doi.org/10.48550/arXiv.2211.07460

Oberbichler S, Boroş E, Doucet A, Marjanen J, Pfanzelter E, Rautiainen J, Toivonen H, Tolonen M (2022) Integrated interdisciplinary workflows for research on historical newspapers: perspectives from humanities scholars, computer scientists, and librarians. J Assoc Inf Sci Technol 73(2):225–239. https://doi.org/10.1002/asi.24565

Article   PubMed   Google Scholar  

Ohm T, Canet Solá M, Karjus A, and Schich M (2023) Collection space navigator: interactive visualization interface for multidimensional datasets. https://collection-space-navigator.github.io/

Olesen CG, Kisjes I (2018) From text mining to visual classification: rethinking computational new cinema history with Jean Desmet’s digitised business archive. TMG J Media Hist 21(2):127–145. https://doi.org/10.18146/2213-7653.2018.370

Olesen CG, Masson E, Van Gorp J, Fossati G, Noordegraaf J (2016) Data-driven research for film history: exploring the Jean Desmet collection. Mov Image: J Assoc Mov Image Arch 16(1):82–105. https://doi.org/10.5749/movingimage.16.1.0082

van Oort T, Jernudd Å, Lotze K, Pafort-Overduin C, Biltereyst D, Boter J, Dibeltulo S et al. (2020) Mapping film programming across post-war Europe (1952): arts and media. Res Data J Hum Soc Sci 5(2):109–125. https://doi.org/10.1163/24523666-00502009

OpenAI (2023) GPT-4 technical report. arXiv https://doi.org/10.48550/arXiv.2303.08774

Opoku A, Ahmed V, Akotia J (2016) Choosing an appropriate research methodology. In: Ahmed V, Opoku A, Aziz Z (eds.) Research methodology in the built environment: a selection of case studies. Routledge, London, p 32–50

Paju P, Oiva M, Fridlund M (2020) Digital and distant histories: introducing emergent approaches within the new digital history. In: Fridlund M, Oiva M, Paju P (eds.) Digital readings of history. history research in the digital era. Helsinki University Press, Helsinki, p 3–18. https://doi.org/10.33134/HUP-5

Park S, Bellur A, Han DK, Elhilali M (2021) Self-training for sound event detection in audio mixtures. In Proceedings of the ICASSP 2021–2021 IEEE international conference on acoustics, speech and signal processing (ICASSP) : 341–345. https://doi.org/10.1109/ICASSP39728.2021.9414450

Pérez AF, Sanguineti V, Morerio P, Murino V (2020) Audio-visual model distillation using acoustic images. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV) : 2843–2852. https://doi.org/10.1109/WACV45572.2020.9093307

Pozdorovkin M (2012) Khronika: Soviet newsreel at the dawn of the information age . PhD thesis, Harvard University. https://dash.harvard.edu/handle/1/9823973

Pozharliev L, Gallo González D (2018) Martin Luther King’s assassination in Spain’s NO-DOs and in Bulgaria’s Kinopregledi. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 93–117

Pustu-Iren K, Sittel J, Mauer R, Bulgakowa O, Ewerth R (2020) Automated visual content analysis for film studies: current status and challenges. Digit Hum Q 14:4

Rasheed Z, Shah M (2003) Scene detection in Hollywood movies and TV shows. In: IEEE computer society conference on computer vision and pattern recognition, 2:II–343. https://doi.org/10.1109/CVPR.2003.1211489

Ravessoud C, Haver G (2016) Art exhibitions through newsreels: an avatar for identity politics (1945–1960). In: Imesch K, Schade S, Sieber S (eds.) Constructions of cultural identities in newsreel cinema and television after 1945. Transcript Verlag, Bielefeld, p 101–116. https://doi.org/10.14361/9783839429754-006

Ren H, Renoust B, Viaud M-L, Melançon G, and Satoh S (2018) Generating ‘Visual Clouds’ from multiplex networks for TV news archive query visualization. In: Proceedings of the International Conference on Content-Based Multimedia Indexing (CBMI) : 1–6. https://doi.org/10.1109/CBMI.2018.8516482

Rodgers J (2014) From Perestroika to Putin: journalism in Russia. In: Media Independence , Routledge, London, p 237–256

Roth-Ey K (2011) Moscow prime time: how the Soviet Union built the media empire that lost the cultural cold war. Cornell University Press, Ithaca

Salmi H (2020) What is digital history? Polity, Medford

Salt B (1974) Statistical style analysis of motion pictures. Film Q 28(1):13–22. https://doi.org/10.2307/1211438

Sampaio S (2022) From propaganda to attraction: reassessing the role of the public in the Newsreel Jornal Português (1938–1951). Hist J Film Radio Telev 42(3):470–492. https://doi.org/10.1080/01439685.2021.1976918

Sanguineti V, Morerio P, Del Bue A, Murino V (2022) Unsupervised synthetic acoustic image generation for audio-visual scene understanding. IEEE Trans Image Process 31:7102–7115. https://doi.org/10.1109/TIP.2022.3219228

Article   ADS   PubMed   Google Scholar  

Schich M (2017) The hermeneutic hypercycle. In: Brockman J (ed.) Know this: today’s most interesting and important scientific ideas. Harper Perennial, New York, p 561–563

Schich M (2019) Cultural analysis situs. ART-Dok eprint https://doi.org/10.11588/artdok.00006347

Senocak A, Oh T-H, Kim J, Yang M-H, Kweon IS (2018) Learning to localize sound source in visual scenes. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition : 4358–4366. https://doi.org/10.1109/CVPR.2018.00458

Snow CP (2001) The two cultures. Cambridge University Press, London (1959)

Sommer NM, Velipasalar S, Hirshfield L, Lu Y, Kakillioglu B (2020) Simultaneous and spatiotemporal detection of different levels of activity in multidimensional data. IEEE Access 8:118205–118218. https://doi.org/10.1109/ACCESS.2020.3005633

Studer L, Alberti M, Pondenkandath V, Goktepe P, Kolonko T, Fischer A, Liwicki M, and Ingold R (2019) A comprehensive study of imagenet pre-training for historical document image analysis. In: Proceedings of the international conference on document analysis and recognition (ICDAR) : 720–725. https://doi.org/10.1109/ICDAR.2019.00120

Treveri Gennari D, Sedgwick J (2022) Five Italian cities: comparative analysis of cinema types, film circulation and relative popularity in the mid-1950s. In: Sedgwick J (ed.) Towards a comparative economic history of cinema, 1930–1970. Springer International Publishing, Cham, p 249–279. https://doi.org/10.1007/978-3-031-05770-0_9

Tsivian Y (2009) Cinemetrics, part of the humanities’ cyberinfrastructure. In: Ross M, Grauer M, Freisleben B (eds.) Digital tools in media studies. analysis and research. An overview. Transcript, Bielefeld, p 93–100

Turoma Sanna, Waldstein Maxim (eds.) (2013) Empire de/centered: new spatial histories of Russia and the Soviet Union. Ashgate, Farnham

Tworek HJS (2019) News from Germany: the competition to control world communications, 1900–1945. News from Germany. Harvard University Press, Harvard, https://doi.org/10.4159/9780674240728

Valverde, FR, Valeria Hurtado J, and Valada A (2021) There is more than meets the eye: self-supervised multi-object detection and tracking with sound by distilling multimodal knowledge. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) : 11607–11616. https://doi.org/10.1109/CVPR46437.2021.01144

Vande Winkel R, Biltereyst D (2018) Newsreel production, distribution and exhibition in Belgium, 1908-1994. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 231–246

Veldi M, Bell S, Kuhlmann F (2019) Five-year plan in four: Kolkhoz propaganda in film and documentaries in Estonia. SHS Web Conf 63:10002. https://doi.org/10.1051/shsconf/20196310002

Verhoeven D, Coate B, Zemaityte V (2019) Re-distributing gender in the global film industry: beyond #metoo and #methree. Media Ind J 6(1):135–155. https://doi.org/10.3998/mij.15031809.0006.108

Verhoeven D, Musial K, Palmer S, Taylor S, Abidi S, Zemaityte V, Simpson L (2020) Controlling for openness in the male-dominated collaborative networks of the global film industry. PLOS One 15(6):e0234460. https://doi.org/10.1371/journal.pone.0234460

Article   CAS   PubMed   PubMed Central   Google Scholar  

Wan S, Xu X, Wang T, Gu Z (2021) An intelligent video analysis method for abnormal event detection in intelligent transportation systems. IEEE Trans Intell Transport Syst 22(7):4487–4495. https://doi.org/10.1109/TITS.2020.3017505

Wang D, Zhang S (2022) Unsupervised person re-identification via multi-label classification. Int J Comput Vis 130(12):2924–2939. https://doi.org/10.1007/s11263-022-01680-y

Werenskjold R (2018) Around the world: the first Norwegian newsreel, 1930–1941. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 51–75

Wevers M (2021) Scene detection in De Boer historical photo collection. In: Proceedings of the 13th international conference on agents and artificial intelligence (ICAART 2021) , 1:601–610. https://doi.org/10.5220/0010288206010610

Wevers M, Smits T (2020) The visual digital turn: using neural networks to study historical images. Digit Scholarsh Hum 35(1):194–207. https://doi.org/10.1093/llc/fqy085

Williams M, Bell J (2021) The Media Ecology Project: Collaborative DH synergies to produce new research in visual culture history. Digit Hum Q 15:1

Winston B (2018) Wofull News from Wales: details at 11. news, newsreels, bulletins and documentaries. In: Vande Winkel R, Chambers C, Jönsson M (eds.) Researching newsreels: local, national and transnational case studies. Palgrave Macmillan, Switzerland, p 15–33

Yang K, Russell B, Salamon J (2020) Telling left from right: learning spatial correspondence of sight and sound. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) : 9929–9938. https://doi.org/10.1109/CVPR42600.2020.00995

Ye K, Kovashka A (2022) Weakly-Supervised Action Detection Guided By Audio Narration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW) : 1527–1537. https://doi.org/10.1109/CVPRW56347.2022.00159

Zeiler M (1999) Modeling our world: the ESRI guide to geodatabase design . ESRI Press

Zhang Q, Chan AB (2022) 3D crowd counting via geometric attention-guided multi-view fusion. Int J Comput Vis 130(12):3123–3139. https://doi.org/10.1007/s11263-022-01685-7

Zhao B, Gong M, Li X (2021) AudioVisual video summarization. IEEE Transactions on Neural Networks and Learning Systems : 1–8. https://doi.org/10.1109/TNNLS.2021.3119969

Download references

Acknowledgements

European Union Horizon2020 research and innovation programme ERA Chair project for Cultural Data Analytics CUDAN (Project no. 810961); National programme of the Ministry of Education and Research of the Republic of Estonia for April 2022–February 2023 (Project no. EKKD77); Estonian Research Council Public Value of Open Cultural Data (Project no. PRG1641); European Union Horizon Europe research and innovation programme CresCine—Increasing the International Competitiveness of the Film Industry in Small European Markets (Project no. 101094988).

Author information

Authors and affiliations.

Tallinn University, Tallinn, Estonia

Mila Oiva, Ksenia Mukhina, Vejune Zemaityte, Andres Karjus, Mikhail Tamm, Tillmann Ohm, Mark Mets, Mar Canet Sola & Maximilian Schich

Estonian Business School, Tallinn, Estonia

Andres Karjus

King’s College London, London, UK

Daniel Chávez Heras

University of Tartu, Tartu, Estonia

Helena Hanna Juht

You can also search for this author in PubMed   Google Scholar

Contributions

MO, KM, VZ, TO, MT, MM, AK, DCH, MCS, and MS designed research, KM and MO acquired the data, MO, KM, VZ, MT, TO, and HHJ prepared the data, MO, KM, VZ, AK, MT, TO, MM, and MS performed the analysis, and MO, VZ, and MS wrote the manuscript. MO is the first author. MS is the supervising author.

Corresponding author

Correspondence to Mila Oiva .

Ethics declarations

Ethical approval.

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material, supplementary table 1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Oiva, M., Mukhina, K., Zemaityte, V. et al. A framework for the analysis of historical newsreels. Humanit Soc Sci Commun 11 , 530 (2024). https://doi.org/10.1057/s41599-024-02886-w

Download citation

Received : 14 July 2023

Accepted : 27 February 2024

Published : 25 April 2024

DOI : https://doi.org/10.1057/s41599-024-02886-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

content analysis in historical research

IMAGES

  1. PPT

    content analysis in historical research

  2. (PDF) Capturing the historical research methodology: an experimental

    content analysis in historical research

  3. How to write a historical document analysis

    content analysis in historical research

  4. What it is Content Analysis and How Can you Use it in Research

    content analysis in historical research

  5. FREE 7+ Historical Research Samples & Templates in PDF

    content analysis in historical research

  6. Historical Research

    content analysis in historical research

VIDEO

  1. The United States’ Long History of destabilizing Nations

  2. Haunting of Hill House: Grief and Hereditary Mental Illness

  3. Us: Classism, Racism, and Access to Care

  4. Infinity Pool: Lifestyles of the Rich and the Famous

  5. The Impact of Grief: Character Studies of the Crain Children in the Haunting of Hill House

  6. Haunting of Bly Manor: Why Humans Want to be Remembered and How

COMMENTS

  1. Historical Research

    Historical research is the process of investigating and studying past events, people, and societies using a variety of sources and methods. ... Content analysis: This involves analyzing the content of media from the past, such as films, television programs, and advertisements, to gain insights into cultural attitudes and beliefs.

  2. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, ... Amy has a master's degree in History of Art and has been working as a freelance writer and editor since 2014. She is passionate about helping people communicate ...

  3. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers and magazines. Speeches and interviews. Web content and social media posts. Photographs and films.

  4. Chapter 17. Content Analysis

    Chapter 17. Content Analysis Introduction. Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or ...

  5. Guide: Using Content Analysis

    Using Content Analysis. This guide provides an introduction to content analysis, a research methodology that examines words or phrases within a wide range of texts. Introduction to Content Analysis: Read about the history and uses of content analysis. Conceptual Analysis: Read an overview of conceptual analysis and its associated methodology.

  6. Content Analysis

    In his 1952 text on the subject of content analysis, Bernard Berelson traced the origins of the method to communication research and then listed what he called six distinguishing features of the approach. As one might expect, the six defining features reflect the concerns of social science as taught in the 1950s, an age in which the calls for an "objective," "systematic," and ...

  7. How to plan and perform a qualitative study using content analysis

    Abstract. This paper describes the research process - from planning to presentation, with the emphasis on credibility throughout the whole process - when the methodology of qualitative content analysis is chosen in a qualitative study. The groundwork for the credibility initiates when the planning of the study begins.

  8. How to use content analysis in historical research

    This paper illustrates the use of a content analysis in historical research. The purpose of a content analysis study is to illustrate the ways in which an individual organization participates in the processes of social change. Recommended Citation. Neimark, Marilyn (1983) ...

  9. Introduction

    Abstract. This chapter offers an inclusive definition of content analysis. This helps in clarifying some key terms and concepts. Three approaches to content analysis are introduced and defined briefly: basic content analysis, interpretive content analysis, and qualitative content analysis. Long-standing differences between quantitative and ...

  10. Content Analysis

    This is a collection of fifty-two published articles that cover the history of the process, discuss methodology, and provide important examples of content analysis studies that cover a number of social science fields, media (textual and visual), and approaches. ... Using quantitative content analysis in research. 3d ed. New York: Routledge.

  11. PDF History

    bols, historical documents, anthropological data, and psychotherapeutic exchanges; computer text analysis and the new media; and qualitative chal-lenges to content analysis. 1.1 Some Precursors Content analysis entails a systematic reading of a body of texts, images, and sym-bolic matter, not necessary from an author's or user's perspective.

  12. Qualitative Content Analysis

    It is a flexible research method ( Anastas, 1999 ). Qualitative content analysis may use either newly collected data, existing texts and materials, or a combination of both. It may be used in exploratory, descriptive, comparative, or explanatory research designs, though its primary use is descriptive.

  13. Content Analysis

    Step 1—Define the research questions. Ground the focus of the analysis of content in the literature/theory informing the data collection efforts. Consider the purpose of both the research in general as well as the information required to proceed with the next steps of the research. Step 2—Define the population.

  14. Content Analysis Method and Examples

    Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). ... essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content ...

  15. Content analysis

    History. Content analysis is research using the categorization and classification of speech, written text, interviews, images, or other forms of communication. In its beginnings, using the first newspapers at the end of the 19th century, analysis was done manually by measuring the number of columns given a subject. ...

  16. Historical research and content analysis: Relevance and possibilities

    The possibilities and adequation of the methodology of "content analysis" for the historical research are discussed. For such, theorical-methodological aspects of History are approached; the ...

  17. Content Analysis, Historical Research, and Mixed Methods

    Standing as pivotal qualitative methods, this chapter discusses qualitative content analysis and historical methodologies. Within each, readers are provided a history, overview of key features, and logic associated with their respective method. In addition, this chapter examines the role, use, and importance of mixed methods research.

  18. A hands-on guide to doing content analysis

    Keywords: Qualitative research, Qualitative data analysis, Content analysis. ... Content analysis, as in all qualitative analysis, is a reflective process. There is no "step 1, 2, 3, done!" linear progression in the analysis. This means that identifying and condensing meaning units, coding, and categorising are not one-time events. ...

  19. Content Analysis

    2 Historical Overview of Qualitative Research in the Social Sciences Notes. Notes. 3 The History of Historical-Comparative Methods in Sociology Notes. Notes. 4 The Centrality ... Using a distinctive and somewhat novel style of content analysis that calls upon the notion of semantic networks, the chapter shows how the method can be used either ...

  20. Reflexive Content Analysis: An Approach to Qualitative Data Analysis

    Content analysis, initially a quantitative technique for identifying patterns in qualitative data, has evolved into a widely used qualitative method. ... This history has led to a maze of competing, ... (2004). Qualitative content analysis in nursing research: Concepts, procedures and measures to achieve trustworthiness. Nurse Education Today ...

  21. What is Content Analysis? Uses, Types & Advantages

    Content analysis is a research method used to identify the presence of various concepts, words, and themes in different texts. Two types of content analysis exist: conceptual analysis and relational analysis. In the former, researchers determine whether and how frequently certain concepts appear in a text. In relational analysis, researchers ...

  22. Content Analysis

    Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

  23. Content Analysis

    This book provides an inclusive and carefully differentiated examination of contemporary content analysis research purposes and methods. Chapter 1 examines the conceptual base and history of content analysis. The next three chapters examine in depth each approach as a single approach to content analysis, using brief, illustrative exemplar studies.

  24. History and Definitions of Content Analysis

    1.1. History of Content Analysis Technique. We have already mentioned that Content Analysis is a natural, spontaneous process that we all use when we underline ideas in a text and try to organize them. But the history of Content Analysis, as a scientific method, therefore subject to controlled and systematic procedures, goes back to the times ...

  25. A framework for the analysis of historical newsreels

    Despite promising developments in film history, computational video analysis, and other relevant fields, current research streams have limitations related to the scope of data used, the ...

  26. The Past, Present, and Future of Research on Mathematical Giftedness: A

    There is a well-established historical background of research on mathematical giftedness that can be traced back to the early 1900s. An overarching purpose of this research was to review and explore the existing research and its evolution since its emergence in educational and psychological studies.