Grad Coach

Qualitative Data Coding 101

How to code qualitative data, the smart way (with examples).

By: Jenna Crosley (PhD) | Reviewed by:Dr Eunice Rautenbach | December 2020

As we’ve discussed previously , qualitative research makes use of non-numerical data – for example, words, phrases or even images and video. To analyse this kind of data, the first dragon you’ll need to slay is  qualitative data coding  (or just “coding” if you want to sound cool). But what exactly is coding and how do you do it? 

Overview: Qualitative Data Coding

In this post, we’ll explain qualitative data coding in simple terms. Specifically, we’ll dig into:

  • What exactly qualitative data coding is
  • What different types of coding exist
  • How to code qualitative data (the process)
  • Moving from coding to qualitative analysis
  • Tips and tricks for quality data coding

Qualitative Data Coding: The Basics

What is qualitative data coding?

Let’s start by understanding what a code is. At the simplest level,  a code is a label that describes the content  of a piece of text. For example, in the sentence:

“Pigeons attacked me and stole my sandwich.”

You could use “pigeons” as a code. This code simply describes that the sentence involves pigeons.

So, building onto this,  qualitative data coding is the process of creating and assigning codes to categorise data extracts.   You’ll then use these codes later down the road to derive themes and patterns for your qualitative analysis (for example, thematic analysis ). Coding and analysis can take place simultaneously, but it’s important to note that coding does not necessarily involve identifying themes (depending on which textbook you’re reading, of course). Instead, it generally refers to the process of  labelling and grouping similar types of data  to make generating themes and analysing the data more manageable. 

Makes sense? Great. But why should you bother with coding at all? Why not just look for themes from the outset? Well, coding is a way of making sure your  data is valid . In other words, it helps ensure that your  analysis is undertaken systematically  and that other researchers can review it (in the world of research, we call this transparency). In other words, good coding is the foundation of high-quality analysis.

Definition of qualitative coding

What are the different types of coding?

Now that we’ve got a plain-language definition of coding on the table, the next step is to understand what types of coding exist. Let’s start with the two main approaches,  deductive  and  inductive   coding.

Deductive coding 101

With deductive coding, we make use of pre-established codes, which are developed before you interact with the present data. This usually involves drawing up a set of  codes based on a research question or previous research . You could also use a code set from the codebook of a previous study.

For example, if you were studying the eating habits of college students, you might have a research question along the lines of 

“What foods do college students eat the most?”

As a result of this research question, you might develop a code set that includes codes such as “sushi”, “pizza”, and “burgers”.  

Deductive coding allows you to approach your analysis with a very tightly focused lens and quickly identify relevant data . Of course, the downside is that you could miss out on some very valuable insights as a result of this tight, predetermined focus. 

Deductive coding of data

Inductive coding 101 

But what about inductive coding? As we touched on earlier, this type of coding involves jumping right into the data and then developing the codes  based on what you find  within the data. 

For example, if you were to analyse a set of open-ended interviews , you wouldn’t necessarily know which direction the conversation would flow. If a conversation begins with a discussion of cats, it may go on to include other animals too, and so you’d add these codes as you progress with your analysis. Simply put, with inductive coding, you “go with the flow” of the data.

Inductive coding is great when you’re researching something that isn’t yet well understood because the coding derived from the data helps you explore the subject. Therefore, this type of coding is usually used when researchers want to investigate new ideas or concepts , or when they want to create new theories. 

Inductive coding definition

A little bit of both… hybrid coding approaches

If you’ve got a set of codes you’ve derived from a research topic, literature review or a previous study (i.e. a deductive approach), but you still don’t have a rich enough set to capture the depth of your qualitative data, you can  combine deductive and inductive  methods – this is called a  hybrid  coding approach. 

To adopt a hybrid approach, you’ll begin your analysis with a set of a priori codes (deductive) and then add new codes (inductive) as you work your way through the data. Essentially, the hybrid coding approach provides the best of both worlds, which is why it’s pretty common to see this in research.

Need a helping hand?

research questions for coding

How to code qualitative data

Now that we’ve looked at the main approaches to coding, the next question you’re probably asking is “how do I actually do it?”. Let’s take a look at the  coding process , step by step.

Both inductive and deductive methods of coding typically occur in two stages:  initial coding  and  line by line coding . 

In the initial coding stage, the objective is to get a general overview of the data by reading through and understanding it. If you’re using an inductive approach, this is also where you’ll develop an initial set of codes. Then, in the second stage (line by line coding), you’ll delve deeper into the data and (re)organise it according to (potentially new) codes. 

Step 1 – Initial coding

The first step of the coding process is to identify  the essence  of the text and code it accordingly. While there are various qualitative analysis software packages available, you can just as easily code textual data using Microsoft Word’s “comments” feature. 

Let’s take a look at a practical example of coding. Assume you had the following interview data from two interviewees:

What pets do you have?

I have an alpaca and three dogs.

Only one alpaca? They can die of loneliness if they don’t have a friend.

I didn’t know that! I’ll just have to get five more. 

I have twenty-three bunnies. I initially only had two, I’m not sure what happened. 

In the initial stage of coding, you could assign the code of “pets” or “animals”. These are just initial,  fairly broad codes  that you can (and will) develop and refine later. In the initial stage, broad, rough codes are fine – they’re just a starting point which you will build onto in the second stage. 

While there are various analysis software packages, you can just as easily code text data using Word's "comments" feature.

How to decide which codes to use

But how exactly do you decide what codes to use when there are many ways to read and interpret any given sentence? Well, there are a few different approaches you can adopt. The  main approaches  to initial coding include:

  • In vivo coding 

Process coding

  • Open coding

Descriptive coding

Structural coding.

  • Value coding

Let’s take a look at each of these:

In vivo coding

When you use in vivo coding, you make use of a  participants’ own words , rather than your interpretation of the data. In other words, you use direct quotes from participants as your codes. By doing this, you’ll avoid trying to infer meaning, rather staying as close to the original phrases and words as possible. 

In vivo coding is particularly useful when your data are derived from participants who speak different languages or come from different cultures. In these cases, it’s often difficult to accurately infer meaning due to linguistic or cultural differences. 

For example, English speakers typically view the future as in front of them and the past as behind them. However, this isn’t the same in all cultures. Speakers of Aymara view the past as in front of them and the future as behind them. Why? Because the future is unknown, so it must be out of sight (or behind us). They know what happened in the past, so their perspective is that it’s positioned in front of them, where they can “see” it. 

In a scenario like this one, it’s not possible to derive the reason for viewing the past as in front and the future as behind without knowing the Aymara culture’s perception of time. Therefore, in vivo coding is particularly useful, as it avoids interpretation errors.

Next up, there’s process coding, which makes use of  action-based codes . Action-based codes are codes that indicate a movement or procedure. These actions are often indicated by gerunds (words ending in “-ing”) – for example, running, jumping or singing.

Process coding is useful as it allows you to code parts of data that aren’t necessarily spoken, but that are still imperative to understanding the meaning of the texts. 

An example here would be if a participant were to say something like, “I have no idea where she is”. A sentence like this can be interpreted in many different ways depending on the context and movements of the participant. The participant could shrug their shoulders, which would indicate that they genuinely don’t know where the girl is; however, they could also wink, showing that they do actually know where the girl is. 

Simply put, process coding is useful as it allows you to, in a concise manner, identify the main occurrences in a set of data and provide a dynamic account of events. For example, you may have action codes such as, “describing a panda”, “singing a song about bananas”, or “arguing with a relative”.

research questions for coding

Descriptive coding aims to summarise extracts by using a  single word or noun  that encapsulates the general idea of the data. These words will typically describe the data in a highly condensed manner, which allows the researcher to quickly refer to the content. 

Descriptive coding is very useful when dealing with data that appear in forms other than traditional text – i.e. video clips, sound recordings or images. For example, a descriptive code could be “food” when coding a video clip that involves a group of people discussing what they ate throughout the day, or “cooking” when coding an image showing the steps of a recipe. 

Structural coding involves labelling and describing  specific structural attributes  of the data. Generally, it includes coding according to answers to the questions of “ who ”, “ what ”, “ where ”, and “ how ”, rather than the actual topics expressed in the data. This type of coding is useful when you want to access segments of data quickly, and it can help tremendously when you’re dealing with large data sets. 

For example, if you were coding a collection of theses or dissertations (which would be quite a large data set), structural coding could be useful as you could code according to different sections within each of these documents – i.e. according to the standard  dissertation structure . What-centric labels such as “hypothesis”, “literature review”, and “methodology” would help you to efficiently refer to sections and navigate without having to work through sections of data all over again. 

Structural coding is also useful for data from open-ended surveys. This data may initially be difficult to code as they lack the set structure of other forms of data (such as an interview with a strict set of questions to be answered). In this case, it would useful to code sections of data that answer certain questions such as “who?”, “what?”, “where?” and “how?”.

Let’s take a look at a practical example. If we were to send out a survey asking people about their dogs, we may end up with a (highly condensed) response such as the following: 

Bella is my best friend. When I’m at home I like to sit on the floor with her and roll her ball across the carpet for her to fetch and bring back to me. I love my dog.

In this set, we could code  Bella  as “who”,  dog  as “what”,  home  and  floor  as “where”, and  roll her ball  as “how”. 

Values coding

Finally, values coding involves coding that relates to the  participant’s worldviews . Typically, this type of coding focuses on excerpts that reflect the values, attitudes, and beliefs of the participants. Values coding is therefore very useful for research exploring cultural values and intrapersonal and experiences and actions.   

To recap, the aim of initial coding is to understand and  familiarise yourself with your data , to  develop an initial code set  (if you’re taking an inductive approach) and to take the first shot at  coding your data . The coding approaches above allow you to arrange your data so that it’s easier to navigate during the next stage, line by line coding (we’ll get to this soon). 

While these approaches can all be used individually, it’s important to remember that it’s possible, and potentially beneficial, to  combine them . For example, when conducting initial coding with interviews, you could begin by using structural coding to indicate who speaks when. Then, as a next step, you could apply descriptive coding so that you can navigate to, and between, conversation topics easily. 

Step 2 – Line by line coding

Once you’ve got an overall idea of our data, are comfortable navigating it and have applied some initial codes, you can move on to line by line coding. Line by line coding is pretty much exactly what it sounds like – reviewing your data, line by line,  digging deeper  and assigning additional codes to each line. 

With line-by-line coding, the objective is to pay close attention to your data to  add detail  to your codes. For example, if you have a discussion of beverages and you previously just coded this as “beverages”, you could now go deeper and code more specifically, such as “coffee”, “tea”, and “orange juice”. The aim here is to scratch below the surface. This is the time to get detailed and specific so as to capture as much richness from the data as possible. 

In the line-by-line coding process, it’s useful to  code everything  in your data, even if you don’t think you’re going to use it (you may just end up needing it!). As you go through this process, your coding will become more thorough and detailed, and you’ll have a much better understanding of your data as a result of this, which will be incredibly valuable in the analysis phase.

Line-by-line coding explanation

Moving from coding to analysis

Once you’ve completed your initial coding and line by line coding, the next step is to  start your analysis . Of course, the coding process itself will get you in “analysis mode” and you’ll probably already have some insights and ideas as a result of it, so you should always keep notes of your thoughts as you work through the coding.  

When it comes to qualitative data analysis, there are  many different types of analyses  (we discuss some of the  most popular ones here ) and the type of analysis you adopt will depend heavily on your research aims, objectives and questions . Therefore, we’re not going to go down that rabbit hole here, but we’ll cover the important first steps that build the bridge from qualitative data coding to qualitative analysis.

When starting to think about your analysis, it’s useful to  ask yourself  the following questions to get the wheels turning:

  • What actions are shown in the data? 
  • What are the aims of these interactions and excerpts? What are the participants potentially trying to achieve?
  • How do participants interpret what is happening, and how do they speak about it? What does their language reveal?
  • What are the assumptions made by the participants? 
  • What are the participants doing? What is going on? 
  • Why do I want to learn about this? What am I trying to find out? 
  • Why did I include this particular excerpt? What does it represent and how?

The type of qualitative analysis you adopt will depend heavily on your research aims, objectives and research questions.

Code categorisation

Categorisation is simply the process of reviewing everything you’ve coded and then  creating code categories  that can be used to guide your future analysis. In other words, it’s about creating categories for your code set. Let’s take a look at a practical example.

If you were discussing different types of animals, your initial codes may be “dogs”, “llamas”, and “lions”. In the process of categorisation, you could label (categorise) these three animals as “mammals”, whereas you could categorise “flies”, “crickets”, and “beetles” as “insects”. By creating these code categories, you will be making your data more organised, as well as enriching it so that you can see new connections between different groups of codes. 

Theme identification

From the coding and categorisation processes, you’ll naturally start noticing themes. Therefore, the logical next step is to  identify and clearly articulate the themes  in your data set. When you determine themes, you’ll take what you’ve learned from the coding and categorisation and group it all together to develop themes. This is the part of the coding process where you’ll try to draw meaning from your data, and start to  produce a narrative . The nature of this narrative depends on your research aims and objectives, as well as your research questions (sounds familiar?) and the  qualitative data analysis method  you’ve chosen, so keep these factors front of mind as you scan for themes. 

Themes help you develop a narrative in your qualitative analysis

Tips & tricks for quality coding

Before we wrap up, let’s quickly look at some general advice, tips and suggestions to ensure your qualitative data coding is top-notch.

  • Before you begin coding,  plan out the steps  you will take and the coding approach and technique(s) you will follow to avoid inconsistencies. 
  • When adopting deductive coding, it’s useful to  use a codebook  from the start of the coding process. This will keep your work organised and will ensure that you don’t forget any of your codes. 
  • Whether you’re adopting an inductive or deductive approach,  keep track of the meanings  of your codes and remember to revisit these as you go along.
  • Avoid using synonyms  for codes that are similar, if not the same. This will allow you to have a more uniform and accurate coded dataset and will also help you to not get overwhelmed by your data.
  • While coding, make sure that you  remind yourself of your aims  and coding method. This will help you to  avoid  directional drift , which happens when coding is not kept consistent. 
  • If you are working in a team, make sure that everyone has  been trained and understands  how codes need to be assigned. 

research questions for coding

Psst… there’s more (for free)

This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project. 

You Might Also Like:

What is a research question?

31 Comments

Finan Sabaroche

I appreciated the valuable information provided to accomplish the various stages of the inductive and inductive coding process. However, I would have been extremely satisfied to be appraised of the SPECIFIC STEPS to follow for: 1. Deductive coding related to the phenomenon and its features to generate the codes, categories, and themes. 2. Inductive coding related to using (a) Initial (b) Axial, and (c) Thematic procedures using transcribe data from the research questions

CD Fernando

Thank you so much for this. Very clear and simplified discussion about qualitative data coding.

Kelvin

This is what I want and the way I wanted it. Thank you very much.

Prasad

All of the information’s are valuable and helpful. Thank for you giving helpful information’s. Can do some article about alternative methods for continue researches during the pandemics. It is more beneficial for those struggling to continue their researchers.

Bahiru Haimanot

Thank you for your information on coding qualitative data, this is a very important point to be known, really thank you very much.

Christine Wasanga

Very useful article. Clear, articulate and easy to understand. Thanks

Andrew Wambua

This is very useful. You have simplified it the way I wanted it to be! Thanks

elaine clarke

Thank you so very much for explaining, this is quite helpful!

Enis

hello, great article! well written and easy to understand. Can you provide some of the sources in this article used for further reading purposes?

Kay Sieh Smith

You guys are doing a great job out there . I will not realize how many students you help through your articles and post on a daily basis. I have benefited a lot from your work. this is remarkable.

Wassihun Gebreegizaber Woldesenbet

Wonderful one thank you so much.

Thapelo Mateisi

Hello, I am doing qualitative research, please assist with example of coding format.

A. Grieme

This is an invaluable website! Thank you so very much!

Pam

Well explained and easy to follow the presentation. A big thumbs up to you. Greatly appreciate the effort 👏👏👏👏

Ceylan

Thank you for this clear article with examples

JOHNSON Padiyara

Thank you for the detailed explanation. I appreciate your great effort. Congrats!

Kwame Aboagye

Ahhhhhhhhhh! You just killed me with your explanation. Crystal clear. Two Cheers!

Stacy Ellis

D0 you have primary references that was used when creating this? If so, can you share them?

Ifeanyi Idam

Being a complete novice to the field of qualitative data analysis, your indepth analysis of the process of thematic analysis has given me better insight. Thank you so much.

Takalani Nemaungani

Excellent summary

Temesgen Yadeta Dibaba

Thank you so much for your precise and very helpful information about coding in qualitative data.

Ruby Gabor

Thanks a lot to this helpful information. You cleared the fog in my brain.

Derek Jansen

Glad to hear that!

Rosemary

This has been very helpful. I am excited and grateful.

Robert Siwer

I still don’t understand the coding and categorizing of qualitative research, please give an example on my research base on the state of government education infrastructure environment in PNG

Uvara Isaac Ude

Wahho, this is amazing and very educational to have come across this site.. from a little search to a wide discovery of knowledge.

Thanks I really appreciate this.

Jennifer Maslin

Thank you so much! Very grateful.

Vanassa Robinson

This was truly helpful. I have been so lost, and this simplified the process for me.

Julita Maradzika

Just at the right time when I needed to distinguish between inductive and

deductive data analysis of my Focus group discussion results very helpful

Sergio D. Mahinay, Jr.

Very useful across disciplines and at all levels. Thanks…

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly
  • AI & NLP
  • Churn & Loyalty
  • Customer Experience
  • Customer Journeys
  • Customer Metrics
  • Feedback Analysis
  • Product Experience
  • Product Updates
  • Sentiment Analysis
  • Surveys & Feedback Collection
  • Try Thematic

Welcome to the community

research questions for coding

Coding Qualitative Data: How to Code Qualitative Research

Authored by Alyona Medelyan, PhD – Natural Language Processing & Machine Learning

How many hours have you spent sitting in front of Excel spreadsheets trying to find new insights from customer feedback?

You know that asking open-ended survey questions gives you more actionable insights than asking your customers for just a numerical Net Promoter Score (NPS) . But when you ask open-ended, free-text questions, you end up with hundreds (or even thousands) of free-text responses.

How can you turn all of that text into quantifiable, applicable information about your customers’ needs and expectations? By coding qualitative data.

Keep reading to learn:

  • What coding qualitative data means (and why it’s important)
  • Different methods of coding qualitative data
  • How to manually code qualitative data to find significant themes in your data

What is coding in qualitative research?

Coding is the process of labeling and organizing your qualitative data to identify different themes and the relationships between them.

When coding customer feedback , you assign labels to words or phrases that represent important (and recurring) themes in each response. These labels can be words, phrases, or numbers; we recommend using words or short phrases, since they’re easier to remember, skim, and organize.

Coding qualitative research to find common themes and concepts is part of thematic analysis . Thematic analysis extracts themes from text by analyzing the word and sentence structure.

Within the context of customer feedback, it's important to understand the many different types of qualitative feedback a business can collect, such as open-ended surveys, social media comments, reviews & more.

What is qualitative data analysis?

Qualitative data analysis is the process of examining and interpreting qualitative data to understand what it represents.

Qualitative data is defined as any non-numerical and unstructured data; when looking at customer feedback, qualitative data usually refers to any verbatim or text-based feedback such as reviews, open-ended responses in surveys , complaints, chat messages, customer interviews, case notes or social media posts

For example, NPS metric can be strictly quantitative, but when you ask customers why they gave you a rating a score, you will need qualitative data analysis methods in place to understand the comments that customers leave alongside numerical responses.

Methods of qualitative data analysis

  • Content analysis: This refers to the categorization, tagging and thematic analysis of qualitative data. This can include combining the results of the analysis with behavioural data for deeper insights.
  • Narrative analysis: Some qualitative data, such as interviews or field notes may contain a story. For example, the process of choosing a product, using it, evaluating its quality and decision to buy or not buy this product next time. Narrative analysis helps understand the underlying events and their effect on the overall outcome.
  • Discourse analysis: This refers to analysis of what people say in social and cultural context. It’s particularly useful when your focus is on building or strengthening a brand.
  • Framework analysis: When performing qualitative data analysis, it is useful to have a framework. A code frame (a hierarchical set of themes used in coding qualitative data) is an example of such framework.
  • Grounded theory: This method of analysis starts by formulating a theory around a single data case. Therefore the theory is “grounded’ in actual data. Then additional cases can be examined to see if they are relevant and can add to the original theory.

Automatic coding software

Advances in natural language processing & machine learning have made it possible to automate the analysis of qualitative data, in particular content and framework analysis

While manual human analysis is still popular due to its perceived high accuracy, automating the analysis is quickly becoming the preferred choice. Unlike manual analysis, which is prone to bias and doesn’t scale to the amount of qualitative data that is generated today, automating analysis is not only more consistent and therefore can be more accurate, but can also save a ton of time, and therefore money.

The most commonly used software for automated coding of qualitative data is text analytics software such as Thematic .

Why is it important to code qualitative data?

Coding qualitative data makes it easier to interpret customer feedback. Assigning codes to words and phrases in each response helps capture what the response is about which, in turn, helps you better analyze and summarize the results of the entire survey.

Researchers use coding and other qualitative data analysis processes to help them make data-driven decisions based on customer feedback. When you use coding to analyze your customer feedback, you can quantify the common themes in customer language. This makes it easier to accurately interpret and analyze customer satisfaction.

Automated vs. Manual coding of qualitative data

Methods of coding qualitative data fall into two categories: automated coding and manual coding.

You can automate the coding of your qualitative data with thematic analysis software . Thematic analysis and qualitative data analysis software use machine learning, artificial intelligence (AI) , and natural language processing (NLP) to code your qualitative data and break text up into themes.

Thematic analysis software is autonomous, which means…

  • You don’t need to set up themes or categories in advance.
  • You don’t need to train the algorithm — it learns on its own.
  • You can easily capture the “unknown unknowns” to identify themes you may not have spotted on your own.

…all of which will save you time (and lots of unnecessary headaches) when analyzing your customer feedback.

Businesses are also seeing the benefit of using thematic analysis softwares that have the capacity to act as a single data source, helping to break down data silos, unifying data across an organization. This is now being referred to as Unified Data Analytics.

What is thematic coding?

Thematic coding, also called thematic analysis, is a type of qualitative data analysis that finds themes in text by analyzing the meaning of words and sentence structure.

When you use thematic coding to analyze customer feedback for example, you can learn which themes are most frequent in feedback. This helps you understand what drives customer satisfaction in an accurate, actionable way.

To learn more about how thematic analysis software helps you automate the data coding process, check out this article .

How to manually code qualitative data

For the rest of this post, we’ll focus on manual coding. Different researchers have different processes, but manual coding usually looks something like this:

  • Choose whether you’ll use deductive or inductive coding.
  • Read through your data to get a sense of what it looks like. Assign your first set of codes.
  • Go through your data line-by-line to code as much as possible. Your codes should become more detailed at this step.
  • Categorize your codes and figure out how they fit into your coding frame.
  • Identify which themes come up the most — and act on them.

Let’s break it down a little further…

Deductive coding vs. inductive coding

Before you start qualitative data coding, you need to decide which codes you’ll use.

What is Deductive Coding?

Deductive coding means you start with a predefined set of codes, then assign those codes to the new qualitative data. These codes might come from previous research, or you might already know what themes you’re interested in analyzing. Deductive coding is also called concept-driven coding.

For example, let’s say you’re conducting a survey on customer experience . You want to understand the problems that arise from long call wait times, so you choose to make “wait time” one of your codes before you start looking at the data.

The deductive approach can save time and help guarantee that your areas of interest are coded. But you also need to be careful of bias; when you start with predefined codes, you have a bias as to what the answers will be. Make sure you don’t miss other important themes by focusing too hard on proving your own hypothesis.  

What is Inductive Coding?

Inductive coding , also called open coding, starts from scratch and creates codes based on the qualitative data itself. You don’t have a set codebook; all codes arise directly from the survey responses.

Here’s how inductive coding works:

  • Break your qualitative dataset into smaller samples.
  • Read a sample of the data.
  • Create codes that will cover the sample.
  • Reread the sample and apply the codes.
  • Read a new sample of data, applying the codes you created for the first sample.
  • Note where codes don’t match or where you need additional codes.
  • Create new codes based on the second sample.
  • Go back and recode all responses again.
  • Repeat from step 5 until you’ve coded all of your data.

If you add a new code, split an existing code into two, or change the description of a code, make sure to review how this change will affect the coding of all responses. Otherwise, the same responses at different points in the survey could end up with different codes.

Sounds like a lot of work, right? Inductive coding is an iterative process, which means it takes longer and is more thorough than deductive coding. But it also gives you a more complete, unbiased look at the themes throughout your data.

Categorize your codes with coding frames

Once you create your codes, you need to put them into a coding frame. A coding frame represents the organizational structure of the themes in your research. There are two types of coding frames: flat and hierarchical.

Flat Coding Frame

A flat coding frame assigns the same level of specificity and importance to each code. While this might feel like an easier and faster method for manual coding, it can be difficult to organize and navigate the themes and concepts as you create more and more codes. It also makes it hard to figure out which themes are most important, which can slow down decision making.

Hierarchical Coding Frame

Hierarchical frames help you organize codes based on how they relate to one another. For example, you can organize the codes based on your customers’ feelings on a certain topic:

Hierarchical Coding Frame example

In this example:

  • The top-level code describes the topic (customer service)
  • The mid-level code specifies whether the sentiment is positive or negative
  • The third level details the attribute or specific theme associated with the topic

Hierarchical framing supports a larger code frame and lets you organize codes based on organizational structure. It also allows for different levels of granularity in your coding.

Whether your code frames are hierarchical or flat, your code frames should be flexible. Manually analyzing survey data takes a lot of time and effort; make sure you can use your results in different contexts.

For example, if your survey asks customers about customer service, you might only use codes that capture answers about customer service. Then you realize that the same survey responses have a lot of comments about your company’s products. To learn more about what people say about your products, you may have to code all of the responses from scratch! A flexible coding frame covers different topics and insights, which lets you reuse the results later on.

Tips for coding qualitative data

Now that you know the basics of coding your qualitative data, here are some tips on making the most of your qualitative research.

Use a codebook to keep track of your codes

As you code more and more data, it can be hard to remember all of your codes off the top of your head. Tracking your codes in a codebook helps keep you organized throughout the data analysis process. Your codebook can be as simple as an Excel spreadsheet or word processor document. As you code new data, add new codes to your codebook and reorganize categories and themes as needed.

Make sure to track:

  • The label used for each code
  • A description of the concept or theme the code refers to
  • Who originally coded it
  • The date that it was originally coded or updated
  • Any notes on how the code relates to other codes in your analysis

How to create high-quality codes - 4 tips

1. cover as many survey responses as possible..

The code should be generic enough to apply to multiple comments, but specific enough to be useful in your analysis. For example, “Product” is a broad code that will cover a variety of responses — but it’s also pretty vague. What about the product? On the other hand, “Product stops working after using it for 3 hours” is very specific and probably won’t apply to many responses. “Poor product quality” or “short product lifespan” might be a happy medium.

2. Avoid commonalities.

Having similar codes is okay as long as they serve different purposes. “Customer service” and “Product” are different enough from one another, while “Customer service” and “Customer support” may have subtle differences but should likely be combined into one code.

3. Capture the positive and the negative.

Try to create codes that contrast with each other to track both the positive and negative elements of a topic separately. For example, “Useful product features” and “Unnecessary product features” would be two different codes to capture two different themes.

4. Reduce data — to a point.

Let’s look at the two extremes: There are as many codes as there are responses, or each code applies to every single response. In both cases, the coding exercise is pointless; you don’t learn anything new about your data or your customers. To make your analysis as useful as possible, try to find a balance between having too many and too few codes.

Group responses based on themes, not wording

Make sure to group responses with the same themes under the same code, even if they don’t use the same exact wording. For example, a code such as “cleanliness” could cover responses including words and phrases like:

  • Looked like a dump
  • Could eat off the floor

Having only a few codes and hierarchical framing makes it easier to group different words and phrases under one code. If you have too many codes, especially in a flat frame, your results can become ambiguous and themes can overlap. Manual coding also requires the coder to remember or be able to find all of the relevant codes; the more codes you have, the harder it is to find the ones you need, no matter how organized your codebook is.

Make accuracy a priority

Manually coding qualitative data means that the coder’s cognitive biases can influence the coding process. For each study, make sure you have coding guidelines and training in place to keep coding reliable, consistent, and accurate .

One thing to watch out for is definitional drift, which occurs when the data at the beginning of the data set is coded differently than the material coded later. Check for definitional drift across the entire dataset and keep notes with descriptions of how the codes vary across the results.

If you have multiple coders working on one team, have them check one another’s coding to help eliminate cognitive biases.

Conclusion: 6 main takeaways for coding qualitative data

Here are 6 final takeaways for manually coding your qualitative data:

  • Coding is the process of labeling and organizing your qualitative data to identify themes. After you code your qualitative data, you can analyze it just like numerical data.
  • Inductive coding (without a predefined code frame) is more difficult, but less prone to bias, than deductive coding.
  • Code frames can be flat (easier and faster to use) or hierarchical (more powerful and organized).
  • Your code frames need to be flexible enough that you can make the most of your results and use them in different contexts.
  • When creating codes, make sure they cover several responses, contrast one another, and strike a balance between too much and too little information.
  • Consistent coding = accuracy. Establish coding procedures and guidelines and keep an eye out for definitional drift in your qualitative data analysis.

Some more detail in our downloadable guide

If you’ve made it this far, you’ll likely be interested in our free guide: Best practises for analyzing open-ended questions.

The guide includes some of the topics covered in this article, and goes into some more niche details.

If your company is looking to automate your qualitative coding process, try Thematic !

If you're looking to trial multiple solutions, check out our free buyer's guide . It covers what to look for when trialing different feedback analytics solutions to ensure you get the depth of insights you need.

Happy coding!

research questions for coding

CEO and Co-Founder

Alyona has a PhD in NLP and Machine Learning. Her peer-reviewed articles have been cited by over 2600 academics. Her love of writing comes from years of PhD research.

We make it easy to discover the customer and product issues that matter.

Unlock the value of feedback at scale, in one platform. Try it for free now!

  • Questions to ask your Feedback Analytics vendor
  • How to end customer churn for good
  • Scalable analysis of NPS verbatims
  • 5 Text analytics approaches
  • How to calculate the ROI of CX

Our experts will show you how Thematic works, how to discover pain points and track the ROI of decisions. To access your free trial, book a personal demo today.

Recent posts

Watercare is New Zealand's largest water and wastewater service provider. They are responsible for bringing clean water to 1.7 million people in Tamaki Makaurau (Auckland) and safeguarding the wastewater network to minimize impact on the environment. Water is a sector that often gets taken for granted, with drainage and

Become a qualitative theming pro! Creating a perfect code frame is hard, but thematic analysis software makes the process much easier.

Qualtrics is one of the most well-known and powerful Customer Feedback Management platforms. But even so, it has limitations. We recently hosted a live panel where data analysts from two well-known brands shared their experiences with Qualtrics, and how they extended this platform’s capabilities. Below, we’ll share the

University Library, University of Illinois at Urbana-Champaign

University of Illinois Library Wordmark

Qualitative Data Analysis: Coding

  • Atlas.ti web
  • R for text analysis
  • Microsoft Excel & spreadsheets
  • Other options
  • Planning Qual Data Analysis
  • Free Tools for QDA
  • QDA with NVivo
  • QDA with Atlas.ti
  • QDA with MAXQDA
  • PKM for QDA
  • QDA with Quirkos
  • Working Collaboratively
  • Qualitative Methods Texts
  • Transcription
  • Data organization
  • Example Publications

Coding Qualitative Data

Planning your coding strategy.

Coding is a qualitative data analysis strategy in which some aspect of the data is assigned a descriptive label that allows the researcher to identify related content across the data. How you decide to code - or whether to code- your data should be driven by your methodology. But there are rarely step-by-step descriptions, and you'll have to make many decisions about how to code for your own project.

Some questions to consider as you decide how to code your data:

What will you code? 

What aspects of your data will you code? If you are not coding all of your available data, how will you decide which elements need to be coded? If you have recordings interviews or focus groups, or other types of multimedia data, will you create transcripts to analyze and code? Or will you code the media itself (see Farley, Duppong & Aitken, 2020 on direct coding of audio recordings rather than transcripts). 

Where will your codes come from? 

Depending on your methodology, your coding scheme may come from previous research and be applied to your data (deductive). Or you my try to develop codes entirely from the data, ignoring as much as possible, previous knowledge of the topic under study, to develop a scheme grounded in your data (inductive). In practice, however, many practices will fall between these two approaches. 

How will you apply your codes to your data? 

You may decide to use software to code your qualitative data, to re-purpose other software tools (e.g. Word or spreadsheet software) or work primarily with physical versions of your data. Qualitative software is not strictly necessary, though it does offer some advantages, like: 

  • Codes can be easily re-labeled, merged, or split. You can also choose to apply multiple coding schemes to the same data, which means you can explore multiple ways of understanding the same data. Your analysis, then, is not limited by how often you are able to work with physical data, such as paper transcripts. 
  • Most software programs for QDA include the ability to export and import coding schemes. This means you can create a re-use a coding scheme from a previous study, or that was developed in outside of the software, without having to manually create each code. 
  • Some software for QDA includes the ability to directly code image, video, and audio files. This may mean saving time over creating transcripts. Or, your coding may be enhanced by access to the richness of mediated content, compared to transcripts.
  • Using QDA software may also allow you the ability to use auto-coding functions. You may be able to automatically code all of the statements by speaker in a focus group transcript, for example, or identify and code all of the paragraphs that include a specific phrase. 

What will be coded? 

Will you deploy a line-by-line coding approach, with smaller codes eventually condensed into larger categories or concepts? Or will you start with codes applied to larger segments of the text, perhaps later reviewing the examples to explore and re-code for differences between the segments? 

How will you explain the coding process? 

  • Regardless of how you approach coding, the process should be clearly communicated when you report your research, though this is not always the case (Deterding & Waters, 2021).
  • Carefully consider the use of phrases like "themes emerged." This phrasing implies that the themes lay passively in the data, waiting for the researcher to pluck them out. This description leaves little room for describing how the researcher "saw" the themes and decided which were relevant to the study. Ryan and Bernard (2003) offer a terrific guide to ways that you might identify themes in the data, using both your own observations as well as manipulations of the data. 

How will you report the results of your coding process? 

How you report your coding process should align with the methodology you've chosen. Your methodology may call for careful and consistent application of a coding scheme, with reports of inter-rater reliability and counts of how often a code appears within the data. Or you may use the codes to help develop a rich description of an experience, without needing to indicate precisely how often the code was applied. 

How will you code collaboratively?

If you are working with another researcher or a team, your coding process requires careful planning and implementation. You will likely need to have regular conversations about your process, particularly if your goal is to develop and consistently apply a coding scheme across your data. 

Coding Features in QDA Software Programs

  • Atlas.ti (Mac)
  • Atlas.ti (Windows)
  • NVivo (Windows)
  • NVivo (Mac)
  • Coding data See how to create and manage codes and apply codes to segments of the data (known as quotations in Atlas.ti).

  • Search and Code Using the search and code feature lets you locate and automatically code data through text search, regular expressions, Named Entity Recognition, and Sentiment Analysis.
  • Focus Group Coding Properly prepared focus group documents can be automatically coded by speaker.
  • Inter-Coder Agreement Coded text, audio, and video documents can be tested for inter-coder agreement. ICA is not available for images or PDF documents.
  • Quotation Reader Once you've coded data, you can view just the data that has been assigned that code.

  • Find Redundant Codings (Mac) This tool identifies "overlapping or embedded" quotations that have the same code, that are the result of manual coding or errors when merging project files.
  • Coding Data in Atlas.ti (Windows) Demonstrates how to create new codes, manage codes and applying codes to segments of the data (known as quotations in Atlas.ti)
  • Search and Code in Atlas.ti (Windows) You can use a text search, regular expressions, Named Entity Recognition, and Sentiment Analysis to identify and automatically code data in Atlas.ti.
  • Focus Group Coding in Atlas.ti (Windows) Properly prepared focus group transcripts can be automatically coded by speaker.
  • Inter-coder Agreement in Atlas.ti (Windows) Coded text, audio, and video documents can be tested for inter-coder agreement. ICA is not available for images or PDF documents.
  • Quotation Reader in Atlas.ti (Windows) Once you've coded data, you can view and export the quotations that have been assigned that code.
  • Find Redundant Codings in Atlas.ti (Windows) This tool identifies "overlapping or embedded" quotations that have the same code, that are the result of manual coding or errors when merging project files.
  • Coding in NVivo (Windows) This page includes an overview of the coding features in NVivo.
  • Automatic Coding in Documents in NVivo (Windows) You can use paragraph formatting styles or speaker names to automatically format documents.
  • Coding Comparison Query in NVivo (Windows) You can use the coding comparison feature to compare how different users have coded data in NVivo.
  • Review the References in a Node in NVivo (Windows) References are the term that NVivo uses for coded segments of the data. This shows you how to view references related to a code (or any node)
  • Text Search Queries in NVivo (Windows) Text queries let you search for specific text in your data. The results of your query can be saved as a node (a form of auto coding).
  • Coding Query in NVivo (Windows) Use a coding query to display references from your data for a single code or multiples of codes.
  • Code Files and Manage Codes in NVivo (Mac) This page offers an overview of coding features in NVivo. Note that NVivo uses the concept of a node to refer to any structure around which you organize your data. Codes are a type of node, but you may see these terms used interchangeably.
  • Automatic Coding in Datasets in NVivo (Mac) A dataset in NVivo is data that is in rows and columns, as in a spreadsheet. If a column is set to be codable, you can also automatically code the data. This approach could be used for coding open-ended survey data.
  • Text Search Query in NVivo (Mac) Use the text search query to identify relevant text in your data and automatically code references by saving as a node.
  • Review the References in a Node in NVivo (Mac) NVivo uses the term references to refer to data that has been assigned to a code or any node. You can use the reference view to see the data linked to a specific node or combination of nodes.
  • Coding Comparison Query in NVivo (Mac) Use the coding comparison query to calculate a measure of inter-rater reliability when you've worked with multiple coders.

The MAXQDA interface is the same across Mac and Windows devices. 

  • The "Code System" in MAXQDA This section of the manual shows how to create and manage codes in MAXQDA's code system.
  • How to Code with MAXQDA

  • Display Coded Segments in the Document Browser Once you've coded a document within MAXQDA, you can choose which of those codings will appear on the document, as well as choose whether or not the text is highlighted in the color linked to the code.
  • Creative Coding in MAXQDA Use the creative coding feature to explore the relationships between codes in your system. If you develop a new structure to you codes that you like, you can apply the changes to your overall code scheme.
  • Text Search in MAXQDA Use a Text Search to identify data that matches your search terms and automatically code the results. You can choose whether to code only the matching results, the sentence the results are in, or the paragraph the results appear in.
  • Segment Retrieval in MAXQDA Data that has been coded is considered a segment. Segment retrieval is how you display the segments that match a code or combination of codes. You can use the activation feature to show only the segments from a document group, or that match a document variable.
  • Intercorder Agreement in MAXQDA MAXQDA includes the ability to compare coding between two coders on a single project.
  • Create Tags in Taguette Taguette uses the term tag to refer to codes. You can create single tags as well as a tag hierarchy using punctuation marks.
  • Highlighting in Taguette Select text with a document (a highlight) and apply tags to code data in Taguette.

Useful Resources on Coding

Cover Art

Deterding, N. M., & Waters, M. C. (2021). Flexible coding of in-depth interviews: A twenty-first-century approach. Sociological Methods & Research , 50 (2), 708–739. https://doi.org/10.1177/0049124118799377

Farley, J., Duppong Hurley, K., & Aitken, A. A. (2020). Monitoring implementation in program evaluation with direct audio coding. Evaluation and Program Planning , 83 , 101854. https://doi.org/10.1016/j.evalprogplan.2020.101854

Ryan, G. W., & Bernard, H. R. (2003). Techniques to identify themes. Field Methods , 15 (1), 85–109. https://doi.org/10.1177/1525822X02239569. 

  • << Previous: Data organization
  • Next: Citations >>
  • Last Updated: Apr 5, 2024 2:23 PM
  • URL: https://guides.library.illinois.edu/qualitative

Logo for Rhode Island College Digital Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Qualitative Data Analysis

21 Qualitative Coding

Mikaila Mariel Lemonik Arthur

Codes are words or phrases that capture a central or notable attribute of a particular segment of text or visual data (Saldaña 2016). Coding , then, is the process of applying codes to texts or visuals. It is one of the most common strategies for data reduction and analysis of qualitative data, though many qualitative projects do not require or use coding. This chapter will provide an overview of approaches based in coding, including how to develop codes and how to go through the coding process.

In order to understand coding, it is essential to think about what it means for something to be a code. To analogize to social media, codes might function a bit like tags or hashtags. They are words or phrases that convey content, ideas, perspectives, or other key elements of segments of text. Codes are not the same as themes. Themes are broader than codes—they are concepts or topics around which a discussion, analysis, or text focuses. Themes are more general and more explanatory—often, once we code, we find themes emerge as ideas to explore in our further analysis (Saldaña 2016). Codes are also different from descriptors. Descriptors are words or phrases that describe characteristics of the entire text and/or the person who created it. For example, if we note the profession of an interview respondent, whether an article is news or opinion, or the type of camera used to take a photograph, those would be descriptors. Saldaña (2016) instead calls these attributes . The term attributes more typically refers to the possible answer choices or options for a variable, so it is possible to think about descriptors as variables (or perhaps their attributes) as well.

Three boxes, one headlined codes, one headlined themes, and one headlined descriptors, each followed by a definition. Codes convey central ideas or contributions of segments of text. Themes are general, explanatory discussions of concepts or ideas in texts. And descriptors are characteristics of entire texts or their creators.

Let’s consider an example. Imagine that you were conducting an interview-based study looking at minor-league athletes’ workplace experiences and later-life career plans. In this study, themes might be broad ideas like “aspirations” or “work experiences.” There would be a vast array of codes, but they might include things like “short-term goals,” “educational plans,” “pay,” “team bonding,” “travel,” “treatment by managers,” “family demands,” and many more. Descriptors might include the athlete’s gender and what sport they play.

Developing a Coding System

While all approaches to coding have in common the idea that codes are applied to segments of text or visuals, there are many different ways to go about coding. These approaches differ in terms of when they occur during the research process and how codes are developed. First of all, there is a distinction between first- and second-cycle coding approaches (Saldaña 2016). First-cycle coding happens early in the research process and is really a bridge from data reduction to data analysis, while second-cycle coding occurs later in the research process and is more analytical in nature. Another version of this distinction is the comparison between rough, analytic, and focused coding. Rough coding is really part of the process of data reduction. It often involves little more than putting a few words near each segment of text to make clear what is important in that segment, with the approach being further refined as coding continues. In contrast, analytic coding involves more detailed techniques designed to move towards the development of themes and findings. Finally, focused coding involves selecting ideas of interest and going back and re-coding your texts to orient your approach more specifically around these ideas (Bergin 2018).

A second set of distinctions concerns whether the data drives the development of codes or whether codes are instead developed in advance. If codes are determined in advance, or predetermined, researchers develop a set of codes based on their theory, hypothesis, or research question. This sort of coding is typically called deductive coding or closed coding . In contrast, open coding or inductive coding refers to a process in which researchers develop codes based on what they observe in their data, grounding their codes in the texts. This second approach is more common, though by no means universal, in qualitative data analysis. In both types of coding, however, researcher may rely upon ideas generated by writing theoretical memos as they work through the connections between concepts, theory, and data (Saldaña 2016).

Finally, a third set of distinctions focuses on what is coded. Manifest coding refers to the coding of surface-level and easily observable elements of texts (Berg 2009). In contrast, latent coding is a more interpretive approach based on looking deeply into texts for the meanings that are encoded within or symbolized by them (Berg 2009). For example, consider a research project focused on gender in car advertisements. A manifest approach might count the number of men versus women who appear in the ads. A latent approach would instead focus on the use of gendered language and the extent to which men and women are depicted in gender-stereotyped ways.

Researchers need to answer two more questions as they develop their coding systems. First, what to code, and second, how many codes. When thinking about what to code, researchers can look at the level of individual words, characters or actors in the text, paragraphs, entire textual items (like complete books or articles), or really any unit of text (Berg 2009), but the most useful procedure is to look for chunks of words that together express a thought or idea, here referred to as “segments of text” or “textual segments,” and then code to represent the ideas, concepts, emotions, or other relevant thoughts expressed in those chunks.

How many codes should a particular coding system have? There is no simple answer to this question. Some researchers develop complex coding systems with many codes and may have over a hundred different codes. Others may use no more than 25, perhaps fewer, even for the same size project (Saldaña 2016). Some researchers nest codes into code trees, with several related “child” codes (or subcodes) under a single “parent” code. For example, a code “negative emotions” could be the parent code for a series of codes like “anger,” “frustration,” “sadness,” and “fear.” This approach enables researcher to use a smaller or larger number of codes in their analysis as seems fit after coding is complete. While there is no formula for determining the right number of codes for a particular project, researchers should be attentive to overgrowth in the number of codes. Codes have limited analytical value if they are used only once or twice—if a coding system includes many codes that are applied only a small number of times, consider whether there are larger categories of codes that might be more useful. Occasionally, there are codes worth keeping but applying rarely, for example when there is a rare but important phenomenon that arises in the data. But for the most part, codes should be used with some degree of frequency in order for them to be useful for uncovering themes and patterns.

Types of Codes

A wide variety of different types of codes can be used in coding systems. The discussion below, which draws heavily on the work of Saldaña (2016), details a variety of different approaches to coding and code development. Researchers do not need to choose just one of these approaches—most researchers combine multiple coding approaches to create an overall system that is right for the texts they are coding and the project they are conducting. The approaches detailed here are presented roughly in order of the degree of complexity they represent.

At the most basic level is descriptive coding . Descriptive codes are nouns or phrases describing the content covered in a segment of text or the topic the segment of text focuses on. All studies can use descriptive coding, but it often is less productive of rich data for analysis than other approaches might be. Descriptive coding is often used as part of rough coding and data reduction to prepare for later iterations of coding that delve more deeply into the texts. So, for instance, that study of sexism in advertisements might involve some rough coding in which the researcher notes what type of product or service is being advertised in each advertisement.

Structural coding , in contrast, attends more closely to the research question rather than to the ideas in the text. In structural coding, codes indicate which specific research question, part of a research question, or hypothesis is being addressed by a particular segment of text. This may be most useful as part of rough coding to help researchers ensure that their data addresses the questions and foci central to their project.

In vivo coding captures short phrases derived from participants’ own language, typically action-oriented. This is particularly important when researchers are studying subcultural groups that use language in different ways than researchers are accustomed to and where this language is important for subsequent analysis (Manning 2017). In this approach, researchers choose actual portions of respondents’ words and use those as codes. In vivo coding can be used as part of both rough and analytical coding processes.

A related approach is process coding , which involves “the use of gerunds to label actual or conceptual actions relayed by participants” (Saldaña 2016:77). ( Gerunds are verb forms that end in -ing and can function grammatically as if they are nouns when used in sentences). Process coding draws researchers’ attention to actions, but in contrast to in vivo coding it uses the researcher’s vocabulary to build the coding system. So, for instance, in the study of minor league athletes discussed earlier in the chapter, process codes might include “traveling,” “planning,” “exercising,” “competing,” and “socializing.”

Concept coding involves codes consisting of words or short phrases that represent broader concepts or ideas rather than tangible objects or actions. Sticking with the minor league athletes example, concept codes might include “for the love of the game,” “youth,” and “exploitation.” A combination of concept, process, and descriptive coding may be useful if researchers want their coding system to result in an inventory of the ideas, objects, and actions discussed in the texts.

A 5 by 5 grid of emojis, including grinning face, grinning face with sunglasses, grinning face with a tear, laughing face, grinning face with glasses, face with tongue sticking out, smiling face with sunglasses, grinning face with hearts for eyes, kissing face blowing a kiss, kissing face, winking face with tongue sticking out, face with glasses and tongue sticking out, face with rolling eyes, smirking face with glasses, squinting face with frown, relieved face, frowning face, confounded face, face with surgical mask, confused face, grimacing face, flushed face, face with crossed-out eyes, angry face with surgical mask, and unamused face.

Emotion codes are codes indicating the emotions participants discuss in or that are evoked by a segment of text. A more contemporary version of emotion codes relies on “emoticodes” or the emoji that express specific kinds of emotions, as shown in Figure 2.

Values coding involves the use of codes designed to represent the “perspectives or worldview” of a respondent by conveying participants’ “values, attitudes, and beliefs” (Saldaña 2016:131). For example, a project on elementary school teachers’ workplace satisfaction might include values codes like “equity,” “learning,” “commitment,” and “the pursuit of excellence.” Do note that choices made in values coding are, even more so than in other forms of coding, likely to reflect the values and worldviews of the coder. Thus, it can be essential to use a team of multiple coders with different backgrounds and perspectives in order to ensure a values coding approach that reflects the contents of the texts rather than the ideas of the coders.

Versus coding requires the construction of a series of binary oppositions and then the application of one or the other of the items in the binary as a code to each relevant segment of text. This may be a particularly useful approach for deductive coding, as the researcher can set out a series of hypothesized binaries to use as the basis for coding. For example, the project on elementary school teachers’ workplace satisfaction might use binaries like feeling supported vs. feeling unsupported, energized vs. tired, unfulfilled needs vs. fulfilled needs, kids ready to learn vs. kids needing services, academic vs non-academic concerns, and so on.

Evaluation coding is used to signify what is and is not working in the policy, program, or endeavor that respondents are discussing or that the research focuses on. This approach is obviously especially useful in evaluation research designed to assess the merit or functioning of particular policies or programs. For example, if the project about elementary school teachers was part of a mentoring program designed to keep new teachers in the education profession, codes might include “future orientation” to flag portions of the text in which teachers discuss their longer-term plans and “mentor/mentee match” to flag portions in which they explore how they feel about their mentors, both key elements of the program and its goals.

There are a variety of other approaches more common outside of sociology, such as dramaturgical coding , which is a coding approach that treats interview transcripts or fieldnotes as if they are scripts for a play, coding such things as actors, attitudes, conflicts, and subtexts; coding approaches relying on terms and ideas from literary analysis; and those drawn from communications studies, which focus on facets of verbal exchange. Finally, some researchers have outlined very specific coding strategies and procedures such that someone else could pick up their methods and apply them exactly. This sort of approach is typically deductive, as it requires the advance specification of the decisions that will be made about coding.

Some coding strategies incorporate measures of weight or intensity, and this can be combined with many of the approaches detailed above. For example, consider a project collecting narratives of people’s experiences with losing their jobs. Respondents might include a variety of emotional content in their narratives, whether sadness, fear, stress, relief, or something else. But the emotions they discuss will vary not only in type, they will also vary in extent. A worker who is fired from a job they liked well enough but who knows they will be able to find another job soon may express sadness while a worker whose company closed after she worked there for 20 years and who has few other equivalent employment opportunities in the region may express devastation. Code weights help account for these differences.

A final question researchers must consider is whether they will apply only one code per segment of text or will permit overlapping codes. Overlapping codes make data analysis more complex but can facilitate the process of looking for relationships between different concepts or ideas in the data.

As a coding system is developed and certainly upon its completion, researchers create documents known as codebooks . As is the case with survey research, codebooks lay out the details of how the measurement instrument works to capture data and measure it. For surveys, a codebook tells researchers how to transform the multiple-choice and short-answer responses to survey questions into the numerical data used for quantitative analysis. For qualitative coding, codebooks instead explain when and how to use each of the codes included in the project. Codebooks are an important part of the coding process because they remind the researcher, and any other coders working on the project, what each code means, what types of data it is meant to apply to, and when it should and should not be used (Luker 2008). Even if a researcher is coding without others, it is easy to lose sight of what you were thinking when you initially developed your coding system, and so the codebook serves as an important reminder.

For each code, the codebook should state the name of the code, include a couple of sentences describing the code and what it should be used for, any information about when the code should not be used, examples of both typical and atypical conditions under which the code would be used, and a discussion of the role the code plays in analysis (Saldaña 2016). Codebooks thus serve as instruction manuals for when and how to apply codes. They can also help researchers think about taxonomies of codes as they organize the code book, with higher-level ideas serving as categories for groups of child, or more precise, codes.

The Process of Coding

So, what does the process of coding look like? While qualitative research can and does involve deductive approaches, the process that will be detailed here is an inductive approach, as this is more common in qualitative research. This discussion will lay out a series of steps in the coding process as well as some additional questions researchers and analysts must consider as they develop and carry out their coding.

The first step in inductive coding is to completely and thoroughly read through the data several times while taking detailed notes. To Saldaña (2016), the most important question to ask during this initial read is what is especially interesting or surprising or otherwise stands out. In addition, researchers might contemplate the actions people take, how people go about accomplishing things, how people use language or understand the world, and what people seem to be thinking. The notes should include anything and everything—objects, people, emotions, actions, theoretical ideas, questions—really anything, whether it comes up again and again in the data or only once, though it is useful to flag or highlight those concepts that seem to recur frequently in the data.

Next, researchers need to organize these notes into a coding system. This involves deciding which coding approach(es) to incorporate, whether or not to use parent and child codes, and what sort of vocabulary to use for codes. Remember that readers will not see the coding system except insofar as the researcher chooses to convey it, so vocabulary and terms should be chosen based on the extent to which they make sense to the research team. Once a coding system has been developed, the researcher must create a codebook. If paper coding will be used, a paper codebook should be created. If researchers will be using CAQDAS, or computer-aided qualitative data analysis software, to do their coding, it is often the case that the codebook can be built into the software itself.

Next, the researcher or research team should rough code, applying codes to the text while taking notes to reflect upon missing pieces in the coding system, ways to reorganize the codes or combine them to enhance meaning, and relevant theoretical ideas and insights. Upon completing the rough coding process, researchers should revise the coding system and codebook to fully reflect the data and the project’s needs.

At this point, researchers are ready to engage in coding using the revised codebook. They should always have someone else code a portion of the texts—usually a minimum of 10%—for interrater reliability checks, and if a larger research team is used, 10% of the texts should be coded in common by all coders who are part of the research team. Even in cases where researchers are working alone, it truly strengthens data analysis to be able to check for interrater reliability, so most analysts suggest having a portion of the data coded by another coder, using the codebook. If at all possible, additional coding staff should not be told what the hypothesis or research question is, as one of the strengths of this approach is that additional coding staff will be less likely to be influenced by preexisting ideas about what the data should show (Luker 2008). There are various quantitative measures, such as Chronbach’s alpha and Kappa , that researchers use to calculate interrater reliability, the measure of how closely the ratings of multiple coders correspond. All coders should keep detailed notes about their coding process and any obstacles or difficulties they encounter.

How do researchers know they are done coding? Not just because they have gone through each text once or twice! Researchers may need to continue repeating this process of revision and re-coding until additional coding does not reveal anything more. This repetition is an essential part of coding, as coding always requires refinement and rethinking (Saldaña 2016). In Berg’s (2009:354-55) words, it is essential to “code minutely,” beginning with a rough view of the entire text and then refining as you go until you are examining each detail of a text. Then, researchers think about why and how they developed their codes and what jumps out at them as important from the research as they delve into findings, making sure that nothing has been left out of the coding process before they move towards data analysis.

One interesting question is whether the identities and standpoints (as discussed in the chapter “The Qualitative Approach”) of coders matter to the coding process. Eduardo Bonila-Silva (Zuberi and Bonilla-Silva 2008:17) has described how, after a presentation discussing his research on racism, a colleague asked whether the coders were White or Black—and he responded by asking the colleague “if he asked such questions across the board or only to researchers saying race matters.” As Bonilla-Silva’s question suggests, race (like other aspects of identity and experience, such as gender, immigration status, disability status, age, and social class, just to name a few) very well might shape the way coders see and understand data, functioning as part of a particular coding filter (Saldaña 2016). But that shaping extends broadly across all issues, not just those we might assume are particularly salient in relationship to identities. Thus, it is best for research teams to be diverse so as to ensure that a variety of perspectives are brought to bear on the data and that the findings reflect more than just a narrow set of ideas about how the world works.

Coding and What Comes After

If researchers will code by hand, they will need multiple copies of their data, one for reference and one for writing on (Luker 2008). On the copy that will be written on, researchers use a note-taking system that makes sense to them—whether different-colored markers, Roman numerals in the margins, a complex series of sticky notes, or whatever—to mark the application of various codes to sections of your data. You can see an example of what hand coding might look like in Figure 3 below, which is taken from a study of the comments faculty members make on student writing. Segments of text are highlighted in different colors, with codes noted in the margins next to the text. You can see how codes are repeated but in different combinations. Once the initial coding process is complete, researchers often cut apart the pieces of paper to make chunks of text with individual codes and sort the pieces of paper by code (if multiple codes appear in individual chunks of text, additional copies might be needed). Then, each pile is organized and used as the basis for writing theoretical memos. Another option for coding by hand is to use an index sheet (Berg 2009). This approach entails developing a set of codes and categories, arranging them on paper, and entering transcript, page, and paragraph information to identify where relevant quotes can be found.

For more complex analytical processes, researchers will likely want to use software, though there are limitations to software. Luker (2008), for instance, argues that when coding manually, she tends to start with big themes and only breaks them into their constituent parts later, while coding using software leads her to start with the smallest possible codes. (One solution to this, offered by some software packages, is upcoding, where a so-called “parent” code is simultaneously applied to all of the “child” codes under it. For instance, you might have a parent code of “activism” and then child codes that you apply to different kinds of activism, whether protest, legislative advocacy, community organizing, or whatever.)

A page of text highlighted in different colors with codes in the margin. "You are off to a strong start here, but your literature review does need more work." Codes: Overall Criticism, Praise. As you can see, "I did a lot of editing to your word usage and sentence structure; you might want to consider going to the writing center with drafts of your work in the future for help learning how to edit and proofread your work more effectively. Sometimes reading out loud can be an effective way to catch some errors." Codes: Editing, Criticism, Suggestions As I noted in the marginal comments, "you have some problems with your citations and are missing at least one source." Codes: Citations, Criticism On the other hand, "you did a good job of trying to combine the themes of your articles into a flowing document. Still, I would suggest a bit of reorganization. For instance, you might start with a paragraph describing the reasons why international students choose to study in other countries (perhaps one of your sources also has statistics about the number of international students in the US; if not, let me know and I might know where to find some). Next, you might turn to a paragraph or two discussing some of the benefits that international students provide, both to their host countries and to their sending countries. Third, write a paragraph discussing some of the difficulties international students have when adjusting to their new circumstances, and then finally turn to the other risks and difficulties you outlined. This will build seamlessly toward" Codes: Organization, Suggestions "your research question—which is a really interesting one!" Codes: Research Q, Praise "If you want to send me an email reminding me, there is a news article in the Chronicle of Higher Education about a series of for-profit colleges in the US that preyed upon international students; it might make an interesting case for your introduction when you write the proposal, and if you remind me I will send it to you." Codes: Sources, Suggestion "In any case, if you do work on the omissions and issues facing this literature review, I think you’ll be in good shape for a really interesting final project." Code: Overall Praise

Coding does not stand on its own, and thus simply completing the coding process does not move a research project from data to analysis. While the analysis process will be discussed in more detail in a subsequent chapter, there are several steps researchers take alongside coding or immediately after completing coding that facilitate analysis and are thus useful to discuss in the context of coding. Many of these are best understood as part of the process of data reduction. One of the most important of these is categorizing codes into larger groupings, a step that helps to enable the development of themes. These larger groupings, sometimes called “parent” codes, can collapse related but not identical ideas. This is always useful, but it is especially useful in cases where researchers have used a large number of codes and each one is applied only a few times. Once parent codes have been created, researchers then go back and ensure that the appropriate parent code is assigned to all segments of text that were initially coded with the relevant “child” codes (a step that can be automated in CAQDAS). If appropriate, researchers may repeat this process to see if parent codes can be further grouped. An alternative approach to this grouping process is to wait until coding is complete, and then create more analytical categories that make sense as thematic groupings for the codes that have been utilized in the project so far (Saldaña 2016).

There are a variety of other approaches researchers may take as part of data reduction or preliminary analysis after completing coding. They may outline the codes that have occurred most frequently for specific participants or texts, or for the entire body of data, or the codes that are most likely to co-occur in the same segment of text or in the same document. They may print out or photocopy documents or segments of text and rearrange them on a surface until the arrangement is analytically meaningful. They may develop diagrams or models of the relationships between codes. In doing this, it is especially helpful to focus on the use of verbs or other action words to specify the nature of these relationships—not just stating that relationships exist, but exploring what the relationships do and how they work.

In inductive coding especially, it is often useful to write theoretical and analytical memos while coding occurs, and after coding is completed it is a good time to go back and review and refine these memos. Here, researchers both clearly articulate to themselves how the coding process occurred and what methodological choices they made as well as what preliminary ideas they have about analysis and potential findings. It can be very useful to summarize one’s thinking and any patterns that might have been observed so far as a step in moving towards analysis. However, it is extremely important to remember the data and not just the codes. Qualitative researchers always go back to the actual text and not just the summaries or categories. So a final step in the process of moving toward analysis might be to flag quotes or data excerpts that seem particularly noteworthy, meaningful, or analytically useful, as researchers need these examples to make their data come alive during analysis and when they ultimately present their results.

Becoming a Coder

This chapter has provided an overview of how to develop a coding system and apply that system to the task of conducting qualitative coding as part of a research project. Many new researchers find it easy—if sometimes time-consuming and not always fascinating—to get engaged with the coding process. But what does it take to become an effective coder? Saldaña (2016) emphasizes personality attributes and skills that can help. Some of these are attributes and skills that are important for anyone who is involved in any aspect of research and data analysis: organization, to keep track of data, ideas, and procedures; perseverance, to ensure that one keeps going even when the going is tough, as is often the case in research; and ethics, to ensure proper treatment of research participants, appropriate data security behaviors, and integrity in the use of sources. In most aspects of data analysis, creativity is also important, though there are some roles in quantitative data analysis that require more in the way of technical skills and ability to follow directions. In qualitative data analysis, creativity remains important because of the need to think deeply and differently about the data as analysis continues. Flexibility and the ability to deal with ambiguity are much more important in qualitative research, as the data itself is more variable and less concrete; quantitative research tends to place more emphasis on rules and procedures. A final strength that is particularly important for those working in qualitative coding is having a strong vocabulary, as vocabulary both helps researchers understand the data and enhances their ability to create effective and useful coding systems. The best way to develop a stronger vocabulary is to read more, especially within your discipline or field but broadly as well, so researchers should be sure to stay engaged with reading, learning, and growing.

Reading, learning, and growing, along with a lot of practice, is of course how researchers enhance their data collection, coding, and data analysis skills, so keep working at it. Qualitative research can indeed be easy to get started with, but it takes time to become an expert. Put in the time, and you, too, can become a skilled qualitative data analyst.

  • Female respondent
  • The relationship between poverty and social control
  • The process of divorce
  • Social hierarchies
  • Pick a research topic you find interesting and determine which of the approaches to coding detailed in this chapter might be most appropriate for your topic, then write a paragraph about why this approach is the best.
  • Sticking with the same topic you used to respond to Exercise 2, brainstorm some codes that might be useful for coding texts related to this topic. Then, write appropriate text for a codebook for each of those codes.
  • Select a hashtag of interest on a particular social media site and randomly sample every other post using that hashtag until you have selected 15 tweets. Then inductively code those posts and engage in summarization or classification to determine what the most important themes they express might be.
  • Create a codebook based on what you did in Exercise 4. Exchange codebooks and tweets with a classmate and code each other’s tweets according to the instructions in the codebook. Compare your results—how often did your coding decisions agree and how often did they disagree? What does this tell you about interrater reliability, codebook construction, and coder training?

Media Attributions

  • codes themes descriptors © Mikaila Mariel Lemonik Arthur is licensed under a CC BY-NC (Attribution NonCommercial) license
  • Emoticodes © AnnaliseArt is licensed under a CC BY (Attribution) license
  • Hand Coding Example © Mikaila Mariel Lemonik Arthur is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license

Words or phrases that capture a central or notable attribute of a particular segment of textual or visual data.

The process of assigning observations to categories.

Concepts, topics, or ideas around which a discussion, analysis, or text focuses.

A category in an information storage system; more specifically in Dedoose, a characteristic of an author or entire text. Also, the word used to indicate that category or characteristic.

The possible levels or response choices of a given variable.

Coding that occurs early in the research process as part of a bridge from data reduction to data analysis.

Analytical coding that occurs later in the data analysis process.

Coding for data reduction or as part of an initial pass through the data.

Coding designed to move analysis towards the development of themes and findings.

Selective coding designed to orient an analytical approach around certain ideas.

Coding in which the researcher developed a coding system in advance based on their theory, hypothesis, or research question.

Coding in which the researcher develops codes based on what they observe in the data they have collected.

Coding of surface-level and/or easily observable elements of texts.

Interpretive coding that focuses on meanings within texts.

Coding that relies on nouns or phrases describing the content or topic of a segment of text.

Coding that indicates which research question or hypothesis is being addressed by a given segment of text.

Coding that relies on research participants' own language.

Coding in which gerunds are applied to actions that are described in segments of text.

Verb forms that end in -ing and function grammatically in sentences as if they are nouns.

Coding using words or phrases that represent concepts or ideas.

Codes indicating emotions discussed by or present in the text, sometimes indicated by the use of emoji/emoticons.

Coding that relies on codes indicating the perspective, worldview, values, attitudes, and/or beliefs of research participants.

Coding that relies on a series of binary oppositions, one of which must be applied to each segment of text.

A coding system used to indicate what is or is not working in a program or policy.

Coding that treats texts as if they are scripts for a play.

Elements of a coding strategy that help identify the intensity or degree of presence of a code in a text.

Documents that lay out the details of measurement. Codebooks may be used in surveys to indicate the way survey questions and responses are entered into data analysis software. Codebooks may be used in coding to lay out details about how and when to use each code that has been developed.

A measure of association especially likely to be used for testing interrater reliability.

Social Data Analysis Copyright © 2021 by Mikaila Mariel Lemonik Arthur is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Qualitative Research

A newer edition of this book is available.

  • < Previous chapter
  • Next chapter >

28 Coding and Analysis Strategies

Johnny Saldaña, School of Theatre and Film, Arizona State University

  • Published: 04 August 2014
  • Cite Icon Cite
  • Permissions Icon Permissions

This chapter provides an overview of selected qualitative data analytic strategies with a particular focus on codes and coding. Preparatory strategies for a qualitative research study and data management are first outlined. Six coding methods are then profiled using comparable interview data: process coding, in vivo coding, descriptive coding, values coding, dramaturgical coding, and versus coding. Strategies for constructing themes and assertions from the data follow. Analytic memo writing is woven throughout the preceding as a method for generating additional analytic insight. Next, display and arts-based strategies are provided, followed by recommended qualitative data analytic software programs and a discussion on verifying the researcher’s analytic findings.

Coding and Analysis Strategies

Anthropologist Clifford Geertz (1983) charmingly mused, “Life is just a bowl of strategies” (p. 25). Strategy , as I use it here, refers to a carefully considered plan or method to achieve a particular goal. The goal in this case is to develop a write-up of your analytic work with the qualitative data you have been given and collected as part of a study. The plans and methods you might employ to achieve that goal are what this article profiles.

Some may perceive strategy as an inappropriate if not colonizing word, suggesting formulaic or regimented approaches to inquiry. I assure you that that is not my intent. My use of strategy is actually dramaturgical in nature: strategies are actions that characters in plays take to overcome obstacles to achieve their objectives. Actors portraying these characters rely on action verbs to generate belief within themselves and to motivate them as they interpret the lines and move appropriately on stage. So what I offer is a qualitative researcher’s array of actions from which to draw to overcome the obstacles to thinking to achieve an analysis of your data. But unlike the pre-scripted text of a play in which the obstacles, strategies, and outcomes have been predetermined by the playwright, your work must be improvisational—acting, reacting, and interacting with data on a moment-by-moment basis to determine what obstacles stand in your way, and thus what strategies you should take to reach your goals.

Another intriguing quote to keep in mind comes from research methodologist Robert E. Stake (1995) who posits, “Good research is not about good methods as much as it is about good thinking” (p. 19). In other words, strategies can take you only so far. You can have a box full of tools, but if you do not know how to use them well or use them creatively, the collection seems rather purposeless. One of the best ways we learn is by doing . So pick up one or more of these strategies (in the form of verbs) and take analytic action with your data. Also keep in mind that these are discussed in the order in which they may typically occur, although humans think cyclically, iteratively, and reverberatively, and each particular research project has its own unique contexts and needs. So be prepared for your mind to jump purposefully and/or idiosyncratically from one strategy to another throughout the study.

QDA (Qualitative Data Analysis) Strategy: To Foresee

To foresee in QDA is to reflect beforehand on what forms of data you will most likely need and collect, which thus informs what types of data analytic strategies you anticipate using.

Analysis, in a way, begins even before you collect data. As you design your research study in your mind and on a word processor page, one strategy is to consider what types of data you may need to help inform and answer your central and related research questions. Interview transcripts, participant observation field notes, documents, artifacts, photographs, video recordings, and so on are not only forms of data but foundations for how you may plan to analyze them. A participant interview, for example, suggests that you will transcribe all or relevant portions of the recording, and use both the transcription and the recording itself as sources for data analysis. Any analytic memos (discussed later) or journal entries you make about your impressions of the interview also become data to analyze. Even the computing software you plan to employ will be relevant to data analysis as it may help or hinder your efforts.

As your research design formulates, compose one to two paragraphs that outline how your QDA may proceed. This will necessitate that you have some background knowledge of the vast array of methods available to you. Thus surveying the literature is vital preparatory work.

QDA Strategy: To Survey

To survey in QDA is to look for and consider the applicability of the QDA literature in your field that may provide useful guidance for your forthcoming data analytic work.

General sources in QDA will provide a good starting point for acquainting you with the data analytic strategies available for the variety of genres in qualitative inquiry (e.g., ethnography, phenomenology, case study, arts-based research, mixed methods). One of the most accessible is Graham R. Gibbs’ (2007)   Analysing Qualitative Data , and one of the most richly detailed is Frederick J. Wertz et al.'s (2011)   Five Ways of Doing Qualitative Analysis . The author’s core texts for this article came from The Coding Manual for Qualitative Researchers ( Saldaña, 2009 , 2013 ) and Fundamentals of Qualitative Research ( Saldaña, 2011 ).

If your study’s methodology or approach is grounded theory, for example, then a survey of methods works by such authors as Barney G. Glaser, Anselm L. Strauss, Juliet Corbin and, in particular, the prolific Kathy Charmaz (2006) may be expected. But there has been a recent outpouring of additional book publications in grounded theory by Birks & Mills (2011) , Bryant & Charmaz (2007) , Stern & Porr (2011) , plus the legacy of thousands of articles and chapters across many disciplines that have addressed grounded theory in their studies.

Particular fields such as education, psychology, social work, health care, and others also have their own QDA methods literature in the form of texts and journals, plus international conferences and workshops for members of the profession. Most important is to have had some university coursework and/or mentorship in qualitative research to suitably prepare you for the intricacies of QDA. Also acknowledge that the emergent nature of qualitative inquiry may require you to adopt different analytic strategies from what you originally planned.

QDA Strategy: To Collect

To collect in QDA is to receive the data given to you by participants and those data you actively gather to inform your study.

QDA is concurrent with data collection and management. As interviews are transcribed, field notes are fleshed out, and documents are filed, the researcher uses the opportunity to carefully read the corpus and make preliminary notations directly on the data documents by highlighting, bolding, italicizing, or noting in some way any particularly interesting or salient portions. As these data are initially reviewed, the researcher also composes supplemental analytic memos that include first impressions, reminders for follow-up, preliminary connections, and other thinking matters about the phenomena at work.

Some of the most common fieldwork tools you might use to collect data are notepads, pens and pencils, file folders for documents, a laptop or desktop with word processing software (Microsoft Word and Excel are most useful) and internet access, a digital camera, and a voice recorder. Some fieldworkers may even employ a digital video camera to record social action, as long as participant permissions have been secured. But everything originates from the researcher himself or herself. Your senses are immersed in the cultural milieu you study, taking in and holding on to relevant details or “significant trivia,” as I call them. You become a human camera, zooming out to capture the broad landscape of your field site one day, then zooming in on a particularly interesting individual or phenomenon the next. Your analysis is only as good as the data you collect.

Fieldwork can be an overwhelming experience because so many details of social life are happening in front of you. Take a holistic approach to your entree, but as you become more familiar with the setting and participants, actively focus on things that relate to your research topic and questions. Of course, keep yourself open to the intriguing, surprising, and disturbing ( Sunstein & Chiseri-Strater, 2012 , p. 115), for these facets enrich your study by making you aware of the unexpected.

QDA Strategy: To Feel

To feel in QDA is to gain deep emotional insight into the social worlds you study and what it means to be human.

Virtually everything we do has an accompanying emotion(s), and feelings are both reactions and stimuli for action. Others’ emotions clue you to their motives, attitudes, values, beliefs, worldviews, identities, and other subjective perceptions and interpretations. Acknowledge that emotional detachment is not possible in field research. Attunement to the emotional experiences of your participants plus sympathetic and empathetic responses to the actions around you are necessary in qualitative endeavors. Your own emotional responses during fieldwork are also data because they document the tacit and visceral. It is important during such analytic reflection to assess why your emotional reactions were as they were. But it is equally important not to let emotions alone steer the course of your study. A proper balance must be found between feelings and facts.

QDA Strategy: To Organize

To organize in QDA is to maintain an orderly repository of data for easy access and analysis.

Even in the smallest of qualitative studies, a large amount of data will be collected across time. Prepare both a hard drive and hard copy folders for digital data and paperwork, and back up all materials for security from loss. I recommend that each data “chunk” (e.g., one interview transcript, one document, one day’s worth of field notes) get its own file, with subfolders specifying the data forms and research study logistics (e.g., interviews, field notes, documents, Institutional Review Board correspondence, calendar).

For small-scale qualitative studies, I have found it quite useful to maintain one large master file with all participant and field site data copied and combined with the literature review and accompanying researcher analytic memos. This master file is used to cut and paste related passages together, deleting what seems unnecessary as the study proceeds, and eventually transforming the document into the final report itself. Cosmetic devices such as font style, font size, rich text (italicizing, bolding, underlining, etc.), and color can help you distinguish between different data forms and highlight significant passages. For example, descriptive, narrative passages of field notes are logged in regular font. “Quotations, things spoken by participants, are logged in bold font.”   Observer’s comments, such as the researcher’s subjective impressions or analytic jottings, are set in italics.

QDA Strategy: To Jot

To jot in QDA is to write occasional, brief notes about your thinking or reminders for follow up.

A jot is a phrase or brief sentence that will literally fit on a standard size “sticky note.” As data are brought and documented together, take some initial time to review their contents and to jot some notes about preliminary patterns, participant quotes that seem quite vivid, anomalies in the data, and so forth.

As you work on a project, keep something to write with or to voice record with you at all times to capture your fleeting thoughts. You will most likely find yourself thinking about your research when you're not working exclusively on the project, and a “mental jot” may occur to you as you ruminate on logistical or analytic matters. Get the thought documented in some way for later retrieval and elaboration as an analytic memo.

QDA Strategy: To Prioritize

To prioritize in QDA is to determine which data are most significant in your corpus and which tasks are most necessary.

During fieldwork, massive amounts of data in various forms may be collected, and your mind can get easily overwhelmed from the magnitude of the quantity, its richness, and its management. Decisions will need to be made about the most pertinent of them because they help answer your research questions or emerge as salient pieces of evidence. As a sweeping generalization, approximately one half to two thirds of what you collect may become unnecessary as you proceed toward the more formal stages of QDA.

To prioritize in QDA is to also determine what matters most in your assembly of codes, categories, themes, assertions, and concepts. Return back to your research purpose and questions to keep you framed for what the focus should be.

QDA Strategy: To Analyze

To analyze in QDA is to observe and discern patterns within data and to construct meanings that seem to capture their essences and essentials.

Just as there are a variety of genres, elements, and styles of qualitative research, so too are there a variety of methods available for QDA. Analytic choices are most often based on what methods will harmonize with your genre selection and conceptual framework, what will generate the most sufficient answers to your research questions, and what will best represent and present the project’s findings.

Analysis can range from the factual to the conceptual to the interpretive. Analysis can also range from a straightforward descriptive account to an emergently constructed grounded theory to an evocatively composed short story. A qualitative research project’s outcomes may range from rigorously achieved, insightful answers to open-ended, evocative questions; from rich descriptive detail to a bullet-pointed list of themes; and from third-person, objective reportage to first-person, emotion-laden poetry. Just as there are multiple destinations in qualitative research, there are multiple pathways and journeys along the way.

Analysis is accelerated as you take cognitive ownership of your data. By reading and rereading the corpus, you gain intimate familiarity with its contents and begin to notice significant details as well as make new insights about their meanings. Patterns, categories, and their interrelationships become more evident the more you know the subtleties of the database.

Since qualitative research’s design, fieldwork, and data collection are most often provisional, emergent, and evolutionary processes, you reflect on and analyze the data as you gather them and proceed through the project. If preplanned methods are not working, you change them to secure the data you need. There is generally a post-fieldwork period when continued reflection and more systematic data analysis occur, concurrent with or followed by additional data collection, if needed, and the more formal write-up of the study, which is in itself an analytic act. Through field note writing, interview transcribing, analytic memo writing, and other documentation processes, you gain cognitive ownership of your data; and the intuitive, tacit, synthesizing capabilities of your brain begin sensing patterns, making connections, and seeing the bigger picture. The purpose and outcome of data analysis is to reveal to others through fresh insights what we have observed and discovered about the human condition. And fortunately, there are heuristics for reorganizing and reflecting on your qualitative data to help you achieve that goal.

QDA Strategy: To Pattern

To pattern in QDA is to detect similarities within and regularities among the data you have collected.

The natural world is filled with patterns because we, as humans, have constructed them as such. Stars in the night sky are not just a random assembly; our ancestors pieced them together to form constellations like the Big Dipper. A collection of flowers growing wild in a field has a pattern, as does an individual flower’s patterns of leaves and petals. Look at the physical objects humans have created and notice how pattern oriented we are in our construction, organization, and decoration. Look around you in your environment and notice how many patterns are evident on your clothing, in a room, and on most objects themselves. Even our sometimes mundane daily and long-term human actions are reproduced patterns in the form of roles, relationships, rules, routines, and rituals.

This human propensity for pattern making follows us into QDA. From the vast array of interview transcripts, field notes, documents, and other forms of data, there is this instinctive, hardwired need to bring order to the collection—not just to reorganize it but to look for and construct patterns out of it. The discernment of patterns is one of the first steps in the data analytic process, and the methods described next are recommended ways to construct them.

QDA Strategy: To Code

To code in QDA is to assign a truncated, symbolic meaning to each datum for purposes of qualitative analysis.

Coding is a heuristic—a method of discovery—to the meanings of individual sections of data. These codes function as a way of patterning, classifying, and later reorganizing them into emergent categories for further analysis. Different types of codes exist for different types of research genres and qualitative data analytic approaches, but this article will focus on only a few selected methods. First, a definition of a code:

A code in qualitative data analysis is most often a word or short phrase that symbolically assigns a summative, salient, essence-capturing, and/or evocative attribute for a portion of language-based or visual data. The data can consist of interview transcripts, participant observation fieldnotes, journals, documents, literature, artifacts, photographs, video, websites, e-mail correspondence, and so on. The portion of data to be coded can... range in magnitude from a single word to a full sentence to an entire page of text to a stream of moving images.... Just as a title represents and captures a book or film or poem’s primary content and essence, so does a code represent and capture a datum’s primary content and essence. [ Saldaña, 2009 , p. 3]

One helpful pre-coding task is to divide long selections of field note or interview transcript data into shorter stanzas . Stanza division “chunks” the corpus into more manageable paragraph-like units for coding assignments and analysis. The transcript sample that follows illustrates one possible way of inserting line breaks in-between self-standing passages of interview text for easier readability.

Process Coding

As a first coding example, the following interview excerpt about an employed, single, lower-middle-class adult male’s spending habits during the difficult economic times in the U.S. during 2008–2012 is coded in the right-hand margin in capital letters. The superscript numbers match the datum unit with its corresponding code. This particular method is called process coding, which uses gerunds (“-ing” words) exclusively to represent action suggested by the data. Processes can consist of observable human actions (e.g., BUYING BARGAINS), mental processes (e.g., THINKING TWICE), and more conceptual ideas (e.g., APPRECIATING WHAT YOU’VE GOT). Notice that the interviewer’s (I) portions are not coded, just the participant’s (P). A code is applied each time the subtopic of the interview shifts—even within a stanza—and the same codes can (and should) be used more than once if the subtopics are similar. The central research question driving this qualitative study is, “In what ways are middle-class Americans influenced and affected by the current [2008–2012] economic recession?”

Different researchers analyzing this same piece of data may develop completely different codes, depending on their lenses and filters. The previous codes are only one person’s interpretation of what is happening in the data, not the definitive list. The process codes have transformed the raw data units into new representations for analysis. A listing of them applied to this interview transcript, in the order they appear, reads:

BUYING BARGAINS

QUESTIONING A PURCHASE

THINKING TWICE

STOCKING UP

REFUSING SACRIFICE

PRIORITIZING

FINDING ALTERNATIVES

LIVING CHEAPLY

NOTICING CHANGES

STAYING INFORMED

MAINTAINING HEALTH

PICKING UP THE TAB

APPRECIATING WHAT YOU’VE GOT

Coding the data is the first step in this particular approach to QDA, and categorization is just one of the next possible steps.

QDA Strategy: To Categorize

To categorize in QDA is to cluster similar or comparable codes into groups for pattern construction and further analysis.

Humans categorize things in innumerable ways. Think of an average apartment or house’s layout. The rooms of a dwelling have been constructed or categorized by their builders and occupants according to function. A kitchen is designated as an area to store and prepare food and the cooking and dining materials such as pots, pans, and utensils. A bedroom is designated for sleeping, a closet for clothing storage, a bathroom for bodily functions and hygiene, and so on. Each room is like a category in which related and relevant patterns of human action occur. Of course, there are exceptions now and then, such as eating breakfast in bed rather than in a dining area or living in a small studio apartment in which most possessions are contained within one large room (but nonetheless are most often organized and clustered into subcategories according to function and optimal use of space).

The point here is that the patterns of social action we designate into particular categories during QDA are not perfectly bounded. Category construction is our best attempt to cluster the most seemingly alike things into the most seemingly appropriate groups. Categorizing is reorganizing and reordering the vast array of data from a study because it is from these smaller, larger, and meaning-rich units that we can better grasp the particular features of each one and the categories’ possible interrelationships with one another.

One analytic strategy with a list of codes is to classify them into similar clusters. Obviously, the same codes share the same category, but it is also possible that a single code can merit its own group if you feel it is unique enough. After the codes have been classified, a category label is applied to each grouping. Sometimes a code can also double as a category name if you feel it best summarizes the totality of the cluster. Like coding, categorizing is an interpretive act, for there can be different ways of separating and collecting codes that seem to belong together. The cut-and-paste functions of a word processor are most useful for exploring which codes share something in common.

Below is my categorization of the fifteen codes generated from the interview transcript presented earlier. Like the gerunds for process codes, the categories have also been labeled as “-ing” words to connote action. And there was no particular reason why fifteen codes resulted in three categories—there could have been less or even more, but this is how the array came together after my reflections on which codes seemed to belong together. The category labels are ways of answering “why” they belong together. For at-a-glance differentiation, I place codes in CAPITAL LETTERS and categories in upper and lower case Bold Font :

Category 1: Thinking Strategically

Category 2: Spending Strategically

Category 3: Living Strategically

APPRECIATING WHAT YOU'VE GOT

Notice that the three category labels share a common word: “strategically.” Where did this word come from? It came from analytic reflection on the original data, the codes, and the process of categorizing the codes and generating their category labels. It was the analyst’s choice based on the interpretation of what primary action was happening. Your categories generated from your coded data do not need to share a common word or phrase, but I find that this technique, when appropriate, helps build a sense of unity to the initial analytic scheme.

The three categories— Thinking Strategically , Spending Strategically , and Living Strategically —are then reflected upon for how they might interact and interplay. This is where the next major facet of data analysis, analytic memos, enters the scheme. But a necessary section on the basic principles of interrelationship and analytic reasoning must precede that discussion.

QDA Strategy: To Interrelate

To interrelate in QDA is to propose connections within, between, and among the constituent elements of analyzed data.

One task of QDA is to explore the ways our patterns and categories interact and interplay. I use these terms to suggest the qualitative equivalent of statistical correlation, but interaction and interplay are much more than a simple relationship. They imply interrelationship . Interaction refers to reverberative connections—for example, how one or more categories might influence and affect the others, how categories operate concurrently, or whether there is some kind of “domino” effect to them. Interplay refers to the structural and processual nature of categories—for example, whether some type of sequential order, hierarchy, or taxonomy exists; whether any overlaps occur; whether there is superordinate and subordinate arrangement; and what types of organizational frameworks or networks might exist among them. The positivist construct of “cause and effect” becomes influences and affects in QDA.

There can even be patterns of patterns and categories of categories if your mind thinks conceptually and abstractly enough. Our minds can intricately connect multiple phenomena but only if the data and their analyses support the constructions. We can speculate about interaction and interplay all we want, but it is only through a more systematic investigation of the data—in other words, good thinking—that we can plausibly establish any possible interrelationships.

QDA Strategy: To Reason

To reason in QDA is to think in ways that lead to causal probabilities, summative findings, and evaluative conclusions.

Unlike quantitative research, with its statistical formulas and established hypothesis-testing protocols, qualitative research has no standardized methods of data analysis. Rest assured, there are recommended guidelines from the field’s scholars and a legacy of analytic strategies from which to draw. But the primary heuristics (or methods of discovery) you apply during a study are deductive , inductive , abductive , and retroductive reasoning. Deduction is what we generally draw and conclude from established facts and evidence. Induction is what we experientially explore and infer to be transferable from the particular to the general, based on an examination of the evidence and an accumulation of knowledge. Abduction is surmising from the evidence that which is most likely, those explanatory hunches based on clues. “Whereas deductive inferences are certain (so long as their premises are true) and inductive inferences are probable, abductive inferences are merely plausible” ( Shank, 2008 , p. 1). Retroduction is historic reconstruction, working backwards to figure out how the current conditions came to exist.

It is not always necessary to know the names of these four ways of reasoning as you proceed through analysis. In fact, you will more than likely reverberate quickly from one to another depending on the task at hand. But what is important to remember about reasoning is:

to base your conclusions primarily on the participants’ experiences, not just your own

not to take the obvious for granted, as sometimes the expected won't always happen. Your hunches can be quite right and, at other times, quite wrong

to examine the evidence carefully and make reasonable inferences

to logically yet imaginatively think about what is going on and how it all comes together.

Futurists and inventors propose three questions when they think about creating new visions for the world: What is possible (induction)? What is plausible (abduction)? What is preferable (deduction)? These same three questions might be posed as you proceed through QDA and particularly through analytic memo writing, which is retroductive reflection on your analytic work thus far.

QDA Strategy: To Memo

To memo in QDA is to reflect in writing on the nuances, inferences, meanings, and transfer of coded and categorized data plus your analytic processes.

Like field note writing, perspectives vary among practitioners as to the methods for documenting the researcher’s analytic insights and subjective experiences. Some advise that such reflections should be included in field notes as relevant to the data. Others advise that a separate researcher’s journal should be maintained for recording these impressions. And still others advise that these thoughts be documented as separate analytic memos. I prescribe the latter as a method because it is generated by and directly connected to the data themselves.

An analytic memo is a “think piece” of reflexive free writing, a narrative that sets in words your interpretations of the data. Coding and categorizing are heuristics to detect some of the possible patterns and interrelationships at work within the corpus, and an analytic memo further articulates your deductive, inductive, abductive, and retroductive thinking processes on what things may mean. Though the metaphor is a bit flawed and limiting, think of codes and their consequent categories as separate jigsaw puzzle pieces, and their integration into an analytic memo as the trial assembly of the complete picture.

What follows is an example of an analytic memo based on the earlier process coded and categorized interview transcript. It is not intended as the final write-up for a publication but as an open-ended reflection on the phenomena and processes suggested by the data and their analysis thus far. As the study proceeds, however, initial and substantive analytic memos can be revisited and revised for eventual integration into the final report. Note how the memo is dated and given a title for future and further categorization, how participant quotes are occasionally included for evidentiary support, and how the category names are bolded and the codes kept in capital letters to show how they integrate or weave into the thinking:

March 18, 2012 EMERGENT CATEGORIES: A STRATEGIC AMALGAM There’s a popular saying now: “Smart is the new rich.” This participant is Thinking Strategically about his spending through such tactics as THINKING TWICE and QUESTIONING A PURCHASE before he decides to invest in a product. There’s a heightened awareness of both immediate trends and forthcoming economic bad news that positively affects his Spending Strategically . However, he seems unaware that there are even more ways of LIVING CHEAPLY by FINDING ALTERNATIVES. He dines at all-you-can-eat restaurants as a way of STOCKING UP on meals, but doesn’t state that he could bring lunch from home to work, possibly saving even more money. One of his “bad habits” is cigarettes, which he refuses to give up; but he doesn’t seem to realize that by quitting smoking he could save even more money, not to mention possible health care costs. He balks at the idea of paying $1.50 for a soft drink, but doesn’t mind paying $6.00–$7.00 for a pack of cigarettes. Penny-wise and pound-foolish. Addictions skew priorities. Living Strategically , for this participant during “scary times,” appears to be a combination of PRIORITIZING those things which cannot be helped, such as pet care and personal dental care; REFUSING SACRIFICE for maintaining personal creature-comforts; and FINDING ALTERNATIVES to high costs and excessive spending. Living Strategically is an amalgam of thinking and action-oriented strategies.

There are several recommended topics for analytic memo writing throughout the qualitative study. Memos are opportunities to reflect on and write about:

how you personally relate to the participants and/or the phenomenon

your study’s research questions

your code choices and their operational definitions

the emergent patterns, categories, themes, assertions, and concepts

the possible networks (links, connections, overlaps, flows) among the codes, patterns, categories, themes, assertions, and concepts

an emergent or related existent theory

any problems with the study

any personal or ethical dilemmas with the study

future directions for the study

the analytic memos generated thus far [labeled “metamemos”]

the final report for the study [adapted from Saldaña, 2013 , p. 49]

Since writing is analysis, analytic memos expand on the inferential meanings of the truncated codes and categories as a transitional stage into a more coherent narrative with hopefully rich social insight.

QDA Strategy: To Code—A Different Way

The first example of coding illustrated process coding, a way of exploring general social action among humans. But sometimes a researcher works with an individual case study whose language is unique, or with someone the researcher wishes to honor by maintaining the authenticity of his or her speech in the analysis. These reasons suggest that a more participant-centered form of coding may be more appropriate.

In Vivo Coding

A second frequently applied method of coding is called in vivo coding. The root meaning of “in vivo” is “in that which is alive” and refers to a code based on the actual language used by the participant ( Strauss, 1987 ). What words or phrases in the data record you select as codes are those that seem to stand out as significant or summative of what is being said.

Using the same transcript of the male participant living in difficult economic times, in vivo codes are listed in the right-hand column. I recommend that in vivo codes be placed in quotation marks as a way of designating that the code is extracted directly from the data record. Note that instead of fifteen codes generated from process coding, the total number of in vivo codes is thirty. This is not to suggest that there should be specific numbers or ranges of codes used for particular methods. In vivo codes, though, tend to be applied more frequently to data. Again, the interviewer’s questions and prompts are not coded, just the participant's responses:

The thirty in vivo codes are then extracted from the transcript and listed in the order they appear to prepare them for analytic action and reflection:

“SKYROCKETED”

“TWO-FOR-ONE”

“THE LITTLE THINGS”

“THINK TWICE”

“ALL-YOU-CAN-EAT”

“CHEAP AND FILLING”

“BAD HABITS”

“DON'T REALLY NEED”

“LIVED KIND OF CHEAP”

“NOT A BIG SPENDER”

“HAVEN'T CHANGED MY HABITS”

“NOT PUTTING AS MUCH INTO SAVINGS”

“SPENDING MORE”

“ANOTHER DING IN MY WALLET”

“HIGH MAINTENANCE”

“COUPLE OF THOUSAND”

“INSURANCE IS JUST WORTHLESS”

“PICK UP THE TAB”

“IT ALL ADDS UP”

“NOT AS BAD OFF”

“SCARY TIMES”

Even though no systematic reorganization or categorization has been conducted with the codes thus far, an analytic memo of first impressions can still be composed:

March 19, 2012 CODE CHOICES: THE EVERYDAY LANGUAGE OF ECONOMICS After eyeballing the in vivo codes list, I noticed that variants of “CHEAP” appear most often. I recall a running joke between me and a friend of mine when we were shopping for sales. We’d say, “We're not ‘cheap,’ we're frugal .” There’s no formal economic or business language is this transcript—no terms such as “recession” or “downsizing”—just the everyday language of one person trying to cope during “SCARY TIMES” with “ANOTHER DING IN MY WALLET.” The participant notes that he’s always “LIVED KIND OF CHEAP” and is “NOT A BIG SPENDER” and, due to his employment, “NOT AS BAD OFF” as others in the country. Yet even with his middle class status, he’s still feeling the monetary pinch, dining at inexpensive “ALL-YOU-CAN-EAT” restaurants and worried about the rising price of peanut butter, observing that he’s “NOT PUTTING AS MUCH INTO SAVINGS” as he used to. Of all the codes, “ANOTHER DING IN MY WALLET” stands out to me, particularly because on the audio recording he sounded bitter and frustrated. It seems that he’s so concerned about “THE LITTLE THINGS” because of high veterinary and dental charges. The only way to cope with a “COUPLE OF THOUSAND” dollars worth of medical expenses is to find ways of trimming the excess in everyday facets of living: “IT ALL ADDS UP.”

Like process coding, in vivo codes could be clustered into similar categories, but another simple data analytic strategy is also possible.

QDA Strategy: To Outline

To outline in QDA is to hierarchically, processually, and/or temporally assemble such things as codes, categories, themes, assertions, and concepts into a coherent, text-based display.

Traditional outlining formats and content provide not only templates for writing a report but templates for analytic organization. This principle can be found in several CAQDAS (Computer Assisted Qualitative Data Analysis Software) programs through their use of such functions as “hierarchies,” “trees,” and “nodes,” for example. Basic outlining is simply a way of arranging primary, secondary, and sub-secondary items into a patterned display. For example, an organized listing of things in a home might consist of:

Large appliances

Refrigerator

Stove-top oven

Microwave oven

Small appliances

Coffee maker

Dining room

In QDA, outlining may include descriptive nouns or topics but, depending on the study, it may also involve processes or phenomena in extended passages, such as in vivo codes or themes.

The complexity of what we learn in the field can be overwhelming, and outlining is a way of organizing and ordering that complexity so that it does not become complicated. The cut-and-paste and tab functions of a word processor page enable you to arrange and rearrange the salient items from your preliminary coded analytic work into a more streamlined flow. By no means do I suggest that the intricate messiness of life can always be organized into neatly formatted arrangements, but outlining is an analytic act that stimulates deep reflection on both the interconnectedness and interrelationships of what we study. As an example, here are the thirty in vivo codes generated from the initial transcript analysis, arranged in such a way as to construct five major categories:

“DON’T REALLY NEED”

“HAVEN’T CHANGED MY HABITS”

Now that the codes have been rearranged into an outline format, an analytic memo is composed to expand on the rationale and constructed meanings in progress:

March 19, 2012 NETWORKS: EMERGENT CATEGORIES The five major categories I constructed from the in vivo codes are: “SCARY TIMES,” “PRIORTY,” “ANOTHER DING IN MY WALLET,” “THE LITTLE THINGS,” and “LIVED KIND OF CHEAP.” One of the things that hit me today was that the reason he may be pinching pennies on smaller purchases is that he cannot control the larger ones he has to deal with. Perhaps the only way we can cope with or seem to have some sense of agency over major expenses is to cut back on the smaller ones that we can control. $1,000 for a dental bill? Skip lunch for a few days a week. Insulin medication to buy for a pet? Don’t buy a soft drink from a vending machine. Using this reasoning, let me try to interrelate and weave the categories together as they relate to this particular participant: During these scary economic times, he prioritizes his spending because there seems to be just one ding after another to his wallet. A general lifestyle of living cheaply and keeping an eye out for how to save money on the little things compensates for those major expenses beyond his control.

QDA Strategy: To Code—In Even More Ways

The process and in vivo coding examples thus far have demonstrated only two specific methods of thirty-two documented approaches ( Saldaña, 2013 ). Which one(s) you choose for your analysis depends on such factors as your conceptual framework, the genre of qualitative research for your project, the types of data you collect, and so on. The following sections present a few other approaches available for coding qualitative data that you may find useful as starting points.

Descriptive Coding

Descriptive codes are primarily nouns that simply summarize the topic of a datum. This coding approach is particularly useful when you have different types of data gathered for one study, such as interview transcripts, field notes, documents, and visual materials such as photographs. Descriptive codes not only help categorize but also index the data corpus’ basic contents for further analytic work. An example of an interview portion coded descriptively, taken from the participant living in tough economic times, follows to illustrate how the same data can be coded in multiple ways:

For initial analysis, descriptive codes are clustered into similar categories to detect such patterns as frequency (i.e., categories with the largest number of codes), interrelationship (i.e., categories that seem to connect in some way), and initial work for grounded theory development.

Values Coding

Values coding identifies the values, attitudes, and beliefs of a participant, as shared by the individual and/or interpreted by the analyst. This coding method infers the “heart and mind” of an individual or group’s worldview as to what is important, perceived as true, maintained as opinion, and felt strongly. The three constructs are coded separately but are part of a complex interconnected system.

Briefly, a value (V) is what we attribute as important, be it a person, thing, or idea. An attitude (A) is the evaluative way we think and feel about ourselves, others, things, or ideas. A belief (B) is what we think and feel as true or necessary, formed from our “personal knowledge, experiences, opinions, prejudices, morals, and other interpretive perceptions of the social world” ( Saldaña, 2009 , pp. 89–90). Values coding explores intrapersonal, interpersonal, and cultural constructs or ethos . It is an admittedly slippery task to code this way, for it is sometimes difficult to discern what is a value, attitude, or belief because they are intricately interrelated. But the depth you can potentially obtain is rich. An example of values coding follows:

For analysis, categorize the codes for each of the three different constructs together (i.e., all values in one group, attitudes in a second group, and beliefs in a third group). Analytic memo writing about the patterns and possible interrelationships may reveal a more detailed and intricate worldview of the participant.

Dramaturgical Coding

Dramaturgical coding perceives life as performance and its participants as characters in a social drama. Codes are assigned to the data (i.e., a “play script”) that analyze the characters in action, reaction, and interaction. Dramaturgical coding of participants examines their objectives (OBJ) or wants, needs, and motives; the conflicts (CON) or obstacles they face as they try to achieve their objectives; the tactics (TAC) or strategies they employ to reach their objectives; their attitudes (ATT) toward others and their given circumstances; the particular emotions (EMO) they experience throughout; and their subtexts (SUB) or underlying and unspoken thoughts. The following is an example of dramaturgically coded data:

Not included in this particular interview excerpt are the emotions the participant may have experienced or talked about. His later line, “that’s another ding in my wallet,” would have been coded EMO: BITTER. A reader may not have inferred that specific emotion from seeing the line in print. But the interviewer, present during the event and listening carefully to the audio recording during transcription, noted that feeling in his tone of voice.

For analysis, group similar codes together (e.g., all objectives in one group, all conflicts in another group, all tactics in a third group), or string together chains of how participants deal with their circumstances to overcome their obstacles through tactics (e.g., OBJ: SAVING MEAL MONEY > TAC: SKIPPING MEALS). Explore how the individuals or groups manage problem solving in their daily lives. Dramaturgical coding is particularly useful as preliminary work for narrative inquiry story development or arts-based research representations such as performance ethnography.

Versus Coding

Versus coding identifies the conflicts, struggles, and power issues observed in social action, reaction, and interaction as an X VS. Y code, such as: MEN VS. WOMEN, CONSERVATIVES VS. LIBERALS, FAITH VS. LOGIC, and so on. Conflicts are rarely this dichotomous. They are typically nuanced and much more complex. But humans tend to perceive these struggles with an US VS. THEM mindset. The codes can range from the observable to the conceptual and can be applied to data that show humans in tension with others, themselves, or ideologies.

What follows are examples of versus codes applied to the case study participant’s descriptions of his major medical expenses:

As an initial analytic tactic, group the versus codes into one of three categories: the Stakeholders , their Perceptions and/or Actions , and the Issues at stake. Examine how the three interrelate and identify the central ideological conflict at work as an X vs. Y category. Analytic memos and the final write-up can detail the nuances of the issues.

Remember that what has been profiled in this section is a broad brushstroke description of just a few basic coding processes, several of which can be compatibly “mixed and matched” within a single analysis (see Saldaña’s [2013]   The Coding Manual for Qualitative Researchers for a complete discussion). Certainly with additional data, more in-depth analysis can occur, but coding is only one approach to extracting and constructing preliminary meanings from the data corpus. What now follows are additional methods for qualitative analysis.

QDA Strategy: To Theme

To theme in QDA is to construct summative, phenomenological meanings from data through extended passages of text.

Unlike codes, which are most often single words or short phrases that symbolically represent a datum, themes are extended phrases or sentences that summarize the manifest (apparent) and latent (underlying) meanings of data ( Auerbach & Silverstein, 2003 ; Boyatzis, 1998 ). Themes, intended to represent the essences and essentials of humans’ lived experiences, can also be categorized or listed in superordinate and subordinate outline formats as an analytic tactic.

Below is the interview transcript example used in the coding sections above. (Hopefully you are not too fatigued at this point with the transcript, but it’s important to know how inquiry with the same data set can be approached in several different ways.) During the investigation of the ways middle-class Americans are influenced and affected by the current (2008–2012) economic recession, the researcher noticed that participants’ stories exhibited facets of what he labeled “economic intelligence” or EI (based on the formerly developed theories of Howard Gardner’s multiple intelligences and Daniel Goleman’s emotional intelligence). Notice how themeing interprets what is happening through the use of two distinct phrases—ECONOMIC INTELLIGENCE IS (i.e., manifest or apparent meanings) and ECONOMIC INTELLIGENCE MEANS (i.e., latent or underlying meanings):

Unlike the fifteen process codes and thirty in vivo codes in the previous examples, there are now fourteen themes to work with. In the order they appear, they are:

EI IS TAKING ADVANTAGE OF UNEXPECTED OPPORTUNITY

EI MEANS THINKING BEFORE YOU ACT

EI IS BUYING CHEAP

EI MEANS SACRIFICE

EI IS SAVING A FEW DOLLARS NOW AND THEN

EI MEANS KNOWING YOUR FLAWS

EI IS SETTING PRIORITIES

EI IS FINDING CHEAPER FORMS OF ENTERTAINMENT

EI MEANS LIVING AN INEXPENSIVE LIFESTYLE

EI IS NOTICING PERSONAL AND NATIONAL ECONOMIC TRENDS

EI MEANS YOU CANNOT CONTROL EVERYTHING

EI IS TAKING CARE OF ONE’S OWN HEALTH

EI MEANS KNOWING YOUR LUCK

There are several ways to categorize the themes as preparation for analytic memo writing. The first is to arrange them in outline format with superordinate and subordinate levels, based on how the themes seem to take organizational shape and structure. Simply cutting and pasting the themes in multiple arrangements on a word processor page eventually develops a sense of order to them. For example:

A second approach is to categorize the themes into similar clusters and to develop different category labels or theoretical constructs . A theoretical construct is an abstraction that transforms the central phenomenon’s themes into broader applications but can still use “is” and “means” as prompts to capture the bigger picture at work:

Theoretical Construct 1: EI Means Knowing the Unfortunate Present

Supporting Themes:

Theoretical Construct 2: EI is Cultivating a Small Fortune

Theoretical Construct 3: EI Means a Fortunate Future

What follows is an analytic memo generated from the cut-and-paste arrangement of themes into an outline and into theoretical constructs:

March 19, 2012 EMERGENT THEMES: FORTUNE/FORTUNATELY/UNFORTUNATELY I first reorganized the themes by listing them in two groups: “is” and “means.” The “is” statements seemed to contain positive actions and constructive strategies for economic intelligence. The “means” statements held primarily a sense of caution and restriction with a touch of negativity thrown in. The first outline with two major themes, LIVING AN INEXPENSIVE LIFESTYLE and YOU CANNOT CONTROL EVERYTHING also had this same tone. This reminded me of the old children’s picture book, Fortunately/Unfortunately , and the themes of “fortune” as a motif for the three theoretical constructs came to mind. Knowing the Unfortunate Present means knowing what’s (most) important and what’s (mostly) uncontrollable in one’s personal economic life. Cultivating a Small Fortune consists of those small money-saving actions that, over time, become part of one's lifestyle. A Fortunate Future consists of heightened awareness of trends and opportunities at micro and macro levels, with the understanding that health matters can idiosyncratically affect one’s fortune. These three constructs comprise this particular individual’s EI—economic intelligence.

Again, keep in mind that the examples above for coding and themeing were from one small interview transcript excerpt. The number of codes and their categorization would obviously increase, given a longer interview and/or multiple interviews to analyze. But the same basic principles apply: codes and themes relegated into patterned and categorized forms are heuristics—stimuli for good thinking through the analytic memo-writing process on how everything plausibly interrelates. Methodologists vary in the number of recommended final categories that result from analysis, ranging anywhere from three to seven, with traditional grounded theorists prescribing one central or core category from coded work.

QDA Strategy: To Assert

To assert in QDA is to put forward statements that summarize particular fieldwork and analytic observations that the researcher believes credibly represent and transcend the experiences.

Educational anthropologist Frederick Erickson (1986) wrote a significant and influential chapter on qualitative methods that outlined heuristics for assertion development . Assertions are declarative statements of summative synthesis, supported by confirming evidence from the data, and revised when disconfirming evidence or discrepant cases require modification of the assertions. These summative statements are generated from an interpretive review of the data corpus and then supported and illustrated through narrative vignettes—reconstructed stories from field notes, interview transcripts, or other data sources that provide a vivid profile as part of the evidentiary warrant.

Coding or themeing data can certainly precede assertion development as a way of gaining intimate familiarity with the data, but Erickson’s methods are a more admittedly intuitive yet systematic heuristic for analysis. Erickson promotes analytic induction and exploration of and inferences about the data, based on an examination of the evidence and an accumulation of knowledge. The goal is not to look for “proof” to support the assertions but plausibility of inference-laden observations about the local and particular social world under investigation.

Assertion development is the writing of general statements, plus subordinate yet related ones called subassertions , and a major statement called a key assertion that represents the totality of the data. One also looks for key linkages between them, meaning that the key assertion links to its related assertions, which then link to their respective subassertions. Subassertions can include particulars about any discrepant related cases or specify components of their parent assertions.

Excerpts from the interview transcript of our case study will be used to illustrate assertion development at work. By now, you should be quite familiar with the contents, so I will proceed directly to the analytic example. First, there is a series of thematically related statements the participant makes:

“Buy one package of chicken, get the second one free. Now that was a bargain. And I got some.”

“With Sweet Tomatoes I get those coupons for a few bucks off for lunch, so that really helps.”

“I don’t go to movies anymore. I rent DVDs from Netflix or Redbox or watch movies online—so much cheaper than paying over ten or twelve bucks for a movie ticket.”

Assertions can be categorized into low-level and high-level inferences . Low-level inferences address and summarize “what is happening” within the particulars of the case or field site—the “micro.” High-level inferences extend beyond the particulars to speculate on “what it means” in the more general social scheme of things—the “meso” or “macro.” A reasonable low-level assertion about the three statements above collectively might read: The participant finds several small ways to save money during a difficult economic period . A high-level inference that transcends the case to the macro level might read: Selected businesses provide alternatives and opportunities to buy products and services at reduced rates during a recession to maintain consumer spending.

Assertions are instantiated (i.e., supported) by concrete instances of action or participant testimony, whose patterns lead to more general description outside the specific field site. The author’s interpretive commentary can be interspersed throughout the report, but the assertions should be supported with the evidentiary warrant . A few assertions and subassertions based on the case interview transcript might read (and notice how high-level assertions serve as the paragraphs’ topic sentences):

Selected businesses provide alternatives and opportunities to buy products and services at reduced rates during a recession to maintain consumer spending. Restaurants, for example, need to find ways during difficult economic periods when potential customers may be opting to eat inexpensively at home rather than spending more money by dining out. Special offers can motivate cash-strapped clientele to patronize restaurants more frequently. An adult male dealing with such major expenses as underinsured dental care offers: “With Sweet Tomatoes I get those coupons for a few bucks off for lunch, so that really helps.” The film and video industries also seem to be suffering from a double-whammy during the current recession: less consumer spending on higher-priced entertainment, resulting in a reduced rate of movie theatre attendance (currently 39 percent of the American population, according to CNN); coupled with a media technology and business revolution that provides consumers less costly alternatives through video rentals and internet viewing: “I don’t go to movies anymore. I rent DVDs from Netflix or Redbox or watch movies online—so much cheaper than paying over ten or twelve bucks for a movie ticket.”

“Particularizability”—the search for specific and unique dimensions of action at a site and/or the specific and unique perspectives of an individual participant—is not intended to filter out trivial excess but to magnify the salient characteristics of local meaning. Although generalizable knowledge serves little purpose in qualitative inquiry since each naturalistic setting will contain its own unique set of social and cultural conditions, there will be some aspects of social action that are plausibly universal or “generic” across settings and perhaps even across time. To work toward this, Erickson advocates that the interpretive researcher look for “concrete universals” by studying actions at a particular site in detail, then comparing those to other sites that have also been studied in detail. The exhibit or display of these generalizable features is to provide a synoptic representation, or a view of the whole. What the researcher attempts to uncover is what is both particular and general at the site of interest, preferably from the perspective of the participants. It is from the detailed analysis of actions at a specific site that these universals can be concretely discerned, rather than abstractly constructed as in grounded theory.

In sum, assertion development is a qualitative data analytic strategy that relies on the researcher’s intense review of interview transcripts, field notes, documents, and other data to inductively formulate composite statements that credibly summarize and interpret participant actions and meanings, and their possible representation of and transfer into broader social contexts and issues.

QDA Strategy: To Display

To display in QDA is to visually present the processes and dynamics of human or conceptual action represented in the data.

Qualitative researchers use not only language but illustrations to both analyze and display the phenomena and processes at work in the data. Tables, charts, matrices, flow diagrams, and other models help both you and your readers cognitively and conceptually grasp the essence and essentials of your findings. As you have seen thus far, even simple outlining of codes, categories, and themes is one visual tactic for organizing the scope of the data. Rich text, font, and format features such as italicizing, bolding, capitalizing, indenting, and bullet pointing provide simple emphasis to selected words and phrases within the longer narrative.

“Think display” was a phrase coined by methodologists Miles and Huberman (1994) to encourage the researcher to think visually as data were collected and analyzed. The magnitude of text can be essentialized into graphics for “at-a-glance” review. Bins in various shapes and lines of various thicknesses, along with arrows suggesting pathways and direction, render the study as a portrait of action. Bins can include the names of codes, categories, concepts, processes, key participants, and/or groups.

As a simple example, Figure 28.1 illustrates the three categories’ interrelationship derived from process coding. It displays what could be the apex of this interaction, LIVING STRATEGICALLY, and its connections to THINKING STRATEGICALLY, which influences and affects SPENDING STRATEGICALLY.

Figure 28.2 represents a slightly more complex (if not playful) model, based on the five major in vivo codes/categories generated from analysis. The graphic is used as a way of initially exploring the interrelationship and flow from one category to another. The use of different font styles, font sizes, and line and arrow thicknesses are intended to suggest the visual qualities of the participant’s language and his dilemmas—a way of heightening in vivo coding even further.

Accompanying graphics are not always necessary for a qualitative report. They can be very helpful for the researcher during the analytic stage as a heuristic for exploring how major ideas interrelate, but illustrations are generally included in published work when they will help supplement and clarify complex processes for readers. Photographs of the field setting or the participants (and only with their written permission) also provide evidentiary reality to the write-up and help your readers get a sense of being there.

QDA Strategy: To Narrate

To narrate in QDA is to create an evocative literary representation and presentation of the data in the form of creative nonfiction.

All research reports are stories of one kind or another. But there is yet another approach to QDA that intentionally documents the research experience as story, in its traditional literary sense. Narrative inquiry plots and story lines the participant’s experiences into what might be initially perceived as a fictional short story or novel. But the story is carefully crafted and creatively written to provide readers with an almost omniscient perspective about the participants’ worldview. The transformation of the corpus from database to creative nonfiction ranges from systematic transcript analysis to open ended literary composition. The narrative, though, should be solidly grounded in and emerge from the data as a plausible rendering of social life.

A simple illustration of category interrelationship.

An illustration with rich text and artistic features.

The following is a narrative vignette based on interview transcript selections from the participant living through tough economic times:

Jack stood in front of the soft drink vending machine at work and looked almost worriedly at the selections. With both hands in his pants pockets, his fingers jingled the few coins he had inside them as he contemplated whether he could afford the purchase. One dollar and fifty cents for a twenty-ounce bottle of Diet Coke. One dollar and fifty cents. “I can practically get a two-liter bottle for that same price at the grocery store,” he thought. Then Jack remembered the upcoming dental surgery he needed—that would cost one thousand dollars—and the bottle of insulin and syringes he needed to buy for his diabetic, “high maintenance” cat—about one hundred and twenty dollars. He sighed, took his hands out of his pockets, and walked away from the vending machine. He was skipping lunch that day anyway so he could stock up on dinner later at the cheap-but-filling-all-you-can-eat Chinese buffet. He could get his Diet Coke there.

Narrative inquiry representations, like literature, vary in tone, style, and point of view. The common goal, however, is to create an evocative portrait of participants through the aesthetic power of literary form. A story does not always have to have a moral explicitly stated by its author. The reader reflects on personal meanings derived from the piece and how the specific tale relates to one’s self and the social world.

QDA Strategy: To Poeticize

To poeticize in QDA is to create an evocative literary representation and presentation of the data in the form of poetry.

One form for analyzing or documenting analytic findings is to strategically truncate interview transcripts, field notes, and other pertinent data into poetic structures. Like coding, poetic constructions capture the essence and essentials of data in a creative, evocative way. The elegance of the format attests to the power of carefully chosen language to represent and convey complex human experience.

In vivo codes (codes based on the actual words used by participants themselves) can provide imagery, symbols, and metaphors for rich category, theme, concept, and assertion development, plus evocative content for arts-based interpretations of the data. Poetic inquiry takes note of what words and phrases seem to stand out from the data corpus as rich material for reinterpretation. Using some of the participant’s own language from the interview transcript illustrated above, a poetic reconstruction or “found poetry” might read:

Scary Times Scary times... spending more (another ding in my wallet) a couple of thousand (another ding in my wallet) insurance is just worthless (another ding in my wallet) pick up the tab (another ding in my wallet) not putting as much into savings (another ding in my wallet) It all adds up. Think twice: don't really need skip Think twice, think cheap: coupons bargains two-for-one free Think twice, think cheaper: stock up all-you-can-eat (cheap—and filling) It all adds up.

Anna Deavere Smith, a verbatim theatre performer, attests that people speak in forms of “organic poetry” in everyday life. Thus in vivo codes can provide core material for poetic representation and presentation of lived experiences, potentially transforming the routine and mundane into the epic. Some researchers also find the genre of poetry to be the most effective way to compose original work that reflects their own fieldwork experiences and autoethnographic stories.

QDA Strategy: To Compute

To compute in QDA is to employ specialized software programs for qualitative data management and analysis.

CAQDAS is an acronym for Computer Assisted Qualitative Data Analysis Software. There are diverse opinions among practitioners in the field about the utility of such specialized programs for qualitative data management and analysis. The software, unlike statistical computation, does not actually analyze data for you at higher conceptual levels. CAQDAS software packages serve primarily as a repository for your data (both textual and visual) that enable you to code them, and they can perform such functions as calculate the number of times a particular word or phrase appears in the data corpus (a particularly useful function for content analysis) and can display selected facets after coding, such as possible interrelationships. Certainly, basic word-processing software such as Microsoft Word, Excel, and Access provide utilities that can store and, with some pre-formatting and strategic entry, organize qualitative data to enable the researcher’s analytic review. The following internet addresses are listed to help in exploriong these CAQDAS packages and obtaining demonstration/trial software and tutorials:

AnSWR: www.cdc.gov/hiv/topics/surveillance/resources/software/answr

ATLAS.ti: www.atlasti.com

Coding Analysis Toolkit (CAT): cat.ucsur.pitt.edu/

Dedoose: www.dedoose.com

HyperRESEARCH: www.researchware.com

MAXQDA: www.maxqda.com

NVivo: www.qsrinternational.com

QDA Miner: www.provalisresearch.com

Qualrus: www.qualrus.com

Transana (for audio and video data materials): www.transana.org

Weft QDA: www.pressure.to/qda/

Some qualitative researchers attest that the software is indispensable for qualitative data management, especially for large-scale studies. Others feel that the learning curve of CAQDAS is too lengthy to be of pragmatic value, especially for small-scale studies. From my own experience, if you have an aptitude for picking up quickly on the scripts of software programs, explore one or more of the packages listed. If you are a novice to qualitative research, though, I recommend working manually or “by hand” for your first project so you can focus exclusively on the data and not on the software.

QDA Strategy: To Verify

To verify in QDA is to administer an audit of “quality control” to your analysis.

After your data analysis and the development of key findings, you may be thinking to yourself, “Did I get it right?” “Did I learn anything new?” Reliability and validity are terms and constructs of the positivist quantitative paradigm that refer to the replicability and accuracy of measures. But in the qualitative paradigm, other constructs are more appropriate.

Credibility and trustworthiness ( Lincoln & Guba, 1985 ) are two factors to consider when collecting and analyzing the data and presenting your findings. In our qualitative research projects, we need to present a convincing story to our audiences that we “got it right” methodologically. In other words, the amount of time we spent in the field, the number of participants we interviewed, the analytic methods we used, the thinking processes evident to reach our conclusions, and so on should be “just right” to persuade the reader that we have conducted our jobs soundly. But remember that we can never conclusively “prove” something; we can only, at best, convincingly suggest. Research is an act of persuasion.

Credibility in a qualitative research report can be established through several ways. First, citing the key writers of related works in your literature review is a must. Seasoned researchers will sometimes assess whether a novice has “done her homework” by reviewing the bibliography or references. You need not list everything that seminal writers have published about a topic, but their names should appear at least once as evidence that you know the field’s key figures and their work.

Credibility can also be established by specifying the particular data analytic methods you employed (e.g., “Interview transcripts were taken through two cycles of process coding, resulting in five primary categories”), through corroboration of data analysis with the participants themselves (e.g., “I asked my participants to read and respond to a draft of this report for their confirmation of accuracy and recommendations for revision”) or through your description of how data and findings were substantiated (e.g., “Data sources included interview transcripts, participant observation field notes, and participant response journals to gather multiple perspectives about the phenomenon”).

Creativity scholar Sir Ken Robinson is attributed with offering this cautionary advice about making a convincing argument: “Without data, you’re just another person with an opinion.” Thus researchers can also support their findings with relevant, specific evidence by quoting participants directly and/or including field note excerpts from the data corpus. These serve both as illustrative examples for readers and to present more credible testimony of what happened in the field.

Trustworthiness , or providing credibility to the writing, is when we inform the reader of our research processes. Some make the case by stating the duration of fieldwork (e.g., “Seventy-five clock hours were spent in the field”; “The study extended over a twenty-month period”). Others put forth the amounts of data they gathered (e.g., “Twenty-seven individuals were interviewed”; “My field notes totaled approximately 250 pages”). Sometimes trustworthiness is established when we are up front or confessional with the analytic or ethical dilemmas we encountered (e.g., “It was difficult to watch the participant’s teaching effectiveness erode during fieldwork”; “Analysis was stalled until I recoded the entire data corpus with a new perspective.”).

The bottom line is that credibility and trustworthiness are matters of researcher honesty and integrity . Anyone can write that he worked ethically, rigorously, and reflexively, but only the writer will ever know the truth. There is no shame if something goes wrong with your research. In fact, it is more than likely the rule, not the exception. Work and write transparently to achieve credibility and trustworthiness with your readers.

The length of this article does not enable me to expand on other qualitative data analytic strategies, such as to conceptualize, abstract, theorize, and write. Yet there are even more subtle thinking strategies to employ throughout the research enterprise, such as to synthesize, problematize, persevere, imagine, and create. Each researcher has his or her own ways of working, and deep reflection (another strategy) on your own methodology and methods as a qualitative inquirer throughout fieldwork and writing provides you with metacognitive awareness of data analytic processes and possibilities.

Data analysis is one of the most elusive processes in qualitative research, perhaps because it is a backstage, behind-the-scenes, in-your-head enterprise. It is not that there are no models to follow. It is just that each project is contextual and case specific. The unique data you collect from your unique research design must be approached with your unique analytic signature. It truly is a learning-by-doing process, so accept that and leave yourself open to discovery and insight as you carefully scrutinize the data corpus for patterns, categories, themes, concepts, assertions, and possibly new theories through strategic analysis.

Auerbach, C. F. , & Silverstein, L. B. ( 2003 ). Qualitative data: An introduction to coding and analysis . New York: New York University Press.

Google Scholar

Google Preview

Birks, M. , & Mills, J. ( 2011 ). Grounded theory: A practical guide . London: Sage.

Boyatzis, R. E. ( 1998 ). Transforming qualitative information: Thematic analysis and code development . Thousand Oaks, CA: Sage.

Bryant, A. , & Charmaz, K. (Eds.). ( 2007 ). The Sage handbook of grounded theory . London: Sage.

Charmaz, K. ( 2006 ). Constructing grounded theory: A practical guide through qualitative analysis . Thousand Oaks, CA: Sage.

Erickson, F. ( 1986 ). Qualitative methods in research on teaching. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed.) (pp. 119–161). New York: Macmillan.

Geertz, C. ( 1983 ). Local knowledge: Further essays in interpretive anthropology . New York: Basic Books.

Gibbs, G. R. ( 2007 ). Analysing qualitative data . London: Sage.

Lincoln, Y. S. , & Guba, E. G. ( 1985 ). Naturalistic inquiry . Newbury Park, CA: Sage.

Miles, M. B. , & Huberman, A. M. ( 1994 ). Qualitative data analysis (2nd ed.). Thousand Oaks, CA: Sage.

Saldaña, J. ( 2009 ). The coding manual for qualitative researchers . London: Sage.

Saldaña, J. ( 2011 ). Fundamentals of qualitative research . New York: Oxford University Press.

Saldaña, J. ( 2013 ). The coding manual for qualitative researchers (2nd ed.). London: Sage.

Shank, G. ( 2008 ). Abduction. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (pp. 1–2). Thousand Oaks, CA: Sage.

Stake, R. E. ( 1995 ). The art of case study research . Thousand Oaks, CA: Sage.

Stern, P. N. , & Porr, C. J. ( 2011 ). Essentials of accessible grounded theory . Walnut Creek, CA: Left Coast Press.

Strauss, A. L. ( 1987 ). Qualitative analysis for social scientists . Cambridge: Cambridge University Press.

Sunstein, B. S. , & Chiseri-Strater, E. ( 2012 ). FieldWorking: Reading and writing research (4th ed.). Boston: Bedford/St. Martin’s.

Wertz, F. J. , Charmaz, K. , McMullen, L. M. , Josselson, R. , Anderson, R. , & McSpadden, E. ( 2011 ). Fives ways of doing qualitative analysis: Phenomenological psychology, grounded theory, discourse analysis, narrative research, and intuitive inquiry . New York: Guilford.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Logo for Open Educational Resources

Chapter 19. Advanced Codes and Coding

Introduction: forest and trees.

Chapter 17 introduced you to content analysis, a particular way of analyzing historical artifacts, media, and other such “content” for its communicative aspects. Chapter 18 introduced you to the more general process of data analysis for qualitative research, how you would go about beginning to organize, simplify, and code interview transcripts and fieldnotes. This chapter takes you a bit deeper into the specifics of codes and how to use them, particularly the later stages of coding, in which our codes are refined, simplified, combined, and organized for the purpose of identifying what it all means , theoretically. These later rounds of coding are essential to getting the most out of the data we’ve collected. By the end of the chapter, you should understand how “findings” are actually found.

research questions for coding

I am going to use a particular analogy throughout this chapter, that of the relationship between the forest and trees. You know the saying “You can’t see the forest for the trees”? Think about what this actually means. One is so focused on individual trees that one neglects to notice the overall system of which the trees are a part. This is something beginning researchers do all the time, and the laborious process of coding can make this tendency worse. You focus on the details of your codes but forget that they are merely the first step in the analysis process, that after you have tagged your trees, you need to step back and look at the big picture that is the entire forest. Keep this metaphor in mind. We will come back to it a few times.

Let’s imagine you have interviewed fifty college students about their experiences during the pandemic, both as students and as workers. Each of these interviews has been transcribed and runs to about 35 pages, double-spaced. That is 1,750 pages of data you will need to code before you can properly begin to make sense of it all. Taking a sample of the interviews for a first round of coding (see chapter 17), you are likely to first note things that are common to the interviews. A general feeling of fear, anxiety, or frustration may jump out at you. There is something about the human brain that is primed to look for “the one common story” at the outset. Often, we are wrong about this. The process of coding and recoding and memoing will often show us that our initial takes on “what the data say” are seriously misleading for a couple of reasons: first, because voices or stories that counter the predominant theme are often ignored in the first round, and, second, because what startles us or surprises us can drive away the more mundane findings that actually are at the heart of what the data are saying. If we have experienced the pandemic with little anxiety, seeing anxiety in the interviews will surprise us and make us overstate its importance in general. If we expect to find something and we see something very different, we tend to overnotice that difference. This is basic psychology, I am sure.

This is where coding comes in to help you verify, amplify, complicate, or delimit your initial first impressions. Coding is a rigorous process because it helps us move away from preconceptions and other judgment errors and pin down what is actually present in the data. It helps you identify the trees, which is actually important before we can properly see the forest. We start with “It’s a forest” (not really that helpful), then move to “These are specific trees, with particular roots and branches,” and finally move back to a better understanding of the forest (“It’s a boreal forest that works like this…”). Coding is the rigorous connecting process between the first (often wrong or incomplete) impression and the final interpretation, the “results” of the study (figure 19.1). If you remember that this is the point of coding, you will be less likely to get lost in the woods. Coding is not about tagging every possible root and branch of every tree to create some kind of master compendium of forest particulars. Coding is about learning how to identify what is important about that forest overall. [1] When you are new to the forest, you won’t know which root or branch is of importance, but as you walk through it again and again, you will learn to appreciate its rhythms and know what to pick up as important and what to discard as irrelevant.

research questions for coding

There is no single correct way to go about coding your data. When I first began teaching qualitative research methods, I resolutely refused to “teach” coding, as I thought it was a little like trying to teach people to write fiction. It’s very personal and best developed through practice. But I have come to see the value of providing some guidelines—maps through the forest, if you will. I have drawn heavily here from Johnny Saldaña’s extensive and beautiful “coding manual,” but the particular suggestions here are what have worked best for me. We are going to walk through the forest many times, first in an open exploratory way and then in a more focused way once we have found our stride. Finally, we will sit down with all of our maps and materials and see what it is we can discover about the world by looking at our data.

First Walks in the Woods: Open Coding

Saldaña ( 2014 ) provides dozens of types of codes and coding processes, but we are going to confine our discussion two five. These are the five kinds of codes that I think work best for beginning researchers in your first walks through the woods. Used together, they have the potential to get at the heart of what is important in social science research. They are descriptive , i n vivo , process , values , and emotions . Select a sample of your data in the first round of coding. If you tried to tag everything in these initial rounds, you will never get out of the woods. Your sample should be broad enough to capture essential aspects of your data corpus but small enough to allow you free rein to pick up as many branches as you think interesting. Set aside a significant amount of time for this. And then double or triple that time allotment. You’ll need it.

Descriptive codes are codes used to tag specific activities, places, and things that seem to be important in particular passages. They are identifying tags (“This is a branch from an elm tree”; “This is an acorn”). Be careful here because you can really end up trying to identify everything—every word, every line, every passage. Don’t do that! It’s helpful to remind yourself what your research is about—what is your research question or focus? Some twigs can stay on the forest floor. Saldaña’s ( 2014 ) use of the term is narrower. Descriptive codes are meant to summarize the basic topic of a passage in a single word or short phrase, what is also called “topic coding” or “index coding.” These descriptive codes will allow you to easily search for and return to passages about a particular topic or feature of the forest; this will allow you to make better comparisons in later rounds of analysis. The actual word or phrase you come up with will be rather personal to you and dependent on the focus of your research. Here is an exemplary passage from a fictitious interview with a working-class college student: “I had no idea what scholarships were available! No one in my family had ever gone to college before, so there was no one I could ask. And my high school counselor was always too busy. What a joke! Plus, I was a little embarrassed, to be honest. So, yeah, I owe a lot of money. It’s really not that fair.”

What descriptive codes can be developed here? How would you define the topic or topics of this passage? On the one hand, the subject appears to be scholarships or how this student paid for college. “How Pay” might be a good descriptive code for the entire passage. But there are a lot of other interesting things going on here too. If your focus is on how peer groups work or social networks, you might focus on those aspects of the passage. Perhaps “No Assistance” could work as a descriptive code in this first round of coding. Descriptive codes are pretty straightforward, so they are easy for beginning researchers to use, but “they may not enable more complex and theoretical analyses as the study progresses, particularly with interview transcript data” ( 137 ).

In vivo codes are codes that use the actual words people have used to tag an important point or message. In the above passage, “no one I could ask” might be such a code. These indigenous terms or phrases are particularly useful when seeking to “honor or prioritize” the voice of the participants ( Saldaña 2014:138 ). They don’t require you to impose your own sense on a passage. They are also rather enjoyable to generate, as they encourage you to step into the shoes of those you have interviewed or observed. The terms or phrases should jump out at you as something salient to your research question or focus (or simply jump out at you in surprising ways that you hadn’t expected, given your research question).

Process codes are codes that label conceptual actions. This is another way to describe the data, but rather than focus on the topic, we organize it around key actions and activities. For example, we could tag the passage above with “asking for help.” By convention, process codes are gerunds , those strange verb forms that end in -ing and operate a bit like nouns. Process codes are particularly helpful for studies that focus on change and development over time, as the use of tagged gerunds can really highlight stages, if such exist. Grounded theorists often employ process codes for this reason. I find it useful, as it reminds me to focus not only on what participants say and how they say it but on the activities that they are engaged in.

Values codes are codes that reflect the attitudes, beliefs, or values held by a participant. Values codes capture things such as principles, moral codes and situational norms (“values”), the way we think about ourselves and others (“attitudes”), and all of our personal knowledge, experience, opinions, assumptions, biases, prejudices, morals, and other interpretive perceptions of the world (“beliefs”). They are extremely powerful tags and absolutely essential for phenomenological researchers. We might attach the values code “unfair” to the passage above or even note the “What a joke!” passage as disbelief or disgust.

Values codes are a particular subset of affective coding , where codes are developed to “investigate subjective qualities of human experience (e.g., emotions, values, conflicts, judgments) by directly acknowledging and naming those experiences” ( Saldaña 2014:159 ). The fifth suggested code is also another form of affective coding, emotions codes , labels of feelings shared by the participants. “Embarrassment” is an obvious emotion code in the above passage. In the kinds of research I mostly do, phenomenological and interview based, often about sensitive subjects around discrimination, power, and marginalization, coding emotions is incredibly helpful and productive: “Emotion coding is appropriate for virtually all qualitative studies, but particularly for those that explore intrapersonal or interpersonal participant experiences and actions, especially in matters of identity, social relationships, reasoning, decision-making, judgment, and risk-taking” ( 160 ).

Null

A Final Purposeful Hike through the Forest: Closed Coding

After initial rounds of coding (several walks through the woods), you should begin to see important themes emerge from your data and have a general idea of what is important enough to look at more closely. Between first-cycle coding and your last hike through the forest, you will have created a list of codes or even a codebook that records these emergent categories and themes (see chapter 18). It is quite possible your research question(s) or focus has shifted based on what you have seen in the first rounds of coding. [2] If you need more data collection based on these shifts, collect more data. Once you feel comfortable that you have reached saturation and know what it is you are looking at and for, you are ready for one final purposeful hike through your forest to tag (code) all your data using a pared-down set of codes.

Building Meaning, Identifying Patterns, Comparing Trees, and Seeing Forests

The final cycle of coding is also the time to generate analyses of your data. As with so much qualitative research, this is not a linear process (finish stage A and move to stage B followed by stage C). To some extent, analysis is happening all the time, even when you are in the field. Journaling, reflecting, and writing analytical memos are important in all stages of coding. But it is in the final stages of coding that you truly start to put everything together—that’s when you start understanding the nature of the forest you have been walking through. That, after all, is the point. What do all these codes of various people’s actions (fieldnotes) or people’s words (interviews) tell you about the larger phenomenon of interest? This will require mapping your codes across your data set, comparing and contrasting themes and patterns often relative to demographic factors, and overall trying to “see” the forest instead of the trees.

Different researchers employ various tools and methods to do this. Some draw pictures or concept maps, seeking to understand the connections between the themes that have emerged. Others spend time counting code frequencies or drawing elaborate outlines of codes and reworking these in search of general patterns and structure. Some even use in vivo codes to generate found poems that might provide insight into the deeper meanings and connections of the data. Mapping word clouds is a similar process. As a sociologist who is interested in issues of identity, my go-to method is to look for interactions between the codes, noting demographic elements of comparison. For example, in the very first study I conducted ( Hurst 2010a ), I used emotion codes. Specifically, I found numerous examples of sadness, anger, shame, embarrassment, pride, resentment, and fear. With the exception of pride, these are not very positive emotions. I could have stopped there, with the finding of overwhelming instances of negative emotions in the stories told by working-class college students. But I played around with these categories, clustering them by incidence and frequency and then comparing these across demographic categories (age, race, gender). I found no race or gender differences and only a hint of a difference between traditional-age college students and older students. What I did find, however, was that the emotions sorted themselves out in clusters relative to other codes. Embarrassment, shame, resentment, and fear were often found together in the same interview, along with a pattern of using “they” to refer to working-class people like the interviewees’ families. Conversely, anger, sadness, and pride were often found together, along with a pattern of using “we” to refer to working-class people. This led me to develop a theory about how working-class students manage their class identities in college, with some desirous of becoming middle class (“Renegades”) and others wanting very strongly to remain identified as working class (“Loyalists”; Hurst 2010a ).

Saldaña ( 2014 ) summarizes many of these techniques. He draws a distinction between "code mapping" and “ code landscaping .” Code mapping is a systematic and rigorous reordering of all codes into an increasingly simplified hierarchical organization. One can move from fifty or so specific stand-alone codes of various types (e.g., sadness, “I was so alone,” socializing, financial aid) and attempt to impose some meaningful order on them by clustering like phenomena with like phenomena. Perhaps sadness (an emotion code), “I was so alone” (an in vivo code), and socializing (an action code) are understood as belonging together, perhaps under a category of SOCIAL CONNECTIONS or, depending on what has emerged from your data, EXCLUSION. Code mapping is an iterative process, meaning that you can do a second or a third take of simplification and reordering. In the end, you might be left with one or two big conceptual themes or patterns.

Code landscaping “integrates textual and visual methods to see both the forest and trees” ( Saldaña 2014:285 ). Using computer-assisted word cloud mapping (WordItOut.com, wordclouds.com, wordle.net) is one way of doing this, or at least a way to jump-start the process. Word clouds quickly allow you to see what stands out in the interview or fieldnotes and can suggest relationships of importance between codes. Manually, one can also diagram the codes in terms of relationship, stressing the processual elements (what leads to what: “I felt so alone” >> sadness).

Another helpful suggestion is to chart the incidence of codes across your data set. This is particularly helpful with interview data. What (simplified) codes emerge in each interview transcript? Is there a pattern here? The two categories of Loyalist and Renegade would not have emerged had I not made these kinds of code comparisons by person interviewed. You might create a master document or spreadsheet that places each interview subject on its own row, with a brief description of that person’s story (what emerges as the focus of the interview or who they are in terms of social location, character, etc.) in a separate column and then a third column listing the key codes found in the interview. This is a good way to “see” the forest in a snapshot.

Whatever method or technique is employed, the general direction is to move from simple tags (codes) to categories to themes/concepts (figure 19.2). Eventually, those identified themes/concepts will help you build a new theory or at a minimum produce relevant theoretically informed findings, as in the second example at the end of this chapter.

research questions for coding

Grounded Theory has its own vocabulary when it comes to coding and data analysis, so if you are trying to do a “proper” Grounded Theory study, you might want to read up on this in more detail ( Charmaz 2014 ; Strauss 1987 ; Strauss and Corbin 2015 ). A quick summary of the approach follows. First-cycle coding employs the following kinds of codes: in vivo , process, and initial. Second-cycle coding employs focused , axial , and theoretical codes. The names of these second-cycle codes are meant to evoke the Grounded Theory approach itself: in the second cycle, the grounded theorists focus the study on axes of importance to generate theories. Focused coding pulls out the most frequent or significant codes from the first round. Axial coding reassembles data around a category, or axis. These categories or axes are meant to be concept generating: “Categories should not be so abstract as to lose their sensitizing aspect, but yet must be abstract enough to make [the emerging] theory a general guide” ( Glaser and Strauss 1967:242 ). Theoretical codes “function like umbrellas that cover and account for all other codes and categories” ( Saldaña 2014:314 ). Key words or key phrases (e.g., “Exclusion” or “Always Crying”) capture the emergent theory in the theoretical code.

Describing and Explaining the Forest: Findings and Theories

It is only now, after the laborious process of coding is complete, that you can actually move on to generate and present findings about your data. Many beginning researchers attempt to skip the middle work and get straight to writing, only to find that what they say about the data is pretty thin. The quality of qualitative research comes from the entire analytical process: open and closed coding, writing analytical memos, identifying patterns, making comparisons, and searching for order in the voluminous transcripts and fieldnotes.

Null

But let’s say that you have followed all the steps so far. You have done multiple rounds of coding—refining, simplifying, and ordering your codes. You’ve looked for patterns. You think you have seen some master concepts emerge, and you have a good idea of what the important themes and stories are in your data. How do you begin to explain and describe those themes and stories and theories to an audience? Chapter 20 will go into further detail on how to present your work (e.g., formats, length, audience, etc.), but before we get to that, we need to talk about the stage after coding but before writing. You will want to be clear in your mind that you have the story right, that you have not missed anything of importance, and that you have searched for disconfirming evidence and not found it (if you have, you have to go back to the data and start again on a new track).

Begin with your research question(s), either as originally asked or as reformulated. What is your answer to these questions? How have your underlying goals (see chapter 4) been addressed or achieved by these answers? In other words, what is the outcome of your study? Is it about describing a culture, raising awareness of a problem, finding solutions, or delineating strategies employed by participants? Perhaps you have taken a critical approach, and your outcome is all about “giving voice” to those whose voices are often unheard. In that case, your findings will be participant driven, and your challenge will be to present passages (direct quotes) that exemplify the most salient themes found in your data. On the other hand, if you have engaged in an ethnographic study, your findings may be thick, theoretically informed descriptions of the culture under study. Your challenge there will be writing evocatively. Or to take a final example, perhaps you undertook a mixed methods study to find the best way to improve a program or policy. Your findings should be such that suggest particular recommendations. Note that in none of these cases are you presenting your codes as your findings! The coding process merely helps you find what is important to say about the case based on your research questions and underlying aims and goals.

The gold star of qualitative research presentation is the formulation of theory. Even for those not following the Grounded Theory tradition, finding something to say that goes beyond the particulars of your case is an important part of doing social science research. Remember, social science is generally not idiographic. A “theory” need not be earth shattering, as in the case of Freud’s theory of Ego, Id, and Superego. A theory is simply an explanation of something general. [3] It is a story we tell about how the world works. Theories are provisional. They can never be proven (although they can be disproven). My description of Loyalists and Renegades is a theory about how college students from the working class manage the problem of class identity when their class backgrounds no longer match their class destinations. While qualitative research is not statistically generalizable , it is and should be theoretically generalizable in this way. Loyalists and Renegades are strategies that I believe occur generally among those who are experiencing upward social mobility; they are not confined solely to the twenty-one students I interviewed in 2005 in a college in the Pacific Northwest.

What is the story your research results are telling about the world? That is the ultimate question to ask yourself as you conclude your data analysis and begin to think about writing up your results.

Further Readings

Note: Please see chapter 18 for further reading on coding generally.

Charmaz, Kathy 2014. Constructing Grounded Theory . 2nd ed. Thousand Oaks, CA: SAGE. Although this is a general textbook on conducting all stages of Grounded Theory research, a significant portion is directed at the coding process.

Strauss, Anselm. 1987. Qualitative Analysis for Social Scientists . Cambridge: Cambridge University Press. An essential reading on coding Grounded Theory for advanced students, written by one of the originators of the Grounded Theory approach. Not an easy read.

Strauss, Anselm, and Juliet Corbin. 2015. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory . 4th ed. Thousand Oaks, CA: SAGE. A good basic textbook for those exploring Grounded Theory. Accessible to undergraduates and graduate students

  • A small aside here on social science in general and sociology in particular: It is often believed that sociologists are concerned about “people” and what people do and believe. Actually, people are our trees. We are really interested in the forest, or society. We try to understand society by listening to and observing the people who compose it. Behavioral science, in contrast, does take the individual as the object of study. ↵
  • It might be helpful to read the first example of writings about qualitative data analysis in the "Further Readings" section. ↵
  • Saldaña ( 2014 ) lists five essential characteristics of a social science theory: “(1) expresses a patterned relationship between two or more concepts; (2) predicts and controls action through if-then logic; (3) accounts for parameters of or variation in the empirical observations; (4) explains how and/or why something happens by stating its cause(s); and (5) provides insights and guidance for improving social life” ( 349 ). ↵

A form of first-cycle coding in which codes are developed to “investigate subjective qualities of human experience (e.g., emotions, values, conflicts, judgments) by directly acknowledging and naming those experiences” (Saldaña 2021:159).  See also emotions coding and values coding .

A technique of second-cycle coding in which codes developed in the first rounds of coding are restructured into an increasingly simplified hierarchical organization, thereby allowing the general patterns and underlying structure of the field data to emerge more clearly.

A technique of second-cycle coding that “integrates textual and visual methods to see both the forest and trees" (Saldaña 2021:285).

A first-cycle coding process in which terms or phrases used by the participants become the code applied to a particular passage.  It is also known as “verbatim coding,” “indigenous coding,” “natural coding,” “emic coding,” and “inductive coding,” depending on the tradition of inquiry of the researcher.  It is common in Grounded Theory approaches and has even given its name to one of the primary CAQDAS programs (“NVivo”).

A later stage coding process used in Grounded Theory that pulls out the most frequent or significant codes from initial coding .

A later stage coding process used in Grounded Theory in which data is reassembled around a category, or axis.

A later stage-coding process used in Grounded Theory in which key words or key phrases capture the emergent theory.

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Untangling the qualitative research codebook: a guide to crafting your own

Last updated

27 February 2023

Reviewed by

Miroslav Damyanov

Researchers from various disciplines use qualitative research codebooks, including:

Anthropology

The most effective use of codebooks is in studies that employ methods such as:

Grounded theory

Content analysis

Ethnography

The qualitative research codebook provides a clear and consistent coding framework. This enables researchers to identify patterns and themes in the data and draw meaningful conclusions about their research questions .

Streamline data coding

Use global data tagging systems in Dovetail so everyone analyzing research is speaking the same language

  • What is coding in qualitative research?

In qualitative research terms, coding is the process of identifying, categorizing, and labeling important ideas, concepts, and patterns that emerge from the data. This process is essential in analyzing qualitative data .

Different types of coding techniques can be used in qualitative research, including:

Open coding: this is the initial stage of coding, where the data is broken down into small pieces and given initial codes based on the meaning and context of the data

Axial coding: the initial codes are combined, reorganized, and connected to form larger categories or themes

Selective coding: this establishes a comprehensive theme to integrate and explain the relationship between all codes and categories

Coding in qualitative research can be done using software programs such as:

These software programs allow users to create and manage codes, categories, and themes. The coding process is iterative—as the data is analyzed, the researcher may modify, add, or remove codes to reflect new insights or to refine the analysis.

  • How to create a codebook for qualitative data

Creating a codebook is vital in coding qualitative data. It provides a structured, consistent framework for organizing and analyzing the data.

Here are the steps to create a codebook for qualitative data:

Begin by reviewing the data to identify the key concepts, themes, and patterns that emerge. This may involve reading through transcripts or notes, listening to audio recordings, or watching video recordings.

Based on the themes and patterns emerging from the data, identify the codes you will use to organize it. These codes should be concise and descriptive and capture the essence of the themes and patterns.

Define each code in clear and specific terms, including examples of what the code does and does not include.

Develop a coding hierarchy specifying how the codes are related. This can involve grouping similar codes under broader categories or creating subcodes related to specific themes or concepts.

Once the codebook is developed, review the data and assign codes to it based on the guidelines and criteria specified in the codebook.

Refine the codebook as needed, based on feedback from coders or changes in the research question. This step keeps the codebook accurate and relevant throughout the coding process.

Following these steps, researchers can ensure the coding is accurate and consistent, capturing the key themes and patterns that emerge from the data.

  • How do you determine what codes to use?

Determining what codes to use in qualitative research involves the following steps:

Review the research question to ensure the codes used are relevant and aligned with the research goals. This involves identifying the key concepts, themes, or phenomena that are interesting to the research.

Conduct a preliminary data review to gain a broad understanding of the topics covered and identify potential codes or categories. You can read and review the data multiple times and note recurring patterns or themes.

Develop an initial coding framework based on the preliminary review of the data. This framework should be flexible and allow new codes or categories to be added as the analysis progresses.

Apply the initial coding framework to the data. This involves coding the data line by line or segment by segment using the identified codes or categories.

Refine or modify the coding framework as the coding progresses to capture new insights or patterns that emerge from the data. This process may involve adding new codes or categories, merging or splitting existing codes, or redefining the codes.

Ensure consistency and rigor by establishing clear definitions and guidelines for each code or category to apply the coding framework consistently.

The process should be flexible, transparent, and well documented to ensure the credibility of the research findings.

  • Automated vs. manual coding of qualitative data

Automated and manual coding are two different approaches to coding qualitative data.

Manual coding consists of:

Reviewing the data

Identifying key themes or concepts

Assigning them to a code or category

This process requires a high level of attention to detail and is performed by a human coder. Manual coding offers a more contextual data analysis, as the coder can consider the specific context and meaning of the data.

Automated coding involves using software programs or algorithms to automatically identify and categorize key themes or concepts in the data. Automated coding can be faster and more efficient than manual coding. It helps identify patterns and relationships that may not be immediately obvious to a human coder.

The choice between automated and manual coding will depend on several factors, including:

The research question

The size and complexity of the data set

The level of detail and nuance required for the analysis

In some cases, both approaches may be used. Automated coding can be used first to quickly identify patterns or themes, then manual coding is employed to further refine the analysis and capture the context and meaning of the data.

  • Tips for coding qualitative data

We’ve put together some top tips for coding qualitative data :

Use a codebook to keep track of your codes

A codebook is a document outlining the coding framework for a research project, including the codes and categories you will use to analyze the data.

Using a codebook is essential for keeping track of codes and categories in qualitative data analysis. It helps ensure the coding is consistent and transparent and facilitates collaboration among research team members.

Avoid commonalities

It's vital to avoid relying solely on surface-level commonalities when creating codes. While it may be tempting to group text segments sharing a similar topic or idea, this approach can overlook crucial nuances in the data.

Consider multiple perspectives when coding the data. This includes the point of view of the research participants, the research team, and the broader literature. This can help ensure the coding is grounded in the data while considering wider theoretical and conceptual frameworks.

Capture the positive and the negative

Use a coding framework that covers the positive and negative aspects of the data. Capturing both aspects of the data can provide a more comprehensive understanding of the phenomena being studied.

Reduce data—to a point

By reducing the data to a manageable size, while still preserving data richness and complexity, researchers can generate a more accurate analysis without becoming overwhelmed by the data that needs analyzing.

Identify key themes and patterns emerging from the data, and focus on coding data relating to these key themes and patterns. This can reduce the data that needs to be coded while still capturing the most important aspects.

Cover as many responses as possible

Develop a comprehensive codebook that covers all possible responses to the survey questions to ensure the data is analyzed comprehensively and systematically.

Covering as many survey responses as possible can help researchers generate a more accurate analysis of the survey data. It can also provide valuable insights into the research question being studied and help identify areas for further research and exploration.

Group responses based on themes, not wording

Grouping responses based on themes rather than wording can make sure similar responses are categorized together, even if they are expressed using different words or phrases.

Use multiple coders to review the data to minimize the risk of overlooking themes or biases in the coding process. This approach can help identify patterns and themes that may not be immediately apparent and provide valuable insights into the research questions.

Make accuracy a priority

Prioritizing accuracy can help ensure the data is analyzed in a thorough, reliable manner.

Develop clear guidelines for the coding process, including definitions of each code and specific criteria for applying them. This can ensure all coders are using the same criteria and the coding is consistent across the entire dataset.

What is codebook thematic analysis?

Codebook thematic analysis is a qualitative data analysis method. It involves creating a codebook or set of codes to identify and analyze themes in a data set. This approach is systematic and rigorous in analyzing qualitative data and can identify patterns and relationships in the data.

What is thematic coding?

Thematic coding is a qualitative data analysis method that identifies and categorizes patterns or themes in the data. It can identify themes across multiple data sources, such as interviews, focus groups, and open-ended survey responses.

What is a data dictionary vs. a codebook?

Data dictionaries and codebooks are both tools used in data management and analysis, but there are differences between the two.

A data dictionary provides information about the structure and content of a dataset.

A codebook provides information about the coding schema used in qualitative data analysis to ensure the qualitative data is systematically and consistently analyzed.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 5 March 2024

Last updated: 25 November 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free

Codes and Coding

  • First Online: 27 January 2024

Cite this chapter

Book cover

  • Ajay Gupta 2  

Part of the book series: Springer Texts in Social Sciences ((STSS))

168 Accesses

Qualitative research is built on codes, and researchers must master the processes for creating codes and drawing insights from their analyses. In this chapter, the author discusses the various types of codes and approaches to coding. The quality of the output of qualitative data analysis is dependent on codes and the coding process. Codes may be relevant or irrelevant and the differences between them, and their significance, are explained in this chapter. The chapter discusses code categories and groups, and themes are derived from them. This chapter introduces coding cycle and Computer Assisted/Aided Qualitative Data Analysis (CAQDAS). After a brief overview, the author highlights the advantages and limitations of CAQDAS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Readings

Abbott, A. D. (2004). Methods of discovery Heuristics for the Social Sciences.

Google Scholar  

Bernard, H. R. (2006). Social research methods: Qualitative and quantitative approaches . Sage.

Bernard, H. R. (2013). Social research methods: Qualitative and quantitative approaches . Sage.

Boeije, H. (2010). Analysis in qualitative research . Sage Publications Ltd.

Boyatzis, R. E. (1998).  Transforming qualitative information: Thematic analysis and code development . sage.

Bryant, A., & Charmaz, K. (Eds.). (2019). The SAGE handbook of current developments in grounded theory . Sage.

Charmaz, K. (2014). Constructing grounded theory (2nd ed.). Sage.

Charmaz, K., & Mitchell, R. G. (2001). Grounded theory in ethnography. In Handbook of ethnography (pp. 160–174).

Coffey, A., & Atkinson, P. (1996). Making sense of qualitative data: Complementary research strategies . Sage Publications, Inc.

Coghlan, D., & Shani, A. B. (2014). Creating action research quality in organization development: Rigorous, reflective and relevant. Systemic Practice and Action Research, 27 , 523–536.

Article   Google Scholar  

Coghlan, D., & Brannick, T. (2014). Doing Action Research in your own organization (4th ed.). London. Sage.

Corbin, J. (2007). Strategies for qualitative data analysis. Journal of Qualitative Research , 67–85.

Corbin, J., & Strauss, A. (2015). Basics of qualitative research: techniques and procedures for developing grounded theory (4th ed.). Sage.

Crabtree, B. F., & Miller, W. F. (1992). A template approach to text analysis: Developing and using codebooks.

Creswell, J. W. (2015). Revisiting mixed methods and advancing scientific practices. In The Oxford handbook of multimethod and mixed methods research inquiry .

Dey, I. (1999). Grounding grounded theory: Guidelines for qualitative inquiry. No title .

Eisenhardt, K. M., & Graebner, M. E. (2007). Theory building from cases: Opportunities and challenges. Academy of Management Journal, 50 (1), 25–32.

Fox, M., Martin, P., & Green, G. (2007). Doing practitioner research . Sage.

Book   Google Scholar  

Franzosi, R. (Ed.). (2010). Quantitative narrative analysis (No. 162). Sage.

Friese, S. (2019). Qualitative data analysis with ATLAS. ti. Sage.

Frith, H., & Gleeson, K. (2004). Clothing and embodiment: men managing body image and appearance. Psychology of Men & Masculinity, 5 (1), 40–48.

Gioia, D. A., Corley, K. G., & Hamilton, A. L. (2013). Seeking qualitative rigor in inductive research: Notes on the Gioia methodology. Organizational Research Methods, 16 (1), 15–31.

Glaser, B. (2005). The grounded theory perspective III: Theoretical coding . Sociology Press

Glaser, B., & Strauss, A. (1967). Grounded theory: The discovery of grounded theory. Sociology the Journal of the British Sociological Association, 12 (1), 27–49.

Glesne, C. (2011). Becoming qualitative researchers: An introduction (4th ed.). Pearson Education Inc.

Grbich, C. (2007). An introduction: Qualitative data analysis . London, UK: Sage. Grootenhuis, M. A., & Last, B. F. (1997). Predictors of parental emotional adjustment to childhood cancer. Psycho-Oncology, 6 (2), 115–128.

Grbich, C. (2012). Qualitative data analysis: An introduction. Qualitative Data Analysis , 1–344.

Gupta, A. K. (2014). A comparative study of middle managers morale in two public sector banks in India.

Hatch, J. A. (2002a). Doing qualitative research in education settings. Suny Press.

Hatch, J. A. (2002b). Doing qualitative research in educational settings . State University of New York Press.

Hayes, N. (1997). Theory-led thematic analysis: social identification in small companies. In N. Hayes (Ed.), Doing qualitative analysis in psychology . Psychology Press.

Hennink, M., Hutter, I., & Bailey, A. (2011). Qualitative research methods . Sage Publications.

Kelle, U., & Bird, K. (Eds.). (1995). Computer-aided qualitative data analysis: Theory, methods and practice . Sage.

Layder, D. (1998). Sociological practice: Linking theory and social research. Sociological Practice , 1–208.

Lewins, A., & Silver, C. (2007). Qualitative coding in software: principles and processes. Using software in qualitative research. Using Software in Qualitative Research . 10.9780857025012.

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic Inquiry . Sage.

Manning, J., & Kunkel, A. (2014a). Making meaning of meaning-making research: Using qualitative research for studies of social and personal relationships. Journal of Social and Personal Relationships, 31 (4), 433–441.

Manning, J., & Kunkel, A. (2014b). Researching interpersonal relationships: Qualitative methods, studies, and analysis . Sage.

Mason, J. (2002). Qualitative researching (2nd ed.). Sage.

Maxwell, J. A. (2012). The importance of qualitative research for causal explanation in education. Qualitative Inquiry, 18 (8), 655–661.

Merton, R. K. (1987). The focussed interview and focus groups: Continuities and discontinuities. The Public Opinion Quarterly, 51 (4), 550–566.

Miles, M. B., Huberman, A. M., & Saldaňa, J. (2014). Qualitative data analysis: A methods sourcebook (3rd ed.).

Morrison, K. (2009). Causation in educational research . Taylor & Francis eBooks DRM Free Collection.

Morrison, K. (2012). Causation in educational research . Routledge.

Muhr, T. (1991). ATLAS/ti—A prototype for the support of text interpretation. Qualitative Sociology, 14(4), 349–371.

Munton, A., Silvester, J., Stratton, P., & Hanks, H. (1999). Attributions in action . Wiley.

Patton, M. Q. (2002). Two decades of developments in qualitative inquiry: A personal, experiential perspective. Qualitative Social Work, 1 (3), 261–283.

Pierce, C. (1978). Pragmatism and abduction. In C. Hartshorne & P. Weiss (Eds.), Collected papers (Vol. 5, pp. 180–212). Harvard University Press.

Quine, S., Bernard, D., & Kendig, H. (2006). Understanding baby boomers’ expectations and plans for their retirement: Findings from a qualitative study. Australasian Journal on Ageing, 25 (3), 145–150.

Richards, L., & Morse, J. M. (2007). Coding. In Readme first for a user’s guide to qualitative methods (pp. 133–151).

Richards, L., & Morse, J. M. (2012). Readme first for a user′s guide to qualitative methods . Sage publications.

Richards, L., & Morse, J. M. (2013). Readme first for a user’s guide to qualitative methods (3rd ed.). London, England. Sage.

Rossman, G. B., & Rallis, S. F. (2003). Learning in the field: An introduction to qualitative research (2nd ed.). Sage Publications.

Saldana, J. (2016). Saldana-coding manual for qualitative research-Introduction to codes & coding. The coding manual for qualitative researchers , 1–39.

Saldaña, J. (2009). The coding manual for qualitative researchers . Sage.

Saldaña, J. (2021). The coding manual for qualitative researchers . sage.

Spradley, J. P. (1980). Making an ethnographic record . Participant observation.

Spradley, J. P. (2016). Participant observation . Waveland Press.

Stebbins, R. A. (2001). What is exploration. Exploratory Research in the Social Sciences, 48 , 2–17.

Stern, P. N., & Porr, C. J. (2011). Essentials of grounded theory .

Stenner, P. (2014). Pattern. In Lury, C., & Wakeford, N. (Eds.), Inventive methods: The happening of the social (pp. 136–146). New York: Routledge.

Strauss, A. L. (1987). Qualitative analysis for social scientists . Cambridge University Press.

Strauss, A., & Corbin, J. (1998). Basics of qualitative research techniques.

Stringer, E. T. (2014). Action research (4th ed.). Sage Publishing.

Swain, J. (2018). A hybrid approach to thematic analysis in qualitative research: Using a practical example . SAGE Publications Ltd.

Timmermans, S., & Tavory, I. (2012). Theory construction in qualitative research: From grounded theory to abductive analysis. Sociological Theory, 30 (3), 167–186.

Vogt, W. P., Gardner, D. C., Haeffele, L. M., & Vogt, E. R. (2014). Selecting the right analyses for your data: Quantitative, qualitative, and mixed methods . Guilford Publications.

Wolcott, H. F. (1994). Transforming qualitative data: Description, analysis, and interpretation . Sage.

Download references

Author information

Authors and affiliations.

Human Resource Management, VES Business School, Mumbai, Maharashtra, India

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ajay Gupta .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Gupta, A. (2023). Codes and Coding. In: Qualitative Methods and Data Analysis Using ATLAS.ti. Springer Texts in Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-031-49650-9_4

Download citation

DOI : https://doi.org/10.1007/978-3-031-49650-9_4

Published : 27 January 2024

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-49649-3

Online ISBN : 978-3-031-49650-9

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Academic Success Center

Research Writing and Analysis

  • NVivo Group and Study Sessions
  • SPSS This link opens in a new window
  • Statistical Analysis Group sessions
  • Using Qualtrics
  • Dissertation and Data Analysis Group Sessions
  • Research Process Flow Chart
  • Research Alignment This link opens in a new window
  • Step 1: Seek Out Evidence
  • Step 2: Explain
  • Step 3: The Big Picture
  • Step 4: Own It
  • Step 5: Illustrate
  • Annotated Bibliography
  • Literature Review This link opens in a new window
  • Systematic Reviews & Meta-Analyses
  • How to Synthesize and Analyze
  • Synthesis and Analysis Practice
  • Synthesis and Analysis Group Sessions
  • Problem Statement
  • Purpose Statement
  • Quantitative Research Questions
  • Qualitative Research Questions
  • Trustworthiness of Qualitative Data
  • Analysis and Coding Example- Qualitative Data
  • Thematic Data Analysis in Qualitative Design
  • Dissertation to Journal Article This link opens in a new window
  • International Journal of Online Graduate Education (IJOGE) This link opens in a new window
  • Journal of Research in Innovative Teaching & Learning (JRIT&L) This link opens in a new window

Jump to DSE Guide

Analysis and coding example: qualitative data.

The following is an example of how to engage in a three step analytic process of coding, categorizing, and identifying themes within the data presented. Note that different researchers would come up with different results based on their specific research questions, literature review findings, and theoretical perspective.

There are many ways cited in the literature to analyze qualitative data. The specific analytic plan in this exercise involved a constant comparative (Glaser & Strauss, 1967) approach that included a three-step process of open coding, categorizing, and synthesizing themes. The constant comparative process involved thinking about how these comments were interrelated. Intertwined within this three step process, this example engages in content analysis techniques as described by Patton (1987) through which coherent and salient themes and patterns are identified throughout the data. This is reflected in the congruencies and incongruencies reflected in the memos and relational matrix.

Step 1: Open Coding

Codes for the qualitative data are created through a line by line analysis of the comments. Codes would be based on the research questions, literature review, and theoretical perspective articulated. Numbering the lines is helpful so that the researcher can make notes regarding which comments they might like to quote in their report.

It is also useful to include memos to remind yourself of what you were thinking and allow you to reflect on the initial interpretations as you engage in the next two analytic steps. In addition, memos will be a reminder of issues that need to be addressed if there is an opportunity for follow up data collection. This technique allows the researcher time to reflect on how his/her biases might affect the analysis. Using different colored text for memos makes it easy to differentiate thoughts from the data.

Many novice researchers forgo this step.  Rather, they move right into arranging the entire statements into the various categories that have been pre-identified. There are two problems with the process. First, since the categories have been listed open coding, it is unclear from where the categories have been derived. Rather, when a researcher uses the open coding process, he/she look at each line of text individually and without consideration for the others. This process of breaking the pieces down and then putting them back together through analysis ensures that the researcher consider all for the data equally and limits the bias that might introduced. In addition, if a researcher is coding interviews or other significant amounts of qualitative data it will likely become overwhelming as the researcher tries to organize and remember from which context each piece of data came.

Step 2: Categorizing

To categorize the codes developed in Step 1 , list the codes and group them by similarity.  Then, identify an appropriate label for each group. The following table reflects the result of this activity.

Step 3: Identification of Themes

In this step, review the categories as well as the memos to determine the themes that emerge.   In the discussion below, three themes emerged from the synthesis of the categories. Relevant quotes from the data are included that exemplify the essence of the themes.These can be used in the discussion of findings. The relational matrix demonstrates the pattern of thinking of the researcher as they engaged in this step in the analysis. This is similar to an axial coding strategy.

Note that this set of data is limited and leaves some questions in mind. In a well-developed study, this would just be a part of the data collected and there would be other data sets and/or opportunities to clarify/verify some of the interpretations made below.  In addition, since there is no literature review or theoretical statement, there are no reference points from which to draw interferences in the data. Some assumptions were made for the purposes of this demonstration in these areas.

T h eme 1:  Professional Standing

Individual participants have articulated issues related to their own professional position. They are concerned about what and when they will teach, their performance, and the respect/prestige that they have within the school. For example, they are concerned about both their physical environment and the steps that they have to take to ensure that they have the up to date tools that they need. They are also concerned that their efforts are being acknowledged, sometimes in relation to their peers and their beliefs that they are more effective.

Selected quotes:

  • Some teachers are carrying the weight for other teachers. (demonstrates that they think that some of their peers are not qualified.)
  • We need objective observations and feedback from the principal (demonstrates that they are looking for acknowledgement for their efforts.  Or this could be interpreted as a belief that their peers who are less qualified should be acknowledged).
  • There is a lack of support for individual teachers

Theme 2:  Group Dynamics and Collegiality

Rationale: There are groups or clicks that have formed. This seems to be the basis for some of the conflict.  This conflict is closely related to the status and professional standing themes. This theme however, has more to do with the group issues while the first theme is an individual perspective. Some teachers and/or subjects are seen as more prestigious than others.  Some of this is related to longevity. This creates jealously and inhibits collegiality. This affects peer-interaction, instruction, and communication.

  • Grade level teams work against each other rather than together.
  • Each team of teachers has stereotypes about the other teams.
  • There is a division between the old and new teachers

Theme 3:  Leadership Issues

Rationale: There seems to be a lack of leadership and shared understanding of the general direction in which the school will go. This is also reflected in a lack of two way communications.  There doesn’t seem to be information being offered by the leadership of the school, nor does there seem to be an opportunity for individuals to share their thoughts, let alone decision making. There seems to be a lack of intervention in the conflict from leadership.

  • Decisions are made on inaccurate information.
  • We need consistent decisions about school rules

Coding Example - Category - Relationships - Themes

Glaser, B.G., & Strauss, A.  (1967).   The discovery of grounded theory:  Strategies for qualitative research . Chicago, IL: Aldine.

Patton, M. Q.  (1987).   How to use qualitative methods in evaluation .  Newbury Park, CA:  Sage Publications.

  • << Previous: Trustworthiness of Qualitative Data
  • Next: Thematic Data Analysis in Qualitative Design >>
  • Last Updated: Apr 2, 2024 6:35 PM
  • URL: https://resources.nu.edu/researchtools

NCU Library Home

From Science to Programming: The Role of Coding in Research

Level up your research skills with the ultimate guide to coding in research. Start mastering this skill today and become an expert!

' src=

In today’s rapidly evolving research landscape, the integration of coding and programming has emerged as a powerful force, revolutionizing the way we approach scientific inquiry. With the exponential growth of data and the increasing complexity of research questions, coding has become an essential tool for researchers across a wide range of disciplines.

The synergy between coding and research extends beyond data analysis. Through simulation and modeling, researchers can use code to create virtual experiments and test hypotheses in silico. By emulating complex systems and scenarios, researchers gain valuable insights into the behavior of biological, physical, and social phenomena that may be difficult or impossible to observe directly. Such simulations enable researchers to make predictions, optimize processes, and design experiments with greater precision and efficiency. 

This article explores the pivotal role that coding plays in research, highlighting its transformative impact on scientific practices and outcomes.

coding in research

Introduction to Coding in Research

The history of coding and programming incorporation into research methodologies is rich and fascinating, punctuated by important milestones that influenced how the scientific community approaches data analysis, automation, and discovery.

Coding in research dates back to the middle of the 20th century, when advances in computing technology created new opportunities for the processing and analysis of data. In the beginning, coding was largely concerned with the design of low-level programming languages and algorithms to address mathematical issues. Programming languages like Fortran and COBOL were created during this time period, laying the foundation for further advancements in research coding.

A turning point was reached in the 1960s and 1970s when researchers realized how effective coding might be at managing massive amounts of data. The emergence of statistical computer languages like SAS and SPSS during this time period gave researchers the ability to analyze data sets more quickly and carry out sophisticated statistical calculations. Researchers in disciplines like social sciences, economics, and epidemiology now rely on their ability to code in order to find patterns in their data, test hypotheses, and derive valuable insights.

Personal computers increased and coding tools became more accessible during the 1980s and 1990s. Integrated development environments (IDEs) and graphical user interfaces (GUIs) have decreased entrance barriers and helped coding become a common research technique by making it more accessible to a larger spectrum of researchers. The development of scripting languages like Python and R also provided new opportunities for data analysis, visualization, and automation, further establishing coding’s role in research.

The fast development of technology at the turn of the 21st century drove the big data era and ushered in a new era of coding in academic research. In order to extract useful insights, researchers had to deal with enormous amounts of complicated and heterogeneous data, which called for advanced coding approaches. 

Data science emerged as a result, merging coding expertise with statistical analysis, machine learning, and data visualization. With the introduction of open-source frameworks and libraries like TensorFlow, PyTorch, and sci-kit-learn, researchers now have access to powerful tools for tackling challenging research problems and maximizing the potential of machine learning algorithms.

Today, coding is a crucial component of research in a wide range of fields, from the natural sciences to the social sciences and beyond. It has evolved into a universal language that enables researchers to examine and analyze data, model and automate processes, and simulate complex systems. Coding is being used more and more when combined with cutting-edge technologies like artificial intelligence, cloud computing, and big data analytics to push the boundaries of research and help scientists solve difficult problems and discover novel insights.

coding in research

Types of Coding in Research

There are many different types and applications of coding used in research, and researchers use them to improve their studies. Here are a few of the main coding types that are employed in research:

Data Analysis Coding

Writing code to process, clean, and analyze sizable and complicated datasets is known as data analysis coding. Researchers can do statistical studies, visualize data, and identify patterns or trends by using coding languages like Python, R, MATLAB, or SQL to extract valuable insights.

Automation Coding

Automating repetitive tasks and workflows in research processes is the subject of automation coding. Researchers can speed up data collecting, data preparation, experimental procedures, or report generation by writing scripts or programs. This saves time and ensures consistency between experiments or analyses.

Simulation and Modeling Coding

To develop computer-based simulations or models that replicate real-world systems or phenomena, simulation, and modeling coding are utilized. Researchers can test hypotheses, examine the behavior of complex systems, and investigate scenarios that could be challenging or expensive to recreate in the real world by employing coding simulations.

Machine Learning and Artificial Intelligence (AI)

Machine learning and AI coding entail teaching algorithms and models to analyze information, identify trends, forecast outcomes, or carry out certain tasks. In fields like image analysis, natural language processing, or predictive analytics, researchers use coding techniques to preprocess data, construct and fine-tune models, evaluate performance, and use these models to solve research challenges.

Web Development and Data Visualization

Web development coding is used in research to produce interactive web-based tools, data dashboards, or online surveys to gather and display data. To successfully explain the results of research, researchers may also use coding to create plots, charts, or interactive visualizations.

Software Development and Tool Creation

To complement their research, some researchers may create specific software tools or applications. To enable data management, analysis, or experimental control, this type of coding entails building, developing, and maintaining software solutions adapted to particular research aims. 

Collaborative Coding

Working on coding projects with peers or colleagues is known as collaborative coding. To increase transparency, reproducibility, and collective scientific knowledge, researchers can participate in code reviews, contribute to open-source projects, and share their code and methodology.

Methods of Coding Qualitative Data

coding in research

Researchers use a variety of techniques when it comes to coding qualitative data to assess and make sense of the data they have acquired. Following are some common methods for coding qualitative data:

  • Thematic Coding: Researchers identify recurring themes or patterns in the data by assigning descriptive codes to segments of text that represent specific themes, facilitating organization and analysis of qualitative information.
  • Descriptive Coding: It allows for the creation of an initial overview and the identification of different aspects or dimensions of the phenomenon under research. Codes are allocated to data segments based on the content or qualities of the information.
  • In Vivo Coding: It preserves authenticity and puts an emphasis on lived experiences by using participants’ own words or phrases as codes to distill their experiences or perspectives. 
  • Conceptual Coding: It allows for the use of pre-existing theories and the establishment of connections between qualitative data and theoretical constructs. Data are coded based on theoretical concepts or frameworks pertinent to the research. 
  • Comparative coding: Systematic comparisons between different situations or individuals are undertaken to uncover similarities and differences in the data. These comparisons are then represented by codes. This approach improves comprehension of variances and subtleties in the data set.
  • Pattern coding: In the qualitative data, recurring patterns or sequences of occurrences are found, and codes are assigned to them to indicate the patterns. By revealing temporal or causal connections, pattern coding sheds light on underlying dynamics or processes.
  • Relationship Coding: Within the qualitative data, connections, dependencies, or linkages between different concepts or themes are analyzed. In order to understand the interactions and linkages between many different data items, researchers develop codes that describe these relationships.

Advantages of Qualitative Research Coding

For data processing, qualitative research coding has a number of advantages. Firstly, it gives the analytic process structure and order, enabling researchers to logically categorize and organize qualitative data. By reducing the amount of data, it is easier to identify important themes and patterns.

Coding additionally makes it possible to thoroughly explore the data, revealing context and hidden meanings. By offering a documented and repeatable process, it also improves the research’s transparency and rigor. 

Coding makes data comparison and synthesis more straightforward, aids in the creation of theories, and produces deep insights for interpretation. It provides adaptability, flexibility, and the capacity for group analysis, which promotes consensus and strengthens the reliability of findings.  

Coding enables an improved understanding of the research topic by combining qualitative data with other research methods.  

In general, qualitative research coding improves the quality, depth, and interpretive capacity of data analysis, allowing researchers to gain insightful knowledge and develop their fields of study.

Tips for Coding Qualitative Data

coding in research

  • Become familiar with the data: Before starting the coding process, thoroughly understand the content and context of the qualitative data by reading and immersing yourself in it.
  • Utilize a coding system: Whether utilizing descriptive codes, thematic codes, or a combination of methods, create a clear and consistent coding system. To ensure uniformity throughout the research, describe your coding system in writing.
  • Code inductively and deductively: Consider using both inductive and deductive coding to capture a wide range of ideas. Inductive coding involves identifying themes that emerge from the data; deductive coding involves using theories or concepts that already exist.
  • Use open coding initially: Start by arbitrarily assigning codes to different data segments without using predetermined categories. This open coding strategy enables exploration and the discovery of early patterns and themes.
  • Review and refine codes: As you move through the analysis, regularly examine and make adjustments to the codes. Clarify definitions, combine similar codes, and make sure that codes appropriately reflect the content to which they are assigned.
  • Establish an audit trail: Record your coding decisions, rationales, and thought processes in great detail. This audit trail serves as a reference for upcoming analysis or discussions and helps to maintain transparency and reproducibility. 

Ethical Considerations in Coding

When coding qualitative data, ethics must come first. Prioritizing informed consent can help researchers ensure that participants have given their approval for data usage, including coding and analysis. In order to protect participants’ names and personal information during the coding process, anonymity and confidentiality are essential.

To ensure impartiality and fairness, researchers must be reflective about personal biases and their influence on coding decisions. It is important to respect the opinions and experiences of participants and to refrain from exploiting or misrepresenting them. 

The ability to recognize and convey different points of view with proper cultural awareness is indispensable, as well as treating participants with respect and upholding any agreements made. 

By addressing these ethical considerations, researchers uphold integrity, protect participants’ rights, and contribute to responsible qualitative research practices.

Common Mistakes to Avoid in Coding in Research

When coding in research, it’s important to be aware of common mistakes that can impact the quality and accuracy of your analysis. Here are some mistakes to avoid:

  • Lack of precise code instructions: To preserve consistency, make sure there are explicit coding instructions.
  • Overcoding or undercoding: Strike a balance between gathering important details and avoiding overly thorough analysis.
  • Ignoring or dismissing deviant cases: Acknowledge and code outliers for comprehensive insights.
  • Failure to maintain consistency: Consistently apply coding rules and review codes for reliability.
  • Lack of intercoder reliability: Establish consensus among team members to address discrepancies.
  • Not documenting coding decisions: Maintain a detailed audit trail for transparency and future reference.
  • Bias and assumptions: Stay aware of biases and strive for objectivity in coding.
  • Insufficient training or familiarity with data: Invest time in understanding the data and seek guidance if needed.
  • Lack of data exploration: Thoroughly analyze the data to capture its richness and depth.
  • Failure to review and validate codes: Regularly review and seek input to refine the coding scheme.

Unleash the Power of Infographics with Mind the Graph

By giving academics the means to produce engaging and eye-catching infographics, Mind the Graph revolutionizes scientific communication. The platform enables scientists to overcome conventional communication barriers and engage wider audiences by successfully explaining data, streamlining complicated concepts, boosting presentations, encouraging cooperation, and allowing customization. Embrace the power of infographics with Mind the Graph and unlock new avenues for impactful scientific communication.

research questions for coding

Subscribe to our newsletter

Exclusive high quality content about effective visual communication in science.

About Jessica Abbadia

Jessica Abbadia is a lawyer that has been working in Digital Marketing since 2020, improving organic performance for apps and websites in various regions through ASO and SEO. Currently developing scientific and intellectual knowledge for the community's benefit. Jessica is an animal rights activist who enjoys reading and drinking strong coffee.

Content tags

en_US

Ethnography Made Easy OER

Edit site title and tagline from dashboard > appearance > customize > site identity.

Ethnography Made Easy OER

Coding Qualitative Data

Camila torres rivera, you’ve gathered all your data carefully so …now what.

Qualitative researchers often ask the same question at this point in the process…now what? It can be daunting to look over your massive collection of interviews, field notes, and transcripts and not know how to begin to use it to answer your research question.  While you have gathered important information, the connections of this information to your research question may not be completely clear. Organizing the raw data in a way that makes sense to you will require you to read and re-read your raw data carefully and patiently. While there are various ways to prepare data for analysis, this chapter will focus on a commonly used method called coding.

What are codes?

A code is a word or short phrase that assigns an attribute (e.g. translation, feeling, category, summary, idea) to a section of text (Saldaña, 2015).  Coding is the act of assigning a code to a section of raw data text for interpretive purposes to gain meaning (Charmaz & Mitchell, 2001). The purpose of coding is to filter and organize the raw data so it will be easier to detect patterns or sequences, to identify themes, and to build theories to answer the research question (Bogdan & Biklen, 1997).

Let’s look at a simple coding example for a portion of an interview conducted to learn the value of the open-access music site called SoundCloud . The interview, organized in a table format below, has two key features – the columns and the rows. The right column has the transcript of the interview; the left column has codes aligned with the related text.  When you read, you should read across the entire row (i.e. the question and the question codes together) and then read the next row the same way (i.e. the answer and the answer codes together).

The eight unique codes in the right column (accessible was used three times) are attributes that are associated with the 135 words of the text. Rather than just reading the 135 words and hoping to remember what we read, assigning codes now gives us the following advantages:

  • The codes left a trail of the thought process during the reading.
  • We will be able to find the ideas later
  • We will be able to associate related sections of the data to each other
  • We know how many times the same attribute was used and which were primary and secondary.

How do ethnographers prepare for coding?

The purpose of qualitative research is to answer a research question. Therefore, before we begin to work with the raw data, we must be certain that we are also focused on this goal.  Aside from concentrating on the research question, ethnographers should consider other factors in their coding such as their approach to the research, topics, and efficiency.

Research approach: Deductive vs. Inductive

  • Inductive research is more commonly used in qualitative studies (Bogden and Biklen, 1997). The inductive research approach consists of the researcher reading the data and developing theories based on what appears in the raw data rather than any preconceived notions. The researcher will develop codes DURING the reading of the data and connect ideas in their mind through a process called open coding . Open-coding is a reactive and iterative process between the data and the researcher. In other words, open-coding documents the researcher’s reactions to data as the researcher continuously interacts with the data. As an example, imagine the researcher is reading a transcript and something they read strikes them as familiar, or unusual, or obvious, the data then causes a natural, spontaneous reaction within the researcher. In this case, the researcher would create a code to mark that section of text for future reference and/or analysis.
  • In some circumstances, a researcher may want to know if certain predetermined ideas exist in the data. Sometimes a researcher will already have some experience with a topic and they want to further their understanding on this topic. When the researcher wants to use the raw data to prove their ideas or hypothesis, they engage in deductive research . The researcher will create a code BEFORE they start the coding process and search the data for examples of the code.  In this case, the predetermined code is referred to as a priori code . In previous chapters, we discussed how ethical research requires value neutrality. While deductive research and priori codes are not automatically introducing bias into the coding, the researcher should be especially careful that they only code a section of text with a priori code when there is a very clear connection between the priori code and text. Using both of these methods can help lessen potential biases.

Coding Families

Some codes are used so often in qualitative data analysis that it is almost expected that they will be mentioned at some point in the analysis regardless of the subject matter. Coding families are not codes – rather, they are categories that suggest different ways in which coding can be accomplished  or may be necessary (Glaser, 1978; Bogden & Biklen, 1997). Because coding families describe large, overarching concepts (i.e. definitions, settings, language structure, etc.), there are usually significant overlaps between coding families.  As always, the selection of a coding family should be determined by how the coding family lends itself to answering the research question.

Using the same SoundCloud interview raw data we used in earlier in this chapter, let’s look at how different coding families relate to the same text.

Notice that the same text could be used as an example for several of coding families. However, you should use the research question as a guide when selecting some coding families over others. For example, if the research question asks, “How do musicians interact with technology?” , we may decide that the word “interact” in the research question relates most to the “activity code” family.  However, if the research question asks, “What problems do musicians face in the Music Industry?” , we may decide that the word “problem” in the question relates to “the definition of situation code” family and/or “the perspective codes” family.  There may be times in your research where you will have to focus more on one type of code over others, although you should strive to use all of them.

Here is a list of some commonly used code families for your work.  This is not a complete list of code families, but it can give you some ideas on how to begin your coding work.

Selecting Coding Supplies: Paper/Pencil vs. Technology

Ultimately, the researcher will need to physically mark their ideas on the selected text. Deciding whether this should be done using traditional office supplies (e.g. pencils, pens, highlighters, sticky-notes, paper clips, envelopes) or computer programs (e.g. Microsoft Programs like Word or Excel, Google programs like Docs or Sheets, Qualitative Data software) is a personal decision for the researcher.  Both types of supplies have their pros and cons so the selection of the coding supply should be made based on efficiency and cost.

Traditional office supplies (e.g. pencils, pens, highlighters, colored pencils, sticky-notes, scissors, paper clips, envelopes, etc.) are inexpensive, are readily available, and require no training. When researchers use office supplies to code, they simply write codes in the margins of the pages of the data, highlight sections of code using colored highlighters or colored pencils, and/or mark pages with sticky-notes.  Sometimes, researchers they will assign colors to codes so they can find the related coded text more easily.  Sections of text with the same code (or color) can be cut out with scissors and organized into groups with paper clips or envelopes.

There are some drawbacks to consider as well. With traditional office supplies, accounting for all the codes and data manually can become messy and cumbersome.  Also, there may be sections of raw data that can be coded with more than one code so those pages would need to be reproduced and coded more than once (once for each code).  Finally, there is only one copy of the coded text so if the data is lost the researcher would have to start over.

Using technology to code has advantages and disadvantages too. An advantage to using word processing office technology (e.g. Microsoft Word, Google Docs) is that the data can be duplicated and saved quickly.  Sections of text can be coded once, or more times, by using the highlighter or comment features. If sections of text have the same code, they can be cut/pasted into new documents to help organize all the codes into different pages or several different files. Depending on the researcher’s familiarity and comfort level, spreadsheet program (e.g. Microsoft Word, Google Sheets) functionality, can be combined to automatically create tables with the duplicated comments and codes.  Of course, the premier qualitative research programs (e.g. Qualtrix, Quirkos, MAXQDA) offer the greatest functionality by allowing researchers to do things like click-and-drag sections of data into code folders, keep running counts of codes, and create reminders for the researcher.

Technology is a wonderful tool, but drawbacks to using technology should be considered carefully too.  Obviously, the researcher will need access to a computer, electricity and, possibly the internet, to be able to conduct any work. While some programs are free (e.g. Google Docs, Google Sheets), some programs can cost hundreds of dollars (e.g. Microsoft Office, Qualtrix, Quirkos, MAXQDA). Also, some programs are fairly intuitive and easy to use (e.g. MS Word, Google Docs), but some programs are more complex (e.g. MS Excel, Google Sheets) and specialized programs require training (e.g. Qualtrix, Quirkos, MAXQDA).

The Coding Cycle

After we have made some decisions on how to approach the coding, we can begin the coding cycle. The coding cycle is a repeating cycle of three phases: the coding phase, writing memos phase, and reviewing/revising/refining codes phase.

In the coding phase, the researcher will start to assign codes to the text based on their research question and their other preparation decisions (i.e. inductive vs. deductive research, coding families, efficiency). After a researcher codes a section of text, the researcher writes a memo —a short journal entry to document processes, important research notes, and possible theories. After writing their memos, the researcher will review their codes, both old and new, and decide if there should be any revisions to the codes.

As an example of this entire process, and using the S oundCloud interview, we used previously in this chapter as an example, we can look at each phase of the cycle in more detail.  As our preparation, we will be using the data to answer the question “ How are Technology and Equity related within the Music Industry? ” We will use a deductive research approach and apply open-coding processes with no priori codes. We have selected the following coding families as a guide: Settings Codes, Definition of Situations Codes, Perspectives Held By Subjects Codes, Activity Codes, Strategy Codes, and Relationship Codes. The codes will be documented using Microsoft Word’s comment feature.

Coding Phase

As previously described, the interview text shown here was assigned codes using the comment feature on Microsoft Word during the reading of the text.  Since some text applied to more than one code, the same section of text was highlighted several times and assigned a different code each time.  At the end of the coding period, the text looked like this:

Memo Writing Phase

When the researcher stops coding, they must document their experience in a memo.  The memos are short journal entries that document the researcher’s interaction with the data. Since the memos are written for the researcher’s personal use, formal writing (e.g. full sentences, punctuation, paragraphs) are not required.  Instead, the researcher will use simple bullets or short phrases to record the experiences quickly.

In our example, we used the benefits of technology to convert the highlighted text with the comments into a table and sort the codes alphabetically.  Most of the codes seemed to be within the Relationship Code Family (i.e. creator, listener, fan, community).  A note was made that the researcher wondered if a larger community is related to more opportunities which could result in more equity.

Reviewing/Revising/Refining Code Phase

The number of times a code appears does not necessarily imply the code is (or is not) important and it is certainly not the only indicator. In qualitative research, the quality of the final research report will be directly linked to the quality of the codes (Saldaña, 2003). While reviewing, revising, and/or refining the codes during the coding cycle, the researcher is making decisions on the importance a particular code may have to their research question.  After the researcher reviews their codes carefully, they may decide to refine their codes by merging two (or more) codes into one code, dividing one code into two (or more) codes, renaming a code, or eliminating a code completely. When a researcher decides to merge, unmerge, rename codes, and/or eliminate codes, a note should be added to the memos. In our example, we added a note to the memo that we decided to merge both the “listener/fan” codes and the “solution/simplify” codes with instructions to only use “Listener” and “Solution” in future coding sessions.

Another good practice in the review/revise/refine phase is to begin to conceptualize the codes. The conceptualization of a code is creating a definition of the code in specific, concrete terms (Saunders et al, 2018). Future use of a conceptualized code is more intentional because the characteristics of the code are clear and, thus, codes are more easily identified in raw data.  Any code conceptualizations or working definitions should always be written in the memo. In our example, code definitions were written in italics next to the most frequently used codes.

At this point, the researcher has completed the full cycle one time.  The researcher will start the cycle again and continue the cycle with new sections of text. Each time a new section of text is coded, the researcher will follow the same three phases – coding, writing a memo to document the progress, and review/revise/refine their codes.

The coding cycle continues over and over until the researcher believes saturation has occurred. Saturation is the point in the research study where no new information seems to emerge from the coding cycle and there appears to be enough information to answer the research question (Strauss & Corbin, 1997).  There is no one event or threshold that can alert a researcher when saturation has occurred. When interviews seem predictable, the same codes are being used over and over, and possible answers to the research question seem reasonable, the researcher should decide if further iterations of the cycle will result in any valuable insights or the coding cycle should end.

Chapter Summary

In this chapter, we learned that the coding cycle is the link between raw data and the theories that will answer the research question. We engaged in a mock coding cycle example to experience how researcher begins to organize their raw data, record their initial reactions to the data, and documents emergent ideas. The key take-aways from the chapter are:

  • Coding is the organizational process where researchers assign codes to raw qualitative data in order to answer their research question.
  • Coding is a reactive and iterative process between the data and the researcher.
  • The research question is the principal guide in preparing to code and selecting codes.
  • Coding can be used for either deductive research or inductive research studies. However, it is more commonly used for inductive research studies.
  • The coding cycle is comprised of coding, memo writing, and review/revising/refining codes.

The coding cycle ends when no new information appears to emerge from the raw data. This phenomenon is referred to as saturation.

  • How is coding data different than summarizing data?
  • How are the research question and the coding process related?
  • “Coding is reactive and iterative.”  In your own words, explain the meaning of this statement.
  • What are the advantages of writing memos at the end of each coding phase?
  • What are some clues that may suggest saturation has occurred? Give an example.

In your opinion, how does coding help the researcher understand their raw data?

Code – A word or short phrase that assigns an attribute (e.g. translation, feeling, category, summary, idea) to a section of text.

Coding – The act of assigning a code to a section of raw data text for interpretive purposes to gain meaning.

Coding Cycle – A repeating cycle in which the researcher organizes their data.  The coding cycle is comprised of three phases: the coding phase, writing memos phase, and reviewing/revising/refining codes phase

Coding Families – General categories that suggest different ways in which coding can be accomplished.

Conceptualization – Descriptions or definitions of abstract ideas is specific, concrete terms. Deductive Research – A research approach conducted to further an already existing theory. (The data is used to prove an existing theory).

Inductive Research – A research approach conducted to develop a theory based on what exists in the data. (The theory emerges from the data).

Memo – A short journal entry to document processes, important research notes, and possible theories.

Open-Coding – The assigning of codes to to the raw data without any previously selected codes.

Priori code – A code selected before the start of the coding cycle.

Saturation – The point in the research study where no new information seems to emerge from the coding cycle and there appears to be enough information to answer the research question.

Bogdan, R., & Biklen, S. K. (1997). Qualitative research for education . Boston, MA: Allyn & Bacon.

Charmaz, K., & Mitchell, R. G. (2014). Grounded theory in ethnography. In Atkinson, P., Coffey, A., Delamont, S., Lofland, J., & Lofland, L (Eds.), Handbook of ethnography , (160 -174). Thousand Oaks, CA: Sage Publications.

Soundcloud. (2010, August 19). Creative Commons. https://creativecommons.org/2010/08/19/soundcloud/

Creswell, J. W., & Poth, C. N. (2016). Qualitative inquiry and research design: Choosing among five approaches. Thousand Oaks, CA: Sage Publications.

Glaser, B. G., & Strauss, A. (1967). The discovery of grounded theory: Strategies for qualitative research. New York: Aldine Publishing Co.

Saldaña, J. (2015). The coding manual for qualitative researchers. Thousand Oaks, CA: Sage Publications.

Saunders, B., Sim, J., Kingstone, T., Baker, S., Waterfield, J., Bartlam, B., Jinks, C. (2018). Saturation in qualitative research: exploring its conceptualization and operationalization. Quality & quantity , 52(4), 1893–1907.

Strauss, A., & Corbin, J. M. (1997). Grounded theory in practice. Thousand Oaks, CA: Sage Publications.

Attribution-NonCommercial-ShareAlike 4.0 International

This entry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

research questions for coding

Need help with the Commons?

Email us at [email protected] so we can respond to your questions and requests. Please email from your CUNY email address if possible. Or visit our help site for more information:

CUNY Academic Commons logo

  • Terms of Service
  • Accessibility
  • Creative Commons (CC) license unless otherwise noted

CUNY logo

Logo

Learn How To Code Open-Ended Survey Questions

Learn How To Code Open-Ended Survey Questions

When harnessed correctly, survey results can lead to big improvements, both for your customers and for your bottom line.

A fundamental step in the survey process is analyzing the open-ended responses from your customers or employees.

If you don't, you stand to miss out on a lot of useful information.

Coding your survey results is therefore key.

Coding, or sorting your survey responses into categories, allows you to organize your text data in a logical way.

This makes it easier to draw out insights that can inform important business decisions.

In this guide we'll go through what survey coding is and show you how to code a questionnaire. 

Finally, we'll reveal how you can visualize your qualitative results.

Feel free to jump ahead to the section you're most interested in: 

  • What is Survey Coding?
  • Survey Coding Approaches -- Which Is Best?

4 Survey Coding Tips to Keep in Mind

From survey coding to data visualization, auto-code your surveys the easy way, firstly, what is survey coding.

Survey coding is where you review all of your open-ended, qualitative responses , identify themes or commonalities, and then sort them into categories or groups using tags.

Coding your survey data is really a form of analysis in itself. But it's just the starting point of your analysis journey.

The end goal of coding your survey results is to have organized information which you can then further mine for important insights that tell you more about your customer or target audience. 

Survey Coding Approaches -- Which is Best?

There are two main ways you can code survey responses. These methods are manual coding which you do alone and automated coding where you enlist the help of AI. Here we'll look into how both methods work, the pros, and then the cons.

Manual vs automated coding.

Before we begin, it's worth noting that you don't have to choose one or the other. In many cases the two can be combined. For example you can automate your coding but still do manual quality control checks as the AI works, or once it's finished.

It's also important to know that automated coding is really only any option if you have around 10,000 plus responses. This is because automated analysis tools need a certain number of text samples to learn from.

Whichever way you choose, you'll find some great survey analysis tools out there to help you.

Manual Survey Coding 

Manual coding in its purest form is a process which involves a human reading through every open-ended survey response and deciding which category it falls into according to its content. That human would then assign a tag to the response and it would be officially coded.

This type of coding has been around for a long time in qualitative research fields. There are a number of qualitative analysis tools like NVIVO and MAXQDA that you can use to manually code. You can also use good old Excel to analyze your survey data .

Within manual coding there are two avenues you can follow:  

  • Inductive coding
  • Deductive coding

Inductive coding is where you begin with a blank slate and derive all of your codes from the data as you explore it.

With deductive coding however, you begin with some guidelines or a "codebook" already in place to steer the coding process. This codebook or annotation handbook is normally established from an initial review of the data or from your research requirements.

Manual coding is still a favored method by many academics. Particularly those who work with smaller data sets and like to read through all of their data and responses. 

Pros of manually coding surveys

With manual coding you are always close to your data and can read every last word if you want to. This means that you potentially won't miss anything, and can perceive any nuances in the responses. 

Cons of manually coding surveys

If you have large amounts of survey responses it's impossible to manually analyze everything.

Even if your data set is a size where you can code it manually, or if you have a team to help, manual coding can be a repetitive, tedious task. It also may not be the best use of your researchers or data experts' time.

If you try to code all of your data manually, you run the risk of missing things --- we are human after all.

As a human you also bring your own opinions and feelings to the table which can result in subjective conclusions. All this leads to inaccuracies in your data, which means your insights may be skewed.

Automated Survey Coding

Automated coding, as the name suggests, is when you make the tagging and grouping of your survey results automatic. This is done with the help of Natural Language Processing (NLP) and machine learning algorithms.

Automated coding is an excellent approach if you have a lot of survey data and you want to code it quickly and effectively.

Open-source libraries can be useful if you want to create your own program to tag and code your data. You can also use ready-to-go tools like MonkeyLearn .

MonkeyLearn offers no-code text analysis templates that can run your survey data through various text analysis techniques like sentiment analysis , topic analysis , and keyword extraction .

How does MonkeyLearn work.

All you have to do is upload your data and wait for MonkeyLearn to categorize or code your survey data in ways that make sense for your goals. You can define these categories alongside MonkeyLearn's data science team.

Want to learn more about automatically coding your survey responses? Book a demo with MonkeyLearn .

Pros of automatically coding surveys

Automatic coding is simply the only way to go if you have a lot of data. It ensures effective results that are free from human error and human subjectivity. It's also a fast process, and seeing as time often equals money, this is an indispensable quality. 

Cons of automatically coding surveys

You are one step removed from your data with this kind of processing. While it can be argued that automatic processing helps you get to the information that matters more efficiently, some people are more comfortable reading through every response. 

Regardless of how you choose to code your survey results, there are some principles that you should stick to.

Here we'll go through 4 tips for survey coding:

1. Don't make assumptions

You should always start your analysis with an open mind. If you go into it with a subjective opinion of how the results should turn out, you might miss something, or add your own bias to the results and insights.

You need to make sure that your categories are defined based on actual data rather than your assumptions about the data.

Because your assumptions might just be wrong! Take a look at this client testimonial:

How does MonkeyLearn work.

2. Define categories that can be used consistently

As you work through tagging and coding, it's best to maintain well-defined categories with clear guidelines or an annotation handbook.

This means making sure that there are no overlapping concepts between them, each category should be unique and distinct.

This is important because if your tagging is inconsistent or if you have tags that are similar but not the same, your human annotators or automated tagging tools will get confused. And this will affect the accuracy of your insights.

Here's an example of how you might define your categories in an annotation handbook:

Annotation handbook tagging examples using a hierarchical tagging method: first-level tags and second-level.

3. Keep categories to a minimum

You want to avoid spreading yourself too thin because this will dilute your insights and make it harder to draw conclusions.

Therefore you should aim for a maximum of 10-15 categories. If there is a category that seems too small or is too niche you can remove it or see if it could fit into an existing category.

4. It's an iterative process

Your tagging work will evolve over time and new tags will probably appear as your data grows. Your codes will probably be broader at the start, then as you go through your data, you will be able to tighten up the categories and add new categories over time. 

Once your survey data is coded and ready for further analysis, you need to think about how you are going to visualize this information. Visualizing your data allows you to spot trends, patterns, or any outliers.

Strong data visualizations make it easier to come to conclusions and inform important business decisions. They're also more practical and accessible than raw data or text reports when it comes to sharing with team members and stakeholders.\ There are a number of data visualization tools that can help you. A popular, accessible choice is Microsoft Excel. With Excel you can easily see your coded survey results plotted along a bar chart, pie charts, scatter plots, and more by choosing from the insert menu.

Here's an example of a visualization in Excel:

An excel bar chart plotting survey results.

MonkeyLearn also offers in-app data visualization capabilities. Once you've uplodad your survey data to one of MonkeyLearn's plug-and-play templates, you'll automatically receive the results in a dashboard.

Let's use the NPS analysis template as an example.

Learn more about NPS analysis , or continue scrolling to see MonkeyLearn's NPS analysis dashboard.

Just upload your data, then MonkeyLearn will analyze it and deliver a dashboard that you can filter by sentiment, topic, keyword, NPS category and much more.

Ta-da! Here's a snapshot of MonkeyLearn's dashboard:

A MonkeyLearn NPS survey analysis dashboard.

Want to see how you can create your own dashboard? Sign up for a demo today.

Coding is an essential piece of the data analysis puzzle, without this step, your analysis would be chaotic and difficult to manage, and useful insights would be potentially lost.

What you will need to decide however, is whether you go down the manual or automated route. This decision will depend a lot on the amount of data you are working with in the first place.

If you have a large amount of data, chances are your only real option will be the automated route in order to maximize both your survey results and your time.

Tools like MonkeyLearn can help you make this process easier and let you focus on the insights you get out of your analysis, rather than the nuts and bolts behind it. Sign up for your free demo today to see how you can streamline and automate your survey coding process.

research questions for coding

Rachel Wolff

February 7th, 2022

Posts you might like...

research questions for coding

Survey Types for Success in 2022

How customer surveys are performed has changed radically for businesses. And this trend will continue long into the future. Fillable…

research questions for coding

How to Analyze Questionnaire Data: A Step by Step Guide

Approaching your questionnaire with the right principles in mind and tools in hand will produce easily-understood results packed with…

research questions for coding

What Is Survey Data Processing?

Designing and distributing the perfect survey can take a lot of precious resources. So, it's important that you get a return on this…

Text Analysis with Machine Learning

Turn tweets, emails, documents, webpages and more into actionable data. Automate business processes and save hours of manual data processing.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Starting the research process
  • Writing Strong Research Questions | Criteria & Examples

Writing Strong Research Questions | Criteria & Examples

Published on October 26, 2022 by Shona McCombes . Revised on November 21, 2023.

A research question pinpoints exactly what you want to find out in your work. A good research question is essential to guide your research paper , dissertation , or thesis .

All research questions should be:

  • Focused on a single problem or issue
  • Researchable using primary and/or secondary sources
  • Feasible to answer within the timeframe and practical constraints
  • Specific enough to answer thoroughly
  • Complex enough to develop the answer over the space of a paper or thesis
  • Relevant to your field of study and/or society more broadly

Writing Strong Research Questions

Table of contents

How to write a research question, what makes a strong research question, using sub-questions to strengthen your main research question, research questions quiz, other interesting articles, frequently asked questions about research questions.

You can follow these steps to develop a strong research question:

  • Choose your topic
  • Do some preliminary reading about the current state of the field
  • Narrow your focus to a specific niche
  • Identify the research problem that you will address

The way you frame your question depends on what your research aims to achieve. The table below shows some examples of how you might formulate questions for different purposes.

Using your research problem to develop your research question

Note that while most research questions can be answered with various types of research , the way you frame your question should help determine your choices.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research questions for coding

Research questions anchor your whole project, so it’s important to spend some time refining them. The criteria below can help you evaluate the strength of your research question.

Focused and researchable

Feasible and specific, complex and arguable, relevant and original.

Chances are that your main research question likely can’t be answered all at once. That’s why sub-questions are important: they allow you to answer your main question in a step-by-step manner.

Good sub-questions should be:

  • Less complex than the main question
  • Focused only on 1 type of research
  • Presented in a logical order

Here are a few examples of descriptive and framing questions:

  • Descriptive: According to current government arguments, how should a European bank tax be implemented?
  • Descriptive: Which countries have a bank tax/levy on financial transactions?
  • Framing: How should a bank tax/levy on financial transactions look at a European level?

Keep in mind that sub-questions are by no means mandatory. They should only be asked if you need the findings to answer your main question. If your main question is simple enough to stand on its own, it’s okay to skip the sub-question part. As a rule of thumb, the more complex your subject, the more sub-questions you’ll need.

Try to limit yourself to 4 or 5 sub-questions, maximum. If you feel you need more than this, it may be indication that your main research question is not sufficiently specific. In this case, it’s is better to revisit your problem statement and try to tighten your main question up.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

Methodology

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

The way you present your research problem in your introduction varies depending on the nature of your research paper . A research paper that presents a sustained argument will usually encapsulate this argument in a thesis statement .

A research paper designed to present the results of empirical research tends to present a research question that it seeks to answer. It may also include a hypothesis —a prediction that will be confirmed or disproved by your research.

As you cannot possibly read every source related to your topic, it’s important to evaluate sources to assess their relevance. Use preliminary evaluation to determine whether a source is worth examining in more depth.

This involves:

  • Reading abstracts , prefaces, introductions , and conclusions
  • Looking at the table of contents to determine the scope of the work
  • Consulting the index for key terms or the names of important scholars

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Writing Strong Research Questions

Formulating a main research question can be a difficult task. Overall, your question should contribute to solving the problem that you have defined in your problem statement .

However, it should also fulfill criteria in three main areas:

  • Researchability
  • Feasibility and specificity
  • Relevance and originality

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 21). Writing Strong Research Questions | Criteria & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/research-process/research-questions/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, how to define a research problem | ideas & examples, how to write a problem statement | guide & examples, 10 research question examples to guide your research project, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • Alzheimer's disease & dementia
  • Arthritis & Rheumatism
  • Attention deficit disorders
  • Autism spectrum disorders
  • Biomedical technology
  • Diseases, Conditions, Syndromes
  • Endocrinology & Metabolism
  • Gastroenterology
  • Gerontology & Geriatrics
  • Health informatics
  • Inflammatory disorders
  • Medical economics
  • Medical research
  • Medications
  • Neuroscience
  • Obstetrics & gynaecology
  • Oncology & Cancer
  • Ophthalmology
  • Overweight & Obesity
  • Parkinson's & Movement disorders
  • Psychology & Psychiatry
  • Radiology & Imaging
  • Sleep disorders
  • Sports medicine & Kinesiology
  • Vaccination
  • Breast cancer
  • Cardiovascular disease
  • Chronic obstructive pulmonary disease
  • Colon cancer
  • Coronary artery disease
  • Heart attack
  • Heart disease
  • High blood pressure
  • Kidney disease
  • Lung cancer
  • Multiple sclerosis
  • Myocardial infarction
  • Ovarian cancer
  • Post traumatic stress disorder
  • Rheumatoid arthritis
  • Schizophrenia
  • Skin cancer
  • Type 2 diabetes
  • Full List »

share this!

April 5, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

AI medical coding research adds to big picture

by West Virginia University

WVU AI medical coding research adds to big picture

Much like the game of connect the dots, Megan McDougal's academic and professional career share points that have come together to form one big picture.

An associate professor in health informatics and information management in the West Virginia University School of Medicine, McDougal has recently added research to her itinerary. She never thought either would be part of her future.

"It's quite interesting how things line up and you don't realize how each experience you have in your life really helps prepare for the next big thing," she said.

She once aspired to become an X-ray technician. While she liked the medical side of her studies, patient care just didn't fit with what she wanted to do. She gave psychology a try, then moved on to elementary education. Both had that something she was looking for, but neither fit the bill. Then along came health information technology , a job in medical records at J.W. Ruby Memorial Hospital and a degree in allied health administration. It seemed she'd found her place. That was until a supervisor recognized her underlying potential and added student preceptor to her job duties.

"One aspect was working with students who were just like me previously," she explained. "They needed internships and practical experience to graduate and to see if those careers were right for them. That sparked my love of educating others again."

Still, she felt her own education was incomplete. She went on to earn a master's degree in health information management with no concrete plans for what came next.

"You never know what will happen in life. It's so weird how everything falls into place."

That's exactly what happened.

When a position for an online adjunct instructor in health information technology came open at another university, McDougal decided to make a move. It wasn't long after that WVU began to build a new bachelor of science in health informatics and information management program, the only one of its kind in West Virginia. The new HIIM program director, Sally Lucci, heard of McDougal's background mentoring students at the hospital and her recent adjunct experience, and connected with her to gauge her interest in working with the program. Her path led her back to WVU, first as an adjunct and as full-time faculty in 2017.

"My favorite thing about the field is it's never the same day-to-day and constant new things emerge to challenge you. It's good to remain curious and continue to keep updated on changes and how you can make an impact."

Next came the research project, something else McDougal said she "stumbled upon" when a group of WVU physicians began wondering what future role artificial intelligence will play in medical coding.

As a result, McDougal, another HIIM faculty member, Ashley Simmons, and WVU Medicine colleagues Drs. Brian Dilchner, Jami Pincavitch, Ankit Sakhuja and Lukas Meadows set out to determine how effective AI can be for the medical coding workflow.

"Medical coding is an important aspect of the U.S. health care system, and currently, computerized assistive coding technology—CAC—is used to assist medical coding professionals with their workflow," she said. "Some studies have looked into powering CAC with use of artificial intelligence , which potentially opens up new opportunities for the coding workflow."

Using 50 fictional patient records, the team divided duties. A certified coder on the team coded the clinical notes and kept track of the time she put in and resulting diagnoses. The others fed the same clinical notes into various open source AI large language models following specific prompts, and also kept track of the time spent and resulting diagnoses.

Although all the LLM platforms extracted codes quicker, there seemed to be minimal agreement between the human coder and the system.

"While the study is still underway, what we are finding from the data we have collected so far is that the LLMs have limited performance in their ability to accurately abstract ICD-10 diagnostic codes from the clinical notes provided," McDougal said. "The actual performance of the LLMs doing the specialized tasks asked have been very poor."

McDougal emphasized the purpose of the study wasn't to see if AI could replace humans, but that did spark the idea.

"You're still going to need a human to make sure you get the right diagnostic code and make sure you are compliant with regulations."

The team is considering experimenting with other LLMs to make comparisons in efficiency.

Explore further

Feedback to editors

research questions for coding

Researchers develop statistical method for genetic mapping of autoimmune diseases

research questions for coding

New atlas of mRNA variants captures inner workings of the brain

research questions for coding

Study suggests light physical activity as a child is key to reducing risk of type 2 diabetes

research questions for coding

No link between acetaminophen use during pregnancy and cognitive risks, says large sibling study

3 hours ago

research questions for coding

Nurses cite employer failures as their top reason for leaving

research questions for coding

Researchers compile detailed catalog of bacteria living in cancer metastases

research questions for coding

How childhood stress influences gene activity and increases the risk of mental illness

research questions for coding

Adding vaccine to immunotherapy for liver cancer shows promise in early trial

research questions for coding

Study uncovers multiple lineages of stem cells contributing to neuron production

4 hours ago

research questions for coding

Targeting RAS proteins may prevent relapse in acute myeloid leukemia

Related stories.

research questions for coding

Trust your doctor: Study shows human medical professionals are more reliable than artificial intelligence tools

Apr 2, 2024

research questions for coding

Chatbot outperforms physicians in clinical reasoning, but also underperforms against residents on many occasions

Apr 1, 2024

research questions for coding

Review reveals potential uses and pitfalls for generative AI in the medical setting

Jan 29, 2024

Large language models in health: Useful, but not a miracle cure

Apr 3, 2024

research questions for coding

ChatGPT shows 'impressive' accuracy in clinical decision making

Aug 22, 2023

research questions for coding

New study shows LLMs respond differently based on user's motivation

Recommended for you.

research questions for coding

Biomedical engineers use AI to build new tool for studying and diagnosing heart function

22 hours ago

research questions for coding

Researchers develop neural decoding that can give back lost speech

Apr 8, 2024

research questions for coding

Neuroscientists release state-of-the-art spike-sorting software

research questions for coding

Using machine learning to track the evolution of COVID-19

Apr 4, 2024

research questions for coding

Users actively seek and share child sexual abuse material on Tor, but help is available to those willing to stop

Let us know if there is a problem with our content.

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Medical Xpress in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

medRxiv

De novo variants in the non-coding spliceosomal snRNA gene RNU4-2 are a frequent cause of syndromic neurodevelopmental disorders

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vijay S Ganesh
  • ORCID record for David R Adams
  • ORCID record for Seth I Berger
  • ORCID record for Jonathan A Bernstein
  • ORCID record for Lindsay C Burrage
  • ORCID record for Alison G Compton
  • ORCID record for Chloe A Cunningham
  • ORCID record for Precilla D'Souza
  • ORCID record for Kimberly Ezell
  • ORCID record for Jamie L Fraser
  • ORCID record for Lyndon Gallacher
  • ORCID record for Christina L Grant
  • ORCID record for Seema R Lalani
  • ORCID record for Elsa Leitão
  • ORCID record for Anna Le Fevre
  • ORCID record for Richard J Leventer
  • ORCID record for Paul J Lockhart
  • ORCID record for Alan S Ma
  • ORCID record for Ellen F Macnamara
  • ORCID record for Taylor M Maurer
  • ORCID record for Hector R Mendez
  • ORCID record for Stephen B Montgomery
  • ORCID record for Elizabeth E Palmer
  • ORCID record for Heidi L Rehm
  • ORCID record for Chloe M Reuter
  • ORCID record for Rocio Rius
  • ORCID record for Jill A Rosenfeld
  • ORCID record for Cas Simons
  • ORCID record for Zornitza Stark
  • ORCID record for Tiong Yang Tan
  • ORCID record for Natalie B Tan
  • ORCID record for David R Thorburn
  • ORCID record for Cynthia J Tifft
  • ORCID record for Eloise Uebergang
  • ORCID record for Grace E VanNoy
  • ORCID record for Eric Vilain
  • ORCID record for Matthew T Wheeler
  • ORCID record for Susan M White
  • ORCID record for Monica Wojcik
  • ORCID record for Changrui Xiao
  • ORCID record for David Zocche
  • ORCID record for Christel Depienne
  • ORCID record for Joanna MM Howson
  • ORCID record for Stephan J Sanders
  • ORCID record for Anne O'Donnell-Luria
  • ORCID record for Nicola Whiffin
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

Around 60% of individuals with neurodevelopmental disorders (NDD) remain undiagnosed after comprehensive genetic testing, primarily of protein-coding genes1. Increasingly, large genome-sequenced cohorts are improving our ability to discover new diagnoses in the non-coding genome. Here, we identify the non-coding RNA RNU4-2 as a novel syndromic NDD gene. RNU4-2 encodes the U4 small nuclear RNA (snRNA), which is a critical component of the U4/U6.U5 tri-snRNP complex of the major spliceosome2. We identify an 18 bp region of RNU4-2 mapping to two structural elements in the U4/U6 snRNA duplex (the T-loop and Stem III) that is severely depleted of variation in the general population, but in which we identify heterozygous variants in 119 individuals with NDD. The vast majority of individuals (77.3%) have the same highly recurrent single base-pair insertion (n.64_65insT). We estimate that variants in this region explain 0.41% of individuals with NDD. We demonstrate that RNU4-2 is highly expressed in the developing human brain, in contrast to its contiguous counterpart RNU4-1 and other U4 homologs, supporting RNU4-2s role as the primary U4 transcript in the brain. Overall, this work underscores the importance of non-coding genes in rare disorders. It will provide a diagnosis to thousands of individuals with NDD worldwide and pave the way for the development of effective treatments for these individuals.

Competing Interest Statement

NW receives research funding from Novo Nordisk and has consulted for ArgoBio studio. SJS receives research funding from BioMarin Pharmaceutical. AODL is on the scientific advisory board for Congenica, was a paid consultant for Tome Biosciences and Ono Pharma USA Inc., and received reagents from PacBio to support rare disease research. HLR has received support from Illumina and Microsoft to support rare disease gene discovery and diagnosis. MHW has consulted for Illumina and Sanofi and received speaking honoraria from Illumina and GeneDx. SBM is an advisor for BioMarin, Myome and Tenaya Therapeutics. SMS has received honoraria for educational events or advisory boards from Angelini Pharma, Biocodex, Eisai, Zogenix/UCB and institutional contributions for advisory boards, educational events or consultancy work from Eisai, Jazz/GW Pharma, Stoke Therapeutics, Takeda, UCB and Zogenix. The Department of Molecular and Human Genetics at Baylor College of Medicine receives revenue from clinical genetic testing completed at Baylor Genetics Laboratories. JMMH is a full-time employee of Novo Nordisk and holds shares in Novo Nordisk A/S. DGM is a paid consultant for GlaxoSmithKline, Insitro, and Overtone Therapeutics and receives research support from Microsoft.

Funding Statement

YC is supported by a studentship from Novo Nordisk. NW is supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (220134/Z/20/Z). GL was supported by the Fonds de recherche en sante du Quebec (FRQS), VSG by NIAMS K23AR083505, and SLS by a fellowship from Manton Center for Orphan Disease Research at Boston Childrens Hospital. The research was supported by grant funding from Novo Nordisk and the Rosetrees Trust (PGL19-2/10025 to NW), the Simons Foundation Autism Research Initiative (SFARI #736613, SJS), the NIMH (R01 MH129751 to SJS), the HDR-UK Molecules to Health Records Driver Programme (SJS), the Australian National Health and Medical Research Council (1164479, 1155244, GNT2001513), the Australian NHMRC Centre for Research Excellence in Neurocognitive Disorders (NHMRC-RG172296), the Australian Medical Research Future Fund (MRF2007677, GHFM76747), NHGRI (U01HG011762, U01HG011745, U24HG011746, UM1HG008900, U01HG011755, R21HG012397, and R01HG009141), NINDS (U01NS134358, U01NS106845, U54NS115052, 1U24NS131172), the Chan Zuckerberg Initiative Donor-Advised Fund at the Silicon Valley Community Foundation (funder DOI 10.13039/100014989) grants 2019-199278, 2020-224274, (https://doi.org/10.37921/236582yuakxy), the US Department of Defense Congressionally Directed Medical Research Programs (PR170396), the National Institute of Neurological Disorders and Stroke of the National Institutes of Health (U01HG007709, U01HG007942, and U01HG010217), and the Clinical Translational Core of the Baylor College of Medicine IDDRC (P50HD103555) from the Eunice Kennedy Shriver National Institute of Child Health and Human Development. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health and Human Development or the National Institutes of Health. The Rare Disease Flagship acknowledges financial support from the Royal Childrens Hospital Foundation, the Murdoch Childrens Research Institute and the Harbig Foundation. Massimos Mission acknowledges funding support from the Australian Government Department of Health and Aged Care (EPCD000034). Sequencing and analysis were supported by the Deutsche Forschungsgemeinschaft (DFG) Research Infrastructure West German Genome Center (project 407493903) as part of the Next Generation Sequencing Competence Network (project 423957469). Short-read genome sequencing was carried out at the production site Cologne (Cologne Center for Genomics; CCG). CD received the DFG 458099954 as part of the DFG Sequencing call #3. SMS is supported by the Epilepsy Society.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The 100,000 Genomes Project Protocol has ethical approval from the HRA Committee East of England Cambridge South (REC Ref 14/EE/1112). This study was registered with Genomics England under Research Registry Projects 354. Clinical data were collected from research participants after obtaining written informed consent from the parents or legal guardians, with the study approved by the local regulatory authority.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Data Availability

All data produced in the present work are contained in the manuscript.

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Addiction Medicine (316)
  • Allergy and Immunology (619)
  • Anesthesia (160)
  • Cardiovascular Medicine (2279)
  • Dentistry and Oral Medicine (280)
  • Dermatology (201)
  • Emergency Medicine (370)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (803)
  • Epidemiology (11585)
  • Forensic Medicine (10)
  • Gastroenterology (680)
  • Genetic and Genomic Medicine (3591)
  • Geriatric Medicine (337)
  • Health Economics (618)
  • Health Informatics (2308)
  • Health Policy (914)
  • Health Systems and Quality Improvement (864)
  • Hematology (335)
  • HIV/AIDS (753)
  • Infectious Diseases (except HIV/AIDS) (13164)
  • Intensive Care and Critical Care Medicine (757)
  • Medical Education (359)
  • Medical Ethics (100)
  • Nephrology (389)
  • Neurology (3354)
  • Nursing (191)
  • Nutrition (507)
  • Obstetrics and Gynecology (651)
  • Occupational and Environmental Health (647)
  • Oncology (1761)
  • Ophthalmology (525)
  • Orthopedics (209)
  • Otolaryngology (284)
  • Pain Medicine (223)
  • Palliative Medicine (66)
  • Pathology (440)
  • Pediatrics (1007)
  • Pharmacology and Therapeutics (422)
  • Primary Care Research (406)
  • Psychiatry and Clinical Psychology (3066)
  • Public and Global Health (5997)
  • Radiology and Imaging (1225)
  • Rehabilitation Medicine and Physical Therapy (715)
  • Respiratory Medicine (811)
  • Rheumatology (367)
  • Sexual and Reproductive Health (353)
  • Sports Medicine (317)
  • Surgery (388)
  • Toxicology (50)
  • Transplantation (171)
  • Urology (142)

IMAGES

  1. Coding Qualitative Data: A Beginner’s How-To + Examples

    research questions for coding

  2. Coding Qualitative Data: A Beginner’s How-To + Examples

    research questions for coding

  3. Example of questionnaire coding structure

    research questions for coding

  4. Open-Ended Questions: How to Code & Analyze for Insights [2018]

    research questions for coding

  5. Coding in Qualitative Research by academiasolutionaus

    research questions for coding

  6. Essential Guide to Coding Qualitative Data

    research questions for coding

VIDEO

  1. Coding Decoding Reasoning Questions || Coding Decoding In Hindi Questions#shorts

  2. A surprising thing about my first coding job #coder #tip #learncoding

  3. IBM Coding Assessment question| IBM hiring freshers

  4. Coding De-coding (L-2) Reasoning Question

  5. Top 10 Java Tricky Coding Interview Questions

  6. Understanding Research Methods in Education

COMMENTS

  1. Qualitative Data Coding 101 (With Examples)

    Deductive coding 101. With deductive coding, we make use of pre-established codes, which are developed before you interact with the present data. This usually involves drawing up a set of codes based on a research question or previous research. You could also use a code set from the codebook of a previous study.

  2. Coding Qualitative Data: How to Code Qualitative Research

    You can automate the coding of your qualitative data with thematic analysis software. Thematic analysis and qualitative data analysis software use machine learning, artificial intelligence (AI), and natural language processing (NLP) to code your qualitative data and break text up into themes. Thematic analysis software is autonomous, which ...

  3. Coding

    Planning your coding strategy. Coding is a qualitative data analysis strategy in which some aspect of the data is assigned a descriptive label that allows the researcher to identify related content across the data. How you decide to code - or whether to code- your data should be driven by your methodology.

  4. Chapter 18. Data Analysis and Coding

    The Coding Manual for Qualitative Researchers. 2nd ed. Thousand Oaks, CA: SAGE. The most complete and comprehensive compendium of coding techniques out there. Essential reference. Silver, Christina. 2014. Using Software in Qualitative Research: A Step-by-Step Guide. 2nd ed. Thousand Oaks, CA; SAGE. If you are unsure which CAQDAS program you are ...

  5. A Guide to Coding Qualitative Research Data

    The primary goal of coding qualitative data is to change data into a consistent format in support of research and reporting. A code can be a phrase or a word that depicts an idea or recurring theme in the data. The code's label must be intuitive and encapsulate the essence of the researcher's observations or participants' responses.

  6. PDF Introduction to Qualitative Research Coding

    Deductive Coding Codes emerge from your research question and/or the literature review. Inductive Coding Codes emerge through engagement with your ... These questions will be more specific than the research questions that motivate your study, and will focus on your actual data. 2. Identify Codes and Attributes Associated with the Specific Questions

  7. Coding qualitative data: a synthesis guiding the novice

    Having pooled our ex perience in coding qualitative material and teaching students how to. code, in this paper we synthesize the extensive literature on coding in the form of a hands-on. review ...

  8. Qualitative Coding

    In structural coding, codes indicate which specific research question, part of a research question, or hypothesis is being addressed by a particular segment of text. This may be most useful as part of rough coding to help researchers ensure that their data addresses the questions and foci central to their project.

  9. (PDF) Qualitative Data Coding

    2. WORKSHOP. Qualitative Data Coding. ABSTRACT. In the quest to address a research problem, meeting the purpose of the study, and answer ing. qualitative research question (s), we actively look ...

  10. Coding and Analysis Strategies

    Abstract. This chapter provides an overview of selected qualitative data analytic strategies with a particular focus on codes and coding. Preparatory strategies for a qualitative research study and data management are first outlined. Six coding methods are then profiled using comparable interview data: process coding, in vivo coding ...

  11. PDF Tips & Tools #18: Coding Qualitative Data

    Coding Qualitative Data . This tip sheet provides an overview of the process of coding qualitative data, which is an ... list of research questions, problem areas, etc. Your prior knowledge of the subject matter and your subject expertise will also help you create these codes. For instance, if you are interviewing MUH owners and managers, you ...

  12. Essential Guide to Coding Qualitative Data

    The process of coding qualitative data varies widely depending on the objective of your research. But in general, it involves a process of reading through your data, applying codes to excerpts, conducting various rounds of coding, grouping codes according to themes, and then making interpretations that lead to your ultimate research findings.

  13. Chapter 19. Advanced Codes and Coding

    Figure 19.1. A Walk Through the Forest as Model of Analytical Coding. There is no single correct way to go about coding your data. When I first began teaching qualitative research methods, I resolutely refused to "teach" coding, as I thought it was a little like trying to teach people to write fiction.

  14. How to Craft Your Own Qualitative Research Codebook

    Creating a codebook is vital in coding qualitative data. It provides a structured, consistent framework for organizing and analyzing the data. Here are the steps to create a codebook for qualitative data: Begin by reviewing the data to identify the key concepts, themes, and patterns that emerge.

  15. Codes and Coding

    Inductive coding is research question-based, data driven and bottom-up process. It is not based on existing theories or concepts. Glaser and Strauss established an inductive method of qualitative research in which data collection and interpretation are carried out concurrently. Constant comparison and theoretical sampling are employed to aid in ...

  16. Analysis and Coding Example- Qualitative Data

    The following is an example of how to engage in a three step analytic process of coding, categorizing, and identifying themes within the data presented. Note that different researchers would come up with different results based on their specific research questions, literature review findings, and theoretical perspective.

  17. From Science to Programming: The Role of Coding in Research

    09/12/2023. In today's rapidly evolving research landscape, the integration of coding and programming has emerged as a powerful force, revolutionizing the way we approach scientific inquiry. With the exponential growth of data and the increasing complexity of research questions, coding has become an essential tool for researchers across a ...

  18. Coding

    Coding is the organizational process where researchers assign codes to raw qualitative data in order to answer their research question. Coding is a reactive and iterative process between the data and the researcher. The research question is the principal guide in preparing to code and selecting codes.

  19. How to Develop a Research Question for Qualitative Coding

    4 Formulate your question. The final step is to formulate your research question for qualitative coding. Your question should be clear, concise, and relevant to your topic and focus. It should ...

  20. Understanding and Identifying 'Themes' in Qualitative Case Study Research

    In the first level, that is, open coding, the researcher is required to label or code direct data sets from the interviews or other textual data that has been collected. If using grounded theory, you may be doing line-by-line coding (Charmaz, 2006). Line-by-line coding allows you to detect the hidden patterns which often chunk coding may miss out.

  21. How to Do Thematic Analysis

    When to use thematic analysis. Thematic analysis is a good approach to research where you're trying to find out something about people's views, opinions, knowledge, experiences or values from a set of qualitative data - for example, interview transcripts, social media profiles, or survey responses. Some types of research questions you might use thematic analysis to answer:

  22. Learn How To Code Open-Ended Survey Questions

    Here we'll go through 4 tips for survey coding: 1. Don't make assumptions. You should always start your analysis with an open mind. If you go into it with a subjective opinion of how the results should turn out, you might miss something, or add your own bias to the results and insights.

  23. Writing Strong Research Questions

    A good research question is essential to guide your research paper, dissertation, or thesis. All research questions should be: Focused on a single problem or issue. Researchable using primary and/or secondary sources. Feasible to answer within the timeframe and practical constraints. Specific enough to answer thoroughly.

  24. AI medical coding research adds to big picture

    AI medical coding research adds to big picture. by West Virginia University. Megan McDougal, WVU associate professor in health informatics and information management, is researching the role ...

  25. Coding activity for interviews.docx

    First step : identify key words/categories from the research question/s that can establish themes. For example: o Home influences o Community influences o Building bridges Second step : create a 'tree' of concepts that will 'develop under' each of the above key words/categories. For example: o Home influences could include language ...

  26. De novo variants in the non-coding spliceosomal snRNA gene RNU4-2 are a

    Around 60% of individuals with neurodevelopmental disorders (NDD) remain undiagnosed after comprehensive genetic testing, primarily of protein-coding genes1. Increasingly, large genome-sequenced cohorts are improving our ability to discover new diagnoses in the non-coding genome. Here, we identify the non-coding RNA RNU4-2 as a novel syndromic NDD gene. RNU4-2 encodes the U4 small nuclear RNA ...