• 7 Data Collection Methods & Tools For Research

busayo.longe

  • Data Collection

The underlying need for Data collection is to capture quality evidence that seeks to answer all the questions that have been posed. Through data collection businesses or management can deduce quality information that is a prerequisite for making informed decisions.

To improve the quality of information, it is expedient that data is collected so that you can draw inferences and make informed decisions on what is considered factual.

At the end of this article, you would understand why picking the best data collection method is necessary for achieving your set objective. 

Sign up on Formplus Builder to create your preferred online surveys or questionnaire for data collection. You don’t need to be tech-savvy! Start creating quality questionnaires with Formplus.

What is Data Collection?

Data collection is a methodical process of gathering and analyzing specific information to proffer solutions to relevant questions and evaluate the results. It focuses on finding out all there is to a particular subject matter. Data is collected to be further subjected to hypothesis testing which seeks to explain a phenomenon.

Hypothesis testing eliminates assumptions while making a proposition from the basis of reason.

data collection tools for case study

For collectors of data, there is a range of outcomes for which the data is collected. But the key purpose for which data is collected is to put a researcher in a vantage position to make predictions about future probabilities and trends.

The core forms in which data can be collected are primary and secondary data. While the former is collected by a researcher through first-hand sources, the latter is collected by an individual other than the user. 

Types of Data Collection 

Before broaching the subject of the various types of data collection. It is pertinent to note that data collection in itself falls under two broad categories; Primary data collection and secondary data collection.

Primary Data Collection

Primary data collection by definition is the gathering of raw data collected at the source. It is a process of collecting the original data collected by a researcher for a specific research purpose. It could be further analyzed into two segments; qualitative research and quantitative data collection methods. 

  • Qualitative Research Method 

The qualitative research methods of data collection do not involve the collection of data that involves numbers or a need to be deduced through a mathematical calculation, rather it is based on the non-quantifiable elements like the feeling or emotion of the researcher. An example of such a method is an open-ended questionnaire.

data collection tools for case study

  • Quantitative Method

Quantitative methods are presented in numbers and require a mathematical calculation to deduce. An example would be the use of a questionnaire with close-ended questions to arrive at figures to be calculated Mathematically. Also, methods of correlation and regression, mean, mode and median.

data collection tools for case study

Read Also: 15 Reasons to Choose Quantitative over Qualitative Research

Secondary Data Collection

Secondary data collection, on the other hand, is referred to as the gathering of second-hand data collected by an individual who is not the original user. It is the process of collecting data that is already existing, be it already published books, journals, and/or online portals. In terms of ease, it is much less expensive and easier to collect.

Your choice between Primary data collection and secondary data collection depends on the nature, scope, and area of your research as well as its aims and objectives. 

Importance of Data Collection

There are a bunch of underlying reasons for collecting data, especially for a researcher. Walking you through them, here are a few reasons; 

  • Integrity of the Research

A key reason for collecting data, be it through quantitative or qualitative methods is to ensure that the integrity of the research question is indeed maintained.

  • Reduce the likelihood of errors

The correct use of appropriate data collection of methods reduces the likelihood of errors consistent with the results. 

  • Decision Making

To minimize the risk of errors in decision-making, it is important that accurate data is collected so that the researcher doesn’t make uninformed decisions. 

  • Save Cost and Time

Data collection saves the researcher time and funds that would otherwise be misspent without a deeper understanding of the topic or subject matter.

  • To support a need for a new idea, change, and/or innovation

To prove the need for a change in the norm or the introduction of new information that will be widely accepted, it is important to collect data as evidence to support these claims.

What is a Data Collection Tool?

Data collection tools refer to the devices/instruments used to collect data, such as a paper questionnaire or computer-assisted interviewing system. Case Studies, Checklists, Interviews, Observation sometimes, and Surveys or Questionnaires are all tools used to collect data.

It is important to decide on the tools for data collection because research is carried out in different ways and for different purposes. The objective behind data collection is to capture quality evidence that allows analysis to lead to the formulation of convincing and credible answers to the posed questions.

The objective behind data collection is to capture quality evidence that allows analysis to lead to the formulation of convincing and credible answers to the questions that have been posed – Click to Tweet

The Formplus online data collection tool is perfect for gathering primary data, i.e. raw data collected from the source. You can easily get data with at least three data collection methods with our online and offline data-gathering tool. I.e Online Questionnaires , Focus Groups, and Reporting. 

In our previous articles, we’ve explained why quantitative research methods are more effective than qualitative methods . However, with the Formplus data collection tool, you can gather all types of primary data for academic, opinion or product research.

Top Data Collection Methods and Tools for Academic, Opinion, or Product Research

The following are the top 7 data collection methods for Academic, Opinion-based, or product research. Also discussed in detail are the nature, pros, and cons of each one. At the end of this segment, you will be best informed about which method best suits your research. 

An interview is a face-to-face conversation between two individuals with the sole purpose of collecting relevant information to satisfy a research purpose. Interviews are of different types namely; Structured, Semi-structured , and unstructured with each having a slight variation from the other.

Use this interview consent form template to let an interviewee give you consent to use data gotten from your interviews for investigative research purposes.

  • Structured Interviews – Simply put, it is a verbally administered questionnaire. In terms of depth, it is surface level and is usually completed within a short period. For speed and efficiency, it is highly recommendable, but it lacks depth.
  • Semi-structured Interviews – In this method, there subsist several key questions which cover the scope of the areas to be explored. It allows a little more leeway for the researcher to explore the subject matter.
  • Unstructured Interviews – It is an in-depth interview that allows the researcher to collect a wide range of information with a purpose. An advantage of this method is the freedom it gives a researcher to combine structure with flexibility even though it is more time-consuming.
  • In-depth information
  • Freedom of flexibility
  • Accurate data.
  • Time-consuming
  • Expensive to collect.

What are The Best Data Collection Tools for Interviews? 

For collecting data through interviews, here are a few tools you can use to easily collect data.

  • Audio Recorder

An audio recorder is used for recording sound on disc, tape, or film. Audio information can meet the needs of a wide range of people, as well as provide alternatives to print data collection tools.

  • Digital Camera

An advantage of a digital camera is that it can be used for transmitting those images to a monitor screen when the need arises.

A camcorder is used for collecting data through interviews. It provides a combination of both an audio recorder and a video camera. The data provided is qualitative in nature and allows the respondents to answer questions asked exhaustively. If you need to collect sensitive information during an interview, a camcorder might not work for you as you would need to maintain your subject’s privacy.

Want to conduct an interview for qualitative data research or a special report? Use this online interview consent form template to allow the interviewee to give their consent before you use the interview data for research or report. With premium features like e-signature, upload fields, form security, etc., Formplus Builder is the perfect tool to create your preferred online consent forms without coding experience. 

  • QUESTIONNAIRES

This is the process of collecting data through an instrument consisting of a series of questions and prompts to receive a response from the individuals it is administered to. Questionnaires are designed to collect data from a group. 

For clarity, it is important to note that a questionnaire isn’t a survey, rather it forms a part of it. A survey is a process of data gathering involving a variety of data collection methods, including a questionnaire.

On a questionnaire, there are three kinds of questions used. They are; fixed-alternative, scale, and open-ended. With each of the questions tailored to the nature and scope of the research.

  • Can be administered in large numbers and is cost-effective.
  • It can be used to compare and contrast previous research to measure change.
  • Easy to visualize and analyze.
  • Questionnaires offer actionable data.
  • Respondent identity is protected.
  • Questionnaires can cover all areas of a topic.
  • Relatively inexpensive.
  • Answers may be dishonest or the respondents lose interest midway.
  • Questionnaires can’t produce qualitative data.
  • Questions might be left unanswered.
  • Respondents may have a hidden agenda.
  • Not all questions can be analyzed easily.

What are the Best Data Collection Tools for Questionnaires? 

  • Formplus Online Questionnaire

Formplus lets you create powerful forms to help you collect the information you need. Formplus helps you create the online forms that you like. The Formplus online questionnaire form template to get actionable trends and measurable responses. Conduct research, optimize knowledge of your brand or just get to know an audience with this form template. The form template is fast, free and fully customizable.

  • Paper Questionnaire

A paper questionnaire is a data collection tool consisting of a series of questions and/or prompts for the purpose of gathering information from respondents. Mostly designed for statistical analysis of the responses, they can also be used as a form of data collection.

By definition, data reporting is the process of gathering and submitting data to be further subjected to analysis. The key aspect of data reporting is reporting accurate data because inaccurate data reporting leads to uninformed decision-making.

  • Informed decision-making.
  • Easily accessible.
  • Self-reported answers may be exaggerated.
  • The results may be affected by bias.
  • Respondents may be too shy to give out all the details.
  • Inaccurate reports will lead to uninformed decisions.

What are the Best Data Collection Tools for Reporting?

Reporting tools enable you to extract and present data in charts, tables, and other visualizations so users can find useful information. You could source data for reporting from Non-Governmental Organizations (NGO) reports, newspapers, website articles, and hospital records.

  • NGO Reports

Contained in NGO report is an in-depth and comprehensive report on the activities carried out by the NGO, covering areas such as business and human rights. The information contained in these reports is research-specific and forms an acceptable academic base for collecting data. NGOs often focus on development projects which are organized to promote particular causes.

Newspaper data are relatively easy to collect and are sometimes the only continuously available source of event data. Even though there is a problem of bias in newspaper data, it is still a valid tool in collecting data for Reporting.

  • Website Articles

Gathering and using data contained in website articles is also another tool for data collection. Collecting data from web articles is a quicker and less expensive data collection Two major disadvantages of using this data reporting method are biases inherent in the data collection process and possible security/confidentiality concerns.

  • Hospital Care records

Health care involves a diverse set of public and private data collection systems, including health surveys, administrative enrollment and billing records, and medical records, used by various entities, including hospitals, CHCs, physicians, and health plans. The data provided is clear, unbiased and accurate, but must be obtained under legal means as medical data is kept with the strictest regulations.

  • EXISTING DATA

This is the introduction of new investigative questions in addition to/other than the ones originally used when the data was initially gathered. It involves adding measurement to a study or research. An example would be sourcing data from an archive.

  • Accuracy is very high.
  • Easily accessible information.
  • Problems with evaluation.
  • Difficulty in understanding.

What are the Best Data Collection Tools for Existing Data?

The concept of Existing data means that data is collected from existing sources to investigate research questions other than those for which the data were originally gathered. Tools to collect existing data include: 

  • Research Journals – Unlike newspapers and magazines, research journals are intended for an academic or technical audience, not general readers. A journal is a scholarly publication containing articles written by researchers, professors, and other experts.
  • Surveys – A survey is a data collection tool for gathering information from a sample population, with the intention of generalizing the results to a larger population. Surveys have a variety of purposes and can be carried out in many ways depending on the objectives to be achieved.
  • OBSERVATION

This is a data collection method by which information on a phenomenon is gathered through observation. The nature of the observation could be accomplished either as a complete observer, an observer as a participant, a participant as an observer, or as a complete participant. This method is a key base for formulating a hypothesis.

  • Easy to administer.
  • There subsists a greater accuracy with results.
  • It is a universally accepted practice.
  • It diffuses the situation of the unwillingness of respondents to administer a report.
  • It is appropriate for certain situations.
  • Some phenomena aren’t open to observation.
  • It cannot be relied upon.
  • Bias may arise.
  • It is expensive to administer.
  • Its validity cannot be predicted accurately.

What are the Best Data Collection Tools for Observation?

Observation involves the active acquisition of information from a primary source. Observation can also involve the perception and recording of data via the use of scientific instruments. The best tools for Observation are:

  • Checklists – state-specific criteria, that allow users to gather information and make judgments about what they should know in relation to the outcomes. They offer systematic ways of collecting data about specific behaviors, knowledge, and skills.
  • Direct observation – This is an observational study method of collecting evaluative information. The evaluator watches the subject in his or her usual environment without altering that environment.

FOCUS GROUPS

The opposite of quantitative research which involves numerical-based data, this data collection method focuses more on qualitative research. It falls under the primary category of data based on the feelings and opinions of the respondents. This research involves asking open-ended questions to a group of individuals usually ranging from 6-10 people, to provide feedback.

  • Information obtained is usually very detailed.
  • Cost-effective when compared to one-on-one interviews.
  • It reflects speed and efficiency in the supply of results.
  • Lacking depth in covering the nitty-gritty of a subject matter.
  • Bias might still be evident.
  • Requires interviewer training
  • The researcher has very little control over the outcome.
  • A few vocal voices can drown out the rest.
  • Difficulty in assembling an all-inclusive group.

What are the Best Data Collection Tools for Focus Groups?

A focus group is a data collection method that is tightly facilitated and structured around a set of questions. The purpose of the meeting is to extract from the participants’ detailed responses to these questions. The best tools for tackling Focus groups are: 

  • Two-Way – One group watches another group answer the questions posed by the moderator. After listening to what the other group has to offer, the group that listens is able to facilitate more discussion and could potentially draw different conclusions .
  • Dueling-Moderator – There are two moderators who play the devil’s advocate. The main positive of the dueling-moderator focus group is to facilitate new ideas by introducing new ways of thinking and varying viewpoints.
  • COMBINATION RESEARCH

This method of data collection encompasses the use of innovative methods to enhance participation in both individuals and groups. Also under the primary category, it is a combination of Interviews and Focus Groups while collecting qualitative data . This method is key when addressing sensitive subjects. 

  • Encourage participants to give responses.
  • It stimulates a deeper connection between participants.
  • The relative anonymity of respondents increases participation.
  • It improves the richness of the data collected.
  • It costs the most out of all the top 7.
  • It’s the most time-consuming.

What are the Best Data Collection Tools for Combination Research? 

The Combination Research method involves two or more data collection methods, for instance, interviews as well as questionnaires or a combination of semi-structured telephone interviews and focus groups. The best tools for combination research are: 

  • Online Survey –  The two tools combined here are online interviews and the use of questionnaires. This is a questionnaire that the target audience can complete over the Internet. It is timely, effective, and efficient. Especially since the data to be collected is quantitative in nature.
  • Dual-Moderator – The two tools combined here are focus groups and structured questionnaires. The structured questionnaires give a direction as to where the research is headed while two moderators take charge of the proceedings. Whilst one ensures the focus group session progresses smoothly, the other makes sure that the topics in question are all covered. Dual-moderator focus groups typically result in a more productive session and essentially lead to an optimum collection of data.

Why Formplus is the Best Data Collection Tool

  • Vast Options for Form Customization 

With Formplus, you can create your unique survey form. With options to change themes, font color, font, font type, layout, width, and more, you can create an attractive survey form. The builder also gives you as many features as possible to choose from and you do not need to be a graphic designer to create a form.

  • Extensive Analytics

Form Analytics, a feature in formplus helps you view the number of respondents, unique visits, total visits, abandonment rate, and average time spent before submission. This tool eliminates the need for a manual calculation of the received data and/or responses as well as the conversion rate for your poll.

  • Embed Survey Form on Your Website

Copy the link to your form and embed it as an iframe which will automatically load as your website loads, or as a popup that opens once the respondent clicks on the link. Embed the link on your Twitter page to give instant access to your followers.

data collection tools for case study

  • Geolocation Support

The geolocation feature on Formplus lets you ascertain where individual responses are coming. It utilises Google Maps to pinpoint the longitude and latitude of the respondent, to the nearest accuracy, along with the responses.

  • Multi-Select feature

This feature helps to conserve horizontal space as it allows you to put multiple options in one field. This translates to including more information on the survey form. 

Read Also: 10 Reasons to Use Formplus for Online Data Collection

How to Use Formplus to collect online data in 7 simple steps. 

  • Register or sign up on Formplus builder : Start creating your preferred questionnaire or survey by signing up with either your Google, Facebook, or Email account.

data collection tools for case study

Formplus gives you a free plan with basic features you can use to collect online data. Pricing plans with vast features starts at $20 monthly, with reasonable discounts for Education and Non-Profit Organizations. 

2. Input your survey title and use the form builder choice options to start creating your surveys. 

Use the choice option fields like single select, multiple select, checkbox, radio, and image choices to create your preferred multi-choice surveys online.

data collection tools for case study

3. Do you want customers to rate any of your products or services delivery? 

Use the rating to allow survey respondents rate your products or services. This is an ideal quantitative research method of collecting data. 

data collection tools for case study

4. Beautify your online questionnaire with Formplus Customisation features.

data collection tools for case study

  • Change the theme color
  • Add your brand’s logo and image to the forms
  • Change the form width and layout
  • Edit the submission button if you want
  • Change text font color and sizes
  • Do you have already made custom CSS to beautify your questionnaire? If yes, just copy and paste it to the CSS option.

5. Edit your survey questionnaire settings for your specific needs

Choose where you choose to store your files and responses. Select a submission deadline, choose a timezone, limit respondents’ responses, enable Captcha to prevent spam, and collect location data of customers.

data collection tools for case study

Set an introductory message to respondents before they begin the survey, toggle the “start button” post final submission message or redirect respondents to another page when they submit their questionnaires. 

Change the Email Notifications inventory and initiate an autoresponder message to all your survey questionnaire respondents. You can also transfer your forms to other users who can become form administrators.

6. Share links to your survey questionnaire page with customers.

There’s an option to copy and share the link as “Popup” or “Embed code” The data collection tool automatically creates a QR Code for Survey Questionnaire which you can download and share as appropriate. 

data collection tools for case study

Congratulations if you’ve made it to this stage. You can start sharing the link to your survey questionnaire with your customers.

7. View your Responses to the Survey Questionnaire

Toggle with the presentation of your summary from the options. Whether as a single, table or cards.

data collection tools for case study

8. Allow Formplus Analytics to interpret your Survey Questionnaire Data

data collection tools for case study

  With online form builder analytics, a business can determine;

  • The number of times the survey questionnaire was filled
  • The number of customers reached
  • Abandonment Rate: The rate at which customers exit the form without submitting it.
  • Conversion Rate: The percentage of customers who completed the online form
  • Average time spent per visit
  • Location of customers/respondents.
  • The type of device used by the customer to complete the survey questionnaire.

7 Tips to Create The Best Surveys For Data Collections

  •  Define the goal of your survey – Once the goal of your survey is outlined, it will aid in deciding which questions are the top priority. A clear attainable goal would, for example, mirror a clear reason as to why something is happening. e.g. “The goal of this survey is to understand why Employees are leaving an establishment.”
  • Use close-ended clearly defined questions – Avoid open-ended questions and ensure you’re not suggesting your preferred answer to the respondent. If possible offer a range of answers with choice options and ratings.
  • Survey outlook should be attractive and Inviting – An attractive-looking survey encourages a higher number of recipients to respond to the survey. Check out Formplus Builder for colorful options to integrate into your survey design. You could use images and videos to keep participants glued to their screens.
  •   Assure Respondents about the safety of their data – You want your respondents to be assured whilst disclosing details of their personal information to you. It’s your duty to inform the respondents that the data they provide is confidential and only collected for the purpose of research.
  • Ensure your survey can be completed in record time – Ideally, in a typical survey, users should be able to respond in 100 seconds. It is pertinent to note that they, the respondents, are doing you a favor. Don’t stress them. Be brief and get straight to the point.
  • Do a trial survey – Preview your survey before sending out your surveys to the intended respondents. Make a trial version which you’ll send to a few individuals. Based on their responses, you can draw inferences and decide whether or not your survey is ready for the big time.
  • Attach a reward upon completion for users – Give your respondents something to look forward to at the end of the survey. Think of it as a penny for their troubles. It could well be the encouragement they need to not abandon the survey midway.

Try out Formplus today . You can start making your own surveys with the Formplus online survey builder. By applying these tips, you will definitely get the most out of your online surveys.

Top Survey Templates For Data Collection 

  • Customer Satisfaction Survey Template 

On the template, you can collect data to measure customer satisfaction over key areas like the commodity purchase and the level of service they received. It also gives insight as to which products the customer enjoyed, how often they buy such a product, and whether or not the customer is likely to recommend the product to a friend or acquaintance. 

  • Demographic Survey Template

With this template, you would be able to measure, with accuracy, the ratio of male to female, age range, and the number of unemployed persons in a particular country as well as obtain their personal details such as names and addresses.

Respondents are also able to state their religious and political views about the country under review.

  • Feedback Form Template

Contained in the template for the online feedback form is the details of a product and/or service used. Identifying this product or service and documenting how long the customer has used them.

The overall satisfaction is measured as well as the delivery of the services. The likelihood that the customer also recommends said product is also measured.

  • Online Questionnaire Template

The online questionnaire template houses the respondent’s data as well as educational qualifications to collect information to be used for academic research.

Respondents can also provide their gender, race, and field of study as well as present living conditions as prerequisite data for the research study.

  • Student Data Sheet Form Template 

The template is a data sheet containing all the relevant information of a student. The student’s name, home address, guardian’s name, record of attendance as well as performance in school is well represented on this template. This is a perfect data collection method to deploy for a school or an education organization.

Also included is a record for interaction with others as well as a space for a short comment on the overall performance and attitude of the student. 

  • Interview Consent Form Template

This online interview consent form template allows the interviewee to sign off their consent to use the interview data for research or report to journalists. With premium features like short text fields, upload, e-signature, etc., Formplus Builder is the perfect tool to create your preferred online consent forms without coding experience.

What is the Best Data Collection Method for Qualitative Data?

Answer: Combination Research

The best data collection method for a researcher for gathering qualitative data which generally is data relying on the feelings, opinions, and beliefs of the respondents would be Combination Research.

The reason why combination research is the best fit is that it encompasses the attributes of Interviews and Focus Groups. It is also useful when gathering data that is sensitive in nature. It can be described as an all-purpose quantitative data collection method.

Above all, combination research improves the richness of data collected when compared with other data collection methods for qualitative data.

data collection tools for case study

What is the Best Data Collection Method for Quantitative Research Data?

Ans: Questionnaire

The best data collection method a researcher can employ in gathering quantitative data which takes into consideration data that can be represented in numbers and figures that can be deduced mathematically is the Questionnaire.

These can be administered to a large number of respondents while saving costs. For quantitative data that may be bulky or voluminous in nature, the use of a Questionnaire makes such data easy to visualize and analyze.

Another key advantage of the Questionnaire is that it can be used to compare and contrast previous research work done to measure changes.

Technology-Enabled Data Collection Methods

There are so many diverse methods available now in the world because technology has revolutionized the way data is being collected. It has provided efficient and innovative methods that anyone, especially researchers and organizations. Below are some technology-enabled data collection methods:

  • Online Surveys: Online surveys have gained popularity due to their ease of use and wide reach. You can distribute them through email, social media, or embed them on websites. Online surveys allow you to quickly complete data collection, automated data capture, and real-time analysis. Online surveys also offer features like skip logic, validation checks, and multimedia integration.
  • Mobile Surveys: With the widespread use of smartphones, mobile surveys’ popularity is also on the rise. Mobile surveys leverage the capabilities of mobile devices, and this allows respondents to participate at their convenience. This includes multimedia elements, location-based information, and real-time feedback. Mobile surveys are the best for capturing in-the-moment experiences or opinions.
  • Social Media Listening: Social media platforms are a good source of unstructured data that you can analyze to gain insights into customer sentiment and trends. Social media listening involves monitoring and analyzing social media conversations, mentions, and hashtags to understand public opinion, identify emerging topics, and assess brand reputation.
  • Wearable Devices and Sensors: You can embed wearable devices, such as fitness trackers or smartwatches, and sensors in everyday objects to capture continuous data on various physiological and environmental variables. This data can provide you with insights into health behaviors, activity patterns, sleep quality, and environmental conditions, among others.
  • Big Data Analytics: Big data analytics leverages large volumes of structured and unstructured data from various sources, such as transaction records, social media, and internet browsing. Advanced analytics techniques, like machine learning and natural language processing, can extract meaningful insights and patterns from this data, enabling organizations to make data-driven decisions.
Read Also: How Technology is Revolutionizing Data Collection

Faulty Data Collection Practices – Common Mistakes & Sources of Error

While technology-enabled data collection methods offer numerous advantages, there are some pitfalls and sources of error that you should be aware of. Here are some common mistakes and sources of error in data collection:

  • Population Specification Error: Population specification error occurs when the target population is not clearly defined or misidentified. This error leads to a mismatch between the research objectives and the actual population being studied, resulting in biased or inaccurate findings.
  • Sample Frame Error: Sample frame error occurs when the sampling frame, the list or source from which the sample is drawn, does not adequately represent the target population. This error can introduce selection bias and affect the generalizability of the findings.
  • Selection Error: Selection error occurs when the process of selecting participants or units for the study introduces bias. It can happen due to nonrandom sampling methods, inadequate sampling techniques, or self-selection bias. Selection error compromises the representativeness of the sample and affects the validity of the results.
  • Nonresponse Error: Nonresponse error occurs when selected participants choose not to participate or fail to respond to the data collection effort. Nonresponse bias can result in an unrepresentative sample if those who choose not to respond differ systematically from those who do respond. Efforts should be made to mitigate nonresponse and encourage participation to minimize this error.
  • Measurement Error: Measurement error arises from inaccuracies or inconsistencies in the measurement process. It can happen due to poorly designed survey instruments, ambiguous questions, respondent bias, or errors in data entry or coding. Measurement errors can lead to distorted or unreliable data, affecting the validity and reliability of the findings.

In order to mitigate these errors and ensure high-quality data collection, you should carefully plan your data collection procedures, and validate measurement tools. You should also use appropriate sampling techniques, employ randomization where possible, and minimize nonresponse through effective communication and incentives. Ensure you conduct regular checks and implement validation processes, and data cleaning procedures to identify and rectify errors during data analysis.

Best Practices for Data Collection

  • Clearly Define Objectives: Clearly define the research objectives and questions to guide the data collection process. This helps ensure that the collected data aligns with the research goals and provides relevant insights.
  • Plan Ahead: Develop a detailed data collection plan that includes the timeline, resources needed, and specific procedures to follow. This helps maintain consistency and efficiency throughout the data collection process.
  • Choose the Right Method: Select data collection methods that are appropriate for the research objectives and target population. Consider factors such as feasibility, cost-effectiveness, and the ability to capture the required data accurately.
  • Pilot Test : Before full-scale data collection, conduct a pilot test to identify any issues with the data collection instruments or procedures. This allows for refinement and improvement before data collection with the actual sample.
  • Train Data Collectors: If data collection involves human interaction, ensure that data collectors are properly trained on the data collection protocols, instruments, and ethical considerations. Consistent training helps minimize errors and maintain data quality.
  • Maintain Consistency: Follow standardized procedures throughout the data collection process to ensure consistency across data collectors and time. This includes using consistent measurement scales, instructions, and data recording methods.
  • Minimize Bias: Be aware of potential sources of bias in data collection and take steps to minimize their impact. Use randomization techniques, employ diverse data collectors, and implement strategies to mitigate response biases.
  • Ensure Data Quality: Implement quality control measures to ensure the accuracy, completeness, and reliability of the collected data. Conduct regular checks for data entry errors, inconsistencies, and missing values.
  • Maintain Data Confidentiality: Protect the privacy and confidentiality of participants’ data by implementing appropriate security measures. Ensure compliance with data protection regulations and obtain informed consent from participants.
  • Document the Process: Keep detailed documentation of the data collection process, including any deviations from the original plan, challenges encountered, and decisions made. This documentation facilitates transparency, replicability, and future analysis.

FAQs about Data Collection

  • What are secondary sources of data collection? Secondary sources of data collection are defined as the data that has been previously gathered and is available for your use as a researcher. These sources can include published research papers, government reports, statistical databases, and other existing datasets.
  • What are the primary sources of data collection? Primary sources of data collection involve collecting data directly from the original source also known as the firsthand sources. You can do this through surveys, interviews, observations, experiments, or other direct interactions with individuals or subjects of study.
  • How many types of data are there? There are two main types of data: qualitative and quantitative. Qualitative data is non-numeric and it includes information in the form of words, images, or descriptions. Quantitative data, on the other hand, is numeric and you can measure and analyze it statistically.
Sign up on Formplus Builder to create your preferred online surveys or questionnaire for data collection. You don’t need to be tech-savvy!

Logo

Connect to Formplus, Get Started Now - It's Free!

  • academic research
  • Data collection method
  • data collection techniques
  • data collection tool
  • data collection tools
  • field data collection
  • online data collection tool
  • product research
  • qualitative research data
  • quantitative research data
  • scientific research
  • busayo.longe

Formplus

You may also like:

Data Collection Plan: Definition + Steps to Do It

Introduction A data collection plan is a way to get specific information on your audience. You can use it to better understand what they...

data collection tools for case study

Data Collection Sheet: Types + [Template Examples]

Simple guide on data collection sheet. Types, tools, and template examples.

User Research: Definition, Methods, Tools and Guide

In this article, you’ll learn to provide value to your target market with user research. As a bonus, we’ve added user research tools and...

How Technology is Revolutionizing Data Collection

As global industrialization continues to transform, it is becoming evident that there is a ubiquity of large datasets driven by the need...

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Data Collection | Definition, Methods & Examples

Data Collection | Definition, Methods & Examples

Published on June 5, 2020 by Pritha Bhandari . Revised on June 21, 2023.

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, other interesting articles, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analyzed through statistical methods .
  • Qualitative data is expressed in words and analyzed through interpretations and categorizations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data. If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

data collection tools for case study

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design (e.g., determine inclusion and exclusion criteria ).

Operationalization

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalization means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and timeframe of the data collection.

Standardizing procedures

If multiple researchers are involved, write a detailed manual to standardize data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorize observations. This helps you avoid common research biases like omitted variable bias or information bias .

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organize and store your data.

  • If you are collecting data from people, you will likely need to anonymize and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimize distortion.
  • You can prevent loss of data by having an organization system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1–5. The data produced is numerical and can be statistically analyzed for averages and patterns.

To ensure that high quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 21). Data Collection | Definition, Methods & Examples. Scribbr. Retrieved April 8, 2024, from https://www.scribbr.com/methodology/data-collection/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, sampling methods | types, techniques & examples, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Grad Med Educ
  • v.8(2); 2016 May

Design: Selection of Data Collection Methods

Associated data.

Editor's Note: The online version of this article contains resources for further reading and a table of strengths and limitations of qualitative data collection methods.

The Challenge

Imagine that residents in your program have been less than complimentary about interprofessional rounds (IPRs). The program director asks you to determine what residents are learning about in collaboration with other health professionals during IPRs. If you construct a survey asking Likert-type questions such as “How much are you learning?” you likely will not gather the information you need to answer this question. You understand that qualitative data deal with words rather than numbers and could provide the needed answers. How do you collect “good” words? Should you use open-ended questions in a survey format? Should you conduct interviews, focus groups, or conduct direct observation? What should you consider when making these decisions?

Introduction

Qualitative research is often employed when there is a problem and no clear solutions exist, as in the case above that elicits the following questions: Why are residents complaining about rounds? How could we make rounds better? In this context, collecting “good” information or words (qualitative data) is intended to produce information that helps you to answer your research questions, capture the phenomenon of interest, and account for context and the rich texture of the human experience. You may also aim to challenge previous thinking and invite further inquiry.

Coherence or alignment between all aspects of the research project is essential. In this Rip Out we focus on data collection, but in qualitative research, the entire project must be considered. 1 , 2 Careful design of the data collection phase requires the following: deciding who will do what, where, when, and how at the different stages of the research process; acknowledging the role of the researcher as an instrument of data collection; and carefully considering the context studied and the participants and informants involved in the research.

Types of Data Collection Methods

Data collection methods are important, because how the information collected is used and what explanations it can generate are determined by the methodology and analytical approach applied by the researcher. 1 , 2 Five key data collection methods are presented here, with their strengths and limitations described in the online supplemental material.

  • 1 Questions added to surveys to obtain qualitative data typically are open-ended with a free-text format. Surveys are ideal for documenting perceptions, attitudes, beliefs, or knowledge within a clear, predetermined sample of individuals. “Good” open-ended questions should be specific enough to yield coherent responses across respondents, yet broad enough to invite a spectrum of answers. Examples for this scenario include: What is the function of IPRs? What is the educational value of IPRs, according to residents? Qualitative survey data can be analyzed using a range of techniques.
  • 2 Interviews are used to gather information from individuals 1-on-1, using a series of predetermined questions or a set of interest areas. Interviews are often recorded and transcribed. They can be structured or unstructured; they can either follow a tightly written script that mimics a survey or be inspired by a loose set of questions that invite interviewees to express themselves more freely. Interviewers need to actively listen and question, probe, and prompt further to collect richer data. Interviews are ideal when used to document participants' accounts, perceptions of, or stories about attitudes toward and responses to certain situations or phenomena. Interview data are often used to generate themes , theories , and models . Many research questions that can be answered with surveys can also be answered through interviews, but interviews will generally yield richer, more in-depth data than surveys. Interviews do, however, require more time and resources to conduct and analyze. Importantly, because interviewers are the instruments of data collection, interviewers should be trained to collect comparable data. The number of interviews required depends on the research question and the overarching methodology used. Examples of these questions include: How do residents experience IPRs? What do residents' stories about IPRs tell us about interprofessional care hierarchies?
  • 3 Focus groups are used to gather information in a group setting, either through predetermined interview questions that the moderator asks of participants in turn or through a script to stimulate group conversations. Ideally, they are used when the sum of a group of people's experiences may offer more than a single individual's experiences in understanding social phenomena. Focus groups also allow researchers to capture participants' reactions to the comments and perspectives shared by other participants, and are thus a way to capture similarities and differences in viewpoints. The number of focus groups required will vary based on the questions asked and the number of different stakeholders involved, such as residents, nurses, social workers, pharmacists, and patients. The optimal number of participants per focus group, to generate rich discussion while enabling all members to speak, is 8 to 10 people. 3 Examples of questions include: How would residents, nurses, and pharmacists redesign or improve IPRs to maximize engagement, participation, and use of time? How do suggestions compare across professional groups?
  • 4 Observations are used to gather information in situ using the senses: vision, hearing, touch, and smell. Observations allow us to investigate and document what people do —their everyday behavior—and to try to understand why they do it, rather than focus on their own perceptions or recollections. Observations are ideal when used to document, explore, and understand, as they occur, activities, actions, relationships, culture, or taken-for-granted ways of doing things. As with the previous methods, the number of observations required will depend on the research question and overarching research approach used. Examples of research questions include: How do residents use their time during IPRs? How do they relate to other health care providers? What kind of language and body language are used to describe patients and their families during IPRs?
  • 5 Textual or content analysis is ideal when used to investigate changes in official, institutional, or organizational views on a specific topic or area to document the context of certain practices or to investigate the experiences and perspectives of a group of individuals who have, for example, engaged in written reflection. Textual analysis can be used as the main method in a research project or to contextualize findings from another method. The choice and number of documents has to be guided by the research question, but can include newspaper or research articles, governmental reports, organization policies and protocols, letters, records, films, photographs, art, meeting notes, or checklists. The development of a coding grid or scheme for analysis will be guided by the research question and will be iteratively applied to selected documents. Examples of research questions include: How do our local policies and protocols for IPRs reflect or contrast with the broader discourses of interprofessional collaboration? What are the perceived successful features of IPRs in the literature? What are the key features of residents' reflections on their interprofessional experiences during IPRs?

How You Can Start TODAY

  • • Review medical education journals to find qualitative research in your area of interest and focus on the methods used as well as the findings.
  • • When you have chosen a method, read several different sources on it.
  • • From your readings, identify potential colleagues with expertise in your choice of qualitative method as well as others in your discipline who would like to learn more and organize potential working groups to discuss challenges that arise in your work.

What You Can Do LONG TERM

  • • Either locally or nationally, build a community of like-minded scholars to expand your qualitative expertise.
  • • Use a range of methods to develop a broad program of qualitative research.

Supplementary Material

Jump to navigation

Home

Cochrane Training

Chapter 5: collecting data.

Tianjing Li, Julian PT Higgins, Jonathan J Deeks

Key Points:

  • Systematic reviews have studies, rather than reports, as the unit of interest, and so multiple reports of the same study need to be identified and linked together before or after data extraction.
  • Because of the increasing availability of data sources (e.g. trials registers, regulatory documents, clinical study reports), review authors should decide on which sources may contain the most useful information for the review, and have a plan to resolve discrepancies if information is inconsistent across sources.
  • Review authors are encouraged to develop outlines of tables and figures that will appear in the review to facilitate the design of data collection forms. The key to successful data collection is to construct easy-to-use forms and collect sufficient and unambiguous data that faithfully represent the source in a structured and organized manner.
  • Effort should be made to identify data needed for meta-analyses, which often need to be calculated or converted from data reported in diverse formats.
  • Data should be collected and archived in a form that allows future access and data sharing.

Cite this chapter as: Li T, Higgins JPT, Deeks JJ (editors). Chapter 5: Collecting data. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

5.1 Introduction

Systematic reviews aim to identify all studies that are relevant to their research questions and to synthesize data about the design, risk of bias, and results of those studies. Consequently, the findings of a systematic review depend critically on decisions relating to which data from these studies are presented and analysed. Data collected for systematic reviews should be accurate, complete, and accessible for future updates of the review and for data sharing. Methods used for these decisions must be transparent; they should be chosen to minimize biases and human error. Here we describe approaches that should be used in systematic reviews for collecting data, including extraction of data directly from journal articles and other reports of studies.

5.2 Sources of data

Studies are reported in a range of sources which are detailed later. As discussed in Section 5.2.1 , it is important to link together multiple reports of the same study. The relative strengths and weaknesses of each type of source are discussed in Section 5.2.2 . For guidance on searching for and selecting reports of studies, refer to Chapter 4 .

Journal articles are the source of the majority of data included in systematic reviews. Note that a study can be reported in multiple journal articles, each focusing on some aspect of the study (e.g. design, main results, and other results).

Conference abstracts are commonly available. However, the information presented in conference abstracts is highly variable in reliability, accuracy, and level of detail (Li et al 2017).

Errata and letters can be important sources of information about studies, including critical weaknesses and retractions, and review authors should examine these if they are identified (see MECIR Box 5.2.a ).

Trials registers (e.g. ClinicalTrials.gov) catalogue trials that have been planned or started, and have become an important data source for identifying trials, for comparing published outcomes and results with those planned, and for obtaining efficacy and safety data that are not available elsewhere (Ross et al 2009, Jones et al 2015, Baudard et al 2017).

Clinical study reports (CSRs) contain unabridged and comprehensive descriptions of the clinical problem, design, conduct and results of clinical trials, following a structure and content guidance prescribed by the International Conference on Harmonisation (ICH 1995). To obtain marketing approval of drugs and biologics for a specific indication, pharmaceutical companies submit CSRs and other required materials to regulatory authorities. Because CSRs also incorporate tables and figures, with appendices containing the protocol, statistical analysis plan, sample case report forms, and patient data listings (including narratives of all serious adverse events), they can be thousands of pages in length. CSRs often contain more data about trial methods and results than any other single data source (Mayo-Wilson et al 2018). CSRs are often difficult to access, and are usually not publicly available. Review authors could request CSRs from the European Medicines Agency (Davis and Miller 2017). The US Food and Drug and Administration had historically avoided releasing CSRs but launched a pilot programme in 2018 whereby selected portions of CSRs for new drug applications were posted on the agency’s website. Many CSRs are obtained through unsealed litigation documents, repositories (e.g. clinicalstudydatarequest.com ), and other open data and data-sharing channels (e.g. The Yale University Open Data Access Project) (Doshi et al 2013, Wieland et al 2014, Mayo-Wilson et al 2018)).

Regulatory reviews such as those available from the US Food and Drug Administration or European Medicines Agency provide useful information about trials of drugs, biologics, and medical devices submitted by manufacturers for marketing approval (Turner 2013). These documents are summaries of CSRs and related documents, prepared by agency staff as part of the process of approving the products for marketing, after reanalysing the original trial data. Regulatory reviews often are available only for the first approved use of an intervention and not for later applications (although review authors may request those documents, which are usually brief). Using regulatory reviews from the US Food and Drug Administration as an example, drug approval packages are available on the agency’s website for drugs approved since 1997 (Turner 2013); for drugs approved before 1997, information must be requested through a freedom of information request. The drug approval packages contain various documents: approval letter(s), medical review(s), chemistry review(s), clinical pharmacology review(s), and statistical reviews(s).

Individual participant data (IPD) are usually sought directly from the researchers responsible for the study, or may be identified from open data repositories (e.g. www.clinicalstudydatarequest.com ). These data typically include variables that represent the characteristics of each participant, intervention (or exposure) group, prognostic factors, and measurements of outcomes (Stewart et al 2015). Access to IPD has the advantage of allowing review authors to reanalyse the data flexibly, in accordance with the preferred analysis methods outlined in the protocol, and can reduce the variation in analysis methods across studies included in the review. IPD reviews are addressed in detail in Chapter 26 .

MECIR Box 5.2.a Relevant expectations for conduct of intervention reviews

5.2.1 Studies (not reports) as the unit of interest

In a systematic review, studies rather than reports of studies are the principal unit of interest. Since a study may have been reported in several sources, a comprehensive search for studies for the review may identify many reports from a potentially relevant study (Mayo-Wilson et al 2017a, Mayo-Wilson et al 2018). Conversely, a report may describe more than one study.

Multiple reports of the same study should be linked together (see MECIR Box 5.2.b ). Some authors prefer to link reports before they collect data, and collect data from across the reports onto a single form. Other authors prefer to collect data from each report and then link together the collected data across reports. Either strategy may be appropriate, depending on the nature of the reports at hand. It may not be clear that two reports relate to the same study until data collection has commenced. Although sometimes there is a single report for each study, it should never be assumed that this is the case.

MECIR Box 5.2.b Relevant expectations for conduct of intervention reviews

It can be difficult to link multiple reports from the same study, and review authors may need to do some ‘detective work’. Multiple sources about the same trial may not reference each other, do not share common authors (Gøtzsche 1989, Tramèr et al 1997), or report discrepant information about the study design, characteristics, outcomes, and results (von Elm et al 2004, Mayo-Wilson et al 2017a).

Some of the most useful criteria for linking reports are:

  • trial registration numbers;
  • authors’ names;
  • sponsor for the study and sponsor identifiers (e.g. grant or contract numbers);
  • location and setting (particularly if institutions, such as hospitals, are named);
  • specific details of the interventions (e.g. dose, frequency);
  • numbers of participants and baseline data; and
  • date and duration of the study (which also can clarify whether different sample sizes are due to different periods of recruitment), length of follow-up, or subgroups selected to address secondary goals.

Review authors should use as many trial characteristics as possible to link multiple reports. When uncertainties remain after considering these and other factors, it may be necessary to correspond with the study authors or sponsors for confirmation.

5.2.2 Determining which sources might be most useful

A comprehensive search to identify all eligible studies from all possible sources is resource-intensive but necessary for a high-quality systematic review (see Chapter 4 ). Because some data sources are more useful than others (Mayo-Wilson et al 2018), review authors should consider which data sources may be available and which may contain the most useful information for the review. These considerations should be described in the protocol. Table 5.2.a summarizes the strengths and limitations of different data sources (Mayo-Wilson et al 2018). Gaining access to CSRs and IPD often takes a long time. Review authors should begin searching repositories and contact trial investigators and sponsors as early as possible to negotiate data usage agreements (Mayo-Wilson et al 2015, Mayo-Wilson et al 2018).

Table 5.2.a Strengths and limitations of different data sources for systematic reviews

5.2.3 Correspondence with investigators

Review authors often find that they are unable to obtain all the information they seek from available reports about the details of the study design, the full range of outcomes measured and the numerical results. In such circumstances, authors are strongly encouraged to contact the original investigators (see MECIR Box 5.2.c ). Contact details of study authors, when not available from the study reports, often can be obtained from more recent publications, from university or institutional staff listings, from membership directories of professional societies, or by a general search of the web. If the contact author named in the study report cannot be contacted or does not respond, it is worthwhile attempting to contact other authors.

Review authors should consider the nature of the information they require and make their request accordingly. For descriptive information about the conduct of the trial, it may be most appropriate to ask open-ended questions (e.g. how was the allocation process conducted, or how were missing data handled?). If specific numerical data are required, it may be more helpful to request them specifically, possibly providing a short data collection form (either uncompleted or partially completed). If IPD are required, they should be specifically requested (see also Chapter 26 ). In some cases, study investigators may find it more convenient to provide IPD rather than conduct additional analyses to obtain the specific statistics requested.

MECIR Box 5.2.c Relevant expectations for conduct of intervention reviews

5.3 What data to collect

5.3.1 what are data.

For the purposes of this chapter, we define ‘data’ to be any information about (or derived from) a study, including details of methods, participants, setting, context, interventions, outcomes, results, publications, and investigators. Review authors should plan in advance what data will be required for their systematic review, and develop a strategy for obtaining them (see MECIR Box 5.3.a ). The involvement of consumers and other stakeholders can be helpful in ensuring that the categories of data collected are sufficiently aligned with the needs of review users ( Chapter 1, Section 1.3 ). The data to be sought should be described in the protocol, with consideration wherever possible of the issues raised in the rest of this chapter.

The data collected for a review should adequately describe the included studies, support the construction of tables and figures, facilitate the risk of bias assessment, and enable syntheses and meta-analyses. Review authors should familiarize themselves with reporting guidelines for systematic reviews (see online Chapter III and the PRISMA statement; (Liberati et al 2009) to ensure that relevant elements and sections are incorporated. The following sections review the types of information that should be sought, and these are summarized in Table 5.3.a (Li et al 2015).

MECIR Box 5.3.a Relevant expectations for conduct of intervention reviews

Table 5.3.a Checklist of items to consider in data collection

*Full description required for assessments of risk of bias (see Chapter 8 , Chapter 23 and Chapter 25 ).

5.3.2 Study methods and potential sources of bias

Different research methods can influence study outcomes by introducing different biases into results. Important study design characteristics should be collected to allow the selection of appropriate methods for assessment and analysis, and to enable description of the design of each included study in a table of ‘Characteristics of included studies’, including whether the study is randomized, whether the study has a cluster or crossover design, and the duration of the study. If the review includes non-randomized studies, appropriate features of the studies should be described (see Chapter 24 ).

Detailed information should be collected to facilitate assessment of the risk of bias in each included study. Risk-of-bias assessment should be conducted using the tool most appropriate for the design of each study, and the information required to complete the assessment will depend on the tool. Randomized studies should be assessed using the tool described in Chapter 8 . The tool covers bias arising from the randomization process, due to deviations from intended interventions, due to missing outcome data, in measurement of the outcome, and in selection of the reported result. For each item in the tool, a description of what happened in the study is required, which may include verbatim quotes from study reports. Information for assessment of bias due to missing outcome data and selection of the reported result may be most conveniently collected alongside information on outcomes and results. Chapter 7 (Section 7.3.1) discusses some issues in the collection of information for assessments of risk of bias. For non-randomized studies, the most appropriate tool is described in Chapter 25 . A separate tool also covers bias due to missing results in meta-analysis (see Chapter 13 ).

A particularly important piece of information is the funding source of the study and potential conflicts of interest of the study authors.

Some review authors will wish to collect additional information on study characteristics that bear on the quality of the study’s conduct but that may not lead directly to risk of bias, such as whether ethical approval was obtained and whether a sample size calculation was performed a priori.

5.3.3 Participants and setting

Details of participants are collected to enable an understanding of the comparability of, and differences between, the participants within and between included studies, and to allow assessment of how directly or completely the participants in the included studies reflect the original review question.

Typically, aspects that should be collected are those that could (or are believed to) affect presence or magnitude of an intervention effect and those that could help review users assess applicability to populations beyond the review. For example, if the review authors suspect important differences in intervention effect between different socio-economic groups, this information should be collected. If intervention effects are thought constant over such groups, and if such information would not be useful to help apply results, it should not be collected. Participant characteristics that are often useful for assessing applicability include age and sex. Summary information about these should always be collected unless they are not obvious from the context. These characteristics are likely to be presented in different formats (e.g. ages as means or medians, with standard deviations or ranges; sex as percentages or counts for the whole study or for each intervention group separately). Review authors should seek consistent quantities where possible, and decide whether it is more relevant to summarize characteristics for the study as a whole or by intervention group. It may not be possible to select the most consistent statistics until data collection is complete across all or most included studies. Other characteristics that are sometimes important include ethnicity, socio-demographic details (e.g. education level) and the presence of comorbid conditions. Clinical characteristics relevant to the review question (e.g. glucose level for reviews on diabetes) also are important for understanding the severity or stage of the disease.

Diagnostic criteria that were used to define the condition of interest can be a particularly important source of diversity across studies and should be collected. For example, in a review of drug therapy for congestive heart failure, it is important to know how the definition and severity of heart failure was determined in each study (e.g. systolic or diastolic dysfunction, severe systolic dysfunction with ejection fractions below 20%). Similarly, in a review of antihypertensive therapy, it is important to describe baseline levels of blood pressure of participants.

If the settings of studies may influence intervention effects or applicability, then information on these should be collected. Typical settings of healthcare intervention studies include acute care hospitals, emergency facilities, general practice, and extended care facilities such as nursing homes, offices, schools, and communities. Sometimes studies are conducted in different geographical regions with important differences that could affect delivery of an intervention and its outcomes, such as cultural characteristics, economic context, or rural versus city settings. Timing of the study may be associated with important technology differences or trends over time. If such information is important for the interpretation of the review, it should be collected.

Important characteristics of the participants in each included study should be summarized for the reader in the table of ‘Characteristics of included studies’.

5.3.4 Interventions

Details of all experimental and comparator interventions of relevance to the review should be collected. Again, details are required for aspects that could affect the presence or magnitude of an effect or that could help review users assess applicability to their own circumstances. Where feasible, information should be sought (and presented in the review) that is sufficient for replication of the interventions under study. This includes any co-interventions administered as part of the study, and applies similarly to comparators such as ‘usual care’. Review authors may need to request missing information from study authors.

The Template for Intervention Description and Replication (TIDieR) provides a comprehensive framework for full description of interventions and has been proposed for use in systematic reviews as well as reports of primary studies (Hoffmann et al 2014). The checklist includes descriptions of:

  • the rationale for the intervention and how it is expected to work;
  • any documentation that instructs the recipient on the intervention;
  • what the providers do to deliver the intervention (procedures and processes);
  • who provides the intervention (including their skill level), how (e.g. face to face, web-based) and in what setting (e.g. home, school, or hospital);
  • the timing and intensity;
  • whether any variation is permitted or expected, and whether modifications were actually made; and
  • any strategies used to ensure or assess fidelity or adherence to the intervention, and the extent to which the intervention was delivered as planned.

For clinical trials of pharmacological interventions, key information to collect will often include routes of delivery (e.g. oral or intravenous delivery), doses (e.g. amount or intensity of each treatment, frequency of delivery), timing (e.g. within 24 hours of diagnosis), and length of treatment. For other interventions, such as those that evaluate psychotherapy, behavioural and educational approaches, or healthcare delivery strategies, the amount of information required to characterize the intervention will typically be greater, including information about multiple elements of the intervention, who delivered it, and the format and timing of delivery. Chapter 17 provides further information on how to manage intervention complexity, and how the intervention Complexity Assessment Tool (iCAT) can facilitate data collection (Lewin et al 2017).

Important characteristics of the interventions in each included study should be summarized for the reader in the table of ‘Characteristics of included studies’. Additional tables or diagrams such as logic models ( Chapter 2, Section 2.5.1 ) can assist descriptions of multi-component interventions so that review users can better assess review applicability to their context.

5.3.4.1 Integrity of interventions

The degree to which specified procedures or components of the intervention are implemented as planned can have important consequences for the findings from a study. We describe this as intervention integrity ; related terms include adherence, compliance and fidelity (Carroll et al 2007). The verification of intervention integrity may be particularly important in reviews of non-pharmacological trials such as behavioural interventions and complex interventions, which are often implemented in conditions that present numerous obstacles to idealized delivery.

It is generally expected that reports of randomized trials provide detailed accounts of intervention implementation (Zwarenstein et al 2008, Moher et al 2010). In assessing whether interventions were implemented as planned, review authors should bear in mind that some interventions are standardized (with no deviations permitted in the intervention protocol), whereas others explicitly allow a degree of tailoring (Zwarenstein et al 2008). In addition, the growing field of implementation science has led to an increased awareness of the impact of setting and context on delivery of interventions (Damschroder et al 2009). (See Chapter 17, Section 17.1.2.1 for further information and discussion about how an intervention may be tailored to local conditions in order to preserve its integrity.)

Information about integrity can help determine whether unpromising results are due to a poorly conceptualized intervention or to an incomplete delivery of the prescribed components. It can also reveal important information about the feasibility of implementing a given intervention in real life settings. If it is difficult to achieve full implementation in practice, the intervention will have low feasibility (Dusenbury et al 2003).

Whether a lack of intervention integrity leads to a risk of bias in the estimate of its effect depends on whether review authors and users are interested in the effect of assignment to intervention or the effect of adhering to intervention, as discussed in more detail in Chapter 8, Section 8.2.2 . Assessment of deviations from intended interventions is important for assessing risk of bias in the latter, but not the former (see Chapter 8, Section 8.4 ), but both may be of interest to decision makers in different ways.

An example of a Cochrane Review evaluating intervention integrity is provided by a review of smoking cessation in pregnancy (Chamberlain et al 2017). The authors found that process evaluation of the intervention occurred in only some trials and that the implementation was less than ideal in others, including some of the largest trials. The review highlighted how the transfer of an intervention from one setting to another may reduce its effectiveness when elements are changed, or aspects of the materials are culturally inappropriate.

5.3.4.2 Process evaluations

Process evaluations seek to evaluate the process (and mechanisms) between the intervention’s intended implementation and the actual effect on the outcome (Moore et al 2015). Process evaluation studies are characterized by a flexible approach to data collection and the use of numerous methods to generate a range of different types of data, encompassing both quantitative and qualitative methods. Guidance for including process evaluations in systematic reviews is provided in Chapter 21 . When it is considered important, review authors should aim to collect information on whether the trial accounted for, or measured, key process factors and whether the trials that thoroughly addressed integrity showed a greater impact. Process evaluations can be a useful source of factors that potentially influence the effectiveness of an intervention.

5.3.5 Outcome s

An outcome is an event or a measurement value observed or recorded for a particular person or intervention unit in a study during or following an intervention, and that is used to assess the efficacy and safety of the studied intervention (Meinert 2012). Review authors should indicate in advance whether they plan to collect information about all outcomes measured in a study or only those outcomes of (pre-specified) interest in the review. Research has shown that trials addressing the same condition and intervention seldom agree on which outcomes are the most important, and consequently report on numerous different outcomes (Dwan et al 2014, Ismail et al 2014, Denniston et al 2015, Saldanha et al 2017a). The selection of outcomes across systematic reviews of the same condition is also inconsistent (Page et al 2014, Saldanha et al 2014, Saldanha et al 2016, Liu et al 2017). Outcomes used in trials and in systematic reviews of the same condition have limited overlap (Saldanha et al 2017a, Saldanha et al 2017b).

We recommend that only the outcomes defined in the protocol be described in detail. However, a complete list of the names of all outcomes measured may allow a more detailed assessment of the risk of bias due to missing outcome data (see Chapter 13 ).

Review authors should collect all five elements of an outcome (Zarin et al 2011, Saldanha et al 2014):

1. outcome domain or title (e.g. anxiety);

2. measurement tool or instrument (including definition of clinical outcomes or endpoints); for a scale, name of the scale (e.g. the Hamilton Anxiety Rating Scale), upper and lower limits, and whether a high or low score is favourable, definitions of any thresholds if appropriate;

3. specific metric used to characterize each participant’s results (e.g. post-intervention anxiety, or change in anxiety from baseline to a post-intervention time point, or post-intervention presence of anxiety (yes/no));

4. method of aggregation (e.g. mean and standard deviation of anxiety scores in each group, or proportion of people with anxiety);

5. timing of outcome measurements (e.g. assessments at end of eight-week intervention period, events occurring during eight-week intervention period).

Further considerations for economics outcomes are discussed in Chapter 20 , and for patient-reported outcomes in Chapter 18 .

5.3.5.1 Adverse effects

Collection of information about the harmful effects of an intervention can pose particular difficulties, discussed in detail in Chapter 19 . These outcomes may be described using multiple terms, including ‘adverse event’, ‘adverse effect’, ‘adverse drug reaction’, ‘side effect’ and ‘complication’. Many of these terminologies are used interchangeably in the literature, although some are technically different. Harms might additionally be interpreted to include undesirable changes in other outcomes measured during a study, such as a decrease in quality of life where an improvement may have been anticipated.

In clinical trials, adverse events can be collected either systematically or non-systematically. Systematic collection refers to collecting adverse events in the same manner for each participant using defined methods such as a questionnaire or a laboratory test. For systematically collected outcomes representing harm, data can be collected by review authors in the same way as efficacy outcomes (see Section 5.3.5 ).

Non-systematic collection refers to collection of information on adverse events using methods such as open-ended questions (e.g. ‘Have you noticed any symptoms since your last visit?’), or reported by participants spontaneously. In either case, adverse events may be selectively reported based on their severity, and whether the participant suspected that the effect may have been caused by the intervention, which could lead to bias in the available data. Unfortunately, most adverse events are collected non-systematically rather than systematically, creating a challenge for review authors. The following pieces of information are useful and worth collecting (Nicole Fusco, personal communication):

  • any coding system or standard medical terminology used (e.g. COSTART, MedDRA), including version number;
  • name of the adverse events (e.g. dizziness);
  • reported intensity of the adverse event (e.g. mild, moderate, severe);
  • whether the trial investigators categorized the adverse event as ‘serious’;
  • whether the trial investigators identified the adverse event as being related to the intervention;
  • time point (most commonly measured as a count over the duration of the study);
  • any reported methods for how adverse events were selected for inclusion in the publication (e.g. ‘We reported all adverse events that occurred in at least 5% of participants’); and
  • associated results.

Different collection methods lead to very different accounting of adverse events (Safer 2002, Bent et al 2006, Ioannidis et al 2006, Carvajal et al 2011, Allen et al 2013). Non-systematic collection methods tend to underestimate how frequently an adverse event occurs. It is particularly problematic when the adverse event of interest to the review is collected systematically in some studies but non-systematically in other studies. Different collection methods introduce an important source of heterogeneity. In addition, when non-systematic adverse events are reported based on quantitative selection criteria (e.g. only adverse events that occurred in at least 5% of participants were included in the publication), use of reported data alone may bias the results of meta-analyses. Review authors should be cautious of (or refrain from) synthesizing adverse events that are collected differently.

Regardless of the collection methods, precise definitions of adverse effect outcomes and their intensity should be recorded, since they may vary between studies. For example, in a review of aspirin and gastrointestinal haemorrhage, some trials simply reported gastrointestinal bleeds, while others reported specific categories of bleeding, such as haematemesis, melaena, and proctorrhagia (Derry and Loke 2000). The definition and reporting of severity of the haemorrhages (e.g. major, severe, requiring hospital admission) also varied considerably among the trials (Zanchetti and Hansson 1999). Moreover, a particular adverse effect may be described or measured in different ways among the studies. For example, the terms ‘tiredness’, ‘fatigue’ or ‘lethargy’ may all be used in reporting of adverse effects. Study authors also may use different thresholds for ‘abnormal’ results (e.g. hypokalaemia diagnosed at a serum potassium concentration of 3.0 mmol/L or 3.5 mmol/L).

No mention of adverse events in trial reports does not necessarily mean that no adverse events occurred. It is usually safest to assume that they were not reported. Quality of life measures are sometimes used as a measure of the participants’ experience during the study, but these are usually general measures that do not look specifically at particular adverse effects of the intervention. While quality of life measures are important and can be used to gauge overall participant well-being, they should not be regarded as substitutes for a detailed evaluation of safety and tolerability.

5.3.6 Results

Results data arise from the measurement or ascertainment of outcomes for individual participants in an intervention study. Results data may be available for each individual in a study (i.e. individual participant data; see Chapter 26 ), or summarized at arm level, or summarized at study level into an intervention effect by comparing two intervention arms. Results data should be collected only for the intervention groups and outcomes specified to be of interest in the protocol (see MECIR Box 5.3.b ). Results for other outcomes should not be collected unless the protocol is modified to add them. Any modification should be reported in the review. However, review authors should be alert to the possibility of important, unexpected findings, particularly serious adverse effects.

MECIR Box 5.3.b Relevant expectations for conduct of intervention reviews

Reports of studies often include several results for the same outcome. For example, different measurement scales might be used, results may be presented separately for different subgroups, and outcomes may have been measured at different follow-up time points. Variation in the results can be very large, depending on which data are selected (Gøtzsche et al 2007, Mayo-Wilson et al 2017a). Review protocols should be as specific as possible about which outcome domains, measurement tools, time points, and summary statistics (e.g. final values versus change from baseline) are to be collected (Mayo-Wilson et al 2017b). A framework should be pre-specified in the protocol to facilitate making choices between multiple eligible measures or results. For example, a hierarchy of preferred measures might be created, or plans articulated to select the result with the median effect size, or to average across all eligible results for a particular outcome domain (see also Chapter 9, Section 9.3.3 ). Any additional decisions or changes to this framework made once the data are collected should be reported in the review as changes to the protocol.

Section 5.6 describes the numbers that will be required to perform meta-analysis, if appropriate. The unit of analysis (e.g. participant, cluster, body part, treatment period) should be recorded for each result when it is not obvious (see Chapter 6, Section 6.2 ). The type of outcome data determines the nature of the numbers that will be sought for each outcome. For example, for a dichotomous (‘yes’ or ‘no’) outcome, the number of participants and the number who experienced the outcome will be sought for each group. It is important to collect the sample size relevant to each result, although this is not always obvious. A flow diagram as recommended in the CONSORT Statement (Moher et al 2001) can help to determine the flow of participants through a study. If one is not available in a published report, review authors can consider drawing one (available from www.consort-statement.org ).

The numbers required for meta-analysis are not always available. Often, other statistics can be collected and converted into the required format. For example, for a continuous outcome, it is usually most convenient to seek the number of participants, the mean and the standard deviation for each intervention group. These are often not available directly, especially the standard deviation. Alternative statistics enable calculation or estimation of the missing standard deviation (such as a standard error, a confidence interval, a test statistic (e.g. from a t-test or F-test) or a P value). These should be extracted if they provide potentially useful information (see MECIR Box 5.3.c ). Details of recalculation are provided in Section 5.6 . Further considerations for dealing with missing data are discussed in Chapter 10, Section 10.12 .

MECIR Box 5.3.c Relevant expectations for conduct of intervention reviews

5.3.7 Other information to collect

We recommend that review authors collect the key conclusions of the included study as reported by its authors. It is not necessary to report these conclusions in the review, but they should be used to verify the results of analyses undertaken by the review authors, particularly in relation to the direction of effect. Further comments by the study authors, for example any explanations they provide for unexpected findings, may be noted. References to other studies that are cited in the study report may be useful, although review authors should be aware of the possibility of citation bias (see Chapter 7, Section 7.2.3.2 ). Documentation of any correspondence with the study authors is important for review transparency.

5.4 Data collection tools

5.4.1 rationale for data collection forms.

Data collection for systematic reviews should be performed using structured data collection forms (see MECIR Box 5.4.a ). These can be paper forms, electronic forms (e.g. Google Form), or commercially or custom-built data systems (e.g. Covidence, EPPI-Reviewer, Systematic Review Data Repository (SRDR)) that allow online form building, data entry by several users, data sharing, and efficient data management (Li et al 2015). All different means of data collection require data collection forms.

MECIR Box 5.4.a Relevant expectations for conduct of intervention reviews

The data collection form is a bridge between what is reported by the original investigators (e.g. in journal articles, abstracts, personal correspondence) and what is ultimately reported by the review authors. The data collection form serves several important functions (Meade and Richardson 1997). First, the form is linked directly to the review question and criteria for assessing eligibility of studies, and provides a clear summary of these that can be used to identify and structure the data to be extracted from study reports. Second, the data collection form is the historical record of the provenance of the data used in the review, as well as the multitude of decisions (and changes to decisions) that occur throughout the review process. Third, the form is the source of data for inclusion in an analysis.

Given the important functions of data collection forms, ample time and thought should be invested in their design. Because each review is different, data collection forms will vary across reviews. However, there are many similarities in the types of information that are important. Thus, forms can be adapted from one review to the next. Although we use the term ‘data collection form’ in the singular, in practice it may be a series of forms used for different purposes: for example, a separate form could be used to assess the eligibility of studies for inclusion in the review to assist in the quick identification of studies to be excluded from or included in the review.

5.4.2 Considerations in selecting data collection tools

The choice of data collection tool is largely dependent on review authors’ preferences, the size of the review, and resources available to the author team. Potential advantages and considerations of selecting one data collection tool over another are outlined in Table 5.4.a (Li et al 2015). A significant advantage that data systems have is in data management ( Chapter 1, Section 1.6 ) and re-use. They make review updates more efficient, and also facilitate methodological research across reviews. Numerous ‘meta-epidemiological’ studies have been carried out using Cochrane Review data, resulting in methodological advances which would not have been possible if thousands of studies had not all been described using the same data structures in the same system.

Some data collection tools facilitate automatic imports of extracted data into RevMan (Cochrane’s authoring tool), such as CSV (Excel) and Covidence. Details available here https://documentation.cochrane.org/revman-kb/populate-study-data-260702462.html

Table 5.4.a Considerations in selecting data collection tools

5.4.3 Design of a data collection form

Regardless of whether data are collected using a paper or electronic form, or a data system, the key to successful data collection is to construct easy-to-use forms and collect sufficient and unambiguous data that faithfully represent the source in a structured and organized manner (Li et al 2015). In most cases, a document format should be developed for the form before building an electronic form or a data system. This can be distributed to others, including programmers and data analysts, and as a guide for creating an electronic form and any guidance or codebook to be used by data extractors. Review authors also should consider compatibility of any electronic form or data system with analytical software, as well as mechanisms for recording, assessing and correcting data entry errors.

Data described in multiple reports (or even within a single report) of a study may not be consistent. Review authors will need to describe how they work with multiple reports in the protocol, for example, by pre-specifying which report will be used when sources contain conflicting data that cannot be resolved by contacting the investigators. Likewise, when there is only one report identified for a study, review authors should specify the section within the report (e.g. abstract, methods, results, tables, and figures) for use in case of inconsistent information.

If review authors wish to automatically import their extracted data into RevMan, it is advised that their data collection forms match the data extraction templates available via the RevMan Knowledge Base. Details available here https://documentation.cochrane.org/revman-kb/data-extraction-templates-260702375.html.

A good data collection form should minimize the need to go back to the source documents. When designing a data collection form, review authors should involve all members of the team, that is, content area experts, authors with experience in systematic review methods and data collection form design, statisticians, and persons who will perform data extraction. Here are suggested steps and some tips for designing a data collection form, based on the informal collation of experiences from numerous review authors (Li et al 2015).

Step 1. Develop outlines of tables and figures expected to appear in the systematic review, considering the comparisons to be made between different interventions within the review, and the various outcomes to be measured. This step will help review authors decide the right amount of data to collect (not too much or too little). Collecting too much information can lead to forms that are longer than original study reports, and can be very wasteful of time. Collection of too little information, or omission of key data, can lead to the need to return to study reports later in the review process.

Step 2. Assemble and group data elements to facilitate form development. Review authors should consult Table 5.3.a , in which the data elements are grouped to facilitate form development and data collection. Note that it may be more efficient to group data elements in the order in which they are usually found in study reports (e.g. starting with reference information, followed by eligibility criteria, intervention description, statistical methods, baseline characteristics and results).

Step 3. Identify the optimal way of framing the data items. Much has been written about how to frame data items for developing robust data collection forms in primary research studies. We summarize a few key points and highlight issues that are pertinent to systematic reviews.

  • Ask closed-ended questions (i.e. questions that define a list of permissible responses) as much as possible. Closed-ended questions do not require post hoc coding and provide better control over data quality than open-ended questions. When setting up a closed-ended question, one must anticipate and structure possible responses and include an ‘other, specify’ category because the anticipated list may not be exhaustive. Avoid asking data extractors to summarize data into uncoded text, no matter how short it is.
  • Avoid asking a question in a way that the response may be left blank. Include ‘not applicable’, ‘not reported’ and ‘cannot tell’ options as needed. The ‘cannot tell’ option tags uncertain items that may promote review authors to contact study authors for clarification, especially on data items critical to reach conclusions.
  • Remember that the form will focus on what is reported in the article rather what has been done in the study. The study report may not fully reflect how the study was actually conducted. For example, a question ‘Did the article report that the participants were masked to the intervention?’ is more appropriate than ‘Were participants masked to the intervention?’
  • Where a judgement is required, record the raw data (i.e. quote directly from the source document) used to make the judgement. It is also important to record the source of information collected, including where it was found in a report or whether information was obtained from unpublished sources or personal communications. As much as possible, questions should be asked in a way that minimizes subjective interpretation and judgement to facilitate data comparison and adjudication.
  • Incorporate flexibility to allow for variation in how data are reported. It is strongly recommended that outcome data be collected in the format in which they were reported and transformed in a subsequent step if required. Review authors also should consider the software they will use for analysis and for publishing the review (e.g. RevMan).

Step 4. Develop and pilot-test data collection forms, ensuring that they provide data in the right format and structure for subsequent analysis. In addition to data items described in Step 2, data collection forms should record the title of the review as well as the person who is completing the form and the date of completion. Forms occasionally need revision; forms should therefore include the version number and version date to reduce the chances of using an outdated form by mistake. Because a study may be associated with multiple reports, it is important to record the study ID as well as the report ID. Definitions and instructions helpful for answering a question should appear next to the question to improve quality and consistency across data extractors (Stock 1994). Provide space for notes, regardless of whether paper or electronic forms are used.

All data collection forms and data systems should be thoroughly pilot-tested before launch (see MECIR Box 5.4.a ). Testing should involve several people extracting data from at least a few articles. The initial testing focuses on the clarity and completeness of questions. Users of the form may provide feedback that certain coding instructions are confusing or incomplete (e.g. a list of options may not cover all situations). The testing may identify data that are missing from the form, or likely to be superfluous. After initial testing, accuracy of the extracted data should be checked against the source document or verified data to identify problematic areas. It is wise to draft entries for the table of ‘Characteristics of included studies’ and complete a risk of bias assessment ( Chapter 8 ) using these pilot reports to ensure all necessary information is collected. A consensus between review authors may be required before the form is modified to avoid any misunderstandings or later disagreements. It may be necessary to repeat the pilot testing on a new set of reports if major changes are needed after the first pilot test.

Problems with the data collection form may surface after pilot testing has been completed, and the form may need to be revised after data extraction has started. When changes are made to the form or coding instructions, it may be necessary to return to reports that have already undergone data extraction. In some situations, it may be necessary to clarify only coding instructions without modifying the actual data collection form.

5.5 Extracting data from reports

5.5.1 introduction.

In most systematic reviews, the primary source of information about each study is published reports of studies, usually in the form of journal articles. Despite recent developments in machine learning models to automate data extraction in systematic reviews (see Section 5.5.9 ), data extraction is still largely a manual process. Electronic searches for text can provide a useful aid to locating information within a report. Examples include using search facilities in PDF viewers, internet browsers and word processing software. However, text searching should not be considered a replacement for reading the report, since information may be presented using variable terminology and presented in multiple formats.

5.5.2 Who should extract data?

Data extractors should have at least a basic understanding of the topic, and have knowledge of study design, data analysis and statistics. They should pay attention to detail while following instructions on the forms. Because errors that occur at the data extraction stage are rarely detected by peer reviewers, editors, or users of systematic reviews, it is recommended that more than one person extract data from every report to minimize errors and reduce introduction of potential biases by review authors (see MECIR Box 5.5.a ). As a minimum, information that involves subjective interpretation and information that is critical to the interpretation of results (e.g. outcome data) should be extracted independently by at least two people (see MECIR Box 5.5.a ). In common with implementation of the selection process ( Chapter 4, Section 4.6 ), it is preferable that data extractors are from complementary disciplines, for example a methodologist and a topic area specialist. It is important that everyone involved in data extraction has practice using the form and, if the form was designed by someone else, receives appropriate training.

Evidence in support of duplicate data extraction comes from several indirect sources. One study observed that independent data extraction by two authors resulted in fewer errors than data extraction by a single author followed by verification by a second (Buscemi et al 2006). A high prevalence of data extraction errors (errors in 20 out of 34 reviews) has been observed (Jones et al 2005). A further study of data extraction to compute standardized mean differences found that a minimum of seven out of 27 reviews had substantial errors (Gøtzsche et al 2007).

MECIR Box 5.5.a Relevant expectations for conduct of intervention reviews

5.5.3 Training data extractors

Training of data extractors is intended to familiarize them with the review topic and methods, the data collection form or data system, and issues that may arise during data extraction. Results of the pilot testing of the form should prompt discussion among review authors and extractors of ambiguous questions or responses to establish consistency. Training should take place at the onset of the data extraction process and periodically over the course of the project (Li et al 2015). For example, when data related to a single item on the form are present in multiple locations within a report (e.g. abstract, main body of text, tables, and figures) or in several sources (e.g. publications, ClinicalTrials.gov, or CSRs), the development and documentation of instructions to follow an agreed algorithm are critical and should be reinforced during the training sessions.

Some have proposed that some information in a report, such as its authors, be blinded to the review author prior to data extraction and assessment of risk of bias (Jadad et al 1996). However, blinding of review authors to aspects of study reports generally is not recommended for Cochrane Reviews as there is little evidence that it alters the decisions made (Berlin 1997).

5.5.4 Extracting data from multiple reports of the same study

Studies frequently are reported in more than one publication or in more than one source (Tramèr et al 1997, von Elm et al 2004). A single source rarely provides complete information about a study; on the other hand, multiple sources may contain conflicting information about the same study (Mayo-Wilson et al 2017a, Mayo-Wilson et al 2017b, Mayo-Wilson et al 2018). Because the unit of interest in a systematic review is the study and not the report, information from multiple reports often needs to be collated and reconciled. It is not appropriate to discard any report of an included study without careful examination, since it may contain valuable information not included in the primary report. Review authors will need to decide between two strategies:

  • Extract data from each report separately, then combine information across multiple data collection forms.
  • Extract data from all reports directly into a single data collection form.

The choice of which strategy to use will depend on the nature of the reports and may vary across studies and across reports. For example, when a full journal article and multiple conference abstracts are available, it is likely that the majority of information will be obtained from the journal article; completing a new data collection form for each conference abstract may be a waste of time. Conversely, when there are two or more detailed journal articles, perhaps relating to different periods of follow-up, then it is likely to be easier to perform data extraction separately for these articles and collate information from the data collection forms afterwards. When data from all reports are extracted into a single data collection form, review authors should identify the ‘main’ data source for each study when sources include conflicting data and these differences cannot be resolved by contacting authors (Mayo-Wilson et al 2018). Flow diagrams such as those modified from the PRISMA statement can be particularly helpful when collating and documenting information from multiple reports (Mayo-Wilson et al 2018).

5.5.5 Reliability and reaching consensus

When more than one author extracts data from the same reports, there is potential for disagreement. After data have been extracted independently by two or more extractors, responses must be compared to assure agreement or to identify discrepancies. An explicit procedure or decision rule should be specified in the protocol for identifying and resolving disagreements. Most often, the source of the disagreement is an error by one of the extractors and is easily resolved. Thus, discussion among the authors is a sensible first step. More rarely, a disagreement may require arbitration by another person. Any disagreement that cannot be resolved should be addressed by contacting the study authors; if this is unsuccessful, the disagreement should be reported in the review.

The presence and resolution of disagreements should be carefully recorded. Maintaining a copy of the data ‘as extracted’ (in addition to the consensus data) allows assessment of reliability of coding. Examples of ways in which this can be achieved include the following:

  • Use one author’s (paper) data collection form and record changes after consensus in a different ink colour.
  • Enter consensus data onto an electronic form.
  • Record original data extracted and consensus data in separate forms (some online tools do this automatically).

Agreement of coded items before reaching consensus can be quantified, for example using kappa statistics (Orwin 1994), although this is not routinely done in Cochrane Reviews. If agreement is assessed, this should be done only for the most important data (e.g. key risk of bias assessments, or availability of key outcomes).

Throughout the review process informal consideration should be given to the reliability of data extraction. For example, if after reaching consensus on the first few studies, the authors note a frequent disagreement for specific data, then coding instructions may need modification. Furthermore, an author’s coding strategy may change over time, as the coding rules are forgotten, indicating a need for retraining and, possibly, some recoding.

5.5.6 Extracting data from clinical study reports

Clinical study reports (CSRs) obtained for a systematic review are likely to be in PDF format. Although CSRs can be thousands of pages in length and very time-consuming to review, they typically follow the content and format required by the International Conference on Harmonisation (ICH 1995). Information in CSRs is usually presented in a structured and logical way. For example, numerical data pertaining to important demographic, efficacy, and safety variables are placed within the main text in tables and figures. Because of the clarity and completeness of information provided in CSRs, data extraction from CSRs may be clearer and conducted more confidently than from journal articles or other short reports.

To extract data from CSRs efficiently, review authors should familiarize themselves with the structure of the CSRs. In practice, review authors may want to browse or create ‘bookmarks’ within a PDF document that record section headers and subheaders and search key words related to the data extraction (e.g. randomization). In addition, it may be useful to utilize optical character recognition software to convert tables of data in the PDF to an analysable format when additional analyses are required, saving time and minimizing transcription errors.

CSRs may contain many outcomes and present many results for a single outcome (due to different analyses) (Mayo-Wilson et al 2017b). We recommend review authors extract results only for outcomes of interest to the review (Section 5.3.6 ). With regard to different methods of analysis, review authors should have a plan and pre-specify preferred metrics in their protocol for extracting results pertaining to different populations (e.g. ‘all randomized’, ‘all participants taking at least one dose of medication’), methods for handling missing data (e.g. ‘complete case analysis’, ‘multiple imputation’), and adjustment (e.g. unadjusted, adjusted for baseline covariates). It may be important to record the range of analysis options available, even if not all are extracted in detail. In some cases it may be preferable to use metrics that are comparable across multiple included studies, which may not be clear until data collection for all studies is complete.

CSRs are particularly useful for identifying outcomes assessed but not presented to the public. For efficacy outcomes and systematically collected adverse events, review authors can compare what is described in the CSRs with what is reported in published reports to assess the risk of bias due to missing outcome data ( Chapter 8, Section 8.5 ) and in selection of reported result ( Chapter 8, Section 8.7 ). Note that non-systematically collected adverse events are not amenable to such comparisons because these adverse events may not be known ahead of time and thus not pre-specified in the protocol.

5.5.7 Extracting data from regulatory reviews

Data most relevant to systematic reviews can be found in the medical and statistical review sections of a regulatory review. Both of these are substantially longer than journal articles (Turner 2013). A list of all trials on a drug usually can be found in the medical review. Because trials are referenced by a combination of numbers and letters, it may be difficult for the review authors to link the trial with other reports of the same trial (Section 5.2.1 ).

Many of the documents downloaded from the US Food and Drug Administration’s website for older drugs are scanned copies and are not searchable because of redaction of confidential information (Turner 2013). Optical character recognition software can convert most of the text. Reviews for newer drugs have been redacted electronically; documents remain searchable as a result.

Compared to CSRs, regulatory reviews contain less information about trial design, execution, and results. They provide limited information for assessing the risk of bias. In terms of extracting outcomes and results, review authors should follow the guidance provided for CSRs (Section 5.5.6 ).

5.5.8 Extracting data from figures with software

Sometimes numerical data needed for systematic reviews are only presented in figures. Review authors may request the data from the study investigators, or alternatively, extract the data from the figures either manually (e.g. with a ruler) or by using software. Numerous tools are available, many of which are free. Those available at the time of writing include tools called Plot Digitizer, WebPlotDigitizer, Engauge, Dexter, ycasd, GetData Graph Digitizer. The software works by taking an image of a figure and then digitizing the data points off the figure using the axes and scales set by the users. The numbers exported can be used for systematic reviews, although additional calculations may be needed to obtain the summary statistics, such as calculation of means and standard deviations from individual-level data points (or conversion of time-to-event data presented on Kaplan-Meier plots to hazard ratios; see Chapter 6, Section 6.8.2 ).

It has been demonstrated that software is more convenient and accurate than visual estimation or use of a ruler (Gross et al 2014, Jelicic Kadic et al 2016). Review authors should consider using software for extracting numerical data from figures when the data are not available elsewhere.

5.5.9 Automating data extraction in systematic reviews

Because data extraction is time-consuming and error-prone, automating or semi-automating this step may make the extraction process more efficient and accurate. The state of science relevant to automating data extraction is summarized here (Jonnalagadda et al 2015).

  • At least 26 studies have tested various natural language processing and machine learning approaches for facilitating data extraction for systematic reviews.

· Each tool focuses on only a limited number of data elements (ranges from one to seven). Most of the existing tools focus on the PICO information (e.g. number of participants, their age, sex, country, recruiting centres, intervention groups, outcomes, and time points). A few are able to extract study design and results (e.g. objectives, study duration, participant flow), and two extract risk of bias information (Marshall et al 2016, Millard et al 2016). To date, well over half of the data elements needed for systematic reviews have not been explored for automated extraction.

  • Most tools highlight the sentence(s) that may contain the data elements as opposed to directly recording these data elements into a data collection form or a data system.
  • There is no gold standard or common dataset to evaluate the performance of these tools, limiting our ability to interpret the significance of the reported accuracy measures.

At the time of writing, we cannot recommend a specific tool for automating data extraction for routine systematic review production. There is a need for review authors to work with experts in informatics to refine these tools and evaluate them rigorously. Such investigations should address how the tool will fit into existing workflows. For example, the automated or semi-automated data extraction approaches may first act as checks for manual data extraction before they can replace it.

5.5.10 Suspicions of scientific misconduct

Systematic review authors can uncover suspected misconduct in the published literature. Misconduct includes fabrication or falsification of data or results, plagiarism, and research that does not adhere to ethical norms. Review authors need to be aware of scientific misconduct because the inclusion of fraudulent material could undermine the reliability of a review’s findings. Plagiarism of results data in the form of duplicated publication (either by the same or by different authors) may, if undetected, lead to study participants being double counted in a synthesis.

It is preferable to identify potential problems before, rather than after, publication of the systematic review, so that readers are not misled. However, empirical evidence indicates that the extent to which systematic review authors explore misconduct varies widely (Elia et al 2016). Text-matching software and systems such as CrossCheck may be helpful for detecting plagiarism, but they can detect only matching text, so data tables or figures need to be inspected by hand or using other systems (e.g. to detect image manipulation). Lists of data such as in a meta-analysis can be a useful means of detecting duplicated studies. Furthermore, examination of baseline data can lead to suspicions of misconduct for an individual randomized trial (Carlisle et al 2015). For example, Al-Marzouki and colleagues concluded that a trial report was fabricated or falsified on the basis of highly unlikely baseline differences between two randomized groups (Al-Marzouki et al 2005).

Cochrane Review authors are advised to consult with Cochrane editors if cases of suspected misconduct are identified. Searching for comments, letters or retractions may uncover additional information. Sensitivity analyses can be used to determine whether the studies arousing suspicion are influential in the conclusions of the review. Guidance for editors for addressing suspected misconduct will be available from Cochrane’s Editorial Publishing and Policy Resource (see community.cochrane.org ). Further information is available from the Committee on Publication Ethics (COPE; publicationethics.org ), including a series of flowcharts on how to proceed if various types of misconduct are suspected. Cases should be followed up, typically including an approach to the editors of the journals in which suspect reports were published. It may be useful to write first to the primary investigators to request clarification of apparent inconsistencies or unusual observations.

Because investigations may take time, and institutions may not always be responsive (Wager 2011), articles suspected of being fraudulent should be classified as ‘awaiting assessment’. If a misconduct investigation indicates that the publication is unreliable, or if a publication is retracted, it should not be included in the systematic review, and the reason should be noted in the ‘excluded studies’ section.

5.5.11 Key points in planning and reporting data extraction

In summary, the methods section of both the protocol and the review should detail:

  • the data categories that are to be extracted;
  • how extracted data from each report will be verified (e.g. extraction by two review authors, independently);
  • whether data extraction is undertaken by content area experts, methodologists, or both;
  • pilot testing, training and existence of coding instructions for the data collection form;
  • how data are extracted from multiple reports from the same study; and
  • how disagreements are handled when more than one author extracts data from each report.

5.6 Extracting study results and converting to the desired format

In most cases, it is desirable to collect summary data separately for each intervention group of interest and to enter these into software in which effect estimates can be calculated, such as RevMan. Sometimes the required data may be obtained only indirectly, and the relevant results may not be obvious. Chapter 6 provides many useful tips and techniques to deal with common situations. When summary data cannot be obtained from each intervention group, or where it is important to use results of adjusted analyses (for example to account for correlations in crossover or cluster-randomized trials) effect estimates may be available directly.

5.7 Managing and sharing data

When data have been collected for each individual study, it is helpful to organize them into a comprehensive electronic format, such as a database or spreadsheet, before entering data into a meta-analysis or other synthesis. When data are collated electronically, all or a subset of them can easily be exported for cleaning, consistency checks and analysis.

Tabulation of collected information about studies can facilitate classification of studies into appropriate comparisons and subgroups. It also allows identification of comparable outcome measures and statistics across studies. It will often be necessary to perform calculations to obtain the required statistics for presentation or synthesis. It is important through this process to retain clear information on the provenance of the data, with a clear distinction between data from a source document and data obtained through calculations. Statistical conversions, for example from standard errors to standard deviations, ideally should be undertaken with a computer rather than using a hand calculator to maintain a permanent record of the original and calculated numbers as well as the actual calculations used.

Ideally, data only need to be extracted once and should be stored in a secure and stable location for future updates of the review, regardless of whether the original review authors or a different group of authors update the review (Ip et al 2012). Standardizing and sharing data collection tools as well as data management systems among review authors working in similar topic areas can streamline systematic review production. Review authors have the opportunity to work with trialists, journal editors, funders, regulators, and other stakeholders to make study data (e.g. CSRs, IPD, and any other form of study data) publicly available, increasing the transparency of research. When legal and ethical to do so, we encourage review authors to share the data used in their systematic reviews to reduce waste and to allow verification and reanalysis because data will not have to be extracted again for future use (Mayo-Wilson et al 2018).

5.8 Chapter information

Editors: Tianjing Li, Julian PT Higgins, Jonathan J Deeks

Acknowledgements: This chapter builds on earlier versions of the Handbook . For details of previous authors and editors of the Handbook , see Preface. Andrew Herxheimer, Nicki Jackson, Yoon Loke, Deirdre Price and Helen Thomas contributed text. Stephanie Taylor and Sonja Hood contributed suggestions for designing data collection forms. We are grateful to Judith Anzures, Mike Clarke, Miranda Cumpston and Peter Gøtzsche for helpful comments.

Funding: JPTH is a member of the National Institute for Health Research (NIHR) Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JJD received support from the NIHR Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. JPTH received funding from National Institute for Health Research Senior Investigator award NF-SI-0617-10145. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

5.9 References

Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data real? Statistical methods for the detection of data fabrication in clinical trials. BMJ 2005; 331 : 267-270.

Allen EN, Mushi AK, Massawe IS, Vestergaard LS, Lemnge M, Staedke SG, Mehta U, Barnes KI, Chandler CI. How experiences become data: the process of eliciting adverse event, medical history and concomitant medication reports in antimalarial and antiretroviral interaction trials. BMC Medical Research Methodology 2013; 13 : 140.

Baudard M, Yavchitz A, Ravaud P, Perrodeau E, Boutron I. Impact of searching clinical trial registries in systematic reviews of pharmaceutical treatments: methodological systematic review and reanalysis of meta-analyses. BMJ 2017; 356 : j448.

Bent S, Padula A, Avins AL. Better ways to question patients about adverse medical events: a randomized, controlled trial. Annals of Internal Medicine 2006; 144 : 257-261.

Berlin JA. Does blinding of readers affect the results of meta-analyses? University of Pennsylvania Meta-analysis Blinding Study Group. Lancet 1997; 350 : 185-186.

Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. Journal of Clinical Epidemiology 2006; 59 : 697-703.

Carlisle JB, Dexter F, Pandit JJ, Shafer SL, Yentis SM. Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia 2015; 70 : 848-858.

Carroll C, Patterson M, Wood S, Booth A, Rick J, Balain S. A conceptual framework for implementation fidelity. Implementation Science 2007; 2 : 40.

Carvajal A, Ortega PG, Sainz M, Velasco V, Salado I, Arias LHM, Eiros JM, Rubio AP, Castrodeza J. Adverse events associated with pandemic influenza vaccines: Comparison of the results of a follow-up study with those coming from spontaneous reporting. Vaccine 2011; 29 : 519-522.

Chamberlain C, O'Mara-Eves A, Porter J, Coleman T, Perlen SM, Thomas J, McKenzie JE. Psychosocial interventions for supporting women to stop smoking in pregnancy. Cochrane Database of Systematic Reviews 2017; 2 : CD001055.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implementation Science 2009; 4 : 50.

Davis AL, Miller JD. The European Medicines Agency and publication of clinical study reports: a challenge for the US FDA. JAMA 2017; 317 : 905-906.

Denniston AK, Holland GN, Kidess A, Nussenblatt RB, Okada AA, Rosenbaum JT, Dick AD. Heterogeneity of primary outcome measures used in clinical trials of treatments for intermediate, posterior, and panuveitis. Orphanet Journal of Rare Diseases 2015; 10 : 97.

Derry S, Loke YK. Risk of gastrointestinal haemorrhage with long term use of aspirin: meta-analysis. BMJ 2000; 321 : 1183-1187.

Doshi P, Dickersin K, Healy D, Vedula SS, Jefferson T. Restoring invisible and abandoned trials: a call for people to publish the findings. BMJ 2013; 346 : f2865.

Dusenbury L, Brannigan R, Falco M, Hansen WB. A review of research on fidelity of implementation: implications for drug abuse prevention in school settings. Health Education Research 2003; 18 : 237-256.

Dwan K, Altman DG, Clarke M, Gamble C, Higgins JPT, Sterne JAC, Williamson PR, Kirkham JJ. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Medicine 2014; 11 : e1001666.

Elia N, von Elm E, Chatagner A, Popping DM, Tramèr MR. How do authors of systematic reviews deal with research malpractice and misconduct in original studies? A cross-sectional analysis of systematic reviews and survey of their authors. BMJ Open 2016; 6 : e010442.

Gøtzsche PC. Multiple publication of reports of drug trials. European Journal of Clinical Pharmacology 1989; 36 : 429-432.

Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA 2007; 298 : 430-437.

Gross A, Schirm S, Scholz M. Ycasd - a tool for capturing and scaling data from graphical representations. BMC Bioinformatics 2014; 15 : 219.

Hoffmann TC, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, Altman DG, Barbour V, Macdonald H, Johnston M, Lamb SE, Dixon-Woods M, McCulloch P, Wyatt JC, Chan AW, Michie S. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014; 348 : g1687.

ICH. ICH Harmonised tripartite guideline: Struture and content of clinical study reports E31995. ICH1995. www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E3/E3_Guideline.pdf .

Ioannidis JPA, Mulrow CD, Goodman SN. Adverse events: The more you search, the more you find. Annals of Internal Medicine 2006; 144 : 298-300.

Ip S, Hadar N, Keefe S, Parkin C, Iovin R, Balk EM, Lau J. A web-based archive of systematic review data. Systematic Reviews 2012; 1 : 15.

Ismail R, Azuara-Blanco A, Ramsay CR. Variation of clinical outcomes used in glaucoma randomised controlled trials: a systematic review. British Journal of Ophthalmology 2014; 98 : 464-468.

Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJM, Gavaghan DJ, McQuay H. Assessing the quality of reports of randomized clinical trials: Is blinding necessary? Controlled Clinical Trials 1996; 17 : 1-12.

Jelicic Kadic A, Vucic K, Dosenovic S, Sapunar D, Puljak L. Extracting data from figures with software was faster, with higher interrater reliability than manual extraction. Journal of Clinical Epidemiology 2016; 74 : 119-123.

Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. Journal of Clinical Epidemiology 2005; 58 : 741-742.

Jones CW, Keil LG, Holland WC, Caughey MC, Platts-Mills TF. Comparison of registered and published outcomes in randomized controlled trials: a systematic review. BMC Medicine 2015; 13 : 282.

Jonnalagadda SR, Goyal P, Huffman MD. Automating data extraction in systematic reviews: a systematic review. Systematic Reviews 2015; 4 : 78.

Lewin S, Hendry M, Chandler J, Oxman AD, Michie S, Shepperd S, Reeves BC, Tugwell P, Hannes K, Rehfuess EA, Welch V, McKenzie JE, Burford B, Petkovic J, Anderson LM, Harris J, Noyes J. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Medical Research Methodology 2017; 17 : 76.

Li G, Abbade LPF, Nwosu I, Jin Y, Leenus A, Maaz M, Wang M, Bhatt M, Zielinski L, Sanger N, Bantoto B, Luo C, Shams I, Shahid H, Chang Y, Sun G, Mbuagbaw L, Samaan Z, Levine MAH, Adachi JD, Thabane L. A scoping review of comparisons between abstracts and full reports in primary biomedical research. BMC Medical Research Methodology 2017; 17 : 181.

Li TJ, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innovations in data collection, management, and archiving for systematic reviews. Annals of Internal Medicine 2015; 162 : 287-294.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Medicine 2009; 6 : e1000100.

Liu ZM, Saldanha IJ, Margolis D, Dumville JC, Cullum NA. Outcomes in Cochrane systematic reviews related to wound care: an investigation into prespecification. Wound Repair and Regeneration 2017; 25 : 292-308.

Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association 2016; 23 : 193-201.

Mayo-Wilson E, Doshi P, Dickersin K. Are manufacturers sharing data as promised? BMJ 2015; 351 : h4169.

Mayo-Wilson E, Li TJ, Fusco N, Bertizzolo L, Canner JK, Cowley T, Doshi P, Ehmsen J, Gresham G, Guo N, Haythomthwaite JA, Heyward J, Hong H, Pham D, Payne JL, Rosman L, Stuart EA, Suarez-Cuervo C, Tolbert E, Twose C, Vedula S, Dickersin K. Cherry-picking by trialists and meta-analysts can drive conclusions about intervention efficacy. Journal of Clinical Epidemiology 2017a; 91 : 95-110.

Mayo-Wilson E, Fusco N, Li TJ, Hong H, Canner JK, Dickersin K, MUDS Investigators. Multiple outcomes and analyses in clinical trials create challenges for interpretation and research synthesis. Journal of Clinical Epidemiology 2017b; 86 : 39-50.

Mayo-Wilson E, Li T, Fusco N, Dickersin K. Practical guidance for using multiple data sources in systematic reviews and meta-analyses (with examples from the MUDS study). Research Synthesis Methods 2018; 9 : 2-12.

Meade MO, Richardson WS. Selecting and appraising studies for a systematic review. Annals of Internal Medicine 1997; 127 : 531-537.

Meinert CL. Clinical trials dictionary: Terminology and usage recommendations . Hoboken (NJ): Wiley; 2012.

Millard LAC, Flach PA, Higgins JPT. Machine learning to assist risk-of-bias assessments in systematic reviews. International Journal of Epidemiology 2016; 45 : 266-277.

Moher D, Schulz KF, Altman DG. The CONSORT Statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001; 357 : 1191-1194.

Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 2010; 340 : c869.

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, Moore L, O'Cathain A, Tinati T, Wight D, Baird J. Process evaluation of complex interventions: Medical Research Council guidance. BMJ 2015; 350 : h1258.

Orwin RG. Evaluating coding decisions. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis . New York (NY): Russell Sage Foundation; 1994. p. 139-162.

Page MJ, McKenzie JE, Kirkham J, Dwan K, Kramer S, Green S, Forbes A. Bias due to selective inclusion and reporting of outcomes and analyses in systematic reviews of randomised trials of healthcare interventions. Cochrane Database of Systematic Reviews 2014; 10 : MR000035.

Ross JS, Mulvey GK, Hines EM, Nissen SE, Krumholz HM. Trial publication after registration in ClinicalTrials.Gov: a cross-sectional analysis. PLoS Medicine 2009; 6 .

Safer DJ. Design and reporting modifications in industry-sponsored comparative psychopharmacology trials. Journal of Nervous and Mental Disease 2002; 190 : 583-592.

Saldanha IJ, Dickersin K, Wang X, Li TJ. Outcomes in Cochrane systematic reviews addressing four common eye conditions: an evaluation of completeness and comparability. PloS One 2014; 9 : e109400.

Saldanha IJ, Li T, Yang C, Ugarte-Gil C, Rutherford GW, Dickersin K. Social network analysis identified central outcomes for core outcome sets using systematic reviews of HIV/AIDS. Journal of Clinical Epidemiology 2016; 70 : 164-175.

Saldanha IJ, Lindsley K, Do DV, Chuck RS, Meyerle C, Jones LS, Coleman AL, Jampel HD, Dickersin K, Virgili G. Comparison of clinical trial and systematic review outcomes for the 4 most prevalent eye diseases. JAMA Ophthalmology 2017a; 135 : 933-940.

Saldanha IJ, Li TJ, Yang C, Owczarzak J, Williamson PR, Dickersin K. Clinical trials and systematic reviews addressing similar interventions for the same condition do not consider similar outcomes to be important: a case study in HIV/AIDS. Journal of Clinical Epidemiology 2017b; 84 : 85-94.

Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, Tierney JF, PRISMA-IPD Development Group. Preferred reporting items for a systematic review and meta-analysis of individual participant data: the PRISMA-IPD statement. JAMA 2015; 313 : 1657-1665.

Stock WA. Systematic coding for research synthesis. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis . New York (NY): Russell Sage Foundation; 1994. p. 125-138.

Tramèr MR, Reynolds DJ, Moore RA, McQuay HJ. Impact of covert duplicate publication on meta-analysis: a case study. BMJ 1997; 315 : 635-640.

Turner EH. How to access and process FDA drug approval packages for use in research. BMJ 2013; 347 .

von Elm E, Poglia G, Walder B, Tramèr MR. Different patterns of duplicate publication: an analysis of articles used in systematic reviews. JAMA 2004; 291 : 974-980.

Wager E. Coping with scientific misconduct. BMJ 2011; 343 : d6586.

Wieland LS, Rutkow L, Vedula SS, Kaufmann CN, Rosman LM, Twose C, Mahendraratnam N, Dickersin K. Who has used internal company documents for biomedical and public health research and where did they find them? PloS One 2014; 9 .

Zanchetti A, Hansson L. Risk of major gastrointestinal bleeding with aspirin (Authors' reply). Lancet 1999; 353 : 149-150.

Zarin DA, Tse T, Williams RJ, Califf RM, Ide NC. The ClinicalTrials.gov results database: update and key issues. New England Journal of Medicine 2011; 364 : 852-860.

Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B, Oxman AD, Moher D. Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ 2008; 337 : a2390.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data collection tools for case study

Home Market Research

Qualitative Data Collection: What it is + Methods to do it

qualitative-data-collection

Qualitative data collection is vital in qualitative research. It helps researchers understand individuals’ attitudes, beliefs, and behaviors in a specific context.

Several methods are used to collect qualitative data, including interviews, surveys, focus groups, and observations. Understanding the various methods used for gathering qualitative data is essential for successful qualitative research.

In this post, we will discuss qualitative data and its collection methods of it.

Content Index

What is Qualitative Data?

What is qualitative data collection, what is the need for qualitative data collection, effective qualitative data collection methods, qualitative data analysis, advantages of qualitative data collection.

Qualitative data is defined as data that approximates and characterizes. It can be observed and recorded.

This data type is non-numerical in nature. This type of data is collected through methods of observations, one-to-one interviews, conducting focus groups, and similar methods.

Qualitative data in statistics is also known as categorical data – data that can be arranged categorically based on the attributes and properties of a thing or a phenomenon.

It’s pretty easy to understand the difference between qualitative and quantitative data. Qualitative data does not include numbers in its definition of traits, whereas quantitative research data is all about numbers.

  • The cake is orange, blue, and black in color (qualitative).
  • Females have brown, black, blonde, and red hair (qualitative).

Qualitative data collection is gathering non-numerical information, such as words, images, and observations, to understand individuals’ attitudes, behaviors, beliefs, and motivations in a specific context. It is an approach used in qualitative research. It seeks to understand social phenomena through in-depth exploration and analysis of people’s perspectives, experiences, and narratives. In statistical analysis , distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities.

The data collected through qualitative methods are often subjective, open-ended, and unstructured and can provide a rich and nuanced understanding of complex social phenomena.

Qualitative research is a type of study carried out with a qualitative approach to understand the exploratory reasons and to assay how and why a specific program or phenomenon operates in the way it is working. A researcher can access numerous qualitative data collection methods that he/she feels are relevant.

LEARN ABOUT: Best Data Collection Tools

Qualitative data collection methods serve the primary purpose of collecting textual data for research and analysis , like the thematic analysis. The collected research data is used to examine:

  • Knowledge around a specific issue or a program, experience of people.
  • Meaning and relationships.
  • Social norms and contextual or cultural practices demean people or impact a cause.

The qualitative data is textual or non-numerical. It covers mostly the images, videos, texts, and written or spoken words by the people. You can opt for any digital data collection methods , like structured or semi-structured surveys, or settle for the traditional approach comprising individual interviews, group discussions, etc.

Data at hand leads to a smooth process ensuring all the decisions made are for the business’s betterment. You will be able to make informed decisions only if you have relevant data.

Well! With quality data, you will improve the quality of decision-making. But you will also enhance the quality of the results expected from any endeavor.

Qualitative data collection methods are exploratory. Those are usually more focused on gaining insights and understanding the underlying reasons by digging deeper.

Although quantitative data cannot be quantified, measuring it or analyzing qualitative data might become an issue. Due to the lack of measurability, collection methods of qualitative data are primarily unstructured or structured in rare cases – that too to some extent.

Let’s explore the most common methods used for the collection of qualitative data:

data collection tools for case study

Individual interview

It is one of the most trusted, widely used, and familiar qualitative data collection methods primarily because of its approach. An individual or face-to-face interview is a direct conversation between two people with a specific structure and purpose.

The interview questionnaire is designed in the manner to elicit the interviewee’s knowledge or perspective related to a topic, program, or issue.

At times, depending on the interviewer’s approach, the conversation can be unstructured or informal but focused on understanding the individual’s beliefs, values, understandings, feelings, experiences, and perspectives on an issue.

More often, the interviewer chooses to ask open-ended questions in individual interviews. If the interviewee selects answers from a set of given options, it becomes a structured, fixed response or a biased discussion.

The individual interview is an ideal qualitative data collection method. Particularly when the researchers want highly personalized information from the participants. The individual interview is a notable method if the interviewer decides to probe further and ask follow-up questions to gain more insights.

Qualitative surveys

To develop an informed hypothesis, many researchers use qualitative research surveys for data collection or to collect a piece of detailed information about a product or an issue. If you want to create questionnaires for collecting textual or qualitative data, then ask more open-ended questions .

LEARN ABOUT: Research Process Steps

To answer such qualitative research questions , the respondent has to write his/her opinion or perspective concerning a specific topic or issue. Unlike other collection methods, online surveys have a wider reach. People can provide you with quality data that is highly credible and valuable.

Paper surveys

Online surveys, focus group discussions.

Focus group discussions can also be considered a type of interview, but it is conducted in a group discussion setting. Usually, the focus group consists of 8 – 10 people (the size may vary depending on the researcher’s requirement). The researchers ensure appropriate space is given to the participants to discuss a topic or issue in a context. The participants are allowed to either agree or disagree with each other’s comments. 

With a focused group discussion, researchers know how a particular group of participants perceives the topic. Researchers analyze what participants think of an issue, the range of opinions expressed, and the ideas discussed. The data is collected by noting down the variations or inconsistencies (if any exist) in the participants, especially in terms of belief, experiences, and practice. 

The participants of focused group discussions are selected based on the topic or issues for which the researcher wants actionable insights. For example, if the research is about the recovery of college students from drug addiction. The participants have to be college students studying and recovering from drug addiction.

Other parameters such as age, qualification, financial background, social presence, and demographics are also considered, but not primarily, as the group needs diverse participants. Frequently, the qualitative data collected through focused group discussion is more descriptive and highly detailed.

Record keeping

This method uses reliable documents and other sources of information that already exist as the data source. This information can help with the new study. It’s a lot like going to the library. There, you can look through books and other sources to find information that can be used in your research.

Case studies

In this method, data is collected by looking at case studies in detail. This method’s flexibility is shown by the fact that it can be used to analyze both simple and complicated topics. This method’s strength is how well it draws conclusions from a mix of one or more qualitative data collection methods.

Observations

Observation is one of the traditional methods of qualitative data collection. It is used by researchers to gather descriptive analysis data by observing people and their behavior at events or in their natural settings. In this method, the researcher is completely immersed in watching people by taking a participatory stance to take down notes.

There are two main types of observation:

  • Covert: In this method, the observer is concealed without letting anyone know that they are being observed. For example, a researcher studying the rituals of a wedding in nomadic tribes must join them as a guest and quietly see everything. 
  • Overt: In this method, everyone is aware that they are being watched. For example, A researcher or an observer wants to study the wedding rituals of a nomadic tribe. To proceed with the research, the observer or researcher can reveal why he is attending the marriage and even use a video camera to shoot everything around him. 

Observation is a useful method of qualitative data collection, especially when you want to study the ongoing process, situation, or reactions on a specific issue related to the people being observed.

When you want to understand people’s behavior or their way of interaction in a particular community or demographic, you can rely on the observation data. Remember, if you fail to get quality data through surveys, qualitative interviews , or group discussions, rely on observation.

It is the best and most trusted collection method of qualitative data to generate qualitative data as it requires equal to no effort from the participants.

LEARN ABOUT: Behavioral Research

You invested time and money acquiring your data, so analyze it. It’s necessary to avoid being in the dark after all your hard work. Qualitative data analysis starts with knowing its two basic techniques, but there are no rules.

  • Deductive Approach: The deductive data analysis uses a researcher-defined structure to analyze qualitative data. This method is quick and easy when a researcher knows what the sample population will say.
  • Inductive Approach: The inductive technique has no structure or framework. When a researcher knows little about the event, an inductive approach is applied.

Whether you want to analyze qualitative data from a one-on-one interview or a survey, these simple steps will ensure a comprehensive qualitative data analysis.

Step 1: Arrange your Data

After collecting all the data, it is mostly unstructured and sometimes unclear. Arranging your data is the first stage in qualitative data analysis. So, researchers must transcribe data before analyzing it.

Step 2: Organize all your Data

After transforming and arranging your data, the next step is to organize it. One of the best ways to organize the data is to think back to your research goals and then organize the data based on the research questions you asked.

Step 3: Set a Code to the Data Collected

Setting up appropriate codes for the collected data gets you one step closer. Coding is one of the most effective methods for compressing a massive amount of data. It allows you to derive theories from relevant research findings.

Step 4: Validate your Data

Qualitative data analysis success requires data validation. Data validation should be done throughout the research process, not just once. There are two sides to validating data:

  • The accuracy of your research design or methods.
  • Reliability—how well the approaches deliver accurate data.

Step 5: Concluding the Analysis Process

Finally, conclude your data in a presentable report. The report should describe your research methods, their pros and cons, and research limitations. Your report should include findings, inferences, and future research.

QuestionPro is a comprehensive online survey software that offers a variety of qualitative data analysis tools to help businesses and researchers in making sense of their data. Users can use many different qualitative analysis methods to learn more about their data.

Users of QuestionPro can see their data in different charts and graphs, which makes it easier to spot patterns and trends. It can help researchers and businesses learn more about their target audience, which can lead to better decisions and better results.

LEARN ABOUT: Steps in Qualitative Research

Qualitative data collection has several advantages, including:

data collection tools for case study

  • In-depth understanding: It provides in-depth information about attitudes and behaviors, leading to a deeper understanding of the research.
  • Flexibility: The methods allow researchers to modify questions or change direction if new information emerges.
  • Contextualization: Qualitative research data is in context, which helps to provide a deep understanding of the experiences and perspectives of individuals.
  • Rich data: It often produces rich, detailed, and nuanced information that cannot capture through numerical data.
  • Engagement: The methods, such as interviews and focus groups, involve active meetings with participants, leading to a deeper understanding.
  • Multiple perspectives: This can provide various views and a rich array of voices, adding depth and complexity.
  • Realistic setting: It often occurs in realistic settings, providing more authentic experiences and behaviors.

LEARN ABOUT: 12 Best Tools for Researchers

Qualitative research is one of the best methods for identifying the behavior and patterns governing social conditions, issues, or topics. It spans a step ahead of quantitative data as it fails to explain the reasons and rationale behind a phenomenon, but qualitative data quickly does. 

Qualitative research is one of the best tools to identify behaviors and patterns governing social conditions. It goes a step beyond quantitative data by providing the reasons and rationale behind a phenomenon that cannot be explored quantitatively.

With QuestionPro, you can use it for qualitative data collection through various methods. Using Our robust suite correctly, you can enhance the quality and integrity of the collected data.

FREE TRIAL         LEARN MORE

MORE LIKE THIS

Behavior analytics tools

Best 15 Behavior Analytics Tools to Explore Your User Actions

Apr 8, 2024

concept testing tools

Top 7 Concept Testing Tools to Elevate Your Ideas in 2024

AI Question Generator

AI Question Generator: Create Easy + Accurate Tests and Surveys

Apr 6, 2024

ux research software

Top 17 UX Research Software for UX Design in 2024

Apr 5, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

LOGO ANALYTICS FOR DECISIONS

What are the Data Collection Tools and How to Use Them?

Ever wondered how researchers are able to collect enormous amounts of data and use it effectively while ensuring accuracy and reliability? With the explosion of AI research over the past decade, data collection has become even more critical. Data collection tools are vital components in both qualitative and quantitative research , as they help in collecting and analyzing data effectively.

The process involves the use of different techniques and tools that vary depending on the type of research being conducted. But with so many different tools available, it can sometimes be overwhelming to know which ones to use.

So, in this article, we’ll discuss some of the commonly used data collection tools, how they work, and their application in qualitative and quantitative research. By the end of the article, you’ll be confident about which data collection method is the one for you. Let’s get going!

What are the Data Collection Methods and How to Use Them? (Tools for Qualitative and Quantitative Research)

Data collection – qualitative vs. quantitative.

When researchers conduct studies or experiments, they need to collect data to answer their research questions, which is where data collection tools come in. Data collection tools are methods or instruments that researchers use to gather and analyze data .

Data collection tools can be used in both qualitative and quantitative research , which are two different research methodologies. Qualitative research is focused on understanding people’s experiences and perspectives, while quantitative research is focused on gathering numerical data to test hypotheses .

Importance of Data Collection Tools

Data collection tools are essential for conducting reliable and accurate research. They provide a structured way of gathering information, which helps ensure that data is collected in a consistent and organized manner. This is important because it helps reduce errors and bias in the data, which can impact the validity and reliability of research findings. Moreover, using data collection tools can also help you analyze and interpret data more accurately and confidently.

For example, if you’re conducting a survey, using a standardized questionnaire will make it easier to compare responses and identify trends, leading to more meaningful insights and better-informed decisions. Hence, data collection tools are a vital part of the research process, and help ensure that your research is credible and trustworthy.

5 Types of Data Collection Tools

There are various types of data collection tools that researchers use, depending on the research methodology and the nature of the data they aim to collect, but here are the five most commonly used data collection tools:

  • Observations
  • Focus groups
  • Case studies

Let’s explore these methods in detail below, along with their real-life examples.

Interviews are amongst the most primary data collection tools in qualitative research. They involve a one-on-one conversation between the researcher and the participant and can be either structured or unstructured , depending on the nature of the research. Structured interviews have a predetermined set of questions, while unstructured interviews are more open-ended and allow the researcher to explore the participant’s perspective in-depth.

Interviews are useful in collecting rich and detailed data about a specific topic or experience, and they provide an opportunity for the researcher to understand the participant’s perspective in-depth.

Example : A researcher conducting a study on the experiences of cancer patients can use interviews to collect data about the patients’ experiences with their disease, including their emotional responses, coping strategies, and interactions with healthcare providers.

Surveys are a popular data collection tool in quantitative research. They involve asking a series of questions to a group of participants and can be conducted in a lot of different mediums, such as in person, via phone or email, or online. Surveys are helpful in collecting large amounts of data quickly and efficiently, and they can be used to measure attitudes, beliefs, and behaviors.

Example : A researcher conducting a study on the public’s opinion on a political issue can use surveys to collect data about people’s beliefs, opinions, and values related to that issue.

Observations involve watching and recording the behavior of individuals or groups in a natural or controlled setting. They are commonly used in qualitative research and are useful in collecting data about social interactions and behaviors. Observations can be structured or unstructured and can be conducted overtly or covertly, whatever the need of the research is.

Example : A researcher studying the behavior of children in a playground can use observations to collect data about how children interact with one another, what games they play, and how they resolve conflicts.

  • Focus Groups

Focus groups involve bringing together a group of individuals to discuss a specific topic or issue and are also used while conducting qualitative research. This method is quite useful in collecting data about attitudes, beliefs, and opinions. Just like surveys, focus groups can also be conducted over a variety of mediums, such as in person or online. Another benefit of focus groups is that they provide an opportunity for participants to interact with one another, which can lead to a more comprehensive understanding of the topic being studied.

Example : A researcher studying the attitudes of consumers towards a new product can use focus groups to collect data about how consumers perceive the product, what they like and dislike about it, and how they would use it in their daily lives.

  • Case Studies

Case studies involve an in-depth analysis of a specific individual, group, or situation and are useful in collecting detailed data about a specific phenomenon. Case studies can involve interviews, observations, and document analysis and can provide a rich understanding of the topic being studied.

Example : A researcher studying the impact of a new teaching method can use a case study to collect data about the experiences of a specific group of students who were taught using the new method, including their learning outcomes and perceptions of the method.

Importance of Data Analysis in Research

The Importance of Data Analysis in Research

Methods of Collecting Data

The 5 Methods of Collecting Data Explained

Data Analytics vs. Business Analytics

Data Analytics Vs. Business Analytics – Top 5 Differences

How to use data collection tools.

Now that we have discussed some of the commonly used data collection tools, it is essential to understand how to use them effectively. Here are the detailed steps involved in using data collection tools that you can follow:

  • Plan the Research

The first step in using data collection tools effectively is to plan the research carefully. This involves defining the research question or hypothesis, selecting the appropriate methodology, and identifying the target population. A clear research plan helps select the most appropriate data collection tool that aligns with the research objectives, hence building a solid base for rest of the steps.

  • Choose the Right Method

Once the research plan is completed, the next step is to choose the right data collection tool. It’s essential to select a tool/method that aligns with the research question, methodology, and target population. You should also pay strong attention to the associated strengths and limitations of each method and choose the one that is most appropriate for their research.

For example, if the research objective is to measure attitudes, beliefs, and behaviors, a survey may be the most appropriate data collection tool.

  • Prepare for Data Collection

Preparing for data collection involves creating a protocol, training data collectors, and testing the tool. At this point, you need to ensure that the data collection process is standardized and all data collectors are familiar with the tool and the research objectives.

Creating a protocol that outlines the steps involved in data collection and data recording is important to ensure consistency in the process. Additionally, training data collectors and testing the tool can help in identifying and addressing any potential issues.

  • Collect Data

This step is where you actually collect the data. It involves administering the tool to the target population. Ensuring that the data collection process is ethical and that all participants give informed consent is essential. The data collection process should be done systematically, and all data should be recorded accurately to ensure reliability and validity.

  • Analyze Data

Finally, the last step left after collection is to analyze the data. This involves organizing the data, cleaning it, and conducting statistical or qualitative analysis. The choice of analysis method will depend on the research question and methodology.

For example, if the research objective is to compare the means of two groups, a t-test may be used for statistical analysis. On the other hand, if the research objective is to explore a phenomenon, the qualitative analysis may be more appropriate.

Data collection tools are critical in both qualitative and quantitative research, and they help in collecting accurate and reliable data to build a solid foundation for every research. Selecting the appropriate tool depends on several factors, including the research question, methodology, and target population. Therefore, careful planning, proper preparation, systematic data collection, and accurate data analysis are essential for successful research outcomes.

Lastly, let’s discuss some of the most frequently asked questions along with their answers so you can jump straight to them if you want to.

Qualitative data collection tools are used to collect non-numerical data, such as attitudes, beliefs, and experiences, while quantitative data collection tools are used to collect numerical data, such as measurements and statistics.

  • What are some examples of qualitative data collection tools?

Examples of qualitative data collection tools include interviews, focus groups, observations, and case studies.

  • What are some examples of quantitative data collection tools?

Examples of quantitative data collection tools include surveys, experiments, and statistical analysis.

  • Why is it essential to choose the right data collection tool?

Choosing the right data collection tool is crucial as it can have a significant impact on the accuracy and validity of the data collected. Using an inappropriate tool can lead to biased or incomplete data, making it difficult to draw valid conclusions or make informed decisions.

  • What are some common challenges faced during data collection? 

Some common challenges include difficulty in accessing the target population, low response rates, data collection errors, and ethical concerns. So, always make sure you plan and prepare adequately to address these challenges effectively.

Emidio Amadebai

As an IT Engineer, who is passionate about learning and sharing. I have worked and learned quite a bit from Data Engineers, Data Analysts, Business Analysts, and Key Decision Makers almost for the past 5 years. Interested in learning more about Data Science and How to leverage it for better decision-making in my business and hopefully help you do the same in yours.

Recent Posts

Bootstrapping vs. Boosting

Over the past decade, the field of machine learning has witnessed remarkable advancements in predictive techniques and ensemble learning methods. Ensemble techniques are very popular in machine...

Boosting Algorithms vs. Random Forests Explained

Imagine yourself in the position of a marketing analyst for an e-commerce site who has to make a model that will predict if a customer purchases in the next month or not. In such a scenario, you...

data collection tools for case study

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Data Collection Methods | Step-by-Step Guide & Examples

Data Collection Methods | Step-by-Step Guide & Examples

Published on 4 May 2022 by Pritha Bhandari .

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental, or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address, and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analysed through statistical methods .
  • Qualitative data is expressed in words and analysed through interpretations and categorisations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data.

If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism, run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research, and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design .

Operationalisation

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalisation means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness, and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection.

Standardising procedures

If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorise observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organise and store your data.

  • If you are collecting data from people, you will likely need to anonymise and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimise distortion.
  • You can prevent loss of data by having an organisation system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1 to 5. The data produced is numerical and can be statistically analysed for averages and patterns.

To ensure that high-quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
  • You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research , you also have to consider the internal and external validity of your experiment.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, May 04). Data Collection Methods | Step-by-Step Guide & Examples. Scribbr. Retrieved 9 April 2024, from https://www.scribbr.co.uk/research-methods/data-collection-guide/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs quantitative research | examples & methods, triangulation in research | guide, types, examples, what is a conceptual framework | tips & examples.

This website may not work correctly because your browser is out of date. Please update your browser .

A handbook of data collection tools

  • https://www.betterevaluation.org/sites/default/files/a_handbook_of_data_collection_tools.pdf File type PDF File size 447.48 KB

This handbook, written by Jane Reisman, Anne Gienapp and Sarah Stachowiak for the Annie E. Casey Foundation , is a companion to  A guide to measuring advocacy and policy . The handbook includes a range of practical tools and processes that can be used to gather data about policy and advocacy initiatives.  

"These examples are actual or modified tools used for evaluating existing campaigns or related efforts. We aimed to identify a wide range of data collection methods rather than rely primarily on traditional pre/post surveys and wide opinion polling. When possible, we included innovative applications of tools or methods to provide a broad range of options for grantees and funders.

We primarily identified sample tools to measure the core outcome areas related to social change or policy change. For each outcome area, you will find several data collection options as well as relevant methodological notes on ways to implement or adapt particular methods. In addition, we have included examples of tools and methods related to other types of evaluation  design." (Reisman, Gienapp and Stachowiak, 2007)

  • Interview Protocol: Changes in Awareness and Prioritization  2
  • Focus Group: Changes in Attitudes   4
  • Meeting Observation Checklist: Changes in Community Members’
  • Beliefs about the Importance of a Particular Issue   4
  • Survey: Changes in Prioritization of Specific Issues  6
  • Rolling Sample Survey: Changes in Community Awareness  6
  • Self-Assessment Tool: Alliance for Justice Advocacy Capacity
  • Assessment   7
  • Self-Assessment: Spider Diagram   9
  • Self-Assessment: KIDS COUNT Self-Assessment Tool 10
  • Outcome Area: Strengthened Alliances 12
  • Tools for Measuring Public Support 15
  • Logs: Increased Public Involvement in an Issue 15
  • Log: Increased Engagement of Champions  15
  • Survey: Increased Public Involvement 16
  • Self-Assessment: Checklist for Mobilization and Advocacy  17
  • Tools for Measuring Media Support  20
  • Media Tracking Form: Increased Media Coverage  21
  • Composite News Scores: Media Impact 23
  • Log: Increased Visibility  23
  • Log: Legislative Process Tracking  26
  • Log: Policy Tracking Analysis 26
  • Survey: Assessing Number and Type of Policies 28
  • Log: Monitoring Policy Implementation  28
  • Environmental Assessments: Changes in Physical Environments  30
  • Outcome Area: Changes in Impact  32
  • Tools and Methods for Other Evaluation Designs  33
  • Method: Appreciate Inquiry Approach to Process Evaluation 34
  • Measuring Short-Term Incremental Objectives  35
  • Method: Case Studies 38
  • Method: Theory-based Evaluation  40
  • Method: Use of Coding Protocol with Qualitative Data  42
  • Reporting Tool: Alliance for Justice Advocacy Evaluation Tool  45

Reisman, J. Gienapp, A. and Stachowiak, S. (2007).  A handbook of data collection tools,  Annie E. Casey Foundation. Retrieved from:  http://www.organizationalresearch.com/publicationsandresources/a_handbook_of_data_collection_tools.pdf

Related links

  • http://www.organizationalresearch.com/publicationsandresources/a_handbook_of_data_collection_tools.pdf

Back to top

© 2022 BetterEvaluation. All right reserved.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 21, Issue 3
  • Data collection in qualitative research
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • David Barrett 1 ,
  • http://orcid.org/0000-0003-1130-5603 Alison Twycross 2
  • 1 Faculty of Health Sciences , University of Hull , Hull , UK
  • 2 School of Health and Social Care , London South Bank University , London , UK
  • Correspondence to Dr David Barrett, Faculty of Health Sciences, University of Hull, Hull HU6 7RX, UK; D.I.Barrett{at}hull.ac.uk

https://doi.org/10.1136/eb-2018-102939

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Qualitative research methods allow us to better understand the experiences of patients and carers; they allow us to explore how decisions are made and provide us with a detailed insight into how interventions may alter care. To develop such insights, qualitative research requires data which are holistic, rich and nuanced, allowing themes and findings to emerge through careful analysis. This article provides an overview of the core approaches to data collection in qualitative research, exploring their strengths, weaknesses and challenges.

Collecting data through interviews with participants is a characteristic of many qualitative studies. Interviews give the most direct and straightforward approach to gathering detailed and rich data regarding a particular phenomenon. The type of interview used to collect data can be tailored to the research question, the characteristics of participants and the preferred approach of the researcher. Interviews are most often carried out face-to-face, though the use of telephone interviews to overcome geographical barriers to participant recruitment is becoming more prevalent. 1

A common approach in qualitative research is the semistructured interview, where core elements of the phenomenon being studied are explicitly asked about by the interviewer. A well-designed semistructured interview should ensure data are captured in key areas while still allowing flexibility for participants to bring their own personality and perspective to the discussion. Finally, interviews can be much more rigidly structured to provide greater control for the researcher, essentially becoming questionnaires where responses are verbal rather than written.

Deciding where to place an interview design on this ‘structural spectrum’ will depend on the question to be answered and the skills of the researcher. A very structured approach is easy to administer and analyse but may not allow the participant to express themselves fully. At the other end of the spectrum, an open approach allows for freedom and flexibility, but requires the researcher to walk an investigative tightrope that maintains the focus of an interview without forcing participants into particular areas of discussion.

Example of an interview schedule 3

What do you think is the most effective way of assessing a child’s pain?

Have you come across any issues that make it difficult to assess a child’s pain?

What pain-relieving interventions do you find most useful and why?

When managing pain in children what is your overall aim?

Whose responsibility is pain management?

What involvement do you think parents should have in their child’s pain management?

What involvement do children have in their pain management?

Is there anything that currently stops you managing pain as well as you would like?

What would help you manage pain better?

Interviews present several challenges to researchers. Most interviews are recorded and will need transcribing before analysing. This can be extremely time-consuming, with 1 hour of interview requiring 5–6 hours to transcribe. 4 The analysis itself is also time-consuming, requiring transcriptions to be pored over word-for-word and line-by-line. Interviews also present the problem of bias the researcher needs to take care to avoid leading questions or providing non-verbal signals that might influence the responses of participants.

Focus groups

The focus group is a method of data collection in which a moderator/facilitator (usually a coresearcher) speaks with a group of 6–12 participants about issues related to the research question. As an approach, the focus group offers qualitative researchers an efficient method of gathering the views of many participants at one time. Also, the fact that many people are discussing the same issue together can result in an enhanced level of debate, with the moderator often able to step back and let the focus group enter into a free-flowing discussion. 5 This provides an opportunity to gather rich data from a specific population about a particular area of interest, such as barriers perceived by student nurses when trying to communicate with patients with cancer. 6

From a participant perspective, the focus group may provide a more relaxing environment than a one-to-one interview; they will not need to be involved with every part of the discussion and may feel more comfortable expressing views when they are shared by others in the group. Focus groups also allow participants to ‘bounce’ ideas off each other which sometimes results in different perspectives emerging from the discussion. However, focus groups are not without their difficulties. As with interviews, focus groups provide a vast amount of data to be transcribed and analysed, with discussions often lasting 1–2 hours. Moderators also need to be highly skilled to ensure that the discussion can flow while remaining focused and that all participants are encouraged to speak, while ensuring that no individuals dominate the discussion. 7

Observation

Participant and non-participant observation are powerful tools for collecting qualitative data, as they give nurse researchers an opportunity to capture a wide array of information—such as verbal and non-verbal communication, actions (eg, techniques of providing care) and environmental factors—within a care setting. Another advantage of observation is that the researcher gains a first-hand picture of what actually happens in clinical practice. 8 If the researcher is adopting a qualitative approach to observation they will normally record field notes . Field notes can take many forms, such as a chronological log of what is happening in the setting, a description of what has been observed, a record of conversations with participants or an expanded account of impressions from the fieldwork. 9 10

As with other qualitative data collection techniques, observation provides an enormous amount of data to be captured and analysed—one approach to helping with collection and analysis is to digitally record observations to allow for repeated viewing. 11 Observation also provides the researcher with some unique methodological and ethical challenges. Methodologically, the act of being observed may change the behaviour of the participant (often referred to as the ‘Hawthorne effect’), impacting on the value of findings. However, most researchers report a process of habitation taking place where, after a relatively short period of time, those being observed revert to their normal behaviour. Ethically, the researcher will need to consider when and how they should intervene if they view poor practice that could put patients at risk.

The three core approaches to data collection in qualitative research—interviews, focus groups and observation—provide researchers with rich and deep insights. All methods require skill on the part of the researcher, and all produce a large amount of raw data. However, with careful and systematic analysis 12 the data yielded with these methods will allow researchers to develop a detailed understanding of patient experiences and the work of nurses.

  • Twycross AM ,
  • Williams AM ,
  • Huang MC , et al
  • Onwuegbuzie AJ ,
  • Dickinson WB ,
  • Leech NL , et al
  • Twycross A ,
  • Emerson RM ,
  • Meriläinen M ,
  • Ala-Kokko T

Competing interests None declared.

Patient consent Not required.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

Table of Contents

What is data collection, why do we need data collection, what are the different data collection methods, data collection tools, the importance of ensuring accurate and appropriate data collection, issues related to maintaining the integrity of data collection, what are common challenges in data collection, what are the key steps in the data collection process, data collection considerations and best practices, choose the right data science program, are you interested in a career in data science, what is data collection: methods, types, tools.

What is Data Collection? Definition, Types, Tools, and Techniques

The process of gathering and analyzing accurate data from various sources to find answers to research problems, trends and probabilities, etc., to evaluate possible outcomes is Known as Data Collection. Knowledge is power, information is knowledge, and data is information in digitized form, at least as defined in IT. Hence, data is power. But before you can leverage that data into a successful strategy for your organization or business, you need to gather it. That’s your first step.

So, to help you get the process started, we shine a spotlight on data collection. What exactly is it? Believe it or not, it’s more than just doing a Google search! Furthermore, what are the different types of data collection? And what kinds of data collection tools and data collection techniques exist?

If you want to get up to speed about what is data collection process, you’ve come to the right place. 

Transform raw data into captivating visuals with Simplilearn's hands-on Data Visualization Courses and captivate your audience. Also, master the art of data management with Simplilearn's comprehensive data management courses  - unlock new career opportunities today!

Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences, business, and healthcare.

Accurate data collection is necessary to make informed business decisions, ensure quality assurance, and keep research integrity.

During data collection, the researchers must identify the data types, the sources of data, and what methods are being used. We will soon see that there are many different data collection methods . There is heavy reliance on data collection in research, commercial, and government fields.

Before an analyst begins collecting data, they must answer three questions first:

  • What’s the goal or purpose of this research?
  • What kinds of data are they planning on gathering?
  • What methods and procedures will be used to collect, store, and process the information?

Additionally, we can break up data into qualitative and quantitative types. Qualitative data covers descriptions such as color, size, quality, and appearance. Quantitative data, unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.

Before a judge makes a ruling in a court case or a general creates a plan of attack, they must have as many relevant facts as possible. The best courses of action come from informed decisions, and information and data are synonymous.

The concept of data collection isn’t a new one, as we’ll see later, but the world has changed. There is far more data available today, and it exists in forms that were unheard of a century ago. The data collection process has had to change and grow with the times, keeping pace with technology.

Whether you’re in the world of academia, trying to conduct research, or part of the commercial sector, thinking of how to promote a new product, you need data collection to help you make better choices.

Now that you know what is data collection and why we need it, let's take a look at the different methods of data collection. While the phrase “data collection” may sound all high-tech and digital, it doesn’t necessarily entail things like computers, big data , and the internet. Data collection could mean a telephone survey, a mail-in comment card, or even some guy with a clipboard asking passersby some questions. But let’s see if we can sort the different data collection methods into a semblance of organized categories.

Primary and secondary methods of data collection are two approaches used to gather information for research or analysis purposes. Let's explore each data collection method in detail:

1. Primary Data Collection:

Primary data collection involves the collection of original data directly from the source or through direct interaction with the respondents. This method allows researchers to obtain firsthand information specifically tailored to their research objectives. There are various techniques for primary data collection, including:

a. Surveys and Questionnaires: Researchers design structured questionnaires or surveys to collect data from individuals or groups. These can be conducted through face-to-face interviews, telephone calls, mail, or online platforms.

b. Interviews: Interviews involve direct interaction between the researcher and the respondent. They can be conducted in person, over the phone, or through video conferencing. Interviews can be structured (with predefined questions), semi-structured (allowing flexibility), or unstructured (more conversational).

c. Observations: Researchers observe and record behaviors, actions, or events in their natural setting. This method is useful for gathering data on human behavior, interactions, or phenomena without direct intervention.

d. Experiments: Experimental studies involve the manipulation of variables to observe their impact on the outcome. Researchers control the conditions and collect data to draw conclusions about cause-and-effect relationships.

e. Focus Groups: Focus groups bring together a small group of individuals who discuss specific topics in a moderated setting. This method helps in understanding opinions, perceptions, and experiences shared by the participants.

2. Secondary Data Collection:

Secondary data collection involves using existing data collected by someone else for a purpose different from the original intent. Researchers analyze and interpret this data to extract relevant information. Secondary data can be obtained from various sources, including:

a. Published Sources: Researchers refer to books, academic journals, magazines, newspapers, government reports, and other published materials that contain relevant data.

b. Online Databases: Numerous online databases provide access to a wide range of secondary data, such as research articles, statistical information, economic data, and social surveys.

c. Government and Institutional Records: Government agencies, research institutions, and organizations often maintain databases or records that can be used for research purposes.

d. Publicly Available Data: Data shared by individuals, organizations, or communities on public platforms, websites, or social media can be accessed and utilized for research.

e. Past Research Studies: Previous research studies and their findings can serve as valuable secondary data sources. Researchers can review and analyze the data to gain insights or build upon existing knowledge.

Now that we’ve explained the various techniques, let’s narrow our focus even further by looking at some specific tools. For example, we mentioned interviews as a technique, but we can further break that down into different interview types (or “tools”).

Word Association

The researcher gives the respondent a set of words and asks them what comes to mind when they hear each word.

Sentence Completion

Researchers use sentence completion to understand what kind of ideas the respondent has. This tool involves giving an incomplete sentence and seeing how the interviewee finishes it.

Role-Playing

Respondents are presented with an imaginary situation and asked how they would act or react if it was real.

In-Person Surveys

The researcher asks questions in person.

Online/Web Surveys

These surveys are easy to accomplish, but some users may be unwilling to answer truthfully, if at all.

Mobile Surveys

These surveys take advantage of the increasing proliferation of mobile technology. Mobile collection surveys rely on mobile devices like tablets or smartphones to conduct surveys via SMS or mobile apps.

Phone Surveys

No researcher can call thousands of people at once, so they need a third party to handle the chore. However, many people have call screening and won’t answer.

Observation

Sometimes, the simplest method is the best. Researchers who make direct observations collect data quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in small-scale situations.

Accurate data collecting is crucial to preserving the integrity of research, regardless of the subject of study or preferred method for defining data (quantitative, qualitative). Errors are less likely to occur when the right data gathering tools are used (whether they are brand-new ones, updated versions of them, or already available).

Among the effects of data collection done incorrectly, include the following -

  • Erroneous conclusions that squander resources
  • Decisions that compromise public policy
  • Incapacity to correctly respond to research inquiries
  • Bringing harm to participants who are humans or animals
  • Deceiving other researchers into pursuing futile research avenues
  • The study's inability to be replicated and validated

When these study findings are used to support recommendations for public policy, there is the potential to result in disproportionate harm, even if the degree of influence from flawed data collecting may vary by discipline and the type of investigation.

Let us now look at the various issues that we might face while maintaining the integrity of data collection.

In order to assist the errors detection process in the data gathering process, whether they were done purposefully (deliberate falsifications) or not, maintaining data integrity is the main justification (systematic or random errors).

Quality assurance and quality control are two strategies that help protect data integrity and guarantee the scientific validity of study results.

Each strategy is used at various stages of the research timeline:

  • Quality control - tasks that are performed both after and during data collecting
  • Quality assurance - events that happen before data gathering starts

Let us explore each of them in more detail now.

Quality Assurance

As data collecting comes before quality assurance, its primary goal is "prevention" (i.e., forestalling problems with data collection). The best way to protect the accuracy of data collection is through prevention. The uniformity of protocol created in the thorough and exhaustive procedures manual for data collecting serves as the best example of this proactive step. 

The likelihood of failing to spot issues and mistakes early in the research attempt increases when guides are written poorly. There are several ways to show these shortcomings:

  • Failure to determine the precise subjects and methods for retraining or training staff employees in data collecting
  • List of goods to be collected, in part
  • There isn't a system in place to track modifications to processes that may occur as the investigation continues.
  • Instead of detailed, step-by-step instructions on how to deliver tests, there is a vague description of the data gathering tools that will be employed.
  • Uncertainty regarding the date, procedure, and identity of the person or people in charge of examining the data
  • Incomprehensible guidelines for using, adjusting, and calibrating the data collection equipment.

Now, let us look at how to ensure Quality Control.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Quality Control

Despite the fact that quality control actions (detection/monitoring and intervention) take place both after and during data collection, the specifics should be meticulously detailed in the procedures manual. Establishing monitoring systems requires a specific communication structure, which is a prerequisite. Following the discovery of data collection problems, there should be no ambiguity regarding the information flow between the primary investigators and staff personnel. A poorly designed communication system promotes slack oversight and reduces opportunities for error detection.

Direct staff observation conference calls, during site visits, or frequent or routine assessments of data reports to spot discrepancies, excessive numbers, or invalid codes can all be used as forms of detection or monitoring. Site visits might not be appropriate for all disciplines. Still, without routine auditing of records, whether qualitative or quantitative, it will be challenging for investigators to confirm that data gathering is taking place in accordance with the manual's defined methods. Additionally, quality control determines the appropriate solutions, or "actions," to fix flawed data gathering procedures and reduce recurrences.

Problems with data collection, for instance, that call for immediate action include:

  • Fraud or misbehavior
  • Systematic mistakes, procedure violations 
  • Individual data items with errors
  • Issues with certain staff members or a site's performance 

Researchers are trained to include one or more secondary measures that can be used to verify the quality of information being obtained from the human subject in the social and behavioral sciences where primary data collection entails using human subjects. 

For instance, a researcher conducting a survey would be interested in learning more about the prevalence of risky behaviors among young adults as well as the social factors that influence these risky behaviors' propensity for and frequency. Let us now explore the common challenges with regard to data collection.

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand them better and avoid them.

Data Quality Issues

The main threat to the broad and successful application of machine learning is poor data quality. Data quality must be your top priority if you want to make technologies like machine learning work for you. Let's talk about some of the most prevalent data quality problems in this blog article and how to fix them.

Inconsistent Data

When working with various data sources, it's conceivable that the same information will have discrepancies between sources. The differences could be in formats, units, or occasionally spellings. The introduction of inconsistent data might also occur during firm mergers or relocations. Inconsistencies in data have a tendency to accumulate and reduce the value of data if they are not continually resolved. Organizations that have heavily focused on data consistency do so because they only want reliable data to support their analytics.

Data Downtime

Data is the driving force behind the decisions and operations of data-driven businesses. However, there may be brief periods when their data is unreliable or not prepared. Customer complaints and subpar analytical outcomes are only two ways that this data unavailability can have a significant impact on businesses. A data engineer spends about 80% of their time updating, maintaining, and guaranteeing the integrity of the data pipeline. In order to ask the next business question, there is a high marginal cost due to the lengthy operational lead time from data capture to insight.

Schema modifications and migration problems are just two examples of the causes of data downtime. Data pipelines can be difficult due to their size and complexity. Data downtime must be continuously monitored, and it must be reduced through automation.

Ambiguous Data

Even with thorough oversight, some errors can still occur in massive databases or data lakes. For data streaming at a fast speed, the issue becomes more overwhelming. Spelling mistakes can go unnoticed, formatting difficulties can occur, and column heads might be deceptive. This unclear data might cause a number of problems for reporting and analytics.

Become a Data Science Expert & Get Your Dream Job

Become a Data Science Expert & Get Your Dream Job

Duplicate Data

Streaming data, local databases, and cloud data lakes are just a few of the sources of data that modern enterprises must contend with. They might also have application and system silos. These sources are likely to duplicate and overlap each other quite a bit. For instance, duplicate contact information has a substantial impact on customer experience. If certain prospects are ignored while others are engaged repeatedly, marketing campaigns suffer. The likelihood of biased analytical outcomes increases when duplicate data are present. It can also result in ML models with biased training data.

Too Much Data

While we emphasize data-driven analytics and its advantages, a data quality problem with excessive data exists. There is a risk of getting lost in an abundance of data when searching for information pertinent to your analytical efforts. Data scientists, data analysts, and business users devote 80% of their work to finding and organizing the appropriate data. With an increase in data volume, other problems with data quality become more serious, particularly when dealing with streaming data and big files or databases.

Inaccurate Data

For highly regulated businesses like healthcare, data accuracy is crucial. Given the current experience, it is more important than ever to increase the data quality for COVID-19 and later pandemics. Inaccurate information does not provide you with a true picture of the situation and cannot be used to plan the best course of action. Personalized customer experiences and marketing strategies underperform if your customer data is inaccurate.

Data inaccuracies can be attributed to a number of things, including data degradation, human mistake, and data drift. Worldwide data decay occurs at a rate of about 3% per month, which is quite concerning. Data integrity can be compromised while being transferred between different systems, and data quality might deteriorate with time.

Hidden Data

The majority of businesses only utilize a portion of their data, with the remainder sometimes being lost in data silos or discarded in data graveyards. For instance, the customer service team might not receive client data from sales, missing an opportunity to build more precise and comprehensive customer profiles. Missing out on possibilities to develop novel products, enhance services, and streamline procedures is caused by hidden data.

Finding Relevant Data

Finding relevant data is not so easy. There are several factors that we need to consider while trying to find relevant data, which include -

  • Relevant Domain
  • Relevant demographics
  • Relevant Time period and so many more factors that we need to consider while trying to find relevant data.

Data that is not relevant to our study in any of the factors render it obsolete and we cannot effectively proceed with its analysis. This could lead to incomplete research or analysis, re-collecting data again and again, or shutting down the study.

Deciding the Data to Collect

Determining what data to collect is one of the most important factors while collecting data and should be one of the first factors while collecting data. We must choose the subjects the data will cover, the sources we will be used to gather it, and the quantity of information we will require. Our responses to these queries will depend on our aims, or what we expect to achieve utilizing your data. As an illustration, we may choose to gather information on the categories of articles that website visitors between the ages of 20 and 50 most frequently access. We can also decide to compile data on the typical age of all the clients who made a purchase from your business over the previous month.

Not addressing this could lead to double work and collection of irrelevant data or ruining your study as a whole.

Dealing With Big Data

Big data refers to exceedingly massive data sets with more intricate and diversified structures. These traits typically result in increased challenges while storing, analyzing, and using additional methods of extracting results. Big data refers especially to data sets that are quite enormous or intricate that conventional data processing tools are insufficient. The overwhelming amount of data, both unstructured and structured, that a business faces on a daily basis. 

The amount of data produced by healthcare applications, the internet, social networking sites social, sensor networks, and many other businesses are rapidly growing as a result of recent technological advancements. Big data refers to the vast volume of data created from numerous sources in a variety of formats at extremely fast rates. Dealing with this kind of data is one of the many challenges of Data Collection and is a crucial step toward collecting effective data. 

Low Response and Other Research Issues

Poor design and low response rates were shown to be two issues with data collecting, particularly in health surveys that used questionnaires. This might lead to an insufficient or inadequate supply of data for the study. Creating an incentivized data collection program might be beneficial in this case to get more responses.

Now, let us look at the key steps in the data collection process.

In the Data Collection Process, there are 5 key steps. They are explained briefly below -

1. Decide What Data You Want to Gather

The first thing that we need to do is decide what information we want to gather. We must choose the subjects the data will cover, the sources we will use to gather it, and the quantity of information that we would require. For instance, we may choose to gather information on the categories of products that an average e-commerce website visitor between the ages of 30 and 45 most frequently searches for. 

2. Establish a Deadline for Data Collection

The process of creating a strategy for data collection can now begin. We should set a deadline for our data collection at the outset of our planning phase. Some forms of data we might want to continuously collect. We might want to build up a technique for tracking transactional data and website visitor statistics over the long term, for instance. However, we will track the data throughout a certain time frame if we are tracking it for a particular campaign. In these situations, we will have a schedule for when we will begin and finish gathering data. 

3. Select a Data Collection Approach

We will select the data collection technique that will serve as the foundation of our data gathering plan at this stage. We must take into account the type of information that we wish to gather, the time period during which we will receive it, and the other factors we decide on to choose the best gathering strategy.

4. Gather Information

Once our plan is complete, we can put our data collection plan into action and begin gathering data. In our DMP, we can store and arrange our data. We need to be careful to follow our plan and keep an eye on how it's doing. Especially if we are collecting data regularly, setting up a timetable for when we will be checking in on how our data gathering is going may be helpful. As circumstances alter and we learn new details, we might need to amend our plan.

5. Examine the Information and Apply Your Findings

It's time to examine our data and arrange our findings after we have gathered all of our information. The analysis stage is essential because it transforms unprocessed data into insightful knowledge that can be applied to better our marketing plans, goods, and business judgments. The analytics tools included in our DMP can be used to assist with this phase. We can put the discoveries to use to enhance our business once we have discovered the patterns and insights in our data.

Let us now look at some data collection considerations and best practices that one might follow.

We must carefully plan before spending time and money traveling to the field to gather data. While saving time and resources, effective data collection strategies can help us collect richer, more accurate, and richer data.

Below, we will be discussing some of the best practices that we can follow for the best results -

1. Take Into Account the Price of Each Extra Data Point

Once we have decided on the data we want to gather, we need to make sure to take the expense of doing so into account. Our surveyors and respondents will incur additional costs for each additional data point or survey question.

2. Plan How to Gather Each Data Piece

There is a dearth of freely accessible data. Sometimes the data is there, but we may not have access to it. For instance, unless we have a compelling cause, we cannot openly view another person's medical information. It could be challenging to measure several types of information.

Consider how time-consuming and difficult it will be to gather each piece of information while deciding what data to acquire.

3. Think About Your Choices for Data Collecting Using Mobile Devices

Mobile-based data collecting can be divided into three categories -

  • IVRS (interactive voice response technology) -  Will call the respondents and ask them questions that have already been recorded. 
  • SMS data collection - Will send a text message to the respondent, who can then respond to questions by text on their phone. 
  • Field surveyors - Can directly enter data into an interactive questionnaire while speaking to each respondent, thanks to smartphone apps.

We need to make sure to select the appropriate tool for our survey and responders because each one has its own disadvantages and advantages.

4. Carefully Consider the Data You Need to Gather

It's all too easy to get information about anything and everything, but it's crucial to only gather the information that we require. 

It is helpful to consider these 3 questions:

  • What details will be helpful?
  • What details are available?
  • What specific details do you require?

5. Remember to Consider Identifiers

Identifiers, or details describing the context and source of a survey response, are just as crucial as the information about the subject or program that we are actually researching.

In general, adding more identifiers will enable us to pinpoint our program's successes and failures with greater accuracy, but moderation is the key.

6. Data Collecting Through Mobile Devices is the Way to Go

Although collecting data on paper is still common, modern technology relies heavily on mobile devices. They enable us to gather many various types of data at relatively lower prices and are accurate as well as quick. There aren't many reasons not to pick mobile-based data collecting with the boom of low-cost Android devices that are available nowadays.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

1. What is data collection with example?

Data collection is the process of collecting and analyzing information on relevant variables in a predetermined, methodical way so that one can respond to specific research questions, test hypotheses, and assess results. Data collection can be either qualitative or quantitative. Example: A company collects customer feedback through online surveys and social media monitoring to improve their products and services.

2. What are the primary data collection methods?

As is well known, gathering primary data is costly and time intensive. The main techniques for gathering data are observation, interviews, questionnaires, schedules, and surveys.

3. What are data collection tools?

The term "data collecting tools" refers to the tools/devices used to gather data, such as a paper questionnaire or a system for computer-assisted interviews. Tools used to gather data include case studies, checklists, interviews, occasionally observation, surveys, and questionnaires.

4. What’s the difference between quantitative and qualitative methods?

While qualitative research focuses on words and meanings, quantitative research deals with figures and statistics. You can systematically measure variables and test hypotheses using quantitative methods. You can delve deeper into ideas and experiences using qualitative methodologies.

5. What are quantitative data collection methods?

While there are numerous other ways to get quantitative information, the methods indicated above—probability sampling, interviews, questionnaire observation, and document review—are the most typical and frequently employed, whether collecting information offline or online.

6. What is mixed methods research?

User research that includes both qualitative and quantitative techniques is known as mixed methods research. For deeper user insights, mixed methods research combines insightful user data with useful statistics.

7. What are the benefits of collecting data?

Collecting data offers several benefits, including:

  • Knowledge and Insight
  • Evidence-Based Decision Making
  • Problem Identification and Solution
  • Validation and Evaluation
  • Identifying Trends and Predictions
  • Support for Research and Development
  • Policy Development
  • Quality Improvement
  • Personalization and Targeting
  • Knowledge Sharing and Collaboration

8. What’s the difference between reliability and validity?

Reliability is about consistency and stability, while validity is about accuracy and appropriateness. Reliability focuses on the consistency of results, while validity focuses on whether the results are actually measuring what they are intended to measure. Both reliability and validity are crucial considerations in research to ensure the trustworthiness and meaningfulness of the collected data and measurements.

Are you thinking about pursuing a career in the field of data science? Simplilearn's Data Science courses are designed to provide you with the necessary skills and expertise to excel in this rapidly changing field. Here's a detailed comparison for your reference:

Program Name Data Scientist Master's Program Post Graduate Program In Data Science Post Graduate Program In Data Science Geo All Geos All Geos Not Applicable in US University Simplilearn Purdue Caltech Course Duration 11 Months 11 Months 11 Months Coding Experience Required Basic Basic No Skills You Will Learn 10+ skills including data structure, data manipulation, NumPy, Scikit-Learn, Tableau and more 8+ skills including Exploratory Data Analysis, Descriptive Statistics, Inferential Statistics, and more 8+ skills including Supervised & Unsupervised Learning Deep Learning Data Visualization, and more Additional Benefits Applied Learning via Capstone and 25+ Data Science Projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Resume Building Assistance Upto 14 CEU Credits Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

We live in the Data Age, and if you want a career that fully takes advantage of this, you should consider a career in data science. Simplilearn offers a Caltech Post Graduate Program in Data Science  that will train you in everything you need to know to secure the perfect position. This Data Science PG program is ideal for all working professionals, covering job-critical topics like R, Python programming , machine learning algorithms , NLP concepts , and data visualization with Tableau in great detail. This is all provided via our interactive learning model with live sessions by global practitioners, practical labs, and industry projects.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Recommended Reads

Data Science Career Guide: A Comprehensive Playbook To Becoming A Data Scientist

Capped Collection in MongoDB

An Ultimate One-Stop Solution Guide to Collections in C# Programming With Examples

Managing Data

Difference Between Collection and Collections in Java

What Are Java Collections and How to Implement Them?

Get Affiliated Certifications with Live Class programs

Data scientist.

  • Add the IBM Advantage to your Learning
  • 25 Industry-relevant Projects and Integrated labs

Caltech Data Sciences-Bootcamp

  • Exclusive visit to Caltech’s Robotics Lab

Caltech Post Graduate Program in Data Science

  • Earn a program completion certificate from Caltech CTME
  • Curriculum delivered in live online sessions by industry experts
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Data Collection – Methods Types and Examples

Data Collection – Methods Types and Examples

Table of Contents

Data collection

Data Collection

Definition:

Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation.

In order for data collection to be effective, it is important to have a clear understanding of what data is needed and what the purpose of the data collection is. This can involve identifying the population or sample being studied, determining the variables to be measured, and selecting appropriate methods for collecting and recording data.

Types of Data Collection

Types of Data Collection are as follows:

Primary Data Collection

Primary data collection is the process of gathering original and firsthand information directly from the source or target population. This type of data collection involves collecting data that has not been previously gathered, recorded, or published. Primary data can be collected through various methods such as surveys, interviews, observations, experiments, and focus groups. The data collected is usually specific to the research question or objective and can provide valuable insights that cannot be obtained from secondary data sources. Primary data collection is often used in market research, social research, and scientific research.

Secondary Data Collection

Secondary data collection is the process of gathering information from existing sources that have already been collected and analyzed by someone else, rather than conducting new research to collect primary data. Secondary data can be collected from various sources, such as published reports, books, journals, newspapers, websites, government publications, and other documents.

Qualitative Data Collection

Qualitative data collection is used to gather non-numerical data such as opinions, experiences, perceptions, and feelings, through techniques such as interviews, focus groups, observations, and document analysis. It seeks to understand the deeper meaning and context of a phenomenon or situation and is often used in social sciences, psychology, and humanities. Qualitative data collection methods allow for a more in-depth and holistic exploration of research questions and can provide rich and nuanced insights into human behavior and experiences.

Quantitative Data Collection

Quantitative data collection is a used to gather numerical data that can be analyzed using statistical methods. This data is typically collected through surveys, experiments, and other structured data collection methods. Quantitative data collection seeks to quantify and measure variables, such as behaviors, attitudes, and opinions, in a systematic and objective way. This data is often used to test hypotheses, identify patterns, and establish correlations between variables. Quantitative data collection methods allow for precise measurement and generalization of findings to a larger population. It is commonly used in fields such as economics, psychology, and natural sciences.

Data Collection Methods

Data Collection Methods are as follows:

Surveys involve asking questions to a sample of individuals or organizations to collect data. Surveys can be conducted in person, over the phone, or online.

Interviews involve a one-on-one conversation between the interviewer and the respondent. Interviews can be structured or unstructured and can be conducted in person or over the phone.

Focus Groups

Focus groups are group discussions that are moderated by a facilitator. Focus groups are used to collect qualitative data on a specific topic.

Observation

Observation involves watching and recording the behavior of people, objects, or events in their natural setting. Observation can be done overtly or covertly, depending on the research question.

Experiments

Experiments involve manipulating one or more variables and observing the effect on another variable. Experiments are commonly used in scientific research.

Case Studies

Case studies involve in-depth analysis of a single individual, organization, or event. Case studies are used to gain detailed information about a specific phenomenon.

Secondary Data Analysis

Secondary data analysis involves using existing data that was collected for another purpose. Secondary data can come from various sources, such as government agencies, academic institutions, or private companies.

How to Collect Data

The following are some steps to consider when collecting data:

  • Define the objective : Before you start collecting data, you need to define the objective of the study. This will help you determine what data you need to collect and how to collect it.
  • Identify the data sources : Identify the sources of data that will help you achieve your objective. These sources can be primary sources, such as surveys, interviews, and observations, or secondary sources, such as books, articles, and databases.
  • Determine the data collection method : Once you have identified the data sources, you need to determine the data collection method. This could be through online surveys, phone interviews, or face-to-face meetings.
  • Develop a data collection plan : Develop a plan that outlines the steps you will take to collect the data. This plan should include the timeline, the tools and equipment needed, and the personnel involved.
  • Test the data collection process: Before you start collecting data, test the data collection process to ensure that it is effective and efficient.
  • Collect the data: Collect the data according to the plan you developed in step 4. Make sure you record the data accurately and consistently.
  • Analyze the data: Once you have collected the data, analyze it to draw conclusions and make recommendations.
  • Report the findings: Report the findings of your data analysis to the relevant stakeholders. This could be in the form of a report, a presentation, or a publication.
  • Monitor and evaluate the data collection process: After the data collection process is complete, monitor and evaluate the process to identify areas for improvement in future data collection efforts.
  • Ensure data quality: Ensure that the collected data is of high quality and free from errors. This can be achieved by validating the data for accuracy, completeness, and consistency.
  • Maintain data security: Ensure that the collected data is secure and protected from unauthorized access or disclosure. This can be achieved by implementing data security protocols and using secure storage and transmission methods.
  • Follow ethical considerations: Follow ethical considerations when collecting data, such as obtaining informed consent from participants, protecting their privacy and confidentiality, and ensuring that the research does not cause harm to participants.
  • Use appropriate data analysis methods : Use appropriate data analysis methods based on the type of data collected and the research objectives. This could include statistical analysis, qualitative analysis, or a combination of both.
  • Record and store data properly: Record and store the collected data properly, in a structured and organized format. This will make it easier to retrieve and use the data in future research or analysis.
  • Collaborate with other stakeholders : Collaborate with other stakeholders, such as colleagues, experts, or community members, to ensure that the data collected is relevant and useful for the intended purpose.

Applications of Data Collection

Data collection methods are widely used in different fields, including social sciences, healthcare, business, education, and more. Here are some examples of how data collection methods are used in different fields:

  • Social sciences : Social scientists often use surveys, questionnaires, and interviews to collect data from individuals or groups. They may also use observation to collect data on social behaviors and interactions. This data is often used to study topics such as human behavior, attitudes, and beliefs.
  • Healthcare : Data collection methods are used in healthcare to monitor patient health and track treatment outcomes. Electronic health records and medical charts are commonly used to collect data on patients’ medical history, diagnoses, and treatments. Researchers may also use clinical trials and surveys to collect data on the effectiveness of different treatments.
  • Business : Businesses use data collection methods to gather information on consumer behavior, market trends, and competitor activity. They may collect data through customer surveys, sales reports, and market research studies. This data is used to inform business decisions, develop marketing strategies, and improve products and services.
  • Education : In education, data collection methods are used to assess student performance and measure the effectiveness of teaching methods. Standardized tests, quizzes, and exams are commonly used to collect data on student learning outcomes. Teachers may also use classroom observation and student feedback to gather data on teaching effectiveness.
  • Agriculture : Farmers use data collection methods to monitor crop growth and health. Sensors and remote sensing technology can be used to collect data on soil moisture, temperature, and nutrient levels. This data is used to optimize crop yields and minimize waste.
  • Environmental sciences : Environmental scientists use data collection methods to monitor air and water quality, track climate patterns, and measure the impact of human activity on the environment. They may use sensors, satellite imagery, and laboratory analysis to collect data on environmental factors.
  • Transportation : Transportation companies use data collection methods to track vehicle performance, optimize routes, and improve safety. GPS systems, on-board sensors, and other tracking technologies are used to collect data on vehicle speed, fuel consumption, and driver behavior.

Examples of Data Collection

Examples of Data Collection are as follows:

  • Traffic Monitoring: Cities collect real-time data on traffic patterns and congestion through sensors on roads and cameras at intersections. This information can be used to optimize traffic flow and improve safety.
  • Social Media Monitoring : Companies can collect real-time data on social media platforms such as Twitter and Facebook to monitor their brand reputation, track customer sentiment, and respond to customer inquiries and complaints in real-time.
  • Weather Monitoring: Weather agencies collect real-time data on temperature, humidity, air pressure, and precipitation through weather stations and satellites. This information is used to provide accurate weather forecasts and warnings.
  • Stock Market Monitoring : Financial institutions collect real-time data on stock prices, trading volumes, and other market indicators to make informed investment decisions and respond to market fluctuations in real-time.
  • Health Monitoring : Medical devices such as wearable fitness trackers and smartwatches can collect real-time data on a person’s heart rate, blood pressure, and other vital signs. This information can be used to monitor health conditions and detect early warning signs of health issues.

Purpose of Data Collection

The purpose of data collection can vary depending on the context and goals of the study, but generally, it serves to:

  • Provide information: Data collection provides information about a particular phenomenon or behavior that can be used to better understand it.
  • Measure progress : Data collection can be used to measure the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Support decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions.
  • Identify trends : Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Monitor and evaluate : Data collection can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.

When to use Data Collection

Data collection is used when there is a need to gather information or data on a specific topic or phenomenon. It is typically used in research, evaluation, and monitoring and is important for making informed decisions and improving outcomes.

Data collection is particularly useful in the following scenarios:

  • Research : When conducting research, data collection is used to gather information on variables of interest to answer research questions and test hypotheses.
  • Evaluation : Data collection is used in program evaluation to assess the effectiveness of programs or interventions, and to identify areas for improvement.
  • Monitoring : Data collection is used in monitoring to track progress towards achieving goals or targets, and to identify any areas that require attention.
  • Decision-making: Data collection is used to provide decision-makers with information that can be used to inform policies, strategies, and actions.
  • Quality improvement : Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Characteristics of Data Collection

Data collection can be characterized by several important characteristics that help to ensure the quality and accuracy of the data gathered. These characteristics include:

  • Validity : Validity refers to the accuracy and relevance of the data collected in relation to the research question or objective.
  • Reliability : Reliability refers to the consistency and stability of the data collection process, ensuring that the results obtained are consistent over time and across different contexts.
  • Objectivity : Objectivity refers to the impartiality of the data collection process, ensuring that the data collected is not influenced by the biases or personal opinions of the data collector.
  • Precision : Precision refers to the degree of accuracy and detail in the data collected, ensuring that the data is specific and accurate enough to answer the research question or objective.
  • Timeliness : Timeliness refers to the efficiency and speed with which the data is collected, ensuring that the data is collected in a timely manner to meet the needs of the research or evaluation.
  • Ethical considerations : Ethical considerations refer to the ethical principles that must be followed when collecting data, such as ensuring confidentiality and obtaining informed consent from participants.

Advantages of Data Collection

There are several advantages of data collection that make it an important process in research, evaluation, and monitoring. These advantages include:

  • Better decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions, leading to better decision-making.
  • Improved understanding: Data collection helps to improve our understanding of a particular phenomenon or behavior by providing empirical evidence that can be analyzed and interpreted.
  • Evaluation of interventions: Data collection is essential in evaluating the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Identifying trends and patterns: Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Increased accountability: Data collection increases accountability by providing evidence that can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.
  • Validation of theories: Data collection can be used to test hypotheses and validate theories, leading to a better understanding of the phenomenon being studied.
  • Improved quality: Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Limitations of Data Collection

While data collection has several advantages, it also has some limitations that must be considered. These limitations include:

  • Bias : Data collection can be influenced by the biases and personal opinions of the data collector, which can lead to inaccurate or misleading results.
  • Sampling bias : Data collection may not be representative of the entire population, resulting in sampling bias and inaccurate results.
  • Cost : Data collection can be expensive and time-consuming, particularly for large-scale studies.
  • Limited scope: Data collection is limited to the variables being measured, which may not capture the entire picture or context of the phenomenon being studied.
  • Ethical considerations : Data collection must follow ethical principles to protect the rights and confidentiality of the participants, which can limit the type of data that can be collected.
  • Data quality issues: Data collection may result in data quality issues such as missing or incomplete data, measurement errors, and inconsistencies.
  • Limited generalizability : Data collection may not be generalizable to other contexts or populations, limiting the generalizability of the findings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Research Questions

Research Questions – Types, Examples and Writing...

Facts & Figures 2024: How Does Your State Compare?

How do taxes in your state compare regionally and nationally? Facts and Figures, a resource we’ve provided to U.S. taxpayers and legislators since 1941 , serves as a one-stop state tax A tax is a mandatory payment or charge collected by local, state, and national governments from individuals or businesses to cover the costs of general government services, goods, and activities. data resource that compares all 50 states on over 40 measures of tax rates, collections, burdens, and more.

For visualizations and further analysis of 2024 state tax data, explore our state tax maps , the latest edition of our State Business Tax Climate Index , and subscribe to our weekly tax newsletter . Download and explore the latest 2024 state tax data with our interactive tool below.

Related Resources

  • State Tax Maps See more
  • State Data Explorer See more
  • State Tax Changes Taking Effect January 1, 2024 See more
  • State Individual Income Tax An individual income tax (or personal income tax) is levied on the wages, salaries, investments, or other forms of income an individual or household earns. The U.S. imposes a progressive income tax where rates increase with income. The Federal Income Tax was established in 1913 with the ratification of the 16th Amendment . Though barely 100 years old, individual income taxes are the largest source of tax revenue in the U.S. Rates and Brackets, 2024 See more
  • State and Local Sales Tax A sales tax is levied on retail sales of goods and services and, ideally, should apply to all final consumption with few exemptions . Many governments exempt goods like groceries; base broadening , such as including groceries, could keep rates lower. A sales tax should exempt business-to-business transactions which, when taxed, cause tax pyramiding . Rates, 2024 See more
  • State Corporate Income Tax A corporate income tax (CIT) is levied by federal and state governments on business profits. Many companies are not subject to the CIT because they are taxed as pass-through businesses , with income reportable under the individual income tax . Rates and Brackets, 2024 See more
  • Eight Tax Reforms for Mobility and Modernization See more
  • 2024 State Business Tax Climate Index See more

Stay informed on the tax policies impacting you.

Subscribe to get insights from our trusted experts delivered straight to your inbox.

Previous Versions

Facts & figures 2023: how does your state compare, facts & figures 2022: how does your state compare, facts and figures 2021: how does your state compare, facts and figures 2020: how does your state compare, facts and figures 2019: how does your state compare, facts and figures 2018: how does your state compare, facts & figures 2017: how does your state compare, facts & figures 2016: how does your state compare, facts & figures 2015: how does your state compare, facts & figures 2014: how does your state compare, facts & figures 2013: how does your state compare, facts & figures 2011: how does your state compare, facts & figures 2010: how does your state compare, facts & figures 2009: how does your state compare, facts & figures on government finance, third edition, 1944, facts & figures on war finance, second edition, 1942.

Help | Advanced Search

High Energy Physics - Theory

Title: a systematic approach to celestial holography: a case study in einstein gravity.

Abstract: We propose a systematic approach to celestial holography in massless theories beginning by studying the implications of properly incorporating field configurations built using the eigenstates of central interest: massless conformal primary wavefunctions that diagonalize the dilatation generator. Due to their singular behaviour on the locus $k\cdot x=0$, they do not belong to the space of Fourier decomposable functions, and incorporating them in the path integral domain requires careful manipulations. In this paper, we include these singular field configurations by a splitting procedure using large pure gauge/diffeomorphism transformations on the action functional. We demonstrate that doing so splits the action into an integrand supported on the singular locus $k\cdot x=0$ and an integrand on the rest of the space. Mellin transforms single out the scalings/conformal dimension in $x$, geometrically, we treat this as a proper non-compact scaling reduction, where we are able to further isolate the dynamics of the large pure diffeomorphism transformations. This takes the form of 2d chiral CFT on a 2d sphere on the singular locus $k\cdot x=0$ - the celestial sphere where the null cone of the origin cuts $\mathscr{I}$. Using this framework, we study Einstein gravity perturbatively around its self-dual sector, where the resulting microscopic 2d CFT couples to bulk scattering states. We are able to obtain an explicit representation of the $\mathcal{L} w_{1+\infty}$ algebra and leading soft splitting functions. With further marginal deformations, we also write down effective interaction vertices which provide form factors of tree-level graviton scattering in Minkowski space.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • INSPIRE HEP
  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

ORIGINAL RESEARCH article

This article is part of the research topic.

Integration-Focused Approaches of Educational Systems Across the EU

Navigating the Peer-to-Peer Workflow in Non-Formal Education Through an Innovative E-learning Platform: A Case Study of the KIDS4ALLL Educational Project in Hungary and Italy Provisionally Accepted

  • 1 University of Turin, Italy
  • 2 TÁRKI Social Research Institute, Hungary

The final, formatted version of the article will be published soon.

The digital revolution is affecting all aspects of life, radically transforming everyday tasks and routines. The ability to cope with new challenges in life, including new forms of learning are key skills in the 21st-century, however, education systems often struggle with tackling digital inequalities. A digital learning platform developed by the KIDS4ALLL educational project, implemented in face-to-face student interactions, aims to mitigate the divide and the resulting social disadvantages among children with and without migration/ethnic minority background. Analysing data collected during the pilot phase of the project in two of the participating countries, Italy and Hungary, this paper examines how students and teaching staff adapt to a newly introduced digital learning tool based on peer-to-peer workflows. Firstly, it examines the role of educators' interpersonal competences in navigating the innovative learning activities and delves into how they use them and how they manage resources. Secondly, the study explores what attitudes and behaviours are observed among students engaged in the proposed peer-led activities, in particular in terms of their ability to cope with uncertainty and complexity. The analytical framework of the paper is based on two cultural dimensions offered by Hofstede (2001), the index of uncertainty avoidance (UAI) and power distance (PDI), and it utilizes the personal, social and learning-to-learn competence of the 8 LLL Key Competences as defined by the European Commission to conceptualize the skills of educators and students. Interpreting data from Italy and Hungary in their respective social and educational contexts, the study finds that the most important features that proved to be effective and useful during the pilot phase were the democratic power-relations between students and educators, the peer-to-peer scheme and its further development to the peer-for-peer approach. The child-friendly and real-life-related new curriculum and its appealing digital learning platform, embedded into a flexible, playful and child-centred pedagogical approach, were also successful. These are all complementing the traditional, formal school environment and pedagogy which, despite all developments in formal education in the past decades, can be characterized as teacher-centred and frontal.

Keywords: peer-to-peer learning, Educational inclusion, LLL Key competences, uncertainty avoidance, Power distance

Received: 10 Jan 2024; Accepted: 08 Apr 2024.

Copyright: © 2024 Schroot, Lőrincz and Bernát. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Tanja Schroot, University of Turin, Turin, Italy

People also looked at

IMAGES

  1. The case study data collection and analysis process (an author's view

    data collection tools for case study

  2. How to Collect Data

    data collection tools for case study

  3. 7 Data Collection Methods & Tools For Research

    data collection tools for case study

  4. case study as a tool of data collection

    data collection tools for case study

  5. Our toolbox for primary data collection

    data collection tools for case study

  6. Standard statistical tools in research and data analysis

    data collection tools for case study

VIDEO

  1. Data Collection for Qualitative Studies

  2. Comparison of Data Collection Tools

  3. TM Activities and Tools + Case Study || Technology Management || Sir. Farooq Javeed

  4. LVC01 TOOLS OF DATA COLLECTION

  5. Meeting in week 5 Data collection tools 20231130 130539 Meeting Recording

  6. Top 6 data collection tools

COMMENTS

  1. (PDF) Collecting data through case studies

    The case study is a data collection method in which in-depth descriptive information. about specific entities, or cases, is collected, organized, interpreted, and presented in a. narrative format ...

  2. 7 Data Collection Methods & Tools For Research

    Data collection tools refer to the devices/instruments used to collect data, such as a paper questionnaire or computer-assisted interviewing system. Case Studies, Checklists, Interviews, Observation sometimes, and Surveys or Questionnaires are all tools used to collect data.

  3. Data Collection

    Data Collection | Definition, Methods & Examples. Published on June 5, 2020 by Pritha Bhandari.Revised on June 21, 2023. Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem.

  4. Case Study Method: A Step-by-Step Guide for Business Researchers

    The case study method involves a range of empirical material collection tools in order to answer the research questions with maximum breadth. Semistructured interviews can be conducted along with meeting observations and documents collection. ... The authors interpreted the raw data for case studies with the help of a four-step interpretation ...

  5. Design: Selection of Data Collection Methods

    Data collection methods are important, because how the information collected is used and what explanations it can generate are determined by the methodology and analytical approach applied by the researcher. 1, 2 Five key data collection methods are presented here, with their strengths and limitations described in the online supplemental material.

  6. Best Practices in Data Collection and Preparation: Recommendations for

    We offer best-practice recommendations for journal reviewers, editors, and authors regarding data collection and preparation. Our recommendations are applicable to research adopting different epistemological and ontological perspectives—including both quantitative and qualitative approaches—as well as research addressing micro (i.e., individuals, teams) and macro (i.e., organizations ...

  7. Chapter 5: Collecting data

    The choice of data collection tool is largely dependent on review authors' preferences, the size of the review, and resources available to the author team. ... a case study in HIV/AIDS. Journal of Clinical Epidemiology 2017b; 84: 85-94. Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, Tierney JF, PRISMA-IPD Development Group ...

  8. PDF Basic Tools for Data Collection

    A case study is not a data collection tool in itself. It is a descriptive piece of work that can provide in-depth information on a topic. It is often based on information acquired through one or more of the other tools described in this paper, such as interviews or observation. Case studies are usually written, but can also be presented as ...

  9. 10+ Best Data Collection Tools With Data Gathering Strategies

    Different data collection strategies include Case Studies, Usage Data, Checklists, Observations, Interviews, Focus Groups, Surveys, and Document analysis. Primary data is the data that is collected for the first time by the researcher. This will be the original data and will be relevant to the research topic.

  10. Qualitative Data Collection: What it is + Methods to do it

    LEARN ABOUT: Best Data Collection Tools. Qualitative data collection methods serve the primary purpose of collecting textual data for research and analysis, like the thematic analysis. The collected research data is used to examine: ... In this method, data is collected by looking at case studies in detail. This method's flexibility is shown ...

  11. What are the Data Collection Tools and How to Use Them?

    Data Collection - Qualitative Vs. Quantitative. When researchers conduct studies or experiments, they need to collect data to answer their research questions, which is where data collection tools come in. Data collection tools are methods or instruments that researchers use to gather and analyze data. Data collection tools can be used in both ...

  12. Case Study

    The data collection method should be selected based on the research questions and the nature of the case study phenomenon. Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions ...

  13. Data Collection Methods and Tools for Research; A Step-by-Step Guide to

    Data Collection Methods and Tools for Research; A ... One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain insights ... It means the findings of case studies can be used just for the same issues

  14. Collecting data through case studies

    The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and disadvantages of case studies as a data collection method are discussed and guidelines for their use are given.

  15. Data Collection Methods

    Table of contents. Step 1: Define the aim of your research. Step 2: Choose your data collection method. Step 3: Plan your data collection procedures. Step 4: Collect the data. Frequently asked questions about data collection.

  16. A handbook of data collection tools

    Case Study Documentation of Process and Impacts 38 Method: Case Studies 38; Method: Theory-based Evaluation 40; Method: Use of Coding Protocol with Qualitative Data 42; Reporting Tool: Alliance for Justice Advocacy Evaluation Tool 45; Sources. Reisman, J. Gienapp, A. and Stachowiak, S. (2007). A handbook of data collection tools, ...

  17. (PDF) QUALITATIVE DATA COLLECTION INSTRUMENTS: THE MOST ...

    The qualitative research interview is an important data collection tool for a variety of methods used within the broad spectrum of medical education research. ... case study research; grounded ...

  18. Data collection in qualitative research

    The three core approaches to data collection in qualitative research—interviews, focus groups and observation—provide researchers with rich and deep insights. All methods require skill on the part of the researcher, and all produce a large amount of raw data. However, with careful and systematic analysis 12 the data yielded with these ...

  19. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  20. Collecting data through case studies

    The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and disadvantages of case studies as a data collection method are discussed and guidelines for their use are given.

  21. What Is Data Collection: Methods, Types, Tools

    Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences ...

  22. Data Collection

    Data collection is the process of gathering data from various sources and then analyzing it to find trends and patterns. ... Case Studies. Case studies involve in-depth analysis of a single individual, organization, or event. ... the tools and equipment needed, and the personnel involved. Test the data collection process: Before you start ...

  23. 2024 State Tax Data: Facts & Figures

    Related Resources. State Tax Maps See more; State Data Explorer See more; State Tax Changes Taking Effect January 1, 2024 See more; State Individual Income TaxAn individual income tax (or personal income tax) is levied on the wages, salaries, investments, or other forms of income an individual or household earns. The U.S. imposes a progressive income tax where rates increase with income.

  24. A systematic approach to celestial holography: a case study in Einstein

    Using this framework, we study Einstein gravity perturbatively around its self-dual sector, where the resulting microscopic 2d CFT couples to bulk scattering states. We are able to obtain an explicit representation of the $\mathcal{L} w_{1+\infty}$ algebra and leading soft splitting functions.

  25. Frontiers

    The digital revolution is affecting all aspects of life, radically transforming everyday tasks and routines. The ability to cope with new challenges in life, including new forms of learning are key skills in the 21st-century, however, education systems often struggle with tackling digital inequalities. A digital learning platform developed by the KIDS4ALLL educational project, implemented in ...