Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What is Secondary Research? | Definition, Types, & Examples

What is Secondary Research? | Definition, Types, & Examples

Published on January 20, 2023 by Tegan George . Revised on January 12, 2024.

Secondary research is a research method that uses data that was collected by someone else. In other words, whenever you conduct research using data that already exists, you are conducting secondary research. On the other hand, any type of research that you undertake yourself is called primary research .

Secondary research can be qualitative or quantitative in nature. It often uses data gathered from published peer-reviewed papers, meta-analyses, or government or private sector databases and datasets.

Table of contents

When to use secondary research, types of secondary research, examples of secondary research, advantages and disadvantages of secondary research, other interesting articles, frequently asked questions.

Secondary research is a very common research method, used in lieu of collecting your own primary data. It is often used in research designs or as a way to start your research process if you plan to conduct primary research later on.

Since it is often inexpensive or free to access, secondary research is a low-stakes way to determine if further primary research is needed, as gaps in secondary research are a strong indication that primary research is necessary. For this reason, while secondary research can theoretically be exploratory or explanatory in nature, it is usually explanatory: aiming to explain the causes and consequences of a well-defined problem.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Secondary research can take many forms, but the most common types are:

Statistical analysis

Literature reviews, case studies, content analysis.

There is ample data available online from a variety of sources, often in the form of datasets. These datasets are often open-source or downloadable at a low cost, and are ideal for conducting statistical analyses such as hypothesis testing or regression analysis .

Credible sources for existing data include:

  • The government
  • Government agencies
  • Non-governmental organizations
  • Educational institutions
  • Businesses or consultancies
  • Libraries or archives
  • Newspapers, academic journals, or magazines

A literature review is a survey of preexisting scholarly sources on your topic. It provides an overview of current knowledge, allowing you to identify relevant themes, debates, and gaps in the research you analyze. You can later apply these to your own work, or use them as a jumping-off point to conduct primary research of your own.

Structured much like a regular academic paper (with a clear introduction, body, and conclusion), a literature review is a great way to evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

A case study is a detailed study of a specific subject. It is usually qualitative in nature and can focus on  a person, group, place, event, organization, or phenomenon. A case study is a great way to utilize existing research to gain concrete, contextual, and in-depth knowledge about your real-world subject.

You can choose to focus on just one complex case, exploring a single subject in great detail, or examine multiple cases if you’d prefer to compare different aspects of your topic. Preexisting interviews , observational studies , or other sources of primary data make for great case studies.

Content analysis is a research method that studies patterns in recorded communication by utilizing existing texts. It can be either quantitative or qualitative in nature, depending on whether you choose to analyze countable or measurable patterns, or more interpretive ones. Content analysis is popular in communication studies, but it is also widely used in historical analysis, anthropology, and psychology to make more semantic qualitative inferences.

Primary Research and Secondary Research

Secondary research is a broad research approach that can be pursued any way you’d like. Here are a few examples of different ways you can use secondary research to explore your research topic .

Secondary research is a very common research approach, but has distinct advantages and disadvantages.

Advantages of secondary research

Advantages include:

  • Secondary data is very easy to source and readily available .
  • It is also often free or accessible through your educational institution’s library or network, making it much cheaper to conduct than primary research .
  • As you are relying on research that already exists, conducting secondary research is much less time consuming than primary research. Since your timeline is so much shorter, your research can be ready to publish sooner.
  • Using data from others allows you to show reproducibility and replicability , bolstering prior research and situating your own work within your field.

Disadvantages of secondary research

Disadvantages include:

  • Ease of access does not signify credibility . It’s important to be aware that secondary research is not always reliable , and can often be out of date. It’s critical to analyze any data you’re thinking of using prior to getting started, using a method like the CRAAP test .
  • Secondary research often relies on primary research already conducted. If this original research is biased in any way, those research biases could creep into the secondary results.

Many researchers using the same secondary research to form similar conclusions can also take away from the uniqueness and reliability of your research. Many datasets become “kitchen-sink” models, where too many variables are added in an attempt to draw increasingly niche conclusions from overused data . Data cleansing may be necessary to test the quality of the research.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

methodology when using secondary data

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

George, T. (2024, January 12). What is Secondary Research? | Definition, Types, & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/methodology/secondary-research/
Largan, C., & Morris, T. M. (2019). Qualitative Secondary Research: A Step-By-Step Guide (1st ed.). SAGE Publications Ltd.
Peloquin, D., DiMaio, M., Bierer, B., & Barnes, M. (2020). Disruptive and avoidable: GDPR challenges to secondary research uses of data. European Journal of Human Genetics , 28 (6), 697–705. https://doi.org/10.1038/s41431-020-0596-x

Is this article helpful?

Tegan George

Tegan George

Other students also liked, primary research | definition, types, & examples, how to write a literature review | guide, examples, & templates, what is a case study | definition, examples & methods, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Secondary Data – Types, Methods and Examples

Secondary Data – Types, Methods and Examples

Table of Contents

Secondary Data

Secondary Data

Definition:

Secondary data refers to information that has been collected, processed, and published by someone else, rather than the researcher gathering the data firsthand. This can include data from sources such as government publications, academic journals, market research reports, and other existing datasets.

Secondary Data Types

Types of secondary data are as follows:

  • Published data: Published data refers to data that has been published in books, magazines, newspapers, and other print media. Examples include statistical reports, market research reports, and scholarly articles.
  • Government data: Government data refers to data collected by government agencies and departments. This can include data on demographics, economic trends, crime rates, and health statistics.
  • Commercial data: Commercial data is data collected by businesses for their own purposes. This can include sales data, customer feedback, and market research data.
  • Academic data: Academic data refers to data collected by researchers for academic purposes. This can include data from experiments, surveys, and observational studies.
  • Online data: Online data refers to data that is available on the internet. This can include social media posts, website analytics, and online customer reviews.
  • Organizational data: Organizational data is data collected by businesses or organizations for their own purposes. This can include data on employee performance, financial records, and customer satisfaction.
  • Historical data : Historical data refers to data that was collected in the past and is still available for research purposes. This can include census data, historical documents, and archival records.
  • International data: International data refers to data collected from other countries for research purposes. This can include data on international trade, health statistics, and demographic trends.
  • Public data : Public data refers to data that is available to the general public. This can include data from government agencies, non-profit organizations, and other sources.
  • Private data: Private data refers to data that is not available to the general public. This can include confidential business data, personal medical records, and financial data.
  • Big data: Big data refers to large, complex datasets that are difficult to manage and analyze using traditional data processing methods. This can include social media data, sensor data, and other types of data generated by digital devices.

Secondary Data Collection Methods

Secondary Data Collection Methods are as follows:

  • Published sources: Researchers can gather secondary data from published sources such as books, journals, reports, and newspapers. These sources often provide comprehensive information on a variety of topics.
  • Online sources: With the growth of the internet, researchers can now access a vast amount of secondary data online. This includes websites, databases, and online archives.
  • Government sources : Government agencies often collect and publish a wide range of secondary data on topics such as demographics, crime rates, and health statistics. Researchers can obtain this data through government websites, publications, or data portals.
  • Commercial sources: Businesses often collect and analyze data for marketing research or customer profiling. Researchers can obtain this data through commercial data providers or by purchasing market research reports.
  • Academic sources: Researchers can also obtain secondary data from academic sources such as published research studies, academic journals, and dissertations.
  • Personal contacts: Researchers can also obtain secondary data from personal contacts, such as experts in a particular field or individuals with specialized knowledge.

Secondary Data Formats

Secondary data can come in various formats depending on the source from which it is obtained. Here are some common formats of secondary data:

  • Numeric Data: Numeric data is often in the form of statistics and numerical figures that have been compiled and reported by organizations such as government agencies, research institutions, and commercial enterprises. This can include data such as population figures, GDP, sales figures, and market share.
  • Textual Data: Textual data is often in the form of written documents, such as reports, articles, and books. This can include qualitative data such as descriptions, opinions, and narratives.
  • Audiovisual Data : Audiovisual data is often in the form of recordings, videos, and photographs. This can include data such as interviews, focus group discussions, and other types of qualitative data.
  • Geospatial Data: Geospatial data is often in the form of maps, satellite images, and geographic information systems (GIS) data. This can include data such as demographic information, land use patterns, and transportation networks.
  • Transactional Data : Transactional data is often in the form of digital records of financial and business transactions. This can include data such as purchase histories, customer behavior, and financial transactions.
  • Social Media Data: Social media data is often in the form of user-generated content from social media platforms such as Facebook, Twitter, and Instagram. This can include data such as user demographics, content trends, and sentiment analysis.

Secondary Data Analysis Methods

Secondary data analysis involves the use of pre-existing data for research purposes. Here are some common methods of secondary data analysis:

  • Descriptive Analysis: This method involves describing the characteristics of a dataset, such as the mean, standard deviation, and range of the data. Descriptive analysis can be used to summarize data and provide an overview of trends.
  • Inferential Analysis: This method involves making inferences and drawing conclusions about a population based on a sample of data. Inferential analysis can be used to test hypotheses and determine the statistical significance of relationships between variables.
  • Content Analysis: This method involves analyzing textual or visual data to identify patterns and themes. Content analysis can be used to study the content of documents, media coverage, and social media posts.
  • Time-Series Analysis : This method involves analyzing data over time to identify trends and patterns. Time-series analysis can be used to study economic trends, climate change, and other phenomena that change over time.
  • Spatial Analysis : This method involves analyzing data in relation to geographic location. Spatial analysis can be used to study patterns of disease spread, land use patterns, and the effects of environmental factors on health outcomes.
  • Meta-Analysis: This method involves combining data from multiple studies to draw conclusions about a particular phenomenon. Meta-analysis can be used to synthesize the results of previous research and provide a more comprehensive understanding of a particular topic.

Secondary Data Gathering Guide

Here are some steps to follow when gathering secondary data:

  • Define your research question: Start by defining your research question and identifying the specific information you need to answer it. This will help you identify the type of secondary data you need and where to find it.
  • Identify relevant sources: Identify potential sources of secondary data, including published sources, online databases, government sources, and commercial data providers. Consider the reliability and validity of each source.
  • Evaluate the quality of the data: Evaluate the quality and reliability of the data you plan to use. Consider the data collection methods, sample size, and potential biases. Make sure the data is relevant to your research question and is suitable for the type of analysis you plan to conduct.
  • Collect the data: Collect the relevant data from the identified sources. Use a consistent method to record and organize the data to make analysis easier.
  • Validate the data: Validate the data to ensure that it is accurate and reliable. Check for inconsistencies, missing data, and errors. Address any issues before analyzing the data.
  • Analyze the data: Analyze the data using appropriate statistical and analytical methods. Use descriptive and inferential statistics to summarize and draw conclusions from the data.
  • Interpret the results: Interpret the results of your analysis and draw conclusions based on the data. Make sure your conclusions are supported by the data and are relevant to your research question.
  • Communicate the findings : Communicate your findings clearly and concisely. Use appropriate visual aids such as graphs and charts to help explain your results.

Examples of Secondary Data

Here are some examples of secondary data from different fields:

  • Healthcare : Hospital records, medical journals, clinical trial data, and disease registries are examples of secondary data sources in healthcare. These sources can provide researchers with information on patient demographics, disease prevalence, and treatment outcomes.
  • Marketing : Market research reports, customer surveys, and sales data are examples of secondary data sources in marketing. These sources can provide marketers with information on consumer preferences, market trends, and competitor activity.
  • Education : Student test scores, graduation rates, and enrollment statistics are examples of secondary data sources in education. These sources can provide researchers with information on student achievement, teacher effectiveness, and educational disparities.
  • Finance : Stock market data, financial statements, and credit reports are examples of secondary data sources in finance. These sources can provide investors with information on market trends, company performance, and creditworthiness.
  • Social Science : Government statistics, census data, and survey data are examples of secondary data sources in social science. These sources can provide researchers with information on population demographics, social trends, and political attitudes.
  • Environmental Science : Climate data, remote sensing data, and ecological monitoring data are examples of secondary data sources in environmental science. These sources can provide researchers with information on weather patterns, land use, and biodiversity.

Purpose of Secondary Data

The purpose of secondary data is to provide researchers with information that has already been collected by others for other purposes. Secondary data can be used to support research questions, test hypotheses, and answer research objectives. Some of the key purposes of secondary data are:

  • To gain a better understanding of the research topic : Secondary data can be used to provide context and background information on a research topic. This can help researchers understand the historical and social context of their research and gain insights into relevant variables and relationships.
  • To save time and resources: Collecting new primary data can be time-consuming and expensive. Using existing secondary data sources can save researchers time and resources by providing access to pre-existing data that has already been collected and organized.
  • To provide comparative data : Secondary data can be used to compare and contrast findings across different studies or datasets. This can help researchers identify trends, patterns, and relationships that may not have been apparent from individual studies.
  • To support triangulation: Triangulation is the process of using multiple sources of data to confirm or refute research findings. Secondary data can be used to support triangulation by providing additional sources of data to support or refute primary research findings.
  • To supplement primary data : Secondary data can be used to supplement primary data by providing additional information or insights that were not captured by the primary research. This can help researchers gain a more complete understanding of the research topic and draw more robust conclusions.

When to use Secondary Data

Secondary data can be useful in a variety of research contexts, and there are several situations in which it may be appropriate to use secondary data. Some common situations in which secondary data may be used include:

  • When primary data collection is not feasible : Collecting primary data can be time-consuming and expensive, and in some cases, it may not be feasible to collect primary data. In these situations, secondary data can provide valuable insights and information.
  • When exploring a new research area : Secondary data can be a useful starting point for researchers who are exploring a new research area. Secondary data can provide context and background information on a research topic, and can help researchers identify key variables and relationships to explore further.
  • When comparing and contrasting research findings: Secondary data can be used to compare and contrast findings across different studies or datasets. This can help researchers identify trends, patterns, and relationships that may not have been apparent from individual studies.
  • When triangulating research findings: Triangulation is the process of using multiple sources of data to confirm or refute research findings. Secondary data can be used to support triangulation by providing additional sources of data to support or refute primary research findings.
  • When validating research findings : Secondary data can be used to validate primary research findings by providing additional sources of data that support or refute the primary findings.

Characteristics of Secondary Data

Secondary data have several characteristics that distinguish them from primary data. Here are some of the key characteristics of secondary data:

  • Non-reactive: Secondary data are non-reactive, meaning that they are not collected for the specific purpose of the research study. This means that the researcher has no control over the data collection process, and cannot influence how the data were collected.
  • Time-saving: Secondary data are pre-existing, meaning that they have already been collected and organized by someone else. This can save the researcher time and resources, as they do not need to collect the data themselves.
  • Wide-ranging : Secondary data sources can provide a wide range of information on a variety of topics. This can be useful for researchers who are exploring a new research area or seeking to compare and contrast research findings.
  • Less expensive: Secondary data are generally less expensive than primary data, as they do not require the researcher to incur the costs associated with data collection.
  • Potential for bias : Secondary data may be subject to biases that were present in the original data collection process. For example, data may have been collected using a biased sampling method or the data may be incomplete or inaccurate.
  • Lack of control: The researcher has no control over the data collection process and cannot ensure that the data were collected using appropriate methods or measures.
  • Requires careful evaluation : Secondary data sources must be evaluated carefully to ensure that they are appropriate for the research question and analysis. This includes assessing the quality, reliability, and validity of the data sources.

Advantages of Secondary Data

There are several advantages to using secondary data in research, including:

  • Time-saving : Collecting primary data can be time-consuming and expensive. Secondary data can be accessed quickly and easily, which can save researchers time and resources.
  • Cost-effective: Secondary data are generally less expensive than primary data, as they do not require the researcher to incur the costs associated with data collection.
  • Large sample size : Secondary data sources often have larger sample sizes than primary data sources, which can increase the statistical power of the research.
  • Access to historical data : Secondary data sources can provide access to historical data, which can be useful for researchers who are studying trends over time.
  • No ethical concerns: Secondary data are already in existence, so there are no ethical concerns related to collecting data from human subjects.
  • May be more objective : Secondary data may be more objective than primary data, as the data were not collected for the specific purpose of the research study.

Limitations of Secondary Data

While there are many advantages to using secondary data in research, there are also some limitations that should be considered. Some of the main limitations of secondary data include:

  • Lack of control over data quality : Researchers do not have control over the data collection process, which means they cannot ensure the accuracy or completeness of the data.
  • Limited availability: Secondary data may not be available for the specific research question or study design.
  • Lack of information on sampling and data collection methods: Researchers may not have access to information on the sampling and data collection methods used to gather the secondary data. This can make it difficult to evaluate the quality of the data.
  • Data may not be up-to-date: Secondary data may not be up-to-date or relevant to the current research question.
  • Data may be incomplete or inaccurate : Secondary data may be incomplete or inaccurate due to missing or incorrect data points, data entry errors, or other factors.
  • Biases in data collection: The data may have been collected using biased sampling or data collection methods, which can limit the validity of the data.
  • Lack of control over variables: Researchers have limited control over the variables that were measured in the original data collection process, which can limit the ability to draw conclusions about causality.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Primary Data

Primary Data – Types, Methods and Examples

Qualitative Data

Qualitative Data – Types, Methods and Examples

Research Data

Research Data – Types Methods and Examples

Quantitative Data

Quantitative Data – Types, Methods and Examples

Research Information

Information in Research – Types and Examples

What Is Secondary Data? A Complete Guide

What is secondary data, and why is it important? Find out in this post.

Within data analytics, there are many ways of categorizing data. A common distinction, for instance, is that between qualitative and quantitative data . In addition, you might also distinguish your data based on factors like sensitivity. For example, is it publicly available or is it highly confidential?  

Probably the most fundamental distinction between different types of data is their source. Namely, are they primary, secondary, or third-party data? Each of these vital data sources supports the data analytics process in its own way. In this post, we’ll focus specifically on secondary data. We’ll look at its main characteristics, provide some examples, and highlight the main pros and cons of using secondary data in your analysis.  

We’ll cover the following topics:  

What is secondary data?

  • What’s the difference between primary, secondary, and third-party data?
  • What are some examples of secondary data?
  • How to analyse secondary data
  • Advantages of secondary data
  • Disadvantages of secondary data
  • Wrap-up and further reading

Ready to learn all about secondary data? Then let’s go.

1. What is secondary data?

Secondary data (also known as second-party data) refers to any dataset collected by any person other than the one using it.  

Secondary data sources are extremely useful. They allow researchers and data analysts to build large, high-quality databases that help solve business problems. By expanding their datasets with secondary data, analysts can enhance the quality and accuracy of their insights. Most secondary data comes from external organizations. However, secondary data also refers to that collected within an organization and then repurposed.

Secondary data has various benefits and drawbacks, which we’ll explore in detail in section four. First, though, it’s essential to contextualize secondary data by understanding its relationship to two other sources of data: primary and third-party data. We’ll look at these next.

2. What’s the difference between primary, secondary, and third-party data?

To best understand secondary data, we need to know how it relates to the other main data sources: primary and third-party data.

What is primary data?

‘Primary data’ (also known as first-party data) are those directly collected or obtained by the organization or individual that intends to use them. Primary data are always collected for a specific purpose. This could be to inform a defined goal or objective or to address a particular business problem. 

For example, a real estate organization might want to analyze current housing market trends. This might involve conducting interviews, collecting facts and figures through surveys and focus groups, or capturing data via electronic forms. Focusing only on the data required to complete the task at hand ensures that primary data remain highly relevant. They’re also well-structured and of high quality.

As explained, ‘secondary data’ describes those collected for a purpose other than the task at hand. Secondary data can come from within an organization but more commonly originate from an external source. If it helps to make the distinction, secondary data is essentially just another organization’s primary data. 

Secondary data sources are so numerous that they’ve started playing an increasingly vital role in research and analytics. They are easier to source than primary data and can be repurposed to solve many different problems. While secondary data may be less relevant for a given task than primary data, they are generally still well-structured and highly reliable.

What is third-party data?

‘Third-party data’ (sometimes referred to as tertiary data) refers to data collected and aggregated from numerous discrete sources by third-party organizations. Because third-party data combine data from numerous sources and aren’t collected with a specific goal in mind, the quality can be lower. 

Third-party data also tend to be largely unstructured. This means that they’re often beset by errors, duplicates, and so on, and require more processing to get them into a usable format. Nevertheless, used appropriately, third-party data are still a useful data analytics resource. You can learn more about structured vs unstructured data here . 

OK, now that we’ve placed secondary data in context, let’s explore some common sources and types of secondary data.

3. What are some examples of secondary data?

External secondary data.

Before we get to examples of secondary data, we first need to understand the types of organizations that generally provide them. Frequent sources of secondary data include:  

  • Government departments
  • Public sector organizations
  • Industry associations
  • Trade and industry bodies
  • Educational institutions
  • Private companies
  • Market research providers

While all these organizations provide secondary data, government sources are perhaps the most freely accessible. They are legally obliged to keep records when registering people, providing services, and so on. This type of secondary data is known as administrative data. It’s especially useful for creating detailed segment profiles, where analysts hone in on a particular region, trend, market, or other demographic.

Types of secondary data vary. Popular examples of secondary data include:

  • Tax records and social security data
  • Census data (the U.S. Census Bureau is oft-referenced, as well as our favorite, the U.S. Bureau of Labor Statistics )
  • Electoral statistics
  • Health records
  • Books, journals, or other print media
  • Social media monitoring, internet searches, and other online data
  • Sales figures or other reports from third-party companies
  • Libraries and electronic filing systems
  • App data, e.g. location data, GPS data, timestamp data, etc.

Internal secondary data 

As mentioned, secondary data is not limited to that from a different organization. It can also come from within an organization itself.  

Sources of internal secondary data might include:

  • Sales reports
  • Annual accounts
  • Quarterly sales figures
  • Customer relationship management systems
  • Emails and metadata
  • Website cookies

In the right context, we can define practically any type of data as secondary data. The key takeaway is that the term ‘secondary data’ doesn’t refer to any inherent quality of the data themselves, but to how they are used. Any data source (external or internal) used for a task other than that for which it was originally collected can be described as secondary data.

4. How to analyse secondary data

The process of analysing secondary data can be performed either quantitatively or qualitatively, depending on the kind of data the researcher is dealing with. The quantitative method of secondary data analysis is used on numerical data and is analyzed mathematically. The qualitative method uses words to provide in-depth information about data.

There are different stages of secondary data analysis, which involve events before, during, and after data collection. These stages include:

  • Statement of purpose: Before collecting secondary data, you need to know your statement of purpose. This means you should have a clear awareness of the goal of the research work and how this data will help achieve it. This will guide you to collect the right data, then choosing the best data source and method of analysis.
  • Research design: This is a plan on how the research activities will be carried out. It describes the kind of data to be collected, the sources of data collection, the method of data collection, tools used, and method of analysis. Once the purpose of the research has been identified, the researcher should design a research process that will guide the data analysis process.
  • Developing the research questions: Once you’ve identified the research purpose, an analyst should also prepare research questions to help identify secondary data. For example, if a researcher is looking to learn more about why working adults are increasingly more interested in the “gig economy” as opposed to full-time work, they may ask, “What are the main factors that influence adults decisions to engage in freelance work?” or, “Does education level have an effect on how people engage in freelance work?
  • Identifying secondary data: Using the research questions as a guide, researchers will then begin to identify relevant data from the sources provided. If the kind of data to be collected is qualitative, a researcher can filter out qualitative data—for example.
  • Evaluating secondary data: Once relevant data has been identified and collates, it will be evaluated to ensure it fulfils the criteria of the research topic. Then, it is analyzed either using the quantitative or qualitative method, depending on the type of data it is.

You can learn more about secondary data analysis in this post .  

5. Advantages of secondary data

Secondary data is suitable for any number of analytics activities. The only limitation is a dataset’s format, structure, and whether or not it relates to the topic or problem at hand. 

When analyzing secondary data, the process has some minor differences, mainly in the preparation phase. Otherwise, it follows much the same path as any traditional data analytics project. 

More broadly, though, what are the advantages and disadvantages of using secondary data? Let’s take a look.

Advantages of using secondary data

It’s an economic use of time and resources: Because secondary data have already been collected, cleaned, and stored, this saves analysts much of the hard work that comes from collecting these data firsthand. For instance, for qualitative data, the complex tasks of deciding on appropriate research questions or how best to record the answers have already been completed. Secondary data saves data analysts and data scientists from having to start from scratch.  

It provides a unique, detailed picture of a population: Certain types of secondary data, especially government administrative data, can provide access to levels of detail that it would otherwise be extremely difficult (or impossible) for organizations to collect on their own. Data from public sources, for instance, can provide organizations and individuals with a far greater level of population detail than they could ever hope to gather in-house. You can also obtain data over larger intervals if you need it., e.g. stock market data which provides decades’-worth of information.  

Secondary data can build useful relationships: Acquiring secondary data usually involves making connections with organizations and analysts in fields that share some common ground with your own. This opens the door to a cross-pollination of disciplinary knowledge. You never know what nuggets of information or additional data resources you might find by building these relationships.

Secondary data tend to be high-quality: Unlike some data sources, e.g. third-party data, secondary data tends to be in excellent shape. In general, secondary datasets have already been validated and therefore require minimal checking. Often, such as in the case of government data, datasets are also gathered and quality-assured by organizations with much more time and resources available. This further benefits the data quality , while benefiting smaller organizations that don’t have endless resources available.

It’s excellent for both data enrichment and informing primary data collection: Another benefit of secondary data is that they can be used to enhance and expand existing datasets. Secondary data can also inform primary data collection strategies. They can provide analysts or researchers with initial insights into the type of data they might want to collect themselves further down the line.

6. Disadvantages of secondary data

They aren’t always free: Sometimes, it’s unavoidable—you may have to pay for access to secondary data. However, while this can be a financial burden, in reality, the cost of purchasing a secondary dataset usually far outweighs the cost of having to plan for and collect the data firsthand.  

The data isn’t always suited to the problem at hand: While secondary data may tick many boxes concerning its relevance to a business problem, this is not always true. For instance, secondary data collection might have been in a geographical location or time period ill-suited to your analysis. Because analysts were not present when the data were initially collected, this may also limit the insights they can extract.

The data may not be in the preferred format: Even when a dataset provides the necessary information, that doesn’t mean it’s appropriately stored. A basic example: numbers might be stored as categorical data rather than numerical data. Another issue is that there may be gaps in the data. Categories that are too vague may limit the information you can glean. For instance, a dataset of people’s hair color that is limited to ‘brown, blonde and other’ will tell you very little about people with auburn, black, white, or gray hair.  

You can’t be sure how the data were collected: A structured, well-ordered secondary dataset may appear to be in good shape. However, it’s not always possible to know what issues might have occurred during data collection that will impact their quality. For instance, poor response rates will provide a limited view. While issues relating to data collection are sometimes made available alongside the datasets (e.g. for government data) this isn’t always the case. You should therefore treat secondary data with a reasonable degree of caution.

Being aware of these disadvantages is the first step towards mitigating them. While you should be aware of the risks associated with using secondary datasets, in general, the benefits far outweigh the drawbacks.

7. Wrap-up and further reading

In this post we’ve explored secondary data in detail. As we’ve seen, it’s not so different from other forms of data. What defines data as secondary data is how it is used rather than an inherent characteristic of the data themselves. 

To learn more about data analytics, check out this free, five-day introductory data analytics short course . You can also check out these articles to learn more about the data analytics process:

  • What is data cleaning and why is it important?
  • What is data visualization? A complete introductory guide
  • 10 Great places to find free datasets for your next project

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Employee Exit Interviews
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories

Market Research

  • Artificial Intelligence
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Secondary Research

Try Qualtrics for free

Secondary research: definition, methods, & examples.

19 min read This ultimate guide to secondary research helps you understand changes in market trends, customers buying patterns and your competition using existing data sources.

In situations where you’re not involved in the data gathering process ( primary research ), you have to rely on existing information and data to arrive at specific research conclusions or outcomes. This approach is known as secondary research.

In this article, we’re going to explain what secondary research is, how it works, and share some examples of it in practice.

Free eBook: The ultimate guide to conducting market research

What is secondary research?

Secondary research, also known as desk research, is a research method that involves compiling existing data sourced from a variety of channels . This includes internal sources (e.g.in-house research) or, more commonly, external sources (such as government statistics, organizational bodies, and the internet).

Secondary research comes in several formats, such as published datasets, reports, and survey responses , and can also be sourced from websites, libraries, and museums.

The information is usually free — or available at a limited access cost — and gathered using surveys , telephone interviews, observation, face-to-face interviews, and more.

When using secondary research, researchers collect, verify, analyze and incorporate it to help them confirm research goals for the research period.

As well as the above, it can be used to review previous research into an area of interest. Researchers can look for patterns across data spanning several years and identify trends — or use it to verify early hypothesis statements and establish whether it’s worth continuing research into a prospective area.

How to conduct secondary research

There are five key steps to conducting secondary research effectively and efficiently:

1.    Identify and define the research topic

First, understand what you will be researching and define the topic by thinking about the research questions you want to be answered.

Ask yourself: What is the point of conducting this research? Then, ask: What do we want to achieve?

This may indicate an exploratory reason (why something happened) or confirm a hypothesis. The answers may indicate ideas that need primary or secondary research (or a combination) to investigate them.

2.    Find research and existing data sources

If secondary research is needed, think about where you might find the information. This helps you narrow down your secondary sources to those that help you answer your questions. What keywords do you need to use?

Which organizations are closely working on this topic already? Are there any competitors that you need to be aware of?

Create a list of the data sources, information, and people that could help you with your work.

3.    Begin searching and collecting the existing data

Now that you have the list of data sources, start accessing the data and collect the information into an organized system. This may mean you start setting up research journal accounts or making telephone calls to book meetings with third-party research teams to verify the details around data results.

As you search and access information, remember to check the data’s date, the credibility of the source, the relevance of the material to your research topic, and the methodology used by the third-party researchers. Start small and as you gain results, investigate further in the areas that help your research’s aims.

4.    Combine the data and compare the results

When you have your data in one place, you need to understand, filter, order, and combine it intelligently. Data may come in different formats where some data could be unusable, while other information may need to be deleted.

After this, you can start to look at different data sets to see what they tell you. You may find that you need to compare the same datasets over different periods for changes over time or compare different datasets to notice overlaps or trends. Ask yourself: What does this data mean to my research? Does it help or hinder my research?

5.    Analyze your data and explore further

In this last stage of the process, look at the information you have and ask yourself if this answers your original questions for your research. Are there any gaps? Do you understand the information you’ve found? If you feel there is more to cover, repeat the steps and delve deeper into the topic so that you can get all the information you need.

If secondary research can’t provide these answers, consider supplementing your results with data gained from primary research. As you explore further, add to your knowledge and update your findings. This will help you present clear, credible information.

Primary vs secondary research

Unlike secondary research, primary research involves creating data first-hand by directly working with interviewees, target users, or a target market. Primary research focuses on the method for carrying out research, asking questions, and collecting data using approaches such as:

  • Interviews (panel, face-to-face or over the phone)
  • Questionnaires or surveys
  • Focus groups

Using these methods, researchers can get in-depth, targeted responses to questions, making results more accurate and specific to their research goals. However, it does take time to do and administer.

Unlike primary research, secondary research uses existing data, which also includes published results from primary research. Researchers summarize the existing research and use the results to support their research goals.

Both primary and secondary research have their places. Primary research can support the findings found through secondary research (and fill knowledge gaps), while secondary research can be a starting point for further primary research. Because of this, these research methods are often combined for optimal research results that are accurate at both the micro and macro level.

Sources of Secondary Research

There are two types of secondary research sources: internal and external. Internal data refers to in-house data that can be gathered from the researcher’s organization. External data refers to data published outside of and not owned by the researcher’s organization.

Internal data

Internal data is a good first port of call for insights and knowledge, as you may already have relevant information stored in your systems. Because you own this information — and it won’t be available to other researchers — it can give you a competitive edge . Examples of internal data include:

  • Database information on sales history and business goal conversions
  • Information from website applications and mobile site data
  • Customer-generated data on product and service efficiency and use
  • Previous research results or supplemental research areas
  • Previous campaign results

External data

External data is useful when you: 1) need information on a new topic, 2) want to fill in gaps in your knowledge, or 3) want data that breaks down a population or market for trend and pattern analysis. Examples of external data include:

  • Government, non-government agencies, and trade body statistics
  • Company reports and research
  • Competitor research
  • Public library collections
  • Textbooks and research journals
  • Media stories in newspapers
  • Online journals and research sites

Three examples of secondary research methods in action

How and why might you conduct secondary research? Let’s look at a few examples:

1.    Collecting factual information from the internet on a specific topic or market

There are plenty of sites that hold data for people to view and use in their research. For example, Google Scholar, ResearchGate, or Wiley Online Library all provide previous research on a particular topic. Researchers can create free accounts and use the search facilities to look into a topic by keyword, before following the instructions to download or export results for further analysis.

This can be useful for exploring a new market that your organization wants to consider entering. For instance, by viewing the U.S Census Bureau demographic data for that area, you can see what the demographics of your target audience are , and create compelling marketing campaigns accordingly.

2.    Finding out the views of your target audience on a particular topic

If you’re interested in seeing the historical views on a particular topic, for example, attitudes to women’s rights in the US, you can turn to secondary sources.

Textbooks, news articles, reviews, and journal entries can all provide qualitative reports and interviews covering how people discussed women’s rights. There may be multimedia elements like video or documented posters of propaganda showing biased language usage.

By gathering this information, synthesizing it, and evaluating the language, who created it and when it was shared, you can create a timeline of how a topic was discussed over time.

3.    When you want to know the latest thinking on a topic

Educational institutions, such as schools and colleges, create a lot of research-based reports on younger audiences or their academic specialisms. Dissertations from students also can be submitted to research journals, making these places useful places to see the latest insights from a new generation of academics.

Information can be requested — and sometimes academic institutions may want to collaborate and conduct research on your behalf. This can provide key primary data in areas that you want to research, as well as secondary data sources for your research.

Advantages of secondary research

There are several benefits of using secondary research, which we’ve outlined below:

  • Easily and readily available data – There is an abundance of readily accessible data sources that have been pre-collected for use, in person at local libraries and online using the internet. This data is usually sorted by filters or can be exported into spreadsheet format, meaning that little technical expertise is needed to access and use the data.
  • Faster research speeds – Since the data is already published and in the public arena, you don’t need to collect this information through primary research. This can make the research easier to do and faster, as you can get started with the data quickly.
  • Low financial and time costs – Most secondary data sources can be accessed for free or at a small cost to the researcher, so the overall research costs are kept low. In addition, by saving on preliminary research, the time costs for the researcher are kept down as well.
  • Secondary data can drive additional research actions – The insights gained can support future research activities (like conducting a follow-up survey or specifying future detailed research topics) or help add value to these activities.
  • Secondary data can be useful pre-research insights – Secondary source data can provide pre-research insights and information on effects that can help resolve whether research should be conducted. It can also help highlight knowledge gaps, so subsequent research can consider this.
  • Ability to scale up results – Secondary sources can include large datasets (like Census data results across several states) so research results can be scaled up quickly using large secondary data sources.

Disadvantages of secondary research

The disadvantages of secondary research are worth considering in advance of conducting research :

  • Secondary research data can be out of date – Secondary sources can be updated regularly, but if you’re exploring the data between two updates, the data can be out of date. Researchers will need to consider whether the data available provides the right research coverage dates, so that insights are accurate and timely, or if the data needs to be updated. Also, fast-moving markets may find secondary data expires very quickly.
  • Secondary research needs to be verified and interpreted – Where there’s a lot of data from one source, a researcher needs to review and analyze it. The data may need to be verified against other data sets or your hypotheses for accuracy and to ensure you’re using the right data for your research.
  • The researcher has had no control over the secondary research – As the researcher has not been involved in the secondary research, invalid data can affect the results. It’s therefore vital that the methodology and controls are closely reviewed so that the data is collected in a systematic and error-free way.
  • Secondary research data is not exclusive – As data sets are commonly available, there is no exclusivity and many researchers can use the same data. This can be problematic where researchers want to have exclusive rights over the research results and risk duplication of research in the future.

When do we conduct secondary research?

Now that you know the basics of secondary research, when do researchers normally conduct secondary research?

It’s often used at the beginning of research, when the researcher is trying to understand the current landscape . In addition, if the research area is new to the researcher, it can form crucial background context to help them understand what information exists already. This can plug knowledge gaps, supplement the researcher’s own learning or add to the research.

Secondary research can also be used in conjunction with primary research. Secondary research can become the formative research that helps pinpoint where further primary research is needed to find out specific information. It can also support or verify the findings from primary research.

You can use secondary research where high levels of control aren’t needed by the researcher, but a lot of knowledge on a topic is required from different angles.

Secondary research should not be used in place of primary research as both are very different and are used for various circumstances.

Questions to ask before conducting secondary research

Before you start your secondary research, ask yourself these questions:

  • Is there similar internal data that we have created for a similar area in the past?

If your organization has past research, it’s best to review this work before starting a new project. The older work may provide you with the answers, and give you a starting dataset and context of how your organization approached the research before. However, be mindful that the work is probably out of date and view it with that note in mind. Read through and look for where this helps your research goals or where more work is needed.

  • What am I trying to achieve with this research?

When you have clear goals, and understand what you need to achieve, you can look for the perfect type of secondary or primary research to support the aims. Different secondary research data will provide you with different information – for example, looking at news stories to tell you a breakdown of your market’s buying patterns won’t be as useful as internal or external data e-commerce and sales data sources.

  • How credible will my research be?

If you are looking for credibility, you want to consider how accurate the research results will need to be, and if you can sacrifice credibility for speed by using secondary sources to get you started. Bear in mind which sources you choose — low-credibility data sites, like political party websites that are highly biased to favor their own party, would skew your results.

  • What is the date of the secondary research?

When you’re looking to conduct research, you want the results to be as useful as possible , so using data that is 10 years old won’t be as accurate as using data that was created a year ago. Since a lot can change in a few years, note the date of your research and look for earlier data sets that can tell you a more recent picture of results. One caveat to this is using data collected over a long-term period for comparisons with earlier periods, which can tell you about the rate and direction of change.

  • Can the data sources be verified? Does the information you have check out?

If you can’t verify the data by looking at the research methodology, speaking to the original team or cross-checking the facts with other research, it could be hard to be sure that the data is accurate. Think about whether you can use another source, or if it’s worth doing some supplementary primary research to replicate and verify results to help with this issue.

We created a front-to-back guide on conducting market research, The ultimate guide to conducting market research , so you can understand the research journey with confidence.

In it, you’ll learn more about:

  • What effective market research looks like
  • The use cases for market research
  • The most important steps to conducting market research
  • And how to take action on your research findings

Download the free guide for a clearer view on secondary research and other key research types for your business.

Related resources

Market intelligence 10 min read, marketing insights 11 min read, ethnographic research 11 min read, qualitative vs quantitative research 13 min read, qualitative research questions 11 min read, qualitative research design 12 min read, primary vs secondary research 14 min read, request demo.

Ready to learn more about Qualtrics?

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical Literature
  • Classical Reception
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Archaeology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Variation
  • Language Families
  • Language Acquisition
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Modernism)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Culture
  • Music and Religion
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Science
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Society
  • Law and Politics
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Ethics
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Games
  • Computer Security
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business History
  • Business Strategy
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Methodology
  • Economic Systems
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Theory
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2: Statistical Analysis

  • < Previous chapter
  • Next chapter >

28 Secondary Data Analysis

Department of Psychology, Michigan State University

Richard E. Lucas, Department of Psychology, Michigan State University, East Lansing, MI

  • Published: 01 October 2013
  • Cite Icon Cite
  • Permissions Icon Permissions

Secondary data analysis refers to the analysis of existing data collected by others. Secondary analysis affords researchers the opportunity to investigate research questions using large-scale data sets that are often inclusive of under-represented groups, while saving time and resources. Despite the immense potential for secondary analysis as a tool for researchers in the social sciences, it is not widely used by psychologists and is sometimes met with sharp criticism among those who favor primary research. The goal of this chapter is to summarize the promises and pitfalls associated with secondary data analysis and to highlight the importance of archival resources for advancing psychological science. In addition to describing areas of convergence and divergence between primary and secondary data analysis, we outline basic steps for getting started and finding data sets. We also provide general guidance on issues related to measurement, handling missing data, and the use of survey weights.

The goal of research in the social science is to gain a better understanding of the world and how well theoretical predictions match empirical realities. Secondary data analysis contributes to these objectives through the application of “creative analytical techniques to data that have been amassed by others” ( Kiecolt & Nathan, 1985 , p. 10). Primary researchers design new studies to answer research questions, whereas the secondary data analyst uses existing resources. There is a deliberate coupling of research design and data analysis in primary research; however, the secondary data analyst rarely has had input into the design of the original studies in terms of the sampling strategy and measures selected for the investigation. For better or worse, the secondary data analyst simply has access to the final products of the data collection process in the form of a codebook or set of codebooks and a cleaned data set.

The analysis of existing data sets is routine in disciplines such as economics, political science, and sociology, but it is less well established in psychology ( but see   Brooks-Gunn & Chase-Lansdale, 1991 ; Brooks-Gunn, Berlin, Leventhal, & Fuligini, 2000 ). Moreover, biases against secondary data analysis in favor of primary research may be present in psychology ( see   McCall & Appelbaum, 1991 ). One possible explanation for this bias is that psychology has a rich and vibrant experimental tradition, and the training of many psychologists has likely emphasized this approach as the “gold standard” for addressing research questions and establishing causality ( see , e.g., Cronbach, 1957 ). As a result, the nonexperimental methods that are typically used in secondary analyses may be viewed by some as inferior. Psychological scientists trained in the experimental tradition may not fully appreciate the unique strengths that nonexperimental techniques have to offer and may underestimate the time, effort, and skills required for conducting secondary data analyses in a competent and professional manner. Finally, biases against secondary data analysis might stem from lingering concerns over the validity of the self-report methods that are typically used in secondary data analysis. These can include concerns about the possibility that placement of items in a survey can influence responses (e.g., differences in the average levels of reported marital and life satisfaction when questions occur back to back as opposed to having the questions separated in the survey; see   Schwarz, 1999 ; Schwarz & Strack, 1999 ) and concerns with biased reporting of sensitive behaviors ( but see   Akers, Massey, & Clarke, 1983 ).

Despite the initial reluctance to widely embrace secondary data analysis as a tool for psychological research, there are promising signs that the skepticism toward secondary analyses will diminish as psychology seeks to position itself as a hub science that plays a key role in interdisciplinary inquiry ( see   Mroczek, Pitzer, Miller, Turiano, & Fingerman, 2011 ). Accordingly, there is a compelling argument for including secondary data analysis into the suite of methodological approaches used by psychologists ( see   Trzesniewski, Donnellan, & Lucas, 2011 ).

The goal of this chapter is to summarize the promises and pitfalls associated with secondary data analysis and to highlight the importance of archival resources for advancing psychological science. We limit our discussion to analyses based on large-scale and often longitudinal national data sets such as the National Longitudinal Study of Adolescent Health (Add Health), the British Household Panel Study (BHPS), the German Socioeconomic Panel Study (GSOEP), and the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SEC-CYD). However, much of our discussion applies to all secondary analyses. The perspective and specific recommendations found in this chapter draw on the edited volume by Trzesniewski et al. (2011 ). Following a general introduction to secondary data analysis, we will outline the necessary steps for getting started and finding data sets. Finally, we provide some general guidance on issues related to measurement, approaches to handling missing data, and survey weighting. Our treatment of these important topics is intended to draw attention to the relevant issues rather than to provide extensive coverage. Throughout, we take a practical approach to the issues and offer tips and guidance rooted in our experiences as data analysts and researchers with substantive interests in personality and life span developmental psychology.

Comparing Primary Research and Secondary Research

As noted in the opening section, it is possible that biases against secondary data analysis exist in the minds of some psychological scientists. To address these concerns, we have found it can be helpful to explicitly compare the processes of secondary analyses with primary research ( see also   McCall & Appelbaum, 1991 ). An idealized and simplified list of steps is provided in Table 28.1 . As is evident from this table, both techniques start with a research question that is ideally rooted in existing theory and previous empirical results. The areas of biggest divergence between primary and secondary approaches occur after researchers have identified their questions (i.e., Steps 2 through 5 in Table 28.1 ). At this point, the primary researcher develops a set of procedures and then engages in pilot testing to refine procedures and methods, whereas the secondary analyst searches for data sets and evaluates codebooks. The primary researcher attempts to refine her or his procedures, whereas the secondary analyst determines whether a particular resource is appropriate for addressing the question at hand. In the next stages, the primary researcher collects new data, whereas the secondary data analyst constructs a working data set from a much larger data archive. At these stages, both types of researchers must grapple with the practical considerations imposed by real world constraints. There is no such thing as a perfect single study ( see   Hunter & Schmidt, 2004 ), as all data sets are subject to limitations stemming from design and implementation. For example, the primary researcher may not have enough subjects to generate adequate levels of statistical power (because of a failure to take power calculations into account during the design phase, time or other resource constraints during the data collection phase, or because of problems with sample retention), whereas the secondary data analyst may have to cope with impoverished measurement of core constructs. Both sets of considerations will affect the ability of a given study to detect effects and provide unbiased estimates of effect sizes.

Table 28.1 also illustrates the fact that there are considerable areas of overlap between the two techniques. Researchers stemming from both traditions analyze data, interpret results, and write reports for dissemination to the wider scientific community. Both kinds of research require a significant investment of time and intellectual resources. Many skills required in conducting high-quality primary research are also required in conducting high-quality secondary data analysis including sound scientific judgment, attention to detail, and a firm grasp of statistical methodology.

Note: Steps modified and expanded from McCall and Appelbaum (1991 ).

We argue that both primary research and secondary data analysis have the potential to provide meaningful and scientifically valid research findings for psychology. Both approaches can generate new knowledge and are therefore reasonable ways of evaluating research questions. Blanket pronouncements that one approach is inherently superior to the other are usually difficult to justify. Many of the concerns about secondary data analysis are raised in the context of an unfair comparison—a contrast between the idealized conceptualization of primary research with the actual process of a secondary data analysis. Our point is that both approaches can be conducted in a thoughtful and rigorous manner, yet both approaches involve concessions to real-world constraints. Accordingly, we encourage all researchers and reviewers of papers to keep an open mind about the importance of both types of research.

Advantages and Disadvantages of Secondary Data Analysis

The foremost reason why psychologists should learn about secondary data analysis is that there are many existing data sets that can be used to answer interesting and important questions. Individuals who are unaware of these resources are likely to miss crucial opportunities to contribute new knowledge to the discipline and even risk reinventing the proverbial wheel by collecting new data. Regrettably, new data collection efforts may occur on a smaller scale than what is available in large national datasets. Researchers who are unaware of the potential treasure trove of variables in existing data sets risk unnecessarily duplicating considerable amounts of time and effort. At the very least, researchers may wish to familiarize themselves with publicly available data to truly address gaps in the literature when they undertake projects that involve new data collection.

The biggest advantage of secondary analyses is that the data have already been collected and are ready to be analyzed ( see   Hofferth, 2005 ), thus conserving time and resources. Existing data sources are often of much larger and higher quality than could be feasibly collected by a single investigator. This advantage is especially pronounced when considering the investments of time and money necessary to collect longitudinal data. Some data sets were collected with scientific sampling plans (such as the GSOEP), which make it possible to generalize the findings to a specific population. Further, many publicly available data sets are quite large, and therefore provide adequate statistical power for conducting many analyses, including hypotheses about statistical interactions. Investigations of interactions often require a surprisingly high number of participants to achieve respectable levels of statistical power in the face of measurement error ( see   Aiken & West, 1991 ). 1 Large-scale data sets are also well suited for subgroup analyses of populations that are often under-represented in smaller research studies.

Another advantage of secondary data analysis is that it forces researchers to adopt an open and transparent approach to their craft. Because data are publicly available, other investigators may attempt to replicate findings and specify alternative models for a given research question. This reality encourages transparency and detailed record keeping on the part of the researcher, including careful reporting of analysis and a reasoned justification for all analytic decisions. Freese (2007 ) has provided a useful discussion about policies for archiving material necessary for replicating results, and his treatment of the issues provides guidance to researchers interested in maintaining good records.

Despite the many advantages of secondary data analysis, it is not without its disadvantages. The most significant challenge is simply the flipside of the primary advantage—the data have already been collected by somebody else! Analysts must take advantage of what has been collected without input into design and measurement issues. In some cases, an existing data set may not be available to address the particular research questions of a given investigator without some limitations in terms of sampling, measurement, or other design feature. For example, data sets commonly used for secondary analysis often have a great deal of breadth in terms of the range of constructs assessed (e.g., finances, attitudes, personality, life satisfaction, physical health), but these constructs are often measured with a limited number of survey items. Issues of measurement reliability and validity are usually a major concern. Therefore, a strong grounding in basic and advanced psychometrics is extremely helpful for responding to criticisms and concerns about measurement issues that arise during the peer-review process.

A second consequence of the fact that the data have been collected by somebody else is that analysts may not have access to all of the information about data collection procedures and issues. The analyst simply receives a cleaned data set to use for subsequent analyses. Perhaps not obvious to the user is the amount of actual cleaning that occurred behind the scenes. Similarly, the complicated sampling procedures used in a given study may not be readily apparent to users, and this issue can prevent the appropriate use of survey weights ( Shrout & Napier, 2011 ).

Another significant disadvantage for secondary data analysis is the large amount of time and energy initially required to review data documentation. It can take hours and even weeks to become familiar with the codebooks and to discover which research questions have already been addressed by investigators using the existing data sets. It is very easy to underestimate how long it will take to move from an initial research idea to a competent final analysis. There is a risk that, unbeknownst to one another, researchers in different locations will pursue answers to the same research questions. On the other hand, once a researcher has become familiar with a data set and developed skills to work with the resource, they are able to pursue additional research questions resulting in multiple publications from the same data set. It is our experience that the process of learning about a data set can help generate new research ideas as it becomes clearer how the resource can be used to contribute to psychological science. Thus, the initial time and energy expended to learn about a resource can be viewed as initial investment that holds the potential to pay larger dividends over time.

Finally, a possible disadvantage concerns how secondary data analyses are viewed within particular subdisciplines of psychology and by referees during the peer-review process. Some journals and some academic departments may not value secondary data analyses as highly as primary research. Such preferences might break along Cronbach’s two disciplines or two streams of psychology—correlational versus experimental ( Cronbach, 1957 ; Tracy, Robins, & Sherman, 2009 ). The reality is that if original data collection is more highly valued in a given setting, then new investigators looking to build a strong case for getting hired or getting promoted might face obstacles if they base a career exclusively on secondary data analysis. Similarly, if experimental methods are highly valued and correlational methods are denigrated in a particular subfield, then results of secondary data analyses will face difficulties getting attention (and even getting published). The best advice is to be aware of local norms and to act accordingly.

Steps for Beginning a Secondary Data Analysis

Step 1: Find Existing Data Sets . After generating a substantive question, the first task is to find relevant data sets ( see   Pienta, O’Rouke, & Franks, 2011 ). In some cases researchers will be aware of existing data sets through familiarity with the literature given that many well-cited papers have used such resources. For example, the GSOEP has now been widely used to address questions about correlates and developmental course of subjective well-being (e.g., Baird, Lucas, & Donnellan, 2010 ; Gerstorf, Ram, Estabrook, Schupp, Wagner, & Lindenberger, 2008 ; Gerstorf, Ram, Goebel, Schupp, Lindenberger, & Wagner, 2010 ; Lucas, 2005 ; 2007 ), and thus, researchers in this area know to turn to this resource if a new question arises. In other cases, however, researchers will attempt to find data sets using established archives such as the University of Michigan’s Interuniversity Consortium for Political and Social Research (ICPSR; http://www.icpsr.umich.edu/icpsrweb/ICPSR/ ). In addition to ICPSR, there are a number of other major archives ( see   Pienta et al., 2011 ) that house potentially relevant data sets. Here are just a few starting points:

The Henry A. Murray Research Archive ( http://www.murray.harvard.edu/ )

The Howard W Odum Institute for Research in Social Science ( http://www.irss.unc.edu/odum/jsp/home2.jsp )

The National Opinion Research Center ( http://norc.org/homepage.htm )

The Roper Center of Public Opinion Research ( http://ropercenter.uconn.edu/ )

The United Kingdom Data Archive ( http://www.data-archive.ac.uk/ )

Individuals in charge of these archives and data depositories often catalog metadata, which is the technical term for information about the constituent data sets. Typical kinds of metadata include information about the original investigators, a description of the design and process of data collection, a list of the variables assessed, and notes about sampling weights and missing data. Searching through this information is an efficient way of gaining familiarity with data sets. In particular, the ICPSR has an impressive infrastructure for allowing researchers to search for data sets through a cataloguing of study metadata. The ICPSR is thus a useful starting point for finding the raw material for a secondary data analysis. The ICPSR also provides a new user tutorial for searching their holdings ( http://www.icpsr.umich.edu/icpsrweb/ICPSR/help/newuser.jsp ). We recommend that researchers search through their holdings to make a list of potential data sets. At that point, the next task is to obtain relevant codebooks to learn more about each resource.

Step 2: Read Codebooks . Researchers interesting in using an existing data set are strongly advised to thoroughly read the accompanying codebook ( Pienta et al., 2011 ). There are several reasons why a comprehensive understanding of the codebook is a critical first step when conducting a secondary data analysis. First, the codebook will detail the procedures and methods used to acquire the data and provide a list of all of the questions and assessments collected. A thorough reading of the codebook can provide insights into important covariates that can be included in subsequent models, and a careful reading will draw the analyst’s attention to key variables that will be missing because no such information was collected. Reading through a codebook can also help to generate new research questions.

Second, high-quality codebooks often report basic descriptive information for each variable such as raw frequency distributions and information about the extent of missing values. The descriptive information in the codebook can give investigators a baseline expectation for variables under consideration, including the expected distributions of the variables and the frequencies of under-represented groups (such as ethnic minority participants). Because it is important to verify that the descriptive statistics in the published codebook match those in the file analyzed by the secondary analyst, a familiarity with the codebook is essential. In addition to codebooks, many existing resources provide copies of the actual surveys completed by participants ( Pienta et al., 2011 ). However, the use of actual pencil-and-paper surveys is becoming less common with the advent of computer assisted interview techniques and Internet surveys. It is often the case that survey methods involve skip patterns (e.g., a participant is not asked about the consequences of her drinking if she responds that she doesn’t drink alcohol) that make it more difficult to assume the perspective of the “typical” respondent in a given study ( Pienta et al., 2011 ). Nonetheless, we recommend that analysts try to develop an understanding for the experiences of the participant in a given study. This perspective can help secondary analysts develop an intuitive understanding of certain patterns of missing data and anticipate concerns about question ordering effects ( see , e.g., Schwarz, 1999 ).

Step 3: Acquire Datasets and Construct a Working Datafile . Although there is a growing availability of Web-based resources for conducting basic analyses using selected data sets (e.g., the Survey Documentation Analysis software used by ICPSR), we are convinced that there is no substitute for the analysis of the raw data using the software packages of preference for a given investigator. This means that the analysts will need to acquire the data sets that they consider most relevant. This is typically a very straightforward process that involves acknowledging researcher responsibilities before downloading the entire data set from a website. In some cases, data are classified as restricted-use, and there are more extensive procedures for obtaining access that may involve submitting a detailed security plan and accompanying legal paperwork before becoming an authorized data user. When data involve children and other sensitive groups, Institutional Review Board approval is often required.

Each data set has different usage requirements, so it is difficult to provide blanket guidance. Researchers should be aware of the policies for using each data set and recognize their ethical responsibility for adhering to those regulations. A central issue is that the researcher must avoid deductive disclosure whereby otherwise anonymous participants are identified because of prior knowledge in conjunction with the personal characteristics coded in the dataset (e.g., gender, racial/ethnic group, geographic location, birth date). Such a practice violates the major ethical principles followed by responsible social scientists and has the potential to harm research participants.

Once the entire set of raw data is acquired, it is usually straightforward to import the files into the kinds of statistical packages used by researchers (e.g., R, SAS, SPSS, and STATA). At this point, it is likely that researchers will want to create smaller “working” file by pulling only relevant variables from the larger master files. It is often too cumbersome to work with a computer file that may have more than a thousand columns of information. The solution is to construct a working data file that has all of the needed variables tied to a particular research project. Researchers may also need to link multiple files by matching longitudinal data sets and linking to contextual variables such as information about schools or neighborhoods for data sets with a multilevel structure (e.g., individuals nested in schools or neighborhoods).

Explicit guidance about managing a working data file can be found in Willms (2011 ). Here, we simply highlight some particularly useful advice: (1) keep exquisite notes about what variables were selected and why; (2) keep detailed notes regarding changes to each variable and reasons why; and (3) keep track of sample sizes throughout this entire process. The guiding philosophy is to create documentation that is clear enough for an outside user to follow the logic and procedures used by the researcher. It is far too easy to overestimate the power of memory only to be disappointed when it comes time to revisit a particular analysis. Careful documentation can save time and prevent frustration. Willms (2011 ) noted that “keeping good notes is the sine qua non of the trade” (p. 33).

Step 4: Conduct Analyses . After assembling the working data file, the researcher will likely construct major study variables by creating scale composites (e.g., the mean of the responses to the items assessing the same construct) and conduct initial analyses. As previously noted, a comparison of the distributions and sample sizes with those in the study codebook is essential at this stage. Any deviations for the variables in the working data file and the codebook should be understood and documented. It is particularly useful to keep track of missing values to make sure that they have been properly coded. It should go without saying that an observed value of-9999 will typically require recoding to a missing value in the working file. Similarly, errors in reverse scoring items can be particularly common (and troubling) so researchers are well advised to conduct through item-level and scale analyses and check to make sure that reverse scoring was done correctly (e.g., examine the inter-item correlation matrix when calculating internal consistency estimates to screen for negative correlations). Willms (2011 ) provides some very savvy advice for the initial stages of actual data analysis: “Be wary of surprise findings” (p. 35). He noted that “too many times I have been excited by results only to find that I have made some mistake” (p. 35). Caution, skepticism, and a good sense of the underlying data set are essential for detecting mistakes.

An important comment about the nature of secondary data analysis is again worth emphasizing: These data sets are available to others in the scholarly community. This means that others should be able to replicate your results! It is also very useful to adopt a self-critical perspective because others will be able to subject findings to their own empirical scrutiny. Contemplate alternative explanations and attempt to conduct analyses to evaluate the plausibility of these explanations. Accordingly, we recommend that researchers strive to think of theoretically relevant control variables and include them in the analytic models when appropriate. Such an approach is useful both from the perspective of scientific progress (i.e., attempting to curb confirmation biases) and in terms of surviving the peer-review process.

Special Issue: Measurement Concerns in Existing Datasets

One issue with secondary data analyses that is likely to perplex psychologists are concerns regarding the measurement of core constructs. The reality is that many of the measures available in large-scale data sets consist of a subset of items derived from instruments commonly used by psychologists ( see   Russell & Matthews, 2011 ). For example, the 10-item Rosenberg Self-Esteem scale ( Rosenberg, 1965 ) is the most commonly used measure of global self-esteem in the literature ( Donnellan, Trzesniewski, & Robins, 2011 ). Measures of self-esteem are available in many data sets like Monitoring the Future ( see   Trzesniewski & Donnellan, 2010 ) but these measures are typically shorter than the original Rosenberg scale. Similarly, the GSOEP has a single-item rating of subjective well-being in the form of happiness, whereas psychologists might be more accustomed to measuring this construct with at least five items (e.g., Diener, Emmons, Larsen, & Griffin, 1985 ). Researchers using existing data sets will have to grapple with the consequences of having relatively short assessments in terms of the impact on reliability and validity.

For purposes of this chapter, we will make use of a conventional distinction between reliability and validity. Reliability will refer to the degree of measurement error present in a given set of scores (or alternatively the degree of consistency or precision in scores), whereas validity will refer to the degree to which measures capture the construct of interest and predict other variables in ways that are consistent with theory. More detailed but accessible discussions of reliability and validity can be found in Briggs and Cheek (1986 ), Clark and Watson (1995 ), John and Soto (2007 ), Messick (1995 ), Simms (2008 ), and Simms and Watson (2007 ). Widaman, Little, Preacher, and Sawalani (2011 ) have provided a discussion of these issues in the context of the shortened assessments available in existing data sets.

Short Measures and Reliability . Classical Test Theory (e.g., Lord & Novick, 1968 ) is the measurement perspective most commonly used among psychologists. According to this measurement philosophy, any observed score is a function of the underlying attribute (the so-called “true score”) and measurement error. Reliability is conceptualized as any deviation or inconsistency in observed scores for the same attribute across multiple assessments of that attribute. A thought experiment may help crystallize insights about reliability (e.g., Lord & Novick, 1968 ): Imagine a thousand identical clones each completing the same self-esteem instrument simultaneously. The underlying self-esteem attribute (i.e., the true scores) should be the same for each clone (by definition), whereas the observed scores may fluctuate across clones because of random measurement errors (e.g., a single clone misreading an item vs. another clone being frustrated by an extremely hot testing room). The extent of the observed fluctuations in reported scores across clones offers insight into how much measurement error is present in this instrument. If scores are tightly clustered around a single value, then measurement error is minimal; however, if scores are dramatically different across clones, then there is a clear indication of problems with reliability. The measure is imprecise because it yields inconsistent values across the same true scores.

These ideas about reliability can be applied to observed samples of scores such that the total observed variance is attributable to true score variance (i.e., true individual differences in underlying attributes) and variance stemming from random measurement errors. The assumption that measurement error is random means that it has an expected value of zero across observations. Using this framework, reliability can then be defined as the ratio of true score variance to the total observed variance. An assessment that is perfectly reliable (i.e., has no measurement error) will have a ratio of 1.0, whereas an assessment that is completely unreliable will yield a ratio of 0.0 ( see   John & Soto, 2007 , for an expanded discussion). This perspective provides a formal definition of a reliability coefficient.

Psychologists have developed several tools to estimate the reliability of their measures, but the approach that is most commonly used is coefficient a ( Cronbach, 1951 ; see   Schmitt, 1996 , for an accessible review). This approach considers reliability from the perspective of internal consistency. The basic idea is that fluctuations across items assessing the same construct reflect the presence of measurement error. The formula for the standardized α is a fairly simple function of the average inter-item correlation (a measure of inter-item homogeneity) and the total number of items in a scale. The α coefficient is typically judged acceptable if it is above 0.70, but the justification for this particular cutoff is somewhat arbitrary ( see   Lance, Butts, & Michels, 2006 ). Researchers are therefore advised to take a more critical perspective on this statistic. A relevant concern is that α is negatively impacted when the measure is short.

Given concerns with scale length and α, many methodologically oriented researchers recommend evaluating and reporting the average inter-item correlation because it can be interpreted independently of length and thus represents a “more straightforward indicator of internal consistency” ( Clark & Watson, 1995 , p. 316). Consider that it is common to observe an average inter-item correlation for the 10-item Rosenberg Self-Esteem ( Rosenberg, 1965 ) scale around 0.40 (this is based on typically reported a coefficients; see   Donnellan et al., 2011 ). This same level of internal homogeneity (i.e., an inter-item correlation of 0.40) yields an α of around 0.67 with a 3-item scale but an α of around 0.87 with 10 items. A measure of a broader construct like Extraversion may generate an average inter-item correlation of 0.20 ( Clark & Watson, 1995 , p. 316), which would translate to an α of 0.43 for a 3-item scale and 0.71 for a 10-item scale. The point is that α coefficients will fluctuate with scale length and the breadth of the construct. Because most scales in existing resources are short, the α coefficients might fall below the 0.70 convention despite having a respectable level of inter-item correlation.

Given these considerations, we recommend that researchers consider the average inter-item correlation more explicitly when working with secondary data sets. It is also important to consider the breadth of the underlying construct to generate expectations for reasonable levels of item homogeneity as indexed by the average inter-item correlation. Clark and Watson (1995 ; see also   Briggs & Cheek, 1986 ) recommend values of around 0.40 to 0.50 for measures of fairly narrow constructs (e.g., self-esteem) and values of around 0.15 to 0.20 for measures of broader constructs (e.g., neuroticism). It is our experience that considerations about internal consistency often need to be made explicit in manuscripts so that reviewers will not take an unnecessarily harsh perspective on α’s that fall below their expectations. Finally, we want to emphasize that internal consistency is but one kind of reliability. In some cases, it might be that test—retest reliability is more informative and diagnostic of the quality of a measure ( McCrae, Kurtz, Yamagata, & Terracciano, 2011 ). Fortunately, many secondary data sets are longitudinal so it possible to get an estimate of longer term test-retest reliability from the existing data.

Beyond simply reporting estimates of reliability, it is worth considering why measurement reliability is such an important issue in the first place. One consequence of reliability for substantive research is that measurement imprecision tends to depress observed correlations with other variables. This notion of attenuation resulting from measurement error and a solution were discussed by Spearman as far back as 1904 ( see , e.g., pp. 88–94). Unreliable measures can affect the conclusions drawn from substantive research by imposing a downward bias on effect size estimation. This is perhaps why Widaman et al. (2011 ) advocate using latent variable structural modeling methods to combat this important consequence of measurement error. Their recommendation is well worth considering for those with experience with this technique ( see   Kline, 2011 , for an introduction). Regardless of whether researchers use observed variables or latent variables for their analyses, it is important to recognize and appreciate the consequences of reliability.

Short Measures and Validity . Validity, for our purposes, reflects how well a measure captures the underlying conceptual attribute of interest. All discussions of validity are based, in part, on agreement in a field as to how to understand the construct in question. Validity, like reliability, is assessed as a matter of degree rather than a categorical distinction between valid or invalid measures. Cronbach and Meehl (1955 ) have provided a classic discussion of construct validity, perhaps the most overarching and fundamental form of validity considered in psychological research ( see also   Smith, 2005 ). However, we restrict our discussion to content validity and criterion-related validity because these two types of validity are particularly relevant for secondary data analysis and they are more immediately addressable.

Content validity describes how well a measure captures the entire domain of the construct in question. Judgments regarding content validity are ideally made by panels of experts familiar with the focal construct. A measure is considered construct deficient if it fails to assess important elements of the construct. For example, if thoughts of suicide are an integral aspect of the concept depression and a given self-report measure is missing items that tap this content, then the measure would be deemed construct-deficient. A measure can also suffer from construct contamination if it includes extraneous items that are irrelevant to the focal construct. For example, if somatic symptoms like a rapid heartbeat are considered to reflect the construct of anxiety and not part of depression, then a depression inventory that has such an item would suffer from construct contamination. Given the reduced length of many assessments, concerns over construct deficiency are likely to be especially pressing. A short assessment may not include enough items to capture the full breadth of a broad construct. This limitation is not readily addressed and should be acknowledged ( see   Widaman et al., 2011 ). In particular, researchers may need to clearly specify that their findings are based on a narrower content domain than is normally associated with the focal construct of interest.

A subtle but important point can arise when considering the content of measures with particularly narrow content. Internal consistency will increase when there is redundancy among items in the scale; however, the presence of similar items may decrease predictive power. This is known as the attenuation paradox in psycho metrics ( see   Clark & Watson, 1995 ). When items are nearly identical, they contribute redundant information about a very specific aspect of the construct. However, the very specific attribute may not have predictive power. In essence, reliability can be maximized at the expense of creating a measure that is not very useful from the point of view of prediction (and likely explanation). Indeed, Clark and Watson (1995 ) have argued that the “goal of scale construction is to maximize validity rather than reliability” (p. 316). In short, an evaluation of content validity is also important when considering the predictive power of a given measure.

Whereas content validity is focused on the internal attributes of a measure, criterion-related validity is based on the empirical relations between measures and other variables. Using previous research and theory surrounding the focal construct, the researcher should develop an expectation regarding the magnitude and direction of observed associations (i.e., correlations) with other variables. A good supporting theory of a construct should stipulate a pattern of association, or nomological network, concerning those other variables that should be related and unrelated to the focal construct. This latter requirement is often more difficult to specify from existing theories, which tend to provide a more elaborate discussion of convergent associations rather than discriminant validity ( Widaman et al., 2011 ). For example, consider a very truncated nomological network for Agreeableness (dispositional kindness and empathy). Measures of this construct should be positively associated with romantic relationship quality, negatively related to crime (especially violent crime), and distinct from measures of cognitive ability such as tests of general intelligence.

Evaluations of criterion-related validity can be conducted within a data set as researchers document that a measure has an expected pattern of associations with existing criterion-related variables. Investigators using secondary data sets may want to conduct additional research to document the criterion-related validity of short measures with additional convenience samples (e.g., the ubiquitous college student samples used by many psychologists; Sears, 1986 ). For example, there are six items in the Add Health data set that appear to measure self-esteem (e.g., “I have a lot of good qualities” and “I like myself just the way I am”) ( see   Russell, Crockett, Shen, &Lee, 2008 ). Although many of the items bear a strong resemblance to the items on the Rosenberg Self-Esteem scale ( Rosenberg, 1965 ), they are not exactly the same items. To obtain some additional data on the usefulness of this measure, we administered the Add Health items to a sample of 387 college students at our university along with the Rosenberg Self-Esteem scale and an omnibus measure of personality based on the Five-Factor model ( Goldberg, 1999 ). The six Add Health items were strongly correlated with the Rosenberg ( r = 0.79), and both self-esteem measures had a similar pattern of convergent and divergent associations with the facets of the Five-Factor model (the two profiles were very strongly associated: r > 0.95). This additional information can help bolster the case for the validity of the short Add Health self-esteem measure.

Special Issue: Missing Data in Existing Data Sets

Missing data is a fact of life in research— individuals may drop out of longitudinal studies or refuse to answer particular questions. These behaviors can affect the generalizability of findings because results may only apply to those individuals who choose to complete a study or a measure. Missing data can also diminish statistical power when common techniques like listwise deletion are used (e.g., only using cases with complete information, thereby reducing the sample size) and even lead to biased effect size estimates (e.g., McKnight & McKnight, 2011 ; McKnight, McKnight, Sidani, & Figuredo, 2007 ; Widaman, 2006 ). Thus, concerns about missing data are important for all aspects of research, including secondary data analysis. The development of specific techniques for appropriately handling missing data is an active area of research in quantitative methods ( Schafer & Graham, 2002 ).

Unfortunately, the literature surrounding missing data techniques is often technical and steeped in jargon, as noted by McKnight et al. (2007 ). The reality is that researchers attempting to understand issues of missing data need to pay careful attention to terminology. For example, a novice researcher may not immediately grasp the classification of missing data used in the literature ( see   Schafer & Graham, 2002 , for a clear description). Consider the confusion that may stem from learning that data are missing at random (MAR) versus data are missing completely at random (MCAR). The term MAR does not mean that missing values only occurred because of chance factors. This is the case when data are missing completely at random (MCAR). Data that are MCAR are absent because of truly random factors. Data that are MAR refers to the situation in which the probability that the observations are missing depends only on other available information in the data set. Data that are MAR can be essentially “ignored” when the other factors are included in a statistical model. The last type of missing data, data missing not at random (MNAR), is likely to characterize the variables in many real-life data sets. As it stands, methods for handing data that are MAR and MCAR are better developed and more easily implemented than methods for handling data MNAR. Thus, many applied researchers will assume data are MAR for purposes of statistical modeling (and the ability to sleep comfortably at night). Fortunately, such an assumption might not create major problems for many analyses and may in fact represent the “practical state of the art” ( Schafer & Graham, 2002 , p. 173).

The literature on missing data techniques is growing, so we simply recommend that researchers keep current on developments in this area. McKnight et al. (2007 ) and Widaman (2006 ) both provide an accessible primer on missing data techniques. In keeping with the largely practical bent to the chapter, we suggest that researchers keep careful track of the amount of missing data present in their analyses and report such information clearly in research papers ( see   McKnight & McKnight, 2011 ). Similarly, we recommend that researchers thoroughly screen their data sets for evidence that missing values depend on other measured variables (e.g., scores at Time 1 might be associated with Time 2 dropout). In general, we suggest that researchers avoid listwise and pairwise deletion methods because there is very little evidence that these are good practices ( see   Jeličić, Phelps, & Lerner, 2009 ; Widaman, 2006 ). Rather, it might be easiest to use direct fitting methods such as the estimation procedures used in conventional structural equation modeling packages (e.g., Full Information Maximum Likelihood; see   Allison, 2003 ). At the very least, it is usually instructive to compare results using listwise deletion with results obtained with direct model fitting in terms of the effect size estimates and basic conclusions regarding the statistical significance of focal coefficients.

Special Issue: Sample Weighting in Existing Data Sets

One of the advantages of many existing data sets is that they were collected using probabilistic sampling methods so that researchers can obtain unbiased population estimates. Such estimates, however, are only obtained when complex survey weights are formally incorporated into the statistical modeling procedures. Such weighting schemes can affect the correlations between variables, and therefore all users of secondary data sets should become familiar with sampling design when they begin working with a new data set. A considerable amount of time and effort is dedicated toward generating complex weighting schemes that account for the precise sampling strategies used in the given study, and users of secondary data sets should give careful consideration to using these weights appropriately.

In some cases, the addition of sampling weights will have little substantive implication on findings, so extensive concern over weighting might be overstated. On the other hand, any potential difference is ultimately an empirical question, so researchers are well advised to consider the importance of sampling weights ( Shrout & Napier, 2011 ). The problem is that many psychologists are not well versed in the use of sampling weights ( Shrout & Napier, 2011 ). Thus, psychologists may not be in a strong position to evaluate whether sample weighting concerns are relevant. In addition, it is sometimes necessary to use specialized software packages or add-ons to adjust analytic models appropriately for sampling weights. Programs such as STATA and SAS have such capabilities in the base package, whereas packages like SPSS sometimes require a complex survey model add-on that integrates with its existing capabilities. Whereas the graduate training of the modal sociologist or demographer is likely to emphasize survey research and thus presumably cover sampling, this is not the case with the methodological training of many psychologists ( Aiken, West, & Millsap, 2008 ). Psychologists who are unfamiliar with sample weighting procedures are well advised to seek the counsel of a survey methodologist before undertaking data analysis.

In terms of practical recommendations, it is important for the user of the secondary data set to develop a clear understanding of how the data were collected by reading documentation about the design and sampling procedure ( Shrout & Napier, 2011 ). This insight will provide a conceptual framework for understanding weighting schemes and for deciding how to appropriately weight the data. Once researchers have a clear idea of the sampling scheme and potential weights, actually incorporating available weights into analyses is not terribly difficult, provided researchers have the appropriate software ( Shrout & Napier, 2011 ). Weighting tutorials are often available for specific data sets. For example, the Add Health project has a document describing weighting ( http://www.cpc.unc.edu/projects/addhealth/faqs/aboutdata/weight1.pdf ) as does the Centers for Disease Control and Prevention for use with their Youth Risk Behavior Surveys ( http://www.cdc.gov/HealthyYouth/yrbs/pdf/YRBS_analysis_software.pdf ). These free documents may also provide useful and accessible background even for those who may not use the data from these projects.

Secondary data analysis refers to the analysis of existing data that may not have been explicitly collected to address a particular research question. Many of the quantitative techniques described in this volume can be applied using existing resources. To be sure, strong data analytic skills are important for fully realizing the potential benefits of secondary data sets, and such skills can help researchers recognize the limits of a data set for any given analysis.

In particular, measurement issues are likely to create the biggest hurdles for psychologists conducting secondary analyses in terms of the challenges associated with offering a reasonable interpretation of the results and in surviving the peer-review process. Accordingly, a familiarity with basic issues in psychometrics is very helpful. Beyond such skills, the effective use of these existing resources requires patience and strong attention to detail. Effective secondary data analysis also requires a fair bit of curiosity to seek out those resources that might be used to make important contribution to psychological science.

Ultimately, we hope that the field of psychology becomes more and more accepting of secondary data analysis. As psychologists use this approach with increasing frequency, it is likely that the organizers of major ongoing data collection efforts will be increasingly open to including measures of prime interest to psychologists. The individuals in charge of projects like the BHPS, the GSOEP, and the National Center for Education Statistics ( http://nces.ed.gov/ ) want their data to be used by the widest possible audiences and will respond to researcher demands. We believe that it is time that psychologists join their colleagues in economics, sociology, and political science in taking advantage of these existing resources. It is also time to move beyond divisive discussions surrounding the presumed superiority of primary data collection over secondary analysis. There is no reason to choose one over the other when the field of psychology can profit from both. We believe that the relevant topics of debate are not about the method of initial data collection but, rather, about the importance and intrinsic interest of the underlying research questions. If the question is important and the research design and measures are suitable, then there is little doubt in our minds that secondary data analysis can make a contribution to psychological science.

Author Note

M. Brent Donnellan, Department of Psychology, Michigan State University, East Lansing, MI 48824.

Richard E. Lucas, Department of Psychology, Michigan State University, East Lansing, MI 48824.

One consequence of large sample sizes, however, is that issues of effect size interpretation become paramount given that very small correlations or very small mean differences between groups are likely to be statistically significant using conventional null hypothesis significance tests (e.g., Trzesniewski & Donnellan, 2009 ). Researchers will therefore need to grapple with issues related to null hypothesis significance testing ( see   Kline, 2004 ).

Aiken, L. S. , & West, S. G. ( 1991 ). Multiple regression: Testing and interpreting interactions . Newbury Park, CA: Sage.

Google Scholar

Google Preview

Aiken, L. S. , West, S. G. , & Millsap, R. E. ( 2008 ). Doctoral training in statistics, measurement, and methodology in psychology: Replication and extension of Aiken, West, Sechrest, and Reno’s (1990) survey of Ph.D. programs in North America.   American Psychologist, 63, 32–50.

Akers, R. L. , Massey, J. , & Clarke, W ( 1983 ). Are self-reports of adolescent deviance valid? Biochemical measures, randomized response, and the bogus pipeline in smoking behavior.   Social Forces, 62, 234–251.

Allison, P. D. ( 2003 ). Missing data techniques for structural equation modeling.   Journal of Abnormal Psychology, 112, 545–557.

Baird, B. M. , Lucas, R. E. , & Donnellan, M. B. ( 2010 ). Life Satisfaction across the lifespan: Findings from two nationally representative panel studies.   Social Indicators Research, 99, 183–203.

Briggs, S. R. , & Cheek, J. M. ( 1986 ). The role of factor analysis in the development and evaluation of personality scales.   Journal of Personality 54, 106–148.

Brooks-Gunn, J. , Berlin, L. J. , Leventhal, T. , & Fuligini, A. S. ( 2000 ). Depending on the kindness of strangers: Current national data initiatives and developmental research.   Child Development, 71, 257–268.

Brooks-Gunn, J. , & Chase-Lansdale, P. L. ( 1991 ) (Eds.). Secondary data analyses in developmental psychology [Special section].   Developmental Psychology, 27, 899–951.

Clark, L. A. , & Watson, D. ( 1995 ). Constructing validity: Basic issues in objective scale development.   Psychological Assessment, 7, 309–319.

Cronbach, L. J. ( 1951 ). Coefficient alpha and the internal structure of tests.   Psychometrika, 16, 297–234.

Cronbach, L. J. ( 1957 ). The two disciplines of scientific psychology.   American Psychologist, 12, 671–684.

Cronbach, L. J. , & Meehl, P. ( 1955 ). Construct validity in psychological tests.   Psychological Bulletin, 52, 281–302.

Diener, E. , Emmons, R. A. , Larsen, R. J. , & Griffin, S. ( 1985 ). The Satisfaction with Life Scale.   Journal of Personality Assessment, 49, 71–75.

Donnellan, M. B. , Trzesniewski, K. H. , & Robins, R. W. ( 2011 ). Self-esteem: Enduring issues and controversies. In T Chamorro-Premuzic , S. von Stumm , and A. Furnham (Eds). The Wiley-Blackwell Handbook of Individual Differences (pp. 710–746). New York: Wiley-Blackwell.

Freese, J. ( 2007 ). Replication standards for quantitative social science: Why not sociology?   Sociological Methods & Research, 36, 153–172.

Gerstorf, D. , Ram, N. , Estabrook, R. , Schupp, J. , Wagner, G. G. , & Lindenberger, U. ( 2008 ). Life satisfaction shows terminal decline in old age: Longitudinal evidence from the German Socio-Economic Panel Study (SOEP).   Developmental Psychology, 44, 1148–1159.

Gerstorf, D. , Ram, N. , Goebel, J. , Schupp, J. , Lindenberger, U. , & Wagner, G. G. ( 2010 ). Where people live and die makes a difference: Individual and geographic disparities in well-being progression at the end of life.   Psychology and Aging, 25, 661–676.

Goldberg, L. R. ( 1999 ). A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models. In I Mervielde , I. Deary , F. De Fruyt , & F. Ostendorf (Eds.), Personality psychology in Europe (Vol. 7, pp. 7–28). Tilburg, The Netherlands: Tilburg University Press.

Hofferth, S. L. , ( 2005 ). Secondary data analysis in family research.   Journal of Marriage and the Family, 67, 891–907.

Hunter, J. E. , & Schmidt, F. L. ( 2004 ). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Newbury Park, CA: Sage.

Jeličić, H. , Phelps, E. , & Lerner, R. M. ( 2009 ). Use of missing data methods in longitudinal studies: The persistence of bad practices in developmental psychology.   Developmental Psychology, 45, 1195–1199.

John, O. P. , & Soto, C. J. ( 2007 ). The importance of being valid. In R. W Robins , R. C. Fraley , and R. F. Krueger (Eds). Handbook of Research Methods in Personality Psychology (pp. 461–494). New York: Guilford Press.

Kiecolt, K. J. & Nathan, L. E. ( 1985 ). Secondary analysis of survey data . Sage University Paper series on Quantitative Applications in the Social Sciences, No. 53). Newbury Park, CA: Sage.

Kline, R. B. ( 2004 ). Beyond significance testing: Reforming data analysis methods in behavioral research . Washington, DC: American Psychological Association.

Kline, R. B. ( 2011 ). Principles and practice of structural equation modeling (3rd ed.). New York: Guildford Press.

Lance, C. E. , Butts, M. M. , & Michels, L. C. ( 2006 ). The sources of four commonly reported cutoff criteria: What did they really say?   Organizational Research Methods, 9, 202–220.

Lord, F. , & Novick, M. R. ( 1968 ). Statistical theories of mental test scores . Reading, MA: Addison-Wesley.

Lucas, R. E. ( 2005 ). Time does not heal all wounds.   Psychological Science, 16, 945–950.

Lucas, R. E. ( 2007 ). Adaptation and the set-point model of subjective well-being: Does happiness change after major life events?   Current Directions in Psychological Science, 16, 75–79.

McCall, R. B. , & Appelbaum, M. I. ( 1991 ). Some issues of conducting secondary analyses.   Developmental Psychology, 27, 911–917.

McCrae, R. R. , Kurtz, J. E. , Yamagata, S. , & Terracciano, A. ( 2011 ). Internal consistency, retest reliability, and their implications for personality scale validity.   Personality and Social Psychology Review, 15, 28–50.

Messick, S. ( 1995 ). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning.   American Psychologist, 50, 741–749.

McKnight, P. E. , & McKnight, K. M. ( 2011 ). Missing data in secondary data analysis. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 83–101). Washington, DC: American Psychological Association.

McKnight, P. E. , McKnight, K. M. , Sidani, S. , & Figuredo, A. ( 2007 ). Missing data: A gentle introduction . New York: Guilford Press.

Mroczek, D. K. , Pitzer, L. , Miller, L. , Turiano, N. , & Fingerman, K. ( 2011 ). The use of secondary data in adult development and aging research. In K. H. Trzesniewski , M. B. Donnellan , and R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 121–132). Washington, DC: American Psychological Association.

Pienta, A. M. , O’Rourke, J. M. , & Franks, M. M. ( 2011 ). Getting started: Working with secondary data. In K. H. Trzesniewski , M. B. Donnellan , and R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 13–25). Washington, DC: American Psychological Association.

Rosenberg, M. ( 1965 ). Society and adolescent self image , Princeton, NJ: Princeton University.

Russell, S. T. , Crockett, L. J. , Shen, Y-L , & Lee, S-A. ( 2008 ). Cross-ethnic invariance of self-esteem and depression measures for Chinese, Filipino, and European American adolescents.   Journal of Youth and Adolescence, 37, 50–61.

Russell, S. T. , & Matthews, E. ( 2011 ). Using secondary data to study adolescence and adolescent development. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 163–176). Washington, DC: American Psychological Association.

Schafer, J. L. & Graham, J. W ( 2002 ). Missing data: Our view of the state of the art.   Psychological Methods, 7, 147–177.

Schmitt, N. ( 1996 ). Uses and abuses of coefficient alpha.   Psychological Assessment, 8, 350–353.

Schwarz, N. ( 1999 ). Self-reports: How the questions shape the answers.   American Psychologist, 54, 93–105.

Schwarz, N. & Strack, F. ( 1999 ). Reports of subjective well-being: Judgmental processes and their methodological implications. In D. Kahneman , E. Diener , & N. Schwarz (Eds.). Well-being: The foundations of hedonic psychology (pp.61–84). New York: Russell Sage Foundation.

Sears, D. O. ( 1986 ). College sophomores in the lab: Influences of a narrow data base on social psychology’s view of human nature.   Journal of Personality and Social Psychology, 51, 515–530.

Shrout, P. E. , & Napier, J. L. ( 2011 ). Analyzing survey data with complex sampling designs. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 63–81). Washington, DC: American Psychological Association.

Simms, L. J. ( 2008 ). Classical and modern methods of psychological scale construction.   Social and Personality Psychology Compass, 2/1, 414–433.

Simms, L. J. , & Watson, D. ( 2007 ). The construct validation approach to personality scale creation. In R. W Robins , R. C. Fraley , & R. F. Krueger (Eds). Handbook of Research Methods in Personality Psychology (pp. 240–258). New York: Guilford Press.

Smith, G. X ( 2005 ). On construct validity: Issues of method and measurement.   Psychological Assessment, 17, 396–408.

Tracy, J. L. , Robins, R. W. , & Sherman, J. W. ( 2009 ). The practice of psychological science: Searching for Cronbach’s two streams in social-personality psychology.   Journal of Personality and Social Psychology, 96, 1206–1225.

Trzesniewski, K.H. & Donnellan, M. B. ( 2009 ). Re-evaluating the evidence for increasing self-views among high school students: More evidence for consistency across generations (1976–2006).   Psychological Science, 20, 920–922.

Trzesniewski, K. H. & Donnellan, M. B. ( 2010 ). Rethinking “Generation Me”: A study of cohort effects from 1976–2006.   Perspectives in Psychological Science , 5, 58–75.

Trzesniewski, K. H. , Donnellan, M. B. , & Lucas, R. E. ( 2011 ) (Eds). Secondary data analysis: An introduction for psychologists . Washington, DC: American Psychological Association.

Widaman, K. F. ( 2006 ). Missing data: What to do with or without them.   Monographs of the Society for Research in Child Development, 71, 42–64.

Widaman, K. F. , Little, T. D. , Preacher, K. K. , & Sawalani, G. M. ( 2011 ). On creating and using short forms of scales in secondary research. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 39–61). Washington, DC: American Psychological Association.

Willms, J. D. ( 2011 ). Managing and using secondary data sets with multidisciplinary research teams. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 27–38). Washington, DC: American Psychological Association.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Sage Research Methods Community

Using Secondary Data in Mixed Methods is More Straight-Forward Than You Think

methodology when using secondary data

Dr.  Daphne C. Watkins

By Daphne C. Watkins

Professor Watkins is the author of Secondary Data in Mixed Methods Research . She is is a professor of social work and a University Diversity and Social Transformation Professor at the University of Michigan. Professor Watkins is also the founding director of the Gender and Health Research (GendHR) Lab and the award-winning Young Black Men, Masculinities, and Mental Health (YBMen) Project, which leverages technology to provide mental health education and social support for young Black men. She currently directs the Vivian A. and James L. Curtis Center for Health Equity Research and Training at the University of Michigan.

Study with Dr. Watkins!

Dr. Watkins will offer a new 3-day virtual (and affordable!) workshop on mixed methods research through  Instats  June 14-16, 2023. Click here for more info. She will also offer a two-week in-person and hybrid mixed methods program through the University of Essex Summer School in London.

Novice researchers assume they are not really doing research if their process does not involve collecting data. However, more seasoned researchers know this could not be further from the truth. While writing my 2022 book, Secondary Data in Mixed Methods Research , I took a deep dive into the origins of secondary data and its strengths and limitations for use in mixed methods studies. Under certain circumstances, you must demonstrate your ability to undergo the complete research cycle, from beginning to end, including idea conception, data collection, data analysis, and reporting. But beyond these isolated occurrences, secondary data are available and worth consideration when addressing your research questions.

methodology when using secondary data

Use the code COMMUNIT24 for 25% off through December 31, 2024.

Secondary data can help you become more familiar with a topic you do not know. When I ask a room full of graduate students about the most fundamental benefits of secondary data, they usually tell me its money and time-saving advantages are what make secondary data most attractive. However, I believe secondary data are an invitation from the original researchers to think more intentionally about our research questions outside our typical (albeit limited) boundaries. Furthermore, having a chance to work with data you did not collect can offer you an insider perspective of someone else’s research design, its strengths and weaknesses, and its ability to answer research questions that may or may not have been in the minds of the original researchers. An inside view of secondary data can also teach budding researchers about the decisions they will make when someday leading their own research projects.

Easy-to-follow instructions do not always accompany secondary data sets. Therefore, your ability to sift through data records and make sense of the available information associated with secondary data sets is essential. In this post, I briefly cover the advantages and disadvantages of using secondary data in mixed methods research, how to prioritize secondary qualitative and quantitative data in a mixed methods project, and the role of theory in mixed methods with secondary data.

In research, there are at least two types of data: primary and secondary. Primary data refer to data collected in the present moment for a specific project; they are also known as "original" or "raw" data. On the other hand, secondary data have previously been collected by you or someone else and can be used for a project now or in the future. Secondary data can take various forms, such as publicly available data like the U.S. Census or smaller, privately-owned data like those from a local after-school program. The distinction between primary and secondary data also involves a time element because it is possible to collect and use primary data for secondary purposes. However, no set amount of time must pass before primary data can be considered secondary. The only distinguishing factor is whether there is a fundamental research question to answer and whether the data to address that question already exists. For instance, if data were collected a year ago to answer one research question and are now being used to answer a different research question, those data are considered secondary data being used for a secondary purpose. 

methodology when using secondary data

Secondary data in mixed methods research is the process of identifying, evaluating, and incorporating one or more secondary qualitative or quantitative data sources into a mixed methods project. Incorporating secondary data expands on the original definition of mixed methods research, which involves collecting, analyzing, and integrating qualitative and quantitative approaches to study a research problem from multiple perspectives. Conducting a mixed methods project can lead to a more comprehensive understanding of the problem being studied because qualitative methods provide insight into the experiences and perspectives of individuals, while quantitative methods provide statistical findings and help to generalize findings. The use of secondary data in mixed methods research broadens the purpose of secondary data analysis, which is to analyze existing data by addressing research questions similar to or different from those for which the original data were collected.

Incorporating secondary data into your mixed methods research will let you make good use of data that already has time, resources, and energy investments. But a common question is how theory should be used in mixed methods with secondary data. Your purpose statement, research questions, and prioritization of either the quantitative or qualitative phase of the study should guide decisions for how to use theory in your mixed methods with secondary data. For instance, in sequential designs, prioritized phases come before other phases of the study. So, if the quantitative phase of your mixed methods study takes priority, the theory will guide your selection of variables. If the prioritized quantitative phase includes secondary data, the theory used by the original investigators may guide your preliminary selection and understanding of the variables used in those data to answer your research questions. In a mixed methods study where the qualitative phase is prioritized, a conceptual framework is generated to be tested using a subsequent quantitative study. If the qualitative data happen to be secondary, you might use the qualitative concepts from the original researchers to guide your theory construction and then test this theory in a subsequent quantitative phase.  So you see, the theory does not take a backseat but rather can play a critical role in shaping the study design for mixed methods with secondary data.

In closing, I want to emphasize the benefits of using secondary data in mixed methods research, such as cost and time savings, not to mention the chance to access larger datasets that may not be feasible to collect as primary data. Additionally, using secondary data can help you triangulate findings and increase the validity and reliability of your results by combining multiple data sources with similar variables and concepts. Whether you are an aspiring researcher or a seasoned scholar, you must know both the benefits and challenges of using secondary data in your research and make informed decisions about whether to use them based on your research questions and goals.

More Methodspace Posts about Mixed Methods

Teach and Learn with a Research Case: Understanding Online Discussions of Key Public Health Issues Using a Mixed-Methods Approach

Let’s use this open-access research case to think through the possibilities and potential problems involved with studying blog posts and online discussions.

New Thinking about Mixed and Multi- Methods

Is it too hard to address problems in our complex world with one type of data? Mixed methods might be the answer. Find explanations and open-access resources in this post.

From codes to theory-building

Learn what to do when you are faced with next steps after coding qualitative data.

Using Secondary Data in Mixed Methods is More Straight-Forward Than You Think

Learn about using mixed methods with secondary data.

Collecting Qualitative and Quantitative Data

Researchers who use mixed methods collect both qualitative and quantitative data. In this collection of open-access you will find articles that show how researchers collect two or more types of data.

Conducting Mixed Methods Research

Researchers who want to collect both qualitative and quantitative data with mixed methods will find this conversation of interest. Drs. Linda Bloomberg and Merle Werbeloff walk through the process of designing and conducting mixed methods studies.

Research Aims Appropriate to Integrated Mixed Methods

Elizabeth Creamer discusses the selection of a research problem in mixed methods research.

Case Study Methods and Examples

What is case study methodology? It is unique given one characteristic: case studies draw from more than one data source. In this post find definitions and a collection of multidisciplinary examples.

The challenges of running social science experiments from home - and 14 tools that can help

A growing number of social science researchers are shifting to digital methods, but it’s not an easy task, and this has been even more evident in lockdown. We’ve selected 14 software tools that you can start using immediately to run your social or behavioral experiments online.

Writing an Award-Winning Book: Interview with Dr. Cheryl Poth

Dr. Poth has written two winning books. Learn about her strategy!

methodology when using secondary data

There’s a lot of uncertainty about how higher education will be taught in the age of COVID-19. How should professors and instructors of qualitative methods courses re-think their curriculums for online classrooms or cohorts? How can students conduct observations if they’re sheltered at home? How will students work in teams to analyze data if they’re distributed across the world? Here are some tips for alternative data collection methods, and collaborative tools for remote analysis.

A webinar and interview with Dr. Michael Fetters

Thinking about designing a mixed methods study? Find ideas from Dr. Fetters in this interview and recorded webinar.

Thick Big Data: Mixed Methods for Our Time

Dr. Dariusz Jemielniak discusses the importance of mixed methods in Big Data research.

fuec-xa_-e1576165679738.jpg

What was most read and cited from the

SMaPP-Global: An interview with Josh Tucker and Pablo Barbera

In April this year a special collection examining social media and politics was published in SAGE Open . Guest edited by Joshua A. Tucker and Pablo Barberá, the articles grew out of a series of conferences held by NYU’s Social Media and Political Participation lab (SMaPP) and the NYU Global Institute for Advanced Study (GIAS) known as SMaPP-Global . Upon publication Joshua Tucker said ‘the collection of articles also shows the value of exposing researchers from a variety of disciplines with similar substantive interests to each other's work at regular intervals’. Interdisciplinary collaborative research projects are a cornerstone of what makes computational social science such an interesting field. We were intrigued to know more so caught up with Josh and Pablo to hear more.

Salmons_Multimodal-2018.jpg

Homelessness is a major problem in many global cities, and research on the issue recognizes that urban aspect. But homelessness is not just a problem for cities, as criminologist Michael Young describes in this Methods in Action interview. Young describes his experiences conducting this research, details some of the nuts and bolts of his mixed methods approach, and explains how geography mediates culture in ways that wise researchers need to understand.

Automated Video Analysis

Creating a culture of inquiry in the classroom.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

methodology when using secondary data

Home Market Research

Secondary Research: Definition, Methods and Examples.

secondary research

In the world of research, there are two main types of data sources: primary and secondary. While primary research involves collecting new data directly from individuals or sources, secondary research involves analyzing existing data already collected by someone else. Today we’ll discuss secondary research.

One common source of this research is published research reports and other documents. These materials can often be found in public libraries, on websites, or even as data extracted from previously conducted surveys. In addition, many government and non-government agencies maintain extensive data repositories that can be accessed for research purposes.

LEARN ABOUT: Research Process Steps

While secondary research may not offer the same level of control as primary research, it can be a highly valuable tool for gaining insights and identifying trends. Researchers can save time and resources by leveraging existing data sources while still uncovering important information.

What is Secondary Research: Definition

Secondary research is a research method that involves using already existing data. Existing data is summarized and collated to increase the overall effectiveness of the research.

One of the key advantages of secondary research is that it allows us to gain insights and draw conclusions without having to collect new data ourselves. This can save time and resources and also allow us to build upon existing knowledge and expertise.

When conducting secondary research, it’s important to be thorough and thoughtful in our approach. This means carefully selecting the sources and ensuring that the data we’re analyzing is reliable and relevant to the research question . It also means being critical and analytical in the analysis and recognizing any potential biases or limitations in the data.

LEARN ABOUT: Level of Analysis

Secondary research is much more cost-effective than primary research , as it uses already existing data, unlike primary research, where data is collected firsthand by organizations or businesses or they can employ a third party to collect data on their behalf.

LEARN ABOUT: Data Analytics Projects

Secondary Research Methods with Examples

Secondary research is cost-effective, one of the reasons it is a popular choice among many businesses and organizations. Not every organization is able to pay a huge sum of money to conduct research and gather data. So, rightly secondary research is also termed “ desk research ”, as data can be retrieved from sitting behind a desk.

methodology when using secondary data

The following are popularly used secondary research methods and examples:

1. Data Available on The Internet

One of the most popular ways to collect secondary data is the internet. Data is readily available on the internet and can be downloaded at the click of a button.

This data is practically free of cost, or one may have to pay a negligible amount to download the already existing data. Websites have a lot of information that businesses or organizations can use to suit their research needs. However, organizations need to consider only authentic and trusted website to collect information.

2. Government and Non-Government Agencies

Data for secondary research can also be collected from some government and non-government agencies. For example, US Government Printing Office, US Census Bureau, and Small Business Development Centers have valuable and relevant data that businesses or organizations can use.

There is a certain cost applicable to download or use data available with these agencies. Data obtained from these agencies are authentic and trustworthy.

3. Public Libraries

Public libraries are another good source to search for data for this research. Public libraries have copies of important research that were conducted earlier. They are a storehouse of important information and documents from which information can be extracted.

The services provided in these public libraries vary from one library to another. More often, libraries have a huge collection of government publications with market statistics, large collection of business directories and newsletters.

4. Educational Institutions

Importance of collecting data from educational institutions for secondary research is often overlooked. However, more research is conducted in colleges and universities than any other business sector.

The data that is collected by universities is mainly for primary research. However, businesses or organizations can approach educational institutions and request for data from them.

5. Commercial Information Sources

Local newspapers, journals, magazines, radio and TV stations are a great source to obtain data for secondary research. These commercial information sources have first-hand information on economic developments, political agenda, market research, demographic segmentation and similar subjects.

Businesses or organizations can request to obtain data that is most relevant to their study. Businesses not only have the opportunity to identify their prospective clients but can also know about the avenues to promote their products or services through these sources as they have a wider reach.

Key Differences between Primary Research and Secondary Research

Understanding the distinction between primary research and secondary research is essential in determining which research method is best for your project. These are the two main types of research methods, each with advantages and disadvantages. In this section, we will explore the critical differences between the two and when it is appropriate to use them.

How to Conduct Secondary Research?

We have already learned about the differences between primary and secondary research. Now, let’s take a closer look at how to conduct it.

Secondary research is an important tool for gathering information already collected and analyzed by others. It can help us save time and money and allow us to gain insights into the subject we are researching. So, in this section, we will discuss some common methods and tips for conducting it effectively.

Here are the steps involved in conducting secondary research:

1. Identify the topic of research: Before beginning secondary research, identify the topic that needs research. Once that’s done, list down the research attributes and its purpose.

2. Identify research sources: Next, narrow down on the information sources that will provide most relevant data and information applicable to your research.

3. Collect existing data: Once the data collection sources are narrowed down, check for any previous data that is available which is closely related to the topic. Data related to research can be obtained from various sources like newspapers, public libraries, government and non-government agencies etc.

4. Combine and compare: Once data is collected, combine and compare the data for any duplication and assemble data into a usable format. Make sure to collect data from authentic sources. Incorrect data can hamper research severely.

4. Analyze data: Analyze collected data and identify if all questions are answered. If not, repeat the process if there is a need to dwell further into actionable insights.

Advantages of Secondary Research

Secondary research offers a number of advantages to researchers, including efficiency, the ability to build upon existing knowledge, and the ability to conduct research in situations where primary research may not be possible or ethical. By carefully selecting their sources and being thoughtful in their approach, researchers can leverage secondary research to drive impact and advance the field. Some key advantages are the following:

1. Most information in this research is readily available. There are many sources from which relevant data can be collected and used, unlike primary research, where data needs to collect from scratch.

2. This is a less expensive and less time-consuming process as data required is easily available and doesn’t cost much if extracted from authentic sources. A minimum expenditure is associated to obtain data.

3. The data that is collected through secondary research gives organizations or businesses an idea about the effectiveness of primary research. Hence, organizations or businesses can form a hypothesis and evaluate cost of conducting primary research.

4. Secondary research is quicker to conduct because of the availability of data. It can be completed within a few weeks depending on the objective of businesses or scale of data needed.

As we can see, this research is the process of analyzing data already collected by someone else, and it can offer a number of benefits to researchers.

Disadvantages of Secondary Research

On the other hand, we have some disadvantages that come with doing secondary research. Some of the most notorious are the following:

1. Although data is readily available, credibility evaluation must be performed to understand the authenticity of the information available.

2. Not all secondary data resources offer the latest reports and statistics. Even when the data is accurate, it may not be updated enough to accommodate recent timelines.

3. Secondary research derives its conclusion from collective primary research data. The success of your research will depend, to a greater extent, on the quality of research already conducted by primary research.

LEARN ABOUT: 12 Best Tools for Researchers

In conclusion, secondary research is an important tool for researchers exploring various topics. By leveraging existing data sources, researchers can save time and resources, build upon existing knowledge, and conduct research in situations where primary research may not be feasible.

There are a variety of methods and examples of secondary research, from analyzing public data sets to reviewing previously published research papers. As students and aspiring researchers, it’s important to understand the benefits and limitations of this research and to approach it thoughtfully and critically. By doing so, we can continue to advance our understanding of the world around us and contribute to meaningful research that positively impacts society.

QuestionPro can be a useful tool for conducting secondary research in a variety of ways. You can create online surveys that target a specific population, collecting data that can be analyzed to gain insights into consumer behavior, attitudes, and preferences; analyze existing data sets that you have obtained through other means or benchmark your organization against others in your industry or against industry standards. The software provides a range of benchmarking tools that can help you compare your performance on key metrics, such as customer satisfaction, with that of your peers.

Using QuestionPro thoughtfully and strategically allows you to gain valuable insights to inform decision-making and drive business success. Start today for free! No credit card is required.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

A/B testing software

Top 13 A/B Testing Software for Optimizing Your Website

Apr 12, 2024

contact center experience software

21 Best Contact Center Experience Software in 2024

Government Customer Experience

Government Customer Experience: Impact on Government Service

Apr 11, 2024

Employee Engagement App

Employee Engagement App: Top 11 For Workforce Improvement 

Apr 10, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence
  • What is Secondary Data? + [Examples, Sources, & Analysis]

busayo.longe

  • Data Collection

Aside from consulting the primary origin or source, data can also be collected through a third party, a process common with secondary data. It takes advantage of the data collected from previous research and uses it to carry out new research.

Secondary data is one of the two main types of data, where the second type is the primary data. These 2 data types are very useful in research and statistics, but for the sake of this article, we will be restricting our scope to secondary data.

We will study secondary data, its examples, sources, and methods of analysis.

What is Secondary Data?  

Secondary data is the data that has already been collected through primary sources and made readily available for researchers to use for their own research. It is a type of data that has already been collected in the past.

A researcher may have collected the data for a particular project, then made it available to be used by another researcher. The data may also have been collected for general use with no specific research purpose like in the case of the national census.

Data classified as secondary for particular research may be said to be primary for another research. This is the case when data is being reused, making it primary data for the first research and secondary data for the second research it is being used for.

Sources of Secondary Data

Sources of secondary data include books, personal sources, journals, newspapers, websitess, government records etc. Secondary data are known to be readily available compared to that of primary data. It requires very little research and needs for manpower to use these sources.

With the advent of electronic media and the internet, secondary data sources have become more easily accessible. Some of these sources are highlighted below.

Books are one of the most traditional ways of collecting data. Today, there are books available for all topics you can think of.  When carrying out research, all you have to do is look for a book on the topic being researched, then select from the available repository of books in that area. Books, when carefully chosen are an authentic source of authentic data and can be useful in preparing a literature review.

  • Published Sources

There are a variety of published sources available for different research topics. The authenticity of the data generated from these sources depends majorly on the writer and publishing company. 

Published sources may be printed or electronic as the case may be. They may be paid or free depending on the writer and publishing company’s decision.

  • Unpublished Personal Sources

This may not be readily available and easily accessible compared to the published sources. They only become accessible if the researcher shares with another researcher who is not allowed to share it with a third party.

For example, the product management team of an organization may need data on customer feedback to assess what customers think about their product and improvement suggestions. They will need to collect the data from the customer service department, which primarily collected the data to improve customer service.

Journals are gradually becoming more important than books these days when data collection is concerned. This is because journals are updated regularly with new publications on a periodic basis, therefore giving to date information.

Also, journals are usually more specific when it comes to research. For example, we can have a journal on, “Secondary data collection for quantitative data ” while a book will simply be titled, “Secondary data collection”.

In most cases, the information passed through a newspaper is usually very reliable. Hence, making it one of the most authentic sources of collecting secondary data.

The kind of data commonly shared in newspapers is usually more political, economic, and educational than scientific. Therefore, newspapers may not be the best source for scientific data collection.

The information shared on websites is mostly not regulated and as such may not be trusted compared to other sources. However, there are some regulated websites that only share authentic data and can be trusted by researchers.

Most of these websites are usually government websites or private organizations that are paid, data collectors.

Blogs are one of the most common online sources for data and may even be less authentic than websites. These days, practically everyone owns a blog, and a lot of people use these blogs to drive traffic to their website or make money through paid ads.

Therefore, they cannot always be trusted. For example, a blogger may write good things about a product because he or she was paid to do so by the manufacturer even though these things are not true.

They are personal records and as such rarely used for data collection by researchers. Also, diaries are usually personal, except for these days when people now share public diaries containing specific events in their life.

A common example of this is Anne Frank’s diary which contained an accurate record of the Nazi wars.

  • Government Records

Government records are a very important and authentic source of secondary data. They contain information useful in marketing, management, humanities, and social science research.

Some of these records include; census data, health records, education institute records, etc. They are usually collected to aid proper planning, allocation of funds, and prioritizing of projects.

Podcasts are gradually becoming very common these days, and a lot of people listen to them as an alternative to radio. They are more or less like online radio stations and are generating increasing popularity.

Information is usually shared during podcasts, and listeners can use it as a source of data collection. 

Some other sources of data collection include:

  • Radio stations
  • Public sector records.

What are the Secondary Data Collection Tools?

Popular tools used to collect secondary data include; bots, devices, libraries, etc. In order to ease the data collection process from the sources of secondary data highlighted above, researchers use these important tools which are explained below.

There are a lot of data online and it may be difficult for researchers to browse through all these data and find what they are actually looking for. In order to ease this process of data collection, programmers have created bots to do an automatic web scraping for relevant data.

These bots are “ software robots ” programmed to perform some task for the researcher. It is common for businesses to use bots to pull data from forums and social media for sentiment and competitive analysis.

  • Internet-Enabled Devices

This could be a mobile phone, PC, or tablet that has access to an internet connection. They are used to access journals, books, blogs, etc. to collect secondary data.

This is a traditional secondary data collection tool for researchers. The library contains relevant materials for virtually all the research areas you can think of, and it is accessible to everyone.

A researcher might decide to sit in the library for some time to collect secondary data or borrow the materials for some time and return when done collecting the required data.

Radio stations are one of the secondary sources of data collection, and one needs radio to access them. The advent of technology has even made it possible to listen to the radio on mobile phones, deeming it unnecessary to get a radio.

Secondary Data Analysis  

Secondary data analysis is the process of analyzing data collected from another researcher who primarily collected this data for another purpose. Researchers leverage secondary data to save time and resources that would have been spent on primary data collection.

The secondary data analysis process can be carried out quantitatively or qualitatively depending on the kind of data the researcher is dealing with. The quantitative method of secondary data analysis is used on numerical data and is analyzed mathematically, while the qualitative method uses words to provide in-depth information about data.

How to Analyse Secondary Data

There are different stages of secondary data analysis, which involve events before, during, and after data collection. These stages include;

  • Statement of Purpose

Before collecting secondary data for analysis, you need to know your statement of purpose. That is, a clear understanding of why you are collecting the data—the ultimate aim of the research work and how this data will help achieve it.

This will help direct your path towards collecting the right data, and choosing the best data source and method of analysis.

  • Research Design

This is a written-down plan on how the research activities will be carried out. It describes the kind of data to be collected, the sources of data collection, method of data collection, tools, and even method of analysis.

A research design may also contain a timestamp of when each of these activities will be carried out. Therefore, serving as a guide for the secondary data analysis.

After identifying the purpose of the research, the researcher should design a research process that will guide the data analysis process.

  • Developing the Research Questions

It is not enough to just know the research purpose, you need to develop research questions that will help in better identifying Secondary data. This is because they are usually a pool of data to choose from, and asking the right questions will assist in collecting authentic data.

For example, a researcher trying to collect data about the best fish feeds to enable fast growth in fishes will have to ask questions like, What kind of fish is considered? Is the data meant to be quantitative or qualitative? What is the content of the fish feed? The growth rate in fishes after feeding on it, and so on.

  • Identifying Secondary Data

After developing the research questions, researchers use them as a guide to identifying relevant data from the data repository. For example, if the kind of data to be collected is qualitative, a researcher can filter out qualitative data.

The suitable secondary data will be the one that correctly answers the questions highlighted above. When looking for the solutions to a linear programming problem, for instance, the solutions will be numbers that satisfy both the objective and the constraints.

Any answer that doesn’t satisfy both, is not a solution.

  • Evaluating Secondary Data

This stage is what many classify as the real data analysis stage because it is the point where analysis is actually performed. However, the stages highlighted above are a part of the data analysis process, because they influence how the analysis is performed.

Once a dataset that appears viable in addressing the initial requirements discussed above is located, the next step in the process is the evaluation of the dataset to ensure the appropriateness for the research topic. The data is evaluated to ensure that it really addresses the statement of the problem and answers the research questions.

After which it will now be analyzed either using the quantitative method or the qualitative method depending on the type of data it is.

Advantages of Secondary Data

  • Ease of Access

Most of the sources of secondary data are easily accessible to researchers. Most of these sources can be accessed online through a mobile device.  People who do not have access to the internet can also access them through print.

They are usually available in libraries, book stores, and can even be borrowed from other people.

  • Inexpensive

Secondary data mostly require little to no cost for people to acquire them. Many books, journals, and magazines can be downloaded for free online.  Books can also be borrowed for free from public libraries by people who do not have access to the internet.

Researchers do not have to spend money on investigations, and very little is spent on acquiring books if any.

  • Time-Saving

The time spent on collecting secondary data is usually very little compared to that of primary data. The only investigation necessary for secondary data collection is the process of sourcing for necessary data sources.

Therefore, cutting the time that would normally be spent on the investigation. This will save a significant amount of time for the researcher 

  • Longitudinal and Comparative Studies

Secondary data makes it easy to carry out longitudinal studies without having to wait for a couple of years to draw conclusions. For example, you may want to compare the country’s population according to census 5 years ago, and now.

Rather than waiting for 5 years, the comparison can easily be made by collecting the census 5 years ago and now.

  • Generating new insights

When re-evaluating data, especially through another person’s lens or point of view, new things are uncovered. There might be a thing that wasn’t discovered in the past by the primary data collector, that secondary data collection may reveal.

For example, when customers complain about difficulty using an app to the customer service team, they may decide to create a user guide teaching customers how to use it. However, when a product developer has access to this data, it may be uncovered that the issue came from and UI/UX design that needs to be worked on.

Disadvantages of Secondary Data  

  • Data Quality:

The data collected through secondary sources may not be as authentic as when collected directly from the source. This is a very common disadvantage with online sources due to a lack of regulatory bodies to monitor the kind of content that is being shared.

Therefore, working with this kind of data may have negative effects on the research being carried out.

  • Irrelevant Data:

Researchers spend so much time surfing through a pool of irrelevant data before finally getting the one they need. This is because the data was not collected mainly for the researcher.

In some cases, a researcher may not even find the exact data he or she needs, but have to settle for the next best alternative. 

  • Exaggerated Data

Some data sources are known to exaggerate the information that is being shared. This bias may be some to maintain a good public image or due to a paid advert.

This is very common with many online blogs that even go a bead to share false information just to gain web traffic. For example, a FinTech startup may exaggerate the amount of money it has processed just to attract more customers.

A researcher gathering this data to investigate the total amount of money processed by FinTech startups in the US for the quarter may have to use this exaggerated data.

  • Outdated Information

Some of the data sources are outdated and there are no new available data to replace the old ones. For example, the national census is not usually updated yearly.

Therefore, there have been changes in the country’s population since the last census. However, someone working with the country’s population will have to settle for the previously recorded figure even though it is outdated.

Secondary data has various uses in research, business, and statistics. Researchers choose secondary data for different reasons, with some of it being due to price, availability, or even needs of the research.

Although old, secondary data may be the only source of data in some cases. This may be due to the huge cost of performing research or due to its delegation to a particular body (e.g. national census). 

In short, secondary data has its shortcomings, which may affect the outcome of the research negatively and also some advantages over primary data. It all depends on the situation, the researcher in question, and the kind of research being carried out.

Logo

Connect to Formplus, Get Started Now - It's Free!

  • advantages of secondary data
  • secondary data analysis
  • secondary data examples
  • sources of secondary data
  • busayo.longe

Formplus

You may also like:

Primary vs Secondary Data:15 Key Differences & Similarities

Simple guide on secondary and primary data differences on examples, types, collection tools, advantages, disadvantages, sources etc.

methodology when using secondary data

Brand vs Category Development Index: Formula & Template

In this article, we are going to break down the brand and category development index along with how it applies to all brands in the market.

Categorical Data: Definition + [Examples, Variables & Analysis]

A simple guide on categorical data definitions, examples, category variables, collection tools and its disadvantages

What is Numerical Data? [Examples,Variables & Analysis]

A simple guide on numerical data examples, definitions, numerical variables, types and analysis

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

An illustration of a magnifying glass over a stack of reports representing secondary research.

Secondary Research Guide: Definition, Methods, Examples

Apr 3, 2024

8 min. read

The internet has vastly expanded our access to information, allowing us to learn almost anything about everything. But not all market research is created equal , and this secondary research guide explains why.

There are two key ways to do research. One is to test your own ideas, make your own observations, and collect your own data to derive conclusions. The other is to use secondary research — where someone else has done most of the heavy lifting for you. 

Here’s an overview of secondary research and the value it brings to data-driven businesses.

Secondary Research Definition: What Is Secondary Research?

Primary vs Secondary Market Research

What Are Secondary Research Methods?

Advantages of secondary research, disadvantages of secondary research, best practices for secondary research, how to conduct secondary research with meltwater.

Secondary research definition: The process of collecting information from existing sources and data that have already been analyzed by others.

Secondary research (aka desk research ) provides a foundation to help you understand a topic, with the goal of building on existing knowledge. They often cover the same information as primary sources, but they add a layer of analysis and explanation to them.

colleagues working on a secondary research

Users can choose from several secondary research types and sources, including:

  • Journal articles
  • Research papers

With secondary sources, users can draw insights, detect trends , and validate findings to jumpstart their research efforts.

Primary vs. Secondary Market Research

We’ve touched a little on primary research , but it’s essential to understand exactly how primary and secondary research are unique.

laying out the keypoints of a secondary research on a board

Think of primary research as the “thing” itself, and secondary research as the analysis of the “thing,” like these primary and secondary research examples:

  • An expert gives an interview (primary research) and a marketer uses that interview to write an article (secondary research).
  • A company conducts a consumer satisfaction survey (primary research) and a business analyst uses the survey data to write a market trend report (secondary research).
  • A marketing team launches a new advertising campaign across various platforms (primary research) and a marketing research firm, like Meltwater for market research , compiles the campaign performance data to benchmark against industry standards (secondary research).

In other words, primary sources make original contributions to a topic or issue, while secondary sources analyze, synthesize, or interpret primary sources.

Both are necessary when optimizing a business, gaining a competitive edge , improving marketing, or understanding consumer trends that may impact your business.

Secondary research methods focus on analyzing existing data rather than collecting primary data . Common examples of secondary research methods include:

  • Literature review . Researchers analyze and synthesize existing literature (e.g., white papers, research papers, articles) to find knowledge gaps and build on current findings.
  • Content analysis . Researchers review media sources and published content to find meaningful patterns and trends.
  • AI-powered secondary research . Platforms like Meltwater for market research analyze vast amounts of complex data and use AI technologies like natural language processing and machine learning to turn data into contextual insights.

Researchers today have access to more market research tools and technology than ever before, allowing them to streamline their efforts and improve their findings.

Want to see how Meltwater can complement your secondary market research efforts? Simply fill out the form at the bottom of this post, and we'll be in touch.

Conducting secondary research offers benefits in every job function and use case, from marketing to the C-suite. Here are a few advantages you can expect.

Cost and time efficiency

Using existing research saves you time and money compared to conducting primary research. Secondary data is readily available and easily accessible via libraries, free publications, or the Internet. This is particularly advantageous when you face time constraints or when a project requires a large amount of data and research.

Access to large datasets

Secondary data gives you access to larger data sets and sample sizes compared to what primary methods may produce. Larger sample sizes can improve the statistical power of the study and add more credibility to your findings.

Ability to analyze trends and patterns

Using larger sample sizes, researchers have more opportunities to find and analyze trends and patterns. The more data that supports a trend or pattern, the more trustworthy the trend becomes and the more useful for making decisions. 

Historical context

Using a combination of older and recent data allows researchers to gain historical context about patterns and trends. Learning what’s happened before can help decision-makers gain a better current understanding and improve how they approach a problem or project.

Basis for further research

Ideally, you’ll use secondary research to further other efforts . Secondary sources help to identify knowledge gaps, highlight areas for improvement, or conduct deeper investigations.

Tip: Learn how to use Meltwater as a research tool and how Meltwater uses AI.

Secondary research comes with a few drawbacks, though these aren’t necessarily deal breakers when deciding to use secondary sources.

Reliability concerns

Researchers don’t always know where the data comes from or how it’s collected, which can lead to reliability concerns. They don’t control the initial process, nor do they always know the original purpose for collecting the data, both of which can lead to skewed results.

Potential bias

The original data collectors may have a specific agenda when doing their primary research, which may lead to biased findings. Evaluating the credibility and integrity of secondary data sources can prove difficult.

Outdated information

Secondary sources may contain outdated information, especially when dealing with rapidly evolving trends or fields. Using outdated information can lead to inaccurate conclusions and widen knowledge gaps.

Limitations in customization

Relying on secondary data means being at the mercy of what’s already published. It doesn’t consider your specific use cases, which limits you as to how you can customize and use the data.

A lack of relevance

Secondary research rarely holds all the answers you need, at least from a single source. You typically need multiple secondary sources to piece together a narrative, and even then you might not find the specific information you need.

To make secondary market research your new best friend, you’ll need to think critically about its strengths and find ways to overcome its weaknesses. Let’s review some best practices to use secondary research to its fullest potential.

Identify credible sources for secondary research

To overcome the challenges of bias, accuracy, and reliability, choose secondary sources that have a demonstrated history of excellence . For example, an article published in a medical journal naturally has more credibility than a blog post on a little-known website.

analyzing data resulting from a secondary research

Assess credibility based on peer reviews, author expertise, sampling techniques, publication reputation, and data collection methodologies. Cross-reference the data with other sources to gain a general consensus of truth.

The more credibility “factors” a source has, the more confidently you can rely on it. 

Evaluate the quality and relevance of secondary data

You can gauge the quality of the data by asking simple questions:

  • How complete is the data? 
  • How old is the data? 
  • Is this data relevant to my needs?
  • Does the data come from a known, trustworthy source?

It’s best to focus on data that aligns with your research objectives. Knowing the questions you want to answer and the outcomes you want to achieve ahead of time helps you focus only on data that offers meaningful insights.

Document your sources 

If you’re sharing secondary data with others, it’s essential to document your sources to gain others’ trust. They don’t have the benefit of being “in the trenches” with you during your research, and sharing your sources can add credibility to your findings and gain instant buy-in.

Secondary market research offers an efficient, cost-effective way to learn more about a topic or trend, providing a comprehensive understanding of the customer journey . Compared to primary research, users can gain broader insights, analyze trends and patterns, and gain a solid foundation for further exploration by using secondary sources.

Meltwater for market research speeds up the time to value in using secondary research with AI-powered insights, enhancing your understanding of the customer journey. Using natural language processing, machine learning, and trusted data science processes, Meltwater helps you find relevant data and automatically surfaces insights to help you understand its significance. Our solution identifies hidden connections between data points you might not know to look for and spells out what the data means, allowing you to make better decisions based on accurate conclusions. Learn more about Meltwater's power as a secondary research solution when you request a demo by filling out the form below:

Continue Reading

An illustration showing a desktop computer with a large magnifying glass over the search bar, a big purple folder with a document inside, a light bulb, and graphs. How to do market research blog post.

How To Do Market Research: Definition, Types, Methods

Two brightly colored speech bubbles, a smaller one in green and larger one in purple, with two bright orange light bulbs. Consumer insights ultimate guide.

What Are Consumer Insights? Meaning, Examples, Strategy

A model of the human brain that is blue set against a blue background. We think (get it) was the perfect choice for our blog on market intelligence.

Market Intelligence 101: What It Is & How To Use It

Illustration showing a large desktop computer with several icons and graphs on the screen. A large purple magnifying glass hovers over the top right corner of the screen. Market research tools blog post.

The 13 Best Market Research Tools

Illustration showing a magnifying glass over a user profile to gather consumer intelligence

Consumer Intelligence: Definition & Examples

Image showing a scale of emotions from angry to happy. Top consumer insights companies blog post.

9 Top Consumer Insights Tools & Companies

An illustration of a person at a desktop computer representing desk research.

What Is Desk Research? Meaning, Methodology, Examples

Banner Image

Library Guides

Dissertations 4: methodology: methods.

  • Introduction & Philosophy
  • Methodology

Primary & Secondary Sources, Primary & Secondary Data

When describing your research methods, you can start by stating what kind of secondary and, if applicable, primary sources you used in your research. Explain why you chose such sources, how well they served your research, and identify possible issues encountered using these sources.  

Definitions  

There is some confusion on the use of the terms primary and secondary sources, and primary and secondary data. The confusion is also due to disciplinary differences (Lombard 2010). Whilst you are advised to consult the research methods literature in your field, we can generalise as follows:  

Secondary sources 

Secondary sources normally include the literature (books and articles) with the experts' findings, analysis and discussions on a certain topic (Cottrell, 2014, p123). Secondary sources often interpret primary sources.  

Primary sources 

Primary sources are "first-hand" information such as raw data, statistics, interviews, surveys, law statutes and law cases. Even literary texts, pictures and films can be primary sources if they are the object of research (rather than, for example, documentaries reporting on something else, in which case they would be secondary sources). The distinction between primary and secondary sources sometimes lies on the use you make of them (Cottrell, 2014, p123). 

Primary data 

Primary data are data (primary sources) you directly obtained through your empirical work (Saunders, Lewis and Thornhill 2015, p316). 

Secondary data 

Secondary data are data (primary sources) that were originally collected by someone else (Saunders, Lewis and Thornhill 2015, p316).   

Comparison between primary and secondary data   

Use  

Virtually all research will use secondary sources, at least as background information. 

Often, especially at the postgraduate level, it will also use primary sources - secondary and/or primary data. The engagement with primary sources is generally appreciated, as less reliant on others' interpretations, and closer to 'facts'. 

The use of primary data, as opposed to secondary data, demonstrates the researcher's effort to do empirical work and find evidence to answer her specific research question and fulfill her specific research objectives. Thus, primary data contribute to the originality of the research.    

Ultimately, you should state in this section of the methodology: 

What sources and data you are using and why (how are they going to help you answer the research question and/or test the hypothesis. 

If using primary data, why you employed certain strategies to collect them. 

What the advantages and disadvantages of your strategies to collect the data (also refer to the research in you field and research methods literature). 

Quantitative, Qualitative & Mixed Methods

The methodology chapter should reference your use of quantitative research, qualitative research and/or mixed methods. The following is a description of each along with their advantages and disadvantages. 

Quantitative research 

Quantitative research uses numerical data (quantities) deriving, for example, from experiments, closed questions in surveys, questionnaires, structured interviews or published data sets (Cottrell, 2014, p93). It normally processes and analyses this data using quantitative analysis techniques like tables, graphs and statistics to explore, present and examine relationships and trends within the data (Saunders, Lewis and Thornhill, 2015, p496). 

Qualitative research  

Qualitative research is generally undertaken to study human behaviour and psyche. It uses methods like in-depth case studies, open-ended survey questions, unstructured interviews, focus groups, or unstructured observations (Cottrell, 2014, p93). The nature of the data is subjective, and also the analysis of the researcher involves a degree of subjective interpretation. Subjectivity can be controlled for in the research design, or has to be acknowledged as a feature of the research. Subject-specific books on (qualitative) research methods offer guidance on such research designs.  

Mixed methods 

Mixed-method approaches combine both qualitative and quantitative methods, and therefore combine the strengths of both types of research. Mixed methods have gained popularity in recent years.  

When undertaking mixed-methods research you can collect the qualitative and quantitative data either concurrently or sequentially. If sequentially, you can for example, start with a few semi-structured interviews, providing qualitative insights, and then design a questionnaire to obtain quantitative evidence that your qualitative findings can also apply to a wider population (Specht, 2019, p138). 

Ultimately, your methodology chapter should state: 

Whether you used quantitative research, qualitative research or mixed methods. 

Why you chose such methods (and refer to research method sources). 

Why you rejected other methods. 

How well the method served your research. 

The problems or limitations you encountered. 

Doug Specht, Senior Lecturer at the Westminster School of Media and Communication, explains mixed methods research in the following video:

LinkedIn Learning Video on Academic Research Foundations: Quantitative

The video covers the characteristics of quantitative research, and explains how to approach different parts of the research process, such as creating a solid research question and developing a literature review. He goes over the elements of a study, explains how to collect and analyze data, and shows how to present your data in written and numeric form.

methodology when using secondary data

Link to quantitative research video

Some Types of Methods

There are several methods you can use to get primary data. To reiterate, the choice of the methods should depend on your research question/hypothesis. 

Whatever methods you will use, you will need to consider: 

why did you choose one technique over another? What were the advantages and disadvantages of the technique you chose? 

what was the size of your sample? Who made up your sample? How did you select your sample population? Why did you choose that particular sampling strategy?) 

ethical considerations (see also tab...)  

safety considerations  

validity  

feasibility  

recording  

procedure of the research (see box procedural method...).  

Check Stella Cottrell's book  Dissertations and Project Reports: A Step by Step Guide  for some succinct yet comprehensive information on most methods (the following account draws mostly on her work). Check a research methods book in your discipline for more specific guidance.  

Experiments 

Experiments are useful to investigate cause and effect, when the variables can be tightly controlled. They can test a theory or hypothesis in controlled conditions. Experiments do not prove or disprove an hypothesis, instead they support or not support an hypothesis. When using the empirical and inductive method it is not possible to achieve conclusive results. The results may only be valid until falsified by other experiments and observations. 

For more information on Scientific Method, click here . 

Observations 

Observational methods are useful for in-depth analyses of behaviours in people, animals, organisations, events or phenomena. They can test a theory or products in real life or simulated settings. They generally a qualitative research method.  

Questionnaires and surveys 

Questionnaires and surveys are useful to gain opinions, attitudes, preferences, understandings on certain matters. They can provide quantitative data that can be collated systematically; qualitative data, if they include opportunities for open-ended responses; or both qualitative and quantitative elements. 

Interviews  

Interviews are useful to gain rich, qualitative information about individuals' experiences, attitudes or perspectives. With interviews you can follow up immediately on responses for clarification or further details. There are three main types of interviews: structured (following a strict pattern of questions, which expect short answers), semi-structured (following a list of questions, with the opportunity to follow up the answers with improvised questions), and unstructured (following a short list of broad questions, where the respondent can lead more the conversation) (Specht, 2019, p142). 

This short video on qualitative interviews discusses best practices and covers qualitative interview design, preparation and data collection methods. 

Focus groups   

In this case, a group of people (normally, 4-12) is gathered for an interview where the interviewer asks questions to such group of participants. Group interactions and discussions can be highly productive, but the researcher has to beware of the group effect, whereby certain participants and views dominate the interview (Saunders, Lewis and Thornhill 2015, p419). The researcher can try to minimise this by encouraging involvement of all participants and promoting a multiplicity of views. 

This video focuses on strategies for conducting research using focus groups.  

Check out the guidance on online focus groups by Aliaksandr Herasimenka, which is attached at the bottom of this text box. 

Case study 

Case studies are often a convenient way to narrow the focus of your research by studying how a theory or literature fares with regard to a specific person, group, organisation, event or other type of entity or phenomenon you identify. Case studies can be researched using other methods, including those described in this section. Case studies give in-depth insights on the particular reality that has been examined, but may not be representative of what happens in general, they may not be generalisable, and may not be relevant to other contexts. These limitations have to be acknowledged by the researcher.     

Content analysis 

Content analysis consists in the study of words or images within a text. In its broad definition, texts include books, articles, essays, historical documents, speeches, conversations, advertising, interviews, social media posts, films, theatre, paintings or other visuals. Content analysis can be quantitative (e.g. word frequency) or qualitative (e.g. analysing intention and implications of the communication). It can detect propaganda, identify intentions of writers, and can see differences in types of communication (Specht, 2019, p146). Check this page on collecting, cleaning and visualising Twitter data.

Extra links and resources:  

Research Methods  

A clear and comprehensive overview of research methods by Emerald Publishing. It includes: crowdsourcing as a research tool; mixed methods research; case study; discourse analysis; ground theory; repertory grid; ethnographic method and participant observation; interviews; focus group; action research; analysis of qualitative data; survey design; questionnaires; statistics; experiments; empirical research; literature review; secondary data and archival materials; data collection. 

Doing your dissertation during the COVID-19 pandemic  

Resources providing guidance on doing dissertation research during the pandemic: Online research methods; Secondary data sources; Webinars, conferences and podcasts; 

  • Virtual Focus Groups Guidance on managing virtual focus groups

5 Minute Methods Videos

The following are a series of useful videos that introduce research methods in five minutes. These resources have been produced by lecturers and students with the University of Westminster's School of Media and Communication. 

5 Minute Method logo

Case Study Research

Research Ethics

Quantitative Content Analysis 

Sequential Analysis 

Qualitative Content Analysis 

Thematic Analysis 

Social Media Research 

Mixed Method Research 

Procedural Method

In this part, provide an accurate, detailed account of the methods and procedures that were used in the study or the experiment (if applicable!). 

Include specifics about participants, sample, materials, design and methods. 

If the research involves human subjects, then include a detailed description of who and how many participated along with how the participants were selected.  

Describe all materials used for the study, including equipment, written materials and testing instruments. 

Identify the study's design and any variables or controls employed. 

Write out the steps in the order that they were completed. 

Indicate what participants were asked to do, how measurements were taken and any calculations made to raw data collected. 

Specify statistical techniques applied to the data to reach your conclusions. 

Provide evidence that you incorporated rigor into your research. This is the quality of being thorough and accurate and considers the logic behind your research design. 

Highlight any drawbacks that may have limited your ability to conduct your research thoroughly. 

You have to provide details to allow others to replicate the experiment and/or verify the data, to test the validity of the research. 

Bibliography

Cottrell, S. (2014). Dissertations and project reports: a step by step guide. Hampshire, England: Palgrave Macmillan.

Lombard, E. (2010). Primary and secondary sources.  The Journal of Academic Librarianship , 36(3), 250-253

Saunders, M.N.K., Lewis, P. and Thornhill, A. (2015).  Research Methods for Business Students.  New York: Pearson Education. 

Specht, D. (2019).  The Media And Communications Study Skills Student Guide . London: University of Westminster Press.  

  • << Previous: Introduction & Philosophy
  • Next: Ethics >>
  • Last Updated: Sep 14, 2022 12:58 PM
  • URL: https://libguides.westminster.ac.uk/methodology-for-dissertations

CONNECT WITH US

How to Analyse Secondary Data for a Dissertation

Secondary data refers to data that has already been collected by another researcher. For researchers (and students!) with limited time and resources, secondary data, whether qualitative or quantitative can be a highly viable source of data.  In addition, with the advances in technology and access to peer reviewed journals and studies provided by the internet, it is increasingly popular as a form of data collection.  The question that frequently arises amongst students however, is: how is secondary data best analysed?

The process of data analysis in secondary research

Secondary analysis (i.e., the use of existing data) is a systematic methodological approach that has some clear steps that need to be followed for the process to be effective.  In simple terms there are three steps:

  • Step One: Development of Research Questions
  • Step Two: Identification of dataset
  • Step Three: Evaluation of the dataset.

Let’s look at each of these in more detail:

Step One: Development of research questions

Using secondary data means you need to apply theoretical knowledge and conceptual skills to be able to use the dataset to answer research questions.  Clearly therefore, the first step is thus to clearly define and develop your research questions so that you know the areas of interest that you need to explore for location of the most appropriate secondary data.

Step Two: Identification of Dataset

This stage should start with identification, through investigation, of what is currently known in the subject area and where there are gaps, and thus what data is available to address these gaps.  Sources can be academic from prior studies that have used quantitative or qualitative data, and which can then be gathered together and collated to produce a new secondary dataset.  In addition, other more informal or “grey” literature can also be incorporated, including consumer report, commercial studies or similar.  One of the values of using secondary research is that original survey works often do not use all the data collected which means this unused information can be applied to different settings or perspectives.

Key point: Effective use of secondary data means identifying how the data can be used to deliver meaningful and relevant answers to the research questions.  In other words that the data used is a good fit for the study and research questions.

Step Three: Evaluation of the dataset for effectiveness/fit

A good tip is to use a reflective approach for data evaluation.  In other words, for each piece of secondary data to be utilised, it is sensible to identify the purpose of the work, the credentials of the authors (i.e., credibility, what data is provided in the original work and how long ago it was collected).  In addition, the methods used and the level of consistency that exists compared to other works. This is important because understanding the primary method of data collection will impact on the overall evaluation and analysis when it is used as secondary source. In essence, if there is no understanding of the coding used in qualitative data analysis to identify key themes then there will be a mismatch with interpretations when the data is used for secondary purposes.  Furthermore, having multiple sources which draw similar conclusions ensures a higher level of validity than relying on only one or two secondary sources.

A useful framework provides a flow chart of decision making, as shown in the figure below.

Analyse Secondary Data

Following this process ensures that only those that are most appropriate for your research questions are included in the final dataset, but also demonstrates to your readers that you have been thorough in identifying the right works to use.

Writing up the Analysis

Once you have your dataset, writing up the analysis will depend on the process used.  If the data is qualitative in nature, then you should follow the following process.

Pre-Planning

  • Read and re-read all sources, identifying initial observations, correlations, and relationships between themes and how they apply to your research questions.
  • Once initial themes are identified, it is sensible to explore further and identify sub-themes which lead on from the core themes and correlations in the dataset, which encourages identification of new insights and contributes to the originality of your own work.

Structure of the Analysis Presentation

Introduction.

The introduction should commence with an overview of all your sources. It is good practice to present these in a table, listed chronologically so that your work has an orderly and consistent flow. The introduction should also incorporate a brief (2-3 sentences) overview of the key outcomes and results identified.

The body text for secondary data, irrespective of whether quantitative or qualitative data is used, should be broken up into sub-sections for each argument or theme presented. In the case of qualitative data, depending on whether content, narrative or discourse analysis is used, this means presenting the key papers in the area, their conclusions and how these answer, or not, your research questions. Each source should be clearly cited and referenced at the end of the work. In the case of qualitative data, any figures or tables should be reproduced with the correct citations to their original source. In both cases, it is good practice to give a main heading of a key theme, with sub-headings for each of the sub themes identified in the analysis.

Do not use direct quotes from secondary data unless they are:

  • properly referenced, and
  • are key to underlining a point or conclusion that you have drawn from the data.

All results sections, regardless of whether primary or secondary data has been used should refer back to the research questions and prior works. This is because, regardless of whether the results back up or contradict previous research, including previous works shows a wider level of reading and understanding of the topic being researched and gives a greater depth to your own work.

Summary of results

The summary of the results section of a secondary data dissertation should deliver a summing up of key findings, and if appropriate a conceptual framework that clearly illustrates the findings of the work. This shows that you have understood your secondary data, how it has answered your research questions, and furthermore that your interpretation has led to some firm outcomes.

You may also like

How to Critically Analyse

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Qualitative Secondary Analysis: A Case Exemplar

Judith ann tate.

The Ohio State University, College of Nursing

Mary Beth Happ

Qualitative secondary analysis (QSA) is the use of qualitative data collected by someone else or to answer a different research question. Secondary analysis of qualitative data provides an opportunity to maximize data utility particularly with difficult to reach patient populations. However, QSA methods require careful consideration and explicit description to best understand, contextualize, and evaluate the research results. In this paper, we describe methodologic considerations using a case exemplar to illustrate challenges specific to QSA and strategies to overcome them.

Health care research requires significant time and resources. Secondary analysis of existing data provides an efficient alternative to collecting data from new groups or the same subjects. Secondary analysis, defined as the reuse of existing data to investigate a different research question ( Heaton, 2004 ), has a similar purpose whether the data are quantitative or qualitative. Common goals include to (1) perform additional analyses on the original dataset, (2) analyze a subset of the original data, (3) apply a new perspective or focus to the original data, or (4) validate or expand findings from the original analysis ( Hinds, Vogel, & Clarke-Steffen, 1997 ). Synthesis of knowledge from meta-analysis or aggregation may be viewed as an additional purpose of secondary analysis ( Heaton, 2004 ).

Qualitative studies utilize several different data sources, such as interviews, observations, field notes, archival meeting minutes or clinical record notes, to produce rich descriptions of human experiences within a social context. The work typically requires significant resources (e.g., personnel effort/time) for data collection and analysis. When feasible, qualitative secondary analysis (QSA) can be a useful and cost-effective alternative to designing and conducting redundant primary studies. With advances in computerized data storage and analysis programs, sharing qualitative datasets has become easier. However, little guidance is available for conducting, structuring procedures, or evaluating QSA ( Szabo & Strang, 1997 ).

QSA has been described as “an almost invisible enterprise in social research” ( Fielding, 2004 ). Primary data is often re-used; however, descriptions of this practice are embedded within the methods section of qualitative research reports rather than explicitly identified as QSA. Moreover, searching or classifying reports as QSA is difficult because many researchers refrain from identifying their work as secondary analyses ( Hinds et al., 1997 ; Thorne, 1998a ). In this paper, we provide an overview of QSA, the purposes, and modes of data sharing and approaches. A unique, expanded QSA approach is presented as a methodological exemplar to illustrate considerations.

QSA Typology

Heaton (2004) classified QSA studies based on the relationship between the secondary and primary questions and the scope of data analyzed. Types of QSA included studies that (1) investigated questions different from the primary study, (2) applied a unique theoretical perspective, or (3) extended the primary work. Heaton’s literature review (2004) showed that studies varied in the choice of data used, from selected portions to entire or combined datasets.

Modes of Data Sharing

Heaton (2004) identified three modes of data sharing: formal, informal and auto-data. Formal data sharing involves accessing and analyzing deposited or archived qualitative data by an independent group of researchers. Historical research often uses formal data sharing. Informal data sharing refers to requests for direct access to an investigator’s data for use alone or to pool with other data, usually as a result of informal networking. In some instances, the primary researchers may be invited to collaborate. The most common mode of data sharing is auto-data, defined as further exploration of a qualitative data set by the primary research team. Due to the iterative nature of qualitative research, when using auto-data, it may be difficult to determine where the original study questions end and discrete, distinct analysis begins ( Heaton, 1998 ).

An Exemplar QSA

Below we describe a QSA exemplar conducted by the primary author of this paper (JT), a member of the original research team, who used a supplementary approach to examine concepts revealed but not fully investigated in the primary study. First, we describe an overview of the original study on which the QSA was based. Then, the exemplar QSA is presented to illustrate: (1) the use of auto-data when the new research questions are closely related to or extend the original study aims ( Table 1 ), (2) the collection of additional clinical record data to supplement the original dataset and (3) the performance of separate member checking in the form of expert review and opinion. Considerations and recommendations for use of QSA are reviewed with illustrations taken from the exemplar study ( Table 2 ). Finally, discussion of conclusions and implications is included to assist with planning and implementation of QSA studies.

Research question comparison

Application of the Exemplar Qualitative Secondary Analysis (QSA)

Aitken, L. M., Marshall, A. P., Elliott, R., & McKinley, S. (2009). Critical care nurses' decision making: sedation assessment and management in intensive care. Journal of Clinical Nursing, 18 (1), 36–45.

Morse, J., & Field, P. (1995). Qualitative research methods for health professionals. (2nd ed.). Thousand Oaks, CA: Sage Publishing.

Patel, R. P., Gambrell, M., Speroff, T.,…Strength, C. (2009). Delirium and sedation in the intensive care unit: Survey of behaviors and attitudes of 1384 healthcare professionals. Critical Care Medicine, 37 (3), 825–832.

Shehabi, Y., Botha, J. A., Boyle, M. S., Ernest, D., Freebairn, R. C., Jenkins, I. R., … Seppelt, I. M. (2008). Sedation and delirium in the intensive care unit: an Australian and New Zealand perspective. Anaesthesia & Intensive Care, 36 (4), 570–578.

Tanios, M. A., de Wit, M., Epstein, S. K., & Devlin, J. W. (2009). Perceived barriers to the use of sedation protocols and daily sedation interruption: a multidisciplinary survey. Journal of Critical Care, 24 (1), 66–73.

Weinert, C. R., & Calvin, A. D. (2007). Epidemiology of sedation and sedation adequacy for mechanically ventilated patients in a medical and surgical intensive care unit. Critical Care Medicine , 35(2), 393–401.

The Primary Study

Briefly, the original study was a micro-level ethnography designed to describe the processes of care and communication with patients weaning from prolonged mechanical ventilation (PMV) in a 28-bed Medical Intensive Care Unit ( Broyles, Colbert, Tate, & Happ, 2008 ; Happ, Swigart, Tate, Arnold, Sereika, & Hoffman, 2007 ; Happ et al, 2007 , 2010 ). Both the primary study and the QSA were approved by the Institutional Review Board at the University of Pittsburgh. Data were collected by two experienced investigators and a PhD student-research project coordinator. Data sources consisted of sustained field observations, interviews with patients, family members and clinicians, and clinical record review, including all narrative clinical documentation recorded by direct caregivers.

During iterative data collection and analysis in the original study, it became apparent that anxiety and agitation had an effect on the duration of ventilator weaning episodes, an observation that helped to formulate the questions for the QSA ( Tate, Dabbs, Hoffman, Milbrandt & Happ, 2012 ). Thus, the secondary topic was closely aligned as an important facet of the primary phenomenon. The close, natural relationship between the primary and QSA research questions is demonstrated in the side-by-side comparison in Table 1 . This QSA focused on new questions which extended the original study to recognition and management of anxiety or agitation, behaviors that often accompany mechanical ventilation and weaning but occur throughout the trajectory of critical illness and recovery.

Considerations when Undertaking QSA ( Table 2 )

Practical advantages.

A key practical advantage of QSA is maximizing use of existing data. Data collection efforts represent a significant percentage of the research budget in terms of cost and labor ( Coyer & Gallo, 2005 ). This is particularly important in view of the competition for research funding. Planning and implementing a qualitative study involves considerable time and expertise not only for data collecting (e.g., interviews, participant observation or focus group), but in establishing access, credibility and relationships ( Thorne, 1994 ) and in conducting the analysis. The cost of QSA is often seen as negligible since the outlay of resources for data collection is assumed by the original study. However, QSA incurs costs related to storage, researcher’s effort for review of existing data, analysis, and any further data collection that may be necessary.

Another advantage of QSA is access to data from an assembled cohort. In conducting original primary research, practical concerns arise when participants are difficult to locate or reluctant to divulge sensitive details to a researcher. In the case of vulnerable critically ill patients, participation in research may seem an unnecessary burden to family members who may be unwilling to provide proxy consent ( Fielding, 2004 ). QSA permits new questions to be asked of data collected previously from these vulnerable groups ( Rew, Koniak-Griffin, Lewis, Miles, & O'Sullivan, 2000 ), or from groups or events that occur with scarcity ( Thorne, 1994 ). Participants’ time and effort in the primary study therefore becomes more worthwhile. In fact, it is recommended that data already collected from existing studies of vulnerable populations or about sensitive topics be analyzed prior to engaging new participants. In this way, QSA becomes a cumulative rather than a repetitive process ( Fielding, 2004 ).

Data Adequacy and Congruency

Secondary researchers must determine that the primary data set meets the needs of the QSA. Data may be insufficient to answer a new question or the focus of the QSA may be so different as to render the pursuit of a QSA impossible ( Heaton, 1998 ). The underlying assumptions, sampling plan, research questions, and conceptual framework selected to answer the original study question may not fit the question posed during QSA ( Coyer & Gallo, 2005 ). The researchers of the primary study may have selectively sampled participants and analyzed the resulting data in a manner that produced a narrow or uneven scope of data ( Hinds et al., 1997 ). Thus, the data needed to fully answer questions posed by the QSA may be inadequately addressed in the primary study. A critical review of the existing dataset is an important first step in determining whether the primary data fits the secondary questions ( Hinds et al., 1997 ).

Passage of Time

The timing of the QSA is another important consideration. If the primary study and secondary study are performed sequentially, findings of the original study may influence the secondary study. On the other hand, studies performed concurrently offer the benefit of access to both the primary research team and participants member checking ( Hinds et al., 1997 ).

The passage of time since the primary study was conducted can also have a distinct effect on the usefulness of the primary dataset. Data may be outdated or contain a historical bias ( Coyer & Gallo, 2005 ). Since context changes over time, characteristics of the phenomena of interest may have changed. Analysis of older datasets may not illuminate the phenomena as they exist today.( Hinds et al., 1997 ) Even if participants could be re-contacted, their perspectives, memories and experiences change. The passage of time also has an affect on the relationship of the primary researchers to the data – so auto-data may be interpreted differently by the same researcher with the passage of time. Data are bound by time and history, therefore, may be a threat to internal validity unless a new investigator is able to account for these effects when interpreting data ( Rew et al., 2000 ).

Researcher stance/Context involvement

Issues related to context are a major source of criticism of QSA ( Gladstone, Volpe, & Boydell, 2007 ). One of the hallmarks of qualitative research is the relationship of the researcher to the participants. It can be argued that removing active contact with participants violates this premise. Tacit understandings developed in the field may be difficult or impossible to reconstruct ( Thorne, 1994 ). Qualitative fieldworkers often react and redirect the data collection based on a growing knowledge of the setting. The setting may change as a result of external or internal factors. Interpretation of researchers as participants in a unique time and social context may be impossible to re-construct even if the secondary researchers were members of the primary team ( Mauthner, Parry, & Milburn, 1998 ). Because the context in which the data were originally produced cannot be recovered, the ability of the researcher to react to the lived experience may be curtailed in QSA ( Gladstone et al., 2007 ). Researchers utilize a number of tactics to filter and prioritize what to include as data that may not be apparent in either the written or spoken records of those events ( Thorne, 1994 ). Reflexivity between the researcher, participants and setting is impossible to recreate when examining pre-existing data.

Relationship of QSA Researcher to Primary Study

The relationship of the QSA researcher to the primary study is an important consideration. When the QSA researcher is not part of the original study team, contractual arrangements detailing access to data, its format, access to the original team, and authorship are required ( Hinds et al., 1997 ). The QSA researcher should assess the condition of the data, documents including transcripts, memos and notes, and clarity and flow of interactions ( Hinds et al., 1997 ). An outline of the original study and data collection procedures should be critically reviewed ( Heaton, 1998 ). If the secondary researcher was not a member of the original study team, access to the original investigative team for the purpose of ongoing clarification is essential ( Hinds et al., 1997 ).

Membership on the original study team may, however, offer the secondary researcher little advantage depending on their role in the primary study. Some research team members may have had responsibility for only one type of data collection or data source. There may be differences in involvement with analysis of the primary data.

Informed Consent of Participants

Thorne (1998) questioned whether data collected for one study purpose can ethically be re-examined to answer another question without participants’ consent. Many institutional review boards permit consent forms to include language about the possibility of future use of existing data. While this mechanism is becoming routine and welcomed by researchers, concerns have been raised that a generic consent cannot possibly address all future secondary questions and may violate the principle of full informed consent ( Gladstone et al., 2007 ). Local variations in study approval practices by institutional review boards may influence the ability of researchers to conduct a QSA.

Rigor of QSA

The primary standards for evaluating rigor of qualitative studies are trustworthiness (logical relationship between the data and the analytic claims), fit (the context within which the findings are applicable), transferability (the overall generalizability of the claims) and auditabilty (the transparency of the procedural steps and the analytic moves processes) ( Lincoln & Guba, 1991 ). Thorne suggests that standard procedures for assuring rigor can be modified for QSA ( Thorne, 1994 ). For instance, the original researchers may be viewed as sources of confirmation while new informants, other related datasets and validation by clinical experts are sources of triangulation that may overcome the lack of access to primary subjects ( Heaton, 2004 ; Thorne, 1994 ).

Our observations, derived from the experience of posing a new question of existing qualitative data serves as a template for researchers considering QSA. Considerations regarding quality, availability and appropriateness of existing data are of primary importance. A realistic plan for collecting additional data to answer questions posed in QSA should consider burden and resources for data collection, analysis, storage and maintenance. Researchers should consider context as a potential limitation to new analyses. Finally, the cost of QSA should be fully evaluated prior to making a decision to pursue QSA.

Acknowledgments

This work was funded by the National Institute of Nursing Research (RO1-NR07973, M Happ PI) and a Clinical Practice Grant from the American Association of Critical Care Nurses (JA Tate, PI).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Disclosure statement: Drs. Tate and Happ have no potential conflicts of interest to disclose that relate to the content of this manuscript and do not anticipate conflicts in the foreseeable future.

Contributor Information

Judith Ann Tate, The Ohio State University, College of Nursing.

Mary Beth Happ, The Ohio State University, College of Nursing.

  • Broyles L, Colbert A, Tate J, Happ MB. Clinicians’ evaluation and management of mental health, substance abuse, and chronic pain conditions in the intensive care unit. Critical Care Medicine. 2008; 36 (1):87–93. [ PubMed ] [ Google Scholar ]
  • Coyer SM, Gallo AM. Secondary analysis of data. Journal of Pediatric Health Care. 2005; 19 (1):60–63. [ PubMed ] [ Google Scholar ]
  • Fielding N. Getting the most from archived qualitative data: Epistemological, practical and professional obstacles. International Journal of Social Research Methodology. 2004; 7 (1):97–104. [ Google Scholar ]
  • Gladstone BM, Volpe T, Boydell KM. Issues encountered in a qualitative secondary analysis of help-seeking in the prodrome to psychosis. Journal of Behavioral Health Services & Research. 2007; 34 (4):431–442. [ PubMed ] [ Google Scholar ]
  • Happ MB, Swigart VA, Tate JA, Arnold RM, Sereika SM, Hoffman LA. Family presence and surveillance during weaning from prolonged mechanical ventilation. Heart & Lung: The Journal of Acute and Critical Care. 2007; 36 (1):47–57. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Happ MB, Swigart VA, Tate JA, Hoffman LA, Arnold RM. Patient involvement in health-related decisions during prolonged critical illness. Research in Nursing & Health. 2007; 30 (4):361–72. [ PubMed ] [ Google Scholar ]
  • Happ MB, Tate JA, Swigart V, DiVirgilio-Thomas D, Hoffman LA. Wash and wean: Bathing patients undergoing weaning trials during prolonged mechanical ventilation. Heart & Lung: The Journal of Acute and Critical Care. 2010; 39 (6 Suppl):S47–56. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Heaton J. Secondary analysis of qualitative data. Social Research Update. 1998;(22) [ Google Scholar ]
  • Heaton J. Reworking Qualitative Data. London: SAGE Publications; 2004. [ Google Scholar ]
  • Hinds PS, Vogel RJ, Clarke-Steffen L. The possibilities and pitfalls of doing a secondary analysis of a qualitative data set. Qualitative Health Research. 1997; 7 (3):408–424. [ Google Scholar ]
  • Lincoln YS, Guba EG. Naturalistic inquiry. Beverly Hills, CA: Sage Publishing; 1991. [ Google Scholar ]
  • Mauthner N, Parry O, Milburn K. The data are out there, or are they? Implications for archiving and revisiting qualitative data. Sociology. 1998; 32 :733–745. [ Google Scholar ]
  • Rew L, Koniak-Griffin D, Lewis MA, Miles M, O'Sullivan A. Secondary data analysis: new perspective for adolescent research. Nursing Outlook. 2000; 48 (5):223–229. [ PubMed ] [ Google Scholar ]
  • Szabo V, Strang VR. Secondary analysis of qualitative data. Advances in Nursing Science. 1997; 20 (2):66–74. [ PubMed ] [ Google Scholar ]
  • Tate JA, Dabbs AD, Hoffman LA, Milbrandt E, Happ MB. Anxiety and agitation in mechanically ventilated patients. Qualitative health research. 2012; 22 (2):157–173. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Thorne S. Secondary analysis in qualitative research: Issues and implications. In: Morse JM, editor. Critical Issues in Qualitative Research. Second. Thousand Oaks, CA: SAGE; 1994. [ Google Scholar ]
  • Thorne S. Ethical and representational issues in qualitative secondary analysis. Qualitative Health Research. 1998; 8 (4):547–555. [ PubMed ] [ Google Scholar ]

Secondary Data: Advantages, Disadvantages, Sources, Types

If you know the advantages and disadvantages of secondary data, you can make informed decisions and create future-oriented strategies.

Wherever you work – in business, marketing, research, or statistics, secondary data sources can help you optimize your current and future results.

Let’s see how

On this page:

  • What is secondary data? Definition, meaning, importance
  • Secondary data advantages and disadvantages (comparison chart)
  • Examples, types, and sources of secondary data.
  • Infographics in PDF

What is secondary data? Definition and meaning.

Secondary data is the data that have been already collected for another purpose but has some relevance to your current research needs.

In other words, it has already been collected in the past by someone else, not you. And now, you can use the data.

Secondary data is second-hand information. It is not used for the first time. That is why it is called secondary.

Typically, secondary data is found in resources like the Internet, libraries, or reports.

Web information, business reports, mass media products, encyclopedias, and government statistics are among the most popular examples of secondary data.

Advantages And Disadvantages Of Secondary Data

Secondary data advantages and disadvantages - infographic

Download the above infographic in PDF for free.

Let’s break down the infographic.

Advantages of Secondary Data:

  • Ease of access The secondary data sources are very easy to access. The Internet has changed the way secondary research works. Nowadays, you have so much information available just by clicking with the mouse.
  • Low cost or free The majority of secondary sources are absolutely free for use or at very low costs. It saves not only your money but your efforts. In comparison with primary research where you have to design and conduct a whole primary study process from the beginning, secondary research allows you to gather data without having to put any money on the table. (see more on our post: primary vs secondary data )
  • Time-saving  As the above advantage suggests, you can perform secondary research in no time. Sometimes it is a matter of a few Google searches to find a source of data.
  • Allow you to generate new insights from previous analysis Reanalyzing old data can bring unexpected new understandings and points of view or even new relevant conclusions.
  • Longitudinal analysis Secondary data allows you to perform a longitudinal analysis which means the studies are performed spanning over a large period of time. This can help you to determine different trends. In addition, you can find secondary data from many years back up to a couple of hours ago. It allows you to compare data over time.
  • Anyone can collect the data Secondary data research can be performed by people that aren’t familiar with the different data collection methods . Practically, anyone can collect it.
  • A huge amount of secondary data with a wide variety of sources It is the richest type of data available to you in a wide variety of sources and topics.

Disadvantages:

  • Might be not specific to your needs Secondary data is not specific to the researcher’s needs due to the fact that it was collected in the past for another reason. That is why the secondary data might be unreliable for your current needs. Secondary data sources can give you a huge amount of information, but quantity does not always mean appropriateness.
  • You have no control over data quality The secondary data might lack quality. The source of the information may be questionable, especially when you gather the data via the Internet. As you relying on secondary data for your data-driven decision-making , you must evaluate the reliability of the information by finding out how the information was collected and analyzed.
  • Biasness As the secondary data is collected by someone else than you, typically the data is biased in favor of the person who gathered it. This might not cover your requirements as a researcher or marketer.
  • Not timely Secondary data is collected in the past which means it might be out-of-date. This issue can be crucial in many different situations.
  • You are not the owner of the information Generally, secondary data is not collected specifically for your company. Instead, it is available to many companies and people either for free or for a little fee. So, this is not exactly a “ competitive advantage ” for you. Your current and potential competitors also have access to the data.

Types Of Secondary Data

Types Of Secondary Data - infographic

There are two types of secondary data, based on the data source:

  • Internal sources of data : information gathered within the researcher’s company or organization (examples – a database with customer details, sales reports, marketing analysis, your emails, your social media profiles, etc).
  • External sources of data : the data collected outside the organization (i.e. government statistics, mass media channels, newspapers, etc.)

Also, secondary data can be 2 types depending on the research strands:

  • Quantitative data  – data that can be expressed as a number or can be quantified. Examples – the weight and height of a person, the number of working hours, the volume of sales per month, etc. Quantitative data are easily amenable to statistical manipulation.
  • Qualitative data  – the information that can’t be expressed as a number and can’t be measured. Qualitative data consist of words, pictures, observations, and symbols, not numbers. It is about qualities. Examples – colors of the eyes (brown, blue, green), your socioeconomic status, customer satisfaction, and etc.

Dive deeper into the topic with our posts:

  • Qualitative vs quantitative data (comparison chart)
  • 40 ways to collect data for business needs

Examples And Sources Of Secondary Data

Internal Sources Of Secondary Data

Examples of internal secondary data sources - infographic

You might have loads of data in your company or organization that you aren’t using.

All types of organizations, whatever they are business or non-profit, collect information during their everyday processes. Orders are performed, costs and sales are recorded, customer inquiries about products are submitted, reports are presented, and so on.

Much of this information is of great use in your research. They can have hidden and unexpected value for you if you are able to incorporate them into your dashboards allowing data analysts with advanced BI training to spot new relationships.

Here is a list of some common and hidden sources of internal information:

1. Sales data

Sales are essential to a company’s profitability.

Examples of sales data are revenue, profitability, price, distribution channels, buyer personas, etc. This information can show you areas of strength and weakness, which will drive your future decisions.

2. Finance data

Collecting and analyzing your financial data is a way to maximize profits. Examples of financial data are overheads and production costs, cash flow reports, amounts spent to manufacture products, etc.

3. General marketing data

Marketing departments are a gold mine when it comes to secondary data sources.

Examples of marketing data are reports on customer profiles,  market segmentation , level of customer satisfaction, level of brand awareness, customer engagement through content marketing, customer retention and loyalty, etc.

4. Human resource data

Human resource departments have information about the costs to recruit and train an employee, staff retention rate and churn, the productivity of an individual employee, etc.

Human resource data can help you uncover the areas where a company needs to improve its HR processes to empower staff skills, talent, and achievements.

5. Customer relationship management system (CRM software)

Businesses can also collect and analyze data within their own CRM system.

This system is a great source of secondary data such as clients’ company affiliations, regional or geographical details for customers, and etc.

The average office employee sends dozens of business emails per day and receives even more.

Emails as sources of secondary data, provide important information such as product reviews, opinions, feedback and so on.

7. Your social media profiles

Social Media profiles on networks like Facebook, Tweeter, Linkedin are a great source of information that you can analyze to learn more about, for example, how people are talking about your business and how users share and engage with your content.

Some examples of secondary data that you can collect from social profiles include: likes, shares, mentions, impressions, new followers, comments, URL clicks.

8. Your website analytics

There’s a huge amount of valuable secondary data accessible to you through your website analytics platform.

The most popular platform for insights into your website statistics is Google Analytics and Google Search Console.

Examples of data that you can gather from your website include: visitor’s location, patterns of visitor behavior, keywords used by visitors to find your site and business, visitor’s activities in the site, most popular content, etc.

External Sources Of Secondary Data

Examples of external secondary data sources - a short infographic

External data are any data generated outside the boundaries of the company or organization.

There are many advantages of using external sources of secondary data, especially online ones. They offer endless information which you can acquire efficiently and quickly.

Today, external secondary data is a foundation for creating executive decisions wherever it is in business, in medicine, science, or in statistics.

Here are some key examples of external secondary data.

1. Data.gov

Data.gov provides over free 150,000 datasets available through federal, state, and local governments.  They are free, and accessible online.

Here, companies or students can find a ton of data, including information related to consumers, education, manufacturing, public safety, and much more.

2. World Bank Open Data

World Bank Open Data  offer free and open access to global development data. Datasets provide population demographics and a vast number of economic indicators from across the world.

3. IMF Economic Data

The International Monetary Fund  (IMF) is an organization of 189 countries.

It provides data such as international financial statistics, regional economic reports, foreign exchange rates, debt rates, commodity prices, and investments.

4. Crayon Intel Free 

Crayon Intel Free  is one of the best  free competitor analysis tools  that can help you track, analyze, and act on many things that happen outside of your business.

5. Talkwalker’s Free Social Search

Talkwalker’s Free Social Search  is a real-time free social media search engine that can provide you with unlimited searches across all major social networks.

It allows you to find out what the internet is saying about you or your competitors in seconds. You can know who’s talking about you with live audience insights.

Feedly  is a free news aggregator site that allows you to keep up with all the topics that matter to you. All in one place.

With Feedly, you are able to monitor easily news about your products, your competitors, important posts, content, Tweets or even YouTube videos.

7. Mailcharts 

Mailcharts  is a quite powerful tool for email marketers as well as for those who want to spy on the competition.

It collects emails from competing campaigns to help you develop your own. Mailhcharts has an enormous library of emails from countless brands.

8. Glassdoor

Glassdoor  is one of the world’s largest and most popular job and recruiting sites. It provides a free database with millions of company reviews, CEO approval ratings, interview reviews and questions, salary reports, benefits reviews, office photos, and more.

9. Google Alerts

Google Alerts  is one of the most popular free alert services that allows you to follow mentions on the internet about practically anything you want – company, brand, customers, purchasing patterns, and so on.

10. HubSpot Marketing Statistics

HubSpot  offers a large and very valuable free repository of marketing data.

You could find the latest marketing statistics and trends in areas such as Organic Search, Conversion Rate Optimization (CRO), Ecomerce, Local SEO, Mobile Search, and others.

11. Crunchbase 

Crunchbase  is one of the best and most innovative platforms for finding business information about private and public companies.

Crunchbase data include investments and funding information, news, and industry trends, individuals in leadership positions, mergers, and etc.

For many businesses, the sources of secondary data are a key way to gather information about their customers in order to better understand and serve them.

We are living in the big data age. Knowing the advantages and disadvantages of secondary data can ensure better decision making for all management levels and types.

It is a good basis for creating new opportunities, running data-driven marketing , and improving your results and performance.

About The Author

methodology when using secondary data

Silvia Valcheva

Silvia Valcheva is a digital marketer with over a decade of experience creating content for the tech industry. She has a strong passion for writing about emerging software and technologies such as big data, AI (Artificial Intelligence), IoT (Internet of Things), process automation, etc.

One Response

' src=

I am impressed with the breakdown of the secondary. Weldone and more grace!!!

Leave a Reply Cancel Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Comparative effectiveness research methodology using secondary data: A starting user's guide

Affiliations.

  • 1 Division of Urological Surgery and Center for Surgery and Public Health, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; The Lank Center for Genitourinary Oncology, Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA. Electronic address: [email protected].
  • 2 Division of Urological Surgery and Center for Surgery and Public Health, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.
  • PMID: 29146037
  • DOI: 10.1016/j.urolonc.2017.10.011

Background: The use of secondary data, such as claims or administrative data, in comparative effectiveness research has grown tremendously in recent years.

Purpose: We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before initiating a comparative effectiveness investigation, and (3) optimize the quality of their investigations.

Main findings: Specifically, we review concepts of adjusted analyses and confounders, methods of propensity score analyses, and instrumental variable analyses, risk prediction models (logistic and time-to-event), decision-curve analysis, as well as the interpretation of the P value and hypothesis testing.

Conclusions: Overall, we hope that the current review article can help research investigators relying on secondary data to perform comparative effectiveness research better understand the necessity of a rigorous planning before study start, and gain better insight in the choice of statistical methods so as to optimize the quality of the research study.

Keywords: Comparative effectiveness research; Oncology; Review; Secondary data; Urology.

Copyright © 2017 Elsevier Inc. All rights reserved.

Publication types

  • Comparative Effectiveness Research / methods
  • Comparative Effectiveness Research / standards*
  • Guidelines as Topic
  • Logistic Models
  • Medical Oncology / methods*
  • Medical Oncology / standards
  • Propensity Score
  • Research Design / standards*
  • Risk Assessment / methods
  • Urology / methods*
  • Urology / standards

Updating...

Eisenberg Family Depression Center

  • Knowledge Base

Secondary data for mental health research: A Primer

Secondary data are a valuable resource for researchers, broadly speaking, and offer particular value in the field of mental health. Yet, they are often underutilized. Here, we offer a primer on what secondary data is, the value these sources offer, and address some misconceptions about their utility.

Secondary data: What is it?

Within the Data & Design Core, our goal is to connect researchers to valuable secondary sources of mental health data, and increase the use of such data sources in depression and mental health research. But, what do we mean by secondary data? Secondary data refers to data that is collected by someone other than the researcher, often on an ongoing basis, and often with the goal of encouraging broad use by multiple research teams to answer many different questions. It differs from primary data collection, which is collected directly by the research team to answer a more specific, narrow research question.

While there are several forms of secondary data, our team primarily works with: large, population-based datasets, that are nationally representative and/or longitudinal in nature (such as the Health and Retirement Study, the National Survey on Drug Use and Health and additional examples on our website), as well as clinical, administrative data (such as Michigan Medicine’s DataDirect). We often use the terms “secondary data” and “existing data” interchangeably.

Why bother?

Secondary data may be overlooked by researchers who don’t have hands-on experience, often seeming too complicated or time-consuming. The truth is, utilizing secondary data drastically cuts down the costs and time involved with primary data collection. It is not unusual for primary data collection to take five years or longer, including time spent securing funding; secondary data projects can often be completed within one to two years or less, depending on scope. In particular, secondary data are especially valuable because:

  • The work of collecting data has already been done! This eliminates several years of work and significant costs from a project’s timeline and budget.
  • Many data sources are easily accessible and downloadable online for no cost
  • There is a huge breadth & depth of secondary data available for mental health research on a range of diverse topics, including other mental health co-morbidities, physical health co-morbidities, social determinants of health, disease prevention, health across the lifespan, among others. Explore available data by topic using the filters on our website.
  • Most secondary data sources have very robust sample sizes, into the tens or hundreds of thousands of participants and more.
  • As many secondary data sources are nationally representative and/or longitudinal (or both) in nature, they allow the researcher to gain insight into national trends and/or longitudinal trends that is often not possible in primary data collection
  • Working with secondary data does not typically require full IRB review or newly-required data sharing plans, reducing start-up time
  • Due to the reduced cost and time burdens, secondary data offers a lower-risk to test preliminary hypotheses, and identify areas of need for additional research
  • Secondary data are particularly valuable for trainees and early-career faculty, who often face many obstacles in getting research work completed, including limited funding, protected time, bandwidth, research staff, collaborators and others.

Dispelling misconceptions

Despite the value that secondary data offers to researchers, it is underutilized, especially in the mental health field, and that may be due in part to some commonly held misconceptions. These might include:

  • Secondary data is too complicated to figure out: Oftentimes, researchers who have not used secondary data before may feel overwhelmed or intimidated by the prospect. While some data sources vary in their readiness and ease of use, there are many high-quality sources of data that have excellent documentation and are very user-friendly. If you have questions about which data sources to use, or the data cleaning process, we encourage you to contact our team.
  • Secondary data is easy: On the other hand, some may consider secondary data taking the “easy way out” or not “real” research; while secondary data does certainly reduce many of the barriers and challenges related to original data collection, it still does require skill and knowledge to utilize.
  • You can’t build an academic career without collecting original data: Most researchers would say that you collect original data in order to have a successful academic career, due to the need to get funding and publish. Some may think that they won’t be able to get funding, publish or find collaborators working with secondary data - this is a misconception! There are many funding opportunities available through federal agencies to support secondary data analysis, opening the door for opportunities to find collaborators and publish extensively. There are many examples of prolific researchers who have made significant and innovative advancements using secondary data in the field of mental health.
  • Secondary data isn’t precise enough: Some may hesitate to use secondary data question because they don’t think the dataset will have exactly the variables that they are looking for, or that secondary data is just for “fishing expeditions”. It is true that secondary data analysis limits you to the data that are available, so at times may require some creativity and flexibility. It can also illuminate the need for additional primary data collection. While there is potential to use secondary data for fishing expeditions, our team avoids this by publicly pre-registering research questions and analysis plans, and we recommend others to do the same.

We hope this article provided a helpful overview for working with secondary data. If you have questions or need help getting started, please contact our team at [email protected] .

This article was inspired by multiple sources, including this article and this article , and a presentation given by Amy Byers, PhD

About the Author

Deleting....

Office of the Vice President for Research

Four clas faculty researchers secure prestigious early career awards.

Continuing  an upward trend of University of Iowa faculty securing prestigious early-career grants, four investigators from the Departments of Physics and Astronomy and Computer Science have been awarded notable grant awards to advance their careers.

DeRoo, Hoadley advance space instrumentation with Nancy Grace Roman Technology Fellowships in Astrophysics for Early Career Researchers

Casey DeRoo and Keri Hoadley , both assistant professors in the Department of Physics and Astronomy, each received a Nancy Grace Roman Technology Fellowship in Astrophysics for Early Career Researchers. The NASA fellowship provides each researcher with $500,000 over two years to support their research in space-based instrumentation. 

Keri Hoadley

Hoadley’s research is two-pronged. She will design and ultimately prototype a mirror-based vacuum ultraviolet polarizer, which will allow researchers to access polarized light from space below 120-nanometer wavelength. Polarizing light at such a low wavelength is crucial to building optics for NASA’s future Habitable World Observatory (HWO), the agency’s next flagship astrophysics mission after the Nancy Grace Roman Space Telescope. 

“Our vacuum ultraviolet polarizer project is meant to help set up our lab to propose to NASA for one or more follow-up technology programs, including adapting this polarizer for use in vacuum systems, duplicating it and measuring its efficiency to measure additional flavors of polarized UV light, quantifying the polarization effects introduced by UV optical components that may be used on HWO, and building an astronomical instrument to measure the polarization of UV from around massive stars and throughout star-forming regions,” said Hoadley.

In addition, Hoadley and her team will build a facility to align, calibrate, and integrate small space telescopes before flight, using a vacuum chamber and wavelengths of light typically only accessible in space, which could help the university win future small satellite and suborbital missions from NASA. 

Casey DeRoo

DeRoo will work to advance diffraction gratings made with electron beams that pattern structures on a nanometer scale.   Like a prism, diffraction gratings spread out and direct light coming from stars and galaxies, allowing researchers to deduce things like the temperature, density, or composition of an astronomical object.

The fellowship will allow DeRoo to upgrade the university’s Raith

DeRoo

 Voyager tool, a specialized fabrication tool hosted by OVPR’s Materials Analysis, Testing and Fabrication (MATFab) facility.

“These upgrades will let us perform algorithmic patterning, which uses computer code to quickly generate the patterns to be manufactured,” DeRoo said. “This is a major innovation that should enable us to make more complex grating shapes as well as make gratings more quickly.” DeRoo added that the enhancements mean his team may be able to make diffraction gratings that allow space instrument designs that are distinctly different from those launched to date.

“For faculty who develop space-based instruments, the Nancy Grace Roman Technology Fellowship is on par with the prestige of an NSF CAREER or Department of Energy Early Career award,” said Mary Hall Reno, professor and department chair. “Our track record with the program elevates our status as a destination university for astrophysics and space physics missions.”

Uppu pursues building blocks quantum computing with NSF CAREER Award

Ravitej Uppu

Ravitej Uppu, assistant professor in the Department of Physics and Astronomy, received a 5-year NSF CAREER award of $550,000 to conduct research aimed at amplifying the power of quantum computing and making its application more practical. 

Uppu and his team will explore the properties of light-matter interactions at the level of a single photon interacting with a single molecule, enabling them to generate efficient and high-quality multiphoton entangled states of light. Multiphoton entangled states, in which photons become inextricably linked, are necessary for photons to serve as practical quantum interconnects, transmitting information between quantum computing units, akin to classical cluster computers. 

“ In our pursuit of secure communication, exploiting quantum properties of light is the final frontier,” said Uppu. “However, unavoidable losses that occur in optical fiber links between users can easily nullify the secure link. Our research on multiphoton entangled states is a key building block for implementing ‘quantum repeaters’ that can overcome this challenge.”

Jiang tackles real-world data issues with NSF CAREER Award

Peng Jiang

Peng Jiang, assistant professor in the Department of Computer Science, received an NSF CAREER Award that will provide $548,944 over five years to develop tools to support the use of sampling-based algorithms. 

Sampling-based algorithms reduce computing costs by processing only a random selection of a dataset, which has made them increasingly popular, but the method still faces limited efficiency. Jiang will develop a suite of tools that simplify the implementation of sampling-based algorithms and improve their efficacy across wide range of computing and big data applications.

“ A simple example of a real-world application is subgraph matching,” Jiang said. “For example, one might be interested in finding a group of people with certain connections in a social network. The use of sampling-based algorithms can significantly accelerate this process.”

In addition to providing undergraduate students the opportunity to engage with this research, Jiang also plans for the project to enhance projects in computer science courses.

IMAGES

  1. How to do your PhD Thesis Using Secondary Data Collection in 4 Steps

    methodology when using secondary data

  2. Secondary Data

    methodology when using secondary data

  3. Difference Between Data Collection and Data Analysis

    methodology when using secondary data

  4. Secondary Data Analysis

    methodology when using secondary data

  5. Secondary Analysis of Qualitative Data

    methodology when using secondary data

  6. Writing A Dissertation With Secondary Data

    methodology when using secondary data

VIDEO

  1. Sources for Secondary Data Collection- Published, Unpublished

  2. 31. Research methodology- Data Collection

  3. IGC SUMMER SCHOOL IN DEVELOPMENT ECONOMICS: Day 1, Lecture 1

  4. Primary and Secondary Data

  5. research methodology

  6. How to do Master's Dissertation using Secondary Data? by Prof KS Hari

COMMENTS

  1. What is Secondary Research?

    Revised on January 12, 2024. Secondary research is a research method that uses data that was collected by someone else. In other words, whenever you conduct research using data that already exists, you are conducting secondary research. On the other hand, any type of research that you undertake yourself is called primary research.

  2. Secondary Qualitative Research Methodology Using Online Data within the

    Whilst using secondary data is often associated with limited knowledge of the data collection procedure and difficulties of "verification" of the data (Heaton, 2008) as well as limited "fidelity" of secondary data (Thorne, 1998), Heaton (2008) questions whether qualitative data can actually be ever verified, whether primary or secondary ...

  3. Secondary Data

    Data may be incomplete or inaccurate: Secondary data may be incomplete or inaccurate due to missing or incorrect data points, data entry errors, or other factors. Biases in data collection: The data may have been collected using biased sampling or data collection methods, which can limit the validity of the data.

  4. Secondary Data Analysis: Your Complete How-To Guide

    Step 3: Design your research process. After defining your statement of purpose, the next step is to design the research process. For primary data, this involves determining the types of data you want to collect (e.g. quantitative, qualitative, or both) and a methodology for gathering them. For secondary data analysis, however, your research ...

  5. What is Secondary Data? [Examples, Sources & Advantages]

    5. Advantages of secondary data. Secondary data is suitable for any number of analytics activities. The only limitation is a dataset's format, structure, and whether or not it relates to the topic or problem at hand. When analyzing secondary data, the process has some minor differences, mainly in the preparation phase.

  6. Secondary Research: Definition, Methods & Examples

    Secondary research, also known as desk research, is a research method that involves compiling existing data sourced from a variety of channels. This includes internal sources (e.g.in-house research) or, more commonly, external sources (such as government statistics, organizational bodies, and the internet).

  7. Conducting High-Value Secondary Dataset Analysis: An Introductory Guide

    Secondary dataset analysis is a well-established methodology. Secondary analysis is particularly valuable for junior investigators, who have limited time and resources to demonstrate expertise and productivity. ... The same basic research principles that apply to studies using primary data apply to secondary data analysis, including the ...

  8. Secondary Analysis Research

    Example of a Secondary Data Analysis. An example highlighting this method of reusing one's own data is Winters-Stone and colleagues' SDA of data from four previous primary studies they performed at one institution, published in the Journal of Clinical Oncology (JCO) in 2017. Their pooled sample was 512 breast cancer survivors (age 63 ± 6 years) who had been diagnosed and treated for ...

  9. Secondary Data Analysis

    Abstract. Secondary data analysis refers to the analysis of existing data collected by others. Secondary analysis affords researchers the opportunity to investigate research questions using large-scale data sets that are often inclusive of under-represented groups, while saving time and resources.

  10. Secondary Qualitative Research Methodology Using Online Data within the

    the downfalls of secondary data analysis, particularly in the setting of forced migration research when using online, publicly accessible data. Step 1. Formulation of Research Questions. Setting a research aim is important regardless of whether the data is from a primary or secondary source (Taylor & Ussher, 2001).

  11. PDF Getting started with secondary data analysis

    Many ways of re-using data 1. Description • See more data, not limited to published extracts 2. Research design and methodological advancement • Study sampling, data collection methods, topics guides 3. Reanalysis • A research question different to the original research 4. Restudy • Replicate and compare 28

  12. Using Secondary Data in Mixed Methods is More Straight-Forward Than You

    In this post, I briefly cover the advantages and disadvantages of using secondary data in mixed methods research, how to prioritize secondary qualitative and quantitative data in a mixed methods project, and the role of theory in mixed methods with secondary data. In research, there are at least two types of data: primary and secondary.

  13. Secondary Research: Definition, Methods & Examples

    So, rightly secondary research is also termed " desk research ", as data can be retrieved from sitting behind a desk. The following are popularly used secondary research methods and examples: 1. Data Available on The Internet. One of the most popular ways to collect secondary data is the internet.

  14. PDF An Introduction to Secondary Data Analysis

    Secondary analysis of qualitative data is a topic unto itself and is not discussed in this volume. The interested reader is referred to references such as James and Sorenson (2000) and Heaton (2004). Advantages and Disadvantages of Secondary Data Analysis. The choice of primary or secondary data need not be an either/or ques-tion.

  15. What is Secondary Data? + [Examples, Sources, & Analysis]

    Sources of Secondary Data. Sources of secondary data include books, personal sources, journals, newspapers, websitess, government records etc. Secondary data are known to be readily available compared to that of primary data. It requires very little research and needs for manpower to use these sources.

  16. Secondary Data In Research Methodology (With Examples)

    In this article, we define what secondary data in research methodology is, explain the differences between primary and secondary data, list secondary data research methods, provide examples of secondary research, offer a step-by-step guide detailing how to use secondary data in research and discuss the advantages and disadvantages of using it.

  17. Conducting secondary analysis of qualitative data: Should we, can we

    SDA involves investigations where data collected for a previous study is analyzed - either by the same researcher(s) or different researcher(s) - to explore new questions or use different analysis strategies that were not a part of the primary analysis (Szabo and Strang, 1997).For research involving quantitative data, SDA, and the process of sharing data for the purpose of SDA, has become ...

  18. Comparative effectiveness research methodology using secondary data: A

    Purpose. We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before initiating a comparative effectiveness investigation, and (3) optimize the quality of their investigations.

  19. Secondary Research Guide: Definition, Methods, Examples

    Meltwater for market research speeds up the time to value in using secondary research with AI-powered insights, enhancing your understanding of the customer journey. Using natural language processing, machine learning, and trusted data science processes, Meltwater helps you find relevant data and automatically surfaces insights to help you understand its significance.

  20. Dissertations 4: Methodology: Methods

    Quantitative methods can be difficult, expensive and time consuming (especially if using primary data, rather than secondary data). Suitable when the phenomenon is relatively simple, and can be analysed according to identified variables.

  21. How to Analyse Secondary Data for a Dissertation

    The process of data analysis in secondary research. Secondary analysis (i.e., the use of existing data) is a systematic methodological approach that has some clear steps that need to be followed for the process to be effective. In simple terms there are three steps: Step One: Development of Research Questions. Step Two: Identification of dataset.

  22. Qualitative Secondary Analysis: A Case Exemplar

    Qualitative secondary analysis (QSA) is the use of qualitative data collected by someone else or to answer a different research question. Secondary analysis of qualitative data provides an opportunity to maximize data utility particularly with difficult to reach patient populations. However, QSA methods require careful consideration and ...

  23. Secondary Data: Advantages, Disadvantages, Sources, Types

    There are two types of secondary data, based on the data source: Internal sources of data: information gathered within the researcher's company or organization (examples - a database with customer details, sales reports, marketing analysis, your emails, your social media profiles, etc).; External sources of data: the data collected outside the organization (i.e. government statistics, mass ...

  24. Comparative effectiveness research methodology using secondary data: A

    Background: The use of secondary data, such as claims or administrative data, in comparative effectiveness research has grown tremendously in recent years. Purpose: We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before ...

  25. How to Write a Research Methodology for Your Academic Article

    Secondary data. Secondary data are data that have been previously collected or gathered for other purposes than the aim of the academic article's study. This type of data is already available, in different forms, from a variety of sources. Secondary data collection could lead to Internal or External secondary data research.

  26. Article

    Secondary data isn't precise enough: Some may hesitate to use secondary data question because they don't think the dataset has exactly the variables that they are looking for, or that secondary data is just for "fishing expeditions". It is true that secondary data analysis limits you to the data that are available, so at times may ...

  27. Four CLAS faculty researchers secure prestigious early career awards

    Continuing an upward trend of University of Iowa faculty securing prestigious early-career grants, four investigators from the Departments of Physics and Astronomy and Computer Science have been awarded notable grant awards to advance their careers. DeRoo, Hoadley advance space instrumentation with Nancy Grace Roman Technology Fellowships in Astrophysics for Early Career Researchers