• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

sample size for a research project

Home Audience

Sample Size Determination: Definition, Formula, and Example

sample size for a research project

Are you ready to survey your research target? Research surveys help you gain insights from your target audience. The data you collect gives you insights to meet customer needs, leading to increased sales and customer loyalty. Sample size calculation and determination are imperative to the researcher to determine the right number of respondents, keeping in mind the research study’s quality.

So, how should you do the sample size determination? How do you know who should get your survey? How do you decide on the number of the target audience?

Sending out too many surveys can be expensive without giving you a definitive advantage over a smaller sample. But if you send out too few, you won’t have enough data to draw accurate conclusions. 

Knowing how to calculate and determine the appropriate sample size accurately can give you an edge over your competitors. Let’s take a look at what a good sample includes. Also, let’s look at the sample size calculation formula so you can determine the perfect sample size for your next survey.

What is Sample Size?

‘Sample size’ is a market research term used for defining the number of individuals included in conducting research. Researchers choose their sample based on demographics, such as age, gender questions , or physical location. It can be vague or specific. 

For example, you may want to know what people within the 18-25 age range think of your product. Or, you may only require your sample to live in the United States, giving you a wide population range. The total number of individuals in a particular sample is the sample size.

What is sample size determination?

Sample size determination is the process of choosing the right number of observations or people from a larger group to use in a sample. The goal of figuring out the sample size is to ensure that the sample is big enough to give statistically valid results and accurate estimates of population parameters but small enough to be manageable and cost-effective.

In many research studies, getting information from every member of the population of interest is not possible or useful. Instead, researchers choose a sample of people or events that is representative of the whole to study. How accurate and precise the results are can depend a lot on the size of the sample.

Choosing the statistically significant sample size depends on a number of things, such as the size of the population, how precise you want your estimates to be, how confident you want to be in the results, how different the population is likely to be, and how much money and time you have for the study. Statistics are often used to figure out how big a sample should be for a certain type of study and research question.

Figuring out the sample size is important in ensuring that research findings and conclusions are valid and reliable.

Why do you need to determine the sample size?

Let’s say you are a market researcher in the US and want to send out a survey or questionnaire . The survey aims to understand your audience’s feelings toward a new cell phone you are about to launch. You want to know what people in the US think about the new product to predict the phone’s success or failure before launch.

Hypothetically, you choose the population of New York, which is 8.49 million. You use a sample size determination formula to select a sample of 500 individuals that fit into the consumer panel requirement. You can use the responses to help you determine how your audience will react to the new product.

However, determining a sample size requires more than just throwing your survey at as many people as possible. If your estimated sample sizes are too big, it could waste resources, time, and money. A sample size that’s too small doesn’t allow you to gain maximum insights, leading to inconclusive results.

LEARN ABOUT: Survey Sample Sizes

What are the terms used around the sample size?

Before we jump into sample size determination, let’s take a look at the terms you should know:

terms_used_around_sample_size

1. Population size: 

Population size is how many people fit your demographic. For example, you want to get information on doctors residing in North America. Your population size is the total number of doctors in North America. 

Don’t worry! Your population size doesn’t always have to be that big. Smaller population sizes can still give you accurate results as long as you know who you’re trying to represent.

2. Confidence level: 

The confidence level tells you how sure you can be that your data is accurate. It is expressed as a percentage and aligned to the confidence interval. For example, if your confidence level is 90%, your results will most likely be 90% accurate.

3. The margin of error (confidence interval): 

There’s no way to be 100% accurate when it comes to surveys. Confidence intervals tell you how far off from the population means you’re willing to allow your data to fall. 

A margin of error describes how close you can reasonably expect a survey result to fall relative to the real population value. Remember, if you need help with this information, use our margin of error calculator .

4. Standard deviation: 

Standard deviation is the measure of the dispersion of a data set from its mean. It measures the absolute variability of a distribution. The higher the dispersion or variability, the greater the standard deviation and the greater the magnitude of the deviation. 

For example, you have already sent out your survey. How much variance do you expect in your responses? That variation in response is the standard deviation.

Sample size calculation formula – sample size determination

With all the necessary terms defined, it’s time to learn how to determine sample size using a sample calculation formula.

Your confidence level corresponds to a Z-score. This is a constant value needed for this equation. Here are the z-scores for the most common confidence levels:

90% – Z Score = 1.645

95% – Z Score = 1.96

99% – Z Score = 2.576

If you choose a different confidence level, various online tools can help you find your score.

Necessary Sample Size = (Z-score)2 * StdDev*(1-StdDev) / (margin of error)2

Here is an example of how the math works, assuming you chose a 90% confidence level, .6 standard deviation, and a margin of error (confidence interval) of +/- 4%.

((1.64)2 x .6(.6)) / (.04)2

( 2.68x .0.36) / .0016

.9648 / .0016

603 respondents are needed, and that becomes your sample size.

Free Sample Size Calculator

How is a sample size determined?

Determining the right sample size for your survey is one of the most common questions researchers ask when they begin a market research study. Luckily, sample size determination isn’t as hard to calculate as you might remember from an old high school statistics class.

Before calculating your sample size, ensure you have these things in place:

Goals and objectives: 

What do you hope to do with the survey? Are you planning on projecting the results onto a whole demographic or population? Do you want to see what a specific group thinks? Are you trying to make a big decision or just setting a direction? 

Calculating sample size is critical if you’re projecting your survey results on a larger population. You’ll want to make sure that it’s balanced and reflects the community as a whole. The sample size isn’t as critical if you’re trying to get a feel for preferences. 

For example, you’re surveying homeowners across the US on the cost of cooling their homes in the summer. A homeowner in the South probably spends much more money cooling their home in the humid heat than someone in Denver, where the climate is dry and cool. 

For the most accurate results, you’ll need to get responses from people in all US areas and environments. If you only collect responses from one extreme, such as the warm South, your results will be skewed.

Precision level: 

How close do you want the survey results to mimic the true value if everyone responded? Again, if this survey determines how you’re going to spend millions of dollars, then your sample size determination should be exact. 

The more accurate you need to be, the larger the sample you want to have, and the more your sample will have to represent the overall population. If your population is small, say, 200 people, you may want to survey the entire population rather than cut it down with a sample.

Confidence level: 

Think of confidence from the perspective of risk. How much risk are you willing to take on? This is where your Confidence Interval numbers become important. How confident do you want to be — 98% confident, 95% confident? 

Understand that the confidence percentage you choose greatly impacts the number of completions you’ll need for accuracy. This can increase the survey’s length and how many responses you need, which means increased costs for your survey. 

Knowing the actual numbers and amounts behind percentages can help make more sense of your correct sample size needs vs. survey costs. 

For example, you want to be 99% confident. After using the sample size determination formula, you find you need to collect an additional 1000 respondents. 

This, in turn, means you’ll be paying for samples or keeping your survey running for an extra week or two. You have to determine if the increased accuracy is more important than the cost.

Population variability: 

What variability exists in your population? In other words, how similar or different is the population?

If you are surveying consumers on a broad topic, you may have lots of variations. You’ll need a larger sample size to get the most accurate picture of the population. 

However, if you’re surveying a population with similar characteristics, your variability will be less, and you can sample fewer people. More variability equals more samples, and less variability equals fewer samples. If you’re not sure, you can start with 50% variability.

Response rate: 

You want everyone to respond to your survey. Unfortunately, every survey comes with targeted respondents who either never open the study or drop out halfway. Your response rate will depend on your population’s engagement with your product, service organization, or brand. 

The higher the response rate, the higher your population’s engagement level. Your base sample size is the number of responses you must get for a successful survey.

Consider your audience: 

Besides the variability within your population, you need to ensure your sample doesn’t include people who won’t benefit from the results. One of the biggest mistakes you can make in sample size determination is forgetting to consider your actual audience. 

For example, you don’t want to send a survey asking about the quality of local apartment amenities to a group of homeowners.

Select your respondents

Focus on your survey’s objectives: 

You may start with general demographics and characteristics, but can you narrow those characteristics down even more? Narrowing down your audience makes getting a more accurate result from a small sample size easier. 

For example, you want to know how people will react to new automobile technology. Your current population includes anyone who owns a car in a particular market. 

However, you know your target audience is people who drive cars that are less than five years old. You can remove anyone with an older vehicle from your sample because they’re unlikely to purchase your product.

Once you know what you hope to gain from your survey and what variables exist within your population, you can decide how to calculate sample size. Using the formula for determining sample size is a great starting point to get accurate results. 

After calculating the sample size, you’ll want to find reliable customer survey software to help you accurately collect survey responses and turn them into analyzed reports.

LEARN MORE: Population vs Sample

In sample size determination, statistical analysis plan needs careful consideration of the level of significance, effect size, and sample size. 

Researchers must reconcile statistical significance with practical and ethical factors like practicality and cost. A well-designed study with a sufficient sample size can improve the odds of obtaining statistically significant results.

To meet the goal of your survey, you may have to try a few methods to increase the response rate, such as:

  • Increase the list of people who receive the survey.
  • To reach a wider audience, use multiple distribution channels, such as SMS, website, and email surveys.
  • Send reminders to survey participants to complete the survey.
  • Offer incentives for completing the survey, such as an entry into a prize drawing or a discount on the respondent’s next order.
  • Consider your survey structure and find ways to simplify your questions. The less work someone has to do to complete the survey, the more likely they will finish it. 
  • Longer surveys tend to have lower response rates due to the length of time it takes to complete the survey. In this case, you can reduce the number of questions in your survey to increase responses.  

QuestionPro’s sample size calculator makes it easy to find the right sample size for your research based on your desired level of confidence, your margin of error, and the size of the population.

FREE TRIAL         LEARN MORE

Frequently Asked Questions (FAQ)

The four ways to determine sample size are: 1. Power analysis 2. Convenience sampling, 3. Random sampling , 4. Stratified sampling

The three factors that determine sample size are: 1. Effect size, 2. Level of significance 3. Power

Using statistical techniques like power analysis, the minimal detectable effect size, or the sample size formula while taking into account the study’s goals and practical limitations is the best way to calculate the sample size.

The sample size is important because it affects how precise and accurate the results of a study are and how well researchers can spot real effects or relationships between variables.

The sample size is the number of observations or study participants chosen to be representative of a larger group

MORE LIKE THIS

NPS Survey Platform

NPS Survey Platform: Types, Tips, 11 Best Platforms & Tools

Apr 26, 2024

user journey vs user flow

User Journey vs User Flow: Differences and Similarities

gap analysis tools

Best 7 Gap Analysis Tools to Empower Your Business

Apr 25, 2024

employee survey tools

12 Best Employee Survey Tools for Organizational Excellence

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence
  • A/B Monadic Test
  • A/B Pre-Roll Test
  • Key Driver Analysis
  • Multiple Implicit
  • Penalty Reward
  • Price Sensitivity
  • Segmentation
  • Single Implicit
  • Category Exploration
  • Competitive Landscape
  • Consumer Segmentation
  • Innovation & Renovation
  • Product Portfolio
  • Marketing Creatives
  • Advertising
  • Shelf Optimization
  • Performance Monitoring
  • Better Brand Health Tracking
  • Ad Tracking
  • Trend Tracking
  • Satisfaction Tracking
  • AI Insights
  • Case Studies

quantilope is the Consumer Intelligence Platform for all end-to-end research needs

How To Determine Sample Size for Quantitative Research

mrx glossary sample size

This blog post looks at how large a sample size should be for reliable, usable market research findings.

Table of Contents: 

What is sample size , why do you need to determine sample size , variables that impact sample size.

  • Determining sample size

The sample size of a quantitative study is the number of people who complete questionnaires in a research project. It is a representative sample of the target audience in which you are interested.

Back to Table of Contents

You need to determine how big of a sample size you need so that you can be sure the quantitative data you get from a survey is reflective of your target population as a whole - and so that the decisions you make based on the research have a firm foundation. Too big a sample and a project can be needlessly expensive and time-consuming. Too small a sample size, and you risk interviewing the wrong respondents - meaning ultimately you miss out on valuable insights.

There are a few variables to be aware of before working out the right sample size for your project.

Population size

The subject matter of your research will determine who your respondents are - chocolate eaters, dentists, homeowners, drivers, people who work in IT, etc. For your respective group of interest, the total of this target group (i.e. the number of chocolate eaters/homeowners/drivers that exist in the general population) will guide how many respondents you need to interview for reliable results in that field.

Ideally, you would use a random sample of people who fit within the group of people you’re interested in. Some of these people are easy to get hold of, while others aren‘t as easy. Some represent smaller groups of people in the population, so a small sample is inevitable. For example, if you’re interviewing chocolate eaters aged 5-99 you’ll have a larger sample size - and a much easier time sampling the population - than if you’re interviewing healthcare professionals who specialize in a niche branch of medicine.

Confidence interval (margin of error)

Confidence intervals, otherwise known as the margin of error, indicate the reliability of statistics that have been calculated by research; in other words, how certain you can be that the statistics are close to what they would be if it were possible to interview the entire population of the people you’re researching.

Confidence intervals are helpful since it would be impossible to interview all chocolate eaters in the US. However, statistics and research enable you to take a sample of that group and achieve results that reflect their opinions as a total population. Before starting a research project, you can decide how large a margin of error you will allow between the mean number of your sample and the mean number of its total population. The confidence interval is expressed as +/- a number, indicating the margin of error on either side of your statistic. For example, if 35% of chocolate eaters say that they eat chocolate for breakfast and your margin of error is 5, you’ll know that if you had asked the entire population, 30-40% of people would admit to eating chocolate at that time of day.

Confidence level

The confidence level indicates how probable it is that if you were to repeat your study multiple times with a random sample, you would get the same statistics and they would fall within the confidence interval every time.

In the example above, if you were to repeat the chocolate study over and over, you would have a certain level of confidence that those eating chocolate for breakfast would always fall within the 30-40% parameters. Most research studies have confidence intervals of 90% confident, 95% confident, or 99% confident. The number you choose will depend on whether you are happy to accept a broadly accurate set of data or whether the nature of your study demands one that is almost completely reliable.

Standard deviation

Standard deviation represents how much the results will vary from the mean number and from each other. A high standard deviation means that there is a wide range of responses to your research questions, while a low standard deviation indicates that responses are more similar to each other, clustered around the mean number. A standard deviation of 0.5 is a safe level to pick to ensure that the sample size is large enough.

Population variability 

If you already know anything about your target audience, you should have a feel for the degree to which their opinions vary. If you’re interviewing the entire population of a city, without any other criteria, their views are going to be wildly diverse so you’ll want to sample a high number of residents. If you’re honing in on a sample of chocolate breakfast eaters - there’s probably a limited number of reasons why that’s their meal of choice, so you can feel confident with a much smaller sample.

Project scope

The scope and objectives of the research will have an influence on how big the sample is. If the project aims to evaluate four different pieces of stimulus (an advert, a concept, a website, etc.) and each respondent is giving feedback on a single piece, then a higher number of respondents will need to be interviewed than if each respondent were evaluating all four; the same would be true when looking for reads on four different sub-audiences vs. not needing any sub-group data cuts.

Determining a good sample size for quantitative research

Sample size, as we’ve seen, is an important factor to consider in market research projects. Getting the sample size right will result in research findings you can use confidently when translating them into action. So now that you’ve thought about the subject of your research, the population that you’d like to interview, and how confident you want to be with the findings, how do you calculate the appropriate sample size?

There are many factors that can go into determining the sample size for a study, including z-scores, standard deviations, confidence levels, and margins of error. The great thing about quantilope is that your research consultants and data scientists are the experts in helping you land on the right target so you can focus on the actual study and the findings. 

To learn more about determining sample size for quantitative research, get in touch below: 

Get in touch to learn more about quantitative sample sizes!

Related posts, survey results: how to analyze data and report on findings, how florida's natural leveraged better brand health tracking, quirk's virtual event: fab-brew-lous collaboration: product innovations from the melitta group, brand value: what it is, how to build it, and how to measure it.

sample size for a research project

Survey Software & Market Research Solutions - Sawtooth Software

  • Technical Support
  • Technical Papers
  • Knowledge Base
  • Question Library

Call our friendly, no-pressure support team.

Understanding and Determining Sample Size for Survey Research

Pedestrians on a footpath. The people walking are blurred while the ones standing still are clear. Represents sample size determination.

Table of Contents

Sample Size Determination

Folks wanting to learn how to determine the right sample size for their research studies are badly underserved: nearly every article you can find on the internet tells, at best, just half the story. An inadequate sample size could lead to results that are far from the truth, costing your company millions in misguided investments. 

The most common advice you’ll find on the internet often leads straight to those inadequate sample sizes. There are different samples size calculations for different purposes – for means (single or multiple, independent or dependent), for proportions (single, paired, independent), for multivariate statistics (factor analysis, regression, logit, etc.) and for experiments (e.g., conjoint, MaxDiff). For brevity’s sake we’ll focus on sample size for a single proportions, leaving the reader to generalize for cases of two proportions, and for single, paired and independent means.

We’ll cover some rules of thumb about multivariate statistics and experiments. We’ll also differentiate between sample size for confidence intervals (the topic of almost every other article about sample size that you’ll find) and sample size for statistical testing (a topic that is almost uniformly neglected).

In this comprehensive guide, we'll dive deep into:

  • The definition of sample size and its significance in research
  • Factors influencing the determination of sample size
  • Step-by-step calculation methods for both needs, confidence intervals and hypotheses testing. 
  • Sample size advice for studies with complex analyses

Sample Size Definition

When we talk about sample size we just mean the number of respondents (people) that you include in your study . This number depends on whether you want to ensure that the results will (a) reflect the overall population's characteristics or (b) support managerially valuable hypothesis tests, or both.

Significance of Sample Size in Market Research?

Sample size is the currency with which you buy accuracy in survey research, both by generating quantifiable margins of error around any statistics we generate and by delivering credible hypothesis testing results.

A properly defined sample size balances cost-efficiency with statistical rigor . It gives your study credibility and it offers a clearer lens through which you can understand your research findings.

To Summarize:

  • Sample Size Definition : The number of observations or respondents in a study.
  • Significance of Sample Size in Market Research : It directly impacts the credibility and value of the research.

Need Sample for Your Research?

Let us connect you with your ideal audience! Reach out to us to request sample for your survey research.

Request Sample

Factors Influencing Sample Size Determination

How to determine the appropriate sample size depends on a few factors. Each requires careful consideration. Let's delve into these key factors.

Confidence versus power

This factor depends on whether you want your sample size scaled for precision (your margin of error or your confidence interval) or for power (i.e., for supporting hypothesis testing). Just for purposes of a sneak preview, the two formulas are slightly different (the formula for statistical power of a hypothesis test has one extra variable in it).

Population Size

Population sizes only matter in the rare case when your sample size will exceed 5% of the total population size. This happens so infrequently that we can refer anyone interested to Google “finite population correction factor,” which you can then add straightforwardly to your sample size formula.

Margin of Error (Confidence Interval)

The margin of error is the range within which the population parameter is expected to fall. Smaller margins require larger sample sizes. Simply put, the more precise you want to be, the larger your sample size needs to be.

Confidence Level

Confidence level refers to the probability that the sample results will represent the population within the margin of error. Common levels are 90%, 95%, and 99%. Higher confidence levels require larger sample sizes.

Standard Deviation

Standard deviation measures how spread out the values in your data set are. When you expect a high variation, you'll need a larger sample size to capture it accurately.

Quick Reference Table:

Sample Size Formulas

Sample size formula for margin of error (confidence interval, precision).

You may recall when learning statistics that your professor showed a formula for a confidence interval, then did some algebra to use it to solve for sample size (n). That’s where this formula comes from, from the confidence interval around a single proportion:

Sample Size Formula for Margin of Error

  • n = Sample Size
  • Z a/2 = Z-value that corresponds to desired confidence level (1.96 corresponds with the typical 95% confidence level)
  • p = Proportion of the population (since this is often not known, we usually use a worst case estimate of 0.5)
  • d = Margin of error (the radius of the confidence interval, or the precision)

Sample size formula for hypothesis testing

What your professor didn’t show you is that there’s a different formula when you want your sample size to support statistical testing. That’s where this formula comes from:

Sample Size Formula for Hypothesis Testing

  • n, Z a/2 , p and d are as above and
  • Zb =the Z-value that corresponds to the desired level of statistical power (0.84 corresponds to the commonly used 80% power)

How to Determine Sample Size: The Process

The sample size calculation process looks harder than it is. Just break it down into systematic steps. Here's how you can approach it, complete with real-world examples.

Step 1: Determine Confidence Level—Choose Wisely

The confidence level you select specifies how confident you can be that your sample results will reflect the true population parameter (a de facto standard is to shoot for 95% confidence). A higher confidence level, such as 99%, will provide greater assurance but will demand a larger sample size. A level like 99% might be appropriate for projects that carry high stakes, such as healthcare studies or regulatory compliance assessments.

On the flip side, a lower confidence level, like 90%, may suffice for quick market assessments or pilot studies. While it reduces the sample size needed, it does come at the cost of confidence in your findings. Here you accept a slightly higher risk that your sample results may not perfectly represent the broader population.

Rule of Thumb : For most business or academic research, a confidence level of 95% is considered a good starting point. For high-stakes, mission-critical projects, aim for 99%. For more exploratory or pilot projects where you can tolerate a bit more risk, 90% might be acceptable.

Z a/2 -the Z score for Confidence Level

In the context of confidence levels, this Z-score gives us the confidence level we want to have that the population score (mean, proportion, whatever you’re measuring) is within the margin of error, or contained within the confidence interval.

To calculate the Z-score, you can look it up in the standard normal distribution table, or use statistical software. The Z-score table below shows the Z-scores for the most commonly used confidence levels in market research (90%, 95%, and 99%) .

Z-score Table for Common Confidence Levels

Remember, the choice of confidence level dictates how much risk you're willing to accept, and in turn, influences the sample size and potentially, the viability of your project.

Example : Let's say you're researching consumer preferences for a new type of organic snack bar. You decide to go with a 95% confidence level, that is a 95% chance that your margin of error will include the population’s preference for the new snack bar. This equates to a Z-score of 1.96.

Step 2: Choose the Margin of Error/Precision

The margin of error measures the precision of your survey results. Simply put, a smaller margin of error (e.g., 2%) provides more accurate insights but requires a larger sample size. This can be particularly valuable when you're working on high-stakes projects or research where even minor errors could have significant business or policy implications.

Conversely, a larger margin of error (e.g., 5% or 10%) may suffice for exploratory studies or when resource constraints are a significant concern. In these cases, the benefit of a larger sample size may not outweigh the additional time and costs involved.

Rule of Thumb: Always weigh the trade-off between precision and resources to arrive at an optimal margin of error for your study. Larger samples give you more precision but they also cost more. Your margin of error directly influences both the quality and feasibility of your market research. This selection is not merely a statistical decision; it’s a strategic one that can have a meaningful impact on your project's success.

Example : Continuing with the organic snack bar study, you decide a 5% (0.05) margin of error is acceptable: you want your estimate to be accurate to with +/- 5% of the population percentage.

Step 3: Estimate Standard Deviation 

The standard deviation is a measure of the dispersion or spread of your data points around their average value. A high standard deviation implies more variability, whereas a low standard deviation indicates that the values are more bunched around the mean.

Why Standard Deviation Matters : A high standard deviation, means that there's a larger spread in the opinions, attitudes, or behaviors of your target population. This level of variability could require a larger sample size to capture the differences adequately. In contrast, a low standard deviation simplifies things; the closer your data points are to the mean, the less sample you may need for precise results.

Rule of Thumb : If you don't have prior data to calculate the actual standard deviation, a typical approach for proportions is to assume a 50:50 split or a proportion (p) of 0.05. This conservative estimate maximizes your sample size and thereby reduces the chance of underestimating it. However, if you have historical data or pilot studies to draw from, use the observed standard deviation as it will provide a more accurate sample size tailored to your research.

Example : Given the lack of preliminary data on consumer preferences for organic snack bars, you choose p = 0.5 to maximize your sample size.

Step 4: Determine your level of power (for hypothesis testing only)

Power is your ability to identify a difference of a particular size in hypothesis testing. If being able to detect a difference of 5% is really important to you, then you want to have a lot of power to detect that size of difference.

Why Power Matters: In a statistical test we have to worry about both confidence and power, because we seek to avoid both false positives (through the confidence level) and false negatives (via the power level). If you calculate sample size and ignore power, your sample sill be too small to detect the things that matter to you and you increase your risk of experiencing a false negative. False negatives can be very costly in practice. Let’s say a new ad campaign will be so successful that it will increase sales by 10%. If your product has $500 million in sales, that 10% increase is $50 million. If you cut costs on sample size and get a false negative result, however, you could conclude that the new ad isn’t a success, and cost your company $50 million in lost sales.

Rule of Thumb : We usually want at least 70% or 80% power to detect differences when they are real. In truth, however, when setting both the confidence level and power, we should consider how costly are false negatives (concluding the advertising doesn’t work when in fact it does) and false positives (concluding a new ad is successful when it is not) and then tailor our confidence and power to reflect those costs.

Step 5: Apply the Appropriate Sample Size Formula

This is where determining the correct sample size formula comes into play. Let’s say we want to make sure our study can identify the percentage of respondents who want our new product. We want 95% confidence the proportion we measure will be within 10 percentage points of the population proportion, but we don’t really have a clue what that might be.

Example : Plug in the Z-score (1.96), estimated proportion (0.5), and margin of error (0.05) into the sample size formula for margin of error:

Sample Size Formula for margin of error in Action

Note that we rounded our answer up to 385 because we can’t interview 0.16 of a respondent.

Actually, it turns out management wants to know the results of a statistical test. The current advertising scored 50% while it was in the testing phase, so we want to know if our new ad can beat the old one by 5%. Moreover, because we stand to lose sales if we get a false negative here, we want to have 80% power to detect a significant difference. Now we use the sample size formula for power:

Sample Size Formula for power in Action

Note that when we took power into account because we wanted to avoid a false negative) our sample size requirement more than doubled, from 385 to 784. Had the company gone out with a sample of 385, it would have had only a 50% chance of identifying a successful ad campaign! That’s research money very poorly spent, but it’s exactly what happens if you don’t take power into account.

Summary Checklist: Sample Size Determination Steps

  • Determine Confidence Level : Usually 95%, but sometimes 90% or 99%.
  • Choose Margin of Error : A small percentage (2-5%) is common.
  • Estimate Proportion of Population : Often 0.5 to maximize sample size.
  • Choose a level of power (hypothesis testing only) : 80% is common, 70% is usually a minimum recommendation
  • Apply the Appropriate Sample Size Formula : Use the formula to find the ideal sample size.

By following these steps, you're well on your way to determining an sample size for your study. This is a cornerstone of robust and credible market research, one that balances the risks of false positives and false negatives so as to maximize the value of your findings.

Using Sample Size Calculators

Though the sample size formula is a reliable tool for manual calculations, let's face it—math can be tedious. Sample size calculators can offer a more convenient route , often giving you the same level of accuracy with just a few clicks. However, most online sample size calculators use only the sample size for precision formula and thus do not take into account power. To remedy this, you may want just to double the sample size from an online calculator (because when we chose 80% power in the example above, the sample size, 784, was about double the one that came from considering only the confidence interval.

Key Takeaway: Sample size calculators are your go-to tools for quick, accurate, and convenient calculations. Most sample size calculators neglect statistical power, however, so use them with caution.

Troubleshooting Sample Size Issues

Sometimes your calculated sample size may be impractical (unaffordable). However, there are some strategies you can employ to come up with a more affordable sample size (hopefully without compromising your research too much).

Lowering the Confidence Level

If your sample size is turning out too large for your resources, one option is to lower the confidence level . A move from a 99% to a 95% confidence level can noticeably reduce the needed sample size. Remember though, this makes your results less robust.

Lowering the Power

While this comes with risks, lowering your power to 70% from 80%, say, can reduce your sample size.

Increasing the Margin of Error

Similarly, widening the margin of error will also decrease your required sample size. While this increases the range within which your population parameter is expected to fall, it's a trade-off that can sometimes make the research process more feasible.

Key Takeaway: Tweaking your confidence level, power or margin of error can reduce sample size needs, but always weigh the pros and cons.

Troubleshooting Options

Remember, these are options to help make your study feasible, but they do come with trade-offs. Always consider the impact of these adjustments on the reliability and credibility of your findings.

Real-Life Sample Size Applications

Understanding the mechanics of sample size determination is great, but what does this mean in real-world settings? How has accurate sample size determination influenced the outcomes of actual market research projects?

Success Story

Let's consider a tech company that recently launched a new feature and wanted to gauge user satisfaction. By carefully calculating a sample size that took into account a 95% confidence level and a 4% margin of error, the company was able to reliably conclude that the feature was well-received, leading to its continued investment and improvement.

Consequences of Poor Sample Size

On the flip side, another business failed to adequately calculate their sample size for a similar user-satisfaction survey. They concluded there was no change in user satisfaction, but there was and they missed it leading to misguided business decisions.

Key Takeaway: Accurate sample size determination isn't just academic; it has tangible implications for your business decisions and overall strategy.

Real-Life Implications

  • Success Scenarios : Precise sample size -> Reliable data -> Informed Decisions
  • Failure Scenarios : Inaccurate sample size -> Unreliable Data -> Misguided Decisions

Sample size determination is more than a statistical necessity; it's a vital business tool that can guide a company toward success or contribute to its failure.

Get Started with Survey Research Today!

Ready for your next research study? Get access to our free survey research tool. In just a few minutes, you can create powerful surveys with our easy-to-use interface.

Start Survey Research for Free or Request a Product Tour

Sample Sizes for Different Research Methods

The calculations above work for a single proportion. Similar equations exist for confidence intervals and statistical tests involving differences in proportions and differences in means. Complex statistical models have their own sample size requirements.

Regression analysis/driver analysis

The old rule of thumb of 10 observations per variable in the model is useful and works for data of average condition. When using particularly clean data we may get by with as few as 5 observations per variable. More common will be data with higher than average levels of multicollinearity and this will require larger sample sizes. So if our regression model has 12 variables, the basic recommendation would be n = 10k = 10(12) = 120.

Because it estimates the shape of an S-curve rather than a straight line, logit is more sample size intensive than regression. The rule of thumb is 10 times the number of variables in the model divided by the smaller of the two percentages of the binary response: n = 10k/p. So if our model has 2 predictors and we expect the response will be about 60/40 we’d go with n = 10(12)/(0.40) = 300.

Segmentation

Previous advice was a bit all over the board, but the most recent paper on the topic suggested a sample size of 100 for every basis variable included in the segmentation analysis. So if we have 20 basis variables, that suggests n=2,000.

Factor analysis 

One source suggests that samples of less than a hundred are held to be “poor,” 200 to be “fair” and 300 “good.” Others suggest that when the number of factors is small and correlations are large and reliable, samples of as few as 50 may be workable. Given the messiness of most survey research data, erring on the side of larger sample size seems prudent.

Tree-Based Segmentation

In classification or regression trees, sample is split and then split again, repeatedly. After three levels of pairwise splits, a tree model could have eight groups. For this reason, we usually recommend having at least 1000 respondents.

Conjoint Analysis/MaxDiff

Our usual recommendation about multivariate statistics (like conjoint analysis and MaxDiff analysis ) is to have at least 300 respondents, or at least 200 per separately reportable subgroup. Another way to think about conjoint analysis is to work backward from the simulator: what size differences in shares would be worth capturing, and what size of sample do you need to capture them (using a sample size formula for the difference in two proportions).

Key Takeaway : The methodology you choose can significantly impact your sample size needs, so choose wisely and calculate accordingly. Tailoring your sample size to the specific demands of your chosen methodology isn't just best practice; it's crucial for obtaining valid, actionable insights.

FAQ: Frequently Asked Questions about Sample Size

You've journeyed through the intricate maze of sample size determination, but you may still have lingering questions. Let's tackle some of those.

How do you define sample size?

Sample size refers to the number of individual data points or subjects that are included in a study. It's a crucial aspect of market research that impacts the reliability and credibility of your findings.

What is a good sample size?

A "good" sample size is one that allows for a high confidence level and a low margin of error (and for statistical testing, a high level of power), all while remaining manageable and cost-effective. The ideal sample size can vary based on the research methodology.

How do I calculate sample size?

To calculate the ideal sample size, you typically use a sample size formula that takes into account the statistic you want to study, your desired levels of confidence (and power), and the acceptable margin of error. Some online calculators can also do this for you.

And there you have it—a detailed guide on Understanding and Calculating Sample Size for Surveys . 

Sawtooth Software

3210 N Canyon Rd Ste 202

Provo UT 84604-6508

United States of America

sample size for a research project

Support: [email protected]

Consulting: [email protected]

Sales: [email protected]

Products & Services

Support & Resources

sample size for a research project

GeoPoll

How to Determine Sample Size for a Research Study

Frankline kibuacha | apr. 06, 2021 | 3 min. read.

sample size research

This article will discuss considerations to put in place when determining your sample size and how to calculate the sample size.

Confidence Interval and Confidence Level

As we have noted before, when selecting a sample there are multiple factors that can impact the reliability and validity of results, including sampling and non-sampling errors . When thinking about sample size, the two measures of error that are almost always synonymous with sample sizes are the confidence interval and the confidence level.

Confidence Interval (Margin of Error)

Confidence intervals measure the degree of uncertainty or certainty in a sampling method and how much uncertainty there is with any particular statistic. In simple terms, the confidence interval tells you how confident you can be that the results from a study reflect what you would expect to find if it were possible to survey the entire population being studied. The confidence interval is usually a plus or minus (±) figure. For example, if your confidence interval is 6 and 60% percent of your sample picks an answer, you can be confident that if you had asked the entire population, between 54% (60-6) and 66% (60+6) would have picked that answer.

Confidence Level

The confidence level refers to the percentage of probability, or certainty that the confidence interval would contain the true population parameter when you draw a random sample many times. It is expressed as a percentage and represents how often the percentage of the population who would pick an answer lies within the confidence interval. For example, a 99% confidence level means that should you repeat an experiment or survey over and over again, 99 percent of the time, your results will match the results you get from a population.

The larger your sample size, the more confident you can be that their answers truly reflect the population. In other words, the larger your sample for a given confidence level, the smaller your confidence interval.

Standard Deviation

Another critical measure when determining the sample size is the standard deviation, which measures a data set’s distribution from its mean. In calculating the sample size, the standard deviation is useful in estimating how much the responses you receive will vary from each other and from the mean number, and the standard deviation of a sample can be used to approximate the standard deviation of a population.

The higher the distribution or variability, the greater the standard deviation and the greater the magnitude of the deviation. For example, once you have already sent out your survey, how much variance do you expect in your responses? That variation in responses is the standard deviation.

Population Size

population

As demonstrated through the calculation below, a sample size of about 385 will give you a sufficient sample size to draw assumptions of nearly any population size at the 95% confidence level with a 5% margin of error, which is why samples of 400 and 500 are often used in research. However, if you are looking to draw comparisons between different sub-groups, for example, provinces within a country, a larger sample size is required. GeoPoll typically recommends a sample size of 400 per country as the minimum viable sample for a research project, 800 per country for conducting a study with analysis by a second-level breakdown such as females versus males, and 1200+ per country for doing third-level breakdowns such as males aged 18-24 in Nairobi.

Try Qualtrics for free

Determining sample size: how to make sure you get the correct sample size.

16 min read Sample size can make or break your research project. Here’s how to master the delicate art of choosing the right sample size.

What is sample size?

Sample size is the beating heart of any research project. It’s the invisible force that gives life to your data, making your findings robust, reliable and believable.

Sample size is what determines if you see a broad view or a focus on minute details; the art and science of correctly determining it involves a careful balancing act. Finding an appropriate sample size demands a clear understanding of the level of detail you wish to see in your data and the constraints you might encounter along the way.

Remember, whether you’re studying a small group or an entire population, your findings are only ever as good as the sample you choose.

Free eBook: Empower your market research efforts today

Let’s delve into the world of sampling and uncover the best practices for determining sample size for your research.

How to determine sample size

“How much sample do we need?” is one of the most commonly-asked questions and stumbling points in the early stages of  research design . Finding the right answer to it requires first understanding and answering two other questions:

How important is statistical significance to you and your stakeholders?

What are your real-world constraints.

At the heart of this question is the goal to confidently differentiate between groups, by describing meaningful differences as statistically significant.  Statistical significance  isn’t a difficult concept, but it needs to be considered within the unique context of your research and your measures.

First, you should consider when you deem a difference to be meaningful in your area of research. While the standards for statistical significance are universal, the standards for “meaningful difference” are highly contextual.

For example, a 10% difference between groups might not be enough to merit a change in a marketing campaign for a breakfast cereal, but a 10% difference in efficacy of breast cancer treatments might quite literally be the difference between life and death for hundreds of patients. The exact same magnitude of difference has very little meaning in one context, but has extraordinary meaning in another. You ultimately need to determine the level of precision that will help you make your decision.

Within sampling, the lowest amount of magnification – or smallest sample size – could make the most sense, given the level of precision needed, as well as timeline and budgetary constraints.

If you’re able to detect statistical significance at a difference of 10%, and 10% is a meaningful difference, there is no need for a larger sample size, or higher magnification. However, if the study will only be useful if a significant difference is detected for smaller differences – say, a difference of 5% — the sample size must be larger to accommodate this needed precision. Similarly, if 5% is enough, and 3% is unnecessary, there is no need for a larger statistically significant sample size.

You should also consider how much you expect your responses to vary. When there isn’t a lot of variability in response, it takes a lot more sample to be confident that there are statistically significant differences between groups.

For instance, it will take a lot more sample to find statistically significant differences between groups if you are asking, “What month do you think Christmas is in?” than if you are asking, “How many miles are there between the Earth and the moon?”. In the former, nearly everybody is going to give the exact same answer, while the latter will give a lot of variation in responses. Simply put, when your variables do not have a lot of variance, larger sample sizes make sense.

Statistical significance

The likelihood that the results of a study or experiment did not occur randomly or by chance, but are meaningful and indicate a genuine effect or relationship between variables.

Magnitude of difference

The size or extent of the difference between two or more groups or variables, providing a measure of the effect size or practical significance of the results.

Actionable insights

Valuable findings or conclusions drawn from  data analysis  that can be directly applied or implemented in decision-making processes or strategies to achieve a particular goal or outcome.

It’s crucial to understand the differences between the concepts of “statistical significance”, “magnitude of difference” and “actionable insights” – and how they can influence each other:

  • Even if there is a statistically significant difference, it doesn’t mean the magnitude of the difference is large: with a large enough sample, a 3% difference could be statistically significant
  • Even if the magnitude of the difference is large, it doesn’t guarantee that this difference is statistically significant: with a small enough sample, an 18% difference might not be statistically significant
  • Even if there is a large, statistically significant difference, it doesn’t mean there is a story, or that there are actionable insights

There is no way to guarantee statistically significant differences at the outset of a study – and that is a good thing.

Even with a sample size of a million, there simply may not be any differences – at least, any that could be described as statistically significant. And there are times when a lack of significance is positive.

Imagine if your main competitor ran a multi-million dollar ad campaign in a major city and a huge pre-post study to detect campaign effects, only to discover that there were no statistically significant differences in  brand awareness . This may be terrible news for your competitor, but it would be great news for you.

relative importance of age

With Stats iQ™ you can analyze your research results and conduct significance testing

As you determine your sample size, you should consider the real-world constraints to your research.

Factors revolving around timings, budget and target population are among the most common constraints, impacting virtually every study. But by understanding and acknowledging them, you can definitely navigate the practical constraints of your research when pulling together your sample.

Timeline constraints

Gathering a larger sample size naturally requires more time. This is particularly true for elusive audiences, those hard-to-reach groups that require special effort to engage. Your timeline could become an obstacle if it is particularly tight, causing you to rethink your sample size to meet your deadline.

Budgetary constraints

Every sample, whether large or small, inexpensive or costly, signifies a portion of your budget. Samples could be like an open market; some are inexpensive, others are pricey, but all have a price tag attached to them.

Population constraints

Sometimes the individuals or groups you’re interested in are difficult to reach; other times, they’re a part of an extremely small population. These factors can limit your sample size even further.

What’s a good sample size?

A good sample size really depends on the context and goals of the research. In general, a good sample size is one that accurately represents the population and allows for reliable statistical analysis.

Larger sample sizes are typically better because they reduce the likelihood of  sampling errors  and provide a more accurate representation of the population. However, larger sample sizes often increase the impact of practical considerations, like time, budget and the availability of your audience. Ultimately, you should be aiming for a sample size that provides a balance between statistical validity and practical feasibility.

4 tips for choosing the right sample size

Choosing the right sample size is an intricate balancing act, but following these four tips can take away a lot of the complexity.

1) Start with your goal

The foundation of your research is a clearly defined goal. You need to determine what you’re trying to understand or discover, and use your goal to guide your  research methods  – including your sample size.

If your aim is to get a broad overview of a topic, a larger, more diverse sample may be appropriate. However, if your goal is to explore a niche aspect of your subject, a smaller, more targeted sample might serve you better. You should always align your sample size with the objectives of your research.

2) Know that you can’t predict everything

Research is a journey into the unknown. While you may have hypotheses and predictions, it’s important to remember that you can’t foresee every outcome – and this uncertainty should be considered when choosing your sample size.

A larger sample size can help to mitigate some of the risks of unpredictability, providing a more diverse range of data and potentially more accurate results. However, you shouldn’t let the fear of the unknown push you into choosing an impractically large sample size.

3) Plan for a sample that meets your needs and considers your real-life constraints

Every research project operates within certain boundaries – commonly budget, timeline and the nature of the sample itself. When deciding on your sample size, these factors need to be taken into consideration.

Be realistic about what you can achieve with your available resources and time, and always tailor your sample size to fit your constraints – not the other way around.

4) Use best practice guidelines to calculate sample size

There are many established guidelines and formulas that can help you in determining the right sample size.

The easiest way to define your sample size is using a  sample size calculator , or you can use a manual sample size calculation if you want to test your math skills. Cochran’s formula is perhaps the most well known equation for calculating sample size, and widely used when the population is large or unknown.

Cochran's sample size formula

Beyond the formula, it’s vital to consider the confidence interval, which plays a significant role in determining the appropriate sample size – especially when working with a  random sample  – and the sample proportion. This represents the expected ratio of the target population that has the characteristic or response you’re interested in, and therefore has a big impact on your correct sample size.

If your population is small, or its variance is unknown, there are steps you can still take to determine the right sample size. Common approaches here include conducting a small pilot study to gain initial estimates of the population variance, and taking a conservative approach by assuming a larger variance to ensure a more representative sample size.

Empower your market research

Conducting meaningful research and extracting actionable intelligence are priceless skills in today’s ultra competitive business landscape. It’s never been more crucial to stay ahead of the curve by leveraging the power of market research to identify opportunities, mitigate risks and make informed decisions.

Equip yourself with the tools for success with our essential eBook,  “The ultimate guide to conducting market research” .

With this front-to-back guide, you’ll discover the latest strategies and best practices that are defining effective market research. Learn about practical insights and real-world applications that are demonstrating the value of research in driving business growth and innovation.

Learn how to determine sample size

To choose the correct sample size, you need to consider a few different factors that affect your research, and gain a basic understanding of the statistics involved. You’ll then be able to use a sample size formula to bring everything together and sample confidently, knowing that there is a high probability that your survey is statistically accurate.

The steps that follow are suitable for finding a sample size for continuous data – i.e. data that is counted numerically. It doesn’t apply to categorical data – i.e. put into categories like green, blue, male, female etc.

Stage 1: Consider your sample size variables

Before you can calculate a sample size, you need to determine a few things about the target population and the level of accuracy you need:

1. Population size

How many people are you talking about in total? To find this out, you need to be clear about who does and doesn’t fit into your group. For example, if you want to know about dog owners, you’ll include everyone who has at some point owned at least one dog. (You may include or exclude those who owned a dog in the past, depending on your research goals.) Don’t worry if you’re unable to calculate the exact number. It’s common to have an unknown number or an estimated range.

2. Margin of error (confidence interval)

Errors are inevitable – the question is how much error you’ll allow. The margin of error , AKA confidence interval, is expressed in terms of mean numbers. You can set how much difference you’ll allow between the mean number of your sample and the mean number of your population. If you’ve ever seen a political poll on the news, you’ve seen a confidence interval and how it’s expressed. It will look something like this: “68% of voters said yes to Proposition Z, with a margin of error of +/- 5%.”

3. Confidence level

This is a separate step to the similarly-named confidence interval in step 2. It deals with how confident you want to be that the actual mean falls within your margin of error. The most common confidence intervals are 90% confident, 95% confident, and 99% confident.

4. Standard deviation

This step asks you to estimate how much the responses you receive will vary from each other and from the mean number. A low standard deviation means that all the values will be clustered around the mean number, whereas a high standard deviation means they are spread out across a much wider range with very small and very large outlying figures. Since you haven’t yet run your survey, a safe choice is a standard deviation of .5 which will help make sure your sample size is large enough.

Stage 2: Calculate sample size

Now that you’ve got answers for steps 1 – 4, you’re ready to calculate the sample size you need. This can be done using an  online sample size calculator  or with paper and pencil.

1. Find your Z-score

Next, you need to turn your confidence level into a Z-score. Here are the Z-scores for the most common confidence levels:

  • 90% – Z Score = 1.645
  • 95% – Z Score = 1.96
  • 99% – Z Score = 2.576

If you chose a different confidence level, use this  Z-score table  (a resource owned and hosted by SJSU.edu) to find your score.

2. Use the sample size formula

Plug in your Z-score, standard of deviation, and confidence interval into the  sample size calculator  or use this sample size formula to work it out yourself:

Sample size formula graphic

This equation is for an unknown population size or a very large population size. If your population is smaller and known, just  use the sample size calculator.

What does that look like in practice?

Here’s a worked example, assuming you chose a 95% confidence level, .5 standard deviation, and a margin of error (confidence interval) of +/- 5%.

((1.96)2 x .5(.5)) / (.05)2

(3.8416 x .25) / .0025

.9604 / .0025

385 respondents are needed

Voila! You’ve just determined your sample size.

eBook: 2022 Market Research Global Trends Report

Related resources

Convenience sampling 15 min read, non-probability sampling 17 min read, probability sampling 8 min read, stratified random sampling 13 min read, simple random sampling 10 min read, sampling methods 15 min read, sampling and non-sampling errors 10 min read, request demo.

Ready to learn more about Qualtrics?

Our approach

  • Responsibility
  • Infrastructure
  • Try Meta AI

RECOMMENDED READS

  • 5 Steps to Getting Started with Llama 2
  • The Llama Ecosystem: Past, Present, and Future
  • Introducing Code Llama, a state-of-the-art large language model for coding
  • Meta and Microsoft Introduce the Next Generation of Llama
  • Today, we’re introducing Meta Llama 3, the next generation of our state-of-the-art open source large language model.
  • Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.
  • We’re dedicated to developing Llama 3 in a responsible way, and we’re offering various resources to help others use it responsibly as well. This includes introducing new trust and safety tools with Llama Guard 2, Code Shield, and CyberSec Eval 2.
  • In the coming months, we expect to introduce new capabilities, longer context windows, additional model sizes, and enhanced performance, and we’ll share the Llama 3 research paper.
  • Meta AI, built with Llama 3 technology, is now one of the world’s leading AI assistants that can boost your intelligence and lighten your load—helping you learn, get things done, create content, and connect to make the most out of every moment. You can try Meta AI here .

Today, we’re excited to share the first two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases. This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning. We believe these are the best open source models of their class, period. In support of our longstanding open approach, we’re putting Llama 3 in the hands of the community. We want to kickstart the next wave of innovation in AI across the stack—from applications to developer tools to evals to inference optimizations and more. We can’t wait to see what you build and look forward to your feedback.

Our goals for Llama 3

With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today. We wanted to address developer feedback to increase the overall helpfulness of Llama 3 and are doing so while continuing to play a leading role on responsible use and deployment of LLMs. We are embracing the open source ethos of releasing early and often to enable the community to get access to these models while they are still in development. The text-based models we are releasing today are the first in the Llama 3 collection of models. Our goal in the near future is to make Llama 3 multilingual and multimodal, have longer context, and continue to improve overall performance across core LLM capabilities such as reasoning and coding.

State-of-the-art performance

Our new 8B and 70B parameter Llama 3 models are a major leap over Llama 2 and establish a new state-of-the-art for LLM models at those scales. Thanks to improvements in pretraining and post-training, our pretrained and instruction-fine-tuned models are the best models existing today at the 8B and 70B parameter scale. Improvements in our post-training procedures substantially reduced false refusal rates, improved alignment, and increased diversity in model responses. We also saw greatly improved capabilities like reasoning, code generation, and instruction following making Llama 3 more steerable.

sample size for a research project

*Please see evaluation details for setting and parameters with which these evaluations are calculated.

In the development of Llama 3, we looked at model performance on standard benchmarks and also sought to optimize for performance for real-world scenarios. To this end, we developed a new high-quality human evaluation set. This evaluation set contains 1,800 prompts that cover 12 key use cases: asking for advice, brainstorming, classification, closed question answering, coding, creative writing, extraction, inhabiting a character/persona, open question answering, reasoning, rewriting, and summarization. To prevent accidental overfitting of our models on this evaluation set, even our own modeling teams do not have access to it. The chart below shows aggregated results of our human evaluations across of these categories and prompts against Claude Sonnet, Mistral Medium, and GPT-3.5.

sample size for a research project

Preference rankings by human annotators based on this evaluation set highlight the strong performance of our 70B instruction-following model compared to competing models of comparable size in real-world scenarios.

Our pretrained model also establishes a new state-of-the-art for LLM models at those scales.

sample size for a research project

To develop a great language model, we believe it’s important to innovate, scale, and optimize for simplicity. We adopted this design philosophy throughout the Llama 3 project with a focus on four key ingredients: the model architecture, the pretraining data, scaling up pretraining, and instruction fine-tuning.

Model architecture

In line with our design philosophy, we opted for a relatively standard decoder-only transformer architecture in Llama 3. Compared to Llama 2, we made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, we’ve adopted grouped query attention (GQA) across both the 8B and 70B sizes. We trained the models on sequences of 8,192 tokens, using a mask to ensure self-attention does not cross document boundaries.

Training data

To train the best language model, the curation of a large, high-quality training dataset is paramount. In line with our design principles, we invested heavily in pretraining data. Llama 3 is pretrained on over 15T tokens that were all collected from publicly available sources. Our training dataset is seven times larger than that used for Llama 2, and it includes four times more code. To prepare for upcoming multilingual use cases, over 5% of the Llama 3 pretraining dataset consists of high-quality non-English data that covers over 30 languages. However, we do not expect the same level of performance in these languages as in English.

To ensure Llama 3 is trained on data of the highest quality, we developed a series of data-filtering pipelines. These pipelines include using heuristic filters, NSFW filters, semantic deduplication approaches, and text classifiers to predict data quality. We found that previous generations of Llama are surprisingly good at identifying high-quality data, hence we used Llama 2 to generate the training data for the text-quality classifiers that are powering Llama 3.

We also performed extensive experiments to evaluate the best ways of mixing data from different sources in our final pretraining dataset. These experiments enabled us to select a data mix that ensures that Llama 3 performs well across use cases including trivia questions, STEM, coding, historical knowledge, etc.

Scaling up pretraining

To effectively leverage our pretraining data in Llama 3 models, we put substantial effort into scaling up pretraining. Specifically, we have developed a series of detailed scaling laws for downstream benchmark evaluations. These scaling laws enable us to select an optimal data mix and to make informed decisions on how to best use our training compute. Importantly, scaling laws allow us to predict the performance of our largest models on key tasks (for example, code generation as evaluated on the HumanEval benchmark—see above) before we actually train the models. This helps us ensure strong performance of our final models across a variety of use cases and capabilities.

We made several new observations on scaling behavior during the development of Llama 3. For example, while the Chinchilla-optimal amount of training compute for an 8B parameter model corresponds to ~200B tokens, we found that model performance continues to improve even after the model is trained on two orders of magnitude more data. Both our 8B and 70B parameter models continued to improve log-linearly after we trained them on up to 15T tokens. Larger models can match the performance of these smaller models with less training compute, but smaller models are generally preferred because they are much more efficient during inference.

To train our largest Llama 3 models, we combined three types of parallelization: data parallelization, model parallelization, and pipeline parallelization. Our most efficient implementation achieves a compute utilization of over 400 TFLOPS per GPU when trained on 16K GPUs simultaneously. We performed training runs on two custom-built 24K GPU clusters . To maximize GPU uptime, we developed an advanced new training stack that automates error detection, handling, and maintenance. We also greatly improved our hardware reliability and detection mechanisms for silent data corruption, and we developed new scalable storage systems that reduce overheads of checkpointing and rollback. Those improvements resulted in an overall effective training time of more than 95%. Combined, these improvements increased the efficiency of Llama 3 training by ~three times compared to Llama 2.

Instruction fine-tuning

To fully unlock the potential of our pretrained models in chat use cases, we innovated on our approach to instruction-tuning as well. Our approach to post-training is a combination of supervised fine-tuning (SFT), rejection sampling, proximal policy optimization (PPO), and direct preference optimization (DPO). The quality of the prompts that are used in SFT and the preference rankings that are used in PPO and DPO has an outsized influence on the performance of aligned models. Some of our biggest improvements in model quality came from carefully curating this data and performing multiple rounds of quality assurance on annotations provided by human annotators.

Learning from preference rankings via PPO and DPO also greatly improved the performance of Llama 3 on reasoning and coding tasks. We found that if you ask a model a reasoning question that it struggles to answer, the model will sometimes produce the right reasoning trace: The model knows how to produce the right answer, but it does not know how to select it. Training on preference rankings enables the model to learn how to select it.

Building with Llama 3

Our vision is to enable developers to customize Llama 3 to support relevant use cases and to make it easier to adopt best practices and improve the open ecosystem. With this release, we’re providing new trust and safety tools including updated components with both Llama Guard 2 and Cybersec Eval 2, and the introduction of Code Shield—an inference time guardrail for filtering insecure code produced by LLMs.

We’ve also co-developed Llama 3 with torchtune , the new PyTorch-native library for easily authoring, fine-tuning, and experimenting with LLMs. torchtune provides memory efficient and hackable training recipes written entirely in PyTorch. The library is integrated with popular platforms such as Hugging Face, Weights & Biases, and EleutherAI and even supports Executorch for enabling efficient inference to be run on a wide variety of mobile and edge devices. For everything from prompt engineering to using Llama 3 with LangChain we have a comprehensive getting started guide and takes you from downloading Llama 3 all the way to deployment at scale within your generative AI application.

A system-level approach to responsibility

We have designed Llama 3 models to be maximally helpful while ensuring an industry leading approach to responsibly deploying them. To achieve this, we have adopted a new, system-level approach to the responsible development and deployment of Llama. We envision Llama models as part of a broader system that puts the developer in the driver’s seat. Llama models will serve as a foundational piece of a system that developers design with their unique end goals in mind.

sample size for a research project

Instruction fine-tuning also plays a major role in ensuring the safety of our models. Our instruction-fine-tuned models have been red-teamed (tested) for safety through internal and external efforts. ​​Our red teaming approach leverages human experts and automation methods to generate adversarial prompts that try to elicit problematic responses. For instance, we apply comprehensive testing to assess risks of misuse related to Chemical, Biological, Cyber Security, and other risk areas. All of these efforts are iterative and used to inform safety fine-tuning of the models being released. You can read more about our efforts in the model card .

Llama Guard models are meant to be a foundation for prompt and response safety and can easily be fine-tuned to create a new taxonomy depending on application needs. As a starting point, the new Llama Guard 2 uses the recently announced MLCommons taxonomy, in an effort to support the emergence of industry standards in this important area. Additionally, CyberSecEval 2 expands on its predecessor by adding measures of an LLM’s propensity to allow for abuse of its code interpreter, offensive cybersecurity capabilities, and susceptibility to prompt injection attacks (learn more in our technical paper ). Finally, we’re introducing Code Shield which adds support for inference-time filtering of insecure code produced by LLMs. This offers mitigation of risks around insecure code suggestions, code interpreter abuse prevention, and secure command execution.

With the speed at which the generative AI space is moving, we believe an open approach is an important way to bring the ecosystem together and mitigate these potential harms. As part of that, we’re updating our Responsible Use Guide (RUG) that provides a comprehensive guide to responsible development with LLMs. As we outlined in the RUG, we recommend that all inputs and outputs be checked and filtered in accordance with content guidelines appropriate to the application. Additionally, many cloud service providers offer content moderation APIs and other tools for responsible deployment, and we encourage developers to also consider using these options.

Deploying Llama 3 at scale

Llama 3 will soon be available on all major platforms including cloud providers, model API providers, and much more. Llama 3 will be everywhere .

Our benchmarks show the tokenizer offers improved token efficiency, yielding up to 15% fewer tokens compared to Llama 2. Also, Group Query Attention (GQA) now has been added to Llama 3 8B as well. As a result, we observed that despite the model having 1B more parameters compared to Llama 2 7B, the improved tokenizer efficiency and GQA contribute to maintaining the inference efficiency on par with Llama 2 7B.

For examples of how to leverage all of these capabilities, check out Llama Recipes which contains all of our open source code that can be leveraged for everything from fine-tuning to deployment to model evaluation.

What’s next for Llama 3?

The Llama 3 8B and 70B models mark the beginning of what we plan to release for Llama 3. And there’s a lot more to come.

Our largest models are over 400B parameters and, while these models are still training, our team is excited about how they’re trending. Over the coming months, we’ll release multiple models with new capabilities including multimodality, the ability to converse in multiple languages, a much longer context window, and stronger overall capabilities. We will also publish a detailed research paper once we are done training Llama 3.

To give you a sneak preview for where these models are today as they continue training, we thought we could share some snapshots of how our largest LLM model is trending. Please note that this data is based on an early checkpoint of Llama 3 that is still training and these capabilities are not supported as part of the models released today.

sample size for a research project

We’re committed to the continued growth and development of an open AI ecosystem for releasing our models responsibly. We have long believed that openness leads to better, safer products, faster innovation, and a healthier overall market. This is good for Meta, and it is good for society. We’re taking a community-first approach with Llama 3, and starting today, these models are available on the leading cloud, hosting, and hardware platforms with many more to come.

Try Meta Llama 3 today

We’ve integrated our latest models into Meta AI, which we believe is the world’s leading AI assistant. It’s now built with Llama 3 technology and it’s available in more countries across our apps.

You can use Meta AI on Facebook, Instagram, WhatsApp, Messenger, and the web to get things done, learn, create, and connect with the things that matter to you. You can read more about the Meta AI experience here .

Visit the Llama 3 website to download the models and reference the Getting Started Guide for the latest list of all available platforms.

You’ll also soon be able to test multimodal Meta AI on our Ray-Ban Meta smart glasses.

As always, we look forward to seeing all the amazing products and experiences you will build with Meta Llama 3.

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

sample size for a research project

Product experiences

Foundational models

Latest news

Meta © 2024

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Psychol Med
  • v.35(2); Apr-Jun 2013

How to Calculate Sample Size for Different Study Designs in Medical Research?

Jaykaran charan.

Department of Pharmacology, Govt. Medical College, Surat, Gujarat, India

Tamoghna Biswas

1 Independent Researcher, Kolkata, West Bengal, India

Calculation of exact sample size is an important part of research design. It is very important to understand that different study design need different method of sample size calculation and one formula cannot be used in all designs. In this short review we tried to educate researcher regarding various method of sample size calculation available for different study designs. In this review sample size calculation for most frequently used study designs are mentioned. For genetic and microbiological studies readers are requested to read other sources.

INTRODUCTION

In the recent era of evidence-based medicine, biomedical statistics has come under increased scrutiny. Evidence is as good as the research it is based on, which in turn depends on the statistical soundness of the claims it make. One of the important issues faced by a biomedical researcher during the design phase of the study is sample size calculation. Various studies published in Indian and International journals revealed that sample size calculations are not reported properly in the published articles. Many of the studies published in these journals have less than required sample size and hence less power.[ 1 , 2 , 3 ] Many articles have been published in existing literature explaining the methods of calculation of sample size but still a lot of confusion exists.[ 4 , 5 , 6 ] It is very important to understand that method of sample size calculation is different for different study designs and one blanket formula for sample size calculation cannot be used for all study designs. In this article different formulae of sample size calculations are explained based on study designs. Readers are advised to understand the basics of prerequisites needed for calculation of sample size calculation through this article and from other sources also and once they have understood the basics they can use different paid/freely available software available for sample size calculations. For simple study designs formulae given in this article can be used for sample size calculation.

Sample size calculation for cross sectional studies/surveys

Cross sectional studies or cross sectional survey are done to estimate a population parameter like prevalence of some disease in a community or finding the average value of some quantitative variable in a population. Sample size formula for qualitative variable and quantities variable are different.

For qualitative variable

Suppose an epidemiologist want to know proportion of children who are hypertensive in a population then this formula should be used as proportion is a qualitative variable.

An external file that holds a picture, illustration, etc.
Object name is IJPsyM-35-121-g001.jpg

So if the researcher is interested in knowing the average systolic blood pressure in pediatric age group of that city at 5% of type of 1 error and precision of 5 mmHg of either side (more or less than mean systolic BP) and standard deviation, based on previously done studies, is 25 mmHg then formula for sample size calculation will be

An external file that holds a picture, illustration, etc.
Object name is IJPsyM-35-121-g002.jpg

So if the researcher wants to calculate sample size for the above-mentioned case control study to know link between childhood sexual abuse with psychiatric disorder in adulthood and he wants to fix power of study at 80% and assuming expected proportions in case group and control group are 0.35 and 0.20 respectively, and he wants to have equal number cases and control; then the sample size per group will be

An external file that holds a picture, illustration, etc.
Object name is IJPsyM-35-121-g003.jpg

So, the researcher has to take 59 samples in each group.

It is worthy of mention here that these formulas for case control and cohort study are for independent design studies. They are not for matched case control and cohort studies. These formulae can be modified or corrected depending on population size or ratio between sample size and population size. Detailed text should be read to know more about technical aspects of sample size calculation.[ 7 , 8 ] Readers are advised to use various freely available epidemiological calculators like openEpi given in appendix to calculate sample size formula.

Sample size calculation for testing a hypothesis (Clinical trials or clinical interventional studies)

In this kind of research design researcher wants to see the effect of an intervention. Suppose a researcher want to see the effect of an antihypertensive drug so he will select two groups, one group will be given antihypertensive drug and another group will be give placebo. After giving these drug s for a fixed time period blood pressure of both groups will be measured and mean blood pressure of both groups will be compared to see if difference is significant or not. Complex formulae are used for this type of studies and we want to advise readers to use statistical software for calculation of exact sample size. The procedure for calculation of samle size in clinical trials/intervention studies involving two groups is mentioned here. In the case of only two groups method of calculation is mentioned here but if design involves more than two groups then statistical software like G Power should be used for sample size calculation. But understanding of various prerequisites which are needed for sample size calculation is very important.

Formula for sample size calculation for comparison between two groups when endpoint is quantitative data

When the variable is quantitative data like blood pressure, weight, height, etc., then the followingformula can be used for calculation of sample size for comparison between two groups.

An external file that holds a picture, illustration, etc.
Object name is IJPsyM-35-121-g004.jpg

So researcher needs 294 subjects per group.

So simple calculation for sample size when comparison is for two independent groups can be done manually by given formulae but for more than two groups or for matched data and for other complex calculations software should be used [ appendix 1 ].

Sample size formula for animal studies

For animal studies there are two method of calculation of sample size. The most preferred method is the same method which has been mentioned in sample size calculation for testing the hypothesis. While all efforts should be done to calculate the sample size by that method, sometimes it is not possible to get information related to the prerequisites needed for sample size calculation by power analysis like standard deviation, effect size etc. In that condition a second method can be used this is called as “resource equation method”.[ 9 ] In this method a value E is calculated based on decided sample size. The value if E should lies within 10 to 20 for optimum sample size. If a value of E is less than 10 then more animal should be included and if it is more than 20 then sample size should be decreased.

E = Total number of animals - Total number of groups

Suppose in an animal study a researcher formed 4 groups of animal having 8 animals each for different interventions then total animals will be 32 (4 × 8). Hence E will be

E = 32 – 4 = 28

This is more than 20 hence animals should be decreased in each group. So if researcher takes 5 rats in each group then E will be

E = 20 – 4 = 16

E is 16 which lies within 10-20 hence five rats per group for four groups can be considered as appropriate sample size. This is a crude method and should be used only if sample size calculation cannot be done by power analysis method explained in above section for testing the hypothesis.

APPENDIX 1: – FREE SOFTWARE AND CALCULATORS AVAILABLE ONLINE FOR SAMPLE SIZE CALCULATION

An external file that holds a picture, illustration, etc.
Object name is IJPsyM-35-121-g005.jpg

Source of Support: Nil

Conflict of Interest: None.

IMAGES

  1. Sample Size Determination: Definition, Formula, and Example

    sample size for a research project

  2. FREE 12+ Sample Research Project Templates in PDF

    sample size for a research project

  3. How to find the correct sample size for your research survey (formula

    sample size for a research project

  4. What is Sample Size? Definition

    sample size for a research project

  5. FREE 12+ Sample Research Project Templates in PDF

    sample size for a research project

  6. how to determine sample size in research methodology

    sample size for a research project

VIDEO

  1. How to calculate/determine the Sample size for difference in proportion/percentage between 2 groups?

  2. 11. Business Research Methods / Non-probability sampling / Sample size

  3. how to calculate sample size in descriptive studies,sample size calculation in research paper/thesis

  4. How to calculate Sample Size? |Marathi| Research in Social Sciences| Research in Management

  5. Determining Sampling size

  6. Sample size determination & Power of study

COMMENTS

  1. How to Determine Sample Size

    3) Plan for a sample that meets your needs and considers your real-life constraints. Every research project operates within certain boundaries - commonly budget, timeline and the nature of the sample itself. When deciding on your sample size, these factors need to be taken into consideration.

  2. A Step-by-Step Process on Sample Size Determination for Medical Research

    In order to make up for a rough estimate of 20.0% of non-response rate, the minimum sample size requirement is calculated to be 254 patients (i.e. 203/0.8) by estimating the sample size based on the EPV 50, and is calculated to be 375 patients (i.e. 300/0.8) by estimating the sample size based on the formula n = 100 + 50i.

  3. Sample Size and its Importance in Research

    The sample size for a study needs to be estimated at the time the study is proposed; too large a sample is unnecessary and unethical, and too small a sample is unscientific and also unethical. The necessary sample size can be calculated, using statistical software, based on certain assumptions. If no assumptions can be made, then an arbitrary ...

  4. Sample size determination: A practical guide for health researchers

    Therefore, when estimating the sample size, population size is rarely important in medical research. 37 However, if the population is limited (e.g., in a study that evaluates an academic program, where the population is all students enrolled in the program), then the sample size equations can be adjusted for the population size. 37 , 38 , 39 ...

  5. Sample Size Determination: Definition, Formula, and Example

    What is Sample Size? 'Sample size' is a market research term used for defining the number of individuals included in conducting research. Researchers choose their sample based on demographics, such as age, gender questions, or physical location.It can be vague or specific.

  6. How To Determine Sample Size for Quantitative Research

    You need to determine how big of a sample size you need so that you can be sure the quantitative data you get from a survey is reflective of your target population as a whole - and so that the decisions you make based on the research have a firm foundation. Too big a sample and a project can be needlessly expensive and time-consuming.

  7. PDF Choosing the Size of the Sample

    For explorative research, a small sample size may suffice. Moreover, generally, the more important a study is, the larger the sample size required in order to satisfy the ... The number of elements selected for a research project will primarily be determined by the availability of resources. There is a direct relationship between amount of ...

  8. What Is Sample Size?

    Sample size is the number of observations or individuals included in a study or experiment. It is the number of individuals, items, or data points selected from a larger population to represent it statistically. The sample size is a crucial consideration in research because it directly impacts the reliability and extent to which you can ...

  9. Determining Sample Size for Survey Research

    Learn why determining the right sample size is critical for accurate research outcomes. Explore step-by-step methods for calculating sample size for confidence intervals and hypothesis testing. ... This can be particularly valuable when you're working on high-stakes projects or research where even minor errors could have significant business or ...

  10. Sample size: how many participants do I need in my research?

    It is the ability of the test to detect a difference in the sample, when it exists in the target population. Calculated as 1-Beta. The greater the power, the larger the required sample size will be. A value between 80%-90% is usually used. Relationship between non-exposed/exposed groups in the sample.

  11. Big enough? Sampling in qualitative inquiry

    So there was no uniform answer to the question and the ranges varied according to methodology. In fact, Shaw and Holland (2014) claim, sample size will largely depend on the method. (p. 87), "In truth," they write, "many decisions about sample size are made on the basis of resources, purpose of the research" among other factors. (p. 87).

  12. Sample size determination: A practical guide for health researchers

    For sample size estima-tion, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5) determine the magnitude of practical significance differences (effect size).

  13. How to Determine Sample Size for a Research Study

    2.58. Put these figures into the sample size formula to get your sample size. Here is an example calculation: Say you choose to work with a 95% confidence level, a standard deviation of 0.5, and a confidence interval (margin of error) of ± 5%, you just need to substitute the values in the formula: ( (1.96)2 x .5 (.5)) / (.05)2.

  14. Sample Size: How Many Survey Participants Do I Need?

    (For more advanced students with an interest in statistics, the Creative Research Systems website (Creative Research Systems, 2003) has a more exact formula, along with a sample size calculator that you can use.

  15. How to Determine Sample Size in Research

    Stage 2: Calculate sample size. Now that you've got answers for steps 1 - 4, you're ready to calculate the sample size you need. This can be done using an online sample size calculator or with paper and pencil. 1. Find your Z-score. Next, you need to turn your confidence level into a Z-score.

  16. Full article: Sample size determination for research projects

    However, sample size determination is 'the mathematical process of deciding, before a study begins, how many subjects should be studied'. 1 Sample size calculations for research projects are an essential part of a study protocol for submission to ethical committees, research funding bodies and some peer review journals.

  17. Sample size determination: A practical guide for health researchers

    Therefore, when estimating the sample size, population size is rarely important in medical research. 37 However, if the population is limited (e.g., in a study that evaluates an academic program, where the population is all students enrolled in the program), then the sample size equations can be adjusted for the population size. 37-39 The size ...

  18. Sample size calculation: Basic principles

    If a difference of 15 mmHg in MAP is considered between the phenylephrine and the placebo group as clinically significant (μ1− μ2) and be detected with 80% power and a significance level alpha of 0.05. [ 7] n = 2 × ( [1.96 + 0.842] 2 × 20 2 )/15 2 = 27.9. That means 28 subjects per group is the sample size.

  19. Determining the Sample Size in Qualitative Research

    finds a variation of the sample size from 1 to 95 (averages being of 31 in the first ca se and 28 in the. second). The research region - one of t he cultural factors, plays a significant role in ...

  20. (PDF) Research Sampling and Sample Size Determination: A practical

    Department of Guidance and Counseling, Faculty of Arts and Education, University of Africa, Bayelsa State. E-Mail: [email protected]. Phone Number: 08036648341. Abstract. One of the major ...

  21. How sample size influences research outcomes

    An appropriate sample renders the research more efficient: Data generated are reliable, resource investment is as limited as possible, while conforming to ethical principles. The use of sample size calculation directly influences research findings. Very small samples undermine the internal and external validity of a study.

  22. Introducing Meta Llama 3: The most capable openly available LLM to date

    To develop a great language model, we believe it's important to innovate, scale, and optimize for simplicity. We adopted this design philosophy throughout the Llama 3 project with a focus on four key ingredients: the model architecture, the pretraining data, scaling up pretraining, and instruction fine-tuning.

  23. How to Calculate Sample Size for Different Study Designs in Medical

    In this method a value E is calculated based on decided sample size. The value if E should lies within 10 to 20 for optimum sample size. If a value of E is less than 10 then more animal should be included and if it is more than 20 then sample size should be decreased. E = Total number of animals - Total number of groups.