Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

8.2: Hypothesis Testing with t

  • Last updated
  • Save as PDF
  • Page ID 7127

  • Foster et al.
  • University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus via University of Missouri’s Affordable and Open Access Educational Resources Initiative

Hypothesis testing with the \(t\)-statistic works exactly the same way as \(z\)-tests did, following the four-step process of

  • Stating the Hypothesis
  • Finding the Critical Values
  • Computing the Test Statistic
  • Making the Decision.

We will work though an example: let’s say that you move to a new city and find a an auto shop to change your oil. Your old mechanic did the job in about 30 minutes (though you never paid close enough attention to know how much that varied), and you suspect that your new shop takes much longer. After 4 oil changes, you think you have enough evidence to demonstrate this.

Step 1: State the Hypotheses Our hypotheses for 1-sample t-tests are identical to those we used for \(z\)-tests. We still state the null and alternative hypotheses mathematically in terms of the population parameter and written out in readable English. For our example:

\(H_0\): There is no difference in the average time to change a car’s oil

\(H_0: μ = 30\)

\(H_A\): This shop takes longer to change oil than your old mechanic

\(H_A: μ > 30\)

Step 2: Find the Critical Values As noted above, our critical values still delineate the area in the tails under the curve corresponding to our chosen level of significance. Because we have no reason to change significance levels, we will use \(α\) = 0.05, and because we suspect a direction of effect, we have a one-tailed test. To find our critical values for \(t\), we need to add one more piece of information: the degrees of freedom. For this example:

\[df = N – 1 = 4 – 1 = 3 \nonumber \]

Going to our \(t\)-table, we find the column corresponding to our one-tailed significance level and find where it intersects with the row for 3 degrees of freedom. As shown in Figure \(\PageIndex{1}\): our critical value is \(t*\) = 2.353

fig 8.2.1.png

We can then shade this region on our \(t\)-distribution to visualize our rejection region

fig 8.2.2.png

Step 3: Compute the Test Statistic The four wait times you experienced for your oil changes are the new shop were 46 minutes, 58 minutes, 40 minutes, and 71 minutes. We will use these to calculate \(\overline{\mathrm{X}}\) and s by first filling in the sum of squares table in Table \(\PageIndex{1}\):

After filling in the first row to get \(\Sigma\)=215, we find that the mean is \(\overline{\mathrm{X}}\) = 53.75 (215 divided by sample size 4), which allows us to fill in the rest of the table to get our sum of squares \(SS\) = 564.74, which we then plug in to the formula for standard deviation from chapter 3:

\[s=\sqrt{\dfrac{\sum(X-\overline{X})^{2}}{N-1}}=\sqrt{\dfrac{S S}{d f}}=\sqrt{\dfrac{564.74}{3}}=13.72 \nonumber \]

Next, we take this value and plug it in to the formula for standard error:

\[s_{\overline{X}}=\dfrac{s}{\sqrt{n}}=\dfrac{13.72}{2}=6.86 \nonumber \]

And, finally, we put the standard error, sample mean, and null hypothesis value into the formula for our test statistic \(t\):

\[t=\dfrac{\overline{\mathrm{X}}-\mu}{s_{\overline{\mathrm{X}}}}=\dfrac{53.75-30}{6.86}=\dfrac{23.75}{6.68}=3.46 \nonumber \]

This may seem like a lot of steps, but it is really just taking our raw data to calculate one value at a time and carrying that value forward into the next equation: data  sample size/degrees of freedom  mean  sum of squares  standard deviation  standard error  test statistic. At each step, we simply match the symbols of what we just calculated to where they appear in the next formula to make sure we are plugging everything in correctly.

Step 4: Make the Decision Now that we have our critical value and test statistic, we can make our decision using the same criteria we used for a \(z\)-test. Our obtained \(t\)-statistic was \(t\) = 3.46 and our critical value was \(t* = 2.353: t > t*\), so we reject the null hypothesis and conclude:

Based on our four oil changes, the new mechanic takes longer on average (\(\overline{\mathrm{X}}\) = 53.75) to change oil than our old mechanic, \(t(3)\) = 3.46, \(p\) < .05.

Notice that we also include the degrees of freedom in parentheses next to \(t\). And because we found a significant result, we need to calculate an effect size, which is still Cohen’s \(d\), but now we use \(s\) in place of \(σ\):

\[d=\dfrac{\overline{X}-\mu}{s}=\dfrac{53.75-30.00}{13.72}=1.73 \nonumber \]

This is a large effect. It should also be noted that for some things, like the minutes in our current example, we can also interpret the magnitude of the difference we observed (23 minutes and 45 seconds) as an indicator of importance since time is a familiar metric.

An open portfolio of interoperable, industry leading products

The Dotmatics digital science platform provides the first true end-to-end solution for scientific R&D, combining an enterprise data platform with the most widely used applications for data analysis, biologics, flow cytometry, chemicals innovation, and more.

how to write hypothesis for t test

Statistical analysis and graphing software for scientists

Bioinformatics, cloning, and antibody discovery software

Plan, visualize, & document core molecular biology procedures

Electronic Lab Notebook to organize, search and share data

Proteomics software for analysis of mass spec data

Modern cytometry analysis platform

Analysis, statistics, graphing and reporting of flow cytometry data

Software to optimize designs of clinical trials

The Ultimate Guide to T Tests

Get all of your t test questions answered here

The ultimate guide to t tests

The t test is one of the simplest statistical techniques that is used to evaluate whether there is a statistical difference between the means from up to two different samples. The t test is especially useful when you have a small number of sample observations (under 30 or so), and you want to make conclusions about the larger population.

The characteristics of the data dictate the appropriate type of t test to run. All t tests are used as standalone analyses for very simple experiments and research questions as well as to perform individual tests within more complicated statistical models such as linear regression. In this guide, we’ll lay out everything you need to know about t tests, including providing a simple workflow to determine what t test is appropriate for your particular data or if you’d be better suited using a different model.

What is a t test?

A t test is a statistical technique used to quantify the difference between the mean (average value) of a variable from up to two samples (datasets). The variable must be numeric. Some examples are height, gross income, and amount of weight lost on a particular diet.

A t test tells you if the difference you observe is “surprising” based on the expected difference. They use t-distributions to evaluate the expected variability. When you have a reasonable-sized sample (over 30 or so observations), the t test can still be used, but other tests that use the normal distribution (the z test) can be used in its place.

Sometimes t tests are called “Student’s” t tests, which is simply a reference to their unusual history.

Barrels at the guinness brewery museum in Dublin, Ireland |sebastiangora (Adobe Stock)

It got its name because a brewer from the Guinness Brewery, William Gosset , published about the method under the pseudonym "Student". He wanted to get information out of very small sample sizes (often 3-5) because it took so much effort to brew each keg for his samples.

When should I use a t test?

A t test is appropriate to use when you’ve collected a small, random sample from some statistical “population” and want to compare the mean from your sample to another value. The value for comparison could be a fixed value (e.g., 10) or the mean of a second sample.

For example, if your variable of interest is the average height of sixth graders in your region, then you might measure the height of 25 or 30 randomly-selected sixth graders. A t test could be used to answer questions such as, “Is the average height greater than four feet?”

How does a t test work?

Based on your experiment, t tests make enough assumptions about your experiment to calculate an expected variability, and then they use that to determine if the observed data is statistically significant. To do this, t tests rely on an assumed “null hypothesis.” With the above example, the null hypothesis is that the average height is less than or equal to four feet.

Say that we measure the height of 5 randomly selected sixth graders and the average height is five feet. Does that mean that the “true” average height of all sixth graders is greater than four feet or did we randomly happen to measure taller than average students?

To evaluate this, we need a distribution that shows every possible average value resulting from a sample of five individuals in a population where the true mean is four. That may seem impossible to do, which is why there are particular assumptions that need to be made to perform a t test.

With those assumptions, then all that’s needed to determine the “sampling distribution of the mean” is the sample size (5 students in this case) and standard deviation of the data (let’s say it’s 1 foot).

That’s enough to create a graphic of the distribution of the mean, which is:

8-degreesoffreedom

Notice the vertical line at x = 5, which was our sample mean. We (use software to) calculate the area to the right of the vertical line, which gives us the P value (0.09 in this case). Note that because our research question was asking if the average student is greater than four feet, the distribution is centered at four. Since we’re only interested in knowing if the average is greater than four feet, we use a one-tailed test in this case.

Using the standard confidence level of 0.05 with this example, we don’t have evidence that the true average height of sixth graders is taller than 4 feet.

What are the assumptions for t tests?

  • One variable of interest : This is not correlation or regression, where you are interested in the relationship between multiple variables. With a t test, you can have different samples, but they are all measuring the same variable (e.g., height).
  • Numeric data: You are dealing with a list of measurements that can be averaged. This means you aren’t just counting occurrences in various categories (e.g., eye color or political affiliation).
  • Two groups or less: If you have more than two samples of data, a t test is the wrong technique. You most likely need to try ANOVA.
  • Random sample : You need a random sample from your statistical “population of interest” in order to draw valid conclusions about the larger population. If your population is so small that you can measure everything, then you have a “census” and don’t need statistics. This is because you don’t need to estimate the truth, since you have measured the truth without variability.
  • Normally Distributed : The smaller your sample size, the more important it is that your data come from a normal, Gaussian distribution bell curve. If you have reason to believe that your data are not normally distributed, consider nonparametric t test alternatives . This isn’t necessary for larger samples (usually 25 or 30 unless the data is heavily skewed). The reason is that the Central Limit Theorem applies in this case, which says that even if the distribution of your data is not normal, the distribution of the mean of your data is, so you can use a z-test rather than a t test.

How do I know which t test to use?

There are many types of t tests to choose from, but you don’t necessarily have to understand every detail behind each option.

You just need to be able to answer a few questions, which will lead you to pick the right t test. To that end, we put together this workflow for you to figure out which test is appropriate for your data.

Do you have one or two samples?

Are you comparing the means of two different samples, or comparing the mean from one sample to a fixed value? An example research question is, “Is the average height of my sample of sixth grade students greater than four feet?”

If you only have one sample of data, you can click here to skip to a one-sample t test example, otherwise your next step is to ask: 

Are observations in the two samples matched up or related in some way?

This could be as before-and-after measurements of the same exact subjects, or perhaps your study split up “pairs” of subjects (who are technically different but share certain characteristics of interest) into the two samples. The same variable is measured in both cases.

If so, you are looking at some kind of paired samples t test . The linked section will help you dial in exactly which one in that family is best for you, either difference (most common) or ratio.

If you aren’t sure paired is right, ask yourself another question:

Are you comparing different observations in each of the two samples?

If the answer is yes, then you have an unpaired or independent samples t test. The two samples should measure the same variable (e.g., height), but are samples from two distinct groups (e.g., team A and team B). 

The goal is to compare the means to see if the groups are significantly different. For example, “Is the average height of team A greater than team B?” Unlike paired, the only relationship between the groups in this case is that we measured the same variable for both. There are two versions of unpaired samples t tests (pooled and unpooled) depending on whether you assume the same variance for each sample.

Have you run the same experiment multiple times on the same subject/observational unit?

If so, then you have a nested t test (unless you have more than two sample groups). This is a trickier concept to understand. One example is if you are measuring how well Fertilizer A works against Fertilizer B. Let’s say you have 12 pots to grow plants in (6 pots for each fertilizer), and you grow 3 plants in each pot.

In this case you have 6 observational units for each fertilizer, with 3 subsamples from each pot. You would want to analyze this with a nested t test . The “nested” factor in this case is the pots. It’s important to note that we aren’t interested in estimating the variability within each pot, we just want to take it into account.

You might be tempted to run an unpaired samples t test here, but that assumes you have 6*3 = 18 replicates for each fertilizer. However, the three replicates within each pot are related, and an unpaired samples t test wouldn’t take that into account.

What if none of these sound like my experiment?

If you’re not seeing your research question above, note that t tests are very basic statistical tools. Many experiments require more sophisticated techniques to evaluate differences. If the variable of interest is a proportion (e.g., 10 of 100 manufactured products were defective), then you’d use z-tests. If you take before and after measurements and have more than one treatment (e.g., control vs a treatment diet), then you need ANOVA.

How do I perform a t test using software?

If you’re wondering how to do a t test, the easiest way is with statistical software such as Prism or an online t test calculator .

If you’re using software, then all you need to know is which t test is appropriate ( use the workflow here ) and understand how to interpret the output. To do that, you’ll also need to:

  • Determine whether your test is one or two-tailed
  • Choose the level of significance

Is my test one or two-tailed? 

Whether or not you have a one- or two-tailed test depends on your research hypothesis. Choosing the appropriately tailed test is very important and requires integrity from the researcher. This is because you have more “power” with one-tailed tests, meaning that you can detect a statistically significant difference more easily. Unless you have written out your research hypothesis as one directional before you run your experiment, you should use a two-tailed test.

Two-tailed tests

Two-tailed tests are the most common, and they are applicable when your research question is simply asking, “is there a difference?”

One-tailed tests

Contrast that with one-tailed tests, where the research questions are directional, meaning that either the question is, “is it greater than ” or the question is, “is it less than ”. These tests can only detect a difference in one direction.

Choosing the level of significance

All t tests estimate whether a mean of a population is different than some other value, and with all estimates come some variability, or what statisticians call “error.” Before analyzing your data, you want to choose a level of significance, usually denoted by the Greek letter alpha, 𝛼. The scientific standard is setting alpha to be 0.05.

An alpha of 0.05 results in 95% confidence intervals, and determines the cutoff for when P values are considered statistically significant.

One sample t test

If you only have one sample of a list of numbers, you are doing a one-sample t test. All you are interested in doing is comparing the mean from this group with some known value to test if there is evidence, that it is significantly different from that standard. Use our free one-sample t test calculator for this.

A one sample t test example research question is, “Is the average fifth grader taller than four feet?”

It is the simplest version of a t test, and has all sorts of applications within hypothesis testing. Sometimes the “known value” is called the “null value”. While the null value in t tests is often 0, it could be any value. The name comes from being the value which exactly represents the null hypothesis, where no significant difference exists. 

Any time you know the exact number you are trying to compare your sample of data against, this could work well. And of course: it can be either one or two-tailed.

One sample t test formula

Statistical software handles this for you, but if you want the details, the formula for a one sample t test is:

T-test formula

  • M: Calculated mean of your sample
  • μ: Hypothetical mean you are testing against
  • s: The standard deviation of your sample
  • n: The number of observations in your sample.

In a one-sample t test, calculating degrees of freedom is simple: one less than the number of objects in your dataset (you’ll see it written as n-1 ).

Example of a one sample t test

For our example within Prism, we have a dataset of 12 values from an experiment labeled “% of control”. Perhaps these are heights of a sample of plants that have been treated with a new fertilizer. A value of 100 represents the industry-standard control height. Likewise, 123 represents a plant with a height 123% that of the control (that is, 23% larger).

2 - one sample t test results

We’ll perform a two-tailed, one-sample t test to see if plants are shorter or taller on average with the fertilizer. We will use a significance threshold of 0.05. Here is the output:

You can see in the output that the actual sample mean was 111. Is that different enough from the industry standard (100) to conclude that there is a statistical difference?

The quick answer is yes, there’s strong evidence that the height of the plants with the fertilizer is greater than the industry standard (p=0.015). The nice thing about using software is that it handles some of the trickier steps for you. In this case, it calculates your test statistic (t=2.88), determines the appropriate degrees of freedom (11), and outputs a P value.

More informative than the P value is the confidence interval of the difference, which is 2.49 to 18.7. The confidence interval tells us that, based on our data, we are confident that the true difference between our sample and the baseline value of 100 is somewhere between 2.49 and 18.7. As long as the difference is statistically significant, the interval will not contain zero.

You can follow these tips for interpreting your own one-sample test.

Graphing a one-sample t test

For some techniques (like regression), graphing the data is a very helpful part of the analysis. For t tests, making a chart of your data is still useful to spot any strange patterns or outliers, but the small sample size means you may already be familiar with any strange things in your data.

3- one sample t plot

Here we have a simple plot of the data points, perhaps with a mark for the average. We’ve made this as an example, but the truth is that graphing is usually more visually telling for two-sample t tests than for just one sample.

Two sample t tests

There are several kinds of two sample t tests, with the two main categories being paired and unpaired (independent) samples.

Paired samples t test

In a paired samples t test, also called dependent samples t test, there are two samples of data, and each observation in one sample is “paired” with an observation in the second sample. The most common example is when measurements are taken on each subject before and after a treatment. A paired t test example research question is, “Is there a statistical difference between the average red blood cell counts before and after a treatment?”

Having two samples that are closely related simplifies the analysis. Statistical software, such as this paired t test calculator , will simply take a difference between the two values, and then compare that difference to 0.

In some (rare) situations, taking a difference between the pairs violates the assumptions of a t test, because the average difference changes based on the size of the before value (e.g., there’s a larger difference between before and after when there were more to start with). In this case, instead of using a difference test, use a ratio of the before and after values, which is referred to as ratio t tests .

Paired t test formula

The formula for paired samples t test is:

Paired t-test formula

  • Md: Mean difference between the samples
  • sd: The standard deviation of the differences
  • n: The number of differences

Degrees of freedom are the same as before. If you’re studying for an exam, you can remember that the degrees of freedom are still n-1 (not n-2) because we are converting the data into a single column of differences rather than considering the two groups independently.

Also note that the null value here is simply 0. There is no real reason to include “minus 0” in an equation other than to illustrate that we are still doing a hypothesis test. After you take the difference between the two means, you are comparing that difference to 0.

For our example data, we have five test subjects and have taken two measurements from each: before (“control”) and after a treatment (“treated”). If we set alpha = 0.05 and perform a two-tailed test, we observe a statistically significant difference between the treated and control group (p=0.0160, t=4.01, df = 4). We are 95% confident that the true mean difference between the treated and control group is between 0.449 and 2.47.

4-pairedtresults

Graphing a paired t test

The significant result of the P value suggests evidence that the treatment had some effect, and we can also look at this graphically. The lines that connect the observations can help us spot a pattern, if it exists. In this case the lines show that all observations increased after treatment. While not all graphics are this straightforward, here it is very consistent with the outcome of the t test. 

5-pairedt graph

Prism’s estimation plot is even more helpful because it shows both the data (like above) and the confidence interval for the difference between means. You can easily see the evidence of significance since the confidence interval on the right does not contain zero.

5B - estimation plot paired

Here are some more graphing tips for paired t tests .

Unpaired samples t test

Unpaired samples t test, also called independent samples t test, is appropriate when you have two sample groups that aren’t correlated with one another. A pharma example is testing a treatment group against a control group of different subjects. Compare that with a paired sample, which might be recording the same subjects before and after a treatment.

With unpaired t tests, in addition to choosing your level of significance and a one or two tailed test, you need to determine whether or not to assume that the variances between the groups are the same or not. If you assume equal variances, then you can “pool” the calculation of the standard error between the two samples. Otherwise, the standard choice is Welch’s t test which corrects for unequal variances. This choice affects the calculation of the test statistic and the power of the test, which is the test’s sensitivity to detect statistical significance. 

It’s best to choose whether or not you’ll use a pooled or unpooled (Welch’s) standard error before running your experiment, because the standard statistical test is notoriously problematic. See more details about unequal variances here .

As long as you’re using statistical software, such as this two-sample t test calculator , it’s just as easy to calculate a test statistic whether or not you assume that the variances of your two samples are the same. If you’re doing it by hand, however, the calculations get more complicated with unequal variances.

Unpaired (independent) samples t test formula

The general two-sample t test formula is:

Unpaired (independent) samples t-test formula

  • M1 and M2: Two means you are comparing, one from each dataset
  • SE : The combined standard error of the two samples (calculated using pooled or unpooled standard error)

The denominator (standard error) calculation can be complicated, as can the degrees of freedom. If the groups are not balanced (the same number of observations in each), you will need to account for both when determining n for the test as a whole.

As an example for this family, we conduct a paired samples t test assuming equal variances (pooled). Based on our research hypothesis, we’ll conduct a two-tailed test, and use alpha=0.05 for our level of significance. Our samples were unbalanced, with two samples of 6 and 5 observations respectively. 

6-indeptresults

The P value (p=0.261, t = 1.20, df = 9) is higher than our threshold of 0.05. We have not found sufficient evidence to suggest a significant difference. You can see the confidence interval of the difference of the means is -9.58 to 31.2.

Note that the F-test result shows that the variances of the two groups are not significantly different from each other.

Graphing an unpaired samples t test

For an unpaired samples t test, graphing the data can quickly help you get a handle on the two groups and how similar or different they are. Like the paired example, this helps confirm the evidence (or lack thereof) that is found by doing the t test itself.

Below you can see that the observed mean for females is higher than that for males. But because of the variability in the data, we can’t tell if the means are actually different or if the difference is just by chance. 

7-indept graph

Nonparametric alternatives for t tests

If your data comes from a normal distribution (or something close enough to a normal distribution), then a t test is valid. If that assumption is violated, you can use nonparametric alternatives.

T tests evaluate whether the mean is different from another value, whereas nonparametric alternatives compare either the median or the rank. Medians are well-known to be much more robust to outliers than the mean.

The downside to nonparametric tests is that they don’t have as much statistical power, meaning a larger difference is required in order to determine that it’s statistically significant.

Wilcoxon signed-rank test

The Wilcoxon signed-rank test is the nonparametric cousin to the one-sample t test. This compares a sample median to a hypothetical median value. It is sometimes erroneously even called the Wilcoxon t test (even though it calculates a “W” statistic).

And if you have two related samples, you should use the Wilcoxon matched pairs test instead. The two versions of Wilcoxon are different, and the matched pairs version is specifically for comparing the median difference for paired samples. 

Mann-Whitney and Kolmogorov-Smirnov tests

For unpaired (independent) samples, there are multiple options for nonparametric testing. Mann-Whitney is more popular and compares the mean ranks (the ordering of values from smallest to largest) of the two samples. Mann-Whitney is often misrepresented as a comparison of medians, but that’s not always the case. Kolmogorov-Smirnov tests if the overall distributions differ between the two samples. 

More t test FAQs

What is the formula for a t test.

The exact formula depends on which type of t test you are running, although there is a basic structure that all t tests have in common. All t test statistics will have the form:

t-test

  • t : The t test statistic you calculate for your test
  • Mean1 and Mean2: Two means you are comparing, at least 1 from your own dataset
  • Standard Error of the Mean : The standard error of the mean , also called the standard deviation of the mean, which takes into account the variance and size of your dataset

The exact formula for any t test can be slightly different, particularly the calculation of the standard error. Not only does it matter whether one or two samples are being compared, the relationship between the samples can make a difference too.

What is a t-distribution?

A t-distribution is similar to a normal distribution. It’s a bell-shaped curve, but compared to a normal it has fatter tails, which means that it’s more common to observe extremes. T-distributions are identified by the number of degrees of freedom. The higher the number, the closer the t-distribution gets to a normal distribution. After about 30 degrees of freedom, a t and a standard normal are practically the same.

1-tdistribution 4df

What are degrees of freedom?

Degrees of freedom are a measure of how large your dataset is. They aren’t exactly the number of observations, because they also take into account the number of parameters (e.g., mean, variance) that you have estimated.

What is the difference between paired vs unpaired t tests?

Both paired and unpaired t tests involve two sample groups of data. With a paired t test, the values in each group are related (usually they are before and after values measured on the same test subject). In contrast, with unpaired t tests, the observed values aren’t related between groups. An unpaired, or independent t test, example is comparing the average height of children at school A vs school B. 

When do I use a z-test versus a t test?

Z-tests, which compare data using a normal distribution rather than a t-distribution, are primarily used for two situations. The first is when you’re evaluating proportions (number of failures on an assembly line). The second is when your sample size is large enough (usually around 30) that you can use a normal approximation to evaluate the means.

When should I use ANOVA instead of a t test?

Use ANOVA if you have more than two group means to compare.

What are the differences between t test vs chi square?

Chi square tests are used to evaluate contingency tables , which record a count of the number of subjects that fall into particular categories (e.g., truck, SUV, car). t tests compare the mean(s) of a variable of interest (e.g., height, weight).

What are P values?

P values are the probability that you would get data as or more extreme than the observed data given that the null hypothesis is true. It’s a mouthful, and there are a lot of issues to be aware of with P values.

What are t test critical values?

Critical values are a classical form (they aren’t used directly with modern computing) of determining if a statistical test is significant or not. Historically you could calculate your test statistic from your data, and then use a t-table to look up the cutoff value (critical value) that represented a “significant” result. You would then compare your observed statistic against the critical value.

How do I calculate degrees of freedom for my t test?

In most practical usage, degrees of freedom are the number of observations you have minus the number of parameters you are trying to estimate. The calculation isn’t always straightforward and is approximated for some t tests.

Statistical software calculates degrees of freedom automatically as part of the analysis, so understanding them in more detail isn’t needed beyond assuaging any curiosity.

Perform your own t test

Are you ready to calculate your own t test? Start your 30 day free trial of Prism and get access to:

  • A step by step guide on how to perform a t test
  • Sample data to save you time
  • More tips on how Prism can help your research

With Prism, in a matter of minutes you learn how to go from entering data to performing statistical analyses and generating high-quality graphs.

Independent t-test for two samples

Introduction.

The independent t-test, also called the two sample t-test, independent-samples t-test or student's t-test, is an inferential statistical test that determines whether there is a statistically significant difference between the means in two unrelated groups.

Null and alternative hypotheses for the independent t-test

The null hypothesis for the independent t-test is that the population means from the two unrelated groups are equal:

H 0 : u 1 = u 2

In most cases, we are looking to see if we can show that we can reject the null hypothesis and accept the alternative hypothesis, which is that the population means are not equal:

H A : u 1 ≠ u 2

To do this, we need to set a significance level (also called alpha) that allows us to either reject or accept the alternative hypothesis. Most commonly, this value is set at 0.05.

What do you need to run an independent t-test?

In order to run an independent t-test, you need the following:

  • One independent, categorical variable that has two levels/groups.
  • One continuous dependent variable.

Unrelated groups

Unrelated groups, also called unpaired groups or independent groups, are groups in which the cases (e.g., participants) in each group are different. Often we are investigating differences in individuals, which means that when comparing two groups, an individual in one group cannot also be a member of the other group and vice versa. An example would be gender - an individual would have to be classified as either male or female – not both.

Assumption of normality of the dependent variable

The independent t-test requires that the dependent variable is approximately normally distributed within each group.

Note: Technically, it is the residuals that need to be normally distributed, but for an independent t-test, both will give you the same result.

You can test for this using a number of different tests, but the Shapiro-Wilks test of normality or a graphical method, such as a Q-Q Plot, are very common. You can run these tests using SPSS Statistics, the procedure for which can be found in our Testing for Normality guide. However, the t-test is described as a robust test with respect to the assumption of normality. This means that some deviation away from normality does not have a large influence on Type I error rates. The exception to this is if the ratio of the smallest to largest group size is greater than 1.5 (largest compared to smallest).

What to do when you violate the normality assumption

If you find that either one or both of your group's data is not approximately normally distributed and groups sizes differ greatly, you have two options: (1) transform your data so that the data becomes normally distributed (to do this in SPSS Statistics see our guide on Transforming Data ), or (2) run the Mann-Whitney U test which is a non-parametric test that does not require the assumption of normality (to run this test in SPSS Statistics see our guide on the Mann-Whitney U Test ).

Assumption of homogeneity of variance

The independent t-test assumes the variances of the two groups you are measuring are equal in the population. If your variances are unequal, this can affect the Type I error rate. The assumption of homogeneity of variance can be tested using Levene's Test of Equality of Variances, which is produced in SPSS Statistics when running the independent t-test procedure. If you have run Levene's Test of Equality of Variances in SPSS Statistics, you will get a result similar to that below:

Levene's Test for Equality of Variances in the Independent T-Test Procedure within SPSS

This test for homogeneity of variance provides an F -statistic and a significance value ( p -value). We are primarily concerned with the significance value – if it is greater than 0.05 (i.e., p > .05), our group variances can be treated as equal. However, if p < 0.05, we have unequal variances and we have violated the assumption of homogeneity of variances.

Overcoming a violation of the assumption of homogeneity of variance

If the Levene's Test for Equality of Variances is statistically significant, which indicates that the group variances are unequal in the population, you can correct for this violation by not using the pooled estimate for the error term for the t -statistic, but instead using an adjustment to the degrees of freedom using the Welch-Satterthwaite method. In all reality, you will probably never have heard of these adjustments because SPSS Statistics hides this information and simply labels the two options as "Equal variances assumed" and "Equal variances not assumed" without explicitly stating the underlying tests used. However, you can see the evidence of these tests as below:

Differences in the t-statistic and the degrees of freedom when homogeneity of variance is not assumed

From the result of Levene's Test for Equality of Variances, we can reject the null hypothesis that there is no difference in the variances between the groups and accept the alternative hypothesis that there is a statistically significant difference in the variances between groups. The effect of not being able to assume equal variances is evident in the final column of the above figure where we see a reduction in the value of the t -statistic and a large reduction in the degrees of freedom (df). This has the effect of increasing the p -value above the critical significance level of 0.05. In this case, we therefore do not accept the alternative hypothesis and accept that there are no statistically significant differences between means. This would not have been our conclusion had we not tested for homogeneity of variances.

Testimonials

Reporting the result of an independent t-test

When reporting the result of an independent t-test, you need to include the t -statistic value, the degrees of freedom (df) and the significance value of the test ( p -value). The format of the test result is: t (df) = t -statistic, p = significance value. Therefore, for the example above, you could report the result as t (7.001) = 2.233, p = 0.061.

Fully reporting your results

In order to provide enough information for readers to fully understand the results when you have run an independent t-test, you should include the result of normality tests, Levene's Equality of Variances test, the two group means and standard deviations, the actual t-test result and the direction of the difference (if any). In addition, you might also wish to include the difference between the groups along with a 95% confidence interval. For example:

Inspection of Q-Q Plots revealed that cholesterol concentration was normally distributed for both groups and that there was homogeneity of variance as assessed by Levene's Test for Equality of Variances. Therefore, an independent t-test was run on the data with a 95% confidence interval (CI) for the mean difference. It was found that after the two interventions, cholesterol concentrations in the dietary group (6.15 ± 0.52 mmol/L) were significantly higher than the exercise group (5.80 ± 0.38 mmol/L) ( t (38) = 2.470, p = 0.018) with a difference of 0.35 (95% CI, 0.06 to 0.64) mmol/L.

To know how to run an independent t-test in SPSS Statistics, see our SPSS Statistics Independent-Samples T-Test guide. Alternatively, you can carry out an independent-samples t-test using Excel, R and RStudio .

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.2 - writing hypotheses.

The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis (\(H_0\)) and an alternative hypothesis (\(H_a\)).

When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the direction of the test (non-directional, right-tailed or left-tailed), and (3) the value of the hypothesized parameter.

  • At this point we can write hypotheses for a single mean (\(\mu\)), paired means(\(\mu_d\)), a single proportion (\(p\)), the difference between two independent means (\(\mu_1-\mu_2\)), the difference between two proportions (\(p_1-p_2\)), a simple linear regression slope (\(\beta\)), and a correlation (\(\rho\)). 
  • The research question will give us the information necessary to determine if the test is two-tailed (e.g., "different from," "not equal to"), right-tailed (e.g., "greater than," "more than"), or left-tailed (e.g., "less than," "fewer than").
  • The research question will also give us the hypothesized parameter value. This is the number that goes in the hypothesis statements (i.e., \(\mu_0\) and \(p_0\)). For the difference between two groups, regression, and correlation, this value is typically 0.

Hypotheses are always written in terms of population parameters (e.g., \(p\) and \(\mu\)).  The tables below display all of the possible hypotheses for the parameters that we have learned thus far. Note that the null hypothesis always includes the equality (i.e., =).

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Section 3.3: Independent T-test Assumptions, Interpretation, and Write Up

Learning Objectives

At the end of this chapter you should be able to answer the following questions:

  • Is the Independent T-test a Between Groups or Within Groups test?
  • How many assumptions underpin the Independent Samples T-test?
  • What is the first test to examine within the Independent Groups T-test output?
  • What is the second test to examine within the Independent Groups T-test output?
  • What elements or individual statistics should be reported when writing up an Independent T-test?

An Independent T-test or Independent Samples T-test is an important test for Between Groups differences.

Here we will discuss the underlying assumptions of the Independent t-test and explain how to interpret the results of the t-test. There are a number of assumptions that need to be met before performing an Independent t-test:

  • The dependent variable (the variable of interest) needs a continuous scale (i.e., the data needs to be at either an interval or ratio measurement). An example of a continuous dependent variable might be the weight of an athlete. Their weight could be anywhere between 50 and 70 kilograms.
  • The independent variable needs to have two independent groups with two levels. An example of this independent variable could be regional vs metropolitan Australians.
  • The data should have independence of observations. More specifically,  there shouldn’t be the same participants in both groups.
  • The dependent variable should be normally or near-to-normally distributed for each group. It is worth noting that the t-test is robust for minor violations in normality, however, if your data is very non-normal, it might be worth using a non-parametric test or bootstrapping (see later chapters for more information).
  • There should be no spurious outliers.
  • The data must have homogeneity of variances. This assumption can be tested using Levene’s test for homogeneity of variances in the statistics package which is shown in the output included in the next chapter.

Independent T-test Interpretation

The order of interpreting test statistics can be important and there are multiple test statistics to interpret within the Independent Groups T-test output.

Keep in mind that we are examining two groups of individuals – In this example, we are looking at metropolitan versus regional Australians.   The dependent or outcome variable is mental distress.

And here we have the output from the T-test.

PowerPoint: Independent T-test Output

You will need to click on the below link to access the output:

  • Chapter Three Independent T-test Output

Table of statistics on mental distress

Green: Levene’s test

Red: Test statistics

Blue: Means and standard deviations

Green: The first thing you should examine is Levene’s test. If this test is nonsignificant, that means you have homogeneity of variance between the two groups on the dependent or outcome variable. If Levene’s test is significant, this means that the two groups did not show homogeneity of variance on the dependent or outcome variable.  In our example, Levene’s test is nonsignificant so we can move on to the statistics for the tests under the condition of equal variances assumed.

You should notice that there are two lines or rows of statistics given in the output.  The first row, which we are using, provides statistics for the tests under the condition of equal variances assumed.  The second row, which we are not using, provides statistics for the tests under the condition of equal variances not assumed.

Red: The next thing you should look at is the t value, the degrees of freedom, and the p value statistics in the first or top row of the output. The p-value of .024 shows that there is a significant difference in levels of mental distress reported by metropolitan and regional Australians. If we look at the mean scores, we can tell that regional Australians reported higher levels of mental distress (38.867) than the Australians who live in major cities (35.904).

You will also notice that there is a 95% CI presented, which is a 95% Confidence Interval of the difference.  This CI has a lower limit at -5.525 and an upper limit at -.401.  Because the CI does not include 0 we can infer that the difference between the two groups does exist in the population.

Blue: Next, make sure you have a look at the mean, standard deviation, and sample size (N) for both groups.  You can get the effect size (Cohen’s D) by using an effect size calculator.

You may find an effect size calculator here: https://www.socscistatistics.com/effectsize/default3.aspx

If you enter the mean, standard deviation, and sample size for both groups, you should get a Cohen’s D of .239.

Independent T-test Write-Up

You will need to report the Means and SD for each group, along with the t test statistic ( t ), its p value, and its effect size d .

It is common in many formats to round your decimal places to two.  Therefore, a Write-Up for an Independent T-test should  look like this:

An independent samples t-test showed that the metropolitan sample (M = 35.90, SD = 12.10) reported lower levels of mental distress ( t =-2.27,  p =.024,  d =.24) than the regional sample (M = 38.87, SD = 12.69).

Statistics for Research Students Copyright © 2022 by University of Southern Queensland is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

Statology

Statistics Made Easy

Paired Samples t-test: Definition, Formula, and Example

A paired samples t-test is used to compare the means of two samples when each observation in one sample can be paired with an observation in the other sample.

This tutorial explains the following:

  • The motivation for performing a paired samples t-test.
  • The formula to perform a paired samples t-test.
  • The assumptions that should be met to perform a paired samples t-test.
  • An example of how to perform a paired samples t-test.

Paired Samples t-test: Motivation

A paired samples t-test is commonly used in two scenarios:

1. A measurement is taken on a subject before and after some treatment – e.g. the max vertical jump of college basketball players is measured before and after participating in a training program.

2. A measurement is taken under two different conditions  – e.g. the response time of a patient is measured on two different drugs.

In both cases we are interested in comparing the mean measurement between two groups in which each observation in one sample can be paired with an observation in the other sample.

Paired Samples t-test: Formula

A paired samples t-test always uses the following null hypothesis:

  • H 0 : μ 1  = μ 2 (the two population means are equal)

The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

  • H 1 (two-tailed): μ 1  ≠ μ 2 (the two population means are not equal)
  • H 1 (left-tailed): μ 1  < μ 2  (population 1 mean is less than population 2 mean)
  • H 1 (right-tailed):  μ 1 > μ 2  (population 1 mean is greater than population 2 mean)

We use the following formula to calculate the test statistic t:

t = x diff  / (s diff /√n)

  • x diff :  sample mean of the differences
  • s:  sample standard deviation of the differences
  • n:  sample size (i.e. number of pairs)

If the p-value that corresponds to the test statistic t with (n-1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis.

Paired Samples t-test: Assumptions

For the results of a paired samples t-test to be valid, the following assumptions should be met:

  • The participants should be selected randomly from the population.
  • The differences between the pairs should be approximately normally distributed.
  • There should be no extreme outliers in the differences.

Paired Samples t-test : Example

Suppose we want to know whether or not a certain training program is able to increase the max vertical jump (in inches) of college basketball players.

To test this, we may recruit a simple random sample of 20 college basketball players and measure each of their max vertical jumps. Then, we may have each player use the training program for one month and then measure their max vertical jump again at the end of the month.

Paired t-test example dataset

To determine whether or not the training program actually had an effect on max vertical jump, we will perform a paired samples t-test at significance level α = 0.05 using the following steps:

Step 1: Calculate the summary data for the differences.

Paired samples t-test dataset

  • x diff :  sample mean of the differences =  -0.95
  • s:  sample standard deviation of the differences =  1.317
  • n:  sample size (i.e. number of pairs) =  20

Step 2: Define the hypotheses.

We will perform the paired samples t-test with the following hypotheses:

  • H 0 :  μ 1  = μ 2 (the two population means are equal)
  • H 1 :  μ 1  ≠ μ 2 (the two population means are not equal)

Step 3: Calculate the test statistic  t .

t = x diff  / (s diff /√n)  = -0.95 / (1.317/ √ 20) =  -3.226

Step 4: Calculate the p-value of the test statistic  t .

According to the T Score to P Value Calculator , the p-value associated with t = -3.226 and degrees of freedom = n-1 = 20-1 = 19 is  0.00445 .

Step 5: Draw a conclusion.

Since this p-value is less than our significance level α = 0.05, we reject the null hypothesis. We have sufficient evidence to say that the mean max vertical jump of players is different before and after participating in the training program.

Note:  You can also perform this entire paired samples t-test by simply using the Paired Samples t-test Calculator .

Additional Resources

The following tutorials explain how to perform a paired samples t-test using different statistical programs:

How to Perform a Paired Samples t-Test in Excel How to Perform a Paired Samples t-test in SPSS How to Perform a Paired Samples t-test in Stata How to Perform a Paired Samples t-test on a TI-84 Calculator How to Perform a Paired Samples t-test in R How to Perform a Paired Samples t-Test in Python How to Perform a Paired Samples t-Test by Hand

' src=

Published by Zach

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

IMAGES

  1. Hypothesis Testing Example Two Sample t-Test

    how to write hypothesis for t test

  2. hypothesis test formula statistics

    how to write hypothesis for t test

  3. Hypothesis Testing

    how to write hypothesis for t test

  4. Hypothesis Testing Part-3: Two Sample t-test with tables and using

    how to write hypothesis for t test

  5. Best Example of How to Write a Hypothesis 2024

    how to write hypothesis for t test

  6. Hypothesis Testing

    how to write hypothesis for t test

VIDEO

  1. T TEST hypothesis class 1

  2. Testing of hypothesis |Part-2|statistics and numerical methods-MA3251

  3. Hypothesis Testing through Single Sample t Test

  4. Inferential statistics

  5. One Sample Hypothesis T Test for Means

  6. Hypothesis testing for meu when sigma is unknown l T-Test l Hindi/Urdu

COMMENTS

  1. T-test and Hypothesis Testing (Explained Simply)

    Student's t-tests are commonly used in inferential statistics for testing a hypothesis on the basis of a difference between sample means. However, people often misinterpret the results of t-tests, which leads to false research findings and a lack of reproducibility of studies. This problem exists not only among students.

  2. T Test Overview: How to Use & Examples

    We'll use a two-sample t test to evaluate if the difference between the two group means is statistically significant. The t test output is below. In the output, you can see that the treatment group (Sample 1) has a mean of 109 while the control group's (Sample 2) average is 100. The p-value for the difference between the groups is 0.112.

  3. An Introduction to t Tests

    Revised on June 22, 2023. A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another. t test example.

  4. Hypothesis Testing

    Step 2: Collect data. For a statistical test to be valid, it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in. Hypothesis testing example.

  5. T Test (Student's T-Test): Definition and Examples

    The null hypothesis for the independent samples t-test is μ 1 = μ 2. So it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H 0: µ d = 0). Paired Samples T Test By hand. Example question: Calculate a paired t test by hand for the following data:

  6. PDF Hypothesis Testing with t Tests

    Write the symbol for the test statistic (e.g., z or t) 2. Write the degrees of freedom in parentheses 3. Write an equal sign and then the value of the test statistic (2 decimal places) 4. Write a comma and then whether the p value associated with the test statistic was less than or greater than the cutoff p value of 05value of .05

  7. Independent Samples T Test: Definition, Using & Interpreting

    Independent Samples T Tests Hypotheses. Independent samples t tests have the following hypotheses: Null hypothesis: The means for the two populations are equal. Alternative hypothesis: The means for the two populations are not equal.; If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis. The difference between the two means is statistically ...

  8. 8.2: Hypothesis Testing with t

    Hypothesis testing with the \(t\)-statistic works exactly the same way as \(z\)-tests did, following the four-step process of. Stating the Hypothesis; Finding the Critical Values; Computing the Test Statistic; Making the Decision. We will work though an example: let's say that you move to a new city and find a an auto shop to change your oil.

  9. The Ultimate Guide to T Tests

    A t test is a statistical technique used to quantify the difference between the mean (average value) of a variable from up to two samples (datasets). The variable must be numeric. Some examples are height, gross income, and amount of weight lost on a particular diet. A t test tells you if the difference you observe is "surprising" based on ...

  10. How t-Tests Work: t-Values, t-Distributions, and Probabilities

    Hypothesis tests work by taking the observed test statistic from a sample and using the sampling distribution to calculate the probability of obtaining that test statistic if the null hypothesis is correct. In the context of how t-tests work, you assess the likelihood of a t-value using the t-distribution.

  11. Hypotheses for a two-sample t test (video)

    On the other hand, a two-sample T test is where you're thinking about two different populations. For example, you could be thinking about a population of men, and you could be thinking about the population of women. And you wanna compare the means between these two, say, the mean salary. So, you have the mean salary for men and you have the ...

  12. Independent t-test for two samples

    Reporting the result of an independent t-test. When reporting the result of an independent t-test, you need to include the t-statistic value, the degrees of freedom (df) and the significance value of the test (p-value).The format of the test result is: t(df) = t-statistic, p = significance value. Therefore, for the example above, you could report the result as t(7.001) = 2.233, p = 0.061.

  13. Example of hypotheses for paired and two-sample t tests

    First of all, if you have two groups, one testing one placebo, then it's 2 samples. If it is the same group before and after, then paired t-test. I'm trying to run a dependent sample t-test/paired sample t test through using data from a Qualtrics survey measuring two groups of people (one with social anxiety and one without on the effects of ...

  14. How to Write a Strong Hypothesis

    Developing a hypothesis (with example) Step 1. Ask a question. Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project. Example: Research question.

  15. How to Report T-Test Results (With Examples)

    By Zach Bobbitt May 18, 2021. We can use the following general format to report the results of a one sample t-test: A one sample t-test was performed to compare [variable of interest] against the population mean. The mean value of [variable of interest] (M = [Mean], SD = [standard deviation]) was significantly [higher, lower, or different] than ...

  16. Conclusion for a two-sample t test using a P-value

    It's a t test to see if we have evidence that would suggest our alternative hypothesis. And so what we do is we assume the null hypothesis. From that you're able to calculate a t-statistic, and then from that t-statistic and the degrees of freedom, you are able to calculate a p-value. And if that p-value is below your significance level, then ...

  17. How t-Tests Work: 1-sample, 2-sample, and Paired t-Tests

    A paired t-test takes paired observations (like before and after), subtracts one from the other, and conducts a 1-sample t-test on the differences. Typically, a paired t-test determines whether the paired differences are significantly different from zero. Download the CSV data file to check this yourself: T-testData.

  18. 5.2

    5.2 - Writing Hypotheses. The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis ( H 0) and an alternative hypothesis ( H a ). Null Hypothesis. The statement that there is not a difference in the population (s), denoted as H 0.

  19. Section 3.3: Independent T-test Assumptions, Interpretation, and Write

    Independent T-test Write-Up. You will need to report the Means and SD for each group, along with the t test statistic (t), its p value, and its effect size d. It is common in many formats to round your decimal places to two. Therefore, a Write-Up for an Independent T-test should look like this:

  20. Two-sample t test for difference of means

    And let's assume that we are working with a significance level of 0.05. So pause the video, and conduct the two sample T test here, to see whether there's evidence that the sizes of tomato plants differ between the fields. Alright, now let's work through this together. So like always, let's first construct our null hypothesis.

  21. Paired Samples t-test: Definition, Formula, and Example

    A paired samples t-test is used to compare the means of two samples when each observation in one sample can be paired with an observation in the other sample.. This tutorial explains the following: The motivation for performing a paired samples t-test. The formula to perform a paired samples t-test. The assumptions that should be met to perform a paired samples t-test.

  22. Paired T Test: Definition & When to Use It

    A paired t-test determines whether the mean change for these pairs is significantly different from zero. This test is an inferential statistics procedure because it uses samples to draw conclusions about populations. Paired t tests are also known as a paired sample t-test or a dependent samples t test. These names reflect the fact that the two ...

  23. One-Tailed and Two-Tailed Hypothesis Tests Explained

    One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution. In the examples below, I use an alpha of 5%.