Statology

Statistics Made Easy

One Sample t-test: Definition, Formula, and Example

A  one sample t-test  is used to test whether or not the mean of a population is equal to some value.

This tutorial explains the following:

  • The motivation for performing a one sample t-test.
  • The formula to perform a one sample t-test.
  • The assumptions that should be met to perform a one sample t-test.
  • An example of how to perform a one sample t-test.

One Sample t-test: Motivation

Suppose we want to know whether or not the mean weight of a certain species of turtle in Florida is equal to 310 pounds. Since there are thousands of turtles in Florida, it would be extremely time-consuming and costly to go around and weigh each individual turtle.

Instead, we might take a simple random sample of 40 turtles and use the mean weight of the turtles in this sample to estimate the true population mean:

Sample from population example

However, it’s virtually guaranteed that the mean weight of turtles in our sample will differ from 310 pounds. The question is whether or not this difference is statistically significant . Fortunately, a one sample t-test allows us to answer this question.

One Sample t-test: Formula

A one-sample t-test always uses the following null hypothesis:

  • H 0 :  μ = μ 0 (population mean is equal to some hypothesized value μ 0 )

The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

  • H 1 (two-tailed):  μ ≠ μ 0 (population mean is not equal to some hypothesized value μ 0 )
  • H 1 (left-tailed):  μ < μ 0 (population mean is less than some hypothesized value μ 0 )
  • H 1 (right-tailed):  μ > μ 0 (population mean is greater than some hypothesized value μ 0 )

We use the following formula to calculate the test statistic t:

t = ( x  – μ) / (s/√ n )

  • x : sample mean
  • μ 0 : hypothesized population mean
  • s:  sample standard deviation
  • n:  sample size

If the p-value that corresponds to the test statistic t with (n-1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis.

One Sample t-test: Assumptions

For the results of a one sample t-test to be valid, the following assumptions should be met:

  • The variable under study should be either an interval or ratio variable .
  • The observations in the sample should be independent .
  • The variable under study should be approximately normally distributed.  You can check this assumption by creating a histogram and visually checking if the distribution has roughly a “bell shape.”
  • The variable under study should have no outliers. You can check this assumption by creating a boxplot and visually checking for outliers.

One Sample t-test : Example

Suppose we want to know whether or not the mean weight of a certain species of turtle is equal to 310 pounds. To test this, will perform a one-sample t-test at significance level α = 0.05 using the following steps:

Step 1: Gather the sample data.

Suppose  we collect a random sample of turtles with the following information:

  • Sample size n = 40
  • Sample mean weight  x  = 300
  • Sample standard deviation s = 18.5

Step 2: Define the hypotheses.

We will perform the one sample t-test with the following hypotheses:

  • H 0 :  μ = 310 (population mean is equal to 310 pounds)
  • H 1 :  μ ≠ 310 (population mean is not equal to 310 pounds)

Step 3: Calculate the test statistic  t .

t = ( x  – μ) / (s/√ n ) = (300-310) / (18.5/√ 40 ) =  -3.4187

Step 4: Calculate the p-value of the test statistic  t .

According to the T Score to P Value Calculator , the p-value associated with t = -3.4817 and degrees of freedom = n-1 = 40-1 = 39 is  0.00149 .

Step 5: Draw a conclusion.

Since this p-value is less than our significance level α = 0.05, we reject the null hypothesis. We have sufficient evidence to say that the mean weight of this species of turtle is not equal to 310 pounds.

Note:  You can also perform this entire one sample t-test by simply using the One Sample t-test calculator .

Additional Resources

The following tutorials explain how to perform a one-sample t-test using different statistical programs:

How to Perform a One Sample t-test in Excel How to Perform a One Sample t-test in SPSS How to Perform a One Sample t-test in Stata How to Perform a One Sample t-test in R How to Conduct a One Sample t-test in Python How to Perform a One Sample t-test on a TI-84 Calculator

Featured Posts

5 Regularization Techniques You Should Know

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

3 Replies to “One Sample t-test: Definition, Formula, and Example”

This is very helpful. Thanks a lot.

am impressed with the note statology links give us. thanks

The sample size for t test cannot be more than 30.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

JMP | Statistical Discovery.™ From SAS.

Statistics Knowledge Portal

A free online introduction to statistics

The One-Sample t -Test

What is the one-sample t -test.

The one-sample t-test is a statistical hypothesis test used to determine whether an unknown population mean is different from a specific value.

When can I use the test?

You can use the test for continuous data. Your data should be a random sample from a normal population.

What if my data isn’t nearly normally distributed?

If your sample sizes are very small, you might not be able to test for normality. You might need to rely on your understanding of the data. When you cannot safely assume normality, you can perform a nonparametric test that doesn’t assume normality.

Using the one-sample t -test

See how to perform a one-sample t -test using statistical software.

  • Download JMP to follow along using the sample data included with the software.
  • To see more JMP tutorials, visit the JMP Learning Library .

The sections below discuss what we need for the test, checking our data, performing the test, understanding test results and statistical details.

What do we need?

For the one-sample t -test, we need one variable.

We also have an idea, or hypothesis, that the mean of the population has some value. Here are two examples:

  • A hospital has a random sample of cholesterol measurements for men. These patients were seen for issues other than cholesterol. They were not taking any medications for high cholesterol. The hospital wants to know if the unknown mean cholesterol for patients is different from a goal level of 200 mg.
  • We measure the grams of protein for a sample of energy bars. The label claims that the bars have 20 grams of protein. We want to know if the labels are correct or not.

One-sample t -test assumptions

For a valid test, we need data values that are:

  • Independent (values are not related to one another).
  • Continuous.
  • Obtained via a simple random sample from the population.

Also, the population is assumed to be normally distributed .

One-sample t -test example

Imagine we have collected a random sample of 31 energy bars from a number of different stores to represent the population of energy bars available to the general consumer. The labels on the bars claim that each bar contains 20 grams of protein.

Table 1: Grams of protein in random sample of energy bars

If you look at the table above, you see that some bars have less than 20 grams of protein. Other bars have more. You might think that the data support the idea that the labels are correct. Others might disagree. The statistical test provides a sound method to make a decision, so that everyone makes the same decision on the same set of data values. 

Checking the data

Let’s start by answering: Is the t -test an appropriate method to test that the energy bars have 20 grams of protein ? The list below checks the requirements for the test.

  • The data values are independent. The grams of protein in one energy bar do not depend on the grams in any other energy bar. An example of dependent values would be if you collected energy bars from a single production lot. A sample from a single lot is representative of that lot, not energy bars in general.
  • The data values are grams of protein. The measurements are continuous.
  • We assume the energy bars are a simple random sample from the population of energy bars available to the general consumer (i.e., a mix of lots of bars).
  • We assume the population from which we are collecting our sample is normally distributed, and for large samples, we can check this assumption.

We decide that the t -test is an appropriate method.

Before jumping into analysis, we should take a quick look at the data. The figure below shows a histogram and summary statistics for the energy bars.

Histogram and summary statistics for the grams of protein in energy bars

From a quick look at the histogram, we see that there are no unusual points, or outliers . The data look roughly bell-shaped, so our assumption of a normal distribution seems reasonable.

From a quick look at the statistics, we see that the average is 21.40, above 20. Does this  average from our sample of 31 bars invalidate the label's claim of 20 grams of protein for the unknown entire population mean? Or not?

How to perform the one-sample t -test

For the t -test calculations we need the mean, standard deviation and sample size. These are shown in the summary statistics section of Figure 1 above.

We round the statistics to two decimal places. Software will show more decimal places, and use them in calculations. (Note that Table 1 shows only two decimal places; the actual data used to calculate the summary statistics has more.)

We start by finding the difference between the sample mean and 20:

$ 21.40-20\ =\ 1.40$

Next, we calculate the standard error for the mean. The calculation is:

Standard Error for the mean = $ \frac{s}{\sqrt{n}}= \frac{2.54}{\sqrt{31}}=0.456 $

This matches the value in Figure 1 above.

We now have the pieces for our test statistic. We calculate our test statistic as:

$ t =  \frac{\text{Difference}}{\text{Standard Error}}= \frac{1.40}{0.456}=3.07 $

To make our decision, we compare the test statistic to a value from the t- distribution. This activity involves four steps.

  • We calculate a test statistic. Our test statistic is 3.07.
  • We decide on the risk we are willing to take for declaring a difference when there is not a difference. For the energy bar data, we decide that we are willing to take a 5% risk of saying that the unknown population mean is different from 20 when in fact it is not. In statistics-speak, we set α = 0.05. In practice, setting your risk level (α) should be made before collecting the data.

We find the value from the t- distribution based on our decision. For a t -test, we need the degrees of freedom to find this value. The degrees of freedom are based on the sample size. For the energy bar data:

degrees of freedom = $ n - 1 = 31 - 1 = 30 $

The critical value of t with α = 0.05 and 30 degrees of freedom is +/- 2.043. Most statistics books have look-up tables for the distribution. You can also find tables online. The most likely situation is that you will use software and will not use printed tables.

We compare the value of our statistic (3.07) to the t value. Since 3.07 > 2.043, we reject the null hypothesis that the mean grams of protein is equal to 20. We make a practical conclusion that the labels are incorrect, and the population mean grams of protein is greater than 20.

Statistical details

Let’s look at the energy bar data and the 1-sample t -test using statistical terms.

Our null hypothesis is that the underlying population mean is equal to 20. The null hypothesis is written as:

$ H_o:  \mathrm{\mu} = 20 $

The alternative hypothesis is that the underlying population mean is not equal to 20. The labels claiming 20 grams of protein would be incorrect. This is written as:

$ H_a:  \mathrm{\mu} ≠ 20 $

This is a two-sided test. We are testing if the population mean is different from 20 grams in either direction. If we can reject the null hypothesis that the mean is equal to 20 grams, then we make a practical conclusion that the labels for the bars are incorrect. If we cannot reject the null hypothesis, then we make a practical conclusion that the labels for the bars may be correct.

We calculate the average for the sample and then calculate the difference with the population mean, mu:

$  \overline{x} - \mathrm{\mu} $

We calculate the standard error as:

$ \frac{s}{ \sqrt{n}} $

The formula shows the sample standard deviation as s and the sample size as n .  

The test statistic uses the formula shown below:

$  \dfrac{\overline{x} - \mathrm{\mu}} {s / \sqrt{n}} $

We compare the test statistic to a t value with our chosen alpha value and the degrees of freedom for our data. Using the energy bar data as an example, we set α = 0.05. The degrees of freedom ( df ) are based on the sample size and are calculated as:

$ df = n - 1 = 31 - 1 = 30 $

Statisticians write the t value with α = 0.05 and 30 degrees of freedom as:

$ t_{0.05,30} $

The t value for a two-sided test with α = 0.05 and 30 degrees of freedom is +/- 2.042. There are two possible results from our comparison:

  • The test statistic is less extreme than the critical  t  values; in other words, the test statistic is not less than -2.042, or is not greater than +2.042. You fail to reject the null hypothesis that the mean is equal to the specified value. In our example, you would be unable to conclude that the label for the protein bars should be changed.
  • The test statistic is more extreme than the critical  t  values; in other words, the test statistic is less than -2.042, or is greater than +2.042. You reject the null hypothesis that the mean is equal to the specified value. In our example, you conclude that either the label should be updated or the production process should be improved to produce, on average, bars with 20 grams of protein.

Testing for normality

The normality assumption is more important for small sample sizes than for larger sample sizes.

Normal distributions are symmetric, which means they are “even” on both sides of the center. Normal distributions do not have extreme values, or outliers. You can check these two features of a normal distribution with graphs. Earlier, we decided that the energy bar data was “close enough” to normal to go ahead with the assumption of normality. The figure below shows a normal quantile plot for the data, and supports our decision.

Normal quantile plot for energy bar data

You can also perform a formal test for normality using software. The figure below shows results of testing for normality with JMP software. We cannot reject the hypothesis of a normal distribution. 

Testing for normality using JMP software

We can go ahead with the assumption that the energy bar data is normally distributed.

What if my data are not from a Normal distribution?

If your sample size is very small, it is hard to test for normality. In this situation, you might need to use your understanding of the measurements. For example, for the energy bar data, the company knows that the underlying distribution of grams of protein is normally distributed. Even for a very small sample, the company would likely go ahead with the t -test and assume normality.

What if you know the underlying measurements are not normally distributed? Or what if your sample size is large and the test for normality is rejected? In this situation, you can use a nonparametric test. Nonparametric  analyses do not depend on an assumption that the data values are from a specific distribution. For the one-sample t ­-test, the one possible nonparametric test is the Wilcoxon Signed Rank test. 

Understanding p-values

Using a visual, you can check to see if your test statistic is more extreme than a specified value in the distribution. The figure below shows a t- distribution with 30 degrees of freedom.

t-distribution with 30 degrees of freedom and α = 0.05

Since our test is two-sided and we set α = 0.05, the figure shows that the value of 2.042 “cuts off” 5% of the data in the tails combined.

The next figure shows our results. You can see the test statistic falls above the specified critical value. It is far enough “out in the tail” to reject the hypothesis that the mean is equal to 20.

Our results displayed in a t-distribution with 30 degrees of freedom

Putting it all together with Software

You are likely to use software to perform a t -test. The figure below shows results for the 1-sample t -test for the energy bar data from JMP software.  

One-sample t-test results for energy bar data using JMP software

The software shows the null hypothesis value of 20 and the average and standard deviation from the data. The test statistic is 3.07. This matches the calculations above.

The software shows results for a two-sided test and for one-sided tests. We want the two-sided test. Our null hypothesis is that the mean grams of protein is equal to 20. Our alternative hypothesis is that the mean grams of protein is not equal to 20.  The software shows a p- value of 0.0046 for the two-sided test. This p- value describes the likelihood of seeing a sample average as extreme as 21.4, or more extreme, when the underlying population mean is actually 20; in other words, the probability of observing a sample mean as different, or even more different from 20, than the mean we observed in our sample. A p -value of 0.0046 means there is about 46 chances out of 10,000. We feel confident in rejecting the null hypothesis that the population mean is equal to 20.

  • Comprehensive Learning Paths
  • 150+ Hours of Videos
  • Complete Access to Jupyter notebooks, Datasets, References.

Rating

One Sample T Test – Clearly Explained with Examples | ML+

  • October 8, 2020
  • Selva Prabhakaran

One sample T-Test tests if the given sample of observations could have been generated from a population with a specified mean.

If it is found from the test that the means are statistically different, we infer that the sample is unlikely to have come from the population.

For example: If you want to test a car manufacturer’s claim that their cars give a highway mileage of 20kmpl on an average. You sample 10 cars from the dealership, measure their mileage and use the T-test to determine if the manufacturer’s claim is true.

By end of this, you will know when and how to do the T-Test, the concept, math, how to set the null and alternate hypothesis, how to use the T-tables, how to understand the one-tailed and two-tailed T-Test and see how to implement in R and Python using a practical example.

null hypothesis for single sample t test

Introduction

Purpose of one sample t test, how to set the null and alternate hypothesis, procedure to do one sample t test, one sample t test example, one sample t test implementation, how to decide which t test to perform two tailed, upper tailed or lower tailed.

  • Related Posts

The ‘One sample T Test’ is one of the 3 types of T Tests . It is used when you want to test if the mean of the population from which the sample is drawn is of a hypothesized value. You will understand this statement better (and all of about One Sample T test) better by the end of this post.

T Test was first invented by William Sealy Gosset, in 1908. Since he used the pseudo name as ‘Student’ when publishing his method in the paper titled ‘Biometrika’, the test came to be know as Student’s T Test.

Since it assumes that the test statistic, typically the sample mean, follows the sampling distribution, the Student’s T Test is considered as a Parametric test.

The purpose of the One Sample T Test is to determine if a sample observations could have come from a process that follows a specific parameter (like the mean).

It is typically implemented on small samples.

For example, given a sample of 15 items, you want to test if the sample mean is the same as a hypothesized mean (population). That is, essentially you want to know if the sample came from the given population or not.

Let’s suppose, you want to test if the mean weight of a manufactured component (from a sample size 15) is of a particular value (55 grams), with a 99% confidence.

Image showing manufacturing quality testing

How did we determine One sample T-test is the right test for this?

null hypothesis for single sample t test

Because, there is only one sample involved and you want to compare the mean of this sample against a particular (hypothesized) value..

To do this, you need to set up a null hypothesis and an alternate hypothesis .

The null hypothesis usually assumes that there is no difference in the sample means and the hypothesized mean (comparison mean). The purpose of the T Test is to test if the null hypothesis can be rejected or not.

Depending on the how the problem is stated, the alternate hypothesis can be one of the following 3 cases:

  • Case 1: H1 : x̅ != µ. Used when the true sample mean is not equal to the comparison mean. Use Two Tailed T Test.
  • Case 2: H1 : x̅ > µ. Used when the true sample mean is greater than the comparison mean. Use Upper Tailed T Test.
  • Case 3: H1 : x̅ < µ. Used when the true sample mean is lesser than the comparison mean. Use Lower Tailed T Test.

Where x̅ is the sample mean and µ is the population mean for comparison. We will go more into the detail of these three cases after solving some practical examples.

Example 1: A customer service company wants to know if their support agents are performing on par with industry standards.

According to a report the standard mean resolution time is 20 minutes per ticket. The sample group has a mean at 21 minutes per ticket with a standard deviation of 7 minutes.

Can you tell if the company’s support performance is better than the industry standard or not?

Example 2: A farming company wants to know if a new fertilizer has improved crop yield or not.

Historic data shows the average yield of the farm is 20 tonne per acre. They decide to test a new organic fertilizer on a smaller sample of farms and observe the new yield is 20.175 tonne per acre with a standard deviation of 3.02 tonne for 12 different farms.

Did the new fertilizer work?

Step 1: Define the Null Hypothesis (H0) and Alternate Hypothesis (H1)

H0: Sample mean (x̅) = Hypothesized Population mean (µ)

H1: Sample mean (x̅) != Hypothesized Population mean (µ)

The alternate hypothesis can also state that the sample mean is greater than or less than the comparison mean.

Step 2: Compute the test statistic (T)

$$t = \frac{Z}{s} = \frac{\bar{X} – \mu}{\frac{\hat{\sigma}}{\sqrt{n}}}$$

where s is the standard error .

Step 3: Find the T-critical from the T-Table

Use the degree of freedom and the alpha level (0.05) to find the T-critical.

Step 4: Determine if the computed test statistic falls in the rejection region.

Alternately, simply compute the P-value. If it is less than the significance level (0.05 or 0.01), reject the null hypothesis.

Problem Statement:

We have the potato yield from 12 different farms. We know that the standard potato yield for the given variety is µ=20.

x = [21.5, 24.5, 18.5, 17.2, 14.5, 23.2, 22.1, 20.5, 19.4, 18.1, 24.1, 18.5]

Test if the potato yield from these farms is significantly better than the standard yield.

Step 1: Define the Null and Alternate Hypothesis

H0: x̅ = 20

H1: x̅ > 20

n = 12. Since this is one sample T test, the degree of freedom = n-1 = 12-1 = 11.

Let’s set alpha = 0.05, to meet 95% confidence level.

Step 2: Calculate the Test Statistic (T) 1. Calculate sample mean

$$\bar{X} = \frac{x_1 + x_2 + x_3 + . . + x_n}{n}$$

$$\bar{x} = 20.175$$

  • Calculate sample standard deviation

$$\bar{\sigma} = \frac{(x_1 – \bar{x})^2 + (x_2 – \bar{x})^2 + (x_3 – \bar{x})^2 + . . + (x_n – \bar{x})^2}{n-1}$$

$$\sigma = 3.0211$$

  • Substitute in the T Statistic formula

$$T = \frac{\bar{x} – \mu}{se} = \frac{\bar{x} – \mu}{\frac{\sigma}{\sqrt{n}}}$$

$$T = (20.175 – 20)/(3.0211/\sqrt{12}) = 0.2006$$

Step 3: Find the T-Critical

Confidence level = 0.95, alpha=0.05. For one tailed test, look under 0.05 column. For d.o.f = 12 – 1 = 11, T-Critical = 1.796 .

Now you might wonder why ‘One Tailed test’ was chosen. This is because of the way you define the alternate hypothesis. Had the null hypothesis simply stated that the sample means is not equal to 20, then we would have gone for a two tailed test. More details about this topic in the next section.

Image showing T-Table for one sample T Test

Step 4: Does it fall in rejection region?

Since the computed T Statistic is less than the T-critical, it does not fall in the rejection region.

Image showing one-tailed T Test

Clearly, the calculated T statistic does not fall in the rejection region. So, we do not reject the null hypothesis.

Since you want to perform a ‘One Tailed Greater than’ test (that is, the sample mean is greater than the comparison mean), you need to specify alternative='greater' in the t.test() function. Because, by default, the t.test() does a two tailed test (which is what you do when your alternate hypothesis simply states sample mean != comparison mean).

The P-value computed here is nothing but p = Pr(T > t) (upper-tailed), where t is the calculated T statistic.

Image showing T-Distribution for P-value Computation for One Sample T-Test

In Python, One sample T Test is implemented in ttest_1samp() function in the scipy package. However, it does a Two tailed test by default , and reports a signed T statistic. That means, the reported P-value will always be computed for a Two-tailed test. To calculate the correct P value, you need to divide the output P-value by 2.

Apply the following logic if you are performing a one tailed test:

For greater than test: Reject H0 if p/2 < alpha (0.05). In this case, t will be greater than 0. For lesser than test: Reject H0 if p/2 < alpha (0.05). In this case, t will be less than 0.

Since it is one tailed test, the real p-value is 0.8446/2 = 0.4223. We do not rejecting the Null Hypothesis anyway.

The decision of whether the computed test statistic falls in the rejection region depends on how the alternate hypothesis is defined.

We know the Null Hypothesis is H0: µD = 0. Where, µD is the difference in the means, that is sample mean minus the comparison mean.

You can also write H0 as: x̅ = µ , where x̅ is sample mean and ‘µ’ is the comparison mean.

Case 1: If H1 : x̅ != µ , then rejection region lies on both tails of the T-Distribution (two-tailed). This means the alternate hypothesis just states the difference in means is not equal. There is no comparison if one of the means is greater or lesser than the other.

In this case, use Two Tailed T Test .

Here, P value = 2 . Pr(T > | t |)

Image showing two-tailed-test

Case 2: If H1: x̅ > µ , then rejection region lies on upper tail of the T-Distribution (upper-tailed). If the mean of the sample of interest is greater than the comparison mean. Example: If Component A has a longer time-to-failure than Component B.

In such case, use Upper Tailed based test.

Here, P-value = Pr(T > t)

Image showing upper tailed T-Distribution

Case 3: If H1: x̅ < µ , then rejection region lies on lower tail of the T-Distribution (lower-tailed). If the mean of the sample of interest is lesser than the comparison mean.

In such case, use lower tailed test.

Here, P-value = Pr(T < t)

Image showing T-Distribution for Lower Tailed T-Test

Hope you are now familiar and clear about with the One Sample T Test. If some thing is still not clear, write in comment. Next, topic is Two sample T test . Stay tuned.

More Articles

Correlation – connecting the dots, the role of correlation in data analysis, hypothesis testing – a deep dive into hypothesis testing, the backbone of statistical inference, sampling and sampling distributions – a comprehensive guide on sampling and sampling distributions, law of large numbers – a deep dive into the world of statistics, central limit theorem – a deep dive into central limit theorem and its significance in statistics, skewness and kurtosis – peaks and tails, understanding data through skewness and kurtosis”, similar articles, complete introduction to linear regression in r, how to implement common statistical significance tests and find the p value, logistic regression – a complete tutorial with examples in r.

Subscribe to Machine Learning Plus for high value data science content

© Machinelearningplus. All rights reserved.

null hypothesis for single sample t test

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free sample videos:.

null hypothesis for single sample t test

  • Flashes Safe Seven
  • FlashLine Login
  • Faculty & Staff Phone Directory
  • Emeriti or Retiree
  • All Departments
  • Maps & Directions

Kent State University Home

  • Building Guide
  • Departments
  • Directions & Parking
  • Faculty & Staff
  • Give to University Libraries
  • Library Instructional Spaces
  • Mission & Vision
  • Newsletters
  • Circulation
  • Course Reserves / Core Textbooks
  • Equipment for Checkout
  • Interlibrary Loan
  • Library Instruction
  • Library Tutorials
  • My Library Account
  • Open Access Kent State
  • Research Support Services
  • Statistical Consulting
  • Student Multimedia Studio
  • Citation Tools
  • Databases A-to-Z
  • Databases By Subject
  • Digital Collections
  • Discovery@Kent State
  • Government Information
  • Journal Finder
  • Library Guides
  • Connect from Off-Campus
  • Library Workshops
  • Subject Librarians Directory
  • Suggestions/Feedback
  • Writing Commons
  • Academic Integrity
  • Jobs for Students
  • International Students
  • Meet with a Librarian
  • Study Spaces
  • University Libraries Student Scholarship
  • Affordable Course Materials
  • Copyright Services
  • Selection Manager
  • Suggest a Purchase

Library Locations at the Kent Campus

  • Architecture Library
  • Fashion Library
  • Map Library
  • Performing Arts Library
  • Special Collections and Archives

Regional Campus Libraries

  • East Liverpool
  • College of Podiatric Medicine

null hypothesis for single sample t test

  • Kent State University
  • SPSS Tutorials

One Sample t Test

Spss tutorials: one sample t test.

  • The SPSS Environment
  • The Data View Window
  • Using SPSS Syntax
  • Data Creation in SPSS
  • Importing Data into SPSS
  • Variable Types
  • Date-Time Variables in SPSS
  • Defining Variables
  • Creating a Codebook
  • Computing Variables
  • Computing Variables: Mean Centering
  • Computing Variables: Recoding Categorical Variables
  • Computing Variables: Recoding String Variables into Coded Categories (Automatic Recode)
  • rank transform converts a set of data values by ordering them from smallest to largest, and then assigning a rank to each value. In SPSS, the Rank Cases procedure can be used to compute the rank transform of a variable." href="https://libguides.library.kent.edu/SPSS/RankCases" style="" >Computing Variables: Rank Transforms (Rank Cases)
  • Weighting Cases
  • Sorting Data
  • Grouping Data
  • Descriptive Stats for One Numeric Variable (Explore)
  • Descriptive Stats for One Numeric Variable (Frequencies)
  • Descriptive Stats for Many Numeric Variables (Descriptives)
  • Descriptive Stats by Group (Compare Means)
  • Frequency Tables
  • Working with "Check All That Apply" Survey Data (Multiple Response Sets)
  • Chi-Square Test of Independence
  • Pearson Correlation
  • Paired Samples t Test
  • Independent Samples t Test
  • One-Way ANOVA
  • How to Cite the Tutorials

Sample Data Files

Our tutorials reference a dataset called "sample" in many examples. If you'd like to download the sample dataset to work through the examples, choose one of the files below:

  • Data definitions (*.pdf)
  • Data - Comma delimited (*.csv)
  • Data - Tab delimited (*.txt)
  • Data - Excel format (*.xlsx)
  • Data - SAS format (*.sas7bdat)
  • Data - SPSS format (*.sav)
  • SPSS Syntax (*.sps) Syntax to add variable labels, value labels, set variable types, and compute several recoded variables used in later tutorials.
  • SAS Syntax (*.sas) Syntax to read the CSV-format sample data and set variable labels and formats/value labels.

The One Sample t Test examines whether the mean of a population is statistically different from a known or hypothesized value. The One Sample t Test is a parametric test.

This test is also known as:

  • Single Sample t Test

The variable used in this test is known as:

  • Test variable

In a One Sample t Test, the test variable's mean is compared against a "test value", which is a known or hypothesized value of the mean in the population. Test values may come from a literature review, a trusted research organization, legal requirements, or industry standards. For example:

  • A particular factory's machines are supposed to fill bottles with 150 milliliters of product. A plant manager wants to test a random sample of bottles to ensure that the machines are not under- or over-filling the bottles.
  • The United States Environmental Protection Agency (EPA) sets clearance levels for the amount of lead present in homes: no more than 10 micrograms per square foot on floors and no more than 100 micrograms per square foot on window sills ( as of December 2020 ). An inspector wants to test if samples taken from units in an apartment building exceed the clearance level.

Common Uses

The One Sample  t  Test is commonly used to test the following:

  • Statistical difference between a mean and a known or hypothesized value of the mean in the population.
  • This approach involves creating a change score from two variables, and then comparing the mean change score to zero, which will indicate whether any change occurred between the two time points for the original measures. If the mean change score is not significantly different from zero, no significant change occurred.

Note: The One Sample t Test can only compare a single sample mean to a specified constant. It can not compare sample means between two or more groups. If you wish to compare the means of multiple groups to each other, you will likely want to run an Independent Samples t Test (to compare the means of two groups) or a One-Way ANOVA (to compare the means of two or more groups).

Data Requirements

Your data must meet the following requirements:

  • Test variable that is continuous (i.e., interval or ratio level)
  • There is no relationship between scores on the test variable
  • Violation of this assumption will yield an inaccurate p value
  • Random sample of data from the population
  • Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test
  • Among moderate or large samples, a violation of normality may still yield accurate p values
  • Homogeneity of variances (i.e., variances approximately equal in both the sample and population)
  • No outliers

The null hypothesis ( H 0 ) and (two-tailed) alternative hypothesis ( H 1 ) of the one sample T test can be expressed as:

H 0 : µ =  µ 0   ("the population mean is equal to the [proposed] population mean") H 1 : µ ≠  µ 0   ("the population mean is not equal to the [proposed] population mean")

where µ is the "true" population mean and µ 0 is the proposed value of the population mean.

Test Statistic

The test statistic for a One Sample t Test is denoted t , which is calculated using the following formula:

$$ t = \frac{\overline{x}-\mu{}_{0}}{s_{\overline{x}}} $$

$$ s_{\overline{x}} = \frac{s}{\sqrt{n}} $$

\(\mu_{0}\) = The test value -- the proposed constant for the population mean \(\bar{x}\) = Sample mean \(n\) = Sample size (i.e., number of observations) \(s\) = Sample standard deviation \(s_{\bar{x}}\) = Estimated standard error of the mean ( s /sqrt( n ))

The calculated t value is then compared to the critical t value from the t distribution table with degrees of freedom df = n - 1 and chosen confidence level. If the calculated t value > critical t value, then we reject the null hypothesis.

Data Set-Up

Your data should include one continuous, numeric variable (represented in a column) that will be used in the analysis. The variable's measurement level should be defined as Scale in the Variable View window.

Run a One Sample t Test

To run a One Sample t Test in SPSS, click  Analyze > Compare Means > One-Sample T Test .

The One-Sample T Test window opens where you will specify the variables to be used in the analysis. All of the variables in your dataset appear in the list on the left side. Move variables to the Test Variable(s) area by selecting them in the list and clicking the arrow button.

null hypothesis for single sample t test

A Test Variable(s): The variable whose mean will be compared to the hypothesized population mean (i.e., Test Value). You may run multiple One Sample t Tests simultaneously by selecting more than one test variable. Each variable will be compared to the same Test Value. 

B Test Value: The hypothesized population mean against which your test variable(s) will be compared.

C Estimate effect sizes: Optional. If checked, will print effect size statistics -- namely, Cohen's d -- for the test(s). (Note: Effect sizes calculations for t tests were first added to SPSS Statistics in version 27, making them a relatively recent addition. If you do not see this option when you use SPSS, check what version of SPSS you're using.)

D Options: Clicking Options will open a window where you can specify the Confidence Interval Percentage and how the analysis will address Missing Values (i.e., Exclude cases analysis by analysis or Exclude cases listwise ). Click Continue when you are finished making specifications.

null hypothesis for single sample t test

Click OK to run the One Sample t Test.

Problem Statement

According to the CDC , the mean height of U.S. adults ages 20 and older is about 66.5 inches (69.3 inches for males, 63.8 inches for females).

In our sample data, we have a sample of 435 college students from a single college. Let's test if the mean height of students at this college is significantly different than 66.5 inches using a one-sample t test. The null and alternative hypotheses of this test will be:

H 0 : µ Height = 66.5  ("the mean height is equal to 66.5") H 1 : µ Height ≠ 66.5  ("the mean height is not equal to 66.5")

Before the Test

In the sample data, we will use the variable Height , which a continuous variable representing each respondent’s height in inches. The heights exhibit a range of values from 55.00 to 88.41 ( Analyze > Descriptive Statistics > Descriptives ).

Let's create a histogram of the data to get an idea of the distribution, and to see if  our hypothesized mean is near our sample mean. Click Graphs > Legacy Dialogs > Histogram . Move variable Height to the Variable box, then click OK .

null hypothesis for single sample t test

To add vertical reference lines at the mean (or another location), double-click on the plot to open the Chart Editor, then click Options > X Axis Reference Line . In the Properties window, you can enter a specific location on the x-axis for the vertical line, or you can choose to have the reference line at the mean or median of the sample data (using the sample data). Click Apply to make sure your new line is added to the chart. Here, we have added two reference lines: one at the sample mean (the solid black line), and the other at 66.5 (the dashed red line).

From the histogram, we can see that height is relatively symmetrically distributed about the mean, though there is a slightly longer right tail. The reference lines indicate that sample mean is slightly greater than the hypothesized mean, but not by a huge amount. It's possible that our test result could come back significant.

Running the Test

To run the One Sample t Test, click  Analyze > Compare Means > One-Sample T Test.  Move the variable Height to the Test Variable(s) area. In the Test Value field, enter 66.5.

null hypothesis for single sample t test

If you are using SPSS Statistics 27 or later :

If you are using SPSS Statistics 26 or earlier :

Two sections (boxes) appear in the output: One-Sample Statistics and One-Sample Test . The first section, One-Sample Statistics , provides basic information about the selected variable, Height , including the valid (nonmissing) sample size ( n ), mean, standard deviation, and standard error. In this example, the mean height of the sample is 68.03 inches, which is based on 408 nonmissing observations.

null hypothesis for single sample t test

The second section, One-Sample Test , displays the results most relevant to the One Sample t Test. 

null hypothesis for single sample t test

A Test Value : The number we entered as the test value in the One-Sample T Test window.

B t Statistic : The test statistic of the one-sample t test, denoted t . In this example, t = 5.810. Note that t is calculated by dividing the mean difference (E) by the standard error mean (from the One-Sample Statistics box).

C df : The degrees of freedom for the test. For a one-sample t test, df = n - 1; so here, df = 408 - 1 = 407.

D Significance (One-Sided p and Two-Sided p): The p-values corresponding to one of the possible one-sided alternative hypotheses (in this case, µ Height > 66.5) and two-sided alternative hypothesis (µ Height ≠ 66.5), respectively. In our problem statement above, we were only interested in the two-sided alternative hypothesis.

E Mean Difference : The difference between the "observed" sample mean (from the One Sample Statistics box) and the "expected" mean (the specified test value (A)). The sign of the mean difference corresponds to the sign of the t value (B). The positive t value in this example indicates that the mean height of the sample is greater than the hypothesized value (66.5).

F Confidence Interval for the Difference : The confidence interval for the difference between the specified test value and the sample mean.

Decision and Conclusions

Recall that our hypothesized population value was 66.5 inches, the [approximate] average height of the overall adult population in the U.S. Since p < 0.001, we reject the null hypothesis that the mean height of students at this college is equal to the hypothesized population mean of 66.5 inches and conclude that the mean height is significantly different than 66.5 inches.

Based on the results, we can state the following:

  • There is a significant difference in the mean height of the students at this college and the overall adult population in the U.S. ( p < .001).
  • The average height of students at this college is about 1.5 inches taller than the U.S. adult population average (95% CI [1.013, 2.050]).
  • << Previous: Pearson Correlation
  • Next: Paired Samples t Test >>
  • Last Updated: May 10, 2024 1:32 PM
  • URL: https://libguides.library.kent.edu/SPSS

Street Address

Mailing address, quick links.

  • How Are We Doing?
  • Student Jobs

Information

  • Accessibility
  • Emergency Information
  • For Our Alumni
  • For the Media
  • Jobs & Employment
  • Life at KSU
  • Privacy Statement
  • Technology Support
  • Website Feedback

Single Sample T-Test

The StatsTest Flow: Difference >> Continuous Variable of Interest >> One Sample Tests (single group) >> Normal Variable of Interest

Not sure this is the right statistical method? Use the Choose Your StatsTest workflow to select the right method.

What is a Single Sample T-Test?

The Single Sample T-Test is a statistical test used to determine if a single group is significantly different from a known or hypothesized population value on your variable of interest. Your variable of interest should be continuous and normally distributed and you should have enough data (more than 5 values).

A Single Sample T-Test  is a statistical test comparing a bell shaped, normal distribution mean on the left, with a population mean on the right.

The Single Sample T-Test is also called a One-Sample T-Test, Single Sample Student T-Test, or One-Sample Test of Means.

Assumptions for a Single Sample T-Test

Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.

The assumptions for the Single Sample T-Test include:

Normally Distributed

Random sample, enough data.

Let’s dive in to each one of these separately.

The variable that you care about (and want to see if it is different between your group and the population) must be continuous. Continuous means that the variable can take on any reasonable value.

Some good examples of continuous variables include age, weight, height, test scores, survey scores, yearly salary, etc.

If the variable that you care about is a proportion (48% of males voted vs 56% of females voted) and you have more than 5 in each group then you should use the One-Proportion Z-Test . If your variable of interest is a proportion and you have less than 5 in a group, you should use the Exact Test of Goodness of Fit .

Normally Distributed Variable of Interest

The variable that you care about must be spread out in a normal way. In statistics, this is called being normally distributed (aka it must look like a bell curve when you graph the data). Only use a single sample t-test with your data if the variable you care about is normally distributed.

A normal distribution is bell shaped with most of the data in the middle as seen on the top of this image. A skewed distribution is leaning left or right with most of the data on the edge as seen on the bottom of this image.

If your variable is not normally distributed, you should use Single-Sample Wilcoxon Signed-Rank Test instead.

The data points for each group in your analysis must have come from a simple random sample. This means that if you wanted to see if drinking sugary soda makes you gain weight, you would need to randomly select a group of soda drinkers for your soda drinker group, and then you would compare that to a known population weight for non-sugary-soda drinkers.

The key here is that the data points for each group were randomly selected. This is important because if your group is not randomly determined then your analysis will be incorrect. In statistical terms this is called bias, or a tendency to have incorrect results because of bad data.

If you do not have a random sample, the conclusions you can draw from your results are very limited. You should try to get a simple random sample. If you have paired samples (2 measurements from the same group of subjects) then you should use a Paired Samples T-Test instead. If you want to compare 2 groups of subjects instead of a single group with a population mean, then you should use an Independent Samples T-Test instead

The sample size (or data set size) should be greater than 5 in your group. Some people argue for more than 15 or even 30, but more than 5 is probably sufficient.

It also depends on the expected size of the difference between groups. If you expect a large difference between groups, then you can get away with a smaller sample size. If you expect a small difference between groups, then you likely need a larger sample (30+).

The sample size needed in order to have statistically significant results for a single sample t-test. For a small effect size, 199 participants are needed, for a medium effect size, 34 participants are needed, and for a large effect size, 15 participants are needed.

If your sample size is greater than 30 (and you know the average and standard deviation or spread of the population values), you should run a Single Sample Z-Test instead.

When to use a Single Sample T-Test?

You should use a Single Sample T-Test in the following scenario:

  • You want to know if one group is different from a known or hypothesized population value on your variable of interest
  • Your variable of interest is continuous
  • You have one group
  • Your variable of interest is normally distributed

Let’s clarify these to help you know when to use a Single Sample T-Test.

You are looking for a statistical test to see whether a single group is significantly different from a population value on your variable of interest. This is a difference question. Other types of analyses include examining the relationship between two variables (correlation) or predicting one variable using another variable (prediction).

Continuous Data

Your variable of interest must be continuous. Continuous means that your variable of interest can basically take on any value, such as heart rate, height, weight, number of ice cream bars you can eat in 1 minute, etc.

Types of data that are NOT continuous include ordered data (such as finishing place in a race, best business rankings, etc.), categorical data (gender, eye color, race, etc.), or binary data (purchased the product or not, has the disease or not, etc.).

A Single Sample T-Test can only be used to compare a single group with a known population value on your variable of interest.

If you have three or more groups, you should use a One Way Anova analysis instead. If you have two groups to compare, you should use an Independent Samples T-Test instead.

Normally distributed was covered earlier and means that your variable of interest should look like a bell curve when you graph it as a histogram.

If you get a group of students to take a pre-test and the same students to take a post-test, you have two different variables for the same group of students, which would be paired data, in which case you would need to use a Paired Samples T-Test instead.

Single Sample T-Test Example

Group 1 : Received the experimental medical treatment. Population Value : On average in the population, it takes 12 days to recover from the disease Variable of interest : Time to recover from the disease in days.

In this example, group 1 is our treatment group because they received the experimental medical treatment. The population value is essentially our control group because they did not receive the treatment.

The null hypothesis, which is statistical lingo for what would happen if the treatment does nothing, is that group 1 and our population will recover from the disease in about the same number of days, on average. We are trying to determine if receiving the experimental medical treatment will shorten the number of days it takes for patients to recover from the disease.

As we run the experiment, we track how long it takes for each patient to fully recover from the disease. In order to use a Single Sample T-Test on our data, our variable of interest has to be normally distributed (bell curve shaped). In this case, recovery from the disease in days is normal for our treatment group.

After the experiment is over, we compare our treatment group to the population value on our variable of interest (days to fully recover) using a Single Sample T-Test. When we run the analysis, we get a t-statistic and a p-value. The t-statistic is a measure of how different our group is from the population value on our recovery variable of interest. A p-value is the chance of seeing our results assuming the treatment actually doesn’t do anything. A p-value less than or equal to 0.05 means that our result is statistically significant and we can trust that the difference is not due to chance alone.

Frequently Asked Questions

Q: What is the difference between a single sample t-test and a one sample t-test? A: Nothing. They are two names for the same analysis.

Q: What if I don’t know the population average for my variable of interest? A: You cannot run a single sample t-test without a comparison group or value. You either need to collect data for a control group or find data on what the population average is.

Q: How do I run a single sample t-test in SPSS, R, SAS, or STATA? A: This resource is focused on helping you pick the right statistical method every time. There are many resources available to help you figure out how to run this method with your data: SPSS article: https://libguides.library.kent.edu/SPSS/OneSampletTest SPSS video: https://www.youtube.com/watch?v=2zVeV1ohGCU R article: http://www.sthda.com/english/wiki/one-sample-t-test-in-r R video: https://www.youtube.com/watch?v=kvmSAXhX9Hs

If you still can’t figure something out, feel free to reach out .

css.php

SPSS tutorials website header logo

One-Sample T-Test – Quick Tutorial & Example

Null hypothesis, assumptions, effect size, confidence intervals for means, apa style reporting.

A one-sample t-test evaluates if a population mean is likely to be x : some hypothesized value.

One Sample T-Test Diagram

One-Sample T-Test Example

A school director thinks his students perform poorly due to low IQ scores. Now, most IQ tests have been calibrated to have a mean of 100 points in the general population. So the question is does the student population have a mean IQ score of 100? Now, our school has 1,114 students and the IQ tests are somewhat costly to administer. Our director therefore draws a simple random sample of N = 38 students and tests them on 4 IQ components:

  • verb (Verbal Intelligence )
  • math (Mathematical Ability )
  • clas (Classification Skills )
  • logi (Logical Reasoning Skills)

The raw data thus collected are in this Googlesheet , partly shown below. Note that a couple of scores are missing due to illness and unknown reasons.

One Sample T-Test Example Data

We'll try to demonstrate that our students have low IQ scores by rejecting the null hypothesis that the mean IQ score for the entire student population is 100 for each of the 4 IQ components measured. Our main challenge is that we only have data on a sample of 38 students from a population of N = 1,114. But let's first just look at some descriptive statistics for each component:

  • N - sample size;
  • M - sample mean and
  • SD - sample standard deviation.

Descriptive Statistics

Descriptive Statistics for One-Sample T-Test

Our first basic conclusion is that our 38 students score lower than 100 points on all 4 IQ components. The differences for verb (99.29) and math (97.97) are small. Those for clas (93.91) and logi (94.74) seem somewhat more serious. Now, our sample of 38 students may obviously come up with slightly different means than our population of N = 1,114. So what can we (not) conclude regarding our population? We'll try to generalize these sample results to our population with 2 different approaches:

  • Statistical significance : how likely are these sample means if the population means are really all 100 points?
  • Confidence intervals : given the sample results, what are likely ranges for the population means?

Both approaches require some assumptions so let's first look into those.

The assumptions required for our one-sample t-tests are

  • independent observations and
  • normality : the IQ scores must be normally distributed in the entire population.

Do our data meet these assumptions? First off, 1. our students didn't interact during their tests. Therefore, our observations are likely to be independent. 2. Normality is only needed for small sample sizes, say N < 25 or so. For the data at hand, normality is no issue. For smaller sample sizes, you could evaluate the normality assumption by

  • inspecting if the histograms roughly follow normal curves,
  • inspecting if both skewness and kurtosis are close to 0 and
  • running a Shapiro-Wilk test or a Kolmogorov-Smirnov test .

However, the data at hand meet all assumptions so let's now look into the actual tests.

If we'd draw many samples of students, such samples would come up with different means. We can compute the standard deviation of those means over hypothesized samples: the standard error of the mean or \(SE_{mean}\) $$SE_{mean} = \frac{SD}{\sqrt{N}}$$ for our first IQ component, this results in $$SE_{mean} = \frac{12.45}{\sqrt{38}} = 2.02$$ Our null hypothesis is that the population mean, \(\mu_0 = 100\). If this is true, then the average sample mean should also be 100. We now basically compute the z-score for our sample mean: the test statistic \(t\) $$t = \frac{M - \mu_0}{SE_{mean}}$$ for our first IQ component, this results in $$t = \frac{99.29 - 100}{2.02} = -0.35$$ If the assumptions are met, \(t\) follows a t distribution with the degrees of freedom or \(df\) given by $$df = N - 1$$ For a sample of 38 respondents, this results in $$df = 38 - 1 = 37$$ Given \(t\) and \(df\), we can simply look up that the 2-tailed significance level \(p\) = 0.73 in this Googlesheet , partly shown below.

One Sample T-Test In Googlesheets

Interpretation

As a rule of thumb, we reject the null hypothesis if p < 0.05. We just found that p = 0.73 so we don't reject our null hypothesis: given our sample data, the population mean being 100 is a credible statement. So precisely what does p = 0.73 mean? Well, it means there's a 0.73 (or 73%) probability that t < -0.35 or t > 0.35. The figure below illustrates how this probability results from the sampling distribution , t(37).

2-Tailed Significance In T-Distribution

Next, remember that t is just a standardized mean difference. For our data, t = -0.35 corresponds to a difference of -0.71 IQ points. Therefore, p = 0.73 means that there's a 0.73 probability of finding an absolute mean difference of at least 0.71 points. Roughly speaking, the sample mean we found is likely to occur if the null hypothesis is true.

The only effect size measure for a one-sample t-test is Cohen’s D defined as $$Cohen's\;D = \frac{M - \mu_0}{SD}$$ For our first IQ test component, this results in $$Cohen's\;D = \frac{99.29 - 100}{12.45} = -0.06$$ Some general conventions are that

  • | Cohen’s D | = 0.20 indicates a small effect size;
  • | Cohen’s D | = 0.50 indicates a medium effect size;
  • | Cohen’s D | = 0.80 indicates a large effect size.

This means that Cohen’s D = -0.06 indicates a negligible effect size for our first test component. Cohen’s D is completely absent from SPSS except for SPSS 27 . However, we can easily obtain it from JASP . The JASP output below shows the effect sizes for all 4 IQ test components.

One Sample T-Test Jasp Output

Note that the last 2 IQ components -clas and logi- almost have medium effect sizes. These are also the 2 components whose means differ significantly from 100: p < 0.05 for both means (third table column).

Our data came up with sample means for our 4 IQ test components. Now, we know that sample means typically differ somewhat from their population counterparts. So what are likely ranges for the population means we're after? This is often answered by computing 95% confidence intervals . We'll demonstrate the procedure for our last IQ component, logical reasoning. Since we've 34 observations, t follows a t-distribution with df = 33. We'll first look up which t-values enclose the most likely 95% from the inverse t-distribution. We'll do so by typing =T.INV(0.025,33) into any cell of a Googlesheet , which returns -2.03. Note that 0.025 is 2.5%. This is because the 5% most un likely values are divided over both tails of the distribution as shown below.

Finding Critical Values for Confidence Intervals from an Inverse T-Distribution in Googlesheets

Now, our t-value of -2.03 estimates that our 95% of our sample means fluctuate between ± 2.03 standard errors denoted by \(SE_{mean}\) For our last IQ component, $$SE_{mean} = \frac{12.57}{\sqrt34} = 2.16 $$ We now know that 95% of our sample means are estimated to fluctuate between ± 2.03 · 2.16 = 4.39 IQ test points. Last, we combine this fluctuation with our observed sample mean of 94.74: $$CI_{95\%} = [94.74 - 4.39,94.74 + 4.39] = [90.35,99.12]$$ Note that our 95% confidence interval does not enclose our hypothesized population mean of 100. This implies that we'll reject this null hypothesis at α = 0.05. We don't even need to run the actual t-test for drawing this conclusion.

A single t-test is usually reported in text as in “The mean for verbal skills did not differ from 100, t(37) = -0.35, p = 0.73, Cohen’s D = 0.06.” For multiple tests, a simple overview table as shown below is recommended. We feel that confidence intervals for means (not mean differences ) should also be included. Since the APA does not mention these, we left them out for now.

APA Style Reporting Table for One-Sample T-Test

Right. Well, I can't think of anything else that is relevant regarding the one-sample t-test. If you do, don't be shy. Just write us a comment below. We're always happy to hear from you!

Thanks for reading!

Tell us what you think!

This tutorial has 3 comments:.

null hypothesis for single sample t test

By YY Ma on February 23rd, 2021

An excellent introduction! Cohen's D is a useful statistic. I think, if the sample size of each study is identical, | t | can be used as the effect size. And | t (0.05,df) | is the threshold for assessing whether a effect size is significantly large.

null hypothesis for single sample t test

By SHAMSUDDEEN IDRIS RIMINGADO on January 9th, 2022

In accordance with your explanation, does a one sample t test be use to test this hypothesis : There is significant difference between male and female exposed to error analysis in student with handwriting difficulties

null hypothesis for single sample t test

By Ruben Geert van den Berg on January 10th, 2022

For your question, you'd typically use an independent samples t-test , which is a bit more complicated than the one-sample t-test discussed in this tutorial.

Hope that helps!

SPSS tutorials

Privacy Overview

An open portfolio of interoperable, industry leading products

The Dotmatics digital science platform provides the first true end-to-end solution for scientific R&D, combining an enterprise data platform with the most widely used applications for data analysis, biologics, flow cytometry, chemicals innovation, and more.

null hypothesis for single sample t test

Statistical analysis and graphing software for scientists

Bioinformatics, cloning, and antibody discovery software

Plan, visualize, & document core molecular biology procedures

Electronic Lab Notebook to organize, search and share data

Proteomics software for analysis of mass spec data

Modern cytometry analysis platform

Analysis, statistics, graphing and reporting of flow cytometry data

Software to optimize designs of clinical trials

  • One sample t test

A one sample t test compares the mean with a hypothetical value. In most cases, the hypothetical value comes from theory. For example, if you express your data as 'percent of control', you can test whether the average differs significantly from 100. The hypothetical value can also come from previous data. For example, compare whether the mean systolic blood pressure differs from 135, a value determined in a previous study.

1. Choose data entry format

Caution: Changing format will erase your data.

2. Specify the hypothetical mean value

3. enter data, 4. view the results, learn more about the one sample t test.

In this article you will learn the requirements and assumptions of a one sample t test, how to format and interpret the results of a one sample t test, and when to use different types of t tests.

One sample t test: Overview

The one sample t test, also referred to as a single sample t test, is a statistical hypothesis test used to determine whether the mean calculated from sample data collected from a single group is different from a designated value specified by the researcher. This designated value does not come from the data itself, but is an external value chosen for scientific reasons. Often, this designated value is a mean previously established in a population, a standard value of interest, or a mean concluded from other studies. Like all hypothesis testing, the one sample t test determines if there is enough evidence reject the null hypothesis (H0) in favor of an alternative hypothesis (H1). The null hypothesis for a one sample t test can be stated as: "The population mean equals the specified mean value." The alternative hypothesis for a one sample t test can be stated as: "The population mean is different from the specified mean value."

Single sample t test

The one sample t test differs from most statistical hypothesis tests because it does not compare two separate groups or look at a relationship between two variables. It is a straightforward comparison between data gathered on a single variable from one population and a specified value defined by the researcher. The one sample t test can be used to look for a difference in only one direction from the standard value (a one-tailed t test ) or can be used to look for a difference in either direction from the standard value (a two-tailed t test ).

Requirements and Assumptions for a one sample t test

A one sample t test should be used only when data has been collected on one variable for a single population and there is no comparison being made between groups. For a valid one sample t test analysis, data values must be all of the following:

The one sample t test assumes that all "errors" in the data are independent. The term "error" refers to the difference between each value and the group mean. The results of a t test only make sense when the scatter is random - that whatever factor caused a value to be too high or too low affects only that one value. Prism cannot test this assumption, but there are graphical ways to explore data to verify this assumption is met.

A t test is only appropriate to apply in situations where data represent variables that are continuous measurements. As they rely on the calculation of a mean value, variables that are categorical should not be analyzed using a t test.

The results of a t test should be based on a random sample and only be generalized to the larger population from which samples were drawn.

As with all parametric hypothesis testing, the one sample t test assumes that you have sampled your data from a population that follows a normal (or Gaussian) distribution. While this assumption is not as important with large samples, it is important with small sample sizes, especially less than 10. If your data do not come from a Gaussian distribution , there are three options to accommodate this. One option is to transform the values to make the distribution more Gaussian, perhaps by transforming all values to their reciprocals or logarithms. Another choice is to use the Wilcoxon signed rank nonparametric test instead of the t test. A final option is to use the t test anyway, knowing that the t test is fairly robust to departures from a Gaussian distribution with large samples.

How to format a one sample t test

Ideally, data for a one sample t test should be collected and entered as a single column from which a mean value can be easily calculated. If data is entered on a table with multiple subcolumns, Prism requires one of the following choices to be selected to perform the analysis:

  • Each subcolumn of data can be analyzed separately
  • An average of the values in the columns across each row can be calculated, and the analysis conducted on this new stack of means, or
  • All values in all columns can be treated as one sample of data (paying no attention to which row or column any values are in).

How the one sample t test calculator works

Prism calculates the t ratio by dividing the difference between the actual and hypothetical means by the standard error of the actual mean. The equation is written as follows, where x is the calculated mean, μ is the hypothetical mean (specified value), S is the standard deviation of the sample, and n is the sample size:

t test ratio

A p value is computed based on the calculated t ratio and the numbers of degrees of freedom present (which equals sample size minus 1). The one sample t test calculator assumes it is a two-tailed one sample t test, meaning you are testing for a difference in either direction from the specified value.

How to interpret results of a one sample t test

As discussed, a one sample t test compares the mean of a single column of numbers against a hypothetical mean. This hypothetical mean can be based upon a specific standard or other external prediction. The test produces a P value which requires careful interpretation.

The p value answers this question: If the data were sampled from a Gaussian population with a mean equal to the hypothetical value you entered, what is the chance of randomly selecting N data points and finding a mean as far (or further) from the hypothetical value as observed here?

If the p value is large (usually defined to mean greater than 0.05), the data do not give you any reason to conclude that the population mean differs from the designated value to which it has been compared. This is not the same as saying that the true mean equals the hypothetical value, but rather states that there is no evidence of a difference. Thus, we cannot reject the null hypothesis (H0).

If the p value is small (usually defined to mean less than or equal to 0.05), then it is unlikely that the discrepancy observed between the sample mean and hypothetical mean is due to a coincidence arising from random sampling. There is evidence to reject the idea that the difference is coincidental and conclude instead that the population has a mean that is different from the hypothetical value to which it has been compared. The difference is statistically significant, and the null hypothesis is therefore rejected.

If the null hypothesis is rejected, the question of whether the difference is scientifically important still remains. The confidence interval can be a useful tool in answering this question. Prism reports the 95% confidence interval for the difference between the actual and hypothetical mean. In interpreting these results, one can be 95% sure that this range includes the true difference. It requires scientific judgment to determine if this difference is truly meaningful.

Performing t tests? We can help.

Sign up for more information on how to perform t tests and other common statistical analyses.

When to use different types of t tests

There are three types of t tests which can be used for hypothesis testing:

  • Independent two-sample (or unpaired) t test
  • Paired sample t test

As described, a one sample t test should be used only when data has been collected on one variable for a single population and there is no comparison being made between groups. It only applies when the mean value for data is intended to be compared to a fixed and defined number.

In most cases involving data analysis, however, there are multiple groups of data either representing different populations being compared, or the same population being compared at different times or conditions. For these situations, it is not appropriate to use a one sample t test. Other types of t tests are appropriate for these specific circumstances:

Independent Two-Sample t test (Unpaired t test)

The independent sample t test, also referred to as the unpaired t test, is used to compare the means of two different samples. The independent two-sample t test comes in two different forms:

  • the standard Student's t test, which assumes that the variance of the two groups are equal.
  • the Welch's t test , which is less restrictive compared to the original Student's test. This is the test where you do not assume that the variance is the same in the two groups, which results in fractional degrees of freedom.

The two methods give very similar results when the sample sizes are equal and the variances are similar.

Paired Sample t test

The paired sample t test is used to compare the means of two related groups of samples. Put into other words, it is used in a situation where you have two values (i.e., a pair of values) for the same group of samples. Often these two values are measured from the same samples either at two different times, under two different conditions, or after a specific intervention.

You can perform multiple independent two-sample comparison tests simultaneously in Prism. Select from parametric and nonparametric tests and specify if the data are unpaired or paired. Try performing a t test with a 30-day free trial of Prism .

Watch this video to learn how to choose between a paired and unpaired t test.

Example of how to apply the appropriate t test

"Alkaline" labeled bottled drinking water has become fashionable over the past several years. Imagine we have collected a random sample of 30 bottles of "alkaline" drinking water from a number of different stores to represent the population of "alkaline" bottled water for a particular brand available to the general consumer. The labels on each of the bottles claim that the pH of the "alkaline" water is 8.5. A laboratory then proceeds to measure the exact pH of the water in each bottle.

Table 1: pH of water in random sample of "alkaline bottled water"

If you look at the table above, you see that some bottles have a pH measured to be lower than 8.5, while other bottles have a pH measured to be higher. What can the data tell us about the actual pH levels found in this brand of "alkaline" water bottles marketed to the public as having a pH of 8.5? Statistical hypothesis testing provides a sound method to evaluate this question. Which specific test to use, however, depends on the specific question being asked.

Is a t test appropriate to apply to this data?

Let's start by asking: Is a t test an appropriate method to analyze this set of pH data? The following list reviews the requirements and assumptions for using a t test:

  • Independent sampling : In an independent sample t test, the data values are independent. The pH of one bottle of water does not depend on the pH of any other water bottle. (An example of dependent values would be if you collected water bottles from a single production lot. A sample from a single lot is representative only of that lot, not of alkaline bottled water in general).
  • Continuous variable : The data values are pH levels, which are numerical measurements that are continuous.
  • Random sample : We assume the water bottles are a simple random sample from the population of "alkaline" water bottles produced by this brand as they are a mix of many production lots.
  • Normal distribution : We assume the population from which we collected our samples has pH levels that are normally distributed. To verify this, we should visualize the data graphically. The figure below shows a histogram for the pH measurements of the water bottles. From a quick look at the histogram, we see that there are no unusual points, or outliers. The data look roughly bell-shaped, so our assumption of a normal distribution seems reasonable. The QQ plot can also be used to graphically assess normality and is the preferred choice when the sample size is small.

QQplot ph measurements

Based upon these features and assumptions being met, we can conclude that a t test is an appropriate method to be applied to this set of data.

Which t test is appropriate to use?

The next decision is which t test to apply, and this depends on the exact question we would like our analysis to answer. This example illustrates how each type of t test could be chosen for a specific analysis, and why the one sample t test is the correct choice to determine if the measured pH of the bottled water samples match the advertised pH of 8.5.

We could be interested in determining whether a certain characteristic of a water bottle is associated with having a higher or lower pH, such as whether bottles are glass or plastic. For this questions, we would effectively be dividing the bottles into 2 separate groups and comparing the means of the pH between the 2 groups. For this analysis, we would elect to use a two sample t test because we are comparing the means of two independent groups.

We could also be interested in learning if pH is affected by a water bottle being opened and exposed to the air for a week. In this case, each original sample would be tested for pH level after a week had elapsed and the water had been exposed to the air, creating a second set of sample data. To evaluate whether this exposure affected pH, we would again be comparing two different groups of data, but this time the data are in paired samples each having an original pH measurement and a second measurement from after the week of exposure to the open air. For this analysis, it is appropriate to use a paired t test so that data for each bottle is assembled in rows, and the change in pH is considered bottle by bottle.

Returning to the original question we set out to answer-whether bottled water that is advertised to have a pH of 8.5 actually meets this claim-it is now clear that neither an independent two sample t test or a paired t test would be appropriate. In this case, all 30 pH measurements are sampled from one group representing bottled drinking water labeled "alkaline" available to the general consumer. We wish to compare this measured mean with an expected advertised value of 8.5. This is the exact situation for which one should employ a one sample t test!

From a quick look at the descriptive statistics, we see that the mean of the sample measurements is 8.513, slightly above 8.5. Does this average from our sample of 30 bottles validate the advertised claim of pH 8.5? By applying Prism's one sample t test analysis to this data set, we will get results by which we can evaluate whether the null hypothesis (that there is no difference between the mean pH level in the water bottles and the pH level advertised on the bottles) should be accepted or rejected.

How to Perform a One Sample T Test in Prism

In prior versions of Prism, the one sample t test and the Wilcoxon rank sum tests were computed as part of Prism's Column Statistics analysis. Now, starting with Prism 8, performing one sample t tests is even easier with a separate analysis in Prism.

Steps to perform a one sample t test in Prism

  • Create a Column data table.
  • Enter each data set in a single Y column so all values from each group are stacked into a column. Prism will perform a one sample t test (or Wilcoxon rank sum test) on each column you enter.
  • Click Analyze, look in the list of Column analyses, and choose one sample t test and Wilcoxon test.

It's that simple! Prism streamlines your t test analysis so you can make more accurate and more informed data interpretations. Start your 30-day free trial of Prism and try performing your first one sample t test in Prism.

Watch this video for a step-by-step tutorial on how to perform a t test in Prism.

We Recommend:

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.

One sample t-test

The t-test is one of the most common hypothesis tests in statistics. The t-test determines either whether the sample mean and the mean of the population differ or if two sample means differ statistically. The t-test distinguishes between

  • one sample t-test
  • independent sample t-test
  • paired samples t-test

One sample t-test

The choice of which t-test to use depends on whether one or two samples are available. If two samples are available, a distinction is made between dependent and independent samples. In this tutorial you will find everything about the one sample t-test .

Tip: Do you want to calculate the t-value? You can easily calculate it for all three t-tests online in the t-test calculator on DATAtab

The one sample t-test is used to test whether the population differs from a fixed value. So, the question is: Are there statistically significant differences between a sample mean and the fixed value? The set value may, for example, reflect the remaining population percentage or a set quality target that is to be controlled.

Social science example:

You want to find out whether the health perception of managers in Canada differs from that of the population as a whole. For this purpose you ask 50 managers about their perception of health.

Technical example:

You want to find out if the screws your company produces really weigh 10 grams on average. To test this, weigh 50 screws and compare the actual weight with the weight they should have (10 grams).

Medical example:

A pharmaceutical company promises that its new drug lowers blood pressure by 10 mmHg in one week. You want to find out if this is correct. To do this, compare the observed reduction in blood pressure of 75 test subjects with the expected reduction of 10 mmHg.

Assumptions

In a one sample t-test, the data under consideration must be from a random sample, have metric scale of measurement , and be normally distributed.

One tailed and two tailed t-test

One tailed and two tailed t-test

So if you want to know whether a sample differs from the population, you have to calculate a one sample t-test . But before the t-test can be calculated, a question and the hypotheses must first be defined. This determines whether a one tailed (directional) or a two tailed (non-directional) t-test must be calculated.

The question helps you to define the object of investigation. In the case of the one sample t-test the question is:

Two tailed (non-directional)

Is there a statistically significant difference between the mean value of the sample and the population?

One tailed (directional)

Is the mean value of the sample significantly larger (or smaller) than the mean value of the population?

For the examples above, this gives us the following questions:

  • Does the health perception of managers in Canada differ from that of the overall population in Canada?
  • Does the production plant produce screws with a weight of 10 grams?
  • Does the new drug lower blood pressure by 10 mmHg within one week?

Hypotheses t-Test

In order to perform a one sample t-test, the following hypotheses are formulated:

  • Null hypothesis H 0 : The mean value of the population is equal to the specified value.
  • Alternative hypothesis H 1 : The mean value of the population is not equal to the specified value.
  • Null hypothesis H 0 : The mean value of the population is equal to or greater than (or less than) that of the specified value.
  • Alternative hypothesis H 1 : The mean value of the population is smaller (or larger) than the specified values.

One sample t-test equation

You can calculate the t-test either with a statistics software like DATAtab or by hand. For the calculation by hand you first need the test statistics t , which can be calculated for the one sample t-test with the equation

Calculate one sample t-test

In order to check whether the mean sample value differs significantly from that of the population, the critical t-value must be calculated. First the number of degrees of freedom, abbreviated df , is required, which is calculated by taking the number of samples minus one.

where the standard deviation is the population standard deviation estimated using the sample.

If the number of degrees of freedom is known, the critical t-value can be determined using the table of t-values . For a sample of 12 people, the degree of freedom is 11, and the significance level is assumed to be 5 %. The table below shows the t values for a one tailed open distribution. Depending on whether you want to calculate a one tailed (directional) or two tailed (non-directional) t-test, you must read the t value at either 0.95 or 0.975. For the non-directional hypothesis and an significance level of 5%, the critical t-value is 2.201.

If the calculated t value is below the critical t value, there is no significant difference between the sample and the population; if it is above the critical t value, there is a significant difference.

Interpret t-value

The t-value is calculated by dividing the measured difference by the scatter in the sample data. The larger the magnitude of t, the more this argues against the null hypothesis. If the calculated t-value is larger than the critical t-value, the null hypothesis is rejected.

Number of degrees of freedom - df

The number of degrees of freedom indicates how many values are allowed to vary freely. The degrees of freedom are therefore the number of independent individual pieces of information.

One sample t-test example

As an example for the t-test for one sample, we examine whether an online statistics tutorial newly introduced at the university has an effect on the students' examination results.

The average score in the statistics test at a university has been 28 points for years. This semester a new online statistics tutorial was introduced. Now the course management would like to know whether the success of the studies has changed since the introduction of the statistics tutorial: Does the online statistics tutorial have a positive effect on exam results?

The population considered is all students who have written the statistics exam since the new statistics tutorial was introduced. The reference value to be compared is 28.

Null hypothesis H0

The mean value from the sample and the predefined value does not differ significantly. The online statistics tutorial has no significant effect on exam results.

Here's how it goes on DATAtab:

Do you want to calculate a t-test independently? Calculate the example in the Statistics Calculator. Just copy the upper table including the first row into the t-Test Calculator . Datatab will then provide you with the tables below.

The following results are obtained with DATAtab: The mean value is 32.33 and the standard deviation 5.46. This leads to a standard error of the mean value of 1.57. The t-statistic thus gives 2.75

You would now like to know whether your hypothesis (the score is 28) is significant or not. To do this, you first specify a significance level in Datatab, usually 5% is used, which is preselected. Now you will get the table below in Datatab.

One sample t-test (Test Value = 28)

95% confidence interval of the difference.

To interpret whether your hypothesis is significant one of the two values can be used:

  • p-value (2-tailed)
  • lower and upper confidence interval of the difference

In this example p-value (2-tailed) is equal to 0.02, i.e. 2 %. Put into words this means: The probability that a sample with a mean difference of 4.33 or more will be drawn from the population is 2%. The significance level was set at 5%, which is greater than 2%. For this reason, a significant difference between the sample and the population is assumed.

Whether or not there is a significant difference can also be read from the confidence interval of the difference. If the lower and upper limits go throw zero, there is no significant difference. If this is not the case, there is a significant difference. In this example, the lower value is 0.86 and the upper value is 7.81. Since the lower and upper values do not touch zero, there is a significant difference.

APA Style | One sample t-test

If we were to write the top results for publication in an APA journal, that is, in an APA format, we would write it that way:

A t-test showed a statistically reliable difference between the score of students who attended the online course and the average score of students who did not attend an online course. (M = 32.33, s = 5.47) and 28, t(11) = 2.75, p < 0.02, α = 0.05.

Statistics made easy

  • many illustrative examples
  • ideal for exams and theses
  • statistics made easy on 412 pages
  • 5rd revised edition (April 2024)
  • Only 7.99 €

Datatab

"Super simple written"

"It could not be simpler"

"So many helpful examples"

Statistics Calculator

Cite DATAtab: DATAtab Team (2024). DATAtab: Online Statistics Calculator. DATAtab e.U. Graz, Austria. URL https://datatab.net

Calcworkshop

One Sample T Test Easily Explained w/ 5+ Examples!

// Last Updated: October 9, 2020 - Watch Video //

Did you know that a hypothesis test for a sample mean is the same thing as a one sample t-test?

Jenn (B.S., M.Ed.) of Calcworkshop® teaching one sample t test

Jenn, Founder Calcworkshop ® , 15+ Years Experience (Licensed & Certified Teacher)

Learn the how-to with 5 step-by-step examples.

Let’s go!

What is a One Sample T Test?

A one sample t-test determines whether or not the sample mean is statistically different (statistically significant) from a population mean.

While significance tests for population proportions are based on z-scores and the normal distribution, hypothesis testing for population means depends on whether or not the population standard deviation is known or unknown.

For a one sample t test, we compare a test variable against a test value. And depending on whether or not we know the population standard deviation will determine what type of test variable we calculate.

T Test Vs. Z Test

So, determining whether or not to use a z-test or a t-test comes down to four things:

  • Are we are working with a proportion (z-test) or mean (z-test or t-test)?
  • Do you know the population standard deviation (z-test)?
  • Is the population normally distributed (z-test)?
  • What is the sample size? If the sample is less than 30 (t-test), if the sample is larger than 30 we can apply the central limit theorem as population is approximately normally.

How To Calculate a Test Statistic

Standard deviation known.

If the population standard deviation is known , then our significance test will follow a z-value. And as we learned while conducting confidence intervals, if our sample size is larger than 30, then our distribution is normal or approximately normal. And if our sample size is less than 30, we apply the Central Limit Theorem and deem our distribution approximately normal.

z test statistic formula

Z Test Statistic Formula

Standard Deviation Unknown

If the population standard deviation is unknown , we will use a sample standard deviation that will be close enough to the unknown population standard deviation. But this will also cause us to have to use a t-distribution instead of a normal distribution as noted by StatTrek .

Just like we saw with confidence intervals for population means, the t-distribution has an additional parameter representing the degrees of freedom or the number of observations that can be chosen freely.

t test statistic formula

T Test Statistic Formula

This means that our test statistic will be a t-value rather than a z-value. But thankfully, how we find our p-value and draw our final inference is the same as for hypothesis testing for proportions, as the graphic below illustrates.

how to find the p value

How To Find The P Value

Example Question

For example, imagine a company wants to test the claim that their batteries last more than 40 hours. Using a simple random sample of 15 batteries yielded a mean of 44.9 hours, with a standard deviation of 8.9 hours. Test this claim using a significance level of 0.05.

one sample t test example

One Sample T Test Example

How To Find P Value From T

So, our p-value is a probability, and it determines whether our test statistic is as extreme or more extreme then our test value, assuming that the null hypothesis is true. To find this value we either use a calculator or a t-table, as we will demonstrate in the video.

We have significant evidence to conclude the company’s claim that their batteries last more than 40 hours.

what does the p value mean

What Does The P Value Mean?

Together we will work through various examples of how to create a hypothesis test about population means using normal distributions and t-distributions.

One Sample T Test – Lesson & Examples (Video)

  • Introduction to Video: One Sample t-test
  • 00:00:43 – Steps for conducting a hypothesis test for population means (one sample z-test or one sample t-test)
  • Exclusive Content for Members Only
  • 00:03:49 – Conduct a hypothesis test and confidence interval when population standard deviation is known (Example #1)
  • 00:13:49 – Test the null hypothesis when population standard deviation is known (Example #2)
  • 00:18:56 – Use a one-sample t-test to test a claim (Example #3)
  • 00:26:50 – Conduct a hypothesis test and confidence interval when population standard deviation is unknown (Example #4)
  • 00:37:16 – Conduct a hypothesis test by using a one-sample t-test and provide a confidence interval (Example #5)
  • 00:49:19 – Test the hypothesis by first finding the sample mean and standard deviation (Example #6)
  • Practice Problems with Step-by-Step Solutions
  • Chapter Tests with Video Solutions

Get access to all the courses and over 450 HD videos with your subscription

Monthly and Yearly Plans Available

Get My Subscription Now

Still wondering if CalcWorkshop is right for you? Take a Tour and find out how a membership can take the struggle out of learning math.

5 Star Excellence award from Shopper Approved for collecting at least 100 5 star reviews

Open topic with navigation

Single Sample t Test

Menu location: Analysis_Parametric_Single Sample t .

This function gives a single sample Student t test with a confidence interval for the mean difference.

The single sample t method tests a null hypothesis that the population mean is equal to a specified value. If this value is zero (or not entered) then the confidence interval for the sample mean is given ( Altman, 1991; Armitage and Berry, 1994 ).

The test statistic is calculated as:

- where x bar is the sample mean, s² is the sample variance, n is the sample size, µ is the specified population mean and t is a Student t quantile with n-1 degrees of freedom.

Power is calculated as the power achieved with the given sample size and variance for detecting the observed mean difference with a two-sided type I error probability of (100-CI%)% ( Dupont, 1990 ).

Test workbook (Parametric worksheet: Systolic BP).

Consider 20 first year resident female doctors drawn at random from one area, resting systolic blood pressures measured using an electronic sphygmomanometer were:

From previous large studies of women drawn at random from the healthy general public, a resting systolic blood pressure of 120 mm Hg was predicted as the population mean for the relevant age group. To analyse these data in StatsDirect first prepare a workbook column containing the 20 data above or open the test workbook and select the single sample t test from the parametric methods section of the analysis menu. Select the column marked "Systolic BP" when prompted and enter the population mean as 120.

For this example:

Single sample t test

Sample name: Systolic BP

Sample mean = 130.05

Population mean = 120

Sample size n = 20

Sample sd = 9.960316

95% confidence interval for mean difference = 5.388429 to 14.711571

t = 4.512404

One sided P = .0001

Two sided P = .0002

Power (for 5% significance) = 98.71%

A null hypothesis of no difference between sample and population means has clearly been rejected. Using the 95% CI we expect the mean systolic BP for this population of doctors to be at least 5 mm Hg greater than the age and sex matched general public, lying somewhere between 125 and 135 mm Hg.

confidence intervals

Copyright © 2000-2024 StatsDirect Limited, all rights reserved. Download here .

null hypothesis for single sample t test

  • Calculators
  • Descriptive Statistics
  • Merchandise
  • Which Statistics Test?

Single Sample T-Test Calculator

A single sample t-test (or one sample t-test) is used to compare the mean of a single sample of scores to a known or hypothetical population mean. So, for example, it could be used to determine whether the mean diastolic blood pressure of a particular group differs from 85, a value determined by a previous study.

Requirements

  • The data is normally distributed
  • Scale of measurement should be interval or ratio
  • A randomized sample from a defined population

Null Hypothesis

H 0 : M - μ = 0, where M is the sample mean and μ is the population or hypothesized mean.

As above, the null hypothesis is that there is no difference between the sample mean and the known or hypothesized population mean.

null hypothesis for single sample t test

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

11.3: The Independent Samples t-test (Student Test)

  • Last updated
  • Save as PDF
  • Page ID 36153

  • Danielle Navarro
  • University of New South Wales

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Although the one sample t-test has its uses, it’s not the most typical example of a t-test 189 . A much more common situation arises when you’ve got two different groups of observations. In psychology, this tends to correspond to two different groups of participants, where each group corresponds to a different condition in your study. For each person in the study, you measure some outcome variable of interest, and the research question that you’re asking is whether or not the two groups have the same population mean. This is the situation that the independent samples t-test is designed for.

Suppose we have 33 students taking Dr Harpo’s statistics lectures, and Dr Harpo doesn’t grade to a curve. Actually, Dr Harpo’s grading is a bit of a mystery, so we don’t really know anything about what the average grade is for the class as a whole. There are two tutors for the class, Anastasia and Bernadette. There are N 1 =15 students in Anastasia’s tutorials, and N 2 =18 in Bernadette’s tutorials. The research question I’m interested in is whether Anastasia or Bernadette is a better tutor, or if it doesn’t make much of a difference. Dr Harpo emails me the course grades, in the harpo.Rdata file. As usual, I’ll load the file and have a look at what variables it contains:

As we can see, there’s a single data frame with two variables, grade and tutor . The grade variable is a numeric vector, containing the grades for all N=33 students taking Dr Harpo’s class; the tutor variable is a factor that indicates who each student’s tutor was. The first six observations in this data set are shown below:

We can calculate means and standard deviations, using the mean() and sd() functions. Rather than show the R output, here’s a nice little summary table:

To give you a more detailed sense of what’s going on here, I’ve plotted histograms showing the distribution of grades for both tutors (Figure 13.6 and 13.7). Inspection of these histograms suggests that the students in Anastasia’s class may be getting slightly better grades on average, though they also seem a little more variable.

harpohistanastasia-1.png

Here is a simpler plot showing the means and corresponding confidence intervals for both groups of students (Figure 13.8).

ttestci.png

Introducing the test

The independent samples t-test comes in two different forms, Student’s and Welch’s. The original Student t-test – which is the one I’ll describe in this section – is the simpler of the two, but relies on much more restrictive assumptions than the Welch t-test. Assuming for the moment that you want to run a two-sided test, the goal is to determine whether two “independent samples” of data are drawn from populations with the same mean (the null hypothesis) or different means (the alternative hypothesis). When we say “independent” samples, what we really mean here is that there’s no special relationship between observations in the two samples. This probably doesn’t make a lot of sense right now, but it will be clearer when we come to talk about the paired samples t-test later on. For now, let’s just point out that if we have an experimental design where participants are randomly allocated to one of two groups, and we want to compare the two groups’ mean performance on some outcome measure, then an independent samples t-test (rather than a paired samples t-test) is what we’re after.

Okay, so let’s let μ 1 denote the true population mean for group 1 (e.g., Anastasia’s students), and μ 2 will be the true population mean for group 2 (e.g., Bernadette’s students), 190 and as usual we’ll let \(\bar{X}_{1}\) and \(\bar{X}_{2}\) denote the observed sample means for both of these groups. Our null hypothesis states that the two population means are identical (μ 1 =μ 2 ) and the alternative to this is that they are not (μ 1 ≠μ 2 ). Written in mathematical-ese, this is…

H 0 :μ 1 =μ 2

H 1 :μ 1 ≠μ 2

ttesthyp-1.png

To construct a hypothesis test that handles this scenario, we start by noting that if the null hypothesis is true, then the difference between the population means is exactly zero, μ 1 −μ 2 =0 As a consequence, a diagnostic test statistic will be based on the difference between the two sample means. Because if the null hypothesis is true, then we’d expect

\(\bar{X}_{1}\) - \(\bar{X}_{2}\)

to be pretty close to zero. However, just like we saw with our one-sample tests (i.e., the one-sample z-test and the one-sample t-test) we have to be precise about exactly how close to zero this difference

\(\ t ={\bar{X}_1 - \bar{X}_2 \over SE}\)

We just need to figure out what this standard error estimate actually is. This is a bit trickier than was the case for either of the two tests we’ve looked at so far, so we need to go through it a lot more carefully to understand how it works.

“pooled estimate” of the standard deviation

In the original “Student t-test”, we make the assumption that the two groups have the same population standard deviation: that is, regardless of whether the population means are the same, we assume that the population standard deviations are identical, σ 1 =σ 2 . Since we’re assuming that the two standard deviations are the same, we drop the subscripts and refer to both of them as σ. How should we estimate this? How should we construct a single estimate of a standard deviation when we have two samples? The answer is, basically, we average them. Well, sort of. Actually, what we do is take a weighed average of the variance estimates, which we use as our pooled estimate of the variance . The weight assigned to each sample is equal to the number of observations in that sample, minus 1. Mathematically, we can write this as

\(\ \omega_{1}\)=N 1 −1

\(\ \omega_{2}\)=N 2 −1

Now that we’ve assigned weights to each sample, we calculate the pooled estimate of the variance by taking the weighted average of the two variance estimates, \(\ \hat{\sigma_1}^2\) and \(\ \hat{\sigma_2}^2\)

\(\ \hat{\sigma_p}^2 ={ \omega_{1}\hat{\sigma_1}^2+\omega_{2}\hat{\sigma_2}^2 \over \omega_{1}+\omega_{2}}\)

Finally, we convert the pooled variance estimate to a pooled standard deviation estimate, by taking the square root. This gives us the following formula for \(\ \hat{\sigma_p}\),

\(\ \hat{\sigma_p} =\sqrt{\omega_1\hat{\sigma_1}^2+\omega_2\hat{\sigma_2}^2\over \omega_1+\omega_2} \)

And if you mentally substitute \(\ \omega_1\)=N1−1 and \(\ \omega_2\)=N2−1 into this equation you get a very ugly looking formula; a very ugly formula that actually seems to be the “standard” way of describing the pooled standard deviation estimate. It’s not my favourite way of thinking about pooled standard deviations, however. 191

same pooled estimate, described differently

I prefer to think about it like this. Our data set actually corresponds to a set of N observations, which are sorted into two groups. So let’s use the notation X ik to refer to the grade received by the i-th student in the k-th tutorial group: that is, X 11 is the grade received by the first student in Anastasia’s class, X 21 is her second student, and so on. And we have two separate group means \(\ \bar{X_1}\) and \(\ \bar{X_2}\), which we could “generically” refer to using the notation \(\ \bar{X_k}\), i.e., the mean grade for the k-th tutorial group. So far, so good. Now, since every single student falls into one of the two tutorials, and so we can describe their deviation from the group mean as the difference

\(\ X_{ik} - \bar{X_k}\)

So why not just use these deviations (i.e., the extent to which each student’s grade differs from the mean grade in their tutorial?) Remember, a variance is just the average of a bunch of squared deviations, so let’s do that. Mathematically, we could write it like this:

\(\ ∑_{ik} (X_{ik}-\bar{X}_k)^2 \over N \)

where the notation “∑ ik ” is a lazy way of saying “calculate a sum by looking at all students in all tutorials”, since each “ik” corresponds to one student. 192 But, as we saw in Chapter 10, calculating the variance by dividing by N produces a biased estimate of the population variance. And previously, we needed to divide by N−1 to fix this. However, as I mentioned at the time, the reason why this bias exists is because the variance estimate relies on the sample mean; and to the extent that the sample mean isn’t equal to the population mean, it can systematically bias our estimate of the variance. But this time we’re relying on two sample means! Does this mean that we’ve got more bias? Yes, yes it does. And does this mean we now need to divide by N−2 instead of N−1, in order to calculate our pooled variance estimate? Why, yes…

\(\hat{\sigma}_{p}\ ^{2}=\dfrac{\sum_{i k}\left(X_{i k}-X_{k}\right)^{2}}{N-2}\)

Oh, and if you take the square root of this then you get \(\ \hat{\sigma_{P}}\), the pooled standard deviation estimate. In other words, the pooled standard deviation calculation is nothing special: it’s not terribly different to the regular standard deviation calculation.

Completing the test

Regardless of which way you want to think about it, we now have our pooled estimate of the standard deviation. From now on, I’ll drop the silly p subscript, and just refer to this estimate as \(\ \hat{\sigma}\). Great. Let’s now go back to thinking about the bloody hypothesis test, shall we? Our whole reason for calculating this pooled estimate was that we knew it would be helpful when calculating our standard error estimate. But, standard error of what ? In the one-sample t-test, it was the standard error of the sample mean, SE (\(\ \bar{X}\)), and since SE (\(\ \bar{X}=\sigma/ \sqrt{N}\) that’s what the denominator of our t-statistic looked like. This time around, however, we have two sample means. And what we’re interested in, specifically, is the the difference between the two \(\ \bar{X_1}\) - \(\ \bar{X_2}\). As a consequence, the standard error that we need to divide by is in fact the standard error of the difference between means. As long as the two variables really do have the same standard deviation, then our estimate for the standard error is

\(\operatorname{SE}\left(\bar{X}_{1}-\bar{X}_{2}\right)=\hat{\sigma} \sqrt{\dfrac{1}{N_{1}}+\dfrac{1}{N_{2}}}\)

and our t-statistic is therefore

\(t=\dfrac{\bar{X}_{1}-\bar{X}_{2}}{\operatorname{SE}\left(\bar{X}_{1}-\bar{X}_{2}\right)}\)

(shocking, isn’t it?) as long as the null hypothesis is true, and all of the assumptions of the test are met. The degrees of freedom, however, is slightly different. As usual, we can think of the degrees of freedom to be equal to the number of data points minus the number of constraints. In this case, we have N observations (N1 in sample 1, and N2 in sample 2), and 2 constraints (the sample means). So the total degrees of freedom for this test are N−2.

Doing the test in R

Not surprisingly, you can run an independent samples t-test using the t.test() function (Section 13.7), but once again I’m going to start with a somewhat simpler function in the lsr package. That function is unimaginatively called independentSamplesTTest() . First, recall that our data look like this:

The outcome variable for our test is the student grade , and the groups are defined in terms of the tutor for each class. So you probably won’t be too surprised to see that we’re going to describe the test that we want in terms of an R formula that reads like this grade ~ tutor . The specific command that we need is:

The first two arguments should be familiar to you. The first one is the formula that tells R what variables to use and the second one tells R the name of the data frame that stores those variables. The third argument is not so obvious. By saying var.equal = TRUE , what we’re really doing is telling R to use the Student independent samples t-test. More on this later. For now, lets ignore that bit and look at the output:

The output has a very familiar form. First, it tells you what test was run, and it tells you the names of the variables that you used. The second part of the output reports the sample means and standard deviations for both groups (i.e., both tutorial groups). The third section of the output states the null hypothesis and the alternative hypothesis in a fairly explicit form. It then reports the test results: just like last time, the test results consist of a t-statistic, the degrees of freedom, and the p-value. The final section reports two things: it gives you a confidence interval, and an effect size. I’ll talk about effect sizes later. The confidence interval, however, I should talk about now.

It’s pretty important to be clear on what this confidence interval actually refers to: it is a confidence interval for the difference between the group means. In our example, Anastasia’s students had an average grade of 74.5, and Bernadette’s students had an average grade of 69.1, so the difference between the two sample means is 5.4. But of course the difference between population means might be bigger or smaller than this. The confidence interval reported by the independentSamplesTTest() function tells you that there’s a 95% chance that the true difference between means lies between 0.2 and 10.8.

In any case, the difference between the two groups is significant (just barely), so we might write up the result using text like this:

The mean grade in Anastasia’s class was 74.5% (std dev = 9.0), whereas the mean in Bernadette’s class was 69.1% (std dev = 5.8). A Student’s independent samples t-test showed that this 5.4% difference was significant (t(31)=2.1, p<.05, CI 95 =[0.2,10.8], d=.74), suggesting that a genuine difference in learning outcomes has occurred.

Notice that I’ve included the confidence interval and the effect size in the stat block. People don’t always do this. At a bare minimum, you’d expect to see the t-statistic, the degrees of freedom and the p value. So you should include something like this at a minimum: t(31)=2.1, p<.05. If statisticians had their way, everyone would also report the confidence interval and probably the effect size measure too, because they are useful things to know. But real life doesn’t always work the way statisticians want it to: you should make a judgment based on whether you think it will help your readers, and (if you’re writing a scientific paper) the editorial standard for the journal in question. Some journals expect you to report effect sizes, others don’t. Within some scientific communities it is standard practice to report confidence intervals, in other it is not. You’ll need to figure out what your audience expects. But, just for the sake of clarity, if you’re taking my class: my default position is that it’s usually worth includng the effect size, but don’t worry about the confidence interval unless the assignment asks you to or implies that you should.

Positive and negative t values

Before moving on to talk about the assumptions of the t-test, there’s one additional point I want to make about the use of t-tests in practice. The first one relates to the sign of the t-statistic (that is, whether it is a positive number or a negative one). One very common worry that students have when they start running their first t-test is that they often end up with negative values for the t-statistic, and don’t know how to interpret it. In fact, it’s not at all uncommon for two people working independently to end up with R outputs that are almost identical, except that one person has a negative t values and the other one has a positive t value. Assuming that you’re running a two-sided test, then the p-values will be identical. On closer inspection, the students will notice that the confidence intervals also have the opposite signs. This is perfectly okay: whenever this happens, what you’ll find is that the two versions of the R output arise from slightly different ways of running the t-test. What’s happening here is very simple. The t-statistic that R is calculating here is always of the form

\(t=\dfrac{(\text { mean } 1)-(\text { mean } 2)}{(\mathrm{SE})}\)

If “mean 1” is larger than “mean 2” the t statistic will be positive, whereas if “mean 2” is larger then the t statistic will be negative. Similarly, the confidence interval that R reports is the confidence interval for the difference “(mean 1) minus (mean 2)”, which will be the reverse of what you’d get if you were calculating the confidence interval for the difference “(mean 2) minus (mean 1)”.

Okay, that’s pretty straightforward when you think about it, but now consider our t-test comparing Anastasia’s class to Bernadette’s class. Which one should we call “mean 1” and which one should we call “mean 2”. It’s arbitrary. However, you really do need to designate one of them as “mean 1” and the other one as “mean 2”. Not surprisingly, the way that R handles this is also pretty arbitrary. In earlier versions of the book I used to try to explain it, but after a while I gave up, because it’s not really all that important, and to be honest I can never remember myself. Whenever I get a significant t-test result, and I want to figure out which mean is the larger one, I don’t try to figure it out by looking at the t-statistic. Why would I bother doing that? It’s foolish. It’s easier just look at the actual group means, since the R output actually shows them!

Here’s the important thing. Because it really doesn’t matter what R printed out, I usually try to report the t-statistic in such a way that the numbers match up with the text. Here’s what I mean… suppose that what I want to write in my report is “Anastasia’s class had higher grades than Bernadette’s class”. The phrasing here implies that Anastasia’s group comes first, so it makes sense to report the t-statistic as if Anastasia’s class corresponded to group 1. If so, I would write

Anastasia’s class had higher grades than Bernadette’s class (t(31)=2.1,p=.04).

(I wouldn’t actually emphasise the word “higher” in real life, I’m just doing it to emphasise the point that “higher” corresponds to positive t values). On the other hand, suppose the phrasing I wanted to use has Bernadette’s class listed first. If so, it makes more sense to treat her class as group 1, and if so, the write up looks like this:

Bernadette’s class had lower grades than Anastasia’s class (t(31)=−2.1,p=.04).

Because I’m talking about one group having “lower” scores this time around, it is more sensible to use the negative form of the t-statistic. It just makes it read more cleanly.

One last thing: please note that you can’t do this for other types of test statistics. It works for t-tests, but it wouldn’t be meaningful for chi-square testsm F-tests or indeed for most of the tests I talk about in this book. So don’t overgeneralise this advice! I’m really just talking about t-tests here and nothing else!

Assumptions of the test

As always, our hypothesis test relies on some assumptions. So what are they? For the Student t-test there are three assumptions, some of which we saw previously in the context of the one sample t-test (see Section 13.2.3):

  • Normality . Like the one-sample t-test, it is assumed that the data are normally distributed. Specifically, we assume that both groups are normally distributed. In Section 13.9 we’ll discuss how to test for normality, and in Section 13.10 we’ll discuss possible solutions.
  • Independence . Once again, it is assumed that the observations are independently sampled. In the context of the Student test this has two aspects to it. Firstly, we assume that the observations within each sample are independent of one another (exactly the same as for the one-sample test). However, we also assume that there are no cross-sample dependencies. If, for instance, it turns out that you included some participants in both experimental conditions of your study (e.g., by accidentally allowing the same person to sign up to different conditions), then there are some cross sample dependencies that you’d need to take into account.
  • Homogeneity of variance (also called “homoscedasticity”). The third assumption is that the population standard deviation is the same in both groups. You can test this assumption using the Levene test, which I’ll talk about later on in the book (Section 14.7). However, there’s a very simple remedy for this assumption, which I’ll talk about in the next section.

Enter a value for the null hypothesis. This value should indicate the absence of an effect in your data. Indicate whether your alternative hypothesis involves one-tail or two-tails. If it is a one-tailed test, then you need to indicate whether it is a positive (right tail) test or a negative (left tail) test.

Enter an \(\alpha\) value for the hypothesis test. This is the Type I error rate for your hypothesis test. It also determines the confidence level \(100 \times (1-\alpha)\) for a confidence interval.

Press the Run Test button and a table summarizing the computations and conclusions will appear below.

  • Open access
  • Published: 13 May 2024

SCIPAC: quantitative estimation of cell-phenotype associations

  • Dailin Gan 1 ,
  • Yini Zhu 2 ,
  • Xin Lu 2 , 3 &
  • Jun Li   ORCID: orcid.org/0000-0003-4353-5761 1  

Genome Biology volume  25 , Article number:  119 ( 2024 ) Cite this article

175 Accesses

2 Altmetric

Metrics details

Numerous algorithms have been proposed to identify cell types in single-cell RNA sequencing data, yet a fundamental problem remains: determining associations between cells and phenotypes such as cancer. We develop SCIPAC, the first algorithm that quantitatively estimates the association between each cell in single-cell data and a phenotype. SCIPAC also provides a p -value for each association and applies to data with virtually any type of phenotype. We demonstrate SCIPAC’s accuracy in simulated data. On four real cancerous or noncancerous datasets, insights from SCIPAC help interpret the data and generate new hypotheses. SCIPAC requires minimum tuning and is computationally very fast.

Single-cell RNA sequencing (scRNA-seq) technologies are revolutionizing biomedical research by providing comprehensive characterizations of diverse cell populations in heterogeneous tissues [ 1 , 2 ]. Unlike bulk RNA sequencing (RNA-seq), which measures the average expression profile of the whole tissue, scRNA-seq gives the expression profiles of thousands of individual cells in the tissue [ 3 , 4 , 5 , 6 , 7 ]. Based on this rich data, cell types may be discovered/determined in an unsupervised (e.g., [ 8 , 9 ]), semi-supervised (e.g., [ 10 , 11 , 12 , 13 ]), or supervised manner (e.g., [ 14 , 15 , 16 ]). Despite the fast development, there are still limitations with scRNA-seq technologies. Notably, the cost for each scRNA-seq experiment is still high; as a result, most scRNA-seq data are from a single or a few biological samples/tissues. Very few datasets consist of large numbers of samples with different phenotypes, e.g., cancer vs. normal. This places great difficulties in determining how a cell type contributes to a phenotype based on single-cell studies (especially if the cell type is discovered in a completely unsupervised manner or if people have limited knowledge of this cell type). For example, without having single-cell data from multiple cancer patients and multiple normal controls, it could be hard to computationally infer whether a cell type may promote or inhibit cancer development. However, such association can be critical for cancer research [ 17 ], disease diagnosis [ 18 ], cell-type targeted therapy development [ 19 ], etc.

Fortunately, this difficulty may be overcome by borrowing information from bulk RNA-seq data. Over the past decade, a considerable amount of bulk RNA-seq data from a large number of samples with different phenotypes have been accumulated and made available through databases like The Cancer Genome Atlas (TCGA) [ 20 ] and cBioPortal [ 21 , 22 ]. Data in these databases often contain comprehensive patient phenotype information, such as cancer status, cancer stages, survival status and time, and tumor metastasis. Combining single-cell data from a single or a few individuals and bulk data from a relatively large number of individuals regarding a particular phenotype can be a cost-effective way to determine how a cell type contributes to the phenotype. A recent method Scissor [ 23 ] took an essential step in this direction. It uses single-cell and bulk RNA-seq data with phenotype information to classify the cells into three discrete categories: Scissor+, Scissor−, and null cells, corresponding to cells that are positively associated, negatively associated, and not associated with the phenotype.

Here, we present a method that takes another big step in this direction, which is called Single-Cell and bulk data-based Identifier for Phenotype Associated Cells or SCIPAC for short. SCIPAC enables quantitative estimation of the strength of association between each cell in a scRNA-seq data and a phenotype, with the help of bulk RNA-seq data with phenotype information. Moreover, SCIPAC also enables the estimation of the statistical significance of the association. That is, it gives a p -value for the association between each cell and the phenotype. Furthermore, SCIPAC enables the estimation of association between cells and an ordinal phenotype (e.g., different stages of cancer), which could be informative as people may not only be interested in the emergence/existence of cancer (cancer vs. healthy, a binary problem) but also in the progression of cancer (different stages of cancer, an ordinal problem).

To study the performance of SCIPAC, we first apply SCIPAC to simulated data under three schemes. SCIPAC shows high accuracy with low false positive rates. We further show the broad applicability of SCIPAC on real datasets across various diseases, including prostate cancer, breast cancer, lung cancer, and muscular dystrophy. The association inferred by SCIPAC is highly informative. In real datasets, some cell types have definite and well-studied functions, while others are less well-understood: their functions may be disease-dependent or tissue-dependent, and they may contain different sub-types with distinct functions. In the former case, SCIPAC’s results agree with current biological knowledge. In the latter case, SCIPAC’s discoveries inspire the generation of new hypotheses regarding the roles and functions of cells under different conditions.

An overview of the SCIPAC algorithm

SCIPAC is a computational method that identifies cells in single-cell data that are associated with a given phenotype. This phenotype can be binary (e.g., cancer vs. normal), ordinal (e.g., cancer stage), continuous (e.g., quantitative traits), or survival (i.e., survival time and status). SCIPAC uses input data consisting of three parts: single-cell RNA-seq data that measures the expression of p genes in m cells, bulk RNA-seq data that measures the expression of the same set of p genes in n samples/tissues, and the statuses/values of the phenotype of the n bulk samples/tissues. The output of SCIPAC is the strength and the p -value of the association between each cell and the phenotype.

SCIPAC proposes the following definition of “association” between a cell and a phenotype: A group of cells that are likely to play a similar role in the phenotype (such as cells of a specific cell type or sub-type, cells in a particular state, cells in a cluster, cells with similar expression profiles, or cells with similar functions) is considered to be positively/negatively associated with a phenotype if an increase in their proportion within the tissue likely indicates an increased/decreased probability of the phenotype’s presence. SCIPAC assigns the same association to all cells within such a group. Taking cancer as the phenotype as an example, if increasing the proportion of a cell type indicates a higher chance of having cancer (binary), having a higher cancer stage (ordinal), or a higher hazard rate (survival), all cells in this cell type is positively associated with cancer.

The algorithm of SCIPAC follows the following four steps. First, the cells in the single-cell data are grouped into clusters according to their expression profiles. The Louvain algorithm from the Seurat package [ 24 , 25 ] is used as the default clustering algorithm, but the user may choose any clustering algorithm they prefer. Or if information of the cell types or other groupings of cells is available a priori, it may be supplied to SCIPAC as the cell clusters, and this clustering step can be skipped. In the second step, a regression model is learned from bulk gene expression data with the phenotype. Depending on the type of the phenotype, this model can be logistic regression, ordinary linear regression, proportional odds model, or Cox proportional hazards model. To achieve a higher prediction power with less variance, by default, the elastic net (a blender of Lasso and ridge regression [ 26 ]) is used to fit the model. In the third step, SCIPAC computes the association strength \(\Lambda\) between each cell cluster and the phenotype based on a mathematical formula that we derive. Finally, the p -values are computed. The association strength and its p -value between a cell cluster and the phenotype are given to all cells in the cluster.

SCIPAC requires minimum tuning. When the cell-type information is given in step 1, SCIPAC does not have any (hyper)parameter. Otherwise, the Louvain algorithm used in step 1 has a “resolution” parameter that controls the number of cell clusters: a larger resolution results in more clusters. SCIPAC inherits this parameter as its only parameter. Since SCIPAC gives the same association strength and p -value to cells from the same cluster, this parameter also determines the resolution of results provided by SCIPAC. Thus, we still call it “resolution” in SCIPAC. Because of its meaning, we recommend setting it so that the number of cell clusters given by the clustering algorithm is comparable to, or reasonably larger than, the number of cell types (or sub-types) in the data. We will see that the performance of SCIPAC is insensitive to this resolution parameter, and the default value 2.0 typically works well.

The details of the SCIPAC algorithm are given in the “ Methods ” section.

Performance in simulated data

We assess the performance of SCIPAC in simulated data under three different schemes. The first scheme is simple and consists of only three cell types. The second scheme is more complicated and consists of seven cell types, which better imitates actual scRNA-seq data. In the third scheme, we simulate cells under different cell development stages to test the performance of SCIPAC under an ordinal phenotype. Details of the simulation are given in Additional file 1.

Simulation scheme I

Under this scheme, the single-cell data consists of three cell types: one is positively associated with the phenotype, one is negatively associated, and the third is not associated (we call it “null”). Figure 1 a gives the UMAP [ 27 ] plot of the three cell types, and Fig. 1 b gives the true associations of these three cell types with the phenotype, with red, blue, and light gray denoting positive, negative, and null associations.

figure 1

UMAP visualization and numeric measures of the simulated data under scheme I. All the plots in a–e  are scatterplots of the two dimensional single-cell data given by UMAP. The x and y axes represent the two dimensions, and their scales are not shown as their specific values are not directly relevant. Points in the plots represents single cells, and they are colored differently in each subplot to reflect different information/results. a  Cell types. b  True associations. The association between cell types 1, 2, and 3 and the phenotype are positive, negative, and null, respectively. c  Association strengths \(\Lambda\) given by SCIPAC under different resolutions. Red/blue represents the sign of \(\Lambda\) , and the shade gives the absolute value of \(\Lambda\) . Every cell is colored red or blue since no \(\Lambda\) is exactly zero. Below each subplot, Res stands for resolution, and K stands for the number of cell clusters given by this resolution. d   p -values given by SCIPAC. Only cells with p -value \(< 0.05\) are colored red (positive association) or blue (negative association); others are colored white. e  Results given by Scissor under different \(\alpha\) values. Red, blue, and light gray stands for Scissor+, Scissor−, and background (i.e., null) cells. f  F1 scores and g  FSC for SCIPAC and Scissor under different parameter values. For SCIPAC, each bar is the value under a resolution/number of clusters. For Scissor, each bar is the value under an \(\alpha\)

We apply SCIPAC to the simulated data. For the resolution parameter (see the “ Methods ” section), values 0.5, 1.0, and 1.5 give 3, 4, and 4 clusters, respectively, close to the actual number of cell types. They are good choices based on the guidance for choosing this parameter. To show how SCIPAC behaves under parameter misspecification, we also set the resolution up to 4.0, which gives a whopping 61 clusters. Figure 1 c and d give the association strengths \(\Lambda\) and the p -values given by four different resolutions (results under other resolutions are provided in Additional file 1: Fig. S1 and S2). In Fig. 1 c, red and blue denote positive and negative associations, respectively, and the shade of the color represents the strength of the association, i.e., the absolute value of \(\Lambda\) . Every cell is colored blue or red, as none of \(\Lambda\) is exactly zero. In Fig. 1 d, red and blue denote positive and negative associations that are statistically significant ( p -value \(< 0.05\) ). Cells whose associations are not statistically significant ( p -value \(\ge 0.05\) ) are shown in white. To avoid confusion, it is worth repeating that cells that are colored in red/blue in Fig. 1 c are shown in red/blue in Fig. 1 d only if they are statistically significant; otherwise, they are colored white in Fig. 1 d.

From Fig. 1 c, d (as well as Additional file 1: Fig. S1 and S2), it is clear that the results of SCIPAC are highly consistent under different resolution values, including both the estimated association strengths and the p -values. It is also clear that SCIPAC is highly accurate: most truly associated cells are identified as significant, and most, if not all, truly null cells are identified as null.

As the first algorithm that quantitatively estimates the association strength and the first algorithm that gives the p -value of the association, SCIPAC does not have a real competitor. A previous algorithm, Scissor, is able to classify cells into three discrete categories according to their associations with the phenotype. So, we compare SCIPAC with Scissor in respect of the ability to differentiate positively associated, negatively associated, and null cells.

Running Scissor requires tuning a parameter called \(\alpha\) , which is a number between 0 and 1 that balances the amount of regularization for the smoothness and for the sparsity of the associations. The Scissor R package does not provide a default value for this \(\alpha\) or a function to help select this value. The Scissor paper suggests the following criterion: “the number of Scissor-selected cells should not exceed a certain percentage of total cells (default 20%) in the single-cell data. In each experiment, a search on the above searching list is performed from the smallest to the largest until a value of \(\alpha\) meets the above criteria.” In practice, we have found that this criterion does not often work properly, as the truly associated cells may not compose 20% of all cells in actual data. Therefore, instead of setting \(\alpha\) to any particular value, we set \(\alpha\) values that span the whole range of \(\alpha\) to see the best possible performance of Scissor.

The performance of Scissor in our simulation data under four different \(\alpha\) values are shown in Fig. 1 e, and results under more \(\alpha\) values are shown in Additional file 1: Fig. S3. In the figures, red, blue, and light gray denote Scissor+, Scissor−, and null (called “background” in Scissor) cells, respectively. The results of Scissor have several characteristics different from SCIPAC. First, Scissor does not give the strength or statistical significance of the association, and thus the colors of the cells in the figures do not have different shades. Second, different \(\alpha\) values give very different results. Greater \(\alpha\) values generally give fewer Scissor+ and Scissor− cells, but there are additional complexities. One complexity is that the Scissor+ (or Scissor−) cells under a greater \(\alpha\) value are not a strict subset of Scissor+ (or Scissor−) cells under a smaller \(\alpha\) value. For example, the number of truly negatively associated cells detected as Scissor− increases when \(\alpha\) increases from 0.01 to 0.30. Another complexity is that the direction of the association may flip as \(\alpha\) increases. For example, most cells of cell type 2 are identified as Scissor+ under \(\alpha =0.01\) , but many of them are identified as Scissor− under larger \(\alpha\) values. Third, Scissor does not achieve high power and low false-positive rate at the same time under any \(\alpha\) . No matter what the \(\alpha\) value is, there is only a small proportion of cells from cell type 2 that are correctly identified as negatively associated, and there is always a non-negligible proportion of null cells (i.e., cells from cell type 3) that are incorrectly identified as positively or negatively associated. Fourth, Scissor+ and Scissor− cells can be close to each other in the figure, even under a large \(\alpha\) value. This means that cells with nearly identical expression profiles are detected to be associated with the phenotype in opposite directions, which can place difficulties in interpreting the results.

SCIPAC overcomes the difficulties of Scissor and gives results that are more informative (quantitative strengths with p -values), more accurate (both high power and low false-positive rate), less sensitive to the tuning parameter, and easier to interpret (cells with similar expression typically have similar associations to the phenotype).

SCIPAC’s higher accuracy in differentiating positively associated, negatively associated, and null cells than Scissors can also be measured numerically using the F1 score and the fraction of sign correctness (FSC). F1, which is the harmonic mean of precision and recall, is a commonly used measure of calling accuracy. Note that precision and recall are only defined for two-class problems, which try to classify desired signals/discoveries (so-called “positives”) against noises/trivial results (so-called “negatives”). Our case, on the other hand, is a three-class problem: positive association, negative association, and null. To compute F1, we combine the positive and negative associations and treat them as “positives,” and treat null as “negatives.” This F1 score ignores the direction of the association; thus, it alone is not enough to describe the performance of an association-detection algorithm. For example, an algorithm may have a perfect F1 score even if it incorrectly calls all negative associations positive. To measure an algorithm’s ability to determine the direction of the association, we propose a statistic called FSC, defined as the fraction of true discoveries that also have the correct direction of the association. The F1 score and FSC are numbers between 0 and 1, and higher values are preferred. A mathematical definition of these two measures is given in Additional file 1.

Figure 1 f, g show the F1 score and FSC of SCIPAC and Scissor under different values of tuning parameters. The F1 score of Scissor is between 0.2 and 0.7 under different \(\alpha\) ’s. The FSC of Scissor increases from around 0.5 to nearly 1 as \(\alpha\) increases, but Scissor does not achieve high F1 and FSC scores at the same time under any \(\alpha\) . On the other hand, the F1 score of SCIPAC is close to perfection when the resolution parameter is properly set, and it is still above 0.90 even if the resolution parameter is set too large. The FSC of SCIPAC is always above 0.96 under different resolutions. That is, SCIPAC achieves high F1 and FSC scores simultaneously under a wide range of resolutions, representing a much higher accuracy than Scissor.

Simulation scheme II

This more complicated simulation scheme has seven cell types, which are shown in Fig. 2 a. As shown in Fig. 2 b, cell types 1 and 3 are negatively associated (colored blue), 2 and 4 are positively associated (colored red), and 5, 6, and 7 are not associated (colored light gray).

figure 2

UMAP visualization of the simulated data under a–g  scheme II and h–k  scheme III. a  Cell types. b  True associations. c , d  Association strengths \(\Lambda\) and p -values given by SCIPAC under the default resolution. e  Results given by Scissor under different \(\alpha\) values. f  F1 scores and g  FSC for SCIPAC and Scissor under different parameter values. h  Cell differentiation paths. The four paths have the same starting location, which is in the center, but different ending locations. This can be considered as a progenitor cell type differentiating into four specialized cell types. i  Cell differentiation steps. These steps are used to create four stages, each containing 500 steps. Thus, this plot of differentiation steps can also be viewed as the plot of true association strengths. j , k  Association strengths \(\Lambda\) and p -values given by SCIPAC under the default resolution

The association strengths and p -values given by SCIPAC under the default resolution are illustrated in Fig. 2 c, d, respectively. Results under several other resolutions are given in Additional file 1: Fig. S4 and S5. Again, we find that SCIPAC gives highly consistent results under different resolutions. SCIPAC successfully identifies three out of the four truly associated cell types. For the other truly associated cell type, cell type 1, SCIPAC correctly recognizes its association with the phenotype as negative, although the p -values are not significant enough. The F1 score is 0.85, and the FSC is greater than 0.99, as shown in Fig. 2 f, g.

The results of Scissor under four different \(\alpha\) values are given in Fig. 2 e. (More shown in Additional file 1: Fig. S6.) Under this highly challenging simulation scheme, Scissor can only identify one out of four truly associated cell types. Its F1 score is below 0.4.

Simulation scheme III

This simulation scheme is to assess the performance of SCIPAC for ordinal phenotypes. We simulate cells along four cell-differentiation paths with the same starting location but different ending locations, as shown in Fig. 2 h. These cells can be considered as a progenitor cell population differentiating into four specialized cell types. In Fig. 2 i, the “step” reflects their position in the differentiation path, with step 0 meaning the start and step 2000 meaning the end of the differentiation. Then, the “stage” is generated according to the step: cells in steps 0 \(\sim\) 500, 501 \(\sim\) 1000, 1001 \(\sim\) 1500, and 1501 \(\sim\) 2000 are assigned to stages I, II, III, and IV, respectively. This stage is treated as the ordinal phenotype. Under this simulation scheme, Fig. 2 i also gives the actual associations, and all cells are associated with the phenotype.

The results of SCIPAC under the default resolution are shown in Fig. 2 j, k. Clearly, the associations SCIPAC identifies are highly consistent with the truth. Particularly, it successfully identifies the cells in the center as early-stage cells and most cells at the end of branches as last-stage cells. The results of SCIPAC under other resolutions are given in Additional file 1: Fig. S7 and S8, which are highly consistent. Scissor does not work with ordinal phenotypes; thus, no results are reported here.

Performance in real data

We consider four real datasets: a prostate cancer dataset, a breast cancer dataset, a lung cancer dataset, and a muscular dystrophy dataset. The bulk RNA-seq data of the three cancer datasets are obtained from the TCGA database, and that of the muscular dystrophy dataset is obtained from a published paper [ 28 ]. A detailed description of these datasets is given in Additional file 1. We will use these datasets to assess the performance of SCIPAC on different types of phenotypes. The cell type information (i.e., which cell belongs to which cell type) is available for the first three datasets, but we ignore this information so that we can make a fair comparison with Scissor, which cannot utilize this information.

Prostate cancer data with a binary phenotype

We use the single-cell expression of 8,700 cells from prostate-cancer tumors sequenced by [ 29 ]. The cell types of these cells are known and given in Fig. 3 a. The bulk data comprises 550 TCGA-PRAD (prostate adenocarcinoma) samples with phenotype (cancer vs. normal) information. Here the phenotype is cancer, and it is binary: present or absent.

figure 3

UMAP visualization of the prostate cancer data, with a zoom-in view for the red-circled region (cell type MNP). a  True cell types. BE, HE, and CE stand for basal, hillock, club epithelial cells, LE-KLK3 and LE-KLK4 stand for luminal epithelial cells with high levels of kallikrein related peptidase 3 and 4, and MNP stands for mononuclear phagocytes. In the zoom-in view, the sub-types of MNP cells are given. b  Association strengths \(\Lambda\) given by SCIPAC under the default resolution. The cyan-circled cells are B cells, which are estimated by SCIPAC as negatively associated with cancer but estimated by Scissor as Scissor+ or null. c   p -values given by SCIPAC. The MNP cell type, which is red-circled in the plot, is estimated by SCIPAC to be strongly negatively associated with cancer but estimated by Scissor to be positively associated with cancer. d  Results given by Scissor under different \(\alpha\) values

Results from SCIPAC with the default resolution are shown in Fig. 3 b, c (results with other resolutions, given in Additional file 1: Fig. S9 and S10, are highly consistent with results here.) Compared with results from Scissor, shown in Fig. 3 d, results from SCIPAC again show three advantages. First, results from SCIPAC are richer and more comprehensive. SCIPAC gives estimated associations and the corresponding p -values, and the estimated associations are quantitative (shown in Fig. 3 b as different shades to the red or blue color) instead of discrete (shown in Fig. 3 d as a uniform shade to the red, blue, or light gray color). Second, SCIPAC’s results can be easier to interpret as the red and blue colors are more block-wise instead of scattered. Third, unlike Scissor, which produces multiple sets of results varying based on the parameter \(\alpha\) —a parameter without a default value or tuning guidance—typically, a single set of results from SCIPAC under its default settings suffices.

Comparing the results from our SCIPAC method with those from Scissor is non-trivial, as the latter’s outcomes are scattered and include multiple sets. We propose the following solutions to summarize the inferred association of a known cell type with the phenotype using a specific method (Scissor under a specific \(\alpha\) value, or SCIPAC with the default setting). We first calculate the proportion of cells in this cell type identified as Scissor+ (by Scissor at a specific \(\alpha\) value) or as significantly positively associated (by SCIPAC), denoted by \(p_{+}\) . We also calculate the proportion of all cells, encompassing any cell type, which are identified as Scissor+ or significantly positively associated, serving as the average background strength, denoted by \(p_{a}\) . Then, we compute the log odds ratio for this cell type to be positively associated with the phenotype compared to the background, represented as:

Similarly, the log odds ratio for the cell type to be negatively associated with the phenotype, \(\rho _-\) , is computed in a parallel manner.

For SCIPAC, a cell type is summarized as positively associated with the phenotype if \(\rho _+ \ge 1\) and \(\rho _- < 1\)  and negatively associated if \(\rho _- \ge 1\) and \(\rho _+ < 1\) . If neither condition is met, the association is inconclusive. For Scissor, we apply it under six different \(\alpha\) values: 0.01, 0.05, 0.10, 0.15, 0.20, and 0.25. A cell type is summarized as positively associated with the phenotype if \(\rho _+ \ge 1\) and \(\rho _- < 1\) in at least four of these \(\alpha\) values and negatively associated if \(\rho _- \ge 1\) and \(\rho _+ < 1\) in at least four \(\alpha\) values. If these criteria are not met, the association is deemed inconclusive. The above computation of log odds ratios and the determination of associations are performed only on cell types that each compose at least 1% of the cell population, to ensure adequate power.

For the prostate cancer data, the log odds ratios for each cell type using each method are presented in Tables S1 and S2. The final associations determined for each cell type are summarized in Table S3. In the last column of this table, we also indicate whether the conclusions drawn from SCIPAC and Scissor are consistent or not.

We find that SCIPAC’s results agree with Scissor on most cell types. However, there are three exceptions: mononuclear phagocytes (MNPs), B cells, and LE-KLK4.

MNPs are red-circled and zoomed in in each sub-figure of Fig. 3 . Most cells in this cell type are colored red in Fig. 3 d but colored dark blue in Fig. 3 b. In other words, while Scissor determines that this cell type is Scissor+, SCIPAC makes the opposite inference. Moreover, SCIPAC is confident about its judgment by giving small p -values, as shown in Fig. 3 c. To see which inference is closer to the biological fact is not easy, as biologically MNPs contain a number of sub-types that each have different functions [ 30 , 31 ]. Fortunately, this cell population has been studied in detail in the original paper that generated this dataset [ 29 ], and the sub-type information of each cell is provided there: this MNP population contains six sub-types, which are dendritic cells (DC), M1 macrophages (Mac1), metallothionein-expressing macrophages (Mac-MT), M2 macrophages (Mac2), proliferating macrophages (Mac-cycling), and monocytes (Mono), as shown in the zoom-in view of Fig. 3 a. Among these six sub-types, DC, Mac1, and Mac-MT are believed to inhibit cancer development and can serve as targets in cancer immunotherapy [ 29 ]; they compose more than 60% of all MNP cells in this dataset. SCIPAC makes the correct inference on this majority of MNP cells. Another cell type, Mac2, is reported to promote tumor development [ 32 ], but it only composes less than \(15\%\) of the MNPs. How the other two cell types, Mac-cycling and Mono, are associated with cancer is less studied. Overall, the results given by SCIPAC are more consistent with the current biological knowledge.

B cells are cyan-circled in Fig. 3 b. B cells are generally believed to have anti-tumor activity by producing tumor-reactive antibodies and forming tertiary lymphoid structures [ 29 , 33 ]. This means that B cells are likely to be negatively associated with cancer. SCIPAC successfully identifies this negative association, while Scissor fails.

LE-KLK4, a subtype of cancer cells, is thought to be positively associated with the tumor phenotype [ 29 ]. SCIPAC successfully identified this positive association, in contrast to Scissor, which failed to do so (in the figure, a proportion of LE-KLK4 cells are identified as Scissor+, especially under the smallest \(\alpha\) value; however, this proportion is not significantly higher than the background Scissor+ level under the majority of \(\alpha\) values).

In summary, across all three cell types, the results from SCIPAC appear to be more consistent with current biological knowledge. For more discussions regarding this dataset, refer to Additional file 1.

Breast cancer data with an ordinal phenotype

The scRNA-seq data for breast cancer are from [ 34 ], and we use the 19,311 cells from the five HER2+ tumor tissues. The true cell types are shown in Fig. 4 a. The bulk data include 1215 TCGA-BRCA samples with information on the cancer stage (I, II, III, or IV), which is treated as an ordinal phenotype.

figure 4

UMAP visualization of the breast cancer data. a  True cell types. CAFs stand for cancer-associated fibroblasts, PB stands for plasmablasts and PVL stands for perivascular-like cells. b , c  Association strengths \(\Lambda\) and p -values given by SCIPAC under the default resolution. Cyan-circled are a group of T cells that are estimated by SCIPAC to be most significantly associated with the cancer stage in the negative direction, and orange-circled are a group of T cells that are estimated by SCIPAC to be significantly positively associated with the cancer stage. d  DE analysis of the cyan-circled T cells vs. all the other T cells. e  DE analysis of the cyan-circled T cells vs. all the other cells. f  Expression of CD8+ T cell marker genes in the cyan-circled cells and all the other cells. g  DE analysis of the orange-circled T cells vs. all the other cells. h  Expression of regulatory T cell marker genes in the orange-circled cells and all the other cells

Association strengths and p -values given by SCIPAC under the default resolution are shown in Fig. 4 b, c. Results under other resolutions are given in Additional file 1: Fig. S11 and S12, and again they are highly consistent with results under the default resolution. We do not present the results from Scissor, as Scissor does not take ordinal phenotypes.

In the SCIPAC results, cells that are most strongly and statistically significantly associated with the phenotype in the positive direction are the cancer-associated fibroblasts (CAFs). This finding agrees with the literature: CAFs contribute to therapy resistance and metastasis of cancer cells via the production of secreted factors and direct interaction with cancer cells [ 35 ], and they are also active players in breast cancer initiation and progression [ 36 , 37 , 38 , 39 ]. Another large group of cells identified as positively associated with the phenotype is the cancer epithelial cells. They are malignant cells in breast cancer tissues and are thus expected to be associated with severe cancer stages.

Of the cells identified as negatively associated with severe cancer stages, a large portion of T cells is the most noticeable. Biologically, T cells contain many sub-types, including CD4+, CD8+, regulatory T cells, and more, and their functions are diverse in the tumor microenvironment [ 40 ]. To explore SCIPAC’s discoveries, we compare T cells that are identified as most statistically significant, with p -values \(< 10^{-6}\) and circled in Fig. 4 d, with the other T cells. Differential expression (DE) analysis (details about DE analysis and other analyses are given in Additional file 1) identifies seven genes upregulated in these most significant T cells. Of these seven genes, at least five are supported by the literature: CCL4, XCL1, IFNG, and GZMB are associated with CD8+ T cell infiltration; they have been shown to have anti-tumor functions and are involved in cancer immunotherapy [ 41 , 42 , 43 ]. Also, IL2 has been shown to serve an important role in combination therapies for autoimmunity and cancer [ 44 ]. We also perform an enrichment analysis [ 45 ], in which a pathway called Myc stands out with a \(\textit{p}\text{-value}<10^{-7}\) , much smaller than all other pathways. Myc is downregulated in the T cells that are identified as most negatively associated with cancer stage progress. This agrees with current biological knowledge about this pathway: Myc is known to contribute to malignant cell transformation and tumor metastasis [ 46 , 47 , 48 ].

On the above, we have compared T cells that are most significantly associated with cancer stages in the negative direction with the other T cells using DE and pathway analysis, and the results could suggest that these cells are tumor-infiltrated CD8+ T cells with tumor-inhibition functions. To check this hypothesis, we perform DE analysis of these cells against all other cells (i.e., the other T cells and all the other cell types). The DE genes are shown in Fig. 4 e. It can be noted that CD8+ T cell marker genes such as CD8A, CD8B, and GZMK are upregulated. We further obtain CD8+ T cell marker genes from CellMarker [ 49 ] and check their expression, as illustrated in Fig. 4 f. Marker genes CD8A, CD8B, CD3D, GZMK, and CD7 show significantly higher expression in these T cells. This again supports our hypothesis that these cells are tumor-infiltrated CD8+ T cells that have anti-tumor functions.

Interestingly, not all T cells are identified as negatively associated with severe cancer stages; a group of T cells is identified as positively associated, as circled in Fig. 4 c. To explore the function of this group of T cells, we perform DE analysis of these T cells against the other T cells. The DE genes are shown in Fig. 4 g. Based on the literature, six out of eight over-expressed genes are associated with cancer development. The high expression of NUSAP1 gene is associated with poor patient overall survival, and this gene also serves as a prognostic factor in breast cancer [ 50 , 51 , 52 ]. Gene MKI67 has been treated as a candidate prognostic prediction for cancer proliferation [ 53 , 54 ]. The over-expression of RRM2 has been linked to higher proliferation and invasiveness of malignant cells [ 55 , 56 ], and the upregulation of RRM2 in breast cancer suggests it to be a possible prognostic indicator [ 57 , 58 , 59 , 60 , 61 , 62 ]. The high expression of UBE2C gene always occurs in cancers with a high degree of malignancy, low differentiation, and high metastatic tendency [ 63 ]. For gene TOP2A, it has been proposed that the HER2 amplification in HER2 breast cancers may be a direct result of the frequent co-amplification of TOP2A [ 64 , 65 , 66 ], and there is a high correlation between the high expressions of TOP2A and the oncogene HER2 [ 67 , 68 ]. Gene CENPF is a cell cycle-associated gene, and it has been identified as a marker of cell proliferation in breast cancers [ 69 ]. The over-expression of these genes strongly supports the correctness of the association identified by SCIPAC. To further validate this positive association, we perform DE analysis of these cells against all the other cells. We find that the top marker genes obtained from CellMarker [ 49 ] for the regulatory T cells, which are known to be immunosuppressive and promote cancer progression [ 70 ], are over-expressed with statistical significance, as shown in Fig. 4 h. This finding again provides strong evidence that the positive association identified by SCIPAC for this group of T cells is correct.

Lung cancer data with survival information

The scRNA-seq data for lung cancer are from [ 71 ], and we use two lung adenocarcinoma (LUAD) patients’ data with 29,888 cells. The true cell types are shown in Fig. 5 a. The bulk data consist of 576 TCGA-LUAD samples with survival status and time.

figure 5

UMAP visualization of a–d  the lung cancer data and e–g  the muscular dystrophy data. a  True cell types. b , c  Association strengths \(\Lambda\) and p -values given by SCIPAC under the default resolution. d  Results given by Scissor under different \(\alpha\) values. e , f  Association strengths \(\Lambda\) and p -values given by SCIPAC under the default resolution. Circled are a group of cells that are identified by SCIPAC as significantly positively associated with the disease but identified by Scissor as null. g  Results given by Scissor under different \(\alpha\) values

Association strengths and p -values given by SCIPAC are given in Fig. 5 b, c (results under other resolutions are given in Additional file 1: Fig. S13 and S14). In Fig. 5 c, most cells with statistically significant associations are CD4+ T cells or B cells. These associations are negative, meaning that the abundance of these cells is associated with a reduced death rate, i.e., longer survival time. This agrees with the literature: CD4+ T cells primarily mediate anti-tumor immunity and are associated with favorable prognosis in lung cancer patients [ 72 , 73 , 74 ]; B cells also show anti-tumor functions in all stages of human lung cancer development and play an essential role in anti-tumor responses [ 75 , 76 ].

The results by Scissor under different \(\alpha\) values are shown in Fig. 5 d. The highly scattered Scissor+ and Scissor− cells make identifying and interpreting meaningful phenotype-associated cell groups difficult.

Muscular dystrophy data with a binary phenotype

This dataset contains cells from four facioscapulohumeral muscular dystrophy (FSHD) samples and two control samples [ 77 ]. We pool all the 7047 cells from these six samples together. The true cell types of these cells are unknown. The bulk data consists of 27 FSHD patients and eight controls from [ 28 ]. Here the phenotype is FSHD, and it is binary: present or absent.

The results of SCIPAC with the default resolution are given in Fig. 5 e, f. Results under other resolutions are highly similar (shown in Additional file 1: Fig. S15 and S16). For comparison, results given by Scissor under different \(\alpha\) values are presented in Fig. 5 g. The agreements between the results of SCIPAC and Scissor are clear. For example, both methods identify cells located at the top and lower left part of UMAP plots to be negatively associated with FSHD, and cells located at the center and right parts of UMAP plots to be positively associated. However, the discrepancies in their results are also evident. The most pronounced one is a large group of cells (circled in Fig. 5 f) that are identified by SCIPAC as significantly positively associated but are completely ignored by Scissor. Checking into this group of cells, we find that over 90% (424 out of 469) come from the FSHD patients, and less than 10% come from the control samples. However, cells from FSHD patients only compose 73% (5133) of all the 7047 cells. This statistically significant ( p -value \(<10^{-15}\) , Fisher’s exact test) over-representation (odds ratio = 3.51) suggests that the positive association identified SCIPAC is likely to be correct.

SCIPAC is computationally highly efficient. On an 8-core machine with 2.50 GHz CPU and 16 GB RAM, SCIPAC takes 7, 24, and 2 s to finish all the computation and give the estimated association strengths and p -values on the prostate cancer, lung cancer, and muscular dystrophy datasets, respectively. As a reference, Scissor takes 314, 539, and 171 seconds, respectively.

SCIPAC works with various phenotype types, including binary, continuous, survival, and ordinal. It can easily accommodate other types by using a proper regression model with a systematic component in the form of Eq. 3 (see the “ Methods ” section). For example, a Poisson or negative binomial log-linear model can be used if the phenotype is a count (i.e., non-negative integer).

In SCIPAC’s definition of association, a cell type is associated with the phenotype if increasing the proportion of this cell type leads to a change of probability of the phenotype occurring. The strength of association represents the extent of the increase or decrease in this probability. In the case of binary-response, this change is measured by the log odds ratio. For example, if the association strength of cell type A is twice that of cell type B, increasing cell type A by a certain proportion leads to twice the amount of change in the log odds ratio of having the phenotype compared to increasing cell type B by the same proportion. The association strength under other types of phenotypes can be interpreted similarly, with the major difference lying in the measure of change in probability. For quantitative, ordinal, and survival outcomes, the difference in the quantitative outcome, log odds ratio of the right-tail probability, and log hazard ratio respectively are used. Despite the differences in the exact form of the association strength under different types of phenotypes, the underlying concept remains the same: a larger (absolute value of) association strength indicates that the same increase/decrease in a cell type leads to a larger change in the occurrence of the phenotype.

As SCIPAC utilizes both bulk RNA-seq data with phenotype and single-cell RNA-seq data, the estimated associations for the cells are influenced by the choice of the bulk data. Although different bulk data can yield varying estimations of the association for the same single cells, the estimated associations appear to be reasonably robust even when minor changes are made to the bulk data. See Additional file 1 for further discussions.

When using the Louvain algorithm in the Seurat package to cluster cells, SCIPAC’s default resolution is 2.0, larger than the default setting of Seurat. This allows for the identification of potential subtypes within the major cell type and enables the estimation of individual association strengths. Consequently, a more detailed and comprehensive description of the association between single cells and the phenotype can be obtained by SCIPAC.

When applying SCIPAC to real datasets, we made a deliberate choice to disregard the cell annotation provided by the original publications and instead relied on the inferred cell clusters produced by the Louvain algorithm. We made this decision for several reasons. Firstly, we aimed to ensure a fair comparison with Scissor, as it does not utilize cell-type annotations. Secondly, the original annotation might not be sufficiently comprehensive or detailed. Presumed cell types could potentially encompass multiple subtypes, each of which may exhibit distinct associations with the phenotype under investigation. In such cases, employing the Louvain algorithm with a relatively high resolution, which is the default setting in SCIPAC, enables us to differentiate between these subtypes and allows SCIPAC to assign varying association strengths to each subtype.

SCIPAC fits the regression model using the elastic net, a machine-learning algorithm that maximizes a penalized version of the likelihood. The elastic net can be replaced by other penalized estimates of regression models, such as SCAD [ 78 ], without altering the rest of the SCIPAC algorithm. The combination of a regression model and a penalized estimation algorithm such as the elastic net has shown comparable or higher prediction power than other sophisticated methods such as random forests, boosting, or neural networks in numerous applications, especially for gene expression data [ 79 ]. However, there can still be datasets where other models have higher prediction power. It will be future work to incorporate these models into SCIPAC.

The use of metacells is becoming an efficient way to handle large single-cell datasets [ 80 , 81 , 82 , 83 ]. Conceptually, SCIPAC can incorporate metacells and their representatives as an alternative to its default setting of using cell clusters/types and their centroids. We have explored this aspect using metacells provided by SEACells [ 81 ]. Details are given in Additional file 1. Our comparative analysis reveals that combining SCIPAC with SEACells results in significantly reduced performance compared to using SCIPAC directly on original single-cell data. The primary reason for this appears to be the subpar performance of SEACells in cell grouping, especially when contrasted with the Louvain algorithm. Given these findings, we do not suggest using metacells provided by SEACells for SCIPAC applications in the current stage.

Conclusions

SCIPAC is a novel algorithm for studying the associations between cells and phenotypes. Compared to the previous algorithm, SCIPAC gives a much more detailed and comprehensive description of the associations by enabling a quantitative estimation of the association strength and by providing a quality control—the p -value. Underlying SCIPAC are a general statistical model that accommodates virtually all types of phenotypes, including ordinal (and potentially count) phenotypes that have never been considered before, and a concise and closed-form mathematical formula that quantifies the association, which minimizes the computational load. The mathematical conciseness also largely frees SCIPAC from parameter tuning. The only parameter (i.e., the resolution) barely changes the results given by SCIPAC. Overall, compared with its predecessor, SCIPAC represents a substantially more capable software by being much more informative, versatile, robust, and user-friendly.

The improvement in accuracy is also remarkable. In simulated data, SCIPAC achieves high power and low false positives, which is evident from the UMAP plot, F1 score, and FSC score. In real data, SCIPAC gives results that are consistent with current biological knowledge for cell types whose functions are well understood. For cell types whose functions are less studied or more multifaceted, SCIPAC gives support to certain biological hypotheses or helps identify/discover cell sub-types.

SCIPAC’s identification of cell-phenotype associations closely follows its definition of association: when increasing the fraction of a cell type increases (or decreases) the probability for a phenotype to be present, this cell type is positively (or negatively) associated with the phenotype.

The increase of the fraction of a cell type

For a bulk sample, let vector \(\varvec{G} \in \mathbb {R}^p\) be its expression profile, that is, its expression on the p genes. Suppose there are K cell types in the tissue, and let \(\varvec{g}_{k}\) be the representative expression of the k ’th cell type. Usually, people assume that \(\varvec{G}\) can be decomposed by

where \(\gamma _{k}\) is the proportion of cell type k in the bulk tissue, with \(\sum _{k = 1}^{K}\gamma _{k} = 1\) . This equation links the bulk and single-cell expression data.

Now consider increasing cells from cell type k by \(\Delta \gamma\) proportion of the original number of cells. Then, the new proportion of cell type k becomes \(\frac{\gamma _{k} + \Delta \gamma }{1 + \Delta \gamma }\) , and the new proportion of cell type \(j \ne k\) becomes \(\frac{\gamma _{j}}{1 + \Delta \gamma }\)  (note that the new proportions of all cell types should still add up to 1). Thus, the bulk expression profile with the increase of cell type k becomes

Plugging Eq. 1 , we get

Interestingly, this expression of \(\varvec{G}^*\) does not include \(\gamma _{1}, \ldots , \gamma _{K}\) . This means that there is no need actually to compute \(\gamma _{1}, \ldots , \gamma _{K}\) in Eq. 1 , which could otherwise be done using a cell-type-decomposition software, but an accurate and robust decomposition is non-trivial [ 84 , 85 , 86 ]. See Additional file 1 for a more in-depth discussion on the connections of SCIPAC with decomposition/deconvolution.

The change in chance of a phenotype

In this section, we consider how the increase in the fraction of a cell type will change the chance for a binary phenotype such as cancer to occur. Other types of phenotypes will be considered in the next section.

Let \(\pi (\varvec{G})\) be the chance of an individual with gene expression profile \(\varvec{G}\) for this phenotype to occur. We assume a logistic regression model to describe the relationship between \(\pi (\varvec{G})\) and \(\varvec{G}\) :

here the left-hand side is the log odds of \(\pi (\varvec{G})\) , \(\beta _{0}\) is the intercept, and \(\varvec{\beta }\) is a length- p vector of coefficients. In the section after the next, we will describe how we obtain \(\beta _{0}\) and \(\varvec{\beta }\) from the data.

When increasing cells from cell type k by \(\Delta \gamma\) , \(\varvec{G}\) becomes \(\varvec{G}^*\) in Eq. 3 . Plugging Eq. 2 , we get

We further take the difference between Eqs. 4 and 3 and get

The left-hand side of this equation is the log odds ratio (i.e., the change of log odds). On the right-hand side, \(\frac{\Delta \gamma }{1 + \Delta \gamma }\) is an increasing function with respect to \(\Delta \gamma\) , and \(\varvec{\beta }^T(\varvec{g}_{k} - \varvec{G})\) is independent of \(\Delta \gamma\) . This indicates that given any specific \(\Delta \gamma\) , the log odds ratio under over-representation of cell type k is proportional to

\(\lambda _k\) describes the strength of the effect of increasing cell type k to a bulk sample with expression profile \(\varvec{G}\) . Given the presence of numerous bulk samples, employing multiple \(\lambda _k\) ’s could be cumbersome and obscure the overall effect of a particular cell type. To concisely summarize the association of cell type k , we propose averaging their effects. The average effect on all bulk samples can be obtained by

where \(\bar{\varvec{G}}\) is the average expression profile of all bulk samples.

\(\Lambda _k\) gives an overall impression of how strong the effect is when cell type k over-represents to the probability for the phenotype to be present. Its sign represents the direction of the change: a positive value means an increase in probability, and a negative value means a decrease in probability. Its absolute value represents the strength of the effect. In SCIPAC, we call \(\Lambda _k\) the association strength of cell type k and the phenotype.

Note that this derivation does not involve likelihood, although the computation of \(\varvec{\beta }\) does. Here, it serves more as a definitional approach.

Definition of the association strength for other types of phenotype

Our definition of \(\Lambda _k\) relies on vector \(\varvec{\beta }\) . In the case of a binary phenotype, \(\varvec{\beta }\) are the coefficients of a logistic regression that describes a linear relationship between the expression profile and the log odds of having the phenotype, as shown in Eq. 3 . For other types of phenotype, \(\varvec{\beta }\) can be defined/computed similarly.

For a quantitative (i.e., continuous) phenotype, an ordinary linear regression can be used, and the left-hand side of Eq. 3 is changed to the quantitative value of the phenotype.

For a survival phenotype, a Cox proportional hazards model can be used, and the left-hand side of Eq. 3 is changed to the log hazard ratio.

For an ordinal phenotype, we use a proportional odds model

where \(j \in \{1, 2, ..., (J - 1)\}\) and J is the number of ordinal levels. It should be noted that here we use the right-tail probability \(\Pr (Y_{i} \ge j + 1 | X)\) instead of the commonly used cumulative probability (left-tail probability) \(\Pr (Y_{i} \le j | X)\) . Such a change makes the interpretation consistent with other types of phenotypes: in our model, a larger value on the right-hand side indicates a larger chance for \(Y_{i}\) to have a higher level, which in turn guarantees that the sign of the association strength defined according to this \(\varvec{\beta }\) has the usual meaning: a positive \(\Lambda _k\) value means a positive association with the phenotype-using the cancer stage as an example. A positive \(\Lambda _k\) means the over-representation of cell type k increases the chance of a higher cancer stage. In contrast, using the commonly used cumulative probability leads to a counter-intuitive, reversed interpretation.

Computation of the association strength in practice

In practice, \(\varvec{\beta }\) in Eq. 3 needs to be learned from the bulk data. By default, SCIPAC uses the elastic net, a popular and powerful penalized regression method:

In this model, \(l(\beta _{0}, \varvec{\beta })\) is a log-likelihood of the linear model (i.e., logistic regression for a binary phenotype, ordinary linear regression for a quantitative phenotype, Cox proportional odds model for a survival phenotype, and proportional odds model for an ordinal phenotype). \(\alpha\) is a number between 0 and 1, denoting a combination of \(\ell _1\) and \(\ell _2\) penalties, and \(\lambda\) is the penalty strength. SCIPAC fixes \(\alpha\) to be 0.4 (see Additional file 1 for discussions on this choice) and uses 10-fold cross-validation to decide \(\lambda\) automatically. This way, they do not become hyperparameters.

In SCIPAC, the fitting and cross-validation of the elastic net are done by calling the ordinalNet [ 87 ] R package for the ordinal phenotype and by calling the glmnet R package [ 88 , 89 , 90 , 91 ] for other types of phenotypes.

The computation of the association strength, as defined by Eq. 7 , does not only require \(\varvec{\beta }\) , but also \(\varvec{g}_k\) and \(\bar{\varvec{G}}\) . \(\bar{\varvec{G}}\) is simply the average expression profile of all bulk samples. On the other hand, \(\varvec{g}_k\) requires knowing the cell type of each cell. By default, SCIPAC does not assume this information to be given, and it uses the Louvain clustering implemented in the Seurat [ 24 , 25 ] R package to infer it. This clustering algorithm has one tuning parameter called “resolution.” SCIPAC sets its default value as 2.0, and the user can use other values. With the inferred or given cell types, \(\varvec{g}_k\) is computed as the centroid (i.e., the mean expression profile) of cells in cluster k .

Given \(\varvec{\beta }\) , \(\bar{\varvec{G}}\) , and \(\varvec{g}_k\) , the association strength can be computed using Eq. 7 . Knowing the association strength for each cell type and the cell-type label for each cell, we also know the association strength for every single cell. In practice, we standardize the association strengths for all cells. That is, we compute the mean and standard deviation of the association strengths of all cells and use them to centralize and scale the association strength, respectively. We have found such standardization makes SCIPAC more robust to the possible unbalance in sample size of bulk data in different phenotype groups.

Computation of the p -value

SCIPAC uses non-parametric bootstrap [ 92 ] to compute the standard deviation and hence the p -value of the association. Fifty bootstrap samples, which are believed to be enough to compute the standard error of most statistics [ 93 ], are generated for the bulk expression data, and each is used to compute (standardized) \(\Lambda\) values for all the cells. For cell i , let its original \(\Lambda\) values be \(\Lambda _i\) , and the bootstrapped values be \(\Lambda _i^{(1)}, \ldots , \Lambda _i^{(50)}\) . A z -score is then computed using

and then the p -value is computed according to the cumulative distribution function of the standard Gaussian distribution. See Additional file 1 for more discussions on the calculation of p -value.

Availability of data and materials

The simulated datasets [ 94 ] under three schemes are available at Zenodo with DOI 10.5281/zenodo.11013320 [ 95 ]. The SCIPAC package is available at GitHub website https://github.com/RavenGan/SCIPAC under the MIT license [ 96 ]. The source code of SCIPAC is also deposited at Zenodo with DOI 10.5281/zenodo.11013696 [ 97 ]. A vignette of the R package is available on the GitHub page and in the Additional file 2. The prostate cancer scRNA-seq data is obtained from the Prostate Cell Atlas https://www.prostatecellatlas.org [ 29 ]; the scRNA-seq data for the breast cancer are from the Gene Expression Omnibus (GEO) under accession number GSE176078 [ 34 , 98 ]; the scRNA-seq data for the lung cancer are from E-MTAB-6149 [ 99 ] and E-MTAB-6653 [ 71 , 100 ]; the scRNA-seq data for facioscapulohumeral muscular dystrophy data are from the GEO under accession number GSE122873 [ 101 ]. The bulk RNA-seq data are obtained from the TCGA database via TCGAbiolinks (ver. 2.25.2) R package [ 102 ]. More details about the simulated and real scRNA-seq and bulk RNA-seq data can be found in the Additional file 1.

Yofe I, Dahan R, Amit I. Single-cell genomic approaches for developing the next generation of immunotherapies. Nat Med. 2020;26(2):171–7.

Article   CAS   PubMed   Google Scholar  

Zhang Q, He Y, Luo N, Patel SJ, Han Y, Gao R, et al. Landscape and dynamics of single immune cells in hepatocellular carcinoma. Cell. 2019;179(4):829–45.

Fan J, Slowikowski K, Zhang F. Single-cell transcriptomics in cancer: computational challenges and opportunities. Exp Mol Med. 2020;52(9):1452–65.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015;161(5):1187–201.

Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161(5):1202–14.

Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z, et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science. 2018;360(6385):176–82.

Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8(1):1–12.

Article   Google Scholar  

Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJ, et al. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol. 2019;20(1):1–19.

Article   CAS   Google Scholar  

Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol. 2019;15(6):e8746.

Article   PubMed   PubMed Central   Google Scholar  

Guo H, Li J. scSorter: assigning cells to known cell types according to marker genes. Genome Biol. 2021;22(1):1–18.

Pliner HA, Shendure J, Trapnell C. Supervised classification enables rapid annotation of cell atlases. Nat Methods. 2019;16(10):983–6.

Zhang AW, O’Flanagan C, Chavez EA, Lim JL, Ceglia N, McPherson A, et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat Methods. 2019;16(10):1007–15.

Zhang Z, Luo D, Zhong X, Choi JH, Ma Y, Wang S, et al. SCINA: a semi-supervised subtyping algorithm of single cells and bulk samples. Genes. 2019;10(7):531.

Johnson TS, Wang T, Huang Z, Yu CY, Wu Y, Han Y, et al. LAmbDA: label ambiguous domain adaptation dataset integration reduces batch effects and improves subtype detection. Bioinformatics. 2019;35(22):4696–706.

Ma F, Pellegrini M. ACTINN: automated identification of cell types in single cell RNA sequencing. Bioinformatics. 2020;36(2):533–8.

Tan Y, Cahan P. SingleCellNet: a computational tool to classify single cell RNA-Seq data across platforms and across species. Cell Syst. 2019;9(2):207–13.

Salcher S, Sturm G, Horvath L, Untergasser G, Kuempers C, Fotakis G, et al. High-resolution single-cell atlas reveals diversity and plasticity of tissue-resident neutrophils in non-small cell lung cancer. Cancer Cell. 2022;40(12):1503–20.

Good Z, Sarno J, Jager A, Samusik N, Aghaeepour N, Simonds EF, et al. Single-cell developmental classification of B cell precursor acute lymphoblastic leukemia at diagnosis reveals predictors of relapse. Nat Med. 2018;24(4):474–83.

Wagner J, Rapsomaniki MA, Chevrier S, Anzeneder T, Langwieder C, Dykgers A, et al. A single-cell atlas of the tumor and immune ecosystem of human breast cancer. Cell. 2019;177(5):1330–45.

Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45(10):1113–20.

Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Disc. 2012;2(5):401–4.

Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6(269):1.

Sun D, Guan X, Moran AE, Wu LY, Qian DZ, Schedin P, et al. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat Biotechnol. 2022;40(4):527–38.

Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008;2008(10):P10008.

Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM III, et al. Comprehensive integration of single-cell data. Cell. 2019;177(7):1888–902.

Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol. 2005;67(2):301–20.

McInnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. 2018. arXiv preprint arXiv:1802.03426 .

Wong CJ, Wang LH, Friedman SD, Shaw D, Campbell AE, Budech CB, et al. Longitudinal measures of RNA expression and disease activity in FSHD muscle biopsies. Hum Mol Genet. 2020;29(6):1030–43.

Tuong ZK, Loudon KW, Berry B, Richoz N, Jones J, Tan X, et al. Resolving the immune landscape of human prostate at a single-cell level in health and cancer. Cell Rep. 2021;37(12):110132.

Hume DA. The mononuclear phagocyte system. Curr Opin Immunol. 2006;18(1):49–53.

Hume DA, Ross IL, Himes SR, Sasmono RT, Wells CA, Ravasi T. The mononuclear phagocyte system revisited. J Leukoc Biol. 2002;72(4):621–7.

Raggi F, Bosco MC. Targeting mononuclear phagocyte receptors in cancer immunotherapy: new perspectives of the triggering receptor expressed on myeloid cells (TREM-1). Cancers. 2020;12(5):1337.

Largeot A, Pagano G, Gonder S, Moussay E, Paggetti J. The B-side of cancer immunity: the underrated tune. Cells. 2019;8(5):449.

Wu SZ, Al-Eryani G, Roden DL, Junankar S, Harvey K, Andersson A, et al. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet. 2021;53(9):1334–47.

Fernández-Nogueira P, Fuster G, Gutierrez-Uzquiza Á, Gascón P, Carbó N, Bragado P. Cancer-associated fibroblasts in breast cancer treatment response and metastasis. Cancers. 2021;13(13):3146.

Ao Z, Shah SH, Machlin LM, Parajuli R, Miller PC, Rawal S, et al. Identification of cancer-associated fibroblasts in circulating blood from patients with metastatic breast cancer. Identification of cCAFs from metastatic cancer patients. Cancer Res. 2015;75(22):4681–7.

Arcucci A, Ruocco MR, Granato G, Sacco AM, Montagnani S. Cancer: an oxidative crosstalk between solid tumor cells and cancer associated fibroblasts. BioMed Res Int. 2016;2016.  https://pubmed.ncbi.nlm.nih.gov/27595103/ .

Buchsbaum RJ, Oh SY. Breast cancer-associated fibroblasts: where we are and where we need to go. Cancers. 2016;8(2):19.

Ruocco MR, Avagliano A, Granato G, Imparato V, Masone S, Masullo M, et al. Involvement of breast cancer-associated fibroblasts in tumor development, therapy resistance and evaluation of potential therapeutic strategies. Curr Med Chem. 2018;25(29):3414–34.

Savas P, Virassamy B, Ye C, Salim A, Mintoff CP, Caramia F, et al. Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis. Nat Med. 2018;24(7):986–93.

Bassez A, Vos H, Van Dyck L, Floris G, Arijs I, Desmedt C, et al. A single-cell map of intratumoral changes during anti-PD1 treatment of patients with breast cancer. Nat Med. 2021;27(5):820–32.

Romero JM, Grünwald B, Jang GH, Bavi PP, Jhaveri A, Masoomian M, et al. A four-chemokine signature is associated with a T-cell-inflamed phenotype in primary and metastatic pancreatic cancer. Chemokines in Pancreatic Cancer. Clin Cancer Res. 2020;26(8):1997–2010.

Tamura R, Yoshihara K, Nakaoka H, Yachida N, Yamaguchi M, Suda K, et al. XCL1 expression correlates with CD8-positive T cells infiltration and PD-L1 expression in squamous cell carcinoma arising from mature cystic teratoma of the ovary. Oncogene. 2020;39(17):3541–54.

Hernandez R, Põder J, LaPorte KM, Malek TR. Engineering IL-2 for immunotherapy of autoimmunity and cancer. Nat Rev Immunol. 2022:22:1–15.  https://pubmed.ncbi.nlm.nih.gov/35217787/ .

Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A. Fast gene set enrichment analysis. BioRxiv. 2016:060012.  https://www.biorxiv.org/content/10.1101/060012v3.abstract .

Dang CV. MYC on the path to cancer. Cell. 2012;149(1):22–35.

Gnanaprakasam JR, Wang R. MYC in regulating immunity: metabolism and beyond. Genes. 2017;8(3):88.

Oshi M, Takahashi H, Tokumaru Y, Yan L, Rashid OM, Matsuyama R, et al. G2M cell cycle pathway score as a prognostic biomarker of metastasis in estrogen receptor (ER)-positive breast cancer. Int J Mol Sci. 2020;21(8):2921.

Zhang X, Lan Y, Xu J, Quan F, Zhao E, Deng C, et al. Cell Marker: a manually curated resource of cell markers in human and mouse. Nucleic Acids Res. 2019;47(D1):D721–8.

Chen L, Yang L, Qiao F, Hu X, Li S, Yao L, et al. High levels of nucleolar spindle-associated protein and reduced levels of BRCA1 expression predict poor prognosis in triple-negative breast cancer. PLoS ONE. 2015;10(10):e0140572.

Li M, Yang B. Prognostic value of NUSAP1 and its correlation with immune infiltrates in human breast cancer. Crit Rev TM Eukaryot Gene Expr. 2022;32(3).  https://pubmed.ncbi.nlm.nih.gov/35695609/ .

Zhang X, Pan Y, Fu H, Zhang J. Nucleolar and spindle associated protein 1 (NUSAP1) inhibits cell proliferation and enhances susceptibility to epirubicin in invasive breast cancer cells by regulating cyclin D kinase (CDK1) and DLGAP5 expression. Med Sci Monit: Int Med J Exp Clin Res. 2018;24:8553.

Geyer FC, Rodrigues DN, Weigelt B, Reis-Filho JS. Molecular classification of estrogen receptor-positive/luminal breast cancers. Adv Anat Pathol. 2012;19(1):39–53.

Karamitopoulou E, Perentes E, Tolnay M, Probst A. Prognostic significance of MIB-1, p53, and bcl-2 immunoreactivity in meningiomas. Hum Pathol. 1998;29(2):140–5.

Duxbury MS, Whang EE. RRM2 induces NF- \(\kappa\) B-dependent MMP-9 activation and enhances cellular invasiveness. Biochem Biophys Res Commun. 2007;354(1):190–6.

Zhou BS, Tsai P, Ker R, Tsai J, Ho R, Yu J, et al. Overexpression of transfected human ribonucleotide reductase M2 subunit in human cancer cells enhances their invasive potential. Clin Exp Metastasis. 1998;16(1):43–9.

Zhang H, Liu X, Warden CD, Huang Y, Loera S, Xue L, et al. Prognostic and therapeutic significance of ribonucleotide reductase small subunit M2 in estrogen-negative breast cancers. BMC Cancer. 2014;14(1):1–16.

Putluri N, Maity S, Kommagani R, Creighton CJ, Putluri V, Chen F, et al. Pathway-centric integrative analysis identifies RRM2 as a prognostic marker in breast cancer associated with poor survival and tamoxifen resistance. Neoplasia. 2014;16(5):390–402.

Koleck TA, Conley YP. Identification and prioritization of candidate genes for symptom variability in breast cancer survivors based on disease characteristics at the cellular level. Breast Cancer Targets Ther. 2016;8:29.

Li Jp, Zhang Xm, Zhang Z, Zheng Lh, Jindal S, Liu Yj. Association of p53 expression with poor prognosis in patients with triple-negative breast invasive ductal carcinoma. Medicine. 2019;98(18).  https://pubmed.ncbi.nlm.nih.gov/31045815/ .

Gong MT, Ye SD, Lv WW, He K, Li WX. Comprehensive integrated analysis of gene expression datasets identifies key anti-cancer targets in different stages of breast cancer. Exp Ther Med. 2018;16(2):802–10.

PubMed   PubMed Central   Google Scholar  

Chen Wx, Yang Lg, Xu Ly, Cheng L, Qian Q, Sun L, et al. Bioinformatics analysis revealing prognostic significance of RRM2 gene in breast cancer. Biosci Rep. 2019;39(4).  https://pubmed.ncbi.nlm.nih.gov/30898978/ .

Hao Z, Zhang H, Cowell J. Ubiquitin-conjugating enzyme UBE2C: molecular biology, role in tumorigenesis, and potential as a biomarker. Tumor Biol. 2012;33(3):723–30.

Arriola E, Rodriguez-Pinilla SM, Lambros MB, Jones RL, James M, Savage K, et al. Topoisomerase II alpha amplification may predict benefit from adjuvant anthracyclines in HER2 positive early breast cancer. Breast Cancer Res Treat. 2007;106(2):181–9.

Knoop AS, Knudsen H, Balslev E, Rasmussen BB, Overgaard J, Nielsen KV, et al. Retrospective analysis of topoisomerase IIa amplifications and deletions as predictive markers in primary breast cancer patients randomly assigned to cyclophosphamide, methotrexate, and fluorouracil or cyclophosphamide, epirubicin, and fluorouracil: Danish Breast Cancer Cooperative Group. J Clin Oncol. 2005;23(30):7483–90.

Tanner M, Isola J, Wiklund T, Erikstein B, Kellokumpu-Lehtinen P, Malmstrom P, et al. Topoisomerase II \(\alpha\) gene amplification predicts favorable treatment response to tailored and dose-escalated anthracycline-based adjuvant chemotherapy in HER-2/neu-amplified breast cancer: Scandinavian Breast Group Trial 9401. J Clin Oncol. 2006;24(16):2428–36.

Arriola E, Moreno A, Varela M, Serra JM, Falo C, Benito E, et al. Predictive value of HER-2 and topoisomerase II \(\alpha\) in response to primary doxorubicin in breast cancer. Eur J Cancer. 2006;42(17):2954–60.

Järvinen TA, Tanner M, Bärlund M, Borg Å, Isola J. Characterization of topoisomerase II \(\alpha\) gene amplification and deletion in breast cancer. Gene Chromosome Cancer. 1999;26(2):142–50.

Landberg G, Erlanson M, Roos G, Tan EM, Casiano CA. Nuclear autoantigen p330d/CENP-F: a marker for cell proliferation in human malignancies. Cytom J Int Soc Anal Cytol. 1996;25(1):90–8.

CAS   Google Scholar  

Bettelli E, Carrier Y, Gao W, Korn T, Strom TB, Oukka M, et al. Reciprocal developmental pathways for the generation of pathogenic effector TH17 and regulatory T cells. Nature. 2006;441(7090):235–8.

Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat Med. 2018;24(8):1277–89.

Bremnes RM, Busund LT, Kilvær TL, Andersen S, Richardsen E, Paulsen EE, et al. The role of tumor-infiltrating lymphocytes in development, progression, and prognosis of non-small cell lung cancer. J Thorac Oncol. 2016;11(6):789–800.

Article   PubMed   Google Scholar  

Schalper KA, Brown J, Carvajal-Hausdorf D, McLaughlin J, Velcheti V, Syrigos KN, et al. Objective measurement and clinical significance of TILs in non–small cell lung cancer. J Natl Cancer Inst. 2015;107(3):dju435.

Tay RE, Richardson EK, Toh HC. Revisiting the role of CD4+ T cells in cancer immunotherapy—new insights into old paradigms. Cancer Gene Ther. 2021;28(1):5–17.

Dieu-Nosjean MC, Goc J, Giraldo NA, Sautès-Fridman C, Fridman WH. Tertiary lymphoid structures in cancer and beyond. Trends Immunol. 2014;35(11):571–80.

Wang Ss, Liu W, Ly D, Xu H, Qu L, Zhang L. Tumor-infiltrating B cells: their role and application in anti-tumor immunity in lung cancer. Cell Mol Immunol. 2019;16(1):6–18.

van den Heuvel A, Mahfouz A, Kloet SL, Balog J, van Engelen BG, Tawil R, et al. Single-cell RNA sequencing in facioscapulohumeral muscular dystrophy disease etiology and development. Hum Mol Genet. 2019;28(7):1064–75.

Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96(456):1348–60.

Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction, vol. 2. New York: Springer; 2009.

Book   Google Scholar  

Baran Y, Bercovich A, Sebe-Pedros A, Lubling Y, Giladi A, Chomsky E, et al. MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. Genome Biol. 2019;20(1):1–19.

Persad S, Choo ZN, Dien C, Sohail N, Masilionis I, Chaligné R, et al. SEACells infers transcriptional and epigenomic cellular states from single-cell genomics data. Nat Biotechnol. 2023;41:1–12.  https://pubmed.ncbi.nlm.nih.gov/36973557/ .

Ben-Kiki O, Bercovich A, Lifshitz A, Tanay A. Metacell-2: a divide-and-conquer metacell algorithm for scalable scRNA-seq analysis. Genome Biol. 2022;23(1):100.

Bilous M, Tran L, Cianciaruso C, Gabriel A, Michel H, Carmona SJ, et al. Metacells untangle large and complex single-cell transcriptome networks. BMC Bioinformatics. 2022;23(1):336.

Avila Cobos F, Alquicira-Hernandez J, Powell JE, Mestdagh P, De Preter K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat Commun. 2020;11(1):1–14.

Jin H, Liu Z. A benchmark for RNA-seq deconvolution analysis under dynamic testing environments. Genome Biol. 2021;22(1):1–23.

Wang X, Park J, Susztak K, Zhang NR, Li M. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun. 2019;10(1):380.

Wurm MJ, Rathouz PJ, Hanlon BM. Regularized ordinal regression and the ordinalNet R package. 2017. arXiv preprint arXiv:1706.05003 .

Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1.

Simon N, Friedman J, Hastie T. A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. 2013. arXiv preprint arXiv:1311.6529 .

Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw. 2011;39(5):1.

Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, et al. Strong rules for discarding predictors in lasso-type problems. J R Stat Soc Ser B Stat Methodol. 2012;74(2):245–66.

Efron B. Bootstrap methods: another look at the jackknife. In: Breakthroughs in statistics. New York: Springer; 1992. pp. 569–593.

Efron B, Tibshirani RJ. An introduction to the bootstrap. London: CRC Press; 1994.

Zappia L, Phipson B, Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 2017;18(1):174.

Gan D, Zhu Y, Lu X, Li J. Simulated datasets used in SCIPAC analysis. Zenodo. 2024. https://doi.org/10.5281/zenodo.11013320 .

Gan D, Zhu Y, Lu X, Li J. SCIPAC R package. GitHub. 2024. https://github.com/RavenGan/SCIPAC . Accessed 24 Apr 2024.

Gan D, Zhu Y, Lu X, Li J. SCIPAC source code. Zenodo. 2024. https://doi.org/10.5281/zenodo.11013696 .

Wu SZ, Al-Eryani G, Roden DL, Junankar S, Harvey K, Andersson A, et al. A single-cell and spatially resolved atlas of human breast cancers. Datasets. 2021. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE176078 . Gene Expression Omnibus. Accessed 1 Oct 2022.

Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Datasets. 2018. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-6149 . ArrayExpress. Accessed 24 July 2022.

Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Datasets. 2018. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-6653 . ArrayExpress. Accessed 24 July 2022.

van den Heuvel A, Mahfouz A, Kloet SL, Balog J, van Engelen BG, Tawil R, et al. Single-cell RNA sequencing in facioscapulohumeral muscular dystrophy disease etiology and development. Datasets. 2019. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122873 . Gene Expression Omnibus. Accessed 13 Aug 2022.

Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016;44(8):e71.

Download references

Review history

The review history is available as Additional file 3.

Peer review information

Veronique van den Berghe was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

This work is supported by the National Institutes of Health (R01CA280097 to X.L. and J.L, R01CA252878 to J.L.) and the DOD BCRP Breakthrough Award, Level 2 (W81XWH2110432 to J.L.).

Author information

Authors and affiliations.

Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, 46556, IN, USA

Dailin Gan & Jun Li

Department of Biological Sciences, Boler-Parseghian Center for Rare and Neglected Diseases, Harper Cancer Research Institute, Integrated Biomedical Sciences Graduate Program, University of Notre Dame, Notre Dame, 46556, IN, USA

Yini Zhu & Xin Lu

Tumor Microenvironment and Metastasis Program, Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, 46202, IN, USA

You can also search for this author in PubMed   Google Scholar

Contributions

J.L. conceived and supervised the study. J.L. and D.G. proposed the methods. D.G. implemented the methods and analyzed the data. D.G. and J.L. drafted the paper. D.G., Y.Z., X.L., and J.L. interpreted the results and revised the paper.

Corresponding author

Correspondence to Jun Li .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1. supplementary materials that include additional results and plots., additional file 2. a vignette of the scipac package., additional file 3. review history., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Gan, D., Zhu, Y., Lu, X. et al. SCIPAC: quantitative estimation of cell-phenotype associations. Genome Biol 25 , 119 (2024). https://doi.org/10.1186/s13059-024-03263-1

Download citation

Received : 30 January 2023

Accepted : 30 April 2024

Published : 13 May 2024

DOI : https://doi.org/10.1186/s13059-024-03263-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Phenotype association
  • Single cell
  • RNA sequencing
  • Cancer research

Genome Biology

ISSN: 1474-760X

null hypothesis for single sample t test

IMAGES

  1. One-sample t-test with Microsoft Excel

    null hypothesis for single sample t test

  2. One Sample T Test

    null hypothesis for single sample t test

  3. An example of how a one sample t-test is calculated.

    null hypothesis for single sample t test

  4. Null hypothesis for a single-sample t-test

    null hypothesis for single sample t test

  5. t-test formula

    null hypothesis for single sample t test

  6. One Sample T Test

    null hypothesis for single sample t test

VIDEO

  1. Small Sample Hypothesis Testing, Example 1

  2. T test Part 3 Single Sample Hypothesis Test MBS First Semester Statistics TU Important Question

  3. One Sample t test in SPSS

  4. Testing of hypothesis -single Mean Problems| Statistical Inference| MAT202 |MAT208 |Module 3| Part 7

  5. Hypothesis Testing through Single Sample t Test

  6. Statistics 101: Single Sample Hypothesis t-test Examples

COMMENTS

  1. One Sample t-test: Definition, Formula, and Example

    A one sample t-test is used to test whether or not the mean of a population is equal to some value. This tutorial explains the following: The motivation for performing a one sample t-test. The formula to perform a one sample t-test. The assumptions that should be met to perform a one sample t-test. An example of how to perform a one sample t-test.

  2. One Sample T Test: Definition, Using & Example

    One Sample T Test Hypotheses. A one sample t test has the following hypotheses: Null hypothesis (H 0): The population mean equals the hypothesized value (µ = H 0).; Alternative hypothesis (H A): The population mean does not equal the hypothesized value (µ ≠ H 0).; If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis.

  3. One-Sample t-Test

    Figure 8: One-sample t-test results for energy bar data using JMP software. The software shows the null hypothesis value of 20 and the average and standard deviation from the data. The test statistic is 3.07. This matches the calculations above. The software shows results for a two-sided test and for one-sided tests.

  4. T Test Overview: How to Use & Examples

    One-Sample T Test Hypotheses. Null hypothesis (H 0): The population mean equals the reference value ... We want to evaluate our IQ boosting drug using a one-sample t test. First, we draw a single random sample of 15 participants and administer the medicine to all of them. Then we measure all their IQs and calculate a sample average IQ of 109.

  5. An Introduction to t Tests

    The null hypothesis (H 0) is that the true difference between these group means is zero. The alternate hypothesis ... A one-sample t-test is used to compare a single population to a standard value (for example, to determine whether the average lifespan of a specific town is different from the country average).

  6. One Sample T Test

    One sample T-Test tests if the mean of a given sample is statistically different from a known value (a hypothesized population mean). ... simply compute the P-value. If it is less than the significance level (0.05 or 0.01), reject the null hypothesis. One Sample T Test Example. Problem Statement: We have the potato yield from 12 different farms ...

  7. SPSS Tutorials: One Sample t Test

    The One Sample t Test is a parametric test. This test is also known as: Single Sample t Test. The variable used in this test is known as: Test variable. In a One Sample t Test, the test variable's mean is compared against a "test value", which is a known or hypothesized value of the mean in the population. Test values may come from a literature ...

  8. Single Sample T-Test

    The Single Sample T-Test is a statistical test used to determine if a single group is significantly different from a known or hypothesized population value on your variable of interest. Your variable of interest should be continuous and normally distributed and you should have enough data (more than 5 values). ... The null hypothesis, which is ...

  9. One-Sample T-Test

    We now basically compute the z-score for our sample mean: the test statistic t t. t = M −μ0 SEmean t = M − μ 0 S E m e a n. for our first IQ component, this results in. t = 99.29 − 100 2.02 = −0.35 t = 99.29 − 100 2.02 = − 0.35. If the assumptions are met, t t follows a t distribution with the degrees of freedom or df d f given by.

  10. How t-Tests Work: 1-sample, 2-sample, and Paired t-Tests

    Here's what we've learned about the t-values for the 1-sample t-test, paired t-test, and 2-sample t-test: Each test reduces your sample data down to a single t-value based on the ratio of the effect size to the variability in your sample. A t-value of zero indicates that your sample results match the null hypothesis precisely.

  11. 4.1: One-Sample t-Test

    How the test works. Calculate the test statistic, ts t s, using this formula: ts = (x¯ −μθ) (s/ n−−√) (4.1.1) (4.1.1) t s = ( x ¯ − μ θ) ( s / n) where x¯ x ¯ is the sample mean, μ μ is the mean expected under the null hypothesis, s s is the sample standard deviation and n n is the sample size. The test statistic, ts t s ...

  12. An Introduction to the One Sample t-test

    To test this hypothesis, you could collect a sample of laptop computers from the assembly line, measure their weights, and compare the sample with a value of five using a one-sample t-test. Hypotheses. There are two kinds of hypotheses for a one sample t-test, the null hypothesis and the alternative hypothesis. The alternative hypothesis ...

  13. One sample t test

    One sample t test: Overview. The one sample t test, also referred to as a single sample t test, is a statistical hypothesis test used to determine whether the mean calculated from sample data collected from a single group is different from a designated value specified by the researcher. This designated value does not come from the data itself ...

  14. One sample t-test • Simply explained

    The t-test is one of the most common hypothesis tests in statistics. The t-test determines either whether the sample mean and the mean of the population differ or if two sample means differ statistically. The t-test distinguishes between. The choice of which t-test to use depends on whether one or two samples are available.

  15. One Sample T Test (Easily Explained w/ 5+ Examples!)

    00:13:49 - Test the null hypothesis when population standard deviation is known (Example #2) 00:18:56 - Use a one-sample t-test to test a claim (Example #3) 00:26:50 - Conduct a hypothesis test and confidence interval when population standard deviation is unknown (Example #4) 00:37:16 - Conduct a hypothesis test by using a one-sample t ...

  16. T-test and Hypothesis Testing (Explained Simply)

    Aug 5, 2022. 6. Photo by Andrew George on Unsplash. Student's t-tests are commonly used in inferential statistics for testing a hypothesis on the basis of a difference between sample means. However, people often misinterpret the results of t-tests, which leads to false research findings and a lack of reproducibility of studies.

  17. Single Sample t Test

    Menu location: Analysis_Parametric_Single Sample t. This function gives a single sample Student t test with a confidence interval for the mean difference. The single sample t method tests a null hypothesis that the population mean is equal to a specified value. If this value is zero (or not entered) then the confidence interval for the sample ...

  18. One Sample T-Test Hypothesis Test By Hand

    A t-test hypothesis test example By Hand. ... NOTE: There are three types of t-tests. There is the one sample t-test that compares a single sample to a known population value (this example). ... (which in this case is 13.6) is either bigger than 2.064 or less than -2.064 then we CAN reject the null because we ARE in the rejection region. Result ...

  19. Single Sample T-Test Calculator

    A single sample t-test (or one sample t-test) is used to compare the mean of a single sample of scores to a known or hypothetical population mean. So, for example, it could be used to determine whether the mean diastolic blood pressure of a particular group differs from 85, a value determined by a previous study. ... Null Hypothesis. H 0: ...

  20. Conduct and Interpret a One-Sample T-Test

    The "One-Sample Test" section shows the results of the t-test. In this case, the null hypothesis is that the mean of the sample is equal to 9.5. For the purpose of this example, we will set our significance (alpha) level to .05. The Sig. column displays the p-value for the test. The results show that the p-value (.592) is greater than .05.

  21. 11.3: The Independent Samples t-test (Student Test)

    A Student's independent samples t-test showed that this 5.4% difference was significant (t (31)=2.1, p<.05, CI95= [0.2,10.8], d=.74), suggesting that a genuine difference in learning outcomes has occurred. Notice that I've included the confidence interval and the effect size in the stat block.

  22. One Sample Mean Hypothesis Test

    This calculator runs a one-sample t t test for a given sample data set and specified null and alternative hypotheses. Enter the data in the text area to the left. The data must be formatted with one score for each row. Alternatively, enter the sample size, the sample mean, and the sample standard deviation in the fields below.

  23. prove if one sample t test accept null hypothesis then one sample sign

    No. The tests are not equivalent. All that is necessary is to find a normal sample for which the t test rejects and the sign test does not, or a normal sample for which the sign test rejects and the t test does not.

  24. SCIPAC: quantitative estimation of cell-phenotype associations

    Numerous algorithms have been proposed to identify cell types in single-cell RNA sequencing data, yet a fundamental problem remains: determining associations between cells and phenotypes such as cancer. We develop SCIPAC, the first algorithm that quantitatively estimates the association between each cell in single-cell data and a phenotype. SCIPAC also provides a p-value for each association ...