hypothesis testing towards data science

  • Onsite training

3,000,000+ delegates

15,000+ clients

1,000+ locations

  • KnowledgePass
  • Log a ticket

01344203999 Available 24/7

Hypothesis Testing in Data Science: It's Usage and Types

Hypothesis Testing in Data Science is a crucial method for making informed decisions from data. This blog explores its essential usage in analysing trends and patterns, and the different types such as null, alternative, one-tailed, and two-tailed tests, providing a comprehensive understanding for both beginners and advanced practitioners.

stars

Exclusive 40% OFF

Training Outcomes Within Your Budget!

We ensure quality, budget-alignment, and timely delivery by our expert instructors.

Share this Resource

  • Advanced Data Science Certification
  • Data Science and Blockchain Training
  • Big Data Analysis
  • Python Data Science Course
  • Advanced Data Analytics Course {location}

course

Table of Contents  

1) What is Hypothesis Testing in Data Science? 

2) Importance of Hypothesis Testing in Data Science 

3) Types of Hypothesis Testing 

4) Basic steps in Hypothesis Testing 

5) Real-world use cases of Hypothesis Testing 

6) Conclusion 

What is Hypothesis Testing in Data Science?  

Hypothesis Testing in Data Science is a statistical method used to assess the validity of assumptions or claims about a population based on sample data. It involves formulating two Hypotheses, the null Hypothesis (H0) and the alternative Hypothesis (Ha or H1), and then using statistical tests to find out if there is enough evidence to support the alternative Hypothesis.  

Hypothetical Testing is a critical tool for making data-driven decisions, evaluating the significance of observed effects or differences, and drawing meaningful conclusions from data, allowing Data Scientists to uncover patterns, relationships, and insights that inform various domains, from medicine to business and beyond. 

Unlock the power of data with our comprehensive Data Science & Analytics Training . Sign up now!  

Importance of Hypothesis Testing in Data Science  

The significance of Hypothesis Testing in Data Science cannot be overstated. It serves as the cornerstone of data-driven decision-making. By systematically testing Hypotheses, Data Scientists can: 

Importance of Hypothesis Testing in Data Science

Objective decision-making 

Hypothesis Testing provides a structured and impartial method for making decisions based on data. In a world where biases can skew perceptions, Data Scientists rely on this method to ensure that their conclusions are grounded in empirical evidence, making their decisions more objective and trustworthy. 

Statistical rigour 

Data Scientists deal with large amounts of data, and Hypothesis Testing helps them make sense of it. It quantifies the significance of observed patterns, differences, or relationships. This statistical rigour is essential in distinguishing between mere coincidences and meaningful findings, reducing the likelihood of making decisions based on random chance. 

Resource allocation 

Resources, whether they are financial, human, or time-related, are often limited. Hypothesis Testing enables efficient resource allocation by guiding Data Scientists towards strategies or interventions that are statistically significant. This ensures that efforts are directed where they are most likely to yield valuable results. 

Risk management 

In domains like healthcare and finance, where lives and livelihoods are at stake, Hypothesis Testing is a critical tool for risk assessment. For instance, in drug development, Hypothesis Testing is used to determine the safety and efficiency of new treatments, helping mitigate potential risks to patients. 

Innovation and progress 

Hypothesis Testing fosters innovation by providing a systematic framework to evaluate new ideas, products, or strategies. It encourages a cycle of experimentation, feedback, and improvement, driving continuous progress and innovation. 

Strategic decision-making 

Organisations base their strategies on data-driven insights. Hypothesis Testing enables them to make informed decisions about market trends, customer behaviour, and product development. These decisions are grounded in empirical evidence, increasing the likelihood of success. 

Scientific integrity 

In scientific research, Hypothesis Testing is integral to maintaining the integrity of research findings. It ensures that conclusions are drawn from rigorous statistical analysis rather than conjecture. This is essential for advancing knowledge and building upon existing research. 

Regulatory compliance 

Many industries, such as pharmaceuticals and aviation, operate under strict regulatory frameworks. Hypothesis Testing is essential for demonstrating compliance with safety and quality standards. It provides the statistical evidence required to meet regulatory requirements. 

Supercharge your data skills with our Big Data and Analytics Training – register now!  

Types of Hypothesis Testing  

Hypothesis Testing can be seen in several different types. In total, we have five types of Hypothesis Testing. They are described below as follows: 

Types of Hypothesis Testing

Alternative Hypothesis

The Alternative Hypothesis, denoted as Ha or H1, is the assertion or claim that researchers aim to support with their data analysis. It represents the opposite of the null Hypothesis (H0) and suggests that there is a significant effect, relationship, or difference in the population. In simpler terms, it's the statement that researchers hope to find evidence for during their analysis. For example, if you are testing a new drug's efficacy, the alternative Hypothesis might state that the drug has a measurable positive effect on patients' health. 

Null Hypothesis 

The Null Hypothesis, denoted as H0, is the default assumption in Hypothesis Testing. It posits that there is no significant effect, relationship, or difference in the population being studied. In other words, it represents the status quo or the absence of an effect. Researchers typically set out to challenge or disprove the Null Hypothesis by collecting and analysing data. Using the drug efficacy example again, the Null Hypothesis might state that the new drug has no effect on patients' health. 

Non-directional Hypothesis 

A Non-directional Hypothesis, also known as a two-tailed Hypothesis, is used when researchers are interested in whether there is any significant difference, effect, or relationship in either direction (positive or negative). This type of Hypothesis allows for the possibility of finding effects in both directions. For instance, in a study comparing the performance of two groups, a Non-directional Hypothesis would suggest that there is a significant difference between the groups, without specifying which group performs better. 

Directional Hypothesis 

A Directional Hypothesis, also called a one-tailed Hypothesis, is employed when researchers have a specific expectation about the direction of the effect, relationship, or difference they are investigating. In this case, the Hypothesis predicts an outcome in a particular direction—either positive or negative. For example, if you expect that a new teaching method will improve student test scores, a directional Hypothesis would state that the new method leads to higher test scores. 

Statistical Hypothesis 

A Statistical Hypothesis is a Hypothesis formulated in a way that it can be tested using statistical methods. It involves specific numerical values or parameters that can be measured or compared. Statistical Hypotheses are crucial for quantitative research and often involve means, proportions, variances, correlations, or other measurable quantities. These Hypotheses provide a precise framework for conducting statistical tests and drawing conclusions based on data analysis. 

Want to unlock the power of Big Data Analysis? Join our Big Data Analysis Course today!  

Basic steps in Hypothesis Testing  

Hypothesis Testing is a systematic approach used in statistics to make informed decisions based on data. It is a critical tool in Data Science, research, and many other fields where data analysis is employed. The following are the basic steps involved in Hypothesis Testing: 

Basic steps in Hypothesis Testing

1) Formulate Hypotheses 

The first step in Hypothesis Testing is to clearly define your research question and translate it into two mutually exclusive Hypotheses: 

a) Null Hypothesis (H0): This is the default assumption, often representing the status quo or the absence of an effect. It states that there is no significant difference, relationship, or effect in the population. 

b) Alternative Hypothesis (Ha or H1): This is the statement that contradicts the null Hypothesis. It suggests that there is a significant difference, relationship, or effect in the population. 

The formulation of these Hypotheses is crucial, as they serve as the foundation for your entire Hypothesis Testing process. 

2) Collect data 

With your Hypotheses in place, the next step is to gather relevant data through surveys, experiments, observations, or any other suitable method. The data collected should be representative of the population you are studying. The quality and quantity of data are essential factors in the success of your Hypothesis Testing. 

3) Choose a significance level (α) 

Before conducting the statistical test, you need to decide on the level of significance, denoted as α. The significance level represents the threshold for statistical significance and determines how confident you want to be in your results. A common choice is α = 0.05, which implies a 5% chance of making a Type I error (rejecting the null Hypothesis when it's true). You can choose a different α value based on the specific requirements of your analysis. 

4) Perform the test 

Based on the nature of your data and the Hypotheses you've formulated, select the appropriate statistical test. There are various tests available, including t-tests, chi-squared tests, ANOVA, regression analysis, and more. The chosen test should align with the type of data (e.g., continuous or categorical) and the research question (e.g., comparing means or testing for independence). 

Execute the selected statistical test on your data to obtain test statistics and p-values. The test statistics quantify the difference or effect you are investigating, while the p-value represents the probability of obtaining the observed results if the null Hypothesis were true. 

5) Analyse the results 

Once you have the test statistics and p-value, it's time to interpret the results. The primary focus is on the p-value: 

a) If the p-value is less than or equal to your chosen significance level (α), typically 0.05, you have evidence to reject the null Hypothesis. This shows that there is a significant difference, relationship, or effect in the population. 

b) If the p-value is more than α, you fail to reject the null Hypothesis, showing that there is insufficient evidence to support the alternative Hypothesis. 

6) Draw conclusions 

Based on the analysis of the p-value and the comparison to the significance level, you can draw conclusions about your research question: 

a) In case you reject the null Hypothesis, you can accept the alternative Hypothesis and make inferences based on the evidence provided by your data. 

b) In case you fail to reject the null Hypothesis, you do not accept the alternative Hypothesis, and you acknowledge that there is no significant evidence to support your claim. 

It's important to communicate your findings clearly, including the implications and limitations of your analysis. 

Real-world use cases of Hypothesis Testing  

The following are some of the real-world use cases of Hypothesis Testing. 

a) Medical research: Hypothesis Testing is crucial in determining the efficacy of new medications or treatments. For instance, in a clinical trial, researchers use Hypothesis Testing to assess whether a new drug is significantly more effective than a placebo in treating a particular condition. 

b) Marketing and advertising: Businesses employ Hypothesis Testing to evaluate the impact of marketing campaigns. A company may test whether a new advertising strategy leads to a significant increase in sales compared to the previous approach. 

c) Manufacturing and quality control: Manufacturing industries use Hypothesis Testing to ensure product quality. For example, in the automotive industry, Hypothesis Testing can be applied to test whether a new manufacturing process results in a significant reduction in defects. 

d) Education: In the field of education, Hypothesis Testing can be used to assess the effectiveness of teaching methods. Researchers may test whether a new teaching approach leads to statistically significant improvements in student performance. 

e) Finance and investment: Investment strategies are often evaluated using Hypothesis Testing. Investors may test whether a new investment strategy outperforms a benchmark index over a specified period.  

Big Data Analytics

Conclusion 

To sum it up, Hypothesis Testing in Data Science is a powerful tool that enables Data Scientists to make evidence-based decisions and draw meaningful conclusions from data. Understanding the types, methods, and steps involved in Hypothesis Testing is essential for any Data Scientist. By rigorously applying Hypothesis Testing techniques, you can gain valuable insights and drive informed decision-making in various domains. 

Want to take your Data Science skills to the next level? Join our Big Data Analytics & Data Science Integration Course now!  

Frequently Asked Questions

Upcoming data, analytics & ai resources batches & dates.

Fri 5th Jul 2024

Fri 1st Nov 2024

Get A Quote

WHO WILL BE FUNDING THE COURSE?

My employer

By submitting your details you agree to be contacted in order to respond to your enquiry

  • Business Analysis
  • Lean Six Sigma Certification

Share this course

Our biggest spring sale.

red-star

We cannot process your enquiry without contacting you, please tick to confirm your consent to us for contacting you about your enquiry.

By submitting your details you agree to be contacted in order to respond to your enquiry.

We may not have the course you’re looking for. If you enquire or give us a call on 01344203999 and speak to our training experts, we may still be able to help with your training requirements.

Or select from our popular topics

  • ITIL® Certification
  • Scrum Certification
  • Change Management Certification
  • Business Analysis Courses
  • Microsoft Azure Certification
  • Microsoft Excel Courses
  • Microsoft Project
  • Explore more courses

Press esc to close

Fill out your  contact details  below and our training experts will be in touch.

Fill out your   contact details   below

Thank you for your enquiry!

One of our training experts will be in touch shortly to go over your training requirements.

Back to Course Information

Fill out your contact details below so we can get in touch with you regarding your training requirements.

* WHO WILL BE FUNDING THE COURSE?

Preferred Contact Method

No preference

Back to course information

Fill out your  training details  below

Fill out your training details below so we have a better idea of what your training requirements are.

HOW MANY DELEGATES NEED TRAINING?

HOW DO YOU WANT THE COURSE DELIVERED?

Online Instructor-led

Online Self-paced

WHEN WOULD YOU LIKE TO TAKE THIS COURSE?

Next 2 - 4 months

WHAT IS YOUR REASON FOR ENQUIRING?

Looking for some information

Looking for a discount

I want to book but have questions

One of our training experts will be in touch shortly to go overy your training requirements.

Your privacy & cookies!

Like many websites we use cookies. We care about your data and experience, so to give you the best possible experience using our site, we store a very limited amount of your data. Continuing to use this site or clicking “Accept & close” means that you agree to our use of cookies. Learn more about our privacy policy and cookie policy cookie policy .

We use cookies that are essential for our site to work. Please visit our cookie policy for more information. To accept all cookies click 'Accept & close'.

  • Online Degree Explore Bachelor’s & Master’s degrees
  • MasterTrack™ Earn credit towards a Master’s degree
  • University Certificates Advance your career with graduate-level learning
  • Top Courses
  • Join for Free

University of Colorado Boulder

Statistical Inference and Hypothesis Testing in Data Science Applications

This course is part of Data Science Foundations: Statistical Inference Specialization

Taught in English

Some content may not be translated

Jem Corcoran

Instructor: Jem Corcoran

Financial aid available

5,208 already enrolled

Coursera Plus

(37 reviews)

Recommended experience

Intermediate level

Sequence in calculus up through Calculus II (preferably multivariate calculus) and some programming experience in R

What you'll learn

Define a composite hypothesis and the level of significance for a test with a composite null hypothesis., define a test statistic, level of significance, and the rejection region for a hypothesis test. give the form of a rejection region., perform tests concerning a true population variance., compute the sampling distributions for the sample mean and sample minimum of the exponential distribution., details to know.

hypothesis testing towards data science

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Placeholder

Build your subject-matter expertise

  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

There are 6 modules in this course

This course will focus on theory and implementation of hypothesis testing, especially as it relates to applications in data science. Students will learn to use hypothesis tests to make informed decisions from data. Special attention will be given to the general logic of hypothesis testing, error and error rates, power, simulation, and the correct computation and interpretation of p-values. Attention will also be given to the misuse of testing concepts, especially p-values, and the ethical implications of such misuse.

This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder.

Start Here!

Welcome to the course! This module contains logistical information to get you started!

What's included

6 readings 1 app item 1 discussion prompt 1 ungraded lab

6 readings • Total 57 minutes

  • Introducing the Yellowdig Learning Community Pilot • 10 minutes
  • Earn Academic Credit for your Work! • 10 minutes
  • Course Support • 10 minutes
  • Course Resources • 10 minutes
  • Getting Started with Yellowdig • 15 minutes
  • Join the Discussion in our Yellowdig Community • 2 minutes

1 app item • Total 60 minutes

  • Statistical Inference and Hypothesis Testing in Data Science Applications Yellowdig Community • 60 minutes

1 discussion prompt • Total 10 minutes

  • Introduce Yourself • 10 minutes

1 ungraded lab • Total 60 minutes

  • Introduction to Jupyter Notebooks and R • 60 minutes

Fundamental Concepts of Hypothesis Testing

In this module, we will define a hypothesis test and develop the intuition behind designing a test. We will learn the language of hypothesis testing, which includes definitions of a null hypothesis, an alternative hypothesis, and the level of significance of a test. We will walk through a very simple test.

6 videos 12 readings 1 quiz 1 programming assignment 2 ungraded labs

6 videos • Total 69 minutes

  • What is Hypothesis Testing? • 3 minutes • Preview module
  • Types of Hypotheses • 14 minutes
  • Normal Computations • 23 minutes
  • Errors in Hypothesis Testing • 7 minutes
  • Test Statistics and Significance • 14 minutes
  • A First Test • 4 minutes

12 readings • Total 107 minutes

  • What is Hypothesis Testing? • 5 minutes
  • Types of Hypotheses • 10 minutes
  • Video Slides for Types of Hypotheses • 10 minutes
  • Normal Computations • 10 minutes
  • Video Slides for Normal Computations • 10 minutes
  • Errors in Hypothesis Testing • 10 minutes
  • Video Slides for Errors in Hypothesis Testing • 10 minutes
  • Test Statistics and Significance • 10 minutes
  • Video Slides for Test Statistics and Level of Significance • 10 minutes
  • A First Test • 10 minutes
  • Video Slides for A First Test • 10 minutes

1 quiz • Total 30 minutes

  • Introduction to Hypothesis Testing • 30 minutes

1 programming assignment • Total 180 minutes

  • Intro to Hypothesis Testing Lab • 180 minutes

2 ungraded labs • Total 120 minutes

  • An Introduction to R and Jupyter Notebooks • 60 minutes
  • Visualizing Errors in Hypothesis Testing • 60 minutes

Composite Tests, Power Functions, and P-Values

In this module, we will expand the lessons of Module 1 to composite hypotheses for both one and two-tailed tests. We will define the “power function” for a test and discuss its interpretation and how it can lead to the idea of a “uniformly most powerful” test. We will discuss and interpret “p-values” as an alternate approach to hypothesis testing.

7 videos 8 readings 1 quiz 1 programming assignment 1 ungraded lab

7 videos • Total 124 minutes

  • Composite Hypotheses and Level of Significance • 16 minutes • Preview module
  • One-Tailed Tests • 20 minutes
  • Power Functions • 13 minutes
  • Hypothesis Testing with P-Values • 21 minutes
  • Two Tailed Tests • 12 minutes
  • CLT: A Brief Review • 16 minutes
  • Hypothesis Tests for Proportions • 23 minutes

8 readings • Total 72 minutes

  • Video Slides for Composite Hypotheses and Level of Significance • 10 minutes
  • Video Slides for One-Tailed Tests • 10 minutes
  • Video Slides for Power Functions • 10 minutes
  • Video Slides for Hypothesis Testing with P-Values • 10 minutes
  • Video Slides for Two-Tailed Tests • 10 minutes
  • Video Slides for CLT: A Brief Review • 10 minutes
  • Video Slides for Hypothesis Tests for Proportions • 10 minutes
  • Constructing Tests • 30 minutes
  • The Basics of Hypothesis Testing • 180 minutes
  • Distributions of P-Values • 60 minutes

t-Tests and Two-Sample Tests

In this module, we will learn about the chi-squared and t distributions and their relationships to sampling distributions. We will learn to identify when hypothesis tests based on these distributions are appropriate. We will review the concept of sample variance and derive the “t-test”. Additionally, we will derive our first two-sample test and apply it to make some decisions about real data.

7 videos • Total 139 minutes

  • The t and Chi-Squared Distributions • 41 minutes • Preview module
  • The Sample Variance for the Normal Distribution • 23 minutes
  • t-Tests • 18 minutes
  • Two Sample Tests for Means • 15 minutes
  • Two Sample t-Tests for a Difference of Means • 17 minutes
  • Welch's t-Test and Paired Data • 13 minutes
  • Comparing Population Proportions • 8 minutes
  • Video Slides for the t and Chi-Squared Distributions • 10 minutes
  • Video Slides for the Sample Variance and the Normal Distribution • 10 minutes
  • Video Slides for t-Tests • 10 minutes
  • Video Slides for Two Sample Tests for Means • 10 minutes
  • Video Slides for Differences in Population Means • 10 minutes
  • Video Slides for Welch's Test and Paired Data • 10 minutes
  • Video Slides for Comparing Population Proportions • 10 minutes
  • More Hypothesis Tests! • 30 minutes
  • t-Tests • 180 minutes
  • t-Tests and Two Sample Tests • 60 minutes

Beyond Normality

In this module, we will consider some problems where the assumption of an underlying normal distribution is not appropriate and will expand our ability to construct hypothesis tests for this case. We will define the concept of a “uniformly most powerful” (UMP) test, whether or not such a test exists for specific problems, and we will revisit some of our earlier tests from Modules 1 and 2 through the UMP lens. We will also introduce the F-distribution and its role in testing whether or not two population variances are equal.

6 videos 7 readings 2 quizzes

6 videos • Total 117 minutes

  • Properties of the Exponential Distribution • 13 minutes • Preview module
  • Two Tests • 27 minutes
  • Best Tests • 22 minutes
  • UMP Tests • 10 minutes
  • A Test for the Variance of the Normal Distribution • 12 minutes
  • The F-Distribution and a Ratio of Variances • 31 minutes

7 readings • Total 62 minutes

  • Video Slides for Properties of the Exponential Distribution • 10 minutes
  • Video Slides for Two Hypothesis Tests for the Exponential • 10 minutes
  • Video Slides for Best Tests • 10 minutes
  • Video Slides for UMP Tests • 10 minutes
  • Video Slides for a Normal Variance Test • 10 minutes
  • Video Slides for an F-Distribution and a Ratio of Variances • 10 minutes

2 quizzes • Total 60 minutes

  • Best Tests and Some General Skills • 30 minutes
  • Uniformly Most Powerful Tests and F-Tests • 30 minutes

Likelihood Ratio Tests and Chi-Squared Tests

In this module, we develop a formal approach to hypothesis testing, based on a “likelihood ratio” that can be more generally applied than any of the tests we have discussed so far. We will pay special attention to the large sample properties of the likelihood ratio, especially Wilks’ Theorem, that will allow us to come up with approximate (but easy) tests when we have a large sample size. We will close the course with two chi-squared tests that can be used to test whether the distributional assumptions we have been making throughout this course are valid.

5 videos 7 readings 1 quiz 1 programming assignment 1 ungraded lab

5 videos • Total 93 minutes

  • MLEs • 23 minutes • Preview module
  • The GRLT • 15 minutes
  • Wilks' Theorem • 12 minutes
  • Chi-Squared Goodness of Fit Test • 23 minutes
  • Independence and Homogeneity • 19 minutes
  • Video Slides for MLEs • 10 minutes
  • Video Slides for the GLRT • 10 minutes
  • Video Slides for Wilks' Theorem • 10 minutes
  • Video Slides for Chi-Squared Goodness of Fit Test • 10 minutes
  • Video Slides for Independence and Homogeneity • 10 minutes
  • Share Your Feedback on Yellowdig • 10 minutes
  • Adventures in GLRTs • 30 minutes
  • Chi-Squared Tests and Mo • 180 minutes
  • Exploring Wilks' Theorem • 60 minutes

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

hypothesis testing towards data science

CU-Boulder is a dynamic community of scholars and learners on one of the most spectacular college campuses in the country. As one of 34 U.S. public institutions in the prestigious Association of American Universities (AAU), we have a proud tradition of academic excellence, with five Nobel laureates and more than 50 members of prestigious academic academies.

Recommended if you're interested in Probability and Statistics

hypothesis testing towards data science

University of Colorado Boulder

Statistical Inference for Estimation in Data Science

Make progress toward a degree

hypothesis testing towards data science

Google Cloud

Scaling Microservices App: Migration to Redis Enterprise on Google Cloud

hypothesis testing towards data science

Data Science as a Field

hypothesis testing towards data science

Cybersecurity for Data Science

Get a head start on your degree.

This course is part of the following degree programs offered by University of Colorado Boulder. If you are admitted and enroll, your coursework can count toward your degree learning and your progress can transfer with you.

Master of Science in Electrical Engineering

Degree · 2 years

Master of Engineering in Engineering Management

Degree · 24 months

Master of Science in Data Science

Why people choose coursera for their career.

hypothesis testing towards data science

Learner reviews

Showing 3 of 37

Reviewed on Jul 7, 2023

coursera classes can be rough and maybe even a little bit buggy it's loaded with good knowlede tho. the professor is great!

Reviewed on Feb 9, 2024

Great course, challenging quizzes. Labs and programming assignments are really helpful, especially the one on Wilks theorem, I really liked that one.

Reviewed on Oct 27, 2022

In-depth course on Hypothesis testing. Course instructor is quite engaging.

New to Probability and Statistics? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions

When will i have access to the lectures and assignments.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

What is the refund policy?

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy Opens in a new tab .

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

More questions

Data Science

Data Science

91 of 257 Completed

Fundamentals of Hypothesis Testing

In this section, you’ll learn what a hypothesis test is, when to use it, and how to calculate it.

A hypothesis is a formalized guess about the value of a random variable. A hypothesis test is a statistical tool that allows you to make an inference about a population from a sample drawn from that population.

Hypothesis tests are organized into a null hypothesis ( H 0 H_0 H 0 ​ ) and an alternative hypothesis ( H 1 H_1 H 1 ​ ).

The goal of hypothesis testing is to determine whether there is enough evidence in the sample data to reject the null hypothesis in favor of the alternative hypothesis .

For example, you might hypothesize that a new drug will reduce blood pressure in a population of patients with high blood pressure. You would then split your patients into two groups, giving one of them the drug and one of them a placebo and measuring their blood pressure.

The hypotheses might look like this:

H 0 H_0 H 0 ​ : The drug had no impact on patients’ high blood pressure.

H A H_A H A ​ : The drug did have an impact on patients’ high blood pressure.

You would then calculate the test statistic to determine if the data obtained from the sample is enough to reject the null in favor of the alternative. Different hypothesis tests can have different test statistics, it really depends on what you’re comparing.

The Null Hypothesis is No Effect

Note that the null and alternative hypotheses are set up in this way, with the hypothesis you’re interested in testing set as the alternative hypothesis.

That is, your guess of the value of the variable is the alternative hypothesis. The null hypothesis is often that there was no effect, no difference, or no change in the population.

The alternative hypothesis is the effect, difference, or change you’re trying to support, and you collect data to see if you can reject the null in favor of the alternative.

The Null and Alternative Hypothesis

Together, the null hypothesis  H 0 H_0 H 0 ​ and the alternative hypothesis H A H_A H A ​  represent all possible outcomes being tested.

A common hypothesis test is to check if some statistic  θ \theta θ is different from a given value, denoted  θ 0 \theta_0 θ 0 ​ .

For example, you might want to check if the mean of your sample ( μ \mu μ ) is different from a set number, such as 5. Here μ \mu μ is called the test statistic . The hypotheses look like this:

H 0 : μ = 5 H_0: \mu = 5 H 0 ​ : μ = 5

H A : μ ≠ 5 H_A: \mu \neq 5 H A ​ : μ  = 5

Or you may want to see if the mean  μ \mu μ  is greater than 5.

H 0 : μ ≤ 5 H_0: \mu \leq 5 H 0 ​ : μ ≤ 5

H A : μ > 5 H_A: \mu > 5 H A ​ : μ > 5

Notice that your guess is in the alternative hypothesis , not the null! Let’s look at why.

Why H 0 H_0 H 0 ​ Is No Effect

In statistics, you can never be absolutely sure of an outcome. So you would never say that you accept a hypothesis. Instead, you say that there’s enough evidence to reject a hypothesis.

Your data can’t prove a hypothesis. However, you can show how unlikely it would be to get the data you collected if your hypothesis were false.

For example, if out of 20 people you surveyed, 13 said that they like coffee more than tea, you can’t say this definitively proves that more people like coffee than tea. But, you can calculate how likely it would be to get this result if, in reality, only 50% of people prefer coffee to tea.

Statistical Significance of a Hypothesis Test

A hypothesis test has a power or significance level of the test, denoted by  α \alpha α , and takes values in (0,1). The significance level is also sometimes called a  p -value.

The significance level is the probability of  rejecting   H 0 H_0 H 0 ​  when  H 0 H_0 H 0 ​ is actually true.

The significance level is chosen by you the researcher before carrying out the hypothesis test. Commonly used values are 0.05, 0.01, or 0.10. The lower the significance level, the harder it is to reject the null hypothesis. But the lower the significance level is, the more that a result that rejects the null is statistically significant .

Using a Significance Level

After a significance level is chosen, the next step is to calculate the test statistic. The test statistic depends on the kind of test being done: for example, a t-test, a Chi-square test, or an F-test.

The test statistic computed from the sample data is then compared to a threshold of that same statistic that corresponds to the previously chosen p-value or significance level α \alpha α . If the test statistic is below the threshold, then you can reject the null hypothesis at the chosen significance level, for example, 0.05 or 0.01.

Alternatively, you can compute a p-value directly. If the p-value computed from the sample data is below your previously chosen significance level, you reject the null hypothesis in favor of the alternative.

When you reject the null in favor of the alternative, it means that the data you collected is unlikely to have looked that way if the null hypothesis were true. This is what it means to have a statistically significant result.

Type I and Type II Error

The type of error where you reject  H 0 H_0 H 0 ​ when it is true is called a  type I error  (read: “type one error”) or a  false positive .

Similarly, the type of error where you fail to reject  H 0 H_0 H 0 ​ when it is false is called a  type II error  (read: “type two error”) or a  false negative .

One-Sided and Two-Sided Tests

The sidedness of a test is the number of scenarios where H 0 H_0 H 0 ​  is false. A test can be one-sided or two-sided.

A two-sided test:

H 0 : θ = 0 H_0: \theta = 0 H 0 ​ : θ = 0

H 1 : θ ≠ 0 H_1: \theta ≠ 0 H 1 ​ : θ  = 0

H 0 H_0 H 0 ​  can be rejected by either concluding that  θ > 0 \theta > 0 θ > 0  or θ < 0 \theta < 0 θ < 0 , making it a two-sided test.

A one-sided test:

H 0 : θ ≤ 0 H_0: \theta \leq 0 H 0 ​ : θ ≤ 0

H 1 : θ > 0 H_1: \theta > 0 H 1 ​ : θ > 0

The only way to reject  H 0 H_0 H 0 ​ is to conclude  θ > 0 \theta > 0 θ > 0 , so this is a one-sided test.

Practically, the only difference between one-sided and two-sided tests is that two-sided tests need their value of  α \alpha α  adjusted to  α / 2 \alpha/2 α /2 to take into account the multiple ways to reject  H 0 H_0 H 0 ​ .

How to Carry Out a Hypothesis Test

The following reviews the steps you’d carry out when using a hypothesis test.

  • Null Hypothesis H 0 H_0 H 0 ​ : The default hypothesis, which often states that there is no effect, no difference, or no change in the population.
  • Alternative Hypothesis H A H_A H A ​ : The opposite of the null hypothesis, suggesting an effect, difference, or change. It is what you are trying to support with the data collected.
  • The significance level, denoted by α, is the probability of rejecting the null hypothesis when it is actually true. Commonly used values are 0.05, 0.01, or 0.10.
  • The smaller the significance level, the harder it will be to reject the null hypothesis.
  • Conversely, the smaller the significance level is, the more confidence you have in the alternative hypothesis if you end up rejecting the null. This is statistical significance.
  • Collect a sample of data and calculate the test statistic. This could involve calculating means, standard deviations, or conducting tests like t-tests or chi-square tests.
  • Compare the test statistic with the significance level. If the p-value is less than or equal to α, you reject the null hypothesis. If the p-value is greater than α, you fail to reject the null hypothesis.
  • If you reject the null hypothesis, you conclude that there is evidence to support the claim of the alternative hypothesis.
  • If you fail to reject the null hypothesis, you do not have enough evidence to support the alternative hypothesis.
  • You never say that you accept the alternative hypothesis, only that you reject the null hypothesis or fail to reject the null.

You have 166 sections remaining on this learning path.

Advance your learning journey! Go Premium and unlock 40+ hours of specialized content.

9   Hypothesis testing

In scientific studies, you’ll often see phrases like “the results are statistically significant”. This points to a technique called hypothesis testing, where we use p-values, a type of probability, to test our initial assumption or hypothesis.

In hypothesis testing, rather than providing an estimate of the parameter we’re studying, we provide a probability that serves as evidence supporting or contradicting a specific hypothesis. The hypothesis usually involves whether a parameter is different from a predetermined value (often 0).

Hypothesis testing is used when you can phrase your research question in terms of whether a parameter differs from this predetermined value. It’s applied in various fields, asking questions such as: Does a medication extend the lives of cancer patients? Does an increase in gun sales correlate with more gun violence? Does class size affect test scores?

Take, for instance, the previously used example with colored beads. We might not be concerned about the exact proportion of blue beads, but instead ask: Are there more blue beads than red ones? This could be rephrased as asking if the proportion of blue beads is more than 0.5.

The initial hypothesis that the parameter equals the predetermined value is called the “null hypothesis”. It’s popular because it allows us to focus on the data’s properties under this null scenario. Once data is collected, we estimate the parameter and calculate the p-value, which is the probability of the estimate being as extreme as observed if the null hypothesis is true. If the p-value is small, it indicates the null hypothesis is unlikely, providing evidence against it.

We will see more examples of hypothesis testing in Chapter 17 .

9.1 p-values

Suppose we take a random sample of \(N=100\) and we observe \(52\) blue beads, which gives us \(\bar{X} = 0.52\) . This seems to be pointing to the existence of more blue than red beads since 0.52 is larger than 0.5. However, we know there is chance involved in this process and we could get a 52 even when the actual \(p=0.5\) . We call the assumption that \(p = 0.5\) a null hypothesis . The null hypothesis is the skeptic’s hypothesis.

We have observed a random variable \(\bar{X} = 0.52\) , and the p-value is the answer to the question: How likely is it to see a value this large, when the null hypothesis is true? If the p-value is small enough, we reject the null hypothesis and say that the results are statistically significant .

The p-value of 0.05 as a threshold for statistical significance is conventionally used in many areas of research. A cutoff of 0.01 is also used to define highly significance . The choice of 0.05 is somewhat arbitrary and was popularized by the British statistician Ronald Fisher in the 1920s. We do not recommend using these cutoff without justification and recommend avoiding the phrase statistically significant .

To obtain a p-value for our example, we write:

\[\mbox{Pr}(\mid \bar{X} - 0.5 \mid > 0.02 ) \]

assuming the \(p=0.5\) . Under the null hypothesis we know that:

\[ \sqrt{N}\frac{\bar{X} - 0.5}{\sqrt{0.5(1-0.5)}} \]

is standard normal. We, therefore, can compute the probability above, which is the p-value.

\[\mbox{Pr}\left(\sqrt{N}\frac{\mid \bar{X} - 0.5\mid}{\sqrt{0.5(1-0.5)}} > \sqrt{N} \frac{0.02}{ \sqrt{0.5(1-0.5)}}\right)\]

In this case, there is actually a large chance of seeing 52 or larger under the null hypothesis.

Keep in mind that there is a close connection between p-values and confidence intervals. If a 95% confidence interval of the spread does not include 0, we know that the p-value must be smaller than 0.05.

To learn more about p-values, you can consult any statistics textbook. However, in general, we prefer reporting confidence intervals over p-values because it gives us an idea of the size of the estimate. If we just report the p-value, we provide no information about the significance of the finding in the context of the problem.

We can show mathematically that if a \((1-\alpha)\times 100\) % confidence interval does not contain the null hypothesis value, the null hypothesis is rejected with a p-value as smaller or smaller than \(\alpha\) . So statistical significance can be determined from confidence intervals. However, unlike the confidence interval, the p-value does not provide an estimate of the magnitude of the effect. For this reason, we recommend avoiding p-values whenever you can compute a confidence interval.

Pollsters are not successful at providing correct confidence intervals, but rather at predicting who will win. When we took a 25 bead sample size, the confidence interval for the spread:

included 0. If this were a poll and we were forced to make a declaration, we would have to say it was a “toss-up”.

One problem with our poll results is that, given the sample size and the value of \(p\) , we would have to sacrifice the probability of an incorrect call to create an interval that does not include 0.

This does not mean that the election is close. It only means that we have a small sample size. In statistical textbooks, this is called lack of power . In the context of polls, power is the probability of detecting spreads different from 0.

By increasing our sample size, we lower our standard error, and thus, have a much better chance of detecting the direction of the spread.

9.3 Exercises

  • Generate a sample of size \(N=1000\) from an urn model with 50% blue beads:

then, compute a p-value to test if \(p=0.5\) . Repeat this 10,000 times and report how often the p-value is lower than 0.05? How often is it lower than 0.01?

  • Make a histogram of the p-values you generated in exercise 1. Which of the following seems to be true?
  • The p-values are all 0.05.
  • The p-values are normally distributed; CLT seems to hold.
  • The p-values are uniformly distributed.
  • The p-values are all less than 0.05.

Demonstrate, mathematically, why see the histogram we see in exercise 2.

Generate a sample of size \(N=1000\) from an urn model with 52% blue beads:

Compute a p-value to test if \(p=0.5\) . Repeat this 10,000 times and report how often the p-value is larger than 0.05? Note that you are computing 1 - power.

  • Repeat exercise for but for the following values:

Plot power as a function of \(N\) with a different color curve for each value of p .

Hypothesis Testing

Hypothesis Tests (or Significance Tests) are statistical tests to see if a difference we observe is due to chance.

There are many different types of hypothesis tests for different scenarios, but they all have the same basic ideas. Below are the general steps to performing a hypothesis test:

  • Formulate your Null and Alternative Hypotheses.
  • Ho- Null Hypothesis : The null hypothesis is the hypothesis of no effect. It's the dull, boring hypothesis that says that nothing interesting is going on. If we are trying to test if a difference we observe is due to chance, the null says it is!
  • Ha- Alternative Hypothesis : The alternative hypothesis is the opposite of the null. It's what you are trying to test. If we are trying to test if a difference we observe is due to chance, the alternative says it is not!

Think about what you would expect to get if you randomly sampled from the population, assuming the null is true. Compare your observed data and expected data and calculate the test statistic .

Calculate the probability of getting the data you got or something even more extreme if the null were true. This is called the p-value .

  • Make your conclusion and interpret it in the context of the problem. If p is very low, we say that the data support rejecting the null hypothesis.

How low is “very low”?

The convention is to reject the null when P < 5% (P < 0.05) and call the result “significant”. There’s no particular justification for this value but it’s commonly used.

The P-value cut-off is called the significance level and is often represented by the Greek letter alpha (α).

The One Sample Z Test: One-sided Hypothesis

The first type of hypothesis test we are going to look at is the one-sample z-test. You can do a z-test for means or for proportions. This is the most simple type of hypothesis test and it uses z-scores and the normal curve. Let’s look at one below!

Hypothesis Test Example : Suppose a large university claims that the average ACT score of their incoming freshman class is 30, but we think the University may be inflating their average. To test the University’s claim we take a simple random sample of 50 students and find their average to be only 28.3 with an SD of 4. Perform a hypothesis test to test the claim. Here are the 4 steps:

  • This can be written in symbols as well: Ho: μ = 30
  • μ is the symbol for the population mean
  • This can be written in symbols as well: Ho: μ < 30
  • Our test statistic for the one sample z test is z! We can calculate z using our z-score formula for random variables since we are dealing with a sample of 50 students.

hypothesis testing towards data science

  • In our case, the expected value (EV) is 30 since we are assuming our null hypothesis is true (until proven otherwise).
  • Since we are dealing with means, our SE is found using the following formula:

hypothesis testing towards data science

Our z-score is -3. See the calculation below:

hypothesis testing towards data science

  • Calculate the probability of getting the data you got or something even more extreme if the null were true. This is called the p-value . In this case, our p-value is going to be the area to the left of z = -3. We can use Python to calculate this by using norm.cdf(-3).
  • We get that the p-value is 0.0013.
  • This is the probability that we would get a sample average of 28.3 given that the null hypothesis was true (the true average was 30).

hypothesis testing towards data science

  • Our p-value is less than 5% so we reject our Null Hypothesis. In other words, there is evidence of the Alternative Hypothesis (that the University is inflating their average).

The One Sample Z Test: Two-sided Hypothesis

Hypothesis Test Example : Now we're going to test the above claim but with a different alternative hypothesis. The large university still claims that the average ACT score of their incoming freshman class is 30, but now we think the University may be inflating or deflating their average. To test the University’s claim we take a simple random sample of 50 students and find their average to be only 28.3 with an SD of 4. Perform a hypothesis test to test the claim with our new alternative hypothesis. Here are the 4 steps:

  • This can be written in symbols as well: Ho: μ != 30

Step 2 is the same as the one-sided example, so our z score is still -3.

Calculate the probability of getting the data you got or something even more extreme if the null were true. This is called the p-value . In this case, our p-value is going to be the area to the left or right of z = -3. We can use Python to calculate this by using 2*norm.cdf(-3).

  • We get that the p-value is 0.0027.
  • Our p-value is less than 5% so we reject our Null Hypothesis. In other words, there is evidence of the Alternative Hypothesis (that the University is inflating or deflating their average).

Example Walk-Throughs with Worksheets

Video 1: one sample z-test examples.

  • Download Blank Worksheet (PDF)

Video 2: Two Sample z-test Examples

Video 3: z-tests in Python

Video 4: One Sample t-test Examples

Video 5: t-tests in Python

For enquiries call:

+1-469-442-0620

banner-in1

  • Data Science

Hypothesis Testing in Data Science [Types, Process, Example]

Home Blog Data Science Hypothesis Testing in Data Science [Types, Process, Example]

Play icon

In day-to-day life, we come across a lot of data lot of variety of content. Sometimes the information is too much that we get confused about whether the information provided is correct or not. At that moment, we get introduced to a word called “Hypothesis testing” which helps in determining the proofs and pieces of evidence for some belief or information.  

What is Hypothesis Testing?

Hypothesis testing is an integral part of statistical inference. It is used to decide whether the given sample data from the population parameter satisfies the given hypothetical condition. So, it will predict and decide using several factors whether the predictions satisfy the conditions or not. In simpler terms, trying to prove whether the facts or statements are true or not.   

For example, if you predict that students who sit on the last bench are poorer and weaker than students sitting on 1st bench, then this is a hypothetical statement that needs to be clarified using different experiments. Another example we can see is implementing new business strategies to evaluate whether they will work for the business or not. All these things are very necessary when you work with data as a data scientist.  If you are interested in learning about data science, visit this amazing  Data Science full course   to learn data science.    

How is Hypothesis Testing Used in Data Science?

It is important to know how and where we can use hypothesis testing techniques in the field of data science. Data scientists predict a lot of things in their day-to-day work, and to check the probability of whether that finding is certain or not, we use hypothesis testing. The main goal of hypothesis testing is to gauge how well the predictions perform based on the sample data provided by the population. If you are interested to know more about the applications of the data, then refer to this  D ata  Scien ce course in India  which will give you more insights into application-based things. When data scientists work on model building using various machine learning algorithms, they need to have faith in their models and the forecasting of models. They then provide the sample data to the model for training purposes so that it can provide us with the significance of statistical data that will represent the entire population.  

Where and When to Use Hypothesis Test?

Hypothesis testing is widely used when we need to compare our results based on predictions. So, it will compare before and after results. For example, someone claimed that students writing exams from blue pen always get above 90%; now this statement proves it correct, and experiments need to be done. So, the data will be collected based on the student's input, and then the test will be done on the final result later after various experiments and observations on students' marks vs pen used, final conclusions will be made which will determine the results. Now hypothesis testing will be done to compare the 1st and the 2nd result, to see the difference and closeness of both outputs. This is how hypothesis testing is done.  

How Does Hypothesis Testing Work in Data Science?

In the whole data science life cycle, hypothesis testing is done in various stages, starting from the initial part, the 1st stage where the EDA, data pre-processing, and manipulation are done. In this stage, we will do our initial hypothesis testing to visualize the outcome in later stages. The next test will be done after we have built our model, once the model is ready and hypothesis testing is done, we will compare the results of the initial testing and the 2nd one to compare the results and significance of the results and to confirm the insights generated from the 1st cycle match with the 2nd one or not. This will help us know how the model responds to the sample training data. As we saw above, hypothesis testing is always needed when we are planning to contrast more than 2 groups. While checking on the results, it is important to check on the flexibility of the results for the sample and the population. Later, we can judge on the disagreement of the results are appropriate or vague. This is all we can do using hypothesis testing.   

Different Types of Hypothesis Testing

Hypothesis testing can be seen in several types. In total, we have 5 types of hypothesis testing. They are described below:

Hypothesis Testing

1. Alternative Hypothesis

The alternative hypothesis explains and defines the relationship between two variables. It simply indicates a positive relationship between two variables which means they do have a statistical bond. It indicates that the sample observed is going to influence or affect the outcome. An alternative hypothesis is described using H a  or H 1 . Ha indicates an alternative hypothesis and H 1  explains the possibility of influenced outcome which is 1. For example, children who study from the beginning of the class have fewer chances to fail. An alternate hypothesis will be accepted once the statistical predictions become significant. The alternative hypothesis can be further divided into 3 parts.   

  • Left-tailed: Left tailed hypothesis can be expected when the sample value is less than the true value.   
  • Right-tailed: Right-tailed hypothesis can be expected when the true value is greater than the outcome/predicted value.    
  • Two-tailed: Two-tailed hypothesis is defined when the true value is not equal to the sample value or the output.   

2. Null Hypothesis

The null hypothesis simply states that there is no relation between statistical variables. If the facts presented at the start do not match with the outcomes, then we can say, the testing is null hypothesis testing. The null hypothesis is represented as H 0 . For example, children who study from the beginning of the class have no fewer chances to fail. There are types of Null Hypothesis described below:   

Simple Hypothesis:  It helps in denoting and indicating the distribution of the population.   

Composite Hypothesis:  It does not denote the population distribution   

Exact Hypothesis:  In the exact hypothesis, the value of the hypothesis is the same as the sample distribution. Example- μ= 10   

Inexact Hypothesis:  Here, the hypothesis values are not equal to the sample. It will denote a particular range of values.   

3. Non-directional Hypothesis 

The non-directional hypothesis is a tow-tailed hypothesis that indicates the true value does not equal the predicted value. In simpler terms, there is no direction between the 2 variables. For an example of a non-directional hypothesis, girls and boys have different methodologies to solve a problem. Here the example explains that the thinking methodologies of a girl and a boy is different, they don’t think alike.    

4. Directional Hypothesis

In the Directional hypothesis, there is a direct relationship between two variables. Here any of the variables influence the other.   

5. Statistical Hypothesis

Statistical hypothesis helps in understanding the nature and character of the population. It is a great method to decide whether the values and the data we have with us satisfy the given hypothesis or not. It helps us in making different probabilistic and certain statements to predict the outcome of the population... We have several types of tests which are the T-test, Z-test, and Anova tests.  

Methods of Hypothesis Testing

1. frequentist hypothesis testing.

Frequentist hypotheses mostly work with the approach of making predictions and assumptions based on the current data which is real-time data. All the facts are based on current data. The most famous kind of frequentist approach is null hypothesis testing.    

2. Bayesian Hypothesis Testing

Bayesian testing is a modern and latest way of hypothesis testing. It is known to be the test that works with past data to predict the future possibilities of the hypothesis. In Bayesian, it refers to the prior distribution or prior probability samples for the observed data. In the medical Industry, we observe that Doctors deal with patients’ diseases using past historical records. So, with this kind of record, it is helpful for them to understand and predict the current and upcoming health conditions of the patient.

Importance of Hypothesis Testing in Data Science

Most of the time, people assume that data science is all about applying machine learning algorithms and getting results, that is true but in addition to the fact that to work in the data science field, one needs to be well versed with statistics as most of the background work in Data science is done through statistics. When we deal with data for pre-processing, manipulating, and analyzing, statistics play. Specifically speaking Hypothesis testing helps in making confident decisions, predicting the correct outcomes, and finding insightful conclusions regarding the population. Hypothesis testing helps us resolve tough things easily. To get more familiar with Hypothesis testing and other prediction models attend the superb useful  KnowledgeHut Data Science full course  which will give you more domain knowledge and will assist you in working with industry-related projects.          

Basic Steps in Hypothesis Testing [Workflow]

1. null and alternative hypothesis.

After we have done our initial research about the predictions that we want to find out if true, it is important to mention whether the hypothesis done is a null hypothesis(H0) or an alternative hypothesis (Ha). Once we understand the type of hypothesis, it will be easy for us to do mathematical research on it. A null hypothesis will usually indicate the no-relationship between the variables whereas an alternative hypothesis describes the relationship between 2 variables.    

  • H0 – Girls, on average, are not strong as boys   
  • Ha - Girls, on average are stronger than boys   

2. Data Collection

To prove our statistical test validity, it is essential and critical to check the data and proceed with sampling them to get the correct hypothesis results. If the target data is not prepared and ready, it will become difficult to make the predictions or the statistical inference on the population that we are planning to make. It is important to prepare efficient data, so that hypothesis findings can be easy to predict.   

3. Selection of an appropriate test statistic

To perform various analyses on the data, we need to choose a statistical test. There are various types of statistical tests available. Based on the wide spread of the data that is variance within the group or how different the data category is from one another that is variance without a group, we can proceed with our further research study.   

4. Selection of the appropriate significant level

Once we get the result and outcome of the statistical test, we have to then proceed further to decide whether the reject or accept the null hypothesis. The significance level is indicated by alpha (α). It describes the probability of rejecting or accepting the null hypothesis. Example- Suppose the value of the significance level which is alpha is 0.05. Now, this value indicates the difference from the null hypothesis. 

5. Calculation of the test statistics and the p-value

P value is simply the probability value and expected determined outcome which is at least as extreme and close as observed results of a hypothetical test. It helps in evaluating and verifying hypotheses against the sample data. This happens while assuming the null hypothesis is true. The lower the value of P, the higher and better will be the results of the significant value which is alpha (α). For example, if the P-value is 0.05 or even less than this, then it will be considered statistically significant. The main thing is these values are predicted based on the calculations done by deviating the values between the observed one and referenced one. The greater the difference between values, the lower the p-value will be.

6. Findings of the test

After knowing the P-value and statistical significance, we can determine our results and take the appropriate decision of whether to accept or reject the null hypothesis based on the facts and statistics presented to us.

How to Calculate Hypothesis Testing?

Hypothesis testing can be done using various statistical tests. One is Z-test. The formula for Z-test is given below:  

            Z = ( x̅  – μ 0 )  / (σ /√n)    

In the above equation, x̅ is the sample mean   

  • μ0 is the population mean   
  • σ is the standard deviation    
  • n is the sample size   

Now depending on the Z-test result, the examination will be processed further. The result is either going to be a null hypothesis or it is going to be an alternative hypothesis. That can be measured through below formula-   

  • H0: μ=μ0   
  • Ha: μ≠μ0   
  • Here,   
  • H0 = null hypothesis   
  • Ha = alternate hypothesis   

In this way, we calculate the hypothesis testing and can apply it to real-world scenarios.

Real-World Examples of Hypothesis Testing

Hypothesis testing has a wide variety of use cases that proves to be beneficial for various industries.    

1. Healthcare

In the healthcare industry, all the research and experiments which are done to predict the success of any medicine or drug are done successfully with the help of Hypothesis testing.   

2. Education sector

Hypothesis testing assists in experimenting with different teaching techniques to deal with the understanding capability of different students.   

3. Mental Health

Hypothesis testing helps in indicating the factors that may cause some serious mental health issues.   

4. Manufacturing

Testing whether the new change in the process of manufacturing helped in the improvement of the process as well as in the quantity or not.  In the same way, there are many other use cases that we get to see in different sectors for hypothesis testing. 

Error Terms in Hypothesis Testing

1. type-i error.

Type I error occurs during the process of hypothesis testing when the null hypothesis is rejected even though it is accurate. This kind of error is also known as False positive because even though the statement is positive or correct but results are given as false. For example, an innocent person still goes to jail because he is considered to be guilty.   

2. Type-II error

Type II error occurs during the process of hypothesis testing when the null hypothesis is not rejected even though it is inaccurate. This Kind of error is also called a False-negative which means even though the statements are false and inaccurate, it still says it is correct and doesn’t reject it. For example, a person is guilty, but in court, he has been proven innocent where he is guilty, so this is a Type II error.   

3. Level of Significance

The level of significance is majorly used to measure the confidence with which a null hypothesis can be rejected. It is the value with which one can reject the null hypothesis which is H0. The level of significance gauges whether the hypothesis testing is significant or not.   

P-value stands for probability value, which tells us the probability or likelihood to find the set of observations when the null hypothesis is true using statistical tests. The main purpose is to check the significance of the statistical statement.   

5. High P-Values

A higher P-value indicates that the testing is not statistically significant. For example, a P value greater than 0.05 is considered to be having higher P value. A higher P-value also means that our evidence and proofs are not strong enough to influence the population.

In hypothesis testing, each step is responsible for getting the outcomes and the results, whether it is the selection of statistical tests or working on data, each step contributes towards the better consequences of the hypothesis testing. It is always a recommendable step when planning for predicting the outcomes and trying to experiment with the sample; hypothesis testing is a useful concept to apply.   

Frequently Asked Questions (FAQs)

We can test a hypothesis by selecting a correct hypothetical test and, based on those getting results.   

Many statistical tests are used for hypothetical testing which includes Z-test, T-test, etc. 

Hypothesis helps us in doing various experiments and working on a specific research topic to predict the results.   

The null and alternative hypothesis, data collection, selecting a statistical test, selecting significance value, calculating p-value, check your findings.    

In simple words, parametric tests are purely based on assumptions whereas non-parametric tests are based on data that is collected and acquired from a sample.   

Profile

Gauri Guglani

Gauri Guglani works as a Data Analyst at Deloitte Consulting. She has done her major in Information Technology and holds great interest in the field of data science. She owns her technical skills as well as managerial skills and also is great at communicating. Since her undergraduate, Gauri has developed a profound interest in writing content and sharing her knowledge through the manual means of blog/article writing. She loves writing on topics affiliated with Statistics, Python Libraries, Machine Learning, Natural Language processes, and many more.

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Data Science Batches & Dates

Course advisor icon

logo

Introduction to Data Science I & II

Hypothesis testing, hypothesis testing #.

Hypothesis testing is about choosing between two views, called hypotheses , on how data were generated (for example, SSR=1 or SSR =1.05 where SSR is the secondary sex ratio we defined in the previous section). Hypotheses, called null and alternative , should be specified before doing the analysis. Testing is a way to select the hypothesis that is better supported by the data. Note that the null hypothesis corresponds to what we called “the data-generating model” in the previous section.

Ingredients of hypothesis testing:

A null hypothesis \(H_0\) (e.g. SSR=1.05);

An alternative hypothesis \(H_A\) (e.g. SSR \(\neq\) 1.05);

A test statistic;

A decision or a measure of significance that is obtained from the test statistic and its distribution under the null hypothesis.

In the previous section we investigated \(H_0\) : SSR=1 by simulating from the binomial distribution. The natural alternative there was \(H_A:\) SSR \(\neq\) 1. The test statistic we used was the number of boys.

Let’s look in more detail at the components of a hypothesis test on a subset of these data. Assume that someone claims that Illinois has a different SSR based on what they have seen in a hospital they work for. You decide to investigate this using the natality data we introduced above. Before looking at the data, you need to decide on the first three ingredients:

Null hypothesis is generally the default view (generally believed to be true) and it all needs to lead to clear rules on how the data were generated. In this case, it makes sense to declare that \(H_0:\) SSR_IL=1.05 (the probability of having a boy in Illinois is 0.512 which corresponds to a secondary sex ratio of 1.05).

Alternative hypothesis should be the opposite of the null, but it can have variations (for example, we can use SSR_IL<1.05 or SSR_IL>1.05). Here, because there was no additional information provided, it is natural to use \(H_A:\) SSR_IL \(\neq\) 1.05. Note that the choice of alternative will impact the measure of significance that is discussed below.

Test statistic is the summary of the data that will be used for investigating consistency. We aim to choose the statistic that is most informative for the hypotheses we are investigating. We will use here the observed SSR in IL as a test statistic.

Below are the cells that show the data and the histogram of test statistics generated under \(H_0\) .

../../_images/HypothesisTesting_2_Test_3_0.png

In the above histogram, the observed statistic (indicated by the red dot) seems to be natural realization from the distribution summarized by the histogram. There seems to be no evidence against \(H_0\) . A more complete conclusion: using the number of boys as test statistic, we find no evidence to reject \(H_0\) (no evidence that the data is inconsistent with SSR=1.05).

Significance as measured by the p-value #

P-values capture the consistency of the data (test statistic) with the null hypothesis (distribution of the statistic under the null).

The p-value is the chance, under the null hypothesis , that the test statistic is equal to the observed value or is further in the direction of the alternative.

It is important to use correctly the specified alternative hypothesis for specifying the tail or tails of the null distribution of the statistic.

The decision is made using the null distribution of the test statistics ( probability distribution ); we will use an approximation given by an empirical distribution . P-value is about the tail area of the distribution.

In the above example, the proportion of simulations that lead to more extreme values than the one observed is:

Interpretation of p-values #

When \(H_0\) is true : p-value is (approximately) distributed uniform on the interval [0,1]:

about half of p-values are larger than 0.5

about 10% are smaller than 0.1

about 5% are smaller than 0.05

A small p-value (typically smaller than 0.05 or 0.01) indicates evidence against the null hypothesis (smaller the p-value, stronger the evidence). A large p-value indicates no evidence (or weak evidence) against the null.

What is Hypothesis Testing?

A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses. Statistical Hypotheses Factual Hypotheses  The most ideal approach to decide if a factual theory is genuine is to look at the […]

A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses.

Statistical Hypotheses

Factual Hypotheses 

The most ideal approach to decide if a factual theory is genuine is to look at the whole populace. Since that is regularly unfeasible, specialists normally look at an arbitrary example from the populace. In the event that example information is not steady with the factual speculation, the theory is dismissed. 

There are two sorts of factual speculations. 

Invalid speculation. The invalid theory, signified by Ho, is normally the speculation that example perceptions result absolutely from possibility. 

Elective theory. The elective speculation, indicated by H1 or Ha, is the theory that example perceptions are impacted by some non-arbitrary reason. 

For instance, assume we needed to decide if a coin was reasonable and adjusted. Invalid speculation may be that a large portion of the flips would bring about Heads and half, in Tails. The elective speculation may be that the number of Heads and Tails would be altogether different. Emblematically, these speculations would be communicated as 

Ho: P = 0.5 

Ha: P ≠ 0.5 

Assume we flipped the coin multiple times, bringing about 40 Heads and 10 Tails. Given this outcome, we would be slanted to dismiss the invalid speculation. We would finish up, in view of the proof, that the coin was most likely not reasonable and adjusted.

Can We Accept the Null Hypothesis?

A few scientists state that a speculation test can have one of two results: you acknowledge the invalid theory or you dismiss the invalid speculation. Numerous analysts, be that as it may, disagree with the thought of “tolerating the invalid speculation.” Instead, they state: you dismiss the invalid theory or you neglect to dismiss the invalid speculation. 

Why the qualification among “acknowledgment” and “inability to dismiss?” Acceptance suggests that the invalid theory is valid. The inability to reject suggests that the information is not adequately powerful for us to favor the elective speculation over the invalid theory. 

Hypothesis Tests

Analysts pursue a conventional procedure to decide if to dismiss an invalid theory, in light of test information. This procedure, called speculation testing, comprises of four stages. 

State the hypotheses. This includes expressing the invalid and elective speculations. The speculations are expressed so that they are totally unrelated. That is, in the event that one is valid, the other must be false. 

Detail an investigation plan. The examination plan portrays how to utilize test information to assess invalid speculation. The assessment frequently centers around a solitary test measurement. 

Break down example information. Discover the estimation of the test measurement (mean score, extent, t measurement, z-score, and so on.) depicted in the examination plan. 

Interpret results.  Apply the choice principle portrayed in the investigation plan. On the off chance that the estimation of the test measurement is far-fetched, in view of the invalid theory, dismiss the invalid speculation. 

Decision Errors

Two sorts of blunders can result from a theory test. 

Type I mistake. A Type I mistake happens when the scientist dismisses an invalid theory when it is valid. The likelihood of submitting a Type I mistake is known as the centrality level. This likelihood is likewise called the alpha and is frequently indicated by α. 

Type II mistake. A Type II blunder happens when the analyst neglects to dismiss invalid speculation that is false. The likelihood of submitting a Type II mistake is called Beta and is frequently meant by β. The likelihood of not submitting a Type II blunder is known as the Power of the test.

Decision Rules

The analysis plan includes decision rules for rejecting the null hypothesis. In practice, statisticians describe these decision rules in two ways – with reference to a P-value or with reference to a region of acceptance.

P-value. The strength of evidence in support of a null hypothesis is measured by the P-value. Suppose the test statistic is equal to S. The P-value is the probability of observing a test statistic as extreme as S, assuming the null hypothesis is true. If the P-value is less than the significance level, we reject the null hypothesis.

Region of acceptance. The region of acceptance is a range of values. If the test statistic falls within the region of acceptance, the null hypothesis is not rejected. The region of acceptance is defined so that the chance of making a Type I error is equal to the significance level.

The set of values outside the region of acceptance is called the region of rejection. If the test statistic falls within the region of rejection, the null hypothesis is rejected. In such cases, we say that the hypothesis has been rejected at the α level of significance.

These approaches are equivalent. Some statistics texts use the P-value approach; others use the region of acceptance approach. On this website, we tend to use the region of acceptance approach.

One-Tailed and Two-Tailed Tests 

A trial of a factual theory, where the locale of dismissal is on just one side of the examining dispersion, is known as a one-followed test. For instance, assume the invalid theory expresses that the mean is not exactly or equivalent to 10. The elective speculation would be that the mean is more prominent than 10. The area of dismissal would comprise of a scope of numbers situated on the correct side of inspecting dissemination; that is, a lot of numbers more noteworthy than 10. 

A trial of a measurable speculation, where the locale of dismissal is on the two sides of the inspecting dispersion, is known as a two-followed test. For instance, assume the invalid theory expresses that the mean is equivalent to 10. The elective speculation would be that the mean is under 10 or more noteworthy than 10. The area of dismissal would comprise of a scope of numbers situated on the two sides of inspecting dissemination; that is, the locale of dismissal would comprise mostly of numbers that were under 10 and incompletely of numbers that were more noteworthy than 10

Weekly newsletter

No spam. Just the latest releases and tips, interesting articles, and exclusive interviews in your inbox every week.

You may also like

False negative.

While understanding the hypothesis, two errors can be quite confusing. These two errors are false negative and false positive. You can also refer to the false-negative error as type II error and false-positive as type I error. While you are learning, you might think these errors have no use and will only waste your time […]

Box Plot Review

A box plot or box and whisker plot help you display the database distribution on a five-number summary. The first quartile Q1 will be the minimum, the third quartile Q3 will be the median, and the fifth quartile Q5 will be the maximum. You can find the outliers and their values by using a box […]

hypothesis testing towards data science

Bayesian Networks

Creating a probabilistic model can be challenging but proves helpful in machine learning. To create such a graphical model, you need to find the probabilistic relationships between variables. Suppose you are creating a graphical representation of the variables. You need to represent the variables as nodes and conditional independence as the absence of edges. Graphical […]

hypothesis testing towards data science

Privacy Overview

Data Science Stunt

Data Science Stunt

Join Our Stunt to Successful Product Strategies with a Data Mindset

What is hypothesis testing in data science?

Hypothesis testing is a statistical technique used to evaluate hypotheses about a population based on sample data. In data science, hypothesis testing is an essential tool used to make inferences about the population based on a representative sample. In this blog, we will discuss the key aspects of hypothesis testing, including null hypothesis, alternate hypothesis, significance level, type I and type II errors, p-value, region of acceptance, typical steps involved in p-value approach, and key terms around type I error and type II error.

Table of Contents

What is hypothesis testing in data science or AI or ML or Statistics?

Hypothesis testing is a statistical technique used to test whether a claim or hypothesis about a population is true or not. Imagine that you have a hypothesis that eating breakfast helps students perform better in school. To test this hypothesis, you would collect data from a sample of students and compare their performance in school to whether they ate breakfast or not. You would then compare the results to what would be expected by chance, assuming the null hypothesis is true, which means there is no difference in performance between those who ate breakfast and those who didn’t. If the results show a statistically significant difference, you can reject the null hypothesis and accept the alternative hypothesis, which in this case is that eating breakfast does, in fact, help students perform better in school. Hypothesis testing is important in many fields, including science and medicine, to help make decisions based on empirical evidence.

Hypothesis testing is a statistical tool that helps to test assumptions or claims about a population based on a sample of data. The process of hypothesis testing involves formulating a null hypothesis and an alternative hypothesis, selecting an appropriate test statistic, choosing a level of significance, calculating the p-value, and making a decision whether to reject or fail to reject the null hypothesis based on the p-value. Hypothesis testing is widely used in various fields, such as social sciences, medicine, engineering, and economics, to evaluate research questions and determine the statistical significance of results. It is a crucial component of data analysis that can help to make sound decisions based on empirical evidence.

What is hypothesis testing in data science?

What are the null hypothesis and alternative hypothesis with examples?

Hypothesis testing involves comparing two statements, the null hypothesis and the alternative hypothesis. The null hypothesis is the starting assumption we make about a population, which suggests there is no difference or relationship between two or more variables being tested. For instance, if we want to test whether there is a difference in the average height between boys and girls, the null hypothesis would state that there is no difference in height between the two groups.

On the other hand, the alternative hypothesis is a statement that assumes there is a difference or relationship between the variables being tested. In this example, the alternative hypothesis would propose that there is a difference in height between boys and girls.

To evaluate which hypothesis to accept or reject, we collect data from a sample and compare it to what would be expected by chance, assuming the null hypothesis is correct. If the results reveal a statistically significant difference, we reject the null hypothesis and accept the alternative hypothesis. However, if there is insufficient evidence to dismiss the null hypothesis, we fail to reject it. To sum up, the null hypothesis is the original assumption we test, and the alternative hypothesis is the statement we attempt to validate.

The null hypothesis is a statement of no effect or no difference between two groups. This statement is usually the starting point in hypothesis testing, and we assume it to be true. For example, if we want to test the efficacy of a new drug, the null hypothesis would state that the drug has no effect.

The alternate hypothesis is the opposite of the null hypothesis. It is the statement we want to test, and it suggests that there is an effect or difference between two groups. For example, if we want to test the efficacy of a new drug, the alternate hypothesis would suggest that the drug is effective.

Here are some more examples to help illustrate the concept of null and alternative hypotheses:

Example 1: Null hypothesis: There is no difference in exam scores between students who study alone versus students who study in groups. Alternative hypothesis: Students who study in groups score higher on exams than students who study alone.

Example 2: Null hypothesis: There is no difference in sales between two different advertising campaigns. Alternative hypothesis: One advertising campaign leads to more sales than the other.

Example 3: Null hypothesis: There is no relationship between the amount of sleep a person gets and their cognitive performance. Alternative hypothesis: The amount of sleep a person gets is positively correlated with their cognitive performance.

What is the significance level in hypothesis testing with examples?

In hypothesis testing, the significance level is a number that we pick before we do a test. It helps us know how sure we are about our results. It’s like saying, “I’m only going to say something is true if I’m really sure.”

For example, if we’re trying to find out if girls are taller than boys on average, we might set a significance level of 0.05. This means we’re only going to say girls are taller if we’re at least 95% sure that’s true. If our test gives us a result that is less than 95% sure, we won’t say for sure that girls are taller.

The significance level is like a safety net to make sure we’re not jumping to conclusions without enough evidence. It helps us be more careful and confident about what we say is true based on the data we have.

In hypothesis testing, the significance level is the probability threshold that we set for rejecting the null hypothesis. The significance level, also known as alpha (α), is usually chosen before the test is conducted and is typically set at 0.05, meaning that we are willing to accept a 5% chance of making a type I error, which is the probability of rejecting the null hypothesis when it is actually true.

For example, let’s say we’re testing the hypothesis that boys and girls have different average heights. We would start by setting a significance level of 0.05. If the results of our study indicate that there is a statistically significant difference in height between boys and girls, with a p-value less than 0.05, we would reject the null hypothesis and conclude that there is a significant difference in height between the two groups. However, if the p-value is greater than 0.05, we would fail to reject the null hypothesis and conclude that there is insufficient evidence to support the claim that there is a difference in height between boys and girls.

In summary, the significance level is the probability threshold we set for rejecting the null hypothesis, and it helps us determine how confident we are in our conclusions based on the data we have collected. It is a critical component of hypothesis testing and helps us make informed decisions based on empirical evidence.

The significance level is the probability of making a type I error, or the probability of rejecting the null hypothesis when it is actually true. It is denoted by alpha (α) and is typically set to 0.05 or 0.01.

What are Type I and Type II Errors in hypothesis testing with examples?

When we do a test to see if something is true or not, sometimes we can make mistakes. There are two kinds of mistakes we can make:

The first mistake is called a Type I error. This happens when we say something is true, but it’s really not true. It’s like thinking you found a diamond in the sand, but it’s just a piece of glass.

The second mistake is called a Type II error. This happens when we say something is not true, but it’s actually true. It’s like thinking there are no more cookies in the cookie jar, but there are still some left.

We use some special words to describe these mistakes. Type I errors are also called false positives, and Type II errors are also called false negatives.

To make things even more confusing, we use a letter called beta (β) to talk about the chance of making a Type II error. But don’t worry too much about that for now. Just remember that sometimes we can make mistakes when we do tests, and we have special names for those mistakes.

Type I error occurs when we reject the null hypothesis when it is actually true. It is also known as a false positive. Type II error occurs when we fail to reject the null hypothesis when it is actually false. It is also known as a false negative. The probability of making a type II error is denoted by beta (β).

We also have something called power. Power is like the opposite of a false negative. It’s the chance that we’ll figure out something is true, if it really is true.

Another thing we think about is the sample size. That means how many things we’re looking at to figure out if something is true or not.

And finally, we think about the effect size. That’s how big the difference is between two things we’re looking at. If the difference is really big, we might be more likely to find out if something is true or not.

So basically, when we do hypothesis testing, we try to figure out if something is true or not. But sometimes we make mistakes. We also think about how many things we’re looking at, how big the difference is between them, and how likely we are to figure out if something is true if it really is true.

What is the p-value in hypothesis testing with examples?

When we do a test to see if something is true or not, we get a number called the p-value. The p-value tells us how likely it is that we got the result we did just by chance.

Here’s an example: Let’s say we’re trying to find out if cats are smarter than dogs. We do a test and get a p-value of 0.03. This means there’s only a 3% chance that we got our result just by chance.

We also have something called a significance level, which is like a safety net to make sure we’re not jumping to conclusions without enough evidence. It’s like saying, “I’m only going to say something is true if I’m really sure.”

If the p-value is less than the significance level, we can say that we’re really sure our result is true, and we reject the idea that cats and dogs are equally smart. But if the p-value is greater than the significance level, we don’t have enough evidence to say for sure that cats are smarter than dogs.

So the p-value is a number that helps us decide if our test is giving us good evidence or if it’s just a fluke.

The p-value is the probability of obtaining a test statistic as extreme as the observed one, assuming that the null hypothesis is true. If the p-value is less than the significance level, we reject the null hypothesis. If the p-value is greater than the significance level, we fail to reject the null hypothesis.

What is the Region of Acceptance in hypothesis testing with examples?

When we do a test to see if something is true or not, we use a number called the test statistic. This number tells us how different our result is from what we would expect if the null hypothesis were true.

Sometimes, we get a test statistic that is in a certain range, and we say that it’s not different enough from what we would expect to be sure that our result is true. This range is called the region of acceptance.

For example, let’s say we’re trying to find out if a new medicine helps people sleep better. We do a test and get a test statistic of 1.2. We look at our chart and see that the region of acceptance is from -1.96 to 1.96. This means that our test statistic of 1.2 is in this range, so we can’t be sure that the medicine really works.

The region of acceptance is like a zone of uncertainty. It’s saying, “We’re not sure if this result is really different from what we would expect.”

The complement of the region of acceptance is called the critical region. This is the range of values that is different enough from what we would expect that we can be pretty sure our result is true.

So the region of acceptance is just the opposite of the critical region. If our test statistic is in the region of acceptance, we can’t say for sure if our result is true or not. But if it’s in the critical region, we can be pretty sure that it is.

The region of acceptance is the range of values of the test statistic that leads to failing to reject the null hypothesis. It is the complement of the critical region.

Typical Steps Involved in P-Value Approach

What are the 5 steps in hypothesis testing?

When we do a hypothesis test using the p-value approach, there are some typical steps we follow to find out if our hypothesis is true or not. Here are the steps we usually follow:

  • We start by stating what we think is true (the null hypothesis) and what we think might be true instead (the alternate hypothesis).
  • Then we pick a number called the test statistic. This number helps us figure out if our result is different enough from what we would expect to be sure it’s true.
  • Next, we calculate something called the p-value. This is like a score that tells us how likely it is that our result is true, based on the test statistic.
  • After that, we compare the p-value to another number called the significance level. If the p-value is smaller than the significance level, it means our result is very unlikely to be a coincidence, so we can say it’s true. If the p-value is bigger than the significance level, it means our result might just be a coincidence, so we can’t be sure it’s true.
  • Finally, we make a decision about whether our hypothesis is true or not, based on the comparison we made in step 4. We also explain what our results mean in real-life terms.

So basically, when we do a hypothesis test using the p-value approach, we follow these steps to figure out if our hypothesis is true or not.

To summarize, hypothesis testing is a critical technique in data science that helps us draw conclusions about the population based on sample data. By setting up the null and alternate hypotheses, selecting an appropriate significance level, and computing the p-value, we can make informed judgments about whether to accept or reject the null hypothesis. It’s essential to recognize the kinds of errors that can happen during hypothesis testing and to choose a suitable sample size and effect size to reduce these errors.

Check out the table of contents for  Product Management  and  Data Science  to explore those topics.

Curious about how product managers can utilize Bhagwad Gita’s principles to tackle difficulties? Give this super short  book  a shot. This will certainly support my work.

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Telegram (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
  • Click to print (Opens in new window)
  • Click to email a link to a friend (Opens in new window)
  • Click to share on Pocket (Opens in new window)
  • Click to share on Mastodon (Opens in new window)

Leave a Reply Cancel reply

Data Science from Scratch (ch7) - Hypothesis and Inference

Connecting probability and statistics to hypothesis testing and inference

Table of contents

  • Central Limit Theorem
  • Hypothesis Testing
  • Confidence Intervals
  • Connecting dots with Python

This is a continuation of my progress through Data Science from Scratch by Joel Grus. We’ll use a classic coin-flipping example in this post because it is simple to illustrate with both concept and code . The goal of this post is to connect the dots between several concepts including the Central Limit Theorem, Hypothesis Testing, p-Values and confidence intervals, using python to build our intuition.

Central_Limit_Theorem

Terms like “null” and “alternative” hypothesis are used quite frequently, so let’s set some context. The “null” is the default position. The “alternative”, alt for short, is something we’re comparing to the default (null).

The classic coin-flipping exercise is to test the fairness off a coin. If a coin is fair, it’ll land on heads 50% of the time (and tails 50% of the time). Let’s translate into hypothesis testing language:

Null Hypothesis : Probability of landing on Heads = 0.5.

Alt Hypothesis : Probability of landing on Heads != 0.5.

Each coin flip is a Bernoulli trial , which is an experiment with two outcomes - outcome 1, “success”, (probability p ) and outcome 0, “fail” (probability p - 1 ). The reason it’s a Bernoulli trial is because there are only two outcome with a coin flip (heads or tails). Read more about Bernoulli here .

Here’s the code for a single Bernoulli Trial:

When you sum the independent Bernoulli trials , you get a Binomial(n,p) random variable, a variable whose possible values have a probability distribution. The central limit theorem says as n or the number of independent Bernoulli trials get large, the Binomial distribution approaches a normal distribution.

Here’s the code for when you sum all the Bernoulli Trials to get a Binomial random variable:

Note : A single ‘success’ in a Bernoulli trial is ‘x’. Summing up all those x’s into X, is a Binomial random variable. Success doesn’t imply desirability, nor does “failure” imply undesirability. They’re just terms to count the cases we’re looking for (i.e., number of heads in multiple coin flips to assess a coin’s fairness).

Given that our null is (p = 0.5) and alt is (p != 0.5), we can run some independent bernoulli trials, then sum them up to get a binomial random variable.

independent_coin_flips

Each bernoulli_trial is an experiment with either 0 or 1 as outcomes. The binomial function sums up n bernoulli(0.5) trails. We ran both twice and got different results. Each bernoulli experiment can be a success(1) or faill(0); summing up into a binomial random variable means we’re taking the probability p(0.5) that a coin flips head and we ran the experiment 1,000 times to get a random binomial variable.

The first 1,000 flips we got 510. The second 1,000 flips we got 495. We can repeat this process many times to get a distribution . We can plot this distribution to reinforce our understanding. To this we’ll use binomial_histogram function. This function picks points from a Binomial(n,p) random variable and plots their histogram.

This plot is then rendered:

binomial_coin_fairness

What we did was sum up independent bernoulli_trial (s) of 1,000 coin flips, where the probability of head is p = 0.5, to create a binomial random variable. We then repeated this a large number of times (N = 10,000), then plotted a histogram of the distribution of all binomial random variables. And because we did it so many times, it approximates the standard normal distribution (smooth bell shape curve).

Just to demonstrate how this works, we can generate several binomial random variables:

several_binomial

If we do this 10,000 times, we’ll generate the above histogram. You’ll notice that because we are testing whether the coin is fair, the probability of heads (success) should be at 0.5 and, from 1,000 coin flips, the mean ( mu ) should be a 500.

We have another function that can help us calculate normal_approximation_to_binomial :

When calling the function with our parameters, we get a mean mu of 500 (from 1,000 coin flips) and a standard deviation sigma of 15.8114. Which means that 68% of the time, the binomial random variable will be 500 +/- 15.8114 and 95% of the time it’ll be 500 +/- 31.6228 (see 68-95-99.7 rule )

Hypothesis_Testing

Now that we have seen the results of our “coin fairness” experiment plotted on a binomial distribution (approximately normal), we will be, for the purpose of testing our hypothesis, be interested in the probability of its realized value (binomial random variable) lies within or outside a particular interval .

This means we’ll be interested in questions like:

  • What’s the probability that the binomial(n,p) is below a threshold?
  • Above a threshold?
  • Between an interval?
  • Outside an interval?

First, the normal_cdf (normal cummulative distribution function), which we learned in a previous post , is the probability of a variable being below a certain threshold.

Here, the probability of X (success or heads for a ‘fair coin’) is at 0.5 ( mu = 500, sigma = 15.8113), and we want to find the probability that X falls below 490, which comes out to roughly 26%

On the other hand, the normal_probability_above , probability that X falls above 490 would be 1 - 0.2635 = 0.7365 or roughly 74%.

To make sense of this we need to recall the binomal distribution, that approximates the normal distribution, but we’ll draw a vertical line at 490.

binomial_vline

We’re asking, given the binomal distribution with mu 500 and sigma at 15.8113, what is the probability that a binomal random variable falls below the threshold (left of the line); the answer is approximately 26% and correspondingly falling above the threshold (right of the line), is approximately 74%.

Between interval

We may also wonder what the probability of a binomial random variable falling between 490 and 520 :

binomial_2_vline

Here is the function to calculate this probability and it comes out to approximately 63%. note : Bear in mind the full area under the curve is 1.0 or 100%.

Finally, the area outside of the interval should be 1 - 0.6335 = 0.3665:

In addition to the above, we may also be interested in finding (symmetric) intervals around the mean that account for a certain level of likelihood , for example, 60% probability centered around the mean.

For this operation we would use the inverse_normal_cdf :

First we’d have to find the cutoffs where the upper and lower tails each contain 20% of the probability. We calculate normal_upper_bound and normal_lower_bound and use those to calculate the normal_two_sided_bounds .

So if we wanted to know what the cutoff points were for a 60% probability around the mean and standard deviation ( mu = 500, sigma = 15.8113), it would be between 486.69 and 513.31 .

Said differently, this means roughly 60% of the time, we can expect the binomial random variable to fall between 486 and 513.

Significance and Power

Now that we have a handle on the binomial normal distribution, thresholds (left and right of the mean), and cut-off points, we want to make a decision about significance . Probably the most important part of statistical significance is that it is a decision to be made, not a standard that is externally set.

Significance is a decision about how willing we are to make a type 1 error (false positive), which we explored in a previous post . The convention is to set it to a 5% or 1% willingness to make a type 1 error. Suppose we say 5%.

We would say that out of 1,000 coin flips, 95% of the time, we’d get between 469 and 531 heads on a “fair coin” and 5% of the time, outside of this 469-531 range.

If we recall our hypotheses:

Null Hypothesis : Probability of landing on Heads = 0.5 (fair coin)

Alt Hypothesis : Probability of landing on Heads != 0.5 (biased coin)

Each binomial distribution (test) that consist of 1,000 bernoulli trials, each test where the number of heads falls outside the range of 469-531, we’ll reject the null that the coin is fair. And we’ll be wrong (false positive), 5% of the time. It’s a false positive when we incorrectly reject the null hypothesis, when it’s actually true.

We also want to avoid making a type-2 error (false negative), where we fail to reject the null hypothesis, when it’s actually false.

Note : Its important to keep in mind that terms like significance and power are used to describe tests , in our case, the test of whether a coin is fair or not. Each test is the sum of 1,000 independent bernoulli trials.

For a “test” that has a 95% significance, we’ll assume that out of a 1,000 coin flips, it’ll land on heads between 469-531 times and we’ll determine the coin is fair. For the 5% of the time it lands outside of this range, we’ll determine the coin to be “unfair”, but we’ll be wrong because it actually is fair.

To calculate the power of the test, we’ll take the assumed mu and sigma with a 95% bounds (based on the assumption that the probability of the coin landing on heads is 0.5 or 50% - a fair coin). We’ll determine the lower and upper bounds:

And if the coin was actually biased , we should reject the null, but we fail to. Let’s suppose the actual probability that the coin lands on heads is 55% ( biased towards head):

Using the same range 469 - 531, where the coin is assumed ‘fair’ with mu at 500 and sigma at 15.8113:

95sig_binomial

If the coin, in fact, had a bias towards head (p = 0.55), the distribution would shift right, but if our 95% significance test remains the same, we get:

type2_error

The probability of making a type-2 error is 11.345%. This is the probability that we’re see that the coin’s distribution is within the previous interval 469-531, thinking we should accept the null hypothesis (that the coin is fair), but in actuality, failing to see that the distribution has shifted to the coin having a bias towards heads.

The other way to arrive at this is to find the probability, under the new mu and sigma (new distribution), that X (number of successes) will fall below 531.

So the probability of making a type-2 error or the probability that the new distribution falls below 531 is approximately 11.3%.

The power to detect a type-2 error is 1.00 minus the probability of a type-2 error (1 - 0.113 = 0.887), or 88.7%.

Finally, we may be interested in increasing power to detect a type-2 error. Instead of using a normal_two_sided_bounds function to find the cut-off points (i.e., 469 and 531), we could use a one-sided test that rejects the null hypothesis (‘fair coin’) when X (number of heads on a coin-flip) is much larger than 500.

Here’s the code, using normal_upper_bound :

This means shifting the upper bounds from 531 to 526, providing more probability in the upper tail. This means the probability of a type-2 error goes down from 11.3 to 6.3.

increase_power

And the new (stronger) power to detect type-2 error is 1.0 - 0.064 = 0.936 or 93.6% (up from 88.7% above).

p-Values represent another way of deciding whether to accept or reject the Null Hypothesis. Instead of choosing bounds, thresholds or cut-off points, we could compute the probability, assuming the Null Hypothesis is true, that we would see a value as extreme as the one we just observed.

Here is the code:

If we wanted to compute, assuming we have a “fair coin” ( mu = 500, sigma = 15.8113), what is the probability of seeing a value like 530? ( note : We use 529.5 instead of 530 below due to continuity correction )

Answer: approximately 6.2%

The p-value, 6.2% is higher than our (hypothetical) 5% significance, so we don’t reject the null. On the other hand, if X was slightly more extreme, 532, the probability of seeing that value would be approximately 4.3%, which is less than 5% significance, so we would reject the null.

For one-sided tests, we would use the normal_probability_above and normal_probability_below functions created above:

Under the two_sided_p_values test, the extreme value of 529.5 had a probability of 6.2% of showing up, but not low enough to reject the null hypothesis.

However, with a one-sided test, upper_p_value for the same threshold is now 3.1% and we would reject the null hypothesis.

Confidence_Intervals

A third approach to deciding whether to accept or reject the null is to use confidence intervals. We’ll use the 530 as we did in the p-Values example.

The confidence interval for a coin flipping heads 530 (out 1,000) times is (0.4991, 0.5609). Since this interval contains the p = 0.5 (probability of heads 50% of the time, assuming a fair coin), we do not reject the null.

If the extreme value were more extreme at 540, we would arrive at a different conclusion:

Here we would be 95% confident that the mean of this distribution is contained between 0.5091 and 0.5709 and this does not contain 0.500 (albiet by a slim margin), so we reject the null hypothesis that this is a fair coin.

note : Confidence intervals are about the interval not probability p. We interpret the confidence interval as, if you were to repeat the experiment many times, 95% of the time, the “true” parameter, in our example p = 0.5, would lie within the observed confidence interval.

Connecting_Dots

We used several python functions to build intuition around statistical hypothesis testing. To higlight this “from scratch” aspect of the book here is a diagram tying together the various python function used in this post:

connecting_dots

This post is part of an ongoing series where I document my progress through Data Science from Scratch by Joel Grus .

book_disclaimer

For more content on data science, machine learning, R, Python, SQL and more, find me on Twitter .

Paul Apivat

Paul Apivat

Onchain ⛓️ data.

My interests include data science, machine learning and Python programming.

  • Statistics & Probability in Code
  • Data Science from Scratch (ch6) - Probability
  • How Positive are Your Facebook Posts?
  • Gradient Descent -- Data Science from Scratch (ch8)
  • Data Science from Scratch (ch5) - Statistics

IMAGES

  1. Hypothesis Testing- Meaning, Types & Steps

    hypothesis testing towards data science

  2. Statistical Hypothesis Testing: Step by Step

    hypothesis testing towards data science

  3. How to do Hypothesis Testing : A Beginner Guide For Data Scientist

    hypothesis testing towards data science

  4. What is Hypothesis Testing? Types and Methods

    hypothesis testing towards data science

  5. Hypothesis Testing Solved Problems

    hypothesis testing towards data science

  6. Everything You Need To Know about Hypothesis Testing

    hypothesis testing towards data science

VIDEO

  1. Concept of Hypothesis

  2. What Is A Hypothesis?

  3. Demystifying Hypothesis Testing: A Beginner's Guide to Statistics

  4. Churn dataset using Hypothesis testing in data science 2023 telugu #subscribe #datascience 23

  5. Statistics for Hypothesis Testing

  6. Hypothesis Testing Using IBM SPSS Statistics

COMMENTS

  1. Hypothesis testing for data scientists

    4. Photo by Anna Nekrashevich from Pexels. Hypothesis testing is a common statistical tool used in research and data science to support the certainty of findings. The aim of testing is to answer how probable an apparent effect is detected by chance given a random data sample. This article provides a detailed explanation of the key concepts in ...

  2. Understanding Hypothesis Testing

    Image by Author. So, a one-tailed statistical test is one whose distribution has only one tail — either the left (left-tailed test) or the right (right-tailed test).A two-tailed statistical test is one whose distribution has two tails — both left and right.. The purpose of a tail in statistical tests is to see whether the test statistic obtained falls within the tail or outside it.

  3. A Complete Guide to Hypothesis Testing

    Hypothesis testing is a method of statistical inference that considers the null hypothesis H ₀ vs. the alternative hypothesis H a, where we are typically looking to assess evidence against H ₀. Such a test is used to compare data sets against one another, or compare a data set against some external standard. The former being a two sample ...

  4. Everything You Need To Know about Hypothesis Testing

    6. Test Statistic: The test statistic measures how close the sample has come to the null hypothesis. Its observed value changes randomly from one random sample to a different sample. A test statistic contains information about the data that is relevant for deciding whether to reject the null hypothesis or not.

  5. Hypothesis Testing: a Practical Intro

    Feb 7, 2021. 1. A short primer on why we can reject hypotheses, but cannot accept them, with examples and visuals. Image by the author. Hypothesis testing is the basis of classical statistical inference. It's a framework for making decisions under uncertainty with the goal to prevent you from making stupid decisions — provided there is data ...

  6. Introduction to Hypothesis Testing with Examples

    Likelihood ratio. In the likelihood ratio test, we reject the null hypothesis if the ratio is above a certain value i.e, reject the null hypothesis if L(X) > 𝜉, else accept it. 𝜉 is called the critical ratio.. So this is how we can draw a decision boundary: we separate the observations for which the likelihood ratio is greater than the critical ratio from the observations for which it ...

  7. Hypothesis Testing in Data Science: A Comprehensive Guide

    Hypothesis Testing is a commonly used statistical tool used in research and Data Science to support the certainty of findings. The primary objective of Hypothetical Testing is to answer the probability of an apparent effect being detected by chance given a random data sample. In this blog, we will give you a detailed explanation of Hypothetical ...

  8. Mastering Hypothesis Testing: A Comprehensive Guide for ...

    1. Introduction to Hypothesis Testing - Definition and significance in research and data analysis. - Brief historical background. 2. Fundamentals of Hypothesis Testing - Null and Alternative…

  9. Hypothesis Testing in Data Science

    1. Set up a null and alternative hypothesis. Null hypothesis: Can be thought of as the "control" of the experiment. The hypothesis assumed to be true before we collect data and usually has some sort of equal sign (≥, ≤, =). Alternative hypothesis: Can be thought of as the "experiment".

  10. Hypothesis Testing Guide for Data Science Beginners

    Steps of Hypothesis Testing. The steps of hypothesis testing typically involve the following process: Formulate Hypotheses: State the null hypothesis and the alternative hypothesis.; Choose Significance Level (α): Select a significance level (α), which determines the threshold for rejecting the null hypothesis.Commonly used significance levels include 0.05 and 0.01.

  11. Statistical Inference and Hypothesis Testing in Data Science

    Statistical Inference and Hypothesis Testing in Data Science Applications. This course is part of Data Science Foundations: Statistical Inference Specialization. Taught in English. 22 languages available. Some content may not be translated. Instructor: Jem Corcoran. Enroll for Free. Starts Apr 20.

  12. A Crash Course on Hypothesis Testing

    Hypothesis testing is a quintessential part of statistical inference in data science context. In previous article, we talked about the estimating the population parameter such as mean or variance ...

  13. Fundamentals of Hypothesis Testing

    A hypothesis test is a statistical tool that allows you to make an inference about a population from a sample drawn from that population. Hypothesis tests are organized into a null hypothesis ( H_0 H 0) and an alternative hypothesis ( H_1 H 1 ). The goal of hypothesis testing is to determine whether there is enough evidence in the sample data ...

  14. Advanced Data Science

    Once data is collected, we estimate the parameter and calculate the p-value, which is the probability of the estimate being as extreme as observed if the null hypothesis is true. If the p-value is small, it indicates the null hypothesis is unlikely, providing evidence against it. We will see more examples of hypothesis testing in Chapter 17.

  15. Hypothesis Testing for Data Science and Analytics

    This article was published as a part of the Data Science Blogathon. Introduction to Hypothesis Testing. Every day we find ourselves testing new ideas, finding the fastest route to the office, the quickest way to finish our work, or simply finding a better way to do something we love.

  16. Hypothesis Testing in Data Science

    In the world of Data Science, there are two parts to consider when putting together a hypothesis. Hypothesis Testing is when the team builds a strong hypothesis based on the available dataset. This will help direct the team and plan accordingly throughout the data science project. The hypothesis will then be tested with a complete dataset and ...

  17. Hypothesis testing

    Hypothesis testing can be thought of as a way to investigate the consistency of a dataset with a model, where a model is a set of rules that describe how data are generated. The consistency is evaluated using ideas from probability and probability distributions. The consistency question in the above diagram is short for "Is it plausible that ...

  18. Hypothesis Testing

    Hypothesis Tests (or Significance Tests) are statistical tests to see if a difference we observe is due to chance. There are many different types of hypothesis tests for different scenarios, but they all have the same basic ideas. Below are the general steps to performing a hypothesis test: Formulate your Null and Alternative Hypotheses.

  19. Hypothesis Testing in Data Science [Types, Process, Example]

    In the whole data science life cycle, hypothesis testing is done in various stages, starting from the initial part, the 1st stage where the EDA, data pre-processing, and manipulation are done. ... whether it is the selection of statistical tests or working on data, each step contributes towards the better consequences of the hypothesis testing ...

  20. Hypothesis testing

    Hypothesis testing#. Hypothesis testing is about choosing between two views, called hypotheses, on how data were generated (for example, SSR=1 or SSR =1.05 where SSR is the secondary sex ratio we defined in the previous section).Hypotheses, called null and alternative, should be specified before doing the analysis.Testing is a way to select the hypothesis that is better supported by the data.

  21. What is Hypothesis Testing?

    A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses. Statistical Hypotheses Factual Hypotheses The most ideal approach to decide if a factual theory is genuine is to look at the […]

  22. What is hypothesis testing in data science?

    Hypothesis testing is a statistical technique used to evaluate hypotheses about a population based on sample data. In data science, hypothesis testing is an essential tool used to make inferences about the population based on a representative sample. In this blog, we will discuss the key aspects of hypothesis testing, including null hypothesis ...

  23. Data Science from Scratch (ch7)

    This is a continuation of my progress through Data Science from Scratch by Joel Grus. We'll use a classic coin-flipping example in this post because it is simple to illustrate with both conceptand code. The goal of this post is to connect the dots between several concepts including the Central Limit Theorem, Hypothesis Testing, p-Values and ...