Statology

Statistics Made Easy

4 Examples of Hypothesis Testing in Real Life

In statistics, hypothesis tests are used to test whether or not some hypothesis about a population parameter is true.

To perform a hypothesis test in the real world, researchers will obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:

  • Null Hypothesis (H 0 ): The sample data occurs purely from chance.
  • Alternative Hypothesis (H A ): The sample data is influenced by some non-random cause.

If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we can reject the null hypothesis and conclude that we have sufficient evidence to say that the alternative hypothesis is true.

The following examples provide several situations where hypothesis tests are used in the real world.

Example 1: Biology

Hypothesis tests are often used in biology to determine whether some new treatment, fertilizer, pesticide, chemical, etc. causes increased growth, stamina, immunity, etc. in plants or animals.

For example, suppose a biologist believes that a certain fertilizer will cause plants to grow more during a one-month period than they normally do, which is currently 20 inches. To test this, she applies the fertilizer to each of the plants in her laboratory for one month.

She then performs a hypothesis test using the following hypotheses:

  • H 0 : μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
  • H A : μ > 20 inches (the fertilizer will cause mean plant growth to increase)

If the p-value of the test is less than some significance level (e.g. α = .05), then she can reject the null hypothesis and conclude that the fertilizer leads to increased plant growth.

Example 2: Clinical Trials

Hypothesis tests are often used in clinical trials to determine whether some new treatment, drug, procedure, etc. causes improved outcomes in patients.

For example, suppose a doctor believes that a new drug is able to reduce blood pressure in obese patients. To test this, he may measure the blood pressure of 40 patients before and after using the new drug for one month.

He then performs a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean blood pressure is the same before and after using the drug)
  • H A : μ after < μ before (the mean blood pressure is less after using the drug)

If the p-value of the test is less than some significance level (e.g. α = .05), then he can reject the null hypothesis and conclude that the new drug leads to reduced blood pressure.

Example 3: Advertising Spend

Hypothesis tests are often used in business to determine whether or not some new advertising campaign, marketing technique, etc. causes increased sales.

For example, suppose a company believes that spending more money on digital advertising leads to increased sales. To test this, the company may increase money spent on digital advertising during a two-month period and collect data to see if overall sales have increased.

They may perform a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean sales is the same before and after spending more on advertising)
  • H A : μ after > μ before (the mean sales increased after spending more on advertising)

If the p-value of the test is less than some significance level (e.g. α = .05), then the company can reject the null hypothesis and conclude that increased digital advertising leads to increased sales.

Example 4: Manufacturing

Hypothesis tests are also used often in manufacturing plants to determine if some new process, technique, method, etc. causes a change in the number of defective products produced.

For example, suppose a certain manufacturing plant wants to test whether or not some new method changes the number of defective widgets produced per month, which is currently 250. To test this, they may measure the mean number of defective widgets produced before and after using the new method for one month.

They can then perform a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean number of defective widgets is the same before and after using the new method)
  • H A : μ after ≠ μ before (the mean number of defective widgets produced is different before and after using the new method)

If the p-value of the test is less than some significance level (e.g. α = .05), then the plant can reject the null hypothesis and conclude that the new method leads to a change in the number of defective widgets produced per month.

Additional Resources

Introduction to Hypothesis Testing Introduction to the One Sample t-test Introduction to the Two Sample t-test Introduction to the Paired Samples t-test

' src=

Published by Zach

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

8 Hypothesis Testing Examples in Real Life

Hypothesis testing refers to the systematic and scientific method of examining whether the hypothesis set by the researcher is valid or not. Hypothesis testing verifies that the findings of an experiment are valid and the particular results did not happen by chance. If the particular results have occurred by chance then that experiment can not be repeated and its findings won’t be reliable. For example, if you conduct a study that finds that a particular drug is responsible for the blood pressure problem in diabetic patients. But, when you repeat this experiment and it does not give the same results, no one would trust this experiment’s findings. Hence, hypothesis testing is a very crucial step to verify the experimental findings. The main criterion of hypothesis testing is to check whether the null hypothesis is rejected or retained. The null hypothesis assumes that there does not exist any relationship between the variables under investigation, while the alternate hypothesis confirms the association between the variables under investigation. If the null hypothesis is rejected, it means that alternative hypotheses (research hypothesis) are accepted, and if the null hypothesis is accepted, the alternate hypothesis is rejected automatically. In this article, we’ll learn about hypothesis testing and various real-life examples of hypothesis testing.

Understanding Hypothesis Testing

The hypothesis testing broadly involves the following steps,

  • Step 1 : Formulate the research hypothesis and the null hypothesis of the experiment.
  • Step 2: Set the characteristics of the comparison distribution.
  • Step3: Set the criterion for decision making, i.e., cut off sample score for the comparison to reject or retain the null hypothesis.
  • Step:4 Set the outcome of the sample on the comparison distribution.
  • Step 5: Make a decision whether to reject or retain the null hypothesis.

Let us understand these steps through the following example,

Suppose the researcher wants to examine whether the memorizing power of students improves after consuming caffeine or not. To examine this he conducts experiments, the experiment involves two groups say group A (experimental group) and group B (control group). Group A consumed the coffee before the memory test, while group B consumed the water before the memory test. The average normally distributed score of the people of the experimental group has a standard deviation of 4 and a mean of 19. On the basis of the score, the researcher can state that there is an association between the two variables, i.e., the memory power and the caffeine, but the researcher can not predict any particular direction, i.e., which out of the experimental group and the control group had performed better in the memory tests. Hence, the level of significance value, i.e., 5 per cent will help to draw the conclusion. Following is the stepwise hypothesis testing of this example,

Step 1: Formulating Null hypothesis and alternate hypothesis 

There exist two sample populations, i.e., group A and group B.

Group A: People who consumed coffee before the experiment

Group B: People who consumed water before the experiment.

On the basis of this, the null hypothesis and the alternative hypothesis would be as follows.

Alternate Hypothesis: Group A will perform differently from Group B, i.e., there exists an association between the two variables.

Null Hypothesis: There will not be any difference between the performance of both groups, i.e., Group A and Group B both will perform similarly.

Step 2: Characteristics of the comparison distribution 

The characteristics of the comparison distribution in this example are given below,

Population Mean = 19

Standard Deviation= 4, normally distributed.

Step 3: Cut off score

In this test the direction of effect is not stated, i.e., it is a two-tailed test. In the case of a two-tailed test, the cut off sample scores is equal to +1.96 and -1.99 at the 5 per cent level.

Step 4: Outcome of Sample Score

The sample score is then converted into the Z value. Using the appropriate method of conversion this value is turned out to be equal to 2.

Step 5: Decision Making

The Z score value of 2 is far more than the cut off Z value, ie., +1.96, hence the result is significant, ie., rejection of the null hypothesis, i.e., there exists an association between the memory power and the consumption of the coffee before the test.

Click here , to understand hypothesis testing in detail.

Hypothesis Testing Real Life Examples

Following are some real-life examples of hypothesis testing.

1. To Check the Manufacturing Processes

Hypothesis testing finds its application in the manufacturing processes such as in determining whether the implication of the new technique or process in the manufacturing plant caused the anomalies in the quality of the product or not. Let us suppose, that manufacturing plant X decides to verify that the particular method results in an increase in the defective products per quarter, say this number to be 200. Now, to verify this the researcher needs to calculate the mean of the number of defective products produced before the start and the end of the quarter.

Following is the representation of the Hypothesis testing of this example,

Null Hypothesis (Ho) :  The average of the defective products produced is the same before and after the implementation of the new manufacturing method, i.e., μ after = μ before

Alternative Hypothesis (Ha) : The average number of defective products produced are different before and after the implementation of the new manufacturing method, i.e., μ after ≠ μ before

If the resultant p-value of the hypothesis testing comes lesser than the significant value, i.e., α = .05, then the null hypothesis is rejected and it can be concluded that the changes in the method of production lead to the rise in the number of defective products production per quarter.

2. To Plan the Marketing Strategies

Many businesses often use hypothesis testing to determine the impact of the newly implemented marketing techniques, campaigns or other tactics on the sales of the product. For example, the marketing department of the company assumed that if they spend more the digital advertisements it would lead to a rise in sales. To verify this assumption, the marketing department may raise the digital advertisement budget for a particular period, and then analyse the collected data at the end of that period. They have to perform hypothesis testing to verify their assumption. Here,

Null Hypothesis (Ho) : The average sales are the same before and after the rise in the digital advertisement budget, i.e., μafter = μbefore

Alternative Hypothesis (Ha) : The average sales increase after the rise in the digital advertisement budget, i.e., μafter > μbefore

If the P-value is smaller than the significant value (say .05), then the null hypothesis can be rejected by the marketing department, and they can conclude that the rise in the digital advertisement budget can result in a rise in the sales of the product.

3. In Clinical Trials

Many pharmacists and doctors use hypothesis testing for clinical trials. The impact of the new clinical methods, medicines or procedures on the condition of the patients is analysed through hypothesis testing. For example, a pharmacist believes that the new medicine is resulting in the rise of blood pressure in diabetic patients. To test this assumption, the researcher has to measure the blood pressure of the sample patients (patients under investigation) before and after the intake of the new medicine for nearly a particular period say one month. The following procedure of the hypothesis testing is then followed,

Null Hypothesis (H0) : The average blood pressure is the same after and before the consumption of the medicine, i.e., μafter = μbefore

Alternative Hypothesis (Ha): The average blood pressure after the consumption of the medicine is less than the average blood pressure before the consumption of the medicine, i.e., μafter < μbefore

If the p-value of the hypothesis test is less than the significance value (say .o5), the null hypothesis is rejected, i.e.,  it can be concluded that the new drug is responsible for the rise in the blood pressure of diabetic patients.

4. In Testing Effectiveness of Essential Oils

Essential oils are gaining popularity nowadays due to their various benefits. Various essential oils such as ylang-ylang, lavender, and chamomile claim to reduce anxiety. You might like to test the true healing powers of all these essential oils. Suppose you assume that the lavender essential oil has the ability to reduce stress and anxiety. To check this assumption you may conduct the hypothesis testing by restating the hypothesis as follows,

Null Hypothesis (Ho) : Lavender essential has no effect on reducing anxiety.

Alternative Hypothesis (Ha): Lavender oil helps in reducing anxiety.

In this experiment, group A, i.e., the experimental group are provided with the lavender oil, while group B, i.e., the control group is provided with the placebo. The data is then collected using the various statistical tools and the stress level of both the groups, i.e., the experimental and the control group is then analysed. After the calculation, the significance level, and the p-value are found to be 0.25, and 0.05 respectively. The p values are less than the significance values, hence the null hypothesis is rejected, and it can be concluded that the lavender oil helps in reducing the stress among the people.

5. In Testing Fertilizer’s Impact on Plants

Nowadays, hypothesis testing is also used to examine the impact of pesticides, fertilizers, and other chemicals on the growth of plants or animals. Let us suppose a researcher wants to check his assumption that the particular fertilizer may result in the faster growth of the plant in a month than its usual growth of 10 inches. To verify this assumption he consistently gave that fertilizer to the plant for nearly a month. Following is the mathematical procedure of the hypothesis testing in this case,

Null Hypothesis (H0): The fertilizer does not have any influence on the growth of the plant. i.e., μ = 20 inches

Alternative Hypothesis (Ha): The fertiliser results in the faster growth of the plant, i.e., μ > 20 inches

Now, if the p-value of the hypothesis testing comes smaller than the level of significance, say .05, then the null hypothesis can be rejected, and you can conclude that the particular fertilizer is responsible for the faster plant growth.

6. In Testing the Effectiveness of Vitamin E

Suppose the researcher assumes that Vitamin E helps in the faster growth of the Hair. He conduct an experiment in which the experimental group is provided with vitamin E for three months while the controlled group is provided with the placebo. The results are then analysed after the duration of three months. To verify his assumption he restates the hypothesis as follows,

Null Hypothesis (H0) : There is no association between the Vitamin E and the hair growth of the sample group, i.e., μafter = μbefore

Alternative Hypothesis (Ha) : The group of people who consumed the vitamin E shows faster hair growth than the average hair growth of them before the consumption of the Vitamin E provided other variables remains constant. Here, μafter > μbefore.

After performing the statistical analysis, the significance level and the p-value in this scenario are o.o5, and 0.20 respectively. Hence, the researcher can conclude that the consumption of vitamin E results in faster hair growth.

7. In Testing the Teaching Strategy

Suppose the two teachers say Mr X and Mr Y argue about the best teaching strategy. Mr X says that children will perform better in the annual exams if they are given the weekly tests, while Mr Y argues that the weekly test would not impact the performance of the children in the annual exams and it is waste of time. Now, to verify who is right between the both, we may conduct hypothesis testing. The researcher may formulate the hypothesis as follows,

Null Hypothesis (Ho): There is no association between the weekly tests on the performance of the children in the annual exams, i.e., the average marks scored by the children when they were given the weekly exams and when not, were the same. (μafter = μbefore)

Alternative Hypothesis (Ha): The children will perform better in the annual exams, when they have to give the weekly tests, rather than just giving the annual exams, i.e., μafter > μbefore.

Now, if the p-value of the hypothesis testing comes smaller than the level of significance, say .05, then the null hypothesis can be rejected, and the researcher can conclude that the children will perform better in the annual exams if the weekly examination system would be implemented.

8. In Verifying the Assumption related to Intelligence

Suppose a principle states that the students studying in her school have an IQ level of above average. To support her statement, the researcher may take a sample of around 50 random students from that school. Let’s say the average IQ score of those children is around 110, and the average IQ score of the mean population is 100 with a standard deviation of 15. The hypothesis testing is given as follows,

Null Hypothesis (Ho) : The population mean IQ score of 100 is a general fact, i.e., μ = 100.

Alternative Hypothesis (Ha): The average IQ score of the students is above average, i.e., μ > 100

It’s a one-tailed test as we are aiming for the ‘greater than’ assumption. Let us suppose the alpha level or we can say the significance level, in this case, is 5 per cent, i.e., 0.05, and this corresponds to the Z score equal to 1.645. The Z score is found by the statistical formula given by (112.5 – 100) / (15/√30) = 4.56. Now, the final step is to compare the values of the expected z score and the calculated z score. Here, the calculated Z score is lesser than the expected Z score, hence, the Null Hypothesis is rejected, i.e., the average IQ score of the children belonging to that school is above average.

Related Posts

8 Placebo Effect Examples in Real Life

8 Placebo Effect Examples in Real Life

Social Control Theory Examples

Social Control Theory Examples

Revocation & Lapse of Offer

Freud’s Psychoanalytic Theories Explained

Freud’s Psychoanalytic Theories Explained

2 Applied Ethics Examples in Real Life

2 Applied Ethics Examples in Real Life

16 Critical Thinking Examples in Real Life

16 Critical Thinking Examples in Real Life

Add comment cancel reply.

  • Prompt Library
  • DS/AI Trends
  • Stats Tools
  • Interview Questions
  • Generative AI
  • Machine Learning
  • Deep Learning

Hypothesis Testing Steps & Examples

Hypothesis Testing Workflow

Table of Contents

What is a Hypothesis testing?

As per the definition from Oxford languages, a hypothesis is a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation. As per the Dictionary page on Hypothesis , Hypothesis means a proposition or set of propositions, set forth as an explanation for the occurrence of some specified group of phenomena, either asserted merely as a provisional conjecture to guide investigation (working hypothesis) or accepted as highly probable in the light of established facts.

The hypothesis can be defined as the claim that can either be related to the truth about something that exists in the world, or, truth about something that’s needs to be established a fresh . In simple words, another word for the hypothesis is the “claim” . Until the claim is proven to be true, it is called the hypothesis. Once the claim is proved, it becomes the new truth or new knowledge about the thing. For example , let’s say that a claim is made that students studying for more than 6 hours a day gets more than 90% of marks in their examination. Now, this is just a claim or a hypothesis and not the truth in the real world. However, in order for the claim to become the truth for widespread adoption, it needs to be proved using pieces of evidence, e.g., data.  In order to reject this claim or otherwise, one needs to do some empirical analysis by gathering data samples and evaluating the claim. The process of gathering data and evaluating the claims or hypotheses with the goal to reject or otherwise (failing to reject) can be called as hypothesis testing . Note the wordings – “failing to reject”. It means that we don’t have enough evidence to reject the claim. Thus, until the time that new evidence comes up, the claim can be considered the truth. There are different techniques to test the hypothesis in order to reach the conclusion of whether the hypothesis can be used to represent the truth of the world.

One must note that the hypothesis testing never constitutes a proof that the hypothesis is absolute truth based on the observations. It only provides added support to consider the hypothesis as truth until the time that new evidences can against the hypotheses can be gathered. We can never be 100% sure about truth related to those hypotheses based on the hypothesis testing.

Simply speaking, hypothesis testing is a framework that can be used to assert whether the claim or the hypothesis made about a real-world/real-life event can be seen as the truth or otherwise based on the given data (evidences).

Hypothesis Testing Examples

Before we get ahead and start understanding more details about hypothesis and hypothesis testing steps, lets take a look at some  real-world examples  of how to think about hypothesis and hypothesis testing when dealing with real-world problems :

  • Customers are churning because they ain’t getting response to their complaints or issues
  • Customers are churning because there are other competitive services in the market which are providing these services at lower cost.
  • Customers are churning because there are other competitive services which are providing more services at the same cost.
  • It is claimed that a 500 gm sugar packet for a particular brand, say XYZA, contains sugar of less than 500 gm, say around 480gm.  Can this claim be taken as truth? How do we know that this claim is true? This is a hypothesis until proved.
  • A group of doctors claims that quitting smoking increases lifespan. Can this claim be taken as new truth? The hypothesis is that quitting smoking results in an increase in lifespan.
  • It is claimed that brisk walking for half an hour every day reverses diabetes. In order to accept this in your lifestyle, you may need evidence that supports this claim or hypothesis.
  • It is claimed that doing Pranayama yoga for 30 minutes a day can help in easing stress by 50%. This can be termed as hypothesis and would require testing / validation for it to be established as a truth and recommended for widespread adoption.
  • One common real-life example of hypothesis testing is election polling. In order to predict the outcome of an election, pollsters take a sample of the population and ask them who they plan to vote for. They then use hypothesis testing to assess whether their sample is representative of the population as a whole. If the results of the hypothesis test are significant, it means that the sample is representative and that the poll can be used to predict the outcome of the election. However, if the results are not significant, it means that the sample is not representative and that the poll should not be used to make predictions.
  • Machine learning models make predictions based on the input data. Each of the machine learning model representing a function approximation can be taken as a hypothesis. All different models constitute what is called as hypothesis space .
  • As part of a linear regression machine learning model , it is claimed that there is a relationship between the response variables and predictor variables? Can this hypothesis or claim be taken as truth? Let’s say, the hypothesis is that the housing price depends upon the average income of people already staying in the locality. How true is this hypothesis or claim? The relationship between response variable and each of the predictor variables can be evaluated using T-test and T-statistics .
  • For linear regression model , one of the hypothesis is that there is no relationship between the response variable and any of the predictor variables. Thus, if b1, b2, b3 are three parameters, all of them is equal to 0. b1 = b2 = b3 = 0. This is where one performs F-test and use F-statistics to test this hypothesis.

You may note different hypotheses which are listed above. The next step would be validate some of these hypotheses. This is where data scientists will come into picture. One or more data scientists may be asked to work on different hypotheses. This would result in these data scientists looking for appropriate data related to the hypothesis they are working. This section will be detailed out in near future.

State the Hypothesis to begin Hypothesis Testing

The first step to hypothesis testing is defining or stating a hypothesis. Before the hypothesis can be tested, we need to formulate the hypothesis in terms of mathematical expressions. There are two important aspects to pay attention to, prior to the formulation of the hypothesis. The following represents different types of hypothesis that could be put to hypothesis testing:

  • Claim made against the well-established fact : The case in which a fact is well-established, or accepted as truth or “knowledge” and a new claim is made about this well-established fact. For example , when you buy a packet of 500 gm of sugar, you assume that the packet does contain at the minimum 500 gm of sugar and not any less, based on the label of 500 gm on the packet. In this case, the fact is given or assumed to be the truth. A new claim can be made that the 500 gm sugar contains sugar weighing less than 500 gm. This claim needs to be tested before it is accepted as truth. Such cases could be considered for hypothesis testing if this is claimed that the assumption or the default state of being is not true. The claim to be established as new truth can be stated as “alternate hypothesis”. The opposite state can be stated as “null hypothesis”. Here the claim that the 500 gm packet consists of sugar less than 500 grams would be stated as alternate hypothesis. The opposite state which is the sugar packet consists 500 gm is null hypothesis.
  • Claim to establish the new truth : The case in which there is some claim made about the reality that exists in the world (fact). For example , the fact that the housing price depends upon the average income of people already staying in the locality can be considered as a claim and not assumed to be true. Another example could be the claim that running 5 miles a day would result in a reduction of 10 kg of weight within a month. There could be varied such claims which when required to be proved as true have to go through hypothesis testing. The claim to be established as new truth can be stated as “alternate hypothesis”. The opposite state can be stated as “null hypothesis”. Running 5 miles a day would result in reduction of 10 kg within a month would be stated as alternate hypothesis.

Based on the above considerations, the following hypothesis can be stated for doing hypothesis testing.

  • The packet of 500 gm of sugar contains sugar of weight less than 500 gm. (Claim made against the established fact). This is a new knowledge which requires hypothesis testing to get established and acted upon.
  • The housing price depends upon the average income of the people staying in the locality. This is a new knowledge which requires hypothesis testing to get established and acted upon.
  • Running 5 miles a day results in a reduction of 10 kg of weight within a month. This is a new knowledge which requires hypothesis testing to get established for widespread adoption.

Formulate Null & Alternate Hypothesis as Next Step

Once the hypothesis is defined or stated, the next step is to formulate the null and alternate hypothesis in order to begin hypothesis testing as described above.

What is a null hypothesis?

In the case where the given statement is a well-established fact or default state of being in the real world, one can call it a null hypothesis (in the simpler word, nothing new). Well-established facts don’t need any hypothesis testing and hence can be called the null hypothesis. In cases, when there are any new claims made which is not well established in the real world, the null hypothesis can be thought of as the default state or opposite state of that claim. For example , in the previous section, the claim or hypothesis is made that the students studying for more than 6 hours a day gets more than 90% of marks in their examination. The null hypothesis, in this case, will be that the claim is not true or real. The null hypothesis can be stated that there is no relationship or association between the students reading more than 6 hours a day and they getting 90% of the marks. Any occurrence is only a chance occurrence. Another example of hypothesis is when somebody is alleged that they have performed a crime.

Null hypothesis is denoted by letter H with 0, e.g., [latex]H_0[/latex]

What is an alternate hypothesis?

When the given statement is a claim (unexpected event in the real world) and not yet proven, one can call/formulate it as an alternate hypothesis and accordingly define a null hypothesis which is the opposite state of the hypothesis. The alternate hypothesis is a new knowledge or truth that needs to be established. In simple words, the hypothesis or claim that needs to be tested against reality in the real world can be termed the alternate hypothesis. In order to reach a conclusion that the claim (alternate hypothesis) can be considered the new knowledge or truth (based on the available evidence), it would be important to reject the null hypothesis. It should be noted that null and alternate hypotheses are mutually exclusive and at the same time asymmetric. In the example given in the previous section, the claim that the students studying for more than 6 hours get more than 90% of marks can be termed as the alternate hypothesis.

Alternate hypothesis is denoted with H subscript a, e.g., [latex]H_a[/latex]

Once the hypothesis is formulated as null([latex]H_0[/latex]) and alternate hypothesis ([latex]H_a[/latex]), there are two possible outcomes that can happen from hypothesis testing. These outcomes are the following:

  • Reject the null hypothesis : There is enough evidence based on which one can reject the null hypothesis. Let’s understand this with the help of an example provided earlier in this section. The null hypothesis is that there is no relationship between the students studying more than 6 hours a day and getting more than 90% marks. In a sample of 30 students studying more than 6 hours a day, it was found that they scored 91% marks. Given that the null hypothesis is true, this kind of hypothesis testing result will be highly unlikely. This kind of result can’t happen by chance. That would mean that the claim can be taken as the new truth or new knowledge in the real world. One can go and take further samples of 30 students to perform some more testing to validate the hypothesis. If similar results show up with other tests, it can be said with very high confidence that there is enough evidence to reject the null hypothesis that there is no relationship between the students studying more than 6 hours a day and getting more than 90% marks. In such cases, one can go to accept the claim as new truth that the students studying more than 6 hours a day get more than 90% marks. The hypothesis can be considered the new truth until the time that new tests provide evidence against this claim.
  • Fail to reject the null hypothesis : There is not enough evidence-based on which one can reject the null hypothesis (well-established fact or reality). Thus, one would fail to reject the null hypothesis. In a sample of 30 students studying more than 6 hours a day, the students were found to score 75%. Given that the null hypothesis is true, this kind of result is fairly likely or expected. With the given sample, one can’t reject the null hypothesis that there is no relationship between the students studying more than 6 hours a day and getting more than 90% marks.

Examples of formulating the null and alternate hypothesis

The following are some examples of the null and alternate hypothesis.

Hypothesis Testing Steps

Here is the diagram which represents the workflow of Hypothesis Testing.

Hypothesis Testing Workflow

Figure 1. Hypothesis Testing Steps

Based on the above, the following are some of the  steps to be taken when doing hypothesis testing:

  • State the hypothesis : First and foremost, the hypothesis needs to be stated. The hypothesis could either be the statement that is assumed to be true or the claim which is made to be true.
  • Formulate the hypothesis : This step requires one to identify the Null and Alternate hypotheses or in simple words, formulate the hypothesis. Take an example of the canned sauce weighing 500 gm as the Null Hypothesis.
  • Set the criteria for a decision : Identify test statistics that could be used to assess the Null Hypothesis. The test statistics with the above example would be the average weight of the sugar packet, and t-statistics would be used to determine the P-value. For different kinds of problems, different kinds of statistics including Z-statistics, T-statistics, F-statistics, etc can be used.
  • Identify the level of significance (alpha) : Before starting the hypothesis testing, one would be required to set the significance level (also called as  alpha ) which represents the value for which a P-value less than or equal to  alpha  is considered statistically significant. Typical values of  alpha  are 0.1, 0.05, and 0.01. In case the P-value is evaluated as statistically significant, the null hypothesis is rejected. In case, the P-value is more than the  alpha  value, the null hypothesis is failed to be rejected.
  • Compute the test statistics : Next step is to calculate the test statistics (z-test, t-test, f-test, etc) to determine the P-value. If the sample size is more than 30, it is recommended to use z-statistics. Otherwise, t-statistics could be used. In the current example where 20 packets of canned sauce is selected for hypothesis testing, t-statistics will be calculated for the mean value of 505 gm (sample mean). The t-statistics would then be calculated as the difference of 505 gm (sample mean) and the population means (500 gm) divided by the sample standard deviation divided by the square root of sample size (20).
  • Calculate the P-value of the test statistics : Once the test statistics have been calculated, find the P-value using either of t-table or a z-table. P-value is the probability of obtaining a test statistic (t-score or z-score) equal to or more extreme than the result obtained from the sample data, given that the null hypothesis H0 is true.
  • Compare P-value with the level of significance : The significance level is set as the allowable range within which if the value appears, one will be failed to reject the Null Hypothesis. This region is also called as Non-rejection region . The value of alpha is compared with the p-value. If the p-value is less than the significance level, the test is statistically significant and hence, the null hypothesis will be rejected.

P-Value: Key to Statistical Hypothesis Testing

Once you formulate the hypotheses, there is the need to test those hypotheses. Meaning, say that the null hypothesis is stated as the statement that housing price does not depend upon the average income of people staying in the locality, it would be required to be tested by taking samples of housing prices and, based on the test results, this Null hypothesis could either be rejected or failed to be rejected . In hypothesis testing, the following two are the outcomes:

  • Reject the Null hypothesis
  • Fail to Reject the Null hypothesis

Take the above example of the sugar packet weighing 500 gm. The Null hypothesis is set as the statement that the sugar packet weighs 500 gm. After taking a sample of 20 sugar packets and testing/taking its weight, it was found that the average weight of the sugar packets came to 495 gm. The test statistics (t-statistics) were calculated for this sample and the P-value was determined. Let’s say the P-value was found to be 15%. Assuming that the level of significance is selected to be 5%, the test statistic is not statistically significant (P-value > 5%) and thus, the null hypothesis fails to get rejected. Thus, one could safely conclude that the sugar packet does weigh 500 gm. However, if the average weight of canned sauce would have found to be 465 gm, this is way beyond/away from the mean value of 500 gm and one could have ended up rejecting the Null Hypothesis based on the P-value .

Hypothesis Testing for Problem Analysis & Solution Implementation

Hypothesis testing can be applied in both problem analysis and solution implementation. The following represents method on how you can apply hypothesis testing technique for both problem and solution space:

  • Problem Analysis : Hypothesis testing is a systematic way to validate assumptions or educated guesses during problem analysis. It allows for a structured investigation into the nature of a problem and its potential root causes. In this process, a null hypothesis and an alternative hypothesis are usually defined. The null hypothesis generally asserts that no significant change or effect exists, while the alternative hypothesis posits the opposite. Through controlled experiments, data collection, or statistical analysis, these hypotheses are then tested to determine their validity. For example, if a software company notices a sudden increase in user churn rate, they might hypothesize that the recent update to their application is the root cause. The null hypothesis could be that the update has no effect on churn rate, while the alternative hypothesis would assert that the update significantly impacts the churn rate. By analyzing user behavior and feedback before and after the update, and perhaps running A/B tests where one user group has the update and another doesn’t, the company can test these hypotheses. If the alternative hypothesis is confirmed, the company can then focus on identifying specific issues in the update that may be causing the increased churn, thereby moving closer to a solution.
  • Solution Implementation : Hypothesis testing can also be a valuable tool during the solution implementation phase, serving as a method to evaluate the effectiveness of proposed remedies. By setting up a specific hypothesis about the expected outcome of a solution, organizations can create targeted metrics and KPIs to measure success. For example, if a retail business is facing low customer retention rates, they might implement a loyalty program as a solution. The hypothesis could be that introducing a loyalty program will increase customer retention by at least 15% within six months. The null hypothesis would state that the loyalty program has no significant effect on retention rates. To test this, the company can compare retention metrics from before and after the program’s implementation, possibly even setting up control groups for more robust analysis. By applying statistical tests to this data, the company can determine whether their hypothesis is confirmed or refuted, thereby gauging the effectiveness of their solution and making data-driven decisions for future actions.
  • Tests of Significance
  • Hypothesis testing for the Mean
  • z-statistics vs t-statistics (Khan Academy)

Hypothesis testing quiz

The claim that needs to be established is set as ____________, the outcome of hypothesis testing is _________.

Please select 2 correct answers

P-value is defined as the probability of obtaining the result as extreme given the null hypothesis is true

There is a claim that doing pranayama yoga results in reversing diabetes. which of the following is true about null hypothesis.

In this post, you learned about hypothesis testing and related nuances such as the null and alternate hypothesis formulation techniques, ways to go about doing hypothesis testing etc. In data science, one of the reasons why one needs to understand the concepts of hypothesis testing is the need to verify the relationship between the dependent (response) and independent (predictor) variables. One would, thus, need to understand the related concepts such as hypothesis formulation into null and alternate hypothesis, level of significance, test statistics calculation, P-value, etc. Given that the relationship between dependent and independent variables is a sort of hypothesis or claim , the null hypothesis could be set as the scenario where there is no relationship between dependent and independent variables.

Recent Posts

Ajitesh Kumar

  • Free IBM Data Sciences Courses on Coursera - April 6, 2024
  • Self-Supervised Learning vs Transfer Learning: Examples - April 3, 2024
  • OKRs vs KPIs vs KRAs: Differences and Examples - February 21, 2024

Ajitesh Kumar

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

  • Search for:
  • Excellence Awaits: IITs, NITs & IIITs Journey

ChatGPT Prompts (250+)

  • Generate Design Ideas for App
  • Expand Feature Set of App
  • Create a User Journey Map for App
  • Generate Visual Design Ideas for App
  • Generate a List of Competitors for App
  • Free IBM Data Sciences Courses on Coursera
  • Self-Supervised Learning vs Transfer Learning: Examples
  • OKRs vs KPIs vs KRAs: Differences and Examples
  • CEP vs Traditional Database Examples
  • Retrieval Augmented Generation (RAG) & LLM: Examples

Data Science / AI Trends

  • • Prepend any arxiv.org link with talk2 to load the paper into a responsive chat application
  • • Custom LLM and AI Agents (RAG) On Structured + Unstructured Data - AI Brain For Your Organization
  • • Guides, papers, lecture, notebooks and resources for prompt engineering
  • • Common tricks to make LLMs efficient and stable
  • • Machine learning in finance

Free Online Tools

  • Create Scatter Plots Online for your Excel Data
  • Histogram / Frequency Distribution Creation Tool
  • Online Pie Chart Maker Tool
  • Z-test vs T-test Decision Tool
  • Independent samples t-test calculator

Recent Comments

I found it very helpful. However the differences are not too understandable for me

Very Nice Explaination. Thankyiu very much,

in your case E respresent Member or Oraganization which include on e or more peers?

Such a informative post. Keep it up

Thank you....for your support. you given a good solution for me.

4 Examples of Hypothesis Testing in Real Life

In statistics, hypothesis tests are used to test whether or not some hypothesis about a population parameter is true.

To perform a hypothesis test in the real world, researchers will obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:

  • Null Hypothesis (H 0 ): The sample data occurs purely from chance.
  • Alternative Hypothesis (H A ): The sample data is influenced by some non-random cause.

If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we can reject the null hypothesis and conclude that we have sufficient evidence to say that the alternative hypothesis is true.

The following examples provide several situations where hypothesis tests are used in the real world.

Example 1: Biology

Hypothesis tests are often used in biology to determine whether some new treatment, fertilizer, pesticide, chemical, etc. causes increased growth, stamina, immunity, etc. in plants or animals.

For example, suppose a biologist believes that a certain fertilizer will cause plants to grow more during a one-month period than they normally do, which is currently 20 inches. To test this, she applies the fertilizer to each of the plants in her laboratory for one month.

She then performs a hypothesis test using the following hypotheses:

  • H 0 : μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
  • H A : μ > 20 inches (the fertilizer will cause mean plant growth to increase)

If the p-value of the test is less than some significance level (e.g. α = .05), then she can reject the null hypothesis and conclude that the fertilizer leads to increased plant growth.

Example 2: Clinical Trials

Hypothesis tests are often used in clinical trials to determine whether some new treatment, drug, procedure, etc. causes improved outcomes in patients.

For example, suppose a doctor believes that a new drug is able to reduce blood pressure in obese patients. To test this, he may measure the blood pressure of 40 patients before and after using the new drug for one month.

He then performs a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean blood pressure is the same before and after using the drug)
  • H A : μ after before (the mean blood pressure is less after using the drug)

If the p-value of the test is less than some significance level (e.g. α = .05), then he can reject the null hypothesis and conclude that the new drug leads to reduced blood pressure.

Example 3: Advertising Spend

Hypothesis tests are often used in business to determine whether or not some new advertising campaign, marketing technique, etc. causes increased sales.

For example, suppose a company believes that spending more money on digital advertising leads to increased sales. To test this, the company may increase money spent on digital advertising during a two-month period and collect data to see if overall sales have increased.

They may perform a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean sales is the same before and after spending more on advertising)
  • H A : μ after > μ before (the mean sales increased after spending more on advertising)

If the p-value of the test is less than some significance level (e.g. α = .05), then the company can reject the null hypothesis and conclude that increased digital advertising leads to increased sales.

Example 4: Manufacturing

Hypothesis tests are also used often in manufacturing plants to determine if some new process, technique, method, etc. causes a change in the number of defective products produced.

For example, suppose a certain manufacturing plant wants to test whether or not some new method changes the number of defective widgets produced per month, which is currently 250. To test this, they may measure the mean number of defective widgets produced before and after using the new method for one month.

They can then perform a hypothesis test using the following hypotheses:

  • H 0 : μ after = μ before (the mean number of defective widgets is the same before and after using the new method)
  • H A : μ after ≠ μ before (the mean number of defective widgets produced is different before and after using the new method)

If the p-value of the test is less than some significance level (e.g. α = .05), then the plant can reject the null hypothesis and conclude that the new method leads to a change in the number of defective widgets produced per month.

Additional Resources

Introduction to Hypothesis Testing Introduction to the One Sample t-test Introduction to the Two Sample t-test Introduction to the Paired Samples t-test

How to Perform a Partial F-Test in Excel

4 examples of confidence intervals in real life, related posts, how to normalize data between -1 and 1, how to interpret f-values in a two-way anova, how to create a vector of ones in..., vba: how to check if string contains another..., how to determine if a probability distribution is..., what is a symmetric histogram (definition & examples), how to find the mode of a histogram..., how to find quartiles in even and odd..., how to calculate sxy in statistics (with example), how to calculate sxx in statistics (with example).

real life example of hypothesis testing

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.3.3 hypothesis testing examples.

  • Example: Right-Tailed Test
  • Example: Left-Tailed Test
  • Example: Two-Tailed Test

Brinell Hardness Scores

An engineer measured the Brinell hardness of 25 pieces of ductile iron that were subcritically annealed. The resulting data were:

The engineer hypothesized that the mean Brinell hardness of all such ductile iron pieces is greater than 170. Therefore, he was interested in testing the hypotheses:

H 0 : μ = 170 H A : μ > 170

The engineer entered his data into Minitab and requested that the "one-sample t -test" be conducted for the above hypotheses. He obtained the following output:

Descriptive Statistics

$\mu$: mean of Brinelli

Null hypothesis    H₀: $\mu$ = 170 Alternative hypothesis    H₁: $\mu$ > 170

The output tells us that the average Brinell hardness of the n = 25 pieces of ductile iron was 172.52 with a standard deviation of 10.31. (The standard error of the mean "SE Mean", calculated by dividing the standard deviation 10.31 by the square root of n = 25, is 2.06). The test statistic t * is 1.22, and the P -value is 0.117.

If the engineer set his significance level α at 0.05 and used the critical value approach to conduct his hypothesis test, he would reject the null hypothesis if his test statistic t * were greater than 1.7109 (determined using statistical software or a t -table):

t distribution graph for df = 24 and a right tailed test of .05 significance level

Since the engineer's test statistic, t * = 1.22, is not greater than 1.7109, the engineer fails to reject the null hypothesis. That is, the test statistic does not fall in the "critical region." There is insufficient evidence, at the \(\alpha\) = 0.05 level, to conclude that the mean Brinell hardness of all such ductile iron pieces is greater than 170.

If the engineer used the P -value approach to conduct his hypothesis test, he would determine the area under a t n - 1 = t 24 curve and to the right of the test statistic t * = 1.22:

t distribution graph of right tailed test showing the p-value of 0117 for a t-value of 1.22

In the output above, Minitab reports that the P -value is 0.117. Since the P -value, 0.117, is greater than \(\alpha\) = 0.05, the engineer fails to reject the null hypothesis. There is insufficient evidence, at the \(\alpha\) = 0.05 level, to conclude that the mean Brinell hardness of all such ductile iron pieces is greater than 170.

Note that the engineer obtains the same scientific conclusion regardless of the approach used. This will always be the case.

Height of Sunflowers

A biologist was interested in determining whether sunflower seedlings treated with an extract from Vinca minor roots resulted in a lower average height of sunflower seedlings than the standard height of 15.7 cm. The biologist treated a random sample of n = 33 seedlings with the extract and subsequently obtained the following heights:

The biologist's hypotheses are:

H 0 : μ = 15.7 H A : μ < 15.7

The biologist entered her data into Minitab and requested that the "one-sample t -test" be conducted for the above hypotheses. She obtained the following output:

$\mu$: mean of Height

Null hypothesis    H₀: $\mu$ = 15.7 Alternative hypothesis    H₁: $\mu$ < 15.7

The output tells us that the average height of the n = 33 sunflower seedlings was 13.664 with a standard deviation of 2.544. (The standard error of the mean "SE Mean", calculated by dividing the standard deviation 13.664 by the square root of n = 33, is 0.443). The test statistic t * is -4.60, and the P -value, 0.000, is to three decimal places.

Minitab Note. Minitab will always report P -values to only 3 decimal places. If Minitab reports the P -value as 0.000, it really means that the P -value is 0.000....something. Throughout this course (and your future research!), when you see that Minitab reports the P -value as 0.000, you should report the P -value as being "< 0.001."

If the biologist set her significance level \(\alpha\) at 0.05 and used the critical value approach to conduct her hypothesis test, she would reject the null hypothesis if her test statistic t * were less than -1.6939 (determined using statistical software or a t -table):s-3-3

Since the biologist's test statistic, t * = -4.60, is less than -1.6939, the biologist rejects the null hypothesis. That is, the test statistic falls in the "critical region." There is sufficient evidence, at the α = 0.05 level, to conclude that the mean height of all such sunflower seedlings is less than 15.7 cm.

If the biologist used the P -value approach to conduct her hypothesis test, she would determine the area under a t n - 1 = t 32 curve and to the left of the test statistic t * = -4.60:

t-distribution for left tailed test with significance level of 0.05 shown in left tail

In the output above, Minitab reports that the P -value is 0.000, which we take to mean < 0.001. Since the P -value is less than 0.001, it is clearly less than \(\alpha\) = 0.05, and the biologist rejects the null hypothesis. There is sufficient evidence, at the \(\alpha\) = 0.05 level, to conclude that the mean height of all such sunflower seedlings is less than 15.7 cm.

t-distribution graph for left tailed test with a t-value of -4.60 and left tail area of 0.000

Note again that the biologist obtains the same scientific conclusion regardless of the approach used. This will always be the case.

Gum Thickness

A manufacturer claims that the thickness of the spearmint gum it produces is 7.5 one-hundredths of an inch. A quality control specialist regularly checks this claim. On one production run, he took a random sample of n = 10 pieces of gum and measured their thickness. He obtained:

The quality control specialist's hypotheses are:

H 0 : μ = 7.5 H A : μ ≠ 7.5

The quality control specialist entered his data into Minitab and requested that the "one-sample t -test" be conducted for the above hypotheses. He obtained the following output:

$\mu$: mean of Thickness

Null hypothesis    H₀: $\mu$ = 7.5 Alternative hypothesis    H₁: $\mu \ne$ 7.5

The output tells us that the average thickness of the n = 10 pieces of gums was 7.55 one-hundredths of an inch with a standard deviation of 0.1027. (The standard error of the mean "SE Mean", calculated by dividing the standard deviation 0.1027 by the square root of n = 10, is 0.0325). The test statistic t * is 1.54, and the P -value is 0.158.

If the quality control specialist sets his significance level \(\alpha\) at 0.05 and used the critical value approach to conduct his hypothesis test, he would reject the null hypothesis if his test statistic t * were less than -2.2616 or greater than 2.2616 (determined using statistical software or a t -table):

t-distribution graph of two tails with a significance level of .05 and t values of -2.2616 and 2.2616

Since the quality control specialist's test statistic, t * = 1.54, is not less than -2.2616 nor greater than 2.2616, the quality control specialist fails to reject the null hypothesis. That is, the test statistic does not fall in the "critical region." There is insufficient evidence, at the \(\alpha\) = 0.05 level, to conclude that the mean thickness of all of the manufacturer's spearmint gum differs from 7.5 one-hundredths of an inch.

If the quality control specialist used the P -value approach to conduct his hypothesis test, he would determine the area under a t n - 1 = t 9 curve, to the right of 1.54 and to the left of -1.54:

t-distribution graph for a two tailed test with t values of -1.54 and 1.54, the corresponding p-values are 0.0789732 on both tails

In the output above, Minitab reports that the P -value is 0.158. Since the P -value, 0.158, is greater than \(\alpha\) = 0.05, the quality control specialist fails to reject the null hypothesis. There is insufficient evidence, at the \(\alpha\) = 0.05 level, to conclude that the mean thickness of all pieces of spearmint gum differs from 7.5 one-hundredths of an inch.

Note that the quality control specialist obtains the same scientific conclusion regardless of the approach used. This will always be the case.

In our review of hypothesis tests, we have focused on just one particular hypothesis test, namely that concerning the population mean \(\mu\). The important thing to recognize is that the topics discussed here — the general idea of hypothesis tests, errors in hypothesis testing, the critical value approach, and the P -value approach — generally extend to all of the hypothesis tests you will encounter.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Educ Psychol Meas
  • v.77(4); 2017 Aug

Hypothesis Testing in the Real World

Jeff miller.

1 University of Otago, Dunedin, New Zealand

Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis testing logic is routinely used in everyday life. These same examples also refute (b) by showing circumstances in which the logic of hypothesis testing addresses a question of prime interest. Null hypothesis significance testing may sometimes be misunderstood or misapplied, but these problems should be addressed by improved education.

One important goal of statistical analysis is to find real patterns in data. This is difficult when the data are subject to random noise, because random noise can produce illusory patterns “just by chance.” Given the difficulty of separating real patterns from coincidental ones within noisy data, it is important for researchers to use all of the appropriate tools and models to make inferences from their data (e.g., Gigerenzer & Marewski, 2015 ).

Null hypothesis significance testing (NHST) is one of the most commonly used types of statistical analysis, but it has been criticized severely (e.g., Kline, 2004 ; Ziliak & McCloskey, 2008 ). According to Cohen (1994) , for example, “NHST has not only failed to support the advance of psychology as a science but also has seriously impeded it” (p. 997). There have been calls for it to be supplemented with other types of analysis (e.g., Wilkinson & the Task Force on Statistical Inference, 1999 ), and at least one journal has banned its use outright ( Trafimow & Marks, 2015 ).

This note reviews the basic logic of NHST and responds to some criticisms of it. I argue that the basic logic is straightforward and compelling—so much so that it is commonly used in everyday reasoning. It is suitable for answering certain types of research questions, and of course it can be supplemented with additional techniques to address other questions. Criticisms of NHST’s logic either distort it or implicitly deny the possibility of ever finding patterns in data. The major problem with NHST is that some aspects of the method can be misunderstood, but the solution to that problem is to improve education—not to adopt new methods that address a different set of questions but are incapable of answering the question addressed by NHST. I conclude that it would be a mistake to throw out NHST.

The Common Sense Logic of NHST

Critics of NHST assert that it uses arcane, twisted, and ultimately flawed probabilistic logic (e.g., Cohen, 1994 ; Hubbard & Lindsay, 2008 ). To the contrary, the heart of NHST is a simple, intuitive, and familiar “common sense” logic that most people routinely use when they are trying to decide whether something they observe might have happened by coincidence (a.k.a., “randomly,” “by accident,” or “by chance”).

For example, suppose that you and five colleagues attend a departmental picnic. An hour after eating, three of you start to feel queasy. It comes out in discussion that those feeling queasy ate potato salad and that those not feeling queasy did not eat the potato salad. What could be more natural than to conclude that there was something wrong with the potato salad?

It is important to realize that this nonstatistical example fully embodies the underlying logic of hypothesis testing. First, a pattern is observed. In this example, the pattern is that people who ate potato salad felt queasy. Second, it is acknowledged that the pattern might have arisen just by chance. In this example, for instance, exactly those people who ate the potato salad—and no one else—might coincidentally all have been coming down with the flu, and the flu might have caused their queasiness. Third, there is reason to believe that the observed coincidence—while possible—would be very unlikely. In the example, real-world experience suggests that coming down with flu is a rare event, so it would be quite unlikely for several people to do so at just the same time, and it would of course be even more unlikely that those were exactly the people who ate the potato salad. Fourth, it is concluded that the observed pattern did not arise by chance. In this example, the “not by chance” conclusion suggests that there was something wrong with the potato salad.

To further clarify the analogy between NHST and the potato salad example, consider how a standard coin-flipping “statistical” data analysis situation could be described in parallel terms. Suppose a coin is flipped 50 times and it comes up heads 48 of them (pattern). This quite strong pattern could happen by coincidence, but elementary probability theory says that such a coincidence would be extremely unlikely. It therefore seems reasonable to conclude that the pattern was not just a coincidence; instead, the coin appears to be biased to come up heads. This is exactly the same line of reasoning used in the potato salad example: The observed pattern would be very unlikely to occur by chance, so it is reasonable to conclude that it arose for some other reason.

There are many other nonstatistical examples of the reasoning used in NHST. For instance, if you see an unusually large number of cars parked on the street where you live (pattern), you will probably conclude that something special is going on nearby. It is logically possible for all those cars to be there at the same time just by coincidence, but you know from your experience that this would be unlikely, so you reject the “just by chance” idea. Analogously, if two statistics students make an identical series of calculation errors on a homework problem (pattern), their instructor might well conclude that they had not done the homework independently. Although it is logically possible that the two students made the same errors by chance, that would seem so unlikely—at least for some types of errors—that the instructor would reject that explanation. These and many similar examples show that people often use the logic of hypothesis testing in the real world; essentially, they do so every time they conclude “that could not just be a coincidence.” Statistical hypothesis testing differs only in that laws of probability—rather than every-day experiences with various coincidences—are used to assess the likelihood that an observed pattern would occur by chance.

Criticisms of NHST’s Logic

According to Berkson (1942) , “There is no logical warrant for considering an event known to occur in a given hypothesis, even if infrequently, as disproving the hypothesis” (p. 326). In terms of our examples, Berkson is saying that it is illogical to consider 3/6 queasy friends as proving that there was something wrong with the potato salad, because it could be just a coincidence. Taken to its logical extreme, his statement implies that observing 48/50 heads should also not be regarded as disproving the hypothesis of a fair coin, because that too could happen by chance. To be sure, Berkson is mathematically correct that the suggested conclusions about the quality of the potato salad and the fairness of the coin do not follow from the observed patterns with the same 100% certainty that implications have in propositional logic (e.g., modus ponens ). On the other hand, it is unrealistic to demand that level of certainty before reaching conclusions from noisy data, because such data will almost never support any interesting conclusions with 100% certainty. In practice, 48/50 heads seems like ample evidence to conclude—with no further assumptions—that a coin must be biased, and the “logical” objection that this could have happened by chance seems rather intransigent. Given that logical certainty is unattainable due to the presence of noise in the data, one can only consider the probabilities of various correct and incorrect decisions (e.g., Type I error rates, power) under various hypothesized conditions, which is exactly what NHST does.

Another long-standing objection to NHST is that its conclusions depend on the probabilities of events that did not actually occur (e.g., Cox, 1958 ; Wagenmakers, 2007 ). For example, in deciding whether 3/6 people feeling queasy was too much of a coincidence, people might be influenced by how often they had seen 4/6, 5/6, or 6/6 people in a group feel queasy by chance, even though only 3/6 had actually been observed. It is difficult to see much practical force to this objection, however. In trying to decide whether a particular pattern is too strong to be observed by chance, it seems quite relevant to consider all of the different patterns that might be observed by chance—especially the patterns that are even stronger. Proponents of this objection generally support it with artificial probability distributions in which stronger patterns are at least as likely to occur by chance as weaker patterns, but such distributions rarely if ever arise in actual research scenarios.

Critics of NHST sometimes claim that its logical form is parallel to that of the argument shown in Table 1 (e.g., Cohen, 1994 ; Pollard & Richardson, 1987 ). There is obviously something wrong with the argument in this table, and NHST must be flawed if it uses the same logic. This criticism is unfounded, however, because the logic illustrated in Table 1 is not parallel to that of NHST.

A Misleading Caricature of Null Hypothesis Significance Testing’s Logical Form.

The argument given in Table 1 suggests that a null hypothesis—in this case, that a person is an American—should be rejected whenever the observed results are unlikely under that hypothesis. NHST requires more than that, however. Implicitly, in order to reject a null hypothesis, NHST requires that the observed results must be more likely under an alternative hypothesis than under the null. In the potato salad example, for instance, rejecting the coincidence explanation requires not only that the observed pattern is unlikely by chance when the potato salad is good, but also that this pattern is more likely when the potato salad is bad (i.e., more likely when the null hypothesis is false than when it is true).

Figure 1 shows how this additional requirement arises within NHST using the Z test as an example. The null hypothesis predicts that the outcome is a draw from the depicted standard normal distribution, and Region A (i.e., the cross-hatched tails) of this distribution represent the Z values for which the null would be rejected at p < .05. Critically, Region B in the middle of the distribution also depicts an area of 5%. If NHST really only required that the rejection region had a probability of 5% under the null hypothesis, as implied by the argument in Table 1 , then rejecting the null for an observation in Region B would be just as appropriate as rejecting it for an observation in Region A. This is not all that NHST requires, however, and in fact outcomes in Region B would not be considered evidence against the null hypothesis. The null hypothesis is rejected for outcomes in A but not for those in B, because of the requirement that an outcome in the rejection region must have higher probability when the null hypothesis is false than when it is true. Region B of Figure 1 clearly does not satisfy this additional requirement, because this area will have a higher probability when the null hypothesis is true than when it is not.

An external file that holds a picture, illustration, etc.
Object name is 10.1177_0013164416667984-fig1.jpg

A standard normal ( Z ) distribution of observed scores under the null hypothesis.

Note . Region A: The two cross-hatched areas indicate the standard two-tailed rejection region—that is, the 5% of the distribution most discrepant from the mean. Region B: The dark shaded area in the middle of the distribution also represents an area of 5%. Under NHST, only observations in the tails are taken as evidence that the null hypothesis should be rejected, even though the probability of an observation in Region B is just as low (i.e., 5%).

Likewise, the example of Table 1 clearly does not satisfy the additional requirement that the observed results should be more likely under some alternative to the null hypothesis. The probability that a person is a member of Congress is lower—not higher—if the person is not an American. In fact, the logic of NHST actually requires a first premise of the form:

  • 1′. If a person is an American, then he is probably not a member of Congress; on the other hand, if he is not an American, then he is more likely to be a member of Congress.

Premise 1′ is obviously false, so the conclusion (3) is obviously not supported within NHST.

Finally, critics of NHST often complain that its conclusions can depend on the sampling methods used to collect the data as well as on the data themselves (e.g., Wagenmakers, 2007 ). This dependence arises because NHST’s assessment of “how likely is such an extreme pattern by chance” depends on the exact probabilities of various outcomes, and these in turn depend on the details of how the sampling was carried out. This is thought to be a problem for NHST, because—according to critics—the conclusion from a data set should depend only on what the data are, but not on the sampling plan used to collect them. This argument begs the question, however. Of course, the assessment of what will happen “by chance” can only be done within a well-defined set of possible outcomes. These outcomes are necessarily determined by the sampling plan, so the plan must influence the assessment of the various patterns’ probabilities. Viewed in this manner, it seems quite reasonable that any conclusion about the presence of an unusual pattern would depend on the sampling plan as well as on the observations themselves.

Ancillary Criticisms of NHST

Additional criticisms have been directed at aspects of NHST other than its logic. For example, it is sometimes claimed that NHST does not address the question of main interest. Critics often assert that researchers “really” want to know the probability that a pattern is coincidental given the data (e.g., Berger & Berry, 1988 ; Cohen, 1994 ; Kline, 2004 ). Within the current examples, then, the claim is that people really want to know “the probability that these 3/6 picnic-goers feel sick by coincidence” or “the probability that the coin is biased towards heads.”

It is clear that NHST does not provide such probabilities, but it is not so clear that everyone always wants them. In many cases, people simply want to decide whether the pure chance explanation is tenable; for example, it is difficult to imagine a picnic-goer asking for a precise probability that the potato salad was bad. In any case, to obtain such probabilities requires knowing all of the other possible explanations, plus their prior probabilities (e.g., Efron, 2013 ). In many situations where NHST is used, the complete set of other possible explanations and their probabilities are simply unknown. In these situations, no statistical method can compute the probability that researchers supposedly want, and it seems unfair to criticize NHST for failing to provide something that cannot be determined with any other technique either.

Surely the most frequent and justified criticisms of NHST revolve around the idea that researchers do not completely understand it (e.g., Batanero, 2000 ; Wainer & Robinson, 2003 ). A number of findings suggest that one aspect of NHST in particular—the so-called “ p value”—is widely misunderstood (e.g., Gelman, 2013 ; Haller & Kraus, 2002 ; Hubbard & Lindsay, 2008 ; Kline, 2004 ). Explicitly or implicitly, such findings are taken as evidence that NHST should be abandoned because it is too difficult to use properly (e.g., Cohen, 1994 ).

Unfortunately, similar data suggest that many other concepts in probability and statistics are also poorly understood (e.g., Campbell, 1974 ). If we abandon all methods based on misunderstood statistical concepts, then almost all statistically based methods will have to go, including some apparently quite practical and important ones (e.g., diagnostic testing in medicine; Gigerenzer, Gaissmaier, Kurz-Milcke, Schwartz, & Woloshin, 2008 ). Within this difficult context, there seems to be no reason to abandon NHST selectively, because there is “no evidence that NHST is misused any more often than any other procedure” ( Wainer & Robinson, 2003 , p. 22). Moreover, if one accepted the argument that all poorly understood methods should be abandoned, then some useful but poorly understood nonstatistical methods would presumably also have to go (e.g., propositional logic; Rips & Marcus, 1977 ; Wason, 1968 ). Surely it would be a mistake to abandon a valuable tool or technique simply because considerable training and effort are required to use it correctly.

The current discussion of frequent false positives and low replicability in research areas using NHST (e.g., Francis, 2012 ; Nosek, Spies, & Motyl, 2012 ; Simmons, Nelson, & Simonsohn, 2011 ) also suggests that there are misunderstandings and misuse of this technique. Specifically, there is evidence that researchers capitalize on flexibility in the selection of their data and in the application of their analyses (i.e., “ p -hacking”) in order to obtain statistically significant and therefore publishable results (e.g., Bakker, Van Dijk, & Wicherts, 2012 ; John, Loewenstein, & Prelec, 2012 ; Tsilidis et al., 2013 ). Such practices are a misuse of NHST, and they inflate positive rates, especially in combination with existing biases toward publication of surprising new findings and with the relative scarcity of such findings within well-studied areas (e.g., Ferguson & Heene, 2012 ; Ioannidis, 2005 ). The false positive problem is not specific to NHST, however; it would arise analogously within any statistical framework. Whatever statistical methods are used to detect new patterns in noisy data, the rate of reporting imaginary patterns (i.e., false positives) will be inflated by flexibility in the selection of the data, flexibility in the application of the methods, and flexibility in the choice of what findings are reported.

To the extent that misunderstanding of NHST presents a problem, better education of researchers seems like the best path toward a solution (e.g., Holland, 2007 ; Kalinowski, Fidler, & Cumming, 2008 ; Leek & Peng, 2015 ). Although the underlying logic of NHST has considerable common sense appeal—as shown by the real-world examples described earlier—this logic is often obscured when the methods are taught to beginners. This is partly because of the specialized and unintuitive terminology that has been developed for NHST (e.g., “null hypothesis,” “Type I error,” “Type II error,” “power”). Another problem is that introductions to NHST nearly always focus primarily on the mathematical formulas used to compute the probabilities of observing various patterns by chance (i.e., “distributions under the null hypothesis”). Students can easily be so confused about the workings of these formulas that they fail to appreciate the simplicity of the underlying logic.

Conclusions

NHST is a useful heuristic for detecting nonrandom patterns, and abandoning it would be counterproductive. Its underlying logic—both in scientific research and in everyday life—is that chance can be rejected as an explanation of observed patterns that would rarely occur by coincidence. It is true that the conclusion of a biased coin does not follow with 100% certainty, and it will be wrong when an unlikely pattern really does occur by chance. Researchers should certainly keep this possibility in mind and resist the tendency to believe that every pattern documented statistically—whether by NHST or any other technique—necessarily reflects the true state of the world. As a practical strategy for detecting non-random patterns in a noisy world, however, it seems quite a reasonable heuristic to conclude tentatively that something other than chance is responsible for systematic observed patterns.

While NHST is extremely useful for deciding whether patterns might have arisen by chance, it is, of course, not the only useful statistical technique. In fact, when NHST is employed, “the answer to the significance test is rarely the only thing we should consider” ( Cox, 1958 , p. 367), so it is not sufficient for researchers to try to answer all research questions entirely within the NHST framework. For example, NHST is not appropriate for evaluating how strongly a data set supports a null hypothesis (e.g., Grant, 1962 ). For that purpose, it is better to use confidence intervals or Bayesian techniques (e.g., Cumming & Fidler, 2009 ; Rouder, Speckman, Sun, Morey, & Iverson, 2009 ; Wainer & Robinson, 2003 ; Wetzels, Raaijmakers, Jakab, & Wagenmakers, 2009 ). Fortunately, there is no fundamental limit on the number of statistical tools that researchers can use. Researchers should always use the set of tools most suitable for the questions under consideration. In many cases, that set will include NHST.

Acknowledgments

I thank Scott Brown, Patricia Haden, Wolf Schwarz, and two anonymous reviewers for constructive comments on earlier versions of the article.

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Preparation of this article was supported by a research award from the Alexander von Humboldt Foundation.

  • Bakker M., Van Dijk A., Wicherts J. M. (2012). The rules of the game called psychological science . Perspectives on Psychological Science , 7 , 543-554. doi: 10.1177/1745691612459060 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Batanero C. (2000). Controversies around the role of statistical tests in experimental research . Mathematical Thinking and Learning , 2 , 75-97. doi: 10.1207/S15327833MTL0202_4 [ CrossRef ] [ Google Scholar ]
  • Berger J. O., Berry D. A. (1988). Statistical analysis and the illusion of objectivity . American Scientist , 76 , 159-165. [ Google Scholar ]
  • Berkson J. (1942). Tests of significance considered as evidence . Journal of the American Statistical Association , 37 , 325-335. doi: 10.1080/01621459.1942.10501760 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Campbell S. K. (1974). Flaws and fallacies in statistical thinking . Englewood Cliffs, NJ: Prentice-Hall. [ Google Scholar ]
  • Cohen J. (1994). The earth is round ( p < .05) . American Psychologist , 49 , 997-1003. doi: 10.1037//0003-066X.49.12.997 [ CrossRef ] [ Google Scholar ]
  • Cox D. R. (1958). Some problems connected with statistical inference . Annals of Mathematical Statistics , 29 , 357-372. doi: 10.1214/aoms/1177706618 [ CrossRef ] [ Google Scholar ]
  • Cumming G., Fidler F. (2009). Confidence intervals: Better answers to better questions . Zeitschrift für Psychologie , 217 , 15-26. doi: 10.1027/0044-3409.217.1.15 [ CrossRef ] [ Google Scholar ]
  • Efron B. (2013). Bayes’ theorem in the 21st century . Science , 340 , 1177-1178. doi: 10.1126/science.1236536 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ferguson C. J., Heene M. (2012). A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null . Perspectives on Psychological Science , 7 , 555-561. doi: 10.1177/1745691612459059 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Francis G. (2012). Publication bias and the failure of replication in experimental psychology . Psychonomic Bulletin & Review , 19 , 975-991. doi: 10.3758/s13423-012-0322-y [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gelman A. (2013). Commentary: P values and statistical practice . Epidemiology , 24 , 69-72. [ PubMed ] [ Google Scholar ]
  • Gigerenzer G., Gaissmaier W., Kurz-Milcke E., Schwartz L. M., Woloshin S. (2008). Helping doctors and patients make sense of health statistics . Psychological Science in the Public Interest , 8 , 53-96. doi: 10.1111/j.1539-6053.2008.00033.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gigerenzer G., Marewski J. N. (2015). Surrogate science: The idol of a universal method for scientific inference . Journal of Management , 41 , 421-440. doi: 10.1177/0149206314547522 [ CrossRef ] [ Google Scholar ]
  • Grant D. A. (1962). Testing the null hypothesis and the strategy and tactics of investigating theoretical models . Psychological Review , 69 , 54-61. doi: 10.1037/h0038813 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Haller H., Kraus S. (2002). Misinterpretations of significance: A problem students share with their teachers? Methods of Psychological Research , 7 , 1-20. [ Google Scholar ]
  • Holland B. K. (2007). A classroom demonstration of hypothesis testing . Teaching Statistics , 29 , 71-73. doi: 10.1111/j.1467-9639.2007.00269.x [ CrossRef ] [ Google Scholar ]
  • Hubbard R., Lindsay R. M. (2008). Why p values are not a useful measure of evidence in statistical significance testing . Theory & Psychology , 18 , 69-88. doi: 10.1177/0959354307086923 [ CrossRef ] [ Google Scholar ]
  • Ioannidis J. P. A. (2005). Why most published research findings are false . PLoS Medicine , 2 ( 8 ), e124. doi: 10.1371/journal.pmed.0020124 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • John L. K., Loewenstein G., Prelec D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling . Psychological Science , 23 , 524-532. doi: 10.1177/0956797611430953 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kalinowski P., Fidler F., Cumming G. (2008). Overcoming the inverse probability fallacy: A comparison of two teaching interventions . Methodology: European Journal of Research Methods for the Behavioral and Social Sciences , 4 , 152-158. [ Google Scholar ]
  • Kline R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research . Washington, DC: American Psychological Association. [ Google Scholar ]
  • Leek J. T., Peng R. D. (2015). P values are just the tip of the iceberg . Nature , 520 , 612. [ PubMed ] [ Google Scholar ]
  • Nosek B. A., Spies J. R., Motyl M. (2012). Scientific utopia II. Restructuring incentives and practices to promote truth over publishability . Perspectives on Psychological Science , 7 , 615-631. [ Google Scholar ]
  • Pollard P., Richardson J. T. E. (1987). On the probability of making Type I errors . Psychological Bulletin , 102 , 159-163. doi: 10.1037/0033-2909.102.1.159 [ CrossRef ] [ Google Scholar ]
  • Rips L. J., Marcus S. L. (1977). Suppositions and the analysis of conditional sentences . In Just M. A., Carpenter P. A. (Eds.), Cognitive processes in comprehension (pp. 185-220). Hillsdale, NJ: Lawrence Erlbaum. [ Google Scholar ]
  • Rouder J. N., Speckman P. L., Sun D., Morey R. D., Iverson G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis . Psychonomic Bulletin & Review , 16 , 225-237. doi: 10.3758/PBR.16.2.225 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Simmons J. P., Nelson L. D., Simonsohn U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant . Psychological Science , 22 , 1359-1366. doi: 10.1177/0956797611417632 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Trafimow D., Marks M. (2015). Editorial . Basic and Applied Social Psychology , 37 ( 1 ), 1-2. doi: 10.1080/01973533.2015.1012991 [ CrossRef ] [ Google Scholar ]
  • Tsilidis K. K., Panagiotou O. A., Sena E. S., Aretouli E., Evangelou E., Howells D. W., … Ioannidis J. P. A. (2013). Evaluation of excess significance bias in animal studies of neurological diseases . PLoS Biology , 11 ( 7 ), e1001609. doi: 10.1371/journal.pbio.1001609 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wagenmakers E. J. (2007). A practical solution to the pervasive problems of p values . Psychonomic Bulletin & Review , 14 , 779-804. [ PubMed ] [ Google Scholar ]
  • Wainer H., Robinson D. H. (2003). Shaping up the practice of null hypothesis significance testing . Educational Researcher , 32 , 22-30. doi: 10.3102/0013189X032007022 [ CrossRef ] [ Google Scholar ]
  • Wason P. C. (1968). Reasoning about a rule . Quarterly Journal of Experimental Psychology , 20 , 273-281. doi: 10.1080/14640746808400161 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wetzels R., Raaijmakers J. G. W., Jakab E., Wagenmakers E. J. (2009). How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t test . Psychonomic Bulletin & Review , 16 , 752-760. doi: 10.3758/PBR.16.4.752 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wilkinson L., & the Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations . American Psychologist , 54 , 594-604. doi: 10.1037/0003-066X.54.8.594 [ CrossRef ] [ Google Scholar ]
  • Ziliak S. T., McCloskey D. (2008). The cult of statistical significance: How the standard error costs us jobs, justice, and lives . Ann Arbor: University of Michigan Press. [ Google Scholar ]

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability.

A Comprehensive Look at Percentile in Statistics

The Best Guide to Understand Bayes Theorem

Everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, a complete guide to chi-square test, a complete guide on hypothesis testing in statistics, understanding the fundamentals of arithmetic and geometric progression, the definitive guide to understand spearman’s rank correlation, a comprehensive guide to understand mean squared error, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution.

All You Need to Know About Bias in Statistics

A Complete Guide to Get a Grasp of Time Series Analysis

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is hypothesis testing in statistics types and examples.

Lesson 10 of 24 By Avijeet Biswal

A Complete Guide on Hypothesis Testing in Statistics

Table of Contents

In today’s data-driven world , decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis & hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.

What Is Hypothesis Testing in Statistics?

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life - 

  • A teacher assumes that 60% of his college's students come from lower-middle-class families.
  • A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.

Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.

Hypothesis Testing Formula

Z = ( x̅ – μ0 ) / (σ /√n)

  • Here, x̅ is the sample mean,
  • μ0 is the population mean,
  • σ is the standard deviation,
  • n is the sample size.

How Hypothesis Testing Works?

An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.

The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.

Your Dream Career is Just Around The Corner!

Your Dream Career is Just Around The Corner!

Null Hypothesis and Alternate Hypothesis

The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Let's understand this with an example.

A sanitizer manufacturer claims that its product kills 95 percent of germs on average. 

To put this company's claim to the test, create a null and alternate hypothesis.

H0 (Null Hypothesis): Average = 95%.

Alternative Hypothesis (H1): The average is less than 95%.

Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.

Become a Data Scientist with Hands-on Training!

Become a Data Scientist with Hands-on Training!

Hypothesis Testing Calculation With Examples

Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine that their average height is 5'5". The standard deviation of population is 2.

To calculate the z-score, we would use the following formula:

z = ( x̅ – μ0 ) / (σ /√n)

z = (5'5" - 5'4") / (2" / √100)

z = 0.5 / (0.045)

 We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".

Steps of Hypothesis Testing

Step 1: specify your null and alternate hypotheses.

It is critical to rephrase your original research hypothesis (the prediction that you wish to study) as a null (Ho) and alternative (Ha) hypothesis so that you can test it quantitatively. Your first hypothesis, which predicts a link between variables, is generally your alternate hypothesis. The null hypothesis predicts no link between the variables of interest.

Step 2: Gather Data

For a statistical test to be legitimate, sampling and data collection must be done in a way that is meant to test your hypothesis. You cannot draw statistical conclusions about the population you are interested in if your data is not representative.

Step 3: Conduct a Statistical Test

Other statistical tests are available, but they all compare within-group variance (how to spread out the data inside a category) against between-group variance (how different the categories are from one another). If the between-group variation is big enough that there is little or no overlap between groups, your statistical test will display a low p-value to represent this. This suggests that the disparities between these groups are unlikely to have occurred by accident. Alternatively, if there is a large within-group variance and a low between-group variance, your statistical test will show a high p-value. Any difference you find across groups is most likely attributable to chance. The variety of variables and the level of measurement of your obtained data will influence your statistical test selection.

Step 4: Determine Rejection Of Your Null Hypothesis

Your statistical test results must determine whether your null hypothesis should be rejected or not. In most circumstances, you will base your judgment on the p-value provided by the statistical test. In most circumstances, your preset level of significance for rejecting the null hypothesis will be 0.05 - that is, when there is less than a 5% likelihood that these data would be seen if the null hypothesis were true. In other circumstances, researchers use a lower level of significance, such as 0.01 (1%). This reduces the possibility of wrongly rejecting the null hypothesis.

Step 5: Present Your Results 

The findings of hypothesis testing will be discussed in the results and discussion portions of your research paper, dissertation, or thesis. You should include a concise overview of the data and a summary of the findings of your statistical test in the results section. You can talk about whether your results confirmed your initial hypothesis or not in the conversation. Rejecting or failing to reject the null hypothesis is a formal term used in hypothesis testing. This is likely a must for your statistics assignments.

Types of Hypothesis Testing

To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a z-test. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a z-test be applied.

A statistical test called a t-test is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.

Chi-Square 

You utilize a Chi-square test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are well-fitted, the Chi-square test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.

Hypothesis Testing and Confidence Intervals

Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.

Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.

A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just two-tailed confidence intervals. There is a direct connection between these two-tail confidence intervals and these two-tail hypothesis tests. The results of a two-tailed hypothesis test and two-tailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.

Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of values.

A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value.

In two tails, the test sample is checked to be greater or less than a range of values in a Two-Tailed test, implying that the critical distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Right Tailed Hypothesis Testing

If the larger than (>) sign appears in your hypothesis statement, you are using a right-tailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):

  • The null hypothesis is (H0 <= 90) or less change.
  • A possibility is that battery life has risen (H1) > 90.

The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a right-tailed test.

Left Tailed Hypothesis Testing

Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a left-tailed test; they are indicated by the asterisk "<".

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50. This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

Type 1 and Type 2 Error

A hypothesis test can result in two types of errors.

Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false, unlike a Type-I error.

Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.

H0: Student has passed

H1: Student has failed

Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true]. 

Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].

Level of Significance

The alpha value is a criterion for determining whether a test statistic is statistically significant. In a statistical test, Alpha represents an acceptable probability of a Type I error. Because alpha is a probability, it can be anywhere between 0 and 1. In practice, the most commonly used alpha values are 0.01, 0.05, and 0.1, which represent a 1%, 5%, and 10% chance of a Type I error, respectively (i.e. rejecting the null hypothesis when it is in fact correct).

Future-Proof Your AI/ML Career: Top Dos and Don'ts

Future-Proof Your AI/ML Career: Top Dos and Don'ts

A p-value is a metric that expresses the likelihood that an observed difference could have occurred by chance. As the p-value decreases the statistical significance of the observed difference increases. If the p-value is too low, you reject the null hypothesis.

Here you have taken an example in which you are trying to test whether the new advertising campaign has increased the product's sales. The p-value is the likelihood that the null hypothesis, which states that there is no change in the sales due to the new advertising campaign, is true. If the p-value is .30, then there is a 30% chance that there is no increase or decrease in the product's sales.  If the p-value is 0.03, then there is a 3% probability that there is no increase or decrease in the sales value due to the new advertising campaign. As you can see, the lower the p-value, the chances of the alternate hypothesis being true increases, which means that the new advertising campaign causes an increase or decrease in sales.

Why is Hypothesis Testing Important in Research Methodology?

Hypothesis testing is crucial in research methodology for several reasons:

  • Provides evidence-based conclusions: It allows researchers to make objective conclusions based on empirical data, providing evidence to support or refute their research hypotheses.
  • Supports decision-making: It helps make informed decisions, such as accepting or rejecting a new treatment, implementing policy changes, or adopting new practices.
  • Adds rigor and validity: It adds scientific rigor to research using statistical methods to analyze data, ensuring that conclusions are based on sound statistical evidence.
  • Contributes to the advancement of knowledge: By testing hypotheses, researchers contribute to the growth of knowledge in their respective fields by confirming existing theories or discovering new patterns and relationships.

Limitations of Hypothesis Testing

Hypothesis testing has some limitations that researchers should be aware of:

  • It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
  • Results are sample-specific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
  • Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
  • Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.

After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science . The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.

If you are interested in statistics of data science and skills needed for such a career, you ought to explore Simplilearn’s Post Graduate Program in Data Science.

If you have any questions regarding this ‘Hypothesis Testing In Statistics’ tutorial, do share them in the comment section. Our subject matter expert will respond to your queries. Happy learning!

1. What is hypothesis testing in statistics with example?

Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.

2. What is hypothesis testing and its types?

Hypothesis testing is a statistical method used to make inferences about a population based on sample data. It involves formulating two hypotheses: the null hypothesis (H0), which represents the default assumption, and the alternative hypothesis (Ha), which contradicts H0. The goal is to assess the evidence and determine whether there is enough statistical significance to reject the null hypothesis in favor of the alternative hypothesis.

Types of hypothesis testing:

  • One-sample test: Used to compare a sample to a known value or a hypothesized value.
  • Two-sample test: Compares two independent samples to assess if there is a significant difference between their means or distributions.
  • Paired-sample test: Compares two related samples, such as pre-test and post-test data, to evaluate changes within the same subjects over time or under different conditions.
  • Chi-square test: Used to analyze categorical data and determine if there is a significant association between variables.
  • ANOVA (Analysis of Variance): Compares means across multiple groups to check if there is a significant difference between them.

3. What are the steps of hypothesis testing?

The steps of hypothesis testing are as follows:

  • Formulate the hypotheses: State the null hypothesis (H0) and the alternative hypothesis (Ha) based on the research question.
  • Set the significance level: Determine the acceptable level of error (alpha) for making a decision.
  • Collect and analyze data: Gather and process the sample data.
  • Compute test statistic: Calculate the appropriate statistical test to assess the evidence.
  • Make a decision: Compare the test statistic with critical values or p-values and determine whether to reject H0 in favor of Ha or not.
  • Draw conclusions: Interpret the results and communicate the findings in the context of the research question.

4. What are the 2 types of hypothesis testing?

  • One-tailed (or one-sided) test: Tests for the significance of an effect in only one direction, either positive or negative.
  • Two-tailed (or two-sided) test: Tests for the significance of an effect in both directions, allowing for the possibility of a positive or negative effect.

The choice between one-tailed and two-tailed tests depends on the specific research question and the directionality of the expected effect.

5. What are the 3 major types of hypothesis?

The three major types of hypotheses are:

  • Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
  • Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
  • Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.

Find our Data Analyst Online Bootcamp in top cities:

About the author.

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

The Key Differences Between Z-Test Vs. T-Test

Free eBook: Top Programming Languages For A Data Scientist

Normality Test in Minitab: Minitab with Statistics

Normality Test in Minitab: Minitab with Statistics

A Comprehensive Look at Percentile in Statistics

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Understand Hypothesis testing with real-life examples

hypothesis-testing-real-life-example

If you are learning or working in the field of Data Science, you should have clear understanding of what are hypothesis testing and p values in statistics. If you learn those two things then understanding any basic machine learning algorithms like linear regression, and logistic regression will be easier for you. In this tutorial I will explain hypothesis testing with real life examples.

Many people have lots of confusion about hypothesis testing and p value. In this tutorial, I will explain you what is hypothesis testing in statistics with a very simple and real life example.

What is Hypothesis

Hypothesis testing is a statistical method to evaluate or investigate the validity of a claim or assumption about something based on sample data.

In simple English, meaning of Hypothesis is guess . Now a guess can be anything like: With a ladder tall enough, I could touch the sky or If I had wings, I could fly to the moon, If I had a time machine, I could travel to any point in history, etc.

Some guesses are just imagination, you can not prove or investigate them with anything. If we can investigate or test any guesses with data, that test is called Hypothesis testing in statistics.

Types of Hypothesis Testing

At this point, we know what is hypothesis testing and its use. Now there are two types of hypotheses:

  • Null Hypothesis ( H 0 )
  • Alternative Hypothesis ( H 1 )

Null Hypothesis

In general programming language, we denote Null as empty or zero. In statistics, Null hypothesis means default hypothesis , the hypothesis or guess which is already established for a certain question.

Why it is called “Null”? We call it Null Hypothesis because we always want to nullify the existing theory with something else. This is why we perform any test right? If we do not want to prove something new then why should we put effort into doing any new test?

In statistics, Null Hypothesis is denoted as H 0

Alternative Hypothesis

The alternative hypothesis (H 1 ) is the opposite of the null hypothesis. The alternative hypothesis is the hypothesis that we want to or try to prove .

For example, previously, people believed that the sun is rotating around the Earth until Galileo, an Italian astronomer, proved through his observations that the Earth is actually revolving around the sun . This discovery challenged the widely accepted belief of geocentrism, which people used to believe for centuries.

Here previously believed theory (the Sun is rotating around the Earth) is the Null Hypothesis and the theory Galileo wanted to prove (Earth is rotating around the Sun) is the Alternative Hypothesis. Here the test or observation Galileo perform can be called a Hypothesis testing.

In statistics, Alternative Hypothesis is denoted as H 1

Hypothesis testing example

Let me now give your rela life example where you can implement hypothesis testing as a data science professional. Let’s say for a particular candy manufacturing company, it is believed that a candy machine makes chocolate bars that are an average of 5 grams . That candy manufacturing plant has been making chocolate bars for 10 years now suddenly a worker claims that the machine no longer makes 5 grams of each chocolate bar .

real-life-application-of-hypothesis-testing

So is the worker wrong or the machine is not functioning properly? To answer those questions the owner of that candy company should perform Hypothesis testing.

Here the Null Hypothesis is the statement “The machine is functioning correctly and making each chocolate bar 5 grams” and the alternative hypothesis is the statement of the worker “The machine no longer makes 5 grams of each chocolate bar”.

If we want to write above English text into math then we can write like below:

hypothesis-testing-in-statistics

Pass outcome of Hypothesis testing

There are only two outcomes of hypothesis testing in statistics those are:

  • Reject Null Hypothesis (H 0 ) – the machine is producing each chocolate bar with 5g of weight
  • Fail to reject null hypothesis – In other words, alternative hypothesis (H 1 ) is proven or new theory is established

Application of Hypothesis Testing

Like the above example, there are so many places we can implement hypothesis testing in real life problems. Let’s also see some other important areas where we can apply hypothesis testing.

  • Medicine: Hypothesis testing used to determine effectiveness of new treatments or drugs.
  • Marketing: You can understand the effectiveness of advertising campaigns or new product launches.
  • Quality Control: Find out whether a manufacturing process is producing products within acceptable quality limits. This is the example I explained above.
  • Environmental Science: You can test the effects of pollutants on ecosystems using hypothesis testing

Terminologies to know

Before calculating hypothesis testing you should know some statistical terminologies, let’s understand those.

Statistical Significance

Statistically Significant is a term used to describe whether a result from a statistical test or analysis is likely to be real or just by chance. It is the point where we can draw the line to make a decision. Again the decision is whether we are rejecting the null hypothesis or not.

Level of Confidence

The Level of Confidence (c) says that, how much confident are we when taking our decision to reject null hypothesis or fail to reject the null hypothesis. Level of confidence is measured as a probability value. Now when you are trying to prove something new (rejecting null hypo), you must have high confidence in your test so that people or statistics can believe in your new theory. Good confidence level is more than 95%.

Level of Significance

Level of Significance is a probability value used in hypothesis testing to determine if the results of a statistical test are statistically significant or not. This level of significance is also known as the p-value, which I will explain in a later post. The equation of LOS is as follows:

Level of Significance = 1 – Level of confidence

Now if for some test, level of confidence = 95%

Then, Level of Significance = 1 – 95 = 0.05

Similarly, if LOC = 99% then LOS = 0.01.

You can see sum of Level of Significance and Level of confidence is equal to 1 ( LOS + LOC = 1 ). The sum should always be 1.

Good level of significance score is less than 0.05 (which is greater than 95% of LOS).

Level of Confidence vs Level of Significance

In general, there is no difference between level of confidence and level of significance. They are both the same thing. Different problems or research papers may use different parameters to indicate significance. But both will tell you the same thing, how sure are you to make any statistical decision?

Degrees of Freedom

In statistics, degrees of freedom (df) is the number of values in a sample that are free to vary after imposing certain restrictions or constraints.

In simple words, Degrees of freedom represents the number of independent observations or data points that can be used to estimate statistical parameters or test hypotheses. The more degrees of freedom available, the more reliable and accurate the statistical analysis is likely to be.

The equation for degrees of freedom is as follows:

DF = number of sample data (n) – 1

Hypothesis Testing Formula

Before we calculate hypothesis testing in hand from scratch, we should know the equation to test hypothesis. There are mainly two ways to calculate hypothesis those are t-test and z-test .

t-test equation

t-test equation is: t = ( x̅ – μ 0 ) / (s / √n)

  • Where x̅ is sample mean
  • μ 0 is already proven value (you can say null hypothesis value)
  • s sample standard deviation
  • n is the sample size

z-test equation

z-test equation is: Z = ( x̅ – μ0 ) / (σ / √n)

  • σ is the population standard deviation

Difference between sample SD and population SD

For t-test, I mentioned sample standard deviation and for z-test, I mentioned population standard deviation. Now, what is the difference between sample SD and population SD?

The main difference between sample standard deviation (SD) and population standard deviation (SD) is the data set they are calculated from.

Sample standard deviation (SD) is calculated from a subset of the population data called a sample. A sample is a smaller group of data selected from the entire population for statistical analysis.

On the other hand Population standard deviation (SD) is calculated from the entire population or entire data set.

Z-Test vs. T-Test: When to Use Which?

The z-test and t-test are both statistical tests used to calculate hypothesis testing. The z-test is used when the population standard deviation is known and the sample size is large (typically n > 30).

On the other hand, t-test is used when the population standard deviation is unknown, and the sample size is small (typically n < 30).

In general, if the sample size is large and the population standard deviation is known then the z-test is preferred. If the sample size is small and the population standard deviation is unknown, then the t-test is more appropriate (you can use sample standard deviation).

How to calculate Hypothesis Testing in Excel

So we understand what is hypothesis. Now how to calculate it? or how do you actually determine if you reject the null hypothesis or not? For that, you need to perform some statistical test using some sample data.

Data Collection

Coming back to our main example of a candy-making factory where a worker suddenly made a statement that “the candy making machine is no longer makes 5 grams of chocolate bars”.

Now, this is a big and scary statement for the owner of that factory. So the owner assigned a quality inspector to inspect this.

The quality inspector checked the weight of ten chocolate randomly . I mentioned randomly which means the sample data should be normally distributed . Now the quality inspector can bring this randomness in various ways like he can check chocolate weight at different time of a day or week or check weight while different worker is working etc.

Now let’s say the quality inspector collected following ten random samples:

Selecting Test Type

As I explained earlier since the data or samle size is less and we do not know the population standard deviation so we will use T-test to calculate hypothesis for this problem .

Note: To solve almost all real life problem, we use t-test to calculate hypothesis testing because of lack of overall information (unknown population standard deviation).

Calculate required Parameters

So to apply t-test, now we need to calculate sample standard distribution for our example. You can do it easily in Excel. Now in excel, there are mainly two formulas for standard deviation. Those are:

  • STDEV.S => To calculate standard deviation for sample
  • STDEV.P +> To calculate population standard deviation

We can find our sample standard deviation value is 0.171 (using STDEV.S method in excel)

An our sample mean (x̅) = (5.2 + 4.8 + 5 + 4.9 + 5.2 + 4.7 + 5 + 4.8 + 4.9 + 5.1) / 10 = 4.96

So let’s now list down all required information to calculate hypothesis testing for our problem (measure likelihood of the worker’s claim). Informations are:

  • Already proven value (μ 0 ) = 5 grams
  • Sample standard deviation (s) = 0.171
  • Number of samples collected (n) = 10
  • Degrees of Freedom (DF) = (n-1) = 9
  • Sample mean (x̅) = 4.96

Assuming that the company’s claim of a chocolate bar weight being 5 grams each is true, We need to prove that:

  • Either H0: μ=μ0 (null hypothesis is true – fail to reject null hypothesis – worker’s claimed information is not correct)
  • Or H 1 : μ≠μ0 (Rejecting null hypothesis – worker’s claim is correct)

Draw distribution graph

So our sample mean at 4.96 and our sample standard deviation at 0.171. Now if you draw the distribution of the possible values of our sample mean given the population mean is 5 (already proven weight of a chocolate bar which is 5 grams each). Assuming the null hypothesis is true, below would be the distribution:

standard-normal-distribution-gaussian-bell-graph-curve-concept-hypothesis-testing-gauss-distribution

So basically this hypothesis test is really going to be asking how extreme is this value of 4.96 (sample mean x̅) if the null hypothesis is true. Is it too extreme for us to accept null hypothesis or closer to population mean (already proven value 5) to reject null hypothesis? To answer this question we need to calculate a test statistic here using all provided or known informations.

Apply T test

Applying the T-test formula:

t = ( x̅ – μ 0 ) / (s / √n)

t = (4.96 – 5) / (0.171 / √ 10) = -0.04/0.054 = -0.74074

Decision Rule

So we have our T score which is -0.74074. Now the next thing we need to do is to consider our decision rule. Like normal distribution, T distribution is also a bell curve type shape. The only difference is that the T distribution is defined by its number of degrees of freedom. In our example, degrees of freedom (DF) is 9.

So if degree of freedom is 9 and if we want 95% level of confidence or 0.05 level of significance, we can reject null hypothesis if our calculated T score lies in the green region on the negative side.

t-distribution-bell-curve-for-hypothesis-testing-with-005-level-of-significance

Output of Hypothesis Testing

Now from the T Critical values table, we can find that if degrees of freedom = 9 and Level of significance = 0.05, then T score should be greater than -1.833 (we will keep this value negative because our T score is negative) then only we can reject the null hypothesis.

Note: This is just a one-tailed test as we are only looking at one direction from our expected value (in the negative direction). If we were looking for both the direction of distribution bell curve then we could call it a two-tailed test.

Below is the T Critical values table for your reference.

T distribution table

The calculated T score for our example is -0.74074 > -1.833, therefore the null hypothesis is true (we failed to reject the null hypothesis). That means the claim by the worker is not true . The candy machine is working properly and it is still producing each chocolate bar around 5 grams.

Graph interpretation

We can see the above interpretation in the below T distribution bell curve:

output-decision-of-hypothesis-testing-calculation-in-hand-and-excel-from-scratch

In statistics, we should always test a hypothesis to make a proper judgment. Now there are two types of hypotheses. The first one is the Null Hypothesis, which is an already established concept. The second one is the Alternative Hypothesis, which is a new concept that we are trying to establish by proving the Null Hypothesis wrong.

We are making these decisions (rejecting null hypothesis or not) based on Level of confidence or Level of Significance score. Good score for those parameters are more than 95% for LOC or less than 0.05 for LOS.

In this post, I tried to explain Hypothesis testing with easy and real life examples. Please let me know if you have any questions or suggestions regarding this tutorial.

Anindya Naskar

Hi there, I’m Anindya Naskar, Data Science Engineer. I created this website to show you what I believe is the best possible way to get your start in the field of Data Science.

Related Posts

  • pip install multiple packages in Python
  • 19 Python Regular Expression Exercises and Solutions
  • Neuralink: A Chip can control your Brain – Got FDA Approval
  • FastText Word Embeddings Python implementation
  • 19 Python Programming Lists Practice Exercises
  • Replace Text in a PDF File with Python
  • How to Write Loops the Pythonic Way
  • Build Question Answering System with BERT model

Leave a comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

404 Not found

real life example of hypothesis testing

Hypothesis Testing In real life

Carolina Bento

Ready to learn Data Science? Browse  Data Science Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.

Real world problems solved with Math

Every day you test ideas, recipes, new routes so you can get to your destination faster or with less traffic … The important question, however, it  was that idea/recipe/route  significantly  better than your previous one?

It's Friday night and you want to watch a movie. There are three movies that caught your eye, but you're not really sure if they're good or not. In this modern day and age, you're that kind of person that still relies on family and friends for recommendations. So, you ask them to rate those movies and get ready to crunch the data.

real life example of hypothesis testing

The beauty of dummy data 

Even though it looks like your friends are somewhat skeptical about  The Emoji Movie , you need to examine each rating distribution in order to understand more about the central trend of your friends' votes.

Ratings for  Interstellar

  • median — the centre of the distribution — of 5 units;
  • mean  of ~4.35 units;
  • standard deviation — how far apart the values are from the median — of ~0.76 units;

Ratings for  The Emoji Movie  have

  • median of 2 units;
  • mean of ~2.2 units;
  • standard deviation of ~0.59 units;

Ratings for  Star Wars: The Last Jedi  have

  • median of 5 units;
  • mean of ~4.5 units;
  • standard deviation of ~0.62 units;

This is great! But actually it doesn't tell you much more than what you already knew: The Emoji Movie might not be that appealing, and there's a clear competition between Interstellar and Star Wars …

To clear out any questions about which movie your friends rated as best, you decide to run some  statistical tests  and compare the three rating distributions.

Hypothesis Testing

Hypothesis Tests, or Statistical Hypothesis Testing, is a technique used to compare two datasets or a sample from a dataset. It is a  statistical   inference method  so, in the end of the test, you'll  draw a conclusion — you'll infer something — about the characteristics of what you're comparing.

In the case of your Friday night movie choice, you want to pick a movie that is the best choice among your three possibilities.

What kind of test should you use?

In order to answer this question, first, you need to know  what distribution it follows . Because the different tests assume that data follows a specific distribution.

You already calculated a few statistics the ratings data — mean, median and standard deviation — but what shape does your data take?

One of the most famous distributions is the so called  Bell Curve , the  Normal Distribution . In this distribution, the data is centered at the mean, which you can identify by the peak of the bell curve. In this case, it also corresponds to the value in the middle, the median.

The data points in a Normal Distribution are  spread around  the mean/median according to the standard deviation.

real life example of hypothesis testing

Example of a dataset that follows a Normal Distribution with mean 0 and standard deviation of 1

In this example of a Normal Distribution, it's easy to see that most values are centered around zero — the mean and median of the distribution — and that sides of the curve are  moving away from the mean  in increments of 1 unit.

Do the movie ratings follow a Normal Distribution?

Thankfully, Statisticians have thought about identifying the  shape of your data . They created an easy way to figure that out: the  Quantile-Quantile Plot , a.k.a., Q-Q Plot.

real life example of hypothesis testing

A dataset that follows a Normal Distribution and the Q-Q plot that compares it with the Normal Distribution

Q-Q plots help visualize the quantiles of two probability distributions against one another. Quantiles are simply a way of saying that you are  dividing the distribution into equal parts . For instance,  quart iles, divide a distribution into quarters, 4 equal parts.

How to read this Q-Q Plot?

In the example above we already knew the dataset followed a Normal Distribution.

What the Q-Q plot intends to visually represent is that, if both datasets follow the same distribution, they'll roughly be aligned along the diagonal red line. The more the blue dots, corresponding to your dataset, deviate from the diagonal line, corresponding to the distribution to compare to, the bigger the difference between the two distributions.

So, to figure out what kind of distribution each movie rating dataset follows you can compare them with a Normal Distribution using a Q-Q plot.

real life example of hypothesis testing

Distribution of each movie rating and corresponding Q-Q plot vs Normal Distribution

The first thing that may come to mind is  This doesn't look at all like the Q-Q plot I was expecting!  Well, sort of.

The data points are distributed along the diagonal line however, the reason why it doesn't follow the red line entirely is because the ratings are discrete values instead of continuous. That's why we see, for instance, in the Star Wars ratings a few blue dots horizontally aligned with the value 4 and on top of the red line and then, further up, a few more dots aligned with value 5.

So you can prove that it follows a Normal Distribution because, although in a discrete, step-wise way, the data follows the diagonal line.

Now that you figured out that your ratings follow a Normal Distribution, it's time to pick a statistical test.

Not quite yet.

Before even thinking about what test you are going to use, you need to

  • Define your hypothesis;
  • Set the significance level of the statistical test.

Then you're good to pick the statistical test!

1. Defining Your Hypothesis

A hypothesis test is usually composed by

  • Null Hypothesis  (H0, read "H zero"): states that all things remain equal. No phenomena are observed or there is no relationship between what you are comparing;
  • Alternative Hypothesis  (H1, read “H one”): states the opposite of the Null Hypothesis. That there was some change or observed relationship between what you are comparing

For you Friday movie night, what you really want to know is if one movie is significantly better than the others. In this case, you can build your hypothesis on the difference between the average rating your friends gave to each movie.

real life example of hypothesis testing

Which you can read as  Null Hypothesis  (H0):  The mean of movie A is equal to the mean of movie B  and  Alternative Hypothesis  (H1):  The mean of movie A is not equal to the mean of movie B.

2. Set the Significance Level of the Statistical Test

The goal of the statistical test is to try to prove that there is an observable phenomenon.

The goal of the statistical test is to try to Reject the Null Hypothesis, which states there's no observable change or behaviour

It could be either proving a treatment that shows improvement in patient health, a sample that has characteristics of a larger population or two datasets that are considered different, i.e., they couldn't have been drawn from the same population. So, at the end of the test, you want to be confident about rejecting the Null Hypothesis.

This leads to defining the  significance level of the test .

Described as a probability, and represented by the Greek letter  alpha , it specifies the probability of rejecting the Null Hypothesis when it was actually true, i.e., you couldn't observe the phenomenon or change in question.

I think about the significance level as  setting a standard of quality  for your test, in order to be able to draw accurate conclusions.

In your Friday night movie quest, not identifying a good movie to watch has very minimal consequences: some potentially wasted time, and a bit of frustration. But you can see the importance of setting the appropriate significance level in scenarios like clinical trials, where you're testing a new drug or treatment.

The significance levels that are normally used are 1% and 5%.

For this movie night pick, we can settle at 5%, i.e., alpha = 0.05.

What Statistical Test To Use?

Knowing that the data follows a Normal Distribution and that you want to compare the means of your friends'ratings, one particular statistical test comes to mind.

Student’s t-Test

This statistical test is normally used to verify if there is a significant difference between two datasets. And, as I mentioned earlier, first you have to guarantee that both datasets have the following characteristics

  • Follow a Normal Distribution;
  • Are independent of each other.

Let’s assume your friends weren’t biased when they rated each movie, in order to attribute a completely independent rating.

From what we've seen so far, you're good to use Student's t-Test!

In order to verify if one of the movies is significantly better than the other, you can conduct an  independent two-sample t-test.  This test will also have to be a  two-tailed test because we’re trying to capture a general significant difference, either lower or higher. Think about the “tails” of the Normal Distribution plot.

Ready to run the t-test? Wait, not yet!

All your friends rated the different movies, however, as you verified earlier, each movie rating distribution has a different standard deviation. This means that each distribution has a different variance.

Because the variance of the distributions is not the same, you have to use instead of a slightly different test,  Welch’s t-Test .

Welch’s t-Test

This is also called the  unequal variances t-test . It’s an adaptation of Student’s t-Test and still requires the data to be normally distributed. However, it takes into account both variances when computing the test.

real life example of hypothesis testing

The numerator accounts for the difference between the two means, represented by X1 and X2, while the denominator takes into account the variance, represented by  s  and the size of each dataset  N .

In the Friday night movie example, the size of the dataset is going to be the same for both movies, because all your friends rate all three movies. But with Welch's t-test, we make sure that the variance of each rating distribution is factored in when verifying if there is a significant difference between ratings.

With the Welch’s t-Test, and for each for each pair of distributions, you calculate the  test statistic , which every statistical software generates once you run the test.

Now, the significance level comes back to action, because you’re ready to draw a conclusion about the data.

Alongside the test statistic, your software of choice will also provide you with the  p-value . Also expressed as a probability, the p-value is the probability of observing a value as extreme as the test statistic, given that the Null Hypothesis is true.

In this Friday movie night scenario, the p-value would be the probability of having a mean rating so much higher or so lower than the one we’re comparing to.

You have all the pieces of the puzzle now!

You ran the test, got the test statistic and the p-value and now you can use the  p-value and the significance level to determine if there’s a statistically significant difference between the dataset .

Crunching all the data with the statistical software of your choice you get the following results

Interstellar vs The Emoji Movie

  • Test-Statistic ~= 15.07
  • p-value = 1.833202972237421e-25

The Emoji Movie vs Star Wars: The Last Jedi

  • Test-Statistic ~= -18.54
  • p-value = 3.4074007278724943e-32

Looking at the absolute value of the test-statistics above, given that they're so large, you can conclude that there's significant difference between the two pairs movies.

Comparing the significance level with each p-value, you can safely reject the Null Hypothesis, which states that there's no difference between the mean rating of these movies.

This applies to both  Interstellar vs The Emoji Movie  and  The Emoji Movie vs Star Wars: The Last Jedi , because in both cases the p-value is much smaller than the significance level of 0.05 we set before running the test.

You just concluded that there’s actually a significant difference between the average rating of The Emoji Movie (2.2 units) compared with both Interstellar (4.35 units) and Star Wars (4.5 units).

Given that the average rating of the latter movies is  significantly higher  you can safely exclude The Emoji Movie from your candidate list.

Now there are only two contestants left …

Interstellar vs Star Wars: The Last Jedi

  • Test-Statistic ~=-1.35
  • p-value = 0.18046156732197555

From these results, you can't prove that there is a statistically significant difference between these two movies. If you recall, their average rating is very close — 4.35 compared to 4.5 units.

Even though it's tempting to say the Null Hypothesis is true, and that there is no difference between the two means, you can't.

What you can say is that you don't have enough empirical evidence to reject the Null Hypothesis.

If you want to abide by the Statistics rules, you'd have a technical tie

As a tie-breaker, you could ask the opinion of an unbiased third-party or just watch the one that has the highest average rating.

Experfy Insights

Top articles, research, podcasts, webinars and more delivered to you monthly.

real life example of hypothesis testing

Blockchain: The Next Step for Data-Driven Industries

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

real life example of hypothesis testing

If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

real life example of hypothesis testing

Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

real life example of hypothesis testing

Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

1700 West Park Drive, Suite 190 Westborough, MA 01581

Email:  [email protected]

Toll Free: (844) EXPERFY or (844) 397-3739

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

8.4: Hypothesis Test Examples for Proportions

  • Last updated
  • Save as PDF
  • Page ID 11533

  • In a hypothesis test problem, you may see words such as "the level of significance is 1%." The "1%" is the preconceived or preset \(\alpha\).
  • The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.
  • If no level of significance is given, a common standard to use is \(\alpha = 0.05\).
  • When you calculate the \(p\)-value and draw the picture, the \(p\)-value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.
  • The alternative hypothesis, \(H_{a}\), tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.
  • \(H_{a}\) never has a symbol that contains an equal sign.
  • Thinking about the meaning of the \(p\)-value: A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller \(p\)-value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p -value such as 0.4, as opposed to a \(p\)-value of 0.056 (\(\alpha = 0.05\) is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.

Full Hypothesis Test Examples

Example \(\PageIndex{7}\)

Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50% . Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.

Set up the hypothesis test:

The 1% level of significance means that α = 0.01. This is a test of a single population proportion .

\(H_{0}: p = 0.50\)  \(H_{a}: p \neq 0.50\)

The words "is the same or different from" tell you this is a two-tailed test.

Calculate the distribution needed:

Random variable: \(P′ =\) the percent of of first-time brides who are younger than their grooms.

Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for P′ , the estimated proportion.

\[P' - N\left(p, \sqrt{\frac{p-q}{n}}\right)\nonumber \]

\[P' - N\left(0.5, \sqrt{\frac{0.5-0.5}{100}}\right)\nonumber \]

where \(p = 0.50, q = 1−p = 0.50\), and \(n = 100\)

Calculate the p -value using the normal distribution for proportions:

\[p\text{-value} = P(p′ < 0.47 or p′ > 0.53) = 0.5485\nonumber \]

where \[x = 53, p' = \frac{x}{n} = \frac{53}{100} = 0.53\nonumber \].

Interpretation of the \(p\text{-value})\: If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion \(p'\) is 0.53 or more OR 0.47 or less (see the graph in Figure).

Normal distribution curve of the percent of first time brides who are younger than the groom with values of 0.47, 0.50, and 0.53 on the x-axis. Vertical upward lines extend from 0.47 and 0.53 to the curve. 1/2(p-values) are calculated for the areas on outsides of 0.47 and 0.53.

\(\mu = p = 0.50\) comes from \(H_{0}\), the null hypothesis.

\(p′ = 0.53\). Since the curve is symmetrical and the test is two-tailed, the \(p′\) for the left tail is equal to \(0.50 – 0.03 = 0.47\) where \(\mu = p = 0.50\). (0.03 is the difference between 0.53 and 0.50.)

Compare \(\alpha\) and the \(p\text{-value}\):

Since \(\alpha = 0.01\) and \(p\text{-value} = 0.5485\). \(\alpha < p\text{-value}\).

Make a decision: Since \(\alpha < p\text{-value}\), you cannot reject \(H_{0}\).

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.

The \(p\text{-value}\) can easily be calculated.

Press STAT and arrow over to TESTS . Press 5:1-PropZTest . Enter .5 for \(p_{0}\), 53 for \(x\) and 100 for \(n\). Arrow down to Prop and arrow to not equals \(p_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator calculates the \(p\text{-value}\) (\(p = 0.5485\)) and the test statistic (\(z\)-score). Prop not equals .5 is the alternate hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(\(z\) = 0.6\) (test statistic) and \(p = 0.5485\) (\(p\text{-value}\)). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

The Type I and Type II errors are as follows:

The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).

The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)

Exercise \(\PageIndex{7}\)

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

First, determine what type of test this is, set up the hypothesis test, find the \(p\text{-value}\), sketch the graph, and state your conclusion.

Since the problem is about percentages, this is a test of single population proportions.

  • \(H_{0} : p = 0.85\)
  • \(H_{a}: p \neq 0.85\)
  • \(p = 0.7554\)

9.6.13.png

Because \(p > \alpha\), we fail to reject the null hypothesis. There is not sufficient evidence to suggest that the proportion of students that want to go to the zoo is not 85%.

Example \(\PageIndex{8}\)

Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.

Set up the Hypothesis Test:

\(H_{0}: p = 0.30, H_{a}: p \neq 0.30\)

Determine the distribution needed:

The random variable is \(P′ =\) proportion of households that have three cell phones.

The distribution for the hypothesis test is \(P' - N\left(0.30, \sqrt{\frac{(0.30 \cdot 0.70)}{150}}\right)\)

Exercise 9.6.8.2

a. The value that helps determine the \(p\text{-value}\) is \(p′\). Calculate \(p′\).

a. \(p' = \frac{x}{n}\) where \(x\) is the number of successes and \(n\) is the total number in the sample.

\(x = 43, n = 150\)

\(p′ = 43150\)

Exercise 9.6.8.3

b. What is a success for this problem?

b. A success is having three cell phones in a household.

Exercise 9.6.8.4

c. What is the level of significance?

c. The level of significance is the preset \(\alpha\). Since \(\alpha\) is not given, assume that \(\alpha = 0.05\).

Exercise 9.6.8.5

d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately.

Calculate the \(p\text{-value}\).

d. \(p\text{-value} = 0.7216\)

Exercise 9.6.8.6

e. Make a decision. _____________(Reject/Do not reject) \(H_{0}\) because____________.

e. Assuming that \(\alpha = 0.05, \alpha < p\text{-value}\). The decision is do not reject \(H_{0}\) because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.

Exercise \(\PageIndex{8}\)

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

  • \(H_{0}: p = 0.92\)
  • \(H_{a}: p < 0.92\)
  • \(p\text{-value} = 0.0046\)

Because \(p < 0.05\), we reject the null hypothesis. There is sufficient evidence to conclude that fewer than 92% of American adults own cell phones.

  • Type I Error: To conclude that fewer than 92% of American adults own cell phones when, in fact, 92% of American adults do own cell phones (reject the null hypothesis when the null hypothesis is true).
  • Type II Error: To conclude that 92% of American adults own cell phones when, in fact, fewer than 92% of American adults own cell phones (do not reject the null hypothesis when the null hypothesis is false).

The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter \(p\). The distribution for the test is normal. The estimated proportion \(p′\) is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived \(\alpha = 0.01\), for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!

Example \(\PageIndex{9}\)

My dog has so many fleas,

They do not come off with ease. As for shampoo, I have tried many types Even one called Bubble Hype, Which only killed 25% of the fleas, Unfortunately I was not pleased.

I've used all kinds of soap, Until I had given up hope Until one day I saw An ad that put me in awe.

A shampoo used for dogs Called GOOD ENOUGH to Clean a Hog Guaranteed to kill more fleas.

I gave Fido a bath And after doing the math His number of fleas Started dropping by 3's! Before his shampoo I counted 42.

At the end of his bath, I redid the math And the new shampoo had killed 17 fleas. So now I was pleased.

Now it is time for you to have some fun With the level of significance being .01, You must help me figure out

Use the new shampoo or go without?

\(H_{0}: p \leq 0.25\)   \(H_{a}: p > 0.25\)

In words, CLEARLY state what your random variable \(\bar{X}\) or \(P′\) represents.

\(P′ =\) The proportion of fleas that are killed by the new shampoo

State the distribution to use for the test.

\[N\left(0.25, \sqrt{\frac{(0.25){1-0.25}}{42}}\right)\nonumber \]

Test Statistic: \(z = 2.3163\)

Calculate the \(p\text{-value}\) using the normal distribution for proportions:

\[p\text{-value} = 0.0103\nonumber \]

In one to two complete sentences, explain what the p -value means for this problem.

If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 \(\left(\frac{17}{42}\right)\) or more.

Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the \(p\text{-value}\).

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.25 and 0.4048 on the x-axis. A vertical upward line extends from 0.4048 to the curve and the area to the left of this is shaded in. The test statistic of the sample proportion is listed.

Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.

Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.26, 17/42, and 0.55 on the x-axis. A vertical upward line extends from 0.26 and 0.55. The area between these two points is equal to 0.95.

Confidence Interval: (0.26,0.55) We are 95% confident that the true population proportion p of fleas that are killed by the new shampoo is between 26% and 55%.

This test result is not very definitive since the \(p\text{-value}\) is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.

Example \(\PageIndex{11}\)

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

We will follow the four-step process.

  • \(H_{0}: p \leq 0.00034\)
  • \(H_{a}: p > 0.00034\)

If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.

  • We will be testing a sample proportion with \(x = 172\) and \(n = 420,019\). The sample is sufficiently large because we have \(np = 420,019(0.00034) = 142.8\), \(nq = 420,019(0.99966) = 419,876.2\), two independent outcomes, and a fixed probability of success \(p = 0.00034\). Thus we will be able to generalize our results to the population.

Figure 9.6.11.

Figure 9.6.12.

  • Since the \(p\text{-value} = 0.0073\) is greater than our alpha value \(= 0.005\), we cannot reject the null. Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.

Example \(\PageIndex{12}\)

According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.

We will follow the four-step plan.

  • We need to test whether the proportion of sexual assaults in Daviess County, KY is significantly different from the national average.
  • \(H_{0}: p = 0.00078\)
  • \(H_{a}: p \neq 0.00078\)

Figure 9.6.13.

Figure 9.6.14.

  • Since the \(p\text{-value}\), \(p = 0.00063\), is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis. In conclusion, the sample data support the claim that the proportion of sexual assaults in Daviess County, Kentucky is different from the national average proportion.

The hypothesis test itself has an established process. This can be summarized as follows:

  • Determine \(H_{0}\) and \(H_{a}\). Remember, they are contradictory.
  • Determine the random variable.
  • Determine the distribution for the test.
  • Draw a graph, calculate the test statistic, and use the test statistic to calculate the \(p\text{-value}\). (A z -score and a t -score are examples of test statistics.)
  • Compare the preconceived α with the p -value, make a decision (reject or do not reject H 0 ), and write a clear conclusion using English sentences.

Notice that in performing the hypothesis test, you use \(\alpha\) and not \(\beta\). \(\beta\) is needed to help determine the sample size of the data that is used in calculating the \(p\text{-value}\). Remember that the quantity \(1 – \beta\) is called the Power of the Test . A high power is desirable. If the power is too low, statisticians typically increase the sample size while keeping α the same.If the power is low, the null hypothesis might not be rejected when it should be.

  • Data from Amit Schitai. Director of Instructional Technology and Distance Learning. LBCC.
  • Data from Bloomberg Businessweek . Available online at http://www.businessweek.com/news/2011- 09-15/nyc-smoking-rate-falls-to-record-low-of-14-bloomberg-says.html.
  • Data from energy.gov. Available online at http://energy.gov (accessed June 27. 2013).
  • Data from Gallup®. Available online at www.gallup.com (accessed June 27, 2013).
  • Data from Growing by Degrees by Allen and Seaman.
  • Data from La Leche League International. Available online at www.lalecheleague.org/Law/BAFeb01.html.
  • Data from the American Automobile Association. Available online at www.aaa.com (accessed June 27, 2013).
  • Data from the American Library Association. Available online at www.ala.org (accessed June 27, 2013).
  • Data from the Bureau of Labor Statistics. Available online at http://www.bls.gov/oes/current/oes291111.htm .
  • Data from the Centers for Disease Control and Prevention. Available online at www.cdc.gov (accessed June 27, 2013)
  • Data from the U.S. Census Bureau, available online at quickfacts.census.gov/qfd/states/00000.html (accessed June 27, 2013).
  • Data from the United States Census Bureau. Available online at www.census.gov/hhes/socdemo/language/.
  • Data from Toastmasters International. Available online at http://toastmasters.org/artisan/deta...eID=429&Page=1 .
  • Data from Weather Underground. Available online at www.wunderground.com (accessed June 27, 2013).
  • Federal Bureau of Investigations. “Uniform Crime Reports and Index of Crime in Daviess in the State of Kentucky enforced by Daviess County from 1985 to 2005.” Available online at http://www.disastercenter.com/kentucky/crime/3868.htm (accessed June 27, 2013).
  • “Foothill-De Anza Community College District.” De Anza College, Winter 2006. Available online at research.fhda.edu/factbook/DA...t_da_2006w.pdf.
  • Johansen, C., J. Boice, Jr., J. McLaughlin, J. Olsen. “Cellular Telephones and Cancer—a Nationwide Cohort Study in Denmark.” Institute of Cancer Epidemiology and the Danish Cancer Society, 93(3):203-7. Available online at http://www.ncbi.nlm.nih.gov/pubmed/11158188 (accessed June 27, 2013).
  • Rape, Abuse & Incest National Network. “How often does sexual assault occur?” RAINN, 2009. Available online at www.rainn.org/get-information...sexual-assault (accessed June 27, 2013).

Contributors and Attributions

Barbara Illowsky and Susan Dean (De Anza College) with many other contributing authors. Content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at http://cnx.org/contents/[email protected] .

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Type I & Type II Errors | Differences, Examples, Visualizations

Type I & Type II Errors | Differences, Examples, Visualizations

Published on January 18, 2021 by Pritha Bhandari . Revised on June 22, 2023.

In statistics , a Type I error is a false positive conclusion, while a Type II error is a false negative conclusion.

Making a statistical decision always involves uncertainties, so the risks of making these errors are unavoidable in hypothesis testing .

The probability of making a Type I error is the significance level , or alpha (α), while the probability of making a Type II error is beta (β). These risks can be minimized through careful planning in your study design.

  • Type I error (false positive) : the test result says you have coronavirus, but you actually don’t.
  • Type II error (false negative) : the test result says you don’t have coronavirus, but you actually do.

Table of contents

Error in statistical decision-making, type i error, type ii error, trade-off between type i and type ii errors, is a type i or type ii error worse, other interesting articles, frequently asked questions about type i and ii errors.

Using hypothesis testing, you can make decisions about whether your data support or refute your research predictions with null and alternative hypotheses .

Hypothesis testing starts with the assumption of no difference between groups or no relationship between variables in the population—this is the null hypothesis . It’s always paired with an alternative hypothesis , which is your research prediction of an actual difference between groups or a true relationship between variables .

In this case:

  • The null hypothesis (H 0 ) is that the new drug has no effect on symptoms of the disease.
  • The alternative hypothesis (H 1 ) is that the drug is effective for alleviating symptoms of the disease.

Then , you decide whether the null hypothesis can be rejected based on your data and the results of a statistical test . Since these decisions are based on probabilities, there is always a risk of making the wrong conclusion.

  • If your results show statistical significance , that means they are very unlikely to occur if the null hypothesis is true. In this case, you would reject your null hypothesis. But sometimes, this may actually be a Type I error.
  • If your findings do not show statistical significance, they have a high chance of occurring if the null hypothesis is true. Therefore, you fail to reject your null hypothesis. But sometimes, this may be a Type II error.

Type I and Type II error in statistics

Prevent plagiarism. Run a free check.

A Type I error means rejecting the null hypothesis when it’s actually true. It means concluding that results are statistically significant when, in reality, they came about purely by chance or because of unrelated factors.

The risk of committing this error is the significance level (alpha or α) you choose. That’s a value that you set at the beginning of your study to assess the statistical probability of obtaining your results ( p value).

The significance level is usually set at 0.05 or 5%. This means that your results only have a 5% chance of occurring, or less, if the null hypothesis is actually true.

If the p value of your test is lower than the significance level, it means your results are statistically significant and consistent with the alternative hypothesis. If your p value is higher than the significance level, then your results are considered statistically non-significant.

To reduce the Type I error probability, you can simply set a lower significance level.

Type I error rate

The null hypothesis distribution curve below shows the probabilities of obtaining all possible results if the study were repeated with new samples and the null hypothesis were true in the population .

At the tail end, the shaded area represents alpha. It’s also called a critical region in statistics.

If your results fall in the critical region of this curve, they are considered statistically significant and the null hypothesis is rejected. However, this is a false positive conclusion, because the null hypothesis is actually true in this case!

Type I error rate

A Type II error means not rejecting the null hypothesis when it’s actually false. This is not quite the same as “accepting” the null hypothesis, because hypothesis testing can only tell you whether to reject the null hypothesis.

Instead, a Type II error means failing to conclude there was an effect when there actually was. In reality, your study may not have had enough statistical power to detect an effect of a certain size.

Power is the extent to which a test can correctly detect a real effect when there is one. A power level of 80% or higher is usually considered acceptable.

The risk of a Type II error is inversely related to the statistical power of a study. The higher the statistical power, the lower the probability of making a Type II error.

Statistical power is determined by:

  • Size of the effect : Larger effects are more easily detected.
  • Measurement error : Systematic and random errors in recorded data reduce power.
  • Sample size : Larger samples reduce sampling error and increase power.
  • Significance level : Increasing the significance level increases power.

To (indirectly) reduce the risk of a Type II error, you can increase the sample size or the significance level.

Type II error rate

The alternative hypothesis distribution curve below shows the probabilities of obtaining all possible results if the study were repeated with new samples and the alternative hypothesis were true in the population .

The Type II error rate is beta (β), represented by the shaded area on the left side. The remaining area under the curve represents statistical power, which is 1 – β.

Increasing the statistical power of your test directly decreases the risk of making a Type II error.

Type II error rate

The Type I and Type II error rates influence each other. That’s because the significance level (the Type I error rate) affects statistical power, which is inversely related to the Type II error rate.

This means there’s an important tradeoff between Type I and Type II errors:

  • Setting a lower significance level decreases a Type I error risk, but increases a Type II error risk.
  • Increasing the power of a test decreases a Type II error risk, but increases a Type I error risk.

This trade-off is visualized in the graph below. It shows two curves:

  • The null hypothesis distribution shows all possible results you’d obtain if the null hypothesis is true. The correct conclusion for any point on this distribution means not rejecting the null hypothesis.
  • The alternative hypothesis distribution shows all possible results you’d obtain if the alternative hypothesis is true. The correct conclusion for any point on this distribution means rejecting the null hypothesis.

Type I and Type II errors occur where these two distributions overlap. The blue shaded area represents alpha, the Type I error rate, and the green shaded area represents beta, the Type II error rate.

By setting the Type I error rate, you indirectly influence the size of the Type II error rate as well.

Type I and Type II error

It’s important to strike a balance between the risks of making Type I and Type II errors. Reducing the alpha always comes at the cost of increasing beta, and vice versa .

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

real life example of hypothesis testing

For statisticians, a Type I error is usually worse. In practical terms, however, either type of error could be worse depending on your research context.

A Type I error means mistakenly going against the main statistical assumption of a null hypothesis. This may lead to new policies, practices or treatments that are inadequate or a waste of resources.

In contrast, a Type II error means failing to reject a null hypothesis. It may only result in missed opportunities to innovate, but these can also have important practical consequences.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

In statistics, a Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s actually false.

The risk of making a Type I error is the significance level (or alpha) that you choose. That’s a value that you set at the beginning of your study to assess the statistical probability of obtaining your results ( p value ).

To reduce the Type I error probability, you can set a lower significance level.

The risk of making a Type II error is inversely related to the statistical power of a test. Power is the extent to which a test can correctly detect a real effect when there is one.

To (indirectly) reduce the risk of a Type II error, you can increase the sample size or the significance level to increase statistical power.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

In statistics, power refers to the likelihood of a hypothesis test detecting a true effect if there is one. A statistically powerful test is more likely to reject a false negative (a Type II error).

If you don’t ensure enough power in your study, you may not be able to detect a statistically significant result even when it has practical significance. Your study might not have the ability to answer your research question.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Type I & Type II Errors | Differences, Examples, Visualizations. Scribbr. Retrieved April 4, 2024, from https://www.scribbr.com/statistics/type-i-and-type-ii-errors/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, an easy introduction to statistical significance (with examples), understanding p values | definition and examples, statistical power and why it matters | a simple introduction, what is your plagiarism score.

  • Machine Learning Tutorial
  • Data Analysis Tutorial
  • Python - Data visualization tutorial
  • Machine Learning Projects
  • Machine Learning Interview Questions
  • Machine Learning Mathematics
  • Deep Learning Tutorial
  • Deep Learning Project
  • Deep Learning Interview Questions
  • Computer Vision Tutorial
  • Computer Vision Projects
  • NLP Project
  • NLP Interview Questions
  • Statistics with Python
  • 100 Days of Machine Learning
  • Data Analysis with Python

Introduction to Data Analysis

  • What is Data Analysis?
  • Data Analytics and its type
  • How to Install Numpy on Windows?
  • How to Install Pandas in Python?
  • How to Install Matplotlib on python?
  • How to Install Python Tensorflow in Windows?

Data Analysis Libraries

  • Pandas Tutorial
  • NumPy Tutorial - Python Library
  • Data Analysis with SciPy
  • Introduction to TensorFlow

Data Visulization Libraries

  • Matplotlib Tutorial
  • Python Seaborn Tutorial
  • Plotly tutorial
  • Introduction to Bokeh in Python

Exploratory Data Analysis (EDA)

  • Univariate, Bivariate and Multivariate data and its analysis
  • Measures of Central Tendency in Statistics
  • Measures of spread - Range, Variance, and Standard Deviation
  • Interquartile Range and Quartile Deviation using NumPy and SciPy
  • Anova Formula
  • Skewness of Statistical Data
  • How to Calculate Skewness and Kurtosis in Python?
  • Difference Between Skewness and Kurtosis
  • Histogram | Meaning, Example, Types and Steps to Draw
  • Interpretations of Histogram
  • Quantile Quantile plots
  • What is Univariate, Bivariate & Multivariate Analysis in Data Visualisation?
  • Using pandas crosstab to create a bar plot
  • Exploring Correlation in Python
  • Mathematics | Covariance and Correlation
  • Factor Analysis | Data Analysis
  • Data Mining - Cluster Analysis
  • MANOVA Test in R Programming
  • Python - Central Limit Theorem
  • Probability Distribution Function
  • Probability Density Estimation & Maximum Likelihood Estimation
  • Exponential Distribution in R Programming - dexp(), pexp(), qexp(), and rexp() Functions
  • Mathematics | Probability Distributions Set 4 (Binomial Distribution)
  • Poisson Distribution - Definition, Formula, Table and Examples
  • P-Value: Comprehensive Guide to Understand, Apply, and Interpret
  • Z-Score in Statistics
  • How to Calculate Point Estimates in R?
  • Confidence Interval
  • Chi-square test in Machine Learning

Understanding Hypothesis Testing

Data preprocessing.

  • ML | Data Preprocessing in Python
  • ML | Overview of Data Cleaning
  • ML | Handling Missing Values
  • Detect and Remove the Outliers using Python

Data Transformation

  • Data Normalization Machine Learning
  • Sampling distribution Using Python

Time Series Data Analysis

  • Data Mining - Time-Series, Symbolic and Biological Sequences Data
  • Basic DateTime Operations in Python
  • Time Series Analysis & Visualization in Python
  • How to deal with missing values in a Timeseries in Python?
  • How to calculate MOVING AVERAGE in a Pandas DataFrame?
  • What is a trend in time series?
  • How to Perform an Augmented Dickey-Fuller Test in R
  • AutoCorrelation

Case Studies and Projects

  • Top 8 Free Dataset Sources to Use for Data Science Projects
  • Step by Step Predictive Analysis - Machine Learning
  • 6 Tips for Creating Effective Data Visualizations

Hypothesis testing involves formulating assumptions about population parameters based on sample statistics and rigorously evaluating these assumptions against empirical evidence. This article sheds light on the significance of hypothesis testing and the critical steps involved in the process.

What is Hypothesis Testing?

Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. 

Example: You say an average height in the class is 30 or a boy is taller than a girl. All of these is an assumption that we are assuming, and we need some statistical way to prove these. We need some mathematical conclusion whatever we are assuming is true.

Defining Hypotheses

\mu

Key Terms of Hypothesis Testing

\alpha

  • P-value: The P value , or calculated probability, is the probability of finding the observed/extreme results when the null hypothesis(H0) of a study-given problem is true. If your P-value is less than the chosen significance level then you reject the null hypothesis i.e. accept that your sample claims to support the alternative hypothesis.
  • Test Statistic: The test statistic is a numerical value calculated from sample data during a hypothesis test, used to determine whether to reject the null hypothesis. It is compared to a critical value or p-value to make decisions about the statistical significance of the observed results.
  • Critical value : The critical value in statistics is a threshold or cutoff point used to determine whether to reject the null hypothesis in a hypothesis test.
  • Degrees of freedom: Degrees of freedom are associated with the variability or freedom one has in estimating a parameter. The degrees of freedom are related to the sample size and determine the shape.

Why do we use Hypothesis Testing?

Hypothesis testing is an important procedure in statistics. Hypothesis testing evaluates two mutually exclusive population statements to determine which statement is most supported by sample data. When we say that the findings are statistically significant, thanks to hypothesis testing. 

One-Tailed and Two-Tailed Test

One tailed test focuses on one direction, either greater than or less than a specified value. We use a one-tailed test when there is a clear directional expectation based on prior knowledge or theory. The critical region is located on only one side of the distribution curve. If the sample falls into this critical region, the null hypothesis is rejected in favor of the alternative hypothesis.

One-Tailed Test

There are two types of one-tailed test:

\mu \geq 50

Two-Tailed Test

A two-tailed test considers both directions, greater than and less than a specified value.We use a two-tailed test when there is no specific directional expectation, and want to detect any significant difference.

\mu =

What are Type 1 and Type 2 errors in Hypothesis Testing?

In hypothesis testing, Type I and Type II errors are two possible errors that researchers can make when drawing conclusions about a population based on a sample of data. These errors are associated with the decisions made regarding the null hypothesis and the alternative hypothesis.

\alpha

How does Hypothesis Testing work?

Step 1: define null and alternative hypothesis.

H_0

We first identify the problem about which we want to make an assumption keeping in mind that our assumption should be contradictory to one another, assuming Normally distributed data.

Step 2 – Choose significance level

\alpha

Step 3 – Collect and Analyze data.

Gather relevant data through observation or experimentation. Analyze the data using appropriate statistical methods to obtain a test statistic.

Step 4-Calculate Test Statistic

The data for the tests are evaluated in this step we look for various scores based on the characteristics of data. The choice of the test statistic depends on the type of hypothesis test being conducted.

There are various hypothesis tests, each appropriate for various goal to calculate our test. This could be a Z-test , Chi-square , T-test , and so on.

  • Z-test : If population means and standard deviations are known. Z-statistic is commonly used.
  • t-test : If population standard deviations are unknown. and sample size is small than t-test statistic is more appropriate.
  • Chi-square test : Chi-square test is used for categorical data or for testing independence in contingency tables
  • F-test : F-test is often used in analysis of variance (ANOVA) to compare variances or test the equality of means across multiple groups.

We have a smaller dataset, So, T-test is more appropriate to test our hypothesis.

T-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score.

Step 5 – Comparing Test Statistic:

In this stage, we decide where we should accept the null hypothesis or reject the null hypothesis. There are two ways to decide where we should accept or reject the null hypothesis.

Method A: Using Crtical values

Comparing the test statistic and tabulated critical value we have,

  • If Test Statistic>Critical Value: Reject the null hypothesis.
  • If Test Statistic≤Critical Value: Fail to reject the null hypothesis.

Note: Critical values are predetermined threshold values that are used to make a decision in hypothesis testing. To determine critical values for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Method B: Using P-values

We can also come to an conclusion using the p-value,

p\leq\alpha

Note : The p-value is the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming the null hypothesis is true. To determine p-value for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Step 7- Interpret the Results

At last, we can conclude our experiment using method A or B.

Calculating test statistic

To validate our hypothesis about a population parameter we use statistical functions . We use the z-score, p-value, and level of significance(alpha) to make evidence for our hypothesis for normally distributed data .

1. Z-statistics:

When population means and standard deviations are known.

z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}

  • μ represents the population mean, 
  • σ is the standard deviation
  • and n is the size of the sample.

2. T-Statistics

T test is used when n<30,

t-statistic calculation is given by:

t=\frac{x̄-μ}{s/\sqrt{n}}

  • t = t-score,
  • x̄ = sample mean
  • μ = population mean,
  • s = standard deviation of the sample,
  • n = sample size

3. Chi-Square Test

Chi-Square Test for Independence categorical Data (Non-normally distributed) using:

\chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

  • i,j are the rows and columns index respectively.

E_{ij}

Real life Hypothesis Testing example

Let’s examine hypothesis testing using two real life situations,

Case A: D oes a New Drug Affect Blood Pressure?

Imagine a pharmaceutical company has developed a new drug that they believe can effectively lower blood pressure in patients with hypertension. Before bringing the drug to market, they need to conduct a study to assess its impact on blood pressure.

  • Before Treatment: 120, 122, 118, 130, 125, 128, 115, 121, 123, 119
  • After Treatment: 115, 120, 112, 128, 122, 125, 110, 117, 119, 114

Step 1 : Define the Hypothesis

  • Null Hypothesis : (H 0 )The new drug has no effect on blood pressure.
  • Alternate Hypothesis : (H 1 )The new drug has an effect on blood pressure.

Step 2: Define the Significance level

Let’s consider the Significance level at 0.05, indicating rejection of the null hypothesis.

If the evidence suggests less than a 5% chance of observing the results due to random variation.

Step 3 : Compute the test statistic

Using paired T-test analyze the data to obtain a test statistic and a p-value.

The test statistic (e.g., T-statistic) is calculated based on the differences between blood pressure measurements before and after treatment.

t = m/(s/√n)

  • m  = mean of the difference i.e X after, X before
  • s  = standard deviation of the difference (d) i.e d i ​= X after, i ​− X before,
  • n  = sample size,

then, m= -3.9, s= 1.8 and n= 10

we, calculate the , T-statistic = -9 based on the formula for paired t test

Step 4: Find the p-value

The calculated t-statistic is -9 and degrees of freedom df = 9, you can find the p-value using statistical software or a t-distribution table.

thus, p-value = 8.538051223166285e-06

Step 5: Result

  • If the p-value is less than or equal to 0.05, the researchers reject the null hypothesis.
  • If the p-value is greater than 0.05, they fail to reject the null hypothesis.

Conclusion: Since the p-value (8.538051223166285e-06) is less than the significance level (0.05), the researchers reject the null hypothesis. There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different.

Python Implementation of Hypothesis Testing

Let’s create hypothesis testing with python, where we are testing whether a new drug affects blood pressure. For this example, we will use a paired T-test. We’ll use the scipy.stats library for the T-test.

Scipy is a mathematical library in Python that is mostly used for mathematical equations and computations.

We will implement our first real life problem via python,

In the above example, given the T-statistic of approximately -9 and an extremely small p-value, the results indicate a strong case to reject the null hypothesis at a significance level of 0.05. 

  • The results suggest that the new drug, treatment, or intervention has a significant effect on lowering blood pressure.
  • The negative T-statistic indicates that the mean blood pressure after treatment is significantly lower than the assumed population mean before treatment.

Case B : Cholesterol level in a population

Data: A sample of 25 individuals is taken, and their cholesterol levels are measured.

Cholesterol Levels (mg/dL): 205, 198, 210, 190, 215, 205, 200, 192, 198, 205, 198, 202, 208, 200, 205, 198, 205, 210, 192, 205, 198, 205, 210, 192, 205.

Populations Mean = 200

Population Standard Deviation (σ): 5 mg/dL(given for this problem)

Step 1: Define the Hypothesis

  • Null Hypothesis (H 0 ): The average cholesterol level in a population is 200 mg/dL.
  • Alternate Hypothesis (H 1 ): The average cholesterol level in a population is different from 200 mg/dL.

As the direction of deviation is not given , we assume a two-tailed test, and based on a normal distribution table, the critical values for a significance level of 0.05 (two-tailed) can be calculated through the z-table and are approximately -1.96 and 1.96.

(203.8 - 200) / (5 \div \sqrt{25})

Step 4: Result

Since the absolute value of the test statistic (2.04) is greater than the critical value (1.96), we reject the null hypothesis. And conclude that, there is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL

Limitations of Hypothesis Testing

  • Although a useful technique, hypothesis testing does not offer a comprehensive grasp of the topic being studied. Without fully reflecting the intricacy or whole context of the phenomena, it concentrates on certain hypotheses and statistical significance.
  • The accuracy of hypothesis testing results is contingent on the quality of available data and the appropriateness of statistical methods used. Inaccurate data or poorly formulated hypotheses can lead to incorrect conclusions.
  • Relying solely on hypothesis testing may cause analysts to overlook significant patterns or relationships in the data that are not captured by the specific hypotheses being tested. This limitation underscores the importance of complimenting hypothesis testing with other analytical approaches.

Hypothesis testing stands as a cornerstone in statistical analysis, enabling data scientists to navigate uncertainties and draw credible inferences from sample data. By systematically defining null and alternative hypotheses, choosing significance levels, and leveraging statistical tests, researchers can assess the validity of their assumptions. The article also elucidates the critical distinction between Type I and Type II errors, providing a comprehensive understanding of the nuanced decision-making process inherent in hypothesis testing. The real-life example of testing a new drug’s effect on blood pressure using a paired T-test showcases the practical application of these principles, underscoring the importance of statistical rigor in data-driven decision-making.

Frequently Asked Questions (FAQs)

1. what are the 3 types of hypothesis test.

There are three types of hypothesis tests: right-tailed, left-tailed, and two-tailed. Right-tailed tests assess if a parameter is greater, left-tailed if lesser. Two-tailed tests check for non-directional differences, greater or lesser.

2.What are the 4 components of hypothesis testing?

Null Hypothesis ( ): No effect or difference exists. Alternative Hypothesis ( ): An effect or difference exists. Significance Level ( ): Risk of rejecting null hypothesis when it’s true (Type I error). Test Statistic: Numerical value representing observed evidence against null hypothesis.

3.What is hypothesis testing in ML?

Statistical method to evaluate the performance and validity of machine learning models. Tests specific hypotheses about model behavior, like whether features influence predictions or if a model generalizes well to unseen data.

4.What is the difference between Pytest and hypothesis in Python?

Pytest purposes general testing framework for Python code while Hypothesis is a Property-based testing framework for Python, focusing on generating test cases based on specified properties of the code.

Please Login to comment...

Similar reads.

  • data-science
  • Data Science
  • Machine Learning
  • CBSE Exam Format Changed for Class 11-12: Focus On Concept Application Questions
  • 10 Best Waze Alternatives in 2024 (Free)
  • 10 Best Squarespace Alternatives in 2024 (Free)
  • Top 10 Owler Alternatives & Competitors in 2024
  • 30 OOPs Interview Questions and Answers (2024)

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

  • Hypothesis testing in statistics

Real flavour of statistics. Hypothesis testing in real life.

Parvesh Kumar

Hypothesis testing is a fundamental concept in statistics, which is often raised during maths tuition , and plays a crucial role in real-life scenarios. In this blog post, we will discuss the importance of hypothesis testing and its applications in real-life situations.

Hypothesis testing is a statistical method used to determine whether the observed data is likely to have occurred by chance or whether it represents a true difference between groups or variables. In other words, it is a way of evaluating the validity of a hypothesis by comparing it with data from the real world.

Find tutors

When is hypothesis testing used?

  • Medical research:

One of the most common applications of hypothesis testing is in medical research. Clinical trials often use hypothesis testing to determine whether a new treatment is more effective than an existing one. For example, suppose a pharmaceutical company has developed a new drug to treat a particular illness. In that case, they must conduct a clinical trial to determine whether the drug is effective and safe for human use. Hypothesis testing can be used to compare the effectiveness of the new drug with the existing treatment.

  • Market research:

Another application of hypothesis testing is in market research. Companies often use hypothesis testing to determine whether a new product or advertising campaign is likely to be successful. For example, a company may conduct a survey to determine whether customers prefer a new product over an existing one. Hypothesis testing can be used to analyse the survey results and determine whether the new product is likely to be successful.

  • Environmental research:

Hypothesis testing is also used in environmental research. Scientists may use hypothesis testing to determine whether a particular pollutant is affecting a particular species or ecosystem. For example, scientists may hypothesize that a chemical spill in a river is causing a decline in the fish population. Hypothesis testing can be used to analyse data from the river and determine whether the hypothesis is supported by the data.

In conclusion, hypothesis testing is a vital tool for making informed decisions in real-life situations. From medical research to market research to environmental research, hypothesis testing plays a crucial role.

If you require further assistance with probability,   statistsics and hypothesis testing, I have a background in maths and experience as a maths tutor . Feel free to get in touch for private maths lessons .

Articles by other tutors

Latest published articles.

  • Search Search Please fill out this field.

What Is Hypothesis Testing?

Step 1: define the hypothesis, step 2: set the criteria, step 3: calculate the statistic, step 4: reach a conclusion, types of errors, the bottom line.

  • Trading Skills
  • Trading Basic Education

Hypothesis Testing in Finance: Concept and Examples

Charlene Rhinehart is a CPA , CFE, chair of an Illinois CPA Society committee, and has a degree in accounting and finance from DePaul University.

real life example of hypothesis testing

Your investment advisor proposes you a monthly income investment plan that promises a variable return each month. You will invest in it only if you are assured of an average $180 monthly income. Your advisor also tells you that for the past 300 months, the scheme had investment returns with an average value of $190 and a standard deviation of $75. Should you invest in this scheme? Hypothesis testing comes to the aid for such decision-making.

Key Takeaways

  • Hypothesis testing is a mathematical tool for confirming a financial or business claim or idea.
  • Hypothesis testing is useful for investors trying to decide what to invest in and whether the instrument is likely to provide a satisfactory return.
  • Despite the existence of different methodologies of hypothesis testing, the same four steps are used: define the hypothesis, set the criteria, calculate the statistic, and reach a conclusion.
  • This mathematical model, like most statistical tools and models, has limitations and is prone to certain errors, necessitating investors also considering other models in conjunction with this one

Hypothesis or significance testing is a mathematical model for testing a claim, idea or hypothesis about a parameter of interest in a given population set, using data measured in a sample set. Calculations are performed on selected samples to gather more decisive information about the characteristics of the entire population, which enables a systematic way to test claims or ideas about the entire dataset.

Here is a simple example: A school principal reports that students in their school score an average of 7 out of 10 in exams. To test this “hypothesis,” we record marks of say 30 students (sample) from the entire student population of the school (say 300) and calculate the mean of that sample. We can then compare the (calculated) sample mean to the (reported) population mean and attempt to confirm the hypothesis.

To take another example, the annual return of a particular mutual fund is 8%. Assume that mutual fund has been in existence for 20 years. We take a random sample of annual returns of the mutual fund for, say, five years (sample) and calculate its mean. We then compare the (calculated) sample mean to the (claimed) population mean to verify the hypothesis.

This article assumes readers' familiarity with concepts of a normal distribution table, formula, p-value and related basics of statistics.

Different methodologies exist for hypothesis testing, but the same four basic steps are involved:

Usually, the reported value (or the claim statistics) is stated as the hypothesis and presumed to be true. For the above examples, the hypothesis will be:

  • Example A: Students in the school score an average of 7 out of 10 in exams.
  • Example B: The annual return of the mutual fund is 8% per annum.

This stated description constitutes the “ Null Hypothesis (H 0 ) ” and is  assumed  to be true – the way a defendant in a jury trial is presumed innocent until proven guilty by the evidence presented in court. Similarly, hypothesis testing starts by stating and assuming a “ null hypothesis ,” and then the process determines whether the assumption is likely to be true or false.

The important point to note is that we are testing the null hypothesis because there is an element of doubt about its validity. Whatever information that is against the stated null hypothesis is captured in the  Alternative Hypothesis (H 1 ).  For the above examples, the alternative hypothesis will be:

  • Students score an average that is not equal to 7.
  • The annual return of the mutual fund is not equal to 8% per annum.

In other words, the alternative hypothesis is a direct contradiction of the null hypothesis.

As in a trial, the jury assumes the defendant's innocence (null hypothesis). The prosecutor has to prove otherwise (alternative hypothesis). Similarly, the researcher has to prove that the null hypothesis is either true or false. If the prosecutor fails to prove the alternative hypothesis, the jury has to let the defendant go (basing the decision on the null hypothesis). Similarly, if the researcher fails to prove an alternative hypothesis (or simply does nothing), then the null hypothesis is assumed to be true.

The decision-making criteria have to be based on certain parameters of datasets.

The decision-making criteria have to be based on certain parameters of datasets and this is where the connection to normal distribution comes into the picture.

As per the standard statistics postulate  about sampling distribution , “For any sample size n, the sampling distribution of X̅ is normal if the population X from which the sample is drawn is normally distributed.” Hence, the probabilities of all other possible sample mean that one could select are normally distributed.

For e.g., determine if the average daily return, of any stock listed on XYZ stock market , around New Year's Day is greater than 2%.

H 0 : Null Hypothesis: mean = 2%

H 1 : Alternative Hypothesis: mean > 2% (this is what we want to prove)

Take the sample (say of 50 stocks out of total 500) and compute the mean of the sample.

For a normal distribution, 95% of the values lie within two standard deviations of the population mean. Hence, this normal distribution and central limit assumption for the sample dataset allows us to establish 5% as a significance level. It makes sense as, under this assumption, there is less than a 5% probability (100-95) of getting outliers that are beyond two standard deviations from the population mean. Depending upon the nature of datasets, other significance levels can be taken at 1%, 5% or 10%. For financial calculations (including behavioral finance), 5% is the generally accepted limit. If we find any calculations that go beyond the usual two standard deviations, then we have a strong case of outliers to reject the null hypothesis.  

Graphically, it is represented as follows:

In the above example, if the mean of the sample is much larger than 2% (say 3.5%), then we reject the null hypothesis. The alternative hypothesis (mean >2%) is accepted, which confirms that the average daily return of the stocks is indeed above 2%.

However, if the mean of the sample is not likely to be significantly greater than 2% (and remains at, say, around 2.2%), then we CANNOT reject the null hypothesis. The challenge comes on how to decide on such close range cases. To make a conclusion from selected samples and results, a level of significance is to be determined, which enables a conclusion to be made about the null hypothesis. The alternative hypothesis enables establishing the level of significance or the "critical value” concept for deciding on such close range cases.

According to the textbook standard definition , “A critical value is a cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if the null hypothesis is true. Sample means obtained beyond a critical value will result in a decision to reject the null hypothesis."   In the above example, if we have defined the critical value as 2.1%, and the calculated mean comes to 2.2%, then we reject the null hypothesis. A critical value establishes a clear demarcation about acceptance or rejection.

This step involves calculating the required figure(s), known as test statistics (like mean, z-score , p-value , etc.), for the selected sample. (We'll get to these in a later section.)

With the computed value(s), decide on the null hypothesis. If the probability of getting a sample mean is less than 5%, then the conclusion is to reject the null hypothesis. Otherwise, accept and retain the null hypothesis.

There can be four possible outcomes in sample-based decision-making, with regard to the correct applicability to the entire population:

The “Correct” cases are the ones where the decisions taken on the samples are truly applicable to the entire population. The cases of errors arise when one decides to retain (or reject) the null hypothesis based on the sample calculations, but that decision does not really apply for the entire population. These cases constitute Type 1 ( alpha ) and Type 2 ( beta ) errors, as indicated in the table above.

Selecting the correct critical value allows eliminating the type-1 alpha errors or limiting them to an acceptable range.

Alpha denotes the error on the level of significance and is determined by the researcher. To maintain the standard 5% significance or confidence level for probability calculations, this is retained at 5%.

According to the applicable decision-making benchmarks and definitions:

  • “This (alpha) criterion is usually set at 0.05 (a = 0.05), and we compare the alpha level to the p-value. When the probability of a Type I error is less than 5% (p < 0.05), we decide to reject the null hypothesis; otherwise, we retain the null hypothesis.”  
  • The technical term used for this probability is the p-value . It is defined as “the probability of obtaining a sample outcome, given that the value stated in the null hypothesis is true. The p-value for obtaining a sample outcome is compared to the level of significance."  
  • A Type II error, or beta error, is defined as the probability of incorrectly retaining the null hypothesis, when in fact it is not applicable to the entire population.  

A few more examples will demonstrate this and other calculations.

A monthly income investment scheme exists that promises variable monthly returns. An investor will invest in it only if they are assured of an average $180 monthly income. The investor has a sample of 300 months’ returns which has a mean of $190 and a standard deviation of $75. Should they invest in this scheme?

Let’s set up the problem. The investor will invest in the scheme if they are assured of the investor's desired $180 average return.

H 0 : Null Hypothesis: mean = 180

H 1 : Alternative Hypothesis: mean > 180

Method 1: Critical Value Approach

Identify a critical value X L for the sample mean, which is large enough to reject the null hypothesis – i.e. reject the null hypothesis if the sample mean >= critical value X L

P (identify a Type I alpha error) = P(reject H 0  given that H 0  is true),

This would be achieved when the sample mean exceeds the critical limits.

= P (given that H 0  is true) = alpha

Graphically, it appears as follows:

Taking alpha = 0.05 (i.e. 5% significance level), Z 0.05  = 1.645 (from the Z-table or normal distribution table)

           = > X L  = 180 +1.645*(75/sqrt(300)) = 187.12

Since the sample mean (190) is greater than the critical value (187.12), the null hypothesis is rejected, and the conclusion is that the average monthly return is indeed greater than $180, so the investor can consider investing in this scheme.

Method 2: Using Standardized Test Statistics

One can also use standardized value z.

Test Statistic, Z = (sample mean – population mean) / (std-dev / sqrt (no. of samples).

Then, the rejection region becomes the following:

Z= (190 – 180) / (75 / sqrt (300)) = 2.309

Our rejection region at 5% significance level is Z> Z 0.05  = 1.645.

Since Z= 2.309 is greater than 1.645, the null hypothesis can be rejected with a similar conclusion mentioned above.

Method 3: P-value Calculation

We aim to identify P (sample mean >= 190, when mean = 180).

= P (Z >= (190- 180) / (75 / sqrt (300))

= P (Z >= 2.309) = 0.0084 = 0.84%

The following table to infer p-value calculations concludes that there is confirmed evidence of average monthly returns being higher than 180:

A new stockbroker (XYZ) claims that their brokerage fees are lower than that of your current stock broker's (ABC). Data available from an independent research firm indicates that the mean and std-dev of all ABC broker clients are $18 and $6, respectively.

A sample of 100 clients of ABC is taken and brokerage charges are calculated with the new rates of XYZ broker. If the mean of the sample is $18.75 and std-dev is the same ($6), can any inference be made about the difference in the average brokerage bill between ABC and XYZ broker?

H 0 : Null Hypothesis: mean = 18

H 1 : Alternative Hypothesis: mean <> 18 (This is what we want to prove.)

Rejection region: Z <= - Z 2.5  and Z>=Z 2.5  (assuming 5% significance level, split 2.5 each on either side).

Z = (sample mean – mean) / (std-dev / sqrt (no. of samples))

= (18.75 – 18) / (6/(sqrt(100)) = 1.25

This calculated Z value falls between the two limits defined by:

- Z 2.5  = -1.96 and Z 2.5  = 1.96.

This concludes that there is insufficient evidence to infer that there is any difference between the rates of your existing broker and the new broker.

Alternatively, The p-value = P(Z< -1.25)+P(Z >1.25)

= 2 * 0.1056 = 0.2112 = 21.12% which is greater than 0.05 or 5%, leading to the same conclusion.

Graphically, it is represented by the following:

Criticism Points for the Hypothetical Testing Method:

  • A statistical method based on assumptions
  • Error-prone as detailed in terms of alpha and beta errors
  • Interpretation of p-value can be ambiguous, leading to confusing results

Hypothesis testing allows a mathematical model to validate a claim or idea with a certain confidence level. However, like the majority of statistical tools and models, it is bound by a few limitations. The use of this model for making financial decisions should be considered with a critical eye, keeping all dependencies in mind. Alternate methods like  Bayesian Inference are also worth exploring for similar analysis.

Sage Publications. " Introduction to Hypothesis Testing ," Page 13.

Sage Publications. " Introduction to Hypothesis Testing ," Page 11.

Sage Publications. " Introduction to Hypothesis Testing ," Page 7.

Sage Publications. " Introduction to Hypothesis Testing ," Pages 10-11.

real life example of hypothesis testing

  • Terms of Service
  • Editorial Policy
  • Privacy Policy
  • Your Privacy Choices

MarketSplash

How To Conduct Hypothesis Testing In R For Effective Data Analysis

Learn the essentials of hypothesis testing in R, a crucial skill for developers. This article guides you through setting up your environment, formulating hypotheses, executing tests, and interpreting results with practical examples

💡 KEY INSIGHTS

  • Hypothesis testing involves using a random population sample to test the null and alternative hypotheses , where the null hypothesis typically represents equality between population parameters​​.
  • The null hypothesis (H0) assumes no event occurrence and is critical unless rejected, while the alternate hypothesis (H1) is its logical opposite and is considered upon the rejection of H0​​.
  • The p-value is a crucial metric in hypothesis testing, indicating the likelihood of an observed difference occurring by chance; a lower p-value suggests a higher probability of the alternate hypothesis being true​​.
  • Hypothesis testing is significant in research methodology as it provides evidence-based conclusions , supports decision-making , adds rigor and validity , and contributes to the advancement of knowledge in various fields​​.

Hypothesis testing in R is a fundamental skill for programmers and developers looking to analyze and interpret data effectively. This article guides you through the essential steps and techniques, using R's robust statistical tools. Whether you're new to R or seeking to refine your data analysis skills, these insights will enhance your ability to make data-driven decisions.

real life example of hypothesis testing

Setting Up Your R Environment

Formulating and testing your hypothesis, interpreting test results, frequently asked questions.

Before diving into hypothesis testing, ensure you have R and RStudio installed. R is the programming language used for statistical computing, while RStudio provides an integrated development environment (IDE) to work with R. Download R from CRAN and RStudio from RStudio's website.

Configuring Your Workspace

Installing necessary packages, loading data into r, exploratory data analysis, basic data manipulation.

After installation, open RStudio and set up your workspace. This involves organizing your scripts, data files, and outputs. Use setwd() to define your working directory:

R's functionality is extended through packages. For hypothesis testing, packages like ggplot2 for data visualization and stats for statistical functions are essential. Install packages using install.packages() :

After installation, load them into your session using library() :

Data can be loaded into R using various functions depending on the file format. For a CSV file, use read.csv() :

Before hypothesis testing, it's crucial to understand your data. Use summary functions and visualization to explore:

Data often requires cleaning and manipulation. Functions like subset() and transform() are useful:

These commands help in refining your dataset, making it ready for hypothesis testing.

The first step in hypothesis testing is to Formulate a Clear Hypothesis . This typically involves stating a null hypothesis (H0) that indicates no effect or no difference, and an alternative hypothesis (H1) that suggests the presence of an effect or a difference.

Null And Alternative Hypothesis

Choosing the right test, t-test example, interpreting the results, analyzing the output, visualizing the data.

For example, if you're testing whether a new programming tool improves efficiency:

  • H0: The tool does not improve efficiency.
  • H1: The tool improves efficiency.

Selecting an appropriate statistical test is crucial. The choice depends on your data type and the nature of your hypothesis. Common tests include t-tests, chi-square tests, and ANOVA.

If you're comparing means between two groups, a t-test is appropriate. In R, use t.test() :

The output of t.test() includes the P-Value , which helps determine the significance of your results. A p-value lower than your significance level (commonly 0.05) indicates that you can reject the null hypothesis.

After running t.test() , analyze the output:

  • P-Value : Indicates the probability of observing your data if the null hypothesis is true.
  • Confidence Interval : Provides a range in which the true mean difference likely lies.

Visualizing your data can provide additional insights. For instance, use ggplot2 to create a plot that compares the groups:

Understanding P-Values

Interpreting confidence intervals, effect size, calculating and interpreting effect size, creating a plot for results.

The P-Value is central in interpreting hypothesis test results. It represents the probability of observing your data, or something more extreme, if the null hypothesis is true. A small p-value (typically ≤ 0.05) suggests that the observed data is unlikely under the null hypothesis, leading to its rejection.

Evaluating Significance

When you run a test, R provides a p-value:

Confidence Intervals

Confidence Intervals offer a range of values within which the true parameter value lies with a certain level of confidence (usually 95%). Narrow intervals indicate more precise estimates.

From your test output, extract and examine the confidence interval:

While p-values indicate whether an effect exists, the Effect Size measures its magnitude. It's crucial for understanding the practical significance of your results.

For a t-test, you might calculate Cohen's d:

For instance, create a plot to visualize the difference:

What is Effect Size and Why is it Important?

Effect size is a quantitative measure of the magnitude of the experimental effect. Unlike p-values, which tell you if an effect exists, effect size tells you how large that effect is. It's important for understanding the practical significance of your results.

How Do I Interpret a Confidence Interval?

A confidence interval gives a range of values within which the true value is likely to lie. For example, a 95% confidence interval means that if the same study were repeated many times, 95% of the intervals would contain the true value.

What Does 'Rejecting the Null Hypothesis' Mean in Practical Terms?

Rejecting the null hypothesis suggests that there is enough statistical evidence to support the alternative hypothesis. In practical terms, it means that the observed effect or difference is unlikely to be due to chance.

Can I Perform Hypothesis Testing on Non-Numeric Data?

Yes, you can perform hypothesis testing on non-numeric (categorical) data. Tests like the Chi-Square test are designed for categorical data and can test hypotheses about proportions or frequencies.

Let’s test your knowledge!

What is the function used in R to perform a t-test?

Continue learning with these 'programming' guides.

  • How To Debug In R: Effective Strategies For Developers
  • How To Use R For Simulation: Effective Strategies And Techniques
  • How To Install R Packages: Steps For Efficient Integration
  • How To Import Data In R: Essential Steps For Efficient Data Analysis
  • How To Clean Data In R: Essential Techniques For Effective Data Management

Subscribe to our newsletter

Subscribe to be notified of new content on marketsplash..

IMAGES

  1. 8 Hypothesis Testing Examples in Real Life

    real life example of hypothesis testing

  2. Hypothesis Testing Solved Examples(Questions and Solutions)

    real life example of hypothesis testing

  3. Hypothesis to Be Tested: Definition and 4 Steps for Testing with Example

    real life example of hypothesis testing

  4. Hypothesis Testing Steps & Examples

    real life example of hypothesis testing

  5. How To Write A Hypothesis Test Statistics

    real life example of hypothesis testing

  6. Hypothesis Testing Solved Problems

    real life example of hypothesis testing

VIDEO

  1. Two-Sample Hypothesis Testing: Dependent Sample

  2. Hypothesis Testing

  3. Hypothesis Testing

  4. Hypothesis Testing for Population Mean (Large sample, Z test) (Hindi/Urdu)

  5. Hypothesis Testing for Population Proportion Using Rejection Region and P-value (Cell Phone Example)

  6. HYPOTHESIS TESTING CONCEPT AND EXAMPLE #shorts #statistics #data #datanalysis #analysis #hypothesis

COMMENTS

  1. 4 Examples of Hypothesis Testing in Real Life

    In statistics, hypothesis tests are used to test whether or not some hypothesis about a population parameter is true. To perform a hypothesis test in the real world, researchers will obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:. Null Hypothesis (H 0): The sample data occurs purely from chance.

  2. 8 Hypothesis Testing Examples in Real Life

    The hypothesis testing broadly involves the following steps, Step 1: Formulate the research hypothesis and the null hypothesis of the experiment. Step 2: Set the characteristics of the comparison distribution. Step3: Set the criterion for decision making, i.e., cut off sample score for the comparison to reject or retain the null hypothesis.

  3. Hypothesis Testing In Real Life

    Hypothesis Testing. Hypothesis Tests, or Statistical Hypothesis Testing, is a technique used to compare two datasets, or a sample from a dataset. It is a statistical inference method so, in the end of the test, you'll draw a conclusion — you'll infer something — about the characteristics of what you're comparing.

  4. PDF Hypothesis Testing

    Example 3: Public Opinion About President Step 1. Determine the null and alternative hypotheses. Null hypothesis: There is no clear winning opinion on this issue; the proportions who would answer yes or no are each 0.50. Alternative hypothesis: Fewer than 0.50, or 50%, of the population would answer yes to this question.

  5. Hypothesis Testing

    There are 5 main steps in hypothesis testing: State your research hypothesis as a null hypothesis and alternate hypothesis (H o) and (H a or H 1 ). Collect data in a way designed to test the hypothesis. Perform an appropriate statistical test. Decide whether to reject or fail to reject your null hypothesis. Present the findings in your results ...

  6. Hypothesis Testing Steps & Examples

    Hypothesis testing is a technique that helps scientists, researchers, or for that matter, anyone test the validity of their claims or hypotheses about real-world or real-life events in order to establish new knowledge. Hypothesis testing techniques are often used in statistics and data science to analyze whether the claims about the occurrence of the events are true, whether the results ...

  7. 4 Examples of Hypothesis Testing in Real Life

    In statistics, hypothesis tests are used to test whether or not some hypothesis about a population parameter is true. To perform a hypothesis test in the real world, researchers will obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:. Null Hypothesis (H 0): The sample data occurs purely from chance.

  8. S.3.3 Hypothesis Testing Examples

    If the biologist set her significance level \(\alpha\) at 0.05 and used the critical value approach to conduct her hypothesis test, she would reject the null hypothesis if her test statistic t* were less than -1.6939 (determined using statistical software or a t-table):s-3-3. Since the biologist's test statistic, t* = -4.60, is less than -1.6939, the biologist rejects the null hypothesis.

  9. Introduction to Hypothesis Testing with Examples

    Likelihood ratio. In the likelihood ratio test, we reject the null hypothesis if the ratio is above a certain value i.e, reject the null hypothesis if L(X) > 𝜉, else accept it. 𝜉 is called the critical ratio.. So this is how we can draw a decision boundary: we separate the observations for which the likelihood ratio is greater than the critical ratio from the observations for which it ...

  10. Hypothesis Testing in the Real World

    To substantiate that, I present examples showing that hypothesis testing logic is routinely used in everyday life. These same examples also refute (b) by showing circumstances in which the logic of hypothesis testing addresses a question of prime interest. Null hypothesis significance testing may sometimes be misunderstood or misapplied, but ...

  11. What is Hypothesis Testing in Statistics? Types and Examples

    Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables. Let's discuss few examples of statistical hypothesis from real-life -

  12. Understand Hypothesis testing with real-life examples

    There are mainly two ways to calculate hypothesis those are t-test and z-test. t-test equation. t-test equation is: t = ( x̅ - μ0 ) / (s / √n) Where x̅ is sample mean. μ0 is already proven value (you can say null hypothesis value) s sample standard deviation. n is the sample size. z-test equation.

  13. Hypothesis Testing

    Step 2: State the Alternate Hypothesis. The claim is that the students have above average IQ scores, so: H 1: μ > 100. The fact that we are looking for scores "greater than" a certain point means that this is a one-tailed test. Step 3: Draw a picture to help you visualize the problem. Step 4: State the alpha level.

  14. 4 Examples of Hypothesis Testing in Real Life

    In statistics, hypothesis tests are used to test whether instead not several hypothesis learn a population parameter is true. To perform a hypo test in the real world, researchers will obtain a random sample from the population and run a hypothesis test on this sample data, with adenine null and alternative hypothesis:. Null Hypotheses (H 0): The example data occurs rein from chance.

  15. Examples of Hypothesis Tests: Busting Myths about the Battle of the

    For our first example of a hypothesis test, we'll test the myth that women multitask better than men. To determine whether this is true, ten men and ten women perform a standard set of tasks that require multitasking. The Mythbusters create a scoring system that measures how well each subject performs the tasks.

  16. Hypothesis Testing In real life

    Hypothesis Tests, or Statistical Hypothesis Testing, is a technique used to compare two datasets or a sample from a dataset. It is a statistical inference method so, in the end of the test, you'll draw a conclusion — you'll infer something — about the characteristics of what you're comparing. Before even thinking about what test you are going to use, you need to define your hypothesis to ...

  17. 8.4: Hypothesis Test Examples for Proportions

    Example 8.4.7. Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50%. Joon samples 100 first-time brides and 53 reply that they are younger than their grooms.

  18. Hypothesis Testing In Real Life

    The video attempts to explain the application of statistical hypothesis testing to real life scenarios. In this attempt it also aims at clarifying misconcept...

  19. Statistical Inference: Definition, Methods & Example

    Statistical inference is the process of using a sample to infer the properties of a population. Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn. Scientists typically want to learn about a population. When studying a phenomenon, such as the effects of a new medication ...

  20. Type I & Type II Errors

    Example: Null and alternative hypothesis. You test whether a new drug intervention can alleviate symptoms of an autoimmune disease. In this case: The null hypothesis (H 0) is that the new drug has no effect on symptoms of the disease. The alternative hypothesis (H 1) is that the drug is effective for alleviating symptoms of the disease.

  21. Understanding Hypothesis Testing

    The article also elucidates the critical distinction between Type I and Type II errors, providing a comprehensive understanding of the nuanced decision-making process inherent in hypothesis testing. The real-life example of testing a new drug's effect on blood pressure using a paired T-test showcases the practical application of these ...

  22. Real flavour of statistics. Hypothesis testing in real life

    Hypothesis testing is a fundamental concept in statistics, which is often raised during maths tuition, and plays a crucial role in real-life scenarios.In this blog post, we will discuss the importance of hypothesis testing and its applications in real-life situations. Hypothesis testing is a statistical method used to determine whether the observed data is likely to have occurred by chance or ...

  23. Hypothesis Testing in Finance: Concept and Examples

    Step 1: Define the Hypothesis. Usually, the reported value (or the claim statistics) is stated as the hypothesis and presumed to be true. For the above examples, the hypothesis will be: Example A ...

  24. How To Conduct Hypothesis Testing In R For Effective ...

    💡 KEY INSIGHTS; Hypothesis testing involves using a random population sample to test the null and alternative hypotheses, where the null hypothesis typically represents equality between population parameters .; The null hypothesis (H0) assumes no event occurrence and is critical unless rejected, while the alternate hypothesis (H1) is its logical opposite and is considered upon the rejection ...