Logo for Open Library Publishing Platform

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

8.8 Hypothesis Tests for a Population Proportion

Learning objectives.

  • Conduct and interpret hypothesis tests for a population proportion.

Some notes about conducting a hypothesis test:

  • The null hypothesis [latex]H_0[/latex] is always an “equal to.”  The null hypothesis is the original claim about the population parameter.
  • The alternative hypothesis [latex]H_a[/latex] is a “less than,” “greater than,” or “not equal to.”  The form of the alternative hypothesis depends on the context of the question.
  • If the alternative hypothesis is a “less than”, then the test is left-tail.  The p -value is the area in the left-tail of the distribution.
  • If the alternative hypothesis is a “greater than”, then the test is right-tail.  The p -value is the area in the right-tail of the distribution.
  • If the alternative hypothesis is a “not equal to”, then the test is two-tail.  The p -value is the sum of the area in the two-tails of the distribution.  Each tail represents exactly half of the p -value.
  • Think about the meaning of the p -value.  A data analyst (and anyone else) should have more confidence that they made the correct decision to reject the null hypothesis with a smaller p -value (for example, 0.001 as opposed to 0.04) even if using a significance level of 0.05. Similarly, for a large p -value such as 0.4, as opposed to a p -value of 0.056 (a significance level of 0.05 is less than either number), a data analyst should have more confidence that they made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.
  • The significance level must be identified before collecting the sample data and conducting the test.  Generally, the significance level will be included in the question.  If no significance level is given, a common standard is to use a significance level of 5%.

Suppose the hypotheses for a hypothesis test are:

[latex]\begin{eqnarray*} H_0: & & p=20 \% \\ H_a: & & p \gt 20\% \end{eqnarray*}[/latex]

Because the alternative hypothesis is a [latex]\gt[/latex], this is a right-tail test.  The p -value is the area in the right-tail of the distribution.

Normal distribution curve of a single population proportion with the value of 0.2 on the x-axis. The p-value points to the area on the right tail of the curve.

[latex]\begin{eqnarray*} H_0: & & p=50 \% \\ H_a: & & p \neq  50\% \end{eqnarray*}[/latex]

Because the alternative hypothesis is a [latex]\neq[/latex], this is a two-tail test.  The p -value is the sum of the areas in the two tails of the distribution.  Each tail contains exactly half of the p -value.

Normal distribution curve of a single population mean with a value of 50 on the x-axis. The p-value formulas, 1/2(p-value), for a two-tailed test is shown for the areas on the left and right tails of the curve.

[latex]\begin{eqnarray*} H_0: & & p=10\% \\ H_a: & & p \lt  10\% \end{eqnarray*}[/latex]

Because the alternative hypothesis is a [latex]\lt[/latex], this is a left-tail test.  The p -value is the area in the left-tail of the distribution.

Steps to Conduct a Hypothesis Test for a Population Proportion

  • Write down the null and alternative hypotheses in terms of the population proportion [latex]p[/latex].  Include appropriate units with the values of the proportion.
  • Use the form of the alternative hypothesis to determine if the test is left-tailed, right-tailed, or two-tailed.
  • Collect the sample information for the test and identify the significance level.
  • If [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex], use the normal distribution with [latex]\displaystyle{z=\frac{\hat{p}-p}{\sqrt{\frac{p \times (1-p)}{n}}}}[/latex].
  • If one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex], use a binomial distribution.
  • The results of the sample data are significant.  There is sufficient evidence to conclude that the null hypothesis [latex]H_0[/latex] is an incorrect belief and that the alternative hypothesis [latex]H_a[/latex] is most likely correct.
  • The results of the sample data are not significant.  There is not sufficient evidence to conclude that the alternative hypothesis [latex]H_a[/latex] may be correct.
  • Write down a concluding sentence specific to the context of the question.

USING EXCEL TO CALCULE THE P -VALUE FOR A HYPOTHESIS TEST ON A POPULATION PROPORTION

The p -value for a hypothesis test on a population proportion is the area in the tail(s) of distribution of the sample proportion.  If both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex], use the normal distribution to find the p -value.  If at least one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex], use the binomial distribution to find the p -value.

If both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex]:

  • For x , enter the value for [latex]\hat{p}[/latex].
  • For [latex]\mu[/latex] , enter the mean of the sample proportions [latex]p[/latex].  Note:  Because the test is run assuming the null hypothesis is true, the value for [latex]p[/latex] is the claim from the null hypothesis.
  • For [latex]\sigma[/latex] , enter the standard error of the proportions [latex]\displaystyle{\sqrt{\frac{p \times (1-p)}{n}}}[/latex].
  • For the logic operator , enter true .  Note:  Because we are calculating the area under the curve, we always enter true for the logic operator.
  • Use the appropriate technique with the norm.dist function to find the area in the left-tail or the area in the right-tail.

If at least one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex]:

  • The p -value is found using the binomial distribution.
  • For x , enter the number of successes.
  • For n , enter the sample size.
  • For p , enter the the value of the population proportion [latex]p[/latex] from the null hypothesis.
  • For the logic operator , enter true .  Note:  Because we are calculating an at most probability, the logic operator is always true.
  • For p , enter the the value of the population proportion [latex]p[/latex] in the null hypothesis.
  • For the logic operator , enter true .  Note:  Because we are calculating an at least probability, the logic operator is always true.

Marketers believe that 92% of adults own a cell phone.  A cell phone manufacturer believes that number is actually lower.  In a sample of 200 adults, 87% own a cell phone.  At the 1% significance level, determine if the proportion of adults that own a cell phone is lower than the marketers’ claim.

Hypotheses:

[latex]\begin{eqnarray*} H_0: & & p=92\% \mbox{ of adults own a cell phone} \\ H_a: & & p \lt 92\% \mbox{ of adults own a cell phone} \end{eqnarray*}[/latex]

From the question, we have [latex]n=200[/latex], [latex]\hat{p}=0.87[/latex], and [latex]\alpha=0.01[/latex].

To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex].  For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.92[/latex]).

[latex]\begin{eqnarray*} n \times p & = & 200 \times 0.92=184 \geq 5 \\ n \times (1-p) & = & 200 \times (1-0.92)=16 \geq 5\end{eqnarray*}[/latex]

Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p)  \geq 5[/latex] we use a normal distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\lt[/latex], the p -value is the area in the left tail of the distribution.

This is a normal distribution curve. On the left side of the center a vertical line extends to the curve with the area to the left of this vertical line shaded. The p-value equals the area of this shaded region.

So the p -value[latex]=0.0046[/latex].

Conclusion:

Because p -value[latex]=0.0046 \lt 0.01=\alpha[/latex], we reject the null hypothesis in favour of the alternative hypothesis.  At the 1% significance level there is enough evidence to suggest that the proportion of adults who own a cell phone is lower than 92%.

  • The null hypothesis [latex]p=92\%[/latex] is the claim that 92% of adults own a cell phone.
  • The alternative hypothesis [latex]p \lt 92\%[/latex] is the claim that less than 92% of adults own a cell phone.
  • The function is norm.dist because we are finding the area in the left tail of a normal distribution.
  • Field 1 is the value of [latex]\hat{p}[/latex].
  • Field 2 is the value of [latex]p[/latex] from the null hypothesis.  Remember, we run the test assuming the null hypothesis is true, so that means we assume [latex]p=0.92[/latex].
  • Field 3 is the standard deviation for the sample proportions [latex]\displaystyle{\sqrt{\frac{p \times (1-p)}{n}}}[/latex].
  • The p -value of 0.0046 tells us that under the assumption that 92% of adults own a cell phone (the null hypothesis), there is only a 0.46% chance that the proportion of adults who own a cell phone in a sample of 200 is 87% or less.  This is a small probability, and so is unlikely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely incorrect, and so the conclusion of the test is to reject the null hypothesis in favour of the alternative hypothesis.  In other words, the proportion of adults who own a cell phone is most likely less than 92%.

A consumer group claims that the proportion of households that have at least three cell phones is 30%.  A cell phone company has reason to believe that the proportion of households with at least three cell phones is much higher.  Before they start a big advertising campaign based on the proportion of households that have at least three cell phones, they want to test their claim.  Their marketing people survey 150 households with the result that 54 of the households have at least three cell phones.  At the 1% significance level, determine if the proportion of households that have at least three cell phones is less than 30%.

[latex]\begin{eqnarray*} H_0: & & p=30\% \mbox{ of household have at least 3 cell phones} \\ H_a: & & p \gt 30\% \mbox{ of household have at least 3 cell phones} \end{eqnarray*}[/latex]

From the question, we have [latex]n=150[/latex], [latex]\displaystyle{\hat{p}=\frac{54}{150}=0.36}[/latex], and [latex]\alpha=0.01[/latex].

To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex].  For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.3[/latex]).

[latex]\begin{eqnarray*} n \times p & = & 150 \times 0.3=45 \geq 5 \\ n \times (1-p) & = & 150 \times (1-0.3)=105 \geq 5\end{eqnarray*}[/latex]

Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p)  \geq  5[/latex] we use a normal distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the area in the right tail of the distribution.

This is a normal distribution curve. On the right side of the center a vertical line extends to the curve with the area to the right of this vertical line shaded. The p-value equals the area of this shaded region.

So the p -value[latex]=0.0544[/latex].

Because p -value[latex]=0.0544 \gt 0.01=\alpha[/latex], we do not reject the null hypothesis.  At the 1% significance level there is not enough evidence to suggest that the proportion of households with at least three cell phones is more than 30%.

  • The null hypothesis [latex]p=30\%[/latex] is the claim that 30% of households have at least three cell phones.
  • The alternative hypothesis [latex]p \gt 30\%[/latex] is the claim that more than 30% of households have at least three cell phones.
  • The function is 1-norm.dist because we are finding the area in the right tail of a normal distribution.
  • Field 2 is the value of [latex]p[/latex] from the null hypothesis.  Remember, we run the test assuming the null hypothesis is true, so that means we assume [latex]p=0.3[/latex].
  • The p -value of 0.0544 tells us that under the assumption that 30% of households have at least three cell phones (the null hypothesis), there is a 5.44% chance that the proportion of households with at least three cell phones in a sample of 150 is 36% or more.  Compared to the 1% significance level, this is a large probability, and so is likely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis.  In other words, the claim that 30% of households have at least three cell phones is most likely correct.

A teacher believes that 70% of students in the class will want to go on a field trip to the local zoo.  The students in the class believe the proportion is much higher and ask the teacher to verify her claim.  The teacher samples 50 students and 39 reply that they would want to go to the zoo.  At the 5% significance level, determine if the proportion of students who want to go on the field trip is higher than 70%.

[latex]\begin{eqnarray*} H_0: & & p = 70\% \mbox{ of students want to go on the field trip}  \\ H_a: & & p \gt 70\% \mbox{ of students want to go on the field trip}   \end{eqnarray*}[/latex]

From the question, we have [latex]n=50[/latex], [latex]\displaystyle{\hat{p}=\frac{39}{50}=0.78}[/latex], and [latex]\alpha=0.05[/latex].

[latex]\begin{eqnarray*} n \times p & = & 50 \times 0.7=35 \geq 5 \\ n \times (1-p) & = & 50 \times (1-0.7)=15 \geq 5\end{eqnarray*}[/latex]

Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p)  \geq 5[/latex] we use a normal distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the area in the right tail of the distribution.

So the p -value[latex]=0.1085[/latex].

Because p -value[latex]=0.1085 \gt 0.05=\alpha[/latex], we do not reject the null hypothesis.  At the 5% significance level there is not enough evidence to suggest that the proportion of students who want to go on the field trip is higher than 70%.

  • The null hypothesis [latex]p=70\%[/latex] is the claim that 70% of the students want to go on the field trip.
  • The alternative hypothesis [latex]p \gt 70\%[/latex] is the claim that more than 70% of students want to go on the field trip.
  • The p -value of 0.1085 tells us that under the assumption that 70% of students want to go on the field trip (the null hypothesis), there is a 10.85% chance that the proportion of students who want to go on the field trip in a sample of 50 students is 78% or more.  Compared to the 5% significance level, this is a large probability, and so is likely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis.  In other words, the teacher’s claim that 70% of students want to go on the field trip is most likely correct.

Joan believes that 50% of first-time brides in the United States are younger than their grooms.  She performs a hypothesis test to determine if the percentage is the same or different from 50%.  Joan samples 100 first-time brides and 56 reply that they are younger than their grooms.  Use a 5% significance level.

[latex]\begin{eqnarray*} H_0: & & p=50\% \mbox{ of first-time brides are younger than the groom} \\ H_a: & & p \neq 50\% \mbox{ of first-time brides are younger than the groom} \end{eqnarray*}[/latex]

From the question, we have [latex]n=100[/latex], [latex]\displaystyle{\hat{p}=\frac{56}{100}=0.56}[/latex], and [latex]\alpha=0.05[/latex].

To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex].  For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.5[/latex]).

[latex]\begin{eqnarray*} n \times p & = & 100 \times 0.5=50 \geq 5 \\ n \times (1-p) & = & 100 \times (1-0.5)=50 \geq 5\end{eqnarray*}[/latex]

Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p)  \geq 5[/latex] we use a normal distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\neq[/latex], the p -value is the sum of area in the tails of the distribution.

This is a normal distribution curve. On the left side of the center a vertical line extends to the curve with the area to the left of this vertical line shaded and labeled as one half of the p-value. On the right side of the center a vertical line extends to the curve with the area to the right of this vertical line shaded and labeled as one half of the p-value. The p-value equals the sum of area of these two shaded regions.

Because there is only one sample, we only have information relating to one of the two tails, either the left or the right.  We need to know if the sample relates to the left or right tail because that will determine how we calculate out the area of that tail using the normal distribution.  In this case, the sample proportion [latex]\hat{p}=0.56[/latex] is greater than the value of the population proportion in the null hypothesis [latex]p=0.5[/latex] ([latex]\hat{p}=0.56>0.5=p[/latex]), so the sample information relates to the right-tail of the normal distribution.  This means that we will calculate out the area in the right tail using 1-norm.dist .  However, this is a two-tailed test where the p -value is the sum of the area in the two tails and the area in the right-tail is only one half of the p -value.  The area in the left tail equals the area in the right tail and the p -value is the sum of these two areas.

So the area in the right tail is 0.1151 and  [latex]\frac{1}{2}[/latex]( p -value)[latex]=0.1151[/latex].  This is also the area in the left tail, so

p -value[latex]=0.1151+0.1151=0.2302[/latex]

Because p -value[latex]=0.2302 \gt 0.05=\alpha[/latex], we do not reject the null hypothesis.  At the 5% significance level there is not enough evidence to suggest that the proportion of first-time brides that are younger than the groom is different from 50%.

  • The null hypothesis [latex]p=50\%[/latex] is the claim that the proportion of first-time brides that are younger than the groom is 50%.
  • The alternative hypothesis [latex]p \neq 50\%[/latex] is the claim that the proportion of first-time brides that are younger than the groom is different from 50%.
  • We use norm.dist([latex]\hat{p}[/latex],[latex]p[/latex],[latex]\mbox{sqrt}(p*(1-p)/n)[/latex],true) to find the area in the left tail.  The area in the right tail equals the area in the left tail, so we can find the p -value by adding the output from this function to itself.
  • We use 1-norm.dist([latex]\hat{p}[/latex],[latex]p[/latex],[latex]\mbox{sqrt}(p*(1-p)/n)[/latex],true) to find the area in the right tail.  The area in the left tail equals the area in the right tail, so we can find the p -value by adding the output from this function to itself.
  • The p -value of 0.2302  is a large probability compared to the 5% significance level, and so is likely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis.  In other words, the claim that the proportion of first-time brides who are younger than the groom is most likely correct.

Watch this video: Hypothesis Testing for Proportions: z -test by ExcelIsFun [7:27] 

An online retailer believes that 93% of the visitors to its website will make a purchase.   A researcher in the marketing department thinks the actual percent is lower than claimed.  The researcher examines a sample of 50 visits to the website and finds that 45 of the visits resulted in a purchase.  At the 1% significance level, determine if the proportion of visits to the website that result in a purchase is lower than claimed.

[latex]\begin{eqnarray*} H_0: & & p=93\% \mbox{ of visitors make a purchase} \\ H_a: & & p \lt 93\% \mbox{ of visitors make a purchase} \end{eqnarray*}[/latex]

From the question, we have [latex]n=50[/latex], [latex]x=45[/latex], and [latex]\alpha=0.01[/latex].

To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex].  For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.93[/latex]).

[latex]\begin{eqnarray*} n \times p & = & 50 \times 0.93=46.5 \geq 5 \\ n \times (1-p) & = & 50 \times (1-0.93)=3.5 \lt 5\end{eqnarray*}[/latex]

Because [latex]n \times (1-p)  \lt 5[/latex] we use a binomial distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\lt[/latex], the p -value is the probability of getting at most 45 successes in 50 trials.

So the p -value[latex]=0.2710[/latex].

Because p -value[latex]=0.2710 \gt 0.01=\alpha[/latex], we do not reject the null hypothesis.  At the 1% significance level there is not enough evidence to suggest that the proportion of visitors who make a purchase is lower than 93%.

  • The null hypothesis [latex]p=93\%[/latex] is the claim that 93% of visitors to the website make a purchase.
  • The alternative hypothesis [latex]p \lt 93\%[/latex] is the claim that less than 93% of visitors to the website make a purchase.
  • The function is binom.dist because we are finding the probability of at most 45 successes.
  • Field 1 is the number of successes [latex]x[/latex].
  • Field 2 is the sample size [latex]n[/latex].
  • Field 3 is the probability of success [latex]p[/latex].  This is the claim about the population proportion made in the null hypothesis, so that means we assume [latex]p=0.93[/latex].
  • The p -value of 0.2710 tells us that under the assumption that 93% of visitors make a purchase (the null hypothesis), there is a 27.10% chance that the number of visitors in a sample of 50 who make a purchase is 45 or less.  This is a large probability compared to the significance level, and so is likely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis.  In other words, the proportion of visitors to the website who make a purchase adults is most likely 93%.

A drug company claims that only 4% of people who take their new drug experience any side effects from the drug.  A researcher believes that the percent is higher than drug company’s claim.  The researcher takes a sample of 80 people who take the drug and finds that 10% of the people in the sample experience side effects from the drug.  At the 5% significance level, determine if the proportion of people who experience side effects from taking the drug is higher than claimed.

[latex]\begin{eqnarray*} H_0: & & p=4\% \mbox{ of people experience side effects} \\ H_a: & & p \gt 4\% \mbox{ of people experience side effects} \end{eqnarray*}[/latex]

From the question, we have [latex]n=80[/latex], [latex]\hat{p}=0.1[/latex], and [latex]\alpha=0.05[/latex].

To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex].  For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.04[/latex]).

[latex]\begin{eqnarray*} n \times p & = & 80 \times 0.04=3.2 \lt 5\end{eqnarray*}[/latex]

Because [latex]n \times p  \lt 5[/latex] we use a binomial distribution to calculate the p -value.  Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the probability of getting at least 8 successes in 80 trials.  (Note:  In the sample of size 80, 10% have the characteristic of interest, so this means that [latex]80 \times 0.1=8[/latex] people in the sample have the characteristic of interest.)

So the p -value[latex]=0.0147[/latex].

Because p -value[latex]=0.0147 \lt 0.05=\alpha[/latex], we reject the null hypothesis in favour of the alternative hypothesis.  At the 5% significance level there is enough evidence to suggest that the proportion of people who experience side effects from taking the drug is higher than 4%.

  • The null hypothesis [latex]p=4\%[/latex] is the claim that 4% of the people experience side effects from taking the drug.
  • The alternative hypothesis [latex]p \gt 4\%[/latex] is the claim that more than 4% of the people experience side effects from taking the drug.
  • The function is 1-binom.dist because we are finding the probability of at least 8 successes.
  • Field 1 is [latex]x-1[/latex] where [latex]x[/latex] is the number of successes.  In this case, we are using the compliment rule to change the probability of at least 8 successes into 1 minus the probability of at most 7 successes.
  • Field 3 is the probability of success [latex]p[/latex].  This is the claim about the population proportion made in the null hypothesis, so that means we assume [latex]p=0.04[/latex].
  • The p -value of 0.0147 tells us that under the assumption that 4% of people experience side effects (the null hypothesis), there is a 1.47% chance that the number of people in a sample of 80 who experience side effects is 8 or more.  This is a small probability compared to the significance level, and so is unlikely to happen assuming the null hypothesis is true.  This suggests that the assumption that the null hypothesis is true is most likely incorrect, and so the conclusion of the test is to reject the null hypothesis in favour of the alternative hypothesis.  In other words, the proportion of people who experience side effects is most likely greater than 4%.

Concept Review

The hypothesis test for a population proportion is a well-established process:

  • Find the p -value (the area in the corresponding tail) for the test using the appropriate distribution (normal or binomial).
  • Compare the p -value to the significance level and state the outcome of the test.

Attribution

“ 9.6   Hypothesis Testing of a Single Mean and Single Proportion “ in Introductory Statistics by OpenStax  is licensed under a  Creative Commons Attribution 4.0 International License.

Introduction to Statistics Copyright © 2022 by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

9.4 - comparing two proportions.

So far, all of our examples involved testing whether a single population proportion p equals some value \(p_0\). Now, let's turn our attention for a bit towards testing whether one population proportion \(p_1\) equals a second population proportion \(p_2\). Additionally, most of our examples thus far have involved left-tailed tests in which the alternative hypothesis involved \(H_A \colon p < p_0\) or right-tailed tests in which the alternative hypothesis involved \(H_A \colon p > p_0\). Here, let's consider an example that tests the equality of two proportions against the alternative that they are not equal. Using statistical notation, we'll test:

\(H_0 \colon p_1 = p_2\) versus \(H_A \colon p_1 \ne p_2\)

Example 9-5 Section  

cigarette butt

Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed of the Americans who were surveyed was: "Should the federal tax on cigarettes be raised to pay for health care reform?" The results of the survey were:

Is there sufficient evidence at the \(\alpha = 0.05\), say, to conclude that the two populations — smokers and non-smokers — differ significantly with respect to their opinions?

If \(p_1\) = the proportion of the non-smoker population who reply "yes" and \(p_2\) = the proportion of the smoker population who reply "yes," then we are interested in testing the null hypothesis:

\(H_0 \colon p_1 = p_2\)

against the alternative hypothesis:

\(H_A \colon p_1 \ne p_2\)

Before we can actually conduct the hypothesis test, we'll have to derive the appropriate test statistic.

The test statistic for testing the difference in two population proportions, that is, for testing the null hypothesis \(H_0:p_1-p_2=0\) is:

\(Z=\dfrac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\hat{p}(1-\hat{p})\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\)

\(\hat{p}=\dfrac{Y_1+Y_2}{n_1+n_2}\)

the proportion of "successes" in the two samples combined.

Recall that:

\(\hat{p}_1-\hat{p}_2\)

is approximately normally distributed with mean:

\(p_1-p_2\)

and variance:

\(\dfrac{p_1(1-p_1)}{n_1}+\dfrac{p_2(1-p_2)}{n_2}\)

But, if we assume that the null hypothesis is true, then the population proportions equal some common value p , say, that is, \(p_1 = p_2 = p\). In that case, then the variance becomes:

\(p(1-p)\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)\)

So, under the assumption that the null hypothesis is true, we have that:

\( {\displaystyle Z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)- \color{blue}\overbrace{\color{black}\left(p_{1}-p_{2}\right)}^0}{\sqrt{p(1-p)\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)}} } \)

follows (at least approximately) the standard normal N (0,1) distribution. Since we don't know the (assumed) common population proportion p any more than we know the proportions \(p_1\) and \(p_2\) of each population, we can estimate p using:

the proportion of "successes" in the two samples combined. And, hence, our test statistic becomes:

as was to be proved.

Example 9-5 (continued) Section  

cigarette

The overall sample proportion is:

\(\hat{p}=\dfrac{41+351}{195+605}=\dfrac{392}{800}=0.49\)

That implies then that the test statistic for testing:

\(H_0:p_1=p_2\) versus \(H_0:p_1 \neq p_2\)

\(Z=\dfrac{(0.58-0.21)-0}{\sqrt{0.49(0.51)\left(\dfrac{1}{195}+\dfrac{1}{605}\right)}}=8.99\)

Errr.... that Z -value is off the charts, so to speak. Let's go through the formalities anyway making the decision first using the rejection region approach, and then using the P -value approach. Putting half of the rejection region in each tail, we have:

That is, we reject the null hypothesis \(H_0\) if \(Z ≥ 1.96\) or if \(Z ≤ −1.96\). We clearly reject \(H_0\), since 8.99 falls in the "red zone," that is, 8.99 is (much) greater than 1.96. There is sufficient evidence at the 0.05 level to conclude that the two populations differ with respect to their opinions concerning imposing a federal tax to help pay for health care reform.

Now for the P -value approach:

That is, the P -value is less than 0.0001. Because \(P < 0.0001 ≤ \alpha = 0.05\), we reject the null hypothesis. Again, there is sufficient evidence at the 0.05 level to conclude that the two populations differ with respect to their opinions concerning imposing a federal tax to help pay for health care reform.

Thankfully, as should always be the case, the two approaches.... the critical value approach and the P -value approach... lead to the same conclusion

Note! Section  

writing hand

For testing \(H_0 \colon p_1 = p_2\), some statisticians use the test statistic:

\(Z=\dfrac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}}\)

instead of the one we used:

An advantage of doing so is again that the interpretation of the confidence interval — does it contain 0? — is always consistent with the hypothesis test decision.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Mathematics LibreTexts

8.7: Hypothesis Test of Single Population Proportion with Examples

  • Last updated
  • Save as PDF
  • Page ID 130292

Steps for performing Hypothesis Test for a Single Population Proportion

Step 1: State your hypotheses about the population proportion. Step 2: Summarize the data. State a significance level. State and check conditions required for the procedure

  • \(\hat{P}=\frac{X}{n}\)
  • Sample is random with independent observations .
  • Sample is large. Check that the sample has 5 or more expected successes and 5 or more expected failures
  • Population is large relative to the sample size . The population size is at least 10 times bigger than the sample size.

Step 3: Perform the procedure

  • Find the Standard Error (SE) based on the assumption that \(H_{0}\) is true.
  • Compute the observed value of the test statistic \(Z_{obs}\).
  • Find the p-value in order to measure your level of surprise.

Step 4: Make a decision about \(H_{0}\) and \(H_{a}\)

  • Do you reject or not reject your null hypothesis? What about the alternative hypothesis?

Step 5 : Make a conclusion.

  • What does this mean in the context of the data?

Examples: Hypothesis Test for a Single Population Proportion

Example \(\pageindex{1}\).

Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50% . Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.

Set up the hypothesis test:

The 1% level of significance means that α = 0.01. This is a test of a single population proportion .

\(H_{0}: p = 0.50\)  \(H_{a}: p \neq 0.50\)

The words "is the same or different from" tell you this is a two-tailed test.

Calculate the distribution needed:

Random variable: \(\hat{P} =\) the percent of of first-time brides who are younger than their grooms.

Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the Normal distribution for \hat{P} , the estimated proportion.

\[ \hat{P} - N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)\nonumber \]

\[ \hat{P} - N\left(0.5, \sqrt{\frac{0.5(0.5)}{100}}\right)\nonumber \]

where \(p = 0.50, q = 1−p = 0.50\), and \(n = 100\)

Calculate the p -value using the normal distribution for proportions:

\[p\text{-value} = P( \hat{P} < 0.47 \space or \space \hat{P} > 0.53) = 0.5485\nonumber \]

where \[x = 53, \hat{P} = \frac{x}{n} = \frac{53}{100} = 0.53\nonumber \].

Interpretation of the p-value: If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion \( \hat{P} \) is 0.53 or more OR 0.47 or less (see the graph in Figure).

Normal distribution curve of the percent of first time brides who are younger than the groom with values of 0.47, 0.50, and 0.53 on the x-axis. Vertical upward lines extend from 0.47 and 0.53 to the curve. 1/2(p-values) are calculated for the areas on outsides of 0.47 and 0.53.

\(\mu = p = 0.50\) comes from \(H_{0}\), the null hypothesis.

\( \hat{P} = 0.53\). Since the curve is symmetrical and the test is two-tailed, the \( \hat{P} \) for the left tail is equal to \(0.50 – 0.03 = 0.47\) where \(\mu = p = 0.50\). (0.03 is the difference between 0.53 and 0.50.)

Compare \(\alpha\) and the \(p\text{-value}\):

Since \(\alpha = 0.01\) and \(p\text{-value} = 0.5485\). \(\alpha < p\text{-value}\).

Make a decision: Since \(\alpha < p\text{-value}\), you cannot reject \(H_{0}\).

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.

The \(p\text{-value}\) can easily be calculated.

Press STAT and arrow over to TESTS . Press 5:1-PropZTest . Enter .5 for \(p_{0}\), 53 for \(x\) and 100 for \(n\). Arrow down to Prop and arrow to not equals \(p_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator calculates the \(p\text{-value}\) (\(p = 0.5485\)) and the test statistic (\(z\)-score). Prop not equals .5 is the alternate hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(z = 0.6\) (test statistic) and \(p = 0.5485\) (\(p\text{-value}\)). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

The Type I and Type II errors are as follows:

The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).

The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)

Exercise \(\PageIndex{1}\)

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

First, determine what type of test this is, set up the hypothesis test, find the \(p\text{-value}\), sketch the graph, and state your conclusion.

Since the problem is about percentages, this is a test of single population proportions.

  • \(H_{0} : p = 0.85\)
  • \(H_{a}: p \neq 0.85\)
  • \(p-value = 0.7554\)

alt

Because \(p > \alpha\), we fail to reject the null hypothesis. There is not sufficient evidence to suggest that the proportion of students that want to go to the zoo is not 85%.

Example \(\PageIndex{2}\)

Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.

Set up the Hypothesis Test:

\(H_{0}: p = 0.30, H_{a}: p \neq 0.30\)

Determine the distribution needed:

The random variable is \( \hat{P} =\) proportion of households that have three cell phones.

The distribution for the hypothesis test is \( \hat{P} - N\left(0.30, \sqrt{\frac{0.30 \cdot (0.70)}{150}}\right)\)

Exercise \(\PageIndex{2}\).2

a. The value that helps determine the \(p\text{-value}\) is \( \hat{P} \). Calculate \( \hat{P} \).

a. \( \hat{P} = \frac{x}{n}\) where \(x\) is the number of successes and \(n\) is the total number in the sample.

\(x = 43, n = 150\)

\( \hat{P} = 43/150=0.2867\)

Exercise \(\PageIndex{2}\).3

b. What is a success for this problem?

b. A success is having three cell phones in a household.

Exercise \(\PageIndex{2}\).4

c. What is the level of significance?

c. The level of significance is the preset \(\alpha\). Since \(\alpha\) is not given, assume that \(\alpha = 0.05\).

Exercise \(\PageIndex{2}\).5

d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately.

Calculate the \(p\text{-value}\).

d. \(p\text{-value} = 0.7216\)

Exercise \(\PageIndex{2}\).6

e. Make a decision. _____________(Reject/Do not reject) \(H_{0}\) because____________.

e. Assuming that \(\alpha = 0.05, \alpha < p\text{-value}\). The decision is do not reject \(H_{0}\) because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.

Exercise \(\PageIndex{2}\)

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

  • \(H_{0}: p = 0.92\)
  • \(H_{a}: p < 0.92\)
  • \(p\text{-value} = 0.0046\)

Because \(p < 0.05\), we reject the null hypothesis. There is sufficient evidence to conclude that fewer than 92% of American adults own cell phones.

  • Type I Error: To conclude that fewer than 92% of American adults own cell phones when, in fact, 92% of American adults do own cell phones (reject the null hypothesis when the null hypothesis is true).
  • Type II Error: To conclude that 92% of American adults own cell phones when, in fact, fewer than 92% of American adults own cell phones (do not reject the null hypothesis when the null hypothesis is false).

The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter \(p\). The distribution for the test is normal. The estimated proportion \(\hat{p}\) is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived \(\alpha = 0.01\), for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!

Example \(\PageIndex{3}\)

My dog has so many fleas,

They do not come off with ease. As for shampoo, I have tried many types Even one called Bubble Hype, Which only killed 25% of the fleas, Unfortunately I was not pleased.

I've used all kinds of soap, Until I had given up hope Until one day I saw An ad that put me in awe.

A shampoo used for dogs Called GOOD ENOUGH to Clean a Hog Guaranteed to kill more fleas.

I gave Fido a bath And after doing the math His number of fleas Started dropping by 3's! Before his shampoo I counted 42.

At the end of his bath, I redid the math And the new shampoo had killed 17 fleas. So now I was pleased.

Now it is time for you to have some fun With the level of significance being .01, You must help me figure out

Use the new shampoo or go without?

\(H_{0}: p \leq 0.25\)   \(H_{a}: p > 0.25\)

In words, CLEARLY state what your random variable \(\bar{X}\) or \( \hat{P} \) represents.

\( \hat{P} =\) The proportion of fleas that are killed by the new shampoo

State the distribution to use for the test.

\[N\left(0.25, \sqrt{\frac{0.25 \cdot (1-0.25)}{42}}\right)\nonumber \]

Test Statistic: \(z_{obs} = 2.3163\)

Calculate the \(p\text{-value}\) using the normal distribution for proportions:

\[p\text{-value} = 0.0103\nonumber \]

In one to two complete sentences, explain what the p -value means for this problem.

If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 \(\left(\frac{17}{42}\right)\) or more.

Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the \(p\text{-value}\).

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.25 and 0.4048 on the x-axis. A vertical upward line extends from 0.4048 to the curve and the area to the left of this is shaded in. The test statistic of the sample proportion is listed.

Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.

Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.26, 17/42, and 0.55 on the x-axis. A vertical upward line extends from 0.26 and 0.55. The area between these two points is equal to 0.95.

Confidence Interval: (0.26,0.55) We are 95% confident that the true population proportion p of fleas that are killed by the new shampoo is between 26% and 55%.

This test result is not very definitive since the \(p\text{-value}\) is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.

Example \(\PageIndex{4}\)

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

We will follow the four-step process.

If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.

  • \(H_{0}: p \leq 0.00034\)
  • \(H_{a}: p > 0.00034\)
  • \(\hat{P}=\frac{172}{420,019}=0.00041\).
  • The sample is sufficiently large because we have \(np = 420,019(0.00034) = 142.8\), \(nq = 420,019(0.99966) = 419,876.2\) both greater than five. Sample is random with independent observations. Thus we will be able to generalize our results to the population.
  • \(SE=\sqrt{\frac{0.00034(1-0.00034)}{420,019}}=0.000028\)
  • \(Z_{obs}=\frac{0.00041-0.00034}{0.000028}=2.5\)
  • \(p\text{-value} = 1-P(Z<2.5)=0.0062\)
  • Since the \(p\text{-value} = 0.0062\) is greater than our alpha value \(= 0.005\), we cannot reject the null and cannot support alternative.
  • Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.

Example \(\PageIndex{5}\)

According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.

We will follow the five-step plan.

  • We need to test whether the proportion of sexual assaults in Daviess County, KY is significantly different from the national average.
  • \(H_{0}: p = 0.00078\)
  • \(H_{a}: p \neq 0.00078\)
  • \(p\text{-value} = 0.00063\)
  • Since the \(p\text{-value} = 0.00063\), is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis.
  • In conclusion, the sample data support the claim that the proportion of sexual assaults in Daviess County, Kentucky is different from the national average proportion.

The hypothesis test itself has an established process. This can be summarized as follows:

  • Determine \(H_{0}\) and \(H_{a}\). Remember, they are contradictory.
  • Determine the random variable.
  • Determine the distribution for the test.
  • Draw a graph, calculate the test statistic, and use the test statistic to calculate the \(p\text{-value}\). (A z -score (\(Z_{obs}\) is an example of test statistics.)
  • Compare the preconceived α with the p -value, make a decision (reject or do not reject H 0 ), and write a clear conclusion using English sentences.

Notice that in performing the hypothesis test, you use \(\alpha\) and not \(\beta\). \(\beta\) is needed to help determine the sample size of the data that is used in calculating the \(p\text{-value}\). Remember that the quantity \(1 – \beta\) is called the Power of the Test . A high power is desirable. If the power is too low, statisticians typically increase the sample size while keeping α the same.If the power is low, the null hypothesis might not be rejected when it should be.

  • Data from Amit Schitai. Director of Instructional Technology and Distance Learning. LBCC.
  • Data from Bloomberg Businessweek . Available online at www.businessweek.com/news/2011- 09-15/nyc-smoking-rate-falls-to-record-low-of-14-bloomberg-says.html.
  • Data from energy.gov. Available online at http://energy.gov (accessed June 27. 2013).
  • Data from Gallup®. Available online at www.gallup.com (accessed June 27, 2013).
  • Data from Growing by Degrees by Allen and Seaman.
  • Data from La Leche League International. Available online at www.lalecheleague.org/Law/BAFeb01.html.
  • Data from the American Automobile Association. Available online at www.aaa.com (accessed June 27, 2013).
  • Data from the American Library Association. Available online at www.ala.org (accessed June 27, 2013).
  • Data from the Bureau of Labor Statistics. Available online at http://www.bls.gov/oes/current/oes291111.htm .
  • Data from the Centers for Disease Control and Prevention. Available online at www.cdc.gov (accessed June 27, 2013)
  • Data from the U.S. Census Bureau, available online at quickfacts.census.gov/qfd/states/00000.html (accessed June 27, 2013).
  • Data from the United States Census Bureau. Available online at www.census.gov/hhes/socdemo/language/.
  • Data from Toastmasters International. Available online at http://toastmasters.org/artisan/deta...eID=429&Page=1 .
  • Data from Weather Underground. Available online at www.wunderground.com (accessed June 27, 2013).
  • Federal Bureau of Investigations. “Uniform Crime Reports and Index of Crime in Daviess in the State of Kentucky enforced by Daviess County from 1985 to 2005.” Available online at http://www.disastercenter.com/kentucky/crime/3868.htm (accessed June 27, 2013).
  • “Foothill-De Anza Community College District.” De Anza College, Winter 2006. Available online at research.fhda.edu/factbook/DA...t_da_2006w.pdf.
  • Johansen, C., J. Boice, Jr., J. McLaughlin, J. Olsen. “Cellular Telephones and Cancer—a Nationwide Cohort Study in Denmark.” Institute of Cancer Epidemiology and the Danish Cancer Society, 93(3):203-7. Available online at http://www.ncbi.nlm.nih.gov/pubmed/11158188 (accessed June 27, 2013).
  • Rape, Abuse & Incest National Network. “How often does sexual assault occur?” RAINN, 2009. Available online at www.rainn.org/get-information...sexual-assault (accessed June 27, 2013).
  • Number Theory
  • Data Structures
  • Cornerstones

Exercises - Hypothesis Tests for Proportions (Two Samples)

hypothesis test for proportions questions

A researcher wants to examine the possibility that women are more likely to commit suicide off a cliff than men are. Data from the past 5 years show that out of 987 men who committed suicide, 394 jumped off a cliff. On the other hand, $44\%$ of the 500 women who committed suicide did so by jumping off a cliff.

What can be said about the researcher's claim? Use the $P$-value method with a significance level of $\alpha=0.05$.

A survey asks a random sample of men and women whether they agree with a particular statement. The result of the survey is that $35\%$ of the men and $43\%$ of the women agree with the statement. The maximum error of the estimate for the difference in proportions is calculated to be 0.135.

  • Find the confidence interval for the difference in the proportions of men and women who agree.
  • Should we reject the null hypothesis $p_M=p_W$? Explain.
  • $-.215 \lt p_M-p_W \lt .055$
  • Fail to reject the null hypothesis because 0 is in the confidence interval.
  • Find a $95\%$ confidence interval for the difference in the proportions of satisfied freshmen and sophomores.
  • Using the confidence interval above, test the claim that there is no difference in the proportion of satisfied freshmen and sophomores. (Use $ \alpha=.05$.)

Assumptions: $29,\ 11,\ 23,\ 17 \geq 5$; $E=.206$ Confidence interval: $-.056 \lt p_F-p_S \lt .356$ Interpretation: We are $95\%$ confident that the difference in the proportions of satisfied freshmen and sophomores is between -.056 and .356.

  • The null hypothesis for testing this claim would be $H_0 : p_F - p_S = 0$. As this $0$ is in the confidence interval (as expected), we fail to reject the null hypothesis. There is no evidence of a difference in the proportions of satisfied freshmen and sophomores.
  • PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
  • EDIT Edit this Article
  • EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
  • Browse Articles
  • Learn Something New
  • Quizzes Hot
  • This Or That Game New
  • Train Your Brain
  • Explore More
  • Support wikiHow
  • About wikiHow
  • Log in / Sign up
  • Education and Communications
  • Mathematics
  • Probability and Statistics

How to Perform Hypothesis Testing for a Proportion

Last Updated: July 31, 2023

This article was co-authored by Joseph Quinones . Joseph Quinones is a High School Physics Teacher working at South Bronx Community Charter High School. Joseph specializes in astronomy and astrophysics and is interested in science education and science outreach, currently practicing ways to make physics accessible to more students with the goal of bringing more students of color into the STEM fields. He has experience working on Astrophysics research projects at the Museum of Natural History (AMNH). Joseph recieved his Bachelor's degree in Physics from Lehman College and his Masters in Physics Education from City College of New York (CCNY). He is also a member of a network called New York City Men Teach. This article has been viewed 34,292 times.

Hypothesis testing for a proportion is used to determine if a sampled proportion is significantly different from a specified population proportion. For example, if you expect the proportion of male births to be 50 percent, but the actual proportion of male births is 53 percent in a sample of 1000 births. Is this significantly different from the hypothesized population parameter? To find out, follow these steps.

Step 1 Formulate your research question.

  • Are there more than 50 percent of Americans who self-identify as liberal?
  • Is the percentage of defects in a given manufacturing plant more than 5%?
  • Is the proportion of babies born male different from 50 percent?
  • Are there more Americans who self-identify as liberal than as conservative? (Use hypothesis testing for 2 proportions instead.)
  • Is the mean number of defects in a given manufacturing plant more than 50 per month? (Use hypothesis testing for one sample t-test instead.)
  • Are male births related to paternal age? (Use chi-square test for independence instead.)

Step 2 Check to see if the following assumptions are met:

  • Simple random sampling is used.
  • Each sample point can result in only one of two possible outcomes. These outcomes are called successes and failures.
  • The sample includes at least 10 successes and 10 failures.
  • The population size is at least 20 times as big as the sample size.

Step 3 State the null hypothesis and the alternative hypothesis.

  • Right-tailed: Research question: Is the sample proportion greater than the hypothesized population proportion? Your hypotheses would be stated as follows: H0: p<=p0; Ha: p>p0.
  • Left-tailed: Research question: Is the sample proportion less than the hypothesized population proportion? Your hypotheses would be stated as follows: H0: p>=p0; Ha: p<p0.
  • Two-tailed: Research question: Is the sample proportion different from the hypothesized population proportion? Your hypotheses would be stated as follows: H0: p=p0; Ha: p<>p0.
  • In your example, you can use a two-tailed test to see if the sample proportion of male births, 0.53, is different from the hypothesized population proportion of 0.50. So H0: p=0.50; Ha: p<>0.50. Typically, if there is no a priori reason to believe that any differences must be unidirectional, the two-tailed test is preferred as it is a more stringent test.

Step 4 Set an appropriate significance level (alpha).

  • In our example, p=0.53, p0=0.50, and n=1000. s = sqrt(0.50*(1-0.50)/1000) = 0.0158. the test statistic is z = (0.53-0.50)/0.0158 = 1.8974.

Step 6 Convert the test statistic to a p value.

  • Normal distribution probability z table. It is important to read the table description to note what probability is listed by the table. Some tables list cumulative (left side) area, others list right tail area, still others list only area from mean up to a positive z value.
  • Excel. The excel function =norm.s.dist(z,cumulative). Substitute the numeric value for z and "true" for cumulative. This excel formula gives cumulative area to the left of a given z value. For your example, you would use the formula =norm.s.dist(1.8974,true) to find the cumulative left side area, which includes the left tail and the body. (Body is the area from -z to z.) You can subtract this from 1 to find the right tail area. Since your example is 2-tailed, you would then multiply by 2. A formula for p can be =2*(1-norm.s.dist(1.8974,true)). The output is 0.0578.
  • Texas Instrument calculator, such as TI-83 or TI-84.
  • Online normal distribution calculators.

Step 7 Decide between null hypothesis or alternative hypothesis.

Expert Q&A

You Might Also Like

Calculate Weighted Average

Expert Interview

hypothesis test for proportions questions

Thanks for reading our article! If you’d like to learn more about teaching, check out our in-depth interview with Joseph Quinones .

  • ↑ http://stattrek.com/hypothesis-test/proportion.aspx?tutorial=ap
  • ↑ http://blog.minitab.com/blog/michelle-paret/alphas-p-values-confidence-intervals-oh-my

About This Article

Joseph Quinones

  • Send fan mail to authors

Reader Success Stories

Shanaya Malik

Shanaya Malik

Jun 29, 2019

Did this article help you?

Shanaya Malik

Featured Articles

Relive the 1970s (for Kids)

Trending Articles

How to Celebrate Passover: Rules, Rituals, Foods, & More

Watch Articles

Fold Boxer Briefs

  • Terms of Use
  • Privacy Policy
  • Do Not Sell or Share My Info
  • Not Selling Info

Get all the best how-tos!

Sign up for wikiHow's weekly email newsletter

Teach yourself statistics

Hypothesis Test: Difference Between Proportions

This lesson explains how to conduct a hypothesis test to determine whether the difference between two proportions is significant. The test procedure, called the two-proportion z-test , is appropriate when the following conditions are met:

  • The sampling method for each population is simple random sampling .
  • The samples are independent .
  • Each sample includes at least 10 successes and 10 failures.
  • Each population is at least 20 times as big as its sample.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

State the Hypotheses

Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis . The table below shows three sets of hypotheses. Each makes a statement about the difference d between two population proportions, P 1 and P 2 . (In the table, the symbol ≠ means " not equal to ".)

The first set of hypotheses (Set 1) is an example of a two-tailed test , since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests , since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.

When the null hypothesis states that there is no difference between the two population proportions (i.e., d = P 1 - P 2 = 0), the null and alternative hypothesis for a two-tailed test are often stated in the following form.

H o : P 1 = P 2 H a : P 1 ≠ P 2

Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements.

  • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
  • Test method. Use the two-proportion z-test (described in the next section) to determine whether the hypothesized difference between population proportions differs significantly from the observed sample difference.

Analyze Sample Data

Using sample data, complete the following computations to find the test statistic and its associated P-Value.

p = (p 1 * n 1 + p 2 * n 2 ) / (n 1 + n 2 )

SE = sqrt{ p * ( 1 - p ) * [ (1/n 1 ) + (1/n 2 ) ] }

z = (p 1 - p 2 ) / SE

  • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a z-score, use the Normal Distribution Calculator to assess the probability associated with the z-score. (See sample problems at the end of this lesson for examples of how this is done.)

The analysis described above is a two-proportion z-test.

Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

Test Your Understanding

In this section, two sample problems illustrate how to conduct a hypothesis test for the difference between two proportions. The first problem involves a two-tailed test; the second problem, a one-tailed test.

Problem 1: Two-Tailed Test

Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is equally effective for men and women. To test this claim, they choose a a simple random sample of 100 women and 200 men from a population of 100,000 volunteers.

At the end of the study, 38% of the women caught a cold; and 51% of the men caught a cold. Based on these findings, can we reject the company's claim that the drug is equally effective for men and women? Use a 0.05 level of significance.

Solution: The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

Null hypothesis: P 1 = P 2

Alternative hypothesis: P 1 ≠ P 2

  • Formulate an analysis plan . For this analysis, the significance level is 0.05. The test method is a two-proportion z-test.

p = [(0.38 * 100) + (0.51 * 200)] / (100 + 200)

p = 140/300 = 0.467

SE = sqrt [ 0.467 * 0.533 * ( 1/100 + 1/200 ) ]

SE = sqrt [0.003733] = 0.061

z = (p 1 - p 2 ) / SE = (0.38 - 0.51)/0.061 = -2.13

where p 1 is the sample proportion in sample 1, where p 2 is the sample proportion in sample 2, n 1 is the size of sample 1, and n 2 is the size of sample 2.

Since we have a two-tailed test , the P-value is the probability that the z-score is less than -2.13 or greater than 2.13.

  • Interpret results . Since the P-value (0.034) is less than the significance level (0.05), we cannot accept the null hypothesis.

Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the samples were independent, each population was at least 10 times larger than its sample, and each sample included at least 10 successes and 10 failures.

Problem 2: One-Tailed Test

Suppose the previous example is stated a little bit differently. Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is more effective for women than for men. To test this claim, they choose a a simple random sample of 100 women and 200 men from a population of 100,000 volunteers.

At the end of the study, 38% of the women caught a cold; and 51% of the men caught a cold. Based on these findings, can we conclude that the drug is more effective for women than for men? Use a 0.01 level of significance.

Null hypothesis: P 1 >= P 2

Alternative hypothesis: P 1 < P 2

  • Formulate an analysis plan . For this analysis, the significance level is 0.01. The test method is a two-proportion z-test.
  • Interpret results . Since the P-value (0.017) is greater than the significance level (0.01), we cannot reject the null hypothesis.

Practice Questions – Hypothesis Testing for a Single Proportion

Published by admin on august 17, 2022 august 17, 2022, question 1:.

A certain cell phone company believes that the proportion of customers they have among the population in a certain town is 33%. In order to test this, the company run a survey, of which 350 random respondents answered, and 126 respondents, confirmed having this brand.

  • At 5% level of significance, is there a strong enough evidence to assume there’s more the actual proportion within the population is greater than 33%?
  • At 2% level of significance, is there a strong enough evidence to assume the proportion in the population is different than 33%?
  • Find a confidence interval for the true proportion in the population, based upon the results of the survey.

This is what we know:

hypothesis test for proportions questions

  • This is the relevant calculation:

hypothesis test for proportions questions

And so, the critical Z for 5% level of significance is 1.645, and the result is that 1.645>1.19, so there no good enough evidence to reject the null hypothesis and support the alternative hypothesis: the true proportion isn’t greater than 33% under 5% level of significance.

  • Using the same calculation, at 2% level of significance, the relevant critical value would be 2.33, so, in this case as well, there’s no strong enough evidence to reject the null hypothesis and support the alternative hypothesis: at 2% level of significance, the actual proportion isn’t significantly different from 33%.
  • That is the calculation for the confidence interval: Since there’s no indication of level of significance, we’ll use 5% as the level of significance.

hypothesis test for proportions questions

Question 2:

A politician believes that the level of support in her area is 56% percent.

To validate this, she runs a study of 450 respondents. 272 respondents confirm their support.

  • Is there a reasonable reason to believe that there’s a change in the percentage of support of the politician?
  • Assuming there’s no prior knowledge of the true level of support of that politician, find a 90% confidence interval for the true proportions of her support.
  • What is the minimum sample size required to find a confidence interval with less than 5% difference of her true proportion of support, at 5% level of significance?
  • That’s the relevant P-value for these measures:

hypothesis test for proportions questions

The p-value for Z=1.93 is:

hypothesis test for proportions questions

So given the result reported, under 5% level of significance, there’s no reasonable reason to conclude there’s a significance change in the true proportion of support.

  • With no prior knowledge, the relevant percentage of support we’d assume is 0.50, and so there’s the confidence interval at 90% level of significance:

hypothesis test for proportions questions

  • There’s the calculation for that:

hypothesis test for proportions questions

Related Posts

hypothesis test for proportions questions

Chi Square test for Goodness of Fit

Chi-square test: a statistical super-method for decision making.

In statistics, the chi-square test is a popular method for comparing observed data with expected data, and determining whether the differences are statistically significant or just due to chance. The test is widely used in Read more…

hypothesis test for proportions questions

Data Wrangling

How to find the z-score using r studio:.

What is a Z-score? A Z-score is a measure calculated for each and every individual from a sample, that takes into account the individual’s personal score, the sample’s mean and the sample’s standard deviation, and Read more…

hypothesis test for proportions questions

Confidence Interval

How to start working with jamovi today.

what is Jamovi? Jamovi is a powerful, open-source statistical software package that is designed to be easy to use and accessible to users of all skill levels. If you’re interested in learning Jamovi, there are Read more…

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Hypothesis test for difference in proportions
  • Constructing hypotheses for two proportions

Writing hypotheses for testing the difference of proportions

  • Hypothesis test for difference in proportions example
  • Test statistic in a two-sample z test for the difference of proportions
  • P-value in a two-sample z test for the difference of proportions
  • Comparing P value to significance level for test involving difference of proportions
  • Confidence interval for hypothesis test for difference in proportions
  • Making conclusions about the difference of proportions
  • (Choice A)   H 0 : p 1 = p 2 H a : p 1 < p 2 ‍   A H 0 : p 1 = p 2 H a : p 1 < p 2 ‍  
  • (Choice B)   H 0 : p 1 = p 2 H a : p 1 > p 2 ‍   B H 0 : p 1 = p 2 H a : p 1 > p 2 ‍  
  • (Choice C)   H 0 : p 1 = p 2 H a : p 1 ≠ p 2 ‍   C H 0 : p 1 = p 2 H a : p 1 ≠ p 2 ‍  
  • (Choice D)   H 0 : p 1 ≠ p 2 H a : p 1 = p 2 ‍   D H 0 : p 1 ≠ p 2 H a : p 1 = p 2 ‍  

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.E: Hypothesis Testing with One Sample (Exercises)

  • Last updated
  • Save as PDF
  • Page ID 1146

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax.

9.1: Introduction

9.2: null and alternative hypotheses.

Some of the following statements refer to the null hypothesis, some to the alternate hypothesis.

State the null hypothesis, \(H_{0}\), and the alternative hypothesis. \(H_{a}\), in terms of the appropriate parameter \((\mu \text{or} p)\).

  • The mean number of years Americans work before retiring is 34.
  • At most 60% of Americans vote in presidential elections.
  • The mean starting salary for San Jose State University graduates is at least $100,000 per year.
  • Twenty-nine percent of high school seniors get drunk each month.
  • Fewer than 5% of adults ride the bus to work in Los Angeles.
  • The mean number of cars a person owns in her lifetime is not more than ten.
  • About half of Americans prefer to live away from cities, given the choice.
  • Europeans have a mean paid vacation each year of six weeks.
  • The chance of developing breast cancer is under 11% for women.
  • Private universities' mean tuition cost is more than $20,000 per year.
  • \(H_{0}: \mu = 34; H_{a}: \mu \neq 34\)
  • \(H_{0}: p \leq 0.60; H_{a}: p > 0.60\)
  • \(H_{0}: \mu \geq 100,000; H_{a}: \mu < 100,000\)
  • \(H_{0}: p = 0.29; H_{a}: p \neq 0.29\)
  • \(H_{0}: p = 0.05; H_{a}: p < 0.05\)
  • \(H_{0}: \mu \leq 10; H_{a}: \mu > 10\)
  • \(H_{0}: p = 0.50; H_{a}: p \neq 0.50\)
  • \(H_{0}: \mu = 6; H_{a}: \mu \neq 6\)
  • \(H_{0}: p ≥ 0.11; H_{a}: p < 0.11\)
  • \(H_{0}: \mu \leq 20,000; H_{a}: \mu > 20,000\)

Over the past few decades, public health officials have examined the link between weight concerns and teen girls' smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin? The alternative hypothesis is:

  • \(p < 0.30\)
  • \(p \leq 0.30\)
  • \(p \geq 0.30\)
  • \(p > 0.30\)

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 attended the midnight showing. An appropriate alternative hypothesis is:

  • \(p = 0.20\)
  • \(p > 0.20\)
  • \(p < 0.20\)
  • \(p \leq 0.20\)

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test. The null and alternative hypotheses are:

  • \(H_{0}: \bar{x} = 4.5, H_{a}: \bar{x} > 4.5\)
  • \(H_{0}: \mu \geq 4.5, H_{a}: \mu < 4.5\)
  • \(H_{0}: \mu = 4.75, H_{a}: \mu > 4.75\)
  • \(H_{0}: \mu = 4.5, H_{a}: \mu > 4.5\)

9.3: Outcomes and the Type I and Type II Errors

State the Type I and Type II errors in complete sentences given the following statements.

  • The mean number of cars a person owns in his or her lifetime is not more than ten.
  • Private universities mean tuition cost is more than $20,000 per year.
  • Type I error: We conclude that the mean is not 34 years, when it really is 34 years. Type II error: We conclude that the mean is 34 years, when in fact it really is not 34 years.
  • Type I error: We conclude that more than 60% of Americans vote in presidential elections, when the actual percentage is at most 60%.Type II error: We conclude that at most 60% of Americans vote in presidential elections when, in fact, more than 60% do.
  • Type I error: We conclude that the mean starting salary is less than $100,000, when it really is at least $100,000. Type II error: We conclude that the mean starting salary is at least $100,000 when, in fact, it is less than $100,000.
  • Type I error: We conclude that the proportion of high school seniors who get drunk each month is not 29%, when it really is 29%. Type II error: We conclude that the proportion of high school seniors who get drunk each month is 29% when, in fact, it is not 29%.
  • Type I error: We conclude that fewer than 5% of adults ride the bus to work in Los Angeles, when the percentage that do is really 5% or more. Type II error: We conclude that 5% or more adults ride the bus to work in Los Angeles when, in fact, fewer that 5% do.
  • Type I error: We conclude that the mean number of cars a person owns in his or her lifetime is more than 10, when in reality it is not more than 10. Type II error: We conclude that the mean number of cars a person owns in his or her lifetime is not more than 10 when, in fact, it is more than 10.
  • Type I error: We conclude that the proportion of Americans who prefer to live away from cities is not about half, though the actual proportion is about half. Type II error: We conclude that the proportion of Americans who prefer to live away from cities is half when, in fact, it is not half.
  • Type I error: We conclude that the duration of paid vacations each year for Europeans is not six weeks, when in fact it is six weeks. Type II error: We conclude that the duration of paid vacations each year for Europeans is six weeks when, in fact, it is not.
  • Type I error: We conclude that the proportion is less than 11%, when it is really at least 11%. Type II error: We conclude that the proportion of women who develop breast cancer is at least 11%, when in fact it is less than 11%.
  • Type I error: We conclude that the average tuition cost at private universities is more than $20,000, though in reality it is at most $20,000. Type II error: We conclude that the average tuition cost at private universities is at most $20,000 when, in fact, it is more than $20,000.

For statements a-j in Exercise 9.109 , answer the following in complete sentences.

  • State a consequence of committing a Type I error.
  • State a consequence of committing a Type II error.

When a new drug is created, the pharmaceutical company must subject it to testing before receiving the necessary permission from the Food and Drug Administration (FDA) to market the drug. Suppose the null hypothesis is “the drug is unsafe.” What is the Type II Error?

  • To conclude the drug is safe when in, fact, it is unsafe.
  • Not to conclude the drug is safe when, in fact, it is safe.
  • To conclude the drug is safe when, in fact, it is safe.
  • Not to conclude the drug is unsafe when, in fact, it is unsafe.

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 of them attended the midnight showing. The Type I error is to conclude that the percent of EVC students who attended is ________.

  • at least 20%, when in fact, it is less than 20%.
  • 20%, when in fact, it is 20%.
  • less than 20%, when in fact, it is at least 20%.
  • less than 20%, when in fact, it is less than 20%.

It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average?

The Type II error is not to reject that the mean number of hours of sleep LTCC students get per night is at least seven when, in fact, the mean number of hours

  • is more than seven hours.
  • is at most seven hours.
  • is at least seven hours.
  • is less than seven hours.

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test, the Type I error is:

  • to conclude that the current mean hours per week is higher than 4.5, when in fact, it is higher
  • to conclude that the current mean hours per week is higher than 4.5, when in fact, it is the same
  • to conclude that the mean hours per week currently is 4.5, when in fact, it is higher
  • to conclude that the mean hours per week currently is no higher than 4.5, when in fact, it is not higher

9.4: Distribution Needed for Hypothesis Testing

It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average? The distribution to be used for this test is \(\bar{X} \sim\) ________________

  • \(N\left(7.24, \frac{1.93}{\sqrt{22}}\right)\)
  • \(N\left(7.24, 1.93\right)\)

9.5: Rare Events, the Sample, Decision and Conclusion

The National Institute of Mental Health published an article stating that in any one-year period, approximately 9.5 percent of American adults suffer from depression or a depressive illness. Suppose that in a survey of 100 people in a certain town, seven of them suffered from depression or a depressive illness. Conduct a hypothesis test to determine if the true proportion of people in that town suffering from depression or a depressive illness is lower than the percent in the general adult American population.

  • Is this a test of one mean or proportion?
  • State the null and alternative hypotheses. \(H_{0}\) : ____________________ \(H_{a}\) : ____________________
  • Is this a right-tailed, left-tailed, or two-tailed test?
  • What symbol represents the random variable for this test?
  • In words, define the random variable for this test.
  • \(x =\) ________________
  • \(n =\) ________________
  • \(p′ =\) _____________
  • Calculate \(\sigma_{x} =\) __________. Show the formula set-up.
  • State the distribution to use for the hypothesis test.
  • Find the \(p\text{-value}\).
  • Reason for the decision:
  • Conclusion (write out in a complete sentence):

9.6: Additional Information and Full Hypothesis Test Examples

For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in [link] . Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.

If you are using a Student's \(t\) - distribution for one of the following homework problems, you may assume that the underlying population is normally distributed. (In general, you must first prove that assumption, however.)

A particular brand of tires claims that its deluxe tire averages at least 50,000 miles before it needs to be replaced. From past studies of this tire, the standard deviation is known to be 8,000. A survey of owners of that tire design is conducted. From the 28 tires surveyed, the mean lifespan was 46,500 miles with a standard deviation of 9,800 miles. Using \(\alpha = 0.05\), is the data highly inconsistent with the claim?

  • \(H_{0}: \mu \geq 50,000\)
  • \(H_{a}: \mu < 50,000\)
  • Let \(\bar{X} =\) the average lifespan of a brand of tires.
  • normal distribution
  • \(z = -2.315\)
  • \(p\text{-value} = 0.0103\)
  • Check student’s solution.
  • alpha: 0.05
  • Decision: Reject the null hypothesis.
  • Reason for decision: The \(p\text{-value}\) is less than 0.05.
  • Conclusion: There is sufficient evidence to conclude that the mean lifespan of the tires is less than 50,000 miles.
  • \((43,537, 49,463)\)

From generation to generation, the mean age when smokers first start to smoke varies. However, the standard deviation of that age remains constant of around 2.1 years. A survey of 40 smokers of this generation was done to see if the mean starting age is at least 19. The sample mean was 18.1 with a sample standard deviation of 1.3. Do the data support the claim at the 5% level?

The cost of a daily newspaper varies from city to city. However, the variation among prices remains steady with a standard deviation of 20¢. A study was done to test the claim that the mean cost of a daily newspaper is $1.00. Twelve costs yield a mean cost of 95¢ with a standard deviation of 18¢. Do the data support the claim at the 1% level?

  • \(H_{0}: \mu = $1.00\)
  • \(H_{a}: \mu \neq $1.00\)
  • Let \(\bar{X} =\) the average cost of a daily newspaper.
  • \(z = –0.866\)
  • \(p\text{-value} = 0.3865\)
  • \(\alpha: 0.01\)
  • Decision: Do not reject the null hypothesis.
  • Reason for decision: The \(p\text{-value}\) is greater than 0.01.
  • Conclusion: There is sufficient evidence to support the claim that the mean cost of daily papers is $1. The mean cost could be $1.
  • \(($0.84, $1.06)\)

An article in the San Jose Mercury News stated that students in the California state university system take 4.5 years, on average, to finish their undergraduate degrees. Suppose you believe that the mean time is longer. You conduct a survey of 49 students and obtain a sample mean of 5.1 with a sample standard deviation of 1.2. Do the data support your claim at the 1% level?

The mean number of sick days an employee takes per year is believed to be about ten. Members of a personnel department do not believe this figure. They randomly survey eight employees. The number of sick days they took for the past year are as follows: 12; 4; 15; 3; 11; 8; 6; 8. Let \(x =\) the number of sick days they took for the past year. Should the personnel team believe that the mean number is ten?

  • \(H_{0}: \mu = 10\)
  • \(H_{a}: \mu \neq 10\)
  • Let \(\bar{X}\) the mean number of sick days an employee takes per year.
  • Student’s t -distribution
  • \(t = –1.12\)
  • \(p\text{-value} = 0.300\)
  • \(\alpha: 0.05\)
  • Reason for decision: The \(p\text{-value}\) is greater than 0.05.
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean number of sick days is not ten.
  • \((4.9443, 11.806)\)

In 1955, Life Magazine reported that the 25 year-old mother of three worked, on average, an 80 hour week. Recently, many groups have been studying whether or not the women's movement has, in fact, resulted in an increase in the average work week for women (combining employment and at-home work). Suppose a study was done to determine if the mean work week has increased. 81 women were surveyed with the following results. The sample mean was 83; the sample standard deviation was ten. Does it appear that the mean work week has increased for women at the 5% level?

Your statistics instructor claims that 60 percent of the students who take her Elementary Statistics class go through life feeling more enriched. For some reason that she can't quite figure out, most people don't believe her. You decide to check this out on your own. You randomly survey 64 of her past Elementary Statistics students and find that 34 feel more enriched as a result of her class. Now, what do you think?

  • \(H_{0}: p \geq 0.6\)
  • \(H_{a}: p < 0.6\)
  • Let \(P′ =\) the proportion of students who feel more enriched as a result of taking Elementary Statistics.
  • normal for a single proportion
  • \(p\text{-value} = 0.1308\)
  • Conclusion: There is insufficient evidence to conclude that less than 60 percent of her students feel more enriched.

The “plus-4s” confidence interval is \((0.411, 0.648)\)

A Nissan Motor Corporation advertisement read, “The average man’s I.Q. is 107. The average brown trout’s I.Q. is 4. So why can’t man catch brown trout?” Suppose you believe that the brown trout’s mean I.Q. is greater than four. You catch 12 brown trout. A fish psychologist determines the I.Q.s as follows: 5; 4; 7; 3; 6; 4; 5; 3; 6; 3; 8; 5. Conduct a hypothesis test of your belief.

Refer to Exercise 9.119 . Conduct a hypothesis test to see if your decision and conclusion would change if your belief were that the brown trout’s mean I.Q. is not four.

  • \(H_{0}: \mu = 4\)
  • \(H_{a}: \mu \neq 4\)
  • Let \(\bar{X}\) the average I.Q. of a set of brown trout.
  • two-tailed Student's t-test
  • \(t = 1.95\)
  • \(p\text{-value} = 0.076\)
  • Reason for decision: The \(p\text{-value}\) is greater than 0.05
  • Conclusion: There is insufficient evidence to conclude that the average IQ of brown trout is not four.
  • \((3.8865,5.9468)\)

According to an article in Newsweek , the natural ratio of girls to boys is 100:105. In China, the birth ratio is 100: 114 (46.7% girls). Suppose you don’t believe the reported figures of the percent of girls born in China. You conduct a study. In this study, you count the number of girls and boys born in 150 randomly chosen recent births. There are 60 girls and 90 boys born of the 150. Based on your study, do you believe that the percent of girls born in China is 46.7?

A poll done for Newsweek found that 13% of Americans have seen or sensed the presence of an angel. A contingent doubts that the percent is really that high. It conducts its own survey. Out of 76 Americans surveyed, only two had seen or sensed the presence of an angel. As a result of the contingent’s survey, would you agree with the Newsweek poll? In complete sentences, also give three reasons why the two polls might give different results.

  • \(H_{a}: p < 0.13\)
  • Let \(P′ =\) the proportion of Americans who have seen or sensed angels
  • –2.688
  • \(p\text{-value} = 0.0036\)
  • Reason for decision: The \(p\text{-value}\)e is less than 0.05.
  • Conclusion: There is sufficient evidence to conclude that the percentage of Americans who have seen or sensed an angel is less than 13%.

The“plus-4s” confidence interval is (0.0022, 0.0978)

The mean work week for engineers in a start-up company is believed to be about 60 hours. A newly hired engineer hopes that it’s shorter. She asks ten engineering friends in start-ups for the lengths of their mean work weeks. Based on the results that follow, should she count on the mean work week to be shorter than 60 hours?

Data (length of mean work week): 70; 45; 55; 60; 65; 55; 55; 60; 50; 55.

Use the “Lap time” data for Lap 4 (see [link] ) to test the claim that Terri finishes Lap 4, on average, in less than 129 seconds. Use all twenty races given.

  • \(H_{0}: \mu \geq 129\)
  • \(H_{a}: \mu < 129\)
  • Let \(\bar{X} =\) the average time in seconds that Terri finishes Lap 4.
  • Student's t -distribution
  • \(t = 1.209\)
  • Conclusion: There is insufficient evidence to conclude that Terri’s mean lap time is less than 129 seconds.
  • \((128.63, 130.37)\)

Use the “Initial Public Offering” data (see [link] ) to test the claim that the mean offer price was $18 per share. Do not use all the data. Use your random number generator to randomly survey 15 prices.

The following questions were written by past students. They are excellent problems!

"Asian Family Reunion," by Chau Nguyen

Every two years it comes around.

We all get together from different towns.

In my honest opinion,

It's not a typical family reunion.

Not forty, or fifty, or sixty,

But how about seventy companions!

The kids would play, scream, and shout

One minute they're happy, another they'll pout.

The teenagers would look, stare, and compare

From how they look to what they wear.

The men would chat about their business

That they make more, but never less.

Money is always their subject

And there's always talk of more new projects.

The women get tired from all of the chats

They head to the kitchen to set out the mats.

Some would sit and some would stand

Eating and talking with plates in their hands.

Then come the games and the songs

And suddenly, everyone gets along!

With all that laughter, it's sad to say

That it always ends in the same old way.

They hug and kiss and say "good-bye"

And then they all begin to cry!

I say that 60 percent shed their tears

But my mom counted 35 people this year.

She said that boys and men will always have their pride,

So we won't ever see them cry.

I myself don't think she's correct,

So could you please try this problem to see if you object?

  • \(H_{0}: p = 0.60\)
  • \(H_{a}: p < 0.60\)
  • Let \(P′ =\) the proportion of family members who shed tears at a reunion.
  • –1.71
  • Reason for decision: \(p\text{-value} < \alpha\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportion of family members who shed tears at a reunion is less than 0.60. However, the test is weak because the \(p\text{-value}\) and alpha are quite close, so other tests should be done.
  • We are 95% confident that between 38.29% and 61.71% of family members will shed tears at a family reunion. \((0.3829, 0.6171)\). The“plus-4s” confidence interval (see chapter 8) is \((0.3861, 0.6139)\)

Note that here the “large-sample” \(1 - \text{PropZTest}\) provides the approximate \(p\text{-value}\) of 0.0438. Whenever a \(p\text{-value}\) based on a normal approximation is close to the level of significance, the exact \(p\text{-value}\) based on binomial probabilities should be calculated whenever possible. This is beyond the scope of this course.

"The Problem with Angels," by Cyndy Dowling

Although this problem is wholly mine,

The catalyst came from the magazine, Time.

On the magazine cover I did find

The realm of angels tickling my mind.

Inside, 69% I found to be

In angels, Americans do believe.

Then, it was time to rise to the task,

Ninety-five high school and college students I did ask.

Viewing all as one group,

Random sampling to get the scoop.

So, I asked each to be true,

"Do you believe in angels?" Tell me, do!

Hypothesizing at the start,

Totally believing in my heart

That the proportion who said yes

Would be equal on this test.

Lo and behold, seventy-three did arrive,

Out of the sample of ninety-five.

Now your job has just begun,

Solve this problem and have some fun.

"Blowing Bubbles," by Sondra Prull

Studying stats just made me tense,

I had to find some sane defense.

Some light and lifting simple play

To float my math anxiety away.

Blowing bubbles lifts me high

Takes my troubles to the sky.

POIK! They're gone, with all my stress

Bubble therapy is the best.

The label said each time I blew

The average number of bubbles would be at least 22.

I blew and blew and this I found

From 64 blows, they all are round!

But the number of bubbles in 64 blows

Varied widely, this I know.

20 per blow became the mean

They deviated by 6, and not 16.

From counting bubbles, I sure did relax

But now I give to you your task.

Was 22 a reasonable guess?

Find the answer and pass this test!

  • \(H_{0}: \mu \geq 22\)
  • \(H_{a}: \mu < 22\)
  • Let \(\bar{X} =\) the mean number of bubbles per blow.
  • –2.667
  • \(p\text{-value} = 0.00486\)
  • Conclusion: There is sufficient evidence to conclude that the mean number of bubbles per blow is less than 22.
  • \((18.501, 21.499)\)

"Dalmatian Darnation," by Kathy Sparling

A greedy dog breeder named Spreckles

Bred puppies with numerous freckles

The Dalmatians he sought

Possessed spot upon spot

The more spots, he thought, the more shekels.

His competitors did not agree

That freckles would increase the fee.

They said, “Spots are quite nice

But they don't affect price;

One should breed for improved pedigree.”

The breeders decided to prove

This strategy was a wrong move.

Breeding only for spots

Would wreak havoc, they thought.

His theory they want to disprove.

They proposed a contest to Spreckles

Comparing dog prices to freckles.

In records they looked up

One hundred one pups:

Dalmatians that fetched the most shekels.

They asked Mr. Spreckles to name

An average spot count he'd claim

To bring in big bucks.

Said Spreckles, “Well, shucks,

It's for one hundred one that I aim.”

Said an amateur statistician

Who wanted to help with this mission.

“Twenty-one for the sample

Standard deviation's ample:

They examined one hundred and one

Dalmatians that fetched a good sum.

They counted each spot,

Mark, freckle and dot

And tallied up every one.

Instead of one hundred one spots

They averaged ninety six dots

Can they muzzle Spreckles’

Obsession with freckles

Based on all the dog data they've got?

"Macaroni and Cheese, please!!" by Nedda Misherghi and Rachelle Hall

As a poor starving student I don't have much money to spend for even the bare necessities. So my favorite and main staple food is macaroni and cheese. It's high in taste and low in cost and nutritional value.

One day, as I sat down to determine the meaning of life, I got a serious craving for this, oh, so important, food of my life. So I went down the street to Greatway to get a box of macaroni and cheese, but it was SO expensive! $2.02 !!! Can you believe it? It made me stop and think. The world is changing fast. I had thought that the mean cost of a box (the normal size, not some super-gigantic-family-value-pack) was at most $1, but now I wasn't so sure. However, I was determined to find out. I went to 53 of the closest grocery stores and surveyed the prices of macaroni and cheese. Here are the data I wrote in my notebook:

Price per box of Mac and Cheese:

  • 5 stores @ $2.02
  • 15 stores @ $0.25
  • 3 stores @ $1.29
  • 6 stores @ $0.35
  • 4 stores @ $2.27
  • 7 stores @ $1.50
  • 5 stores @ $1.89
  • 8 stores @ 0.75.

I could see that the cost varied but I had to sit down to figure out whether or not I was right. If it does turn out that this mouth-watering dish is at most $1, then I'll throw a big cheesy party in our next statistics lab, with enough macaroni and cheese for just me. (After all, as a poor starving student I can't be expected to feed our class of animals!)

  • \(H_{0}: \mu \leq 1\)
  • \(H_{a}: \mu > 1\)
  • Let \(\bar{X} =\) the mean cost in dollars of macaroni and cheese in a certain town.
  • Student's \(t\)-distribution
  • \(t = 0.340\)
  • \(p\text{-value} = 0.36756\)
  • Conclusion: The mean cost could be $1, or less. At the 5% significance level, there is insufficient evidence to conclude that the mean price of a box of macaroni and cheese is more than $1.
  • \((0.8291, 1.241)\)

"William Shakespeare: The Tragedy of Hamlet, Prince of Denmark," by Jacqueline Ghodsi

THE CHARACTERS (in order of appearance):

  • HAMLET, Prince of Denmark and student of Statistics
  • POLONIUS, Hamlet’s tutor
  • HOROTIO, friend to Hamlet and fellow student

Scene: The great library of the castle, in which Hamlet does his lessons

(The day is fair, but the face of Hamlet is clouded. He paces the large room. His tutor, Polonius, is reprimanding Hamlet regarding the latter’s recent experience. Horatio is seated at the large table at right stage.)

POLONIUS: My Lord, how cans’t thou admit that thou hast seen a ghost! It is but a figment of your imagination!

HAMLET: I beg to differ; I know of a certainty that five-and-seventy in one hundred of us, condemned to the whips and scorns of time as we are, have gazed upon a spirit of health, or goblin damn’d, be their intents wicked or charitable.

POLONIUS If thou doest insist upon thy wretched vision then let me invest your time; be true to thy work and speak to me through the reason of the null and alternate hypotheses. (He turns to Horatio.) Did not Hamlet himself say, “What piece of work is man, how noble in reason, how infinite in faculties? Then let not this foolishness persist. Go, Horatio, make a survey of three-and-sixty and discover what the true proportion be. For my part, I will never succumb to this fantasy, but deem man to be devoid of all reason should thy proposal of at least five-and-seventy in one hundred hold true.

HORATIO (to Hamlet): What should we do, my Lord?

HAMLET: Go to thy purpose, Horatio.

HORATIO: To what end, my Lord?

HAMLET: That you must teach me. But let me conjure you by the rights of our fellowship, by the consonance of our youth, but the obligation of our ever-preserved love, be even and direct with me, whether I am right or no.

(Horatio exits, followed by Polonius, leaving Hamlet to ponder alone.)

(The next day, Hamlet awaits anxiously the presence of his friend, Horatio. Polonius enters and places some books upon the table just a moment before Horatio enters.)

POLONIUS: So, Horatio, what is it thou didst reveal through thy deliberations?

HORATIO: In a random survey, for which purpose thou thyself sent me forth, I did discover that one-and-forty believe fervently that the spirits of the dead walk with us. Before my God, I might not this believe, without the sensible and true avouch of mine own eyes.

POLONIUS: Give thine own thoughts no tongue, Horatio. (Polonius turns to Hamlet.) But look to’t I charge you, my Lord. Come Horatio, let us go together, for this is not our test. (Horatio and Polonius leave together.)

HAMLET: To reject, or not reject, that is the question: whether ‘tis nobler in the mind to suffer the slings and arrows of outrageous statistics, or to take arms against a sea of data, and, by opposing, end them. (Hamlet resignedly attends to his task.)

(Curtain falls)

"Untitled," by Stephen Chen

I've often wondered how software is released and sold to the public. Ironically, I work for a company that sells products with known problems. Unfortunately, most of the problems are difficult to create, which makes them difficult to fix. I usually use the test program X, which tests the product, to try to create a specific problem. When the test program is run to make an error occur, the likelihood of generating an error is 1%.

So, armed with this knowledge, I wrote a new test program Y that will generate the same error that test program X creates, but more often. To find out if my test program is better than the original, so that I can convince the management that I'm right, I ran my test program to find out how often I can generate the same error. When I ran my test program 50 times, I generated the error twice. While this may not seem much better, I think that I can convince the management to use my test program instead of the original test program. Am I right?

  • \(H_{0}: p = 0.01\)
  • \(H_{a}: p > 0.01\)
  • Let \(P′ =\) the proportion of errors generated
  • Normal for a single proportion
  • Decision: Reject the null hypothesis
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportion of errors generated is more than 0.01.

The“plus-4s” confidence interval is \((0.004, 0.144)\).

"Japanese Girls’ Names"

by Kumi Furuichi

It used to be very typical for Japanese girls’ names to end with “ko.” (The trend might have started around my grandmothers’ generation and its peak might have been around my mother’s generation.) “Ko” means “child” in Chinese characters. Parents would name their daughters with “ko” attaching to other Chinese characters which have meanings that they want their daughters to become, such as Sachiko—happy child, Yoshiko—a good child, Yasuko—a healthy child, and so on.

However, I noticed recently that only two out of nine of my Japanese girlfriends at this school have names which end with “ko.” More and more, parents seem to have become creative, modernized, and, sometimes, westernized in naming their children.

I have a feeling that, while 70 percent or more of my mother’s generation would have names with “ko” at the end, the proportion has dropped among my peers. I wrote down all my Japanese friends’, ex-classmates’, co-workers, and acquaintances’ names that I could remember. Following are the names. (Some are repeats.) Test to see if the proportion has dropped for this generation.

Ai, Akemi, Akiko, Ayumi, Chiaki, Chie, Eiko, Eri, Eriko, Fumiko, Harumi, Hitomi, Hiroko, Hiroko, Hidemi, Hisako, Hinako, Izumi, Izumi, Junko, Junko, Kana, Kanako, Kanayo, Kayo, Kayoko, Kazumi, Keiko, Keiko, Kei, Kumi, Kumiko, Kyoko, Kyoko, Madoka, Maho, Mai, Maiko, Maki, Miki, Miki, Mikiko, Mina, Minako, Miyako, Momoko, Nana, Naoko, Naoko, Naoko, Noriko, Rieko, Rika, Rika, Rumiko, Rei, Reiko, Reiko, Sachiko, Sachiko, Sachiyo, Saki, Sayaka, Sayoko, Sayuri, Seiko, Shiho, Shizuka, Sumiko, Takako, Takako, Tomoe, Tomoe, Tomoko, Touko, Yasuko, Yasuko, Yasuyo, Yoko, Yoko, Yoko, Yoshiko, Yoshiko, Yoshiko, Yuka, Yuki, Yuki, Yukiko, Yuko, Yuko.

"Phillip’s Wish," by Suzanne Osorio

My nephew likes to play

Chasing the girls makes his day.

He asked his mother

If it is okay

To get his ear pierced.

She said, “No way!”

To poke a hole through your ear,

Is not what I want for you, dear.

He argued his point quite well,

Says even my macho pal, Mel,

Has gotten this done.

It’s all just for fun.

C’mon please, mom, please, what the hell.

Again Phillip complained to his mother,

Saying half his friends (including their brothers)

Are piercing their ears

And they have no fears

He wants to be like the others.

She said, “I think it’s much less.

We must do a hypothesis test.

And if you are right,

I won’t put up a fight.

But, if not, then my case will rest.”

We proceeded to call fifty guys

To see whose prediction would fly.

Nineteen of the fifty

Said piercing was nifty

And earrings they’d occasionally buy.

Then there’s the other thirty-one,

Who said they’d never have this done.

So now this poem’s finished.

Will his hopes be diminished,

Or will my nephew have his fun?

  • \(H_{0}: p = 0.50\)
  • \(H_{a}: p < 0.50\)
  • Let \(P′ =\) the proportion of friends that has a pierced ear.
  • –1.70
  • \(p\text{-value} = 0.0448\)
  • Reason for decision: The \(p\text{-value}\) is less than 0.05. (However, they are very close.)
  • Conclusion: There is sufficient evidence to support the claim that less than 50% of his friends have pierced ears.
  • Confidence Interval: \((0.245, 0.515)\): The “plus-4s” confidence interval is \((0.259, 0.519)\).

"The Craven," by Mark Salangsang

Once upon a morning dreary

In stats class I was weak and weary.

Pondering over last night’s homework

Whose answers were now on the board

This I did and nothing more.

While I nodded nearly napping

Suddenly, there came a tapping.

As someone gently rapping,

Rapping my head as I snore.

Quoth the teacher, “Sleep no more.”

“In every class you fall asleep,”

The teacher said, his voice was deep.

“So a tally I’ve begun to keep

Of every class you nap and snore.

The percentage being forty-four.”

“My dear teacher I must confess,

While sleeping is what I do best.

The percentage, I think, must be less,

A percentage less than forty-four.”

This I said and nothing more.

“We’ll see,” he said and walked away,

And fifty classes from that day

He counted till the month of May

The classes in which I napped and snored.

The number he found was twenty-four.

At a significance level of 0.05,

Please tell me am I still alive?

Or did my grade just take a dive

Plunging down beneath the floor?

Upon thee I hereby implore.

Toastmasters International cites a report by Gallop Poll that 40% of Americans fear public speaking. A student believes that less than 40% of students at her school fear public speaking. She randomly surveys 361 schoolmates and finds that 135 report they fear public speaking. Conduct a hypothesis test to determine if the percent at her school is less than 40%.

  • \(H_{0}: p = 0.40\)
  • \(H_{a}: p < 0.40\)
  • Let \(P′ =\) the proportion of schoolmates who fear public speaking.
  • –1.01
  • \(p\text{-value} = 0.1563\)
  • Conclusion: There is insufficient evidence to support the claim that less than 40% of students at the school fear public speaking.
  • Confidence Interval: \((0.3241, 0.4240)\): The “plus-4s” confidence interval is \((0.3257, 0.4250)\).

Sixty-eight percent of online courses taught at community colleges nationwide were taught by full-time faculty. To test if 68% also represents California’s percent for full-time faculty teaching the online classes, Long Beach City College (LBCC) in California, was randomly selected for comparison. In the same year, 34 of the 44 online courses LBCC offered were taught by full-time faculty. Conduct a hypothesis test to determine if 68% represents California. NOTE: For more accurate results, use more California community colleges and this past year's data.

According to an article in Bloomberg Businessweek , New York City's most recent adult smoking rate is 14%. Suppose that a survey is conducted to determine this year’s rate. Nine out of 70 randomly chosen N.Y. City residents reply that they smoke. Conduct a hypothesis test to determine if the rate is still 14% or if it has decreased.

  • \(H_{0}: p = 0.14\)
  • \(H_{a}: p < 0.14\)
  • Let \(P′ =\) the proportion of NYC residents that smoke.
  • –0.2756
  • \(p\text{-value} = 0.3914\)
  • At the 5% significance level, there is insufficient evidence to conclude that the proportion of NYC residents who smoke is less than 0.14.
  • Confidence Interval: \((0.0502, 0.2070)\): The “plus-4s” confidence interval (see chapter 8) is \((0.0676, 0.2297)\).

The mean age of De Anza College students in a previous term was 26.6 years old. An instructor thinks the mean age for online students is older than 26.6. She randomly surveys 56 online students and finds that the sample mean is 29.4 with a standard deviation of 2.1. Conduct a hypothesis test.

Registered nurses earned an average annual salary of $69,110. For that same year, a survey was conducted of 41 California registered nurses to determine if the annual salary is higher than $69,110 for California nurses. The sample average was $71,121 with a sample standard deviation of $7,489. Conduct a hypothesis test.

  • \(H_{0}: \mu = 69,110\)
  • \(H_{0}: \mu > 69,110\)
  • Let \(\bar{X} =\) the mean salary in dollars for California registered nurses.
  • \(t = 1.719\)
  • \(p\text{-value}: 0.0466\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean salary of California registered nurses exceeds $69,110.
  • \(($68,757, $73,485)\)

La Leche League International reports that the mean age of weaning a child from breastfeeding is age four to five worldwide. In America, most nursing mothers wean their children much earlier. Suppose a random survey is conducted of 21 U.S. mothers who recently weaned their children. The mean weaning age was nine months (3/4 year) with a standard deviation of 4 months. Conduct a hypothesis test to determine if the mean weaning age in the U.S. is less than four years old.

Over the past few decades, public health officials have examined the link between weight concerns and teen girls' smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin?

After conducting the test, your decision and conclusion are

  • Reject \(H_{0}\): There is sufficient evidence to conclude that more than 30% of teen girls smoke to stay thin.
  • Do not reject \(H_{0}\): There is not sufficient evidence to conclude that less than 30% of teen girls smoke to stay thin.
  • Do not reject \(H_{0}\): There is not sufficient evidence to conclude that more than 30% of teen girls smoke to stay thin.
  • Reject \(H_{0}\): There is sufficient evidence to conclude that less than 30% of teen girls smoke to stay thin.

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 of them attended the midnight showing.

At a 1% level of significance, an appropriate conclusion is:

  • There is insufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is less than 20%.
  • There is sufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is more than 20%.
  • There is sufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is less than 20%.
  • There is insufficient evidence to conclude that the percent of EVC students who attended the midnight showing of Harry Potter is at least 20%.

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test.

At a significance level of \(a = 0.05\), what is the correct conclusion?

  • There is enough evidence to conclude that the mean number of hours is more than 4.75
  • There is enough evidence to conclude that the mean number of hours is more than 4.5
  • There is not enough evidence to conclude that the mean number of hours is more than 4.5
  • There is not enough evidence to conclude that the mean number of hours is more than 4.75

Instructions: For the following ten exercises,

Hypothesis testing: For the following ten exercises, answer each question.

State the null and alternate hypothesis.

State the \(p\text{-value}\).

State \(\alpha\).

What is your decision?

Write a conclusion.

Answer any other questions asked in the problem.

According to the Center for Disease Control website, in 2011 at least 18% of high school students have smoked a cigarette. An Introduction to Statistics class in Davies County, KY conducted a hypothesis test at the local high school (a medium sized–approximately 1,200 students–small city demographic) to determine if the local high school’s percentage was lower. One hundred fifty students were chosen at random and surveyed. Of the 150 students surveyed, 82 have smoked. Use a significance level of 0.05 and using appropriate statistical evidence, conduct a hypothesis test and state the conclusions.

A recent survey in the N.Y. Times Almanac indicated that 48.8% of families own stock. A broker wanted to determine if this survey could be valid. He surveyed a random sample of 250 families and found that 142 owned some type of stock. At the 0.05 significance level, can the survey be considered to be accurate?

  • \(H_{0}: p = 0.488\) \(H_{a}: p \neq 0.488\)
  • \(p\text{-value} = 0.0114\)
  • \(\alpha = 0.05\)
  • Reject the null hypothesis.
  • At the 5% level of significance, there is enough evidence to conclude that 48.8% of families own stocks.
  • The survey does not appear to be accurate.

Driver error can be listed as the cause of approximately 54% of all fatal auto accidents, according to the American Automobile Association. Thirty randomly selected fatal accidents are examined, and it is determined that 14 were caused by driver error. Using \(\alpha = 0.05\), is the AAA proportion accurate?

The US Department of Energy reported that 51.7% of homes were heated by natural gas. A random sample of 221 homes in Kentucky found that 115 were heated by natural gas. Does the evidence support the claim for Kentucky at the \(\alpha = 0.05\) level in Kentucky? Are the results applicable across the country? Why?

  • \(H_{0}: p = 0.517\) \(H_{0}: p \neq 0.517\)
  • \(p\text{-value} = 0.9203\).
  • \(\alpha = 0.05\).
  • Do not reject the null hypothesis.
  • At the 5% significance level, there is not enough evidence to conclude that the proportion of homes in Kentucky that are heated by natural gas is 0.517.
  • However, we cannot generalize this result to the entire nation. First, the sample’s population is only the state of Kentucky. Second, it is reasonable to assume that homes in the extreme north and south will have extreme high usage and low usage, respectively. We would need to expand our sample base to include these possibilities if we wanted to generalize this claim to the entire nation.

For Americans using library services, the American Library Association claims that at most 67% of patrons borrow books. The library director in Owensboro, Kentucky feels this is not true, so she asked a local college statistic class to conduct a survey. The class randomly selected 100 patrons and found that 82 borrowed books. Did the class demonstrate that the percentage was higher in Owensboro, KY? Use \(\alpha = 0.01\) level of significance. What is the possible proportion of patrons that do borrow books from the Owensboro Library?

The Weather Underground reported that the mean amount of summer rainfall for the northeastern US is at least 11.52 inches. Ten cities in the northeast are randomly selected and the mean rainfall amount is calculated to be 7.42 inches with a standard deviation of 1.3 inches. At the \(\alpha = 0.05 level\), can it be concluded that the mean rainfall was below the reported average? What if \(\alpha = 0.01\)? Assume the amount of summer rainfall follows a normal distribution.

  • \(H_{0}: \mu \geq 11.52\) \(H_{a}: \mu < 11.52\)
  • \(p\text{-value} = 0.000002\) which is almost 0.
  • At the 5% significance level, there is enough evidence to conclude that the mean amount of summer rain in the northeaster US is less than 11.52 inches, on average.
  • We would make the same conclusion if alpha was 1% because the \(p\text{-value}\) is almost 0.

A survey in the N.Y. Times Almanac finds the mean commute time (one way) is 25.4 minutes for the 15 largest US cities. The Austin, TX chamber of commerce feels that Austin’s commute time is less and wants to publicize this fact. The mean for 25 randomly selected commuters is 22.1 minutes with a standard deviation of 5.3 minutes. At the \(\alpha = 0.10\) level, is the Austin, TX commute significantly less than the mean commute time for the 15 largest US cities?

A report by the Gallup Poll found that a woman visits her doctor, on average, at most 5.8 times each year. A random sample of 20 women results in these yearly visit totals

3; 2; 1; 3; 7; 2; 9; 4; 6; 6; 8; 0; 5; 6; 4; 2; 1; 3; 4; 1

At the \(\alpha = 0.05\) level can it be concluded that the sample mean is higher than 5.8 visits per year?

  • \(H_{0}: \mu \leq 5.8\) \(H_{a}: \mu > 5.8\)
  • \(p\text{-value} = 0.9987\)
  • At the 5% level of significance, there is not enough evidence to conclude that a woman visits her doctor, on average, more than 5.8 times a year.

According to the N.Y. Times Almanac the mean family size in the U.S. is 3.18. A sample of a college math class resulted in the following family sizes:

5; 4; 5; 4; 4; 3; 6; 4; 3; 3; 5; 5; 6; 3; 3; 2; 7; 4; 5; 2; 2; 2; 3; 2

At \(\alpha = 0.05\) level, is the class’ mean family size greater than the national average? Does the Almanac result remain valid? Why?

The student academic group on a college campus claims that freshman students study at least 2.5 hours per day, on average. One Introduction to Statistics class was skeptical. The class took a random sample of 30 freshman students and found a mean study time of 137 minutes with a standard deviation of 45 minutes. At α = 0.01 level, is the student academic group’s claim correct?

  • \(H_{0}: \mu \geq 150\) \(H_{0}: \mu < 150\)
  • \(p\text{-value} = 0.0622\)
  • \(\alpha = 0.01\)
  • At the 1% significance level, there is not enough evidence to conclude that freshmen students study less than 2.5 hours per day, on average.
  • The student academic group’s claim appears to be correct.

9.7: Hypothesis Testing of a Single Mean and Single Proportion

COMMENTS

  1. 8.4: Hypothesis Test Examples for Proportions

    Example 8.4.7. Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50%. Joon samples 100 first-time brides and 53 reply that they are younger than their grooms.

  2. 8.8 Hypothesis Tests for a Population Proportion

    The p -value for a hypothesis test on a population proportion is the area in the tail (s) of distribution of the sample proportion. If both n× p ≥ 5 n × p ≥ 5 and n ×(1− p) ≥ 5 n × ( 1 − p) ≥ 5, use the normal distribution to find the p -value. If at least one of n× p < 5 n × p < 5 or n×(1 −p) < 5 n × ( 1 − p) < 5, use ...

  3. 8.4: Hypothesis Test for One Proportion

    Make sure that you can recognize and distinguish between a question regarding a population mean and a question regarding a population proportion. The z-test is a statistical test for a population proportion. It can be used when np ≥ 10 and nq ≥ 10. ... Hypothesis Test for One Proportion is shared under a CC BY-SA 4.0 license and was ...

  4. PDF STAT 201 Chapter 9.1-9.2 Hypothesis Testing for Proportion

    Hypothesis Test for Proportions: Step 5 (cont.) •For a right tailed test: 𝐻𝑎: > 0 We have rejection regions for 𝐻 are as follows •Note: all of the rejection region is in the right tail, where is much larger than 0 Confidence Reject (test stat) Reject (p-value) 0.90 Test-stat>1.282 P-value<.1 0.95 Test-stat>1.645 P-value<.05

  5. 9.4

    Now, let's turn our attention for a bit towards testing whether one population proportion \(p_1\) equals a second population proportion \(p_2\). Additionally, most of our examples thus far have involved left-tailed tests in which the alternative hypothesis involved \(H_A \colon p < p_0\) or right-tailed tests in which the alternative hypothesis ...

  6. 8.2: Hypothesis Testing of Single Proportion

    Either five-step procedure, critical value or p -value approach, can be used. 8.2: Hypothesis Testing of Single Proportion is shared under a CC BY-NC license and was authored, remixed, and/or curated by LibreTexts. Both the critical value approach and the p-value approach can be applied to test hypotheses about a population proportion.

  7. PDF Lecture 11

    Single population proportion Hypothesis testing for a proportion Back to the GSS The GSS found that 571 out of 670 (85%) of Americans answered the question on experimental design correctly. Do these data provide convincing evidence that more than 80% of Americans have a good intuition about experimental design? H 0: p = 0:80 H A: p >0:80 SE = r ...

  8. Hypothesis test for difference in proportions example

    In the computation of σ, Sal observes that because the premise of the hypothesis test is that the null hypothesis is true, we assume that p^_2015 = p^_2000 and thus use the combined p^_c as the basis for a "best estimate" of σ.This I understand and concur with. However, it's not clear to me why the numerator for computing z is p^_2015 - p^_2000 rather than p^_2015 - p^_c.

  9. Hypothesis test for difference in proportions

    Remember the 𝒛 for any test statistic is =. (Estimator﹣Null) / SE. Let's focus on the numerator (Estimator﹣Null): ∙ The "estimator" in this case is the difference between proportions. This is what we are trying to estimate from the question. Thus, Estimator = p̂₁﹣ p̂₂. ∙ The "null" in this case is zero.

  10. Writing hypotheses for a test about a proportion

    Writing hypotheses for a test about a proportion. Google Classroom. Amanda read a report saying that 49 % of teachers in the United States were members of a labor union. She wants to test whether this holds true for teachers in her state, so she is going to take a random sample of these teachers and see what percent of them are members of a union.

  11. Hypothesis Test for a Proportion

    Test statistic. The test statistic is a z-score (z) defined by the following equation. z = (p - P) / σ. where P is the hypothesized value of population proportion in the null hypothesis, p is the sample proportion, and σ is the standard deviation of the sampling distribution. P-value.

  12. 8.7: Hypothesis Test of Single Population Proportion with Examples

    Steps for performing Hypothesis Test for a Single Population Proportion. Step 1: State your hypotheses about the population proportion. Step 2: Summarize the data. State a significance level. State and check conditions required for the procedure. ˆP = X n. P ^ = X n. Conditions for the test:

  13. 8.15: Hypothesis Test for a Population Proportion (3 of 3)

    Step 2: Collect the data. Since the hypothesis test is based on probability, random selection or assignment is essential in data production. Additionally, we need to check whether the sample proportion can be np ≥ 10 and n (1 − p) ≥ 10. Step 3: Assess the evidence. Determine the test statistic which is the z -score for the sample proportion.

  14. Exercises

    Two dice are rolled 100 times each. Die A lands on six 20 times and Die B on six 10 times. Test the claim that Die A lands on six more often than Die B. Use the critical value and the P P -value method with α = .01 α = .01 . A survey of 430 randomly chosen adults found that 21% 21 % of the 222 men and 18% 18 % of the 208 women had purchased ...

  15. Hypothesis test comparing population proportions

    Our alternative hypothesis is that there is a difference. Or that P1 does not equal P2. Or that P1 minus P2, the proportion of men voting minus the proportion of women voting, the true population proportions, do not equal 0. And we're going to do the hypothesis test with a significance level of 5%.

  16. How to Perform Hypothesis Testing for a Proportion: 8 Steps

    Convert the test statistic to a p value. p value is the probability that a randomly selected sample of n would have a sample statistic at least as different as the one obtained. p value is the tail area under the normal curve in the direction of the alternative hypothesis. For example, if a right-tailed test is used, p value is the right-tailed area, or area to the right of the z value.

  17. Hypothesis Test: Difference in Proportions

    In this section, two sample problems illustrate how to conduct a hypothesis test for the difference between two proportions. The first problem involves a two-tailed test; the second problem, a one-tailed test. Problem 1: Two-Tailed Test. Suppose the Acme Drug Company develops a new drug, designed to prevent colds.

  18. 7.2: One-Sample Proportion Test

    Figure 7.2.1: Setup for 1-Proportion Test. Once you press Calculate, you will see the results as in Figure 7.2.2. Figure 7.2.2: Results for 1-Proportion Test. The z in the results is the test statistic. The p = 0.052683219 is the p-value, and the ˆp = 0.25 is the sample proportion. The p-value is approximately 0.053.

  19. Practice Questions

    Answer 1: This is what we know: And so, the critical Z for 5% level of significance is 1.645, and the result is that 1.645>1.19, so there no good enough evidence to reject the null hypothesis and support the alternative hypothesis: the true proportion isn't greater than 33% under 5% level of significance. Using the same calculation, at 2% ...

  20. Writing hypotheses for testing the difference of proportions

    Writing hypotheses for testing the difference of proportions. Google Classroom. Researchers want to study the effectiveness of a new medication to treat depression. In a randomized experiment, 71 out of 200 people taking the medication report symptoms of depression. Of the people receiving a placebo, 92 out of 200 report symptoms of depression.

  21. Hypothesis Testing for a Proportion

    1. In order to perform a hypothesis test for a proportion, the sampling method has to be random, meaning the part of the overall population that you are using for your sample has to be chosen ...

  22. 3.4: Hypothesis Test for a Population Proportion

    H1: p ≠ 0.75. Step 2) State the level of significance and the critical value. This is a two-sided question so alpha is divided by 2. Alpha is 0.05 so the critical values are ± Zα/2 = ± Z.025. Look on the negative side of the standard normal table, in the body of values for 0.025. The critical values are ± 1.96.

  23. 9.E: Hypothesis Testing with One Sample (Exercises)

    An Introduction to Statistics class in Davies County, KY conducted a hypothesis test at the local high school (a medium sized-approximately 1,200 students-small city demographic) to determine if the local high school's percentage was lower. One hundred fifty students were chosen at random and surveyed.