Hypothesis test

A significance test, also referred to as a statistical hypothesis test, is a method of statistical inference in which observed data is compared to a claim (referred to as a hypothesis) in order to assess the truth of the claim. For example, one might wonder whether age affects the number of apples a person can eat, and may use a significance test to determine whether there is any evidence to suggest that it does.

Generally, the process of statistical hypothesis testing involves the following steps:

  • State the null hypothesis.
  • State the alternative hypothesis.
  • Select the appropriate test statistic and select a significance level.
  • Compute the observed value of the test statistic and its corresponding p-value.
  • Reject the null hypothesis in favor of the alternative hypothesis, or do not reject the null hypothesis.

The null hypothesis

The null hypothesis, H 0 , is the claim that is being tested in a statistical hypothesis test. It typically is a statement that there is no difference between the populations being studied, or that there is no evidence to support a claim being made. For example, "age has no effect on the number of apples a person can eat."

A significance test is designed to test the evidence against the null hypothesis. This is because it is easier to prove that a claim is false than to prove that it is true; demonstrating that the claim is false in one case is sufficient, while proving that it is true requires that the claim be true in all cases.

The alternative hypothesis

The alternative hypothesis is the opposite of the null hypothesis in that it is a statement that there is some difference between the populations being studied. For example, "younger people can eat more apples than older people."

The alternative hypothesis is typically the hypothesis that researchers are trying to prove. A significance test is meant to determine whether there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. Note that the results of a significance test should either be to reject the null hypothesis in favor of the alternative hypothesis, or to not reject the null hypothesis. The result should not be to reject the alternative hypothesis or to accept the alternative hypothesis.

Test statistics and significance level

A test statistic is a statistic that is calculated as part of hypothesis testing that compares the distribution of observed data to the expected distribution, based on the null hypothesis. Examples of test statistics include the Z-score, T-statistic, F-statistic, and the Chi-square statistic. The test statistic used is dependent on the significance test used, which is dependent on the type of data collected and the type of relationship to be tested.

In many cases, the chosen significance level is 0.05, though 0.01 is also used. A significance level of 0.05 indicates that there is a 5% chance of rejecting the null hypothesis when the null hypothesis is actually true. Thus, a smaller selected significance level will require more evidence if the null hypothesis is to be rejected in favor of the alternative hypothesis.

After the test statistic is computed, the p-value can be determined based on the result of the test statistic. The p-value indicates the probability of obtaining test results that are at least as extreme as the observed results, under the assumption that the null hypothesis is correct. It tells us how likely it is to obtain a result based solely on chance. The smaller the p-value, the less likely a result can occur purely by chance, while a larger p-value makes it more likely. For example, a p-value of 0.01 means that there is a 1% chance that a result occurred solely by chance, given that the null hypothesis is true; a p-value of 0.90 means that there is a 90% chance.

A p-value is significantly affected by sample size. The larger the sample size, the smaller the p-value, even if the difference between populations may not be meaningful. On the other hand, if a sample size is too small, a meaningful difference may not be detected.

The last step in a significance test is to determine whether the p-value provides evidence that the null hypothesis should be rejected in favor of the alternative hypothesis. This is based on the selected significance level. If the p-value is less than or equal to the selected significance level, the null hypothesis is rejected in favor of the alternative hypothesis, and the result is deemed statistically significant. If the p-value is greater than the selected significance level, the null hypothesis is not rejected, and the result is deemed not statistically significant.

math version of hypothesis

A hypothesis is a proposition that is consistent with known data, but has been neither verified nor shown to be false.

In statistics, a hypothesis (sometimes called a statistical hypothesis) refers to a statement on which hypothesis testing will be based. Particularly important statistical hypotheses include the null hypothesis and alternative hypothesis .

In symbolic logic , a hypothesis is the first part of an implication (with the second part being known as the predicate ).

In general mathematical usage, "hypothesis" is roughly synonymous with " conjecture ."

Explore with Wolfram|Alpha

WolframAlpha

More things to try:

  • 365 to Mayan
  • domain of sqrt(sin(x))

Cite this as:

Weisstein, Eric W. "Hypothesis." From MathWorld --A Wolfram Web Resource. https://mathworld.wolfram.com/Hypothesis.html

Subject classifications

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.1: Introduction to Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 10211

  • Kyle Siegrist
  • University of Alabama in Huntsville via Random Services

Basic Theory

Preliminaries.

As usual, our starting point is a random experiment with an underlying sample space and a probability measure \(\P\). In the basic statistical model, we have an observable random variable \(\bs{X}\) taking values in a set \(S\). In general, \(\bs{X}\) can have quite a complicated structure. For example, if the experiment is to sample \(n\) objects from a population and record various measurements of interest, then \[ \bs{X} = (X_1, X_2, \ldots, X_n) \] where \(X_i\) is the vector of measurements for the \(i\)th object. The most important special case occurs when \((X_1, X_2, \ldots, X_n)\) are independent and identically distributed. In this case, we have a random sample of size \(n\) from the common distribution.

The purpose of this section is to define and discuss the basic concepts of statistical hypothesis testing . Collectively, these concepts are sometimes referred to as the Neyman-Pearson framework, in honor of Jerzy Neyman and Egon Pearson, who first formalized them.

A statistical hypothesis is a statement about the distribution of \(\bs{X}\). Equivalently, a statistical hypothesis specifies a set of possible distributions of \(\bs{X}\): the set of distributions for which the statement is true. A hypothesis that specifies a single distribution for \(\bs{X}\) is called simple ; a hypothesis that specifies more than one distribution for \(\bs{X}\) is called composite .

In hypothesis testing , the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis . The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\).

An hypothesis test is a statistical decision ; the conclusion will either be to reject the null hypothesis in favor of the alternative, or to fail to reject the null hypothesis. The decision that we make must, of course, be based on the observed value \(\bs{x}\) of the data vector \(\bs{X}\). Thus, we will find an appropriate subset \(R\) of the sample space \(S\) and reject \(H_0\) if and only if \(\bs{x} \in R\). The set \(R\) is known as the rejection region or the critical region . Note the asymmetry between the null and alternative hypotheses. This asymmetry is due to the fact that we assume the null hypothesis, in a sense, and then see if there is sufficient evidence in \(\bs{x}\) to overturn this assumption in favor of the alternative.

An hypothesis test is a statistical analogy to proof by contradiction, in a sense. Suppose for a moment that \(H_1\) is a statement in a mathematical theory and that \(H_0\) is its negation. One way that we can prove \(H_1\) is to assume \(H_0\) and work our way logically to a contradiction. In an hypothesis test, we don't prove anything of course, but there are similarities. We assume \(H_0\) and then see if the data \(\bs{x}\) are sufficiently at odds with that assumption that we feel justified in rejecting \(H_0\) in favor of \(H_1\).

Often, the critical region is defined in terms of a statistic \(w(\bs{X})\), known as a test statistic , where \(w\) is a function from \(S\) into another set \(T\). We find an appropriate rejection region \(R_T \subseteq T\) and reject \(H_0\) when the observed value \(w(\bs{x}) \in R_T\). Thus, the rejection region in \(S\) is then \(R = w^{-1}(R_T) = \left\{\bs{x} \in S: w(\bs{x}) \in R_T\right\}\). As usual, the use of a statistic often allows significant data reduction when the dimension of the test statistic is much smaller than the dimension of the data vector.

The ultimate decision may be correct or may be in error. There are two types of errors, depending on which of the hypotheses is actually true.

Types of errors:

  • A type 1 error is rejecting the null hypothesis \(H_0\) when \(H_0\) is true.
  • A type 2 error is failing to reject the null hypothesis \(H_0\) when the alternative hypothesis \(H_1\) is true.

Similarly, there are two ways to make a correct decision: we could reject \(H_0\) when \(H_1\) is true or we could fail to reject \(H_0\) when \(H_0\) is true. The possibilities are summarized in the following table:

Of course, when we observe \(\bs{X} = \bs{x}\) and make our decision, either we will have made the correct decision or we will have committed an error, and usually we will never know which of these events has occurred. Prior to gathering the data, however, we can consider the probabilities of the various errors.

If \(H_0\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_0\)), then \(\P(\bs{X} \in R)\) is the probability of a type 1 error for this distribution. If \(H_0\) is composite, then \(H_0\) specifies a variety of different distributions for \(\bs{X}\) and thus there is a set of type 1 error probabilities.

The maximum probability of a type 1 error, over the set of distributions specified by \( H_0 \), is the significance level of the test or the size of the critical region.

The significance level is often denoted by \(\alpha\). Usually, the rejection region is constructed so that the significance level is a prescribed, small value (typically 0.1, 0.05, 0.01).

If \(H_1\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_1\)), then \(\P(\bs{X} \notin R)\) is the probability of a type 2 error for this distribution. Again, if \(H_1\) is composite then \(H_1\) specifies a variety of different distributions for \(\bs{X}\), and thus there will be a set of type 2 error probabilities. Generally, there is a tradeoff between the type 1 and type 2 error probabilities. If we reduce the probability of a type 1 error, by making the rejection region \(R\) smaller, we necessarily increase the probability of a type 2 error because the complementary region \(S \setminus R\) is larger.

The extreme cases can give us some insight. First consider the decision rule in which we never reject \(H_0\), regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = \emptyset\). A type 1 error is impossible, so the significance level is 0. On the other hand, the probability of a type 2 error is 1 for any distribution defined by \(H_1\). At the other extreme, consider the decision rule in which we always rejects \(H_0\) regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = S\). A type 2 error is impossible, but now the probability of a type 1 error is 1 for any distribution defined by \(H_0\). In between these two worthless tests are meaningful tests that take the evidence \(\bs{x}\) into account.

If \(H_1\) is true, so that the distribution of \(\bs{X}\) is specified by \(H_1\), then \(\P(\bs{X} \in R)\), the probability of rejecting \(H_0\) is the power of the test for that distribution.

Thus the power of the test for a distribution specified by \( H_1 \) is the probability of making the correct decision.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with region \(R_1\) is uniformly more powerful than the test with region \(R_2\) if \[ \P(\bs{X} \in R_1) \ge \P(\bs{X} \in R_2) \text{ for every distribution of } \bs{X} \text{ specified by } H_1 \]

Naturally, in this case, we would prefer the first test. Often, however, two tests will not be uniformly ordered; one test will be more powerful for some distributions specified by \(H_1\) while the other test will be more powerful for other distributions specified by \(H_1\).

If a test has significance level \(\alpha\) and is uniformly more powerful than any other test with significance level \(\alpha\), then the test is said to be a uniformly most powerful test at level \(\alpha\).

Clearly a uniformly most powerful test is the best we can do.

\(P\)-value

In most cases, we have a general procedure that allows us to construct a test (that is, a rejection region \(R_\alpha\)) for any given significance level \(\alpha \in (0, 1)\). Typically, \(R_\alpha\) decreases (in the subset sense) as \(\alpha\) decreases.

The \(P\)-value of the observed value \(\bs{x}\) of \(\bs{X}\), denoted \(P(\bs{x})\), is defined to be the smallest \(\alpha\) for which \(\bs{x} \in R_\alpha\); that is, the smallest significance level for which \(H_0\) is rejected, given \(\bs{X} = \bs{x}\).

Knowing \(P(\bs{x})\) allows us to test \(H_0\) at any significance level for the given data \(\bs{x}\): If \(P(\bs{x}) \le \alpha\) then we would reject \(H_0\) at significance level \(\alpha\); if \(P(\bs{x}) \gt \alpha\) then we fail to reject \(H_0\) at significance level \(\alpha\). Note that \(P(\bs{X})\) is a statistic . Informally, \(P(\bs{x})\) can often be thought of as the probability of an outcome as or more extreme than the observed value \(\bs{x}\), where extreme is interpreted relative to the null hypothesis \(H_0\).

Analogy with Justice Systems

There is a helpful analogy between statistical hypothesis testing and the criminal justice system in the US and various other countries. Consider a person charged with a crime. The presumed null hypothesis is that the person is innocent of the crime; the conjectured alternative hypothesis is that the person is guilty of the crime. The test of the hypotheses is a trial with evidence presented by both sides playing the role of the data. After considering the evidence, the jury delivers the decision as either not guilty or guilty . Note that innocent is not a possible verdict of the jury, because it is not the point of the trial to prove the person innocent. Rather, the point of the trial is to see whether there is sufficient evidence to overturn the null hypothesis that the person is innocent in favor of the alternative hypothesis of that the person is guilty. A type 1 error is convicting a person who is innocent; a type 2 error is acquitting a person who is guilty. Generally, a type 1 error is considered the more serious of the two possible errors, so in an attempt to hold the chance of a type 1 error to a very low level, the standard for conviction in serious criminal cases is beyond a reasonable doubt .

Tests of an Unknown Parameter

Hypothesis testing is a very general concept, but an important special class occurs when the distribution of the data variable \(\bs{X}\) depends on a parameter \(\theta\) taking values in a parameter space \(\Theta\). The parameter may be vector-valued, so that \(\bs{\theta} = (\theta_1, \theta_2, \ldots, \theta_n)\) and \(\Theta \subseteq \R^k\) for some \(k \in \N_+\). The hypotheses generally take the form \[ H_0: \theta \in \Theta_0 \text{ versus } H_1: \theta \notin \Theta_0 \] where \(\Theta_0\) is a prescribed subset of the parameter space \(\Theta\). In this setting, the probabilities of making an error or a correct decision depend on the true value of \(\theta\). If \(R\) is the rejection region, then the power function \( Q \) is given by \[ Q(\theta) = \P_\theta(\bs{X} \in R), \quad \theta \in \Theta \] The power function gives a lot of information about the test.

The power function satisfies the following properties:

  • \(Q(\theta)\) is the probability of a type 1 error when \(\theta \in \Theta_0\).
  • \(\max\left\{Q(\theta): \theta \in \Theta_0\right\}\) is the significance level of the test.
  • \(1 - Q(\theta)\) is the probability of a type 2 error when \(\theta \notin \Theta_0\).
  • \(Q(\theta)\) is the power of the test when \(\theta \notin \Theta_0\).

If we have two tests, we can compare them by means of their power functions.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with rejection region \(R_1\) is uniformly more powerful than the test with rejection region \(R_2\) if \( Q_1(\theta) \ge Q_2(\theta)\) for all \( \theta \notin \Theta_0 \).

Most hypothesis tests of an unknown real parameter \(\theta\) fall into three special cases:

Suppose that \( \theta \) is a real parameter and \( \theta_0 \in \Theta \) a specified value. The tests below are respectively the two-sided test , the left-tailed test , and the right-tailed test .

  • \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\)
  • \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\)
  • \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\)

Thus the tests are named after the conjectured alternative. Of course, there may be other unknown parameters besides \(\theta\) (known as nuisance parameters ).

Equivalence Between Hypothesis Test and Confidence Sets

There is an equivalence between hypothesis tests and confidence sets for a parameter \(\theta\).

Suppose that \(C(\bs{x})\) is a \(1 - \alpha\) level confidence set for \(\theta\). The following test has significance level \(\alpha\) for the hypothesis \( H_0: \theta = \theta_0 \) versus \( H_1: \theta \ne \theta_0 \): Reject \(H_0\) if and only if \(\theta_0 \notin C(\bs{x})\)

By definition, \(\P[\theta \in C(\bs{X})] = 1 - \alpha\). Hence if \(H_0\) is true so that \(\theta = \theta_0\), then the probability of a type 1 error is \(P[\theta \notin C(\bs{X})] = \alpha\).

Equivalently, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\theta_0\) is in the corresponding \(1 - \alpha\) level confidence set. In particular, this equivalence applies to interval estimates of a real parameter \(\theta\) and the common tests for \(\theta\) given above .

In each case below, the confidence interval has confidence level \(1 - \alpha\) and the test has significance level \(\alpha\).

  • Suppose that \(\left[L(\bs{X}, U(\bs{X})\right]\) is a two-sided confidence interval for \(\theta\). Reject \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\) or \(\theta_0 \gt U(\bs{X})\).
  • Suppose that \(L(\bs{X})\) is a confidence lower bound for \(\theta\). Reject \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\).
  • Suppose that \(U(\bs{X})\) is a confidence upper bound for \(\theta\). Reject \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\) if and only if \(\theta_0 \gt U(\bs{X})\).

Pivot Variables and Test Statistics

Recall that confidence sets of an unknown parameter \(\theta\) are often constructed through a pivot variable , that is, a random variable \(W(\bs{X}, \theta)\) that depends on the data vector \(\bs{X}\) and the parameter \(\theta\), but whose distribution does not depend on \(\theta\) and is known. In this case, a natural test statistic for the basic tests given above is \(W(\bs{X}, \theta_0)\).

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

17 Introduction to Hypothesis Testing

Jenna Lehmann

What is Hypothesis Testing?

Hypothesis testing is a big part of what we would actually consider testing for inferential statistics. It’s a procedure and set of rules that allow us to move from descriptive statistics to make inferences about a population based on sample data. It is a statistical method that uses sample data to evaluate a hypothesis about a population.

This type of test is usually used within the context of research. If we expect to see a difference between a treated and untreated group (in some cases the untreated group is the parameters we know about the population), we expect there to be a difference in the means between the two groups, but that the standard deviation remains the same, as if each individual score has had a value added or subtracted from it.

Steps of Hypothesis Testing

The following steps will be tailored to fit the first kind of hypothesis testing we will learn first: single-sample z-tests. There are many other kinds of tests, so keep this in mind.

  • Null Hypothesis (H0): states that in the general population there is no change, no difference, or no relationship, or in the context of an experiment, it predicts that the independent variable has no effect on the dependent variable.
  • Alternative Hypothesis (H1): states that there is a change, a difference, or a relationship for the general population, or in the context of an experiment, it predicts that the independent variable has an effect on the dependent variable.

\alpha = 0.05,

  • Critical Region: Composed of the extreme sample values that are very unlikely to be obtained if the null hypothesis is true. Determined by alpha level. If sample data fall in the critical region, the null hypothesis is rejected, because it’s very unlikely they’ve fallen there by chance.
  • After collecting the data, we find the sample mean. Now we can compare the sample mean with the null hypothesis by computing a z-score that describes where the sample mean is located relative to the hypothesized population mean. We use the z-score formula.
  • We decided previously what the two z-score boundaries are for a critical score. If the z-score we get after plugging the numbers in the aforementioned equation is outside of that critical region, we reject the null hypothesis. Otherwise, we would say that we failed to reject the null hypothesis.

Regions of the Distribution

Because we’re making judgments based on probability and proportion, our normal distributions and certain regions within them come into play.

The Critical Region is composed of the extreme sample values that are very unlikely to be obtained if the null hypothesis is true. Determined by alpha level. If sample data fall in the critical region, the null hypothesis is rejected, because it’s very unlikely they’ve fallen there by chance.

These regions come into play when talking about different errors.

A Type I Error occurs when a researcher rejects a null hypothesis that is actually true; the researcher concludes that a treatment has an effect when it actually doesn’t. This happens when a researcher unknowingly obtains an extreme, non-representative sample. This goes back to alpha level: it’s the probability that the test will lead to a Type I error if the null hypothesis is true.

(\beta)

A result is said to be significant or statistically significant if it is very unlikely to occur when the null hypothesis is true. That is, the result is sufficient to reject the null hypothesis. For instance, two means can be significantly different from one another.

Factors that Influence and Assumptions of Hypothesis Testing

Assumptions of Hypothesis Testing:

  • Random sampling: it is assumed that the participants used in the study were selected randomly so that we can confidently generalize our findings from the sample to the population.
  • Independent observation: two observations are independent if there is no consistent, predictable relationship between the first observation and the second. The value of σ is unchanged by the treatment; if the population standard deviation is unknown, we assume that the standard deviation for the unknown population (after treatment) is the same as it was for the population before treatment. There are ways of checking to see if this is true in SPSS or Excel.
  • Normal sampling distribution: in order to use the unit normal table to identify the critical region, we need the distribution of sample means to be normal (which means we need the population to be distributed normally and/or each sample size needs to be 30 or greater based on what we know about the central limit theorem).

Factors that influence hypothesis testing:

  • The variability of the scores, which is measured by either the standard deviation or the variance. The variability influences the size of the standard error in the denominator of the z-score.
  • The number of scores in the sample. This value also influences the size of the standard error in the denominator.

Test statistic: indicates that the sample data are converted into a single, specific statistic that is used to test the hypothesis (in this case, the z-score statistic).

Directional Hypotheses and Tailed Tests

In a directional hypothesis test , also known as a one-tailed test, the statistical hypotheses specify with an increase or decrease in the population mean. That is, they make a statement about the direction of the effect.

The Hypotheses for a Directional Test:

  • H0: The test scores are not increased/decreased (the treatment doesn’t work)
  • H1: The test scores are increased/decreased (the treatment works as predicted)

Because we’re only worried about scores that are either greater or less than the scores predicted by the null hypothesis, we only worry about what’s going on in one tail meaning that the critical region only exists within one tail. This means that all of the alpha is contained in one tail rather than split up into both (so the whole 5% is located in the tail we care about, rather than 2.5% in each tail). So before, we cared about what’s going on at the 0.025 mark of the unit normal table to look at both tails, but now we care about 0.05 because we’re only looking at one tail.

A one-tailed test allows you to reject the null hypothesis when the difference between the sample and the population is relatively small, as long as that difference is in the direction that you predicted. A two-tailed test, on the other hand, requires a relatively large difference independent of direction. In practice, researchers hypothesize using a one-tailed method but base their findings off of whether the results fall into the critical region of a two-tailed method. For the purposes of this class, make sure to calculate your results using the test that is specified in the problem.

Effect Size

A measure of effect size is intended to provide a measurement of the absolute magnitude of a treatment effect, independent of the size of the sample(s) being used. Usually done with Cohen’s d. If you imagine the two distributions, they’re layered over one another. The more they overlap, the smaller the effect size (the means of the two distributions are close). The more they are spread apart, the greater the effect size (the means of the two distributions are farther apart).

Statistical Power

The power of a statistical test is the probability that the test will correctly reject a false null hypothesis. It’s usually what we’re hoping to get when we run an experiment. It’s displayed in the table posted above. Power and effect size are connected. So, we know that the greater the distance between the means, the greater the effect size. If the two distributions overlapped very little, there would be a greater chance of selecting a sample that leads to rejecting the null hypothesis.

This chapter was originally posted to the Math Support Center blog at the University of Baltimore on June 11, 2019.

Math and Statistics Guides from UB's Math & Statistics Center Copyright © by Jenna Lehmann is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Forgot password? New user? Sign up

Existing user? Log in

Hypothesis Testing

Already have an account? Log in here.

A hypothesis test is a statistical inference method used to test the significance of a proposed (hypothesized) relation between population statistics (parameters) and their corresponding sample estimators . In other words, hypothesis tests are used to determine if there is enough evidence in a sample to prove a hypothesis true for the entire population.

The test considers two hypotheses: the null hypothesis , which is a statement meant to be tested, usually something like "there is no effect" with the intention of proving this false, and the alternate hypothesis , which is the statement meant to stand after the test is performed. The two hypotheses must be mutually exclusive ; moreover, in most applications, the two are complementary (one being the negation of the other). The test works by comparing the \(p\)-value to the level of significance (a chosen target). If the \(p\)-value is less than or equal to the level of significance, then the null hypothesis is rejected.

When analyzing data, only samples of a certain size might be manageable as efficient computations. In some situations the error terms follow a continuous or infinite distribution, hence the use of samples to suggest accuracy of the chosen test statistics. The method of hypothesis testing gives an advantage over guessing what distribution or which parameters the data follows.

Definitions and Methodology

Hypothesis test and confidence intervals.

In statistical inference, properties (parameters) of a population are analyzed by sampling data sets. Given assumptions on the distribution, i.e. a statistical model of the data, certain hypotheses can be deduced from the known behavior of the model. These hypotheses must be tested against sampled data from the population.

The null hypothesis \((\)denoted \(H_0)\) is a statement that is assumed to be true. If the null hypothesis is rejected, then there is enough evidence (statistical significance) to accept the alternate hypothesis \((\)denoted \(H_1).\) Before doing any test for significance, both hypotheses must be clearly stated and non-conflictive, i.e. mutually exclusive, statements. Rejecting the null hypothesis, given that it is true, is called a type I error and it is denoted \(\alpha\), which is also its probability of occurrence. Failing to reject the null hypothesis, given that it is false, is called a type II error and it is denoted \(\beta\), which is also its probability of occurrence. Also, \(\alpha\) is known as the significance level , and \(1-\beta\) is known as the power of the test. \(H_0\) \(\textbf{is true}\)\(\hspace{15mm}\) \(H_0\) \(\textbf{is false}\) \(\textbf{Reject}\) \(H_0\)\(\hspace{10mm}\) Type I error Correct Decision \(\textbf{Reject}\) \(H_1\) Correct Decision Type II error The test statistic is the standardized value following the sampled data under the assumption that the null hypothesis is true, and a chosen particular test. These tests depend on the statistic to be studied and the assumed distribution it follows, e.g. the population mean following a normal distribution. The \(p\)-value is the probability of observing an extreme test statistic in the direction of the alternate hypothesis, given that the null hypothesis is true. The critical value is the value of the assumed distribution of the test statistic such that the probability of making a type I error is small.
Methodologies: Given an estimator \(\hat \theta\) of a population statistic \(\theta\), following a probability distribution \(P(T)\), computed from a sample \(\mathcal{S},\) and given a significance level \(\alpha\) and test statistic \(t^*,\) define \(H_0\) and \(H_1;\) compute the test statistic \(t^*.\) \(p\)-value Approach (most prevalent): Find the \(p\)-value using \(t^*\) (right-tailed). If the \(p\)-value is at most \(\alpha,\) reject \(H_0\). Otherwise, reject \(H_1\). Critical Value Approach: Find the critical value solving the equation \(P(T\geq t_\alpha)=\alpha\) (right-tailed). If \(t^*>t_\alpha\), reject \(H_0\). Otherwise, reject \(H_1\). Note: Failing to reject \(H_0\) only means inability to accept \(H_1\), and it does not mean to accept \(H_0\).
Assume a normally distributed population has recorded cholesterol levels with various statistics computed. From a sample of 100 subjects in the population, the sample mean was 214.12 mg/dL (milligrams per deciliter), with a sample standard deviation of 45.71 mg/dL. Perform a hypothesis test, with significance level 0.05, to test if there is enough evidence to conclude that the population mean is larger than 200 mg/dL. Hypothesis Test We will perform a hypothesis test using the \(p\)-value approach with significance level \(\alpha=0.05:\) Define \(H_0\): \(\mu=200\). Define \(H_1\): \(\mu>200\). Since our values are normally distributed, the test statistic is \(z^*=\frac{\bar X - \mu_0}{\frac{s}{\sqrt{n}}}=\frac{214.12 - 200}{\frac{45.71}{\sqrt{100}}}\approx 3.09\). Using a standard normal distribution, we find that our \(p\)-value is approximately \(0.001\). Since the \(p\)-value is at most \(\alpha=0.05,\) we reject \(H_0\). Therefore, we can conclude that the test shows sufficient evidence to support the claim that \(\mu\) is larger than \(200\) mg/dL.

If the sample size was smaller, the normal and \(t\)-distributions behave differently. Also, the question itself must be managed by a double-tail test instead.

Assume a population's cholesterol levels are recorded and various statistics are computed. From a sample of 25 subjects, the sample mean was 214.12 mg/dL (milligrams per deciliter), with a sample standard deviation of 45.71 mg/dL. Perform a hypothesis test, with significance level 0.05, to test if there is enough evidence to conclude that the population mean is not equal to 200 mg/dL. Hypothesis Test We will perform a hypothesis test using the \(p\)-value approach with significance level \(\alpha=0.05\) and the \(t\)-distribution with 24 degrees of freedom: Define \(H_0\): \(\mu=200\). Define \(H_1\): \(\mu\neq 200\). Using the \(t\)-distribution, the test statistic is \(t^*=\frac{\bar X - \mu_0}{\frac{s}{\sqrt{n}}}=\frac{214.12 - 200}{\frac{45.71}{\sqrt{25}}}\approx 1.54\). Using a \(t\)-distribution with 24 degrees of freedom, we find that our \(p\)-value is approximately \(2(0.068)=0.136\). We have multiplied by two since this is a two-tailed argument, i.e. the mean can be smaller than or larger than. Since the \(p\)-value is larger than \(\alpha=0.05,\) we fail to reject \(H_0\). Therefore, the test does not show sufficient evidence to support the claim that \(\mu\) is not equal to \(200\) mg/dL.

The complement of the rejection on a two-tailed hypothesis test (with significance level \(\alpha\)) for a population parameter \(\theta\) is equivalent to finding a confidence interval \((\)with confidence level \(1-\alpha)\) for the population parameter \(\theta\). If the assumption on the parameter \(\theta\) falls inside the confidence interval, then the test has failed to reject the null hypothesis \((\)with \(p\)-value greater than \(\alpha).\) Otherwise, if \(\theta\) does not fall in the confidence interval, then the null hypothesis is rejected in favor of the alternate \((\)with \(p\)-value at most \(\alpha).\)

  • Statistics (Estimation)
  • Normal Distribution
  • Correlation
  • Confidence Intervals

Problem Loading...

Note Loading...

Set Loading...

Cambridge University Faculty of Mathematics

Or search by topic

Number and algebra

  • The Number System and Place Value
  • Calculations and Numerical Methods
  • Fractions, Decimals, Percentages, Ratio and Proportion
  • Properties of Numbers
  • Patterns, Sequences and Structure
  • Algebraic expressions, equations and formulae
  • Coordinates, Functions and Graphs

Geometry and measure

  • Angles, Polygons, and Geometrical Proof
  • 3D Geometry, Shape and Space
  • Measuring and calculating with units
  • Transformations and constructions
  • Pythagoras and Trigonometry
  • Vectors and Matrices

Probability and statistics

  • Handling, Processing and Representing Data
  • Probability

Working mathematically

  • Thinking mathematically
  • Mathematical mindsets
  • Cross-curricular contexts
  • Physical and digital manipulatives

For younger learners

  • Early Years Foundation Stage

Advanced mathematics

  • Decision Mathematics and Combinatorics
  • Advanced Probability and Statistics

Published 2008 Revised 2019

Understanding Hypotheses

math version of hypothesis

'What happens if ... ?' to ' This will happen if'

The experimentation of children continually moves on to the exploration of new ideas and the refinement of their world view of previously understood situations. This description of the playtime patterns of young children very nicely models the concept of 'making and testing hypotheses'. It follows this pattern:

  • Make some observations. Collect some data based on the observations.
  • Draw a conclusion (called a 'hypothesis') which will explain the pattern of the observations.
  • Test out your hypothesis by making some more targeted observations.

So, we have

  • A hypothesis is a statement or idea which gives an explanation to a series of observations.

Sometimes, following observation, a hypothesis will clearly need to be refined or rejected. This happens if a single contradictory observation occurs. For example, suppose that a child is trying to understand the concept of a dog. He reads about several dogs in children's books and sees that they are always friendly and fun. He makes the natural hypothesis in his mind that dogs are friendly and fun . He then meets his first real dog: his neighbour's puppy who is great fun to play with. This reinforces his hypothesis. His cousin's dog is also very friendly and great fun. He meets some of his friends' dogs on various walks to playgroup. They are also friendly and fun. He is now confident that his hypothesis is sound. Suddenly, one day, he sees a dog, tries to stroke it and is bitten. This experience contradicts his hypothesis. He will need to amend the hypothesis. We see that

  • Gathering more evidence/data can strengthen a hypothesis if it is in agreement with the hypothesis.
  • If the data contradicts the hypothesis then the hypothesis must be rejected or amended to take into account the contradictory situation.

math version of hypothesis

  • A contradictory observation can cause us to know for certain that a hypothesis is incorrect.
  • Accumulation of supporting experimental evidence will strengthen a hypothesis but will never let us know for certain that the hypothesis is true.

In short, it is possible to show that a hypothesis is false, but impossible to prove that it is true!

Whilst we can never prove a scientific hypothesis to be true, there will be a certain stage at which we decide that there is sufficient supporting experimental data for us to accept the hypothesis. The point at which we make the choice to accept a hypothesis depends on many factors. In practice, the key issues are

  • What are the implications of mistakenly accepting a hypothesis which is false?
  • What are the cost / time implications of gathering more data?
  • What are the implications of not accepting in a timely fashion a true hypothesis?

For example, suppose that a drug company is testing a new cancer drug. They hypothesise that the drug is safe with no side effects. If they are mistaken in this belief and release the drug then the results could have a disastrous effect on public health. However, running extended clinical trials might be very costly and time consuming. Furthermore, a delay in accepting the hypothesis and releasing the drug might also have a negative effect on the health of many people.

In short, whilst we can never achieve absolute certainty with the testing of hypotheses, in order to make progress in science or industry decisions need to be made. There is a fine balance to be made between action and inaction.

Hypotheses and mathematics So where does mathematics enter into this picture? In many ways, both obvious and subtle:

  • A good hypothesis needs to be clear, precisely stated and testable in some way. Creation of these clear hypotheses requires clear general mathematical thinking.
  • The data from experiments must be carefully analysed in relation to the original hypothesis. This requires the data to be structured, operated upon, prepared and displayed in appropriate ways. The levels of this process can range from simple to exceedingly complex.

Very often, the situation under analysis will appear to be complicated and unclear. Part of the mathematics of the task will be to impose a clear structure on the problem. The clarity of thought required will actively be developed through more abstract mathematical study. Those without sufficient general mathematical skill will be unable to perform an appropriate logical analysis.

Using deductive reasoning in hypothesis testing

There is often confusion between the ideas surrounding proof, which is mathematics, and making and testing an experimental hypothesis, which is science. The difference is rather simple:

  • Mathematics is based on deductive reasoning : a proof is a logical deduction from a set of clear inputs.
  • Science is based on inductive reasoning : hypotheses are strengthened or rejected based on an accumulation of experimental evidence.

Of course, to be good at science, you need to be good at deductive reasoning, although experts at deductive reasoning need not be mathematicians. Detectives, such as Sherlock Holmes and Hercule Poirot, are such experts: they collect evidence from a crime scene and then draw logical conclusions from the evidence to support the hypothesis that, for example, Person M. committed the crime. They use this evidence to create sufficiently compelling deductions to support their hypotheses beyond reasonable doubt . The key word here is 'reasonable'. There is always the possibility of creating an exceedingly outlandish scenario to explain away any hypothesis of a detective or prosecution lawyer, but judges and juries in courts eventually make the decision that the probability of such eventualities are 'small' and the chance of the hypothesis being correct 'high'.

math version of hypothesis

  • If a set of data is normally distributed with mean 0 and standard deviation 0.5 then there is a 97.7% certainty that a measurement will not exceed 1.0.
  • If the mean of a sample of data is 12, how confident can we be that the true mean of the population lies between 11 and 13?

It is at this point that making and testing hypotheses becomes a true branch of mathematics. This mathematics is difficult, but fascinating and highly relevant in the information-rich world of today.

To read more about the technical side of hypothesis testing, take a look at What is a Hypothesis Test?

You might also enjoy reading the articles on statistics on the Understanding Uncertainty website

This resource is part of the collection Statistics - Maths of Real Life

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Hypothesis Testing

Hypothesis testing is a tool for making statistical inferences about the population data. It is an analysis tool that tests assumptions and determines how likely something is within a given standard of accuracy. Hypothesis testing provides a way to verify whether the results of an experiment are valid.

A null hypothesis and an alternative hypothesis are set up before performing the hypothesis testing. This helps to arrive at a conclusion regarding the sample obtained from the population. In this article, we will learn more about hypothesis testing, its types, steps to perform the testing, and associated examples.

What is Hypothesis Testing in Statistics?

Hypothesis testing uses sample data from the population to draw useful conclusions regarding the population probability distribution . It tests an assumption made about the data using different types of hypothesis testing methodologies. The hypothesis testing results in either rejecting or not rejecting the null hypothesis.

Hypothesis Testing Definition

Hypothesis testing can be defined as a statistical tool that is used to identify if the results of an experiment are meaningful or not. It involves setting up a null hypothesis and an alternative hypothesis. These two hypotheses will always be mutually exclusive. This means that if the null hypothesis is true then the alternative hypothesis is false and vice versa. An example of hypothesis testing is setting up a test to check if a new medicine works on a disease in a more efficient manner.

Null Hypothesis

The null hypothesis is a concise mathematical statement that is used to indicate that there is no difference between two possibilities. In other words, there is no difference between certain characteristics of data. This hypothesis assumes that the outcomes of an experiment are based on chance alone. It is denoted as \(H_{0}\). Hypothesis testing is used to conclude if the null hypothesis can be rejected or not. Suppose an experiment is conducted to check if girls are shorter than boys at the age of 5. The null hypothesis will say that they are the same height.

Alternative Hypothesis

The alternative hypothesis is an alternative to the null hypothesis. It is used to show that the observations of an experiment are due to some real effect. It indicates that there is a statistical significance between two possible outcomes and can be denoted as \(H_{1}\) or \(H_{a}\). For the above-mentioned example, the alternative hypothesis would be that girls are shorter than boys at the age of 5.

Hypothesis Testing P Value

In hypothesis testing, the p value is used to indicate whether the results obtained after conducting a test are statistically significant or not. It also indicates the probability of making an error in rejecting or not rejecting the null hypothesis.This value is always a number between 0 and 1. The p value is compared to an alpha level, \(\alpha\) or significance level. The alpha level can be defined as the acceptable risk of incorrectly rejecting the null hypothesis. The alpha level is usually chosen between 1% to 5%.

Hypothesis Testing Critical region

All sets of values that lead to rejecting the null hypothesis lie in the critical region. Furthermore, the value that separates the critical region from the non-critical region is known as the critical value.

Hypothesis Testing Formula

Depending upon the type of data available and the size, different types of hypothesis testing are used to determine whether the null hypothesis can be rejected or not. The hypothesis testing formula for some important test statistics are given below:

  • z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the size of the sample.
  • t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\). s is the sample standard deviation.
  • \(\chi ^{2} = \sum \frac{(O_{i}-E_{i})^{2}}{E_{i}}\). \(O_{i}\) is the observed value and \(E_{i}\) is the expected value.

We will learn more about these test statistics in the upcoming section.

Types of Hypothesis Testing

Selecting the correct test for performing hypothesis testing can be confusing. These tests are used to determine a test statistic on the basis of which the null hypothesis can either be rejected or not rejected. Some of the important tests used for hypothesis testing are given below.

Hypothesis Testing Z Test

A z test is a way of hypothesis testing that is used for a large sample size (n ≥ 30). It is used to determine whether there is a difference between the population mean and the sample mean when the population standard deviation is known. It can also be used to compare the mean of two samples. It is used to compute the z test statistic. The formulas are given as follows:

  • One sample: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).
  • Two samples: z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing t Test

The t test is another method of hypothesis testing that is used for a small sample size (n < 30). It is also used to compare the sample mean and population mean. However, the population standard deviation is not known. Instead, the sample standard deviation is known. The mean of two samples can also be compared using the t test.

  • One sample: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\).
  • Two samples: t = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing Chi Square

The Chi square test is a hypothesis testing method that is used to check whether the variables in a population are independent or not. It is used when the test statistic is chi-squared distributed.

One Tailed Hypothesis Testing

One tailed hypothesis testing is done when the rejection region is only in one direction. It can also be known as directional hypothesis testing because the effects can be tested in one direction only. This type of testing is further classified into the right tailed test and left tailed test.

Right Tailed Hypothesis Testing

The right tail test is also known as the upper tail test. This test is used to check whether the population parameter is greater than some value. The null and alternative hypotheses for this test are given as follows:

\(H_{0}\): The population parameter is ≤ some value

\(H_{1}\): The population parameter is > some value.

If the test statistic has a greater value than the critical value then the null hypothesis is rejected

Right Tail Hypothesis Testing

Left Tailed Hypothesis Testing

The left tail test is also known as the lower tail test. It is used to check whether the population parameter is less than some value. The hypotheses for this hypothesis testing can be written as follows:

\(H_{0}\): The population parameter is ≥ some value

\(H_{1}\): The population parameter is < some value.

The null hypothesis is rejected if the test statistic has a value lesser than the critical value.

Left Tail Hypothesis Testing

Two Tailed Hypothesis Testing

In this hypothesis testing method, the critical region lies on both sides of the sampling distribution. It is also known as a non - directional hypothesis testing method. The two-tailed test is used when it needs to be determined if the population parameter is assumed to be different than some value. The hypotheses can be set up as follows:

\(H_{0}\): the population parameter = some value

\(H_{1}\): the population parameter ≠ some value

The null hypothesis is rejected if the test statistic has a value that is not equal to the critical value.

Two Tail Hypothesis Testing

Hypothesis Testing Steps

Hypothesis testing can be easily performed in five simple steps. The most important step is to correctly set up the hypotheses and identify the right method for hypothesis testing. The basic steps to perform hypothesis testing are as follows:

  • Step 1: Set up the null hypothesis by correctly identifying whether it is the left-tailed, right-tailed, or two-tailed hypothesis testing.
  • Step 2: Set up the alternative hypothesis.
  • Step 3: Choose the correct significance level, \(\alpha\), and find the critical value.
  • Step 4: Calculate the correct test statistic (z, t or \(\chi\)) and p-value.
  • Step 5: Compare the test statistic with the critical value or compare the p-value with \(\alpha\) to arrive at a conclusion. In other words, decide if the null hypothesis is to be rejected or not.

Hypothesis Testing Example

The best way to solve a problem on hypothesis testing is by applying the 5 steps mentioned in the previous section. Suppose a researcher claims that the mean average weight of men is greater than 100kgs with a standard deviation of 15kgs. 30 men are chosen with an average weight of 112.5 Kgs. Using hypothesis testing, check if there is enough evidence to support the researcher's claim. The confidence interval is given as 95%.

Step 1: This is an example of a right-tailed test. Set up the null hypothesis as \(H_{0}\): \(\mu\) = 100.

Step 2: The alternative hypothesis is given by \(H_{1}\): \(\mu\) > 100.

Step 3: As this is a one-tailed test, \(\alpha\) = 100% - 95% = 5%. This can be used to determine the critical value.

1 - \(\alpha\) = 1 - 0.05 = 0.95

0.95 gives the required area under the curve. Now using a normal distribution table, the area 0.95 is at z = 1.645. A similar process can be followed for a t-test. The only additional requirement is to calculate the degrees of freedom given by n - 1.

Step 4: Calculate the z test statistic. This is because the sample size is 30. Furthermore, the sample and population means are known along with the standard deviation.

z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).

\(\mu\) = 100, \(\overline{x}\) = 112.5, n = 30, \(\sigma\) = 15

z = \(\frac{112.5-100}{\frac{15}{\sqrt{30}}}\) = 4.56

Step 5: Conclusion. As 4.56 > 1.645 thus, the null hypothesis can be rejected.

Hypothesis Testing and Confidence Intervals

Confidence intervals form an important part of hypothesis testing. This is because the alpha level can be determined from a given confidence interval. Suppose a confidence interval is given as 95%. Subtract the confidence interval from 100%. This gives 100 - 95 = 5% or 0.05. This is the alpha value of a one-tailed hypothesis testing. To obtain the alpha value for a two-tailed hypothesis testing, divide this value by 2. This gives 0.05 / 2 = 0.025.

Related Articles:

  • Probability and Statistics
  • Data Handling

Important Notes on Hypothesis Testing

  • Hypothesis testing is a technique that is used to verify whether the results of an experiment are statistically significant.
  • It involves the setting up of a null hypothesis and an alternate hypothesis.
  • There are three types of tests that can be conducted under hypothesis testing - z test, t test, and chi square test.
  • Hypothesis testing can be classified as right tail, left tail, and two tail tests.

Examples on Hypothesis Testing

  • Example 1: The average weight of a dumbbell in a gym is 90lbs. However, a physical trainer believes that the average weight might be higher. A random sample of 5 dumbbells with an average weight of 110lbs and a standard deviation of 18lbs. Using hypothesis testing check if the physical trainer's claim can be supported for a 95% confidence level. Solution: As the sample size is lesser than 30, the t-test is used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) > 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 5, s = 18. \(\alpha\) = 0.05 Using the t-distribution table, the critical value is 2.132 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = 2.484 As 2.484 > 2.132, the null hypothesis is rejected. Answer: The average weight of the dumbbells may be greater than 90lbs
  • Example 2: The average score on a test is 80 with a standard deviation of 10. With a new teaching curriculum introduced it is believed that this score will change. On random testing, the score of 38 students, the mean was found to be 88. With a 0.05 significance level, is there any evidence to support this claim? Solution: This is an example of two-tail hypothesis testing. The z test will be used. \(H_{0}\): \(\mu\) = 80, \(H_{1}\): \(\mu\) ≠ 80 \(\overline{x}\) = 88, \(\mu\) = 80, n = 36, \(\sigma\) = 10. \(\alpha\) = 0.05 / 2 = 0.025 The critical value using the normal distribution table is 1.96 z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) z = \(\frac{88-80}{\frac{10}{\sqrt{36}}}\) = 4.8 As 4.8 > 1.96, the null hypothesis is rejected. Answer: There is a difference in the scores after the new curriculum was introduced.
  • Example 3: The average score of a class is 90. However, a teacher believes that the average score might be lower. The scores of 6 students were randomly measured. The mean was 82 with a standard deviation of 18. With a 0.05 significance level use hypothesis testing to check if this claim is true. Solution: The t test will be used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) < 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 6, s = 18 The critical value from the t table is -2.015 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = \(\frac{82-90}{\frac{18}{\sqrt{6}}}\) t = -1.088 As -1.088 > -2.015, we fail to reject the null hypothesis. Answer: There is not enough evidence to support the claim.

go to slide go to slide go to slide

math version of hypothesis

Book a Free Trial Class

FAQs on Hypothesis Testing

What is hypothesis testing.

Hypothesis testing in statistics is a tool that is used to make inferences about the population data. It is also used to check if the results of an experiment are valid.

What is the z Test in Hypothesis Testing?

The z test in hypothesis testing is used to find the z test statistic for normally distributed data . The z test is used when the standard deviation of the population is known and the sample size is greater than or equal to 30.

What is the t Test in Hypothesis Testing?

The t test in hypothesis testing is used when the data follows a student t distribution . It is used when the sample size is less than 30 and standard deviation of the population is not known.

What is the formula for z test in Hypothesis Testing?

The formula for a one sample z test in hypothesis testing is z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) and for two samples is z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

What is the p Value in Hypothesis Testing?

The p value helps to determine if the test results are statistically significant or not. In hypothesis testing, the null hypothesis can either be rejected or not rejected based on the comparison between the p value and the alpha level.

What is One Tail Hypothesis Testing?

When the rejection region is only on one side of the distribution curve then it is known as one tail hypothesis testing. The right tail test and the left tail test are two types of directional hypothesis testing.

What is the Alpha Level in Two Tail Hypothesis Testing?

To get the alpha level in a two tail hypothesis testing divide \(\alpha\) by 2. This is done as there are two rejection regions in the curve.

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Unit 12: Significance tests (hypothesis testing)

About this unit, the idea of significance tests.

  • Simple hypothesis testing (Opens a modal)
  • Idea behind hypothesis testing (Opens a modal)
  • Examples of null and alternative hypotheses (Opens a modal)
  • P-values and significance tests (Opens a modal)
  • Comparing P-values to different significance levels (Opens a modal)
  • Estimating a P-value from a simulation (Opens a modal)
  • Using P-values to make conclusions (Opens a modal)
  • Simple hypothesis testing Get 3 of 4 questions to level up!
  • Writing null and alternative hypotheses Get 3 of 4 questions to level up!
  • Estimating P-values from simulations Get 3 of 4 questions to level up!

Error probabilities and power

  • Introduction to Type I and Type II errors (Opens a modal)
  • Type 1 errors (Opens a modal)
  • Examples identifying Type I and Type II errors (Opens a modal)
  • Introduction to power in significance tests (Opens a modal)
  • Examples thinking about power in significance tests (Opens a modal)
  • Consequences of errors and significance (Opens a modal)
  • Type I vs Type II error Get 3 of 4 questions to level up!
  • Error probabilities and power Get 3 of 4 questions to level up!

Tests about a population proportion

  • Constructing hypotheses for a significance test about a proportion (Opens a modal)
  • Conditions for a z test about a proportion (Opens a modal)
  • Reference: Conditions for inference on a proportion (Opens a modal)
  • Calculating a z statistic in a test about a proportion (Opens a modal)
  • Calculating a P-value given a z statistic (Opens a modal)
  • Making conclusions in a test about a proportion (Opens a modal)
  • Writing hypotheses for a test about a proportion Get 3 of 4 questions to level up!
  • Conditions for a z test about a proportion Get 3 of 4 questions to level up!
  • Calculating the test statistic in a z test for a proportion Get 3 of 4 questions to level up!
  • Calculating the P-value in a z test for a proportion Get 3 of 4 questions to level up!
  • Making conclusions in a z test for a proportion Get 3 of 4 questions to level up!

Tests about a population mean

  • Writing hypotheses for a significance test about a mean (Opens a modal)
  • Conditions for a t test about a mean (Opens a modal)
  • Reference: Conditions for inference on a mean (Opens a modal)
  • When to use z or t statistics in significance tests (Opens a modal)
  • Example calculating t statistic for a test about a mean (Opens a modal)
  • Using TI calculator for P-value from t statistic (Opens a modal)
  • Using a table to estimate P-value from t statistic (Opens a modal)
  • Comparing P-value from t statistic to significance level (Opens a modal)
  • Free response example: Significance test for a mean (Opens a modal)
  • Writing hypotheses for a test about a mean Get 3 of 4 questions to level up!
  • Conditions for a t test about a mean Get 3 of 4 questions to level up!
  • Calculating the test statistic in a t test for a mean Get 3 of 4 questions to level up!
  • Calculating the P-value in a t test for a mean Get 3 of 4 questions to level up!
  • Making conclusions in a t test for a mean Get 3 of 4 questions to level up!

More significance testing videos

  • Hypothesis testing and p-values (Opens a modal)
  • One-tailed and two-tailed tests (Opens a modal)
  • Z-statistics vs. T-statistics (Opens a modal)
  • Small sample hypothesis test (Opens a modal)
  • Large sample proportion hypothesis testing (Opens a modal)

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.2 - writing hypotheses.

The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis (\(H_0\)) and an alternative hypothesis (\(H_a\)).

When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the direction of the test (non-directional, right-tailed or left-tailed), and (3) the value of the hypothesized parameter.

  • At this point we can write hypotheses for a single mean (\(\mu\)), paired means(\(\mu_d\)), a single proportion (\(p\)), the difference between two independent means (\(\mu_1-\mu_2\)), the difference between two proportions (\(p_1-p_2\)), a simple linear regression slope (\(\beta\)), and a correlation (\(\rho\)). 
  • The research question will give us the information necessary to determine if the test is two-tailed (e.g., "different from," "not equal to"), right-tailed (e.g., "greater than," "more than"), or left-tailed (e.g., "less than," "fewer than").
  • The research question will also give us the hypothesized parameter value. This is the number that goes in the hypothesis statements (i.e., \(\mu_0\) and \(p_0\)). For the difference between two groups, regression, and correlation, this value is typically 0.

Hypotheses are always written in terms of population parameters (e.g., \(p\) and \(\mu\)).  The tables below display all of the possible hypotheses for the parameters that we have learned thus far. Note that the null hypothesis always includes the equality (i.e., =).

How Do You Formulate (Important) Hypotheses?

  • Open Access
  • First Online: 03 December 2022

Cite this chapter

You have full access to this open access chapter

Book cover

  • James Hiebert 6 ,
  • Jinfa Cai 7 ,
  • Stephen Hwang 7 ,
  • Anne K Morris 6 &
  • Charles Hohensee 6  

Part of the book series: Research in Mathematics Education ((RME))

10k Accesses

Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so on. Many resources feed this process, including reading what others have found about similar phenomena, talking with colleagues, conducting pilot studies, and writing drafts as you revise your thinking. Although you might think you cannot predict what you will find, it is always possible—with enough reading and conversations and pilot studies—to make some good guesses. And, once you guess what you will find and write out the reasons for these guesses you are on your way to scientific inquiry. As you refine your hypotheses, you can assess their research importance by asking how connected they are to problems your research community really wants to solve.

Download chapter PDF

Part I. Getting Started

We want to begin by addressing a question you might have had as you read the title of this chapter. You are likely to hear, or read in other sources, that the research process begins by asking research questions . For reasons we gave in Chap. 1 , and more we will describe in this and later chapters, we emphasize formulating, testing, and revising hypotheses. However, it is important to know that asking and answering research questions involve many of the same activities, so we are not describing a completely different process.

We acknowledge that many researchers do not actually begin by formulating hypotheses. In other words, researchers rarely get a researchable idea by writing out a well-formulated hypothesis. Instead, their initial ideas for what they study come from a variety of sources. Then, after they have the idea for a study, they do lots of background reading and thinking and talking before they are ready to formulate a hypothesis. So, for readers who are at the very beginning and do not yet have an idea for a study, let’s back up. Where do research ideas come from?

There are no formulas or algorithms that spawn a researchable idea. But as you begin the process, you can ask yourself some questions. Your answers to these questions can help you move forward.

What are you curious about? What are you passionate about? What have you wondered about as an educator? These are questions that look inward, questions about yourself.

What do you think are the most pressing educational problems? Which problems are you in the best position to address? What change(s) do you think would help all students learn more productively? These are questions that look outward, questions about phenomena you have observed.

What are the main areas of research in the field? What are the big questions that are being asked? These are questions about the general landscape of the field.

What have you read about in the research literature that caught your attention? What have you read that prompted you to think about extending the profession’s knowledge about this? What have you read that made you ask, “I wonder why this is true?” These are questions about how you can build on what is known in the field.

What are some research questions or testable hypotheses that have been identified by other researchers for future research? This, too, is a question about how you can build on what is known in the field. Taking up such questions or hypotheses can help by providing some existing scaffolding that others have constructed.

What research is being done by your immediate colleagues or your advisor that is of interest to you? These are questions about topics for which you will likely receive local support.

Exercise 2.1

Brainstorm some answers for each set of questions. Record them. Then step back and look at the places of intersection. Did you have similar answers across several questions? Write out, as clearly as you can, the topic that captures your primary interest, at least at this point. We will give you a chance to update your responses as you study this book.

Part II. Paths from a General Interest to an Informed Hypothesis

There are many different paths you might take from conceiving an idea for a study, maybe even a vague idea, to formulating a prediction that leads to an informed hypothesis that can be tested. We will explore some of the paths we recommend.

We will assume you have completed Exercise 2.1 in Part I and have some written answers to the six questions that preceded it as well as a statement that describes your topic of interest. This very first statement could take several different forms: a description of a problem you want to study, a question you want to address, or a hypothesis you want to test. We recommend that you begin with one of these three forms, the one that makes most sense to you. There is an advantage to using all three and flexibly choosing the one that is most meaningful at the time and for a particular study. You can then move from one to the other as you think more about your research study and you develop your initial idea. To get a sense of how the process might unfold, consider the following alternative paths.

Beginning with a Prediction If You Have One

Sometimes, when you notice an educational problem or have a question about an educational situation or phenomenon, you quickly have an idea that might help solve the problem or answer the question. Here are three examples.

You are a teacher, and you noticed a problem with the way the textbook presented two related concepts in two consecutive lessons. Almost as soon as you noticed the problem, it occurred to you that the two lessons could be taught more effectively in the reverse order. You predicted better outcomes if the order was reversed, and you even had a preliminary rationale for why this would be true.

You are a graduate student and you read that students often misunderstand a particular aspect of graphing linear functions. You predicted that, by listening to small groups of students working together, you could hear new details that would help you understand this misconception.

You are a curriculum supervisor and you observed sixth-grade classrooms where students were learning about decimal fractions. After talking with several experienced teachers, you predicted that beginning with percentages might be a good way to introduce students to decimal fractions.

We begin with the path of making predictions because we see the other two paths as leading into this one at some point in the process (see Fig. 2.1 ). Starting with this path does not mean you did not sense a problem you wanted to solve or a question you wanted to answer.

The process flow diagram of initiation of hypothesis. It starts with a problem situation and leads to a prediction following the question to the hypothesis.

Three Pathways to Formulating Informed Hypotheses

Notice that your predictions can come from a variety of sources—your own experience, reading, and talking with colleagues. Most likely, as you write out your predictions you also think about the educational problem for which your prediction is a potential solution. Writing a clear description of the problem will be useful as you proceed. Notice also that it is easy to change each of your predictions into a question. When you formulate a prediction, you are actually answering a question, even though the question might be implicit. Making that implicit question explicit can generate a first draft of the research question that accompanies your prediction. For example, suppose you are the curriculum supervisor who predicts that teaching percentages first would be a good way to introduce decimal fractions. In an obvious shift in form, you could ask, “In what ways would teaching percentages benefit students’ initial learning of decimal fractions?”

The picture has a difference between a question and a prediction: a question simply asks what you will find whereas a prediction also says what you expect to find; written.

There are advantages to starting with the prediction form if you can make an educated guess about what you will find. Making a prediction forces you to think now about several things you will need to think about at some point anyway. It is better to think about them earlier rather than later. If you state your prediction clearly and explicitly, you can begin to ask yourself three questions about your prediction: Why do I expect to observe what I am predicting? Why did I make that prediction? (These two questions essentially ask what your rationale is for your prediction.) And, how can I test to see if it’s right? This is where the benefits of making predictions begin.

Asking yourself why you predicted what you did, and then asking yourself why you answered the first “why” question as you did, can be a powerful chain of thought that lays the groundwork for an increasingly accurate prediction and an increasingly well-reasoned rationale. For example, suppose you are the curriculum supervisor above who predicted that beginning by teaching percentages would be a good way to introduce students to decimal fractions. Why did you make this prediction? Maybe because students are familiar with percentages in everyday life so they could use what they know to anchor their thinking about hundredths. Why would that be helpful? Because if students could connect hundredths in percentage form with hundredths in decimal fraction form, they could bring their meaning of percentages into decimal fractions. But how would that help? If students understood that a decimal fraction like 0.35 meant 35 of 100, then they could use their understanding of hundredths to explore the meaning of tenths, thousandths, and so on. Why would that be useful? By continuing to ask yourself why you gave the previous answer, you can begin building your rationale and, as you build your rationale, you will find yourself revisiting your prediction, often making it more precise and explicit. If you were the curriculum supervisor and continued the reasoning in the previous sentences, you might elaborate your prediction by specifying the way in which percentages should be taught in order to have a positive effect on particular aspects of students’ understanding of decimal fractions.

Developing a Rationale for Your Predictions

Keeping your initial predictions in mind, you can read what others already know about the phenomenon. Your reading can now become targeted with a clear purpose.

By reading and talking with colleagues, you can develop more complete reasons for your predictions. It is likely that you will also decide to revise your predictions based on what you learn from your reading. As you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses. The more you learn about what is already known about your research topic, the more refined will be your predictions and the clearer and more complete your rationales. We will use the term more informed hypotheses to describe this evolution of your hypotheses.

The picture says you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses.

Developing more informed hypotheses is a good thing because it means: (1) you understand the reasons for your predictions; (2) you will be able to imagine how you can test your hypotheses; (3) you can more easily convince your colleagues that they are important hypotheses—they are hypotheses worth testing; and (4) at the end of your study, you will be able to more easily interpret the results of your test and to revise your hypotheses to demonstrate what you have learned by conducting the study.

Imagining Testing Your Hypotheses

Because we have tied together predictions and rationales to constitute hypotheses, testing hypotheses means testing predictions and rationales. Testing predictions means comparing empirical observations, or findings, with the predictions. Testing rationales means using these comparisons to evaluate the adequacy or soundness of the rationales.

Imagining how you might test your hypotheses does not mean working out the details for exactly how you would test them. Rather, it means thinking ahead about how you could do this. Recall the descriptor of scientific inquiry: “experience carefully planned in advance” (Fisher, 1935). Asking whether predictions are testable and whether rationales can be evaluated is simply planning in advance.

You might read that testing hypotheses means simply assessing whether predictions are correct or incorrect. In our view, it is more useful to think of testing as a means of gathering enough information to compare your findings with your predictions, revise your rationales, and propose more accurate predictions. So, asking yourself whether hypotheses can be tested means asking whether information could be collected to assess the accuracy of your predictions and whether the information will show you how to revise your rationales to sharpen your predictions.

Cycles of Building Rationales and Planning to Test Your Predictions

Scientific reasoning is a dialogue between the possible and the actual, an interplay between hypotheses and the logical expectations they give rise to: there is a restless to-and-fro motion of thought, the formulation and rectification of hypotheses (Medawar, 1982 , p.72).

As you ask yourself about how you could test your predictions, you will inevitably revise your rationales and sharpen your predictions. Your hypotheses will become more informed, more targeted, and more explicit. They will make clearer to you and others what, exactly, you plan to study.

When will you know that your hypotheses are clear and precise enough? Because of the way we define hypotheses, this question asks about both rationales and predictions. If a rationale you are building lets you make a number of quite different predictions that are equally plausible rather than a single, primary prediction, then your hypothesis needs further refinement by building a more complete and precise rationale. Also, if you cannot briefly describe to your colleagues a believable way to test your prediction, then you need to phrase it more clearly and precisely.

Each time you strengthen your rationales, you might need to adjust your predictions. And, each time you clarify your predictions, you might need to adjust your rationales. The cycle of going back and forth to keep your predictions and rationales tightly aligned has many payoffs down the road. Every decision you make from this point on will be in the interests of providing a transparent and convincing test of your hypotheses and explaining how the results of your test dictate specific revisions to your hypotheses. As you make these decisions (described in the succeeding chapters), you will probably return to clarify your hypotheses even further. But, you will be in a much better position, at each point, if you begin with well-informed hypotheses.

Beginning by Asking Questions to Clarify Your Interests

Instead of starting with predictions, a second path you might take devotes more time at the beginning to asking questions as you zero in on what you want to study. Some researchers suggest you start this way (e.g., Gournelos et al., 2019 ). Specifically, with this second path, the first statement you write to express your research interest would be a question. For example, you might ask, “Why do ninth-grade students change the way they think about linear equations after studying quadratic equations?” or “How do first graders solve simple arithmetic problems before they have been taught to add and subtract?”

The first phrasing of your question might be quite general or vague. As you think about your question and what you really want to know, you are likely to ask follow-up questions. These questions will almost always be more specific than your first question. The questions will also express more clearly what you want to know. So, the question “How do first graders solve simple arithmetic problems before they have been taught to add and subtract” might evolve into “Before first graders have been taught to solve arithmetic problems, what strategies do they use to solve arithmetic problems with sums and products below 20?” As you read and learn about what others already know about your questions, you will continually revise your questions toward clearer and more explicit and more precise versions that zero in on what you really want to know. The question above might become, “Before they are taught to solve arithmetic problems, what strategies do beginning first graders use to solve arithmetic problems with sums and products below 20 if they are read story problems and given physical counters to help them keep track of the quantities?”

Imagining Answers to Your Questions

If you monitor your own thinking as you ask questions, you are likely to begin forming some guesses about answers, even to the early versions of the questions. What do students learn about quadratic functions that influences changes in their proportional reasoning when dealing with linear functions? It could be that if you analyze the moments during instruction on quadratic equations that are extensions of the proportional reasoning involved in solving linear equations, there are times when students receive further experience reasoning proportionally. You might predict that these are the experiences that have a “backward transfer” effect (Hohensee, 2014 ).

These initial guesses about answers to your questions are your first predictions. The first predicted answers are likely to be hunches or fuzzy, vague guesses. This simply means you do not know very much yet about the question you are asking. Your first predictions, no matter how unfocused or tentative, represent the most you know at the time about the question you are asking. They help you gauge where you are in your thinking.

Shifting to the Hypothesis Formulation and Testing Path

Research questions can play an important role in the research process. They provide a succinct way of capturing your research interests and communicating them to others. When colleagues want to know about your work, they will often ask “What are your research questions?” It is good to have a ready answer.

However, research questions have limitations. They do not capture the three images of scientific inquiry presented in Chap. 1 . Due, in part, to this less expansive depiction of the process, research questions do not take you very far. They do not provide a guide that leads you through the phases of conducting a study.

Consequently, when you can imagine an answer to your research question, we recommend that you move onto the hypothesis formulation and testing path. Imagining an answer to your question means you can make plausible predictions. You can now begin clarifying the reasons for your predictions and transform your early predictions into hypotheses (predictions along with rationales). We recommend you do this as soon as you have guesses about the answers to your questions because formulating, testing, and revising hypotheses offers a tool that puts you squarely on the path of scientific inquiry. It is a tool that can guide you through the entire process of conducting a research study.

This does not mean you are finished asking questions. Predictions are often created as answers to questions. So, we encourage you to continue asking questions to clarify what you want to know. But your target shifts from only asking questions to also proposing predictions for the answers and developing reasons the answers will be accurate predictions. It is by predicting answers, and explaining why you made those predictions, that you become engaged in scientific inquiry.

Cycles of Refining Questions and Predicting Answers

An example might provide a sense of how this process plays out. Suppose you are reading about Vygotsky’s ( 1987 ) zone of proximal development (ZPD), and you realize this concept might help you understand why your high school students had trouble learning exponential functions. Maybe they were outside this zone when you tried to teach exponential functions. In order to recognize students who would benefit from instruction, you might ask, “How can I identify students who are within the ZPD around exponential functions?” What would you predict? Maybe students in this ZPD are those who already had knowledge of related functions. You could write out some reasons for this prediction, like “students who understand linear and quadratic functions are more likely to extend their knowledge to exponential functions.” But what kind of data would you need to test this? What would count as “understanding”? Are linear and quadratic the functions you should assess? Even if they are, how could you tell whether students who scored well on tests of linear and quadratic functions were within the ZPD of exponential functions? How, in the end, would you measure what it means to be in this ZPD? So, asking a series of reasonable questions raised some red flags about the way your initial question was phrased, and you decide to revise it.

You set the stage for revising your question by defining ZPD as the zone within which students can solve an exponential function problem by making only one additional conceptual connection between what they already know and exponential functions. Your revised question is, “Based on students’ knowledge of linear and quadratic functions, which students are within the ZPD of exponential functions?” This time you know what kind of data you need: the number of conceptual connections students need to bridge from their knowledge of related functions to exponential functions. How can you collect these data? Would you need to see into the minds of the students? Or, are there ways to test the number of conceptual connections someone makes to move from one topic to another? Do methods exist for gathering these data? You decide this is not realistic, so you now have a choice: revise the question further or move your research in a different direction.

Notice that we do not use the term research question for all these early versions of questions that begin clarifying for yourself what you want to study. These early versions are too vague and general to be called research questions. In this book, we save the term research question for a question that comes near the end of the work and captures exactly what you want to study . By the time you are ready to specify a research question, you will be thinking about your study in terms of hypotheses and tests. When your hypotheses are in final form and include clear predictions about what you will find, it will be easy to state the research questions that accompany your predictions.

To reiterate one of the key points of this chapter: hypotheses carry much more information than research questions. Using our definition, hypotheses include predictions about what the answer might be to the question plus reasons for why you think so. Unlike research questions, hypotheses capture all three images of scientific inquiry presented in Chap. 1 (planning, observing and explaining, and revising one’s thinking). Your hypotheses represent the most you know, at the moment, about your research topic. The same cannot be said for research questions.

Beginning with a Research Problem

When you wrote answers to the six questions at the end of Part I of this chapter, you might have identified a research interest by stating it as a problem. This is the third path you might take to begin your research. Perhaps your description of your problem might look something like this: “When I tried to teach my middle school students by presenting them with a challenging problem without showing them how to solve similar problems, they didn’t exert much effort trying to find a solution but instead waited for me to show them how to solve the problem.” You do not have a specific question in mind, and you do not have an idea for why the problem exists, so you do not have a prediction about how to solve it. Writing a statement of this problem as clearly as possible could be the first step in your research journey.

As you think more about this problem, it will feel natural to ask questions about it. For example, why did some students show more initiative than others? What could I have done to get them started? How could I have encouraged the students to keep trying without giving away the solution? You are now on the path of asking questions—not research questions yet, but questions that are helping you focus your interest.

As you continue to think about these questions, reflect on your own experience, and read what others know about this problem, you will likely develop some guesses about the answers to the questions. They might be somewhat vague answers, and you might not have lots of confidence they are correct, but they are guesses that you can turn into predictions. Now you are on the hypothesis-formulation-and-testing path. This means you are on the path of asking yourself why you believe the predictions are correct, developing rationales for the predictions, asking what kinds of empirical observations would test your predictions, and refining your rationales and predictions as you read the literature and talk with colleagues.

A simple diagram that summarizes the three paths we have described is shown in Fig. 2.1 . Each row of arrows represents one pathway for formulating an informed hypothesis. The dotted arrows in the first two rows represent parts of the pathways that a researcher may have implicitly travelled through already (without an intent to form a prediction) but that ultimately inform the researcher’s development of a question or prediction.

Part III. One Researcher’s Experience Launching a Scientific Inquiry

Martha was in her third year of her doctoral program and beginning to identify a topic for her dissertation. Based on (a) her experience as a high school mathematics teacher and a curriculum supervisor, (b) the reading she has done to this point, and (c) her conversations with her colleagues, she has developed an interest in what kinds of professional development experiences (let’s call them learning opportunities [LOs] for teachers) are most effective. Where does she go from here?

Exercise 2.2

Before you continue reading, please write down some suggestions for Martha about where she should start.

A natural thing for Martha to do at this point is to ask herself some additional questions, questions that specify further what she wants to learn: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

To focus her questions and decide what she really wants to know, she continues reading but now targets her reading toward everything she can find that suggests possible answers to these questions. She also talks with her colleagues to get more ideas about possible answers to these or related questions. Over several weeks or months, she finds herself being drawn to questions about what makes LOs effective, especially for helping teachers teach more conceptually. She zeroes in on the question, “What makes LOs for teachers effective for improving their teaching for conceptual understanding?”

This question is more focused than her first questions, but it is still too general for Martha to define a research study. How does she know it is too general? She uses two criteria. First, she notices that the predictions she makes about the answers to the question are all over the place; they are not constrained by the reasons she has assembled for her predictions. One prediction is that LOs are more effective when they help teachers learn content. Martha makes this guess because previous research suggests that effective LOs for teachers include attention to content. But this rationale allows lots of different predictions. For example, LOs are more effective when they focus on the content teachers will teach; LOs are more effective when they focus on content beyond what teachers will teach so teachers see how their instruction fits with what their students will encounter later; and LOs are more effective when they are tailored to the level of content knowledge participants have when they begin the LOs. The rationale she can provide at this point does not point to a particular prediction.

A second measure Martha uses to decide her question is too general is that the predictions she can make regarding the answers seem very difficult to test. How could she test, for example, whether LOs should focus on content beyond what teachers will teach? What does “content beyond what teachers teach” mean? How could you tell whether teachers use their new knowledge of later content to inform their teaching?

Before anticipating what Martha’s next question might be, it is important to pause and recognize how predicting the answers to her questions moved Martha into a new phase in the research process. As she makes predictions, works out the reasons for them, and imagines how she might test them, she is immersed in scientific inquiry. This intellectual work is the main engine that drives the research process. Also notice that revisions in the questions asked, the predictions made, and the rationales built represent the updated thinking (Chap. 1 ) that occurs as Martha continues to define her study.

Based on all these considerations and her continued reading, Martha revises the question again. The question now reads, “Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Although she feels like the question is more specific, she realizes that the answer to the question is either “yes” or “no.” This, by itself, is a red flag. Answers of “yes” or “no” would not contribute much to understanding the relationships between these LOs for teachers and changes in their teaching. Recall from Chap. 1 that understanding how things work, explaining why things work, is the goal of scientific inquiry.

Martha continues by trying to understand why she believes the answer is “yes.” When she tries to write out reasons for predicting “yes,” she realizes that her prediction depends on a variety of factors. If teachers already have deep knowledge of the content, the LOs might not affect them as much as other teachers. If the LOs do not help teachers develop their own conceptual understanding, they are not likely to change their teaching. By trying to build the rationale for her prediction—thus formulating a hypothesis—Martha realizes that the question still is not precise and clear enough.

Martha uses what she learned when developing the rationale and rephrases the question as follows: “ Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Through several additional cycles of thinking through the rationale for her predictions and how she might test them, Martha specifies her question even further: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from LOs that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?”

Each version of Martha’s question has become more specific. This has occurred as she has (a) identified a starting condition for the teachers—they lack conceptual knowledge of linear functions, (b) specified the mathematics content as linear functions, and (c) included a condition or purpose of the LO—it is aimed at conceptual learning.

Because of the way Martha’s question is now phrased, her predictions will require thinking about the conditions that could influence what teachers learn from the LOs and how this learning could affect their teaching. She might predict that if teachers engaged in LOs that extended over multiple sessions, they would develop deeper understanding which would, in turn, prompt changes in their teaching. Or she might predict that if the LOs included examples of how their conceptual learning could translate into different instructional activities for their students, teachers would be more likely to change their teaching. Reasons for these predictions would likely come from research about the effects of professional development on teachers’ practice.

As Martha thinks about testing her predictions, she realizes it will probably be easier to measure the conditions under which teachers are learning than the changes in the conceptual emphasis in their instruction. She makes a note to continue searching the literature for ways to measure the “conceptualness” of teaching.

As she refines her predictions and expresses her reasons for the predictions, she formulates a hypothesis (in this case several hypotheses) that will guide her research. As she makes predictions and develops the rationales for these predictions, she will probably continue revising her question. She might decide, for example, that she is not interested in studying the condition of different numbers of LO sessions and so decides to remove this condition from consideration by including in her question something like “. . . over five 2-hour sessions . . .”

At this point, Martha has developed a research question, articulated a number of predictions, and developed rationales for them. Her current question is: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?” Her hypothesis is:

Prediction: Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding, with larger changes on linear function topics directly addressed in the LOs than on other topics.

Brief Description of Rationale: (1) Past research has shown correlations between teachers’ specific mathematics knowledge of a topic and the quality of their teaching of that topic. This does not mean an increase in knowledge causes higher quality teaching but it allows for that possibility. (2) Transfer is usually difficult for teachers, but the examples developed during the LO sessions will help them use what they learned to teach for conceptual understanding. This is because the examples developed during the LO sessions are much like those that will be used by the teachers. So larger changes will be found when teachers are teaching the linear function topics addressed in the LOs.

Notice it is more straightforward to imagine how Martha could test this prediction because it is more precise than previous predictions. Notice also that by asking how to test a particular prediction, Martha will be faced with a decision about whether testing this prediction will tell her something she wants to learn. If not, she can return to the research question and consider how to specify it further and, perhaps, constrain further the conditions that could affect the data.

As Martha formulates her hypotheses and goes through multiple cycles of refining her question(s), articulating her predictions, and developing her rationales, she is constantly building the theoretical framework for her study. Because the theoretical framework is the topic for Chap. 3 , we will pause here and pick up Martha’s story in the next chapter. Spoiler alert: Martha’s experience contains some surprising twists and turns.

Before leaving Martha, however, we point out two aspects of the process in which she has been engaged. First, it can be useful to think about the process as identifying (1) the variables targeted in her predictions, (2) the mechanisms she believes explain the relationships among the variables, and (3) the definitions of all the terms that are special to her educational problem. By variables, we mean things that can be measured and, when measured, can take on different values. In Martha’s case, the variables are the conceptualness of teaching and the content topics addressed in the LOs. The mechanisms are cognitive processes that enable teachers to see the relevance of what they learn in PD to their own teaching and that enable the transfer of learning from one setting to another. Definitions are the precise descriptions of how the important ideas relevant to the research are conceptualized. In Martha’s case, definitions must be provided for terms like conceptual understanding, linear functions, LOs, each of the topics related to linear functions, instructional setting, and knowledge transfer.

A second aspect of the process is a practice that Martha acquired as part of her graduate program, a practice that can go unnoticed. Martha writes out, in full sentences, her thinking as she wrestles with her research question, her predictions of the answers, and the rationales for her predictions. Writing is a tool for organizing thinking and we recommend you use it throughout the scientific inquiry process. We say more about this at the end of the chapter.

Here are the questions Martha wrote as she developed a clearer sense of what question she wanted to answer and what answer she predicted. The list shows the increasing refinement that occurred as she continued to read, think, talk, and write.

Early questions: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

First focused question: What makes LOs for teachers effective for improving their teaching for conceptual understanding?

Question after trying to predict the answer and imagining how to test the prediction: Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing an initial rationale for her prediction: Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing a more precise prediction and richer rationale: Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?

Part IV. An Illustrative Dialogue

The story of Martha described the major steps she took to refine her thinking. However, there is a lot of work that went on behind the scenes that wasn’t part of the story. For example, Martha had conversations with fellow students and professors that sharpened her thinking. What do these conversations look like? Because they are such an important part of the inquiry process, it will be helpful to “listen in” on the kinds of conversations that students might have with their advisors.

Here is a dialogue between a beginning student, Sam (S), and their advisor, Dr. Avery (A). They are meeting to discuss data Sam collected for a course project. The dialogue below is happening very early on in Sam’s conceptualization of the study, prior even to systematic reading of the literature.

Thanks for meeting with me today. As you know, I was able to collect some data for a course project a few weeks ago, but I’m having trouble analyzing the data, so I need your help. Let me try to explain the problem. As you know, I wanted to understand what middle-school teachers do to promote girls’ achievement in a mathematics class. I conducted four observations in each of three teachers’ classrooms. I also interviewed each teacher once about the four lessons I observed, and I interviewed two girls from each of the teachers’ classes. Obviously, I have a ton of data. But when I look at all these data, I don’t really know what I learned about my topic. When I was observing the teachers, I thought I might have observed some ways the teachers were promoting girls’ achievement, but then I wasn’t sure how to interpret my data. I didn’t know if the things I was observing were actually promoting girls’ achievement.

What were some of your observations?

Well, in a couple of my classroom observations, teachers called on girls to give an answer, even when the girls didn’t have their hands up. I thought that this might be a way that teachers were promoting the girls’ achievement. But then the girls didn’t say anything about that when I interviewed them and also the teachers didn’t do it in every class. So, it’s hard to know what effect, if any, this might have had on their learning or their motivation to learn. I didn’t want to ask the girls during the interview specifically about the teacher calling on them, and without the girls bringing it up themselves, I didn’t know if it had any effect.

Well, why didn’t you want to ask the girls about being called on?

Because I wanted to leave it as open as possible; I didn’t want to influence what they were going to say. I didn’t want to put words in their mouths. I wanted to know what they thought the teacher was doing that promoted their mathematical achievement and so I only asked the girls general questions, like “Do you think the teacher does things to promote girls’ mathematical achievement?” and “Can you describe specific experiences you have had that you believe do and do not promote your mathematical achievement?”

So then, how did they answer those general questions?

Well, with very general answers, such as that the teacher knows their names, offers review sessions, grades their homework fairly, gives them opportunities to earn extra credit, lets them ask questions, and always answers their questions. Nothing specific that helps me know what teaching actions specifically target girls’ mathematics achievement.

OK. Any ideas about what you might do next?

Well, I remember that when I was planning this data collection for my course, you suggested I might want to be more targeted and specific about what I was looking for. I can see now that more targeted questions would have made my data more interpretable in terms of connecting teaching actions to the mathematical achievement of girls. But I just didn’t want to influence what the girls would say.

Yes, I remember when you were planning your course project, you wanted to keep it open. You didn’t want to miss out on discovering something new and interesting. What do you think now about this issue?

Well, I still don’t want to put words in their mouths. I want to know what they think. But I see that if I ask really open questions, I have no guarantee they will talk about what I want them to talk about. I guess I still like the idea of an open study, but I see that it’s a risky approach. Leaving the questions too open meant I didn’t constrain their responses and there were too many ways they could interpret and answer the questions. And there are too many ways I could interpret their responses.

By this point in the dialogue, Sam has realized that open data (i.e., data not testing a specific prediction) is difficult to interpret. In the next part, Dr. Avery explains why collecting open data was not helping Sam achieve goals for her study that had motivated collecting open data in the first place.

Yes, I totally agree. Even for an experienced researcher, it can be difficult to make sense of this kind of open, messy data. However, if you design a study with a more specific focus, you can create questions for participants that are more targeted because you will be interested in their answers to these specific questions. Let’s reflect back on your data collection. What can you learn from it for the future?

When I think about it now, I realize that I didn’t think about the distinction between all the different constructs at play in my study, and I didn’t choose which one I was focusing on. One construct was the teaching moves that teachers think could be promoting achievement. Another is what teachers deliberately do to promote girls’ mathematics achievement, if anything. Another was the teaching moves that actually do support girls’ mathematics achievement. Another was what teachers were doing that supported girls’ mathematics achievement versus the mathematics achievement of all students. Another was students’ perception of what their teacher was doing to promote girls’ mathematics achievement. I now see that any one of these constructs could have been the focus of a study and that I didn’t really decide which of these was the focus of my course project prior to collecting data.

So, since you told me that the topic of this course project is probably what you’ll eventually want to study for your dissertation, which of these constructs are you most interested in?

I think I’m more interested in the teacher moves that teachers deliberately do to promote girls’ achievement. But I’m still worried about asking teachers directly and getting too specific about what they do because I don’t want to bias what they will say. And I chose qualitative methods and an exploratory design because I thought it would allow for a more open approach, an approach that helps me see what’s going on and that doesn’t bias or predetermine the results.

Well, it seems to me you are conflating three issues. One issue is how to conduct an unbiased study. Another issue is how specific to make your study. And the third issue is whether or not to choose an exploratory or qualitative study design. Those three issues are not the same. For example, designing a study that’s more open or more exploratory is not how researchers make studies fair and unbiased. In fact, it would be quite easy to create an open study that is biased. For example, you could ask very open questions and then interpret the responses in a way that unintentionally, and even unknowingly, aligns with what you were hoping the findings would say. Actually, you could argue that by adding more specificity and narrowing your focus, you’re creating constraints that prevent bias. The same goes for an exploratory or qualitative study; they can be biased or unbiased. So, let’s talk about what is meant by getting more specific. Within your new focus on what teachers deliberately do, there are many things that would be interesting to look at, such as teacher moves that address math anxiety, moves that allow girls to answer questions more frequently, moves that are specifically fitted to student thinking about specific mathematical content, and so on. What are one or two things that are most interesting to you? One way to answer this question is by thinking back to where your interest in this topic began.

In the preceding part of the dialogue, Dr. Avery explained how the goals Sam had for their study were not being met with open data. In the next part, Sam begins to articulate a prediction, which Sam and Dr. Avery then sharpen.

Actually, I became interested in this topic because of an experience I had in college when I was in a class of mostly girls. During whole class discussions, we were supposed to critically evaluate each other’s mathematical thinking, but we were too polite to do that. Instead, we just praised each other’s work. But it was so different in our small groups. It seemed easier to critique each other’s thinking and to push each other to better solutions in small groups. I began wondering how to get girls to be more critical of each other’s thinking in a whole class discussion in order to push everyone’s thinking.

Okay, this is great information. Why not use this idea to zoom-in on a more manageable and interpretable study? You could look specifically at how teachers support girls in critically evaluating each other’s thinking during whole class discussions. That would be a much more targeted and specific topic. Do you have predictions about what teachers could do in that situation, keeping in mind that you are looking specifically at girls’ mathematical achievement, not students in general?

Well, what I noticed was that small groups provided more social and emotional support for girls, whereas the whole class discussion did not provide that same support. The girls felt more comfortable critiquing each other’s thinking in small groups. So, I guess I predict that when the social and emotional supports that are present in small groups are extended to the whole class discussion, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussion . I guess ultimately, I’d like to know how the whole class discussion could be used to enhance, rather than undermine, the social and emotional support that is present in the small groups.

Okay, then where would you start? Would you start with a study of what the teachers say they will do during whole class discussion and then observe if that happens during whole class discussion?

But part of my prediction also involves the small groups. So, I’d also like to include small groups in my study if possible. If I focus on whole groups, I won’t be exploring what I am interested in. My interest is broader than just the whole class discussion.

That makes sense, but there are many different things you could look at as part of your prediction, more than you can do in one study. For instance, if your prediction is that when the social and emotional supports that are present in small groups are extended to whole class discussions, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussions , then you could ask the following questions: What are the social and emotional supports that are present in small groups?; In which small groups do they exist?; Is it groups that are made up only of girls?; Does every small group do this, and for groups that do this, when do these supports get created?; What kinds of small group activities that teachers ask them to work on are associated with these supports?; Do the same social and emotional supports that apply to small groups even apply to whole group discussion?

All your questions make me realize that my prediction about extending social and emotional supports to whole class discussions first requires me to have a better understanding of the social and emotional supports that exist in small groups. In fact, I first need to find out whether those supports commonly exist in small groups or is that just my experience working in small groups. So, I think I will first have to figure out what small groups do to support each other and then, in a later study, I could ask a teacher to implement those supports during whole class discussions and find out how you can do that. Yeah, now I’m seeing that.

The previous part of the dialogue illustrates how continuing to ask questions about one’s initial prediction is a good way to make it more and more precise (and researchable). In the next part, we see how developing a precise prediction has the added benefit of setting the researcher up for future studies.

Yes, I agree that for your first study, you should probably look at small groups. In other words, you should focus on only a part of your prediction for now, namely the part that says there are social and emotional supports in small groups that support girls in critiquing each other’s thinking . That begins to sharpen the focus of your prediction, but you’ll want to continue to refine it. For example, right now, the question that this prediction leads to is a question with a yes or no answer, but what you’ve said so far suggests to me that you are looking for more than that.

Yes, I want to know more than just whether there are supports. I’d like to know what kinds. That’s why I wanted to do a qualitative study.

Okay, this aligns more with my thinking about research as being prediction driven. It’s about collecting data that would help you revise your existing predictions into better ones. What I mean is that you would focus on collecting data that would allow you to refine your prediction, make it more nuanced, and go beyond what is already known. Does that make sense, and if so, what would that look like for your prediction?

Oh yes, I like that. I guess that would mean that, based on the data I collect for this next study, I could develop a more refined prediction that, for example, more specifically identifies and differentiates between different kinds of social and emotional supports that are present in small groups, or maybe that identifies the kinds of small groups that they occur in, or that predicts when and how frequently or infrequently they occur, or about the features of the small group tasks in which they occur, etc. I now realize that, although I chose qualitative research to make my study be more open, really the reason qualitative research fits my purposes is because it will allow me to explore fine-grained aspects of social and emotional supports that may exist for girls in small groups.

Yes, exactly! And then, based on the data you collect, you can include in your revised prediction those new fine-grained aspects. Furthermore, you will have a story to tell about your study in your written report, namely the story about your evolving prediction. In other words, your written report can largely tell how you filled out and refined your prediction as you learned more from carrying out the study. And even though you might not use them right away, you are also going to be able to develop new predictions that you would not have even thought of about social and emotional supports in small groups and your aim of extending them to whole-class discussions, had you not done this study. That will set you up to follow up on those new predictions in future studies. For example, you might have more refined ideas after you collect the data about the goals for critiquing student thinking in small groups versus the goals for critiquing student thinking during whole class discussion. You might even begin to think that some of the social and emotional supports you observe are not even replicable or even applicable to or appropriate for whole-class discussions, because the supports play different roles in different contexts. So, to summarize what I’m saying, what you look at in this study, even though it will be very focused, sets you up for a research program that will allow you to more fully investigate your broader interest in this topic, where each new study builds on your prior body of work. That’s why it is so important to be explicit about the best place to start this research, so that you can build on it.

I see what you are saying. We started this conversation talking about my course project data. What I think I should have done was figure out explicitly what I needed to learn with that study with the intention of then taking what I learned and using it as the basis for the next study. I didn’t do that, and so I didn’t collect data that pushed forward my thinking in ways that would guide my next study. It would be as if I was starting over with my next study.

Sam and Dr. Avery have just explored how specifying a prediction reveals additional complexities that could become fodder for developing a systematic research program. Next, we watch Sam beginning to recognize the level of specificity required for a prediction to be testable.

One thing that would have really helped would have been if you had had a specific prediction going into your data collection for your course project.

Well, I didn’t really have much of an explicit prediction in mind when I designed my methods.

Think back, you must have had some kind of prediction, even if it was implicit.

Well, yes, I guess I was predicting that teachers would enact moves that supported girls’ mathematical achievement. And I observed classrooms to identify those teacher moves, I interviewed teachers to ask them about the moves I observed, and I interviewed students to see if they mentioned those moves as promoting their mathematical achievement. The goal of my course project was to identify teacher moves that support girls’ mathematical achievement. And my specific research question was: What teacher moves support girls’ mathematical achievement?

So, really you were asking the teacher and students to show and tell you what those moves are and the effects of those moves, as a result putting the onus on your participants to provide the answers to your research question for you. I have an idea, let’s try a thought experiment. You come up with data collection methods for testing the prediction that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking that still puts the onus on the participants. And then I’ll see if I can think of data collection methods that would not put the onus on the participants.

Hmm, well. .. I guess I could simply interview girls who participated in small groups and ask them “are there social and emotional supports that you use in small groups that support your group in critiquing each other’s thinking and if so, what are they?” In that case, I would be putting the onus on them to be aware of the social dynamics of small groups and to have thought about these constructs as much as I have. Okay now can you continue the thought experiment? What might the data collection methods look like if I didn’t put the onus on the participants?

First, I would pick a setting in which it was only girls at this point to reduce the number of variables. Then, personally I would want to observe a lot of groups of girls interacting in groups around tasks. I would be looking for instances when the conversation about students’ ideas was shut down and instances when the conversation about students’ ideas involved critiquing of ideas and building on each other’s thinking. I would also look at what happened just before and during those instances, such as: did the student continue to talk after their thinking was critiqued, did other students do anything to encourage the student to build on their own thinking (i.e., constructive criticism) or how did they support or shut down continued participation. In fact, now that I think about it, “critiquing each other’s thinking” can be defined in a number of different ways. I could mean just commenting on someone’s thinking, judging correctness and incorrectness, constructive criticism that moves the thinking forward, etc. If you put the onus on the participants to answer your research question, you are stuck with their definition, and they won’t have thought about this very much, if at all.

I think that what you are also saying is that my definitions would affect my data collection. If I think that critiquing each other’s thinking means that the group moves their thinking forward toward more valid and complete mathematical solutions, then I’m going to focus on different moves than if I define it another way, such as just making a comment on each other’s thinking and making each other feel comfortable enough to keep participating. In fact, am I going to look at individual instances of critiquing or look at entire sequences in which the critiquing leads to a goal? This seems like a unit of analysis question, and I would need to develop a more nuanced prediction that would make explicit what that unit of analysis is.

I agree, your definition of “critiquing each other’s thinking” could entirely change what you are predicting. One prediction could be based on defining critiquing as a one-shot event in which someone makes one comment on another person’s thinking. In this case the prediction would be that there are social and emotional supports in small groups that support girls in making an evaluative comment on another student’s thinking. Another prediction could be based on defining critiquing as a back-and-forth process in which the thinking gets built on and refined. In that case, the prediction would be something like that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking in ways that do not shut down the conversation but that lead to sustained conversations that move each other toward more valid and complete solutions.

Well, I think I am more interested in the second prediction because it is more compatible with my long-term interests, which are that I’m interested in extending small group supports to whole class discussions. The second prediction is more appropriate for eventually looking at girls in whole class discussion. During whole class discussion, the teacher tries to get a sustained conversation going that moves the students’ thinking forward. So, if I learn about small group supports that lead to sustained conversations that move each other toward more valid and complete solutions , those supports might transfer to whole class discussions.

In the previous part of the dialogue, Dr. Avery and Sam showed how narrowing down a prediction to one that is testable requires making numerous important decisions, including how to define the constructs referred to in the prediction. In the final part of the dialogue, Dr. Avery and Sam begin to outline the reading Sam will have to do to develop a rationale for the specific prediction.

Do you see how your prediction and definitions are getting more and more specific? You now need to read extensively to further refine your prediction.

Well, I should probably read about micro dynamics of small group interactions, anything about interactions in small groups, and what is already known about small group interactions that support sustained conversations that move students’ thinking toward more valid and complete solutions. I guess I could also look at research on whole-class discussion methods that support sustained conversations that move the class to more mathematically valid and complete solutions, because it might give me ideas for what to look for in the small groups. I might also need to focus on research about how learners develop understandings about a particular subject matter so that I know what “more valid and complete solutions” look like. I also need to read about social and emotional supports but focus on how they support students cognitively, rather than in other ways.

Sounds good, let’s get together after you have processed some of this literature and we can talk about refining your prediction based on what you read and also the methods that will best suit testing that prediction.

Great! Thanks for meeting with me. I feel like I have a much better set of tools that push my own thinking forward and allow me to target something specific that will lead to more interpretable data.

Part V. Is It Always Possible to Formulate Hypotheses?

In Chap. 1 , we noted you are likely to read that research does not require formulating hypotheses. Some sources describe doing research without making predictions and developing rationales for these predictions. Some researchers say you cannot always make predictions—you do not know enough about the situation. In fact, some argue for the value of not making predictions (e.g., Glaser & Holton, 2004 ; Merton, 1968 ; Nemirovsky, 2011 ). These are important points of view, so we will devote this section to discussing them.

Can You Always Predict What You Will Find?

One reason some researchers say you do not need to make predictions is that it can be difficult to imagine what you will find. This argument comes up most often for descriptive studies. Suppose you want to describe the nature of a situation you do not know much about. Can you still make a prediction about what you will find? We believe that, although you do not know exactly what you will find, you probably have a hunch or, at a minimum, a very fuzzy idea. It would be unusual to ask a question about a situation you want to know about without at least a fuzzy inkling of what you might find. The original question just would not occur to you. We acknowledge you might have only a vague idea of what you will find and you might not have much confidence in your prediction. However, we expect if you monitor your own thinking you will discover you have developed a suspicion along the way, regardless how vague the suspicion might be. Through the cyclic process we discussed above, that suspicion or hunch gradually evolves and turns into a prediction.

The Benefits of Making Predictions Even When They Are Wrong: An Example from the 1970s

One of us was a graduate student at the University of Wisconsin in the late 1970s, assigned as a research assistant to a project that was investigating young children’s thinking about simple arithmetic. A new curriculum was being written, and the developers wanted to know how to introduce the earliest concepts and skills to kindergarten and first-grade children. The directors of the project did not know what to expect because, at the time, there was little research on five- and six-year-olds’ pre-instruction strategies for adding and subtracting.

After consulting what literature was available, talking with teachers, analyzing the nature of different types of addition and subtraction problems, and debating with each other, the research team formulated some hypotheses about children’s performance. Following the usual assumptions at the time and recognizing the new curriculum would introduce the concepts, the researchers predicted that, before instruction, most children would not be able to solve the problems. Based on the rationale that some young children did not yet recognize the simple form for written problems (e.g., 5 + 3 = ___), the researchers predicted that the best chance for success would be to read problems as stories (e.g., Jesse had 5 apples and then found 3 more. How many does she have now?). They reasoned that, even though children would have difficulty on all the problems, some story problems would be easier because the semantic structure is easier to follow. For example, they predicted the above story about adding 3 apples to 5 would be easier than a problem like, “Jesse had some apples in the refrigerator. She put in 2 more and now has 6. How many were in the refrigerator at the beginning?” Based on the rationale that children would need to count to solve the problems and that it can be difficult to keep track of the numbers, they predicted children would be more successful if they were given counters. Finally, accepting the common reasoning that larger numbers are more difficult than smaller numbers, they predicted children would be more successful if all the numbers in a problem were below 10.

Although these predictions were not very precise and the rationales were not strongly convincing, these hypotheses prompted the researchers to design the study to test their predictions. This meant they would collect data by presenting a variety of problems under a variety of conditions. Because the goal was to describe children’s thinking, problems were presented to students in individual interviews. Problems with different semantic structures were included, counters were available for some problems but not others, and some problems had sums to 9 whereas others had sums to 20 or more.

The punchline of this story is that gathering data under these conditions, prompted by the predictions, made all the difference in what the researchers learned. Contrary to predictions, children could solve addition and subtraction problems before instruction. Counters were important because almost all the solution strategies were based on counting which meant that memory was an issue because many strategies require counting in two ways simultaneously. For example, subtracting 4 from 7 was usually solved by counting down from 7 while counting up from 1 to 4 to keep track of counting down. Because children acted out the stories with their counters, the semantic structure of the story was also important. Stories that were easier to read and write were also easier to solve.

To make a very long story very short, other researchers were, at about the same time, reporting similar results about children’s pre-instruction arithmetic capabilities. A clear pattern emerged regarding the relative difficulty of different problem types (semantic structures) and the strategies children used to solve each type. As the data were replicated, the researchers recognized that kindergarten and first-grade teachers could make good use of this information when they introduced simple arithmetic. This is how Cognitively Guided Instruction (CGI) was born (Carpenter et al., 1989 ; Fennema et al., 1996 ).

To reiterate, the point of this example is that the study conducted to describe children’s thinking would have looked quite different if the researchers had made no predictions. They would have had no reason to choose the particular problems and present them under different conditions. The fact that some of the predictions were completely wrong is not the point. The predictions created the conditions under which the predictions were tested which, in turn, created learning opportunities for the researchers that would not have existed without the predictions. The lesson is that even research that aims to simply describe a phenomenon can benefit from hypotheses. As signaled in Chap. 1 , this also serves as another example of “failing productively.”

Suggestions for What to Do When You Do Not Have Predictions

There likely are exceptions to our claim about being able to make a prediction about what you will find. For example, there could be rare cases where researchers truly have no idea what they will find and can come up with no predictions and even no hunches. And, no research has been reported on related phenomena that would offer some guidance. If you find yourself in this position, we suggest one of three approaches: revise your question, conduct a pilot study, or choose another question.

Because there are many advantages to making predictions explicit and then writing out the reasons for these predictions, one approach is to adjust your question just enough to allow you to make a prediction. Perhaps you can build on descriptions that other researchers have provided for related situations and consider how you can extend this work. Building on previous descriptions will enable you to make predictions about the situation you want to describe.

A second approach is to conduct a small pilot study or, better, a series of small pilot studies to develop some preliminary ideas of what you might find. If you can identify a small sample of participants who are similar to those in your study, you can try out at least some of your research plans to help make and refine your predictions. As we detail later, you can also use pilot studies to check whether key aspects of your methods (e.g., tasks, interview questions, data collection methods) work as you expect.

A third approach is to return to your list of interests and choose one that has been studied previously. Sometimes this is the wisest choice. It is very difficult for beginning researchers to conduct research in brand-new areas where no hunches or predictions are possible. In addition, the contributions of this research can be limited. Recall the earlier story about one of us “failing productively” by completing a dissertation in a somewhat new area. If, after an exhaustive search, you find that no one has investigated the phenomenon in which you are interested or even related phenomena, it can be best to move in a different direction. You will read recommendations in other sources to find a “gap” in the research and develop a study to “fill the gap.” This can be helpful advice if the gap is very small. However, if the gap is large, too large to predict what you might find, the study will present severe challenges. It will be more productive to extend work that has already been done than to launch into an entirely new area.

Should You Always Try to Predict What You Will Find?

In short, our answer to the question in the heading is “yes.” But this calls for further explanation.

Suppose you want to observe a second-grade classroom in order to investigate how students talk about adding and subtracting whole numbers. You might think, “I don’t want to bias my thinking; I want to be completely open to what I see in the classroom.” Sam shared a similar point of view at the beginning of the dialogue: “I wanted to leave it as open as possible; I didn’t want to influence what they were going to say.” Some researchers say that beginning your research study by making predictions is inappropriate precisely because it will bias your observations and results. The argument is that by bringing a set of preconceptions, you will confirm what you expected to find and be blind to other observations and outcomes. The following quote illustrates this view: “The first step in gaining theoretical sensitivity is to enter the research setting with as few predetermined ideas as possible—especially logically deducted, a priori hypotheses. In this posture, the analyst is able to remain sensitive to the data by being able to record events and detect happenings without first having them filtered through and squared with pre-existing hypotheses and biases” (Glaser, 1978, pp. 2–3).

We take a different point of view. In fact, we believe there are several compelling reasons for making your predictions explicit.

Making Your Predictions Explicit Increases Your Chances of Productive Observations

Because your predictions are an extension of what is already known, they prepare you to identify more nuanced relationships that can advance our understanding of a phenomenon. For example, rather than simply noticing, in a general sense, that students talking about addition and subtraction leads them to better understandings, you might, based on your prediction, make the specific observation that talking about addition and subtraction in a particular way helps students to think more deeply about a particular concept related to addition and subtraction. Going into a study without predictions can bring less sensitivity rather than more to the study of a phenomenon. Drawing on knowledge about related phenomena by reading the literature and conducting pilot studies allows you to be much more sensitive and your observations to be more productive.

Making Your Predictions Explicit Allows You to Guard Against Biases

Some genres and methods of educational research are, in fact, rooted in philosophical traditions (e.g., Husserl, 1929/ 1973 ) that explicitly call for researchers to temporarily “bracket” or set aside existing theory as well as their prior knowledge and experience to better enter into the experience of the participants in the research. However, this does not mean ignoring one’s own knowledge and experience or turning a blind eye to what has been learned by others. Much more than the simplistic image of emptying one’s mind of preconceptions and implicit biases (arguably an impossible feat to begin with), the goal is to be as reflective as possible about one’s prior knowledge and conceptions and as transparent as possible about how they may guide observations and shape interpretations (Levitt et al., 2018 ).

We believe it is better to be honest about the predictions you are almost sure to have because then you can deliberately plan to minimize the chances they will influence what you find and how you interpret your results. For starters, it is important to recognize that acknowledging you have some guesses about what you will find does not make them more influential. Because you are likely to have them anyway, we recommend being explicit about what they are. It is easier to deal with biases that are explicit than those that lurk in the background and are not acknowledged.

What do we mean by “deal with biases”? Some journals require you to include a statement about your “positionality” with respect to the participants in your study and the observations you are making to gather data. Formulating clear hypotheses is, in our view, a direct response to this request. The reasons for your predictions are your explicit statements about your positionality. Often there are methodological strategies you can use to protect the study from undue influences of bias. In other words, making your vague predictions explicit can help you design your study so you minimize the bias of your findings.

Making Your Predictions Explicit Can Help You See What You Did Not Predict

Making your predictions explicit does not need to blind you to what is different than expected. It does not need to force you to see only what you want to see. Instead, it can actually increase your sensitivity to noticing features of the situation that are surprising, features you did not predict. Results can stand out when you did not expect to see them.

In contrast, not bringing your biases to consciousness might subtly shift your attention away from these unexpected results in ways that you are not aware of. This path can lead to claiming no biases and no unexpected findings without being conscious of them. You cannot observe everything, and some things inevitably will be overlooked. If you have predicted what you will see, you can design your study so that the unexpected results become more salient rather than less.

Returning to the example of observing a second-grade classroom, we note that the field already knows a great deal about how students talk about addition and subtraction. Being cognizant of what others have observed allows you to enter the classroom with some clear predictions about what will happen. The rationales for these predictions are based on all the related knowledge you have before stepping into the classroom, and the predictions and rationales help you to better deal with what you see. This is partly because you are likely to be surprised by the things you did not anticipate. There is almost always something that will surprise you because your predictions will almost always be incomplete or too general. This sensitivity to the unanticipated—the sense of surprise that sparks your curiosity—is an indication of your openness to the phenomenon you are studying.

Making Your Predictions Explicit Allows You to Plan in Advance

Recall from Chap. 1 the descriptor of scientific inquiry: “Experience carefully planned in advance.” If you make no predictions about what might happen, it is very difficult, if not impossible, to plan your study in advance. Again, you cannot observe everything, so you must make decisions about what you will observe. What kind of data will you plan to collect? Why would you collect these data instead of others? If you have no idea what to expect, on what basis will you make these consequential decisions? Even if your predictions are vague and your rationales for the predictions are a bit shaky, at least they provide a direction for your plan. They allow you to explain why you are planning this study and collecting these data. They allow you to “carefully plan in advance.”

Making Your Predictions Explicit Allows You to Put Your Rationales in Harm’s Way

Rationales are developed to justify the predictions. Rationales represent your best reasoning about the research problem you are studying. How can you tell whether your reasoning is sound? You can try it out with colleagues. However, the best way to test it is to put it in “harm’s way” (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003 p. 10). And the best approach to putting your reasoning in harm’s way is to test the predictions it generates. Regardless if you are conducting a qualitative or quantitative study, rationales can be improved only if they generate testable predictions. This is possible only if predictions are explicit and precise. As we described earlier, rationales are evaluated for their soundness and refined in light of the specific differences between predictions and empirical observations.

Making Your Predictions Explicit Forces You to Organize and Extend Your (and the Field’s) Thinking

By writing out your predictions (even hunches or fuzzy guesses) and by reflecting on why you have these predictions and making these reasons explicit for yourself, you are advancing your thinking about the questions you really want to answer. This means you are making progress toward formulating your research questions and your final hypotheses. Making more progress in your own thinking before you conduct your study increases the chances your study will be of higher quality and will be exactly the study you intended. Making predictions, developing rationales, and imagining tests are tools you can use to push your thinking forward before you even collect data.

Suppose you wonder how preservice teachers in your university’s teacher preparation program will solve particular kinds of math problems. You are interested in this question because you have noticed several PSTs solve them in unexpected ways. As you ask the question you want to answer, you make predictions about what you expect to see. When you reflect on why you made these predictions, you realize that some PSTs might use particular solution strategies because they were taught to use some of them in an earlier course, and they might believe you expect them to solve the problems in these ways. By being explicit about why you are making particular predictions, you realize that you might be answering a different question than you intend (“How much do PSTs remember from previous courses?” or even “To what extent do PSTs believe different instructors have similar expectations?”). Now you can either change your question or change the design of your study (i.e., the sample of students you will use) or both. You are advancing your thinking by being explicit about your predictions and why you are making them.

The Costs of Not Making Predictions

Avoiding making predictions, for whatever reason, comes with significant costs. It prevents you from learning very much about your research topic. It would require not reading related research, not talking with your colleagues, and not conducting pilot studies because, if you do, you are likely to find a prediction creeping into your thinking. Not doing these things would forego the benefits of advancing your thinking before you collect data. It would amount to conducting the study with as little forethought as possible.

Part VI. How Do You Formulate Important Hypotheses?

We provided a partial answer in Chap. 1 to the question of a hypothesis’ importance when we encouraged considering the ultimate goal to which a study’s findings might contribute. You might want to reread Part III of Chap. 1 where we offered our opinions about the purposes of doing research. We also recommend reading the March 2019 editorial in the Journal for Research in Mathematics Education (Cai et al., 2019b ) in which we address what constitutes important educational research.

As we argued in Chap. 1 and in the March 2019 editorial, a worthy ultimate goal for educational research is to improve the learning opportunities for all students. However, arguments can be made for other ultimate goals as well. To gauge the importance of your hypotheses, think about how clearly you can connect them to a goal the educational community considers important. In addition, given the descriptors of scientific inquiry proposed in Chap. 1 , think about how testing your hypotheses will help you (and the community) understand what you are studying. Will you have a better explanation for the phenomenon after your study than before?

Although we address the question of importance again, and in more detail, in Chap. 5 , it is useful to know here that you can determine the significance or importance of your hypotheses when you formulate them. The importance need not depend on the data you collect or the results you report. The importance can come from the fact that, based on the results of your study, you will be able to offer revised hypotheses that help the field better understand an important issue. In large part, it is these revised hypotheses rather than the data that determine a study’s importance.

A critical caveat to this discussion is that few hypotheses are self-evidently important. They are important only if you make the case for their importance. Even if you follow closely the guidelines we suggest for formulating an important hypothesis, you must develop an argument that convinces others. This argument will be presented in the research paper you write.

The picture has a few hypotheses that are self-evidently important. They are important only if you make the case for their importance; written.

Consider Martha’s hypothesis presented earlier. When we left Martha, she predicted that “Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding with larger changes on linear function topics directly addressed in the LOs than on other topics.” For researchers and educators not intimately familiar with this area of research, it is not apparent why someone should spend a year or more conducting a dissertation to test this prediction. Her rationale, summarized earlier, begins to describe why this could be an important hypothesis. But it is by writing a clear argument that explains her rationale to readers that she will convince them of its importance.

How Martha fills in her rationale so she can create a clear written argument for its importance is taken up in Chap. 3 . As we indicated, Martha’s work in this regard led her to make some interesting decisions, in part due to her own assessment of what was important.

Part VII. Beginning to Write the Research Paper for Your Study

It is common to think that researchers conduct a study and then, after the data are collected and analyzed, begin writing the paper about the study. We recommend an alternative, especially for beginning researchers. We believe it is better to write drafts of the paper at the same time you are planning and conducting your study. The paper will gradually evolve as you work through successive phases of the scientific inquiry process. Consequently, we will call this paper your evolving research paper .

The picture has, we believe it is better to write drafts of the paper at the same time you are planning and conducting your study; written.

You will use your evolving research paper to communicate your study, but you can also use writing as a tool for thinking and organizing your thinking while planning and conducting the study. Used as a tool for thinking, you can write drafts of your ideas to check on the clarity of your thinking, and then you can step back and reflect on how to clarify it further. Be sure to avoid jargon and general terms that are not well defined. Ask yourself whether someone not in your field, maybe a sibling, a parent, or a friend, would be able to understand what you mean. You are likely to write multiple drafts with lots of scribbling, crossing out, and revising.

Used as a tool for communicating, writing the best version of what you know before moving to the next phase will help you record your decisions and the reasons for them before you forget important details. This best-version-for-now paper also provides the basis for your thinking about the next phase of your scientific inquiry.

At this point in the process, you will be writing your (research) questions, the answers you predict, and the rationales for your predictions. The predictions you make should be direct answers to your research questions and should flow logically from (or be directly supported by) the rationales you present. In addition, you will have a written statement of the study’s purpose or, said another way, an argument for the importance of the hypotheses you will be testing. It is in the early sections of your paper that you will convince your audience about the importance of your hypotheses.

In our experience, presenting research questions is a more common form of stating the goal of a research study than presenting well-formulated hypotheses. Authors sometimes present a hypothesis, often as a simple prediction of what they might find. The hypothesis is then forgotten and not used to guide the analysis or interpretations of the findings. In other words, authors seldom use hypotheses to do the kind of work we describe. This means that many research articles you read will not treat hypotheses as we suggest. We believe these are missed opportunities to present research in a more compelling and informative way. We intend to provide enough guidance in the remaining chapters for you to feel comfortable organizing your evolving research paper around formulating, testing, and revising hypotheses.

While we were editing one of the leading research journals in mathematics education ( JRME ), we conducted a study of reviewers’ critiques of papers submitted to the journal. Two of the five most common concerns were: (1) the research questions were unclear, and (2) the answers to the questions did not make a substantial contribution to the field. These are likely to be major concerns for the reviewers of all research journals. We hope the knowledge and skills you have acquired working through this chapter will allow you to write the opening to your evolving research paper in a way that addresses these concerns. Much of the chapter should help make your research questions clear, and the prior section on formulating “important hypotheses” will help you convey the contribution of your study.

Exercise 2.3

Look back at your answers to the sets of questions before part II of this chapter.

Think about how you would argue for the importance of your current interest.

Write your interest in the form of (1) a research problem, (2) a research question, and (3) a prediction with the beginnings of a rationale. You will update these as you read the remaining chapters.

Part VIII. The Heart of Scientific Inquiry

In this chapter, we have described the process of formulating hypotheses. This process is at the heart of scientific inquiry. It is where doing research begins. Conducting research always involves formulating, testing, and revising hypotheses. This is true regardless of your research questions and whether you are using qualitative, quantitative, or mixed methods. Without engaging in this process in a deliberate, intense, relentless way, your study will reveal less than it could. By engaging in this process, you are maximizing what you, and others, can learn from conducting your study.

In the next chapter, we build on the ideas we have developed in the first two chapters to describe the purpose and nature of theoretical frameworks . The term theoretical framework, along with closely related terms like conceptual framework, can be somewhat mysterious for beginning researchers and can seem like a requirement for writing a paper rather than an aid for conducting research. We will show how theoretical frameworks grow from formulating hypotheses—from developing rationales for the predicted answers to your research questions. We will propose some practical suggestions for building theoretical frameworks and show how useful they can be. In addition, we will continue Martha’s story from the point at which we paused earlier—developing her theoretical framework.

Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, S. L., & Hiebert, J. (2019b). Posing significant research questions. Journal for Research in Mathematics Education, 50 (2), 114–120. https://doi.org/10.5951/jresematheduc.50.2.0114

Article   Google Scholar  

Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C. P., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26 (4), 385–531.

Fennema, E., Carpenter, T. P., Franke, M. L., Levi, L., Jacobs, V. R., & Empson, S. B. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27 (4), 403–434.

Glaser, B. G., & Holton, J. (2004). Remodeling grounded theory. Forum: Qualitative Social Research, 5(2). https://www.qualitative-research.net/index.php/fqs/article/view/607/1316

Gournelos, T., Hammonds, J. R., & Wilson, M. A. (2019). Doing academic research: A practical guide to research methods and analysis . Routledge.

Book   Google Scholar  

Hohensee, C. (2014). Backward transfer: An investigation of the influence of quadratic functions instruction on students’ prior ways of reasoning about linear functions. Mathematical Thinking and Learning, 16 (2), 135–174.

Husserl, E. (1973). Cartesian meditations: An introduction to phenomenology (D. Cairns, Trans.). Martinus Nijhoff. (Original work published 1929).

Google Scholar  

Levitt, H. M., Bamberg, M., Creswell, J. W., Frost, D. M., Josselson, R., & Suárez-Orozco, C. (2018). Journal article reporting standards for qualitative primary, qualitative meta-analytic, and mixed methods research in psychology: The APA Publications and Communications Board Task Force report. American Psychologist, 73 (1), 26–46.

Medawar, P. (1982). Pluto’s republic [no typo]. Oxford University Press.

Merton, R. K. (1968). Social theory and social structure (Enlarged edition). Free Press.

Nemirovsky, R. (2011). Episodic feelings and transfer of learning. Journal of the Learning Sciences, 20 (2), 308–337. https://doi.org/10.1080/10508406.2011.528316

Vygotsky, L. (1987). The development of scientific concepts in childhood: The design of a working hypothesis. In A. Kozulin (Ed.), Thought and language (pp. 146–209). The MIT Press.

Download references

Author information

Authors and affiliations.

School of Education, University of Delaware, Newark, DE, USA

James Hiebert, Anne K Morris & Charles Hohensee

Department of Mathematical Sciences, University of Delaware, Newark, DE, USA

Jinfa Cai & Stephen Hwang

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2023 The Author(s)

About this chapter

Hiebert, J., Cai, J., Hwang, S., Morris, A.K., Hohensee, C. (2023). How Do You Formulate (Important) Hypotheses?. In: Doing Research: A New Researcher’s Guide. Research in Mathematics Education. Springer, Cham. https://doi.org/10.1007/978-3-031-19078-0_2

Download citation

DOI : https://doi.org/10.1007/978-3-031-19078-0_2

Published : 03 December 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-19077-3

Online ISBN : 978-3-031-19078-0

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

logo white

  • Mathematicians
  • Math Lessons
  • Square Roots
  • Math Calculators
  • Hypothesis | Definition & Meaning

JUMP TO TOPIC

Explanation of Hypothesis

Contradiction, simple hypothesis, complex hypothesis, null hypothesis, alternative hypothesis, empirical hypothesis, statistical hypothesis, special example of hypothesis, solution part (a), solution part (b), hypothesis|definition & meaning.

A hypothesis is a claim or statement  that makes sense in the context of some information or data at hand but hasn’t been established as true or false through experimentation or proof.

In mathematics, any statement or equation that describes some relationship between certain variables can be termed as hypothesis if it is consistent with some initial supporting data or information, however, its yet   to be proven true or false by some definite and trustworthy experiment or mathematical law. 

Following example illustrates one such hypothesis to shed some light on this very fundamental concept which is often used in different areas of mathematics.

Example of Hypothesis

Figure 1: Example of Hypothesis

Here we have considered an example of a young startup company that manufactures state of the art batteries. The hypothesis or the claim of the company is that their batteries have a mean life of more than 1000 hours. Now its very easy to understand that they can prove their claim on some testing experiment in their lab.

However, the statement can only be proven if and only if at least one batch of their production batteries have actually been deployed in the real world for more than 1000 hours . After 1000 hours, data needs to be collected and it needs to be seen what is the probability of this statement being true .

The following paragraphs further explain this concept.

As explained with the help of an example earlier, a hypothesis in mathematics is an untested claim that is backed up by all the known data or some other discoveries or some weak experiments.

In any mathematical discovery, we first start by assuming something or some relationship . This supposed statement is called a supposition. A supposition, however, becomes a hypothesis when it is supported by all available data and a large number of contradictory findings.

The hypothesis is an important part of the scientific method that is widely known today for making new discoveries. The field of mathematics inherited this process. Following figure shows this cycle as a graphic:

Role of Hypothesis in the Scientific Method

Figure 2: Role of Hypothesis in the Scientific Method 

The above figure shows a simplified version of the scientific method. It shows that whenever a supposition is supported by some data, its termed as hypothesis. Once a hypothesis is proven by some well known and widely acceptable experiment or proof, its becomes a law. If the hypothesis is rejected by some contradictory results then the supposition is changed and the cycle continues.

Lets try to understand the scientific method and the hypothesis concept with the help of an example. Lets say that a teacher wanted to analyze the relationship between the students performance in a certain subject, lets call it A, based on whether or not they studied a minor course, lets call it B.

Now the teacher puts forth a supposition that the students taking the course B prior to course A must perform better in the latter due to the obvious linkages in the key concepts. Due to this linkage, this supposition can be termed as a hypothesis.

However to test the hypothesis, the teacher has to collect data from all of his/her students such that he/she knows which students have taken course B and which ones haven’t. Then at the end of the semester, the performance of the students must be measured and compared with their course B enrollments.

If the students that took course B prior to course A perform better, then the hypothesis concludes successful . Otherwise, the supposition may need revision.

The following figure explains this problem graphically.

Teacher and Course Example of Hypothesis

Figure 3: Teacher and Course Example of Hypothesis

Important Terms Related to Hypothesis

To further elaborate the concept of hypothesis, we first need to understand a few key terms that are widely used in this area such as conjecture, contradiction and some special types of hypothesis (simple, complex, null, alternative, empirical, statistical). These terms are briefly explained below:

A conjecture is a term used to describe a mathematical assertion that has notbeenproved. While testing   may occasionally turn up millions of examples in favour of a conjecture, most experts in the area will typically only accept a proof . In mathematics, this term is synonymous to the term hypothesis.

In mathematics, a contradiction occurs if the results of an experiment or proof are against some hypothesis.  In other words, a contradiction discredits a hypothesis.

A simple hypothesis is such a type of hypothesis that claims there is a correlation between two variables. The first is known as a dependent variable while the second is known as an independent variable.

A complex hypothesis is such a type of hypothesis that claims there is a correlation between more than two variables.  Both the dependent and independent variables in this hypothesis may be more than one in numbers.

A null hypothesis, usually denoted by H0, is such a type of hypothesis that claims there is no statistical relationship and significance between two sets of observed data and measured occurrences for each set of defined, single observable variables. In short the variables are independent.

An alternative hypothesis, usually denoted by H1 or Ha, is such a type of hypothesis where the variables may be statistically influenced by some unknown factors or variables. In short the variables are dependent on some unknown phenomena .

An Empirical hypothesis is such a type of hypothesis that is built on top of some empirical data or experiment or formulation.

A statistical hypothesis is such a type of hypothesis that is built on top of some statistical data or experiment or formulation. It may be logical or illogical in nature.

According to the Riemann hypothesis, only negative even integers and complex numbers with real part 1/2 have zeros in the Riemann zeta function . It is regarded by many as the most significant open issue in pure mathematics.

Riemann Hypothesis

Figure 4: Riemann Hypothesis

The Riemann hypothesis is the most well-known mathematical conjecture, and it has been the subject of innumerable proof efforts.

Numerical Examples

Identify the conclusions and hypothesis in the following given statements. Also state if the conclusion supports the hypothesis or not.

Part (a): If 30x = 30, then x = 1

Part (b): if 10x + 2 = 50, then x = 24

Hypothesis: 30x = 30

Conclusion: x = 10

Supports Hypothesis: Yes

Hypothesis: 10x + 2 = 50

Conclusion: x = 24

All images/mathematical drawings were created with GeoGebra.

Hour Hand Definition < Glossary Index > Identity Definition

  • Math Article

Hypothesis Definition

In Statistics, the determination of the variation between the group of data due to true variation is done by hypothesis testing. The sample data are taken from the population parameter based on the assumptions. The hypothesis can be classified into various types. In this article, let us discuss the hypothesis definition, various types of hypothesis and the significance of hypothesis testing, which are explained in detail.

Hypothesis Definition in Statistics

In Statistics, a hypothesis is defined as a formal statement, which gives the explanation about the relationship between the two or more variables of the specified population. It helps the researcher to translate the given problem to a clear explanation for the outcome of the study. It clearly explains and predicts the expected outcome. It indicates the types of experimental design and directs the study of the research process.

Types of Hypothesis

The hypothesis can be broadly classified into different types. They are:

Simple Hypothesis

A simple hypothesis is a hypothesis that there exists a relationship between two variables. One is called a dependent variable, and the other is called an independent variable.

Complex Hypothesis

A complex hypothesis is used when there is a relationship between the existing variables. In this hypothesis, the dependent and independent variables are more than two.

Null Hypothesis

In the null hypothesis, there is no significant difference between the populations specified in the experiments, due to any experimental or sampling error. The null hypothesis is denoted by H 0 .

Alternative Hypothesis

In an alternative hypothesis, the simple observations are easily influenced by some random cause. It is denoted by the H a or H 1 .

Empirical Hypothesis

An empirical hypothesis is formed by the experiments and based on the evidence.

Statistical Hypothesis

In a statistical hypothesis, the statement should be logical or illogical, and the hypothesis is verified statistically.

Apart from these types of hypothesis, some other hypotheses are directional and non-directional hypothesis, associated hypothesis, casual hypothesis.

Characteristics of Hypothesis

The important characteristics of the hypothesis are:

  • The hypothesis should be short and precise
  • It should be specific
  • A hypothesis must be related to the existing body of knowledge
  • It should be capable of verification

To learn more Maths definitions, register with BYJU’S – The Learning App.

Quiz Image

Put your understanding of this concept to test by answering a few MCQs. Click ‘Start Quiz’ to begin!

Select the correct answer and click on the “Finish” button Check your score and answers at the end of the quiz

Visit BYJU’S for all Maths related queries and study materials

Your result is as below

Request OTP on Voice Call

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Post My Comment

math version of hypothesis

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

Help | Advanced Search

Mathematics > Differential Geometry

Title: splitting maps in type i ricci flows.

Abstract: We study the existence and small scale behaviour of almost splitting maps along a Ricci flow satisfying Type I curvature bounds. These are special solutions of the heat equation that serve as parabolic analogues of harmonic almost splitting maps, which have proven to be an indespensable tool in the study of the structure of the singular set of non-collapsed Ricci limit spaces. In this paper, motivated by the recent work of Cheeger-Jiang-Naber in the Ricci limit setting, we construct sharp splitting maps on Ricci flows that are almost selfsimilar, and then investigate their small scale behaviour. We show that, modulo linear transformations, an almost splitting map at a large scale remains a splitting map even at smaller scales, provided that the Ricci flow remains sufficiently self-similar. Allowing these linear transformations means that a priori an almost splitting map might degenerate at small scales. However, we show that under an additional summability hypothesis such degeneration doesn't occur.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Mathematics LibreTexts

9.5: Additional Information and Full Hypothesis Test Examples

  • Last updated
  • Save as PDF
  • Page ID 113355

  • In a hypothesis test problem, you may see words such as "the level of significance is 1%." The "1%" is the preconceived or preset \(\alpha\).
  • The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.
  • If no level of significance is given, a common standard to use is \(\alpha = 0.05\).
  • When you calculate the \(p\)-value and draw the picture, the \(p\)-value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.
  • The alternative hypothesis, \(H_{a}\), tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.
  • \(H_{a}\) never has a symbol that contains an equal sign.
  • Thinking about the meaning of the \(p\)-value: A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller \(p\)-value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p -value such as 0.4, as opposed to a \(p\)-value of 0.056 (\(\alpha = 0.05\) is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.

The following examples illustrate a left-, right-, and two-tailed test.

Example \(\PageIndex{1}\)

\(H_{0}: \mu = 5, H_{a}: \mu < 5\)

Test of a single population mean. \(H_{a}\) tells you the test is left-tailed. The picture of the \(p\)-value is as follows:

Normal distribution curve of a single population mean with a value of 5 on the x-axis and the p-value points to the area on the left tail of the curve.

Exercise \(\PageIndex{1}\)

\(H_{0}: \mu = 10, H_{a}: \mu < 10\)

Assume the \(p\)-value is 0.0935. What type of test is this? Draw the picture of the \(p\)-value.

left-tailed test

alt

Example \(\PageIndex{2}\)

\(H_{0}: \mu \leq 0.2, H_{a}: \mu > 0.2\)

This is a test of a single population proportion. \(H_{a}\) tells you the test is right-tailed . The picture of the p -value is as follows:

Normal distribution curve of a single population proportion with the value of 0.2 on the x-axis. The p-value points to the area on the right tail of the curve.

Exercise \(\PageIndex{2}\)

\(H_{0}: \mu \leq 1, H_{a}: \mu > 1\)

Assume the \(p\)-value is 0.1243. What type of test is this? Draw the picture of the \(p\)-value.

right-tailed test

alt

Example \(\PageIndex{3}\)

\(H_{0}: \mu = 50, H_{a}: \mu \neq 50\)

This is a test of a single population mean. \(H_{a}\) tells you the test is two-tailed . The picture of the \(p\)-value is as follows.

Normal distribution curve of a single population mean with a value of 50 on the x-axis. The p-value formulas, 1/2(p-value), for a two-tailed test is shown for the areas on the left and right tails of the curve.

Exercise \(\PageIndex{3}\)

\(H_{0}: \mu = 0.5, H_{a}: \mu \neq 0.5\)

Assume the p -value is 0.2564. What type of test is this? Draw the picture of the \(p\)-value.

two-tailed test

alt

Full Hypothesis Test Examples

Example \(\pageindex{4}\).

Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds . His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims . For the 15 swims, Jeffrey's mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.

Set up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean .

\(H_{0}: \mu = 16.43, H_{a}: \mu < 16.43\)

For Jeffrey to swim faster, his time will be less than 16.43 seconds. The "\(<\)" tells you this is left-tailed.

Determine the distribution needed:

Random variable: \(\bar{X} =\) the mean time to swim the 25-yard freestyle.

Distribution for the test: \(\bar{X}\) is normal (population standard deviation is known: \(\sigma = 0.8\))

\(\bar{X} - N \left(\mu, \frac{\sigma_{x}}{\sqrt{n}}\right)\) Therefore, \(\bar{X} - N\left(16.43, \frac{0.8}{\sqrt{15}}\right)\)

\(\mu = 16.43\) comes from \(H_{0}\) and not the data. \(\sigma = 0.8\), and \(n = 15\).

Calculate the \(p-\text{value}\) using the normal distribution for a mean:

\(p\text{-value} = P(\bar{x} < 16) = 0.0187\) where the sample mean in the problem is given as 16.

\(p\text{-value} = 0.0187\) (This is called the actual level of significance .) The \(p-\text{value}\) is the area to the left of the sample mean is given as 16.

Normal distribution curve for the average time to swim the 25-yard freestyle with values 16, as the sample mean, and 16.43 on the x-axis. A vertical upward line extends from 16 on the x-axis to the curve. An arrow points to the left tail of the curve.

\(\mu = 16.43\) comes from \(H_{0}\). Our assumption is \(\mu = 16.43\).

Interpretation of the \(p-\text{value}\): If \(H_{0}\) is true , there is a 0.0187 probability (1.87%) that Jeffrey's mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.

Compare \(\alpha\) and the \(p-\text{value}\):

\(\alpha = 0.05 p\text{-value} = 0.0187 \alpha > p\text{-value}\)

Make a decision: Since \(\alpha > p\text{-value}\), reject \(H_{0}\).

This means that you reject \(\mu = 16.43\). In other words, you do not think Jeffrey swims the 25-yard freestyle in 16.43 seconds but faster with the new goggles.

Conclusion: At the 5% significance level, we conclude that Jeffrey swims faster using the new goggles. The sample data show there is sufficient evidence that Jeffrey's mean time to swim the 25-yard freestyle is less than 16.43 seconds.

The p -value can easily be calculated.

Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Stats and press ENTER . Arrow down and enter 16.43 for \(\mu_{0}\) (null hypothesis), .8 for σ , 16 for the sample mean, and 15 for n . Arrow down to \(\mu\) : (alternate hypothesis) and arrow over to \(< \mu_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value (\(p = 0.0187\)) but it also calculates the test statistic ( z -score) for the sample mean. \(\mu < 16.43\) is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(z = -2.08\) (test statistic) and \(p = 0.0187\) (\(p-\text{value}\)). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

When the calculator does a \(Z\)-Test, the Z-Test function finds the p -value by doing a normal probability calculation using the central limit theorem:

\(P(\bar{X} < 16)\) 2nd DISTR normcdf (\((−10^{99},16,16.43,\frac{0.8}{\sqrt{15}})\).

The Type I and Type II errors for this problem are as follows:

The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)

The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard free-style, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)

Exercise \(\PageIndex{4}\)

The mean throwing distance of a football for a Marco, a high school freshman quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset \(\alpha = 0.05\). Assume the throw distances for footballs are normal.

First, determine what type of test this is, set up the hypothesis test, find the p -value, sketch the graph, and state your conclusion.

Press STAT and arrow over to TESTS. Press 1: \(Z\)-Test. Arrow over to Stats and press ENTER. Arrow down and enter 40 for \(\mu_{0}\) (null hypothesis), 2 for \(\sigma\), 45 for the sample mean, and 20 for \(n\). Arrow down to \(\mu\): (alternative hypothesis) and set it either as \(<\), \(\neq\), or \(>\). Press ENTER. Arrow down to Calculate and press ENTER. The calculator not only calculates the p -value but it also calculates the test statistic ( z -score) for the sample mean. Select \(<\), \(\neq\), or \(>\) for the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate). Press ENTER. A shaded graph appears with test statistic and \(p\)-value. Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

Since the problem is about a mean, this is a test of a single population mean.

  • \(H_{0}: \mu = 40\)
  • \(H_{a}: \mu > 40\)
  • \(p = 0.0062\)

alt

Because \(p < \alpha\), we reject the null hypothesis. There is sufficient evidence to suggest that the change in grip improved Marco’s throwing distance.

Historical Note

The traditional way to compare the two probabilities, \(\alpha\) and the \(p-\text{value}\), is to compare the critical value (\(z\)-score from \(\alpha\)) to the test statistic (\(z\)-score from data). The calculated test statistic for the \(p\)-value is –2.08. (From the Central Limit Theorem, the test statistic formula is \(z = \frac{\bar{x}-\mu_{x}}{\left(\frac{\sigma_{x}}{\sqrt{n}}\right)}\). For this problem, \(\bar{x} = 16\), \(\mu_{x} = 16.43\) from the null hypotheses is, \(\sigma_{x} = 0.8\), and \(n = 15\).) You can find the critical value for \(\alpha = 0.05\) in the normal table (see 15.Tables in the Table of Contents). The \(z\)-score for an area to the left equal to 0.05 is midway between –1.65 and –1.64 (0.05 is midway between 0.0505 and 0.0495). The \(z\)-score is –1.645. Since –1.645 > –2.08 (which demonstrates that \(\alpha > p-\text{value}\)), reject \(H_{0}\). Traditionally, the decision to reject or not reject was done in this way. Today, comparing the two probabilities \(\alpha\) and the \(p\)-value is very common. For this problem, the \(p-\text{value}\), 0.0187 is considerably smaller than \(\alpha = 0.05\). You can be confident about your decision to reject. The graph shows \(\alpha\), the \(p-\text{value}\), and the test statistics and the critical value.

Distribution curve comparing the α to the p-value. Values of -2.15 and -1.645 are on the x-axis. Vertical upward lines extend from both of these values to the curve. The p-value is equal to 0.0158 and points to the area to the left of -2.15. α is equal to 0.05 and points to the area between the values of -2.15 and -1.645.

Example \(\PageIndex{5}\)

A college football coach thought that his players could bench press a mean weight of 275 pounds . It is known that the standard deviation is 55 pounds . Three of his players thought that the mean weight was more than that amount. They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3); 215(3); 225(1); 241(2); 252(2); 265(2); 275(2); 313(2); 316(5); 338(2); 341(1); 345(2); 368(2); 385(1).

Conduct a hypothesis test using a 2.5% level of significance to determine if the bench press mean is more than 275 pounds.

Since the problem is about a mean weight, this is a test of a single population mean.

  • \(H_{0}: \mu = 275\)
  • \(H_{a}: \mu > 275\)

This is a right-tailed test.

Calculating the distribution needed:

Random variable: \(\bar{X} =\) the mean weight, in pounds, lifted by the football players.

Distribution for the test: It is normal because \(\sigma\) is known.

  • \(\bar{X} - N\left(275, \frac{55}{\sqrt{30}}\right)\)
  • \(\bar{x} = 286.2\) pounds (from the data).
  • \(\sigma = 55\) pounds (Always use \(\sigma\) if you know it.) We assume \(\mu = 275\) pounds unless our data shows us otherwise.

Calculate the p -value using the normal distribution for a mean and using the sample mean as input (see [link] for using the data as input):

\[p\text{-value} = P(\bar{x} > 286.2) = 0.1323.\nonumber \]

Interpretation of the p -value: If \(H_{0}\) is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.

Normal distribution curve of the average weight lifted by football players with values of 275 and 286.2 on the x-axis. A vertical upward line extends from 286.2 to the curve. The p-value points to the area to the right of 286.2.

\(\alpha = 0.025 p-value = 0.1323\)

Make a decision: Since \(\alpha < p\text{-value}\), do not reject \(H_{0}\).

Conclusion: At the 2.5% level of significance, from the sample data, there is not sufficient evidence to conclude that the true mean weight lifted is more than 275 pounds.

The \(p-\text{value}\) can easily be calculated.

Put the data and frequencies into lists. Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Data and press ENTER . Arrow down and enter 275 for \(\mu_{0}\), 55 for \(\sigma\), the name of the list where you put the data, and the name of the list where you put the frequencies. Arrow down to \(\mu\) : and arrow over to \(> \mu_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the \(p-\text{value}\) (\(p = 0.1331\)), a little different from the previous calculation - in it we used the sample mean rounded to one decimal place instead of the data) but it also calculates the test statistic ( z -score) for the sample mean, the sample mean, and the sample standard deviation. \(\mu > 275\) is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(z = 1.112\) (test statistic) and \(p = 0.1331\) (\(p-\text{value})\). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

Example \(\PageIndex{6}\)

Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores 65 65 70 67 66 63 63 68 72 71. He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution.

Set up the hypothesis test:

A 5% level of significance means that \(\alpha = 0.05\). This is a test of a single population mean .

\(H_{0}: \mu = 65  H_{a}: \mu > 65\)

Since the instructor thinks the average score is higher, use a "\(>\)". The "\(>\)" means the test is right-tailed.

Random variable: \(\bar{X} =\) average score on the first statistics test.

Distribution for the test: If you read the problem carefully, you will notice that there is no population standard deviation given . You are only given \(n = 10\) sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student's \(t\).

Use \(t_{df}\). Therefore, the distribution for the test is \(t_{9}\) where \(n = 10\) and \(df = 10 - 1 = 9\).

Calculate the \(p\)-value using the Student's \(t\)-distribution:

\(p\text{-value} = P(\bar{x} > 67) = 0.0396\) where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.

Interpretation of the p -value: If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 65 or more.

Normal distribution curve of average scores on the first statistic tests with 65 and 67 values on the x-axis. A vertical upward line extends from 67 to the curve. The p-value points to the area to the right of 67.

Since \(α = 0.05\) and \(p\text{-value} = 0.0396\). \(\alpha > p\text{-value}\).

This means you reject \(\mu = 65\). In other words, you believe the average test score is more than 65.

Conclusion: At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is more than 65, just as the math instructor thinks.

The \(p\text{-value}\) can easily be calculated.

Put the data into a list. Press STAT and arrow over to TESTS . Press 2:T-Test . Arrow over to Data and press ENTER . Arrow down and enter 65 for \(\mu_{0}\), the name of the list where you put the data, and 1 for Freq: . Arrow down to \(\mu\): and arrow over to \(> \mu_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the \(p\text{-value}\) (p = 0.0396) but it also calculates the test statistic ( t -score) for the sample mean, the sample mean, and the sample standard deviation. \(\mu > 65\) is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(t = 1.9781\) (test statistic) and \(p = 0.0396\) (\(p\text{-value}\)). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

Exercise \(\PageIndex{6}\)

It is believed that a stock price for a particular company will grow at a rate of $5 per week with a standard deviation of $1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: $4, $3, $2, $3, $1, $7, $2, $1, $1, $2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the p -value, state your conclusion, and identify the Type I and Type II errors.

  • \(H_{0}: \mu = 5\)
  • \(H_{a}: \mu < 5\)
  • \(p = 0.0082\)

Because \(p < \alpha\), we reject the null hypothesis. There is sufficient evidence to suggest that the stock price of the company grows at a rate less than $5 a week.

  • Type I Error: To conclude that the stock price is growing slower than $5 a week when, in fact, the stock price is growing at $5 a week (reject the null hypothesis when the null hypothesis is true).
  • Type II Error: To conclude that the stock price is growing at a rate of $5 a week when, in fact, the stock price is growing slower than $5 a week (do not reject the null hypothesis when the null hypothesis is false).

Example \(\PageIndex{7}\)

Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50% . Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.

The 1% level of significance means that α = 0.01. This is a test of a single population proportion .

\(H_{0}: p = 0.50\)  \(H_{a}: p \neq 0.50\)

The words "is the same or different from" tell you this is a two-tailed test.

Calculate the distribution needed:

Random variable: \(P′ =\) the percent of of first-time brides who are younger than their grooms.

Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for P′ , the estimated proportion.

\[P' - N\left(p, \sqrt{\frac{p-q}{n}}\right)\nonumber \]

\[P' - N\left(0.5, \sqrt{\frac{0.5-0.5}{100}}\right)\nonumber \]

where \(p = 0.50, q = 1−p = 0.50\), and \(n = 100\)

Calculate the p -value using the normal distribution for proportions:

\[p\text{-value} = P(p′ < 0.47 \space or \space p′ > 0.53) = 0.5485\nonumber \]

where \[x = 53, p' = \frac{x}{n} = \frac{53}{100} = 0.53\nonumber \].

Interpretation of the p-value: If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion \(p'\) is 0.53 or more OR 0.47 or less (see the graph in Figure).

Normal distribution curve of the percent of first time brides who are younger than the groom with values of 0.47, 0.50, and 0.53 on the x-axis. Vertical upward lines extend from 0.47 and 0.53 to the curve. 1/2(p-values) are calculated for the areas on outsides of 0.47 and 0.53.

\(\mu = p = 0.50\) comes from \(H_{0}\), the null hypothesis.

\(p′ = 0.53\). Since the curve is symmetrical and the test is two-tailed, the \(p′\) for the left tail is equal to \(0.50 – 0.03 = 0.47\) where \(\mu = p = 0.50\). (0.03 is the difference between 0.53 and 0.50.)

Compare \(\alpha\) and the \(p\text{-value}\):

Since \(\alpha = 0.01\) and \(p\text{-value} = 0.5485\). \(\alpha < p\text{-value}\).

Make a decision: Since \(\alpha < p\text{-value}\), you cannot reject \(H_{0}\).

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.

Press STAT and arrow over to TESTS . Press 5:1-PropZTest . Enter .5 for \(p_{0}\), 53 for \(x\) and 100 for \(n\). Arrow down to Prop and arrow to not equals \(p_{0}\). Press ENTER . Arrow down to Calculate and press ENTER . The calculator calculates the \(p\text{-value}\) (\(p = 0.5485\)) and the test statistic (\(z\)-score). Prop not equals .5 is the alternate hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with \(z = 0.6\) (test statistic) and \(p = 0.5485\) (\(p\text{-value}\)). Make sure when you use Draw that no other equations are highlighted in \(Y =\) and the plots are turned off.

The Type I and Type II errors are as follows:

The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).

The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)

Exercise \(\PageIndex{7}\)

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

First, determine what type of test this is, set up the hypothesis test, find the \(p\text{-value}\), sketch the graph, and state your conclusion.

Since the problem is about percentages, this is a test of single population proportions.

  • \(H_{0} : p = 0.85\)
  • \(H_{a}: p \neq 0.85\)
  • \(p = 0.7554\)

alt

Because \(p > \alpha\), we fail to reject the null hypothesis. There is not sufficient evidence to suggest that the proportion of students that want to go to the zoo is not 85%.

Example \(\PageIndex{8}\)

Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.

\(H_{0}: p = 0.30, H_{a}: p \neq 0.30\)

The random variable is \(P′ =\) proportion of households that have three cell phones.

The distribution for the hypothesis test is \(P' - N\left(0.30, \sqrt{\frac{(0.30 \cdot 0.70)}{150}}\right)\)

Exercise \(\PageIndex{8}\).2

a. The value that helps determine the \(p\text{-value}\) is \(p′\). Calculate \(p′\).

a. \(p' = \frac{x}{n}\) where \(x\) is the number of successes and \(n\) is the total number in the sample.

\(x = 43, n = 150\)

\(p′ = 43150\)

Exercise \(\PageIndex{8}\).3

b. What is a success for this problem?

b. A success is having three cell phones in a household.

Exercise \(\PageIndex{8}\).4

c. What is the level of significance?

c. The level of significance is the preset \(\alpha\). Since \(\alpha\) is not given, assume that \(\alpha = 0.05\).

Exercise \(\PageIndex{8}\).5

d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately.

Calculate the \(p\text{-value}\).

d. \(p\text{-value} = 0.7216\)

Exercise \(\PageIndex{8}\).6

e. Make a decision. _____________(Reject/Do not reject) \(H_{0}\) because____________.

e. Assuming that \(\alpha = 0.05, \alpha < p\text{-value}\). The decision is do not reject \(H_{0}\) because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.

Exercise \(\PageIndex{8}\)

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

  • \(H_{0}: p = 0.92\)
  • \(H_{a}: p < 0.92\)
  • \(p\text{-value} = 0.0046\)

Because \(p < 0.05\), we reject the null hypothesis. There is sufficient evidence to conclude that fewer than 92% of American adults own cell phones.

  • Type I Error: To conclude that fewer than 92% of American adults own cell phones when, in fact, 92% of American adults do own cell phones (reject the null hypothesis when the null hypothesis is true).
  • Type II Error: To conclude that 92% of American adults own cell phones when, in fact, fewer than 92% of American adults own cell phones (do not reject the null hypothesis when the null hypothesis is false).

The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter \(p\). The distribution for the test is normal. The estimated proportion \(p′\) is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived \(\alpha = 0.01\), for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!

Example \(\PageIndex{9}\)

My dog has so many fleas,

They do not come off with ease. As for shampoo, I have tried many types Even one called Bubble Hype, Which only killed 25% of the fleas, Unfortunately I was not pleased.

I've used all kinds of soap, Until I had given up hope Until one day I saw An ad that put me in awe.

A shampoo used for dogs Called GOOD ENOUGH to Clean a Hog Guaranteed to kill more fleas.

I gave Fido a bath And after doing the math His number of fleas Started dropping by 3's! Before his shampoo I counted 42.

At the end of his bath, I redid the math And the new shampoo had killed 17 fleas. So now I was pleased.

Now it is time for you to have some fun With the level of significance being .01, You must help me figure out

Use the new shampoo or go without?

\(H_{0}: p \leq 0.25\)   \(H_{a}: p > 0.25\)

In words, CLEARLY state what your random variable \(\bar{X}\) or \(P′\) represents.

\(P′ =\) The proportion of fleas that are killed by the new shampoo

State the distribution to use for the test.

\[N\left(0.25, \sqrt{\frac{(0.25){1-0.25}}{42}}\right)\nonumber \]

Test Statistic: \(z = 2.3163\)

Calculate the \(p\text{-value}\) using the normal distribution for proportions:

\[p\text{-value} = 0.0103\nonumber \]

In one to two complete sentences, explain what the p -value means for this problem.

If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 \(\left(\frac{17}{42}\right)\) or more.

Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the \(p\text{-value}\).

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.25 and 0.4048 on the x-axis. A vertical upward line extends from 0.4048 to the curve and the area to the left of this is shaded in. The test statistic of the sample proportion is listed.

Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.

Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.

Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.26, 17/42, and 0.55 on the x-axis. A vertical upward line extends from 0.26 and 0.55. The area between these two points is equal to 0.95.

Confidence Interval: (0.26,0.55) We are 95% confident that the true population proportion p of fleas that are killed by the new shampoo is between 26% and 55%.

This test result is not very definitive since the \(p\text{-value}\) is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.

Example \(\PageIndex{10}\)

The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.

1.11; 1.07; 1.11; 1.07; 1.12; 1.08; .98; .98 1.02; .95; .95

Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05. Assume the population is normal.

Let’s follow a four-step process to answer this statistical question.

  • \(H_{0}: \mu \leq 1\)
  • \(H_{a}: \mu > 1\)
  • Plan : We are testing a sample mean without a known population standard deviation. Therefore, we need to use a Student's-t distribution. Assume the underlying population is normal.
  • Do the calculations : We will input the sample data into the TI-83 as follows.

alt

4. State the Conclusions : Since the \(p\text{-value} (p = 0.036)\) is less than our alpha value, we will reject the null hypothesis. It is reasonable to state that the data supports the claim that the average conductivity level is greater than one.

Example \(\PageIndex{11}\)

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

We will follow the four-step process.

  • \(H_{0}: p \leq 0.00034\)
  • \(H_{a}: p > 0.00034\)

If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.

  • We will be testing a sample proportion with \(x = 172\) and \(n = 420,019\). The sample is sufficiently large because we have \(np = 420,019(0.00034) = 142.8\), \(nq = 420,019(0.99966) = 419,876.2\), two independent outcomes, and a fixed probability of success \(p = 0.00034\). Thus we will be able to generalize our results to the population.

alt

Figure \(\PageIndex{11}\).

alt

Figure \(\PageIndex{12}\).

  • Since the \(p\text{-value} = 0.0073\) is greater than our alpha value \(= 0.005\), we cannot reject the null. Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.

Example \(\PageIndex{12}\)

According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.

We will follow the four-step plan.

  • We need to test whether the proportion of sexual assaults in Daviess County, KY is significantly different from the national average.
  • \(H_{0}: p = 0.00078\)
  • \(H_{a}: p \neq 0.00078\)

alt

Figure \(\PageIndex{13}\).

alt

Figure \(\PageIndex{14}\).

  • Since the \(p\text{-value}\), \(p = 0.00063\), is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis. In conclusion, the sample data support the claim that the proportion of sexual assaults in Daviess County, Kentucky is different from the national average proportion.

The hypothesis test itself has an established process. This can be summarized as follows:

  • Determine \(H_{0}\) and \(H_{a}\). Remember, they are contradictory.
  • Determine the random variable.
  • Determine the distribution for the test.
  • Draw a graph, calculate the test statistic, and use the test statistic to calculate the \(p\text{-value}\). (A z -score and a t -score are examples of test statistics.)
  • Compare the preconceived α with the p -value, make a decision (reject or do not reject H 0 ), and write a clear conclusion using English sentences.

Notice that in performing the hypothesis test, you use \(\alpha\) and not \(\beta\). \(\beta\) is needed to help determine the sample size of the data that is used in calculating the \(p\text{-value}\). Remember that the quantity \(1 – \beta\) is called the Power of the Test . A high power is desirable. If the power is too low, statisticians typically increase the sample size while keeping α the same.If the power is low, the null hypothesis might not be rejected when it should be.

Assume \(H_{0}: \mu = 9\) and \(H_{a}: \mu < 9\). Is this a left-tailed, right-tailed, or two-tailed test?

This is a left-tailed test.

Exercise \(\PageIndex{9}\)

Assume \(H_{0}: \mu \leq 6\) and \(H_{a}: \mu > 6\). Is this a left-tailed, right-tailed, or two-tailed test?

Exercise \(\PageIndex{10}\)

Assume \(H_{0}: p = 0.25\) and \(H_{a}: p \neq 0.25\). Is this a left-tailed, right-tailed, or two-tailed test?

This is a two-tailed test.

Exercise \(\PageIndex{11}\)

Draw the general graph of a left-tailed test.

Exercise \(\PageIndex{12}\)

Draw the graph of a two-tailed test.

alt

Exercise \(\PageIndex{13}\)

A bottle of water is labeled as containing 16 fluid ounces of water. You believe it is less than that. What type of test would you use?

Exercise \(\PageIndex{14}\)

Your friend claims that his mean golf score is 63. You want to show that it is higher than that. What type of test would you use?

a right-tailed test

Exercise \(\PageIndex{15}\)

A bathroom scale claims to be able to identify correctly any weight within a pound. You think that it cannot be that accurate. What type of test would you use?

Exercise \(\PageIndex{16}\)

You flip a coin and record whether it shows heads or tails. You know the probability of getting heads is 50%, but you think it is less for this particular coin. What type of test would you use?

a left-tailed test

Exercise \(\PageIndex{17}\)

If the alternative hypothesis has a not equals ( \(\neq\) ) symbol, you know to use which type of test?

Exercise \(\PageIndex{18}\)

Assume the null hypothesis states that the mean is at least 18. Is this a left-tailed, right-tailed, or two-tailed test?

Exercise \(\PageIndex{19}\)

Assume the null hypothesis states that the mean is at most 12. Is this a left-tailed, right-tailed, or two-tailed test?

Exercise \(\PageIndex{20}\)

Assume the null hypothesis states that the mean is equal to 88. The alternative hypothesis states that the mean is not equal to 88. Is this a left-tailed, right-tailed, or two-tailed test?

  • Data from Amit Schitai. Director of Instructional Technology and Distance Learning. LBCC.
  • Data from Bloomberg Businessweek . Available online at www.businessweek.com/news/2011- 09-15/nyc-smoking-rate-falls-to-record-low-of-14-bloomberg-says.html.
  • Data from energy.gov. Available online at http://energy.gov (accessed June 27. 2013).
  • Data from Gallup®. Available online at www.gallup.com (accessed June 27, 2013).
  • Data from Growing by Degrees by Allen and Seaman.
  • Data from La Leche League International. Available online at www.lalecheleague.org/Law/BAFeb01.html.
  • Data from the American Automobile Association. Available online at www.aaa.com (accessed June 27, 2013).
  • Data from the American Library Association. Available online at www.ala.org (accessed June 27, 2013).
  • Data from the Bureau of Labor Statistics. Available online at http://www.bls.gov/oes/current/oes291111.htm .
  • Data from the Centers for Disease Control and Prevention. Available online at www.cdc.gov (accessed June 27, 2013)
  • Data from the U.S. Census Bureau, available online at quickfacts.census.gov/qfd/states/00000.html (accessed June 27, 2013).
  • Data from the United States Census Bureau. Available online at www.census.gov/hhes/socdemo/language/.
  • Data from Toastmasters International. Available online at http://toastmasters.org/artisan/deta...eID=429&Page=1 .
  • Data from Weather Underground. Available online at www.wunderground.com (accessed June 27, 2013).
  • Federal Bureau of Investigations. “Uniform Crime Reports and Index of Crime in Daviess in the State of Kentucky enforced by Daviess County from 1985 to 2005.” Available online at http://www.disastercenter.com/kentucky/crime/3868.htm (accessed June 27, 2013).
  • “Foothill-De Anza Community College District.” De Anza College, Winter 2006. Available online at research.fhda.edu/factbook/DA...t_da_2006w.pdf.
  • Johansen, C., J. Boice, Jr., J. McLaughlin, J. Olsen. “Cellular Telephones and Cancer—a Nationwide Cohort Study in Denmark.” Institute of Cancer Epidemiology and the Danish Cancer Society, 93(3):203-7. Available online at http://www.ncbi.nlm.nih.gov/pubmed/11158188 (accessed June 27, 2013).
  • Rape, Abuse & Incest National Network. “How often does sexual assault occur?” RAINN, 2009. Available online at www.rainn.org/get-information...sexual-assault (accessed June 27, 2013).

IMAGES

  1. Examples of Hypothesis: 15+ Ideas to Help You Formulate Yours

    math version of hypothesis

  2. Hypothesis Testing Solved Problems

    math version of hypothesis

  3. Hypothesis Testing Solved Examples(Questions and Solutions)

    math version of hypothesis

  4. How to Write a Hypothesis

    math version of hypothesis

  5. Statistics 101: Introduction to Hypothesis Formulation

    math version of hypothesis

  6. How to Write a Strong Hypothesis in 6 Simple Steps

    math version of hypothesis

VIDEO

  1. 7TH MATH VERSION 4 SBA PEC 2024 MATH CLASS 7TH URDU ENGLISH MCQS QUESTIONS ANSWERS REAL PAERS

  2. MATH 1342

  3. Cupid but its math version👺//kny//•

  4. 8.1: Basics of Hypothesis Testing

  5. MATH 1342

COMMENTS

  1. Hypothesis test

    Hypothesis test. A significance test, also referred to as a statistical hypothesis test, is a method of statistical inference in which observed data is compared to a claim (referred to as a hypothesis) in order to assess the truth of the claim. For example, one might wonder whether age affects the number of apples a person can eat, and may use a significance test to determine whether there is ...

  2. 8.1: The Elements of Hypothesis Testing

    Hypothesis testing is a statistical procedure in which a choice is made between a null hypothesis and an alternative hypothesis based on information in a sample. The end result of a hypotheses testing procedure is a choice of one of the following two possible conclusions: Reject H0. H 0. (and therefore accept Ha.

  3. Hypothesis -- from Wolfram MathWorld

    A hypothesis is a proposition that is consistent with known data, but has been neither verified nor shown to be false. In statistics, a hypothesis (sometimes called a statistical hypothesis) refers to a statement on which hypothesis testing will be based. Particularly important statistical hypotheses include the null hypothesis and alternative hypothesis. In symbolic logic, a hypothesis is the ...

  4. Examples of null and alternative hypotheses

    It is the opposite of your research hypothesis. The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: Alternative: girls time > boys time. Null: girls time <= boys time.

  5. 8.1.1: Introduction to Hypothesis Testing Part 1

    Review. In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim.If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\).The null is not rejected unless the hypothesis test shows otherwise.

  6. 9.1: Introduction to Hypothesis Testing

    This page titled 9.1: Introduction to Hypothesis Testing is shared under a CC BY 2.0 license and was authored, remixed, and/or curated by Kyle Siegrist ( Random Services) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request. In hypothesis testing, the goal is ...

  7. 14.3: Design Research Hypotheses and Experiment

    Hypotheses and Hypothesis. Testing For purposes of testing, we need to design hypotheses that are statements about population parameters. Some examples of hypotheses: At least 20% of juvenile offenders are caught and sentenced to prison. The mean monthly income for college graduates is over $5000.

  8. Introduction to Hypothesis Testing

    Step 3: Collect Data and Compute Sample Statistics. After collecting the data, we find the sample mean. Now we can compare the sample mean with the null hypothesis by computing a z-score that describes where the sample mean is located relative to the hypothesized population mean. We use the z-score formula. Step 4: Make a Decision.

  9. What Is a Hypothesis Test?

    The null hypothesis significance testing (NHST) framework. The general situation is this: we want to find out about some aspect of the real world, and we do this by performing an experiment. From the data collected in the experiment, we want to make a deduction about reality, a process known as statistical inference .

  10. Hypothesis Testing

    A hypothesis test is a statistical inference method used to test the significance of a proposed (hypothesized) relation between population statistics (parameters) and their corresponding sample estimators. In other words, hypothesis tests are used to determine if there is enough evidence in a sample to prove a hypothesis true for the entire population. The test considers two hypotheses: the ...

  11. PDF Hypothesis tests Math 218, Mathematical Statistics

    Hypothesis tests are directly related to con dence intervals, and we can turn this con dence interval into this hypothesis test: Reject the null hypothesis H 0 that p= 1 2 in favor of the alternative hypothesis H 1 if 1 2 2=[X 1= p n;X+ 1= p n]. In other words, we conclude, at the 95% con dence level, that the coin is unfair if Xis further from ...

  12. Understanding Hypotheses

    A hypothesis is a statement or idea which gives an explanation to a series of observations. Sometimes, following observation, a hypothesis will clearly need to be refined or rejected. This happens if a single contradictory observation occurs. For example, suppose that a child is trying to understand the concept of a dog.

  13. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  14. Hypothesis Testing

    Hypothesis testing is a technique that is used to verify whether the results of an experiment are statistically significant. It involves the setting up of a null hypothesis and an alternate hypothesis. There are three types of tests that can be conducted under hypothesis testing - z test, t test, and chi square test.

  15. Significance tests (hypothesis testing)

    Significance tests give us a formal process for using sample data to evaluate the likelihood of some claim about a population value. Learn how to conduct significance tests and calculate p-values to see how likely a sample result is to occur by random chance. You'll also see how we use p-values to make conclusions about hypotheses.

  16. 5.2

    5.2 - Writing Hypotheses. The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis ( H 0) and an alternative hypothesis ( H a ). When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the ...

  17. 9.1: Null and Alternative Hypotheses

    Review. In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim.If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\).The null is not rejected unless the hypothesis test shows otherwise.

  18. How Do You Formulate (Important) Hypotheses?

    Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so ...

  19. Hypothesis

    Figure 2: Role of Hypothesis in the Scientific Method . The above figure shows a simplified version of the scientific method. It shows that whenever a supposition is supported by some data, its termed as hypothesis. Once a hypothesis is proven by some well known and widely acceptable experiment or proof, its becomes a law. If the hypothesis is rejected by some contradictory results then the ...

  20. Hypothesis Definition

    Types of Hypothesis. The hypothesis can be broadly classified into different types. They are: Simple Hypothesis. A simple hypothesis is a hypothesis that there exists a relationship between two variables. One is called a dependent variable, and the other is called an independent variable. Complex Hypothesis.

  21. 10.2: Null and Alternative Hypotheses

    The null hypothesis ( H0. H 0. ) is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt. The alternative hypothesis ( Ha. H a. ) is a claim about the population that is contradictory to H0.

  22. Quanta Magazine

    Paul Nelson has solved the subconvexity problem, bringing mathematicians one step closer to understanding the Riemann hypothesis and the distribution of prime numbers. It's been 162 years since Bernhard Riemann posed a seminal question about the distribution of prime numbers. Despite their best efforts, mathematicians have made very little ...

  23. [2403.20070] Splitting maps in Type I Ricci flows

    Mathematics > Differential Geometry ... However, we show that under an additional summability hypothesis such degeneration doesn't occur. Comments: 83 pages: Subjects: Differential Geometry (math.DG) Cite as: arXiv:2403.20070 [math.DG] (or arXiv:2403.20070v1 [math.DG] for this version) Submission history From: Panagiotis Gianniotis ...

  24. 9.5: Additional Information and Full Hypothesis Test Examples

    This makes the data analyst use judgment rather than mindlessly applying rules. The following examples illustrate a left-, right-, and two-tailed test. Example 9.5.1. H0: μ = 5, Ha: μ < 5. Test of a single population mean. Ha tells you the test is left-tailed. The picture of the p -value is as follows: Figure 9.5.1.

  25. Introducing DBRX: A New State-of-the-Art Open LLM

    DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2-70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter-counts. When hosted on Mosaic AI Model Serving, DBRX can generate text at up to ...