Null hypothesis

null hypothesis definition

Null hypothesis n., plural: null hypotheses [nʌl haɪˈpɒθɪsɪs] Definition: a hypothesis that is valid or presumed true until invalidated by a statistical test

Table of Contents

Null Hypothesis Definition

Null hypothesis is defined as “the commonly accepted fact (such as the sky is blue) and researcher aim to reject or nullify this fact”.

More formally, we can define a null hypothesis as “a statistical theory suggesting that no statistical relationship exists between given observed variables” .

In biology , the null hypothesis is used to nullify or reject a common belief. The researcher carries out the research which is aimed at rejecting the commonly accepted belief.

What Is a Null Hypothesis?

A hypothesis is defined as a theory or an assumption that is based on inadequate evidence. It needs and requires more experiments and testing for confirmation. There are two possibilities that by doing more experiments and testing, a hypothesis can be false or true. It means it can either prove wrong or true (Blackwelder, 1982).

For example, Susie assumes that mineral water helps in the better growth and nourishment of plants over distilled water. To prove this hypothesis, she performs this experiment for almost a month. She watered some plants with mineral water and some with distilled water.

In a hypothesis when there are no statistically significant relationships among the two variables, the hypothesis is said to be a null hypothesis. The investigator is trying to disprove such a hypothesis. In the above example of plants, the null hypothesis is:

There are no statistical relationships among the forms of water that are given to plants for growth and nourishment.

Usually, an investigator tries to prove the null hypothesis wrong and tries to explain a relation and association between the two variables.

An opposite and reverse of the null hypothesis are known as the alternate hypothesis . In the example of plants the alternate hypothesis is:

There are statistical relationships among the forms of water that are given to plants for growth and nourishment.

The example below shows the difference between null vs alternative hypotheses:

Alternate Hypothesis: The world is round Null Hypothesis: The world is not round.

Copernicus and many other scientists try to prove the null hypothesis wrong and false. By their experiments and testing, they make people believe that alternate hypotheses are correct and true. If they do not prove the null hypothesis experimentally wrong then people will not believe them and never consider the alternative hypothesis true and correct.

The alternative and null hypothesis for Susie’s assumption is:

  • Null Hypothesis: If one plant is watered with distilled water and the other with mineral water, then there is no difference in the growth and nourishment of these two plants.
  • Alternative Hypothesis:  If one plant is watered with distilled water and the other with mineral water, then the plant with mineral water shows better growth and nourishment.

The null hypothesis suggests that there is no significant or statistical relationship. The relation can either be in a single set of variables or among two sets of variables.

Most people consider the null hypothesis true and correct. Scientists work and perform different experiments and do a variety of research so that they can prove the null hypothesis wrong or nullify it. For this purpose, they design an alternate hypothesis that they think is correct or true. The null hypothesis symbol is H 0 (it is read as H null or H zero ).

Why is it named the “Null”?

The name null is given to this hypothesis to clarify and explain that the scientists are working to prove it false i.e. to nullify the hypothesis. Sometimes it confuses the readers; they might misunderstand it and think that statement has nothing. It is blank but, actually, it is not. It is more appropriate and suitable to call it a nullifiable hypothesis instead of the null hypothesis.

Why do we need to assess it? Why not just verify an alternate one?

In science, the scientific method is used. It involves a series of different steps. Scientists perform these steps so that a hypothesis can be proved false or true. Scientists do this to confirm that there will be any limitation or inadequacy in the new hypothesis. Experiments are done by considering both alternative and null hypotheses, which makes the research safe. It gives a negative as well as a bad impact on research if a null hypothesis is not included or a part of the study. It seems like you are not taking your research seriously and not concerned about it and just want to impose your results as correct and true if the null hypothesis is not a part of the study.

Development of the Null

In statistics, firstly it is necessary to design alternate and null hypotheses from the given problem. Splitting the problem into small steps makes the pathway towards the solution easier and less challenging. how to write a null hypothesis?

Writing a null hypothesis consists of two steps:

  • Firstly, initiate by asking a question.
  • Secondly, restate the question in such a way that it seems there are no relationships among the variables.

In other words, assume in such a way that the treatment does not have any effect.

The usual recovery duration after knee surgery is considered almost 8 weeks.

A researcher thinks that the recovery period may get elongated if patients go to a physiotherapist for rehabilitation twice per week, instead of thrice per week, i.e. recovery duration reduces if the patient goes three times for rehabilitation instead of two times.

Step 1: Look for the problem in the hypothesis. The hypothesis either be a word or can be a statement. In the above example the hypothesis is:

“The expected recovery period in knee rehabilitation is more than 8 weeks”

Step 2: Make a mathematical statement from the hypothesis. Averages can also be represented as μ, thus the null hypothesis formula will be.

In the above equation, the hypothesis is equivalent to H1, the average is denoted by μ and > that the average is greater than eight.

Step 3: Explain what will come up if the hypothesis does not come right i.e., the rehabilitation period may not proceed more than 08 weeks.

There are two options: either the recovery will be less than or equal to 8 weeks.

H 0 : μ ≤ 8

In the above equation, the null hypothesis is equivalent to H 0 , the average is denoted by μ and ≤ represents that the average is less than or equal to eight.

What will happen if the scientist does not have any knowledge about the outcome?

Problem: An investigator investigates the post-operative impact and influence of radical exercise on patients who have operative procedures of the knee. The chances are either the exercise will improve the recovery or will make it worse. The usual time for recovery is 8 weeks.

Step 1: Make a null hypothesis i.e. the exercise does not show any effect and the recovery time remains almost 8 weeks.

H 0 : μ = 8

In the above equation, the null hypothesis is equivalent to H 0 , the average is denoted by μ, and the equal sign (=) shows that the average is equal to eight.

Step 2: Make the alternate hypothesis which is the reverse of the null hypothesis. Particularly what will happen if treatment (exercise) makes an impact?

In the above equation, the alternate hypothesis is equivalent to H1, the average is denoted by μ and not equal sign (≠) represents that the average is not equal to eight.

Significance Tests

To get a reasonable and probable clarification of statistics (data), a significance test is performed. The null hypothesis does not have data. It is a piece of information or statement which contains numerical figures about the population. The data can be in different forms like in means or proportions. It can either be the difference of proportions and means or any odd ratio.

The following table will explain the symbols:

P-value is the chief statistical final result of the significance test of the null hypothesis.

  • P-value = Pr(data or data more extreme | H 0 true)
  • | = “given”
  • Pr = probability
  • H 0 = the null hypothesis

The first stage of Null Hypothesis Significance Testing (NHST) is to form an alternate and null hypothesis. By this, the research question can be briefly explained.

Null Hypothesis = no effect of treatment, no difference, no association Alternative Hypothesis = effective treatment, difference, association

When to reject the null hypothesis?

Researchers will reject the null hypothesis if it is proven wrong after experimentation. Researchers accept null hypothesis to be true and correct until it is proven wrong or false. On the other hand, the researchers try to strengthen the alternate hypothesis. The binomial test is performed on a sample and after that, a series of tests were performed (Frick, 1995).

Step 1: Evaluate and read the research question carefully and consciously and make a null hypothesis. Verify the sample that supports the binomial proportion. If there is no difference then find out the value of the binomial parameter.

Show the null hypothesis as:

H 0 :p= the value of p if H 0 is true

To find out how much it varies from the proposed data and the value of the null hypothesis, calculate the sample proportion.

Step 2: In test statistics, find the binomial test that comes under the null hypothesis. The test must be based on precise and thorough probabilities. Also make a list of pmf that apply, when the null hypothesis proves true and correct.

When H 0 is true, X~b(n, p)

N = size of the sample

P = assume value if H 0 proves true.

Step 3: Find out the value of P. P-value is the probability of data that is under observation.

Rise or increase in the P value = Pr(X ≥ x)

X = observed number of successes

P value = Pr(X ≤ x).

Step 4: Demonstrate the findings or outcomes in a descriptive detailed way.

  • Sample proportion
  • The direction of difference (either increases or decreases)

Perceived Problems With the Null Hypothesis

Variable or model selection and less information in some cases are the chief important issues that affect the testing of the null hypothesis. Statistical tests of the null hypothesis are reasonably not strong. There is randomization about significance. (Gill, 1999) The main issue with the testing of the null hypothesis is that they all are wrong or false on a ground basis.

There is another problem with the a-level . This is an ignored but also a well-known problem. The value of a-level is without a theoretical basis and thus there is randomization in conventional values, most commonly 0.q, 0.5, or 0.01. If a fixed value of a is used, it will result in the formation of two categories (significant and non-significant) The issue of a randomized rejection or non-rejection is also present when there is a practical matter which is the strong point of the evidence related to a scientific matter.

The P-value has the foremost importance in the testing of null hypothesis but as an inferential tool and for interpretation, it has a problem. The P-value is the probability of getting a test statistic at least as extreme as the observed one.

The main point about the definition is: Observed results are not based on a-value

Moreover, the evidence against the null hypothesis was overstated due to unobserved results. A-value has importance more than just being a statement. It is a precise statement about the evidence from the observed results or data. Similarly, researchers found that P-values are objectionable. They do not prefer null hypotheses in testing. It is also clear that the P-value is strictly dependent on the null hypothesis. It is computer-based statistics. In some precise experiments, the null hypothesis statistics and actual sampling distribution are closely related but this does not become possible in observational studies.

Some researchers pointed out that the P-value is depending on the sample size. If the true and exact difference is small, a null hypothesis even of a large sample may get rejected. This shows the difference between biological importance and statistical significance. (Killeen, 2005)

Another issue is the fix a-level, i.e., 0.1. On the basis, if a-level a null hypothesis of a large sample may get accepted or rejected. If the size of simple is infinity and the null hypothesis is proved true there are still chances of Type I error. That is the reason this approach or method is not considered consistent and reliable. There is also another problem that the exact information about the precision and size of the estimated effect cannot be known. The only solution is to state the size of the effect and its precision.

Null Hypothesis Examples

Here are some examples:

Example 1: Hypotheses with One Sample of One Categorical Variable

Among all the population of humans, almost 10% of people prefer to do their task with their left hand i.e. left-handed. Let suppose, a researcher in the Penn States says that the population of students at the College of Arts and Architecture is mostly left-handed as compared to the general population of humans in general public society. In this case, there is only a sample and there is a comparison among the known population values to the population proportion of sample value.

  • Research Question: Do artists more expected to be left-handed as compared to the common population persons in society?
  • Response Variable: Sorting the student into two categories. One category has left-handed persons and the other category have right-handed persons.
  • Form Null Hypothesis: Arts and Architecture college students are no more predicted to be lefty as compared to the common population persons in society (Lefty students of Arts and Architecture college population is 10% or p= 0.10)

Example 2: Hypotheses with One Sample of One Measurement Variable

A generic brand of antihistamine Diphenhydramine making medicine in the form of a capsule, having a 50mg dose. The maker of the medicines is concerned that the machine has come out of calibration and is not making more capsules with the suitable and appropriate dose.

  • Research Question: Does the statistical data recommended about the mean and average dosage of the population differ from 50mg?
  • Response Variable: Chemical assay used to find the appropriate dosage of the active ingredient.
  • Null Hypothesis: Usually, the 50mg dosage of capsules of this trade name (population average and means dosage =50 mg).

Example 3: Hypotheses with Two Samples of One Categorical Variable

Several people choose vegetarian meals on a daily basis. Typically, the researcher thought that females like vegetarian meals more than males.

  • Research Question: Does the data recommend that females (women) prefer vegetarian meals more than males (men) regularly?
  • Response Variable: Cataloguing the persons into vegetarian and non-vegetarian categories. Grouping Variable: Gender
  • Null Hypothesis: Gender is not linked to those who like vegetarian meals. (Population percent of women who eat vegetarian meals regularly = population percent of men who eat vegetarian meals regularly or p women = p men).

Example 4: Hypotheses with Two Samples of One Measurement Variable

Nowadays obesity and being overweight is one of the major and dangerous health issues. Research is performed to confirm that a low carbohydrates diet leads to faster weight loss than a low-fat diet.

  • Research Question: Does the given data recommend that usually, a low-carbohydrate diet helps in losing weight faster as compared to a low-fat diet?
  • Response Variable: Weight loss (pounds)
  • Explanatory Variable: Form of diet either low carbohydrate or low fat
  • Null Hypothesis: There is no significant difference when comparing the mean loss of weight of people using a low carbohydrate diet to people using a diet having low fat. (population means loss of weight on a low carbohydrate diet = population means loss of weight on a diet containing low fat).

Example 5: Hypotheses about the relationship between Two Categorical Variables

A case-control study was performed. The study contains nonsmokers, stroke patients, and controls. The subjects are of the same occupation and age and the question was asked if someone at their home or close surrounding smokes?

  • Research Question: Did second-hand smoke enhance the chances of stroke?
  • Variables: There are 02 diverse categories of variables. (Controls and stroke patients) (whether the smoker lives in the same house). The chances of having a stroke will be increased if a person is living with a smoker.
  • Null Hypothesis: There is no significant relationship between a passive smoker and stroke or brain attack. (odds ratio between stroke and the passive smoker is equal to 1).

Example 6: Hypotheses about the relationship between Two Measurement Variables

A financial expert observes that there is somehow a positive and effective relationship between the variation in stock rate price and the quantity of stock bought by non-management employees

  • Response variable- Regular alteration in price
  • Explanatory Variable- Stock bought by non-management employees
  • Null Hypothesis: The association and relationship between the regular stock price alteration ($) and the daily stock-buying by non-management employees ($) = 0.

Example 7: Hypotheses about comparing the relationship between Two Measurement Variables in Two Samples

  • Research Question: Is the relation between the bill paid in a restaurant and the tip given to the waiter, is linear? Is this relation different for dining and family restaurants?
  • Explanatory Variable- total bill amount
  • Response Variable- the amount of tip
  • Null Hypothesis: The relationship and association between the total bill quantity at a family or dining restaurant and the tip, is the same.

Try to answer the quiz below to check what you have learned so far about the null hypothesis.

Choose the best answer. 

Send Your Results (Optional)

clock.png

  • Blackwelder, W. C. (1982). “Proving the null hypothesis” in clinical trials. Controlled Clinical Trials , 3(4), 345–353.
  • Frick, R. W. (1995). Accepting the null hypothesis. Memory & Cognition, 23(1), 132–138.
  • Gill, J. (1999). The insignificance of null hypothesis significance testing. Political Research Quarterly , 52(3), 647–674.
  • Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16(5), 345–353.

©BiologyOnline.com. Content provided and moderated by Biology Online Editors.

Last updated on June 16th, 2022

You will also like...

null hypothesis examples biology

Plant Cell Defense

Plants protect themselves by releasing hydrogen peroxide to fight against fungal invasion. Another way is by secreting c..

Neurology of Illusions

Neurology of Illusions

Illusions are the perceptions and sensory data obtained from situations in which human error prevents us from seeing the..

Cell Biology

Cell Biology

The cell is defined as the fundamental, functional unit of life. Some organisms are comprised of only one cell whereas o..

Mātauranga Māori and Science Collaboration

Mātauranga Māori and Science

Mātauranga Māori is the living knowledge system of the indigenous people of New Zealand, including the relationships t..

Still Water Community Plants

Still Water Community Plants

This tutorial looks at the adaptations of freshwater plants for them to thrive in still water habitats. Familiarize your..

Psychiatry and mental disorders

Psychiatry & Mental Disorders

Different mental disorders are described here. Read this tutorial to get an overview of schizophrenia, affective mood di..

Related Articles...

null hypothesis examples biology

No related articles found

Statology

Statistics Made Easy

How to Write a Null Hypothesis (5 Examples)

A hypothesis test uses sample data to determine whether or not some claim about a population parameter is true.

Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms:

H 0 (Null Hypothesis): Population parameter =,  ≤, ≥ some value

H A  (Alternative Hypothesis): Population parameter <, >, ≠ some value

Note that the null hypothesis always contains the equal sign .

We interpret the hypotheses as follows:

Null hypothesis: The sample data provides no evidence to support some claim being made by an individual.

Alternative hypothesis: The sample data  does provide sufficient evidence to support the claim being made by an individual.

For example, suppose it’s assumed that the average height of a certain species of plant is 20 inches tall. However, one botanist claims the true average height is greater than 20 inches.

To test this claim, she may go out and collect a random sample of plants. She can then use this sample data to perform a hypothesis test using the following two hypotheses:

H 0 : μ ≤ 20 (the true mean height of plants is equal to or even less than 20 inches)

H A : μ > 20 (the true mean height of plants is greater than 20 inches)

If the sample data gathered by the botanist shows that the mean height of this species of plants is significantly greater than 20 inches, she can reject the null hypothesis and conclude that the mean height is greater than 20 inches.

Read through the following examples to gain a better understanding of how to write a null hypothesis in different situations.

Example 1: Weight of Turtles

A biologist wants to test whether or not the true mean weight of a certain species of turtles is 300 pounds. To test this, he goes out and measures the weight of a random sample of 40 turtles.

Here is how to write the null and alternative hypotheses for this scenario:

H 0 : μ = 300 (the true mean weight is equal to 300 pounds)

H A : μ ≠ 300 (the true mean weight is not equal to 300 pounds)

Example 2: Height of Males

It’s assumed that the mean height of males in a certain city is 68 inches. However, an independent researcher believes the true mean height is greater than 68 inches. To test this, he goes out and collects the height of 50 males in the city.

H 0 : μ ≤ 68 (the true mean height is equal to or even less than 68 inches)

H A : μ > 68 (the true mean height is greater than 68 inches)

Example 3: Graduation Rates

A university states that 80% of all students graduate on time. However, an independent researcher believes that less than 80% of all students graduate on time. To test this, she collects data on the proportion of students who graduated on time last year at the university.

H 0 : p ≥ 0.80 (the true proportion of students who graduate on time is 80% or higher)

H A : μ < 0.80 (the true proportion of students who graduate on time is less than 80%)

Example 4: Burger Weights

A food researcher wants to test whether or not the true mean weight of a burger at a certain restaurant is 7 ounces. To test this, he goes out and measures the weight of a random sample of 20 burgers from this restaurant.

H 0 : μ = 7 (the true mean weight is equal to 7 ounces)

H A : μ ≠ 7 (the true mean weight is not equal to 7 ounces)

Example 5: Citizen Support

A politician claims that less than 30% of citizens in a certain town support a certain law. To test this, he goes out and surveys 200 citizens on whether or not they support the law.

H 0 : p ≥ .30 (the true proportion of citizens who support the law is greater than or equal to 30%)

H A : μ < 0.30 (the true proportion of citizens who support the law is less than 30%)

Additional Resources

Introduction to Hypothesis Testing Introduction to Confidence Intervals An Explanation of P-Values and Statistical Significance

' src=

Published by Zach

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Back Home

  • Science Notes Posts
  • Contact Science Notes
  • Todd Helmenstine Biography
  • Anne Helmenstine Biography
  • Free Printable Periodic Tables (PDF and PNG)
  • Periodic Table Wallpapers
  • Interactive Periodic Table
  • Periodic Table Posters
  • How to Grow Crystals
  • Chemistry Projects
  • Fire and Flames Projects
  • Holiday Science
  • Chemistry Problems With Answers
  • Physics Problems
  • Unit Conversion Example Problems
  • Chemistry Worksheets
  • Biology Worksheets
  • Periodic Table Worksheets
  • Physical Science Worksheets
  • Science Lab Worksheets
  • My Amazon Books

Hypothesis Examples

Hypothesis Examples

A hypothesis is a prediction of the outcome of a test. It forms the basis for designing an experiment in the scientific method . A good hypothesis is testable, meaning it makes a prediction you can check with observation or experimentation. Here are different hypothesis examples.

Null Hypothesis Examples

The null hypothesis (H 0 ) is also known as the zero-difference or no-difference hypothesis. It predicts that changing one variable ( independent variable ) will have no effect on the variable being measured ( dependent variable ). Here are null hypothesis examples:

  • Plant growth is unaffected by temperature.
  • If you increase temperature, then solubility of salt will increase.
  • Incidence of skin cancer is unrelated to ultraviolet light exposure.
  • All brands of light bulb last equally long.
  • Cats have no preference for the color of cat food.
  • All daisies have the same number of petals.

Sometimes the null hypothesis shows there is a suspected correlation between two variables. For example, if you think plant growth is affected by temperature, you state the null hypothesis: “Plant growth is not affected by temperature.” Why do you do this, rather than say “If you change temperature, plant growth will be affected”? The answer is because it’s easier applying a statistical test that shows, with a high level of confidence, a null hypothesis is correct or incorrect.

Research Hypothesis Examples

A research hypothesis (H 1 ) is a type of hypothesis used to design an experiment. This type of hypothesis is often written as an if-then statement because it’s easy identifying the independent and dependent variables and seeing how one affects the other. If-then statements explore cause and effect. In other cases, the hypothesis shows a correlation between two variables. Here are some research hypothesis examples:

  • If you leave the lights on, then it takes longer for people to fall asleep.
  • If you refrigerate apples, they last longer before going bad.
  • If you keep the curtains closed, then you need less electricity to heat or cool the house (the electric bill is lower).
  • If you leave a bucket of water uncovered, then it evaporates more quickly.
  • Goldfish lose their color if they are not exposed to light.
  • Workers who take vacations are more productive than those who never take time off.

Is It Okay to Disprove a Hypothesis?

Yes! You may even choose to write your hypothesis in such a way that it can be disproved because it’s easier to prove a statement is wrong than to prove it is right. In other cases, if your prediction is incorrect, that doesn’t mean the science is bad. Revising a hypothesis is common. It demonstrates you learned something you did not know before you conducted the experiment.

Test yourself with a Scientific Method Quiz .

  • Mellenbergh, G.J. (2008). Chapter 8: Research designs: Testing of research hypotheses. In H.J. Adèr & G.J. Mellenbergh (eds.), Advising on Research Methods: A Consultant’s Companion . Huizen, The Netherlands: Johannes van Kessel Publishing.
  • Popper, Karl R. (1959). The Logic of Scientific Discovery . Hutchinson & Co. ISBN 3-1614-8410-X.
  • Schick, Theodore; Vaughn, Lewis (2002). How to think about weird things: critical thinking for a New Age . Boston: McGraw-Hill Higher Education. ISBN 0-7674-2048-9.
  • Tobi, Hilde; Kampen, Jarl K. (2018). “Research design: the methodology for interdisciplinary research framework”. Quality & Quantity . 52 (3): 1209–1225. doi: 10.1007/s11135-017-0513-8

Related Posts

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null & Alternative Hypotheses | Definitions, Templates & Examples

Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis ( H 0 ): There’s no effect in the population .
  • Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:

  • The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
  • The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

null hypothesis examples biology

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question.
  • They both make claims about the population.
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

Prevent plagiarism. Run a free check.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

General template sentences

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
  • Alternative hypothesis ( H a ): Independent variable affects dependent variable.

Test-specific template sentences

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.

  • Open access
  • Published: 29 August 2015

Defining the null hypothesis

  • Emma Saxon 1  

BMC Biology volume  13 , Article number:  68 ( 2015 ) Cite this article

4335 Accesses

2 Citations

7 Altmetric

Metrics details

Virus B is a newly emerged viral strain for which there is no current treatment. Drug A was identified as a potential treatment for infection with virus B. In this pre-clinical phase of drug testing, the effects of drug A on survival after infection with virus B was tested. There was no difference in survival between control (dark blue) and drug A-treated, virus B-infected mice (green), but a significant difference in survival between control and virus B-infected mice without drug treatment (light blue, z-test for proportions P  < 0.05, n = 30 in each group). The authors therefore concluded that drug A is effective in reducing mouse mortality due to virus B.

Some studies report conclusions based on a null hypothesis different from the one that is actually tested. In this example, the authors tested the effect of a novel antiviral drug on mouse survival 7 days after infection with a virus. The virus alone reduced mouse survival (the light blue bar in Fig.  1 , z-test P  < 0.05), but there was no significant difference between uninfected, untreated control mice (dark blue) and infected, drug A-treated mice (green), so the authors concluded that the drug significantly increased the survival time of infected mice.

The effects of drug A on the relative survival of mice infected with virus B. Relative survival is significantly decreased in infected mice (light blue), but not in infected mice treated with drug A (green), compared with the control (dark blue); n = 30, z-test for proportions * P  < 0.05. n/s not significant

But the statistical test used to support the claim was applied inappropriately. In order to conclude that the drug increased the survival of infected mice, the authors would have had to compare infected treated mice (green) with infected untreated mice (light blue), and not with uninfected mice (dark blue). Their results do show that the survival of virus-infected mice was significantly lower than that of uninfected control mice, by 20 %. But the difference between infected untreated and infected treated mice (the light blue versus green bars in Fig.  1 , the correct comparison for testing the drug effect) is only 10 %: as the non-significant difference in survival between uninfected control (dark blue) and infected drug-treated mice (green) was also 10 %, it, too, will be non-significant. In this case, the data support the null hypothesis, contrary to the authors’ conclusions.

Note also that the effects are not large — the majority of infected animals survive — and that with 30 animals in each group the differences amount to six animals at most between the groups. This makes it difficult to know realistically what to make of the results. To address this problem, the authors would need to increase the power of their study by using larger sample sizes, which would show whether there is a significant increase in survival with drug treatment or not.

Indeed, UK funding agencies recently changed their animal experimental guidelines to reflect growing concerns that sample size is commonly too small in studies like this, which therefore may not have sufficient statistical power to detect real differences [ 1 ]. Appropriate sample sizes can be calculated based on the study design, and new tools are being developed to help researchers with this: one example is the Experimental Design Assistant, from the National Centre for the Replacement, Refinement & Reduction of Animals in Research [ 2 ], expected to launch later in 2015.

Cressey D. UK funders demand strong statistics for animal studies. Nature. 2015;520:271–2.

Article   CAS   PubMed   Google Scholar  

Experimental Design Assistant. https://www.nc3rs.org.uk/experimental-design-assistant-eda .

Download references

Author information

Authors and affiliations.

BMC Biology, BioMed Central, 236 Gray’s Inn Road, London, WC1X 8HB, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Emma Saxon .

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Saxon, E. Defining the null hypothesis. BMC Biol 13 , 68 (2015). https://doi.org/10.1186/s12915-015-0181-x

Download citation

Published : 29 August 2015

DOI : https://doi.org/10.1186/s12915-015-0181-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Infected Mouse
  • Uninfected Control
  • Sufficient Statistical Power

BMC Biology

ISSN: 1741-7007

null hypothesis examples biology

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

This page has been archived and is no longer updated

null hypothesis

Further Exploration

Concept Links for further exploration

Topics

  • Gene Inheritance and Transmission
  • Gene Expression and Regulation
  • Nucleic Acid Structure and Function
  • Chromosomes and Cytogenetics
  • Evolutionary Genetics
  • Population and Quantitative Genetics
  • Genes and Disease
  • Genetics and Society
  • Cell Origins and Metabolism
  • Proteins and Gene Expression
  • Subcellular Compartments
  • Cell Communication
  • Cell Cycle and Cell Division

© 2014 Nature Education

  • Press Room |
  • Terms of Use |
  • Privacy Notice |

Send

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.11(11); 2021 Jun

When are hypotheses useful in ecology and evolution?

Matthew g. betts.

1 Forest Biodiversity Research Network, Department of Forest Ecosystems and Society, Oregon State University, Corvallis OR, USA

Adam S. Hadley

David w. frey, sarah j. k. frey, dusty gannon, scott h. harris, urs g. kormann, kara leimberger, katie moriarty.

2 USDA Forest Service, Pacific Northwest Research Station, Corvallis OR, USA

Joseph M. Northrup

3 Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Environmental and Life Sciences Graduate Program, Trent University, Peterborough ON, Canada

Josée S. Rousseau

Thomas d. stokely, jonathon j. valente, diego zárrate‐charry, associated data.

Data for the analysis of hypothesis use in ecology and evolution publications is available at https://figshare.com/articles/dataset/Betts_et_al_2021_When_are_hypotheses_useful_in_ecology_and_evolution_Ecology_and_Evolution/14110289 .

Research hypotheses have been a cornerstone of science since before Galileo. Many have argued that hypotheses (1) encourage discovery of mechanisms, and (2) reduce bias—both features that should increase transferability and reproducibility. However, we are entering a new era of big data and highly predictive models where some argue the hypothesis is outmoded. We hypothesized that hypothesis use has declined in ecology and evolution since the 1990s, given the substantial advancement of tools further facilitating descriptive, correlative research. Alternatively, hypothesis use may have become more frequent due to the strong recommendation by some journals and funding agencies that submissions have hypothesis statements. Using a detailed literature analysis ( N  = 268 articles), we found prevalence of hypotheses in eco–evo research is very low (6.7%–26%) and static from 1990–2015, a pattern mirrored in an extensive literature search ( N  = 302,558 articles). Our literature review also indicates that neither grant success nor citation rates were related to the inclusion of hypotheses, which may provide disincentive for hypothesis formulation. Here, we review common justifications for avoiding hypotheses and present new arguments based on benefits to the individual researcher. We argue that stating multiple alternative hypotheses increases research clarity and precision, and is more likely to address the mechanisms for observed patterns in nature. Although hypotheses are not always necessary, we expect their continued and increased use will help our fields move toward greater understanding, reproducibility, prediction, and effective conservation of nature.

We use a quantitative literature review to show that use of a priori hypotheses is still rare in the fields of ecology and evolution. We provide suggestions about the group and individual‐level benefits of hypothesis use.

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g005.jpg

1. INTRODUCTION

Why should ecologists have hypotheses? At the beginning of most science careers, there comes a time of “hypothesis angst” where students question the need for the hypothetico‐deductive approach their elders have deemed essential for good science. Why is it not sufficient to just have a research objective or question? Why can't we just collect observations and describe those in our research papers?

Research hypotheses are explanations for an observed phenomenon (Loehle,  1987 ; Wolff & Krebs,  2008 ) (see Box  1 ) and have been proposed as a central tool of science since Galileo and Francis Bacon in the mid‐1600s (Glass & Hall,  2008 ). Over the past century, there have been repeated calls for rigorous application of hypotheses in science, and arguments that hypothesis use is the cornerstone of the scientific method (Chamberlin,  1890 ; Popper,  1959 ; Romesburg,  1981 ). In a seminal paper in Science, Platt ( 1964 ) challenged all scientific fields to adopt and rigorously test multiple hypotheses (sensu Chamberlin,  1890 ), arguing that without such hypothesis tests, disciplines would be prone to “stamp collecting” (Landy,  1986 ). To constitute “strong inference,” Platt required the scientific method to be a three‐step process including (1) developing alternative hypotheses, (2) devising a set of “crucial” experiments to eliminate all but one hypothesis, and (3) performing the experiments (Elliott & Brook,  2007 ).

Definitions of hypotheses and associated terms

Hypothesis : An explanation for an observed phenomenon.

Research Hypothesis: A statement about a phenomenon that also includes the potential mechanism or cause of that phenomenon. Though a research hypothesis doesn't need to adhere to this strict framework it is often best described as the “if” in an “if‐then” statement. In other words, “if X is true” (where X is the mechanism or cause for an observed phenomenon) “then Y” (where Y is the outcome of a crucial test that supports the hypothesis). These can also be thought of as “ mechanistic hypotheses ” since they link with a causal mechanism. For example, trees grow slowly at high elevation because of nutrient limitation (hypothesis); if this is the case, fertilizing trees should result in more rapid growth (prediction).

Prediction: The potential outcome of a test that would support a hypothesis. Most researchers call the second part of the if‐then statement a “prediction”.

Multiple alternative hypotheses: Multiple plausible explanations for the same phenomenon.

Descriptive Hypothesis: Descriptive statements or predictions with the word “hypothesis” in front of them. Typically researchers state their guess about the results they expect and call this the “hypothesis” (e.g., “I hypothesize trees at higher elevation will grow slowly”).

Statistical Hypothesis : A predicted pattern in data that should occur if a research hypothesis is true.

Null Hypothesis : A concise statement expressing the concept of “no difference” between a sample and the population mean.

The commonly touted strengths of hypotheses are two‐fold. First, by adopting multiple plausible explanations for a phenomenon (hereafter “ multiple alternative hypotheses ”; Box  1 ), a researcher reduces the chance that they will become attached to a single possibility, thereby biasing research in favor of this outcome (Chamberlin,  1890 ); this “confirmation bias” is a well‐known human trait (Loehle,  1987 ; Rosen,  2016 ) and likely decreases reproducibility (Munafò et al.,  2017 ). Second, various authors have argued that the a priori hypothesis framework forces one to think in advance about—and then test—various causes for patterns in nature (Wolff & Krebs,  2008 ), rather than simply examining the patterns themselves and coming up with explanations after the fact (so called “inductive research;” Romesburg,  1981 ). By understanding and testing mechanisms, science becomes more reliable and transferable (Ayres & Lombardero,  2017 ; Houlahan et al.,  2017 ; Sutherland et al.,  2013 ) (Figure  1 ). Importantly, both of these strengths should have strong, positive impacts on reproducibility of ecological and evolutionary studies (see Discussion).

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g001.jpg

Understanding mechanisms often increases model transferability. Panels (a and b) show snowshoe hares in winter and summer coloration, respectively. If a correlative (i.e., nonmechanistic) model for hare survival as a function of color was trained only on hares during the winter and then extrapolated into the summer months, it would perform poorly (white hares would die disproportionately under no‐snow conditions). On the other hand, a researcher testing mechanisms for hare survival would (ideally via experimentation) arrive at the conclusion that it is not the whiteness of hares, but rather blending with the background that confers survival (the “camouflage” hypothesis). Understanding mechanism results in model predictions being robust to novel conditions. Panel (c) Shows x and y geographic locations of training (blue filled circles) and testing (blue open circles) locations for a hypothetical correlative model. Even if the model performs well on these independent test data (predicting open to closed circles), there is no guarantee that it will predict well outside of the spatial bounds of the existing data (red circles). Nonstationarity (in this case caused by a nonlinear relationship between predictor and response variable; panel d) could result in correlative relationships shifting substantially if extrapolated to new times or places. However, mechanistic hypotheses aimed at understanding the underlying factors driving the distribution of this species would be more likely to elucidate this nonlinear relationship. In both of these examples, understanding drivers behind ecological patterns—via testing mechanistic hypotheses—is likely to enhance model transferability

However, we are entering a new era of ecological and evolutionary science that is characterized by massive datasets on genomes, species distributions, climate, land cover, and other remotely sensed information (e.g., bioacoustics, camera traps; Pettorelli et al.,  2017 ). Exceptional computing power and new statistical and machine‐learning algorithms now enable thousands of statistical models to be run in minutes. Such datasets and methods allow for pattern recognition at unprecedented spatial scales and for huge numbers of taxa and processes. Indeed, there have been recent arguments in both the scientific literature and popular press to do away with the traditional scientific method and a priori hypotheses (Glass & Hall,  2008 ; Golub,  2010 ). These arguments go something along the lines of “if we can get predictions right most of the time, why do we need to know the cause?”

In this paper, we sought to understand if hypothesis use in ecology and evolution has shifted in response to these pressures on the discipline. We, therefore, hypothesized that hypothesis use has declined in ecology and evolution since the 1990s, given the substantial advancement of tools further facilitating descriptive, correlative research (e.g., Cutler et al.,  2007 ; Elith et al.,  2008 ). We predicted that this decline should be particularly evident in the applied conservation literature—where the emergence of machine‐learning models has resulted in an explosion of conservation‐oriented species distribution models (Elith et al.,  2006 ). Our alternative hypothesis was that hypothesis use has become more frequent. The mechanism for such increases is that higher‐profile journals (e.g., Functional Ecology , Proceedings of the Royal Society of London Ser. B ) and competitive granting agencies (e.g., the U.S. National Science Foundation) now require or strongly encourage hypothesis statements.

As noted above, many have argued that hypotheses are useful and important for overall progress in science, because they facilitate the discovery of mechanisms, reduce bias, and increase reproducibility (Platt,  1964 ). However, for hypothesis use to be propagated among scientists, one would also expect hypotheses to confer benefits to the individual. We, therefore, tested whether hypothesis use was associated with individual‐level incentives relevant to academic success: publications, citations, and grants (Weinberg,  2010 ). If hypothesis use confers individual‐level advantages, then hypothesis‐based research should be (1) published in more highly ranked journals, (2) have higher citation rates, and (3) be supported by highly competitive funding sources.

Finally, we also present some common justifications for absence of hypotheses and suggest potential counterpoints researchers should consider prior to dismissing hypothesis use, including potential benefits to the individual researcher. We hope this communication provides practical recommendations for improving hypothesis use in ecology and evolution—particularly for new practitioners in the field (Box  2 ).

Recommendations for improving hypotheses use in ecology and evolution

Authors : Know that you are human and prone to confirmation bias and highly effective at false pattern recognition. Thus, inductive research and single working hypotheses should be rare in your research. Remember that if your work is to have a real “impact”, it needs to withstand multiple tests from other labs over the coming decades.

Editors and Reviewers : Reward research that is conducted using principles of sound scientific method. Be skeptical of research that smacks of data dredging, post hoc hypothesis development, and single hypotheses. If no hypotheses are stated in a paper and/or the paper is purely descriptive, ask whether the novelty of the system and question warrant this, or if the field would have been better served by a study with mechanistic hypotheses. If only single hypotheses are stated, ask whether appropriate precautions were taken for the researcher to avoid finding support for a pet idea (e.g., blinded experiments, randomized attribution of treatments, etc.). To paraphrase Platt ( 1964 ): beware of the person with only one method or one instrument, either experimental or theoretical.

Mentors : Encourage your advisees to think carefully about hypothesis use and teach them how to construct sound multiple, mechanistic hypotheses. Importantly, explain why hypotheses are important to the scientific method, the individual and group consequences of excluding them, and the rare instances where they may not be necessary.

Policymakers/media/educators/students/readers : Read scientific articles with skepticism; have a scrutinous eye out for single hypothesis studies and p‐hacking. Reward multi‐hypothesis, mechanistic, predictive science by giving it greater weight in policy decisions (Sutherland et al.,  2013 ), more coverage in the media, greater leverage in education, and more citations in reports.

2.1. Literature analysis

To examine hypothesis use over time and test whether hypothesis presence was associated with research type (basic vs. applied), journal impact factor, citation rates, and grants, we sampled the ecology and evolution literature using a stratified random sample of ecology and evolution journals in existence before 1991. First, we randomly selected 19 journals across impact factor (IF) strata ranging from 0.5–10.0 in two bins (<3 IF and ≥3 IF; see Figure  3 for full journal list). We then added three multidisciplinary journals that regularly publish ecology and evolution articles ( Proceedings of the National Academy of Sciences, Science, and Nature ). From this sample of 22 journals, we randomly selected ecology and evolution articles within 5‐year strata beginning in 1991 (3 articles/journal per 5‐year bin) to ensure the full date range was evenly sampled. We removed articles in the following categories: editorials, corrections, reviews, opinions, and methods papers. In multidisciplinary journals, we examined only ecology, evolution, and conservation biology articles, as indicated by section headers in each journal. Once selected, articles were randomly distributed to the authors of the current paper (hereafter “reviewers:” MGB, ASH, DF, SF, DG, SH, HK, UK, KL, KM, JN, BP, JSR, TSS, JV, DZC) for detailed examination. On rare occasions, an article was not found, or reviewers were not able to complete their review. Ultimately, our final sample comprised 268 articles.

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g004.jpg

Frequency distributions showing proportion of various hypotheses types across ecology and evolution journals included in our detailed literature search. Hypothesis use varied greatly across publication outlets. We considered J. Applied Ecology, J. Wildlife Management, J. Soil, and Water Cons., Ecological Applications, Conservation Biology, and Biological Conservation to be applied journals; both applied and basic journals varied greatly in the prevalence of hypotheses

Reviewers were given a maximum of 10 min to find research hypothesis statements within the abstract or introduction of articles. We chose 10 min to simulate the amount of time that a journal editor pressed for time might spend evaluating the introductory material in an article. After this initial 10 min period, we determined: (1) whether or not an article contained at least one hypothesis, (2) whether hypotheses were mechanistic or not (i.e., the authors claimed to examine the mechanism for an observed phenomenon), (3) whether multiple alternative hypotheses were considered (sensu Chamberlin, 1890 ), and (4) whether hypotheses were “descriptive” (that is, they did not explore a mechanism but simply stated the expected direction of an effect; we define this as a “prediction” [Box  1 ]). It is important to note that to be identified as having hypotheses, articles did not need to contain the actual term “hypothesis” under our protocol; we also included articles using phrases such as “If X is true, we expected …” or “ we anticipated, ” both of which reflect a priori expectations from the data. We categorized each article as either basic (fundamental research without applications as a focus) or applied (clear management or conservation focus to article). Finally, we also examined all articles for funding sources and noted the presence of a national or international‐level competitive grant (e.g., National Science Foundation, European Union, Natural Sciences and Engineering Research Council). We assumed that published articles would have fidelity to the hypotheses stated in original grant proposals that funded the research, therefore, the acknowledgment of a successful grant is an indicator of financial reward for including hypotheses in initial proposals. Journal impact factors and individual article citation rates were gleaned directly from Web of Science. We reasoned that many researchers seek out journals with higher impact factors for the first submission of their manuscripts (Paine & Fox,  2018 ). Our assumption was that studies with more careful experimental design—including hypotheses—should be published where initially submitted, whereas those without may be eventually published, on average, in lower impact journals (Opthof et al.,  2000 ). Ideally, we could have included articles that were rejected and never published in our analysis, but such articles are notoriously difficult to track (Thornton & Lee,  2000 ).

To support our detailed literature analysis, we also tested for temporal trends in hypothesis use within a broader sample of the ecology and evolution literature. For the same set of 22 journals in our detailed sample, we conducted a Web of Science search for articles containing “hypoth*” in the title or abstract. To calculate the proportion of articles with hypotheses (from 1990–2018), we divided the number of articles with hypotheses by the total number of articles ( N  = 302,558). Because our search method does not include the main text of articles and excludes more subtle ways of stating hypotheses (e.g., “We expected…,” “We predicted…”), we acknowledge that the proportion of papers identified is likely to be an underestimate of the true proportions. Nevertheless, we do not expect that the degree of underestimation would change over time, so temporal trends in the proportion of papers containing hypotheses should be unbiased.

2.2. Statistical analysis

We used generalized linear mixed models (GLMMs) to test for change in the prevalance of various hypothesis types over time (descriptive, mechanistic, multiple, any hypothesis). Presence of a hypothesis was modeled as dichotomous (0,1) with binomial error structure, and “journal” was included as a random effect to account for potential lack of independence among articles published in the same outlet. The predictor variable (i.e., year) was scaled to enable convergence. Similarly, we tested for differences in hypothesis prevalence between basic and applied articles using GLMMs with “journal” as a random effect. Finally, we tested the hypothesis that hypothesis use might decline over time due to the emergence of machine‐learning in the applied conservation literature; specifically, we modeled “hypothesis presence” as a function of the statistical interaction between “year” and “basic versus applied” articles. We conducted this test for all hypothesis types. GLMMs were implemented in R (version 3.60) using the lme4 package (Bates et al.,  2018 ). In three of our models, the “journal” random effect standard deviation was estimated to be zero or nearly zero (i.e., 10 –8 ). In such cases, the model with the random effect is exceptionally difficult to estimate, and the random effect standard deviation being estimated as approximately zero indicates the random effect was likely not needed.

We tested whether the presence of hypotheses influenced the likelihood of publication in a high‐impact journal using generalized linear models with a Gaussian error structure. We used the log of journal impact factor (+0.5) as the response variable to improve normality of model residuals. We tested the association between major competitive grants and the presence of a hypotheses using generalized linear models (logistic regression) with “hypothesis presence” (0,1) as a predictor and presence of a grant (0,1) as a response.

Finally, we tested whether hypotheses increase citation rates using linear mixed effects models (LMMs); presence of various hypotheses (0,1) were predictors in univariate models and average citations per year (log‐transformed) was the response. “Journal” was treated as a random effect, which assumes that articles within a particular journal are unlikely to be independent in their citation rates. LMMs were implemented in R using the lme4 package (Bates et al.,  2015 ).

3.1. Trends in hypothesis use in ecology and evolution

In the ecology and evolution articles we examined in detail, the prevalence of multiple alternative hypotheses (6.7%) and mechanistic hypotheses (26%) was very low and showed no temporal trend (GLMM: multiple alternative: β ^  = 0.098 [95% CI: −0.383, 0.595], z  = 0.40, p  = 0.69, mechanistic: β ^  = 0.131 [95% CI: −0.149, 0.418], z  = 0.92, p  = 0.36, Figure  2a,b ). Descriptive hypothesis use was also low (8.5%), and although we observed a slight tendency to increase over time, 95% confidence intervals overlapped zero (GLMM: β ^  = 0.351 [95% CI: −0.088, 0.819], z  = 1.53, p  = 0.13, Figure  2c ). Although the proportion of papers containing no hypotheses appears to have declined (Figure  2d ), this effect was not statistically significant (GLMM: β ^  = −0.201 [95% CI: −0.483, 0.074], z  = −1.41, p  = 0.15). This overall pattern is consistent with a Web of Science search ( N  = 302,558 articles) for the term “hypoth*” in titles or abstracts that shows essentially no trend over the same time period (Figure  2e,f ).

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g003.jpg

Trends in hypothesis use from 1991–2015 from a sample of the ecological and evolutionary literature ( N  = 268, (a) multiple alternative hypotheses, (b) mechanistic hypotheses, (c) descriptive hypotheses [predictions], and (d) no hypotheses present). We detected no temporal trend in any of these variables. Lines reflect LOESS smoothing with 95% confidence intervals. Dots show raw data with darker colors indicating overlapping data points. The total number of publications in ecology and evolution in selected journals has increased (e), but use of the term “hypoth*” in the title or abstracts of these 302,558 articles has remained flat, and at very low prevalence (f)

Counter to our hypothesis, applied and basic articles did not show a statistically significant difference in the prevalence of either mechanistic (GLMM: β ^  = 0.054 [95% CI: −0.620, 0.728], z  = 0.16, p  = 0.875) or multiple alternative hypotheses (GLMM: β ^  = 0.517 [95% CI: −0.582, 1.80], z  = 0.88, p  = 0.375). Although both basic and applied ecology and evolution articles containing hypotheses were similarly rare overall, there was a tendency for applied ecology articles to show increasing prevalence of mechanistic hypothesis use over time, whereas basic ecology articles have remained relatively unchanged (Table  S1 , Figure  S1 ). However, there was substantial variation across both basic and applied journals in the prevalence of hypotheses (Figure  3 ).

3.2. Do hypotheses “pay?”

We found little evidence that presence of hypotheses increased paper citation rates. Papers with mechanistic (LMM: β ^  = −0.109 [95% CI: −0.329, 0.115], t  = 0.042, p  = 0.97, Figure  4a , middle panel) or multiple alternative hypotheses (LMM: β ^  = −0.008 [95% CI: −0.369, 0.391], t  = 0.042, p  = 0.96, Figure  4a , bottom panel) did not have higher average annual citation rates, nor did papers with at least one hypothesis type (LMM: β ^  = −0.024 [95% CI: −0.239, 0.194], t  = 0.218, p  = 0.83, Figure  4a , top panel).

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g006.jpg

Results of our detailed literature search showing the relationship between having a hypothesis (or not) and three commonly sought after scientific rewards (Average times a paper is cited/year, Journal impact factor, and the likelihood of having a major national competitive grant). We found no statistically significant relationships between having a hypothesis and citation rates or grants, but articles with hypotheses tended to be published in higher impact journals

On the other hand, journal articles containing mechanistic hypotheses tended to be published in higher impact journals (GLM: β ^  = 0.290 [95% CI: 0.083, 0.497], t  = 2.74, p  = 0.006) but only slightly so (Figure  4b , middle panel). Including multiple alternative hypotheses in papers did not have a statistically significant effect (GLM: = 0.339 [95% CI: −0.029, 0.707], t  = 1.80, p  = 0.072, Figure  4b , bottom panel).

Finally, we found no association between obtaining a competitive national or international grant and the presence of a hypothesis (logistic regression: mechanistic: β ^  = −0.090 [95% CI: −0.637, 0.453], z  = −0.36, p  =0 .745; multiple alternative: β ^  = 0.080 [95% CI: −0.891, 1.052], z  = 0.49, p  = 0.870; any hypothesis: β ^  = −0.005 [95% CI: −0.536, 0.525], z  = −0.02, p  = 0.986, Figure  4c ).

4. DISCUSSION

Overall, the prevalence of hypothesis use in the ecological and evolutionary literature is strikingly low and has been so for the past 25 years despite repeated calls to reverse this pattern (Elliott & Brook,  2007 ; Peters,  1991 ; Rosen,  2016 ; Sells et al.,  2018 ). Why is this the case?

Clearly, hypotheses are not always necessary and a portion of the sampled articles may represent situations where hypotheses are truly not useful (see Box  3 : “When Are Hypotheses Not Useful?”). Some authors (Wolff & Krebs,  2008 ) overlook knowledge gathering and descriptive research as a crucial first step for making observations about natural phenomena—from which hypotheses can be formulated. This descriptive work is an important part of ecological science (Tewksbury et al.,  2014 ), but may not benefit from strict use of hypotheses. Similarly, some efforts are simply designed to be predictive, such as auto‐recognition of species via machine learning (Briggs et al.,  2012 ) or for prioritizing conservation efforts (Wilson et al.,  2006 ), where the primary concern is correct identification and prediction rather than the biological or computational reasons for correct predictions (Box  3 ). However, it would be surprising if 75% of ecology since 1990 has been purely descriptive work from little‐known systems or purely predictive in nature. Indeed, the majority of the articles we observed did not fall into these categories.

When are hypotheses not useful?

Of course, there are a number of instances where hypotheses might not be useful or needed. It is important to recognize these instances to prevent the pendulum from swinging in a direction where without hypotheses, research ceases to be considered science (Wolff & Krebs,  2008 ). Below are several important types of ecological research where formulating hypotheses may not always be beneficial.

When the goal is prediction rather than understanding. Examples of this exception include species distribution models (Elith et al.,  2008 ) where the question is not why species are distributed as they are, but simply where species are predicted to be. Such results can be useful in conservation planning (Guisan et al.,  2013 ; see below). Another example lies in auto‐recognition of species (Briggs et al.,  2012 ) where the primary concern is getting identification right rather than the biological or computational reasons for correct predictions. In such instances, complex algorithms can be very effective at uncovering patterns (e.g., deep learning). A caveat and critical component of such efforts is to ensure that such models are tested on independent data. Further, if model predictions are made beyond the spatial or temporal bounds of training or test data, extreme caution should be applied (see Figure  4 ).

When the goal is description rather than understanding. In many applications, the objective is to simply quantify a pattern in nature; for example, where on Earth is forest loss most rapid (Hansen et al.,  2013 )? Further, sometimes so little is known about a system or species that formulating hypotheses is impossible and more description is necessary. In rare instances, an ecological system may be so poorly known and different to other systems that generating testable hypotheses would be extremely challenging. Darwin's observations while traveling on the Beagle are some of the best examples of such “hypothesis generating” science; these initial observations resulted in the formulation of one of the most extensively tested hypotheses in biology. However, such novelty should be uncommon in ecological and evolutionary research where theoretical and empirical precedent abounds (Sells et al.,  2018 ). In the field of biogeography, there is the commonly held view that researchers should first observe and analyze patterns, and only then might explanations emerge (“pattern before process”); however, it has frequently been demonstrated that mechanistic hypotheses are useful even in disciplines where manipulative experiments are impossible (Crisp et al.,  2011 ).

When the objective is a practical planning outcome such as reserve design. In many conservation planning efforts, the goal is not to uncover mechanisms, but rather simply to predict efficient methods or contexts for conserving species (Myers et al.,  2000 ; Wilson et al.,  2006 ). Perhaps this is the reason for such low prevalence of hypotheses in conservation journals (e.g., Conservation Biology).

Alternatively, researchers may not include hypotheses because they see little individual‐level incentive for their inclusion. Our results suggest that currently there are relatively few measurable benefits to individuals. Articles with mechanistic hypotheses do tend to be published in higher impact factor journals, which, for better or worse, is one of the key predictors in obtaining an academic job (van Dijk et al.,  2014 ). However, few of the other typical academic metrics (i.e., citations or grant funding) appear to reward this behavior. Although hypotheses might be “useful” for overall progress in science (Platt,  1964 ), for their use to be propagated in the population of scientists, one would also expect them to provide benefits to the individuals conducting the science. Interestingly, the few existing papers on hypotheses (Loehle,  1987 ; Romesburg,  1981 ; Sells et al.,  2018 ) tended to explain the advantages in terms of benefits to the group by offering arguments such as “because hypotheses help the field move forward more rapidly”.

Here we address some common justifications for hypotheses being unnecessary and show how one's first instinct to avoid hypotheses may be mistaken. We also present four reasons that use of hypotheses may be of individual self‐interest.

5. RESPONSES TO COMMON JUSTIFICATIONS FOR THE ABSENCE OF HYPOTHESES

During our collective mentoring at graduate and undergraduate levels, as well as examination of the literature, we have heard a number of common justifications for why hypotheses are not included. We must admit that many of us have, on occasion, rationalized absence of hypotheses in our own work using the same logic! We understand that clearly formulating and testing hypotheses can often be challenging, but propose that the justifications for avoiding hypotheses should be carefully considered.

  • “ But I do have hypotheses ”. Simply using the word “hypothesis” does not a hypothesis make. A common pattern in the literature we reviewed was for researchers to state their guess about the results they expect and call this the “hypothesis” (e.g., “I hypothesize trees at higher elevation will grow slowly”). But these are usually predictions derived from an implicit theoretical model (Symes et al.,  2015 ) or are simply descriptive statements with the word “hypothesis” in front of them (see Box  1 ). A research hypothesis must contain explanations for an observed phenomenon (Loehle,  1987 ; Wolff & Krebs,  2008 ). Such explanations are derived from existing or new theory (Symes et al.,  2015 ). Making the link between the expected mechanism (hypothesis) and logical outcome if that mechanism were true (the prediction), is a key element of strong inference. Similarly, using “statistical hypotheses” and “null hypothesis testing” is not the same as developing mechanistic research hypotheses (Romesburg,  1981 ; Sells et al.,  2018 ).
  • “ Not enough is known about my system to formulate hypotheses ”. This is perhaps the most common defense against needing hypotheses (Golub,  2010 ). The argument goes that due to lack of previous research no mature theory has developed, so formal tests are impossible. Such arguments may have basis in some truly novel contexts (e.g., exploratory research on genomes) (Golub,  2010 ). But on close inspection, similar work has often been conducted in other geographic regions, systems, or with different taxa. If the response by a researcher is “but we really need to know if X pattern also applies in this region” (e.g., does succession influence bird diversity in forests of Western North America the same way as it does in Eastern forests), this is fine and it is certainly useful to accumulate descriptive studies globally for future synthetic work. However, continued efforts at description alone constitute missed opportunities for understanding the mechanisms behind a pattern (e.g., why does bird diversity decline when the forest canopy closes?). Often with a little planning, both the initial descriptive local interest question (e.g., “is it?”) and the broader interest question (i.e., “why?”) can both be tackled with minimal additional effort.
  • “ What about Darwin? Many important discoveries have been made without hypotheses .” Several authors (and many students) have argued that many important and reliable patterns in nature have emerged outside of the hypothetico‐deductive (H‐D) method (Brush,  1974 ). For instance, Darwin's discovery of natural selection as a key force for evolution has been put forward as an example of how reliable ideas can emerge without the H‐D method (May,  1981 ; Milner,  2018 ). Examination of Darwin's notebooks has suggested that he did not propose explicit hypotheses and test them (Brush,  1974 ). However, Darwin himself wrote “all observation must be for or against some view if it is to be of any service!” (Ayala,  2009 ). In fact, Darwin actually put forward and empirically tested hypotheses in multiple fields, including geology, plant morphology and physiology, psychology, and evolution (Ayala,  2009 ). This debate suggests that, like Darwin, we should continue to value systematic observation and descriptive science (Tewksbury et al.,  2014 ), but whenever possible, it should be with a view toward developing theory and testing hypotheses

The statement that “many important discoveries have been made without hypotheses” stems from a common misconception that somehow hypotheses spring fully formed into the mind, and that speculation, chance and induction play no role in the H‐D method. As noted by Loehle ( 1987 ; p. 402) “The H‐D method and strong inference, however, are valid no matter how theories are obtained. Dreams, crystal balls, or scribbled notebooks are all allowed. In fact, induction may be used to create empirical relations which then become candidates for hypothesis testing even though induction cannot be used to prove anything”. So, although induction has frequently been used to develop theory, it is an unreliable means to test theory (Popper,  1959 ). As is well‐known, Darwin's theory of natural selection was heavily debated in scientific circles at the time, and it is only through countless hypothesis tests that it remains the best explanation for evolution even today (Mayr,  2002 ).

  • “ Ecology is too complex for hypotheses ”. In one of the most forcefully presented arguments for the H‐D method, Karl Popper ( 1959 ) argued that science should be done through a process of falsification; that is, multiple hypotheses should be constructed and the researcher's role is to successively eliminate these one at a time via experimentation until a single plausible hypothesis remains. This approach has caused some consternation among ecologists because the idea of single causes to phenomena doesn't match most of our experiences (Quinn & Dunham,  1983 ); rather, multiple interacting processes often overlap to drive observed patterns. For example, Robert Paine found that the distribution of a common seaweed was best explained by competition, physical disturbance, and dispersal ability (Paine,  1966 ).

It would be interesting if Popperian logic has inoculated ecology and evolution against the frequent application of hypotheses in research. Perhaps because the bar of falsification and testable mutually exclusive hypotheses is so high, many have opted to ignore the need for hypotheses altogether. If this is the case, our response is that in ecology and evolution we must not let Popperian perfection be the enemy of strong inference. With sufficient knowledge of a system, formal a priori hypotheses can be formulated that directly address the possibility of nonlinear relationships and interactions among variables. An example from conservation biology is the well‐explored hypothesis that the effects of habitat fragmentation should be greatest when habitat amount is low due to dispersal limitation (i.e., there should be a statistical interaction between fragmentation and habitat loss (Andrén, 1994 )).

An external file that holds a picture, illustration, etc.
Object name is ECE3-11-5762-g002.jpg

Hypothesis generation is possible at all levels of organization, and does not need to get to the bottom of a causal hierarchy to be useful. As illustrated in this case study (after Betts et al.,  2015 ), using published work by the authors, support for a hypothesis at one level often generates a subsequent question and hypotheses at the next. After each new finding we had to return to the white board and draw out new alternative hypotheses as we progressed further down the hierarchy. Supported hypotheses are shown in black and the alternative hypotheses that were eliminated are in grey. A single study is not expected to tackle an entire mechanistic hierarchy. In fact, we still have yet to uncover the physiological mechanisms involved in this phenomenon

  • “ But my model predicts patterns well ”. An increasingly common justification for not presenting and testing research hypotheses seems to be the notion that if large datasets and complex modeling methods can predict outcomes effectively, what is the need for hypothesizing a mechanism (Glass & Hall,  2008 ; Golub,  2010 )? Indeed, some have argued that prediction is a gold standard in ecology and evolution (Houlahan et al.,  2017 ). However, underlying such arguments is the critical assumption that the relationship between predictors (i.e., independent variables, 'x's) and responses ('y's) exhibit stationarity in time and space. Although this appears to be the case in cosmology (e.g., relativity is thought to apply wherever you are in the universe (Einstein,  1920 )), the assumption of stationarity has repeatedly been shown to be violated in ecological and evolutionary studies (Betts et al.,  2006 ; Osborne et al.,  2007 ; Thompson,  2005 ). Hence the well‐known maxim “correlation does not equal causation;” correlates of a phenomenon often shift, even if the underlying cause remains the same.

The advantage of understanding mechanism is that the relationship between cause and effect is less likely to shift in space and time than between the correlates of a phenomenon (Sells et al.,  2018 ) (Figure  1 ). For instance, climate‐envelope models are still commonly used to predict future species distributions (Beale et al.,  2008 ) despite the fact that links between correlates often fail (Gutiérrez et al.,  2014 ) and climate per se may not be the direct driver of distributions. In an example from our own group, predictions that fit observed data well in the region where the model was built completely failed when predicted to a new region only 250 km away (Betts et al.,  2006 ). Although it is true that mechanisms can also exhibit nonstationarity, at least in these instances logic can inform decisions about whether or not causal factors are likely to hold in a new place or time.

6. WHY SHOULD YOU HAVE HYPOTHESES? (A SELF‐INTERESTED PERSPECTIVE)

We have already described two arguments for hypothesis use, both of which should have positive influences on reproducibility and therefore progress in science: (1) multiple alternative hypotheses developed a priori prevent attachment to a single idea, and (2) hypotheses encourage exploration of mechanisms, which should increase the transferability of findings to new systems. Both these arguments have been made frequently in the eco‐evolutionary literature for decades (Elliott & Brook,  2007 ; Loehle,  1987 ; Rosen,  2016 ; Sells et al.,  2018 ), but our results show that such arguments have been lost on the majority of researchers. One hypothesis recently proposed to explain why “poor methods persist [in science] despite perennial calls for improvements” is that such arguments have largely failed because they do not appeal to researcher self‐interest (Smaldino & McElreath,  2016 ). In periods of intense competition for grants and top‐tier publications, perhaps arguments that rely on altruism fall short. However, happily, there are at least four self‐interested reasons that students of ecological and evolutionary science should adopt the hypothetico‐deductive method.

  • Clarity and Precision in Research

First, and most apparent during our review of the literature, hypotheses force clarity and precision in thinking. We often found it difficult to determine the core purpose of papers that lacked clear hypotheses. One of the key goals of scientific writing is to communicate ideas efficiently (Schimel,  2011 ). Increased clarity through use of hypotheses could potentially even explain the pattern for manuscripts using hypotheses getting published in higher impact journals. Editors are increasingly pressed for time and forced to reject the majority of papers submitted to higher impact outlets prior to detailed review (AAAS,  2018 ). “Unclear message” and “lack of clear hypotheses” are top reasons a paper ends up in the editor's reject pile (Eassom,  2018 ; Elsevier,  2015 ). If editors have to struggle as often as we did to determine the purpose of a paper, this does not bode well for future publication. Clearly, communication through succinctly stated hypotheses is likely to enhance publication success.

Hypotheses also provide crucial direction during study design. Nothing is more frustrating than realizing that your hard‐earned data cannot actually address the key study objectives or rule out alternative explanations. Developing clear hypotheses and, in particular, multiple alternative hypotheses ensures that you actually design your study in a way that can answer the key questions of interest.

  • Personal Fulfillment

Second, science is more likely to be fulfilling and fun when the direction of research is clear, but perhaps more importantly, when questions are addressed with more than one plausible answer. Results are often disappointing or unfulfilling when the study starts out with a single biological hypothesis in mind (Symes et al.,  2015 )—particularly if there is no support for this hypothesis. If multiple alternative hypotheses are well crafted, something interesting and rewarding will result regardless of the outcome. This results in a situation where researchers are much more likely to enjoy the process of science because the stress of wanting a particular end is removed. Subsequently, as Chamberlin ( 1890 ) proposed, “the dangers of parental affection for a favorite theory can be circumvented” which should reduce the risk of creeping bias. In our experience reviewing competitive grant proposals at the U.S. National Science Foundation, it is consistently the case that proposals testing several compelling hypotheses were more likely to be well received—presumably because reviewers are risk‐averse and understand that ultimately finding support for any of the outcomes will pay‐off. Why bet on just one horse when you can bet on them all?

  • Intrinsic Value to Mechanism

Mechanism seems to have intrinsic value for humans—regardless of the practical application. Humans tend to be interested in acquiring understanding rather than just accumulating facts. As a species, we crave answers to the question “why.” Indeed, it is partly this desire for mechanism that is driving a recent perceived “crisis” in machine learning, with the entire field being referred to as “alchemy” (Hutson,  2018 ); algorithms continue to increase in performance, but the mechanisms for such improvements are often a mystery—even to the researchers themselves. “Because our model predicts well” is the unsatisfying scientific equivalent to a parent answering a child's “why?” with “because that's just the way it is.” This problem is beginning to spawn a new field in artificial intelligence “AI neuroscience” which attempts to get into the “black‐box” of machine‐learning algorithms to understand how and why they are predictive (Voosen,  2017 ).

Even in some of our most applied research, we find that managers and policymakers when confronted with a result (e.g., thinning trees to 70% of initial densities reduced bird diversity) want to know why (e.g., thinning eliminated nesting substrate for 4 species); If the answer to this question is not available, policy is much less likely to change (Sells et al.,  2018 ). So, formulating mechanistic hypotheses will not only be more personally satisfying, but we expect it may also be more likely to result in real‐world changes.

  • You Are More Likely To be Right

In a highly competitive era, it seems that in the quest for high publication rates and funding, researchers lose sight of the original aim of science: To discover a truth about nature that is transferable to other systems. In a recent poll conducted by Nature, more than 70% of researchers have tried and failed to reproduce another scientist's experiments (Baker,  2016 ). Ultimately, each researcher has a choice; put forward multiple explanations for a phenomenon on their own or risk “attachment” to a single hypothesis and run the risk of bias entering their work, rendering it irreproducible, and subsequently being found wrong by a future researcher. Imagine if Lamarck had not championed a single hypothesis for the mechanisms of evolution? Although Lamarck potentially had a vital impact as an early proponent of the idea that biological evolution occurred and proceeded in accordance with natural laws (Stafleu,  1971 ), unfortunately in the modern era he is largely remembered for his pet hypothesis. It may be a stretch to argue that he would have necessarily come up with natural selection, but if he had considered natural selection, the idea would have emerged 50 years earlier, substantially accelerating scientific progress and limiting his infamy as an early evolutionary biologist. An interesting contemporary example is provided by Prof. Amy Cuddy's research focused on “power posing” as a means to succeed. The work featured in one of the most viewed TED talks of all time but rather famously turned out to be irreproducible (Ranehill et al.,  2015 ). When asked in a TED interview what she would do differently now, Prof. Cuddy noted that she would include a greater diversity of theory and multiple potential lines of evidence to “shed light on the psychological mechanisms” (Biello,  2017 ).

7. CONCLUSION

We acknowledge that formulating effective hypotheses can feel like a daunting hurdle for ecologists. However, we suggest that initial justifications for absence of hypotheses may often be unfounded. We argue that there are both selfish and altruistic reasons to include multiple alternative mechanistic hypotheses in your research: (1) testing multiple alternative hypotheses simultaneously makes for rapid and powerful progress which is to the benefit of all (Platt,  1964 ), (2) you lessen the chance that confirmation bias will result in you publishing an incorrect but provocative idea, (3) hypotheses provide clarity in design and writing, (4) research using hypotheses is more likely to be published in a high‐impact journal, and (5) you are able to provide satisfying answers to “why?” phenomena occur. However, few current academic metrics appear to reward use of hypotheses. Therefore, we propose that in order to promote hypothesis use we may need to provide additional incentives (Edwards & Roy,  2016 ; Smaldino & McElreath,  2016 ). We suggest editors reward research conducted using principles of sound scientific method and be skeptical of research that smacks of data dredging, post hoc hypothesis development, and single hypotheses. If no hypotheses are stated in a paper and/or the paper is purely descriptive, editors should ask whether the novelty of the system and question warrant this, or if the field would have been better served by a study with mechanistic hypotheses. Eleven of the top 20 ecology journals already indicate a desire for hypotheses in their instructions for authors—with some going as far as indicating “priority will be given” for manuscripts testing clearly stated hypotheses. Although hypotheses are not necessary in all instances, we expect that their continued and increased use will help our disciplines move toward greater understanding, higher reproducibility, better prediction, and more effective management and conservation of nature. We recommend authors, editors, and readers encourage their use (Box  2 ).

CONFLICT OF INTEREST

The authors have no conflicts of interests to declare.

AUTHOR CONTRIBUTIONS

Matthew G. Betts: Conceptualization (lead); data curation (lead); formal analysis (lead); funding acquisition (lead); investigation (lead); methodology (equal); project administration (lead); resources (lead); supervision (lead); visualization (lead); writing‐original draft (lead); writing‐review & editing (lead). Adam S. Hadley: Conceptualization (lead); data curation (lead); funding acquisition (equal); investigation (equal); methodology (lead); project administration (equal); resources (supporting); software (supporting); supervision (lead); validation (lead); visualization (lead); writing‐original draft (equal); writing‐review & editing (equal). David W. Frey: Conceptualization (supporting); data curation (supporting); formal analysis (supporting); funding acquisition (supporting); writing‐review & editing (supporting). Sarah J. K. Frey: Conceptualization (supporting); Investigation (equal); writing‐review & editing (equal). Dusty Gannon: Conceptualization (supporting); Investigation (equal); writing‐review & editing (equal). Scott H. Harris: Conceptualization (supporting); Investigation (equal); methodology (equal); writing‐review & editing (equal). Hankyu Kim: Conceptualization (supporting); Investigation (equal); Methodology (equal); writing‐review & editing (equal). Kara Leimberger: Conceptualization (supporting); Investigation (equal); Methodology (equal); writing‐review & editing (equal). Katie Moriarty: Conceptualization (supporting); Investigation (equal); methodology (equal); writing‐review & editing (equal). Joseph M. Northrup: Investigation (equal); methodology (equal); writing‐review & editing (equal). Ben Phalan: Investigation (equal); Methodology (equal); writing‐review & editing (equal). Josée S. Rousseau: Investigation (equal); Methodology (equal); writing‐review & editing (equal). Thomas D. Stokely: Investigation (equal); methodology (equal); writing‐review & editing (equal). Jonathon J. Valente: Investigation (equal); methodology (equal); writing‐review & editing (equal). Urs G. Kormann: Methodology (supporting); resources (equal); writing‐review & editing (supporting). Chris Wolf: Formal analysis (supporting); writing‐review & editing (supporting). Diego Zárrate‐Charry: Investigation (equal); Methodology (equal); writing‐review & editing (equal).

ETHICAL APPROVAL

The authors adhered to all standards for the ethical conduct of research.

Supporting information

Supplementary Material

ACKNOWLEDGMENTS

Funding from the National Science Foundation (NSF‐DEB‐1457837) to MGB and ASH supported this research. We thank Rob Fletcher, Craig Loehle and anonymous reviewers for thoughtful comments early versions of this manuscript, as well as Joe Nocera and his graduate student group at the University of New Brunswick for constructive comments on the penultimate version of the paper. The authors are also grateful for A. Dream for providing additional resources to enable the completion of this manuscript.

Betts MG, Hadley AS, Frey DW, et al. When are hypotheses useful in ecology and evolution? . Ecol Evol . 2021; 11 :5762–5776. 10.1002/ece3.7365 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Matthew G. Betts and Adam S. Hadley contributed equally to this manuscript.

DATA AVAILABILITY STATEMENT

  • AAAS (2018). What percentage of submissions does Science accept? . AAAS Science Contributors. Retrieved from http://www.sciencemag.org/site/feature/contribinfo/faq/index.xhtml‐pct_faq [ Google Scholar ]
  • Andrén, H. , & Andren, H. (1994). Effects of habitat fragmentation on birds and mammals in landscapes with different proportions of suitable habitat: A review . Oikos , 71 , 355–366. 10.2307/3545823 [ CrossRef ] [ Google Scholar ]
  • Ayala, F. J. (2009). Darwin and the scientific method . Proceedings of the National Academy of Sciences , 106 , 10033–10039. 10.1073/pnas.0901404106 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ayres, M. P. , & Lombardero, M. J. (2017). Forest pests and their management in the Anthropocene . Canadian Journal of Forest Research , 48 , 292–301. 10.1139/cjfr-2017-0033 [ CrossRef ] [ Google Scholar ]
  • Baker, M. (2016). 1,500 scientists lift the lid on reproducibility . Nature , 533 , 452–454. [ PubMed ] [ Google Scholar ]
  • Bates, D. , Mächler, M. , Bolker, B. , & Walker, S. (2015). Fitting linear mixed‐effects models using lme4 . Journal of Statistical Software , 67 , 1–48. [ Google Scholar ]
  • Bates, D. , Maechler, M. , & Bolker, B. (2018). ‘lme4’ Linear mixed‐effects models using S4 classes . R Core Team. Retrieved from https://cran.r‐project.org/web/packages/lme4/lme4.pdf [ Google Scholar ]
  • Beale, C. M. , Lennon, J. J. , & Gimona, A. (2008). Opening the climate envelope reveals no macroscale associations with climate in European birds . Proceedings of the National Academy of Sciences , 105 , 14908–14912. 10.1073/pnas.0803506105 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Betts, M. , Diamond, T. , Forbes, G. J. , Villard, M.‐A. , & Gunn, J. (2006). The importance of spatial autocorrelation, extent and resolution in predicting forest bird occurrence . Ecological Modeling , 191 , 197–224. [ Google Scholar ]
  • Betts, M. G. , Hadley, A. S. , & Kress, J. (2015). Pollinator recognition in a keystone tropical plant . Proceedings of the National Academy of Sciences , 112 , 3433–3438. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Biello, D. (2017). Inside the debate about power posing: A Q & A with Amy Cuddy . Ideas.TED.com. Retrieved from https://ideas.ted.com/inside‐the‐debate‐about‐power‐posing‐a‐q‐a‐with‐amy‐cuddy/ [ Google Scholar ]
  • Briggs, F. , Lakshminarayanan, B. , Neal, L. , Fern, X. Z. , Raich, R. , Hadley, S. J. K. , Hadley, A. S. , & Betts, M. G. (2012). Acoustic classification of multiple simultaneous bird species: A multi‐instance multi‐label approach . The Journal of the Acoustical Society of America , 131 , 4640–4650. 10.1121/1.4707424 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Brush, S. G. (1974). Should the history of science be rated X? Science , 183 , 1164–1172. [ PubMed ] [ Google Scholar ]
  • Chamberlin, T. C. (1890). The method of multiple working hypotheses . Science , 15 , 92–96. [ PubMed ] [ Google Scholar ]
  • Crisp, M. D. , Trewick, S. A. , & Cook, L. G. (2011). Hypothesis testing in biogeography . Trends in Ecology & Evolution , 26 , 66–72. 10.1016/j.tree.2010.11.005 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cutler, D. R. , Edwards, T. C. , Beard, K. H. , Cutler, A. , Hess, K. T. , Gibson, J. , & Lawler, J. J. (2007). Random forests for classification in ecology . Ecology , 88 , 2783–2792. 10.1890/07-0539.1 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Eassom, H. (2018). 9 common reasons for rejection . Wiley: Discover the Future of Research. Retrieved from https://hub.wiley.com/community/exchanges/discover/blog/2018/2001/2031/2019‐common‐reasons‐for‐rejection [ Google Scholar ]
  • Edwards, M. A. , & Roy, S. (2016). Academic research in the 21st century: Maintaining scientific integrity in a climate of perverse incentives and hypercompetition . Environmental Engineering Science , 34 , 51–61. 10.1089/ees.2016.0223 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Egler, F. E. (1986). “Physics envy” in ecology . Bulletin of the Ecological Society of America , 67 , 233–235. [ Google Scholar ]
  • Einstein, A. (1920). Relativity: The special and general theory (78 pp.). Henry Holt and Company. [ Google Scholar ]
  • Elith, J. , Graham, C. H. , Anderson, R. P. , Dudík, M. , Ferrier, S. , Guisan, A. , Hijmans, R. J. , Huettmann, F. , Leathwick, J. R. , Lehmann, A. , Li, J. , Lohmann, L. G. , Loiselle, B. A. , Manion, G. , Moritz, C. , Nakamura, M. , Nakazawa, Y. , Overton, J. M. M. , Townsend Peterson, A. , … Zimmermann, N. E. (2006). Novel methods improve prediction of species' distributions from occurrence data . Ecography , 29 , 129–151. [ Google Scholar ]
  • Elith, J. , Leathwick, J. R. , & Hastie, T. (2008). A working guide to boosted regression trees . Journal of Animal Ecology , 77 , 802–813. 10.1111/j.1365-2656.2008.01390.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Elliott, L. P. , & Brook, B. W. (2007). Revisiting Chamberlin: Multiple working hypotheses for the 21st century . BioScience , 57 , 608–614. 10.1641/B570708 [ CrossRef ] [ Google Scholar ]
  • Elsevier (2015). 5 ways you can ensure your manuscript avoids the desk reject pile . Elsevier Connect: https://www.elsevier.com/authors‐update/story/publishing‐tips/5‐ways‐you‐can‐ensure‐your‐manuscript‐avoids‐the‐desk‐reject‐pile [ Google Scholar ]
  • Fahrig, L. (2003). Effects of habitat fragmentation on biodiversity . Annual Review of Ecology, Evolution, and Systematics , 34 , 487–515. [ Google Scholar ]
  • Glass, D. J. , & Hall, N. (2008). A brief history of the hypothesis . Cell , 134 , 378–381. 10.1016/j.cell.2008.07.033 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Golub, T. (2010). Counterpoint: Data first . Nature , 464 , 679. 10.1038/464679a [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Guisan, A. , Tingley, R. , Baumgartner, J. B. , Naujokaitis‐Lewis, I. , Sutcliffe, P. R. , Tulloch, A. I. T. , Regan, T. J. , Brotons, L. , McDonald‐Madden, E. , Mantyka‐Pringle, C. , Martin, T. G. , Rhodes, J. R. , Maggini, R. , Setterfield, S. A. , Elith, J. , Schwartz, M. W. , Wintle, B. A. , Broennimann, O. , Austin, M. , … Buckley, Y. M. (2013). Predicting species distributions for conservation decisions . Ecology Letters , 16 , 1424–1435. 10.1111/ele.12189 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hansen, M. C. , Potapov, P. V. , Moore, R. , Hancher, M. , Turubanova, S. A. , Tyukavina, A. , Thau, D. , Stehman, S. V. , Goetz, S. J. , Loveland, T. R. , Kommareddy, A. , Egorov, A. , Chini, L. , Justice, C. O. , & Townshend, J. R. G. (2013). High‐resolution global maps of 21st‐century forest cover change . Science , 342 , 850–853. 10.1126/science.1244693 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Houlahan, J. E. , McKinney, S. T. , Anderson, T. M. , & McGill, B. J. (2017). The priority of prediction in ecological understanding . Oikos , 126 , 1–7. [ Google Scholar ]
  • Hutson, M. (2018). AI researchers allege that machine learning is alchemy . Science Posted in: Technology May 3, 2018. 10.1126/science.aau0577 [ CrossRef ]
  • Illán, J. G. , Thomas, C. D. , Jones, J. A. , Wong, W.‐K. , Shirley, S. M. , & Betts, M. G. (2014). Precipitation and winter temperature predict long‐term range‐scale abundance changes in Western North American birds . Global Change Biology , 20 , 3351–3364. 10.1111/gcb.12642 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lack, D. (1954). The natural regulation of animal numbers . Clarendon Press. [ Google Scholar ]
  • Landy, F. (1986). Stamp collecting versus science ‐ Validation as hypothesis testing . American Psychologist , 41 , 1183–1192. [ Google Scholar ]
  • Loehle, C. (1987). Hypothesis testing in ecology: Psychological aspects and the importance of theory maturation . The Quarterly Review of Biology , 62 , 397–409. [ PubMed ] [ Google Scholar ]
  • May, R. M. (1981). The role of theory in ecology . American Zoologist , 21 , 903–910. [ Google Scholar ]
  • Mayr, E. (2002). What evolution is . Basic Books. [ Google Scholar ]
  • Milner, S. (2018). Newton didn't frame hypotheses. Why should we? Real Clear Science: Posted in: Physics Today April 25, 2018. 10.1063/PT.2016.2013.20180424a [ CrossRef ]
  • Munafò, M. , Nosek, B. , Bishop, D. , Button, K. S. , Chambers, C. D. , du Sert, N. P. , Simonsohn, U. , Wagenmakers, E.‐J. , Ware, J. J. , & Ioannidis, J. P. A. (2017). A manifesto for reproducible science . Nature Human Behavior , 1 , 0021. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Myers, N. , Mittermeier, R. A. , Mittermeier, C. G. , da Fonseca, G. A. B. , & Kent, J. (2000). Biodiversity hotspots for conservation priorities . Nature , 403 , 853. [ PubMed ] [ Google Scholar ]
  • O'Neill, R. V. , Johnson, A. R. , & King, A. W. (1989). A hierarchical framework for the analysis of scale . Landscape Ecology , 3 , 193–205. 10.1007/BF00131538 [ CrossRef ] [ Google Scholar ]
  • Opthof, T. , Furstner, F. , van Geer, M. , & Coronel, R. (2000). Regrets or no regrets? No regrets! The fate of rejected manuscripts . Cardiovascular Research , 45 , 255–258. 10.1016/S0008-6363(99)00339-9 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Osborne Patrick, E. , Foody Giles, M. , & Suárez‐Seoane, S. (2007). Non‐stationarity and local approaches to modelling the distributions of wildlife . Diversity and Distributions , 13 , 313–323. 10.1111/j.1472-4642.2007.00344.x [ CrossRef ] [ Google Scholar ]
  • Paine, C. E. T. , & Fox, C. W. (2018). The effectiveness of journals as arbiters of scientific impact . Ecology and Evolution , 8 , 9666–9685. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Paine, R. T. (1966). Food web complexity and species diversity . American Naturalist , 100 , 65–75. 10.1086/282400 [ CrossRef ] [ Google Scholar ]
  • Peters, R. H. (1991). A critique for ecology . Nature. Cambridge University Press. [ Google Scholar ]
  • Pettorelli, N. , Nagendra, H. , Rocchini, D. , Rowcliffe, M. , Williams, R. , Ahumada, J. , de Angelo, C. , Atzberger, C. , Boyd, D. , Buchanan, G. , Chauvenet, A. , Disney, M. , Duncan, C. , Fatoyinbo, T. , Fernandez, N. , Haklay, M. , He, K. , Horning, N. , Kelly, N. , … Wegmann, M. (2017). Remote sensing in ecology and conservation: Three years on . Remote Sensing in Ecology and Conservation , 3 , 53–56. [ Google Scholar ]
  • Platt, J. R. (1964). Strong inference . Science , 146 , 347–353. [ PubMed ] [ Google Scholar ]
  • Popper, K. (1959). The logic of scientific discovery . Basic Books. [ Google Scholar ]
  • Quinn, J. F. , & Dunham, A. E. (1983). On hypothesis testing in ecology and evolution . The American Naturalist , 122 , 602–617. 10.1086/284161 [ CrossRef ] [ Google Scholar ]
  • Ranehill, E. , Dreber, A. , Johannesson, M. , Leiberg, S. , Sul, S. , & Weber, R. A. (2015). Assessing the robustness of power posing: No effect on hormones and risk tolerance in a large sample of men and women . Psychological Science , 26 , 653–656. 10.1177/0956797614553946 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Romesburg, H. C. (1981). Wildlife science: Gaining reliable knowledge . The Journal of Wildlife Management , 45 , 293–313. 10.2307/3807913 [ CrossRef ] [ Google Scholar ]
  • Rosen, J. (2016). Research protocols: A forest of hypotheses . Nature , 536 , 239–241. 10.1038/nj7615-239a [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schimel, J. (2011). Writing science: How to write papers that get cited and proposals that get funded . Oxford University Press. [ Google Scholar ]
  • Sells, S. N. , Bassing, S. B. , Barker, K. J. , Forshee, S. C. , Keever, A. C. , Goerz, J. W. , & Mitchell, M. S. (2018). Increased scientific rigor will improve reliability of research and effectiveness of management . Journal of Wildlife Management , 82 , 485–494. 10.1002/jwmg.21413 [ CrossRef ] [ Google Scholar ]
  • Smaldino, P. E. , & McElreath, R. (2016). The natural selection of bad science . Royal Society Open Science , 3 ( 9 ), 160384. 10.1098/rsos.160384 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stafleu, F. (1971). Lamarck: The birth of biology . Taxon , 20 , 397–442. 10.2307/1218244 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sutherland, W. J. , Spiegelhalter, D. , & Burgman, M. A. (2013). Twenty tips for interpreting scientific claims . Nature , 503 , 335–337. [ PubMed ] [ Google Scholar ]
  • Symes, L. B. , Serrell, N. , & Ayres, M. P. (2015). A pactical guide for mentoring scientific inquiry . The Bulletin of the Ecological Society of America , 96 , 352–367. [ Google Scholar ]
  • Tewksbury, J. J. , Anderson, J. G. T. , Bakker, J. D. , Billo, T. J. , Dunwiddie, P. W. , Groom, M. J. , Hampton, S. E. , Herman, S. G. , Levey, D. J. , Machnicki, N. J. , del Rio, C. M. , Power, M. E. , Rowell, K. , Salomon, A. K. , Stacey, L. , Trombulak, S. C. , & Wheeler, T. A. (2014). Natural history's place in science and society . BioScience , 64 , 300–310. 10.1093/biosci/biu032 [ CrossRef ] [ Google Scholar ]
  • Thompson, J. N. (2005). The geographic mosaic of coevolution . University of Chicago Press. [ Google Scholar ]
  • Thornton, A. , & Lee, P. (2000). Publication bias in meta‐analysis: Its causes and consequences . Journal of Clinical Epidemiology , 53 , 207–216. 10.1016/S0895-4356(99)00161-4 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van Dijk, D. , Manor, O. , & Carey, L. B. (2014). Publication metrics and success on the academic job market . Current Biology , 24 , R516–R517. 10.1016/j.cub.2014.04.039 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Voosen, P. (2017). The AI detectives . Science , 357 , 22–27. 10.1126/science.357.6346.22 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Weinberg, R. (2010). Point: Hypotheses first . Nature , 464 , 678. 10.1038/464678a [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wilson, K. A. , McBride, M. F. , Bode, M. , & Possingham, H. P. (2006). Prioritizing global conservation efforts . Nature , 440 , 337. 10.1038/nature04366 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wolff, J. O. , & Krebs, C. J. (2008). Hypothesis testing and the scientific method revisited . Acta Zoologica Sinica , 54 , 383–386. [ Google Scholar ]

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

2.2: Standard Statistical Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 21580

  • Luke J. Harmon
  • University of Idaho

Standard hypothesis testing approaches focus almost entirely on rejecting null hypotheses. In the framework (usually referred to as the frequentist approach to statistics) one first defines a null hypothesis. This null hypothesis represents your expectation if some pattern, such as a difference among groups, is not present, or if some process of interest were not occurring. For example, perhaps you are interested in comparing the mean body size of two species of lizards, an anole and a gecko. Our null hypothesis would be that the two species do not differ in body size. The alternative, which one can conclude by rejecting that null hypothesis, is that one species is larger than the other. Another example might involve investigating two variables, like body size and leg length, across a set of lizard species 1 . Here the null hypothesis would be that there is no relationship between body size and leg length. The alternative hypothesis, which again represents the situation where the phenomenon of interest is actually occurring, is that there is a relationship with body size and leg length. For frequentist approaches, the alternative hypothesis is always the negation of the null hypothesis; as you will see below, other approaches allow one to compare the fit of a set of models without this restriction and choose the best amongst them.

The next step is to define a test statistic, some way of measuring the patterns in the data. In the two examples above, we would consider test statistics that measure the difference in mean body size among our two species of lizards, or the slope of the relationship between body size and leg length, respectively. One can then compare the value of this test statistic in the data to the expectation of this test statistic under the null hypothesis. The relationship between the test statistic and its expectation under the null hypothesis is captured by a P-value. The P-value is the probability of obtaining a test statistic at least as extreme as the actual test statistic in the case where the null hypothesis is true. You can think of the P-value as a measure of how probable it is that you would obtain your data in a universe where the null hypothesis is true. In other words, the P-value measures how probable it is under the null hypothesis that you would obtain a test statistic at least as extreme as what you see in the data. In particular, if the P-value is very large, say P  = 0.94, then it is extremely likely that your data are compatible with this null hypothesis.

If the test statistic is very different from what one would expect under the null hypothesis, then the P-value will be small. This means that we are unlikely to obtain the test statistic seen in the data if the null hypothesis were true. In that case, we reject the null hypothesis as long as P is less than some value chosen in advance. This value is the significance threshold, α , and is almost always set to α  = 0.05. By contrast, if that probability is large, then there is nothing “special” about your data, at least from the standpoint of your null hypothesis. The test statistic is within the range expected under the null hypothesis, and we fail to reject that null hypothesis. Note the careful language here – in a standard frequentist framework, you never accept the null hypothesis, you simply fail to reject it.

Getting back to our lizard-flipping example, we can use a frequentist approach. In this case, our particular example has a name; this is a binomial test, which assesses whether a given event with two outcomes has a certain probability of success. In this case, we are interested in testing the null hypothesis that our lizard is a fair flipper; that is, that the probability of heads p H  = 0.5. The binomial test uses the number of “successes” (we will use the number of heads, H  = 63) as a test statistic. We then ask whether this test statistic is either much larger or much smaller than we might expect under our null hypothesis. So, our null hypothesis is that p H  = 0.5; our alternative, then, is that p H takes some other value: p H  ≠ 0.5.

To carry out the test, we first need to consider how many "successes" we should expect if the null hypothesis were true. We consider the distribution of our test statistic (the number of heads) under our null hypothesis ( p H  = 0.5). This distribution is a binomial distribution (Figure 2.1).

Figure 2.1. The unfair lizard. We use the null hypothesis to generate a null distribution for our test statistic, which in this case is a binomial distribution centered around 50. We then look at our test statistic and calculate the probability of obtaining a result at least as extreme as this value. Image by the author, can be reused under a CC-BY-4.0 license.

We can use the known probabilities of the binomial distribution to calculate our P-value. We want to know the probability of obtaining a result at least as extreme as our data when drawing from a binomial distribution with parameters p  = 0.5 and n  = 100. We calculate the area of this distribution that lies to the right of 63. This area, P  = 0.003, can be obtained either from a table, from statistical software, or by using a relatively simple calculation. The value, 0.003, represents the probability of obtaining at least 63 heads out of 100 trials with p H  = 0.5. This number is the P-value from our binomial test. Because we only calculated the area of our null distribution in one tail (in this case, the right, where values are greater than or equal to 63), then this is actually a one-tailed test, and we are only considering part of our null hypothesis where p H  > 0.5. Such an approach might be suitable in some cases, but more typically we need to multiply this number by 2 to get a two-tailed test; thus, P  = 0.006. This two-tailed P-value of 0.006 includes the possibility of results as extreme as our test statistic in either direction, either too many or too few heads. Since P < 0.05, our chosen α value, we reject the null hypothesis, and conclude that we have an unfair lizard.

In biology, null hypotheses play a critical role in many statistical analyses. So why not end this chapter now? One issue is that biological null hypotheses are almost always uninteresting. They often describe the situation where patterns in the data occur only by chance. However, if you are comparing living species to each other, there are almost always some differences between them. In fact, for biology, null hypotheses are quite often obviously false. For example, two different species living in different habitats are not identical, and if we measure them enough we will discover this fact. From this point of view, both outcomes of a standard hypothesis test are unenlightening. One either rejects a silly hypothesis that was probably known to be false from the start, or one “fails to reject” this null hypothesis 2 . There is much more information to be gained by estimating parameter values and carrying out model selection in a likelihood or Bayesian framework, as we will see below. Still, frequentist statistical approaches are common, have their place in our toolbox, and will come up in several sections of this book.

One key concept in standard hypothesis testing is the idea of statistical error. Statistical errors come in two flavors: type I and type II errors. Type I errors occur when the null hypothesis is true but the investigator mistakenly rejects it. Standard hypothesis testing controls type I errors using a parameter, α , which defines the accepted rate of type I errors. For example, if α  = 0.05, one should expect to commit a type I error about 5% of the time. When multiple standard hypothesis tests are carried out, investigators often “correct” their P-values using Bonferroni correction. If you do this, then there is only a 5% chance of a single type I error across all of the tests being considered. This singular focus on type I errors, however, has a cost. One can also commit type II errors, when the null hypothesis is false but one fails to reject it. The rate of type II errors in statistical tests can be extremely high. While statisticians do take care to create approaches that have high power, traditional hypothesis testing usually fixes type I errors at 5% while type II error rates remain unknown. There are simple ways to calculate type II error rates (e.g. power analyses) but these are only rarely carried out. Furthermore, Bonferroni correction dramatically increases the type II error rate. This is important because – as stated by Perneger (1998) – “… type II errors are no less false than type I errors.” This extreme emphasis on controlling type I errors at the expense of type II errors is, to me, the main weakness of the frequentist approach 3 .

I will cover some examples of the frequentist approach in this book, mainly when discussing traditional methods like phylogenetic independent contrasts (PICs). Also, one of the model selection approaches used frequently in this book, likelihood ratio tests, rely on a standard frequentist set-up with null and alternative hypotheses.

However, there are two good reasons to look for better ways to do comparative statistics. First, as stated above, standard methods rely on testing null hypotheses that – for evolutionary questions - are usually very likely, a priori, to be false. For a relevant example, consider a study comparing the rate of speciation between two clades of carnivores. The null hypothesis is that the two clades have exactly equal rates of speciation – which is almost certainly false, although we might question how different the two rates might be. Second, in my opinion, standard frequentist methods place too much emphasis on P-values and not enough on the size of statistical effects. A small P-value could reflect either a large effect or very large sample sizes or both.

In summary, frequentist statistical methods are common in comparative statistics but can be limiting. I will discuss these methods often in this book, mainly due to their prevalent use in the field. At the same time, we will look for alternatives whenever possible.

8.1 – The null and alternative hypotheses

  • Introduction

Statistical Inference in the NHST Framework

Nhst workflow, null hypothesis, alternative hypothesis, alternative hypothesis often may be the research hypothesis, how to interpret the results of a statistical test, outcomes of an experiment, chapter 8 contents.

X^{2}

  • a calculated test statistic
  • degrees of freedom associated with the calculation of the test statistic
  • Recall from our previous discussion (Chapter 8.2) that this is not strictly the interpretation of p-value, but a short-hand for how likely the data fit the null hypothesis. P-value alone can’t tell us about “truth.”
  • in the event we reject the null hypothesis, we provisionally accept the alternative hypothesis .

By inference we mean to imply some formal process by which a conclusion is reached from data analysis of outcomes of an experiment. The process at its best leads to conclusions based on evidence. In statistics, evidence comes about from the careful and reasoned application of statistical procedures and the evaluation of probability (Abelson 1995).

Formally, statistics is rich in inference process. We begin by defining the  classical frequentist,  aka Neyman-Pearson approach, to inference, which involves the pairing of two kinds of statistical hypotheses: the null hypothesis (H O ) and the alternate hypothesis (H A ). Whether we accept the hull hypothesis or not is evaluated against a decision criterion, a fixed statistical significance level  (Lehmann 1992). Significance level refers to the setting of a  p-value threshold before testing is done. The threshold is often set to Type I error of 5% (Cowles & Davis 1982), but researchers should always consider whether this threshold is appropriate for their work (Benjamin et al 2017).

This inference process is referred to as Null Hypothesis Significance Testing, NHST. Additionally, a probability value will be obtained for the test outcome or test statistic value. In the Fisherian likelihood tradition, the magnitude of this statistic value can be associated with a probability value, the p-value, of how likely the result is given the null hypothesis is “true”. (Again, keep in mind that this is not strictly the interpretation of p-value, it’s a short-hand for how likely the data fit the null hypothesis. P-value alone can’t tell us about “truth”, per our  discussion, Chapter 8.2 .)

About -logP . P-values are traditionally reported as a decimal, like 0.000134, in the closed (set) interval (0,1) — p-values can never be exactly zero or one. The smaller the value, the less the chance our data agree with the null prediction. Small numbers like this can be confusing, particularly if many p-values are reported, like in many genomics works, e.g., GWAS studies. Instead of reporting vanishingly small p-values, studies may report the negative log 10 p-value , or -logP . Instead of small numbers, large numbers are reported, the larger, the more against the null hypothesis. Thus, our p-value becomes 3.87 -logP.

Why log 10 and not some other base transform? Just that log 10 is convenience — powers of 10.

The antilog of 3.87 returns our p-value

For convenience, a partial p-value -logP transform table

On your own, complete the table up to -logP 5 – 10. See Question 7 below .

We presented in the introduction to Chapter 8 without discussion a simple flow chart to illustrate the process of decision (Figure 1). Here, we repeat the flow chart diagram and follow with descriptions of the elements.

NHST decision flow chart

Figure 1. Flow chart of inductive statistical reasoning.

What’s missing from the flow chart is the very necessary caveat that interpretation of the null hypothesis is associated with two kinds of error, Type I error and Type II error. These points and others are discussed in the following sections.

We start with the hypothesis statements. For illustration we discuss hypotheses in terms of comparisons involving just two groups, also called two sample tests . One sample tests in contrast refer to scenarios where you compare a sample statistic to a population value. Extending these concepts to more than two samples is straight-forward, but we leave that discussion to Chapters 12 – 18.

By far the most common application of the null hypothesis testing paradigm involves the comparisons of different treatment groups on some outcome variable. These kinds of null hypotheses are the subject of Chapters 8 through 12.

The  Null hypothesis  (H O ) is a statement about the comparisons, e.g., between a sample statistic and the population, or between two treatment groups. The former is referred to as a one tailed test whereas the latter is called a two-tailed test . The null hypothesis is typically “no statistical difference” between the comparisons.

For example, a one sample, two tailed null hypothesis.

\begin{align*} H_{0}: \bar{X} = \mu \end{align*}

and we read it as “there is no statistical difference between our sample mean and the population mean.” For the more likely case in which no population mean is available, we provide another example, a two sample, two tailed null hypothesis.

\begin{align*} H_{A}: \bar{X}_{1} = \bar{X}_{2} \end{align*}

Here, we read the statement as “there is no difference between our two sample means.” Equivalently, we interpret the statement as both sample means estimate the same population mean.

\begin{align*} H_{A}: \bar{X}_{1} = \bar{X}_{2} = \mu \end{align*}

Under the Neyman-Pearson approach to inference we have two hypotheses: the null hypothesis and the alternate hypothesis. The hull hypothesis was defined above.

Tails of a test are discussed further in chapter 8.4 .

Alternative hypothesis  (H A ): If we conclude that the null hypothesis is false, or rather and more precisely, we find that we provisionally fail to reject the null hypothesis, then we provisionally accept the alternative hypothesis . The view then is that something other than random chance has influenced the sample observations. Note that the pairing of null and alternative hypotheses covers all possible outcomes. We do not, however, say that we have evidence for the alternative hypothesis under this statistical regimen (Abelson 1995). We tested the null hypothesis, not the alternative hypothesis. Thus, it is incorrect to write that, having found a statistical difference between two drug treatments, say aspirin and acetaminophen for relief of migraine symptoms, it is not correct to conclude that we have proven the case that acetaminophen improves improves symptoms of migraine sufferers.

For the one sample, two tailed null hypothesis, the alternative hypothesis is

\begin{align*} H_{A}: \bar{X}\neq \mu \end{align*}

and we read it as “there is a statistical difference between our sample mean and the population mean.” For the two sample, two tailed null hypothesis, the alternative hypothesis would be

\begin{align*} H_{A}: \bar{X}_{1}\neq \bar{X}_{2} \end{align*}

and we read it as “there is a statistical difference between our two sample means.”

It may be helpful to distinguish between technical hypotheses, scientific hypothesis, or the equality of different kinds of treatments. Tests of technical hypotheses include the testing of statistical assumptions like normality assumption (see Chapter 13.3 ) and homogeneity of variances ( Chapter 13.4 ). The results of inferences about technical hypotheses are used by the statistician to justify selection of parametric statistical tests ( Chapter 13 ). The testing of some scientific hypothesis like whether or not there is a positive link between lifespan and insulin-like growth factor levels in humans (Fontana et al 2008), like the link between lifespan and IGFs in other organisms (Holtzenberger et al 2003), can be further advanced by considering multiple hypotheses and a test of nested hypotheses and evaluated either in Bayesian or likelihood approaches ( Chapter 16 and Chapter 17 ).

Any number of statistical tests may be used to calculate the value of the  test statistic . For example, a one sample t-test may be used to evaluate the difference between the sample mean and the population mean ( Chapter 8.5 ) or the independent sample t-test may be used to evaluate the difference between means of the control group and the treatment group ( Chapter 10 ). The test statistic is the particular value of the outcome of our evaluation of the hypothesis and it is associated with the p-value. In other words, given the assumption of a particular probability distribution, in this case the t-distribution, we can associate a probability, the p-value, that we observed the particular value of the test statistic and the null hypothesis is true in the reference population.

By convention, we determine  statistical significance  (Cox 1982; Whitley & Ball 2002) by assigning ahead of time a decision probability called the  Type I error rate , often given the symbol  α  (alpha). The practice is to look up the  critical value  that corresponds to the outcome of the test with degrees of freedom like your experiment and at the Type I error rate that you selected. The  Degrees of Freedom  (DF, df, or sometimes noted by the symbol  v ), are the number of independent pieces of information available to you. Knowing the degrees of freedom is a crucial piece of information for making the correct tests. Each statistical test has a specific formula for obtaining the independent information available for the statistical test. We first were introduced to DF when we calculated the sample variance with the Bessel correction , n – 1, instead of dividing through by n. With the df in hand, the value of the test statistic is compared to the critical value for our null hypothesis. If the test statistic is smaller than the critical value, we fail to reject the null hypothesis. If, however, the test statistic is greater than the critical value, then we provisionally reject the null hypothesis. This critical value comes from a probability distribution appropriate for the kind of sampling and properties of the measurement we are using. In other words, the rejection criterion for the null hypothesis is set to a critical value, which corresponds to a known probability, the Type I error rate.

Before proceeding with yet another interpretation, and hopefully less technical discussion about test statistics and critical values, we need to discuss the two types of statistical errors. The Type I error rate is the statistical error assigned to the probability that we may reject a null hypothesis as a result of our evaluation of our data when in fact in the reference population, the null hypothesis is, in fact, true. In Biology we generally use Type I error α = 0.05 level of significance. We say that the probability of obtaining the observed value AND H O  is true is 1 in 20 (5%) if α = 0.05. Put another way, we are willing to reject the Null Hypothesis when there is only a 5% chance that the observations could occur and the Null hypothesis is still true. Our test statistic is associated with the p-value; the critical value is associated with the Type I error rate. If and only if the test statistic value equals the critical value will the p-value equal the Type I error rate.

The second error type associated with hypothesis testing is, β, the  Type II statistical error rate . This is the case where we accept or fail to reject a null hypothesis based on our data, but in the reference population, the situation is that indeed, the null hypothesis is actually false.

Thus, we end with a concept that may take you a while to come to terms with — there are four, not two possible outcomes of an experiment.

What are the possible outcomes of a comparative experiment\? We have two treatments, one in which subjects are given a treatment and the other, subjects receive a placebo. Subjects are followed and an outcome is measured. We calculate the descriptive statistics aka summary statistics, means, standard deviations, and perhaps other statistics, and then ask whether there is a difference between the statistics for the groups. So, two possible outcomes of the experiment, correct\? If the treatment has no effect, then we would expect the two groups to have roughly the same values for means, etc., in other words, any difference between the groups is due to chance fluctuations in the measurements and not because of any systematic effect due to the treatment received. Conversely, then if there is a difference due to the treatment, we expect to see a large enough difference in the statistics so that we would notice the systematic effect due to the treatment.

Actually, there are four, not two, possible outcomes of an experiment, just as there were four and not two conclusions about the results of a clinical assay. The four possible outcomes of a test of a statistical null hypothesis are illustrated in Table 1.

Table 1. When conducting hypothesis testing, four outcomes are possible.

In the actual population, a thing happens or it doesn’t. The null hypothesis is either true or it is not. But we don’t have access to the reference population, we don’t have a census. In other words, there is truth, but we don’t have access to the truth. We can weight, assigned as a probability or p-value,  our decisions by how likely our results are given the assumption that the truth is indeed “no difference.”

If you recall, we’ve seen a table like Table 1 before in our discussion of conditional probability and risk analysis ( Chapter 7.3 ). We made the point that statistical inference and the interpretation of clinical tests are similar (Browner and Newman 1987). From the perspective of ordering a diagnostic test , the proper null hypothesis would be the patient does not have the disease. For your review, here’s that table (Table 2).

Table 2. Interpretations of results of a diagnostic or clinical test.

Thus, a positive diagnostic test result is interpreted as rejecting the null hypothesis. If the person actually does not have the disease, then the positive diagnostic test is a false positive.

  • Match the corresponding entries in the two tables. For example, which outcome from the inference/hypothesis table matches  specificity of the test\ ?
  • Find three sources on the web for definitions of the p-value. Write out these definitions in your notes and compare them.
  • In your own words distinguish between the test statistic and the critical value.
  • Can the p-value associated with the test statistic ever be zero\? Explain.
  • Since the p-value is associated with the test statistic and the null hypothesis is true, what value must the p-value be for us to provisionally reject the null hypothesis\?
  • All of our discussions have been about testing the null hypothesis, about accepting or rejecting, provisionally, the null hypothesis. If we reject the null hypothesis, can we say that we have evidence for the alternate hypothesis\?
  • What are the p-values for -logP of 5, 6, 7, 8, 9, and 10\? Complete the p-value -logP transform table .
  • Instead of log 10 transform, create a similar table but for negative natural log transform. Which is more convenient? Hint: log(x, base=exp(1))
  • The null and alternative hypotheses
  • The controversy over proper hypothesis testing
  • Sampling distribution and hypothesis testing
  • Tails of a test
  • One sample t-test
  • Confidence limits for the estimate of population mean
  • References and suggested readings

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null and Alternative Hypotheses | Definitions & Examples

Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis (H 0 ): There’s no effect in the population .
  • Alternative hypothesis (H A ): There’s an effect in the population.

The effect is usually the effect of the independent variable on the dependent variable .

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question
  • They both make claims about the population
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
  • Alternative hypothesis (H A ): Independent variable affects dependent variable .

Test-specific

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2022, December 06). Null and Alternative Hypotheses | Definitions & Examples. Scribbr. Retrieved 9 April 2024, from https://www.scribbr.co.uk/stats/null-and-alternative-hypothesis/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution | calculator, examples & uses, types of variables in research | definitions & examples.

404 Not found

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing

Examples of null and alternative hypotheses

  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations
  • Using P-values to make conclusions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

Null Hypothesis and Alternative Hypothesis

  • Inferential Statistics
  • Statistics Tutorials
  • Probability & Games
  • Descriptive Statistics
  • Applications Of Statistics
  • Math Tutorials
  • Pre Algebra & Algebra
  • Exponential Decay
  • Worksheets By Grade
  • Ph.D., Mathematics, Purdue University
  • M.S., Mathematics, Purdue University
  • B.A., Mathematics, Physics, and Chemistry, Anderson University

Hypothesis testing involves the careful construction of two statements: the null hypothesis and the alternative hypothesis. These hypotheses can look very similar but are actually different.

How do we know which hypothesis is the null and which one is the alternative? We will see that there are a few ways to tell the difference.

The Null Hypothesis

The null hypothesis reflects that there will be no observed effect in our experiment. In a mathematical formulation of the null hypothesis, there will typically be an equal sign. This hypothesis is denoted by H 0 .

The null hypothesis is what we attempt to find evidence against in our hypothesis test. We hope to obtain a small enough p-value that it is lower than our level of significance alpha and we are justified in rejecting the null hypothesis. If our p-value is greater than alpha, then we fail to reject the null hypothesis.

If the null hypothesis is not rejected, then we must be careful to say what this means. The thinking on this is similar to a legal verdict. Just because a person has been declared "not guilty", it does not mean that he is innocent. In the same way, just because we failed to reject a null hypothesis it does not mean that the statement is true.

For example, we may want to investigate the claim that despite what convention has told us, the mean adult body temperature is not the accepted value of 98.6 degrees Fahrenheit . The null hypothesis for an experiment to investigate this is “The mean adult body temperature for healthy individuals is 98.6 degrees Fahrenheit.” If we fail to reject the null hypothesis, then our working hypothesis remains that the average adult who is healthy has a temperature of 98.6 degrees. We do not prove that this is true.

If we are studying a new treatment, the null hypothesis is that our treatment will not change our subjects in any meaningful way. In other words, the treatment will not produce any effect in our subjects.

The Alternative Hypothesis

The alternative or experimental hypothesis reflects that there will be an observed effect for our experiment. In a mathematical formulation of the alternative hypothesis, there will typically be an inequality, or not equal to symbol. This hypothesis is denoted by either H a or by H 1 .

The alternative hypothesis is what we are attempting to demonstrate in an indirect way by the use of our hypothesis test. If the null hypothesis is rejected, then we accept the alternative hypothesis. If the null hypothesis is not rejected, then we do not accept the alternative hypothesis. Going back to the above example of mean human body temperature, the alternative hypothesis is “The average adult human body temperature is not 98.6 degrees Fahrenheit.”

If we are studying a new treatment, then the alternative hypothesis is that our treatment does, in fact, change our subjects in a meaningful and measurable way.

The following set of negations may help when you are forming your null and alternative hypotheses. Most technical papers rely on just the first formulation, even though you may see some of the others in a statistics textbook.

  • Null hypothesis: “ x is equal to y .” Alternative hypothesis “ x is not equal to y .”
  • Null hypothesis: “ x is at least y .” Alternative hypothesis “ x is less than y .”
  • Null hypothesis: “ x is at most y .” Alternative hypothesis “ x is greater than y .”
  • An Example of a Hypothesis Test
  • Hypothesis Test for the Difference of Two Population Proportions
  • What Is a P-Value?
  • How to Conduct a Hypothesis Test
  • Hypothesis Test Example
  • Chi-Square Goodness of Fit Test
  • How to Do Hypothesis Tests With the Z.TEST Function in Excel
  • The Difference Between Type I and Type II Errors in Hypothesis Testing
  • Type I and Type II Errors in Statistics
  • The Runs Test for Random Sequences
  • What 'Fail to Reject' Means in a Hypothesis Test
  • What Is the Difference Between Alpha and P-Values?
  • An Example of Chi-Square Test for a Multinomial Experiment
  • Null Hypothesis Definition and Examples
  • What Is a Hypothesis? (Science)
  • Null Hypothesis Examples

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

4.1: One-Sample t-Test

  • Last updated
  • Save as PDF
  • Page ID 1734

  • John H. McDonald
  • University of Delaware

Learning Objectives

  • Use Student's \(t\)–test for one sample when you have one measurement variable and a theoretical expectation of what the mean should be under the null hypothesis. It tests whether the mean of the measurement variable is different from the null expectation.

There are several statistical tests that use the \(t\)-distribution and can be called a \(t\) - test. One is Student's \(t\) - test for one sample, named after "Student," the pseudonym that William Gosset used to hide his employment by the Guinness brewery in the early 1900s (they had a rule that their employees weren't allowed to publish, and Guinness didn't want other employees to know that they were making an exception for Gosset). Student's \(t\) - test for one sample compares a sample to a theoretical mean. It has so few uses in biology that I didn't cover it in previous editions of this Handbook, but then I recently found myself using it (McDonald and Dunn 2013), so here it is.

When to use it

Use Student's \(t\)-test when you have one measurement variable, and you want to compare the mean value of the measurement variable to some theoretical expectation. It is commonly used in fields such as physics (you've made several observations of the mass of a new subatomic particle—does the mean fit the mass predicted by the Standard Model of particle physics?) and product testing (you've measured the amount of drug in several aliquots from a new batch—is the mean of the new batch significantly less than the standard you've established for that drug?). It's rare to have this kind of theoretical expectation in biology, so you'll probably never use the one-sample \(t\)-test.

I've had a hard time finding a real biological example of a one-sample \(t\)-test, so imagine that you're studying joint position sense, our ability to know what position our joints are in without looking or touching. You want to know whether people over- or underestimate their knee angle. You blindfold \(10\) volunteers, bend their knee to a \(120^{\circ}\) angle for a few seconds, then return the knee to a \(90^{\circ}\) angle. Then you ask each person to bend their knee to the \(120^{\circ}\) angle. The measurement variable is the angle of the knee, and the theoretical expectation from the null hypothesis is \(120^{\circ}\). You get the following imaginary data:

If the null hypothesis were true that people don't over- or underestimate their knee angle, the mean of these \(10\) numbers would be \(120\). The mean of these ten numbers is \(117.2\); the one-sample \(t\)–test will tell you whether that is significantly different from \(120\).

Null hypothesis

The statistical null hypothesis is that the mean of the measurement variable is equal to a number that you decided on before doing the experiment. For the knee example, the biological null hypothesis is that people don't under- or overestimate their knee angle. You decided to move people's knees to \(120^{\circ}\), so the statistical null hypothesis is that the mean angle of the subjects' knees will be \(120^{\circ}\).

How the test works

Calculate the test statistic,\(t_s\), using this formula:

\[t_s=\frac{(\bar{x}-\mu _\theta )}{(s/\sqrt{n})}\]

where \(\bar{x}\) is the sample mean, \(\mu\) is the mean expected under the null hypothesis, \(s\) is the sample standard deviation and \(n\) is the sample size. The test statistic, \(t_s\), gets bigger as the difference between the observed and expected means gets bigger, as the standard deviation gets smaller, or as the sample size gets bigger.

Applying this formula to the imaginary knee position data gives a \(t\)-value of \(-3.69\).

You calculate the probability of getting the observed \(t_s\) value under the null hypothesis using the t-distribution. The shape of the \(t\)-distribution, and thus the probability of getting a particular \(t_s\) value, depends on the number of degrees of freedom. The degrees of freedom for a one-sample \(t\)-test is the total number of observations in the group minus \(1\). For our example data, the \(P\) value for a \(t\)-value of \(-3.69\) with \(9\) degrees of freedom is \(0.005\), so you would reject the null hypothesis and conclude that people return their knee to a significantly smaller angle than the original position.

Assumptions

The \(t\) - test assumes that the observations within each group are normally distributed. If the distribution is symmetrical, such as a flat or bimodal distribution, the one-sample \(t\) - test is not at all sensitive to the non-normality; you will get accurate estimates of the \(P\) value, even with small sample sizes. A severely skewed distribution can give you too many false positives unless the sample size is large (above \(50\) or so). If your data are severely skewed and you have a small sample size, you should try a data transformation to make them less skewed. With large sample sizes (simulations I've done suggest \(50\) is large enough), the one-sample \(t\) - test will give accurate results even with severely skewed data.

McDonald and Dunn (2013) measured the correlation of transferrin (labeled red) and Rab-10 (labeled green) in five cells. The biological null hypothesis is that transferrin and Rab-10 are not colocalized (found in the same subcellular structures), so the statistical null hypothesis is that the correlation coefficient between red and green signals in each cell image has a mean of zero. The correlation coefficients were \(0.52,\; 0.20,\; 0.59,\; 0.62\) and \(0.60\) in the five cells. The mean is \(0.51\), which is highly significantly different from \(0\) (\(t=6.46,\; 4d.f.,\; P=0.003\)), indicating that transferrin and Rab-10 are colocalized in these cells.

Graphing the results

Because you're just comparing one observed mean to one expected value, you probably won't put the results of a one-sample \(t\) - test in a graph. If you've done a bunch of them, I guess you could draw a bar graph with one bar for each mean, and a dotted horizontal line for the null expectation.

Similar tests

The paired t –test is a special case of the one-sample \(t\) - test; it tests the null hypothesis that the mean difference between two measurements (such as the strength of the right arm minus the strength of the left arm) is equal to zero. Experiments that use a paired t –test are much more common in biology than experiments using the one-sample \(t\) - test, so I treat the paired \(t\)-test as a completely different test.

The two-sample t –test compares the means of two different samples. If one of your samples is very large, you may be tempted to treat the mean of the large sample as a theoretical expectation, but this is incorrect. For example, let's say you want to know whether college softball pitchers have greater shoulder flexion angles than normal people. You might be tempted to look up the "normal" shoulder flexion angle (\(150^{\circ}\)) and compare your data on pitchers to the normal angle using a one-sample \(t\) - test. However, the "normal" value doesn't come from some theory, it is based on data that has a mean, a standard deviation, and a sample size, and at the very least you should dig out the original study and compare your sample to the sample the \(150^{\circ}\) "normal" was based on, using a two-sample \(t\)-test that takes the variation and sample size of both samples into account.

How to do the test

Spreadsheets.

I have set up a spreadsheet to perform the one-sample \(t\)–test onesamplettest.xls. It will handle up to \(1000\) observations.

There are web pages to do the one-sample \(t\)–test here and here .

Salvatore Mangiafico's \(R\) Companion has a sample R program for the one-sample t –test .

You can use PROC TTEST for Student's \(t\)-test; the CLASS parameter is the nominal variable, and the VAR parameter is the measurement variable. Here is an example program for the joint position sense data above. Note that \(H0\) parameter for the theoretical value is \(H\) followed by the numeral zero, not a capital letter \(O\).

DATA jps; INPUT angle; DATALINES; 120.6 116.4 117.2 118.1 114.1 116.9 113.3 121.1 116.9 117.0 ; PROC TTEST DATA=jps H0=50; VAR angle; RUN;

The output includes some descriptive statistics, plus the \(t\)-value and \(P\) value. For these data, the \(P\) value is \(0.005\).

DF t Value Pr > |t| 9 -3.69 0.0050

Power analysis

To estimate the sample size you to detect a significant difference between a mean and a theoretical value, you need the following:

  • the effect size, or the difference between the observed mean and the theoretical value that you hope to detect
  • the standard deviation
  • alpha, or the significance level (usually \(0.05\))
  • beta, the probability of accepting the null hypothesis when it is false (\(0.50,\; 0.80\) and \(0.90\) are common values)

The G*Power program will calculate the sample size needed for a one-sample \(t\)-test. Choose "t tests" from the "Test family" menu and "Means: Difference from constant (one sample case)" from the "Statistical test" menu. Click on the "Determine" button and enter the theoretical value ("Mean \(H0\)") and a mean with the smallest difference from the theoretical that you hope to detect ("Mean \(H1\)"). Enter an estimate of the standard deviation. Click on "Calculate and transfer to main window". Change "tails" to two, set your alpha (this will almost always be \(0.05\)) and your power (\(0.5,\; 0.8,\; or\; 0.9\) are commonly used).

As an example, let's say you want to follow up the knee joint position sense study that I made up above with a study of hip joint position sense. You're going to set the hip angle to \(70^{\circ}\) (Mean \(H0=70\)) and you want to detect an over- or underestimation of this angle of \(1^{\circ}\), so you set Mean \(H1=71\). You don't have any hip angle data, so you use the standard deviation from your knee study and enter \(2.4\) for SD. You want to do a two-tailed test at the \(P<0.05\) level, with a probability of detecting a difference this large, if it exists, of \(90\%\) (\(1-\text {beta}=0.90\)). Entering all these numbers in G*Power gives a sample size of \(63\) people.

  • McDonald, J.H., and K.W. Dunn. 2013. Statistical tests for measures of colocalization in biological microscopy. Journal of Microscopy 252: 295-302.

IMAGES

  1. Difference between Null hypothesis and Alternative Hypothesis with simple example

    null hypothesis examples biology

  2. Null hypothesis

    null hypothesis examples biology

  3. A simple 3 min video that explains the Difference between Null hypothesis and Alternative

    null hypothesis examples biology

  4. Null hypothesis

    null hypothesis examples biology

  5. Null Hypothesis Examples

    null hypothesis examples biology

  6. Null And Research Hypothesis Examples

    null hypothesis examples biology

VIDEO

  1. QUANTITATIVE BIOLOGY

  2. Research Methods

  3. How To Formulate The Hypothesis/What is Hypothesis?

  4. Null Hypothesis Test Examples

  5. Hypothesis Testing

  6. Null Hypothesis vs Alternate Hypothesis

COMMENTS

  1. Null hypothesis

    Biology definition: A null hypothesis is an assumption or proposition where an observed difference between two samples of a statistical population is purely accidental and not due to systematic causes. It is the hypothesis to be investigated through statistical hypothesis testing so that when refuted indicates that the alternative hypothesis is true. . Thus, a null hypothesis is a hypothesis ...

  2. Examples of the Null Hypothesis

    Null Hypothesis Examples. The null hypothesis —which assumes that there is no meaningful relationship between two variables—may be the most valuable hypothesis for the scientific method because it is the easiest to test using a statistical analysis. This means you can support your hypothesis with a high level of confidence.

  3. Null Hypothesis Examples

    An example of the null hypothesis is that light color has no effect on plant growth. The null hypothesis (H 0) is the hypothesis that states there is no statistical difference between two sample sets. In other words, it assumes the independent variable does not have an effect on the dependent variable in a scientific experiment.

  4. Null Hypothesis Definition and Examples

    Null Hypothesis Examples. "Hyperactivity is unrelated to eating sugar " is an example of a null hypothesis. If the hypothesis is tested and found to be false, using statistics, then a connection between hyperactivity and sugar ingestion may be indicated. A significance test is the most common statistical test used to establish confidence in a ...

  5. How to Write a Null Hypothesis (5 Examples)

    Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms: H0 (Null Hypothesis): Population parameter =, ≤, ≥ some value. HA (Alternative Hypothesis): Population parameter <, >, ≠ some value. Note that the null hypothesis always contains the equal sign.

  6. Hypothesis Examples

    Here are null hypothesis examples: Plant growth is unaffected by temperature. If you increase temperature, then solubility of salt will increase. Incidence of skin cancer is unrelated to ultraviolet light exposure. All brands of light bulb last equally long. Cats have no preference for the color of cat food. All daisies have the same number of ...

  7. Null & Alternative Hypotheses

    The null and alternative hypotheses offer competing answers to your research question. When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the ...

  8. Null Hypothesis: Definition, Rejecting & Examples

    It is one of two mutually exclusive hypotheses about a population in a hypothesis test. When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant. Statisticians often denote the null hypothesis as H 0 or H A. Null Hypothesis H0: No effect exists in the population.

  9. Hypothesis testing

    Summary. One of the main goals of statistical hypothesis testing is to estimate the P value, which is the probability of obtaining the observed results, or something more extreme, if the null hypothesis were true. If the observed results are unlikely under the null hypothesis, your reject the null hypothesis. Alternatives to this "frequentist ...

  10. Genetics and Statistical Analysis

    Let's consider some examples of the use of the null hypothesis in a genetics experiment. Remember that Mendelian inheritance deals with traits that show discontinuous variation, which means that ...

  11. Defining the null hypothesis

    Fig. 1. The effects of drug A on the relative survival of mice infected with virus B. Relative survival is significantly decreased in infected mice (light blue), but not in infected mice treated with drug A (green), compared with the control (dark blue); n = 30, z-test for proportions * P < 0.05. n/s not significant. Full size image.

  12. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  13. null hypothesis

    The statistical hypothesis that states that there will be no differences between observed and expected data. ... null hypothesis. ... Cell Biology ...

  14. When are hypotheses useful in ecology and evolution?

    Null Hypothesis: A concise statement expressing the concept of "no difference" between a sample and the population mean. ... An example from conservation biology is the well‐explored hypothesis that the effects of habitat fragmentation should be greatest when habitat amount is low due to dispersal limitation ...

  15. 2.2: Standard Statistical Hypothesis Testing

    Luke J. Harmon. University of Idaho. Standard hypothesis testing approaches focus almost entirely on rejecting null hypotheses. In the framework (usually referred to as the frequentist approach to statistics) one first defines a null hypothesis. This null hypothesis represents your expectation if some pattern, such as a difference among groups ...

  16. 8.1

    These kinds of null hypotheses are the subject of Chapters 8 through 12. The Null hypothesis (H O) is a statement about the comparisons, e.g., between a sample statistic and the population, or between two treatment groups. The former is referred to as a one tailed test whereas the latter is called a two-tailed test.

  17. Null Hypothesis

    The null hypothesis is defined as any observable differences in treatments or variables is likely due to chance. In other words, the null hypothesis states that there is no significant difference ...

  18. Null and Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...

  19. 16.2: Null Hypothesis Statistical Testing- An Example

    16.2: Null Hypothesis Statistical Testing- An Example. There is great interest in the use of body-worn cameras by police officers, which are thought to reduce the use of force and improve officer behavior. However, in order to establish this we need experimental evidence, and it has become increasingly common for governments to use randomized ...

  20. Null hypothesis

    Biology defining: A nil hypothesis is an accept or proposition where an seen difference between two samples on a statistical population is purely accidental and not due to systematic causes. It is the hypothesis to be investigated through statistical hypothesis testing so is when disproving indicates that the alternative hypothesis is true. Thus, a null hypothesis is a type that is valid or ...

  21. Examples of null and alternative hypotheses

    It is the opposite of your research hypothesis. The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: Alternative: girls time > boys time. Null: girls time <= boys time.

  22. Null Hypothesis and Alternative Hypothesis

    For example, we may want to investigate the claim that despite what convention has told us, the mean adult body temperature is not the accepted value of 98.6 degrees Fahrenheit.The null hypothesis for an experiment to investigate this is "The mean adult body temperature for healthy individuals is 98.6 degrees Fahrenheit."

  23. 4.1: One-Sample t-Test

    Calculate the test statistic, ts t s, using this formula: ts = (x¯ −μθ) (s/ n−−√) (4.1.1) (4.1.1) t s = ( x ¯ − μ θ) ( s / n) where x¯ x ¯ is the sample mean, μ μ is the mean expected under the null hypothesis, s s is the sample standard deviation and n n is the sample size. The test statistic, ts t s, gets bigger as the ...