9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null and Alternative Hypotheses | Definitions & Examples

Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis (H 0 ): There’s no effect in the population .
  • Alternative hypothesis (H A ): There’s an effect in the population.

The effect is usually the effect of the independent variable on the dependent variable .

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question
  • They both make claims about the population
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
  • Alternative hypothesis (H A ): Independent variable affects dependent variable .

Test-specific

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2022, December 06). Null and Alternative Hypotheses | Definitions & Examples. Scribbr. Retrieved 14 May 2024, from https://www.scribbr.co.uk/stats/null-and-alternative-hypothesis/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution | calculator, examples & uses, types of variables in research | definitions & examples.

Module 9: Hypothesis Testing With One Sample

Null and alternative hypotheses, learning outcomes.

  • Describe hypothesis testing in general and in practice

The actual test begins by considering two  hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 : The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H a : The alternative hypothesis : It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make adecision. There are two options for a  decision . They are “reject H 0 ” if the sample information favors the alternative hypothesis or “do not reject H 0 ” or “decline to reject H 0 ” if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in  H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ 30

H a : More than 30% of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

H 0 : The drug reduces cholesterol by 25%. p = 0.25

H a : The drug does not reduce cholesterol by 25%. p ≠ 0.25

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

H 0 : μ = 2.0

H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 66 H a : μ __ 66

  • H 0 : μ = 66
  • H a : μ ≠ 66

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

H 0 : μ ≥ 5

H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 45 H a : μ __ 45

  • H 0 : μ ≥ 45
  • H a : μ < 45

In an issue of U.S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

H 0 : p ≤ 0.066

H a : p > 0.066

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : p __ 0.40 H a : p __ 0.40

  • H 0 : p = 0.40
  • H a : p > 0.40

Concept Review

In a  hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis , typically denoted with H 0 . The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality (=, ≤ or ≥) Always write the alternative hypothesis , typically denoted with H a or H 1 , using less than, greater than, or not equals symbols, i.e., (≠, >, or <). If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis. Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

H 0 and H a are contradictory.

  • OpenStax, Statistics, Null and Alternative Hypotheses. Provided by : OpenStax. Located at : http://cnx.org/contents/[email protected]:58/Introductory_Statistics . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]
  • Simple hypothesis testing | Probability and Statistics | Khan Academy. Authored by : Khan Academy. Located at : https://youtu.be/5D1gV37bKXY . License : All Rights Reserved . License Terms : Standard YouTube License

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing

Examples of null and alternative hypotheses

  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations
  • Using P-values to make conclusions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 13: Inferential Statistics

Understanding Null Hypothesis Testing

Learning Objectives

  • Explain the purpose of null hypothesis testing, including the role of sampling error.
  • Describe the basic logic of null hypothesis testing.
  • Describe the role of relationship strength and sample size in determining statistical significance and make reasonable judgments about statistical significance based on these two factors.

The Purpose of Null Hypothesis Testing

As we have seen, psychological research typically involves measuring one or more variables for a sample and computing descriptive statistics for that sample. In general, however, the researcher’s goal is not to draw conclusions about that sample but to draw conclusions about the population that the sample was selected from. Thus researchers must use sample statistics to draw conclusions about the corresponding values in the population. These corresponding values in the population are called  parameters . Imagine, for example, that a researcher measures the number of depressive symptoms exhibited by each of 50 clinically depressed adults and computes the mean number of symptoms. The researcher probably wants to use this sample statistic (the mean number of symptoms for the sample) to draw conclusions about the corresponding population parameter (the mean number of symptoms for clinically depressed adults).

Unfortunately, sample statistics are not perfect estimates of their corresponding population parameters. This is because there is a certain amount of random variability in any statistic from sample to sample. The mean number of depressive symptoms might be 8.73 in one sample of clinically depressed adults, 6.45 in a second sample, and 9.44 in a third—even though these samples are selected randomly from the same population. Similarly, the correlation (Pearson’s  r ) between two variables might be +.24 in one sample, −.04 in a second sample, and +.15 in a third—again, even though these samples are selected randomly from the same population. This random variability in a statistic from sample to sample is called  sampling error . (Note that the term error  here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

One implication of this is that when there is a statistical relationship in a sample, it is not always clear that there is a statistical relationship in the population. A small difference between two group means in a sample might indicate that there is a small difference between the two group means in the population. But it could also be that there is no difference between the means in the population and that the difference in the sample is just a matter of sampling error. Similarly, a Pearson’s  r  value of −.29 in a sample might mean that there is a negative relationship in the population. But it could also be that there is no relationship in the population and that the relationship in the sample is just a matter of sampling error.

In fact, any statistical relationship in a sample can be interpreted in two ways:

  • There is a relationship in the population, and the relationship in the sample reflects this.
  • There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

The Logic of Null Hypothesis Testing

Null hypothesis testing  is a formal approach to deciding between two interpretations of a statistical relationship in a sample. One interpretation is called the   null hypothesis  (often symbolized  H 0  and read as “H-naught”). This is the idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error. Informally, the null hypothesis is that the sample relationship “occurred by chance.” The other interpretation is called the  alternative hypothesis  (often symbolized as  H 1 ). This is the idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

Again, every statistical relationship in a sample can be interpreted in either of these two ways: It might have occurred by chance, or it might reflect a relationship in the population. So researchers need a way to decide between them. Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

  • Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.
  • Determine how likely the sample relationship would be if the null hypothesis were true.
  • If the sample relationship would be extremely unlikely, then reject the null hypothesis  in favour of the alternative hypothesis. If it would not be extremely unlikely, then  retain the null hypothesis .

Following this logic, we can begin to understand why Mehl and his colleagues concluded that there is no difference in talkativeness between women and men in the population. In essence, they asked the following question: “If there were no difference in the population, how likely is it that we would find a small difference of  d  = 0.06 in our sample?” Their answer to this question was that this sample relationship would be fairly likely if the null hypothesis were true. Therefore, they retained the null hypothesis—concluding that there is no evidence of a sex difference in the population. We can also see why Kanner and his colleagues concluded that there is a correlation between hassles and symptoms in the population. They asked, “If the null hypothesis were true, how likely is it that we would find a strong correlation of +.60 in our sample?” Their answer to this question was that this sample relationship would be fairly unlikely if the null hypothesis were true. Therefore, they rejected the null hypothesis in favour of the alternative hypothesis—concluding that there is a positive correlation between these variables in the population.

A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the  p value . A low  p  value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high  p  value means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the  p  value be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is called  α (alpha)  and is almost always set to .05. If there is less than a 5% chance of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be  statistically significant . If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to conclude that it is true. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis,” but they never use the expression “accept the null hypothesis.”

The Misunderstood  p  Value

The  p  value is one of the most misunderstood quantities in psychological research (Cohen, 1994) [1] . Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that the  p  value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the  p  value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect . The  p  value is really the probability of a result at least as extreme as the sample result  if  the null hypothesis  were  true. So a  p  value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that the  p  value is not the probability that any particular  hypothesis  is true or false. Instead, it is the probability of obtaining the  sample result  if the null hypothesis were true.

Role of Sample Size and Relationship Strength

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is the  p  value?” It can be helpful to see that the answer to this question depends on just two considerations: the strength of the relationship and the size of the sample. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true. That is, the lower the  p  value. This should make sense. Imagine a study in which a sample of 500 women is compared with a sample of 500 men in terms of some psychological characteristic, and Cohen’s  d  is a strong 0.50. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. Now imagine a similar study in which a sample of three women is compared with a sample of three men, and Cohen’s  d  is a weak 0.10. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.

Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table 13.1 shows roughly how relationship strength and sample size combine to determine whether a sample result is statistically significant. The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research. Thus each cell in the table represents a combination of relationship strength and sample size. If a cell contains the word  Yes , then this combination would be statistically significant for both Cohen’s  d  and Pearson’s  r . If it contains the word  No , then it would not be statistically significant for either. There is one cell where the decision for  d  and  r  would be different and another where it might be different depending on some additional considerations, which are discussed in Section 13.2 “Some Basic Null Hypothesis Tests”

Although Table 13.1 provides only a rough guideline, it shows very clearly that weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Statistical Significance Versus Practical Significance

Table 13.1 illustrates another extremely important point. A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. This is closely related to Janet Shibley Hyde’s argument about sex differences (Hyde, 2007) [2] . The differences between women and men in mathematical problem solving and leadership ability are statistically significant. But the word  significant  can cause people to interpret these differences as strong and important—perhaps even important enough to influence the college courses they take or even who they vote for. As we have seen, however, these statistically significant differences are actually quite weak—perhaps even “trivial.”

This is why it is important to distinguish between the  statistical  significance of a result and the  practical  significance of that result.  Practical significance refers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.” For example, a study on a new treatment for social phobia might show that it produces a statistically significant positive effect. Yet this effect still might not be strong enough to justify the time, effort, and other costs of putting it into practice—especially if easier and cheaper treatments that work almost as well already exist. Although statistically significant, this result would be said to lack practical or clinical significance.

Key Takeaways

  • Null hypothesis testing is a formal approach to deciding whether a statistical relationship in a sample reflects a real relationship in the population or is just due to chance.
  • The logic of null hypothesis testing involves assuming that the null hypothesis is true, finding how likely the sample result would be if this assumption were correct, and then making a decision. If the sample result would be unlikely if the null hypothesis were true, then it is rejected in favour of the alternative hypothesis. If it would not be unlikely, then the null hypothesis is retained.
  • The probability of obtaining the sample result if the null hypothesis were true (the  p  value) is based on two considerations: relationship strength and sample size. Reasonable judgments about whether a sample relationship is statistically significant can often be made by quickly considering these two factors.
  • Statistical significance is not the same as relationship strength or importance. Even weak relationships can be statistically significant if the sample size is large enough. It is important to consider relationship strength and the practical significance of a result in addition to its statistical significance.
  • Discussion: Imagine a study showing that people who eat more broccoli tend to be happier. Explain for someone who knows nothing about statistics why the researchers would conduct a null hypothesis test.
  • The correlation between two variables is  r  = −.78 based on a sample size of 137.
  • The mean score on a psychological characteristic for women is 25 ( SD  = 5) and the mean score for men is 24 ( SD  = 5). There were 12 women and 10 men in this study.
  • In a memory experiment, the mean number of items recalled by the 40 participants in Condition A was 0.50 standard deviations greater than the mean number recalled by the 40 participants in Condition B.
  • In another memory experiment, the mean scores for participants in Condition A and Condition B came out exactly the same!
  • A student finds a correlation of  r  = .04 between the number of units the students in his research methods class are taking and the students’ level of stress.

Long Descriptions

“Null Hypothesis” long description: A comic depicting a man and a woman talking in the foreground. In the background is a child working at a desk. The man says to the woman, “I can’t believe schools are still teaching kids about the null hypothesis. I remember reading a big study that conclusively disproved it years ago.” [Return to “Null Hypothesis”]

“Conditional Risk” long description: A comic depicting two hikers beside a tree during a thunderstorm. A bolt of lightning goes “crack” in the dark sky as thunder booms. One of the hikers says, “Whoa! We should get inside!” The other hiker says, “It’s okay! Lightning only kills about 45 Americans a year, so the chances of dying are only one in 7,000,000. Let’s go on!” The comic’s caption says, “The annual death rate among people who know that statistic is one in six.” [Return to “Conditional Risk”]

Media Attributions

  • Null Hypothesis by XKCD  CC BY-NC (Attribution NonCommercial)
  • Conditional Risk by XKCD  CC BY-NC (Attribution NonCommercial)
  • Cohen, J. (1994). The world is round: p < .05. American Psychologist, 49 , 997–1003. ↵
  • Hyde, J. S. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16 , 259–263. ↵

Values in a population that correspond to variables measured in a study.

The random variability in a statistic from sample to sample.

A formal approach to deciding between two interpretations of a statistical relationship in a sample.

The idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error.

The idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

When the relationship found in the sample would be extremely unlikely, the idea that the relationship occurred “by chance” is rejected.

When the relationship found in the sample is likely to have occurred by chance, the null hypothesis is not rejected.

The probability that, if the null hypothesis were true, the result found in the sample would occur.

How low the p value must be before the sample result is considered unlikely in null hypothesis testing.

When there is less than a 5% chance of a result as extreme as the sample result occurring and the null hypothesis is rejected.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

null hypothesis knowledge

What is The Null Hypothesis & When Do You Reject The Null Hypothesis

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

A null hypothesis is a statistical concept suggesting no significant difference or relationship between measured variables. It’s the default assumption unless empirical evidence proves otherwise.

The null hypothesis states no relationship exists between the two variables being studied (i.e., one variable does not affect the other).

The null hypothesis is the statement that a researcher or an investigator wants to disprove.

Testing the null hypothesis can tell you whether your results are due to the effects of manipulating ​ the dependent variable or due to random chance. 

How to Write a Null Hypothesis

Null hypotheses (H0) start as research questions that the investigator rephrases as statements indicating no effect or relationship between the independent and dependent variables.

It is a default position that your research aims to challenge or confirm.

For example, if studying the impact of exercise on weight loss, your null hypothesis might be:

There is no significant difference in weight loss between individuals who exercise daily and those who do not.

Examples of Null Hypotheses

When do we reject the null hypothesis .

We reject the null hypothesis when the data provide strong enough evidence to conclude that it is likely incorrect. This often occurs when the p-value (probability of observing the data given the null hypothesis is true) is below a predetermined significance level.

If the collected data does not meet the expectation of the null hypothesis, a researcher can conclude that the data lacks sufficient evidence to back up the null hypothesis, and thus the null hypothesis is rejected. 

Rejecting the null hypothesis means that a relationship does exist between a set of variables and the effect is statistically significant ( p > 0.05).

If the data collected from the random sample is not statistically significance , then the null hypothesis will be accepted, and the researchers can conclude that there is no relationship between the variables. 

You need to perform a statistical test on your data in order to evaluate how consistent it is with the null hypothesis. A p-value is one statistical measurement used to validate a hypothesis against observed data.

Calculating the p-value is a critical part of null-hypothesis significance testing because it quantifies how strongly the sample data contradicts the null hypothesis.

The level of statistical significance is often expressed as a  p  -value between 0 and 1. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis.

Probability and statistical significance in ab testing. Statistical significance in a b experiments

Usually, a researcher uses a confidence level of 95% or 99% (p-value of 0.05 or 0.01) as general guidelines to decide if you should reject or keep the null.

When your p-value is less than or equal to your significance level, you reject the null hypothesis.

In other words, smaller p-values are taken as stronger evidence against the null hypothesis. Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis.

In this case, the sample data provides insufficient data to conclude that the effect exists in the population.

Because you can never know with complete certainty whether there is an effect in the population, your inferences about a population will sometimes be incorrect.

When you incorrectly reject the null hypothesis, it’s called a type I error. When you incorrectly fail to reject it, it’s called a type II error.

Why Do We Never Accept The Null Hypothesis?

The reason we do not say “accept the null” is because we are always assuming the null hypothesis is true and then conducting a study to see if there is evidence against it. And, even if we don’t find evidence against it, a null hypothesis is not accepted.

A lack of evidence only means that you haven’t proven that something exists. It does not prove that something doesn’t exist. 

It is risky to conclude that the null hypothesis is true merely because we did not find evidence to reject it. It is always possible that researchers elsewhere have disproved the null hypothesis, so we cannot accept it as true, but instead, we state that we failed to reject the null. 

One can either reject the null hypothesis, or fail to reject it, but can never accept it.

Why Do We Use The Null Hypothesis?

We can never prove with 100% certainty that a hypothesis is true; We can only collect evidence that supports a theory. However, testing a hypothesis can set the stage for rejecting or accepting this hypothesis within a certain confidence level.

The null hypothesis is useful because it can tell us whether the results of our study are due to random chance or the manipulation of a variable (with a certain level of confidence).

A null hypothesis is rejected if the measured data is significantly unlikely to have occurred and a null hypothesis is accepted if the observed outcome is consistent with the position held by the null hypothesis.

Rejecting the null hypothesis sets the stage for further experimentation to see if a relationship between two variables exists. 

Hypothesis testing is a critical part of the scientific method as it helps decide whether the results of a research study support a particular theory about a given population. Hypothesis testing is a systematic way of backing up researchers’ predictions with statistical analysis.

It helps provide sufficient statistical evidence that either favors or rejects a certain hypothesis about the population parameter. 

Purpose of a Null Hypothesis 

  • The primary purpose of the null hypothesis is to disprove an assumption. 
  • Whether rejected or accepted, the null hypothesis can help further progress a theory in many scientific cases.
  • A null hypothesis can be used to ascertain how consistent the outcomes of multiple studies are.

Do you always need both a Null Hypothesis and an Alternative Hypothesis?

The null (H0) and alternative (Ha or H1) hypotheses are two competing claims that describe the effect of the independent variable on the dependent variable. They are mutually exclusive, which means that only one of the two hypotheses can be true. 

While the null hypothesis states that there is no effect in the population, an alternative hypothesis states that there is statistical significance between two variables. 

The goal of hypothesis testing is to make inferences about a population based on a sample. In order to undertake hypothesis testing, you must express your research hypothesis as a null and alternative hypothesis. Both hypotheses are required to cover every possible outcome of the study. 

What is the difference between a null hypothesis and an alternative hypothesis?

The alternative hypothesis is the complement to the null hypothesis. The null hypothesis states that there is no effect or no relationship between variables, while the alternative hypothesis claims that there is an effect or relationship in the population.

It is the claim that you expect or hope will be true. The null hypothesis and the alternative hypothesis are always mutually exclusive, meaning that only one can be true at a time.

What are some problems with the null hypothesis?

One major problem with the null hypothesis is that researchers typically will assume that accepting the null is a failure of the experiment. However, accepting or rejecting any hypothesis is a positive result. Even if the null is not refuted, the researchers will still learn something new.

Why can a null hypothesis not be accepted?

We can either reject or fail to reject a null hypothesis, but never accept it. If your test fails to detect an effect, this is not proof that the effect doesn’t exist. It just means that your sample did not have enough evidence to conclude that it exists.

We can’t accept a null hypothesis because a lack of evidence does not prove something that does not exist. Instead, we fail to reject it.

Failing to reject the null indicates that the sample did not provide sufficient enough evidence to conclude that an effect exists.

If the p-value is greater than the significance level, then you fail to reject the null hypothesis.

Is a null hypothesis directional or non-directional?

A hypothesis test can either contain an alternative directional hypothesis or a non-directional alternative hypothesis. A directional hypothesis is one that contains the less than (“<“) or greater than (“>”) sign.

A nondirectional hypothesis contains the not equal sign (“≠”).  However, a null hypothesis is neither directional nor non-directional.

A null hypothesis is a prediction that there will be no change, relationship, or difference between two variables.

The directional hypothesis or nondirectional hypothesis would then be considered alternative hypotheses to the null hypothesis.

Gill, J. (1999). The insignificance of null hypothesis significance testing.  Political research quarterly ,  52 (3), 647-674.

Krueger, J. (2001). Null hypothesis significance testing: On the survival of a flawed method.  American Psychologist ,  56 (1), 16.

Masson, M. E. (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing.  Behavior research methods ,  43 , 679-690.

Nickerson, R. S. (2000). Null hypothesis significance testing: a review of an old and continuing controversy.  Psychological methods ,  5 (2), 241.

Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test.  Psychological bulletin ,  57 (5), 416.

Print Friendly, PDF & Email

Related Articles

What Is a Focus Group?

Research Methodology

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

Convergent Validity: Definition and Examples

Convergent Validity: Definition and Examples

Null hypothesis

The null hypothesis (H 0 ) is the basis of statistical hypothesis testing. It is the default hypothesis (assumed to be true) that states that there is no statistically significant difference between some population parameter (such as the mean), and a hypothesized value. It is typically based on previous analysis or knowledge.

The null hypothesis is used for various purposes, such as to verify statistical assumptions, to verify that multiple experiments are producing consistent results, to directly advance theories, and more.

Most commonly, the null hypothesis is used to state the equality between two or more variables, such as a drug and a placebo. This equality is then tested in a statistical hypothesis test. Generally, the null hypothesis is the hypothesis that the researcher is attempting to disprove, though this is not necessarily always the goal. It is contrasted with the alternative hypothesis (H a ), which is a statement that there is some difference (value is greater than, less than, or not the same), and seeks to provide evidence that any observed differences are statistically significant, rather than due to random variation.

For example, the null hypothesis may state that the GPA of students at a given high school is not better than the state average. The corresponding alternative hypothesis may state that the GPA of students at a given high school is better than the state average, and a hypothesis test would then be conducted to determine whether there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis.

Mathematically, the null hypothesis is denoted as H 0 , and is stated as

H 0 : μ = μ 0

where μ 0 is the assumed or hypothesized population mean, and μ is the mean of the population from which samples are drawn. Since the null hypothesis is a statement that there is no difference between these population parameters,

μ - μ 0 = 0

The alternative hypothesis generally takes one of three forms:

H 0 can also be stated as an inequality:

H 0 : μ > μ 0

The corresponding alternative hypothesis is stated as:

H a : μ ≤ μ 0

Statistical hypothesis testing

A statistical hypothesis test adheres to the following general procedure:

  • State the null and alternative hypotheses.
  • Select a significance level, α (the probability of rejecting the null hypothesis when the null hypothesis is true), and the appropriate test statistic.
  • chi-squared test
  • Reject the null hypothesis in favor of the alternative hypothesis if the observed value lies within the critical region. Otherwise, do not reject the null hypothesis.

Alternatively, instead of using critical regions, it is possible to calculate the p-value and compare it to the chosen significance level:

  • If the p-value is less than or equal to the significance level, reject the null hypothesis in favor of the alternative hypothesis.
  • If the p-value is greater than the significance level, do not reject the null hypothesis.

Note that the aim of this type of hypothesis test is to determine whether there is evidence to reject the null hypothesis in favor of the alternative hypothesis at a given significance level. This is not the same as proving or accepting an alternative hypothesis, since there may be evidence for the alternative hypothesis at one significance level, but not another. Also, if there is insufficient evidence for the alternative hypothesis, we fail to reject the null hypothesis, rather than accepting it; it is not possible to accept the null hypothesis.

The national average SAT score, calculated for all juniors, was 1150 with a standard deviation of 75. A sample of 35 juniors from a given high school had an average score of 1250. Assuming a significance level of 0.05, use a Z-test to determine whether the difference between the average score of the class of 35 and the national average is statistically significant.

1. State the null and alternative hypotheses:

H 0 : μ = 1150

H a : μ ≠ 1150

2. The selected significance level is 0.05, and test scores follow a normal distribution, so it is appropriate to calculate the Z-score of the test statistic and conduct a Z-test.

3. Since we want to determine if any difference exists, a two-tailed test is appropriate, which means that the 0.05 critical region is broken up into two critical regions comprising an area of 0.025 each; the critical regions for a two-tailed Z-test given a 0.05 significance level are:

4. Calculate the Z-score of the observed value:

5. Since the Z-score of the observed value does not lie within the critical region (as shown in the figure below), we fail to reject the null hypothesis.

null hypothesis knowledge

Failing to reject the null hypothesis suggests that there is not a statistically significant difference between the average scores of the class of 35 and the national average at a significance level of 0.05.

A significance level α of 0.05 means that there is a 5% chance of rejecting the null hypothesis when the null hypothesis is true. When this occurs, the error is referred to as a type I error, or a false positive. In cases where the opposite occurs, and we fail to reject the null hypothesis when it is false, it is referred to as a type II error, as summarized in the table below:

Null Hypothesis Examples

ThoughtCo / Hilary Allison

  • Scientific Method
  • Chemical Laws
  • Periodic Table
  • Projects & Experiments
  • Biochemistry
  • Physical Chemistry
  • Medical Chemistry
  • Chemistry In Everyday Life
  • Famous Chemists
  • Activities for Kids
  • Abbreviations & Acronyms
  • Weather & Climate
  • Ph.D., Biomedical Sciences, University of Tennessee at Knoxville
  • B.A., Physics and Mathematics, Hastings College

In statistical analysis, the null hypothesis assumes there is no meaningful relationship between two variables. Testing the null hypothesis can tell you whether your results are due to the effect of manipulating ​a dependent variable or due to chance. It's often used in conjunction with an alternative hypothesis, which assumes there is, in fact, a relationship between two variables.

The null hypothesis is among the easiest hypothesis to test using statistical analysis, making it perhaps the most valuable hypothesis for the scientific method. By evaluating a null hypothesis in addition to another hypothesis, researchers can support their conclusions with a higher level of confidence. Below are examples of how you might formulate a null hypothesis to fit certain questions.

What Is the Null Hypothesis?

The null hypothesis states there is no relationship between the measured phenomenon (the dependent variable ) and the independent variable , which is the variable an experimenter typically controls or changes. You do not​ need to believe that the null hypothesis is true to test it. On the contrary, you will likely suspect there is a relationship between a set of variables. One way to prove that this is the case is to reject the null hypothesis. Rejecting a hypothesis does not mean an experiment was "bad" or that it didn't produce results. In fact, it is often one of the first steps toward further inquiry.

To distinguish it from other hypotheses , the null hypothesis is written as ​ H 0  (which is read as “H-nought,” "H-null," or "H-zero"). A significance test is used to determine the likelihood that the results supporting the null hypothesis are not due to chance. A confidence level of 95% or 99% is common. Keep in mind, even if the confidence level is high, there is still a small chance the null hypothesis is not true, perhaps because the experimenter did not account for a critical factor or because of chance. This is one reason why it's important to repeat experiments.

Examples of the Null Hypothesis

To write a null hypothesis, first start by asking a question. Rephrase that question in a form that assumes no relationship between the variables. In other words, assume a treatment has no effect. Write your hypothesis in a way that reflects this.

Other Types of Hypotheses

In addition to the null hypothesis, the alternative hypothesis is also a staple in traditional significance tests . It's essentially the opposite of the null hypothesis because it assumes the claim in question is true. For the first item in the table above, for example, an alternative hypothesis might be "Age does have an effect on mathematical ability."

Key Takeaways

  • In hypothesis testing, the null hypothesis assumes no relationship between two variables, providing a baseline for statistical analysis.
  • Rejecting the null hypothesis suggests there is evidence of a relationship between variables.
  • By formulating a null hypothesis, researchers can systematically test assumptions and draw more reliable conclusions from their experiments.
  • What 'Fail to Reject' Means in a Hypothesis Test
  • What Is a Hypothesis? (Science)
  • Null Hypothesis Definition and Examples
  • What Are the Elements of a Good Hypothesis?
  • Scientific Method Vocabulary Terms
  • Definition of a Hypothesis
  • Six Steps of the Scientific Method
  • What Is the Difference Between Alpha and P-Values?
  • Hypothesis Test for the Difference of Two Population Proportions
  • Understanding Simple vs Controlled Experiments
  • Null Hypothesis and Alternative Hypothesis
  • What Are Examples of a Hypothesis?
  • What It Means When a Variable Is Spurious
  • Hypothesis Test Example
  • How to Conduct a Hypothesis Test
  • What Is a P-Value?

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

8.1.1: Null and Alternative Hypotheses

  • Last updated
  • Save as PDF
  • Page ID 10971

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

\(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

\(H_a\): The alternative hypothesis: It is a claim about the population that is contradictory to \(H_0\) and what we conclude when we reject \(H_0\). This is usually what the researcher is trying to prove.

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject \(H_0\)" if the sample information favors the alternative hypothesis or "do not reject \(H_0\)" or "decline to reject \(H_0\)" if the sample information is insufficient to reject the null hypothesis.

\(H_{0}\) always has a symbol with an equal in it. \(H_{a}\) never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example \(\PageIndex{1}\)

  • \(H_{0}\): No more than 30% of the registered voters in Santa Clara County voted in the primary election. \(p \leq 30\)
  • \(H_{a}\): More than 30% of the registered voters in Santa Clara County voted in the primary election. \(p > 30\)

Exercise \(\PageIndex{1}\)

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

  • \(H_{0}\): The drug reduces cholesterol by 25%. \(p = 0.25\)
  • \(H_{a}\): The drug does not reduce cholesterol by 25%. \(p \neq 0.25\)

Example \(\PageIndex{2}\)

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

  • \(H_{0}: \mu = 2.0\)
  • \(H_{a}: \mu \neq 2.0\)

Exercise \(\PageIndex{2}\)

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol \((=, \neq, \geq, <, \leq, >)\) for the null and alternative hypotheses.

  • \(H_{0}: \mu \_ 66\)
  • \(H_{a}: \mu \_ 66\)
  • \(H_{0}: \mu = 66\)
  • \(H_{a}: \mu \neq 66\)

Example \(\PageIndex{3}\)

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

  • \(H_{0}: \mu \geq 5\)
  • \(H_{a}: \mu < 5\)

Exercise \(\PageIndex{3}\)

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • \(H_{0}: \mu \_ 45\)
  • \(H_{a}: \mu \_ 45\)
  • \(H_{0}: \mu \geq 45\)
  • \(H_{a}: \mu < 45\)

Example \(\PageIndex{4}\)

In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

  • \(H_{0}: p \leq 0.066\)
  • \(H_{a}: p > 0.066\)

Exercise \(\PageIndex{4}\)

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (\(=, \neq, \geq, <, \leq, >\)) for the null and alternative hypotheses.

  • \(H_{0}: p \_ 0.40\)
  • \(H_{a}: p \_ 0.40\)
  • \(H_{0}: p = 0.40\)
  • \(H_{a}: p > 0.40\)

COLLABORATIVE EXERCISE

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

In a hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we:

  • Evaluate the null hypothesis , typically denoted with \(H_{0}\). The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality \((=, \leq \text{or} \geq)\)
  • Always write the alternative hypothesis , typically denoted with \(H_{a}\) or \(H_{1}\), using less than, greater than, or not equals symbols, i.e., \((\neq, >, \text{or} <)\).
  • If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis.
  • Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

\(H_{0}\) and \(H_{a}\) are contradictory.

  • If \(\alpha \leq p\)-value, then do not reject \(H_{0}\).
  • If\(\alpha > p\)-value, then reject \(H_{0}\).

\(\alpha\) is preconceived. Its value is set before the hypothesis test starts. The \(p\)-value is calculated from the data.References

Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm .

Null hypothesis

null hypothesis definition

Null hypothesis n., plural: null hypotheses [nʌl haɪˈpɒθɪsɪs] Definition: a hypothesis that is valid or presumed true until invalidated by a statistical test

Table of Contents

Null Hypothesis Definition

Null hypothesis is defined as “the commonly accepted fact (such as the sky is blue) and researcher aim to reject or nullify this fact”.

More formally, we can define a null hypothesis as “a statistical theory suggesting that no statistical relationship exists between given observed variables” .

In biology , the null hypothesis is used to nullify or reject a common belief. The researcher carries out the research which is aimed at rejecting the commonly accepted belief.

What Is a Null Hypothesis?

A hypothesis is defined as a theory or an assumption that is based on inadequate evidence. It needs and requires more experiments and testing for confirmation. There are two possibilities that by doing more experiments and testing, a hypothesis can be false or true. It means it can either prove wrong or true (Blackwelder, 1982).

For example, Susie assumes that mineral water helps in the better growth and nourishment of plants over distilled water. To prove this hypothesis, she performs this experiment for almost a month. She watered some plants with mineral water and some with distilled water.

In a hypothesis when there are no statistically significant relationships among the two variables, the hypothesis is said to be a null hypothesis. The investigator is trying to disprove such a hypothesis. In the above example of plants, the null hypothesis is:

There are no statistical relationships among the forms of water that are given to plants for growth and nourishment.

Usually, an investigator tries to prove the null hypothesis wrong and tries to explain a relation and association between the two variables.

An opposite and reverse of the null hypothesis are known as the alternate hypothesis . In the example of plants the alternate hypothesis is:

There are statistical relationships among the forms of water that are given to plants for growth and nourishment.

The example below shows the difference between null vs alternative hypotheses:

Alternate Hypothesis: The world is round Null Hypothesis: The world is not round.

Copernicus and many other scientists try to prove the null hypothesis wrong and false. By their experiments and testing, they make people believe that alternate hypotheses are correct and true. If they do not prove the null hypothesis experimentally wrong then people will not believe them and never consider the alternative hypothesis true and correct.

The alternative and null hypothesis for Susie’s assumption is:

  • Null Hypothesis: If one plant is watered with distilled water and the other with mineral water, then there is no difference in the growth and nourishment of these two plants.
  • Alternative Hypothesis:  If one plant is watered with distilled water and the other with mineral water, then the plant with mineral water shows better growth and nourishment.

The null hypothesis suggests that there is no significant or statistical relationship. The relation can either be in a single set of variables or among two sets of variables.

Most people consider the null hypothesis true and correct. Scientists work and perform different experiments and do a variety of research so that they can prove the null hypothesis wrong or nullify it. For this purpose, they design an alternate hypothesis that they think is correct or true. The null hypothesis symbol is H 0 (it is read as H null or H zero ).

Why is it named the “Null”?

The name null is given to this hypothesis to clarify and explain that the scientists are working to prove it false i.e. to nullify the hypothesis. Sometimes it confuses the readers; they might misunderstand it and think that statement has nothing. It is blank but, actually, it is not. It is more appropriate and suitable to call it a nullifiable hypothesis instead of the null hypothesis.

Why do we need to assess it? Why not just verify an alternate one?

In science, the scientific method is used. It involves a series of different steps. Scientists perform these steps so that a hypothesis can be proved false or true. Scientists do this to confirm that there will be any limitation or inadequacy in the new hypothesis. Experiments are done by considering both alternative and null hypotheses, which makes the research safe. It gives a negative as well as a bad impact on research if a null hypothesis is not included or a part of the study. It seems like you are not taking your research seriously and not concerned about it and just want to impose your results as correct and true if the null hypothesis is not a part of the study.

Development of the Null

In statistics, firstly it is necessary to design alternate and null hypotheses from the given problem. Splitting the problem into small steps makes the pathway towards the solution easier and less challenging. how to write a null hypothesis?

Writing a null hypothesis consists of two steps:

  • Firstly, initiate by asking a question.
  • Secondly, restate the question in such a way that it seems there are no relationships among the variables.

In other words, assume in such a way that the treatment does not have any effect.

The usual recovery duration after knee surgery is considered almost 8 weeks.

A researcher thinks that the recovery period may get elongated if patients go to a physiotherapist for rehabilitation twice per week, instead of thrice per week, i.e. recovery duration reduces if the patient goes three times for rehabilitation instead of two times.

Step 1: Look for the problem in the hypothesis. The hypothesis either be a word or can be a statement. In the above example the hypothesis is:

“The expected recovery period in knee rehabilitation is more than 8 weeks”

Step 2: Make a mathematical statement from the hypothesis. Averages can also be represented as μ, thus the null hypothesis formula will be.

In the above equation, the hypothesis is equivalent to H1, the average is denoted by μ and > that the average is greater than eight.

Step 3: Explain what will come up if the hypothesis does not come right i.e., the rehabilitation period may not proceed more than 08 weeks.

There are two options: either the recovery will be less than or equal to 8 weeks.

H 0 : μ ≤ 8

In the above equation, the null hypothesis is equivalent to H 0 , the average is denoted by μ and ≤ represents that the average is less than or equal to eight.

What will happen if the scientist does not have any knowledge about the outcome?

Problem: An investigator investigates the post-operative impact and influence of radical exercise on patients who have operative procedures of the knee. The chances are either the exercise will improve the recovery or will make it worse. The usual time for recovery is 8 weeks.

Step 1: Make a null hypothesis i.e. the exercise does not show any effect and the recovery time remains almost 8 weeks.

H 0 : μ = 8

In the above equation, the null hypothesis is equivalent to H 0 , the average is denoted by μ, and the equal sign (=) shows that the average is equal to eight.

Step 2: Make the alternate hypothesis which is the reverse of the null hypothesis. Particularly what will happen if treatment (exercise) makes an impact?

In the above equation, the alternate hypothesis is equivalent to H1, the average is denoted by μ and not equal sign (≠) represents that the average is not equal to eight.

Significance Tests

To get a reasonable and probable clarification of statistics (data), a significance test is performed. The null hypothesis does not have data. It is a piece of information or statement which contains numerical figures about the population. The data can be in different forms like in means or proportions. It can either be the difference of proportions and means or any odd ratio.

The following table will explain the symbols:

P-value is the chief statistical final result of the significance test of the null hypothesis.

  • P-value = Pr(data or data more extreme | H 0 true)
  • | = “given”
  • Pr = probability
  • H 0 = the null hypothesis

The first stage of Null Hypothesis Significance Testing (NHST) is to form an alternate and null hypothesis. By this, the research question can be briefly explained.

Null Hypothesis = no effect of treatment, no difference, no association Alternative Hypothesis = effective treatment, difference, association

When to reject the null hypothesis?

Researchers will reject the null hypothesis if it is proven wrong after experimentation. Researchers accept null hypothesis to be true and correct until it is proven wrong or false. On the other hand, the researchers try to strengthen the alternate hypothesis. The binomial test is performed on a sample and after that, a series of tests were performed (Frick, 1995).

Step 1: Evaluate and read the research question carefully and consciously and make a null hypothesis. Verify the sample that supports the binomial proportion. If there is no difference then find out the value of the binomial parameter.

Show the null hypothesis as:

H 0 :p= the value of p if H 0 is true

To find out how much it varies from the proposed data and the value of the null hypothesis, calculate the sample proportion.

Step 2: In test statistics, find the binomial test that comes under the null hypothesis. The test must be based on precise and thorough probabilities. Also make a list of pmf that apply, when the null hypothesis proves true and correct.

When H 0 is true, X~b(n, p)

N = size of the sample

P = assume value if H 0 proves true.

Step 3: Find out the value of P. P-value is the probability of data that is under observation.

Rise or increase in the P value = Pr(X ≥ x)

X = observed number of successes

P value = Pr(X ≤ x).

Step 4: Demonstrate the findings or outcomes in a descriptive detailed way.

  • Sample proportion
  • The direction of difference (either increases or decreases)

Perceived Problems With the Null Hypothesis

Variable or model selection and less information in some cases are the chief important issues that affect the testing of the null hypothesis. Statistical tests of the null hypothesis are reasonably not strong. There is randomization about significance. (Gill, 1999) The main issue with the testing of the null hypothesis is that they all are wrong or false on a ground basis.

There is another problem with the a-level . This is an ignored but also a well-known problem. The value of a-level is without a theoretical basis and thus there is randomization in conventional values, most commonly 0.q, 0.5, or 0.01. If a fixed value of a is used, it will result in the formation of two categories (significant and non-significant) The issue of a randomized rejection or non-rejection is also present when there is a practical matter which is the strong point of the evidence related to a scientific matter.

The P-value has the foremost importance in the testing of null hypothesis but as an inferential tool and for interpretation, it has a problem. The P-value is the probability of getting a test statistic at least as extreme as the observed one.

The main point about the definition is: Observed results are not based on a-value

Moreover, the evidence against the null hypothesis was overstated due to unobserved results. A-value has importance more than just being a statement. It is a precise statement about the evidence from the observed results or data. Similarly, researchers found that P-values are objectionable. They do not prefer null hypotheses in testing. It is also clear that the P-value is strictly dependent on the null hypothesis. It is computer-based statistics. In some precise experiments, the null hypothesis statistics and actual sampling distribution are closely related but this does not become possible in observational studies.

Some researchers pointed out that the P-value is depending on the sample size. If the true and exact difference is small, a null hypothesis even of a large sample may get rejected. This shows the difference between biological importance and statistical significance. (Killeen, 2005)

Another issue is the fix a-level, i.e., 0.1. On the basis, if a-level a null hypothesis of a large sample may get accepted or rejected. If the size of simple is infinity and the null hypothesis is proved true there are still chances of Type I error. That is the reason this approach or method is not considered consistent and reliable. There is also another problem that the exact information about the precision and size of the estimated effect cannot be known. The only solution is to state the size of the effect and its precision.

Null Hypothesis Examples

Here are some examples:

Example 1: Hypotheses with One Sample of One Categorical Variable

Among all the population of humans, almost 10% of people prefer to do their task with their left hand i.e. left-handed. Let suppose, a researcher in the Penn States says that the population of students at the College of Arts and Architecture is mostly left-handed as compared to the general population of humans in general public society. In this case, there is only a sample and there is a comparison among the known population values to the population proportion of sample value.

  • Research Question: Do artists more expected to be left-handed as compared to the common population persons in society?
  • Response Variable: Sorting the student into two categories. One category has left-handed persons and the other category have right-handed persons.
  • Form Null Hypothesis: Arts and Architecture college students are no more predicted to be lefty as compared to the common population persons in society (Lefty students of Arts and Architecture college population is 10% or p= 0.10)

Example 2: Hypotheses with One Sample of One Measurement Variable

A generic brand of antihistamine Diphenhydramine making medicine in the form of a capsule, having a 50mg dose. The maker of the medicines is concerned that the machine has come out of calibration and is not making more capsules with the suitable and appropriate dose.

  • Research Question: Does the statistical data recommended about the mean and average dosage of the population differ from 50mg?
  • Response Variable: Chemical assay used to find the appropriate dosage of the active ingredient.
  • Null Hypothesis: Usually, the 50mg dosage of capsules of this trade name (population average and means dosage =50 mg).

Example 3: Hypotheses with Two Samples of One Categorical Variable

Several people choose vegetarian meals on a daily basis. Typically, the researcher thought that females like vegetarian meals more than males.

  • Research Question: Does the data recommend that females (women) prefer vegetarian meals more than males (men) regularly?
  • Response Variable: Cataloguing the persons into vegetarian and non-vegetarian categories. Grouping Variable: Gender
  • Null Hypothesis: Gender is not linked to those who like vegetarian meals. (Population percent of women who eat vegetarian meals regularly = population percent of men who eat vegetarian meals regularly or p women = p men).

Example 4: Hypotheses with Two Samples of One Measurement Variable

Nowadays obesity and being overweight is one of the major and dangerous health issues. Research is performed to confirm that a low carbohydrates diet leads to faster weight loss than a low-fat diet.

  • Research Question: Does the given data recommend that usually, a low-carbohydrate diet helps in losing weight faster as compared to a low-fat diet?
  • Response Variable: Weight loss (pounds)
  • Explanatory Variable: Form of diet either low carbohydrate or low fat
  • Null Hypothesis: There is no significant difference when comparing the mean loss of weight of people using a low carbohydrate diet to people using a diet having low fat. (population means loss of weight on a low carbohydrate diet = population means loss of weight on a diet containing low fat).

Example 5: Hypotheses about the relationship between Two Categorical Variables

A case-control study was performed. The study contains nonsmokers, stroke patients, and controls. The subjects are of the same occupation and age and the question was asked if someone at their home or close surrounding smokes?

  • Research Question: Did second-hand smoke enhance the chances of stroke?
  • Variables: There are 02 diverse categories of variables. (Controls and stroke patients) (whether the smoker lives in the same house). The chances of having a stroke will be increased if a person is living with a smoker.
  • Null Hypothesis: There is no significant relationship between a passive smoker and stroke or brain attack. (odds ratio between stroke and the passive smoker is equal to 1).

Example 6: Hypotheses about the relationship between Two Measurement Variables

A financial expert observes that there is somehow a positive and effective relationship between the variation in stock rate price and the quantity of stock bought by non-management employees

  • Response variable- Regular alteration in price
  • Explanatory Variable- Stock bought by non-management employees
  • Null Hypothesis: The association and relationship between the regular stock price alteration ($) and the daily stock-buying by non-management employees ($) = 0.

Example 7: Hypotheses about comparing the relationship between Two Measurement Variables in Two Samples

  • Research Question: Is the relation between the bill paid in a restaurant and the tip given to the waiter, is linear? Is this relation different for dining and family restaurants?
  • Explanatory Variable- total bill amount
  • Response Variable- the amount of tip
  • Null Hypothesis: The relationship and association between the total bill quantity at a family or dining restaurant and the tip, is the same.

Try to answer the quiz below to check what you have learned so far about the null hypothesis.

Choose the best answer. 

Send Your Results (Optional)

clock.png

  • Blackwelder, W. C. (1982). “Proving the null hypothesis” in clinical trials. Controlled Clinical Trials , 3(4), 345–353.
  • Frick, R. W. (1995). Accepting the null hypothesis. Memory & Cognition, 23(1), 132–138.
  • Gill, J. (1999). The insignificance of null hypothesis significance testing. Political Research Quarterly , 52(3), 647–674.
  • Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16(5), 345–353.

©BiologyOnline.com. Content provided and moderated by Biology Online Editors.

Last updated on June 16th, 2022

You will also like...

null hypothesis knowledge

Cell Respiration

Cell respiration is the process of creating ATP. It is "respiration" because it utilizes oxygen. Know the different stag..

Biological Energy

ATP & ADP – Biological Energy

ATP is the energy source that is typically used by an organism in its daily activities. The name is based on its structu..

Biological Cell schematic diagram

Biological Cell Introduction

It only takes one biological cell to create an organism. A single cell is able to keep itself functional through its 'mi..

Carbohydrates, fats and proteins - dietary sources

A Balanced Diet – Carbohydrates and Fat

Apart from vitamins, the human body also requires high energy sources such as carbohydrates and fats. If you want an ove..

Genetics – Lesson Outline & Worksheets

Genetics – Lesson Outline & Worksheets

  Topics Modules Quizzes/Worksheets Description Introduction to Genetics Genetics – Definition: Heredity and ..

temperature - abiotic factor

Abiotic and Biotic Factors

This tutorial deals with the abiotic factors of the freshwater environment that determine what sort of life would be sui..

Related Articles...

null hypothesis knowledge

No related articles found

Statology

Statistics Made Easy

Understanding the Null Hypothesis for Linear Regression

Linear regression is a technique we can use to understand the relationship between one or more predictor variables and a response variable .

If we only have one predictor variable and one response variable, we can use simple linear regression , which uses the following formula to estimate the relationship between the variables:

ŷ = β 0 + β 1 x

  • ŷ: The estimated response value.
  • β 0 : The average value of y when x is zero.
  • β 1 : The average change in y associated with a one unit increase in x.
  • x: The value of the predictor variable.

Simple linear regression uses the following null and alternative hypotheses:

  • H 0 : β 1 = 0
  • H A : β 1 ≠ 0

The null hypothesis states that the coefficient β 1 is equal to zero. In other words, there is no statistically significant relationship between the predictor variable, x, and the response variable, y.

The alternative hypothesis states that β 1 is not equal to zero. In other words, there is a statistically significant relationship between x and y.

If we have multiple predictor variables and one response variable, we can use multiple linear regression , which uses the following formula to estimate the relationship between the variables:

ŷ = β 0 + β 1 x 1 + β 2 x 2 + … + β k x k

  • β 0 : The average value of y when all predictor variables are equal to zero.
  • β i : The average change in y associated with a one unit increase in x i .
  • x i : The value of the predictor variable x i .

Multiple linear regression uses the following null and alternative hypotheses:

  • H 0 : β 1 = β 2 = … = β k = 0
  • H A : β 1 = β 2 = … = β k ≠ 0

The null hypothesis states that all coefficients in the model are equal to zero. In other words, none of the predictor variables have a statistically significant relationship with the response variable, y.

The alternative hypothesis states that not every coefficient is simultaneously equal to zero.

The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models.

Example 1: Simple Linear Regression

Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects data for 20 students and fits a simple linear regression model.

The following screenshot shows the output of the regression model:

Output of simple linear regression in Excel

The fitted simple linear regression model is:

Exam Score = 67.1617 + 5.2503*(hours studied)

To determine if there is a statistically significant relationship between hours studied and exam score, we need to analyze the overall F value of the model and the corresponding p-value:

  • Overall F-Value:  47.9952
  • P-value:  0.000

Since this p-value is less than .05, we can reject the null hypothesis. In other words, there is a statistically significant relationship between hours studied and exam score received.

Example 2: Multiple Linear Regression

Suppose a professor would like to use the number of hours studied and the number of prep exams taken to predict the exam score that students will receive in his class. He collects data for 20 students and fits a multiple linear regression model.

Multiple linear regression output in Excel

The fitted multiple linear regression model is:

Exam Score = 67.67 + 5.56*(hours studied) – 0.60*(prep exams taken)

To determine if there is a jointly statistically significant relationship between the two predictor variables and the response variable, we need to analyze the overall F value of the model and the corresponding p-value:

  • Overall F-Value:  23.46
  • P-value:  0.00

Since this p-value is less than .05, we can reject the null hypothesis. In other words, hours studied and prep exams taken have a jointly statistically significant relationship with exam score.

Note: Although the p-value for prep exams taken (p = 0.52) is not significant, prep exams combined with hours studied has a significant relationship with exam score.

Additional Resources

Understanding the F-Test of Overall Significance in Regression How to Read and Interpret a Regression Table How to Report Regression Results How to Perform Simple Linear Regression in Excel How to Perform Multiple Linear Regression in Excel

Featured Posts

5 Regularization Techniques You Should Know

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

2 Replies to “Understanding the Null Hypothesis for Linear Regression”

Thank you Zach, this helped me on homework!

Great articles, Zach.

I would like to cite your work in a research paper.

Could you provide me with your last name and initials.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Why we habitually engage in null-hypothesis significance testing: A qualitative study

Jonah stunt.

1 Department of Health Sciences, Section of Methodology and Applied Statistics, Vrije Universiteit, Amsterdam, The Netherlands

2 Department of Radiation Oncology, Erasmus Medical Center, Rotterdam, The Netherlands

Leonie van Grootel

3 Rathenau Institute, The Hague, The Netherlands

4 Department of Philosophy, Vrije Universiteit, Amsterdam, The Netherlands

5 Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, The Netherlands

David Trafimow

6 Psychology Department, New Mexico State University, Las Cruces, New Mexico, United States of America

Trynke Hoekstra

Michiel de boer.

7 Department of General Practice and Elderly Care, University Medical Center Groningen, Groningen, The Netherlands

Associated Data

A full study protocol, including a detailed data analysis plan, was preregistered ( https://osf.io/4qg38/ ). At the start of this study, preregistration forms for qualitative studies were not developed yet. Therefore, preregistration for this study is based on an outdated form. Presently, there is a preregistration form available for qualitative studies. Information about data collection, data management, data sharing and data storage is described in a Data Management Plan. Sensitive data is stored in Darkstor, an offline archive for storing sensitive information or data (information that involves i.e., privacy or copyright). As the recordings and transcripts of the interviews and focus groups contain privacy-sensitive data, these files are archived in Darkstor and can be accessed only on request by authorized individuals (i.e., the original researcher or a research coordinator)1. Non-sensitive data is stored in DANS ( https://doi.org/10.17026/dans-2at-nzfs ) (Data Archiving and Networked Services; the Netherlands institute for permanent access to digital research resources). 1. Data requests can be send to ln.uv@mdr .

Null Hypothesis Significance Testing (NHST) is the most familiar statistical procedure for making inferences about population effects. Important problems associated with this method have been addressed and various alternatives that overcome these problems have been developed. Despite its many well-documented drawbacks, NHST remains the prevailing method for drawing conclusions from data. Reasons for this have been insufficiently investigated. Therefore, the aim of our study was to explore the perceived barriers and facilitators related to the use of NHST and alternative statistical procedures among relevant stakeholders in the scientific system.

Individual semi-structured interviews and focus groups were conducted with junior and senior researchers, lecturers in statistics, editors of scientific journals and program leaders of funding agencies. During the focus groups, important themes that emerged from the interviews were discussed. Data analysis was performed using the constant comparison method, allowing emerging (sub)themes to be fully explored. A theory substantiating the prevailing use of NHST was developed based on the main themes and subthemes we identified.

Twenty-nine interviews and six focus groups were conducted. Several interrelated facilitators and barriers associated with the use of NHST and alternative statistical procedures were identified. These factors were subsumed under three main themes: the scientific climate, scientific duty, and reactivity. As a result of the factors, most participants feel dependent in their actions upon others, have become reactive, and await action and initiatives from others. This may explain why NHST is still the standard and ubiquitously used by almost everyone involved.

Our findings demonstrate how perceived barriers to shift away from NHST set a high threshold for actual behavioral change and create a circle of interdependency between stakeholders. By taking small steps it should be possible to decrease the scientific community’s strong dependence on NHST and p-values.

Introduction

Empirical studies often start from the idea that there might be an association between a specific factor and a certain outcome within a population. This idea is referred to as the alternative hypothesis (H1). Its complement, the null hypothesis (H0), typically assumes no association or effect (although it is possible to test other effect sizes than no effect with the null hypothesis). At the stage of data-analysis, the probability of obtaining the observed, or a more extreme, association is calculated under the assumption of no effect in the population (H0) and a number of inferential assumptions [ 1 ]. The probability of obtaining the observed, or more extreme, data is known as ‘the p-value’. The p-value demonstrates the compatibility between the observed data and the expected data under the null hypothesis, where 0 is complete incompatibility and 1 is perfect compatibility [ 2 ]. When the p-value is smaller than a prespecified value (labelled as alpha, usually set at 5% (0.05)), results are generally declared to be statistically significant. At this point, researchers commonly reject the null hypothesis and accept the alternative hypothesis [ 2 ]. Assessing statistical significance by means of contrasting the data with the null hypothesis is called Null Hypothesis Significance Testing (NHST). NHST is the best known and most widely used statistical procedure for making inferences about population effects. The procedure has become the prevailing paradigm in empirical science [ 3 ], and reaching and being able to report statistically significant results has become the ultimate goal for many researchers.

Despite its widespread use, NHST and the p-value have been criticized since its inception. Numerous publications have addressed problems associated with NHST and p-values. Arguably the most important drawback is the fact that NHST is a form of indirect or inverse inference: researchers usually want to know if the null or alternative hypothesis can be accepted and use NHST to conclude either way. But with NHST, the probability of a finding, or more extreme findings, given the null hypothesis is calculated [ 4 ]. Ergo, NHST doesn’t tell us what we want to know. In fact, p-values were never meant to serve as a basis to draw conclusions, but as a continuous measure of incompatibility between empirical findings and a statistical model [ 2 ]. Moreover, the procedure promotes a dichotomous way of thinking, by using the outcome of a significance test as a dichotomous indicator for an effect (p<0.05: effect, p>0.05: no effect). Reducing empirical findings to two categories also results in a great loss of information. Further, a significant outcome is often unjustly interpreted as relevant, but a p-value does not convey any information about the strength or importance of the association. Worse yet, the p-values on which NHST is based confound effect size and sample size. A trivial effect size may nevertheless result in statistical significance provided a sufficiently large sample size. Or an important effect size may fail to result in statistical significance if the sample size is too small. P-values do not validly index the size, relevance, or precision of an effect [ 5 ]. Furthermore, statistical models include not only null hypotheses, but additional assumptions, some of which are wrong, such as the ubiquitous assumption of random and independent sampling from a defined population [ 1 ]. Therefore, although p-values validly index the incompatibility of data with models, p-values do not validly index incompatibility of data with hypotheses that are embedded in wrong models. These are important drawbacks rendering NHST unsuitable as the default procedure for drawing conclusions from empirical data [ 2 , 3 , 5 – 13 ].

A number of alternatives have been developed that overcome these pitfalls, such as Bayesian inference methods [ 7 , 11 , 14 , 15 ], informative hypothesis testing [ 9 , 16 ] and a priori inferential statistics [ 4 , 17 ]. These alternatives build on the idea that research usually starts with a more informed research-question than one merely assuming the null hypothesis of no effect. These methods overcome the problem of inverse inference, although the first two might still lead to dichotomous thinking with the use of thresholds. Despite the availability of alternatives, statistical behavior in the research community has hardly changed. Researchers have been slow to adopt alternative methods and NHST is still the prevailing paradigm for making inferences about population effects [ 3 ].

Until now, reasons for the continuous and ubiquitous use of NHST and the p-value have scarcely been investigated. One explanation is that NHST provides a very simple means for drawing conclusions from empirical data, usually based on the 5% cut-off. Secondly, most researchers are unaware of the pitfalls of NHST; it has been shown that NHST and the p-value are often misunderstood and misinterpreted [ 2 , 3 , 8 , 11 , 18 , 19 ]. Thirdly, NHST has a central role in most methods and statistics courses in higher education. Courses on alternative methods are increasingly being offered but are usually not mandatory. To our knowledge, there is a lack of in depth, empirical research, aimed at elucidating why NHST nevertheless remains the dominant approach, or what actions can be taken to shift the sciences away from NHST. Therefore, the aim of our study was to explore the perceived barriers and facilitators, as well as behavioral intentions related to the use of NHST and alternatives statistical procedures, among all relevant stakeholders in the scientific system.

Theoretical framework

In designing our study, we used two theories. Firstly, we used the ‘diffusion of innovation theory’ of Rogers [ 20 ]. This theory describes the dissemination of an innovation as a process consisting of four elements: 1) an innovation is 2) communicated through certain channels 3) over time 4) among the members of a social system [ 20 ]. In the current study, the innovation consists of the idea that we should stop with the default use of NHST and instead consider using alternative methods for drawing conclusions from empirical data. The science system forms the social structure in which the innovation should take place. The most important members, and potential adopters of the innovation, we identified are researchers, lecturers, editors of scientific journals and representatives of funding agencies. Rogers describes phases in the adoption process, which coincide with characteristics of the (potential) adopters of the idea: 1) innovators, 2) early adopters, 3) early majority adopters, 4) late majority adopters and 5) laggards. Innovators are the first to adopt an innovation. There are few innovators but these few are very important for bringing in new ideas. Early adopters form the second group to adopt an innovation. This group includes opinion leaders and role models for other stakeholders. The largest group consists of the early and late majority who follow the early adopters, and then there is a smaller group of laggards who resist the innovation until they are certain the innovation will not fail. The process of innovation adoption by individuals is described as a normal distribution ( Fig 1 ). For these five groups, the adoption of a new idea is influenced by the following five characteristics of the innovative idea and 1) its relative advantage, 2) its compatibility with current experiences, 3) its complexity, 4) its flexibility, and 5) its visibility [ 20 ]. Members of all four stakeholder groups could play an important role in the diffusion of the innovation of replacing NHST by its alternatives.

An external file that holds a picture, illustration, etc.
Object name is pone.0258330.g001.jpg

The innovativeness dimension, measured by the time at which an individual from an adopter category adopts an innovation. Each category is one of more standard deviations removed from the average time of adoption [ 20 ].

Another important theory for our study is the ‘theory of planned behavior’, that was developed in the 1960s [ 21 ]. This theory describes how human behavior in a certain context can be predicted and explained. The theory was updated in 2010, under the name ‘the reasoned action approach’ [ 22 ]. A central factor in this theory is the intention to perform a certain behavior, in this case, to change the default use of NHST. According to the theory, people’s intentions determine their behaviors. An intention indexes to what extent someone is motivated to perform the behavior. Intentions are determined by three independent determinants: the person’s attitudes toward the behavior—the degree to which a person sees the behavior as favorable or unfavorable, perceived subjective norms regarding the behavior—the perceived social pressure to perform the behavior or not, and perceptions of control regarding the behavior—the perceived ease or difficulty of performing the behavior. Underlying (i.e. responsible for) these three constructs are corresponding behavioral, normative, and control beliefs [ 21 , 22 ] (see Fig 2 ).

An external file that holds a picture, illustration, etc.
Object name is pone.0258330.g002.jpg

Both theories have served as a lens for both data collection and analysis. We used sensitizing concepts [ 23 ] within the framework of the grounded theory approach [ 24 ] from both theories as a starting point for this qualitative study, and more specifically, for the topic list for the interviews and focus groups, providing direction and guidance for the data collection and data analysis.

Many of the concepts of Rogers’ and Fishbein and Ajzen’s theory can be seen as facilitators and barriers for embracing and implementing innovation in the scientific system.

A qualitative study among stakeholders using semi-structured interviews and focus groups was performed. Data collection and analysis were guided by the principle of constant comparison traditional to the grounded theory approach we followed [ 24 ]. The grounded theory is a methodology that uses inductive reasoning, and aims to construct a theory through the collection and analysis of data. Constant comparison is the iterative process whereby each part of the data that emerges from the data analysis is compared with other parts of the data to thoroughly explore and validate the data. Concepts that have been extracted from the data are tagged with codes that are grouped into categories. These categories constitute themes, which (may) become the basis for a new theory. Data collection and analysis were continued until no new information was gained and data saturation had likely occurred within the identified themes.

The target population consisted of stakeholders relevant to our topic: junior and senior researchers, lecturers in statistics, editors of scientific journals and program leaders of funding agencies (see Tables ​ Tables1 1 and ​ and2). 2 ). We approached participants in the field of medical sciences, health- and life sciences and psychology. In line with the grounded theory approach, theoretical sampling was used to identify and recruit eligible participants. Theoretical sampling is a form of purposive sampling. This means that we aimed to purposefully select participants, based on their characteristics that fit the parameters of the research questions [ 25 ]. Recruitment took place by approaching persons in our professional networks and or the networks of the approached persons.

*The numbers between brackets represents the number of participants that were also interviewed.

Data collection

We conducted individual semi-structured interviews followed by focus groups. The aim of the interviews was to gain insight into the views of participants on the use of NHST and alternative methods and to examine potential barriers and facilitators related to these methods. The aim of the focus groups was to validate and further explore interview findings and to develop a comprehensive understanding of participants’ views and beliefs.

For the semi-structured interviews, we used a topic list (see Appendix 1 in S1 Appendix ). Questions addressed participants’ knowledge and beliefs about the concept of NHST, their familiarity with NHST, perceived attractiveness and drawbacks of the use of NHST, knowledge of the current NHST debate, knowledge of and views on alternative procedures and their views on the future of NHST. The topic list was slightly adjusted based on the interviews with editors and representatives from funding agencies (compared to the topic list for interviews with researchers and lecturers). Questions particularly focused on research and education were replaced by questions focused on policy (see Appendix 1 in S1 Appendix ).

The interviews were conducted between October 2017 and June 2018 by two researchers (L.v.G. and J.S.), both trained in qualitative research methods. Interviews lasted about one hour (range 31–86 minutes) and were voice-recorded. One interview was conducted by telephone; all others were face to face and took place at a location convenient for the participants, in most cases the participants’ work location.

Focus groups

During the focus groups, important themes that emerged from the interviews were discussed and explored. These include perceptions on NHST and alternatives and essential conditions to shift away from the default use of NHST.

Five focus groups included representatives from the different stakeholder groups. One focus group was homogenous, including solely lecturers. The focus groups consisted of ‘old’ as well as ‘new’ participants, that is, some of the participants of the focus groups were also in the interview sample. We also selected persons that were open for further contribution to the NHST debate and were willing to help think about (implementing) alternatives for NHST.

The focus groups were conducted between September and December 2018 by three researchers (L.v.G., J.S. and A.d.K.), all trained in qualitative research methods. The focus groups lasted about one-and-a-half hours (range 86–100 minutes).

Data analysis

All interviews and focus groups were transcribed verbatim. Atlas.ti 8.0 software was used for data management and analysis. All transcripts were read thoroughly several times to identify meaningful and relevant text fragments and analyzed by two researchers (J.S. and L.v.G.). Deductive predefined themes and theoretical concepts were used to guide the development of the topic list for the semi-structured interviews and focus groups, and were used as sensitizing concepts [ 23 ] in data collection and data analysis. Inductive themes were identified during the interview process and analysis of the data [ 26 ].

Transcripts were open-, axial- and selectively coded by two researchers (J.S. and L.v.G.). Open coding is the first step in the data-analysis, whereby phenomena found in the text are identified and named (coded). With axial coding, connections between codes are drawn. Selective coding is the process of selecting one central category and relating all other categories to that category, capturing the essence of the research. The constant comparison method [ 27 ] was applied allowing emerging (sub)themes to be fully explored. First, the two researchers independently developed a set of initial codes. Subsequently, findings were discussed until consensus was reached. Codes were then grouped into categories that were covered under subthemes, belonging to main themes. Finally, a theory substantiating the prevailing use of NHST was developed based on the main themes and subthemes.

Ethical issues

This research was conducted in accordance with the Dutch "General Data Protection Regulation" and the “Netherland’s code of conduct for research integrity”. The research protocol had been submitted for review and approved by the ethical review committee of the VU Faculty of Behavioral and Movement Sciences. In addition, the project had been submitted to the Medical Ethics Committee (METC) of the Amsterdam University Medical Centre who decided that the project is not subject to the Medical Research (Human Subjects) Act ( WMO). At the start of data collection, all participants signed an informed consent form.

A full study protocol, including a detailed data analysis plan, was preregistered ( https://osf.io/4qg38/ ). At the start of this study, preregistration forms for qualitative studies were not developed yet. Therefore, preregistration for this study is based on an outdated form. Presently, there is a preregistration form available for qualitative studies [ 28 ]. Information about data collection, data management, data sharing and data storage is described in a Data Management Plan. Sensitive data is stored in Darkstor, an offline archive for storing sensitive information or data (information that involves i.e., privacy or copyright). As the recordings and transcripts of the interviews and focus groups contain privacy-sensitive data, these files are archived in Darkstor and can be accessed only on request by authorized individuals (i.e., the original researcher or a research coordinator) (Data requests can be send to ln.uv@mdr ). Non-sensitive data is stored in DANS ( https://doi.org/10.17026/dans-2at-nzfs ) (Data Archiving and Networked Services; the Netherlands institute for permanent access to digital research resources).

Participant characteristics

Twenty-nine individual interviews and six focus groups were conducted. The focus groups included four to six participants per session. A total of 47 participants were included in the study (13 researchers, 15 lecturers, 11 editors of scientific journals and 8 representatives of funding agencies). Twenty-nine participants were interviewed. Twenty-seven participants took part in the focus group. Nine of the twenty-seven participants were both interviewed and took part in the focus groups. Some participants had multiple roles (i.e., editor and researcher, editor and lecturer or lecturer and researcher) but were classified based on their primary role (assistant professors were classified as lecturers). The lecturers in statistics in our sample were not statisticians themselves. Although they all received training in statistics, they were primarily trained as psychologists, medical doctors, or health scientists. Some lecturers in our sample taught an applied subject, with statistics as part of it. Other lectures taught Methodology and Statistics courses. Statistical skills and knowledge among lecturers varied from modest to quite advanced. Statistical skills and knowledge among participants from the other stakeholder groups varied from poor to quite advanced. All participants were working in the Netherlands. A general overview of the participants is presented in Table 1 . Participant characteristics split up by interviews and focus groups are presented in Table 2 .

Three main themes with sub-themes and categories emerged ( Fig 3 ): the green-colored compartments hold the three main themes: The scientific climate , The scientific duty and Reactivity . Each of these three main themes consists of subthemes, depicted by the yellow-colored compartments. In turn, some (but not all) of the 9 subthemes also have categories. These ‘lower level’ findings are not included in the figure but will be mentioned in the elaboration on the findings and are depicted in Appendix 2 in S1 Appendix . Fig 3 shows how the themes are related to each other. The blue arrows indicate that the themes are interrelated; factors influence each other. The scientific climate affects the way stakeholders perceive and fulfil their scientific duty, the way stakeholders give substance to their scientific duty shapes and maintain the scientific climate. The scientific duty and the scientific climate cause a state of reactivity. Many participants have adopted a ’wait and see’ attitude regarding behavioral changes with respect to statistical methods. They feel dependent on someone else’s action. This leads to a reactive (instead of a proactive) attitude and a low sense of responsibility. ‘Reactivity’ is the core theme, explaining the most critical problem with respect to the continuous and ubiquitous use of NHST.

An external file that holds a picture, illustration, etc.
Object name is pone.0258330.g003.jpg

Main themes and subthemes are numbered. Categories are mentioned in the body of the text in bold. ‘P’ stands for participant; ‘I’ stands for interviewer.

1. The scientific climate

The theme, ‘the scientific climate’, represents researchers’ (Dutch) perceptions of the many written and unwritten rules they face in the research environment. This theme concerns the opportunities and challenges participants encounter when working in the science system. Dutch academics feel pressured to publish fast and regularly, and to follow conventions and directions of those on whom they depend. They feel this comes at the expense of the quality of their work. Thus, the scientific climate in the Netherlands has a strong influence on the behavior of participants regarding how they set their priorities and control the quality of their work.

1 . 1 Quality control . Monitoring the quality of research is considered very important. Researchers, funding agencies and editors indicate they rely on their own knowledge, expertise, and insight, and those of their colleagues, to guarantee this quality. However, editors or funding agencies are often left with little choice when it comes to compiling an evaluation committee or a review panel. The choice is often like-knows-like-based. Given the limited choice, they are forced to trust the opinion of their consultants, but the question is whether this is justified.

I: “The ones who evaluate the statistics, do they have sufficient statistical knowledge?” P: “Ehhr, no, I don’t think so.” I: “Okay, interesting. So, there are manuscripts published of which you afterwards might think….” P: “Yes yes.” (Interview 18; Professor/editor, Medical Sciences)

1 . 2 Convention . The scientific system is built on mores and conventions, as this participant describes:

P: “There is science, and there is the sociology of science, that is, how we talk to each other, what we believe, how we connect. And at some point, it was agreed upon that we would talk to each other in this way.” (Interview 28, researcher, Medical Sciences)

And to these conventions, one (naturally) conforms. Stakeholders copy behavior and actions of others within their discipline, thereby causing particular behaviors and values to become conventional or normative. One of those conventions is the use of NHST and p-values. Everyone is trained with NHST and is used to applying this method. Another convention is the fact that significant results mean ‘success’, in the sense of successful research and being a successful researcher. Everyone is aware that ‘p is smaller than 0.05’ means the desired results are achieved and that publication and citation chances are increased.

P: “You want to find a significant result so badly. (…) Because people constantly think: I must find a significant result, otherwise my study is worthless.” (Focus group 4, lecturer, Medical Sciences)

Stakeholders rigidly hold on to the above-mentioned conventions and are not inclined to deviate from existing norms; they are, in other words, quite conservative . ‘We don’t know any better’ has been brought up as a valid argument by participants from various stakeholder groups to stick to current rules and conventions. Consequently, the status quo in the scientific system is being maintained.

P: “People hold on to….” I: ‘Everyone maintains the system?’ P: ‘Yes, we kind of hang to the conservative manner. This is what we know, what someone, everyone, accepts.” (Interview 17, researcher, Health Sciences)

Everyone is trained with NHST and considers it an accessible and easy to interpret method. The familiarity and perceived simplicity of NHST, user-friendly software such as SPSS and the clear cut-off value for significance are important facilitators for the use of NHST and at the same time barriers to start using alternative methods. Applied researchers stressed the importance of the accessibility of NHST as a method to test hypotheses and draw conclusions. This accessibility also justifies the use of NHST when researchers want to communicate their study results and messages in understandable ways to their readership.

P: “It is harder, also to explain, to use an alternative. So, I think, but maybe I’m overstepping, but if you want to go in that direction [alternative methods] it needs to be better facilitated for researchers. Because at the moment… I did some research, but, you know, there are those uncommon statistical packages.” (Interview 16, researcher/editor, Medical Sciences)

1 . 3 Publication pressure . Most researchers mentioned that they perceive publication pressure. This motivates them to use NHST and hope for significant results, as ‘significant p-values’ increase publication chances. They perceive a high workload and the way the scientific reward system is constructed as barriers for behavioral change pertaining to the use of statistical methods; potential negative consequences for publication and career chances prevent researchers from deviating from (un)written rules.

P: “I would like to learn it [alternative methods], but it might very well be that I will not be able to apply it, because I will not get my paper published. I find that quite tricky.” (Interview 1, Assistant Professor, Health Sciences)

2. The scientific duty

Throughout the interviews, participants reported a sense of duty in several variations. “What does it mean to be a scientific researcher?” seemed to be a question that was reflected upon during rather than prior to the interview, suggesting that many scientists had not really thought about the moral and professional obligations of being a scientist in general—let alone what that would mean for their use of NHST. Once they had given it some thought, the opinions concerning what constitutes the scientific duty varied to a large extent. Some participants attached great importance to issues such as reproducibility and transparency in scientific research and continuing education and training for researchers. For others, these topics seemed to play a less important role. A distinction was made between moral and professional obligations that participants described concerning their scientific duty.

2 . 1 Moral obligation . The moral obligation concerns issues such as doing research in a thorough and honest way, refraining from questionable research practices (QRPs) and investing in better research. It concerns tasks and activities that are not often rewarded or acknowledged.

Throughout the interviews and the focus groups, participants very frequently touched upon the responsibility they felt for doing ‘the right thing’ and making the right choice in doing research and using NHST, in particular. The extent to which they felt responsible varied among participants. When it comes to choices during doing research—for example, drawing conclusions from data—participants felt a strong sense of responsibility to do this correctly. However, when it comes to innovation and new practices, and feeling responsible for your own research, let alone improving scientific practice in general, opinions differed. This quotation from one of the focus groups illustrates that:

P1: “If you people [statisticians, methodologists] want me to improve the statistics I use in my research, then you have to hand it to me. I am not going to make any effort to improve that myself. “P3: “No. It is your responsibility as an academic to keep growing and learning and so, also to start familiarizing yourself when you notice that your statistics might need improvement.” (Focus group 2, participant 1 (PhD researcher, Medical Sciences) and 3 (Associate Professor, Health Sciences)

The sense of responsibility for improving research practices regarding the use of NHST was strongly felt and emphasized by a small group of participants. They emphasized the responsibility of the researcher to think, interpret and be critical when interpreting the p -value in NHST. It was felt that you cannot leave that up to the reader. Moreover, scrutinizing and reflecting upon research results was considered a primary responsibility of a scientist, and failing to do so, as not living up to what your job demands you to do:

P: “Yes, and if I want to be very provocative—and I often want that, because then people tend to wake up and react: then I say that hiding behind alpha.05 is just scientific laziness. Actually, it is worse: it is scientific cowardice. I would even say it is ‘relieving yourself from your duty’, but that may sound a bit harsh…” (Interview 2, Professor, Health Sciences)

These participants were convinced that scientists have a duty to keep scientific practice in general at the highest level possible.

The avoidance of questionable research practices (QRPs) was considered a means or a way to keep scientific practices high level and was often touched upon during the interviews and focus groups as being part of the scientific duty. Statisticians saw NHST as directly facilitating QRPs and providing ample examples of how the use of NHST leads to QRPs, whereas most applied researchers perceived NHST as the common way of doing research and were not aware of the risks related to QRPs. Participants did mention the violation of assumptions underlying NHST as being a QRP. Then, too, participants considered overinterpreting results as a QRP, including exaggerating the degree of significance. Although participants stated they were careful about interpreting and reporting p-values, they ‘admitted’ that statistical significance was a starting point for them. Most researchers indicated they search for information that could get their study published, which usually includes a low p-value (this also relates to the theme ‘Scientific climate’).

P: “We all know that a lot of weight is given to the p-value. So, if it is not significant, then that’s the end of it. If it ís significant, it just begins.” (Interview 5, lecturer, Psychology)

The term ‘sloppy science’ was mentioned in relation to efforts by researchers to reduce the p -value (a.k.a. p-hacking, data-dredging, and HARKing. HARKing is an acronym that refers to the questionable research question of Hypothesizing After the Results are Known. It occurs when researchers formulate a hypothesis after the data have been collected and analyzed, but make it look like it is an a priori hypothesis [ 29 ]). Preregistration and replication were mentioned as being promising solutions for some of the problems caused by NHST.

2 . 2 . Professional obligation . The theme professional obligation reflects participants’ expressions about what methodological knowledge scientists should have about NHST. In contrast moral obligations, there appeared to be some consensus about scientists’ professional obligations. Participants considered critical evaluation of research results a core professional obligation. Also, within all the stakeholder groups, participants agreed that sufficient statistical knowledge is required for using NHST, but they varied in their insights in the principles, potential and limitations of NHST. This also applied to the extent to which participants were aware of the current debate about NHST.

Participants considered critical thinking as a requirement for fulfilling their professional obligation. It specifically refers to the process of interpreting outcomes and taking all relevant contextual information into consideration. Critical thinking was not only literally referred to by participants, but also emerged by interpreting text fragments on the emphasis within their research. Researchers differed quite strongly in where the emphasis of their research outcomes should be put and what kind of information is required when reporting study results. Participants mentioned the proven effectiveness of a particular treatment, giving a summary of the research results, effect sizes, clinical relevance, p-values, or whether you have made a considerable contribution to science or society.

P: “I come back to the point where I said that people find it arbitrary to state that two points difference on a particular scale is relevant. They prefer to hide behind an alpha of 0.05, as if it is a God given truth, that it counts for one and for all. But it is just as well an invented concept and an invented guideline, an invented cut-off value, that isn’t more objective than other methods?” (Interview 2, Professor, Health Sciences)

For some participants, especially those representing funding agencies, critical thinking was primarily seen as a prerequisite for the utility of the research. The focus, when formulating the research question and interpreting the results, should be on practical relevance and the contribution the research makes to society.

The term ‘ignorance’ arose in the context of the participants’ concern regarding the level of statistical knowledge scientists and other stakeholders have versus what knowledge they should have to adequately apply statistical analysis in their research. The more statistically competent respondents in the sample felt quite strongly about how problematic the lack of knowledge about NHST is among those who regularly use it in their research, let alone the lack of knowledge about alternative methods. They felt that regularly retraining yourself in research methods is an essential part of the professional obligation one has. Applied researchers in the sample agreed that a certain level of background knowledge on NHST was required to apply it properly to research and acknowledged their own ignorance. However, they had different opinions about what level of knowledge is required. Moreover, not all of them regarded it as part of their scientific duty to be informed about all ins and outs of NHST. Some saw it as the responsibility of statisticians to actively inform them (see also the subtheme periphery). Some participants were not aware of their ignorance or stated that some of their colleagues are not aware of their ignorance, i.e., that they are unconsciously incompetent and without realizing it, poorly understood what the p-value and associated outcome measures actually mean.

P: “The worst, and I honestly think that this is the most common, is unconsciously incompetent, people don’t even understand that…” I: “Ignorance.” P: “Yes, but worse, ignorant and not even knowing you are ignorant.” (Interview 2, Professor, Health Sciences)

The lack of proper knowledge about statistical procedures was especially prevalent in the medical sciences. Participants working in or with the medical sciences all confirmed that there is little room for proper statistical training for medical students and that the level of knowledge is fairly low. NHST is often used because of its simplicity. It is especially attractive for medical PhD students because they need their PhD to get ahead in their medical career instead of pursuing a scientific career.

P: “I am not familiar with other ways of doing research. I would really like to learn, but I do not know where I could go. And I do not know whether there are better ways. So sometimes I do read studies of which I think: ‘this is something I could investigate with a completely different test. Apparently, this is also possible, but I don’t know how.’ Yes, there are courses, but I do not know what they are. And here in the medical center, a lot of research is done by medical doctors and these people have hardly been taught any statistics. Maybe they will get one or two statistics courses, they know how to do a t-test and that is about it. (…) And the courses have a very low level of statistics, so to say.” (Interview 1, Assistant Professor, Health Sciences)

Also, the term ‘ awareness ’ arose. Firstly, it refers to being conscious about the limitations of NHST. Secondly, it refers to the awareness of the ongoing discussions about NHST and more broadly, about the replication crisis. The statisticians in the sample emphasized the importance of knowing that NHST has limitations and that it cannot be considered the holy grail of data analysis. They also emphasized the importance of being aware of the debate. A certain level of awareness was considered a necessary requirement for critical thinking. There was variation in that awareness. Some participants were quite informed and were also fairly engaged in the discussion whereas others were very new to the discussion and larger contextual factors, such as the replication crisis.

I: “Are you aware of the debate going on in academia on this topic [NHST]? P: “No, I occasionally see some article sent by a colleague passing by. I have the idea that something is going on, but I do not know how the debate is conducted and how advanced it is. (Interview 6, lecturer, Psychology)

With respect to the theme, ‘the scientific duty’, participants differed to what extent they felt responsible for better and open science, for pioneering, for reviewing, and for growing and learning as a scientist. Participants had one commonality: although they strived for adherence to the norms of good research, the rampant feeling is that this is very difficult, due to the scientific climate. Consequently, participants perceive an internal conflict : a discrepancy between what they want or believe , and what they do . Participants often found themselves struggling with the responsibility they felt they had. Making the scientifically most solid choice was often difficult due to feasibility, time constraints, or certain expectations from supervisors (this is also directly related to the themes ‘Scientific climate’ and ‘Reactivity’). Thus, the scientific climate strongly influences the behavior of scientists regarding how they set their priorities and fulfill their scientific duties. The strong sense of scientific duty was perceived by some participants as a facilitator and by others as a barrier for the use of alternative methods.

3. Reactivity

A consequence of the foregoing factors is that most stakeholders have adopted a reactive attitude and behave accordingly. People are disinclined to take responsibility and await external signals and initiatives of others. This might explain why NHST is being continuously used and remains the default procedure to make inferences about population effects.

The core theme ‘reactivity’ can be explained by the following subthemes and categories:

3 . 1 Periphery . The NHST-problem resides in the periphery in several ways. First, it is a subject that is not given much priority. Secondly, some applied researchers and editors believe that methodological knowledge, as it is not their field of expertise, should not be part of their job requirement. This also applies to the NHST debate. Thirdly, and partly related to the second point, there is a lack of cooperation within and between disciplines.

The term ‘ priority’ was mentioned often when participants were asked to what extent the topic of NHST was subject of discussion in their working environment. Participants indicated that (too) little priority is given to statistics and the problems related to the subject. There is simply a lot going on in their research field and daily work, so there are always more important or urgent issues on the agenda.

P: “Discussions take place in the periphery; many people find it complicated. Or are just a little too busy.” (Interview 5, lecturer, Psychology)

As the NHST debate is not prioritized, initiatives with respect to this issue are not forthcoming. Moreover, researchers and lecturers claim there is neither time nor money available for training in statistics in general or acquiring more insight and skills with respect to (the use of) alternative methods. Busy working schedules were mentioned as an important barrier for improving statistical knowledge and skills.

P: “Well you can use your time once, so it is an issue low on the priority list.” (Focus group 5, researcher, Medical Sciences)

The NHST debate is perceived as the domain of statisticians and methodologists. Also, cooperation between different domains and domain-specific experts is perceived as complicated, as different perceptions and ways of thinking can clash. Therefore, some participants feel that separate worlds should be kept separate; put another way: stick to what you know!

P: “This part is not our job. The editorial staff, we have the assignment to ensure that it is properly written down. But the discussion about that [alternatives], that is outside our territory.” (Interview 26, editor, Medical Sciences)

Within disciplines, individuals tend to act on their own, not being aware that others are working on the same subject and that it would be worthwhile to join forces. The interviews and focus groups exposed that a modest number of participants actively try to change the current situation, but in doing that, feel like lone voices in the wilderness.

P1: “I mean, you become a lone voice in the wilderness.” P2: “Indeed, you don’t want that.” P1: “I get it, but no one listens. There is no audience.” (Focus Group 3, P1: MD, lecturer, medical Sciences, P2: editor, Medical Sciences)

To succeed at positive change, participants emphasized that it is essential that people (interdisciplinary) cooperate and join forces, rather than operate on individual levels, focusing solely on their own working environment.

The caution people show with respect to taking initiative is reenforced by the fear of encountering resistance from their working environment when one voices that change regarding the use of NHST is needed. A condition that was mentioned as essential to bring about change was tactical implementation , that is, taking very small steps. As everyone is still using NHST, taking big steps brings the risk of losing especially the more conservative people along the way. Also, the adjustment of policy, guidelines and educational programs are processes for which we need to provide time and scope.

P: “Everyone still uses it, so I think we have to be more critical, and I think we have to look at some kind of culture change, that means that we are going to let go of it (NHST) more and we will also use other tests, that in the long term will overthrow NHST. I: and what about alternatives? P: I think you should never be too fanatic in those discussion, because then you will provoke resistance. (…) That is not how it works in communication. You will touch them on a sore spot, and they will think: ‘and who are you?’ I: “and what works?” P: “well, gradualness. Tell them to use NHST, do not burn it to the ground, you do not want to touch peoples work, because it is close to their hearts. Instead, you say: ‘try to do another test next to NHST’. Be a pioneer yourself.” (Interview 5, lecturer, Psychology)

3 . 2 . Efficacy . Most participants stated they feel they are not in the position to initiate change. On the one hand, this feeling is related to their hierarchical positions within their working environments. On the other hand, the feeling is caused by the fact that statistics is perceived as a very complex field of expertise and people feel they lack sufficient knowledge and skills, especially about alternative methods.

Many participants stated they felt little sense of empowerment, or self-efficacy. The academic system is perceived as hierarchical, having an unequal balance of power. Most participants believe that it is not in their power to take a lead in innovative actions or to stand up against establishment, and think that this responsibility lies with other stakeholders, that have more status .

P: “Ideally, there would be a kind of an emergency letter from several people whose names open up doors, in which they indicate that in the medical sciences we are throwing away money because research is not being interpreted properly. Well, if these people that we listen to send such an emergency letter to the board of The Netherlands Organization for Health Research and Development [the largest Dutch funding agency for innovation and research in healthcare], I can imagine that this will initiate a discussion.” (…) I: “and with a big name you mean someone from within the science system? P: well, you know, ideally a chairman, or chairmen of the academic medical center. At that level. If they would put a letter together. Yes, that of course would have way more impact. Or some prominent medical doctors, yes, that would have more impact, than if some other person would send a letter yes.” (Interview 19, representative from funding agency, Physical Sciences)

Some participants indicated that they did try to make a difference but encountered too much resistance and therefore gave up their efforts. PhD students feel they have insufficient power to choose their own directions and make their own choices.

P: I am dependent on funding agencies and professors. In the end, I will write a grant application in that direction that gives me the greatest chance of eventually receiving that grant. Not primarily research that I think is the most optimal (…) If I know that reviewers believe the p-value is very important, well, of course I write down a method in which the p-value is central.” (Focus group 2, PhD-student, Medical Sciences)

With a sense of imperturbability, most participants accept that they cannot really change anything.

Lastly, the complexity of the subject is an obstacle for behavioral change. Statistics is perceived as a difficult subject. Participants indicate that they have a lack of knowledge and skills and that they are unsure about their own abilities. This applies to the ‘standard’ statistical methods (NHST), but to a greater extent to alternative methods. Many participants feel that they do not have the capacity to pursue a true understanding of (alternative) statistical methods.

P: “Statistics is just very hard. Time and again, research demonstrates that scientists, even the smartest, have a hard time with statistics.” (Focus group 3, PhD researcher, Psychology)

3 . 3 . Interdependency . As mentioned, participants feel they are not in a sufficiently strong position to take initiative or to behave in an anti-establishment manner. Therefore, they await external signals from people within the scientific system with more status, power, or knowledge. This can be people within their own stakeholder group, or from other stakeholder groups. As a consequence of this attitude, a situation arises in which peoples’ actions largely depend on others. That is, a complex state of interdependency evolves: scientists argue that if the reward system does not change, they are not able to alter their statistical behavior. According to researchers, editors and funding agencies are still very much focused on NHST and especially (significant) p-values, and thus, scientists wait for editors and funders to adjust their policy regarding statistics:

P: “I wrote an article and submitted it to an internal medicine journal. I only mentioned confidence intervals. Then I was asked to also write down the p-values. So, I had to do that. This is how they [editors] can use their power. They decide.” (Interview 1, Assistant Professor, Health Sciences)

Editors and funders in their turn claim they do not maintain a strict policy. Their main position is that scientists should reach consensus about the best statistical procedure, and they will then adjust their policy and guidelines.

P: “We actually believe that the research field itself should direct the quality of its research, and thus, also the discussions.” (Interview 22, representative from funding agency, Neurosciences)

Lecturers, for their part, argue that they cannot revise their educational programs due to the academic system, and university policies are adapted to NHST and p-values.

As most participants seem not to be aware of this process, a circle of interdependency arises that is difficult to break.

P: “Yes, the stupid thing about this perpetual circle is that you are educating people, let’s say in the department of cardiology. They must of course grow, and so they need to publish. If you want to publish you must meet the norms and values of the cardiology journals, so they will write down all those p-values. These people are trained and in twenty years they are on the editorial board of those journals, and then you never get rid of it [the p-value].” (Interview 18, Professor, editor, Medical Sciences)

3 . 4 . Degree of eagerness . Exerting certain behavior or behavioral change is (partly) determined by the extent to which people want to employ particular behavior, their behavioral intention [ 22 ]. Some participants indicated they are willing to change their behavior regarding the use of statistical methods, but only if it is absolutely necessary, imposed or if they think that the current conventions have too many negative consequences. Thus, true, intrinsic will-power to change behavior is lacking among these participants. Instead, they have a rather opportunistic attitude, meaning that their behavior is mostly driven by circumstances, not by principles.

P: “If tomorrow an alternative is offered by people that make that call, than I will move along. But I am not the one calling the shots on this issue.” (Interview 26, editor, Medical Sciences)

In addition, pragmatism often outweighs the perceived urgency to change. Participants argue they ‘just want to do their jobs’ and consider the practical consequences mainly in their actions. This attitude creates a certain degree of inertia. Although participants claim they are willing to change their behavior, this would contain much more than ‘doing their jobs, and thus, in the end, the NHST-debate is subject to ‘coffee talk’. People are open to discussion, but when it comes to taking action (and motivating others to do so), no one takes action.

P: “The endless analysis of your data to get something with a p-value less than 0.05… There are people that are more critical about that, and there are people that are less critical. But that is a subject for during the coffee break.” (Interview 18, professor, editor, Medical Sciences)

The goal of our study was to acquire in-depth insight into reasons why so many stakeholders from the scientific system keep using NHST as the default method to draw conclusions, despite its many well-documented drawbacks. Furthermore, we wanted to gain insight into the reasons for their reluctance to apply alternative methods. Using a theoretical framework [ 20 , 21 ], several interrelated facilitators and barriers associated with the use of NHST and alternative methods were identified. The identified factors are subsumed under three main themes: the scientific climate, the scientific duty and reactivity. The scientific climate is dominated by conventions, behavioral rules, and beliefs, of which the use of NHST and p-values is part. At the same time, stakeholders feel they have a (moral or professional) duty. For many participants, these two sides of the same coin are incompatible, leading to internal conflicts. There is a discrepancy between what participants want and what they do . As a result of these factors, the majority feels dependent on others and have thereby become reactive. Most participants are not inclined to take responsibility themselves but await action and initiatives from others. This may explain why NHST is still the standard and used by almost everyone involved.

The current study is closely related to the longstanding debate regarding NHST which recently increased to a level not seen before. In 2015, the editors of the journal ‘Basic and Applied Social Psychology’ (BASP) prohibited the use of NHST (and p-values and confidence intervals) [ 30 ]. Subsequently, in 2016, the American Statistical Association published the so-called ‘Statement on p-values’ in the American Statistician. This statement consists of critical standpoints regarding the use of NHST and p-values and warns against the abuse of the procedure. In 2019, the American Statistician devoted an entire edition to the implementation of reforms regarding the use of NHST; in more than forty articles, scientists debated statistical significance, advocated to embrace uncertainty, and suggested alternatives such as the use of s-values, False Positive Risks, reporting results as effect sizes and confidence intervals and more holistic approaches to p-values and outcome measures [ 31 ]. In addition, in the same year, several articles appeared in which an appeal was made to stop using statistical significance testing [ 32 , 33 ]. A number of counter-reactions were published [ 34 – 36 ], stating (i.e.) that banning statistical significance and, with that, abandoning clear rules for statistical analyses may create new problems with regard to statistical interpretation, study interpretations and objectivity. Also, some methodologists expressed the view that under certain circumstances the use of NHST and p-values is not problematic and can in fact provide useful answers [ 37 ]. Until recently, the NHST-debate was limited to mainly methodologists and statisticians. However, a growing number of scientists are getting involved in this lively debate and believe that a paradigm shift is desirable or even necessary.

The aforementioned publications have constructively contributed to this debate. In fact, since the publication of the special edition of the American Statistician, numerous scientific journals published editorials or revised, to a greater or lesser extent, their author guidelines [ 38 – 45 ]. Furthermore, following the American Statistical Association (ASA), the National Institute of Statistical Sciences (NISS) in the United States has also taken up the reform issue. However, real changes are still barely visible. It takes a long time before these kinds of initiatives translate into behavioral changes, and the widespread adoption by most of the scientific community is still far from accomplished. Debate alone will not lead to real changes, and therefore, our efforts to elucidate behavioral barriers and facilitators could provide a framework for potential effective initiatives that could be taken to reduce the default use of NHST. In fact, the debate could counteract behavioral change. If there is no consensus among statisticians and methodologists (the innovators), changing behavior cannot be expected from stakeholders with less statistical and methodological expertise. In other words, without agreement among innovators, early adopters might be reluctant to adopt the innovation.

Research has recently been conducted to explore the potential of behavioral change to improve Open Science behaviors. The adoption of open science behavior has increased in the last years, but uptake has been slow, due to firm barriers such as a lack of awareness about the subject, concerns about constrainment of the creative process, worries about being “scooped” and holding on to existing working practices [ 46 ]. The development regarding open science practices and the parallels these lines of research shows with the current study, might be of benefit to subserve behavioral change regarding the use of statistical methods.

The described obstacles to change behavior are related to features of both the ‘innovative idea’ and the potential adopters of the idea. First, there are characteristics of ‘the innovation’ that form barriers. The first barrier is the complexity of the innovation: most participants perceive alternative methods as difficult to understand and to use. A second barrier concerns the feasibility of trying the innovation; most people do not feel flexible about trying out or experimenting with the new idea. There is a lack of time and monetary resources to get acquainted with alternative methods (for example, by following a course). Also, the possible negative consequences of the use of alternatives (lower publications chances, the chance that the statistical method and message is too complicated for one’s readership) is holding people back from experimenting with these alternatives. And lastly, it is unclear for most participants what the visibility of the results of the new idea are. Up until now, the debate has mainly taken place among a small group of statisticians and methodologists. Many researchers are still not aware of the NHST debate and the idea to shift away from NHST and use alternative methods instead. Therefore, the question is how easily the benefits of the innovation can be made visible for a larger part of the scientific community. Thus, our study shows that, although the compatibility of the innovation is largely consistent with existing values (participants are critical about (the use of) NHST and the p-value and believe that there are better alternatives to NHST), important attributes of the innovative idea negatively affect the rate of adoption and consequently the diffusion of the innovation.

Due to the barriers mentioned above, most stakeholders do not have the intention to change their behavior and adopt the innovative idea. From the theory of planned behavior [ 21 ], it is known that behavioral intentions directly relate to performances of behaviors. The strength of the intention is shaped by attitudes, subjective norms, and perceived power. If people evaluate the suggested behavior as positive (attitude), and if they think others want them to perform the behavior (subjective norm), this leads to a stronger intention to perform that behavior. When an individual also perceives they have enough control over the behavior, they are likely to perform it. Although most participants have a positive attitude towards the behavior, or the innovative idea at stake, many participants think that others in their working environment believe that they should not perform the behavior—i.e., they do not approve of the use of alternative methods (social normative pressure). This is expressed, for example, in lower publication chances, negative judgements by supervisors or failing the requirements that are imposed by funding agencies. Thus, the perception about a particular behavior—the use of alternative methods—is negatively influenced by the (perceived) judgment of others. Moreover, we found that many participants have a low self-efficacy, meaning that there is a perceived lack of behavioral control, i.e., their perceived ability to engage in the behavior at issue is low. Also, participants feel a lack of authority (in the sense of knowledge and skills, but also power) to initiate behavioral change. The existing subjective norms and perceived behavioral control, and the negative attitudes towards performing the behavior, lead to a lower behavioral intention, and, ultimately, a lower chance of the performance of the actual behavior.

Several participants mentioned there is a need for people of stature (belonging to the group of early adopters) to take the lead and break down perceived barriers. Early adopters serve as role models and have opinion leadership, and form the next group (after the innovators, in this case statisticians and methodologists) to adopt an innovative idea [ 20 ] ( Fig 2 ). If early adopters would stand up, conveying a positive attitude towards the innovation, breaking down the described perceived barriers and facilitating the use of alternatives (for example by adjusting policy, guidelines and educational programs and making available financial resources for further training), this could positively affect the perceived social norms and self-efficacy of the early and late majority and ultimately laggards, which could ultimately lead to behavioral change among all stakeholders within the scientific community.

A strength of our study is that it is the first empirical study on views on the use of NHST, its alternatives and reasons for the prevailing use of NHST. Another strength is the method of coding which corresponds to the thematic approach from Braun & Clarke [ 47 ], which allows the researcher to move beyond just categorizing and coding the data, but also analyze how the codes are related to each other [ 47 ]. It provides a rich description of what is studied, linked to theory, but also generating new hypotheses. Moreover, two independent researchers coded all transcripts, which adds to the credibility of the study. All findings and the coding scheme were discussed by the two researchers, until consensus was reached. Also, interview results were further explored, enriched and validated by means of (mixed) focus groups. Important themes that emanated from the interviews, such as interdependency, perceptions on the scientific duty, perceived disadvantages of alternatives or the consequences of the current scientific climate, served as starting points and main subjects of the focus groups. This set-up provided more data, and more insight about the data and validation of the data. Lastly, the use of a theoretical framework [ 20 , 21 ] to develop the topic list, guide the interviews and focus groups, and guide their analysis is a strength as it provides structure to the analysis and substantiation of the results.

A limitation of this study is its sampling method. By using the network of members of the project group, and the fact that a relatively high proportion of those invited to participate refused because they thought they knew too little about the subject to be able to contribute, our sample was biased towards participants that are (somewhat) aware of the NHST debate. Our sample may also consist of people that are relatively critical towards the use of NHST, compared to the total population of researchers. It was not easy to include participants who were indifferent about or who were pro-NHST, as those were presumably less willing to make time and participate in this study. Even in our sample we found that the majority of our participants solely used NHST and perceived it as difficult if not impossible to change their behavior. These perceptions are thus probably even stronger in the target population. Another limitation, that is inherent to qualitative research, is the risk of interviewer bias. Respondents are unable, unwilling, or afraid to answer questions in good conscience, and instead provide socially desirable answers. In the context of our research, people are aware that, especially as a scientist, it does not look good to be conservative, complacent, or ignorant, or not to be open to innovation and new ideas. Therefore, some participants might have given a too favorable view of themselves. The interviewer bias can also take the other direction when values and expectations of the interviewer consciously or unconsciously influence the answers of the respondents. Although we have tried to be as neutral and objective as possible in asking questions and interpreting answers, we cannot rule out the chance that our views and opinions on the use of NHST have at times steered the respondents somewhat, potentially leading to the foregoing desirable answers.

Generalizability is a topic that is often debated in qualitative research methodology. Many researchers do not consider generalizability the purpose of qualitative research, but rather finding in-depth insights and explanations. However, this is an unjustified simplification, as generalizing of findings from qualitative research is possible. Three types of generalization in qualitative research are described: representational generalization (whether what is found in a sample can be generalized to the parent population of the sample), inferential generalization (whether findings from the study can be generalized to other settings), and theoretical generalization (where one draws theoretical statements from the findings of the study for more general application) [ 48 ]. The extent to which our results are generalizable is uncertain, as we used a theoretical sampling method, and our study was conducted exclusively in the Netherlands. We expect that the generic themes (reactivity, the scientific duty and the scientific climate) are applicable to academia in many countries across the world (inferential generalization). However, some elements, such as the Dutch educational system, will differ to a more or lesser extent from other countries (and thus can only be representationally generalized). In the Netherlands there is, for example, only one educational route after secondary school that has an academic orientation (scientific education, equivalent to the US university level education). This route consists of a bachelor’s program (typically 3 years), and a master’s program (typically 1, 2 or 3 years). Not every study program contains (compulsory) statistical courses, and statistical courses differ in depth and difficulty levels depending on the study program. Thus, not all the results will hold for other parts of the world, and further investigation is required.

Our findings demonstrate how perceived barriers to shift away from NHST set a high threshold for behavioral change and create a circle of interdependency. Behavioral change is a complex process. As ‘the stronger the intention to engage in a behavior, the more likely should be its performance’[ 21 ], further research on this subject should focus on how to influence the intention of behavior; i.e. which perceived barriers for the use of alternatives are most promising to break down in order to increase the intention for behavioral change. The present study shows that negative normative beliefs and a lack of perceived behavioral control regarding the innovation among individuals in the scientific system is a substantial problem. When social norms change in favor of the innovation, and control over the behavior increases, then the behavioral intention becomes a sufficient predictor of behavior [ 49 ]. An important follow-up question will therefore be: how can people be enthused and empowered, to ultimately take up the use of alternative methods instead of NHST? Answering this question can, in the long run, lead to the diffusion of the innovation through the scientific system as a whole.

NHST has been the leading paradigm for many decades and is deeply rooted in our science system, despite longstanding criticism. The aim of this study was to gain insight as to why we continue to use NHST. Our findings have demonstrated how perceived barriers to make a shift away from NHST set a high threshold for actual behavioral change and create a circle of interdependency between stakeholders in the scientific system. Consequently, people find themselves in a state of reactivity, which limits behavioral change with respect to the use of NHST. The next step would be to get more insight into ways to effectively remove barriers and thereby increase the intention to take a step back from NHST. A paradigm shift within a couple of years is not realistic. However, we believe that by taking small steps, one at a time, it is possible to decrease the scientific community’s strong dependence on NHST and p-values.

Supporting information

S1 appendix, acknowledgments.

The authors are grateful to Anja de Kruif for her contribution to the design of the study and for moderating one of the focus groups.

Funding Statement

This research was funded by the NWO (Nederlandse Organisatie voor Wetenschappelijk Onderzoek; Dutch Organization for Scientific Research) ( https://www.nwo.nl/ ) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability

Miles D. Williams

Logo

Visiting Assistant Professor | Denison University

  • Download My CV
  • Send Me an Email
  • View My LinkedIn
  • Follow Me on Twitter

When the Research Hypothesis Is the Null

Posted on May 13, 2024 by Miles Williams in Methods   Statistics  

Back to Blog

What should you do if your research hypothesis is the null hypothesis? In other words, how should you approach hypothesis testing if your theory predicts no effect between two variables? I and a coauthor are working on a paper where a couple of our proposed hypotheses look like this, and we got some push-back from a reviewer about it. This prompted me to go down a rabbit hole of journal articles and message boards to see how others handle this situation. I quickly found that I waded into a contentious issue that’s connected to a bigger philosophical debate about the merits of hypothesis testing in general and whether the null hypothesis in particular as a bench-mark for hypothesis testing is even logically sound.

There’s too much to unpack with this debate for me to cover in a single blog post (and I’m sure I’d get some of the key points wrong anyway if I tried). The main issue I want to explore in this post is the practical problem of how to approach testing a null research hypothesis. From an applied perspective, this is a tricky problem that raises issues with how we calculate and interpret p-values. Thankfully, there is a sound solution for the null research hypothesis which I explore in greater detail below. It’s called a two one-sided test, and it’s easy to implement once you know what it is.

The usual approach

Most of the time when doing research, a scientist usually has a research hypothesis that goes something like X has a positive effect on Y . For example, a political scientist might propose that a get-out-the-vote (GOTV) campaign ( X ) will increase voter turnout ( Y ).

The typical approach for testing this claim might be to estimate a regression model with voter turnout as the outcome and the GOTV campaign as the explanatory variable of interest:

Y = α + β X + ε

If the parameter β > 0, this would support the hypothesis that GOTV campaigns improve voter turnout. To test this hypothesis, in practice the researcher would actually test a different hypothesis that we call the null hypothesis. This is the hypothesis that says there is no true effect of GOTV campaigns on voter turnout.

By proposing and testing the null, we now have a point of reference for calculating a measure of uncertainty—that is, the probability of observing an empirical effect of a certain magnitude or greater if the null hypothesis is true. This probability is called a p-value, and by convention if it is less than 0.05 we say that we can reject the null hypothesis.

For the hypothetical regression model proposed above, to get this p-value we’d estimate β, then calculate its standard error, and then we’d take the ratio of the former to the latter giving us what’s called a t-statistic or t-value. Under the null hypothesis, the t-value has a known distribution which makes it really easy to map any t-value to a p-value. The below figure illustrates using a hypothetical data sample of size N = 200. You can see that the t-statistic’s distribution has a distinct bell shape centered around 0. You can also see the range of t-values in blue where if we observed them in our empirical data we’d fail to reject the null hypothesis at the p < 0.05 level. Values in gray are t-values that would lead us to reject the null hypothesis at this same level.

null hypothesis knowledge

When the null is the research hypothesis we want to test

There’s nothing new or special here. If you have even a basic stats background (particularly with Frequentist statistics), the conventional approach to hypothesis testing is pretty ubiquitous. Things get more tricky when our research hypothesis is that there is no effect. Say for a certain set of theoretical reasons we think that GOTV campaigns are basically useless at increasing voter turnout. If this argument is true, then if we estimate the following regression model, we’d expect β = 0.

The problem here is that our substantive research hypothesis is also the one that we want to try to find evidence against. We could just proceed like usual and just say that if we fail to reject the null this is evidence in support of our theory, but the problem with doing this is that failure to reject the null is not the same thing as finding support for the null hypothesis.

There are a few ideas in the literature for how we should approach this instead. Many of these approaches are Bayesian, but most of my research relies on Frequentist statistics, so these approaches were a no-go for me. However, there is one really simple approach that is consistent with the Frequentist paradigm: equivalence testing . The idea is simple. Propose some absolute effect size that is of minimal interest and then test whether an observed effect is different from it. This minimum effect is called the “smallest effect size of interest” (SESOI). I read about the approach in an article by Harms and Lakens (2018) in the Journal of Clinical and Translational Research .

Say, for example, that we deemed a t-value of +/-1.96 (the usual threshold for rejecting the null hypothesis) as extreme enough to constitute good evidence of a non-zero effect. We could make the appropriate adjustments to our t-distribution to identify a new range of t-values that would allow us to reject the hypothesis that an effect is non-zero. This is illustrated in the below figure. We can now see a range of t-values in the middle where we’d have t-values such that we could reject the non-zero hypothesis at the p < 0.05 level. This distribution looks like it’s been inverted relative to the usual null distribution. The reason is that with this approach what we’re doing is conducting a pair of alternative one-tailed tests. We’re testing both the hypothesis that β / se(β) - 1.96 > 0 and β / se(β) + 1.96 < 0. In the Harms and Lakens paper cited above, they call this approach two one-sided tests or TOST (I’m guessing this is pronounced “toast”).

null hypothesis knowledge

Something to pay attention to with this approach is that the observed t-statistic needs to be very small in absolute magnitude for us to reject the hypothesis of a non-zero effect. This means that the bar for testing a null research hypothesis is actually quite high. This is demonstrated using the following simulation in R. Using the {seerrr} package, I had R generate 1,000 random draws (each of size 200) for a pair of variables x and y where the former is a binary “treatment” and the latter is a random normal “outcome.” By design, there is no true causal relationship between these variables. Once I simulated the data, I then generated a set of estimates of the effect of x on y for each simulated dataset and collected the results in an object called sim_ests . I then visualized two metrics that that I calculated with the simulated results: (1) the rejection rate for the null hypothesis test and (2) the rejection rate for the two one-sided equivalence tests. As you can see, if we were to try to test a research null hypothesis the usual way, we’d expect to be able to fail to reject the null about 95% of the time. Conversely, if we were to use the two one-sided equivalence tests, we’d expect to reject the non-zero alternative hypothesis only about 25% of the time. I tested out a few additional simulations to see if a larger sample size would lead to improvements in power (not shown), but no dice.

null hypothesis knowledge

The two one-sided tests approach strikes me as a nice method when dealing with a null research hypothesis. It’s actually pretty easy to implement, too. The one downside is that this test is under-powered. If the null is true, it will only reject the alternative 25% of the time (though you could select a different non-zero alternative which would possibly give you more power). However, this isn’t all bad. The flip side of the coin is that this is a really conservative test, so if you can reject the alternative that puts you on solid rhetorical footing to show the data really do seem consistent with the null.

Answers In Reason

Answers In Reason

Exploring Wisdom, Challenging Ignorance

Conflated and Misunderstood Terms terms atheist science morality belief

Misunderstanding The Null Hypothesis and Knowledge.

The null hypothesis is used in science, and we atheists tend to have a lot of respect for science, the scientific method, scientific and empirical evidence. We think of ourselves as logical and rational people. We feel we know a lot about the topics we discuss, especially when it comes to science.

In conversations with religious folks, especially many young-earth creationists, we will correct their misrepresentation and misuse of terms like evolution and scientific theory and demand they “learn the science.”

It might come as a surprise to you, then, that there are many of us atheists that also misuse scientific terminology and when explained to them how these terms are used in science can often act no differently than the creationist.

If we atheists are going to hold others to a standard where scientific language should be used correctly then we ought to hold ourselves to that same standard and instead of using the scientific language in a colloquial way, find a different way to describe our thoughts.

So what are these mistakes we make? Well, whether theist, atheist or other non-theist we all have blind spots, biases and things we misunderstand and I try to help with those that I’m aware of.

This article hopes to address some of the errors made using scientific terminology.

The Null Hypothesis

The ‘Null Hypothesis’ is something frequently misunderstood by us atheists. We make statements that don’t really follow. Some atheists who use atheism as a lack of belief in gods say atheism is the null hypothesis whilst others will say the null hypothesis is that gods do not exist.

So let’s actually consider what a hypothesis is.

In short, a scientific hypothesis is a falsifiable tentative explanation of a phenomenon or a narrow set of phenomena observed in the natural world. For more detail check: ‘ Forming a Hypothesis ‘.

Hypothesis is an assumption or an idea proposed for the sake of argument so that it can be tested. It is a precise, testable statement of what the researchers predict will be outcome of the study. https://www.enago.com/academy/how-to-develop-a-good-research-hypothesis/#:~:text=Hypothesis%20is%20an%20assumption%20or,be%20outcome%20of%20the%20study.

Is Lacking Belief in Gods the Null Hypothesis?

Is lacking belief in something a falsifiable explanation? No, no it’s not. You might lack belief in a hypothesis or all hypotheses in a set but that’s still not the same thing as the null.

Lacking belief is simply describing a mental state that is missing. It is not a hypothesis or the null.

So what is a Null Hypothesis?

A null hypothesis is when there isn’t any significant statistical difference between variables. It gets falsified (or at least rejected) by another hypothesis being accepted if there is a significant statistical difference. In fact, depending on the field, the null can be when there is no difference or relationship between variables.

Null hypothesis states a negative statement to support the researcher’s findings that there is no relationship between two variables. There will be no changes in the dependent variable due the manipulation of the independent variable. Furthermore, it states results are due to chance and are not significant in terms of supporting the idea being investigated. Trending It Begs the Question… https://www.enago.com/academy/how-to-develop-a-good-research-hypothesis/

H0 does not stand alone and belongs to H1-n. Without H1-n there is no H0 and it only belongs to the variables set out as measurable. Making a jump from H0 being the base or default state of an untestable H1 requires quite a bit of something that just isn’t there.

The null is never accepted, it sits in the realm of not rejected and rejected. When the null is not rejected, it is due to there not being enough evidence to reject it.

If our statistical analysis shows that the significance level is below the cut-off value we have set (e.g., either 0.05 or 0.01), we reject the null hypothesis and accept the alternative hypothesis. Alternatively, if the significance level is above the cut-off value, we fail to reject the null hypothesis and cannot accept the alternative hypothesis. You should note that you cannot accept the null hypothesis, but only find evidence against it. https://statistics.laerd.com/statistical-guides/hypothesis-testing-3.php#:~:text=Rejecting%20or%20failing%20to%20reject%20the%20null%20hypothesis&text=You%20should%20note%20that%20you,only%20find%20evidence%20against%20it.

close up photo of peanuts

Let’s assume for a moment that someone had a hypothesis that not all calories were equal.

Hypothesis and null would be along the lines of:

H1: All else being equal, the type of calorie you eat can change the amount of weight you gain or lose.

H0: There is no difference between the type of calorie you eat and the weight gained or lost.

They proposed that in an otherwise perfectly balanced diet, people would gain more weight eating 500 calories of sugar than they would by eating 500 calories of peanuts.

Now, assuming this is operationalised effectively, all variables controlled and the like we would be left with 3 possible outcomes.

In speaking with a few researchers and reading some articles these are sometimes regarded as outcomes to support the original hypothesis, or can be considered hypotheses of their own.

Below I have formulated these as H1, H2 and H0, but if you would rather see these as possible outcomes 1-3 it results in much the same for the purpose of the conversation.

H1. Eating 500 calories of sugar causes more weight gain than 500 calories of peanuts H2. Eating 500 calories of peanuts causes more weight gain than 500 calories of sugar

And we are left with our null

H0. There is no significant difference between the weight gain caused by eating either sugar or peanuts.

Either H1 or H2 support the original H1 of there being a difference based on calories eaten. H0 supports the original null hypothesis that there is no difference.

Assuming there is no difference, the null isn’t actually accepted, it’s just not rejected for now. The null is never really considered accepted, though one could argue if you exhausted every type of food on the planet and there was no significant difference between any of it then all that remains is the null.

A good little blog on why we don’t ever accept the null: https://communitymedicine4all.com/2014/04/05/the-null-hypothesis-why-it-can-never-be-accepted/

The null hypothesis can sometimes be seen as the opposite or negation of a hypothesis. As long as there’s an H1 that is able to be operationalised then you can have an H0 that could be the opposite. However, the null is something that sits in not rejected/rejected area, rather than being accepted if H1 is not proved true.

In this case it’s “we haven’t falsified/rejected H0 yet” rather than “we haven’t proven H1 therefore we accept H0”.

Because there was no significant difference between the consumption of each 500 calories and the amount of weight gain, we have not falsified the null.

So, I think it is pretty clear that lacking belief in gods is not the null hypothesis. Not without changing both what hypothesis and null hypothesis mean.

It can be quite ironic to see comments like this:

null hypothesis knowledge

When they follow it up with statements like this:

null hypothesis knowledge

So What About The Null Hypothesis Being “No God Exists”?

Matt Dillahunty recently made a very confusing statement about the null hypothesis, theism , atheism, agnosticism, and the law of excluded middle.

I’m going to do my best to be as charitable with his words here and describe a few ways to look at what he’s saying.

Matt is one of the proponents of “ only one definition of atheism ” – he views theism and atheism purely in the terms of the psychological state of believing and not believing in gods respectively. He then might use the terms implicit and explicit atheism or weak and strong atheism or any other modifier to separate different positions.

So, what he’s saying here is “you either do or don’t believe in gods” which isn’t exactly wrong. You believe something or you don’t but that doesn’t describe your whole position. An -ism also isn’t usually a psychological state, it’s the -ist that holds a psychological state in regard to or follows an -ism.

I cover off this relationship between -ism and ist in more detail in: Why Agnostic and not Agnostist?

We’ve also discussed how psychological systems or beliefs are not hypotheses. So using this definition, neither atheism nor theism are hypotheses and therefore cannot be the null hypothesis.

He does mention, “The null hypothesis is that no god exists.”

Another definition of atheism is the proposition gods do not exist with the atheist accepting (aka believing) this proposition true. Here we see a more typical relationship between an -ist and -ism.

In fact, using the propositional definition of theism (at least one god exists) and atheism (no god do not exists) you also have a dichotomy of theism and atheism in the sense that only one or the other can be true.

If p = theism = at least one god exists is false then ¬p = atheism = gods do not exist is true and vice versa.

However, our belief positions in relation to these propositions are not quite as binary. There are a few different types of not believing but the one I will focus on is the suspense of judgement. Essentially, if you cannot decide if you think a proposition is true or false then the appropriate response is to suspend judgement, this could be described as uncertainty or by someone saying “I don’t know” and is also known as the psychological state of being agnostic or weak agnosticism.

Rational Belief

So, where Matt speaks off agnostic not being some middle ground, he’s sort of missing the point around how agnostic is usually applied in regard to propositions.

It’s not a middle ground in a straight line format, though it can be described that way for ease, but theism and atheism are regarded as ontological positions because they speak directly to the nature of God’s being, whereas agnosticism is an epistemic position about an ontological position.

null hypothesis knowledge

So, you can see by this use it doesn’t violate the law of excluded middle as Matt asserts, and seeing as LEM is used for propositions rather than beliefs (though you can phrase a belief propositionally) it makes more sense than the way Matt is using it too.

So, if we are using propositional theism and atheism, does this change if atheism (gods do not exist) is the null hypothesis?

Well, let’s pretend for the moment that theism (not just a specific claim but theism in general) is both falsifiable and can be operationalised. This means we’d be able to fully describe all the qualities of God(s), and how we would test and measure the results that would demonstrate existence. The design ought to be as such that we are looking for a significant difference between existence and nonexistence.

Using his example though, and pretend for a moment that these propositions can be operationalised into hypotheses:

H1: at least one god exists

H0: no gods exist

There might be some who think that H0 would have to be no gods exist because “you can’t prove a negative” – except you can.

H0 would be a position that was never accepted it just meant that we’d failed to falsify it.

What if we turn it on its head though?

H1: no gods exist

H0: at least one god exists

H0 would be a position that was never accepted, it just meant that we’d failed to falsify it.

Now, these propositions are not falsifiable so are not valid hypotheses and there is no null, and even if they were it could be hard to justify why one is “THE null” over the other. Not to mention there are definitions of gods that are more naturalistic in their formation and there could genuinely be no observable difference between them existing and not.

With this in mind, if these propositions were to be operationalised, the null hypothesis is not necessarily that God does not exist. As this could be approached from either end, it’s that there is no significant difference between God existing and God not existing.  You might then conclude that God does not exist, but that is not the null hypothesis. You might also conclude there isn’t enough evidence to make a decision either way and suspend judgement, aka be agnostic and again, this is not the null hypothesis.

Now, there may be some that insist that H0 would have to be that no gods exist because “ you can’t prove a negative! ” – except you can. This is especially clear when it comes down to logical contradictions, but not the only way to prove negatives.

Ultimately, the more you dig into what a hypothesis and the null are, the more you’ll  find that it just doesn’t work for the god propositions.

If atheism is a lack of belief in gods – it is describing a mental state that is not present, therefore is neither a hypothesis nor the null.

If atheism is no gods exist then it is a proposition, and as it cannot be falsified or operationalised it doesn’t count as a valid hypothesis, null or alternative.

Whilst Matt is right that there is only theism or atheism if used as the propositions at least one god exists and no gods exist, that would mean one was only a theist or atheist if they accepted one of those propositions, and one that accepted neither of them and “only lacks belief” was neither theist not atheist. Whilst there are a variety of other non-theist positions, one of those is known as agnostic which can be simplified to being on the fence between Atheism and theism. It’s not a 3rd option in the sense neither Atheism or theism is true, but it is the third epistemic answer an agent could give in regard to the proposition of theism.

Is Agnostic ‘only’ About Knowledge?

Lastly is Matt’s claim that “agnostic is about knowledge”, and he’s only partially right here, he’s missing the bigger picture.

Firstly, agnosticism was an epistemic principle coined by the late TH Huxley as a response to both the theist and atheist folks of his time and their arrogance towards metaphysical claims.

agnosticism TH Huxley Quote2

He has said it a few different ways but essentially it boils down to: “We shall not say we know or believe that which we have no scientific evidence for” .

Basically, even if you lean one way, you should externally suspend judgement until there is scientific evidence for your position. (This is essentially represented by evidentialism today.)

I’m sure you also noticed the “or believe” too. So, in its coining, it wasn’t only about knowledge, and in epistemology, knowledge is usually considered a subset of belief.

Speaking of epistemology, it comes from the Greek episteme meaning knowledge. Whereas the root of agnostic is Gno which means to know. What you have to understand though, is the use of know here wasn’t necessarily speaking of knowledge in a propositional sense but could be speaking of; certainty (I know it will rain tomorrow), to perceive (I know I am being watched), awareness (I am aware the Qur’an is believed to be perfect), understanding (yeah, I know what you mean mate) and so on.

Now, we could refer to at least some of these examples as knowledge-of, but not knowledge-how or knowledge-that. Knowledge-of isn’t propositional, at least in any meaningful external way. It’s basically just an acknowledgement of holding information. Knowledge-how isn’t propositional either, it is ability based. Knowledge-how can, at times, lead to propositional knowledge (knowledge-that), but not always.

Let’s give you an example.

A friend tells you about something you’ve not heard of before, and that it’s widely accepted in the scientific community… Perhaps he even shows you a few papers and peer reviews where they conclude this is the case, but not the whole papers themselves.

You have knowledge-of this conclusion, you might believe it is generally accepted but you don’t believe it yourself.

Next, let’s say you learn how they came to this conclusion by reading the papers in more detail. If you still don’t believe this conclusion it is likely because you have “knowledge-of-how” than actually understand what they are saying. Perhaps it is something that you actively need to try yourself to gain that knowledge-how.

Let’s say you have access to a lab and can perform the experiment. You see how it all works and get the same results. You now have knowledge-how. Not only that but do you try a number of different experiments to try and prove this hypothesis false. After much experimentation and analysis, you realise that this hypothesis is true. You know have knowledge-that this hypothesis is true.

Just one thing to clarify though, science tends to take a fallibilistic take on knowledge, that is to say, it never considers anything absolutely true, just that there’s enough evidence showing something works and, as yet, has not been proven false.

To bring things us back around to the point about “agnostic dealing with knowledge” it is the knowledge-of kind. You are uncertain if you think god exists or god does not exist.

Now, I do agree we are in a living language and we could be using these scientific terms in non-scientific ways and these philosophical terms in non-philosophical ways, and everyone is within their right to do so. The only danger is when they start:

  • Demanding other people get their terminology right (e.g. telling a creationist to learn about evolution and theory) whilst not holding themselves to the same standard. (Special Pleading)
  • insist other people use these colloquialisms in a prescriptive manner, especially if they start arguing for these colloquialisms being the only one that is correct. (Prescriptivism)
  • if people fail to do 2. or challenge the “only” definition, reasserting it in a dogmatic way instead of showing a degree of scepticism. (Dogmatism)
  • Othering folks who don’t adhere to your dogma. (Poisoning the well, tribalism, no true Scotsman)
  • Accusing those who suggest there is more than one definition of a word as being prescriptive when you’re actually being prescriptive yourself. (Projection, lack of self awareness)

It seems quite clear to me that folks use their language in a flexible way that they feel strengthens their position. That is why they will use a colloquial version of the null hypothesis when it suits them yet demand the scientific use of the evolution or theory at other times.

The irony, of course, is that other than the people that are already their undying fans, they are not strengthening their position at all. In fact, when big named atheists like Matt make these arguments, his reach has a negative impact on how all non-atheist folk see us. It also causes unnecessary infighting when folks try to correct them, but that can’t be avoided… especially when folks claim that when they are wrong, they want to know, or that they want to believe as many true things and as few false things as possible.

null hypothesis knowledge

So how are we to act when the people that claim evidence will change their minds, and they want to believe as many true things as possible turn out to reject evidence and prefer to believe false things?

Relationship of -ist -ism and -ic

I’m Joe. I write under the name Davidian, not only because it is a Machine Head song I enjoy but because it was a game character I used to role-play that was always looking to better himself.

This is one of many things I hope to do with Answers In Reason.

I run our Twitter and IG accounts, as well as share responsibility for our FB group and page, and maintain the site, whilst writing articles, DJing, Podcasting (and producing), keeping fit and more.

Feel free to read a more detailed bio here: https://www.answers-in-reason.com/about/authors/4/

You can find my main social links here:

Twitter(Air): https://twitter.com/answersinreason Twitter(ADHD): https://twitter.com/Davidian_ADHD TikTok (AiR): https://www.tiktok.com/@answersinreason TikTok (ADHD): https://www.tiktok.com/@adultadhdjourney

Ask me a question on Wisdom: https://app.wisdom.audio/ask/0be23c32-0fac-4d8f-bf68-671d9c8a3b95

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Mastodon (Opens in new window)
  • Click to share on X (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Nextdoor (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Pocket (Opens in new window)
  • Click to email a link to a friend (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Telegram (Opens in new window)

Related Posts

null hypothesis knowledge

Leave a Reply Cancel reply

You must be logged in to post a comment.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 12 May 2024

The knowledge regarding the impacts and management of black triangles among dental professionals and laypeople

  • Mahmoud K. AL-Omiri 1 , 2 ,
  • Danial Waleed Ahmad Atieh 3 ,
  • Motasum Abu-Awwad 1 ,
  • Abdullah A. Al Nazeh 4 ,
  • Salem Almoammar 4 ,
  • Saeed Awod Bin Hassan 5 ,
  • Abdallah Ahmed Aljbab 6 ,
  • Mohammed A. Alfaifi 7 ,
  • Naji M. Shat 8 &
  • Edward Lynch 9  

Scientific Reports volume  14 , Article number:  10840 ( 2024 ) Cite this article

30 Accesses

1 Altmetric

Metrics details

  • Dental diseases
  • Restorative dentistry

This study aimed to assess the knowledge regarding impacts, causes and management of black triangles (BT) among participants from different educational backgrounds including dental students, dentists and laypeople. This descriptive cross-sectional observational research included 435 participants who comprised 4 groups: pre-clinical (3rd year) dental students, clinical (4th and 5th year) dental students, dentists, and laypeople. A constructed self-reported questionnaire was utilized to assess participants’ demographic data and their knowledge of the impacts, causes and management of BT. The VAS scale was used to assess participants’ ratings for the impacts of BT on esthetics, with 0 meaning no impact and 10 meaning very severe negative impacts. The most reported treatments for BT were “cannot be treated” 99.3% and “non-surgical periodontal treatment” 67.1%. Meanwhile, the least reported was “modify the porcelain” 41.8%. The most reported cause of BT was “periodontal disease” 85.1%. However, the least reported were “parafunction” and “deep implants” 33.1% each. Dental professionals had better knowledge of the causes (t = 8.189, P < 0.001) and management (t = 8.289, P < 0.001) of BT than the non-dental participants. The dentists had the best knowledge, while the laypeople had the least knowledge of the causes (F = 62.056, P < 0.001) and treatment (F = 46.120, P < 0.001) of BT. The knowledge of the causes (t = 0.616, P = 0.538) and treatment (t = 1.113, P = 0.266) for BT was not significantly different between males and females. Age was not significantly related to the total knowledge about the causes (r = −0.034, P = 0.475) or treatment (r = −0.034, P = 0.482) for BT. Dental professionals had better knowledge of the impacts, causes and management of BT than the non-dental participants. The dentists were the best, while the laypeople were the worst in this regard. Age and gender had no relationships with the knowledge of causes or management of BT.

Similar content being viewed by others

null hypothesis knowledge

Maintaining peri-implant health: an evaluation of understanding among dental hygienists and therapists in Wales

Parental satisfaction and acceptance of silver diamine fluoride treatment for molar incisor hypomineralisation in pediatric dentistry: a cross-sectional study.

null hypothesis knowledge

Awareness of peri-implantitis among general dental practitioners in the UK: a questionnaire study

Introduction.

The loss of interdental gingival papillary tissue results in the formation of a triangular space between the dentition known as open gingival embrasures or black triangles 1 , 2 , 3 , 4 . This might result in esthetic troubles, speech problems, food impaction and/or improper plaque control 5 , 6 , 7 . Black triangles, especially between the central incisors, are considered among the worst esthetic factors that negatively impact smile esthetics 8 , 9 , 10 , 11 , 12 .

The loss of support for interdental papillae is multifactorial, and would result from the loss of tooth contact, loss of bone, or increased distance from tooth contact point to the bony crest that is caused by several reasons, including periodontal disease, periodontal surgery, traumatic insults, improper tooth surface contours, aging, tooth spacing and loss of teeth 5 , 13 , 14 . Also, orthodontic treatment 15 and implant restorations are associated with higher chances of papillary loss 16 , 17 , 18 .

Currently, management of black triangles includes prosthetic 14 , 19 , 20 , orthodontic 21 , and surgical approaches 22 , 23 as well as tissue regeneration 24 and tissue volumising 25 . Considering the difficulty in regenerating the interdental papillary tissue 26 , it is important to prevent black triangles by having enough support for the interdental papilla and not exceeding certain dimensions between the contact of teeth and the alveolar bone crest 5 . This is challenging as regeneration of lost tissue is difficult and requires maintaining the interdental papillary tissue volume within certain and difficult to obtain circumstances 8 , 26 . Also, having long interdental contact was preferred by patients as opposed to the presence of black triangles 8 .

Perception of esthetics is a complicated dynamic phenomenon affected by multiple dimensions including geographic, demographic (gender, age and education), socio-cultural and psychological factors 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 . Furthermore, previous research demonstrated significant differences between patients’ and dentists’ opinions regarding face and smile esthetics 35 , 36 .

Hence, dental professionals were found to be more critical in their judgment of dental and smile esthetics than laypeople 12 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , and this might owe to their dental education 31 . In addition, dental specialists perceive the black triangle as less attractive than non-specialists or laypeople 28 , 37 , 39 . Moreover, younger patients and females perceive black triangles as less attractive than males and older patients 45 , 46 . Nevertheless, laypeople and periodontists were found to consider the inflamed gingiva as worse than black triangles 12 .

This potentially inspired investigators to better understand how to prevent and manage black triangles. In fact, a successful treatment would probably result when the goals and expectations of the patient and the clinician overlap 12 , 47 . This may help in directing the appropriate treatment to the patient and by this, save time, efforts, and costs 29 , 30 , 33 , 34 .

The literature lacks studies that investigate the knowledge of participants concerning the causes and management of black triangles. In addition, the literature lacks studies that explore the knowledge among different study groups including dentists, clinical dental students, preclinical dental students and laypeople. Furthermore, the literature is short in studies concerning the associations between knowledge regarding black triangles and age and gender.

Consequently, this study was conducted to explore the knowledge regarding the impacts, causes and management of black triangles among participants from different educational backgrounds. This could add further guidance to better understand the factors involved in the perception and management of black triangles.

The aim of the current study was to identify the knowledge regarding the impacts, causes and management of black triangles, and the relationship between the knowledge and the educational background among preclinical dental students, clinical dental students, dentists and laypeople.

The null hypothesis for this study was that there is no difference in the knowledge regarding the impacts, causes and management of black triangles between preclinical dental students, clinical dental students, dentists and laypeople.

Materials and methods

Study design and population.

This descriptive cross-sectional, observational investigation was conducted between June 2022 and October 2022 in the University of Jordan considering the guidelines of the Helsinki Declaration (9th version, 2013). It was ethically approved by the Institutional Review Board (IRB) of the University of Jordan (Reference number: 19-2022-238 dated 17-4-2022). A signed written informed consent was provided by each participant before inclusion in the study.

The participants were invited to participate and were recruited from their laboratories (3rd year pre-clinical dental students), clinics (4th and 5th year dental students), offices (employees) and practices (dentists).

Simple randomization utilizing computer generated numbers was used to select the place (laboratories, clinics, offices and practices) of recruitment. A non-probability, convenient, and purposive sampling was used to recruit the participants in this study.

The invitation to participate in this study was extended to 450 participants, and 435 accepted to participate and were recruited (response rate = 96.7%). The study sample consisted of 4 groups including 3rd year preclinical dental students, 4th and 5th year clinical dental students, dentists and laypeople.

The participants were included if they were able to comprehend the questionnaire, did not have debilitating disease or mental disorders and were able to provide a signed informed consent. Also, dentists were included if they are currently practicing dentistry and registered with the Jordan Dental Association.

Dentists were excluded if they were not practicing or not registered with the Jordan Dental Association. Also, participants with history of mental disorders or debilitating disease were excluded.

Study instruments and procedures

After recruitment, the participants were requested to complete a constructed self-reported questionnaire. The questionnaire was adopted from Atieh (2023) 48 . The questionnaire was developed, used, and validated in a previous investigation 48 . The development of the questionnaire involved reviewing the relevant literature and drafting the available causes, impacts and management of black triangles. Then, a panel of dental professionals with previous experience with black triangles (4 prosthodontists, 2 periodontists, 2 orthodontists, 1 oral surgeon, and 2 general dental practitioners) was consulted regarding the developed and drafted causes, impacts and management of black triangles. They were requested to comment on the clarity of the drafted questionnaire as well as to add any missing causes, impacts and managements of black triangles. Following the feedback of the consulted dental professionals, a final draft was prepared and sent back to the consulted professionals for final suggestions. Then, the used questionnaire was finalized. After development, the questionnaire was used and validated in a previous investigation that started with a pilot study, which validated and tested the questionnaire for clarity and effectiveness 48 . In addition, the test–retest reliability was conducted by Atieh (2023) 48 as well as during this investigation to indicate the reliability of the used questionnaire.

To assess the reliability of participants’ responses to the questions, forty participants (ten from each group) were asked to answer the questions twice with a one week interval between the two occasions. In this regard, the Kappa value ranged between 0.8 and 0.9 for the tested questions, indicating an adequate reliability.

The utilized questionnaire in this study included 4 parts. The first part included items to record the demographic data of the participants including gender, age, level of education, educational background, marital status, place of residence, income and experience for dentists.

The second part of the questionnaire included items to assess participants’ knowledge and awareness of black triangles in the everyday life, items to record whether the participants had previous experiences with black triangles, and items with VAS scales to measure their ratings for the impacts of black triangles on the esthetics and appearance of individuals.

The VAS scale was used to assess participants’ ratings for the impacts of black triangles on esthetics and the appearance of individuals, 0 meant no impact and 10 meant very severe negative impacts. The visual analogue scale (VAS) was used in this study because it is considered a simple, valid and reliable method for assessment 48 , 49 , 50 , 51 . Also, adequate level of reliability was shown for the VAS when was used in previous literature regarding the black triangles 48 , 51 .

The third part of the questionnaire included items to assess participants’ knowledge of the possible causes of black triangles. This part of the questionnaire assessed the participants’ knowledge of 11 investigated causes of black triangle. The participants were asked whether each one of the investigated causes could be a possible cause for black triangles or not. A total score of knowledge about causes of black triangles was calculated by denoting one for each correct answer selected by the participant (possible minimum score is 0 and possible maximum score is 11). The participants were also asked to report any other possible cause for black triangles that was not mentioned in the questionnaire.

The fourth part of the questionnaire included items to assess participants’ knowledge of the available management of black triangles. The study investigated the participants’ knowledge of 8 investigated treatments of black triangle. The participants were asked whether each one of the investigated managements could be a possible management for black triangles or not. A total score of knowledge about treatment of black triangles was calculated by denoting one for each correct answer selected by the participant (possible minimum score is 0 and possible maximum score is 8). The participants were also asked to report any other possible management for black triangles that was not mentioned in the questionnaire.

Study outcome measures

The main outcome measures for this study were participants’ knowledge regarding impacts, causes, and management of black triangles, and the level of participants’ education. The secondary outcome measures were the relationship between participants’ demographics (age, gender, and educational background) and their knowledge regarding black triangles.

Statistical analysis

The statistical analysis for this investigation was carried out utilizing the Statistical Package for Social Sciences (IBM SPSS Statistics v23.0; IBM Corp., USA). The data was examined for normal distribution and the proper statistical analyses tests were then utilized. The continuous data was expressed as means, standard errors, standard deviations and confidence intervals, meanwhile the categorical data was described as frequencies, percentages, medians, minimum, maximum and interquartile ranges.

Correlations between different variables parametric variables were tested utilizing the Pearson’s r test and the Point biserial correlation (r). The independent student t-test was used for two-group comparisons, and the one-way analysis of variance (ANOVA) test and Post hoc analyses were used for comparison between more than two groups. Comparisons for non-parametric dependent variables (each tested causes and treatments of black triangles) between dental and non-dental participants were done using the Chi Square test. In addition, two-step hierarchical multiple linear regression analyses were carried out to examine the predictive power of the group and being from dental or non-dental backgrounds on the level of knowledge regarding black triangles, while controlling for the age and gender of participants. The significance level was set as two-tailed with P < 0.05 and 95% confidence intervals for all the analyses executed.

The G*power program (version 3.1.9.7) was used to perform a priori power analysis to determine the appropriate sample size for this investigation. The ANOVA test for multiple independent variables was utilized with a total of 4 groups, a power of 0.80, a significance level of 0.05 and an effect size of 0.25 based on Alomari et al. 2022 12 . This estimated a sample size of 180 participants. Allowing for a potential attrition rate of 20%, a sample size of 220 subjects was approximated. The invitation to participate was extended to 450 individuals, and 435 participants responded and participated in this investigation (response rate = 96.7%) and were the same cohort of patients in a previous investigation 51 .

Overall, 435 participants (136 males (31.3%) and 299 females (68.7%)) were recruited, and had their data collected and analyzed. The participants’ mean age was 28 years old (SD =  ± 10 years, age range = 18–78 years, 95% CI = 27–29 years).

Table 1 demonstrates the distribution of participants’ demographic data in this study. The study sample comprised 4 groups: dentists (n = 110), pre-clinical (3rd year) dental students (n = 104), clinical (4th and 5th year) dental students (n = 110) and laypeople (n = 111) (Table 1 ).

General awareness of BT and knowledge of BT impacts on smile attractiveness

Table 2 shows the participants’ general awareness of black triangles and their knowledge of the significance and impacts of black triangles on smile attractiveness among the study sample. The dentists reported the highest general awareness of black triangles whilst the laypeople reported the least general awareness of the problem. The VAS scores for rating the impacts of black triangles on the esthetics and appearance of individuals was significantly different between groups (F = 3.769, P = 0.011). Further comparisons using Tukey post hoc test revealed that dentists (mean difference = −0.8119, P = 0.014) and clinical dental students (mean difference = −0.7337, P = 0.030) reported more negative impacts of black triangles on esthetics and appearance than laypeople.

Dentists, clinical and preclinical dental students heard more about black triangles than laypeople (P < 0.05, Table 3 ). Dentists saw more BT between teeth and prosthesis than clinical dental students, preclinical students and laymen (P < 0.05, Table 3 ). Clinical dental students saw more BT between dental prosthesis than preclinical dental students and laymen (P < 0.001, Table 3 ).

Knowledge of the causes and treatment of BT amongst the participants

Table 4 demonstrates the distribution of the knowledge regarding the causes and treatment of black triangles amongst the study participants. The most reported cause of black triangles among the study sample was “periodontal disease” (n = 370) followed by “bone loss” (n = 232). However, the least reported cause of black triangles was the “increased overjet/overbite” (n = 76) followed by “parafunction” and “deep implants” (n = 144 each) (Table 4 ). Meanwhile, the most reported treatment for black triangles among the study sample was “cannot be treated” (n = 432) followed by “non-surgical periodontal treatment” (n = 292). However, the least reported treatment for black triangles was “surgery without bone graft” (n = 106) followed by “removing implants” (n = 112) (Table 4 ).

Table 5 shows the presence of significant differences in participants’ knowledge about the causes (F = 62.056, P < 0.001) and treatment (F = 46.120, P < 0.001) of black triangles between the study groups. Further comparisons using the Scheffe Post hoc test revealed that dentists have better knowledge about the causes of black triangles than clinical dental students, pre-clinical dental students and laypeople (P < 0.05, Table 5 ). Similarly, clinical dental students had better knowledge about the causes of black triangles than pre-clinical dental students and laypeople (P < 0.001, Table 5 ). As well, pre-clinical dental students had better knowledge about the causes of black triangles than laypeople (P = 0.037, Table 5 ). Furthermore, dentists had better knowledge regarding the treatment for black triangles than pre-clinical dental students and laypeople (P < 0.001, Table 5 ). Similarly, clinical dental students had better knowledge about the treatment for black triangles than pre-clinical dental students and laypeople (P < 0.001, Table 5 ).

Furthermore, the participants with dental backgrounds (Mean = 5.31 ± 2.55) had better knowledge about the causes of black triangles (t = 8.189, P < 0.001) than the non-dental participants (Mean = 2.85 ± 2.25). Also, the participants with dental backgrounds (Mean = 4.53 ± 1.74) had better knowledge about the treatment for black triangles (t = 8.289, P < 0.001) than the non-dental participants (Mean = 3.05 ± 1.41).

Additionally, the dental participants demonstrated significantly better knowledge (P < 0.05, Table 6 ) regarding each tested cause of black triangles than the non-dental participants, except for “parafunction” ( χ 2  = 0.000, P = 0.989) and “increased overjet/overbite” ( χ 2  = 0.900, P = 0.343). Furthermore, the dental participants demonstrated significantly better knowledge (P < 0.05, Table 6 ) regarding each tested type of treatment for black triangles than the non-dental participants, except for “removing implants” ( χ 2  = 3.314, P = 0.069) and “cannot be treated” ( χ 2  = 1.124, P = 0.289).

However, no significant relationship was identified between participants’ age and the total knowledge about the causes (r = −0.034, P = 0.475) or the treatment (r = −0.034, P = 0.482) for black triangles. Besides, no significant differences were found between males and females regarding the knowledge of the causes (t = 0.616, P = 0.538) and treatment (t = 1.113, P = 0.266) for black triangles.

The two-step multiple hierarchical regression analyses showed that the group significantly contributed to the total knowledge regarding the causes of black triangles (R 2  = 0.296, R 2 change = 0.290, B = −1.361, β = −0.566, t = −8.432, P < 0.001, 95% CI of B = −1.678 to −1.061). Being a dentist was associated with 1.361 higher odds of having better knowledge regarding the treatment of black triangles than clinical dental students, 2.72 higher odds than preclinical dental students, and 4.08 higher odds than laypeople.

Also, the group significantly contributed to the total knowledge regarding the treatment of black triangles (R 2  = 0.238, R 2 change = 0.234, B = −0.770, β = −0.486, t = −6.961, P < 0.001, 95% CI of B = −0.987 to −0.552). Being a dentist was associated with 0.77 higher odds of having better knowledge regarding the treatment of black triangles than clinical dental students, 1.54 higher odds than preclinical dental students, and 2.31 higher odds than laypeople.

The results of this study revealed the existence of associations between the knowledge regarding black triangles and the study group as well as being from dental or non-dental background. Consequently, the null hypothesis was rejected.

The findings showed that the dentists had experienced more cases of black triangles in comparison to dental students and laypeople, possibly due to having higher experience and more practice experience. Also, participants with dental educational backgrounds heard more about black triangles than laypeople. This could be explained by the lack of exposure of laypeople to dental education compared to the other groups. Also, dentists and clinical dental students reported more negative impacts of black triangles on esthetics than laypeople.

This concurs with other findings showing that individuals with a dental background were more strict in their evaluation of different esthetic parameters than the laypeople 12 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 52 , 53 , 54 , 55 . However, this opposes other studies that could not find any difference 28 , 56 , 57 , 58 , 59 , 60 , 61 .

In contrast to this study, Kay et al. (2014) reported no difference in disutility perception between dental professionals and patients in relation to tooth loss 61 . They found that both dental professionals and patients value tooth loss similarly and reported more disutility as the missing teeth were nearer to the front of the mouth, except for the loss of the upper canine that was rated to cause more disutility by the dental professionals. This relates to the black triangles problem as both the loss of anterior teeth and black triangles cause spaces that lead to negative impacts on esthetics.

In addition, missing teeth would cause larger spaces than the ones that result from black triangles, and this might account for negatively perceiving the esthetics regardless being a dental professional or a patient. The differences in psychological, cultural, and social factors could also account for this contrast, as well as differences in the tested parameters and methodologies adopted during these studies.

Dental participants demonstrated better knowledge than the non-dental ones in most of the knowledge items related to the causes of and treatments for black triangles, which might be reflected by the dental education that they were exposed to.

The findings also demonstrated that the odds of having better knowledge regarding causes and treatment of black triangles were the best for dentists, followed by clinical dental students, then the preclinical dental students, and finally the laypeople. This may be explained as dentists had already completed their dental degree, and that clinical students were further ahead in their degree than the pre-clinical dental students, and so they were more likely to have gained greater knowledge related to the black triangles and be more educated than the other groups. In addition, the laypeople had no dental education in this regard which resulted in them having the least knowledge regarding the causes and treatment of black triangles. No studies could be found that compared those specific aspects, but in a similar manner, the study by Costa and colleagues found that dentists had greater knowledge about sedation, followed by dental students and laypeople 62 . Moreover, the work by Al-Omiri and his group found that students in the higher years had better knowledge about oral health 30 , 63 .

No differences were found between the male and female participants in this study, and no studies looking particularly at the knowledge about black triangles could be found, so studies about knowledge of other aspects of black triangles and esthetics would be referred to. For instance, studies have found opposing results, where females had better knowledge than males about the relation of sugar intake and dental caries 64 , 65 , 66 . Furthermore, females also had better knowledge about oral hygiene practices and oral health than males 30 , 66 , 67 , 68 . Those differences may be explained by that, in this study, different aspects were tested and, in addition, female participants were more represented in the sample, and this calls for cautious interpretation. Utilizing various methods to measure the knowledge and perception might also underline this contrast.

Also, some researchers investigated the perception of black triangles as well as other esthetic parameters and concluded that women were more judgmental in their evaluation of black triangles and perceived them as less attractive than men 46 , 69 . However, this does not agree with the results of other studies investigating different esthetic factors 12 , 50 , 70 , 71 . This might owe to variations in the methods used to evaluate perception, differences in tested esthetic parameters as well as the sample demographics and the number of female participants.

Furthermore, no significant relationships were identified between participants’ age and the total knowledge about the causes or the treatment for black triangles. This might be related to the exposure of individuals to social media and having information regardless of the age.

No studies were available to compare with in this regard, so comparison to studies that tested other aspects would be refereed to. For example, this does not agree with previous findings that younger dentists were more familiar with preventive measures than the older counterparts, which is because they were exposed to the more recent dental education curriculum that puts more emphasis on the preventive approaches 72 . Also, multiple studies have shown that older subjects are less critical when it comes to esthetics 30 , 46 , 73 , 74 . Nonetheless, this was not shown in other studies 12 , 71 . This contrast might be attributed to variations in evaluated age groups and sample demographics, differences in the evaluated aspects of esthetics, and differences in education.

The study limitations included that in the present study, racial, social and cultural factors were not considered during this study. Besides, the age and gender distribution were beyond control among some groups such as the dental students who had a small age range. However, careful interpretation of those factors was undertaken. In addition, the confounding effects of age and gender were considered in the hierarchical regression analysis. Furthermore, the responses to the study instrument were subjective and self-reported by the participants; however, the utilized questionnaire was simple, clear, easy to score, and the participants were well informed and had any query answered by the investigators. Also, the reliability of the items was tested and ensured. Furthermore, the participants were recruited from available locations, which may potentially limit the generalizability of the findings of the study.

More investigations are required to highlight the possible effects of cultural, social, personality and racial factors on the knowledge and perception of black triangles and the role of different educational backgrounds in this regard. Comparisons between participants from different social, cultural, and racial backgrounds would highlight the impacts of how black triangles are perceived by different populations, and provide an insight into a more holistic understanding of the black triangles problem. Evaluation of personality might also identify how various personality factors potentially impact the perception of black triangles. Also, further investigations using larger samples are advisable on different populations.

Conclusions

Within the limitations of this research, it was concluded that dental professionals have more negative perception of the impacts of black triangles on esthetics than laypeople. In addition, having a dental educational background was associated with better knowledge about the impacts, causes and treatment of black triangles.

Data availability

Data generated and analysed during this study are available from the corresponding author upon request to the following email: [email protected].

Gonzalez, M. K. et al. Interdental papillary house: A new concept and guide for clinicians. Int. J. Periodontics Restor. Dent. 31 (6), e87-93 (2011).

Google Scholar  

Sharma, A. A. & Park, J. H. Esthetic considerations in interdental papilla: Remediation and regeneration. J. Esthet. Restor. Dent. 22 (1), 18–28. https://doi.org/10.1111/j.1708-8240.2009.00307.x (2010).

Article   PubMed   Google Scholar  

Sharma, P. & Sharma, P. Dental smile esthetics: The assessment and creation of the ideal smile. Semin. Orthod. 18 , 193–201. https://doi.org/10.1053/j.sodo.2012.04.004 (2012).

Article   Google Scholar  

Pugliese, F., Hess, R. & Palomo, L. Black triangles: Preventing their occurrence, managing them when prevention is not practical. Semin. Orthod. 25 (2), 175–186. https://doi.org/10.1053/j.sodo.2019.05.006 (2019).

Tarnow, D. P., Magner, A. W. & Fletcher, P. The effect of the distance from the contact point to the crest of bone on the presence or absence of the interproximal dental papilla. J. Periodontol. 63 (12), 995–996. https://doi.org/10.1902/jop.1992.63.12.995 (1992).

Article   CAS   PubMed   Google Scholar  

Sarver, D. M. Principles of cosmetic dentistry in orthodontics: Part 1. Shape and proportionality of anterior teeth. Am. J. Orthod. Dentofac. Orthop. 126 (6), 749–753. https://doi.org/10.1016/j.ajodo.2004.07.034 (2004).

An, S. S., Choi, Y. J., Kim, J. Y., Chung, C. J. & Kim, K. H. Risk factors associated with open gingival embrasures after orthodontic treatment. Angle Orthod. 88 (3), 267–274. https://doi.org/10.2319/061917-399.12 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Hochman, M. N., Chu, S. J., da Silva, B. P. & Tarnow, D. P. Layperson’s esthetic preference to the presence or absence of the interdental papillae in the low smile line: A web-based study. J. Esthet. Restor. Dent. 31 (2), 113–117. https://doi.org/10.1111/jerd.12478 (2019).

Cunliffe, J. & Pretty, I. Patients’ ranking of interdental “black triangles” against other common aesthetic problems. Eur. J. Prosthodont. Restor. Dent. 17 (4), 177–181 (2009).

PubMed   Google Scholar  

Foulger, T. E., Tredwin, C. J., Gill, D. S. & Moles, D. R. The influence of varying maxillary incisal edge embrasure space and interproximal contact area dimensions on perceived smile aesthetics. Br. Dent. J. 209 (3), E4. https://doi.org/10.1038/sj.bdj.2010.719 (2010).

Batra, P., Daing, A., Azam, I., Miglani, R. & Bhardwaj, A. Impact of altered gingival characteristics on smile esthetics: Laypersons’ perspectives by Q sort methodology. Am. J. Orthod. Dentofac. Orthop. 154 (1), 82-90.e82. https://doi.org/10.1016/j.ajodo.2017.12.010 (2018).

Alomari, S. A., Alhaija, E. S. A., AlWahadni, A. M. & Al-Tawachi, A. K. Smile microesthetics as perceived by dental professionals and laypersons. Angle Orthod. 92 (1), 101–109. https://doi.org/10.2319/020521-108.1 (2022).

Singh, V. P., Uppoor, A. S., Nayak, D. G. & Shah, D. Black triangle dilemma and its management in esthetic dentistry. Dent. Res. J. (Isfahan) 10 (3), 296–301 (2013).

CAS   PubMed   Google Scholar  

Ziahosseini, P., Hussain, F. & Millar, B. J. Management of gingival black triangles. Br. Dent. J. 217 (10), 559–563. https://doi.org/10.1038/sj.bdj.2014.1004 (2014).

Rashid, Z. J., Gul, S. S., Shaikh, M. S., Abdulkareem, A. A. & Zafar, M. S. Incidence of gingival black triangles following treatment with fixed orthodontic appliance: A systematic review. Healthcare (Basel). 10 (8), 1373. https://doi.org/10.3390/healthcare10081373 (2022).

Choquet, V. et al. Clinical and radiographic evaluation of the papilla level adjacent to single-tooth dental implants. A retrospective study in the maxillary anterior region. J. Periodontol. 72 (10), 1364–1371. https://doi.org/10.1902/jop.2001.72.10.1364 (2001).

Tarnow, D. et al. Vertical distance from the crest of bone to the height of the interproximal papilla between adjacent implants. J. Periodontol. 74 (12), 1785–1788. https://doi.org/10.1902/jop.2003.74.12.1785 (2003).

Ryser, M. R., Block, M. S. & Mercante, D. E. Correlation of papilla to crestal bone levels around single tooth implants in immediate or delayed crown protocols. J. Oral Maxillofac. Surg. 63 (8), 1184–1195. https://doi.org/10.1016/j.joms.2005.04.025 (2005).

Alani, A., Maglad, A. & Nohl, F. The prosthetic management of gingival aesthetics. Br. Dent. J. 210 (2), 63–69. https://doi.org/10.1038/sj.bdj.2011.2 (2011).

An, H. S., Park, J. M. & Park, E. J. Evaluation of shear bond strengths of gingiva-colored composite resin to porcelain, metal and zirconia substrates. J. Adv. Prosthodont. 3 (3), 166–171. https://doi.org/10.4047/jap.2011.3.3.166 (2011).

Cardaropoli, D. & Re, S. Interdental papilla augmentation procedure following orthodontic treatment in a periodontal patient. J. Periodontol. 76 (4), 655–661. https://doi.org/10.1902/jop.2005.76.4.655 (2005).

Cortellini, P. & Tonetti, M. S. Microsurgical approach to periodontal regeneration. Initial evaluation in a case cohort. J. Periodontol. 72 (4), 559–569. https://doi.org/10.1902/jop.2001.72.4.559 (2001).

Kotschy, P. & Laky, M. Reconstruction of supracrestal alveolar bone lost as a result of severe chronic periodontitis. Five-year outcome: Case report. Int. J. Periodontics Restor. Dent. 26 (5), 425–431 (2006).

McGuire, M. K. & Scheyer, E. T. A randomized, double-blind, placebo-controlled study to determine the safety and efficacy of cultured and expanded autologous fibroblast injections for the treatment of interdental papillary insufficiency associated with the papilla priming procedure. J. Periodontol. 78 (1), 4–17. https://doi.org/10.1902/jop.2007.060105 (2007).

Ficho, A. C. et al. Is interdental papilla filling using hyaluronic acid a stable approach to treat black triangles? A systematic review. J. Esthet. Restor. Dent. 33 (3), 458–465. https://doi.org/10.1111/jerd.12694 (2021).

Carnio, J. Surgical reconstruction of interdental papilla using an interposed subepithelial connective tissue graft: A case report. Int. J. Periodontics Restor. Dent. 24 (1), 31–37 (2004).

Patnaik, G., Singla, R. K. & Bala, S. Anatomy of a beautiful face and smile. J. Anat. Soc. India. 52 , 74–80 (2003).

Kokich, V. O., Kokich, V. G. & Kiyak, H. A. Perceptions of dental professionals and laypersons to altered dental esthetics: Asymmetric and symmetric situations. Am. J. Orthod. Dentofac. Orthop. 130 (2), 141–151. https://doi.org/10.1016/j.ajodo.2006.04.017 (2006).

Al-Omiri, M. K. & Karasneh, J. Relationship between oral health-related quality of life, satisfaction, and personality in patients with prosthetic rehabilitations. J. Prosthodont. 19 (1), 2–9. https://doi.org/10.1111/j.1532-849X.2009.00518.x (2010).

Al-Omiri, M. K., Barghout, N. H., Shaweesh, A. I. & Malkawi, Z. Level of education and gender-specific self-reported oral health behavior among dental students. Oral Health Prev. Dent. 10 (1), 29–35 (2012).

Mehl, C., Wolfart, S., Vollrath, O., Wenz, H. J. & Kern, M. Perception of dental esthetics in different cultures. Int. J. Prosthodont. 27 (6), 523–529. https://doi.org/10.11607/ijp.3908 (2014).

Sütterlin, C. & Yu, X. Aristotle’s dream: Evolutionary and neural aspects of aesthetic communication in the arts. Psych. J. 10 (2), 224–243. https://doi.org/10.1002/pchj.416 (2021).

Al Nazeh, A. A. et al. Relationship between oral health impacts and personality profiles among orthodontic patients treated with Invisalign clear aligners. Sci. Rep. 10 (1), 20459. https://doi.org/10.1038/s41598-020-77470-8 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Al-Omiri, M. K. et al. Oral health status, oral health–related quality of life and personality factors among users of three-sided sonic-powered toothbrush versus conventional manual toothbrush. Int. J. Dent. Hyg. 21 (2), 371–381. https://doi.org/10.1111/idh.12642 (2023).

Jørnung, J. & Fardal, Ø. Perceptions of patients’ smiles: A comparison of patients’ and dentists’ opinions. J. Am. Dent. Assoc. 138 (12), 1544–1553. https://doi.org/10.14219/jada.archive.2007.0103 (2007) ( quiz 1613–1544 ).

Sriphadungporn, C. & Chamnannidiadha, N. Perception of smile esthetics by laypeople of different ages. Prog. Orthod. 18 (1), 8. https://doi.org/10.1186/s40510-017-0162-4 (2017).

Kokich, V. O. Jr., Kiyak, H. A. & Shapiro, P. A. Comparing the perception of dentists and lay people to altered dental esthetics. J. Esthet. Dent. 11 (6), 311–324. https://doi.org/10.1111/j.1708-8240.1999.tb00414.x (1999).

Moore, T., Southard, K. A., Casko, J. S., Qian, F. & Southard, T. E. Buccal corridors and smile esthetics. Am. J. Orthod. Dentofac. Orthop. 127 (2), 208–213. https://doi.org/10.1016/j.ajodo.2003.11.027 (2005) ( quiz 261 ).

LaVacca, M. I., Tarnow, D. P. & Cisneros, G. J. Interdental papilla length and the perception of aesthetics. Pract. Proced. Aesthet. Dent. 17 (6), 405–412 (2005) ( quiz 414 ).

Machado, A. W., McComb, R. W., Moon, W. & Gandini, L. G. Jr. Influence of the vertical position of maxillary central incisors on the perception of smile esthetics among orthodontists and laypersons. J. Esthet. Restor. Dent. 25 (6), 392–401. https://doi.org/10.1111/jerd.12054 (2013).

Betrine Ribeiro, J., Alecrim Figueiredo, B. & Wilson Machado, A. Does the presence of unilateral maxillary incisor edge asymmetries influence the perception of smile esthetics?. J. Esthet. Restor. Dent. 29 (4), 291–297. https://doi.org/10.1111/jerd.12305 (2017).

Magne, P., Salem, P. & Magne, M. Influence of symmetry and balance on visual perception of a white female smile. J. Prosthet. Dent. 120 (4), 573–582. https://doi.org/10.1016/j.prosdent.2018.05.008 (2018).

Revilla-León, M. et al. Perception of occlusal plane that is nonparallel to interpupillary and commissural lines but with the maxillary dental midline ideally positioned. J. Prosthet. Dent. 122 (5), 482–490. https://doi.org/10.1016/j.prosdent.2019.01.023 (2019).

Babiuc, I. et al. A comparative study on the perception of dental esthetics of laypersons and dental students. Acta Medica Transilvanica. 25 (2), 61–63. https://doi.org/10.2478/amtsb-2020-0034 (2020).

Pithon, M. M. et al. Esthetic perception of black spaces between maxillary central incisors by different age groups. Am. J. Orthod. Dentofac. Orthop. 143 (3), 371–375. https://doi.org/10.1016/j.ajodo.2012.10.020 (2013).

Bolas-Colvee, B., Tarazona, B., Paredes-Gallardo, V. & Arias-De Luxan, S. Relationship between perception of smile esthetics and orthodontic treatment in Spanish patients. PLoS ONE 13 (8), e0201102. https://doi.org/10.1371/journal.pone.0201102 (2018).

Tortopidis, D., Hatzikyriakos, A., Kokoti, M., Menexes, G. & Tsiggos, N. Evaluation of the relationship between subjects’ perception and professional assessment of esthetic treatment needs. J. Esthet. Restor. Dent. 19 (3), 154–162. https://doi.org/10.1111/j.1708-8240.2007.00089.x (2007) ( discussion 163 ).

Atieh, D. W. A. The relationship between the perception of black triangles appearance, personality factors, and the level of education . MDSc Thesis. Jordan, Amman: The University of Jordan, 159–165 (2023).

Talic, N. & Al-Shakhs, M. Perception of facial profile attractiveness by a Saudi sample. Saudi Dent. J. 20 , 17–23 (2008).

Talic, N., Alomar, S. & Almaidhan, A. Perception of Saudi dentists and lay people to altered smile esthetics. Saudi Dent. J. 25 (1), 13–21. https://doi.org/10.1016/j.sdentj.2012.09.001 (2013).

Al-Omiri, M. K. et al. Relationships between perception of black triangles appearance, personality factors and level of education. Sci. Rep. 14 (1), 5675. https://doi.org/10.1038/s41598-024-55855-3 (2024).

Pinho, S., Ciriaco, C., Faber, J. & Lenza, M. A. Impact of dental asymmetries on the perception of smile esthetics. Am. J. Orthod. Dentofac. Orthop. 132 (6), 748–753. https://doi.org/10.1016/j.ajodo.2006.01.039 (2007).

Nascimento, D., Santos, Ê., Machado, A. & Bittencourt, M. Influence of buccal corridor dimension on smile esthetics. Dental Press J. Orthod. 17 , 145–150. https://doi.org/10.1590/S2176-94512012000500020 (2012).

Sadrhaghighi, H., Zarghami, A., Sadrhaghighi, S. & Eskandarinezhad, M. Esthetic perception of smile components by orthodontists, general dentists, dental students, artists, and laypersons. J. Investig. Clin. Dent. 8 (4), 12235. https://doi.org/10.1111/jicd.12235 (2017).

Ngoc, V. T. N. et al. Perceptions of dentists and non-professionals on some dental factors affecting smile aesthetics: A study from Vietnam. Int. J. Environ. Res. Public Health. 17 (5), 1638. https://doi.org/10.3390/ijerph17051638 (2020).

Parekh, S. M., Fields, H. W., Beck, M. & Rosenstiel, S. Attractiveness of variations in the smile arc and buccal corridor space as judged by orthodontists and laymen. Angle Orthod. 76 (4), 557–563. https://doi.org/10.1043/0003-3219(2006)076[0557:Aovits]2.0.Co;2 (2006).

Ritter, D. E., Gandini, L. G., Pinto Ados, S. & Locks, A. Esthetic influence of negative space in the buccal corridor during smiling. Angle Orthod. 76 (2), 198–203. https://doi.org/10.1043/0003-3219(2006)076[0198:Eionsi]2.0.Co;2 (2006).

Krishnan, V., Daniel, S. T., Lazar, D. & Asok, A. Characterization of posed smile by using visual analog scale, smile arc, buccal corridor measures, and modified smile index. Am. J. Orthod. Dentofac. Orthop. 133 (4), 515–523. https://doi.org/10.1016/j.ajodo.2006.04.046 (2008).

Barros, E., Carvalho, M., Mello, K., Botelho, P. & Normando, D. The ability of orthodontists and laypeople in the perception of gradual reduction of dentogingival exposure while smiling. Dental Press J. Orthod. 17 , 81–86. https://doi.org/10.1590/S2176-94512012000500012 (2012).

Saffarpour, A., Ghavam, M., Saffarpour, A., Dayani, R. & Fard, M. J. Perception of laypeople and dental professionals of smile esthetics. J. Dent. 13 (2), 85–91 (2016).

Kay, E. J., Nassani, M. Z., Aswad, M., Abdelkader, R. S. & Tarakji, B. The disutility of tooth loss: A comparison of patient and professional values. J Public Health Dent. 74 (2), 89–92. https://doi.org/10.1111/jphd.12042 (2014).

Costa, L. R. et al. Perceptions of dentists, dentistry undergraduate students, and the lay public about dental sedation. J. Appl. Oral Sci. 12 (3), 182–188. https://doi.org/10.1590/s1678-77572004000300004 (2004).

Al-Omiri, M. K., Alhijawi, M. M., Al-Shayyab, M. H., Kielbassa, A. M. & Lynch, E. Relationship between dental students’ personality profiles and self-reported oral health behaviour. Oral Health Prev. Dent. 17 (2), 125–129. https://doi.org/10.3290/j.ohpd.a42371 (2019).

Bhayat, A. Oral health knowledge and practice among administrative staff at Taibah university, Madina, KSA. Eur. J. Gen. Dent. 2 , 308–311. https://doi.org/10.4103/2278-9626.116025 (2013).

Elrashid, A. et al. Correlation of sociodemographic factors and oral health knowledge among residents in Riyadh City, Kingdom of Saudi Arabia. J. Oral Health Community Dent. 12 , 8–13. https://doi.org/10.5005/jp-journals-10062-0018 (2018).

Rajeh, M. T. Gender differences in oral health knowledge and practices among adults in Jeddah, Saudi Arabia. Clin. Cosmet. Investig. Dent. 14 , 235–244. https://doi.org/10.2147/ccide.S379171 (2022).

Halboub, E., Dhaifullah, E. & Yasin, R. Determinants of dental health status and dental health behavior among Sana’a University students, Yemen. J. Investig. Clin. Dent. 4 (4), 257–264. https://doi.org/10.1111/j.2041-1626.2012.00156.x (2013).

Abu-Gharbieh, E. et al. Oral health knowledge and behavior among adults in the United Arab Emirates. Biomed. Res. Int. 2019 , 7568679. https://doi.org/10.1155/2019/7568679 (2019).

Abu Alhaija, E. S., Al-Shamsi, N. O. & Al-Khateeb, S. Perceptions of Jordanian laypersons and dental professionals to altered smile aesthetics. Eur. J. Orthod. 33 (4), 450–456. https://doi.org/10.1093/ejo/cjq100 (2011).

Omar, H. & Tai, Y. Perception of smile esthetics among dental and nondental students. J. Educ. Ethics Dent. 4 (2), 54–60. https://doi.org/10.4103/0974-7761.148986 (2014).

Silva, B. P., Jiménez-Castellanos, E., Martinez-de-Fuentes, R., Fernandez, A. A. & Chu, S. Perception of maxillary dental midline shift in asymmetric faces. Int. J. Esthet. Dent. 10 (4), 588–596 (2015).

Yusuf, H. et al. Differences by age and sex in general dental practitioners’ knowledge, attitudes and behaviours in delivering prevention. Br. Dent. J. 219 (6), E7. https://doi.org/10.1038/sj.bdj.2015.711 (2015).

Hantash, R. O., Al-Omiri, M. K., Yunis, M. A., Dar-Odeh, N. & Lynch, E. Relationship between impacts of complete denture treatment on daily living, satisfaction and personality profiles. J. Contemp. Dent. Pract. 12 (3), 200–207. https://doi.org/10.5005/jp-journals-10024-1035 (2011).

Younis, A. et al. Relationship between dental impacts on daily living, satisfaction with the dentition and personality profiles among a Palestinian population. Odontostomatol. Trop. 35 (138), 21–30 (2012).

Download references

Acknowledgements

The authors thank Mrs. AbdelAziz M. for her help during the preparation of this manuscript. Thanks also to the University of Jordan, King Khalid University and De Montford University for making this study possible and for providing administrative support.

This research received no external funding. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Department of Fixed and Removable Prosthodontics, School of Dentistry, The University of Jordan, Queen Rania Street, Amman, 11942, Jordan

Mahmoud K. AL-Omiri & Motasum Abu-Awwad

Department of Prosthodontics, The City of London Dental School, Canada Water, Lower Road, London, UK

Mahmoud K. AL-Omiri

Private Practice, Amman, Jordan

Danial Waleed Ahmad Atieh

Department of Pediatric Dentistry and Orthodontics Sciences, College of Dentistry, King Khalid University, Abha, Asir, Saudi Arabia

Abdullah A. Al Nazeh & Salem Almoammar

Department of Restorative Dental Sciences, College of Dentistry, King Khalid University, Abha, Asir, Saudi Arabia

Saeed Awod Bin Hassan

Alqurayyat Specialized Dental Center, Ministry of Health, Al Qurayyat, Saudi Arabia

Abdallah Ahmed Aljbab

Department of Prosthetic Dental Sciences, College of Dentistry, King Khalid University, Abha, Asir, Saudi Arabia

Mohammed A. Alfaifi

Department of Prosthodontics, Faculty of Dental and Oral Surgery, University of Palestine, Gaza, Palestine

Naji M. Shat

De Montfort University, Leicester, UK

Edward Lynch

You can also search for this author in PubMed   Google Scholar

Contributions

M.K.AL-O. conceived the study. M.K.AL-O. and D.W.A.A. designed the study. M.K.AL-O. and D.W.A.A. collected the data. M.K.AL-O., D.W.A.A., M.A.A., A.A.AlN., S.A., S.A.B.H., A.A.A., M.A.A., N.M.S. and E.L. interpreted the data, conceived the results, drafted sections of the manuscript, and revised the manuscript. M.K.AL-O. prepared the tests for the study. M.K.AL-O., D.W.A.A., M.A.A., A.A.AlN., S.A., S.A.B.H., A.A.A., M.A.A., N.M.S. and E.L. carried out the data analysis and critically revised the manuscript. All authors critically revised the manuscript. All authors read and approved the submitted final version of the manuscript.

Corresponding author

Correspondence to Mahmoud K. AL-Omiri .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

AL-Omiri, M.K., Atieh, D.W.A., Abu-Awwad, M. et al. The knowledge regarding the impacts and management of black triangles among dental professionals and laypeople. Sci Rep 14 , 10840 (2024). https://doi.org/10.1038/s41598-024-61356-0

Download citation

Received : 30 January 2024

Accepted : 05 May 2024

Published : 12 May 2024

DOI : https://doi.org/10.1038/s41598-024-61356-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Black triangles
  • Interdental papillae
  • Gingival embrasure
  • Satisfaction

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

null hypothesis knowledge

IMAGES

  1. Null Hypothesis

    null hypothesis knowledge

  2. Null hypothesis

    null hypothesis knowledge

  3. How to Write a Null Hypothesis (with Examples and Templates)

    null hypothesis knowledge

  4. Null Hypothesis

    null hypothesis knowledge

  5. How to Write a Null Hypothesis (with Examples and Templates)

    null hypothesis knowledge

  6. Null Hypothesis: What Is It and How Is It Used in Investing

    null hypothesis knowledge

VIDEO

  1. Misunderstanding The Null Hypothesis

  2. F.A-II statistics Null hypothesis alternative hypothesis simple and composite hypothesis

  3. Null Hypothesis

  4. TEST OF HYPOTHESIS

  5. Understanding the Null Hypothesis

  6. 0122 Meno 89c

COMMENTS

  1. Null & Alternative Hypotheses

    The null and alternative hypotheses offer competing answers to your research question. When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the ...

  2. Null Hypothesis: Definition, Rejecting & Examples

    When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant. Statisticians often denote the null hypothesis as H 0 or H A.. Null Hypothesis H 0: No effect exists in the population.; Alternative Hypothesis H A: The effect exists in the population.; In every study or experiment, researchers assess an effect or relationship.

  3. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  4. Null hypothesis

    Basic definitions. The null hypothesis and the alternative hypothesis are types of conjectures used in statistical tests to make statistical inferences, which are formal methods of reaching conclusions and separating scientific claims from statistical noise.. The statement being tested in a test of statistical significance is called the null hypothesis. . The test of significance is designed ...

  5. Null and Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...

  6. Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0: The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

  7. 9.1: Null and Alternative Hypotheses

    Review. In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim.If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\).The null is not rejected unless the hypothesis test shows otherwise.

  8. 16.3: The Process of Null Hypothesis Testing

    16.3.5 Step 5: Determine the probability of the data under the null hypothesis. This is the step where NHST starts to violate our intuition - rather than determining the likelihood that the null hypothesis is true given the data, we instead determine the likelihood of the data under the null hypothesis - because we started out by assuming that the null hypothesis is true!

  9. Examples of null and alternative hypotheses

    It is the opposite of your research hypothesis. The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: Alternative: girls time > boys time. Null: girls time <= boys time.

  10. Understanding Null Hypothesis Testing

    A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value. A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high p value means that the sample ...

  11. Hypothesis Testing

    The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables. The null hypothesis is a prediction of no relationship between the variables you are interested in. Hypothesis testing example. You want to test whether there is a relationship between gender and height. Based on your knowledge of human ...

  12. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  13. Null Hypothesis Definition and Examples

    Null Hypothesis Examples. "Hyperactivity is unrelated to eating sugar " is an example of a null hypothesis. If the hypothesis is tested and found to be false, using statistics, then a connection between hyperactivity and sugar ingestion may be indicated. A significance test is the most common statistical test used to establish confidence in a ...

  14. What Is The Null Hypothesis & When To Reject It

    When your p-value is less than or equal to your significance level, you reject the null hypothesis. In other words, smaller p-values are taken as stronger evidence against the null hypothesis. Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis. In this case, the sample data provides ...

  15. How to Write a Null Hypothesis (5 Examples)

    Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms: H0 (Null Hypothesis): Population parameter =, ≤, ≥ some value. HA (Alternative Hypothesis): Population parameter <, >, ≠ some value. Note that the null hypothesis always contains the equal sign.

  16. Null hypothesis

    The null hypothesis (H 0) is the basis of statistical hypothesis testing. It is the default hypothesis (assumed to be true) that states that there is no statistically significant difference between some population parameter (such as the mean), and a hypothesized value. It is typically based on previous analysis or knowledge.

  17. How to Formulate a Null Hypothesis (With Examples)

    To distinguish it from other hypotheses, the null hypothesis is written as H 0 (which is read as "H-nought," "H-null," or "H-zero"). A significance test is used to determine the likelihood that the results supporting the null hypothesis are not due to chance. A confidence level of 95% or 99% is common. Keep in mind, even if the confidence level is high, there is still a small chance the ...

  18. 8.1.1: Null and Alternative Hypotheses

    Review. In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim.If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\).The null is not rejected unless the hypothesis test shows otherwise.

  19. Null hypothesis

    Biology definition: A null hypothesis is an assumption or proposition where an observed difference between two samples of a statistical population is purely accidental and not due to systematic causes. It is the hypothesis to be investigated through statistical hypothesis testing so that when refuted indicates that the alternative hypothesis is true. . Thus, a null hypothesis is a hypothesis ...

  20. Understanding the Null Hypothesis for Linear Regression

    xi: The value of the predictor variable xi. Multiple linear regression uses the following null and alternative hypotheses: H0: β1 = β2 = … = βk = 0. HA: β1 = β2 = … = βk ≠ 0. The null hypothesis states that all coefficients in the model are equal to zero. In other words, none of the predictor variables have a statistically ...

  21. Null Hypothesis

    A null hypothesis is a theory based on insufficient evidence that requires further testing to prove whether the observed data is true or false. For example, a null hypothesis statement can be "the rate of plant growth is not affected by sunlight.". It can be tested by measuring the growth of plants in the presence of sunlight and comparing ...

  22. Why we habitually engage in null-hypothesis significance testing: A

    Null Hypothesis Significance Testing (NHST) is the most familiar statistical procedure for making inferences about population effects. ... To our knowledge, there is a lack of in depth, empirical research, aimed at elucidating why NHST nevertheless remains the dominant approach, or what actions can be taken to shift the sciences away from NHST ...

  23. When the Research Hypothesis Is the Null

    Y = α + β X + ε. If the parameter β > 0, this would support the hypothesis that GOTV campaigns improve voter turnout. To test this hypothesis, in practice the researcher would actually test a different hypothesis that we call the null hypothesis. This is the hypothesis that says there is no true effect of GOTV campaigns on voter turnout.

  24. Establishing a Null Hypothesis in Business Studies

    Formulating your null hypothesis is about stating that there is no association or difference between the variables you're investigating. It's a declarative sentence that typically takes the form ...

  25. Misunderstanding The Null Hypothesis and Knowledge. » AiR

    Misunderstanding The Null Hypothesis and Knowledge. The null hypothesis is used in science, and we atheists tend to have a lot of respect for science, the scientific method, scientific and empirical evidence. We think of ourselves as logical and rational people. We feel we know a lot about the topics we discuss, especially when it comes to science.

  26. The knowledge regarding the impacts and management of black ...

    The null hypothesis for this study was that there is no difference in the knowledge regarding the impacts, causes and management of black triangles between preclinical dental students, clinical ...